pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
cyy	ac603bc2f8	[Reland] Eliminate invocations of c10::stoi,c10::stod,c10::stoull,c10::stoll (#109566 ) This is reland of #87603 with definitions of c10::stoXX kept for further investigation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109566 Approved by: https://github.com/huydhn	2023-09-19 07:15:25 +00:00
PyTorch MergeBot	4d44d8c00a	Revert "Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 )" This reverts commit `852f1b8417`. Reverted https://github.com/pytorch/pytorch/pull/109179 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this is breaking periodic buck build, so please fix the issue and reland the change https://github.com/pytorch/pytorch/actions/runs/6207458526/job/16852695272 ([comment](https://github.com/pytorch/pytorch/pull/109179#issuecomment-1724168571))	2023-09-18 18:41:12 +00:00
cyy	852f1b8417	Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 ) We can remove these functions in favor of std ones. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109179 Approved by: https://github.com/colesbury	2023-09-16 07:22:50 +00:00
v-s-2	60121e391b	[caffe2] Replace `CAFFE_` prefixes in `static_tracepoint.h` macros with `TORCH_` (#106380 ) Summary: Rename static tracepoint macros to better describe their targeted usage. Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: chaekit Differential Revision: D47727339 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106380 Approved by: https://github.com/chaekit	2023-08-03 21:51:36 +00:00
v-s-2	e35950cd0d	[caffe2] Move CAFFE SDT macros' definitions to `c10/util/` (#105856 ) Summary: Moving static tracepoint macros header to a location where it can be easily used by various PyTorch components (`c10/utill`). Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: EDG-GH Differential Revision: D47636258 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105856 Approved by: https://github.com/EDG-GH, https://github.com/chaekit	2023-08-01 14:42:55 +00:00
Jeff Daily	5379b5f927	[ROCm] use hipblas instead of rocblas (#105881 ) - BatchLinearAlgebraLib.cpp is now split into one additional file - BatchLinearAlgebraLib.cpp uses only cusolver APIs - BatchLinearAlgebraLibBlas.cpp uses only cublas APIs - hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file - cmake changes to link against hipblas instead of rocblas - hipify mappings changes to map cublas -> hipblas instead of rocblas Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881 Approved by: https://github.com/albanD	2023-07-31 20:42:55 +00:00
xvladus1	e47fad68a0	[caffe2] Update tracepoint USDT macros (#105232 ) Summary: Fix existing CAFFE static tracepoint macros and make them match the latest FOLLY version. Per anakryiko, current `CAFE_SDT` definition is broken. Quote: ``` "Arguments: -5@-16(%rbp) -4@$100 Arguments: -8@-16(%rbp) -4@$100 #define FOLLY_SDT_IS_ARRAY_POINTER(x) ((__builtin_classify_type(x) == 14) \|\| \ (__builtin_classify_type(x) == 5)) vs #define CAFFE_SDT_ISARRAY(x) (__builtin_classify_type(x) == 14) https://github.com/atgreen/gcc/blob/master/gcc/typeclass.h that 5 is "pointer_type_class" so you were right, it's just fixed up version of header I think it should be 8, not 5 5 is the size of literal, but you don't pass string literal as an argument, you pass its address, so actual argument is a pointer, and so 8 byte long you can try just fixing up CAFFE_SDT macro ``` {F1048035373} Test Plan: Tested the following macros on test scripts with libbpf USDTs: CAFFE_SDT CAFFE_DISABLE_SDT CAFFE_SDT_WITH_SEMAPHORE Reviewed By: RihamSelim Differential Revision: D47159249 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105232 Approved by: https://github.com/chaekit, https://github.com/malfet	2023-07-20 22:56:11 +00:00
cyy	483f748dd5	[BE] Enforce missing `override` keyword (#104032 ) This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override` and fixes violations. <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 47e904e</samp> This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032 Approved by: https://github.com/malfet	2023-06-24 02:34:24 +00:00
PyTorch MergeBot	b5594f7df0	Revert "Use missing-prototypes in torch_cpu (#103725 )" This reverts commit `716b3b893d`. Reverted https://github.com/pytorch/pytorch/pull/103725 on behalf of https://github.com/osalpekar due to Broke caffe2 builds due. More info at [D46920675](https://www.internalfb.com/diff/D46920675) ([comment](https://github.com/pytorch/pytorch/pull/103725#issuecomment-1603129273))	2023-06-22 18:30:31 +00:00
PyTorch MergeBot	626d8548df	Revert "add override to Caffe2 (#103795 )" This reverts commit `f5f020adb0`. Reverted https://github.com/pytorch/pytorch/pull/103795 on behalf of https://github.com/osalpekar due to Caused some breakages due to jobs using `-Winconsistent-missing-destructor-override` detecting inconsistent usage of override. Specifically the Tensor class destructor not being marked with override ([comment](https://github.com/pytorch/pytorch/pull/103795#issuecomment-1601812803))	2023-06-21 23:21:25 +00:00
cyy	716b3b893d	Use missing-prototypes in torch_cpu (#103725 ) This PR enables Wmissing-prototypes in torch_cpu except some generated cpp files and the mps and metal backends. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103725 Approved by: https://github.com/albanD	2023-06-21 13:19:55 +00:00
cyy	f5f020adb0	add override to Caffe2 (#103795 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/103795 Approved by: https://github.com/kit1980	2023-06-17 19:46:40 +00:00
Sergii Dymchenko	c2402a9257	Change caffe2 branch links to main (#100129 ) Just a change pytorch/tree/master -> pytorch/tree/main pytorch/blob/master -> pytorch/blob/main Pull Request resolved: https://github.com/pytorch/pytorch/pull/100129 Approved by: https://github.com/huydhn	2023-04-27 10:31:50 +00:00
Scott Wolchok	c47464ed95	[PyTorch] Further reduce cost of TypeMeta::_typeMetaData (by 10x!) (#98105 ) Currently we should be paying a small cost for the thread-safe initialization of `index`. Now we should eliminate that cost. (10x figure in the title comes from internal benchmark that just calls `TypeMeta::Match<caffe2::Tensor>()` in a loop). Differential Revision: [D44597852](https://our.internmc.facebook.com/intern/diff/D44597852/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98105 Approved by: https://github.com/ezyang	2023-04-12 17:44:48 +00:00
mikey dagitses	ee0143bf65	distinguish mutability of TensorImpl::data<T>() (#98719 ) There already is a mutable_data<T>() with different semantics, so we introduce new names: TensorImpl::(mutable_)?data_dtype_initialized<T>(). Differential Revision: [D44824778](https://our.internmc.facebook.com/intern/diff/D44824778/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98719 Approved by: https://github.com/ezyang	2023-04-12 07:24:35 +00:00
Scott Wolchok	457afe48fd	[caffe2] Micro-optimizations in BlobGetMutableTensor (#98103 ) Make sure we don't call Tensor::GetDevice() twice. Remove redundant branch for the case when tensor->dtype() == options.dtype(); in this case we end up calling raw_mutable_data(options.dtype()) anyway! Differential Revision: [D44596695](https://our.internmc.facebook.com/intern/diff/D44596695/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98103 Approved by: https://github.com/jerryzh168	2023-04-10 19:43:02 +00:00
mikey dagitses	2400cb1d57	distinguish mutability of TensorImpl::data() (#97776 ) See D44409928. Differential Revision: [D44459999](https://our.internmc.facebook.com/intern/diff/D44459999/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97776 Approved by: https://github.com/ezyang	2023-04-09 20:21:56 +00:00
mikey dagitses	5f5d675587	remove unused CAFFE2_VERSION macros (#97337 ) remove unused CAFFE2_VERSION macros Summary: Nothing reads these and they are completely subsumed by TORCH_VERSION. Getting rid of these will be helpful for build unification, since they are also not used internally. Test Plan: Rely on CI. Reviewers: sahanp Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/97337 Approved by: https://github.com/malfet	2023-03-24 16:02:35 +00:00
Mark Richardson	8bce88d9de	[caffe2] dont call cudnnDestroy on thread exit (crashes on windows with cuda 11/12) (#95382 ) Summary: My team has been hitting a mysterious crash for a few months on a windows binary that uses Caffe2 inside a worker thread. When this thread gets destroyed, there is an error at this line in context_gpu.h where the state of this operation gives CUDNN_STATUS_INTERNAL_ERROR instead of CUDNN_STATUS_SUCCESS. When enabling cudnn debug logs (via the env variables nvidia specifies), I can see that the context is destroyed twice, even though this code only destroys it once, so something mysterious is causing a double free. This seems very very similar to the issue/fix described here for pytorch: https://github.com/pytorch/pytorch/issues/17658 https://github.com/apache/tvm/pull/8267 And pytorch handles this in the same way, by just not calling cudnnDestroy This seems to have become an issue with cuda11, but I tested cuda12 as well and found that the issue persists so this needs to be somehow fixed. Test Plan: CI I checked that the specific windows binary I am using is able to create and drestroy caffe2-invoking threads without causing the application to crash. buck run arvr/mode/win/cuda11/opt //arvr/projects/nimble/prod/tools/MonoHandTrackingVis Differential Revision: D43538017 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95382 Approved by: https://github.com/malfet	2023-03-10 06:42:51 +00:00
cyy	d0e4ca233e	some reference and move fixes (#95942 ) This PR introduces some modifications: 1. We find out some const function parameters that can be passed by reference and add the reference. 2. We find more opportunists of passing by value and change them accordingly. 3. Some use-after-move errors are fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942 Approved by: https://github.com/Skylion007	2023-03-10 03:44:09 +00:00
cyy	fa65ae8f56	cleanup unused include (#93359 ) Using `include-what-you-use` tool to find out and remove some unused includes Pull Request resolved: https://github.com/pytorch/pytorch/pull/93359 Approved by: https://github.com/malfet	2023-02-04 02:15:50 +00:00
Peter Bell	dd25111250	[caffe2] Remove OperatorBase::newstyle_outputs_ (#67093 ) `OperatorBase` maintains `output_tensors_` and `newstyle_outputs_` which hold the same list of tensors except one is `vector<caffe2::Tensor>` and the other is `List<at::Tensor>`. This instead maintains only `output_tensors_` and handles the conversions inside of export_caffe2_op_to_c10. Differential Revision: [D32289811](https://our.internmc.facebook.com/intern/diff/D32289811) Pull Request resolved: https://github.com/pytorch/pytorch/pull/67093 Approved by: https://github.com/dagitses, https://github.com/malfet	2023-01-23 22:41:59 +00:00
Richard Barnes	6f749fd171	Fixes to DSA infra (#91835 ) Differential Revision: D42397325 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91835 Approved by: https://github.com/soumith	2023-01-12 21:54:26 +00:00
Eddie Yan	bac33ea8b6	[CUDA] Drop CUDA 10 support (#89582 ) CC @ptrblck @ngimel @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/89582 Approved by: https://github.com/malfet, https://github.com/ngimel	2023-01-05 05:11:53 +00:00
Xiaodong Wang	50ec416599	Fix C2 Ambiguous namespace (#89534 ) Summary: cuda:: is a ambiguous namespace. Make it explicit c10::cuda Differential Revision: D41469007 /caffe2/caffe2/core/context_gpu.cu(564): error: "caffe2::cuda" is ambiguous/caffe2/caffe2/core/context_gpu.cu(564): error: expected a ";"/caffe2/caffe2/core/context_gpu.cu(568): warning #12-D: parsing restarts here after previous syntax error Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"/caffe2/caffe2/core/context_gpu.cu(569): error: "caffe2::cuda" is ambiguous/caffe2/caffe2/core/context_gpu.cu(628): error: "caffe2::cuda" is ambiguous 4 errors detected in the compilation of "/caffe2/caffe2/core/context_gpu.cu". Pull Request resolved: https://github.com/pytorch/pytorch/pull/89534 Approved by: https://github.com/malfet	2022-12-07 23:36:41 +00:00
Pruthvi Madugundu	fbd08fb358	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-04 04:43:05 +00:00
PyTorch MergeBot	0fa23663cc	Revert "Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 )" This reverts commit `1e2c4a6e0e`. Reverted https://github.com/pytorch/pytorch/pull/84190 on behalf of https://github.com/malfet due to Needs internal changes, has to be landed via co-dev	2022-11-02 18:13:37 +00:00
Pruthvi Madugundu	1e2c4a6e0e	Introduce TORCH_DISABLE_GPU_ASSERTS (#84190 ) - Asserts for CUDA are enabled by default - Disabled for ROCm by default by setting `TORCH_DISABLE_GPU_ASSERTS` to `ON` - Can be enabled for ROCm by setting above variable to`OFF` during build or can be forcefully enabled by setting `ROCM_FORCE_ENABLE_GPU_ASSERTS:BOOL=ON` This is follow up changes as per comment in PR #81790, comment [link](https://github.com/pytorch/pytorch/pull/81790#issuecomment-1215929021) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84190 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-11-02 17:41:57 +00:00
rboca	23b79e6f48	Update CMakeLists.txt (#87030 ) Fix Caffe2_CPU_INCLUDE with Caffe2_GPU_INCLUDE. The expanding parent scope should be with the same variable name. The compilation in certain build configurations is corrected with this fix. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/87030 Approved by: https://github.com/kit1980	2022-10-28 04:56:40 +00:00
Richard Barnes	7a3afe61d2	Check all CUDA API calls for errors in caffe2/ (#81816 ) Test Plan: Sandcastle Differential Revision: D35194868 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81816 Approved by: https://github.com/ezyang	2022-10-28 00:41:06 +00:00
John Detloff	e0229d6517	Remove caffe2 mobile (#84338 ) We're no longer building Caffe2 mobile as part of our CI, and it adds a lot of clutter to our make files. Any lingering internal dependencies will use the buck build and so wont be effected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84338 Approved by: https://github.com/dreiss	2022-09-08 01:49:55 +00:00
Scott Wolchok	e23d159bc5	[PyTorch][caffe2] Add CAFFE2_{DECLARE,DEFINE}_KNOWN_TYPE (#83707 ) It looks like we aren't getting inlining for the defined `_typeMetaData` functions from CAFFE_KNOWN_TYPE and there's some cost associated with that. I added new macros that fix this problem; I will migrate to them in a follow-up after I get buy-in from reviewers. Differential Revision: [D36883685](https://our.internmc.facebook.com/intern/diff/D36883685/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36883685/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/83707 Approved by: https://github.com/ezyang	2022-08-30 23:09:49 +00:00
Nikolay Korovaiko	eda217ab67	Reland symint_numel (#84281 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/84281 Approved by: https://github.com/ezyang	2022-08-30 21:53:34 +00:00
Nikolay Korovaiko	44a975335e	Revert "Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) (#82855 )" (#84207 ) This reverts commit `bfebf254dd`. Differential Revision: [D39104562](https://our.internmc.facebook.com/intern/diff/D39104562) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84207 Approved by: https://github.com/robieta	2022-08-30 13:22:58 +00:00
Brian Hirsh	1665715cb0	add sym_strides() function, use in fake/proxy tensors (#81300 ) Add `TensorImpl::sym_strides`, bind it to python with `torch.ops.aten.sym_strides`, and use it in `ProxyTensor` and `FakeTensor`. Before, `ProxyTensor` was generating `ProxySymInt`'s for the sizes, but not for the strides. Internally we still represent strides with a `SymIntArrayRef` though, so I ran into some weird issues where sizes were showing up as `ProxySymInt`, but strides were `PySymInt`'s. Differential Revision: [D38594558](https://our.internmc.facebook.com/intern/diff/D38594558) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81300 Approved by: https://github.com/ezyang	2022-08-16 14:31:27 +00:00
Nikolay Korovaiko	bfebf254dd	Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) (#82855 ) ### Description This is a reland of (#82374) (#82726) (#82731) This PR has no extra fixes, it simply updates the correct pin to point to the XLA side that has the corresponding changes. ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82855 Approved by: https://github.com/ezyang, https://github.com/qihqi	2022-08-05 03:36:09 +00:00
PyTorch MergeBot	78bd95b13a	Revert "Re-land sym_numel (#82374 ) (#82726 ) (#82731 )" This reverts commit `c90e00cf85`. Reverted https://github.com/pytorch/pytorch/pull/82731 on behalf of https://github.com/zengk95 due to This is breaking XLA tests on trunk. It seems to have passed on PR and was able to checkout that commit `c90e00cf85`.	2022-08-04 22:45:26 +00:00
Nikolay Korovaiko	c90e00cf85	Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) This PR relands sym_numel #82374 and fixes the ios build break in this commit : `8cbd0031c5` which was a type mismatch in an equality. ### Description <!-- What did you change and why was it needed? --> ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82731 Approved by: https://github.com/malfet	2022-08-04 21:05:24 +00:00
zengk95	d0e6e5a5bb	Revert "sym_numel (#82374 )" (#82726 ) TSIA It looks like this PR #82374 is breaking mac builds on trunk but I can't revert it normally since there's a merge conflict in the XLA hash. <img width="1753" alt="image" src="https://user-images.githubusercontent.com/34172846/182644661-b7fdda4b-e5ce-45c3-96a2-ad6737d169ae.png"> I reverted it and resolved the conflict using the old XLA hash that this commit was based upon Pull Request resolved: https://github.com/pytorch/pytorch/pull/82726 Approved by: https://github.com/albanD, https://github.com/janeyx99	2022-08-03 15:23:47 +00:00
Nikolay Korovaiko	fd68b0931f	sym_numel (#82374 ) ### Description This PR makes `numel` symint-aware similar to `sym_sizes()` and `sym_strides()`. Similar to https://github.com/pytorch/pytorch/pull/81300 . This PR is the part of a bigger project to support dynamic_shapes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82374 Approved by: https://github.com/ezyang	2022-08-03 06:33:45 +00:00
PyTorch MergeBot	41b54c303d	Revert "Fix crash on unload torch cpu dll (#67632 )" This reverts commit `a54c9a419e`. Reverted https://github.com/pytorch/pytorch/pull/67632 on behalf of https://github.com/ezyang due to crashing in fbcode	2022-08-02 00:56:18 +00:00
David Braun	a54c9a419e	Fix crash on unload torch cpu dll (#67632 ) Trying to rebase https://github.com/pytorch/pytorch/pull/61290 into latest pytorch:master Pull Request resolved: https://github.com/pytorch/pytorch/pull/67632 Approved by: https://github.com/ezyang	2022-07-31 21:37:56 +00:00
Scott Wolchok	82712b7985	[PyTorch] Support ExclusivelyOwned<caffe2::Tensor> (#81964 ) Since `caffe2::Tensor` also shares `TensorImpl`, we can apply `ExclusivelyOwned` to it as needed. Differential Revision: [D38066340](https://our.internmc.facebook.com/intern/diff/D38066340/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81964 Approved by: https://github.com/ezyang	2022-07-27 22:14:40 +00:00
Will Constable	4f34cd6d1e	Replace all CHECK_ and DCHECK_ with TORCH_* macros (#82032 ) Avoid exposing defines that conflict with google logging, since this blocks external usage of libtorch in certain cases. All the 'interesting' changes should be in these two files, and the rest should just be mechanical changes via sed. c10/util/logging_is_not_google_glog.h c10/util/logging_is_google_glog.h Fixes https://github.com/pytorch/pytorch/issues/81415 cc @miladm @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/82032 Approved by: https://github.com/soumith, https://github.com/miladm	2022-07-26 01:20:44 +00:00
zhang, xiaobing	86b86202b5	fix torch.config can't respect USE_MKLDNN flag issue (#75001 ) Fixes https://github.com/pytorch/pytorch/issues/74949, which reports that torch.config can't respect USE_MKLDNN flag. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75001 Approved by: https://github.com/malfet	2022-07-17 15:00:48 +00:00
Jing Xu	3c7044728b	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-07-13 13:50:15 +00:00
Scott Wolchok	80a700bd1b	[caffe2] Don't copy Tensor dims during deserialization (#79471 ) We can just make an IntArrayRef. Differential Revision: [D37124427](https://our.internmc.facebook.com/intern/diff/D37124427/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37124427/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/79471 Approved by: https://github.com/ezyang	2022-07-12 21:36:26 +00:00
PyTorch MergeBot	1454515253	Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 )" This reverts commit `f988aa2b3f`. Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see `f988aa2b3f`	2022-06-30 12:49:41 +00:00
Jing Xu	f988aa2b3f	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-06-30 05:14:03 +00:00
Michael Andreas Dagitses	f96d96a7fc	turn on -Werror=type-limits in our Bazel CPU build Summary: We also fix any existing issues. Test Plan: Built locally, rely on CI to confirm. Reviewers: malfet Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79139 Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/albanD	2022-06-10 10:04:08 +00:00

1 2 3 4 5 ...

1443 Commits