pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
albanD	c141f28b64	Fix compilation warning and spurious print (#87297 ) Fixes compilation warning, make this warning an error and remove a random print. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87297 Approved by: https://github.com/malfet	2022-10-19 20:56:37 +00:00
zaf	78c8a0d752	[quant][ao_migration] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` (#78712 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] [Current PR] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [ ] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 Differential Revision: [D36792967](https://our.internmc.facebook.com/intern/diff/D36792967/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36792967/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78712 Approved by: https://github.com/jerryzh168	2022-08-18 17:51:54 +00:00
David Chen	90821aab10	Add SOFT_ASSERT to gracefully recover from invariant violations (#82689 ) Summary: Implement SOFT_ASSERT that only fails in debug mode, but only trigger a warning log in release mode. This allows us to gracefully handle some of the invariant violation when processing traces that doesn't necessarily need to crash the entire program. Test Plan: Added SOFT_ASSERT test in containers.cpp Differential Revision: D38327334 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82689 Approved by: https://github.com/robieta	2022-08-10 00:58:07 +00:00
Michael Suo	30fb2c4aba	[lint] autoformat test/cpp and torch/csrc Let's have some fun. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828 Approved by: https://github.com/ezyang	2022-06-11 21:11:16 +00:00
Scott Wolchok	52af4fc5ba	[PyTorch] Make RecordFunction store inputs as ArrayRef (#72484 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72484 Stepping stone toward stack-allocating array of inputs. Funnily enough, this seems to improve performance too. ghstack-source-id: 155492056 Test Plan: 1) CI 2) framework overhead benchmark with --stressTestRecordFunction --captureRecordFunctionInputs goes from 0.76 usec/iter to 0.72. Reviewed By: chaekit, robieta Differential Revision: D34061169 fbshipit-source-id: 073fedf1d3d162f927c4e9867cfda7dbfabba215 (cherry picked from commit dae77cf1cd8813d902d73999ad97133a3ef8e291)	2022-05-05 21:38:42 +00:00
Nikita Shulga	f6c275f55d	Remove `-Wno-unused-variable` from `utils.cmake` (take 2) (#75538 ) Summary: [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Reviewed By: anjali411 Differential Revision: D35747333 Pulled By: malfet fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626 (cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)	2022-04-20 17:41:59 +00:00
PyTorch MergeBot	5c56b2286b	Revert "Remove `-Wno-unused-variable` from utils.cmake" This reverts commit `018cbe1f5c`. Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere	2022-04-19 17:19:09 +00:00
Nikita Shulga	018cbe1f5c	Remove `-Wno-unused-variable` from utils.cmake [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Approved by: https://github.com/cpuhrsch	2022-04-19 15:26:55 +00:00
alexmsettle	c0a6add7ee	Changes to support input sequence ID tracking (#70264 ) Summary: in the NVTX markers. This feature adds additional information to the NVTX marker string eg seq_ids=[101, 102, 103]. This indicates the sequence id of the op which produced the input tensor based on its position index in the array. In the above example input tensor 0 was produced by the node with sequence id 101, input tensor 1 is from node 102, input tensor 2 is from node with sequence id 103. This is the same way the sizes array is organized. If you know the sequence id of the node and the sequence ids of the input edges, then you have enough information to construct the network graph. Fixes https://github.com/pytorch/pytorch/issues/66105 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70264 Reviewed By: chaekit Differential Revision: D34792707 Pulled By: robieta fbshipit-source-id: 4407b853c929a737505803b0db77a8ecd966cce2 (cherry picked from commit cd3c0c8c9d4d63d7897f60521c407883240d1d5b)	2022-03-31 22:15:39 +00:00
Taylor Robie	0b1f3bd158	[Profiler] Prefer TSC to wall clock when available (#73855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73855 Calling the clock is one of the most expensive parts of profiling. We can reduce the profiling overhead by using `rdtsc` instead. The tradeoff is that we have to measure and convert. (shift and scale) Test Plan: I added a cpp unit test with very aggressive anti-flake measures. I also ran the overhead benchmark (9 replicates) with `--stressTestKineto` (0.94 -> 0.89 us) and `--stressTestKineto --kinetoProfileMemory` (1.27 -> 1.17 us) Reviewed By: chaekit Differential Revision: D34231071 fbshipit-source-id: e3b3dd7580d93bcc783e87c7f2fc726cb74f4df8 (cherry picked from commit e8be9f8160793c6ee35d5af02bca3e01703e377d)	2022-03-13 18:29:06 +00:00
Scott Wolchok	bf82d2012e	[PyTorch] Add IValue::toDimVector & mostly replace toIntVector with it (#71247 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71247 Most uses of toIntVector() were for a Tensor shape. We have DimVector to avoid heap allocations in those cases, so let's use it. ghstack-source-id: 146933314 Test Plan: CI -- if we think DimVector is good in general then I think we have to think this change is good? Reviewed By: mikeiovine Differential Revision: D33556198 fbshipit-source-id: cf2ad92c2d0b99ab1df4da0f6843e6ccb9a6320b	2022-01-14 14:32:40 -08:00
Taylor Robie	786f946098	[Profiler] Add glue layer to reduce the use of `#ifdef USE_KINETO` in the profiler code. (#69798 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69798 One of the major sources of complexity in `profiler_kineto.cpp` is that kineto may or may not be available. The code (including the types) follows two related but often distict codepaths, and large sections may or may not be `#ifdef`'d out. Optimizing such code which preserving correctness is quite difficult; at one point I realized that I had broken the non-Kineto case, because moving work into the finalize step runs astray of a very large `#ifdef` around the finalize logic. In order to make optimization more tractable, I gathered all of the calls to Kineto APIs and isolated them in the `kineto_shim.h/.cpp` files: the header allows callers to pretend as though Kineto is always available (mostly), and the cpp file hides most of the horrible `#ifdef`s so they don't pollute the main profiler code. Test Plan: Unit tests. Reviewed By: aaronenyeshi Differential Revision: D32690568 fbshipit-source-id: 9a276654ef0ff9d40817c2f88f95071683f150c5	2022-01-11 15:57:46 -08:00
Amir Khojaste	748790588c	Upgrading the loop to use irange (#70326 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70326 See D24145988 for context: it allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This is nice because it auto-types the loops and adds const-safety to the iteration variable. Test Plan: buck run //caffe2/torch/fb/sparsenn:test Reviewed By: r-barnes Differential Revision: D33243400 fbshipit-source-id: b1f1b4163f4bf662031baea9e5268459b40c69a3	2022-01-06 07:06:53 -08:00
Jay Chae	12653be434	[PyTorch] Optimize no input NVTX collection (#70133 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70133 we were creating `sstream` + string concats via `getNvtxStr` even when there were no inputs and wasting precious time. this diff avoids `stringstream` when there is no input to squeeze performance. 60% reduction in overhead Test Plan: Before ``` I1214 22:48:07.964118 2971180 bench.cpp:154] Mean 0.970494 I1214 22:48:07.964139 2971180 bench.cpp:155] Median 0.969054 I1214 22:48:07.964144 2971180 bench.cpp:156] Min 0.962247 I1214 22:48:07.964148 2971180 bench.cpp:157] stddev 0.00774841 I1214 22:48:07.964154 2971180 bench.cpp:158] stddev / mean 0.00798398 ``` After ``` I1214 22:59:00.039872 3437853 bench.cpp:154] Mean 0.384333 I1214 22:59:00.039896 3437853 bench.cpp:155] Median 0.384886 I1214 22:59:00.039899 3437853 bench.cpp:156] Min 0.370235 I1214 22:59:00.039902 3437853 bench.cpp:157] stddev 0.00435907 I1214 22:59:00.039907 3437853 bench.cpp:158] stddev / mean 0.0113419 ``` Reviewed By: aaronenyeshi, robieta Differential Revision: D33137501 fbshipit-source-id: ce0e8cf9aef7ea22fd8aed927e76be4ca375efc3	2022-01-04 23:40:22 -08:00
Taylor Robie	ebc66bfeea	[Profiler] Pull helper methods into dedicated file. (And start `torch/csrc/profiler` folder. (#69255 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69255 One thing that I've found as I optimize profier is that there's a lot of intermingled code, where the kineto profiler relies on the legacy (autograd) profiler for generic operations. This made optimization hard because I had to manage too many complex dependencies. (Exaserbated by the USE_KINETO #ifdef's sprinkled around.) This PR is the first of several to restructure the profiler(s) so the later optimizations go in easier. Test Plan: Unit tests Reviewed By: aaronenyeshi Differential Revision: D32671972 fbshipit-source-id: efa83b40dde4216f368f2a5fa707360031a85707	2021-12-16 10:33:47 -08:00

15 Commits