pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Andrew Gu	9d9267c6f7	[FSDP()][3/N] Refactor public APIs (#87917 ) - This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`. - This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings. - This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs. - To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87917 Approved by: https://github.com/mrshenli	2022-10-31 16:45:21 +00:00
Salil Desai	df1cc0ef47	[Vulkan] Add Vulkan Rewrite to Transfer Inputs and Outputs to Vulkan and CPU Backends Respectively (#87432 ) With this change, we don't have to manually invoke transferring input and output backends when we run vulkan models. Graph rewrite code based off of: - `32efff45ba (diff-a473bddb458dc24225866a45092d6eca064eddd256245d93020e48e216eee4d5R160-R179)` Differential Revision: [D39519168](https://our.internmc.facebook.com/intern/diff/D39519168/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39519168/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/87432 Approved by: https://github.com/mcr229, https://github.com/digantdesai	2022-10-31 14:18:45 +00:00
Driss Guessous	35c611d30f	Add mem efficient backend flag (#87946 ) # Summary Add in a torch.backends.cuda flag and update context manager to pic between the three implementations of the scaled_dot_product_attention. cc @cpuhrsch @jbschlosser @bhosmer @mikaylagawarecki Pull Request resolved: https://github.com/pytorch/pytorch/pull/87946 Approved by: https://github.com/cpuhrsch	2022-10-28 15:51:10 +00:00
Alvaro Gaona	46b16977d9	Reimplement Kaiser window (#87330 ) Relates to #85366 - For reference follow #87082. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87330 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-27 21:01:01 +00:00
Edward Z. Yang	1ff52225f1	Unify SymIntNode and SymFloatNode into SymNode (#87817 ) This refactor was prompted by challenges handling mixed int/float operations in C++. A previous version of this patch added overloads for each permutation of int/float and was unwieldy https://github.com/pytorch/pytorch/pull/87722/ This PR takes a different approach. The general outline of the patch is to combine the C++ types SymIntNode and SymFloatNode into a single type, SymNode. This is type erased; we no longer know statically at C++ if we have an int/float and have to test it with the is_int()/is_float() virtual methods. This has a number of knock on effects. - We no longer have C++ classes to bind to Python. Instead, we take an entirely new approach to our Python API, where we have a SymInt/SymFloat class defined entirely in Python, which hold a SymNode (which corresponds to the C++ SymNode). However, SymNode is not pybind11-bound; instead, it lives as-is in Python, and is wrapped into C++ SymNode using PythonSymNode when it goes into C++. This implies a userland rename. In principle, it is also possible for the canonical implementation of SymNode to be written in C++, and then bound to Python with pybind11 (we have this code, although it is commented out.) However, I did not implement this as we currently have no C++ implementations of SymNode. Because we do return SymInt/SymFloat from C++ bindings, the C++ binding code needs to know how to find these classes. Currently, this is done just by manually importing torch and getting the attributes. - Because SymInt/SymFloat are easy Python wrappers, __sym_dispatch__ now takes SymInt/SymFloat, rather than SymNode, bringing it in line with how __torch_dispatch__ works. Some miscellaneous improvements: - SymInt now has a constructor that takes SymNode. Note that this constructor is ambiguous if you pass in a subclass of SymNode, so an explicit downcast is necessary. This means toSymFloat/toSymInt are no more. This is a mild optimization as it means rvalue reference works automatically. - We uniformly use the caster for c10::SymInt/SymFloat, rather than going the long way via the SymIntNode/SymFloatNode. - Removed some unnecessary toSymInt/toSymFloat calls in normalize_* functions, pretty sure this doesn't do anything. - guard_int is now a free function, since to guard on an int you cannot assume the method exists. A function can handle both int and SymInt inputs. - We clean up the magic method definition code for SymInt/SymFloat/SymNode. ONLY the user classes (SymInt/SymFloat) get magic methods; SymNode gets plain methods; this is to help avoid confusion between the two types. Signed-off-by: Edward Z. Yang <ezyang@fb.com> cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87817 Approved by: https://github.com/albanD, https://github.com/anjali411	2022-10-27 20:56:02 +00:00
HDCharles	d0e12d1cc8	[ao] Adding FAQ to docs (#87322 ) Summary: migrated from: https://discuss.pytorch.org/t/quantization-frequently-asked-questions/161251 Test Plan: circle CI tests Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/87322 Approved by: https://github.com/z-a-f	2022-10-25 20:18:04 +00:00
Masaki Kozuki	28593a8339	[docs] `batch_isend_irecv` and `P2POp` of torch.distributed (#86438 ) Reopening https://github.com/pytorch/pytorch/pull/79722 cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu Pull Request resolved: https://github.com/pytorch/pytorch/pull/86438 Approved by: https://github.com/kit1980	2022-10-25 00:11:50 +00:00
Kazuaki Ishizaki	72ec1b5fc1	Fix typo under docs directory (#87583 ) This PR fixes typo in `.rst` files under docs directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/87583 Approved by: https://github.com/kit1980	2022-10-24 23:52:44 +00:00
Svetlana Karslioglu	7e83f65ad5	Add General Project Policies (#87385 ) Add General Project Policies to the Governance page Pull Request resolved: https://github.com/pytorch/pytorch/pull/87385 Approved by: https://github.com/orionr	2022-10-20 21:02:09 +00:00
George Qi	17202b3637	[maskedtensor] fix docs formatting (#87387 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87387 Approved by: https://github.com/cpuhrsch	2022-10-20 20:48:25 +00:00
George Qi	cf2be34ff5	[maskedtensor] add docs (#84887 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84887 Approved by: https://github.com/cpuhrsch	2022-10-19 20:44:34 +00:00
Christian Puhrsch	e8c4adf3c3	Add torch.sparse overview section (#85265 ) The goal of this section is to provide a general overview of how PyTorch handles sparsity for readers who are already familiar with sparse matrices and their operators. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85265 Approved by: https://github.com/jisaacso	2022-10-18 21:07:57 +00:00
albanD	9db7270ee7	Small update to Module note (#87142 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87142 Approved by: https://github.com/cpuhrsch	2022-10-17 22:56:49 +00:00
Jan Margeta	e85dbcc9b0	[docs] Fix ScalarTensor __repr__ in Extending PyTorch example (#86330 ) This PR fixes the __repr__ of the `ScalarTensor` class in the Extending PyTorch example to correspond with the class name instead of `DiagonalTensor`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86330 Approved by: https://github.com/bdhirsh	2022-10-17 20:01:10 +00:00
Nikita Karetnikov	91b3cd0b5a	[primTorch] Add a ref for `narrow_copy` (#86748 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86748 Approved by: https://github.com/mruberry	2022-10-17 10:16:05 +00:00
Lukas Mührke	e027740e77	Chore: Add 'mps' to the docs of tensor_attributes (#86585 ) Since PyTorch supports 'mps' (Apple metal) devices it should be reflected in the documentation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86585 Approved by: https://github.com/albanD	2022-10-14 19:59:33 +00:00
Alvaro Gaona	b48deedb77	Set up new module torch.signal.windows (#85599 ) Resolves #85366 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85599 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-14 11:33:32 +00:00
Kshiteej K	54ee95c8ec	[nn] module: full_backward_pre_hook (#86700 ) Fixes https://github.com/pytorch/pytorch/issues/42824 * [x] Test * [x] Doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/86700 Approved by: https://github.com/soulitzer	2022-10-13 17:36:39 +00:00
Shawn Zhong	e552cf1050	[DOC] Use type hints to show annotation in the docs (#79086 ) Fixes #44964 Use type hints in the code to show type annotations in the parameters section of the docs. For the parameters already documented in the docstring, but lack the type annotation, the type hints from the code are used: \| [Before](https://pytorch.org/docs/master/generated/torch.nn.AdaptiveMaxPool1d.html) \| [After](https://docs-preview.pytorch.org/79086/generated/torch.nn.AdaptiveMaxPool1d.html) \| \| --- \| --- \| \| <img width="462" alt="image" src="https://user-images.githubusercontent.com/6421097/172954756-96d2d8a6-7df9-4c0f-ad34-c12912a5a740.png"> \| <img width="479" alt="image" src="https://user-images.githubusercontent.com/6421097/172954770-a6ce2425-99a6-4853-ac2c-e182c3849344.png"> \| \| [Before](https://pytorch.org/docs/master/generated/torch.nn.Linear.html) \| [After](https://docs-preview.pytorch.org/79086/generated/torch.nn.Linear.html) \| \| --- \| --- \| \| <img width="482" alt="image" src="https://user-images.githubusercontent.com/6421097/172954992-10ce6b48-44a2-487e-b855-2a15a50805bb.png"> \| <img width="471" alt="image" src="https://user-images.githubusercontent.com/6421097/172954839-84012ce6-bf42-432c-9226-d3e81500e72d.png"> \| Ref: - PR https://github.com/pytorch/pytorch/pull/49294 removed type annotations from signatures in HTML docs. - Sphinx version was bumped to 5.0.0 in PR #70309 - Duplicated (closed) issues: #78311 and #77501 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79086 Approved by: https://github.com/malfet	2022-10-12 22:31:48 +00:00
Mikayla Gawarecki	a77f2a95a7	Improve NestedTensor documentation (#85186 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85186 Approved by: https://github.com/cpuhrsch	2022-10-12 22:03:04 +00:00
Daniel Dale	ce56ee11fd	Extend torch.cuda.is_available() to attempt an NVML-based CUDA availability assessment when explicitly requested by the user (#85951 ) Fixes #83973 (This is a substitute PR for https://github.com/pytorch/pytorch/pull/85024) First of all, thanks for your invaluable contributions to PyTorch everyone! Given how extensively `torch.cuda.is_available` is used in the PyTorch ecosystem, IMHO it's worthwhile to provide downstream libraries/frameworks/users the ability to alter the default behavior of `torch.cuda.is_available` in the context of their PyTorch usage. I'm confident there are many current and future such use cases which could benefit from leveraging a weakened, NVML-based `torch.cuda.is_available` assessment at a downstream framework's explicit direction (thanks @malfet `81da50a972` !). Though one could always patch out the `torch.cuda.is_available` function with another implementation in a downstream library, I think this environmental variable based configuration option is more convenient and the cost to including the option is quite low. As discussed in https://github.com/pytorch/pytorch/pull/85024#issuecomment-1261542045, this PR gates new non-default NVML-based CUDA behavior with an environmental variable (PYTORCH_NVML_BASED_CUDA_CHK) that allows a user/framework to invoke non-default, NVML-based `is_available()` assessments if desired. Thanks again for your work everyone! @ngimel @malfet @awaelchli Pull Request resolved: https://github.com/pytorch/pytorch/pull/85951 Approved by: https://github.com/ngimel	2022-10-12 18:37:50 +00:00
Eddie Yan	25725fd624	(Re-open) Adds cudaMallocAsync as an alternative backend for the CUDA allocator (#82682 ) Rebased version of @mcarilli 's cudaMallocAsync #65365 for continued testing Pull Request resolved: https://github.com/pytorch/pytorch/pull/82682 Approved by: https://github.com/ngimel	2022-10-12 03:44:21 +00:00
Partho	42bd275233	[doc] LR scheduler example fix (#86629 ) Fixes issue #86208 As suggested in the issue, updated the LR scheduler example to use a regular nn.Module like the other examples on the same page. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86629 Approved by: https://github.com/soulitzer	2022-10-11 21:41:50 +00:00
zaf	3a02873183	[quant][ao_migration] nn.intrinsic.quantized migration to ao (#86172 ) All quantization-related modules are being migrated to `torch.ao`. This migrates the `nn.intrinsic.quantized`. Please, see the [tracker](https://github.com/pytorch/pytorch/issues/81667) for the timeline. ``` python test/test_quantization.py -- TestAOMigrationNNIntrinsic ``` Internal: ``` buck2 test @mode/dev-nosan //caffe2/test:quantization -- TestAOMigrationNNIntrinsic ``` Differential Revision: [D39425515](https://our.internmc.facebook.com/intern/diff/D39425515/) Differential Revision: [D39425515](https://our.internmc.facebook.com/intern/diff/D39425515) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86172 Approved by: https://github.com/jerryzh168	2022-10-08 00:01:38 +00:00
zaf	efccb6401c	[quant][ao_migration] nn.intrinsic.qat migration to ao (#86171 ) All quantization-related modules are being migrated to `torch.ao`. This migrates the `nn.intrinsic.qat`. Please, see the [tracker](https://github.com/pytorch/pytorch/issues/81667) for the timeline. ``` python test/test_quantization.py TestAOMigrationNNIntrinsic ``` Differential Revision: [D39419993](https://our.internmc.facebook.com/intern/diff/D39419993/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39419993/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/86171 Approved by: https://github.com/jerryzh168	2022-10-07 17:29:42 +00:00
Howard Huang	cc9183eb4c	Update distributed.rst backend collective support chart (#86406 ) NCCL `scatter` was added by Wanchao in https://github.com/pytorch/pytorch/pull/70029 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86406 Approved by: https://github.com/wanchaol	2022-10-07 12:59:09 +00:00
Zafar	0e30da3f2f	[refactor] Renaming ao.sparsity to ao.pruning (#84867 ) `Sparsity` as a term doesn't reflect the tools that are developed by the AO. The `torch/ao/sparsity` also has utilities for structured pruning, which internally we always referred to as just "pruning". To avoid any confusion, we renamed `Sparsity` to `Prune`. We will not be introducing the backwards compatibility, as so far this toolset was kept under silent development. This change will reflect the changes in the documentation as well. TODO: - [ ] Change the tutorials - [ ] Confirm no bc-breakages - [ ] Reflect the changes in the trackers and RFC docs Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/84867 Approved by: https://github.com/supriyar	2022-10-07 00:58:41 +00:00
Sahan Paliskara	936e93058b	Delete torch::deploy from pytorch core (#85953 ) As we have migrated torch::deploy over to https://github.com/pytorch/multipy, we can now delete it from pytorch core as ongoing development will happen there. This PR was created due to syncing issues with https://github.com/pytorch/pytorch/pull/85443 which is where the review history can be found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85953 Approved by: https://github.com/seemethere, https://github.com/malfet	2022-10-06 07:20:16 +00:00
Elias Ellison	d04889323e	Add Context Manager for Disabling Multithreading in Backwards, use in aot autograd (#86245 ) We were running into a few issues with running multithreaded backwards in aot_autograd: such as https://github.com/pytorch/pytorch/issues/86136, and `FakeTensorMode` getting into a weird state as a result of not executing functions completely sequentially. The multithreaded backwards is lost in translation when we trace out the backwards anyway, and adds a lot of additional complexity. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86245 Approved by: https://github.com/albanD, https://github.com/yf225	2022-10-06 03:27:42 +00:00
Jane Xu	a348975e00	Add opteinsum backend to give users control (#86219 ) This achieves the same things as https://github.com/pytorch/pytorch/pull/85908 but using backends instead of kwargs (which breaks torchscript unfortunately). This also does mean we let go of numpy compatibility BUT the wins here are that users can control what opt einsum they wanna do! The backend allows for..well you should just read the docs: ``` .. attribute:: torch.backends.opteinsum.enabled A :class:`bool` that controls whether opt_einsum is enabled (on by default). If so, torch.einsum will use opt_einsum (https://optimized-einsum.readthedocs.io/en/stable/path_finding.html) to calculate an optimal path of contraction for faster performance. .. attribute:: torch.backends.opteinsum.strategy A :class:`str` that specifies which strategies to try when `torch.backends.opteinsum.enabled` is True. By default, torch.einsum will try the "auto" strategy, but the "greedy" and "optimal" strategies are also supported. Note that the "optimal" strategy is factorial on the number of inputs as it tries all possible paths. See more details in opt_einsum's docs (https://optimized-einsum.readthedocs.io/en/stable/path_finding.html). ``` In trying (and failing) to land 85908, I discovered that jit script does NOT actually pull from python's version of einsum (because it cannot support variadic args nor kwargs). Thus I learned that jitted einsum does not subscribe to the new opt_einsum path calculation. Overall, this is fine since jit script is getting deprecated, but where is the best place to document this? ## Test plan: - added tests to CI - locally tested that trying to set the strategy to something invalid will error properly - locally tested that tests will pass even if you don't have opt-einsum - locally tested that setting the strategy when opt-einsum is not there will also error properly Pull Request resolved: https://github.com/pytorch/pytorch/pull/86219 Approved by: https://github.com/soulitzer, https://github.com/malfet	2022-10-05 06:33:25 +00:00
Jing Xu	f20e4eab7b	Fix ITT unit-tests if PyTorch is compiled with `USE_ITT=OFF` (#86199 ) Fixes https://github.com/pytorch/pytorch/pull/84848#discussion_r986329680 @malfet @slgong-fb Pull Request resolved: https://github.com/pytorch/pytorch/pull/86199 Approved by: https://github.com/malfet	2022-10-04 21:57:05 +00:00
Khushi	d6b030856b	[primTorch] special: j0, j1, spherical_j0 (#86049 ) Adds prims and refs for special functions (bessel_j0, bessel_j1, spherical_bessel_j0). Thanks! Pull Request resolved: https://github.com/pytorch/pytorch/pull/86049 Approved by: https://github.com/mruberry	2022-10-04 18:21:46 +00:00
Driss Guessous	cd6477617c	Custom sdp implementations dense (#85984 ) # Summary - This code creates the runtime dispatch system for choosing a performant fused SDP kernel. The only choice of fused kernel is flash_attention. It also creates python flags and a context manager that can be used to turn off and on behavior for dispatch. - This also adds support for flash_attention with dense tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85984 Approved by: https://github.com/cpuhrsch	2022-10-03 17:36:37 +00:00
vfdev	8d9472d7d4	[skip-ci] Fixed bad link in build_ci_governance.rst (#85933 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85933 Approved by: https://github.com/albanD	2022-10-03 17:35:44 +00:00
Masaki Kozuki	85d520d448	[docs] Add `torch.channels_last_3d (#85888 ) As per title, updating https://pytorch.org/docs/master/tensor_attributes.html#torch-memory-format. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85888 Approved by: https://github.com/ngimel	2022-10-03 17:32:07 +00:00
Codrin Popa	d401732baa	Added roundup_bypass_threshold_mb knobs to the PyTorch Caching Allocator (#85940 ) Summary: Added an additional roundup knob( ``roundup_bypass_threshold_mb``) to bypass rounding the requested allocation size, for allocation requests larger than the threshold value (in MB). This can help reduce the memory footprint when making large allocations that are expected to be persistent or have a large lifetime. Differential Revision: D39868104 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85940 Approved by: https://github.com/zdevito	2022-10-03 16:56:22 +00:00
Richard Zou	a262ccea58	Change torch.autograd.graph.disable_saved_tensors_hooks to be public API (#85994 ) Also addresses some comments from the review in https://github.com/pytorch/pytorch/pull/85971 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85994 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-10-03 16:25:01 +00:00
vfdev	6fd5d6397a	[Docs] Updated torchvision people (#85931 ) cc @datumbox @pmeier Pull Request resolved: https://github.com/pytorch/pytorch/pull/85931 Approved by: https://github.com/fmassa, https://github.com/datumbox	2022-10-03 10:57:08 +00:00
Ke Wen	05d1128106	[c10d] Start deprecating *_multigpu APIs (#85961 ) ### Deprecation reasons: - For most users training is on one GPU per process so these APIs are rarely used - They added one more API dimension - They can be expressed in a composed manner - They are not abstracted – specific to GPU - They caused backend APIs and implementations to have nested `std::vector<std::vector<Tensor>>`, which is hard to read or maintain Pull Request resolved: https://github.com/pytorch/pytorch/pull/85961 Approved by: https://github.com/XilunWu, https://github.com/H-Huang	2022-10-01 00:59:39 +00:00
Justin Chu	69b927701a	[ONNX] Update user documentation (#85819 ) - Remove mentions of `SymbolicContext` in the doc - Comment out the PythonOp example so that it is not shown to users - Updated code blocks and wording - Changed to recommend using `pip` for installing onnx. Now adds a deprecation message to the docs (demo only): ![image](https://user-images.githubusercontent.com/11205048/193327649-f789b369-6b59-49e0-8bba-34a6785eb128.png) Fixes #85608 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85819 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao	2022-09-30 19:35:34 +00:00
erjia	b13b10a8fa	Extend collate function that can register collate functions to handle specific types (#85748 ) As per request from Vision team, adding `collate` function with an extra argument of `collate_fn_map` to dispatch custom collate functions for non-collection objects and specific objects. If the type of batch element is not present in`collate_fn_map`, it will go through all keys in the insertion order to check if the type is a subclass of the key. If so, it will invoke the corresponding collate functions. And, `default_collate` will utilize the `collate` function with a few by default collate function for `int`, `float`, `str` and `numpy object`. Benefit: - Domain teams can register their own `collate` function to handle their specific type of objects - Easier for users to extend from the `collate` function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85748 Approved by: https://github.com/NivekT, https://github.com/pmeier	2022-09-30 13:30:18 +00:00
Ke Wen	ade1c19612	Add reduce_scatter_tensor in place of _reduce_scatter_base (#85867 ) This is a twin PR similar to the one for `all_gather_into_tensor` (#85686). The philosophy for renaming `_reduce_scatter_base` instead of merging it is described in #85686. Cc @rohan-varma @H-Huang @crcrpar @ptrblck @mrshenli Pull Request resolved: https://github.com/pytorch/pytorch/pull/85867 Approved by: https://github.com/crcrpar, https://github.com/H-Huang	2022-09-30 05:48:16 +00:00
Kazuaki Ishizaki	bc57306bdd	Fix typo under docs directory and RELEASE.md (#85896 ) This PR fixes typo in rst files under docs directory and `RELEASE.md`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85896 Approved by: https://github.com/kit1980	2022-09-29 21:41:59 +00:00
zaf	d542aab5c1	[quant][ao_migration] nn.intrinsic migration to ao (#84842 ) All quantization-related modules are being migrated to `torch.ao`. This migrates the `nn.intrinsic.modules`. Please, see the [tracker](https://github.com/pytorch/pytorch/issues/81667) for the timeline. Differential Revision: [D39419733](https://our.internmc.facebook.com/intern/diff/D39419733/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39419733/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84842 Approved by: https://github.com/jerryzh168	2022-09-28 23:54:29 +00:00
Mikayla Gawarecki	afaee00fec	Add python `nested_tensor` and `as_nested_tensor` constructors in `torch.nested` (#85593 ) Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ). Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc. Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593 Approved by: https://github.com/drisspg, https://github.com/cpuhrsch	2022-09-28 20:15:02 +00:00
Jing Xu	80b8886223	add itt unit test and docstrings (#84848 ) Add unit tests and docstrings corresponding to PR https://github.com/pytorch/pytorch/pull/63289 UT: 1. `test_profiler_emit_itt` in `test/test_autograd.py`. This test is merely intended to catch if emit_itt breaks on construction. 2. Test `torch.profiler.itt` functions in `test/test_itt.py` 3. Only testing that emit_itt runs when `record_shapes` option is enabled in `test/test_profiler.py`. Docstring: 1. add ITT related info into `docs/source/bottleneck.rst` 4. add `torch.profiler.itt` functions to `docs/source/profiler.rst` 5. add docstring to `torch.profiler.itt` functions in `torch/profiler/itt.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84848 Approved by: https://github.com/malfet	2022-09-28 01:39:58 +00:00
Andrew M. James	5bfcf1f01a	[Docs] Update sparse Maintainers (#85126 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85126 Approved by: https://github.com/cpuhrsch	2022-09-27 22:50:31 +00:00
Ke Wen	775a22c7c6	Add all_gather_into_tensor in place of _all_gather_base (#85686 ) ### Description - This PR renames `_all_gather_base` to `all_gather_into_tensor` so that it is clearer in meaning. - The `all_gather_into_tensor` API differs from the `all_gather` API in the output it accepts -- a single, large tensor instead of a list of tensors. - This PR also adds deprecation warning to `_all_gather_base`. ### Issue `_all_gather_base` was implemented in https://github.com/pytorch/pytorch/pull/33924 to avoid unnecessary flattening. There was previous effort (#82639) to merge `_all_gather_base` with the existing `all_gather` API by detecting the parameter type passed in for the output. There are, however, two "blockers" that make the merge difficult: (i) The merge leads to backward compatibility break. We would need to change the parameter name `tensor_list` in `all_gather` to a general name `output` that can cover both tensor and tensor list. (ii) Recently, the `all_gather` API has added uneven tensor support, utilizing the tensor boundaries implied by the list. We are, however, not sure to add such support to the `_all_gather_base` function, because that would require users to pass in additional tensor boundary information. In view of the above, we decided to productize `_all_gather_base` as a separate function, but with a clearer name. ### Testing Added tests: - `test_all_gather_into_cat_tensor_cuda` -- output form as with `torch.cat`. For example: ``` >>> tensor_in tensor([1, 2], device='cuda:0') # Rank 0 tensor([3, 4], device='cuda:1') # Rank 1 >>> tensor_out tensor([1, 2, 3, 4], device='cuda:0') # Rank 0 tensor([1, 2, 3, 4], device='cuda:1') # Rank 1 ``` - `test_all_gather_into_stack_tensor_cuda` -- output form as with `torch.stack`. For example: ``` >>> tensor_out2 tensor([[1, 2], [3, 4]], device='cuda:0') # Rank 0 tensor([[1, 2], [3, 4]], device='cuda:1') # Rank 1 ``` The output form is determined by the shape of the output tensor passed by the user, no flag used. Cc @rohan-varma @mrshenli @crcrpar @ptrblck @H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/85686 Approved by: https://github.com/rohan-varma, https://github.com/crcrpar	2022-09-27 22:50:22 +00:00
supriyar	18685b7fe1	Update PT maintainers list for AO (#85125 ) Summary: Update the list based on recommendation in https://github.com/pytorch/pytorch/blob/master/docs/source/community/build_ci_governance.rst Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D39745619](https://our.internmc.facebook.com/intern/diff/D39745619) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85125 Approved by: https://github.com/gchanan	2022-09-23 23:38:57 +00:00
Ivan Yashchuk	539076e2c2	Remove deprecated torch.lstsq (#70980 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.lstsq`. There's a note in `tools/codegen/gen.py` about `lstsq` schema in `native_function.yaml` that I will not remove: `87139d8532/tools/codegen/gen.py (L734-L770)` cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70980 Approved by: https://github.com/lezcano, https://github.com/kit1980	2022-09-23 00:16:55 +00:00
Ivan Yashchuk	bcf93181a0	Remove deprecated torch.matrix_rank (#70981 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.matrix_rank`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70981 Approved by: https://github.com/lezcano, https://github.com/kit1980	2022-09-22 17:40:46 +00:00
lezcano	de0f3c4200	Change Lezcano to lezcano (#85396 ) I changed my handle to lezcano (no caps) as rhere's always issues with capital letters when automatising stuff. The last issue was https://github.com/pytorch/test-infra/pull/751 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85396 Approved by: https://github.com/ezyang	2022-09-21 13:49:55 +00:00
Mateusz Sypniewski	b70c254ebb	Rework printing tensor aliases in CSAN error message (#85008 ) Small rework of how the error message is formatted, introduces a distinction between the arguments and the output of kernels. Verified manually on multiple examples that the message is printed as expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85008 Approved by: https://github.com/lw	2022-09-21 13:41:52 +00:00
Justin Chu	d6c2080eb4	[ONNX] Update ONNX documentation to include unsupported operators (#84496 ) - Update ONNX documentation to include unsupported operators - Include aten, quantized and other namespaces Pull Request resolved: https://github.com/pytorch/pytorch/pull/84496 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao, https://github.com/kit1980	2022-09-16 23:48:37 +00:00
Feisi Fu	d8eae6283d	Rename 'torch/ao/nn/quantized._reference' to 'torch/ao/nn/quantized/reference'. (#84974 ) Currently, the path for reference modules contains _ which means it's private (https://github.com/pytorch/pytorch/tree/master/torch/ao/nn/quantized/_reference), but we would like to make it public since the reference module is now enabled by default in the fx graph mode quantization flow and it will be added to eager mode flow as well in the future. To make '_reference' public, it should satisfy the [public API rules](https://github.com/pytorch/pytorch/wiki/Public-API-definition-and-documentation). I did in the first commit (prepare '_reference' to be public): 1: add __all__ to public modules and packages; 2. made functions, that are only used in the file that the function is defined, private by adding _ at their names. Fixes #83090. (we rename the 'torch/ao/nn/quantized/_reference', because of migration #81667.) This is a dup for the #84786. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84974 Approved by: https://github.com/andrewor14, https://github.com/z-a-f	2022-09-16 17:49:07 +00:00
Khushi Agrawal	2386cd2945	[reland] [numpy] add torch.concatenate, alias of torch.cat (#85073 ) Previous PR: #82946 Fixes #81161 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85073 Approved by: https://github.com/mruberry	2022-09-15 19:34:44 +00:00
PyTorch MergeBot	fa7bf3e2dc	Revert "[numpy] add `torch.concatenate`, alias of torch.cat (#82946 )" This reverts commit `270e5e519d`. Reverted https://github.com/pytorch/pytorch/pull/82946 on behalf of https://github.com/malfet due to Broke M1 tests, see `270e5e519d`	2022-09-14 21:32:11 +00:00
Khushi Agrawal	270e5e519d	[numpy] add `torch.concatenate`, alias of torch.cat (#82946 ) As per the title. Fixes: #81161 - [x] add ErrorInputs - ~[ ] dtype argument?~ - ~[ ] casting argument?~ As discussed offline with @kshitij12345, we can currently ignore `dtype` and `casting` arguments. cc: @kshitij12345! Pull Request resolved: https://github.com/pytorch/pytorch/pull/82946 Approved by: https://github.com/mruberry	2022-09-14 19:28:43 +00:00
Nayef Ahmed	cb9ef4668e	Updated library level maintainers for torchtext (#84950 ) - Updated library level maintainers for torchtext to reflect internal changes to the team Pull Request resolved: https://github.com/pytorch/pytorch/pull/84950 Approved by: https://github.com/mthrok	2022-09-14 00:35:36 +00:00
Mikayla Gawarecki	e217b30b0f	Add `torch.nested` namespace (#84102 ) First step towards #83775 - only `to_padded_tensor` is moved to the nested namespace for now - following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in `torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`. ~~Question: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~ [generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested) Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102 Approved by: https://github.com/drisspg	2022-09-12 16:31:05 +00:00
Slava Kovalevskyi	2698f99dc7	fixing form link for governance (#84861 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84861 Approved by: https://github.com/malfet	2022-09-12 14:15:52 +00:00
Dmytro Dzhulgakov	96e4bd9500	[docs] Person of interest update: sparse, torchrec and smaller tweaks (#84772 ) Fixes #83363 This is not a full update yet, but fixes some obvious things: missing modules (torchrec, sparse) and brings a few people from merge_rules.json who are working on the respective modules. There are still discrepancies - e.g. Intel CPU work is split in many categories in merge_rules, but it's better to improve things incrementally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84772 Approved by: https://github.com/b0noI, https://github.com/malfet	2022-09-10 00:09:57 +00:00
Ivan Yashchuk	01c54ad6de	Remove deprecated torch.eig (#70982 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982 Approved by: https://github.com/Lezcano, https://github.com/malfet	2022-09-09 21:31:57 +00:00
Mateusz Sypniewski	d12f3524b7	Add user facing documentation for CSAN (#84689 ) This adds a user facing tutorial for the CSAN tool. The documentation preview should be available [here](https://docs-preview.pytorch.org/84689/index.html) once the GitHub job completes on this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84689 Approved by: https://github.com/lw	2022-09-09 15:29:34 +00:00
Jerry Zhang	214a6500e3	[quant][docs] Additonal fixes for quantize_fx docs (#84587 ) Summary: Some more clarifications for the arguments, including linking to object docs (QConfigMapping, BackendConfig) and adding types in the doc Test Plan: ``` cd docs make html ``` and visual inspection for the generated docs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/84587 Approved by: https://github.com/vkuzo	2022-09-09 15:23:23 +00:00
Sergii Dymchenko	49ec8d32c7	Suggest draft PRs in contribution_guide.rst (#84658 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84658 Approved by: https://github.com/huydhn	2022-09-08 03:12:50 +00:00
Eddie Yan	d892d5d682	[CUBLAS][TF32][CUDNN] Update numerical_accuracy.rst (#79537 ) CC @mruberry @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/79537 Approved by: https://github.com/ngimel, https://github.com/mruberry	2022-09-07 18:30:26 +00:00
Bin Chen	06ebe2d5bc	Add watchdog to TorchElastic agent and trainers (#84081 ) Summary: D38604238 (`3b11b80fc3`) introduced a named pipe based watchdog timer. This diff uses the named pipe based watchdog timer in TorchElastic agent and training worker processes (in the StuckJobDetector class) to allow the TorchElastic agent to detect the stuck of a training process, and kill the process to create a core dump. Test Plan: ``` buck test mode/dev-nosan //caffe2/test/distributed/elastic/agent/server/test:local_agent_test ``` ``` RemoteExecution session id: reSessionID-0bfcacef-24d1-42bc-a1d3-f3058fc42b2f-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/7318349503394739 ✓ ListingSuccess: caffe2/test/distributed/elastic/agent/server/test:local_agent_test : 55 tests discovered (22.699) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_barrier_failed_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (47.140) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_distributed_sum_homogeneous_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (49.198) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_happy_function_c10d (local_elastic_agent_test.LocalElasticAgentTest) (46.387) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_happy_function_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (46.094) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_bipolar_function_etcd (local_elastic_agent_test.LocalElasticAgentTest) (106.342) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_correct_rank_assignment_homogeneous_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (64.888) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_correct_rank_assignment_homogeneous_etcd (local_elastic_agent_test.LocalElasticAgentTest) (69.158) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_agent_local_watchdog_setup_enabled_etcd (local_elastic_agent_test.LocalElasticAgentTest) (46.965) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_double_agent_elastic_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (79.626) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_function_with_return_value_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (46.113) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_sad_function_etcd (local_elastic_agent_test.LocalElasticAgentTest) (46.487) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_shutdown_called_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (24.358) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_torch_rpc_c10d (local_elastic_agent_test.LocalElasticAgentTest) (48.216) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_distributed_sum_homogeneous_c10d (local_elastic_agent_test.LocalElasticAgentTest) (48.433) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_torch_rpc_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (47.029) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_simple_dist_sum_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (44.357) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_check_master_addr_port_override_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (45.176) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_check_nccl_async_error_handling_env_default_c10d (local_elastic_agent_test.LocalElasticAgentTest) (45.980) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_simple_dist_sum_c10d (local_elastic_agent_test.LocalElasticAgentTest) (47.151) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_simple_dist_sum_etcd (local_elastic_agent_test.LocalElasticAgentTest) (44.614) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_correct_rank_assignment_heterogeneous_etcd (local_elastic_agent_test.LocalElasticAgentTest) (69.099) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_agent_local_watchdog_setup_enabled_c10d (local_elastic_agent_test.LocalElasticAgentTest) (45.367) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_shutdown_called_etcd (local_elastic_agent_test.LocalElasticAgentTest) (22.804) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_double_agent_elastic_c10d (local_elastic_agent_test.LocalElasticAgentTest) (77.560) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_dummy_compute_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (46.050) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_distributed_sum_heterogeneous_c10d (local_elastic_agent_test.LocalElasticAgentTest) (48.088) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_double_agent_elastic_etcd (local_elastic_agent_test.LocalElasticAgentTest) (77.286) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_double_agent_fault_tolerance_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (50.670) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_check_master_addr_port_override_etcd (local_elastic_agent_test.LocalElasticAgentTest) (45.631) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_distributed_sum_heterogeneous_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (50.867) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_double_agent_fault_tolerance_etcd (local_elastic_agent_test.LocalElasticAgentTest) (51.095) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_happy_function_etcd (local_elastic_agent_test.LocalElasticAgentTest) (45.000) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_sad_function_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (45.197) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_distributed_sum_homogeneous_etcd (local_elastic_agent_test.LocalElasticAgentTest) (46.873) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_shutdown_called_c10d (local_elastic_agent_test.LocalElasticAgentTest) (23.160) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_barrier_failed_etcd (local_elastic_agent_test.LocalElasticAgentTest) (43.632) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_torch_rpc_etcd (local_elastic_agent_test.LocalElasticAgentTest) (44.536) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_bipolar_function_c10d (local_elastic_agent_test.LocalElasticAgentTest) (89.859) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_workers_drift_fail_etcd (local_elastic_agent_test.LocalElasticAgentTest) (48.277) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_check_nccl_async_error_handling_env_c10d (local_elastic_agent_test.LocalElasticAgentTest) (43.930) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_bipolar_function_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (87.677) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_workers_drift_success_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (48.965) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_workers_drift_fail_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (50.143) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_workers_drift_success_etcd (local_elastic_agent_test.LocalElasticAgentTest) (46.781) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_function_with_return_value_etcd (local_elastic_agent_test.LocalElasticAgentTest) (45.152) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_barrier_failed_c10d (local_elastic_agent_test.LocalElasticAgentTest) (44.832) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_function_with_return_value_c10d (local_elastic_agent_test.LocalElasticAgentTest) (45.281) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_correct_rank_assignment_heterogeneous_etcd_v2 (local_elastic_agent_test.LocalElasticAgentTest) (74.968) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_agent_local_watchdog_setup_disabled_c10d (local_elastic_agent_test.LocalElasticAgentTest) (46.141) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_dummy_compute_c10d (local_elastic_agent_test.LocalElasticAgentTest) (44.960) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_dummy_compute_etcd (local_elastic_agent_test.LocalElasticAgentTest) (45.292) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_agent_local_watchdog_setup_disabled_etcd (local_elastic_agent_test.LocalElasticAgentTest) (44.611) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_check_env_function_etcd (local_elastic_agent_test.LocalElasticAgentTest) (44.939) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_distributed_sum_heterogeneous_etcd (local_elastic_agent_test.LocalElasticAgentTest) (47.609) ✓ Pass: caffe2/test/distributed/elastic/agent/server/test:local_agent_test - test_run_sad_function_c10d (local_elastic_agent_test.LocalElasticAgentTest) (45.628) Summary Pass: 55 ListingSuccess: 1 Finished test run: https://www.internalfb.com/intern/testinfra/testrun/7318349503394739 ``` ----------- ``` buck test caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test ``` ``` RemoteExecution session id: reSessionID-607a0028-4095-4dfc-b657-55f0807fe621-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8162774432794818 ✓ ListingSuccess: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test : 11 tests discovered (39.037) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_thrift_api_called (caffe2.torch.fb.trainer.stuck_detection.tests.collect_quickstack_test.CollectQuickstackTrace) (0.655) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_setup_local_watchdog (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (36.510) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_dont_print_when_job_normal (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (36.727) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_send_watchdog_request_on_batch_callbacks_no_server (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (37.060) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_quickstack_stuck_job (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (37.242) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_setup_local_watchdog_disabled (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (37.243) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_print_stack_trace_when_job_stuck (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (37.590) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_print_when_stuck (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (37.590) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_setup_local_watchdog_no_file (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (37.589) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_signposts_stack_trace_when_job_stuck (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (38.132) ✓ Pass: caffe2/torch/fb/trainer/stuck_detection/tests:stuck_job_detector_test - test_send_watchdog_request_on_batch_callbacks (caffe2.torch.fb.trainer.stuck_detection.tests.stuck_job_detector_test.StuckJobDetectorTest) (38.133) Summary Pass: 11 ListingSuccess: 1 Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8162774432794818 ``` Differential Revision: D38930476 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84081 Approved by: https://github.com/d4l3k	2022-09-07 00:17:20 +00:00
Edward Z. Yang	2a332afbf4	Add SymFloat, support SymInt to SymFloat conversion (#84284 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84284 Approved by: https://github.com/albanD	2022-09-03 01:30:32 +00:00
Slava Kovalevskyi	c585e149e2	Process for maintaining Build + CI contributors list (#83869 ) The following issues are fixed: * process of adding new contributors to the "Build + CI" module added * folks who qualified are explicitly added Pull Request resolved: https://github.com/pytorch/pytorch/pull/83869 Approved by: https://github.com/svekars, https://github.com/seemethere, https://github.com/malfet	2022-08-31 21:48:39 +00:00
apeltop	e7635c06ce	Fix typos in docs (#80602 ) I hope it helps. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80602 Approved by: https://github.com/kit1980	2022-08-29 23:32:44 +00:00
Zain Rizvi	d62a6ca521	Link to instructions on submitting an RFC (#83990 ) Point people to instructions on how to create an RFC Pull Request resolved: https://github.com/pytorch/pytorch/pull/83990 Approved by: https://github.com/janeyx99	2022-08-29 20:31:30 +00:00
Christian Jauvin	089101fc82	Fix small typo in cuda.rst (#84012 ) This fixes a very minor typo in the CUDA semantics doc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84012 Approved by: https://github.com/malfet	2022-08-26 04:53:49 +00:00
Michael Voznesensky	ced2ca8f86	Torch cond operator, python dispatch, pyoperator (#83154 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/83154 Approved by: https://github.com/ezyang	2022-08-25 20:11:53 +00:00
zaf	2f04ba2c7c	[quant][ao_migration] `torch.nn.qat` → `torch.ao.nn.qat` (#78716 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [X] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [X] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [X] [Current PR] `torch.nn.qat` → `torch.ao.nn.qat` - [X] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [X] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - None Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861197/)! Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78716 Approved by: https://github.com/jerryzh168	2022-08-25 16:50:38 +00:00
zaf	29e83b6599	[quant][ao_migration] `torch.nn.quantizable` → `torch.ao.nn.quantizable`. (#78717 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [X] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [X] [Current PR] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - `torch/ao/nn/__init__.py` → Changing the imports to lazy. Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861090/)! Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78717 Approved by: https://github.com/jerryzh168	2022-08-25 16:50:37 +00:00
zaf	d32a762147	[quant][ao_migration] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` (#78714 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 - [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo - [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a - [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)! Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714 Approved by: https://github.com/jerryzh168	2022-08-25 16:50:34 +00:00
zaf	c92e5ac95b	[quant][ao_migration] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` (#78713 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - Documentation @vkuzo - docs/source/conf.py - docs/source/quantization.rst - [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168 - [common test routine](test/quantization/ao_migration/common.py) @HDCharles - JIT stuff @jamesr66a - torch/csrc/jit/passes/hoist_conv_packed_params.cpp - torch/csrc/jit/passes/quantization/helper.h - torch/csrc/jit/serialization/import_source.cpp Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012/) Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713 Approved by: https://github.com/jerryzh168	2022-08-25 16:50:33 +00:00
Bin Chen	3b11b80fc3	Named pipe based watchdog timer (#83695 ) Summary: This diff implements a named pipe based watchdog timer (`FileTimerClient` and `FileTimerServer`). This is similar to the existing `LocalTimerClient` and `LocalTimerServer` (https://fburl.com/code/j4b9pyya). The motivation is from the need of handling various timeout issues. The training process occasionally get stuck. We need a proper watchdog to monitor the liveness of the training processes. This timer allows the TorchElastic agent (as the watchdog) to monitor the progress of the training processes that it spawned. If a timeout occurred, he TorchElastic agent can take some action to kill the stuck process and creating a core dump for it. `LocalTimerClient` and `LocalTimerServer` require a `multiprocessing.Queue()` to work. So they can only be used between `multiprocessing` parent and child processes. `FileTimerClient` and `FileTimerServer` does not have such limitation. Test Plan: ### Unit Test ``` buck test mode/opt caffe2/test/distributed/elastic/timer:file_based_timer_test ``` ``` RemoteExecution session id: reSessionID-06d70a77-043c-4d9d-b0f2-94c24460740a-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/844425186732666 ✓ ListingSuccess: caffe2/test/distributed/elastic/timer:file_based_timer_test : 12 tests discovered (2.177) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_happy_path (file_based_local_timer_test.FileTimerTest) (2.463) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_expired_timers (file_based_local_timer_test.FileTimerServerTest) (1.889) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_send_request_release (file_based_local_timer_test.FileTimerServerTest) (1.700) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_valid_timers (file_based_local_timer_test.FileTimerServerTest) (1.873) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_watchdog_call_count (file_based_local_timer_test.FileTimerServerTest) (1.715) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_watchdog_empty_queue (file_based_local_timer_test.FileTimerServerTest) (1.609) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_exception_propagation (file_based_local_timer_test.FileTimerTest) (1.633) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_multiple_clients_interaction (file_based_local_timer_test.FileTimerTest) (2.189) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_get_timer_recursive (file_based_local_timer_test.FileTimerTest) (2.295) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_no_client (file_based_local_timer_test.FileTimerTest) (1.753) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_timer (file_based_local_timer_test.FileTimerTest) (2.151) ✓ Pass: caffe2/test/distributed/elastic/timer:file_based_timer_test - test_client_interaction (file_based_local_timer_test.FileTimerTest) (1.895) Summary Pass: 12 ListingSuccess: 1 Finished test run: https://www.internalfb.com/intern/testinfra/testrun/844425186732666 ``` Differential Revision: D38604238 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83695 Approved by: https://github.com/d4l3k	2022-08-24 22:16:12 +00:00
PyTorch MergeBot	6a9c02339d	Revert "[quant][ao_migration] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` (#78713 )" This reverts commit `432f037498`. Reverted https://github.com/pytorch/pytorch/pull/78713 on behalf of https://github.com/janeyx99 due to Reverting for breaking (trunk-only) ios build	2022-08-22 07:32:37 +00:00
PyTorch MergeBot	b1a7b67529	Revert "[quant][ao_migration] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` (#78714 )" This reverts commit `e6fb97d8ae`. Reverted https://github.com/pytorch/pytorch/pull/78714 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted	2022-08-22 07:30:48 +00:00
PyTorch MergeBot	e9dd4d5adf	Revert "[quant][ao_migration] `torch.nn.quantizable` → `torch.ao.nn.quantizable`. (#78717 )" This reverts commit `e0876feb49`. Reverted https://github.com/pytorch/pytorch/pull/78717 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted	2022-08-22 07:26:44 +00:00
PyTorch MergeBot	4cbb1986fe	Revert "[quant][ao_migration] `torch.nn.qat` → `torch.ao.nn.qat` (#78716 )" This reverts commit `7cd2fa1d38`. Reverted https://github.com/pytorch/pytorch/pull/78716 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted	2022-08-22 07:23:24 +00:00
zaf	7cd2fa1d38	[quant][ao_migration] `torch.nn.qat` → `torch.ao.nn.qat` (#78716 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [X] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [X] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [X] [Current PR] `torch.nn.qat` → `torch.ao.nn.qat` - [X] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [X] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - None Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861197/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78716 Approved by: https://github.com/jerryzh168	2022-08-22 05:33:23 +00:00
zaf	e0876feb49	[quant][ao_migration] `torch.nn.quantizable` → `torch.ao.nn.quantizable`. (#78717 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [X] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [X] [Current PR] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - None Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861090/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78717 Approved by: https://github.com/jerryzh168	2022-08-22 05:31:48 +00:00
zaf	e6fb97d8ae	[quant][ao_migration] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` (#78714 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 - [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo - [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a - [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714 Approved by: https://github.com/jerryzh168	2022-08-22 05:22:00 +00:00
zaf	432f037498	[quant][ao_migration] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` (#78713 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - Documentation @vkuzo - docs/source/conf.py - docs/source/quantization.rst - [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168 - [common test routine](test/quantization/ao_migration/common.py) @HDCharles - JIT stuff @jamesr66a - torch/csrc/jit/passes/hoist_conv_packed_params.cpp - torch/csrc/jit/passes/quantization/helper.h - torch/csrc/jit/serialization/import_source.cpp Differential Revision: [D36860145](https://our.internmc.facebook.com/intern/diff/D36860145/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713 Approved by: https://github.com/jerryzh168	2022-08-22 01:38:55 +00:00
zaf	78c8a0d752	[quant][ao_migration] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` (#78712 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] [Current PR] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [ ] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 Differential Revision: [D36792967](https://our.internmc.facebook.com/intern/diff/D36792967/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36792967/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78712 Approved by: https://github.com/jerryzh168	2022-08-18 17:51:54 +00:00
George Qi	94ba085ce0	[maskedtensor] first commit, core and creation (#82836 ) * __->__ #82836 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82836 Approved by: https://github.com/albanD, https://github.com/bhosmer	2022-08-16 20:10:34 +00:00
Slava Kovalevskyi	2c79b9c638	module names are made more consistent with POI page (#83219 ) Less intrusive update after the first attempt got reverted: https://github.com/pytorch/pytorch/pull/83127 fix for: #83363 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83219 Approved by: https://github.com/malfet	2022-08-16 18:38:08 +00:00
joncrall	4618371da5	Integrate xdoctest - Rebased (#82797 ) This is a new version of #15648 based on the latest master branch. Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR. In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.) Fixes https://github.com/pytorch/pytorch/issues/71105 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797 Approved by: https://github.com/ezyang	2022-08-12 02:08:01 +00:00
Zachary DeVito	4128712397	Propagate CUDAOutOfMemoryError to Python. (#83146 ) The intention is to make it easier to catch this situation for debugging, logging, or application-specific recovery. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83146 Approved by: https://github.com/albanD	2022-08-11 21:32:11 +00:00
Federico Pozzi	f8a10a7f79	feat: add PolynomialLR scheduler (#82769 ) ### Description <!-- What did you change and why was it needed? --> Add PolynomialLR scheduler. ### Issue Closes #79511. ### Testing I added tests for PolynomialLR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82769 Approved by: https://github.com/datumbox	2022-08-10 18:21:00 +00:00
PyTorch MergeBot	3d61d93ea7	Revert "merge_rules, person_of_interst and CODEOWNERS now better aligned (#83127 )" This reverts commit `fb833aabac`. Reverted https://github.com/pytorch/pytorch/pull/83127 on behalf of https://github.com/malfet due to We should not have removed existing codeowners, nor spam Soumith and Ed with review requests	2022-08-10 16:31:28 +00:00
Slava Kovalevskyi	fb833aabac	merge_rules, person_of_interst and CODEOWNERS now better aligned (#83127 ) not 100% alignment just yet Pull Request resolved: https://github.com/pytorch/pytorch/pull/83127 Approved by: https://github.com/malfet	2022-08-10 14:46:25 +00:00
Sergii Dymchenko	a0b3854548	Change seperate -> separate (#83056 ) One instance was caught by Meta-internal "exact-word-misspell" linter in D38505529. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83056 Approved by: https://github.com/huydhn, https://github.com/seemethere	2022-08-09 23:11:34 +00:00
Slava Kovalevskyi	9ba1631c67	Governance process been actualized. (#82736 ) Changes: * form for topics proposals for Core maintainers review been added * merge_rules.json file specified as spruce of truth for the list of maintainers (since it is the file that actually defines permissions) * responsibilities of the module maintainers are added (as per the last core maintainers meeting) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82736 Approved by: https://github.com/svekars, https://github.com/soumith	2022-08-09 00:40:20 +00:00
Justin Chu	c6cdca5c68	[ONNX] Reland #81953 Type utility for converting among JIT, torch and ONNX data types (#82995 ) Re-land #81953 Add `_type_utils` for handling data type conversion among JIT, torch and ONNX. - Replace dictionary / list indexing with methods in ScalarType - Breaking: Remove ScalarType from `symbolic_helper` and move it to `_type_utils` - Deprecated: "cast_pytorch_to_onnx", "pytorch_name_to_type", "scalar_name_to_pytorch", "scalar_type_to_onnx", "scalar_type_to_pytorch_type" in `symbolic_helper` - Deprecate the type mappings and lists. Remove all internal references - Move _cast_func_template to opset 9 and remove its reference elsewhere (clean up). Added documentation for easy discovery Why: List / dictionary indexing and lookup are error-prone and convoluted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82995 Approved by: https://github.com/kit1980	2022-08-08 23:43:43 +00:00
Ben Wallace	7e3c3fd37b	Fix typos in `torch.package` documentation (#82994 ) This PR fixes typos found throughout the documentation for the `torch.package` module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82994 Approved by: https://github.com/kit1980	2022-08-08 20:19:17 +00:00
Andrew Or	782f3489c6	[Quant][fx][bc-breaking] Integrate BackendConfig with quantization flow (part 2) (#82557 ) This is part 2 of the effort to replace `backend_config_dict` with a python config object, a more formal and robust API that leads to better user experience. This commit integrates the `BackendConfig` implemented in part 1 (https://github.com/pytorch/pytorch/pull/81469) with the existing FX graph mode quantization flow. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps BC-breaking Notes: Before: ``` import torch from torch.ao.quantization import get_default_qconfig_mapping from torch.ao.quantization.backend_config import ObservationType from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx dtype_config = { "input_dtype": torch.quint8, "output_dtype": torch.quint8 "weight_dtype": torch.qint8, "bias_dtype": torch.float, } backend_config_dict = { "name": "my_backend", "configs": [{ "pattern": torch.nn.Linear, "observation_type": ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT, "dtype_configs": [dtype_config], "root_module": torch.nn.Linear, "reference_quantized_module": torch.nn.quantized._reference.Linear, "qat_module": torch.nn.qat.Linear, }] } m = MyModel() qconfig_mapping = get_default_qconfig_mapping() example_inputs = (torch.rand(3, 3),) m = prepare_fx( m, qconfig_mapping, example_inputs, backend_config_dict=backend_config_dict) m = convert_fx(m, backend_config_dict=backend_config_dict) ``` After: ``` import torch from torch.ao.quantization import get_default_qconfig_mapping from torch.ao.quantization.backend_config import ( BackendConfig, BackendPatternConfig, DTypeConfig, ObservationType, ) from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx dtype_config = DTypeConfig( input_dtype=torch.quint8, output_dtype=torch.quint8 weight_dtype=torch.qint8, bias_dtype=torch.float, ) backend_config = BackendConfig("my_backend").set_backend_pattern_config( BackendPatternConfig(torch.nn.Linear) .set_observation_type(ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT) .add_dtype_config(dtype_config) .set_root_module(torch.nn.Linear) .set_reference_quantized_module(torch.nn.quantized._reference.Linear) .set_qat_module(torch.nn.qat.Linear)) m = MyModel() qconfig_mapping = get_default_qconfig_mapping() example_inputs = (torch.rand(3, 3),) m = prepare_fx(m, qconfig_mapping, example_inputs, backend_config=backend_config) m = convert_fx(m, backend_config=backend_config) ``` Reviewers: jerryzh168 Subscribers: jerryzh168, supriyar Differential Revision: [D38471932](https://our.internmc.facebook.com/intern/diff/D38471932) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82557 Approved by: https://github.com/jerryzh168	2022-08-08 18:55:50 +00:00
PyTorch MergeBot	b170a52a09	Revert "[ONNX] Type utility for converting among JIT, torch and ONNX data types (#81953 )" This reverts commit `6ddf4c6f58`. Reverted https://github.com/pytorch/pytorch/pull/81953 on behalf of https://github.com/kit1980 due to Broke internal builds by removing functions without deprecation	2022-08-07 20:15:28 +00:00
Justin Chu	6ddf4c6f58	[ONNX] Type utility for converting among JIT, torch and ONNX data types (#81953 ) Add `_type_utils` for handling data type conversion among JIT, torch and ONNX. - Replace dictionary / list indexing with methods in ScalarType - Breaking: Remove ScalarType from `symbolic_helper` and move it to `_type_utils` - Breaking: Remove "cast_pytorch_to_onnx", "pytorch_name_to_type", "scalar_name_to_pytorch", "scalar_type_to_onnx", "scalar_type_to_pytorch_type" from `symbolic_helper` - Deprecate the type mappings and lists. Remove all internal references - Move _cast_func_template to opset 9 and remove its reference elsewhere (clean up). Added documentation for easy discovery Why: List / dictionary indexing and lookup are error-prone and convoluted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81953 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao	2022-08-05 22:24:45 +00:00
BowenBao	26d50ff1be	[ONNX] Update merge rules and persons of interest (#82673 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82673 Approved by: https://github.com/malfet	2022-08-04 18:11:41 +00:00
shubhambhokare1	95d873855e	[ONNX] Inline prim::PythonOp for Autograd Function Export (#74765 ) Add flag (inline_autograd) to enable inline export of model consisting of autograd functions. Currently, this flag should only be used in TrainingMode.EVAL and not for training. An example: If a model containing ``autograd.Function`` is as follows ``` class AutogradFunc(torch.autograd.Function): @staticmethod def forward(ctx, i): result = i.exp() result = result.log() ctx.save_for_backward(result) return result ``` Then the model is exported as ``` graph(%0 : Float): %1 : Float = ^AutogradFunc(%0) return (%1) ``` If inline_autograd is set to True, this will be exported as ``` graph(%0 : Float): %1 : Float = onnx::Exp(%0) %2 : Float = onnx::Log(%1) return (%2) ``` If one of the ops within the autograd module is not supported, that particular node is exported as is mirroring ONNX_FALLTHROUGH mode Fixes: #61813 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74765 Approved by: https://github.com/BowenBao, https://github.com/malfet	2022-08-03 23:30:19 +00:00
Markus	786a9d095a	Update backends.rst (#82525 ) ### Description Added `torch.backends.mps` to list of avaiable torch.backends at the top, it was missing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82525 Approved by: https://github.com/albanD	2022-08-03 18:33:15 +00:00
Kurt Mohler	14d0296e5c	Rename `_Typed/_UntypedStorage` to `Typed/UntypedStorage` and update docs (#82438 ) ### Description Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public. `TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`. Documentation for storages is improved as well. ### Issue Fixes #82436 ### Testing N/A Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438 Approved by: https://github.com/ezyang	2022-07-30 19:37:08 +00:00
Pearu Peterson	ff5399e528	Revise sparse docs regarding Sparse Compressed tensors (#82108 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82108 Approved by: https://github.com/bhosmer	2022-07-29 18:15:09 +00:00
albanD	386b398317	Update MPS POI (#81757 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81757 Approved by: https://github.com/malfet	2022-07-29 16:00:12 +00:00
Fabio Rocha	fd84c458f4	Add torch.unflatten and improve its docs (#81399 ) unflatten now has a free function version in torch.flatten in addition to the method in torch.Tensor.flatten. Updated docs to reflect this and polished them a little. For consistency, changed the signature of the int version of unflatten in native_functions.yaml. Some override tests were failing because unflatten has unusual characteristics in terms of the .int and .Dimname versions having different number of arguments so this required some changes to test/test_override.py Removed support for using mix of integer and string arguments when specifying dimensions in unflatten. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81399 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-07-29 15:02:42 +00:00
Jing Xu	5257d1d64b	A Launch script with Best Recipe of Deep Learning on Intel Xeon CPU (#63932 ) Fixes https://github.com/pytorch/pytorch/issues/63556 Usage: `python -m torch.backends.xeon.launch [--knobs] <script> [script parameters]` Pull Request resolved: https://github.com/pytorch/pytorch/pull/63932 Approved by: https://github.com/albanD	2022-07-29 12:57:22 +00:00
Edward Z. Yang	fd5ac1e6b5	Rename SymbolicIntNode to SymIntNodeImpl (#82350 ) Done via ``` git grep -l 'SymbolicIntNode' \| xargs sed -i 's/SymbolicIntNode/SymIntNodeImpl/g' ``` Reasoning for the change: * Sym is shorter than Symbolic, and consistent with SymInt * You usually will deal in shared_ptr<...>, so we're going to reserve the shorter name (SymIntNode) for the shared pointer. But I don't want to update the Python name, so afterwards I ran ``` git grep -l _C.SymIntNodeImpl \| xargs sed -i 's/_C.SymIntNodeImpl/_C.SymIntNode/' ``` and manually fixed up the binding code Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82350 Approved by: https://github.com/Krovatkin	2022-07-28 18:27:45 +00:00
Jing Xu	0e95746580	[RFC] enable oneMKL&oneDNN on-demands verbose functinality (#63212 ) RFC: Problem statement  Intel oneMKL and oneDNN are used to accelerate performance on Intel platforms. Both these 2 libraries provide verbose functionality to dump detailed operator execution information as well as execution time. These verbose messages are very helpful to performance profiling. However, the verbose functionality works for the entire execution. In many scenarios, though, we only would like to profile partial of the execution process. This feature is to expose PyTorch API functions to control oneDNN and oneMKL verbose functionality in runtime. Additional context   The most used performance profiling steps are shown as the following code snippet: ``` def inference(model, inputs): # step0 (optional): jit model = torch.jit.trace(model, inputs) # step1: warmup for _ in range(100): model(inputs) # step2: performance profiling. We only care the profiling result, as well as oneDNN and oneMKL verbose messages, of this step model(inputs) # step3 (optional): benchmarking t0 = time.time() for _ in range(100): model(inputs) t1 = time.time() print(‘dur: {}’.format((t1-t0)/100)) return model(inputs) ``` Since environment variables MKL_VERBOSE and DNNL_VERBOSE will be effect to the entire progress, we will get a great number of verbose messages for all of 101 iterations (if step3 is not involved). However, we only care about the verbose messages dumped in step2. It is very difficult to filter unnecessary verbose messages out if we are running into a complicated usages scenario. Also, jit trace will also bring more undesired verbose messages. Furthermore, there are more complicated topologies or usages like cascaded topologies as below: ``` model1 = Model1() model2 = Model2() model3 = Model3() x1 = inference(model1, x) x2 = inference(model2, x1) y = inference(model3, x2) ``` There are many cases that it is very hard to split these child topologies out. In this scenario, it is not possible to investigate performance of each individual topology with `DNNL_VERBOSE` and `MKL_VERBOSE`. To solve this issue, oneDNN and oneMKL provide API functions to make it possible to control verbose functionality in runtime. ``` int mkl_verbose (int enable) status dnnl::set_verbose(int level) ``` oneDNN and oneMKL print verbose messages to stdout when oneMKL or oneDNN ops are executed. Sample verbose messages: ``` MKL_VERBOSE SGEMM(t,n,768,2048,3072,0x7fff64115800,0x7fa1aca58040,3072,0x1041f5c0,3072,0x7fff64115820,0x981f0c0,768) 8.52ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:44 dnnl_verbose,exec,cpu,inner_product,brgemm:avx512_core,forward_training,src_f32::blocked:ab:f0 wei_f32::blocked:AB16b64a:f0 bia_f32::blocked:a:f0 dst_f32::blocked:ab:f0,,,mb16ic768oc768,0.0839844 ``` Design and implementation  The design is to make python-interfaced wrap functions to invoke mkl_verbose and dnnl::set_verbose functions. Design concern   - Need to add wrapper C++ functions for mkl_verbose and dnnl::set_verbose functions in torch/csrc and aten/csrc. - Python API functions will be added to device-specific backends - with torch.backends.mkl.verbose(1): - with torch.backends.mkldnn.verbose(1): Use cases   ``` def inference(model, inputs): # step0 (optional): jit model = torch.jit.trace(model, inputs) # step1: warmup for _ in range(100): model(inputs) # step2: performance profiling with torch.backends.mkl.verbose(1), torch.backends.mkldnn.verbose(1): model(inputs) # step3 (optional): benchmarking t0 = time.time() for _ in range(100): model(inputs) t1 = time.time() print(‘dur: {}’.format((t1-t0)/100)) return model(inputs) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/63212 Approved by: https://github.com/VitalyFedyunin, https://github.com/malfet	2022-07-27 23:29:35 +00:00
Slava Kovalevskyi	842f05f014	new doc/tutorial module been added, with the first maintainer svekars… (#82274 ) Approved on the core maintainers meeting: https://dev-discuss.pytorch.org/t/first-pytorch-quarterly-maintainers-meeting-minutes-meeting-date-july-22-2022/709 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82274 Approved by: https://github.com/kit1980, https://github.com/svekars	2022-07-27 19:57:15 +00:00
Danielle Pintz	ae5c166035	Fix two small typos in ddp_comm_hooks.rst (#82047 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82047 Approved by: https://github.com/kit1980	2022-07-23 19:10:57 +00:00
Shangdi Yu	c52ee6dc0a	CSE Pass and common pass Tests (#81742 ) Test cases for CSE Pass and common passes Pull Request resolved: https://github.com/pytorch/pytorch/pull/81742 Approved by: https://github.com/SherlockNoMad	2022-07-22 03:45:09 +00:00
soulitzer	e60f8f4f60	Improve autograd custom function docs (#81340 ) Fixes https://github.com/pytorch/pytorch/issues/81223 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81340 Approved by: https://github.com/albanD	2022-07-21 19:54:30 +00:00
Khaled Zaouk	2fb2740ef9	corrects typo in quantization docs (#81687 ) Fixes #81686 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81687 Approved by: https://github.com/jerryzh168	2022-07-21 00:17:13 +00:00
Adam J. Stewart	92c6690b9c	Fix linspace dtype replacement in docs (#81371 ) Fixes #81370 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81371 Approved by: https://github.com/ngimel	2022-07-20 13:06:16 +00:00
titaiwang	69608fc598	[ONNX] remove outdated ImplicitCastType QA in onnx.rst (#81268 ) Extend work from: https://github.com/pytorch/pytorch/pull/80596 This PR removes outdated QA of ImplicitCastType , as the coverage is greatly increased with the introduction of onnx shape inference and scalar type analysis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81268 Approved by: https://github.com/justinchuby, https://github.com/BowenBao	2022-07-15 16:18:26 +00:00
Danielle Pintz	8926b5b9c2	Fix typos in docs: Profiler and CUDA semantics (#80406 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80406 Approved by: https://github.com/robieta	2022-07-13 18:53:02 +00:00
Jing Xu	3c7044728b	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-07-13 13:50:15 +00:00
vspenubarthi	3b00b17f64	[docs] Updated quantization docs to show per channel support for conv1d (#81349 ) Summary: There is currently per channel quantization support for Conv1d, however this was not highlighted by the documentation for quantization when discussion which modules have per channel quantization support. This adds that there is exisiting support for Conv1d, with evidence reproducable through the test plan below. Test Plan: ``` class SingleLayerModel(torch.nn.Module): def __init__(self): super().__init__() self.conv1d = torch.nn.Conv1d(5, 5, 1).to(dtype=torch.float) def forward(self, x): x = self.conv1d(x) return x def get_example_inputs(self): return (torch.rand(5, 5, 1),) torch.backends.quantized.engine = "fbgemm" model = SingleLayerModel() example_input = model.get_example_inputs()[0] q_config = q_config_mapping = QConfigMapping() q_config_mapping.set_global(torch.ao.quantization.get_default_qconfig(torch.backends.quantized.engine)) prepared = quantize_fx.prepare_fx(model, q_config_mapping, example_input) print(prepared.conv1d.qconfig.weight.p.func) ``` Printing the above lines shows that the Conv1d has a PerChannelMinMaxObserver. To show that this doesn't work for everything, if you replace the Conv1d with a ConvTranspose1d, you will see running the same code above that there is an error thrown about lack of support. Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81349 Approved by: https://github.com/andrewor14	2022-07-12 23:36:37 +00:00
lezcano	e505796a2c	[Array API] Add linalg.vecdot (#70542 ) This PR adds the function `linalg.vecdot` specified by the [Array API](https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#function-vecdot) For the complex case, it chooses to implement \sum x_i y_i. See the discussion in https://github.com/data-apis/array-api/issues/356 Edit. When it comes to testing, this function is not quite a binopt, nor a reduction opt. As such, we're this close to be able to get the extra testing, but we don't quite make it. Now, it's such a simple op that I think we'll make it without this. Resolves https://github.com/pytorch/pytorch/issues/18027. cc @mruberry @rgommers @pmeier @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi Pull Request resolved: https://github.com/pytorch/pytorch/pull/70542 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-07-12 14:28:54 +00:00
vitrioil	747b3b311d	Fix links in `torch.testing` docs (#80353 ) Fixes #79266 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80353 Approved by: https://github.com/mruberry	2022-07-11 19:15:53 +00:00
albanD	a879cb5865	Update poi based on recent activity (#81097 ) cc @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/81097 Approved by: https://github.com/Lezcano, https://github.com/b0noI	2022-07-09 14:39:34 +00:00
Zafar	68ec793cfd	[ao] Moving the sparsity/experimental to sparsity/_experimental (#81149 ) The experimental code in the sparsity does not have user-facing api, and should reside under the proivate package. This involves pruner and base_sparsifier. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81149 Approved by: https://github.com/macandro96	2022-07-09 03:00:11 +00:00
PyTorch MergeBot	39f659c3ba	Revert "[Array API] Add linalg.vecdot (#70542 )" This reverts commit `74208a9c68`. Reverted https://github.com/pytorch/pytorch/pull/70542 on behalf of https://github.com/malfet due to Broke CUDA-10.2 for vecdot_bfloat16, see `74208a9c68`	2022-07-08 22:56:51 +00:00
Sherlock Huang	fc10a63727	Prims+NvFuser Backend Prototype (#80591 ) This PR integrates FX graph partitioner + Aten2Prims DecompositionInterpreter + Prims' TraceExecutor + naive caches for nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80591 Approved by: https://github.com/jjsjann123, https://github.com/ezyang	2022-07-08 19:53:03 +00:00
lezcano	74208a9c68	[Array API] Add linalg.vecdot (#70542 ) This PR adds the function `linalg.vecdot` specified by the [Array API](https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#function-vecdot) For the complex case, it chooses to implement \sum x_i y_i. See the discussion in https://github.com/data-apis/array-api/issues/356 Edit. When it comes to testing, this function is not quite a binopt, nor a reduction opt. As such, we're this close to be able to get the extra testing, but we don't quite make it. Now, it's such a simple op that I think we'll make it without this. Resolves https://github.com/pytorch/pytorch/issues/18027. cc @mruberry @rgommers @pmeier @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi Pull Request resolved: https://github.com/pytorch/pytorch/pull/70542 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-07-08 15:37:58 +00:00
jjsjann123	d2c726d43c	torch.jit doc link for nvfuser readme.md (#77780 ) adding a quick link to nvfuser README.md in jit doc Note that for 1.12 release, we probably want to have the link pointed to the doc in the release code base. I don't know if we have a tag for 1.12 release candidate yet, so we might want to update that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77780 Approved by: https://github.com/davidberard98	2022-07-07 23:25:35 +00:00
Eddie Yan	ae6dd20ba7	[cuDNN V8 API] (reopen 2) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#78299 ) Reopen of #77002 to address comments by @malfet CC @ngimel @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/78299 Approved by: https://github.com/ngimel	2022-07-07 23:25:23 +00:00
Christian Puhrsch	c97ff3d51e	Update NestedTensor docs (#80963 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80963 Approved by: https://github.com/george-qi	2022-07-07 22:15:39 +00:00
Sahan Paliskara	bd6bea35f8	Update package.rst to not include hermetic claim (#81019 ) Summary: Update package.rst to not include hermetic claim as torch.package is not fully hermetic Test Plan: external CI (docs build) Differential Revision: D37670779 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81019 Approved by: https://github.com/priyaramani	2022-07-07 18:40:55 +00:00
albanD	6f1d99b79f	update nn.init doc to reflect the no_grad (#80882 ) Fixes https://github.com/pytorch/pytorch/issues/80839 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80882 Approved by: https://github.com/jbschlosser	2022-07-07 17:19:29 +00:00
lezcano	19f3d4d795	Expose linalg.solve_ex (#80073 ) This prepares for making `linalg.inv_ex` just a call into this function Pull Request resolved: https://github.com/pytorch/pytorch/pull/80073 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD	2022-07-01 16:09:23 +00:00
Andrew M. James	5a4c9e8394	Add spdiags sparse matrix initialization (#78439 ) Similar to [scipy.sparse.spdiags](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.spdiags.html#scipy-sparse-spdiags) Part of #70926 In other functions (ie (torch.diagonal)[https://pytorch.org/docs/stable/generated/torch.diagonal.html#torch.diagonal]) diagonals of a tensor are referenced using the offset and the two dimensions that the diagonal is taken with respect to. Here the reference implementation from scipy is only considering matrix output, so even if we only support 2-d output at first. It may be useful to consider how the dimensions corresponding to each diagonal would be specified for higher dimensional output. The proposed torch signature implies that all offsets refer to the diagonals with respect to the only two dimensions of the output: ``` torch.sparse.spdiags(Tensor diagonals, IntTensor offsets, int[] shape, Layout? layout=None) -> SparseTensor ``` Above it is required that: `diagonals.ndimension() == 2`, `offsets.ndimensions() == 1`, `offsets.shape[0] == diagonals.shape[0]` and `len(shape) == 2`. This would need to be altered for the case where `len(shape)` > 2. One options is: ``` torch.sparse.spdiags(Tensor[] diagonals, IntTensor[] offsets, IntTensor dims, int[] shape, Layout? layout=None) -> SparseTensor ``` Here `offsets` and `diagonals` becomes lists of tensors, and the `IntTensor dims` argument is introduced. This would require that `len(diagonals) == len(offsets) == dims.shape[0]`, `dims.ndimension() == 2` and `dims.shape[1] == 2` also the same restrictions as the 2d case above apply to the elements of `diagonals` and `offsets` pairwise (that is `diagonals[i].ndimension() == 2`, `offsets[i].ndimension() == 1` and `offsets[i].shape[0] == diagonals[i].shape[0]` for all i). This form of the signature would construct the sparse result by placing the values from `diagonals[i][j]` into the diagonal with offset `offset[i][j]` taken with respect to dimensions `dims[i]`. The specialization back to the original signature for the 2d case could be seen as allowing the single row of dims to default to `[0, 1]` when there is only one `diagonals`, `offsets` provided, and shape is `2-d`. This option allows the rows of an input element `diagonals[i]` to have a different length which may be appropriate as the max length of a diagonal along different dimension pairs will be different. Another option is to specify the dimensions the diagonal is taken with respect to for each offset. This signature would look like: ``` torch.sparse.spdiags(Tensor diagonals, IntTensor offsets, IntTensor dims, int[] shape, Layout? layout=None) -> SparseTensor ``` Here, `diagonals` is still 2-D with dimension 0 matching the length of 1-D `offsets` and the tensor input `dims` is also 2-D with dimension 0 matching the length of 1-D `offsets` and the second dimension being fixed at `2` in this case the sparse result is constructed by placing the elements from `diagonals[i]` into the output diagonal `output.diagonal(offset[i], dim0=dims[i][0], dim1=dims[i][1])` (with some additional consideration that makes it more complicated than simply asigning to that view). The specialization from this back to the 2-D form could be seen as assuming `dims = [[0, 1], [0, 1]... len(offsets) times ]` when `len shape==2`. In both proposed signatures for the N-D case the specialization back to the 2-D signature is a bit of a stretch for your typical default arguments logic, however I think the first is better choice as it offers more flexibility. I think some discussion is required about: - [x] Should the N-D output case be implemented from the outset - [x] If not, should the future addition of the N-D output case be considered when designing the interface. - [x] Other thoughts on the signature which includes the `dims` information for the N-D output case. Resolution: Since no one has offered a request for N-D output support, I think is fine to restrict this to sparse matrix generation. Should a request for N-D support come later, an overload accepting the additional `dims` could be added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78439 Approved by: https://github.com/nikitaved, https://github.com/cpuhrsch, https://github.com/pearu	2022-07-01 01:11:54 +00:00
PyTorch MergeBot	56e3bc5215	Revert "Add spdiags sparse matrix initialization (#78439 )" This reverts commit `cfb2034b65`. Reverted https://github.com/pytorch/pytorch/pull/78439 on behalf of https://github.com/suo due to broke windows builds, see: `cfb2034b65`	2022-06-30 21:04:36 +00:00
Andrew M. James	cfb2034b65	Add spdiags sparse matrix initialization (#78439 ) Similar to [scipy.sparse.spdiags](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.spdiags.html#scipy-sparse-spdiags) Part of #70926 In other functions (ie (torch.diagonal)[https://pytorch.org/docs/stable/generated/torch.diagonal.html#torch.diagonal]) diagonals of a tensor are referenced using the offset and the two dimensions that the diagonal is taken with respect to. Here the reference implementation from scipy is only considering matrix output, so even if we only support 2-d output at first. It may be useful to consider how the dimensions corresponding to each diagonal would be specified for higher dimensional output. The proposed torch signature implies that all offsets refer to the diagonals with respect to the only two dimensions of the output: ``` torch.sparse.spdiags(Tensor diagonals, IntTensor offsets, int[] shape, Layout? layout=None) -> SparseTensor ``` Above it is required that: `diagonals.ndimension() == 2`, `offsets.ndimensions() == 1`, `offsets.shape[0] == diagonals.shape[0]` and `len(shape) == 2`. This would need to be altered for the case where `len(shape)` > 2. One options is: ``` torch.sparse.spdiags(Tensor[] diagonals, IntTensor[] offsets, IntTensor dims, int[] shape, Layout? layout=None) -> SparseTensor ``` Here `offsets` and `diagonals` becomes lists of tensors, and the `IntTensor dims` argument is introduced. This would require that `len(diagonals) == len(offsets) == dims.shape[0]`, `dims.ndimension() == 2` and `dims.shape[1] == 2` also the same restrictions as the 2d case above apply to the elements of `diagonals` and `offsets` pairwise (that is `diagonals[i].ndimension() == 2`, `offsets[i].ndimension() == 1` and `offsets[i].shape[0] == diagonals[i].shape[0]` for all i). This form of the signature would construct the sparse result by placing the values from `diagonals[i][j]` into the diagonal with offset `offset[i][j]` taken with respect to dimensions `dims[i]`. The specialization back to the original signature for the 2d case could be seen as allowing the single row of dims to default to `[0, 1]` when there is only one `diagonals`, `offsets` provided, and shape is `2-d`. This option allows the rows of an input element `diagonals[i]` to have a different length which may be appropriate as the max length of a diagonal along different dimension pairs will be different. Another option is to specify the dimensions the diagonal is taken with respect to for each offset. This signature would look like: ``` torch.sparse.spdiags(Tensor diagonals, IntTensor offsets, IntTensor dims, int[] shape, Layout? layout=None) -> SparseTensor ``` Here, `diagonals` is still 2-D with dimension 0 matching the length of 1-D `offsets` and the tensor input `dims` is also 2-D with dimension 0 matching the length of 1-D `offsets` and the second dimension being fixed at `2` in this case the sparse result is constructed by placing the elements from `diagonals[i]` into the output diagonal `output.diagonal(offset[i], dim0=dims[i][0], dim1=dims[i][1])` (with some additional consideration that makes it more complicated than simply asigning to that view). The specialization from this back to the 2-D form could be seen as assuming `dims = [[0, 1], [0, 1]... len(offsets) times ]` when `len shape==2`. In both proposed signatures for the N-D case the specialization back to the 2-D signature is a bit of a stretch for your typical default arguments logic, however I think the first is better choice as it offers more flexibility. I think some discussion is required about: - [x] Should the N-D output case be implemented from the outset - [x] If not, should the future addition of the N-D output case be considered when designing the interface. - [x] Other thoughts on the signature which includes the `dims` information for the N-D output case. Resolution: Since no one has offered a request for N-D output support, I think is fine to restrict this to sparse matrix generation. Should a request for N-D support come later, an overload accepting the additional `dims` could be added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78439 Approved by: https://github.com/nikitaved, https://github.com/cpuhrsch, https://github.com/pearu	2022-06-30 19:54:47 +00:00
Bin Wen	45ae244086	[torch.package][doc] PackageExporter does not have file_structure (#79948 ) Summary: found this issue when testing torch.package. also found an open issue https://github.com/pytorch/pytorch/issues/74221. bootstrapping a fix. Reviewed By: d4l3k Differential Revision: D37063748 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79948 Approved by: https://github.com/d4l3k	2022-06-30 19:49:53 +00:00
PyTorch MergeBot	1454515253	Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 )" This reverts commit `f988aa2b3f`. Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see `f988aa2b3f`	2022-06-30 12:49:41 +00:00
Jing Xu	f988aa2b3f	Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289 ) More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx). ITT is a functionality for labeling trace data during application execution across different Intel tools. For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future. It works for both Intel CPU and Intel XPU devices. Pitch Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU. This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch. Usage example: ``` with torch.autograd.profiler.emit_itt(): for i in range(10): torch.itt.range_push('step_{}'.format(i)) model(input) torch.itt.range_pop() ``` cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289 Approved by: https://github.com/malfet	2022-06-30 05:14:03 +00:00
Allen Goodman	63ef2a03e5	torch.special.scaled_modified_bessel_k0 (#78900 ) ```Python scaled_modified_bessel_k0(input, *, out=None) -> Tensor ``` Scaled modified Bessel function of the second kind of order $0$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78900 Approved by: https://github.com/mruberry	2022-06-29 14:53:37 +00:00
PyTorch MergeBot	602c38ff63	Revert "torch.special.gamma (#78904 )" This reverts commit `f563f25efd`. Reverted https://github.com/pytorch/pytorch/pull/78904 on behalf of https://github.com/suo due to This PR appears to have broken mac tests on master `f563f25efd`	2022-06-28 00:54:22 +00:00
Svetlana Karslioglu	7394de4e1e	Add a note on CUDA 11.6 (#80363 ) Fixes #79876 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80363 Approved by: https://github.com/atalman	2022-06-27 21:34:24 +00:00
Allen Goodman	ab8797d69b	torch.special.spherical_bessel_j0 (#78912 ) ```Python spherical_bessel_j0(input, *, out=None) -> Tensor ``` Spherical Bessel function of the first kind of order $0$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78912 Approved by: https://github.com/mruberry	2022-06-27 20:14:46 +00:00
Allen Goodman	f563f25efd	torch.special.gamma (#78904 ) ```Python gamma(input, *, out=None) -> Tensor ``` Gamma function $\Gamma\left(\text{input}\right)$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78904 Approved by: https://github.com/mruberry	2022-06-27 19:36:17 +00:00
migeedz	443db9b58e	Introduce Z3 types and utility functions for constraint generation (#80084 ) Create Z3 types. In particular, dynamic dimensions, dynamic tensor type and tensor types up to size 4. Note that for Z3 decidability reasons, we are using uninterpreted functions for tensor types, which means we must explicitly define tensor constructors with a concrete size (for now, upto size 4). We defer lifting this requirement to future work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80084 Approved by: https://github.com/anijain2305	2022-06-25 22:27:33 +00:00
Allen Goodman	b3ca3638be	torch.special.scaled_modified_bessel_k1 (#78901 ) ```Python scaled_modified_bessel_k1(input, *, out=None) -> Tensor ``` Scaled modified Bessel function of the second kind of order $1$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78901 Approved by: https://github.com/mruberry	2022-06-24 20:57:38 +00:00
Sherlock Huang	752c06e0e1	FX graph partitioner and fuser (#79439 ) This PR introduces two components. CapabilityBasedPartitioner for FX graph: given a list of supported operators, this partitioner tries to forms the largest subgraphs that only contain the supported ops. Fuser utility: given a list of nodes in FX graph, it lifts them as a sub-GraphModule in the original graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79439 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-06-24 18:49:37 +00:00
HDCharles	0308609b41	[quant] Quantizable documentation (#79957 ) Minor documentation entry for the quantizable LSTM and MHA classes. due to weird CI issues old discussion can be found: https://github.com/pytorch/pytorch/pull/71191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79957 Approved by: https://github.com/z-a-f	2022-06-24 16:55:15 +00:00
macandro96	70b7bca423	[ao][sparsity] Base scheduler class for Data Schedulers (#79817 ) The BaseDataScheduler is the abstract scheduler class specifically for the BaseDataSparsifier class. This class controls a specific hyperparameter of the sparsifier class and varies it across the training process (or across time). Args: data_sparsifier (instance of BaseDataSparsifier) Implemented class data sparsifier class wherein the update_mask is implemented schedule_param (str) A specific hyperparameter of the passed sparsifier that needs to be scheduled/varied last_epoch (int, default=-1) This is specifically is passed when training needs to be resumed from a particular point. verbose (bool, default=False) Verbosity of the BaseDataScheduler The get_schedule_param() function needs to be implemented by the user. Test Plan: ```python test/test_ao_sparsity.py TestBaseDataScheduler``` Differential Revision: [D37358608](https://our.internmc.facebook.com/intern/diff/D37358608) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79817 Approved by: https://github.com/jerryzh168, https://github.com/z-a-f	2022-06-24 16:51:52 +00:00
HDCharles	ffdc5eebc7	[ao][docs] tests for quantization docs (#79923 ) Summary: per https://github.com/pytorch/pytorch/issues/79135 the code snippets in the docs don't run. This is a recurring problem since previously there was no unit test to check that these code snippets actually ran. This PR adds support for such a test, importing the snippet as a string and evaluating it to make sure that it actually runs if the code snippet has user defined code, you can pass in dummy versions using global_inputs. Sometimes the imports of the code snippets behave oddly but you can pass them in as in test_quantization_doc_custom where nnq is passed in. Test Plan: python test/test_quantization.py TestQuantizationDocs also see https://github.com/pytorch/pytorch/pull/79994 to see what shows up in CI when the docs get broken Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79923 Approved by: https://github.com/z-a-f, https://github.com/vspenubarthi	2022-06-23 20:50:31 +00:00
Justin Chu	da33c93169	[ONNX] Clean up `onnx_supported_ops` (#79424 ) - Hide the module from `torch.onnx` public namespace because it is for internal use - Remove unused variables - Fix lint errors - Reformat - Create `onnx` folder under docs/scripts and add it to the onnx merge rule Pull Request resolved: https://github.com/pytorch/pytorch/pull/79424 Approved by: https://github.com/thiagocrepaldi, https://github.com/garymm, https://github.com/kit1980, https://github.com/malfet	2022-06-23 20:44:51 +00:00
Allen Goodman	b3308e21bf	torch.special.airy_ai (#78902 ) ```Python airy_ai(input, *, out=None) -> Tensor ``` Airy function $\text{Ai}\left(\text{input}\right)$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78902 Approved by: https://github.com/mruberry, https://github.com/linbinyu, https://github.com/seemethere	2022-06-23 19:33:40 +00:00
Edward Z. Yang	f7ee061638	Wconstab/reland pysymint (#79795 ) rebased https://github.com/pytorch/pytorch/pull/79617/ to see if issues are reproducible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79795 Approved by: https://github.com/malfet	2022-06-20 22:55:06 +00:00
David Berard	8edaf388e5	Fix fx decomposition example Previously GraphAppendingTracer was appending to the wrong graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79807 Approved by: https://github.com/kit1980	2022-06-20 17:26:17 +00:00
eqy	eff74ed7bd	[AMP] Use generic autocast in example, specify dtype (#79579 ) CC @mruberry @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/79579 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-06-17 21:32:51 +00:00
Rhys Goodall	62ba548cac	[DOC] Missing line in serialization notes (#79454 ) Small typo fix to serialization docs where there was a missing line in one of the examples. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79454 Approved by: https://github.com/mruberry	2022-06-17 18:26:47 +00:00
Orion Reblitz-Richardson	4df76d1df3	Adjust wording for consistency (#79758 ) Requested by some of our internal review. @svekars thoughts? Thanks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79758 Approved by: https://github.com/svekars, https://github.com/kit1980	2022-06-17 01:39:30 +00:00
Olga Andreeva	8a6d83079c	Functionality/pickling for commhooks (#79334 ) This PR addresses issue address #75666. Stateful communication hook now can be saved and reloaded to resume training. Current PR adds the functionality for PowerSGD communication hook and tests that communication hook can be properly saved and restored. PowerSGD implementation uses ``__slots__``, as a result introduced __getstate__ and __setstate__ methods are implemented to work with `__slots__` and not` __dict__`. `__getstate__ ` Returns: A dictionary that represents a ``PowerSGDState`` which will be pickled and saved. ``process_group`` is non-serializable and excluded from a returned state. `__setstate__` Takes a provided ``state`` and retrieves ``PowerSGDState``. ``process_group`` is set to default with a proper warning issued to a user. Unit test A hook-independent `_test_hook_pickling` is added with this PR, as well as `test_ddp_hook_pickling_powerSGD`, which tests `powerSGD`’s ability to be saved and reloaded. Currently, the test creates a ddp model with a provided hook, trains it for 10 epochs and saves model’s state and hook’s state. During reloading, unit test makes sure that a warning was logged (only one warning and the proper one). It then proceeds to check that reloaded hook and original hook are the same. Finally, it checks that a hook’s state was properly initialized: - it compares slot values (all, but 2: `process_group` and `rng`) for original and reloaded state - it checks that process group was set to a default group - it checks that a random state was restored properly with np.testing.assert_array_equal, because `rng` is an instance of `np.random.RandomState`, represented by a tuple. One of entries is of `ndarray dtype[uint32]` type and `np.testing.assert_array_equal` is used for assertion. Future To-Do: - Implement similar __getstate__ and __setstate__ for other stateful communication hooks - Add appropriate tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/79334 Approved by: https://github.com/rohan-varma, https://github.com/awgu	2022-06-16 23:15:34 +00:00
PyTorch MergeBot	44436947bc	Revert "Reland PySymInt (#79617 )" This reverts commit `8ef6356f26`. Reverted https://github.com/pytorch/pytorch/pull/79617 on behalf of https://github.com/zengk95 due to this is breaking periodic jobs (and maybe pull) on trunk	2022-06-16 19:40:27 +00:00
macandro96	15828bcfd7	[ao][sparsity] Base class for Data Sparsifier Base Data Sparsifier class for all Data sparsifiers. The abstract class accepts raw torch tensors / embedding / embedding bags (refer to SUPPORTED_TYPES above) to prepare for sparsification. In this case, mask (and parametrizations) is owned by the class and not by the user. Specifically, the container object inside the class maintains the mask and parametrizations of the input data Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79251 Approved by: https://github.com/z-a-f, https://github.com/HDCharles	2022-06-16 17:31:22 +00:00
Nikolay Korovaiko	8ef6356f26	Reland PySymInt (#79617 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/79617 Approved by: https://github.com/Chillee	2022-06-16 04:18:06 +00:00
Joel Benjamin Schlosser	2d73c8e6e0	Add Dropout1d module Pull Request resolved: https://github.com/pytorch/pytorch/pull/79545 Approved by: https://github.com/ngimel, https://github.com/albanD	2022-06-15 14:39:07 +00:00
PyTorch MergeBot	b8db0a0475	Revert "Python Bindings for SymInts (#78135 )" This reverts commit `d332724071`. Reverted https://github.com/pytorch/pytorch/pull/78135 on behalf of https://github.com/ezyang due to broke torchvision tests	2022-06-15 13:52:14 +00:00
Svetlana Karslioglu	9d351b3ddd	Update the governance page (#78850 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78850 Approved by: https://github.com/orionr, https://github.com/b0noI	2022-06-14 15:39:51 +00:00
Svetlana Karslioglu	5399fef644	Update persons of interest (#79076 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79076 Approved by: https://github.com/b0noI	2022-06-14 15:26:13 +00:00
Nikolay Korovaiko	d332724071	Python Bindings for SymInts (#78135 ) This PR adds support for `SymInt`s in python. Namely, * `THPVariable_size` now returns `sym_sizes()` * python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s * pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced * a large number of tests added to demonstrate how to implement python symints. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135 Approved by: https://github.com/ezyang	2022-06-14 02:17:59 +00:00
anjali411	38e717dc87	Add docs for Python Registration Pull Request resolved: https://github.com/pytorch/pytorch/pull/78753 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-06-13 23:21:23 +00:00
Mike Ruberry	1d47e0df5a	Updates TF32 docs (#79401 ) Updates TF32 docs to reflect PyTorch 1.12 updates. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79401 Approved by: https://github.com/ngimel	2022-06-13 21:02:00 +00:00
anjali411	38350acf8f	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322 Approved by: https://github.com/albanD	2022-06-11 00:29:32 +00:00
Svetlana Karslioglu	68136828e0	Add Design Philosophy (#79248 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79248 Approved by: https://github.com/albanD	2022-06-10 21:21:05 +00:00
jjsjann123	462874f418	adding a quick link to nvfuser README.md in jit doc for 1.12 release (#78160 ) adding a link to github 1.12 release branch nvfuser README.md in jit doc Note that this PR is intended to be cherry-picked by 1.12 release, we'll have a follow up PR to update the link once this PR is merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78160 Approved by: https://github.com/davidberard98	2022-06-09 17:28:17 +00:00
lezcano	a8ea58afee	Add randomness case to the autograd notes I also took this chance to clean a bit the sphinx formatting and reworded a few minor things. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78617 Approved by: https://github.com/soulitzer, https://github.com/albanD	2022-06-08 21:27:03 +00:00
lezcano	c7d6cec078	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634 Approved by: https://github.com/malfet	2022-06-07 22:28:28 +00:00
Shawn Zhong	243dd7e74f	Fix LeakyReLU image (#78508 ) Fixes #56363, Fixes #78243 \| [Before](https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html) \| [After](https://docs-preview.pytorch.org/78508/generated/torch.nn.LeakyReLU.html) \| \| --- \| --- \| \| ![image](https://user-images.githubusercontent.com/6421097/171110542-4a9e8ff3-015d-4f3c-88da-171d17dad42e.png) \| ![LeakyReLU](https://user-images.githubusercontent.com/6421097/171110505-ba4bca24-2138-47c3-9ebd-35b75a7fe351.png) \| - Plot `LeakyReLU` with `negative_slope=0.1` instead of `negative_slope=0.01` - Changed the title from `"{function_name} activation function"` to the name returned by `_get_name()` (with parameter info). The full list is attached at the end. - Modernized the script and ran black on `docs/source/scripts/build_activation_images.py`. Apologies for the ugly diff. ``` ELU(alpha=1.0) Hardshrink(0.5) Hardtanh(min_val=-1.0, max_val=1.0) Hardsigmoid() Hardswish() LeakyReLU(negative_slope=0.1) LogSigmoid() PReLU(num_parameters=1) ReLU() ReLU6() RReLU(lower=0.125, upper=0.3333333333333333) SELU() SiLU() Mish() CELU(alpha=1.0) GELU(approximate=none) Sigmoid() Softplus(beta=1, threshold=20) Softshrink(0.5) Softsign() Tanh() Tanhshrink() ``` cc @brianjo @mruberry @svekars @holly1238 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78508 Approved by: https://github.com/jbschlosser	2022-06-07 16:32:45 +00:00
Kurt Mohler	a4403c17c7	Improve reproducibility docs for RNG (#78849 ) * Mention that operations may change RNG state and how to deal with it * Add link to Reproducibility note in `use_deterministic_algorithms` docs * Also fix a broken link Fixes #77206 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78849 Approved by: https://github.com/mruberry	2022-06-06 14:53:59 +00:00
Svetlana Karslioglu	8c8527233f	Update the Contribution Guide (#78779 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78779 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-06-03 21:15:52 +00:00
Nikita Shulga	40e2aadf47	Create __init__.py (#78629 ) To make `torch.utils.jit` a proper package, otherwise it will not be added to the wheel Pull Request resolved: https://github.com/pytorch/pytorch/pull/78629 Approved by: https://github.com/seemethere, https://github.com/xuzhao9, https://github.com/davidberard98	2022-06-03 18:14:21 +00:00
YifanShenSZ	6ba1d05fa4	to_padded_tensor doc v0 (#78657 ) Fixes #76846 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78657 Approved by: https://github.com/jbschlosser	2022-06-03 14:27:31 +00:00
PyTorch MergeBot	954522a485	Revert "Autogen Tags enum, and allow specifying tags while defining an op" This reverts commit `9476a78f37`. Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see `9476a78f37`	2022-06-03 01:53:53 +00:00
anjali411	9476a78f37	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-06-03 01:13:44 +00:00
albanD	b30b1f3dec	update mps note with more details (#78669 ) Follow up to the comments in https://github.com/pytorch/pytorch/pull/77767#pullrequestreview-978807521 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78669 Approved by: https://github.com/kulinseth, https://github.com/anjali411	2022-06-02 20:53:19 +00:00
vfdev	642fc94501	Update extending.rst (#78707 ) Follow-up fix for https://github.com/pytorch/pytorch/pull/78073 : https://github.com/pytorch/pytorch/pull/78073#discussion_r887621219 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78707 Approved by: https://github.com/albanD	2022-06-02 17:24:00 +00:00
Adam J. Stewart	d90652db65	Docs: build with Sphinx 5 (#70309 ) Fixes #60979. Also see #61045 and https://github.com/sphinx-doc/sphinx/issues/9395 for discussion. I _believe_ the reason that we were previously pinning to Sphinx 3 was because of issues with pytorch_sphinx_theme and Sphinx 4 support, but these seem to have been resolved now. See https://torchgeo.readthedocs.io/ for an example of docs built with pytorch_sphinx_theme and Sphinx 4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70309 Approved by: https://github.com/albanD	2022-06-01 22:28:29 +00:00
Andrew Or	e41389f84b	[Quant][docs] Replace qconfig_dict with QConfigMapping in docs Summary: https://github.com/pytorch/pytorch/pull/78452 replaced qconfig_dict with QConfigMapping as the default API for prepare_fx, prepare_qat_fx, and convert_fx. We should update the docs to reflect this change as well. Test Plan: ``` cd docs make html cd build/html python -m server.http ``` Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/78533 Approved by: https://github.com/vkuzo	2022-06-01 15:10:48 +00:00
Svetlana Karslioglu	41e2611e6a	Fix left nav (#78552 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78552 Approved by: https://github.com/albanD, https://github.com/orionr, https://github.com/malfet	2022-06-01 00:49:53 +00:00
Philip Meier	288b23bc52	fix MetadataTensor example (#78073 ) ```py [bar if bar for bar in foo] ``` is invalid Python syntax. The `if` clause needs to be at the end: ```py [bar for bar in foo if bar] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78073 Approved by: https://github.com/albanD	2022-05-31 21:34:19 +00:00
Gary Miguel	b27f0fea2c	[ONNX] Fix case in type annotation in docs (#78388 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78388 Approved by: https://github.com/justinchuby, https://github.com/BowenBao	2022-05-31 19:27:34 +00:00
Kevin Tse	51ecc366e1	[DataLoader] Minor documentation improvement Pull Request resolved: https://github.com/pytorch/pytorch/pull/78404 Approved by: https://github.com/ejguan	2022-05-31 15:59:46 +00:00
PyTorch MergeBot	d450034f24	Revert "Beta function (#78031 )" This reverts commit `da16450360`. Reverted https://github.com/pytorch/pytorch/pull/78031 on behalf of https://github.com/suo due to broke trunk, see the above message	2022-05-24 22:55:06 +00:00
Sherlock Huang	6db8440f35	Python Jiterator supports multiple outputs (#78139 ) This PR is part3. Part1: https://github.com/pytorch/pytorch/pull/77902 Part2: https://github.com/pytorch/pytorch/pull/77921 Python Jiterator now supports returning multiple outputs ``` fn = torch.cuda.jiterator._create_multi_output_jit_fn( """ template <typename T> T binary_2outputs(T i0, T i1, T& out0, T& out1) { out0 = i0 + i1; out1 = i0 - i1; } """, num_outputs=2) x = torch.rand(3, device='cuda') y = torch.rand(3, device='cuda') out0, out1 = fn(x, y) torch.allclose(out0, x+y) torch.allclose(out1, x-y) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78139 Approved by: https://github.com/ngimel	2022-05-24 21:52:56 +00:00
PyTorch MergeBot	b994ce359e	Revert "[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002 )" This reverts commit `c274f2ad52`. Reverted https://github.com/pytorch/pytorch/pull/77002 on behalf of https://github.com/malfet due to please, as it breaks internal CI, but also no CUDA heads should be included from `torch/csrc/Module.cpp`, but rather should be implemented/registered in `torch/csrc/cuda/Module.cpp`	2022-05-24 21:52:35 +00:00
Allen Goodman	da16450360	Beta function (#78031 ) Euler beta function: ```Python torch.special.beta(input, other, *, out=None) → Tensor ``` `reentrant_gamma` and `reentrant_ln_gamma` implementations (using Stirling’s approximation) are provided. I started working on this before I realized we were missing a gamma implementation (despite providing incomplete gamma implementations). Uses the coefficients computed by Steve Moshier to replicate SciPy’s implementation. Likewise, it mimics SciPy’s behavior (instead of the behavior in Cephes). Pull Request resolved: https://github.com/pytorch/pytorch/pull/78031 Approved by: https://github.com/mruberry	2022-05-24 21:07:25 +00:00
Svetlana Karslioglu	e451259a60	Reorganize Community Section v1 (#77912 ) - Change Notes to Guides - Move the Community section to the top Pull Request resolved: https://github.com/pytorch/pytorch/pull/77912 Approved by: https://github.com/malfet	2022-05-24 16:38:24 +00:00
Eddie Yan	c274f2ad52	[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002 ) (reopening due to botched merge) The cuDNN V8 API (main support merged in https://github.com/pytorch/pytorch/pull/60755) potentially exposes many more kernels with benchmark=True. While these additional kernels can improve performance, it is often unnecessary to run every kernel returned by the heuristic and doing so may degrade the user experience by causing the first model iteration to be very slow. To alleviate this issue, this PR introduces torch.backends.cudnn.benchmark_limit. benchmark_limit specifies the maximum number of working cuDNN kernels to try for a given workload, with the default being 10 (similar to what TensorFlow does). benchmark_limit = 0 yields the current behavior of trying every kernel returned by the heuristic. CC @ptrblck @ngimel @xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77002 Approved by: https://github.com/ngimel	2022-05-24 00:11:47 +00:00
Rohit Goswami	c915fbe201	ENH: Convert finfo.tiny to finfo.smallest_normal (#76292 ) Fixes #70909, by a straightforward search and replace discussed in #70909. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76292 Approved by: https://github.com/mruberry	2022-05-20 00:59:48 +00:00
Michael Carilli	929f1d5317	[RELAND] Adds torch.cuda.is_current_stream_capturing (#77789 ) Resubmit of https://github.com/pytorch/pytorch/pull/77673, which was reverted due to Windows test failures: https://github.com/pytorch/pytorch/pull/77673#issuecomment-1130425845. I suspect these failures happened because I don't explicitly set a side stream for graph capture in the new test. Not setting a side stream explicitly is alright on Linux because cuda tests implicitly use a side stream. I think Windows cuda tests implicitly use the default stream, breaking capture and leaving the backend in a bad state. Other graphs tests explicitly set side streams and don't error in Windows builds, so i'm 95% sure doing the same for the new test will work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77789 Approved by: https://github.com/ezyang	2022-05-18 23:18:53 +00:00
Alban Desmaison	dcd2ba3538	improve mps note to describe the different functions available (#77767 ) Fixing https://github.com/pytorch/pytorch/issues/77748 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77767 Approved by: https://github.com/soulitzer	2022-05-18 20:17:23 +00:00
Jeff Daily	de86146c61	rocblas alt impl during backward pass only (#71881 ) In preparation of adopting future rocblas library options, it is necessary to track when the backward pass of training is executing. The scope-based helper class `BackwardPassGuard` is provided to toggle state. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71881 Approved by: https://github.com/albanD	2022-05-18 19:42:58 +00:00
PyTorch MergeBot	0d8a0f186b	Revert "Adds torch.cuda.is_current_stream_capturing (#77673 )" This reverts commit `d03d43df52`. Reverted https://github.com/pytorch/pytorch/pull/77673 on behalf of https://github.com/suo	2022-05-18 19:31:49 +00:00
Edward Z. Yang	4941e72e40	Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 )"" This reverts commit `c35bd8d423`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719 Approved by: https://github.com/Chillee, https://github.com/malfet	2022-05-18 18:40:57 +00:00
Michael Carilli	d03d43df52	Adds torch.cuda.is_current_stream_capturing (#77673 ) Exposes a way to query if CUDA graph capture is underway on the current stream. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77673 Approved by: https://github.com/ezyang	2022-05-18 16:46:35 +00:00
Vasiliy Kuznetsov	c15fca1137	quant doc: improve rendered documentation for backend_config_dict Summary: This improves the documentation page for backend_config_dict to render the configurations in a human readable format, such as ``` { 'pattern': torch.nn.modules.pooling.AdaptiveAvgPool1d, 'dtype_configs': [ { 'input_dtype': torch.quint8, 'output_dtype': torch.quint8, }, { 'input_dtype': torch.float16, 'weight_dtype': torch.float16, 'bias_dtype': torch.float16, 'output_dtype': torch.float16, }, ], 'observation_type': ObservationType.OUTPUT_SHARE_OBSERVER_WITH_INPUT, }, ``` The results are also now sorted alphabetically by the normalized name of the root op in the pattern. A couple of utility functions are created to help with this. If in the future we convert backend_config_dict to use typed objects, we can move this logic to the objects at that time. Test plan: ``` cd docs make html cd build python -m server.http // renders correctly, example: https://gist.github.com/vkuzo/76adfc7c89e119c59813a733fa2cd56f ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77535 Approved by: https://github.com/andrewor14	2022-05-18 11:46:07 +00:00
PyTorch MergeBot	48581d74ad	Revert "Add dispatch mode testing for meta tensors and other stuff" This reverts commit `c1cdb1216b`. Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet	2022-05-18 02:56:48 +00:00
Edward Z. Yang	c1cdb1216b	Add dispatch mode testing for meta tensors and other stuff We don't have any coverage for meta tensor correctness for backwards because torch function mode can only allow us to interpose on Python torch API calls, but backwards invocations happen from C++. To make this possible, I add torch_dispatch_meta test which runs the tests with __torch_dispatch__ While doing this, I needed to generate fresh expected failure / skip lists for the new test suite, and I discovered that my original scaffolding for this purpose was woefully insufficient. So I rewrote how the test framework worked, and at the same time rewrote the __torch_function__ code to also use the new logic. Here's whats new: - Expected failure / skip is now done on a per function call basis, rather than the entire test. This means that separate OpInfo samples for a function don't affect each other. - There are now only two lists: expect failure list (where the test consistently fails on all runs) and skip list (where the test sometimes passes and fails. - We explicitly notate the dtype that failed. I considered detecting when something failed on all dtypes, but this was complicated and listing everything out seemed to be nice and simple. To keep the dtypes short, I introduce a shorthand notation for dtypes. - Conversion to meta tensors is factored into its own class MetaConverter - To regenerate the expected failure / skip lists, just run with PYTORCH_COLLECT_EXPECT and filter on a specific test type (test_meta or test_dispatch_meta) for whichever you want to update. Other misc fixes: - Fix max_pool1d to work with BFloat16 in all circumstances, by making it dispatch and then fixing a minor compile error (constexpr doesn't work with BFloat16) - Add resolve_name for turning random torch API functions into string names - Add push classmethod to the Mode classes, so that you can more easily push a mode onto the mode stack - Add some more skips for missing LAPACK - Added an API to let you query if there's already a registration for a function, added a test to check that we register_meta for all decompositions (except detach, that decomp is wrong lol), and then update all the necessary sites to make the test pass. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477 Approved by: https://github.com/zou3519	2022-05-18 00:18:34 +00:00
Sadra Barikbin	71d61bb78b	Fix typo in torch.package code and docs (#77604 ) Fixes #77603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77604 Approved by: https://github.com/cpuhrsch	2022-05-17 17:35:39 +00:00
ecao	541a378914	Remove operators that support BFloat16 in the fp32 cast policy list of AutocastCPU (#77623 ) Remove operators that support BFloat16 in the fp32 cast policy list of AutocastCPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77623 Approved by: https://github.com/frank-wei	2022-05-17 16:49:17 +00:00
Vasiliy Kuznetsov	9cc92d5358	quant docs: best practices for quantization accuracy debugging Summary: This PR creates a best practices guideline for debugging quantization accuracy. The content here comes from https://fburl.com/gdoc/nzlzxeaf, with experimental and Meta-only parts left out. For now, a lot of the debugging is manual, with the Numeric Suite the only tool we have to help the user find root causes of quantization inaccuracies. As we build additional tools for equalization detection, outlier detection, etc, we will add them to this page Test plan: ``` cd docs make html cd build/html python -m server.http // result renders well in browser ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77536 Approved by: https://github.com/hx89	2022-05-17 12:16:52 +00:00
Mikayla Gawarecki	841c65f499	Unprivate _index_reduce and add documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997 Approved by: https://github.com/cpuhrsch	2022-05-13 19:48:38 +00:00
Kulin Seth	e011a8e18b	Enable PyTorch operations on MPS Backend. (#77343 ) Add PyTorch operations to MPS backend. - https://github.com/pytorch/pytorch/issues/77394 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343 Approved by: https://github.com/albanD	2022-05-13 18:28:53 +00:00
James Reed	286d788029	Properly capitalize PyTorch (#77308 ) pytorch -> PyTorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/77308 Approved by: https://github.com/bertmaher, https://github.com/mthrok	2022-05-12 18:07:32 +00:00
ftorres16	e06400e730	Fix docs "quantization" instead of "quantiztion" (#77300 ) There seems to be a typo in the main quantization docs. In the table comparing "Eager Mode Quantization" against "FX Graph Mode Quantization", in the row named "Quantization Mode Support" both modes say they are "Quantiztion aware" instead of "Quantization aware" Pull Request resolved: https://github.com/pytorch/pytorch/pull/77300 Approved by: https://github.com/H-Huang	2022-05-12 12:19:32 +00:00
ecao	5993cc0b3d	Update operator list for AutocastCPU (#68725 ) Update operator list for AutocastCPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68725 Approved by: https://github.com/frank-wei	2022-05-11 17:28:35 +00:00
Kulin Seth	f348b1b2b5	Add the Runtime components for MPS backend. (#76725 ) The PR adds the runtime components and few basic operations like copy, as_strided for MPS backend. Current list of identified TODOs are: - https://github.com/pytorch/pytorch/issues/77176 - Unify the logic with CUDACachingAllocator and remove redundant code. - https://github.com/pytorch/pytorch/issues/77170 - Look into using C++ smart pointers where possible with ObjC code - Use empty_strided_generic() to implement the `empty_strided_mps` code - https://github.com/pytorch/pytorch/issues/77144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76725 Approved by: https://github.com/albanD	2022-05-11 17:19:45 +00:00
leslie-fang-intel	f2d9fc18f1	Update amp document with CPU Training/Inference Examples (#77244 ) This PR mainly updates the document with CPU Training/Inference Examples. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77244 Approved by: https://github.com/H-Huang	2022-05-11 15:42:45 +00:00
Danielle Pintz	6ae047b0a9	Remove misleading statement in optim.Optimizer docs (#76967 ) Fixes #76752 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76967 Approved by: https://github.com/jbschlosser	2022-05-10 14:39:53 +00:00
Ivan Yashchuk	890bdf13e1	Remove deprecated torch.solve (#70986 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.solve`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70986 Approved by: https://github.com/Lezcano, https://github.com/albanD	2022-05-10 13:44:07 +00:00
PyTorch MergeBot	4ebc4890dd	Revert "Add linalg.lu_solve" This reverts commit `fc5b4a5a33`. Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet	2022-05-09 19:12:30 +00:00
Alban Desmaison	d5210a4269	Add gradient choice detail to autograd doc Trying to clarify what our backward functions should compute. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76898 Approved by: https://github.com/soulitzer, https://github.com/Lezcano	2022-05-06 21:12:25 +00:00
Sherlockk Huang	8b6a78f39f	Python Interface for Jiterator This PR allows user to author a CUDA kernel in python. ``` from torch.cuda.jiterator import create_jit_fn code_string = "template <typename T> T my_kernel(T x, T y, T alpha) { return -x * y + x - y + alpha; }" jitted_fn = create_jit_fn(code_string, alpha=0) a = torch.rand(3, device='cuda') b = torch.rand(3, device='cuda') result = jitted_fn(a, b, alpha=1.0) ``` Limitations: - Only supports elementwise kernel - 1~8 tensor inputs (empty input, e.g. factory methods, is not supported) - inputs tensors must live in cuda device - cpu Scalar is not supported - kwargs must be pre-declared when calling create_jit_fn - kwargs must be convertible to at::Scalar, one of float64, int64_t, bool. (complex not support for now) TODOs: - [x] consolidate union and c10::variant implementation - [x] plug into existing op testing framework - [ ] rename files, place files in the right folder - [ ] place util functions in the right file - [x] enforce assumptions in python interface e.g <8 inputs, kwargs types - [x] Add user-facing documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/76394 Approved by: https://github.com/mruberry	2022-05-06 18:44:28 +00:00
zhoubo	fd6991e714	add trunc_normal_ function to doc of torch.nn.init Fixes #72517 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76896 Approved by: https://github.com/jbschlosser	2022-05-06 14:33:08 +00:00
lezcano	621ff0f973	Add linalg.vander This PR adds `linalg.vander`, the linalg version of `torch.vander`. We add autograd support and support for batched inputs. We also take this chance to improve the docs (TODO: Check that they render correctly!) and add an OpInfo. Discussion: The current default for the `increasing` kwargs is extremely odd as it is the opposite of the classical definition (see [wiki](https://en.wikipedia.org/wiki/Vandermonde_matrix)). This is reflected in the docs, where I explicit both the odd defaults that we use and the classical definition. See also [this stackoverflow post](https://stackoverflow.com/a/71758047/5280578), which shows how people are confused by this defaults. My take on this would be to correct the default to be `increasing=True` and document the divergence with NumPy (as we do for other `linalg` functions) as: - It is what people expect - It gives the correct determinant called "the Vandermonde determinant" rather than (-1)^{n-1} times the Vandermonde det (ugh). - [Minor] It is more efficient (no `flip` needed) - Since it's under `linalg.vander`, it's strictly not a drop-in replacement for `np.vander`. We will deprecate `torch.vander` in a PR after this one in this stack (once we settle on what's the correct default). Thoughts? mruberry cc kgryte rgommers as they might have some context for the defaults of NumPy. Fixes https://github.com/pytorch/pytorch/issues/60197 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76303 Approved by: https://github.com/albanD, https://github.com/mruberry	2022-05-06 08:44:14 +00:00
PyTorch MergeBot	8ac6b0a010	Revert "Contribution- Grammatical Corrections in the documentation" This reverts commit `a0ebf1d386`. Reverted https://github.com/pytorch/pytorch/pull/57411 on behalf of https://github.com/malfet	2022-05-05 23:13:10 +00:00
Sanskar	a0ebf1d386	Contribution- Grammatical Corrections in the documentation Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/57411 Approved by: https://github.com/svekars, https://github.com/holly1238, https://github.com/malfet	2022-05-05 22:35:08 +00:00
lezcano	fc5b4a5a33	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-05-05 19:02:13 +00:00
sanchitintel	4ee29d6033	[Reland take-2] Add JIT graph fuser for oneDNN Graph API (v0.5) Re-landing #68111/#74596 ## Description v0.5 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of #50256, the below improvements are included: * The [v0.5 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.5) of the oneDNN Graph API is used * The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` `torch.jit.freeze` should be used after tracing (recommended) or scripting a model. ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: * SkyLake 8180 (1 socket of 28 cores): ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png) * SkyLake 8180 (single thread): ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png) * By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) ** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code is placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is in: ``` caffe2/CMakeLists.txt cmake/public/mkldnn.cmake cmake/Modules/FindMKLDNN.cmake ``` ## Limitations * In this PR, we only support Pytorch-oneDNN-Graph integration on Linux platform. Support on Windows and MacOS will be enabled as a next step. * We have only optimized the inference use-case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76622 Approved by: https://github.com/eellison	2022-05-05 16:57:03 +00:00
lezcano	7cb7cd5802	Add linalg.lu This PR modifies `lu_unpack` by: - Using less memory when unpacking `L` and `U` - Fuse the subtraction by `-1` with `unpack_pivots_stub` - Define tensors of the correct types to avoid copies - Port `lu_unpack` to be a strucutred kernel so that its `_out` version does not incur on extra copies Then we implement `linalg.lu` as a structured kernel, as we want to compute its derivative manually. We do so because composing the derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient. This new function and `lu_unpack` comes with all the things it can come: forward and backward ad, decent docs, correctness tests, OpInfo, complex support, support for metatensors and support for vmap and vmap over the gradients. I really hope we don't continue adding more features. This PR also avoids saving some of the tensors that were previously saved unnecessarily for the backward in `lu_factor_ex_backward` and `lu_backward` and does some other general improvements here and there to the forward and backward AD formulae of other related functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67833 Approved by: https://github.com/IvanYashchuk, https://github.com/nikitaved, https://github.com/mruberry	2022-05-05 09:17:05 +00:00
Eddie Yan	e838137b3e	Add high level control of fp32 matmul precision; disable TF32 for matmuls by default #76440 CC @mruberry @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/76509 Approved by: https://github.com/ngimel	2022-05-04 20:40:13 +00:00
Shawn Zhong	9c902f4749	Add `TORCH_CPP_LOG_LEVEL` to the docs Fixes #70667 `TORCH_CPP_LOG_LEVEL=INFO` is needed for `TORCH_DISTRIBUTED_DEBUG` to be effective. For reference, https://github.com/pytorch/pytorch/pull/71746 introduced the environment variable `TORCH_CPP_LOG_LEVEL` and https://github.com/pytorch/pytorch/pull/73361 documented it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76625 Approved by: https://github.com/rohan-varma	2022-05-03 17:01:11 +00:00
Shabab Ayub	3e08b18167	Back out "Back out "[torch deploy] Update deploy.rst with working simple example"" (#76713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76713 Original commit changeset: d8deed7d0b7f Original Phabricator Diff: D36073344 (`d16ce8a2f6`) Test Plan: n/a Reviewed By: osalpekar Differential Revision: D36086703 fbshipit-source-id: 15d03bdb478c02a4c5253a2023828147ee1438e0 (cherry picked from commit fdc27f0fda4b63703839c9ddb620e4708a6360fa)	2022-05-03 14:12:18 +00:00
Shabab Ayub	d16ce8a2f6	Back out "[torch deploy] Update deploy.rst with working simple example" Summary: Original commit changeset: d78bb2886f94 Original Phabricator Diff: D35998155 Test Plan: n/a Reviewed By: osalpekar Differential Revision: D36073344 fbshipit-source-id: d8deed7d0b7fe716251bfed2450bf971a2dd394c (cherry picked from commit 689d84be98c106a1883f07343b64326560c920ce)	2022-05-02 22:07:42 +00:00
Shabab Ayub	a240d45277	[torch deploy] Update deploy.rst with working simple example (#76538 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76538 when running the example from the docs, I found that these steps were not working. These are the updates necessary to get the example working. Test Plan: n/a Reviewed By: PaliC Differential Revision: D35998155 fbshipit-source-id: d78bb2886f94889abae5a3af5239fcd306cd5e09 (cherry picked from commit 6893812efe7443b437ccafb7b1ff6bc7bd2e6670)	2022-05-02 22:07:42 +00:00
PyTorch MergeBot	bc5307347f	Revert "Add linalg.vander" This reverts commit `1ea49c68d0`. Reverted https://github.com/pytorch/pytorch/pull/76303 on behalf of https://github.com/malfet	2022-05-02 18:50:08 +00:00
lezcano	1ea49c68d0	Add linalg.vander This PR adds `linalg.vander`, the linalg version of `torch.vander`. We add autograd support and support for batched inputs. We also take this chance to improve the docs (TODO: Check that they render correctly!) and add an OpInfo. Discussion: The current default for the `increasing` kwargs is extremely odd as it is the opposite of the classical definition (see [wiki](https://en.wikipedia.org/wiki/Vandermonde_matrix)). This is reflected in the docs, where I explicit both the odd defaults that we use and the classical definition. See also [this stackoverflow post](https://stackoverflow.com/a/71758047/5280578), which shows how people are confused by this defaults. My take on this would be to correct the default to be `increasing=True` and document the divergence with NumPy (as we do for other `linalg` functions) as: - It is what people expect - It gives the correct determinant called "the Vandermonde determinant" rather than (-1)^{n-1} times the Vandermonde det (ugh). - [Minor] It is more efficient (no `flip` needed) - Since it's under `linalg.vander`, it's strictly not a drop-in replacement for `np.vander`. We will deprecate `torch.vander` in a PR after this one in this stack (once we settle on what's the correct default). Thoughts? mruberry cc kgryte rgommers as they might have some context for the defaults of NumPy. Fixes https://github.com/pytorch/pytorch/issues/60197 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76303 Approved by: https://github.com/albanD	2022-05-02 15:26:44 +00:00
PyTorch MergeBot	3dcd67a1b3	Revert "[Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1)" This reverts commit `8b11d81058`. Reverted https://github.com/pytorch/pytorch/pull/74596 on behalf of https://github.com/janeyx99	2022-04-29 15:40:17 +00:00
chunyuan	8b11d81058	[Re-landing 68111] Add JIT graph fuser for oneDNN Graph API (Preview4.1) Re-landing https://github.com/pytorch/pytorch/pull/68111 ## Description Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included: - The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used - The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: - SkyLake 8180 (1 socket of 28 cores): ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png) - SkyLake 8180 (single thread): ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png) \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code are placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is: ``` caffe2/CMakeLists.txt ``` ## Limitations - In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step. - We have only optimized the inference use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74596 Approved by: https://github.com/malfet	2022-04-29 01:01:33 +00:00
Ivan Yashchuk	8bb7203049	Add torch.linalg.ldl_factor_ex and torch.linalg.ldl_solve This PR adds a function for computing the LDL decomposition and a function that can solve systems of linear equations using this decomposition. The result of `torch.linalg.ldl_factor_ex` is in a compact form and it's required to use it only through `torch.linalg.ldl_solve`. In the future, we could provide `ldl_unpack` function that transforms the compact representation into explicit matrices. Fixes https://github.com/pytorch/pytorch/issues/54847. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/69828 Approved by: https://github.com/Lezcano, https://github.com/mruberry, https://github.com/albanD	2022-04-28 19:23:37 +00:00
Jerry Zhang	30342f6ba6	[quant][docs] Fix formatting for quantization.rst (#76223 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76223 Small formatting fixes that was missed because I didn't check the generated doc last time Test Plan: visual inspection of the generated docs for this PR Imported from OSS Reviewed By: HDCharles Differential Revision: D35853174 fbshipit-source-id: 4454a4bf5d0c998d866bbae1d6b5286827082033 (cherry picked from commit 125f60356ccc9cd6888c515889bd27ff9860ec74)	2022-04-26 03:16:39 +00:00
Elias Ellison	0d7be81c9c	[JIT] Add Context Manager to force strict fusion Fixes https://github.com/pytorch/pytorch/issues/75464 Adds a context manager that will throw if the ops in the context are not fused. API is : ``` with torch.jit.strict_fusion(): ... ``` A few TODOs: [+] Compose/figure out how to do with autodiff - right now it will run on autodiff as well [+] Support all of the nvfuser operators that are added in guarding [+] Figure out what to do with control flow that isn't taken (right now it will just error). this is probably a source of the original issue :/ - will just error [+] (After those are figured out) add to docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/75777 Approved by: https://github.com/davidberard98	2022-04-25 16:08:57 +00:00
Jerry Zhang	056627ddce	[quant][docs] Add more docs for quantization.rst (#75998 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75998 Add more details to user facing docs quantization.rst, which will be displayed in the official quantization doc page: https://pytorch.org/docs/stable/quantization.html This includes: * docs for quantization stack (quantized tensor, quantized operator and modules, observer, fake_quantize, QConfig, quantization flow) * Added support table for quantization mode, quantization flow mode and backend, (also moved around operator support table) * restructured eager mode and fx mode docs as well Test Plan: inspect the doc that's built by github ci Imported from OSS Reviewed By: dzdang Differential Revision: D35739111 fbshipit-source-id: 3762d387479bdd37472cb17d5c49da2f520effbb (cherry picked from commit db5e6411c52c08dd9c45f841ab86713d36a75d51)	2022-04-22 06:42:39 -07:00
albanD	a6a5e6cecf	move the stateless util to public API! Pull Request resolved: https://github.com/pytorch/pytorch/pull/75834 Approved by: https://github.com/zou3519, https://github.com/jbschlosser	2022-04-21 13:42:24 +00:00
kshitij12345	aa51704ce5	[complex32] add chalf alias for complex32 and chalf method Reference: https://github.com/pytorch/pytorch/issues/74537 Adds chalf alias for complex32 and also adds method `chalf` similar to `cfloat, cdouble` TODO: * [x] Add docs * [x] Add override Pull Request resolved: https://github.com/pytorch/pytorch/pull/75320 Approved by: https://github.com/anjali411	2022-04-20 23:44:47 +00:00
Jerry Zhang	74454bdb46	[quant][fx] Move backend_config folder to torch.ao.quantization Summary: Following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md we implemented the backend configuration for fbgemm/qnnpack backend, currently it was under fx folder, but we'd like to use this for all different workflows, including eager, fx graph and define by run quantization, this PR moves it to torch.ao.quantization namespace so that it can be shared by different workflows Also moves some utility functions specific to fx to fx/backend_config_utils.py and some files are kept in fx folder (quantize_handler.py and fuse_handler.py) Test Plan: python test/teset_quantization.py TestQuantizeFx python test/teset_quantization.py TestQuantizeFxOps python test/teset_quantization.py TestQuantizeFxModels python test/test_quantization.py TestAOMigrationQuantization python test/test_quantization.py TestAOMigrationQuantizationFx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75823 Approved by: https://github.com/vkuzo	2022-04-19 15:38:57 +00:00
Alban Desmaison	bd7e99cbb9	Fix doc build Regression introduced in https://github.com/pytorch/pytorch/pull/73224 The caller for this script has never been updated to pass in main: `2ecc59086a/.github/workflows/_docs.yml (L81-L85)` So this change made it so that all PR doc is built as-if it was a release (for example https://github.com/pytorch/pytorch/runs/6031182009?check_suite_focus=true) and so the coverage test for the doc didn't run for a month :( Pull Request resolved: https://github.com/pytorch/pytorch/pull/75997 Approved by: https://github.com/musebc, https://github.com/seemethere	2022-04-19 04:07:47 +00:00
Brian Johnson	990d155c9c	Update Index.rst to add TorchRec to domain list. Adds TorchRec and TorchData to domain library list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73229 Approved by: https://github.com/colin2328, https://github.com/jamesr66a	2022-04-15 02:39:12 +00:00
Nikita Shulga	348881deaf	Update doc copyrights to 2022 Also, s/Torch/PyTorch/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/75690 Approved by: https://github.com/kit1980, https://github.com/soumith	2022-04-13 00:25:23 +00:00
Yulv-git	ac2d2e3a3d	Fix some typos. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75561 Approved by: https://github.com/albanD	2022-04-11 21:55:59 +00:00
Nuno-Mota	0bd3354547	Update onnx.rst Fixes #75508 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75509 Approved by: https://github.com/BowenBao	2022-04-08 20:07:01 +00:00
Mikayla Gawarecki	11f1fef981	Update documentation for scatter_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/74608 Approved by: https://github.com/cpuhrsch	2022-04-07 15:41:23 +00:00
Thiago Crepaldi	89e79f844d	Add list of supported ATen ops by ONNX converter into torch.onnx page This PR introduces a new documentation page with a list of supported ATen operators by the ONNX converter. When `make html` (or similar) are called, a python script will generate a temporary CSV file inside the doc build folder with a list of operators/opsets currently supported by the PyTorch ONNX exporter. That CSV is used by Sphinx to build a HTML table using the same theme as the rest of the documentation. That page is linked to the existing `onnx.rst`, including its table of contents. @BowenBao @shubhambhokare1 Feel free to add more details on how the script cross reference onnx symbolics and aten operators list from torch jit api` Below is the workflow for the changed pages: The initial torch.onnx page was modified to add a link to the list of supported aten operators ![image](https://user-images.githubusercontent.com/5469809/159046387-c459bffc-c9b2-4fcb-8468-8181fdddf911.png) The screen below highlights the text structure changes to the `ATen operartors` section ![image](https://user-images.githubusercontent.com/5469809/159046730-ccd1e594-c8e6-4b8d-a9ec-8bf6ad58a435.png) Finally the new page with the list of supported operators is shown below ![image](https://user-images.githubusercontent.com/5469809/159046872-0d99b769-8b95-4c2b-99a9-a8cfdd0b6ecf.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/74397 Approved by: https://github.com/garymm, https://github.com/malfet	2022-04-07 00:05:44 +00:00
Vasiliy Kuznetsov	74b23b2066	quantization: autogenerate quantization backend configs for documentation (#75126 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75126 Quantization has a high volume of configurations of how to quantize an op for a reference model representation which is useful for a lowering step for a backend. An example of this is ``` {'dtype_configs': [{'input_dtype': torch.quint8, 'output_dtype': torch.quint8}], 'observation_type': <ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT: 0>, 'pattern': <class 'torch.nn.modules.conv.ConvTranspose1d'>}, ``` These configs are checked into master, and they are created with Python functions. Therefore, there is no easy way for the user to see what the configs actually are without running some Python code. This PR is one approach to document these configs. Here is what this is doing: 1. during documentation build, write a text file of the configs 2. render that text file on a quantization page, with some additional context In the future, this could be extended to autogenerate better looking tables such as: op support per backend and dtype, op support per valid quantization settings per backend, etc. Test Plan: ``` cd docs make html cd html python -m http.server 8000 // render http://[::]:8000/quantization-backend-configuration.html // it renders correctly ``` Reviewed By: ejguan Differential Revision: D35365461 Pulled By: vkuzo fbshipit-source-id: d60f776ccb57da9db3d09550e4b27bd5e725635a (cherry picked from commit 14865c0e23bc080120342c8f9278f0fae8eb8fbd)	2022-04-04 22:22:30 +00:00
Sherlockk Huang	bbf7e159e0	Implement torch.special.log_ndtr Implements torch.special.log_ndtr Issue: https://github.com/pytorch/pytorch/issues/50345 TODO: - [x] adding proper reference to scipy implementation - [x] double check if the changes in test/test_unary_ufuncs.py is really necessary - [x] check setting for UnaryUfuncInfo cc: @kshitij12345 @mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/74795 Approved by: https://github.com/anjali411	2022-03-29 23:13:37 +00:00
Smark	ab57876420	fix docs error in Autograd Mechanics Fixes #74682 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74807 Approved by: https://github.com/albanD	2022-03-29 18:32:16 +00:00
Janakan	923a922b1b	Grammatically updated quantization tech doc Improved PyTorch technical documentation consistency for the "quantization API summary" section. ![Screen Shot 2022-03-19 at 4 07 46 PM](https://user-images.githubusercontent.com/72175053/160317638-51e26ec0-903e-44ba-ba59-aa114d4fda93.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/74436 Approved by: https://github.com/albanD	2022-03-28 16:48:25 +00:00
Kurt Mohler	79ddc72b85	Virtualize `<type>Storage` classes (#66970 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/66228 cc ezyang bhosmer smessmer ljk53 bdhirsh Pull Request resolved: https://github.com/pytorch/pytorch/pull/66970 Reviewed By: bdhirsh Differential Revision: D33245612 Pulled By: ezyang fbshipit-source-id: 4c61c2cb029e2b94b0e68927c377d3e1c358dd7c (cherry picked from commit d29fcdfb4bc2cc17b1795d4349e4b56fa0d1cf12)	2022-03-22 23:44:48 +00:00
leslie-fang-intel	3a112ebb57	add autocast cpu doc As discussed in https://github.com/pytorch/pytorch/issues/55374#issuecomment-968333614, here we update the cpu autocast operation list in autocast API document. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68567 Approved by: https://github.com/ezyang	2022-03-22 02:02:43 +00:00
Michael Suo	e5bf87963d	Revert D34584878: [pytorch][PR] Add JIT graph fuser for oneDNN Graph API (Preview4) Test Plan: revert-hammer Differential Revision: D34584878 (`7dd0823011`) Original commit changeset: ce817aa8cc90 Original Phabricator Diff: D34584878 (`7dd0823011`) fbshipit-source-id: a941aaad34f8fe5f0c51f719f9f5c29b811c4d5b (cherry picked from commit a43262ec7521b1665b02a64d3f279e72ee2344b9)	2022-03-21 23:07:14 +00:00
chunyuan	7dd0823011	Add JIT graph fuser for oneDNN Graph API (Preview4) (#68111 ) Summary: ## Description Preview4 PR of this [RFC](https://github.com/pytorch/pytorch/issues/49444). On the basis of https://github.com/pytorch/pytorch/pull/50256, the below improvements are included: - The [preview4 release branch](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.4.1) of the oneDNN Graph API is used - The fuser now works with the profiling graph executor. We have inserted type check nodes to guard the profiled tensor properties. ### User API: The optimization pass is disabled by default. Users could enable it by: ``` torch.jit.enable_onednn_fusion(True) ``` ### Performance: [pytorch/benchmark](https://github.com/pytorch/benchmark) tool is used to compare the performance: - SkyLake 8180 (1 socket of 28 cores): ![image](https://user-images.githubusercontent.com/65992142/151162305-05e44425-a24e-4d5e-94e1-743b40b87a8c.png) - SkyLake 8180 (single thread): ![image](https://user-images.githubusercontent.com/65992142/151162528-69f90b79-d08d-46b8-8775-d80a6ccbce8a.png) \* By mapping hardswish to oneDNN Graph, it’s 8% faster than PyTorch JIT (NNC + OFI) \** We expect performance gain after mapping transpose, contiguous & view to oneDNN graph ops ### Directory structure of the integration code Fuser-related code are placed under: ``` torch/csrc/jit/codegen/onednn/ ``` Optimization pass registration is done in: ``` torch/csrc/jit/passes/onednn_graph_fuser.h ``` CMake for the integration code is: ``` caffe2/CMakeLists.txt ``` ## Limitations - In this PR, we have only supported the optimization on Linux platform. The support on Windows and MacOS will be enabled as the next step. - We have only optimized the inference use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68111 Reviewed By: eellison Differential Revision: D34584878 Pulled By: malfet fbshipit-source-id: ce817aa8cc9052ee9ed930c9cf66be83449e61a4 (cherry picked from commit cd17683aa7d9c0947df45a1ab53627feff795587)	2022-03-21 22:12:19 +00:00
Jaewon Lee	11ea09effc	[CUDACachingAlloc/GPUInference] Implement garbage collection without GPU sync (#74261 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74261 ### Goal Implement a cheap way to reclaim GPU memory (garbage collection) without incurring GPU sync. ### Why do we need this? Currently, there are only two ways to reclaim GPU memory block already assigned to a particular stream. - `release_available_cached_blocks(params)`: Free blocks exceeding the `CachingAllocatorConfig::max_split_size()` until we can satisfy the request. Issue: If the `max_split_size` is unset (default), this function is a no-op. Even if this is set, the reclamation is quite conservative (e.g., never frees blocks under max_split_size). - `release_cached_blocks()`: Waits for all the in-flight events and then reclaim blocks. Issue: 'waiting for all event' is very expensive as it will likely stall all the GPU operations. Many GPU applications without a proper handling of potential GPU throttling would suffer/crash. ### Proposed idea - If the garbage collection threshold is set, try to reclaim some memory blocks without synchronization. It should be safe to do so, as `release_available_cached_blocks` essentially does the same thing (but less aggressively). - GC is triggered only when we fail to serve a `malloc` request from the block pool. No need to free blocks when the block pool is functioning just fine. - Prioritize reclaiming blocks that weren't reused for long time. Reclamation stops once the used memory capacity < threshold. - This code path is totally optional; by default it won't be invoked. Test Plan: - Unit tests - Manually checked that the GPU memory usage stays as indicated by the garbage collector. If not the caching allocator at least tries to keep freeing the blocks. Reviewed By: jianyuh Differential Revision: D34482514 fbshipit-source-id: d5eae62ac60b94b0bca851f9d233a092d086e3c2 (cherry picked from commit 05780f1ed4b176f05e765b2411c9eaa2eaeb48b0)	2022-03-21 18:46:02 +00:00
BowenBao	54a6942f8d	[ONNX] ONNX Exporter logging (#71342 ) Summary: Add ONNX exporter logging facility. Supporting both C++/Python logging api. Logging can be turned on/off. Logging output stream can be either set to `stdout` or `stderr`. A few other changes: * When exception is raised in passes, the current IR graph being processed will be logged. * When exception is raised from `_jit_pass_onnx` (the pass that converts nodes from namespace `ATen` to `ONNX`), both ATen IR graph and ONNX IR graph under construction will be logged. * Exception message for ConstantFolding is truncated to avoid being too verbose. * Update the final printed IR graph with node name in ONNX ModelProto as node attribute. Torch IR Node does not have name. Adding this to printed IR graph helps debugging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71342 Reviewed By: msaroufim Differential Revision: D34433473 Pulled By: malfet fbshipit-source-id: 4b137dfd6a33eb681a5f2612f19aadf5dfe3d84a (cherry picked from commit 67a8ebed5192c266f604bdcca931df6fe589699f)	2022-03-17 19:40:03 +00:00
Banit Agrawal	ac3effd150	[PyTorch GPU Allocator] Better use of blocks with rounding of allocation sizes (#74213 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74213 In the current CUDACachingAllocator, the sizes are rounded up in multiple of blocks size of 512, so this works for smaller sizes. However for large sizes, we can have lots of different size blocks in the larger pool. This is problematic when we have variable batch sizes 1001, 1021, 1023 -> all will go to different block size and will create different size of blocks. This will create lots of unused blocks and will waste GPU memory capacity. This diff adds a rounding approach to allocation size. It rounds up the size to nearest power-of-2 divisions and the power2-division can be changed with env variable setting. For example, if we need to round-up size of1200 and if number of divisions is 4, the size 1200 lies between 1024 and 2048 and if we do 4 divisions between them, the values are 1024, 1280, 1536, and 1792. So the function will return 1280 as the nearest ceiling of power-2 division. env setting: export PYTORCH_CUDA_ALLOC_CONF=roundup_power2_divisions:4 ghstack-source-id: 151446017 Reviewed By: ezyang Differential Revision: D34868036 fbshipit-source-id: 494785add16e6b37c920dcb5a2b81d4c637b554a (cherry picked from commit 548454ccacbd8700e7ffd2d762e40b4ba37abbae)	2022-03-16 02:53:53 +00:00
Ke Wen	1f04a00ccf	[PyTorch Distributed] Update documentation about NCCL environment variables (#74006 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74006 updated recommendations about environment variables to use during debug and performance tuning Test Plan: `make html` Reviewed By: rohan-varma Differential Revision: D34767454 fbshipit-source-id: 08cd58469bf72b58702e50e82020fa19b43b5911 (cherry picked from commit ac7e6630f8043f85d3d16be17c6a8ad1ebb2990c)	2022-03-11 23:57:17 +00:00
Alban Desmaison	734281c3d6	Cleanup all module references in doc (#73983 ) Summary: Working towards https://docs.google.com/document/d/10yx2-4gs0gTMOimVS403MnoAWkqitS8TUHX73PN8EjE/edit?pli=1# This PR: - Ensure that all the submodules are listed in a rst file (that ensure they are considered by the coverage tool) - Remove some long deprecated code that just error out on import - Remove the allow list altogether to ensure nothing gets added back there Pull Request resolved: https://github.com/pytorch/pytorch/pull/73983 Reviewed By: anjali411 Differential Revision: D34787908 Pulled By: albanD fbshipit-source-id: 163ce61e133b12b2f2e1cbe374f979e3d6858db7 (cherry picked from commit c9edfead7a01dc45bfc24eaf7220d2a84ab1f62e)	2022-03-10 22:26:29 +00:00
Alban Desmaison	238f7d9cbf	rename config module file to work with gh pages better Fixes https://github.com/pytorch/pytorch/issues/62018 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74038 Approved by: https://github.com/mruberry, https://github.com/seemethere	2022-03-10 20:41:44 +00:00
Rohit Goswami	979a78f8b2	Sphinx panel Fixes https://github.com/pytorch/pytorch/issues/73835. The full context for this is detailed in the issue, but briefly: - Adds `sphinx-panel` Other PRs will demonstrate usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73836 Approved by: https://github.com/albanD	2022-03-07 14:50:09 +00:00
Pritam Damania	71aa3ab020	Add note in RPC docs about retries. (#73601 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73601 Some users had questions about how the RPC framework deals with failures and whether we retry. Adding a note about this to our docs to elaborate on our current behavior and why we chose that approach. ghstack-source-id: 150359866 Test Plan: view docs. Reviewed By: mrshenli Differential Revision: D34560199 fbshipit-source-id: ee33ceed7fa706270d4ca5c8fcff7535583490ff (cherry picked from commit 954a906240cc40aacf08ca13f6554a35303a678a)	2022-03-03 00:29:31 +00:00
Ren Pang	e8b10b6e34	fix wrong indexing of class names in docs Fixes #73631 Locally built and tested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73632 Approved by: jbschlosser	2022-03-02 22:21:21 +00:00
Christian Puhrsch	484c0de670	Minimal NestedTensor (#72881 ) Summary: This PR adds a minimal version of a NestedTensor. It introduces the general harness future development can be built around. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72881 Reviewed By: albanD Differential Revision: D34259177 Pulled By: cpuhrsch fbshipit-source-id: 0245c36f603424e20f3b09651043c207f526d760 (cherry picked from commit 10764e8d427f29b364567e4cbc86ed73c3933158)	2022-03-02 16:31:51 +00:00
Nikita Shulga	8ac7393565	Revert D33767740: [pytorch][PR] Sparse CSR CPU: cuSolverSP backend for `linalg.solve` Test Plan: revert-hammer Differential Revision: D33767740 (`199d9a992c`) Original commit changeset: a945f065210c Original Phabricator Diff: D33767740 (`199d9a992c`) fbshipit-source-id: b7934df18118f8d6d5f165deb5aae9887953ae43 (cherry picked from commit d3ddbb021b227e3638f6f7c22c6eadfa73695e31)	2022-03-01 18:33:23 +00:00
Kushashwa Ravi Shrimali	199d9a992c	Sparse CSR CPU: cuSolverSP backend for `linalg.solve` (#71399 ) Summary: This PR introduces the `cuSolverSP` backend for `linalg.solve` with sparse CSR input matrices. The motivation comes from the issue: https://github.com/pytorch/pytorch/issues/69538. `cuSolver` provides [`cusolverSp<t>csrlsvluHost`](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvlu) API, a few things to note: 1. As mentioned in the documentation: `only CPU (Host) path is provided.` From the profiling, there doesn't seem to be any GPU kernel launch for optimization, please see the profiling below. 2. Since only `host` path is provided, the CPU path uses `csrlsvluHost` (but requires PyTorch to be installed/built with CUDA support). 3. The documentation mentions reordering helps optimize stuff, but it isn't clear how it affects the performance. There are options for reordering, so we stick to `reorder = 0` as the default choice. `cuSolver` has [`csrlsvqr`](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvqr) function which provides a `device` path to solve the linear system. This function is used for the CUDA path in this PR. Gist: For CPU Path: we call [`csrlsvluHost` function of cuSolver](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvlu). For CUDA Path: we call [`csrlsvqr` function of cuSolver](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvqr). Profiling: (On sparse input tensor of size 1000 x 1000, with a vector of shape length 1000), for `csrlsvlu` function (to show no GPU optimization) ```cpp ==3999651== Profiling result: Type Time(%) Time Calls Avg Min Max Name GPU activities: 100.00% 2.1440us 1 2.1440us 2.1440us 2.1440us [CUDA memcpy HtoD] API calls: 99.72% 1.07199s 9 119.11ms 500ns 1.07164s cudaFree 0.11% 1.2182ms 398 3.0600us 140ns 137.94us cuDeviceGetAttribute 0.06% 674.45us 4 168.61us 165.50us 173.64us cuDeviceTotalMem 0.03% 357.07us 4 89.268us 2.7800us 201.89us cudaMalloc 0.03% 309.29us 1 309.29us 309.29us 309.29us cudaGetDeviceProperties 0.01% 160.47us 332 483ns 350ns 3.3300us cudaFuncSetAttribute 0.01% 115.12us 4 28.780us 26.290us 33.410us cuDeviceGetName 0.00% 28.591us 5 5.7180us 440ns 16.921us cudaGetDevice 0.00% 22.061us 4 5.5150us 871ns 18.690us cudaDeviceSynchronize 0.00% 20.370us 18 1.1310us 410ns 6.9900us cudaEventDestroy 0.00% 16.390us 1 16.390us 16.390us 16.390us cudaMemcpy 0.00% 11.540us 2 5.7700us 1.4900us 10.050us cuDeviceGetPCIBusId 0.00% 10.510us 18 583ns 430ns 1.6200us cudaEventCreateWithFlags 0.00% 7.9100us 21 376ns 290ns 700ns cudaDeviceGetAttribute 0.00% 1.4300us 6 238ns 150ns 590ns cuDeviceGet 0.00% 1.2200us 4 305ns 190ns 500ns cuDeviceGetCount 0.00% 900ns 1 900ns 900ns 900ns cuInit 0.00% 860ns 4 215ns 180ns 260ns cuDeviceGetUuid 0.00% 240ns 1 240ns 240ns 240ns cuDriverGetVersion 0.00% 230ns 1 230ns 230ns 230ns cudaGetDeviceCount ``` Script: ```python import torch def solve(x, other, out): torch.linalg.solve(x, other, out=out) if __name__ == "__main__": dense_inp = torch.randn((1000, 1000), dtype=torch.float64) # Set 50% of the values to 0 randomly dense_inp = torch.nn.functional.dropout(dense_inp, p=0.5) sparse_inp = dense_inp.to_sparse_csr() other = torch.randint(100, (1000,), dtype=torch.float64) out = torch.randint(1, (1000,), dtype=torch.float64) solve(sparse_inp, other, out) ``` The following error is raised when the function is used on a CPU device with PyTorch built/installed without CUDA support: * When built without CUDA support: ```python /home/krshrimali/pytorch/torch/autograd/profiler.py:151: UserWarning: CUDA is not available, disabling CUDA profiling warn("CUDA is not available, disabling CUDA profiling") Traceback (most recent call last): File "/home/krshrimali/pytorch/test_sp.py", line 17, in <module> solve(x, other, out) File "/home/krshrimali/pytorch/test_sp.py", line 5, in solve torch.linalg.solve(x, other, out=out) RuntimeError: PyTorch was not built with CUDA support. Please use PyTorch built CUDA support ``` Performance Comparison (vs SciPy's [`scipy.sparse.linalg.spsolve`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.spsolve.html): Time taken by `scipy.sparse.linalg.spsolve` : 0.595 seconds On CPU: Time taken by `torch.linalg.solve` : 4.565 seconds On CUDA: Time taken by `torch.linalg.solve`: 1.838 seconds The inputs are of dimensions: (17281, 17281) and (17281, 1), and were taken from https://math.nist.gov/MatrixMarket/extreme.html. Thanks to IvanYashchuk for helping me with the PR, and guiding me through it. cc: IvanYashchuk pearu nikitaved cpuhrsch cc nikitaved pearu cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/71399 Reviewed By: VitalyFedyunin Differential Revision: D33767740 Pulled By: cpuhrsch fbshipit-source-id: a945f065210cd719096eb8d7cdbf8e8937c2fce9 (cherry picked from commit f4f35c17da414e1ca6c6d91402933521857aa1ea)	2022-03-01 05:32:35 +00:00
Vasiliy Kuznetsov	01bd6f4357	pytorch: fix typo in quantization docs (#73511 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73511 Fixes typo in describing the `torch.qint32` data type. Test Plan: CI Reviewed By: andrewor14 Differential Revision: D34522741 Pulled By: vkuzo fbshipit-source-id: f05f8440d9708281213a4b3736e8f59199dd7b1a (cherry picked from commit ca9e598d60cac016e58fda9cd0f329ca412ec36b)	2022-02-28 23:11:52 +00:00
Peter Bell	f437ca6e8e	Remove legacy tensor constructors for complex dtypes PR #72405 added four new types to the public python API: `torch.ComplexFloatTensor`, `torch.ComplexDoubleTensor`, `torch.cuda.ComplexFloatTensor` and `torch.cuda.ComplexDoubleTensor`. I believe this was unintentional and a clarifying comment as to the purpose of `all_declared_types` is needed to avoid this in future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73370	2022-02-28 15:13:44 +00:00
Philip Meier	c6f1bbc0ac	promote torch.testing to stable (#73348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73348 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D34457727 Pulled By: mruberry fbshipit-source-id: 2cc812b643e0d1e753bead2751ee79b3f03fde20 (cherry picked from commit bcdaca1a019a679b8b274e2fb5f19bfd08874ce9)	2022-02-25 06:30:31 +00:00
Jacob Hepkema	91261feb7b	Add SoftplusTransform (#52300 ) Summary: This pull request introduces `SoftplusTransform` to `torch.distributions.transforms`. `SoftplusTransform` transforms via the mapping `Softplus(x) = log(1 + exp(x))`. Note that the transform is different to [`torch.nn.Softplus`](https://pytorch.org/docs/stable/generated/torch.nn.Softplus.html#torch.nn.Softplus), as that has additional `beta` and `threshold` parameters. Inverse and `log_abs_det_jacobian` for a more complex `SoftplusTransform` can be added in the future. vitkl fritzo Addresses the issue discussed here: [pyro issue 855](https://github.com/pyro-ppl/numpyro/issues/855) Pull Request resolved: https://github.com/pytorch/pytorch/pull/52300 Reviewed By: albanD, ejguan Differential Revision: D34082655 Pulled By: neerajprad fbshipit-source-id: 6114e74ee5d73c1527191bed612a142d691e2094 (cherry picked from commit a181a3a9e53a34214a503d38760ad7778d08a680)	2022-02-25 02:30:03 +00:00
Can Balioglu	0e7a7a5fe7	Add documentation for c10d log levels (#73361 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73361 This PR adds the documentation for the newly introduced `TORCH_CPP_LOG_LEVEL` and how it can be used along with `TORCH_DISTRIBUTED_DEBUG` to adjust the log level of c10d. ghstack-source-id: 149874995 Test Plan: Locally rendered and checked the documentation. Reviewed By: rohan-varma Differential Revision: D34452352 fbshipit-source-id: ecb54590f3030ddef9921a7152ca9f7fc9438345 (cherry picked from commit f4c7c6f3b27dbd3006686cf26a6e9e53cd2c8f09)	2022-02-24 20:38:15 +00:00
Edgar Andrés Margffoy Tuay	86deecd7be	Check clang++/g++ version when compiling CUDA extensions (#63230 ) Summary: See https://github.com/pytorch/pytorch/issues/55267 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63230 Reviewed By: soulitzer Differential Revision: D34159119 Pulled By: malfet fbshipit-source-id: 6eef7582388bf6a42dcc1d82b6e4b1f40f418dd7 (cherry picked from commit 2056d0a0be7951602de22f8d3b4efc28dd71b6c2)	2022-02-24 08:32:32 +00:00
Can Balioglu	e1db2f13ce	Refactor TORCH_DISTRIBUTED_DEBUG implementation (#73166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73166 This PR refactors, cleans up, and optimizes the implementation of `TORCH_DISTRIBUTED_DEBUG`. It also introduces three new user APIs: `get_debug_level()`, `set_debug_level()`, and `set_debug_level_from_env()` to retrieve and modify the debug level after a process has started. ghstack-source-id: 149778566 Test Plan: Run the existing unit tests. Reviewed By: rohan-varma Differential Revision: D34371226 fbshipit-source-id: e18443b411adcbaf39b2ec999178c198052fcd5b (cherry picked from commit 26d6bb1584b83a0490d8b766482656a5887fa21d)	2022-02-24 02:33:05 +00:00
Nikita Karetnikov	75db05c3fd	Check if the iterator is valid before dereferencing it (#72405 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72405 Fixes #71674. This shouldn't segfault now: ``` import torch d = torch.complex64 torch.set_default_dtype(d) ``` Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D34423660 Pulled By: anjali411 fbshipit-source-id: cac92a6f56846f2c0727a120b5f568aa75baa21e (cherry picked from commit eaab813a0fddced24303b3bd50e4fcdba1516e46)	2022-02-23 18:33:46 +00:00
Nikita Shulga	cfb6c942fe	`scatter_reduce` documentation (#73125 ) Summary: Reland of https://github.com/pytorch/pytorch/issues/68580 (which were milestoned for 1.11) plus partial revert of https://github.com/pytorch/pytorch/pull/72543 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73125 Reviewed By: bdhirsh Differential Revision: D34355217 Pulled By: malfet fbshipit-source-id: 325ecdeaf53183d653b44ee5e6e8839ceefd9200 (cherry picked from commit `71db31748a`)	2022-02-22 19:33:46 +00:00
Gary Miguel	dbac0f5cdc	Update persons of interest for ONNX (#72072 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72072 Reviewed By: H-Huang Differential Revision: D34230534 Pulled By: malfet fbshipit-source-id: ed5abdfacf0d9628c6cc99957fa578d71a79d025 (cherry picked from commit `4669c346c4`)	2022-02-16 23:01:13 +00:00
Elias Ellison	f8a2efc190	Make fusion strategy api public (#72639 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72639 Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D34159123 Pulled By: eellison fbshipit-source-id: 27e4d9694a83e8d6829009882715be4308c96a9f (cherry picked from commit `1cadcd2f75`)	2022-02-16 03:45:15 +00:00
Kurt Mohler	8e7fe87630	Rename `Typed/UntypedStorage` to `_Typed/_UntypedStorage` (#72540 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72540 Reviewed By: jbschlosser Differential Revision: D34216823 Pulled By: bdhirsh fbshipit-source-id: 1bc9930ab582771ebf02308e035576cd1a0dbe47 (cherry picked from commit `329238f612`)	2022-02-15 23:53:01 +00:00
Nikita Shulga	cb00d9601c	Revert D33800694: [pytorch][PR] `scatter_reduce` documentation Test Plan: revert-hammer Differential Revision: D33800694 (`12a1df27c7`) Original commit changeset: 2e09492a29ce Original Phabricator Diff: D33800694 (`12a1df27c7`) fbshipit-source-id: 2a4775c0042551607fe3ab77f5bfe9f2e4b6b78e (cherry picked from commit `4bd6c0d2bb`)	2022-02-15 20:10:26 +00:00
rusty1s	12a1df27c7	`scatter_reduce` documentation (#68580 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/63780 (part 2) Pull Request resolved: https://github.com/pytorch/pytorch/pull/68580 Reviewed By: atalman Differential Revision: D33800694 Pulled By: malfet fbshipit-source-id: 2e09492a29cef115a7cca7c8209d1dcb6ae24eb9 (cherry picked from commit `696ff75940`)	2022-02-15 19:43:54 +00:00
Huamin Li	32dd4a8639	move fx_acc out of pytorch core (#72803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72803 as title Reviewed By: jfix71 Differential Revision: D34101788 fbshipit-source-id: a9fd84671929af21405c049603e9895ec68de3d8 (cherry picked from commit `e98fd1c32d`)	2022-02-15 16:13:43 +00:00
mattip	fb4504da2f	DOC: release documentation version should be major.minor (#72706 ) Summary: Fixes pytorch/pytorch.github.io#929 The pytorch doc team would like to move to only major.minor documentation at https://pytorch.org/docs/versions.html, not major.minor.patch. This has been done in the CI scripts, but the generated documentation still has the patch version. Remove it when building RELEASE documentation. This allows simplifying the logic, using `'.'.join(torch_version.split('.')[:2])` since we no longer care about trimming off the HASH: it automatically gets removed. holly1238, brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/72706 Reviewed By: samdow Differential Revision: D34215815 Pulled By: albanD fbshipit-source-id: 8437036cc6636674d9ab8b1666f37b561d0527e1 (cherry picked from commit `d8caf988f9`)	2022-02-14 23:37:43 +00:00
Rohit Goswami	801abc0cdd	MAINT, DOC: Trivial spellings and warnings (#72745 ) Summary: Fixes N/A. Just minor annoyances. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72745 Reviewed By: samdow Differential Revision: D34216016 Pulled By: albanD fbshipit-source-id: b65600b50e41a1dd7bf7d076b0dd3e2d1c99caf9 (cherry picked from commit `b959392a5f`)	2022-02-14 21:55:19 +00:00
Kurt Mohler	47c6993355	Update from_dlpack tests and documentation (#70543 ) Summary: Part of https://github.com/pytorch/pytorch/issues/58742 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70543 Reviewed By: soulitzer Differential Revision: D34172475 Pulled By: mruberry fbshipit-source-id: d498764b8651a8b7a19181b3421aeebf28a5db2b (cherry picked from commit `05332f164c`)	2022-02-14 03:35:17 +00:00
Felix Divo	340fae4363	[Doc] Better formatting in autograd.rst (#72586 ) Summary: See title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72586 Reviewed By: soulitzer Differential Revision: D34177704 Pulled By: albanD fbshipit-source-id: 1adf6ebed4f64ec4d8fff160df300c8e6ee528ea (cherry picked from commit `bbb586d67d`)	2022-02-11 22:46:10 +00:00
BowenBao	9257de7efa	[ONNX] Minor doc update (#69501 ) (#69550 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69550 Fix the wiki URL. Also minor reorganization in onnx.rst. Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D32994269 Pulled By: malfet fbshipit-source-id: 112acfe8b7c778d7e3c2cef684023fdaf2c6ec9c (cherry picked from commit `f0787fabde`)	2022-02-11 22:05:15 +00:00
BowenBao	ce5b155ccb	[ONNX] Link to the wiki (#68505 ) (#72663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72663 Test Plan: Imported from OSS Reviewed By: msaroufim Differential Revision: D34150535 Pulled By: malfet fbshipit-source-id: 230b786f6235549fff764083eac2c3744c6bff88 Co-authored-by: Gary Miguel <garymiguelmicrosoft.com> (cherry picked from commit `c848c582d1`)	2022-02-11 22:05:15 +00:00
Felix Divo	25fba4a019	[DOC] Add link to "double backward" from "extending pytorch" page (#72584 ) Summary: It is probably the most user friendly to link to that (lesser known?) feature. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72584 Reviewed By: soulitzer Differential Revision: D34173999 Pulled By: albanD fbshipit-source-id: 99fff7a55412faf54888f8317ab2388f4d7d30e4 (cherry picked from commit `2191ee7657`)	2022-02-11 20:34:13 +00:00
BowenBao	04c5d978b9	[ONNX] Refactor _run_symbolic_function (#67573 ) (#68491 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68491 * Allows implementing symbolic functions for domains other than `aten`, for example `prim`, in symbolic_opset#.py. * Allows symbolic function to access extra context if needed, through `SymbolicFunctionState`. * Particularly, the `prim::PythonOp` special case can access node without the need of passing node through inputs. Updates will be made downstreams, and in a follow-up PR we will remove the previous workaround in exporter. * `prim::Loop`, `prim::If`, etc are now moved outside of `_run_symbolic_function` from utils.py, and to symbolic_opset9.py. Motivation for this change: - Better maintainability and reducing complexity. Easier to add symbolic for operators, both simple and complex ones (that need additional context), without the former needing to know the existence of the latter. - The design idea was long outdated. prim ops are no longer rare special cases, and they shouldn't all be handled inside `_run_symbolic_function`. As a result this function becomes too clumsy. There were also prim ops symbolic added in symbolic_opset#.py with signature `prim_[opname]`, creating separation and confusion. Test Plan: Imported from OSS Reviewed By: jansel Differential Revision: D32483782 Pulled By: malfet fbshipit-source-id: f9affc31b1570af30ffa6668da9375da111fd54a Co-authored-by: BowenBao <bowbao@microsoft.com> (cherry picked from commit `1e04ffd2fd`)	2022-02-11 18:35:35 +00:00
Mike Ruberry	2fa34fb7b9	Revert D34154832: [pytorch][PR] Add `multi_head_attention_forward` to functional rst docs Test Plan: revert-hammer Differential Revision: D34154832 (`bafaf0d610`) Original commit changeset: 7279d05f31d4 Original Phabricator Diff: D34154832 (`bafaf0d610`) fbshipit-source-id: fcbc896b25f3b51a7ce0c5dc1dca652f57f7218c (cherry picked from commit `afa53acdfd`)	2022-02-11 05:08:46 +00:00
ProGamerGov	bafaf0d610	Add `multi_head_attention_forward` to functional rst docs (#72675 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/72597 Pull Request resolved: https://github.com/pytorch/pytorch/pull/72675 Reviewed By: malfet Differential Revision: D34154832 Pulled By: jbschlosser fbshipit-source-id: 7279d05f31d41259e57ba28fe6fdb7079d603660 (cherry picked from commit `68c32cdbd7`)	2022-02-11 01:52:58 +00:00
Till Hoffmann	b014d4ddb9	Add transformation using cdf of distribution. (#72495 ) Summary: This PR adds a transform that uses the cumulative distribution function of a given probability distribution. For example, the following code constructs a simple Gaussian copula. ```python # Construct a Gaussian copula from a multivariate normal. base_dist = MultivariateNormal( loc=torch.zeros(2), scale_tril=LKJCholesky(2).sample(), ) transform = CumulativeDistributionTransform(Normal(0, 1)) copula = TransformedDistribution(base_dist, [transform]) ``` The following snippet creates a "wrapped" Gaussian copula for correlated positive variables with Weibull marginals. ```python transforms = [ CumulativeDistributionTransform(Normal(0, 1)), CumulativeDistributionTransform(Weibull(4, 2)).inv, ] wrapped_copula = TransformedDistribution(base_dist, transforms) ``` cc fritzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/72495 Reviewed By: ejguan Differential Revision: D34085919 Pulled By: albanD fbshipit-source-id: 7917391519a96b0d9b54c52db65d1932f961d070 (cherry picked from commit `572196146e`)	2022-02-09 14:46:47 +00:00
Yinghai Lu	3670466201	Move fx2trt out of PyTorch core (#72499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72499 Pull Request resolved: https://github.com/pytorch/benchmark/pull/740 To fx2trt out of tree to remove bloatness of PyTorch core. It's the first and major step. Next, we will move acc_tracer out of the tree and rearrange some fx passes. Reviewed By: suo Differential Revision: D34065866 fbshipit-source-id: c72b7ad752d0706abd9a63caeef48430e85ec56d (cherry picked from commit `91647adbca`)	2022-02-09 04:04:49 +00:00
Noufel	8d525d4760	Correcting a minor typo: "Users should pay" instead of "Users should be pay" (#72500 ) Summary: Correcting a minor typo: "Users should pay" instead of "Users should be pay" Pull Request resolved: https://github.com/pytorch/pytorch/pull/72500 Reviewed By: albanD Differential Revision: D34077972 Pulled By: ejguan fbshipit-source-id: 5d7a138d1f17eca838d2c1da76d7759d96e4375f (cherry picked from commit `d046baa89c`)	2022-02-08 23:08:25 +00:00
Kushashwa Ravi Shrimali	bc03c1d000	Structured Kernels for `index_copy`, add `out` variant (#67329 ) Summary: This PR ports `index_copy` implementation to structured kernels, also adds an `out` variant. ~Note to the reviewers: This is in draft mode, waiting for the tests from the CI, and I'll give a final look before requesting the review.~ Issue tracker: https://github.com/pytorch/pytorch/issues/55070 cc: bdhirsh ysiraichi Pull Request resolved: https://github.com/pytorch/pytorch/pull/67329 Reviewed By: ejguan Differential Revision: D34077219 Pulled By: bdhirsh fbshipit-source-id: 6accda33957f654b753261c5c3d765a27a64d2c0 (cherry picked from commit `f3ac83217a`)	2022-02-08 22:52:27 +00:00
Ivan Yashchuk	8cdcc1181c	Add missing entry for sampled_addmm in sparse.rst (#72312 ) Summary: Let's make the documentation for `torch.sparse.sampled_addmm` searchable in the PyTorch documentation. This PR shall be cherry-picked for the next 1.11 release. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72312 Reviewed By: davidberard98 Differential Revision: D34045230 Pulled By: cpuhrsch fbshipit-source-id: c1b1dc907443284857f48c8ce1efab22c6701bbe (cherry picked from commit `225929ecf2`)	2022-02-08 00:07:20 +00:00
Yanli Zhao	2336571cb7	make fsdp folder to be public (#72084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72084 make fsdp folder to be public ghstack-source-id: 148173447 Test Plan: unit tests Reviewed By: mrshenli Differential Revision: D33903417 fbshipit-source-id: 7852a2adc4af09af48a5ffa52ebf210489f834d5 (cherry picked from commit `bd06513cfe`)	2022-02-02 15:50:14 +00:00
Richard Zou	f99147dec0	Targeted documentation updates in autograd.functional (#72111 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72111 For vectorize flag: - Advertises the use of functorch For autograd.functional.jvp: - Advertises the use of functorch and the low-level jvp API, both of which will be more performant than the double backprop trick. Test Plan: - view docs Reviewed By: albanD Differential Revision: D33918065 Pulled By: zou3519 fbshipit-source-id: 6e19699aa94f0e023ccda0dc40551ad6d932b7c7 (cherry picked from commit `b4662ceb99`)	2022-02-02 03:19:31 +00:00
Tristan Rice	6208c2800e	torch/monitor: merge Interval and FixedCount stats (#72009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72009 This simplifies the Stats interface by merging IntervalStat and FixedCountStat into a single Stat w/ a specific window size duration and an optional max samples per window. This allows for the original intention of having comparably sized windows (for statistical purposes) while also having a consistent output bandwidth. Test Plan: ``` buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor ``` Reviewed By: kiukchung Differential Revision: D33822956 fbshipit-source-id: a74782492421be613a1a8b14341b6fb2e8eeb8b4 (cherry picked from commit `293b94e0b4`)	2022-01-30 23:21:59 +00:00
soulitzer	0c2b1b8bcf	Update docs for forward AD and make them public (#71643 ) Summary: Follow up: we would need to update the links to the tutorial later Pull Request resolved: https://github.com/pytorch/pytorch/pull/71643 Reviewed By: albanD Differential Revision: D33713982 Pulled By: soulitzer fbshipit-source-id: a314ffa4e7d5c5ebdef9c50033f338b06578d71c (cherry picked from commit `ba30daaaa5`)	2022-01-28 03:33:00 +00:00
Wanchao Liang	9b53d3194c	Implement gather primitive for ProcessGroupNCCL (#66745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66745 This PR implement NCCL gather and add gather to ProcessGroupNCCL using nccl send/recv api. NCCL doesn’t directly provide primitives for gather, so we need to be implemented on top of NCCL’s send/recv API. 1. In ProcessGroupNCCL.cpp, the outputTensors are first flattened, then inputTensors and outputFlattened are passed by the collective class to gather() function in nccl.cpp. 1. In nccl.cpp, gather is implemented using ncclSend/ncclRecv: all the ranks send inputTensor to the root rank, and the root rank uses a for loop to receive these inputTensors. ghstack-source-id: 147754838 Test Plan: test_gather_ops test_gather_checks test_gather_stress Reviewed By: pritamdamania87 Differential Revision: D29616361 fbshipit-source-id: b500d9b8e67113194c5cc6575fb0e5d806dc7782 (cherry picked from commit `d560ee732e`)	2022-01-27 19:37:55 +00:00
Tristan Rice	7aa4a1f63e	torch/monitor: TensorboardEventHandler (#71658 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71658 This adds the beginnings of a TensorboardEventHandler which will log stats to Tensorboard. Test Plan: buck test //caffe2/test:monitor Reviewed By: edward-io Differential Revision: D33719954 fbshipit-source-id: e9847c1319255ce0d9cf2d85d8b54b7a3c681bd2 (cherry picked from commit `5c8520a6ba`)	2022-01-27 08:33:55 +00:00
lezcano	108b37db84	[Array API] Add linalg.diagonal (#70599 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70599 This PR adds `linalg.diagonal` following the Array API: https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.html#linalg-diagonal-x-axis1-0-axis2-1-offset-0 Fixes https://github.com/pytorch/pytorch/issues/62813 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33760506 Pulled By: mruberry fbshipit-source-id: e32c3490321d8c3f31b3bb538bc1f72b39bd2854 (cherry picked from commit `44f41f8e39`)	2022-01-26 08:08:32 +00:00
Shen Li	7bc220e060	Update distributed.rst for ProcessGroup Extensions (#71482 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71482 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D33745986 Pulled By: mrshenli fbshipit-source-id: fe2d0491901bf00be09deb5c556bc1e1d359b725 (cherry picked from commit `be5104bfd7`)	2022-01-25 00:30:08 +00:00
Priyam Parashar	f75e92a936	Fix for retracing documentation which would break for n-ary operators (#71599 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68195 Updated fx.rst documentation and followed the instructions in [contributing.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md#writing-documentation) to generate html. Faced errors which looked very similar to https://github.com/pytorch/pytorch/issues/32703 but gathered from the thread that a non-0 exit is OK for documentation building and these are warnings not affecting the html generation (at least for root rst folder). The HTML output is plain without any styling, please confirm this is intentional. Screenshot of generated html: <img width="1438" alt="Screen Shot 2022-01-20 at 4 31 24 PM" src="https://user-images.githubusercontent.com/9580531/150439448-1a626d74-68ba-4f94-91f2-a6942959b049.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/71599 Reviewed By: jamesr66a Differential Revision: D33719546 Pulled By: zephirefaith fbshipit-source-id: cc9b8ddb13cfdb9f14ebff54cf0d894a8b842aa1 (cherry picked from commit `170db5d7be`)	2022-01-24 20:07:08 +00:00
Tristan Rice	26d54b4076	monitor: add docstrings to pybind interface (#71481 ) Summary: This adds argument names and docstrings so the docs are a lot more understandable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71481 Test Plan: docs/tests CI should suffice ![Screenshot 2022-01-19 at 16-35-10 torch monitor — PyTorch master documentation](https://user-images.githubusercontent.com/909104/150240882-e69cfa17-e2be-4569-8ced-71979a89b369.png) Reviewed By: edward-io Differential Revision: D33661255 Pulled By: d4l3k fbshipit-source-id: 686835dfe331b92a51f4409ec37f8ee6211e49d3 (cherry picked from commit `0a6accda1b`)	2022-01-21 23:04:33 +00:00
Michael Suo	9f0227a0eb	Revert "[ONNX] Minor doc update (#69501 )" (#71615 ) This reverts commit `114c13d020`.	2022-01-20 17:35:04 -08:00
BowenBao	114c13d020	[ONNX] Minor doc update (#69501 ) Fix the wiki URL. Also minor reorganization in onnx.rst. [ONNX] restore documentation of public functions (#69623) The build-docs check requires all public functions to be documented. These should really not be public, but we'll fix that later.' Pull Request resolved: https://github.com/pytorch/pytorch/pull/71609	2022-01-21 00:13:40 +00:00
Mike Ruberry	9b9b878c89	Fixes jiterator cache macro include + updates CUDA note with cache variables (#71452 ) Summary: Per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71452 Reviewed By: ngimel Differential Revision: D33646495 Pulled By: mruberry fbshipit-source-id: bbf627e6d7a724a83a3ea2ae9c0f50430f8d578e (cherry picked from commit `d1e72b144a`)	2022-01-19 03:45:05 +00:00
Rohan Varma	4fd1992a60	[Docs][BE] DDP doc fix (#71363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71363 Looks like DDP example is currently broken as per https://discuss.pytorch.org/t/official-ddp-example-is-broken/141493. Fix the issue by setting the correct env variable. ghstack-source-id: 147080377 Test Plan: CI Reviewed By: mrshenli Differential Revision: D33607250 fbshipit-source-id: e0e7d03cc365c186253b959c4c5405a5e3609218 (cherry picked from commit `32472884ec`)	2022-01-18 22:24:51 +00:00
Leo Fang	67941c8a94	Document `torch.cuda.ExternalStream`, `torch.cuda.caching_allocator_alloc` and `torch.cuda.caching_allocator_delete` (#70126 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67414. Fixes https://github.com/pytorch/pytorch/issues/70117. cc brianjo mruberry ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/70126 Reviewed By: mruberry Differential Revision: D33542910 Pulled By: ngimel fbshipit-source-id: 4b870f4dceca6ee4cc8fba58819f1cb18ac9f857	2022-01-12 15:44:40 -08:00
Tristan Rice	bfe1abd3b5	torch/monitor: add pybind (#69567 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69567 This exposes torch.monitor events and stats via pybind11 to the underlying C++ implementation. * The registration interface is a tad different since it takes a lambda function in Python where as in C++ it's a full class. * This has a small amount of changes to the counter interfaces since there's no way to create an initializer list at runtime so they now also take a vector. * Only double based stats are provided in Python since it's intended more for high level stats where float imprecision shouldn't be an issue. This can be changed down the line if need arises. ``` events = [] def handler(event): events.append(event) handle = register_event_handler(handler) log_event(Event(type="torch.monitor.TestEvent", timestamp=datetime.now(), metadata={"foo": 1.0})) ``` D32969391 is now included in this diff. This cleans up the naming for events. type is now name, message is gone, and metadata is renamed data. Test Plan: buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor Reviewed By: kiukchung Differential Revision: D32924141 fbshipit-source-id: 563304c2e3261a4754e40cca39fc64c5a04b43e8	2022-01-12 13:35:11 -08:00
Alban Desmaison	3c2ae2b47c	Revert D32994274: [ONNX] Link to the wiki (#68505 ) Test Plan: revert-hammer Differential Revision: D32994274 (`a606ea73d6`) Original commit changeset: 34d54f935799 Original Phabricator Diff: D32994274 (`a606ea73d6`) fbshipit-source-id: 81fc96c2aff9d14efb5e092fffd0685e507837e6	2022-01-11 07:40:14 -08:00
BowenBao	a606ea73d6	[ONNX] Link to the wiki (#68505 ) (#69544 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69544 Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D32994274 Pulled By: msaroufim fbshipit-source-id: 34d54f935799fa94516a541a241900ec205c7427 Co-authored-by: Gary Miguel <garymiguel@microsoft.com>	2022-01-10 15:51:04 -08:00
Steven Morad	cfc1117591	Update sparse.rst to warn about _values() (#71088 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71088 Reviewed By: jbschlosser Differential Revision: D33511207 Pulled By: cpuhrsch fbshipit-source-id: 9d0c5445842ed96999eb88445cbea7ae284b1a6f	2022-01-10 12:43:46 -08:00
Jake Tae	23f902f7e4	Fix incorrect variable in autograd docs (#70884 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68362. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70884 Reviewed By: mruberry Differential Revision: D33463331 Pulled By: ngimel fbshipit-source-id: 834ba9c450972710e0424cc92af222551f0b4a4a	2022-01-06 20:53:10 -08:00
lezcano	a35b4b49d2	Add linalg.lu_factor (#66933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933 This PR exposes `torch.lu` as `torch.linalg.lu_factor` and `torch.linalg.lu_factor_ex`. This PR also adds support for matrices with zero elements both in the size of the matrix and the batch. Note that this function simply returns empty tensors of the correct size in this case. We add a test and an OpInfo for the new function. This PR also adds documentation for this new function in line of the documentation in the rest of `torch.linalg`. Fixes https://github.com/pytorch/pytorch/issues/56590 Fixes https://github.com/pytorch/pytorch/issues/64014 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D32834069 Pulled By: mruberry fbshipit-source-id: 51ef12535fa91d292f419acf83b800b86ee9c7eb	2022-01-05 20:32:12 -08:00
mattip	1681323ddc	DOC: Merge extraheader block from theme instead of override (#70187 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70185 The extraheader block in docs/source/_templates/layout.html overrides the one from the pytorch theme. The theme block adds Google Analytics, so they were missing from the `master` documentation. This came up in PR pytorch/pytorch.github.io#899. brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/70187 Reviewed By: bdhirsh Differential Revision: D33248466 Pulled By: malfet fbshipit-source-id: b314916a3f0789b6617cf9ba6bd938bf5ca27242	2022-01-05 06:42:38 -08:00
Juhyeong Kim	bc40fb5639	[Reinstate] Wishart distribution (#70377 ) Summary: Implement https://github.com/pytorch/pytorch/issues/68050 Reopened merged and reverted PR https://github.com/pytorch/pytorch/issues/68588 worked with neerajprad cc neerajprad Sorry for the confusion. TODO: - [x] Unit Test - [x] Documentation - [x] Change constraint of matrix variables with 'torch.distributions.constraints.symmetric' if it is reviewed and merged. Debug positive definite constraints https://github.com/pytorch/pytorch/issues/68720 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70377 Reviewed By: mikaylagawarecki Differential Revision: D33355132 Pulled By: neerajprad fbshipit-source-id: e968c0d9a3061fb2855564b96074235e46a57b6c	2021-12-30 11:41:46 -08:00
Arvind Kannan	6217fee96b	Revert D33246843: [pytorch][PR] Implementation of Wishart distribution Test Plan: revert-hammer Differential Revision: D33246843 (`a217a62e73`) Original commit changeset: 825fcddf4785 Original Phabricator Diff: D33246843 (`a217a62e73`) fbshipit-source-id: 2c8063e8d10e9d3ac20fa44673e6011ed1160753	2021-12-21 18:55:49 -08:00
Kim Juhyeong	a217a62e73	Implementation of Wishart distribution (#68588 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68050 TODO: - [x] Unit Test - [x] Documentation - [x] Change constraint of matrix variables with 'torch.distributions.constraints.symmetric' if it is reviewed and merged. https://github.com/pytorch/pytorch/issues/68720 Pull Request resolved: https://github.com/pytorch/pytorch/pull/68588 Reviewed By: bdhirsh Differential Revision: D33246843 Pulled By: neerajprad fbshipit-source-id: 825fcddf478555235e7a66de0c18368c41e935cd	2021-12-21 14:07:30 -08:00
Jerry Zhang	9d3a6fa623	[quant][bc-breaking] Remove QConfigDynamic from quantization api (#69875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69875 att Test Plan: ci + regression tets: ``` python test/test_quantization.py TestPostTrainingStatic python test/test_quantization.py TestPostTrainingDynamic python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D33079096 fbshipit-source-id: 1e73bb27c518eba62b60f3a8c4b532dddc8367cf	2021-12-17 23:10:06 -08:00
Philip Meier	de296d526f	move torch.testing from prototype to beta (#69668 ) Summary: cc brianjo mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/69668 Reviewed By: albanD Differential Revision: D33028213 Pulled By: mruberry fbshipit-source-id: 3316b887d4c322cc1262feee651464da4124a6de	2021-12-17 09:52:47 -08:00
Jerry Zhang	043098ef7f	[quant][graphmode] Rename backend_config_dict folder to backend (#69882 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69882 att Test Plan: ``` python test/fx2trt/test_quant_trt.py ``` Imported from OSS Reviewed By: supriyar Differential Revision: D33081761 fbshipit-source-id: c3178eec5798ac8587be09a963944b570c73e8ea	2021-12-16 21:13:04 -08:00
Nicolas Hug	73a6c36f1b	Add more details to the known limitations section of torchhub docs (#69970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69970 This is a follow up to https://github.com/pytorch/hub/issues/243 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33124060 Pulled By: NicolasHug fbshipit-source-id: 298fe14b39a1aff3e0b029044c9a0db8bc82336a	2021-12-16 02:43:48 -08:00
Mike Guo	d4f8313497	Add low level torch.profiler.kineto_profile base class (#63302 ) Summary: Refactor torch.profiler.profile by separate it into one low level class and one high level wrapper. The PR include the following change: 1. separate class torch.profiler.profile into two separated class: kineto_profiler and torch.profiler.profile. 2. The former class has the low-level functionality exposed in C++ level like: prepare_profiler, start_profiler, stop_profiler. 3. The original logics in torch.profiler.profile including export_chrome_trace, export_stacks, key_averages, events, add_metadata are all moved into kineto_profiler since they are all exposed by the torch.autograd.profiler. 4. The new torch.profiler.profile is fully back-compatible with original class since it inherit from torch.profiler.kineto_profiler. Its only responsibility in new implementation is the maintenance of the finite state machine of ProfilerAction. With the refactoring, the responsibility boundary is clear and the new logic is simple to understand. Pull Request resolved: https://github.com/pytorch/pytorch/pull/63302 Reviewed By: albanD Differential Revision: D33006442 Pulled By: robieta fbshipit-source-id: 30d7c9f5c101638703f1243fb2fcc6ced47fb690	2021-12-14 14:47:43 -08:00
Brian Hirsh	457ba1dd3e	Porting index_add to structured kernels, add an out variant (#65993 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65993 This PR attempts to port `index_add` to structured kernels, but does more than that: * Adds an `out=` variant to `index_add` * Revises `native_functions.yaml` registrations, to not have multiple entries and instead pass default value to `alpha`. * Changes in `derivatives.yaml` file for autograd functioning * Revises error messages, please see: https://github.com/pytorch/pytorch/pull/65993#issuecomment-945441615 Follow-up PRs in near future will attempt to refactor the OpInfo test, and will give another look at tests in `test/test_torch.py` for this function. (hence the use of ghstack for this) ~This is WIP because there are tests failing for `Dimname` variant on mobile/android builds, and I'm working on fixing them.~ Issue tracker: https://github.com/pytorch/pytorch/issues/55070 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D32646426 fbshipit-source-id: b035ecf843a9a27d4d1e18b202b035adc2a49ab5	2021-12-14 11:57:13 -08:00
Kevin Tse	b67eaec853	[DateLoader] more clearly expose 'default_collate' and 'default_convert' to users (#69862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69862 Fixes #69445 cc SsnL VitalyFedyunin ejguan NivekT Test Plan: Imported from OSS Reviewed By: ejguan, ngimel Differential Revision: D33068792 Pulled By: NivekT fbshipit-source-id: ef9791acdc23d014b8761fa7420062d454ce8969	2021-12-14 11:18:26 -08:00
Supriya Rao	b1ef56d646	[quant][docs] quantized model save/load instructions (#69789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69789 Add details on how to save and load quantized models without hitting errors Test Plan: CI autogenerated docs Imported from OSS Reviewed By: jerryzh168 Differential Revision: D33030991 fbshipit-source-id: 8ec4610ae6d5bcbdd3c5e3bb725f2b06af960d52	2021-12-13 20:23:59 -08:00
Mike Ruberry	dc87cf5fe1	Fixes mem_get_info when querying on a device other than the current device (#69640 ) Summary: Also fixes the documentation failing to appear and adds a test to validate that op works with multiple devices properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69640 Reviewed By: ngimel Differential Revision: D32965391 Pulled By: mruberry fbshipit-source-id: 4fe502809b353464da8edf62d92ca9863804f08e	2021-12-08 23:04:30 -08:00
Peter Bell	e279963eef	Remove remaining THC code (#69039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69039 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D32872476 Pulled By: ngimel fbshipit-source-id: 7972aacc24aef9450fb59b707ed6396c501bcb31	2021-12-08 12:18:08 -08:00
Vincent-Pierre Berges	30bb4e0071	Add nvidia-smi memory and utilization as native Python API (#69104 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69104 Add nvidia-smi memory and utilization as native Python API Test Plan: testing the function returns the appropriate value. Unit tests to come. Reviewed By: malfet Differential Revision: D32711562 fbshipit-source-id: 01e676203299f8fde4f3ed4065f68b497e62a789	2021-12-08 10:33:23 -08:00
Charles David Hernandez	fc2614537b	Updating quantization documentation (#68907 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68907 Added information about symmetric qschemes and corrected an error in reference to https://github.com/pytorch/pytorch/issues/68540 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D32662033 fbshipit-source-id: 9052c597f61991934b86850fea8b6eab78397450	2021-12-08 08:32:33 -08:00
gmagogsfm	358e908162	Add Union type to TorchScript Language Ref (#69514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69514 Reviewed By: tugsbayasgalan Differential Revision: D32909371 Pulled By: gmagogsfm fbshipit-source-id: af1c3040cd59ee913dc576cf8a8c759313f1e07f	2021-12-07 12:53:54 -08:00
Rodrigo Bermúdez Schettino	1a202b0c39	Docs: Fix broken code syntax in autograd.rst (#69362 ) Summary: The backticks around `nn.Parameters` were not rendered correctly because the word was enclosed in an italics block. Spotted the issue on https://pytorch.org/docs/stable/notes/autograd.html#locally-disable-grad-doc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69362 Reviewed By: zou3519 Differential Revision: D32924093 Pulled By: albanD fbshipit-source-id: 5a310ac3f3d13a5116f7aa911817b9452eee711d	2021-12-07 12:03:15 -08:00
Xiao Wang	bfe5ad28e6	[Linalg] Add a runtime switch to let pytorch prefer a backend impl in linalg functions on GPU (#67980 ) Summary: Per title. This PR introduces a global flag that lets pytorch prefer one of the many backend implementations while calling linear algebra functions on GPU. Usage: ```python torch.backends.cuda.preferred_linalg_library('cusolver') ``` Available options (str): `'default'`, `'cusolver'`, `'magma'`. Issue https://github.com/pytorch/pytorch/issues/63992 inspired me to write this PR. No heuristic is perfect on all devices, library versions, matrix shapes, workloads, etc. We can obtain better performance if we can conveniently switch linear algebra backends at runtime. Performance of linear algebra operators after this PR should be no worse than before. The flag is set to `'default'` by default, which makes everything the same as before this PR. The implementation of this PR is basically following that of https://github.com/pytorch/pytorch/pull/67790. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67980 Reviewed By: mruberry Differential Revision: D32849457 Pulled By: ngimel fbshipit-source-id: 679fee7744a03af057995aef06316306073010a6	2021-12-03 19:06:30 -08:00
Michael Carilli	da023611d7	[CUDA graphs] Fixes make_graphed_callables example typos (#69379 ) Summary: cc mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/69379 Reviewed By: mruberry Differential Revision: D32841260 Pulled By: ngimel fbshipit-source-id: a7d0b9db0578526907547b201eddd55827812b63	2021-12-03 16:51:14 -08:00
Elio	088a4feb41	Update the documentation for AMP with DataParallel (#69218 ) Summary: Following https://github.com/pytorch/pytorch/issues/60540 and pull request https://github.com/pytorch/pytorch/issues/43102 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69218 Reviewed By: gchanan Differential Revision: D32803814 Pulled By: ngimel fbshipit-source-id: 06fdbbee2c7734153271be70ec4bc24263c8c367	2021-12-03 14:58:47 -08:00
Michael Suo	ad182479b0	[deploy] docs (#69251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69251 This adds some actual documentation for deploy, which is probably useful since we told everyone it was experimentally available so they will probably be looking at what the heck it is. It also wires up various compoenents of the OSS build to actually work when used from an external project. Differential Revision: D32783312 D32783312 Test Plan: Imported from OSS Reviewed By: wconstab Pulled By: suo fbshipit-source-id: c5c0a1e3f80fa273b5a70c13ba81733cb8d2c8f8	2021-12-01 21:55:18 -08:00
Nikul Patel	8f9f559453	ammend tensors.rst and torch.rst for doc generation (#69030 ) Summary: (This is my first contribution to PyTorch) Added missing operations to docs added in https://github.com/pytorch/pytorch/issues/64430. Please let me know if I've done anything wrong. Fixes https://github.com/pytorch/pytorch/issues/68928 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69030 Reviewed By: samdow Differential Revision: D32706826 Pulled By: soulitzer fbshipit-source-id: edcc175a8f9bc69450a39059580c05edce699312	2021-11-30 12:04:13 -08:00
mrshenli	b8c3693281	Remove autograd-enabled collective APIs from distributed docs (#69011 ) Summary: These APIs are not yet officially released and are still under discussion. Hence, this commit removes those APIs from docs and will add them back when ready. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/69011 Reviewed By: fduwjj Differential Revision: D32703124 Pulled By: mrshenli fbshipit-source-id: ea049fc7ab6b0015d38cc40c5b5daf47803b7ea0	2021-11-29 18:14:50 -08:00
JUBIN CHHEDA	27228656e6	[FX][docs] Document gotcha about training flag (#68915 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68913 Pull Request resolved: https://github.com/pytorch/pytorch/pull/68915 Reviewed By: jamesr66a Differential Revision: D32705410 Pulled By: jubinchheda fbshipit-source-id: a44c17ab0e62465823ceb0ef983ae330b50fb073	2021-11-29 16:13:32 -08:00
Mike Ruberry	6ae34ea6f8	Revert D32521980: Add linalg.lu_factor Test Plan: revert-hammer Differential Revision: D32521980 (`b10929a14a`) Original commit changeset: 26a49ebd87f8 fbshipit-source-id: e1a6bb9c2ece9bd78190fe17e16a46e3358c5c82	2021-11-28 17:22:15 -08:00
lezcano	b10929a14a	Add linalg.lu_factor (#66933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933 This PR exposes `torch.lu` as `torch.linalg.lu_factor` and `torch.linalg.lu_factor_ex`. This PR also adds support for matrices with zero elements both in the size of the matrix and the batch. Note that this function simply returns empty tensors of the correct size in this case. We add a test and an OpInfo for the new function. This PR also adds documentation for this new function in line of the documentation in the rest of `torch.linalg`. Fixes https://github.com/pytorch/pytorch/issues/56590 Fixes https://github.com/pytorch/pytorch/issues/64014 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32521980 Pulled By: mruberry fbshipit-source-id: 26a49ebd87f8a41472f8cd4e9de4ddfb7f5581fb	2021-11-27 17:52:48 -08:00
lezcano	cf54416925	Add docs entry for `adjoint`. (#68869 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68869 As per title. cc brianjo mruberry anjali411 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D32647456 Pulled By: anjali411 fbshipit-source-id: 2cb053a6884e2b22d3decc058e86d10f355fcb84	2021-11-24 10:03:41 -08:00

... 5 6 7 8 9 ...

2197 Commits