pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Rohan Varma	782ee6c7e7	[FSDP][Reland] Implement local_state_dict and load_local_state_dict 1. Implement the framework to allow user to choose among `state_dict`, `local_state_dict`, and `sharded_state_dict`. 2. Implement ShardedTensor compatible local_state_dict() and load_local_state_dict(). ghstack-source-id: 149625958 Differential Revision: [D34383925](https://our.internmc.facebook.com/intern/diff/D34383925/) [ghstack-poisoned]	2022-02-23 07:57:34 -08:00
Rohan Varma	4bb27ae7d3	Skip optimizer overlap tests that have issues with NCCL async error handling Skip these tests which sometimes have issues on unrelated PRs such as https://github.com/pytorch/pytorch/runs/5291461671?check_suite_focus=true. See https://github.com/pytorch/pytorch/issues/73259 for additional detail Skip these tests which sometimes have issues on unrelated PRs such as https://github.com/pytorch/pytorch/runs/5291461671?check_suite_focus=true. See https://github.com/pytorch/pytorch/issues/73259 for additional details. Differential Revision: [D34404857](https://our.internmc.facebook.com/intern/diff/D34404857/) [ghstack-poisoned]	2022-02-22 15:53:01 -08:00
Scott Wolchok	28339ddc25	[PyTorch] Hit fused addmm path in linear() for existing MHA (#72871 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72871 We do this same trick in the native MHA implementation; backport it for purposes of fair comparison. ghstack-source-id: 149526858 Test Plan: CI Reviewed By: ngimel Differential Revision: D34176090 fbshipit-source-id: 8b578c29c4dcf0d85bae74dfbbb82db9a8f32dc7 (cherry picked from commit `fd50170935`)	2022-02-22 19:33:46 +00:00
Pavithran Ramachandran	932adf26e4	[easy][PyTorch][CleanUp] Removing unused function def (missing function implementation) (#73019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73019 fb: Code search shows no usage https://www.internalfb.com/code/search?q=repo%3Aall%20writeMobileMetadata&hide_uninteresting=0&hide_tests=0 ghstack-source-id: 149381949 Test Plan: CI Reviewed By: larryliu0820 Differential Revision: D34306823 fbshipit-source-id: b405e5683113bd4ff2e89eec023ae9ebb25c3dc9 (cherry picked from commit `a72621fbbd`)	2022-02-22 17:31:32 +00:00
Vasiliy Kuznetsov	6d86dc5390	dbr quant: store auto_quant_state on the top level model (#72934 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72934 Before this PR, DBR quantization had a limitation on handling user code which iterates over all module children. For example, imagine a forward function such as ``` def forward(self, x): for module in self: x = module(x) return x ``` Before this PR, this code would break with DBR quantization, because we attach `AutoQuantizationState` objects to each child, and those objects live in the child's module hierarchy and will appear in these kinds of iterations, changing the meaning of the user program. This PR reduces the scope of this problem to just the top level module. Instead of attaching `AutoQuantizationState` objects to each child, we register them in a map on the parent. Here is a before and after: ``` // toy model model \|--> child1 // toy model with AutoQuantizationState objects, before this PR model \|--> child1 \| \|--> _auto_quant_state \|--> _auto_quant_state // toy model with AutoQuantizationState objects, after this PR model \|--> child1 \|--> _fqn_to_auto_quant_state_map \|--> ( ) --> _auto_quant_state // of `model` \|--> (child1) --> _auto_quant_state // of `model.child1` ``` Note: `child1._auto_quant_state` works as before for convenience, but the `child1` object now stores a soft link to its `_auto_quant_state` instead of properly registering it in its module hierarchy. This is somewhat hacky. If we need to improve this in the future, we could remove this soft link and refactor the code to call the FQN map instead. Note: if the top level module iterates over its children, things will still be broken. This is less likely, and we will recommend that the user work around this by wrapping their model, or checking for the `AutoQuantizationStateModuleDict` type in their iteration loop. The impact of this change should be an improvement of coverage of user models. In fact, we expect this to drive our coverage of torchbenchmark models from 89% to 100%. Test Plan: ``` // previously disabled test cases with user code iterating // over module children are now enabled, with wrappers python test/test_quantization.py -k test_module_calls_items python test/test_quantization.py -k test_vovnet_sequential ``` Reviewed By: dzdang Differential Revision: D34281074 Pulled By: vkuzo fbshipit-source-id: 0e25fc1ec529c47f72478a1875fe43219feac6b1 (cherry picked from commit `4008f89967`)	2022-02-22 17:31:32 +00:00
Andrew Gu	c30659ffcc	[ZeRO] (Reland) Add ctor support for multiple param groups (#72932 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/72578. Overview Windows CI was failing due to the multi-rank single-GPU case (see [here](https://github.com/pytorch/pytorch/runs/5204906995?check_suite_focus=true)). To address this, I - added `common_distributed.skip_if_no_gpu` for `test_multiple_param_groups()` to ensure that each rank can safely call `to(self.device)` -- this targets the expected SPSD use case where each rank has its own GPU; - moved `test_constructor()` back to `TestZeroRedundancyOptimizerSingleRank` to check that the multiple parameter group method for construction works even on a single rank. Test Plan - I checked both tests for CPU, 1 GPU, 2 GPUs, 4 GPUs, and 8 GPUs. - I added the `ciflow/win` label to run the failing Windows CI test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72932 Reviewed By: rohan-varma Differential Revision: D34281482 Pulled By: awgu fbshipit-source-id: c4fe604ddd9d2c123c3071249741e6b8a6454b6e (cherry picked from commit `6bea9bcc63`)	2022-02-22 16:29:55 +00:00
Adam Costarino	849c6a526e	Extrapolated on equiv between linalg @ and solve (#71769 ) Summary: Potentially fixes https://github.com/pytorch/pytorch/issues/71385 similar docstring could also fix https://github.com/pytorch/pytorch/issues/71384 Updated the doc to `torch.linalg.inv` to include nuance around equivalence to `torch.linalg.solve`: Update is below: ``` .. note:: Consider using :func:`torch.linalg.solve` if possible for multiplying a matrix on the left by the inverse, as:: linalg.solve(A, B) == linalg.inv(A) @ B # When B is a matrix It is always prefered to use :func:`~solve` when possible, as it is faster and more numerically stable than computing the inverse explicitly. ``` IvanYashchuk please inform if this the right direction or over-extrapolation. I can apply the same changes to the `tensorinv` doc to fix https://github.com/pytorch/pytorch/issues/71384. Also in https://github.com/pytorch/pytorch/issues/71384 there was a mention of updating `torch.matmul` error message to indicate the proper tensor shapes, I could also potentially do that in this PR if needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71769 Reviewed By: H-Huang Differential Revision: D34242541 Pulled By: mruberry fbshipit-source-id: 40e98dad4d821928d1dea72d4512ee579b690a32 (cherry picked from commit `a0321a5de9`)	2022-02-22 12:29:32 +00:00
Taylor Robie	9f541aa3ac	[Profiler] Optimize `reportMemoryUsage` (#71538 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71538 `reportMemoryUsage` is kind of awful. It does a bunch of string writes and such that makes it VERY expensive. Just moving that work off the hot path reduces the overhead for `profile_memory` from ~6.5 us to ~1.2 us. (85% reduction in the kineto contribution to profiling overhead.) Test Plan: Ran ubenchmark with `--op empty --stressTestKineto --kinetoProfileMemory` Reviewed By: swolchok Differential Revision: D32730167 fbshipit-source-id: fe18e8fa3881967cad8fa1c26c71c805e9b034e5 (cherry picked from commit `0d394cb252`)	2022-02-20 23:29:13 +00:00
Michael Suo	bf03d93496	Revert D33919683: [FSDP] Implement local_state_dict and load_local_state_dict Test Plan: revert-hammer Differential Revision: D33919683 (`d50643adcd`) Original commit changeset: c9f1b43ce04d Original Phabricator Diff: D33919683 (`d50643adcd`) fbshipit-source-id: c54c181edf8eb6a3bc509ed54d34ffdce11b93f5 (cherry picked from commit `4dfb50cd0d`)	2022-02-20 02:32:48 +00:00
Michael Suo	2a7f9f0600	Revert D34284271: [TLC][checkpoint] Add unit test for StatefulComponentCheckpointAgent Test Plan: revert-hammer Differential Revision: D34284271 (`f49a93ba56`) Original commit changeset: 58f84c69782a Original Phabricator Diff: D34284271 (`f49a93ba56`) fbshipit-source-id: 87deabae3c3c10c5a9532825ca33d78c5251958e (cherry picked from commit `03bc05a970`)	2022-02-19 21:28:55 +00:00
Chien-Chin Huang	d50643adcd	[FSDP] Implement local_state_dict and load_local_state_dict (#72469 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72469 1. Implement the framework to allow user to choose among `state_dict`, `local_state_dict`, and `sharded_state_dict`. 2. Implement ShardedTensor compatible local_state_dict() and load_local_state_dict(). ghstack-source-id: 149559985 Test Plan: CI Reviewed By: rohan-varma Differential Revision: D33919683 fbshipit-source-id: c9f1b43ce04da7db65c4aebf6ac2c7a0ac5e9de8 (cherry picked from commit `55fd6230c9`)	2022-02-19 20:29:27 +00:00
Shihao Xu	f49a93ba56	[TLC][checkpoint] Add unit test for StatefulComponentCheckpointAgent Summary: as titiled Test Plan: tsloop --mode-dev-nosan aiplatform/modelstore/client/tests/:stateful_component_checkpoint_agent_test -- --focus --fail-fast buck build mode/dev-nosan //aiplatform/modelstore/client/tests/:stateful_component_checkpoint_agent_test ./buck-out/gen///aiplatform/modelstore/client/tests//stateful_component_checkpoint_agent_test#binary.par --focus --fail-fast Reviewed By: xunnanxu Differential Revision: D34284271 fbshipit-source-id: 58f84c69782a7bdb30bed0a2420c74e7b7487bb9 (cherry picked from commit `a1037118f4`)	2022-02-19 18:31:20 +00:00
Nikolay Korovaiko	237574db19	add assert to make sure expected number of LTC roots matches what TS … (#73112 ) Summary: …computes Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/73112 Reviewed By: mikaylagawarecki Differential Revision: D34351338 Pulled By: Krovatkin fbshipit-source-id: 1b3d0f3c801bd095b68d2eff3184ecbefadf7f34 (cherry picked from commit `53b7fc4ad6`)	2022-02-19 06:33:08 +00:00
patel-zeel	c837caf5c5	Adding details to kl.py (#72845 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/72765. - [x] Improved `NotImplementedError` verbosity. - [x] Automate the docstring generation process ## Improved `NotImplementedError` verbosity ### Code ```python import torch dist = torch.distributions torch_normal = dist.Normal(loc=0.0, scale=1.0) torch_mixture = dist.MixtureSameFamily( dist.Categorical(torch.ones(5,) ), dist.Normal(torch.randn(5,), torch.rand(5,)), ) dist.kl_divergence(torch_normal, torch_mixture) ``` #### Output before this PR ```python NotImplementedError: ``` #### Output after this PR ```python NotImplementedError: No KL(p \|\| q) is implemented for p type Normal and q type MixtureSameFamily ``` ## Automate the docstring generation process ### Docstring before this PR ```python Compute Kullback-Leibler divergence :math:`KL(p \\| q)` between two distributions. .. math:: KL(p \\| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx Args: p (Distribution): A :class:`~torch.distributions.Distribution` object. q (Distribution): A :class:`~torch.distributions.Distribution` object. Returns: Tensor: A batch of KL divergences of shape `batch_shape`. Raises: NotImplementedError: If the distribution types have not been registered via :meth:`register_kl`. ``` ### Docstring after this PR ```python Compute Kullback-Leibler divergence :math:`KL(p \\| q)` between two distributions. .. math:: KL(p \\| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx Args: p (Distribution): A :class:`~torch.distributions.Distribution` object. q (Distribution): A :class:`~torch.distributions.Distribution` object. Returns: Tensor: A batch of KL divergences of shape `batch_shape`. Raises: NotImplementedError: If the distribution types have not been registered via :meth:`register_kl`. KL divergence is currently implemented for the following distribution pairs: * :class:`~torch.distributions.Bernoulli` and :class:`~torch.distributions.Bernoulli` * :class:`~torch.distributions.Bernoulli` and :class:`~torch.distributions.Poisson` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Beta` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.Binomial` and :class:`~torch.distributions.Binomial` * :class:`~torch.distributions.Categorical` and :class:`~torch.distributions.Categorical` * :class:`~torch.distributions.Cauchy` and :class:`~torch.distributions.Cauchy` * :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.ContinuousBernoulli` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.Dirichlet` and :class:`~torch.distributions.Dirichlet` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Gumbel` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Exponential` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.ExponentialFamily` and :class:`~torch.distributions.ExponentialFamily` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Gumbel` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Gamma` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.Geometric` and :class:`~torch.distributions.Geometric` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Gumbel` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Gumbel` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.HalfNormal` and :class:`~torch.distributions.HalfNormal` * :class:`~torch.distributions.Independent` and :class:`~torch.distributions.Independent` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Laplace` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Laplace` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.LowRankMultivariateNormal` and :class:`~torch.distributions.LowRankMultivariateNormal` * :class:`~torch.distributions.LowRankMultivariateNormal` and :class:`~torch.distributions.MultivariateNormal` * :class:`~torch.distributions.MultivariateNormal` and :class:`~torch.distributions.LowRankMultivariateNormal` * :class:`~torch.distributions.MultivariateNormal` and :class:`~torch.distributions.MultivariateNormal` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Gumbel` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Laplace` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Normal` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.OneHotCategorical` and :class:`~torch.distributions.OneHotCategorical` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Pareto` and :class:`~torch.distributions.Uniform` * :class:`~torch.distributions.Poisson` and :class:`~torch.distributions.Bernoulli` * :class:`~torch.distributions.Poisson` and :class:`~torch.distributions.Binomial` * :class:`~torch.distributions.Poisson` and :class:`~torch.distributions.Poisson` * :class:`~torch.distributions.TransformedDistribution` and :class:`~torch.distributions.TransformedDistribution` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Beta` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.ContinuousBernoulli` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Exponential` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Gamma` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Gumbel` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Normal` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Pareto` * :class:`~torch.distributions.Uniform` and :class:`~torch.distributions.Uniform` ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/72845 Reviewed By: mikaylagawarecki Differential Revision: D34344551 Pulled By: soulitzer fbshipit-source-id: 7a603613a2f56f71138d56399c7c521e2238e8c5 (cherry picked from commit `6b2a51c796`)	2022-02-19 06:33:08 +00:00
Andrey Talman	46f9e16afe	Documenting cuda 11.5 windows issue (#73013 ) Summary: Adding documentation about compiling extension with CUDA 11.5 and Windows Example of failure: https://github.com/pytorch/pytorch/runs/4408796098?check_suite_focus=true Note: Don't use torch/extension.h In CUDA 11.5 under windows in your C++ code: Use aten instead of torch interface in all cuda 11.5 code under windows. It has been failing with errors, due to a bug in nvcc. Example use: >>> #include <ATen/ATen.h> >>> at::Tensor SigmoidAlphaBlendForwardCuda(....) Instead of: >>> #include <torch/extension.h> >>> torch::Tensor SigmoidAlphaBlendForwardCuda(...) Currently open issue for nvcc bug: https://github.com/pytorch/pytorch/issues/69460 Complete Workaround code example: `cb170ac024` Pull Request resolved: https://github.com/pytorch/pytorch/pull/73013 Reviewed By: malfet, seemethere Differential Revision: D34306134 Pulled By: atalman fbshipit-source-id: 3c5b9d7a89c91bd1920dc63dbd356e45dc48a8bd (cherry picked from commit `87098e7f17`)	2022-02-19 02:34:59 +00:00
Steven Troxler	906d26fb9b	[codemod][type-comments] Convert type comments in api.py (#73084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73084 I'm wrapping up the conversion of type comments to type annotations in caffe2. The last remaining "bulk" codemod has test failures that are hard for me to understand, so I'm going to submit PRs for each module individually which makes it easier to see what's causing problems. All the codemods were produced via LibCST and then manually cleaned up. Test Plan: Wait for github CI Reviewed By: H-Huang Differential Revision: D34344289 fbshipit-source-id: e8e3a13c3d95f6804829f1818fb7f0605e5ba137 (cherry picked from commit `92d47d9cd5`)	2022-02-19 00:31:45 +00:00
shubhambhokare1	671c8a459a	[ONNX] Add pixel_unshuffle support in opset 9 Current we are unable to utilize ONNX's SpaceToDepth operator due to the lack of the mode_s attribute, hence we add an alternative symbolic in opset 9 to support pixel_unshuffle - Adds support for pixel_unshuffle in opset9 - Adds support for dynamic input shapes for pixel_shuffle and pixel_unshuffle Pull Request resolved: https://github.com/pytorch/pytorch/pull/72449	2022-02-19 00:15:16 +00:00
Scott Wolchok	79a216ce57	Move native MHA code out of PyTorch core (#72944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72944 Doesn't make sense to develop it in core right now. ghstack-source-id: 149456040 Test Plan: CI run MHA benchmark in benchmark_transformers.py to make sure it doesn't crash Reviewed By: zrphercule Differential Revision: D34283104 fbshipit-source-id: 4f0c7a6bc066f938ceac891320d4cf4c3f8a9cd6 (cherry picked from commit `b9df65e97c`)	2022-02-18 21:34:06 +00:00
Stephen Oakley	1646a0033d	Use irange in PyTorch (#72836 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72836 Replacing increment iterator loops with ranged loops. It allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This auto-types the loops and adds const-safety to the iteration variable. Reviewed By: albanD Differential Revision: D34136539 fbshipit-source-id: 760a70ad43ce6f05630ba8fea261d4dbb699e62e (cherry picked from commit `0428408d88`)	2022-02-18 19:29:07 +00:00
albanD	7fe3f334fb	Remove call into python API without GIL being held in c10d (#72928 ) Summary: Fix https://github.com/pytorch/pytorch/issues/26475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/72928 Reviewed By: mikaylagawarecki Differential Revision: D34317697 Pulled By: albanD fbshipit-source-id: e13efb98e8c6bf4cbc05181c028d68871a844bf7 (cherry picked from commit `c0e0397688`)	2022-02-18 19:29:07 +00:00
BowenBao	956bafef8b	[onnx export] Add broadcast to matmul shape inference (#70534 ) Reuse the same broadcast code from the function `ProcessBroadcastNode`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72990	2022-02-18 18:44:19 +00:00
BowenBao	98f9ff9026	[ONNX] Fix an assertion failure involving Slice (#71965 ) Before this change, exporting a model to ONNX involving Slice crashes at `axes[i]` in line 153 if C++ assertions are enabled: ``` /usr/include/c++/11.1.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = long int; _Alloc = std::allocator<long int>; std::vector<_Tp, _Alloc>::reference = long int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed. ``` The relevant check is https://github.com/gcc-mirror/gcc/blob/releases/gcc-11.1.0/libstdc++-v3/include/bits/stl_vector.h#L1045, which checks the vector index. The issue can be reproduced by exporting Mask R-CNN or similar ones. For example, ```Python import io import torch import torchvision as tv model = tv.models.detection.maskrcnn_resnet50_fpn(pretrained=False) x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] with io.BytesIO() as f: torch.onnx.export(model, x, f, opset_version=11) ``` (extracted from [onnxoptimizer tests](https://github.com/onnx/optimizer/blob/master/onnxoptimizer/test/optimizer_test.py)) Tested environment: Arch Linux x86_64 with pytorch and torchvisoin installed from [the official repo](https://github.com/archlinux/svntogit-community/blob/packages/python-pytorch/trunk/PKGBUILD) and [AUR](https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=python-torchvision), respectively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72989	2022-02-18 18:41:47 +00:00
BowenBao	2791725a84	Integrate full ONNX check into ONNX export API (#71125 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/72988	2022-02-18 18:40:09 +00:00
Raghavan Raman	2724e4c039	[Static Runtime] Do not replace with copy variants if TE fuser is enabled (#72946 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72946 The passes to replace with copy variants are run after TensorExpr fusion. Due to this the resulting graph does not conform to the assumptions made in the fuser. So, even if these flags `use_copy_variants`, `use_maybe_copy_variants` are turned on, the corresponding passes will not be executed if TensorExpr fusion is enabled. ghstack-source-id: 149429753 Test Plan: Tested locally. Reviewed By: mikeiovine Differential Revision: D34283842 fbshipit-source-id: 74edea517a00c85dff0319f9c8b3ac8befe09018 (cherry picked from commit `3798af7f1b`)	2022-02-18 18:34:50 +00:00
Raghavan Raman	02afdd54b9	[Static Runtime] Handle fallback graphs that are generated as part of the TE Fuser (#72945 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72945 ghstack-source-id: 149429754 Test Plan: ``` buck run mode/opt //caffe2/benchmarks/static_runtime:static_runtime_cpptest — --gtest_filter=CpuFusion.FallbackGraph ``` Reviewed By: mikeiovine Differential Revision: D34283840 fbshipit-source-id: 868bd340a50fe691797164524f2400d07998d304 (cherry picked from commit `80f60f2cc0`)	2022-02-18 18:34:50 +00:00
BowenBao	5843fea94d	[ONNX] Add export support for linalg norm (#66575 ) * Add matrix_norm * Add vector norm * Fixe flake * Fixe flake * nit fixes * Nit fixes * Restructure and add comments Pull Request resolved: https://github.com/pytorch/pytorch/pull/72987	2022-02-18 18:30:16 +00:00
BowenBao	32f6a1e2a2	[ONNX] First version of quantized model export: Support quantized.Linear (#69232 ) Co-authored-by: David Fan <jiafamicrosoft.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/72986	2022-02-18 18:27:26 +00:00
BowenBao	a6517c20cf	[ONNX] Improve Expand shape inference (#69264 ) Extend shape inference support for `Expand`, when value of argument `shape` is unknown. Infer the rank of the output of `Expand`, and set shape to dynamic, if shape of argument `shape` is known. Without this, shape inference aborts, and falls back to the static shape provided by tracer, which is incorrect in many cases. Co-authored-by: BowenBao <bowbaomicrosoft.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/72985	2022-02-18 18:24:28 +00:00
Alban Desmaison	0951cb513a	Revert D34342689: Revert D34250357: Sync lazy_tensor_staging back to master Test Plan: revert-hammer Differential Revision: D34342689 Original commit changeset: 43f6da6986f7 Original Phabricator Diff: D34250357 (`69389fb542`) fbshipit-source-id: 8a3fb74877e719e9b9577b58027b4e7061a04ef0 (cherry picked from commit `c749f08e7a`)	2022-02-18 17:31:21 +00:00
Jane Xu	477d1bd6cf	Revert D34313425: [quant] Add ConvTranspose reference module Test Plan: revert-hammer Differential Revision: D34313425 (`710f12f58e`) Original commit changeset: 3eeec1b24a51 Original Phabricator Diff: D34313425 (`710f12f58e`) fbshipit-source-id: aecf9113d2e4cef3ccf4e1a9c4c33b07dc2ad385 (cherry picked from commit `3fcb9cd14d`)	2022-02-18 17:31:20 +00:00
Alban Desmaison	86a961af87	Revert D34250357: Sync lazy_tensor_staging back to master Test Plan: revert-hammer Differential Revision: D34250357 (`69389fb542`) Original commit changeset: aa7d589f6050 Original Phabricator Diff: D34250357 (`69389fb542`) fbshipit-source-id: 43f6da6986f7fc5189d641b7803adc5ada27194c (cherry picked from commit `3c930a5e4e`)	2022-02-18 15:47:37 +00:00
Kevin Tse	f5e201e4e9	[DataPipe] Adding usage examples for IterDataPipes (#73033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73033 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D34313793 Pulled By: NivekT fbshipit-source-id: 51125be2f79d73d02658b2b1c2691f96be8d4769 (cherry picked from commit `3e3c2df7c6`)	2022-02-18 15:12:34 +00:00
Vasiliy Kuznetsov	1c0df26597	eager quant: convert mapping for fused QAT Linear-Bn1d (#72796 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72796 Adds the eager mode convert mappint for fused QAT Linear-Bn1d module. Test Plan: ``` python test/test_quantization.py TestQuantizeEagerQATNumerics.test_linear_bn_workflow ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34213150 fbshipit-source-id: c08b5eb843dea673fd07c6b7b93dcd3ba03eaec2 (cherry picked from commit `722edfe676`)	2022-02-18 13:14:56 +00:00
Vasiliy Kuznetsov	e73eaffd3b	quant: add QAT fused Linear-Bn1d [1/x]: prepared module (#72431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72431 Adds support for a fused QAT observed module for `Linear` followed by `BatchNorm1d`. In this PR, only the support for prepared module with fake_quants in the right places is added. A future PR will add support for `convert`, and tests for eager and FX graph mode workflows. Similar to conv-bn, we rescale the weight before applying the fake quant, and undo the rescaling after the linear operation. Test Plan: ``` python test/test_quantization.py TestQuantizeEagerQATNumerics.test_linear_bn ``` Imported from OSS Reviewed By: jerryzh168, raghuramank10000 Differential Revision: D34044427 fbshipit-source-id: 47a519173939ca4824d2c6e6ea7a599764a8ed10 (cherry picked from commit `bfc75fe078`)	2022-02-18 13:14:56 +00:00
Terry Chen	710f12f58e	[quant] Add ConvTranspose reference module (#73031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73031 Add ConvTranspose reference module Test Plan: python3 test/test_quantization.py TestQuantizeEagerOps.test_conv_transpose_2d Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34313425 fbshipit-source-id: 3eeec1b24a51c7951c4d4b0c7dca43a012468b85 (cherry picked from commit `0ee7c1cc39`)	2022-02-18 06:29:12 +00:00
Will Constable	69389fb542	Sync lazy_tensor_staging back to master (#72875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72875 This diff contains changes from several PRs landed to lazy_tensor_staging branch. * generating 'fallback' overrides for each codegenned op, useful for debugging * supports operators which are missing aten:: symbols for op names, instead using their string counterpart * makes the IR class a base class instead of hardcoding the assumption of TS It also resolves lint issues and in particular cleans up the following: * {Type}s shouldn't be passed into isValueType, and using the catch-all base class of CType is nicer than specifying a list of types. Fixes #72852 Test Plan: test manually on lazy_tensor_staging branch Reviewed By: shunting314 Differential Revision: D34250357 fbshipit-source-id: aa7d589f605055d5d02bc77c77fa6f1182ff7497 (cherry picked from commit `2f8f5e4971`)	2022-02-18 03:49:46 +00:00
Don Jang	39fb771423	[Static Runtime] Report static op statistics from graph when input size is zero (#73032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73032 Currently, ptvsc2_predictor_bench reports nothing when the input size is zero. However, Static Runtime's module creation has some useful information even after loading a model. This change reports static op statistics when the given input's size is zero. In addition to that, this enables it to report the out variant coverage percentage, which is crucial to establish the baseline performance of Static Runtime. Test Plan: - Ran `ptvsc2_predictor_bench` with this change as seen above. Reviewed By: mikeiovine Differential Revision: D34294803 fbshipit-source-id: 80c02199075dae9280657d6edecc7c679c1c27f4 (cherry picked from commit `83aec141a2`)	2022-02-17 23:58:32 +00:00
Raghavan Raman	6d33852685	[NNC] TensorExprKernel state should not be modified on calls to run methods (#73028 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73028 A typical use case for `TensorExprKernel` is to create the kernel once and call it multiple times, possibly in parallel. For the parallel calls to work, we need to ensure that the run() method calls do not change any state in `TensorExprKernel`. Before this change, the `run()` method was modifying the sizes and strides vectors when dynamic shapes were present. This manifested as a data race when running a model with Static Runtime. ghstack-source-id: 149398820 Test Plan: ``` buck build mode/dev-asan //caffe2/test/cpp/tensorexpr:tensorexpr ./buck-out/dev/gen/caffe2/test/cpp/tensorexpr/tensorexpr --gtest_filter="DynamicShapes.MultiThreadedExecution" ``` Reviewed By: eellison Differential Revision: D34287960 fbshipit-source-id: d311f3c5a66c5d5de4e1deaeaa01816b53e9906e (cherry picked from commit `161568bfae`)	2022-02-17 23:14:27 +00:00
Joel Schlosser	f670179c0a	Fix doc regressions for various modules and functional forms (#73014 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73014 Fixes #72501 Fixes #72502 Fixes #72503 Fixes #72504 Fixes #72505 Fixes #72506 Fixes #72507 Fixes #72509 Fixes #72510 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D34305640 Pulled By: jbschlosser fbshipit-source-id: 62f341633fdb0316eaa346cf7247865290eb830a (cherry picked from commit `8362d264e7`)	2022-02-17 22:40:18 +00:00
vfdev	af3ca50291	Fixed docstring typo for nn.Module.get_submodule (#73018 ) Summary: Description: - Fixed docstring typo for nn.Module.get_submodule otherwise output is invisible: https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.get_submodule Pull Request resolved: https://github.com/pytorch/pytorch/pull/73018 Reviewed By: davidberard98 Differential Revision: D34310091 Pulled By: jbschlosser fbshipit-source-id: e35aef2b7479bdd81fb6b7ddd203bd71798769e1 (cherry picked from commit `e4944e1f8e`)	2022-02-17 22:40:18 +00:00
Rohan Varma	209a948896	[Reland][FSDP] Implement apply() (#72925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72925 Reland with fix to add the owner string in test file ghstack-source-id: 149280348 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D34273858 fbshipit-source-id: 2174c1d71fcc5148282d94e375071a50b92114f2 (cherry picked from commit `158762bbb3`)	2022-02-17 21:50:03 +00:00
Chen Lai	2c916ef198	More update on the guidance (#72818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72818 ghstack-source-id: 149395630 Test Plan: CI Reviewed By: raziel Differential Revision: D34226823 fbshipit-source-id: e31b71110e8e94bd9fabe25a388f0d4a9b9d0ca7 (cherry picked from commit `57e9b034aa`)	2022-02-17 20:05:17 +00:00
Mike Iovine	d1c5f9e439	[JIT][SR] Introduce prim::IfThenElse (#72587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72587 This pattern frequently appears in a few graphs: ``` %result = prim::If(%condition) block0(): -> (%a) block1(): -> (%b) ``` This is slow, particularly in static runtime. Static runtime creates memory planners/block runners for each sub-block, which eats up a lot of memory and introduces a lot of extra overhead for this relatively simple operation. This diff introduces a new op that replaces nodes like the above with a single op meant to act like a ternary operator: ``` %result = prim::IfThenElse(%condition, %a, %b) ``` Test Plan: New unit tests Reviewed By: eellison Differential Revision: D34091789 fbshipit-source-id: eb6a8c460c39b4c019a1f4ab1f3f1e5b6edc400c (cherry picked from commit `0f1b335e5b`)	2022-02-17 18:22:48 +00:00
Chen Lai	cee84f4051	fix model dump for the lowered module (#72866 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72866 https://github.com/pytorch/pytorch/pull/71597 adds a wrapper `torch.jit.LoweredWrapper` and it breaks the model dump. Fix the model_dump in the notebook ghstack-source-id: 149311636 Test Plan: CI and test with N509022 Before: {F701413403} After: {F701412963} Reviewed By: iseeyuan Differential Revision: D34247216 fbshipit-source-id: 695b02b03675fae596bb450441b327e4cdcffe9c (cherry picked from commit `d46a82a4c1`)	2022-02-17 07:09:44 +00:00
Jordan Fix	540cb5fee2	[graph_manipulation] Unpack list of outputs (#72940 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72940 att Reviewed By: jackm321 Differential Revision: D34282062 fbshipit-source-id: 743710c18e1f38286d1b91c91868bb22c760f3ca (cherry picked from commit `fd2bdd189d`)	2022-02-17 06:38:52 +00:00
Don Jang	5ea74b4996	[Static Runtime] Remove ProcessedNode::num_outputs_ (#72592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72592 Only code paths that are not perf-critical read `ProcessedNode::num_outputs_` and also its static feature of the op that `ProcessedNode` instance is executing. Therefore, it's better to move `ProcessedNode::num_outputs_` into `ProcessedFunction::num_outputs_` and let `ProcessedNode` access it via `ProcessedNode::fn_` for its occasional use. Note that this prevents duplicating num_outputs_ per node & per Static Runtime instance since `ProcessedFunction` instances are shared across all runtime instances. It's confirmed that this change reduces the `sizeof(ProcessedNode)` by 14% from local instrumentation as follows: - Before -- sizeof(ProcessedNode): 56 - After -- sizeof(Processednode): 48 Test Plan: `buck test //caffe2/benchmarks/static_runtime:static_runtime_cpptest` Reviewed By: mikeiovine Differential Revision: D33984792 fbshipit-source-id: e29ffc97b799e679215f42e1e85cd3fcd7e88983 (cherry picked from commit `0f7003f4df`)	2022-02-17 05:09:17 +00:00
Pearu Peterson	456d96d544	Generate static docstrings for torch._masked functions. (#72865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72865 Fixes #72636 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34286183 Pulled By: cpuhrsch fbshipit-source-id: 9cf81bfed6ba8c82593f6a1d9e0b20d0a083310d (cherry picked from commit `0a3f57896b`)	2022-02-17 02:44:16 +00:00
Philip Meier	1f74e082e2	only compare attributes for meta tensors (#72508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72508 Todo: - [x] document this behavior - [x] add tests Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262452 Pulled By: ezyang fbshipit-source-id: bc5c9653d5c3ad5c6efccc9c8e0efc0d28e15104 (cherry picked from commit `233142c88e`)	2022-02-17 02:33:08 +00:00
Philip Meier	b5f2574f36	no longer coalesce sparse COO tensors before comparison (#69751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69751 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262453 Pulled By: ezyang fbshipit-source-id: e2e62d2aa03fc569d2951c880960b256f5dc4aaa (cherry picked from commit `cb6b0ef719`)	2022-02-17 02:33:08 +00:00
Vitaly Fedyunin	81fbeea760	Add docstrings to native_channel_shuffle (#72919 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72919 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D34274717 Pulled By: VitalyFedyunin fbshipit-source-id: fa42f91ef2335e2594b19ef65d914c711f7a94fd (cherry picked from commit `a6f6fe9112`)	2022-02-17 02:33:08 +00:00

1 2 3 4 5 ...

20174 Commits