pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
James Reed	a2d2610ec9	[FX] Assert None concrete_args and improve error messages (#74662 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74662 Previously, we would not emit a check that `concrete_args` with value `None` matched that value during runtime. This fixes that and improves some of the warning messages Test Plan: Imported from OSS Reviewed By: Chillee Differential Revision: D35137362 Pulled By: jamesr66a fbshipit-source-id: 222a2c8a907748f90290f1c1b4ab8012b46099a0 (cherry picked from commit b960405ad87e57dcf62ca25dd4d4bdfc34c8744c)	2022-03-25 23:36:27 +00:00
Linbin Yu	1c4eb3a266	[android] improve unsupported scalar type error message for android Summary: Android only support a few scalar types as model return value. This diff improved the error message so user can know which type is not supported. Test Plan: verified unsupported scalar type is printed Differential Revision: D35104788 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74660 Approved by: https://github.com/kit1980	2022-03-25 23:07:46 +00:00
Christian Puhrsch	edf2deb81e	Add private conversion function from CSR to block CSR This PR adds a private function that converts a CSR Tensor into a [scipy-style block CSR Tensor](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.bsr_matrix.html#scipy.sparse.bsr_matrix). It uses the scipy CSR to BSR conversion routines (and credits them accordingly). The main purpose of this function is to easily create a block CSR Tensor for matrix multiplication. Follow up work includes - Blocksize support for sparse_csr_tensor - Parallel CPU kernel - CUDA kernels - Faster arg sanitization - Benchmarking of cuSPARSE backend - Dense to/from block CSR - Autograd support - Column-major blocks - Block CSR to CSR conversion Pull Request resolved: https://github.com/pytorch/pytorch/pull/71582 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD	2022-03-25 21:22:15 +00:00
anjali411	1dab71ab25	Allow specifying tags for aten operators in native_functions.yaml Pull Request resolved: https://github.com/pytorch/pytorch/pull/72549 Approved by: https://github.com/ezyang	2022-03-25 21:17:52 +00:00
Eli Uriegas	79f91e6ef4	ci: Move ssh setup to it's own action SSH setup was being hidden away in the setup step for both linux and windows, this moves it out to it's own step so that users can know where to click to get ssh details Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/74773 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Approved by: https://github.com/suo, https://github.com/malfet	2022-03-25 21:10:45 +00:00
Will Constable	85abc328b9	Adds dependencies on lazy codegen sources to invocation of generate_code (#74750 ) Summary: Isn't foolproof since it doesn't include transitive deps of these python scripts, but it's better than nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74750 Reviewed By: bdhirsh Differential Revision: D35145749 Pulled By: wconstab fbshipit-source-id: ccd77cf18f68cc66c790f41f111833eca4101dac (cherry picked from commit 5ef757ffd27837eb2b3c98935d66aecb1fc5acf9)	2022-03-25 20:50:52 +00:00
Richard Zou	a75c718d7c	[reland] Update tls logic to work better with guarded call (#73925 ) This PR relands https://github.com/pytorch/pytorch/pull/73925 which we reverted due to a large breakage in functorch. As a part of the reland, this PR adds a change we agreed upon in https://docs.google.com/document/d/1i7Y9VZp9PxtgVcrQh6nGQXkXkPc1uMep0dM-OMOGJ9o/edit The change is moving the PythonTLSSnapshot key after DynamicLayerFrontMode. Test Plan: - I tested this with an updated version of functorch and all the tests pass so I think we are out of the woods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74577 Approved by: https://github.com/albanD	2022-03-25 19:51:10 +00:00
Aaron Enye Shi	d014772b9f	[Profiler] Store Input shapes, dtypes, and metadata into flat AppendOnlyList (#74241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74241 Adds the following changes: - During collection, replaces the vector of vector of int shapes, and vector of string dtypes. Instead pack the IValue details into InputOutputEncoder as flat AppendOnlyLists. - This will save each IValue with a enum tag, metadata holding its dim and dtype, and the shapes. - During Post-Processing, re-construct the vectors that are originally expected (struct Inputs). Reviewed By: chaekit Differential Revision: D34823546 Pulled By: aaronenyeshi fbshipit-source-id: 718fccaa8aab16128da986d665564a8fef5436c8 (cherry picked from commit 96a47c068e55220e7b7224c8a1935033859b5cd2)	2022-03-25 19:10:09 +00:00
Omkar Salpekar	e8c4926e75	[GHF] Adding James Reed to Merge Rules superusers (#74758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74758 See Title Test Plan: NA Reviewed By: jamesr66a Differential Revision: D35147762 fbshipit-source-id: 34572bfb3aef5e14a06fe27dc3008308b40bdc34 (cherry picked from commit 190f429cdf6381b7d2c955ed9e9a2a62930d0582)	2022-03-25 18:38:18 +00:00
Pearu Peterson	ebeea9e2ea	Support masked sum on sparse COO tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71239 Approved by: https://github.com/cpuhrsch	2022-03-25 18:26:39 +00:00
Xiang Gao	3b29bd00eb	Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH is set (#69552 ) Summary: This is the very first step for the UCC-NCCL integration. This PR lets `ProcessGroupNCCL` load the `torch_ucc.so` if the user specifies an environmental variable `TORCH_UCC_LIBRARY_PATH`. If this environment variable is not specified by the user, then there will be no visible change. In the future, we may want to make PyTorch smart enough to automatically detect the `torch_ucc.so` in the user's system, but before doing that, I believe we should first make sure that `ProcessGroupUCC` is very well tested. Note that in this PR, `ProcessGroupNCCL` just loads the library but will not use it. I am trying to make PRs small, so the usage of `torch_ucc.so` will be submitted in later PRs. This PR requires the change in https://github.com/facebookresearch/torch_ucc/pull/56, otherwise `torch_ucc.so` can not be successfully loaded. But his PR can be landed separately without waiting for https://github.com/facebookresearch/torch_ucc/pull/56 because, in PyTorch's unit tests, UCC is never used or tested. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/69552 Reviewed By: mruberry Differential Revision: D34675212 Pulled By: jiayisuse fbshipit-source-id: a3d1fb98340dbe3a931af555423863efd381f1ae (cherry picked from commit 3778b6fabe70c26b5a65e6ddec641d2ef9113cd1)	2022-03-25 18:19:39 +00:00
Nikita Shulga	f36ceefd71	[GHF] Speedup default PR query Fetch check run statuses only last commit and authors names for first hundred This avoids hitting the resource limits on PR with lots of commits Pull Request resolved: https://github.com/pytorch/pytorch/pull/74731 Approved by: https://github.com/seemethere	2022-03-25 18:12:29 +00:00
Slava Kovalevskyi	3b3bdfd51c	Revert D34808842: Reland "[pytorch][PR] Support dataclasses in TorchScript" Test Plan: revert-hammer Differential Revision: D34808842 (`b57cc9c752`) Original commit changeset: 02f807cff1ea Original Phabricator Diff: D34808842 (`b57cc9c752`) fbshipit-source-id: bd7c47493b598677e77634d06d7dc3e3a457b92d (cherry picked from commit e1853d73b3ad2494457626fbb34c65169ae8cc31)	2022-03-25 17:17:30 +00:00
Christian Puhrsch	7fe0b6a5cd	mul(sparse_csr, sparse_csr) using mul(sparse, sparse) Basic fallback implementation. Let's make this faster once used. NOTE: This is stacked on top of https://github.com/pytorch/pytorch/pull/74294 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74266 Approved by: https://github.com/pearu, https://github.com/malfet	2022-03-25 17:10:33 +00:00
Michael Melesse	cd929f403f	[ROCM] Navi21 Enablement 7: Sparse kernels This PR is a follow up to the following prs. https://github.com/pytorch/pytorch/pull/69942 https://github.com/pytorch/pytorch/pull/72682 https://github.com/pytorch/pytorch/pull/72809 https://github.com/pytorch/pytorch/pull/73543 https://github.com/pytorch/pytorch/pull/73545 https://github.com/pytorch/pytorch/pull/73546 We are adding support to Navi21 GPUs which have a warpsize of 32. We cannot rely on a constant so we have to dynamically look up the warpsize when launching the kernel on the host side. Inside device functions this is not needed and the compiler can correctly detect the correct warpsize to replace the C10_WARP_SIZE constant. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73548 Approved by: https://github.com/ngimel	2022-03-25 17:09:03 +00:00
Brian Hirsh	c0491c9179	DispatchKeySet perf improvements (#72828 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72828 Reland of D34034847 (`8aa3620d73`) ghstack-source-id: 152161453 Test Plan: confirm that Milan tests are passing Reviewed By: ezyang, albanD Differential Revision: D34227615 fbshipit-source-id: c7695e16dba3076e8ab9df8654327c5d57e92c77 (cherry picked from commit 940717db1551b799964894e0bb97757ecae14235)	2022-03-25 17:04:51 +00:00
Brian Hirsh	2cbddc0e9b	free up dispatch key space (in C++) (#72827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72827 Reland of D34034848 (`6690256021`) ghstack-source-id: 152161452 Test Plan: Confirm that Milan tests are passing Reviewed By: ezyang Differential Revision: D34227616 fbshipit-source-id: 6d1dd0fd8144dfbd9e194cd7564cce017e7db968 (cherry picked from commit e5c1b29fedd5c2a0bad810cedc94aa784136b6aa)	2022-03-25 17:04:51 +00:00
Alban Desmaison	7c747c7907	Add Sherlock to superusers Pull Request resolved: https://github.com/pytorch/pytorch/pull/74744 Approved by: https://github.com/SherlockNoMad, https://github.com/seemethere	2022-03-25 17:02:14 +00:00
Jerry Zhang	0747bdbf11	[quant][fx] Removing more unused code (#74603 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74603 att Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35071546 fbshipit-source-id: 273a7f0cb2a8f306864eb118916056fad3bb1399 (cherry picked from commit 9c31a50a2bccb2e5b7a5db833085a75e5ebda707)	2022-03-25 16:39:48 +00:00
Salil Desai	cdcd1ac121	[PyTorch Edge] Make contexts thread local for quantized matmul (#74676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74676 We don't want to create and destroy a new context with each multiplication Test Plan: From fbcode: ```buck test caffe2/test:quantization -- test_qmatmul``` # Performance Improvement Benchmarking done by on a model which performs matmuls of the same shapes and counts as Transformer Model, as determined in D30901505 Notebook in which Benchmarking was performed: https://www.internalfb.com/intern/anp/view/?id=1582075&revision_id=1891629751047842 Improvement from this diff alone ~9.71% Reduction in Latency - Non Thread Local Contexts (before this diff, D35087184 v2): [8.5410ms](https://www.internalfb.com/intern/aibench/details/661728682381311 ) - Thread Local Contexts (this diff, v12): [7.7113ms](https://www.internalfb.com/intern/aibench/details/956655867696198) FP32 Matmul vs Quantized Matmul, Overall Improvement from this diff stack 56% reduction in latency compared to FP32 Matmul, 71% reduction in latency compared to Naive QMatmul - FP32 Matmul: [17.4910ms](https://www.internalfb.com/intern/aibench/details/875394396322469) - Quantized Matmul (after this diff): [7.7113ms](https://www.internalfb.com/intern/aibench/details/956655867696198 ) - Naive Quantized Matmul (dequantize → fp32matmul → quantize): [26.8639ms](https://www.internalfb.com/intern/aibench/details/52181682131461 ) Reviewed By: kimishpatel Differential Revision: D34756288 fbshipit-source-id: b000658152cf71b4185dcd34a3cccc71b4cec1f0 (cherry picked from commit 5bc7ef6b5c3255388eb8fab230e44073004d2266)	2022-03-25 15:36:01 +00:00
Pavel Belevich	96c8f64459	Remove with_traceback(None) in wrapped_call to show the root cause error Before: ``` Traceback (most recent call last): File "/Users/pbelevich/PycharmProjects/PiPPy/test/t5_test.py", line 37, in <module> t5_pipe_output = t5_pipe(input_ids=t5_input, decoder_attention_mask=None, decoder_input_ids=decoder_input_ids) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, kwargs) File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 251, in forward return self.executor.run(executor_args) File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 155, in run return super().run(args, initial_env=initial_env) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 121, in run self.env[node] = self.run_node(node) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 148, in run_node return getattr(self, n.op)(n.target, args, kwargs) File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 170, in call_module return super().call_module(target, args, kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 265, in call_module return submod(args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e.with_traceback(None) AttributeError: 'NoneType' object has no attribute 'dtype' ``` After: ``` Traceback (most recent call last): File "/Users/pbelevich/PycharmProjects/PiPPy/test/t5_test.py", line 37, in <module> t5_pipe_output = t5_pipe(input_ids=t5_input, decoder_attention_mask=None, decoder_input_ids=decoder_input_ids) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, *kwargs) File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 251, in forward return self.executor.run(executor_args) File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 155, in run return super().run(args, initial_env=initial_env) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 121, in run self.env[node] = self.run_node(node) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 148, in run_node return getattr(self, n.op)(n.target, args, kwargs) File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 170, in call_module return super().call_module(target, args, kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 265, in call_module return submod(args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call return cls_call(self, args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call return cls_call(self, args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call return cls_call(self, args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call return cls_call(self, args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call return cls_call(self, args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call raise e File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 622, in wrapped_call return super(cls, self).__call__(args, *kwargs) File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, *kwargs) File "<eval_with_key>.42", line 74, in forward File "/Users/pbelevich/PycharmProjects/pbelevich-transformers/src/transformers/utils/fx.py", line 180, in wrapper return func(args, **kwargs) File "/Users/pbelevich/PycharmProjects/pbelevich-transformers/src/transformers/modeling_utils.py", line 256, in create_extended_attention_mask_for_decoder causal_mask = causal_mask.to(attention_mask.dtype) AttributeError: 'NoneType' object has no attribute 'dtype' ``` The last lines of stack trace show where the problem is Pull Request resolved: https://github.com/pytorch/pytorch/pull/74655 Approved by: https://github.com/ansley, https://github.com/rohan-varma	2022-03-25 14:40:45 +00:00
Nicolas Hug	7df0d9fda4	Call super().setUp() and super().tearDown() in torchhub tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/74621 Approved by: https://github.com/vmoens, https://github.com/janeyx99, https://github.com/cpuhrsch	2022-03-25 14:36:31 +00:00
atalman	ca96d1d447	Use nvidia cuda image without cudnn for cudnn 8 and up Use nvidia cuda image without cudnn for cudnn 8 and up. We want to decouple the CUDA and cudnn versions so that we can evolve these version separately. We want to use cudnn 8.3.2 for following CUDA versions 11.3, 11.5 and 11.6. We are using Official Nvidia Cuda ubuntu image. And installing cudnn 8.3.2 on top of it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74545 Approved by: https://github.com/malfet	2022-03-25 12:18:42 +00:00
Jerry Zhang	66e07f2aef	[quant][fx] Merge is_general_tensor_shape_op into is_general_tensor_value_op in QuantizeHandler (#74601 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74601 Currently the behavior for general tensor shape op and general tensor value op are the same, so we can remove this flag and merge with the is_general_tensor_value_op flag. is_general_tensor_value_op flag is used in two places in prepare: (1). dtype propgation: we only do dtype propgation when this flag is true (this will be refactor in the future to be more systematic) (2). observer sharing, we'll use the input observer instance as output observer for an op if this flag is True Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: george-qi Differential Revision: D35071438 fbshipit-source-id: 5e8f5fd84e37db0433a63fe0a0e212ce3c5908d6 (cherry picked from commit b4bbc9fa0e65f3768eb97ca8e84b7cbd7e840b67)	2022-03-25 11:10:44 +00:00
CodemodService FBSourceClangFormatLinterBot	7235ebc5e2	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zsol Differential Revision: D35138849 fbshipit-source-id: 9dc26f28c7855260121b188f3733d1e0a2a8560a (cherry picked from commit 788424548ddecee7793a329cffd5e0454663a1ad)	2022-03-25 09:31:42 +00:00
Han Qi	b57cc9c752	Reland "[pytorch][PR] Support dataclasses in TorchScript" (#74353 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74353 Repatched `d00de0d43598522b8f6ab2de553b6aaf6768faa5` by Nora Belrose (norabelrose). With following changes: * Register fake source of generated methods in linecache so that inspect.get_source will succeed. * this patching is only triggered if the given dataclass passed to torch.jit.script previously. Effectively we make this feature opt-in. ## Original Summary: Fixes #72901. Since we can't get access to the source code for synthesized magic methods on dataclasses, we have to synthesize our own versions. torch/jit/_dataclass_impls.py has the code that does this. What's supported Synthesized __init__, __eq__, and the comparison magic methods when order=True is set on the dataclass decorator Default values for fields __post_init__, including using InitVar fields inside of __post_init__, on Python 3.8+ Overriding __eq__ or any of the comparison magic methods to provide your own implementation What's not supported Default factory initializers for fields Frozen dataclasses InitVar on Python 3.7 __repr__ and __hash__ (these are actually implemented, but the TorchScript interpreter won't call them) Using the != operator on dataclasses inside TorchScript; this is because TorchScript requires that you implement __ne__ to use this operator, whereas in regular Python the != operator will resolve to the negation of whatever is returned by __eq__ if there's no __ne__. Dataclasses don't actually synthesize an __ne__ method for this reason. I've been toying with different ways to fix this but != is not working in this PR at the moment. Test Plan: unittest Also run previously failed test: ``` buck test mode/dev-nosan //fblearner/flow/projects/fluent2/definition/transformers/contrib/faim/test:tests -- --exact 'fblearner/flow/projects/fluent2/definition/transformers/contrib/faim/test:tests - test_mixmatch_multiclass (fblearner.flow.projects.fluent2.definition.transformers.contrib.faim.test.faim_mixmatch_test.TestFaimTransformerMixMatch)' ``` passes Reviewed By: zhxchen17 Differential Revision: D34808842 fbshipit-source-id: 02f807cff1ea99e606333960225c71a239743a4b (cherry picked from commit ec885a2bc04f9e5f65838fa5704d9a05815ebd37)	2022-03-25 06:41:07 +00:00
Peter Bell	c7a6be4b9c	qlinear: Remove legacy cpp_custom_type_hack support (#72680 ) Summary: Ref https://github.com/pytorch/pytorch/issues/72263 for cpp_custom_type_hack removal `qlinear_prepack` and `qlinear_unpack` were updated to use torchbind and the `cpp_custom_type_hack` overloads marked with a deprecation warning in https://github.com/pytorch/pytorch/issues/38101 which was in the PyTorch 1.6 release. So, we are safe to break BC here. The deprecation warning only appears in unpack, but you can't use one without the other I think that's still okay. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72680 Reviewed By: george-qi Differential Revision: D35056994 Pulled By: jerryzh168 fbshipit-source-id: cc046b9fa00d0219a4510854204564f4ea23da4b (cherry picked from commit 31abbf1142d86174a1980feced57e4c621b704d1)	2022-03-25 04:34:21 +00:00
Scott Wolchok	3466c1b690	[PyTorch][deploy] Work around missing libdl (#74705 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74705 As the comment says, libdl might not be separate because it may be subsumed into libc. Test Plan: 1) existing tests 2) this is being sent out on top of platform010 migration for caffe2 Reviewed By: d4l3k, r-barnes Differential Revision: D35117159 fbshipit-source-id: c4a6de7c3412db695509bd25d529658cdf785e3d (cherry picked from commit 563919d4c5fd7a9cbdc03d24b1afc5b6a2c09cc8)	2022-03-25 03:59:44 +00:00
Jerry Zhang	eaae62fed9	Make args work in the uru10x10_to_trt_eval script (#74707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74707 att Test Plan: ``` buck run mode/dev-nosan -c fbcode.split-dwarf=true -c fbcode.platform=platform009 accelerators/workloads/models/uru10x10:uru_10x10_to_trt_eval -- -h ``` Reviewed By: 842974287 Differential Revision: D34088069 fbshipit-source-id: 5c89d25db6493e0f66f7e57aac24ed72196d0378 (cherry picked from commit d9d79f03e28d609a14ddc3e55b97c52b0e102438)	2022-03-25 03:52:47 +00:00
Oleg Khabinov	5079321b71	Fix issue with prim::Print() and torch::deploy (#74513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74513 Reviewed By: d4l3k, houseroad Differential Revision: D35035089 fbshipit-source-id: d67b98600c74e2ed16b4d80f52148cd64b9e6ca0 (cherry picked from commit 16caf865077e28be31b805f015b9a61962632c8f)	2022-03-25 03:14:34 +00:00
Jerry Zhang	b347b8c191	[quant][fx] Support some default ops in the native backend config (#74600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74600 Following https://github.com/pytorch/pytorch/pull/74210, this PR adds the support for some ops using the DefaultNodeQuantizeHandler in the backend_config_dict defintion for pytorch native backend TODO: There is still a few ops we didn't handle with backend_config_dict path: gelu and softmax, need to discuss if we still need them, if so we can change the test to use backend_config_dict and remove the DefaultNodeQuantizeHandler after that Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35071437 fbshipit-source-id: 70351d2810ca1ac7dc09d4a9c239f6757ccb51ca (cherry picked from commit 5e68f755a32ba7d90d6c73db9c2017f9c58d7fa5)	2022-03-25 02:59:36 +00:00
Mengwei Liu	797fa26f60	[PyTorch] Only select root ops in codegen unboxing (#74663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74663 In lightweight dispatch, we only need to register root ops. Unlike in the dispatcher world, the transitive closure of the operators doesn't need to go through dispatcher or op registry. Test Plan: Rely on unit tests Reviewed By: priyaramani Differential Revision: D35104401 fbshipit-source-id: 1a2df571880ac3c8625985c01bd89a2bb9566af9 (cherry picked from commit 16207fa18e87908ec5e038a7f60f41893a236749)	2022-03-25 02:52:51 +00:00
Mengwei Liu	4d82e5bf44	[PyTorch] Avoid registering ops into dispatcher in lightweight dispatch (#74664 ) Summary: This change adds the following logic: If lightweight dispatch is enabled, do not generate `TORCH_LIBARAY` API calls for operator schema and implementations, since these operators will be registered into JIT op registry. `skip_dispatcher_op_registration` is an existing argument to `gen.py`. With that set, `RegisterDispatchKey.cpp` will not generate `m.def` and `m.impl` for each native function. This logic will be removed once we find a better way to skip op registration into dispatcher. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74664 Test Plan: Rely on unit tests for lightweight dispatch. Reviewed By: priyaramani Differential Revision: D34634300 Pulled By: larryliu0820 fbshipit-source-id: d87828f2c6c62f15024ce9e98823b09ee5a81336 (cherry picked from commit 3eb1c27547dea6accd9fa95496189f3699d91201)	2022-03-25 02:52:51 +00:00
Edward Z. Yang	51e7a3406c	Fix formatting of scalar tensors (don't call item) Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/74376 Approved by: https://github.com/bdhirsh	2022-03-25 02:22:25 +00:00
Peter Bell	f86bb2d6e4	Implement _pad_circular in ATen Closes #44459 This migrates the python implementation of `_pad_circular` to ATen and removes the old C++ implementation that had diverged from python. Note that `pad` can't actually use this until the forward-compatibility period is over. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73410 Approved by: https://github.com/ezyang	2022-03-25 02:09:01 +00:00
Slava Kovalevskyi	f7317d3c51	Jinja2 for docs/cpp build set to version 3.0 Fixes https://github.com/pytorch/pytorch/issues/74684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74718 Approved by: https://github.com/malfet	2022-03-24 23:39:26 +00:00
Han Qi	75d6cbe605	[4/5]Testing jit module in flatbuffer in Python. (#74387 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74387 Make temporary python bindings for flatbuffer to test ScriptModule save / load. (Note: this ignores all push blocking failures!) Test Plan: unittest Reviewed By: iseeyuan Differential Revision: D34968080 fbshipit-source-id: d23b16abda6e4b7ecf6b1198ed6e00908a3db903 (cherry picked from commit 5cbbc390c5f54146a1c469106ab4a6286c754325)	2022-03-24 23:29:47 +00:00
Jamie McCrindle	11894db9ea	Add Python Version to Torch.Package metadata (#74610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74610 Adding python version to exported package and reading it on import as per this issue in github https://github.com/pytorch/pytorch/issues/74068 ghstack-source-id: 152003088 Test Plan: CI Tests Reviewed By: PaliC Differential Revision: D35062709 fbshipit-source-id: 04091a1255a09b96255112a60d31df127c424193 (cherry picked from commit ed39fd54b8b20918dac89a2873ecccf06aafd724)	2022-03-24 22:48:25 +00:00
Slava Kovalevskyi	7f996b855c	Jinja2 version pinned to 3.0.* (#74690 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/74684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74690 Reviewed By: malfet Differential Revision: D35119993 Pulled By: b0noI fbshipit-source-id: f53b2643000e24662644fda8718a7c4e1bfaa273 (cherry picked from commit 6dfadffff864f1d57eaea088c6dae0b673496bd7)	2022-03-24 21:58:28 +00:00
Jeeja	13ebcf3723	Add support for backend to register reducer timer Currently by default, reduce timer registration is expected for all backend. if timer is not registered throws assert in set_runtime_stats_and_log() To allow registration of reducer timer for other backends, moved the timer registration to another file decoupling the internal interface. Signed-off-by: Jeeja <jeejakp@habana.ai> Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/71700 Approved by: https://github.com/rohan-varma	2022-03-24 21:52:27 +00:00
wayi1	5fbe8b1966	[Model Averaging] Make HierarchicalModelAverager a subclass of averagers.ModelAverager Make `HierarchicalModelAverager` a subclass of `averagers.ModelAverager` is a preparation step for incorporating hierarchical SGD into `PostLocalSGDOptimizer`. Proposal: https://github.com/pytorch/pytorch/issues/73382 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74564 Approved by: https://github.com/rohan-varma	2022-03-24 21:52:00 +00:00
Pavithran Ramachandran	fc2cf3d26f	Back out "Revert D34805092: Extend _save_for_mobile and _load_for_mobile to support flatbuffer format; Default format is pickle + Change buck targets to support `only pickle` and `pickle + flatbuffer` for migration" (#74594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74594 Extending `_save_for_mobile` and `_load_for_mobile` to support faltbuffer format with additional optional argument which is set to pick pickle by default. Adding new binary target with suffix `_pickle_and_flatbuffer` to help migration. Size test in D34909502 shows the size has regressed by ~40K but after removing pickle and comparing lite_predictors we have ~120K size measure that we will achieve when deprecating pickle and moving to flatbuffer BEFORE: ```lang=mermaid graph TD; torch_core-->torch_mobile_deserialize; torch_mobile_core-->torch_mobile_deserialize; jit_module_saving-->torch_core; jit_module_saving-->torch_mobile_core; torch_mobile_deserialize-->caffe2_serialize; torch_mobile_deserialize-->torch_mobile_module; caffe2_serialize-->miniz; flatbuffer_loader-->mobile_bytecode; flatbuffer_serializer-->mobile_bytecode; mobile_bytecode-->flatbuffer_2.0; flatbuffer_loader-->torch_mobile_module; flatbuffer_serializer-->torch_mobile_module; ``` AFTER: ```lang=mermaid graph TD; torch_core-->torch_mobile_deserialize; torch_mobile_core-->torch_mobile_deserialize; jit_module_saving-->torch_core; jit_module_saving-->torch_mobile_core; torch_mobile_deserialize-->caffe2_serialize; torch_mobile_deserialize-->torch_mobile_module; caffe2_serialize-->miniz; flatbuffer_loader-->mobile_bytecode; flatbuffer_serializer-->mobile_bytecode; mobile_bytecode-->flatbuffer_2.0; torch_mobile_deserialize_pickle_and_flatbuffer-->\|new\| flatbuffer_loader; torch_mobile_deserialize_pickle_and_flatbuffer-->\|new\| torch_mobile_deserialize; torch_mobile_core_pickle_and_flatbuffer-->\|new\| torch_mobile_deserialize_pickle_and_flatbuffer; torch_core_pickle_and_flatbuffer-->\|new\| torch_mobile_deserialize_pickle_and_flatbuffer; jit_module_saving_pickle_and_flatbuffer-->\|new\| torch_core_pickle_and_flatbuffer; jit_module_saving_pickle_and_flatbuffer-->\|new\| torch_mobile_core_pickle_and_flatbuffer; flatbuffer_serializer-->torch_mobile_module; jit_module_saving_pickle_and_flatbuffer-->\|new\|jit_module_saving; jit_module_saving_pickle_and_flatbuffer-->\|new\|flatbuffer_serializer; flatbuffer_loader-->torch_mobile_module; ``` Original commit changeset: 780dfb6fd6ba Original Phabricator Diff: D34805092 (`284b2b7135`) ghstack-source-id: 152044801 (Note: this ignores all push blocking failures!) Test Plan: CI ``` ~/fbsource/fbcode] cd ~/fbsource/fbcode/ && buck test -c fbcode.caffe2_enable_flatbuffer=1 //caffe2/test/cpp/jit:jit -- FlatbufferTest.ExtraFiles Parsing buck files: finished in 0.9 sec Building: finished in 5.3 sec (100%) 12992/54304 jobs, 0/54304 updated Total time: 6.2 sec More details at https://www.internalfb.com/intern/buck/build/2b387fff-f813-4cfa-b53f-eb2378630d4e BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d Trace available for this run at /tmp/tpx-20220323-134108.766518-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d/trace.log RemoteExecution session id: reSessionID-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693 ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 486 tests discovered (19.122) ✓ Pass: caffe2/test/cpp/jit:jit - FlatbufferTest.ExtraFiles (0.187) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693 ``` Similar Build Deps Dags ``` [pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops_pickle_and_flatbuffer, //xplat/caffe2:torch_mobile_deserialize_pickle_and_flatbuffer)' --output-format dot-compact \| pastry P486770901: https://www.internalfb.com/intern/paste/P486770901/ [pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops, //xplat/caffe2:torch_mobile_deserialize)' --output-format dot-compact \| pastry P486771278: https://www.internalfb.com/intern/paste/P486771278/ ``` pickle_and_flatbuffer: https://www.internalfb.com/intern/dgw/graph/?build_id=P486770901 pickle: https://www.internalfb.com/intern/dgw/graph/?build_id=P486771278 Reviewed By: iseeyuan Differential Revision: D35067157 fbshipit-source-id: 9044259c17a2e0da79bd6aedb28efbdfd57e23e0 (cherry picked from commit f738069ec3a72e79da56172741d027de514e9e5f)	2022-03-24 21:51:05 +00:00
Jerry Zhang	d64e7634ff	[quant] Remove assert for weight since it could be non-Tensor (#74365 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74365 att Test Plan: Meta-Internal tests with fx2trt Imported from OSS Reviewed By: andrewor14 Differential Revision: D34952754 fbshipit-source-id: 11d392a520c9ab7c9484c96841f2b39fbbbc3f80 (cherry picked from commit e8a2348d5c6f9717b010972819723affba37a0e4)	2022-03-24 21:27:53 +00:00
Ansha Yu	f2ca4341c9	[pyper] to + lengths_to_offsets (#73879 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73879 Fuse the following pattern: ``` %1994 : Tensor = aten::to(%getattr_78.1, %188, %189, %189) # <eval_with_key>.50:11:0 %1995 : Tensor = fb::lengths_to_offsets(%1994, %190) # /mnt/xarfuse/uid-1994 ``` This pattern is applied after all the applicable clip_ranges+gather_ranges patterns Additional context in https://fb.quip.com/DSCbAozMBwUi Test Plan: > ./caffe2/caffe2/fb/predictor/scripts/run_disagg_model_benchmarks.sh 321004917 27 /data/users/ansha/tmp/ads_tail sr_only ~0.007ms overall reduction in tail model runtime (321004917_27 oemae_long_attr_win_2d_7d_aux_model) Local (25 fused nodes) Before: 2.04ms/iter 0.0112739 ms. 0.543996%. fb::lengths_to_offsets (31 nodes, out variant) 0.00805597 ms. 0.388722%. static_runtime::to_maybe_copy_out (30 nodes, out variant) After: 1.96256ms/iter 0.0100853 ms. 0.498655%. fb::to_lengths_to_offsets (25 nodes, out variant) 0.00328385 ms. 0.157536%. fb::lengths_to_offsets (6 nodes, out variant) 0.00239722 ms. 0.115002%. static_runtime::to_maybe_copy_out (5 nodes, out variant) Local_RO (43 fused nodes) Before: 0.11427 0.0110696 ms. 9.42255%. fb::lengths_to_offsets (43 nodes, out variant) 0.00638323 ms. 5.43349%. static_runtime::to_maybe_copy_out (43 nodes, out variant) After: 0.112098ms/iter 0.014206 ms. 12.6795%. fb::to_lengths_to_offsets (43 nodes, out variant) Remote_RO (17 fused nodes) Before: 0.24 0.0534883 ms. 23.0586%. static_runtime::to_maybe_copy_out (136 nodes, out variant) 0.00216992 ms. 0.935446%. fb::lengths_to_offsets (17 nodes, out variant) After: 0.240225 0.0525392 ms. 23.2864%. static_runtime::to_maybe_copy_out (119 nodes, out variant) 0.00265347 ms. 1.17607%. fb::to_lengths_to_offsets (17 nodes, out variant) Remote_Other (3 fused nodes) Not much affect Reviewed By: mikeiovine Differential Revision: D34696255 fbshipit-source-id: a0dc4a8ff8f25a825f6dc371ec5e4b3b09740c29 (cherry picked from commit a49b482117ebd6dbabce81a7e790f9e59cbf26c1)	2022-03-24 21:20:05 +00:00
Tristan Rice	5b915e844c	c10d: retry dns lookup failures (#74641 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74641 This makes dns hostname lookup failures retryable since in some environments such as Kubernetes they're not guaranteed to be resolvable until the job starts. Retrying this eliminates the race condition. This also fixes `sandcastle_skip_if` when used on the class instead of the method. Previously they wouldn't inherit from TestCase so just wouldn't run under buck at all. Fixes https://github.com/pytorch/pytorch/issues/73682 Test Plan: Added a unit test ``` buck test //caffe2/test/distributed:test_store ``` Reviewed By: aivanou Differential Revision: D35092284 fbshipit-source-id: d40bf187e52c41f551e4fe41c536b2b0015588ee (cherry picked from commit f8908309d8ee64c25ee466a6b4922f34f2b7618e)	2022-03-24 19:51:09 +00:00
Facebook Community Bot	d0adb5ff26	Automated submodule update: FBGEMM (#74633 ) Summary: This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM). New submodule commit: `ef22aabc8b` Pull Request resolved: https://github.com/pytorch/pytorch/pull/74633 Test Plan: Ensure that CI jobs succeed on GitHub before landing. Reviewed By: jianyuh Differential Revision: D35088395 Pulled By: geyyer fbshipit-source-id: cb6808719a545c318302ed2770b3c7fa459fe169 (cherry picked from commit dfba6a9441136ff8563e80e6a09c555ad6a3af5a)	2022-03-24 19:09:57 +00:00
Taylor Robie	2ecf743757	[Profiler] Pay for what you use (v2) (#74484 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74484 In my first attempt at this in December I stamped out specializations using variadic templates. However I'm able to get comparable performance using simple conditionals since the branch is very predictable and AppendOnlyList::emplace_back is low enough overhead that multiple calls don't cause an issue. This is also a chance to do some BE: rather than force ops and backend events to use the same fields (which in practice means setting a bunch of default values when reporting backend events), I just split them and use a variant. Test Plan: The single threaded benchmark (with no extra options set) improved considerably from ~0.88 us to ~0.62 us. The stress test benchmark improved modestly from ~6.1 us to ~5.8 us. So the bottleneck for multi-threading is somewhere else, but doing less wasted work is still able to move the needle a little bit. Reviewed By: swolchok Differential Revision: D34779994 fbshipit-source-id: 392bc7c6f12797fa5e18777063aa21210d9d2067 (cherry picked from commit f0a49ff7be8aa65bab2f6952cc2e6306c1edc24b)	2022-03-24 18:43:08 +00:00
Shiyan Deng	3f164e0395	[reland] Process inputs and outputs in fx interpreter (#74637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74637 Forgot to update the expect file in https://github.com/pytorch/pytorch/pull/74242. Reland to include changes in expect file. Test Plan: unit test Reviewed By: yinghai Differential Revision: D35089989 fbshipit-source-id: 5e3ad9c696cf31cbc691d34fdb77eff26f92e38d (cherry picked from commit 110ac12f5e2bcca7552d4b4691c7d98fafb21a57)	2022-03-24 18:32:57 +00:00
Jiaxu Zhu	7c1f3cc89e	[quant] Populate FakeQuantize quant_min/quant_max to observer (#74581 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74581 As title, currently the quant_min/quant_max of the FakeQuantize are not populated to the observer. We plan to populate when they are both not None. To do this we need to do 1. Remove the current default quant_min/quant_max value (0/255) as it's not universal for various dtype. 2. Move the upper bound/lower bound check before creating the observer. Test Plan: ``` [jiaxuzhu@devvm3400.frc0 /data/users/jiaxuzhu/fbsource/fbcode] buck test mode/dev //caffe2/test:quantization -- --exact 'caffe2/test:quantization - test_quant_min_max_override (quantization.core.test_workflow_module.TestFakeQuantize)' Parsing buck files: finished in 0.8 sec Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 9.5 sec (100%) 18535/84579 jobs, 2/84579 updated Total time: 10.3 sec More details at https://www.internalfb.com/intern/buck/build/1cab97ef-0788-4d06-92ed-a828995e3bde BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 24be645e-eebc-45d6-8111-052ef1225fa0 Trace available for this run at /tmp/tpx-20220323-094106.724238-24be645e-eebc-45d6-8111-052ef1225fa0/trace.log RemoteExecution session id: reSessionID-24be645e-eebc-45d6-8111-052ef1225fa0-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/5066549674998735 ✓ ListingSuccess: caffe2/test:quantization : 483 tests discovered (20.179) ✓ Pass: caffe2/test:quantization - test_quant_min_max_override (quantization.core.test_workflow_module.TestFakeQuantize) (18.896) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/5066549674998735 ``` Reviewed By: jerryzh168 Differential Revision: D34971236 fbshipit-source-id: 4407fd03116a296053256b333f7ce6d28dcc9c42 (cherry picked from commit f6980bccea802f220cc5b6dfe1bf3a3a3eef0a34)	2022-03-24 18:23:40 +00:00
kstant0725	ff58899b5e	Pull request to run CI for #72556 (#73404 ) Summary: This PR moves the Dockerfile conda dependencies into a requirements-ci.txt (and begins the requirements file for other parts of CI as well). Packages are listed alphabetically in the requirements-ci.txt. Uncommented packages before the mkl package have been tested and confirmed to work on all platforms. Commented out packages before mkl have broken at least one platform and so have been comment out. There appears to be some randomness with certain platforms not passing tests so it might be good to run a number of tests for the same configuration to confirm if it is indeed these commented out packages that cause the errors. Remaining is to test all commented out packages to ensure they work on all platforms. This will likely involve repeat runs of the same configurations to ensure it is indeed the packages that break the platforms and not random errors. This PR makes progress on task https://github.com/pytorch/pytorch/issues/72556 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73404 Reviewed By: janeyx99 Differential Revision: D34730797 Pulled By: kstant0725 fbshipit-source-id: 3e4b171720fa33b604cebb9c6101d38ba11f2f8b (cherry picked from commit 99cc445aadb95f92f6ef040f2d4b7c6c6d5b7f8b)	2022-03-24 18:04:08 +00:00

1 2 3 4 5 ...

44789 Commits