pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Erjia Guan	8cdf16d1de	Revert D29810657: [bc-breaking] reference option for linear produce a pattern instead of reference linear module Test Plan: revert-hammer Differential Revision: D29810657 (`9df605133e`) Original commit changeset: 949615bbc017 fbshipit-source-id: 54597d1f9636b0f94ae01c66018ff2592e5c39fc	2021-07-27 10:10:13 -07:00
Nikita Vedeneev	d7ddae8e4f	det_backward: correct, more robust and with complex support [clone] (#61905 ) Summary: Clone of https://github.com/pytorch/pytorch/pull/58195 to ease the import. Done by request from anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61905 Reviewed By: albanD Differential Revision: D29937920 Pulled By: anjali411 fbshipit-source-id: 025892a8e6147790825b20458986730ad8c5bb0f	2021-07-27 10:08:26 -07:00
Peter Bell	de3a4eb583	Migrate thnn_conv_depthwise2d from THC to ATen (#62006 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62006 Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. \| \| Master Forward (us) \| This PR Forward (us) \| \|------------------:\|:-------------------:\|:--------------------:\| \| Forward \| 133.5 \| 133.6 \| \| Backward (input) \| 1,102 \| 1,119 \| \| Backward (weight) \| 2,220 \| 2,217 \| Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29883676 Pulled By: ngimel fbshipit-source-id: 9b2ac62cdd8a84e1a23ffcd66035b2b2fe2374d8	2021-07-27 10:00:25 -07:00
Jerry Zhang	9df605133e	[bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61892 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29810657 fbshipit-source-id: 949615bbc017bc454d81c8a6b2bdec53badaab19	2021-07-27 09:49:20 -07:00
Amy He	6c6a9c73f2	[7/N] Nnapi backend delegation preprocess: compile_spec sanity check (#62213 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62213 Added sanity checks in preprocess function for Android NNAPI delegate. `preprocess()` requires some input metadata passed through its `method_compile_spec` function argument. `preprocess()` now throws specific error messages, if it cannot find the correct input arguments. Example error message: ``` RuntimeError: method_compile_spec does not contain the "forward" key. method_compile_spec should contain a Tensor or Tensor List which bundles input parameters: shape, dtype, quantization, and dimorder. For input shapes, use 0 for run/load time flexible input. method_compile_spec must use the following format: {"forward": {"inputs": at::Tensor}} OR {"forward": {"inputs": c10::List<at::Tensor>}} ``` nnapi_backend_preprocess.cpp: contains sanity check implementation test_backend_nnapi.py: sanity check unit tests Test: Ran `python test/test_jit.py TestNnapiBackend` in OSS successfully. TODO: Using Tensors to pass input parameters is a temporary hack. When a dedicated object is implemented, update the sanity check error message. ghstack-source-id: 134339282 Test Plan: Ran `python test/test_jit.py TestNnapiBackend` in OSS successfully. Reviewed By: raziel, iseeyuan Differential Revision: D29917004 fbshipit-source-id: 0d5c6b35889c556cda905ffc29c25c5422ae9ee4	2021-07-27 09:31:35 -07:00
Rohan Varma	2cbc0ede7d	[DDP] Log if graph is static at end of training (#61871 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61871 When set_static_graph=False, the only type of dynamism we really support in DDP is dynamic set of unused parameters which must be explicitly enabled with find_unused_parameters=True. Although, some workflows have static set of unused parameters, would be good to detect and add this to logging to identify workflows that are candidates for static graph optimization. ghstack-source-id: 134371429 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D29773962 fbshipit-source-id: 1f741984c6e6f8e3e55cf69ca719b1e25a485b13	2021-07-27 09:23:43 -07:00
Mike Iovine	79eb8bb299	[Static Runtime] Enforce proper output dtype for many ops (re-land) (#62267 ) Summary: Re-land of D29935444 We previously had lots of ops with implementations like this: ``` if (p_node->Output(0).isNone()) { p_node->Output(0) = create_empty_like(input_0); } ... auto& out = p_node->Output(0); some_func_out(inputs, out); ``` This would make the output have the correct shape. But it would also take the dtype of `input_0`, which is not always correct. This change transforms these blocks to: ``` if (p_node->Output(0).isNone()) { p_node->Output(0) = some_func(inputs) } else { ... auto& out = p_node->Output(0); some_func_out(inputs, out); } ``` This gives the output the correct shape and dtype. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62267 Reviewed By: ejguan Differential Revision: D29937253 Pulled By: malfet fbshipit-source-id: d91ca5d5703490d7d349a1de2ad3bb09b0c33967	2021-07-27 08:54:09 -07:00
Brian Vaughan	2eef1f27f8	Disable ccache for nccl builds (#62208 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62208 reverts https://github.com/pytorch/pytorch/pull/55814 which removed a workaround for: https://github.com/pytorch/pytorch/issues/13362 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D29935472 Pulled By: nairbv fbshipit-source-id: 7ce9cde1408f17153632036fd128814032739746	2021-07-27 08:07:26 -07:00
Erjia Guan	dc55d511d9	Forward fix mypy (#62263 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62263 Fixes current HUD Error: https://github.com/pytorch/pytorch/runs/3170342799 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29935265 Pulled By: ejguan fbshipit-source-id: 6f247833d24bff7aea42f6287493a85d62d73b96	2021-07-27 07:52:31 -07:00
Ivan Yashchuk	3cd12448b4	Add forward mode differentiation for inverse and solve (#62160 ) Summary: This PR adds forward mode differentiation for `torch.linalg.inv`, `torch.linalg.inv_ex`, and `torch.linalg.solve` functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62160 Reviewed By: mruberry Differential Revision: D29917213 Pulled By: albanD fbshipit-source-id: b08bbc830f77f342cc7ca5b823d7ea4380f2aaa8	2021-07-27 07:51:22 -07:00
Joel Schlosser	a0309f89f4	Initial ModuleInfo implementation (#61935 ) Summary: This PR contains the initial version of `ModuleInfo` for use in testing modules. The design philosophy taken here is to start small and simple and build out / refactor as needed when more test coverage or `ModuleInfo` entries are added. As such, it's not intended for general usage yet. The PR contains the following: * (new file) `torch/testing/_internal/common_modules.py` * `ModuleInfo` definition - metadata for each module to use in testing * `module_db` - the actual `ModuleInfo` database; currently contains entries for two modules * `ModuleInput` - analogous to `SampleInput` from OpInfo; contains `FunctionInput`s for both constructor and forward pass inputs * Constructor and forward pass inputs are tied together within a `ModuleInput` because they are likely correlated * `FunctionInput` - just contains args and kwargs to pass to a function (is there a nicer way to do this?) * `modules` decorator - analogous to `ops`; specifies a set of modules to run a test over * Some constants used to keep track of all modules under torch.nn: * `MODULE_NAMESPACES` - list of all namespaces containing modules * `MODULE_CLASSES` - list of all module class objects * `MODULE_CLASS_NAMES` - dict from module class object to nice name (e.g. torch.nn.Linear -> "nn.Linear") * (new file) `test/test_modules.py` * Uses the above to define tests over modules * Currently, there is one test for demonstration, `test_forward`, which instantiates a module, runs its forward pass, and compares it to a reference, if one is defined Pull Request resolved: https://github.com/pytorch/pytorch/pull/61935 Reviewed By: mruberry Differential Revision: D29881832 Pulled By: jbschlosser fbshipit-source-id: cc05c7d85f190a3aa42d55d4c8b01847d1efd57f	2021-07-27 07:42:07 -07:00
Howard Huang	afe3644321	Remove faulty process group code (#61907 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61907 Removing the code for faulty process group agent since it was replaced by faulty tensorpipe agent Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D29794666 Pulled By: H-Huang fbshipit-source-id: 0b35191cc07220b6774ecacc8d004f25fd2e87f0	2021-07-27 07:37:40 -07:00
Erjia Guan	a3be2ecc3a	Revert D29887367: [Static Runtime] Enforce proper output dtype for many ops Test Plan: revert-hammer Differential Revision: D29887367 (`f4136c5efc`) Original commit changeset: cef04bfa52ec fbshipit-source-id: 32e89f2b6381930559dd746b535904c3e90fd52b	2021-07-27 07:29:09 -07:00
lezcano	b599c1e794	Create linalg and parametrizations codeowners (#62086 ) Summary: Added myself nikitaved and IvanYashchuk Pull Request resolved: https://github.com/pytorch/pytorch/pull/62086 Reviewed By: mruberry Differential Revision: D29920798 Pulled By: albanD fbshipit-source-id: dcbd57bb2a438a1f04d4651447710fced83264d3	2021-07-27 06:50:41 -07:00
CodemodService FBSourceClangFormatLinterBot	228b50e053	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D29930232 fbshipit-source-id: e36dbc59a25d7f36d3bb7a02ad76696f299712cf	2021-07-27 04:13:15 -07:00
Jerry Zhang	2d7c1e3fa8	[bc-breaking] Produce quantization pattern for add_scalar and mul_scalar (#61859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61859 BC-breakign note: Previously we do not add observer/fake_quant for output of add/mul for tensor - scalar operation, in this PR we added the observer/fake_quant instance (that's the same as input) to correctly model the behavior of the quantized add_scalar and mul_scalar op (since quantized add/mul scalar assumes the output quantized tensor have the same quantization parameter as input) Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_add python test/test_quantization.py TestQuantizeFxOps.test_mul Imported from OSS Reviewed By: vkuzo Differential Revision: D29770859 fbshipit-source-id: f43fcbfecd04c392467770b22c481bbbdaf43c25	2021-07-27 02:46:00 -07:00
Alex Suhan	b176feec1e	Add device and key for lazy tensors (#61621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61621 Test Plan: CI Reviewed By: mruberry Differential Revision: D29912934 Pulled By: asuhan fbshipit-source-id: 493c32063a3e756d93cbf1d876563a35eaafb537	2021-07-26 23:00:22 -07:00
Nikita Shulga	2945a73d90	Add option to skip GH validation for torch.hub (#62139 ) Summary: Split from https://github.com/pytorch/pytorch/pull/62072 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62139 Reviewed By: mthrok Differential Revision: D29891497 Pulled By: malfet fbshipit-source-id: 5c0baf53a2acf8f95062bd001457e1f936011529	2021-07-26 22:44:12 -07:00
Rohan Varma	64283fe146	[DDP/Functional Optim] Support kwarg arguments (#62079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62079 Adds support for kwarg arguments into functional optimizer running as hook. ghstack-source-id: 134330379 Test Plan: CI Reviewed By: SciPioneer Differential Revision: D29838127 fbshipit-source-id: 2ab051ef5f0dff19c145ebe2260668b927ba47b2	2021-07-26 22:12:50 -07:00
Rohan Varma	c0ebeca1a8	[Functional Optim] Test kwargs parity for SGD (#62078 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62078 Ensure that kwarg arguments such as momentum and weight decay maintain parity between optimizer.step and step_param. ghstack-source-id: 134330377 Test Plan: CI Reviewed By: SciPioneer Differential Revision: D29837942 fbshipit-source-id: 1ae39648fc26aebd8aaef1a7ac0e03b598a8ed60	2021-07-26 22:11:40 -07:00
Nikita Shulga	478098aaac	Revert D29801652: Refactor Tensor::to to call a primitive that is not copy_. Test Plan: revert-hammer Differential Revision: D29801652 (`29bb3f4647`) Original commit changeset: bb01eb1acf3d fbshipit-source-id: 93693bad8068d47a3a4c16f34f300e03ea573897	2021-07-26 19:37:17 -07:00
Rohan Varma	69adb21940	Parity tests for functional optimizer step_param (#61756 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61756 DDP will support running optimizer as communication hook with optimizers that support a per-parameter/gradient step function `step_param`. Add parity tests as we implement more optimizers that support step_param to ensure parity with regular optimizers. ghstack-source-id: 134330378 Test Plan: Ci Reviewed By: SciPioneer Differential Revision: D29727549 fbshipit-source-id: 18977c896f12b8e478298488b298fd107affcf5f	2021-07-26 19:03:22 -07:00
Nikita Shulga	b6d10a3a27	Fix infinite loop in `_validate_not_a_forked_repo()` (#62072 ) Summary: Increase `page_idx` in the loop rather than outside of it Break from the loop when receive empty response as it means there are no more items to fetch via pagination request Also, add options to use provided github token (via `GITHUB_TOKEN` environment variable) Fixes failure with "Rate Limit Exceeded" when doing something like `torch.hub.list("pytorch/test-infra:dsf")` Fixes https://github.com/pytorch/pytorch/issues/61755 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62072 Reviewed By: jbschlosser Differential Revision: D29868539 Pulled By: malfet fbshipit-source-id: 206082a0ba1208e9b15ff6c9c6cb71d2da74f1c3	2021-07-26 17:54:07 -07:00
Pavithran Ramachandran	d0f430927b	[PyTorch][Edge] Serializing sub modules with same names (#61933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61933 ### Issue: SubModules with same name are not serialized correctly in bytecode format while using `_save_for_mobile`. These submodules are not distinguished as different modules even though they have different foward, setstate etc if they have the same name. ### Fix: Mangler creates unique names so that modules and submodules that have same names can be uniquely identified while saving the module. iseeyuan rightly pointed out the underlying issue that mangler is not used in the process of saving bytecode and hence unique references for the submodules are not created. Please refer to the notebook to repro the issue: N777224 ### Diff: The above idea of fix is implemented. The mangled names are used in bytecode thereby the files in `code/` directory now have right reference to the `bytecode.pkl` Will this have backward compatibility? iseeyuan please feel free to correct or update this. Yes. This fix impacts only modules with same name sub modules which were not serialized correctly before. Existing modules should have correct references and `_load_for_mobile` must not see any change. To confirm this the existing test cases need to pass for the diff to be approved and shipped. ghstack-source-id: 134242696 Test Plan: ``` ~/fbsource/fbcode > buck test caffe2/test/cpp/jit:jit -- BackendTest.TestCompositeWithSetStates Downloaded 0/5 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 19.2 sec (100%) 17619/17619 jobs, 3/17619 updated Total time: 19.5 sec More details at https://www.internalfb.com/intern/buck/build/91542d50-25f2-434d-9e1a-b93117f4efe1 Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: de9e27cf-4c6c-4980-8bc5-b830b7c9c534 Trace available for this run at /tmp/tpx-20210719-161607.659665/trace.log Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/844425127206388 ✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (8.140) ✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestCompositeWithSetStates (0.528) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/844425127206388 ``` ``` ~/fbsource/fbcode > buck test caffe2/test/cpp/jit:jit -- BackendTest.TestConsistencyOfCompositeWithSetStates Building: finished in 4.7 sec (100%) 6787/6787 jobs, 0/6787 updated Total time: 5.0 sec More details at https://www.internalfb.com/intern/buck/build/63d6d871-1dd9-4c72-a63b-ed91900c4dc9 Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 81023cd2-c1a2-498b-81b8-86383d73d23b Trace available for this run at /tmp/tpx-20210722-160818.436635/trace.log Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8725724325952153 ✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (7.867) ✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestConsistencyOfCompositeWithSetStates (0.607) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8725724325952153 ``` To check the `bytecode.pkl` using module inspector please check: N1007089 Reviewed By: iseeyuan Differential Revision: D29669831 fbshipit-source-id: 504dfcb5f7446be5e1c9bd31f0bd9c986ce1a647	2021-07-26 16:31:48 -07:00
mattip	a13f714b6d	DOC: remove git stamp from release documentation version (#58486 ) Summary: CI built the documentation for the recent 1.9.0rc1 tag, but left the git version in the `version`, so (as of now) going to https://pytorch.org/docs/1.9.0/index.html and looking at the version in the upper-left corner shows "1.9.0a0+git5f0bbb3" not "1.9.0". This PR should change that to cut off everything after and including the "a". It should be cherry-picked to the release/1.9 branch so that the next rc will override the current documentation with a "cleaner" version. brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/58486 Reviewed By: zou3519 Differential Revision: D28640476 Pulled By: malfet fbshipit-source-id: 9fd1063f4a2bc90fa8c1d12666e8c0de3d324b5c	2021-07-26 16:28:59 -07:00
Raghavan Raman	60070982d2	[Static Runtime] Fixed build failure in OSS due to test_utils (#62216 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62216 Test Plan: Imported from OSS Reviewed By: hlu1 Differential Revision: D29917514 Pulled By: navahgar fbshipit-source-id: 379863e6cd0b157de3bfa1482f5519b26654b3d2	2021-07-26 16:10:10 -07:00
Janet Yang	962841b532	Fix subnet counting and re-enable check for multiple onnxifi ops in AOT (#62033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62033 Count the number of onnxifi ops rather than just number of subnets, since when the subnet size < min_ops, it isn't turned into an onnxifi op. Test Plan: Runs which ran into the "Did not find a partition with an SLS node" error now report "multiple onnxifi ops found" From https://fb.workplace.com/groups/527892364588452/permalink/807802049930814/: ``` buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-06-30/onnxifi_caffe2_net_aot_input_arguments_01-55-32_711d9476?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1" ``` Reran some failures from last week which now pass AOT: From https://fb.workplace.com/groups/527892364588452/permalink/807802049930814/, https://fb.workplace.com/groups/243933520351820/permalink/572715897473579/ ``` buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-09/onnxifi_caffe2_net_aot_input_arguments_05-31-08_ef5393a6?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1" ``` ``` buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-12/onnxifi_caffe2_net_aot_input_arguments_14-44-34_cfdf3053?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1" ``` ``` buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-13/onnxifi_caffe2_net_aot_input_arguments_04-03-30_162e7e53?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1" ``` Reviewed By: khabinov Differential Revision: D29796893 fbshipit-source-id: e9de7529ef86745207d41643d0fbe932fa166437	2021-07-26 16:08:51 -07:00
Shiyan Deng	037c4aa1d1	[fx2trt] flatten converter (#62202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62202 Add acc_ops.flatten converter. Also migrate to oss acc tacer for trt interpreter. Test Plan: unit test Reviewed By: khabinov Differential Revision: D29861555 fbshipit-source-id: dac88a703fdbf386f3f7fb27674a67951f3add49	2021-07-26 15:49:01 -07:00
Richard Barnes	f883ed9095	irange-ify 8b (#62195 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62195 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29887946 fbshipit-source-id: e3bd44721cf06a34ced47994810212be8460a2bb	2021-07-26 15:38:54 -07:00
Richard Barnes	f7743e92bf	irange-ify 9 (#62118 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62118 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29879670 fbshipit-source-id: 99b86ac7d65dfa2a47d0e6b7d65433200d18081e	2021-07-26 15:13:02 -07:00
Kimish Patel	026cfe85b4	Fix InlinedCallStack annotation to account for module calling its own (#61791 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61791 methods from forward During inlining we attached InlinedCallstack to nodes being inlined. In the process we attach moodule information as well, such that if CallMethod is being inlined we know which class instance and class type the method belongs to. However, CallMethod can be calling a method of the same object to which the graph belongs. e.g.: ``` def forward(self, input): x = input + 10 return forward_impl_(x, input) ``` Here forward_impl is method defined on the same class in which forward is defined. Existing module hierarchy annotation will mislabel this as unknown instance since the method is not associated with output of GetAttr node (it would be we had called self.conv.forward_impl_ for example). Change in this PR reconciles this by creating a placeholder name "SELF" for module instance indicating that you can traverse InlinedCallStack backwards to find first node with name != SELF, which would be the name of the object. e.g.: TOP(ResNet)::forward.SELF(ResNet)::_forward_impl.layer1(Sequential)::forward.0(BasicBlock)::forward.conv1(Conv2d)::forward.SELF(Conv2d)::_conv_forward Test Plan: Add test Imported from OSS Reviewed By: larryliu0820 Differential Revision: D29745443 fbshipit-source-id: 1525e41df53913341c4c36a56772454782a0ba93	2021-07-26 15:00:57 -07:00
Nikita Shulga	f16102f72a	Revert D29892919: Add squid proxy as egress cache Test Plan: revert-hammer Differential Revision: D29892919 (`e63160d735`) Original commit changeset: ac17227f2553 fbshipit-source-id: b78313147d60f26c1df68a25293e6b571ba66919	2021-07-26 14:42:28 -07:00
Edward Yang	cf1f59452b	Hacky support for meta tensor serialization. (#62192 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62192 This support is hacky because it doesn't preserve meta tensor storage sharing (e.g., if you serialize a model with shared storage, e.g., a tensor and a view on a tensor, when I deserialize the viewing relationship will be broken and these are just different tensors.) The hack is also durable, in the sense that we will be on the hook for supporting `_rebuild_meta_tensor_no_storage` in perpetuity in the future, even if we change our mind about the serialization format. This unblocks an FB production use case. I didn't add C++ support to minimize blast area of this patch. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D29910535 Pulled By: ezyang fbshipit-source-id: d98dcdd0108dfc3ae730a071d3c583b6d0281d21	2021-07-26 14:33:45 -07:00
Elton Leander Pinto	f0140a8c5f	Disable cppcoreguidelines-non-private-member-variables-in-classes (#62212 ) Summary: This PR disables the `cppcoreguidelines-non-private-member-variables-in-classes` check. PyTorch makes use of `protected` members throughout the codebase, and we do not want to perform this clang-tidy check in CI to improve signal-to-noise. Relevant failure: https://github.com/pytorch/pytorch/pull/61871/checks?check_run_id=3146453417 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62212 Reviewed By: driazati Differential Revision: D29917882 Pulled By: 1ntEgr8 fbshipit-source-id: f607c3d050a122e95136f9915060c4cda6694c9d	2021-07-26 14:14:05 -07:00
Elton Leander Pinto	1343eea037	Fix clang-tidy line filtering logic (#62210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62210 Fixes #62204 Test Plan: #62211 clang-tidy should only error on the added lines (and not on context/removals) Reviewed By: driazati Differential Revision: D29917897 Pulled By: 1ntEgr8 fbshipit-source-id: de91dbf34c1ad8405507cad91ab3dd0d6c61d82e	2021-07-26 14:12:53 -07:00
Elton Leander Pinto	2a83f24027	Enable macos clang-tidy installs (#62214 ) Summary: This PR enables installing our custom MacOS clang-tidy binaries. It also updates related documentation. The binaries are produced by [this CI job](https://github.com/pytorch/test-infra/blob/master/.github/workflows/clang-tidy-macos.yml), and are published to S3. This PR does not handle versioning of the downloaded binaries as this is being worked on separately. See https://github.com/pytorch/test-infra/issues/73 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62214 Test Plan: On a MacOS machine, run ```bash python3 -m tools.linter.install.clang_tidy .clang-tidy-bin/clang-tidy --checks="*" --list-checks \| grep "misc-max-tokens" ``` Reviewed By: janeyx99, mruberry Differential Revision: D29917728 Pulled By: 1ntEgr8 fbshipit-source-id: 98d0d8b7a57bdebf0ebcdc83228ef391e8c6629e	2021-07-26 13:43:29 -07:00
Mike Iovine	f4136c5efc	[Static Runtime] Enforce proper output dtype for many ops Summary: We previously had lots of ops with implementations like this: ``` if (p_node->Output(0).isNone()) { p_node->Output(0) = create_empty_like(input_0); } ... auto& out = p_node->Output(0); some_func_out(inputs, out); ``` This would make the output have the correct shape. But it would also take the dtype of `input_0`, which is not always correct. This change transforms these blocks to: ``` if (p_node->Output(0).isNone()) { p_node->Output(0) = some_func(inputs) } else { ... auto& out = p_node->Output(0); some_func_out(inputs, out); } ``` This gives the output the correct shape and dtype. Test Plan: `buck test //caffe2/benchmarks/static_runtime:static_runtime_cpptest` Reviewed By: hlu1 Differential Revision: D29887367 fbshipit-source-id: cef04bfa52ec082ad3a9a32aa27c44e275c6b24c	2021-07-26 13:27:02 -07:00
Richard Zou	29bb3f4647	Refactor Tensor::to to call a primitive that is not copy_. (#61458 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61458 Context ------- functorch is unable to vmap(grad(f)) when f contains a .to call. This is because .to (when it is not a no-op) decomposes to .copy_ under grad and the .copy_ is not compatible with vmap. Fix --- The fix for this is to have all Tensor::to variants call a new operator, `_to_copy`, that always copies and is a primitive w.r.t. autograd so that autograd decomposes Tensor::to into a call to `_to_copy`. (This is related to https://github.com/pytorch/pytorch/issues/60956, please let me know if you want to bikeshed the naming). In order to get this done I had to do a bit of refactoring. All of the `::to` implementations now call `to_impl` which may call `_to_copy`. Autograd codegen changes ------------------------ The second thing I had to do was modify the autograd codegen. Right now, autograd assumes that every output is either statically known to be differentiable or not differentiable at codegen time. `_to_copy` is a little special because its differentiability depends on the output dtype. e.g. `torch.randn(3, requires_grad=True).to(torch.long)` is non differentiable. To get this to work: - I changed how `output_differentiability` in derivatives.yaml work. - output_differentiability can now accept "conditions" for each of the output arguments. A "condition" is some C++ code. - We currently only support `output_differentiability` with conditions if there is a single output. This is for convenience and can be changed in the future. - I added a new `output_differentiability_conditions` field to DifferentiabilityInfo. This gets populated in load_derivatives.yaml - forward-mode and reverse-mode AD take `output_differentiability_conditions` into account. Here's how the generated code for `VariableType::_to_copy` [looks like](https://gist.github.com/zou3519/93462df4bda1837acee345205b7cc849) No other autogenerated code gets modified by this PR. Performance benchmarking ------------------------ - I benchmarked [three cases that demonstrate overhead](https://gist.github.com/zou3519/5b6985e6906b80eec5a0dd94ed5b6a1a). - Case A: No-op .to(). Instruction count went from 50223 to 25623. I have no clue why but this is a good thing. - Case B: not-no-op .to(). Instruction count went from 665291 to 671961. This is expected; `_to_copy` adds an additional dispatch. - Case C: not-no-op .to() forward pass and backward pass. Instruction count went from 4022841 to 4030057. This PR adds an additional dispatch to .to() (so there should be one additional dispatch in the forward pass) so this number looks reasonable. Test Plan --------- - test_torch.py has a test_to - test_cuda.py has test_to* - test_autograd has tests (test_type_conversions) that exercise the reverse-mode path - test_ops.py has some tests (like log_softmax) that exercise the reverse-mode and forward-mode AD path. - test_quantization, test_namedtensor all exercise tensor.to as well. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29801652 Pulled By: zou3519 fbshipit-source-id: bb01eb1acf3d79d84f284150d1be4be3b4ace351	2021-07-26 13:02:39 -07:00
zhouzhuojie	e63160d735	Add squid proxy as egress cache (#62103 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62103 This PR adds a squid proxy that's deployed dedicated for PyTorch CI. Initially we only roll out to GHA, and if things are ok we will extend this to circleci tests if necessary. `http_proxy` and `https_proxy` are compatible with the following http clients: - curl - wget - python Existing cache policy: ``` refresh_pattern -i .(7z\|deb\|rpm\|exe\|zip\|tar\|tgz\|gz\|ram\|rar\|bin\|tiff\|bz2\|run\|csv\|sh)$ 1440 80% 2880 ``` It uses the standard squid refresh_pattern for cache requests. In our setup, we tried to cache at least (1440 minutes - 1 day) and at max (2880 minutes - 2 days), with last-modified factor 80% ([squid doc](http://www.squid-cache.org/Doc/config/refresh_pattern/)). Please refer to [pytorch/test-infra](https://github.com/pytorch/test-infra/tree/master/aws/websites/squid-proxy) for details. Right now, it only applies to the `build` and `test` step, to limit the scope and make sure build and test are more reliable with egress cache. Test Plan: Imported from OSS Reviewed By: jbschlosser, malfet, seemethere, janeyx99 Differential Revision: D29892919 Pulled By: zhouzhuojie fbshipit-source-id: ac17227f2553ca62881711b3e9943488dfd8defd	2021-07-26 13:01:34 -07:00
Richard Barnes	d2594fa538	irange-ify 3 (#62112 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62112 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29879513 fbshipit-source-id: c01d18d34bb19014bf28d92c4d04b07e50a2770a	2021-07-26 12:56:58 -07:00
Salil Desai	f5c6c3947e	Remove Input Pointer Caching for XNNPack (#61959 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61959 We no longer need to cache the Input Pointer as XNNPACK has implemented a more robust approach where indirection buffer does not need to be recalculated even if activation tensor pointer changes, as long as tensor dimensions are the same. This reverses the changes in https://github.com/pytorch/pytorch/pull/42840/files Reviewed By: kimishpatel Differential Revision: D29777605 fbshipit-source-id: c1750538c17bce34f885c6f1bbb1f7164ebba25b	2021-07-26 12:02:15 -07:00
Richard Barnes	7ec6d1e857	irange-ify 2 (#62113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62113 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29879507 fbshipit-source-id: 1fb114e44afe8c1407f648b705db7fd4edb9d6e3	2021-07-26 12:00:52 -07:00
Rohan Varma	6dc2c07304	[Reland] [DDP] Implement a hook which performs FunctionalSGD step. (#62177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62177 Reland of https://github.com/pytorch/pytorch/pull/61678 Fix CI failure by gating including torchvision model on whether torchvision is available or not. ghstack-source-id: 134282165 Test Plan: CI Reviewed By: SciPioneer Differential Revision: D29904101 fbshipit-source-id: 47e799eb4a90acbbda91c5857ea00de3045d49f5	2021-07-26 11:56:56 -07:00
Jamie King	1dfb687f3c	Fixed off-by-one bug in Adam Smart Decay (#62135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62135 The initial implementation of Adam with Smart Decay had an off-by-one error. This was in the summation of the geometric series used to calculate how much built-up momentum would have been discharged in skipped minibatches. The unit tests should have caught these, but the testing strategy missed this because k, the "number of skipped minibatches" was always either 0 or so high that the impact of the bug was too small. The impact of the bug was proportional to 1/k. The testing strategy has also been adjusted to cover this bug. Differential Revision: D29889309 fbshipit-source-id: b086c0efed5c27f621061e726533c73658daffc6	2021-07-26 11:55:38 -07:00
Supriya Rao	dcb3eadc1f	[quant][fix] Update quantization c++ tests to not run if CPU_STATIC_DISPATCH is specified (#62197 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62197 For build configs with ATEN_CPU_STATIC_DISPATCH defined, quantization tests will fail since they require QuantizedCPU dispatch to be enabled. This will fix some internal test failures like https://www.internalfb.com/intern/test/844424941811803?ref_report_id=0 which are run under the `caffe2_aten_cpu_inference` project Test Plan: buck test mode/dev //caffe2/aten:quantized_test Imported from OSS Reviewed By: bdhirsh Differential Revision: D29912742 fbshipit-source-id: b117eb9f4afb51e0d0dd52fbe9d5c5be7dfafe85	2021-07-26 11:39:45 -07:00
Richard Barnes	0ca5dc7f03	irange-ify 5 (#62114 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62114 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D29879534 fbshipit-source-id: 0b1d6d2c9062a2fd7a55b00cb9f3d59ec941bad3	2021-07-26 11:07:54 -07:00
Akshit Khurana	8e71f48f0a	Handle simple NNAPI flatten NHWC case (#61796 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61796 We can easily handle nnapi conversion for nhwc inputs that have 1 channel or H & W are 1 Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_flatten Imported from OSS Reviewed By: saketh-are Differential Revision: D29827735 fbshipit-source-id: 65dee4b42fceef1b032bf5dd1c4cc6e020d01e14	2021-07-26 10:59:04 -07:00
kshitij12345	b73d759708	[fix] polygamma n>=1 (#61641 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/55357 TODO: * [x] Use proper casting to avoid confusing the compiler Pull Request resolved: https://github.com/pytorch/pytorch/pull/61641 Reviewed By: albanD Differential Revision: D29816592 Pulled By: mruberry fbshipit-source-id: 2c020a6e4c325c1b5d15499a77fb39f9ba93dd79	2021-07-26 10:52:20 -07:00
Pritam Damania	ef7d572afa	Ensure ShardedTensor handles list/tuple appropriately as `size` parameter. (#62109 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62109 The `size` parameter only worked correctly for *args like invocation :10, 20 and not for list: [10, 20] and tuples: (10, 20). This PR ensures this works similar to `torch.empty`. ghstack-source-id: 134246166 Test Plan: 1) unit tests 2) waitforbuildbot Reviewed By: SciPioneer Differential Revision: D29884768 fbshipit-source-id: 7a4a3c5ed5d7c081344f6ead3170905b97fc652d	2021-07-26 10:31:32 -07:00
Richard Barnes	f9dce598a5	Add some missing cuda guards (#62100 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62100 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D29880330 fbshipit-source-id: 7089000ccbcaa70a13f0ab4531b032bd5326e539	2021-07-26 10:26:22 -07:00

1 2 3 4 5 ...

38836 Commits