pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Philip Meier	1f74e082e2	only compare attributes for meta tensors (#72508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72508 Todo: - [x] document this behavior - [x] add tests Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262452 Pulled By: ezyang fbshipit-source-id: bc5c9653d5c3ad5c6efccc9c8e0efc0d28e15104 (cherry picked from commit `233142c88e`)	2022-02-17 02:33:08 +00:00
Philip Meier	b5f2574f36	no longer coalesce sparse COO tensors before comparison (#69751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69751 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262453 Pulled By: ezyang fbshipit-source-id: e2e62d2aa03fc569d2951c880960b256f5dc4aaa (cherry picked from commit `cb6b0ef719`)	2022-02-17 02:33:08 +00:00
Kurt Mohler	8e7fe87630	Rename `Typed/UntypedStorage` to `_Typed/_UntypedStorage` (#72540 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72540 Reviewed By: jbschlosser Differential Revision: D34216823 Pulled By: bdhirsh fbshipit-source-id: 1bc9930ab582771ebf02308e035576cd1a0dbe47 (cherry picked from commit `329238f612`)	2022-02-15 23:53:01 +00:00
soulitzer	67adc0cb11	Remove xfail for trapz and trapezoid on meta device (#72677 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72677 Test Plan: Imported from OSS Reviewed By: samdow Differential Revision: D34182326 Pulled By: soulitzer fbshipit-source-id: 9697b9e144780a4f3f60bea0978878f7edb72606 (cherry picked from commit `0386263175`)	2022-02-15 22:04:01 +00:00
Nikita Vedeneev	961bbe1c6a	`linalg_det_singular`: modify samples such that CUDA IMA dissapears. (#72585 ) Summary: Implicitly fixes https://github.com/pytorch/pytorch/issues/72203 and https://github.com/pytorch/pytorch/issues/72204. The issues is coming from an incorrect use of `scatter` with wrong indices, see https://github.com/pytorch/pytorch/issues/72204#issuecomment-1034087199. I do not know what exactly calls to `scatter`, investigating... Pull Request resolved: https://github.com/pytorch/pytorch/pull/72585 Reviewed By: cpuhrsch Differential Revision: D34245279 Pulled By: anjali411 fbshipit-source-id: 460f030524f9228f2269eaee0a3a72e1978caeb4 (cherry picked from commit `e48295716a`)	2022-02-15 21:12:17 +00:00
Alban Desmaison	a7cac05ca6	Add new tls snapshot feature (#72832 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/72623 that was reverted for the tls cleanup was removed. From close inspection on the counting of the number of available keys, I think there is one more since the guard is actually one after the last usable key. With this update assert, the last updated key will still be <=63 which will fit just fine. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72832 Reviewed By: H-Huang Differential Revision: D34228571 Pulled By: albanD fbshipit-source-id: ce5e10a841ea87386727346cfc8d9327252574c4 (cherry picked from commit `59d3b86353`)	2022-02-15 19:02:05 +00:00
Huamin Li	32dd4a8639	move fx_acc out of pytorch core (#72803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72803 as title Reviewed By: jfix71 Differential Revision: D34101788 fbshipit-source-id: a9fd84671929af21405c049603e9895ec68de3d8 (cherry picked from commit `e98fd1c32d`)	2022-02-15 16:13:43 +00:00
Nikita Shulga	80f23469dd	Revert D34152115: [pytorch][PR] [ROCm] Enable sort operator BF16 support Test Plan: revert-hammer Differential Revision: D34152115 (`aa44480b40`) Original commit changeset: 53841c91976b Original Phabricator Diff: D34152115 (`aa44480b40`) fbshipit-source-id: c9b5cc06198032af73cd6390466de2c62576a1e1 (cherry picked from commit `eb72533ae9`)	2022-02-15 15:23:29 +00:00
Brian Hirsh	f1a9650e4f	Revert D34214953: Add new tls snapshot feature Test Plan: revert-hammer Differential Revision: D34214953 (`6199b5231f`) Original commit changeset: 7aa5d5e3540a Original Phabricator Diff: D34214953 (`6199b5231f`) fbshipit-source-id: 5d271e9a5ab021b8202402630dbf917b43c55421 (cherry picked from commit `a12c630198`)	2022-02-14 23:14:19 +00:00
Alban Desmaison	6199b5231f	Add new tls snapshot feature (#72623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72623 Test Plan: Imported from OSS Reviewed By: samdow Differential Revision: D34214953 Pulled By: albanD fbshipit-source-id: 7aa5d5e3540a45a0ae70c5af3a4495c755908aa9 (cherry picked from commit `dc0a1ab54a`)	2022-02-14 20:46:54 +00:00
Alban Desmaison	3c33f0bdcd	Clean up LoggingTensor semantic (#72620 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72620 Clarify how LoggingTensor works with autograd. The updated comment should cover the semantic changes. Test Plan: Imported from OSS Reviewed By: samdow Differential Revision: D34214956 Pulled By: albanD fbshipit-source-id: 730d0a68f4228d2a84758e6807d869a34cbc1b31 (cherry picked from commit `66110bf16b`)	2022-02-14 20:13:30 +00:00
Brian Hirsh	f87f753bb9	avoiding adding some functions to the public python API before 1.11 release (#72543 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72543 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D34085724 Pulled By: bdhirsh fbshipit-source-id: 941d5a90a6fa5328268d623e0e2b01577e4132ca (cherry picked from commit `6676a0c79a`)	2022-02-14 19:49:01 +00:00
Pruthvi Madugundu	aa44480b40	[ROCm] Enable sort operator BF16 support (#71226 ) Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: https://github.com/pytorch/pytorch/pull/71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit `963027c7f2`)	2022-02-14 19:26:07 +00:00
Ryan Spring	4f8b986e28	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: VitalyFedyunin Differential Revision: D33894937 Pulled By: jbschlosser fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851 (cherry picked from commit `6e986f91a9`)	2022-02-14 03:40:32 +00:00
Douglas Lehr	28549b618a	[ROCm] Enable skipped ROCm unit tests (#67706 ) Summary: A number of ROCm tests were skipped via the skipCUDAIfRocm flag. A majority of the testcases are now supported on the ROCm platform. This fix enabled all of the test_ops tests for ROCm and enables most Operators in common_methods_invocations.py minus the SpectralFuncInfo class which still has some fft issues. Partially Fixes https://github.com/pytorch/pytorch/issues/51303 cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: https://github.com/pytorch/pytorch/pull/67706 Reviewed By: seemethere, janeyx99 Differential Revision: D34153457 Pulled By: malfet fbshipit-source-id: 95f4420f306ca7580cd438d3b5cc0b24efbfae99 (cherry picked from commit `0d178fffd3`)	2022-02-11 22:14:54 +00:00
Rohan Varma	37651894f9	[Easy] Small DDP fixes (#72455 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72455 - Improve helper function - Improve/fix some logging ghstack-source-id: 148840678 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D34044865 fbshipit-source-id: d2ae820effaaaecdd7155ffa8d3a1d8ebbd9f39e (cherry picked from commit `3efbea8f41`)	2022-02-11 15:55:09 +00:00
David Berard	933e0e8991	Opinfo test for mvlgamma: add epsilon (#72491 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72491 retry #71794, base revision in that stack had been reverted Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D34062260 Pulled By: davidberard98 fbshipit-source-id: 40fbb2d2de3b10000645e25e7fe89f3ce929f0a2 (cherry picked from commit `917676f076`)	2022-02-09 19:01:22 +00:00
David Berard	bbd42c605a	[JIT] Opinfo tests for nnc fusion - retry (#72486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72486 Retry #70465. Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D34061628 Pulled By: davidberard98 fbshipit-source-id: e27ed315bc4ad57cdbfbc9cedffcbb7886004524 (cherry picked from commit `7937808d2e`)	2022-02-09 19:01:22 +00:00
Kushashwa Ravi Shrimali	cdc9b1160e	Port `linalg_cross` to structured kernels (#72413 ) Summary: This PR ports `linalg_cross` to structured kernels. Please see https://github.com/pytorch/pytorch/issues/55070 for the tracker issue. cc: ysiraichi bdhirsh AnirudhDagar Pull Request resolved: https://github.com/pytorch/pytorch/pull/72413 Reviewed By: ejguan Differential Revision: D34076711 Pulled By: bdhirsh fbshipit-source-id: 45b073681692a6f0406e92a4da86aef4dd1423da (cherry picked from commit `3780fb2611`)	2022-02-09 07:13:13 +00:00
Yinghai Lu	3670466201	Move fx2trt out of PyTorch core (#72499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72499 Pull Request resolved: https://github.com/pytorch/benchmark/pull/740 To fx2trt out of tree to remove bloatness of PyTorch core. It's the first and major step. Next, we will move acc_tracer out of the tree and rearrange some fx passes. Reviewed By: suo Differential Revision: D34065866 fbshipit-source-id: c72b7ad752d0706abd9a63caeef48430e85ec56d (cherry picked from commit `91647adbca`)	2022-02-09 04:04:49 +00:00
Kushashwa Ravi Shrimali	bc03c1d000	Structured Kernels for `index_copy`, add `out` variant (#67329 ) Summary: This PR ports `index_copy` implementation to structured kernels, also adds an `out` variant. ~Note to the reviewers: This is in draft mode, waiting for the tests from the CI, and I'll give a final look before requesting the review.~ Issue tracker: https://github.com/pytorch/pytorch/issues/55070 cc: bdhirsh ysiraichi Pull Request resolved: https://github.com/pytorch/pytorch/pull/67329 Reviewed By: ejguan Differential Revision: D34077219 Pulled By: bdhirsh fbshipit-source-id: 6accda33957f654b753261c5c3d765a27a64d2c0 (cherry picked from commit `f3ac83217a`)	2022-02-08 22:52:27 +00:00
Nikita Shulga	bb101ec78d	Revert D33595240: [JIT] Opinfo tests for nnc fusion Test Plan: revert-hammer Differential Revision: D33595240 (`0b57bd4c66`) Original commit changeset: e2e17a921bc3 Original Phabricator Diff: D33595240 (`0b57bd4c66`) fbshipit-source-id: 172a3ffd19d180b1b3617956b1f881be62f37bc9 (cherry picked from commit `324cfaea86`)	2022-02-08 01:28:42 +00:00
Nikita Shulga	58f25678bd	Revert D33780905: Opinfo test for mvlgamma: add epsilon Test Plan: revert-hammer Differential Revision: D33780905 (`72cedba655`) Original commit changeset: c9afd443bc90 Original Phabricator Diff: D33780905 (`72cedba655`) fbshipit-source-id: 180b862ed03e18f96cc1c7f956476eb16dd56225 (cherry picked from commit `623643b362`)	2022-02-08 01:28:42 +00:00
David Berard	72cedba655	Opinfo test for mvlgamma: add epsilon (#71794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71794 mvlgamma(inp, p) requires that all the elements of inp are > (p-1)/2. The opinfo test was occasionally producing inputs with elements == (p-1/2), which would generate errors like: ``` ERROR: test_nnc_correctness_mvlgamma_mvlgamma_p_5_cpu_bfloat16 (__main__.TestNNCOpInfoCPU) ---------------------------------------------------------------------- Traceback (most recent call last): File "/path/pytorch/torch/testing/_internal/common_device_type.py", line 381, in instantiated_test raise rte File "/path/pytorch/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test result = test(self, *param_kwargs) File "/path/pytorch/torch/testing/_internal/common_device_type.py", line 753, in test_wrapper return test(args, *kwargs) File "/path/pytorch/torch/testing/_internal/common_device_type.py", line 907, in only_fn return fn(slf, args, *kwargs) File "/path/pytorch/test/test_jit_fuser_te.py", line 2293, in test_nnc_correctness ref = variant(clone_inputs((sample.input, sample.args)), *sample.kwargs) RuntimeError: All elements must be greater than (p-1)/2 ``` repro example: https://gist.github.com/davidberard98/9da688e31cdfbaed7e990746b28a4ba2 Test Plan: Imported from OSS Reviewed By: qihqi Differential Revision: D33780905 Pulled By: davidberard98 fbshipit-source-id: c9afd443bc90ce68f33b97498921b447e4f7d1d8 (cherry picked from commit `a974b03f07`)	2022-02-07 22:21:03 +00:00
David Berard	0b57bd4c66	[JIT] Opinfo tests for nnc fusion (#70465 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70465 These tests check to ensure that (a) the result after nnc fusion (of a single op) is the same as the unfused op (b) for certain ops where fusion is expected to occur, ensure that fusion does actually occur Test Plan: Imported from OSS Reviewed By: wenleix Differential Revision: D33595240 Pulled By: davidberard98 fbshipit-source-id: e2e17a921bc30c313e92e8e5bbc6c1b5fcd14bc1 (cherry picked from commit `b1ba221acc`)	2022-02-07 20:56:21 +00:00
Andrew Gu	b047963983	[PT-D][BE] Fix DDP no_sync() test logic (#72348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72348 Overview #43307 changed `_test_accumulate_gradients_no_sync()` to add a `num_iters` argument. However, I think the change misconstrued the test logic slightly. `61ab04e1db/torch/testing/_internal/distributed/distributed_test.py (L4369-L4397)` - `iteration % num_iters == 0` evaluates to `True` only for `iteration == 0` since `iteration` comes from `for iteration in `range(num_iters)`. - IIUC, the intention is to alternate between accumulating gradients (using `no_sync()`) and synchronizing gradients normally. In the existing implementation, any iterations following the second one are non-productive since gradients are in sync, meaning it reduces to testing normal DDP. - This PR changes the check back to `iteration % 2 == 0` to restore the alternating behavior. Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D34011559 Pulled By: awgu fbshipit-source-id: 4ba771e45b28a343167a324462571e4b8e25ae72 (cherry picked from commit `8492a8b803`)	2022-02-07 18:05:19 +00:00
anjali411	2b53121ddd	Reenable tests that now run fine after the IMA fix (#72288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72288 Fixes https://github.com/pytorch/pytorch/issues/72177 https://github.com/pytorch/pytorch/issues/72189 https://github.com/pytorch/pytorch/issues/72189 These tests were disabled in https://github.com/pytorch/pytorch/pull/72016 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D34009065 Pulled By: anjali411 fbshipit-source-id: 0874d0cc612bb2fb368a092389448b76ac31a989 (cherry picked from commit `5b40a7ba63`)	2022-02-04 18:39:04 +00:00
Nikita Shulga	38ebb776a4	Fail with unexpected success for fatal errors (#72016 ) Summary: Rest of the tests from CUDA testuite is skipped after GPU context corruption is encountered. For tests decorated with `expectedFailure` creates false impression that entire testsuite is passing. Remedy it by suppressing the exception and printing the warning about unexpected success if `should_stop_early` is true Also, prints warning when this happens (to make attribution easier) as well as when this condition is detected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72016 Test Plan: `python test_ops.py -v -k test_fn_fwgrad_bwgrad_gradient` Before the change: ``` test_fn_fwgrad_bwgrad_gradient_cpu_complex128 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cpu_float64 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cuda_complex128 (__main__.TestGradientsCUDA) ... expected failure ---------------------------------------------------------------------- Ran 3 tests in 0.585s OK (expected failures=1) ``` After the change: ``` test_fn_fwgrad_bwgrad_gradient_cpu_complex128 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cpu_float64 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cuda_complex128 (__main__.TestGradientsCUDA) ... /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py:1670: UserWarning: TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. warn(f"TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with {rte}") /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py:382: UserWarning: Suppressed expected failure that resulted in fatal error warn("Suppressed expected failure that resulted in fatal error") unexpected success ---------------------------------------------------------------------- Ran 3 tests in 0.595s FAILED (unexpected successes=1) ``` And `stderr` from XML file contains requested info: ``` /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py:1670: UserWarning: TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. warn(f"TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with {rte}") /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py:382: UserWarning: Suppressed expected failure that resulted in fatal error warn("Suppressed expected failure that resulted in fatal error") ``` Fixes https://github.com/pytorch/pytorch/issues/71973 Reviewed By: janeyx99, ngimel Differential Revision: D33854287 Pulled By: malfet fbshipit-source-id: dd0f5a4d2fcd21ebb7ee50ce4ec4914405a812d0 (cherry picked from commit `0c0baf3931`)	2022-02-03 17:49:59 +00:00
Junjie Wang (PyTorch)	88547396eb	[PT-D] Enable megatron-lm style MLP layers (Changes mainly on sharded linear op) (#69735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69735 We want to build a prototype of Megatron-LM so that we can apply PT-D op to models like transformer and other Meta flagship models like The basic idea of Megatron-LM is as following: 1. Col-wise sharding of linear weight. Perform the linear op for the first layer. 2. Perform a math op (optional), such as ReLU or GeLU. We use GeLU in our example unit test. The input is from step 1. 3. Row-wise sharing of linear weight. Perform the linear op for the second layer. The input is from step 2. We then save communications to concatenate the col-wise sharding results and spreading the input to different ranks for row-wise sharding. The change is as following: 1. Return a ShardedTensor for the col-wise sharding in the sharded_linear op. 2. Return a PartialTensors for the row-wise sharding in the sharded_linear op. 3. Leverage APIs already defined for `reshard` to merge/aggregate local results to a fully sync local result if needed. 4. Add helper function to create sharded tensor based on the local result. 5. Add a unit test to test the Megatron-LM idea mentioned above and compare with local ops, including the grad and optimizer so that we can ensure the correctness of the implementation. 6. Refactor the unit test of sharded linear to reflect the changes in the code. ghstack-source-id: 148273049 Test Plan: Unit test + CI Reviewed By: pritamdamania87 Differential Revision: D32978221 fbshipit-source-id: 565fc92e7807e19d53b0261f8ace3945bef69e3e (cherry picked from commit `344abe7520`)	2022-02-03 06:12:15 +00:00
Junjie Wang (PyTorch)	19d0de8a57	[PT-D][RFC] Resharding related API implement for ShardedTensor and Partial Tensor (#70079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70079 We defined a new concept named `PartialTensor`, which is an abstraction to represent Tensors that need aggregation across multiple devices and multiple processes. We also defined a API `reshard_output` to reshard a `PartialTensor` to `Tensor` or reshard a `ShardedTensor` to `ShardedTensor/Tensor`. This is done via class `ModuleResharder` which acts like a wrapper of original modules plus the a reshard in the final step. The `reshard` logic is defined in each class (`ShardedTensor` and `PartialTensor`). ghstack-source-id: 148273050 Test Plan: Unit test is in the next PR. Reviewed By: pritamdamania87 Differential Revision: D33121037 fbshipit-source-id: 5f56617ea526b857c5b73df6e069697d428ec359 (cherry picked from commit `58b1457cbc`)	2022-02-03 05:26:02 +00:00
anjali411	f607af126e	Set correct device id on efficientzerotensors (#71611 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71611 Fixes https://github.com/pytorch/pytorch/issues/71160 https://github.com/pytorch/pytorch/issues/69925 #69913 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D33897543 Pulled By: anjali411 fbshipit-source-id: f1d8608c351876b8c2619da5ef891f74bad30ab5 (cherry picked from commit `643e666ea3`)	2022-02-02 21:51:32 +00:00
kshitij12345	02f6226bff	[fix] Dropout2d-3d no-batch-dim (#69885 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/69801 TODO: * [x] Update C++ API cc albanD mruberry jbschlosser walterddr kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69885 Reviewed By: mruberry Differential Revision: D33175470 Pulled By: jbschlosser fbshipit-source-id: c9d7d9e0f59ba290a0157725c338a345f3d58b9f (cherry picked from commit `7e4271a156`)	2022-02-02 16:40:32 +00:00
Yanli Zhao	2336571cb7	make fsdp folder to be public (#72084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72084 make fsdp folder to be public ghstack-source-id: 148173447 Test Plan: unit tests Reviewed By: mrshenli Differential Revision: D33903417 fbshipit-source-id: 7852a2adc4af09af48a5ffa52ebf210489f834d5 (cherry picked from commit `bd06513cfe`)	2022-02-02 15:50:14 +00:00
Pritam Damania	64670e414e	[reland] Create torch.distributed._shard package. (#72141 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72141 We have many sharding components currently: torch.distributed._sharded_tensor, torch.distributed._sharding_spec, torch.distributed._sharded_optimizer and more coming. As a result, organizing all of this under the `torch.distributed._shard` package. For BC reasons, I'm still keeping the old packages and have them just reference the new package. ghstack-source-id: 148150861 ghstack-source-id: 148150861 Test Plan: waitforbuildbot Reviewed By: fduwjj Differential Revision: D33904585 fbshipit-source-id: 057e847eb7521b536a3ee4e0f94871aacc752062 (cherry picked from commit `29a70dd7af`)	2022-02-02 06:58:20 +00:00
Andrew Or	e118d6e59f	Add lowering path for LinearReLU module (#71427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71427 This commit adds a lowering path for the LinearReLU modules in static quantization mode. This includes torch.nn.qat.Linear, torch.nn.intrinsic.LinearReLU, and torch.nn.intrinsic.qat.LinearReLU. Future commits will add support for dynamic quantization and functional LinearReLU. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_module Imported from OSS Reviewed By: george-qi Differential Revision: D33694742 fbshipit-source-id: 19af11f82b1ad8ade0c307498971c29a3f776036 (cherry picked from commit `b3f607de43`)	2022-02-01 19:31:31 +00:00
Richard Zou	f20fa66f70	Revert "[fix] max_pool1d: composite compliance (#70900 )" (#71992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71992 This reverts commit `b7222e15b6`. We are conservatively reverting this because it broke a test in functorch. The original PR added a `_max_pool1d_cpu` operator. I'm not sure if it is actually safe to revert this due to the addition of the new operator (someone may have serialized it between now and then) but because it has only been two weeks this should be fine. Test Plan: - wait for tests Reviewed By: jbschlosser, VitalyFedyunin Differential Revision: D33882918 Pulled By: zou3519 fbshipit-source-id: f146e82e6b46690376b3d8825dc7f7da62e2c7de (cherry picked from commit `1606333e6c`)	2022-02-01 15:07:21 +00:00
Jerry Zhang	082ff25f37	[reland][bc-breaking][quant][be] Refactor fuser_method to include `is_qat` argument" (#71956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71956 Pull Request resolved: https://github.com/facebookresearch/mobile-vision/pull/59 Original commit changeset: f3912e210e8c Original Phabricator Diff: D33178977 (`ef501e8fed`) Test Plan: Please see original diff for test plans Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D33833203/V3/classyvision/)\| \|Modified Pages\| Reviewed By: andrewor14 Differential Revision: D33833203 fbshipit-source-id: 74a8f22730b00aafa6a173b208e635c1d696959e (cherry picked from commit `fb88772b18`)	2022-01-31 23:02:22 +00:00
Nikita Shulga	34494e6252	Back out "Create torch.distributed.shard package." (#72062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72062 Original commit changeset: dc692b31e260 Original Phabricator Diff: D33755913 (`87bbcf70f7`) Test Plan: CI Reviewed By: pbelevich Differential Revision: D33891115 fbshipit-source-id: 37286e03d743d8691319f07c95e9561d54f3d6d0 (cherry picked from commit `0c1b3fe008`)	2022-01-31 18:29:27 +00:00
Nikita Shulga	74c44ba9d6	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692 (cherry picked from commit `c9efb58223`)	2022-01-31 17:44:19 +00:00
Ryan Spring	23d03025dc	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33 (cherry picked from commit `3a53b3e94f`)	2022-01-31 17:07:45 +00:00
Anjali Chourdia	1e4aefaa2f	Revert D33834916: Set correct device id on efficientzerotensors Test Plan: revert-hammer Differential Revision: D33834916 (`a18cfb790d`) Original commit changeset: 11cec343e95e Original Phabricator Diff: D33834916 (`a18cfb790d`) fbshipit-source-id: 3d3f60b760b445383768161b1d21ea4dadbe5d7c (cherry picked from commit `eba41aa646`)	2022-01-31 03:49:56 +00:00
anjali411	a18cfb790d	Set correct device id on efficientzerotensors (#71611 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71611 Fixes https://github.com/pytorch/pytorch/issues/71160 https://github.com/pytorch/pytorch/issues/69925 Test Plan: Imported from OSS Reviewed By: george-qi Differential Revision: D33834916 Pulled By: anjali411 fbshipit-source-id: 11cec343e95e2ee188ab7576f26f64aa19317891 (cherry picked from commit `f6e86f8a6b`)	2022-01-30 20:53:15 +00:00
Pritam Damania	87bbcf70f7	Create torch.distributed.shard package. (#71742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71742 We have many sharding components currently: torch.distributed._sharded_tensor, torch.distributed._sharding_spec, torch.distributed._sharded_optimizer and more coming. As a result, organizing all of this under the `torch.distributed.shard` package. For BC reasons, I'm still keeping the old packages and have them just reference the new package. ghstack-source-id: 147899768 Test Plan: waitforbuildbot Reviewed By: fduwjj, wanchaol Differential Revision: D33755913 fbshipit-source-id: dc692b31e2607063d55dfcb3db33ec53961d5a5b (cherry picked from commit `5b6885f358`)	2022-01-29 00:48:06 +00:00
Joel Schlosser	cb823d9f07	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93 (cherry picked from commit `e9fb2d1db1`)	2022-01-28 18:35:01 +00:00
Ryan Spring	f499ab9cef	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187 (cherry picked from commit `4713dd9cca`)	2022-01-28 16:59:09 +00:00
soulitzer	5bd19ba846	Expect test_fn_fwgrad_bwgrad to fail because forward AD is not implemented (#71944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71944 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33828924 Pulled By: soulitzer fbshipit-source-id: f754d0f08567f20324d10f37502b1ab37aca3d8f	2022-01-27 20:46:35 -08:00
Rohan Varma	d0ff1f0013	[FSDP] Backward prefetch in recursive call (#71804 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71804 Add backward prefetch arg when using auto_wrap_policy. Unittests are updated appropriately. ghstack-source-id: 147753214 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D33782346 fbshipit-source-id: c0176b48db29c3756a8873e809610ed53480102b (cherry picked from commit `764acb3f1c`)	2022-01-28 00:34:08 +00:00
lezcano	6cb128c8dd	Generalize noncontiguous tests to several outputs (#67996 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67996 This is necessary for most matrix decompositions in `linalg`. cc mruberry Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33774418 Pulled By: mruberry fbshipit-source-id: 576f2dda9d484808b4acf0621514c0ffe26834e6 (cherry picked from commit `fb07c50aa9`)	2022-01-27 23:13:17 +00:00
lezcano	a675770adc	Deactivate the tracking of gradients in sampling functions within OpInfos (#68522 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68522 Some OpInfos were inadvertibly generating samples with `grad_fn`. For example, when using functions like `transpose()` or `conj()` on the inputs to generate transposed or conjugated inputs. This PR corrects this and deactivates the tracking of gradients in all the sampling functions. Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33774420 Pulled By: mruberry fbshipit-source-id: da0e6189a2d67a2cb0fd458054558d36dbad9b61 (cherry picked from commit `42b0870774`)	2022-01-27 23:13:17 +00:00
lezcano	e2011b29aa	Add OpInfo test to check that floating point inputs in OpInfos have requires_grad set to True (#69909 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69909 This test detected a number of sampling methods that were not generating the samples as expected, e.g. `index_put`, `cosine_embedding`, `stft`, but perhaps most notably the generator for `BinOps`. It also detected that `reminder` and `fmod` did not have implemented the backward formula for the second input. I added this in the previous PR. Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33774422 Pulled By: mruberry fbshipit-source-id: 76cfc75b1fdfd72ee64aa524665f83a75fe52509 (cherry picked from commit `13ea7b436b`)	2022-01-27 23:13:17 +00:00

1 2 3 4 5 ...

2225 Commits