pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Michael Carilli	0f0271e255	[RELAND2] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35102 ) Summary: This is the second reland attempt for https://github.com/pytorch/pytorch/pull/32140. The first reland attempt https://github.com/pytorch/pytorch/pull/35011 failed due a [small incompatible change](https://github.com/pytorch/pytorch/pull/35011#issuecomment-601754216) in recent master (`skipIfRocm` was removed from `test_data_parallel.py`). The present PR restores skipIfRocm. Description from first reland attempt https://github.com/pytorch/pytorch/pull/35011: > https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. > > The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. > > Original description of https://github.com/pytorch/pytorch/pull/32140: > > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. > Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > > > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35102 Differential Revision: D20596918 Pulled By: ezyang fbshipit-source-id: 60caa279bb0ce4a9bb0b28c1d585d42cf1cc7e50	2020-03-24 09:08:04 -07:00
Mike Ruberry	fe276d541e	Revert D20541921: [pytorch][PR] [RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) Test Plan: revert-hammer Differential Revision: D20541921 Original commit changeset: abb5488dca86 fbshipit-source-id: d2c6038978f80e5429632f8b49107090a8a247f4	2020-03-19 22:39:12 -07:00
Michael Carilli	991b97277a	[RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35011 ) Summary: https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. Original description of https://github.com/pytorch/pytorch/pull/32140: > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35011 Differential Revision: D20541921 Pulled By: ezyang fbshipit-source-id: abb5488dca8620b0daac4306ebf2bb47fc36e4f5	2020-03-19 20:18:18 -07:00
Jeff Daily	35d9874a35	in test_data_parallel.py, remove skipIfRocm from tests that pass (#34978 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34978 Differential Revision: D20535920 Pulled By: mrshenli fbshipit-source-id: 3baa8608dd3b0dd5578bc32e56a2e6c1fe69492d	2020-03-19 09:16:43 -07:00
Edward Yang	d0577e19f0	Revert D20346700: [pytorch][PR] Eager autocasting, out-of-place ops only Test Plan: revert-hammer Differential Revision: D20346700 Original commit changeset: 12d77b391731 fbshipit-source-id: 108d72bf24232f443c0be293ec932c0c478d6a60	2020-03-18 11:42:51 -07:00
Michael Carilli	aaa8f02156	Eager autocasting, out-of-place ops only (#32140 ) Summary: Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 In-place ops and ops with user-supplied `out=...` can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/pull/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32140 Differential Revision: D20346700 Pulled By: ezyang fbshipit-source-id: 12d77b3917310186fbddf11c59b2794dc859131f	2020-03-18 10:28:21 -07:00
Natalia Gimelshein	cd9d9a2235	fix handling of replica parameters in DataParallel (#33907 ) Summary: In DataParallel, replica parameters are not leaves (because they are computed via broadcast from master parameters), and should be treated as such. Fixes https://github.com/pytorch/pytorch/issues/33552 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33907 Differential Revision: D20150199 Pulled By: ngimel fbshipit-source-id: 5965d4115b6b3a8433063126ff6269567872fbeb	2020-03-10 10:35:44 -07:00
iurii zdebskyi	908eee5583	remove .data from test/distributed/ (#33874 ) Summary: `.data` calls are unsafe and should not be used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33874 Differential Revision: D20141059 Pulled By: izdeby fbshipit-source-id: 8e11afc74f0cb04f5b18b458068fb813a6d51708	2020-02-27 13:14:29 -08:00
Pritam Damania	fd684cc312	Use torch.set_default_dtype in test_data_parallel and rename dtype2prec (#32962 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32962 As per gchanan's comments on https://github.com/pytorch/pytorch/pull/30445, I've used `torch.set_default_dtype` in test_data_parallel instead of specifying dtype=torch.double everywhere. Also, renamed dtype2prec to dtype2prec_DONTUSE ghstack-source-id: 98388429 Test Plan: waitforbuildbot Differential Revision: D19714374 fbshipit-source-id: eb55bbca33881625636ba9ea6dd4cb692f25668e	2020-02-15 14:07:54 -08:00
Peter Bell	efba630287	Issue a warning when zero_grad is used in DataParallel (#33064 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/31768, second attempt of https://github.com/pytorch/pytorch/issues/32870 DataParallel creates replicas of the original `nn.Module` with the parameters duplicated onto the destination devices. Calling `backwards` will propagate gradients onto the original module parameters but calling `zero_grad` on the replica module doesn't clear the gradients from the parent module. However, any replica using backwards was broken anyway since the replica's parameters are not leaf nodes in autograd. So, we should issue a warning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33064 Differential Revision: D19790178 Pulled By: albanD fbshipit-source-id: 886f36640acef4834a6fa57a26ce16b42ff0e9ad	2020-02-10 07:04:27 -08:00
Richard Zou	8195961f20	Revert D19730209: [pytorch][PR] Issue a warning when using zero_grad in DataParallel Test Plan: revert-hammer Differential Revision: D19730209 Original commit changeset: cb9b2cb0c2e0 fbshipit-source-id: 5bf53ea3c37a7ed2411a2acc34e40d07eff144c9	2020-02-06 07:05:51 -08:00
Peter Bell	46c3c18bcc	Issue a warning when using zero_grad in DataParallel (#32870 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/31768 `DataParallel` creates replicas of the original `nn.Module` with the parameters duplicated onto the destination devices. Calling `backwards` will propagate gradients onto the original module parameters but calling `zero_grad` on the replica module doesn't clear the gradients from the parent module, ~breaking any model that uses `backward`-`zero_grad` in its `forward`. I fix this by patching the replica module so that `zero_grad` clears grads on the parent as well.~ However, any replica using backwards was broken anyway since the replica's parameters are not leaf nodes in autograd. So, we should raise a warning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32870 Differential Revision: D19730209 Pulled By: ezyang fbshipit-source-id: cb9b2cb0c2e0aca688ce0ff3e56b40fbd2aa3c66	2020-02-05 20:25:04 -08:00
Pritam Damania	f050b16dd9	Move pytorch distributed tests to separate folder for contbuild. (#30445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445 Create distributed and rpc directories under caffe/test for better management of unit tests. Differential Revision: D18702786 fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606	2020-01-22 21:16:59 -08:00

13 Commits