pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Anthony Barbier	3f34d26040	Add __main__ guards to distributed tests (#154628 ) This is the first PR of a series in an attempt to re-submit #134592 as smaller PRs. In distributed tests: - Ensure all files which should call run_tests do call run_tests. - Raise a RuntimeError on tests which have been disabled (not run) - Remove any remaining uses of "unittest.main()"" Cc @wconstab @clee2000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154628 Approved by: https://github.com/Skylion007	2025-06-04 14:39:57 +00:00
cyy	b0dfd242fa	Remove NO_MULTIPROCESSING_SPAWN checks (#146705 ) py 3.9 has spawn. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146705 Approved by: https://github.com/colesbury	2025-02-28 05:53:19 +00:00
PyTorch MergeBot	926b7b5027	Revert "Remove NO_MULTIPROCESSING_SPAWN checks (#146705 )" This reverts commit `40ad5e01df`. Reverted https://github.com/pytorch/pytorch/pull/146705 on behalf of https://github.com/cyyever due to Broke lint?, I guess land race with rufff update ([comment](https://github.com/pytorch/pytorch/pull/146705#issuecomment-2689603077))	2025-02-28 03:04:38 +00:00
cyyever	40ad5e01df	Remove NO_MULTIPROCESSING_SPAWN checks (#146705 ) py 3.9 has spawn. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146705 Approved by: https://github.com/colesbury	2025-02-28 00:15:32 +00:00
Xuehai Pan	db3290846e	[BE][Easy][10/19] enforce style for empty lines in import segments in `test/d*/` (#129761 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129761 Approved by: https://github.com/fegin	2024-07-17 16:57:39 +00:00
Yuanhao Ji	e3effa5855	Enable UFMT on all of `test/distributed` (#123539 ) Partially addresses #123062 Ran lintrunner on: - `test/distributed` Pull Request resolved: https://github.com/pytorch/pytorch/pull/123539 Approved by: https://github.com/ezyang	2024-04-17 06:46:02 +00:00
PyTorch MergeBot	52be63eb2c	Revert "Enable UFMT on all of `test/distributed` (#123539 )" This reverts commit `89ac37fe91`. Reverted https://github.com/pytorch/pytorch/pull/123539 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/123539#issuecomment-2058329471))	2024-04-16 06:33:21 +00:00
Yuanhao Ji	89ac37fe91	Enable UFMT on all of `test/distributed` (#123539 ) Partially addresses #123062 Ran lintrunner on: - `test/distributed` Pull Request resolved: https://github.com/pytorch/pytorch/pull/123539 Approved by: https://github.com/ezyang	2024-04-16 03:23:56 +00:00
Aaron Gokaslan	3fe437b24b	[BE]: Update flake8 to v6.1.0 and fix lints (#116591 ) Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling. - Replace `assert(0)` with `raise AssertionError()` - Remove extraneous parenthesis i.e. - `assert(a == b)` -> `assert a == b` - `if(x > y or y < z):`->`if x > y or y < z:` - And `return('...')` -> `return '...'` Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591 Approved by: https://github.com/albanD, https://github.com/malfet	2024-01-03 06:04:44 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Xuehai Pan	046e88a291	[BE] [3/3] Rewrite `super()` calls in test (#94592 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592 Approved by: https://github.com/ezyang, https://github.com/seemethere	2023-02-12 22:20:53 +00:00
Aaron Gokaslan	8fce9a09cd	[BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308 ) Apply parts of pyupgrade to torch (starting with the safest changes). This PR only does two things: removes the need to inherit from object and removes unused future imports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-07 21:10:56 +00:00
Junjie Wang	7c2489bdae	[PyTorch][Distributed] Enable Reduce Scatter and modify all_to_all for sharded linear with more test cases. (#68786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68786 To enable the auto grad for the sharded linear, we find we need to make some changes to the current nn function api (c10d api with auto grad enabled). So we made the following several changes: 1. Add a new api `reduce_scatter` since we need it in the rowwise sharding. 2. Modify the `all_to_all` api to make sure it consistent with the ones in distributed_c10d.py. 3. Found the cpp input params of `reduce_scatter` is missing input param, added more unit test to cover these cases. 4. Sync the NN test from gloo to nccl. ghstack-source-id: 144860208 Test Plan: CI + Unit Test Reviewed By: pritamdamania87 Differential Revision: D32569674 fbshipit-source-id: 9bd613f91bbf7a39eede0af32a5a5db0f2ade43b	2021-12-06 13:38:58 -08:00
Jane Xu	34051d74da	Add test owner to distributed files starting with test_ (#66797 ) Summary: Action based on https://github.com/pytorch/pytorch/issues/66232 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/66797 Reviewed By: gchanan Differential Revision: D31761389 Pulled By: janeyx99 fbshipit-source-id: c27c9ab4acec1eb71d5edd4538cd113b770dfc6c	2021-10-19 10:55:20 -07:00
Pavel Belevich	5cc75e46fa	Split test_c10d.py to test_c10d_common.py, test_c10d_gloo.py, test_c10d_nccl.py (#56598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56598 Test Plan: NA Reviewed By: SciPioneer Differential Revision: D27913170 fbshipit-source-id: 3439d18141131b02d55f2ca399a4c795cba2b04b	2021-04-21 22:10:41 -07:00
Wanchao Liang	a970e525fd	make ProcessGroup.Options.timeout argument private in python (#56531 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56531 per discussions in https://github.com/pytorch/pytorch/pull/53663/files#r593409009, we need to make sure our API not confusing user by passing in both timeout in argument and timeout in processgroup.options. This PR tries to make the `ProcessGroup.Options.timeout` be a private field, and only be used in our test utils, for both `init_process_group` and `new_group`, we still allow user pass `timeout` as a separate argument. Since `ProcessGroupGloo.Options` only have a `timeout` config, both functions will not allow passing in options for the GLOO backend. This way we still preserve the only `timeout` API, and only allow user to use `ProcessGroupNCCL.Options` when needed. cc pritamdamania87 rohan-varma Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D27893395 Pulled By: wanchaol fbshipit-source-id: cdd29c84648002226ef3d9f9f3ea67b795e64bc5	2021-04-21 17:55:10 -07:00
Yi Wang	6a40339920	[SPMD] Error out SPMD mode (#54454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54454 According to the pitch in https://github.com/pytorch/pytorch/issues/47012 1. Let DDP error out if `device_ids` contains multiple devices. 2. If device_ids is not specified, DDP will use the provided model (module argument in DDP constructor) as-is, regardless if the model is on one GPU or multiple GPUs or on CPU. 3. Remove the assertion that prevents SPMD in DDP `join()` method, because now SPMD is already forbidden by the constructor. Also remove the relevant unit test `test_ddp_uneven_inputs_replicated_error`. #Closes: https://github.com/pytorch/pytorch/issues/47012 ghstack-source-id: 125644392 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_gloo_spawn -- test_cuda buck test mode/dev-nosan caffe2/test/distributed:distributed_gloo_spawn -- test_rnn buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_nccl_backend_multi_device_ids_not_allowed buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_nccl_backend_single_device_module_device_ids_None buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_nccl_backend_multi_device_module_device_ids_None buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_ddp_multi_device_module_config waitforbuildbot Reviewed By: pritamdamania87 Differential Revision: D27226092 fbshipit-source-id: 3ee1e4bc46e5e362fc82cf7a24b2fafb34fcf1b9	2021-04-02 15:11:59 -07:00
Howard Huang	5610e8271b	Fix skip_if_not_multigpu decorator (#54916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54916 Fixes https://github.com/pytorch/pytorch/issues/54887 `skip_if_not_multigpu` was skipping all the tests that use it. Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D27412193 Pulled By: H-Huang fbshipit-source-id: 28d6697bd8cc6b6784cdb038ccb3ff138d0610eb	2021-04-01 18:01:33 -07:00
Wanchao Liang	133000fe7a	[distributed] add processgroup options as argument (#53663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53663 This add the processgroup option as an optional argument to new_group and init_processgroup, this allows user to pass in a initialized processgroup option for gloo and nccl. Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D26968857 Pulled By: wanchaol fbshipit-source-id: 2ff73a009120b85e83ecde7c69956b731902abc2	2021-03-18 01:04:17 -07:00
Kyle Chen	5c2b3d7784	[ROCm] Enable RNN test in test_c10d_spawn.py for ROCm (#52707 ) Summary: Enabling test_rnn test because it is passing for ROCm. Signed-off-by: Kyle Chen <kylechen@amd.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/52707 Reviewed By: albanD Differential Revision: D26994407 Pulled By: mrshenli fbshipit-source-id: f7d60ab7c4f0128e5f7770f959e2b83694d18275	2021-03-11 18:41:54 -08:00
Sam Estep	8c798e0622	Forbid trailing whitespace (#53406 ) Summary: Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857 These are the only hand-written parts of this diff: - the addition to `.github/workflows/lint.yml` - the file endings changed in these four files (to appease FB-internal land-blocking lints): - `GLOSSARY.md` - `aten/src/ATen/core/op_registration/README.md` - `scripts/README.md` - `torch/csrc/jit/codegen/fuser/README.md` The rest was generated by running this command (on macOS): ``` git grep -I -l ' $' -- . ':(exclude)/contrib/' ':(exclude)third_party' \| xargs gsed -i 's/ *$//' ``` I looked over the auto-generated changes and didn't see anything that looked problematic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406 Test Plan: This run (after adding the lint but before removing existing trailing spaces) failed: - https://github.com/pytorch/pytorch/runs/2043032377 This run (on the tip of this PR) succeeded: - https://github.com/pytorch/pytorch/runs/2043296348 Reviewed By: walterddr, seemethere Differential Revision: D26856620 Pulled By: samestep fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97	2021-03-05 17:22:55 -08:00
Emilio Castillo	233e4ebdb6	Implement autograd functions for c10d communication operations (#40762 ) Summary: Closes https://github.com/pytorch/pytorch/issues/40702, Fixes https://github.com/pytorch/pytorch/issues/40690 Currently wip. But I would appreciate some feedback. Functions should be double-differentiable. Contrary to `b35cdc5200/torch/nn/parallel/_functions.py` This PR generates list of tensors instead of aggregating the received data in a single tensor. Is this behavior correct? Thanks! Pull Request resolved: https://github.com/pytorch/pytorch/pull/40762 Reviewed By: glaringlee Differential Revision: D24758889 Pulled By: mrshenli fbshipit-source-id: 79285fb4b791cae3d248f34e2aadb11c9ab10cce	2021-01-26 07:52:51 -08:00
Alexander Grund	282f4ab947	Workaround for bug in DistributedDataParallel (#46186 ) Summary: Fix the DistributedDataParallelSingleProcessTest to work around a limitation in DistributedDataParallel where the batch_size needs to evenly divide by the number of GPUs used See https://github.com/pytorch/pytorch/issues/46175 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46186 Reviewed By: bdhirsh Differential Revision: D24264664 Pulled By: mrshenli fbshipit-source-id: 6cfd6d29e97f3e3420391d03b7f1a8ad49d75f48	2020-10-13 07:34:02 -07:00
gunandrose4u	f07ac6a004	Fix Windows build failure after DDP PR merged (#45335 ) Summary: Fixes #{issue number} This is resubmit for PR https://github.com/pytorch/pytorch/issues/42897 . Together with fix for Windows build issue introduced by PR https://github.com/pytorch/pytorch/issues/44344 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/45335 Reviewed By: zou3519 Differential Revision: D23931471 Pulled By: mrshenli fbshipit-source-id: f49b5a114944c1450b32934b3292170be064f494	2020-09-25 12:37:50 -07:00
Mike Ruberry	103fa3894a	Revert D23841786: [pytorch][PR] Enable distributed package on windows, Gloo backend supported only Test Plan: revert-hammer Differential Revision: D23841786 (`0122299f9b`) Original commit changeset: 334ba1ed73ef fbshipit-source-id: ec95432f9957df56a5a04e52661f5db920b7f57f	2020-09-24 22:44:33 -07:00
gunandrose4u	0122299f9b	Enable distributed package on windows, Gloo backend supported only (#42897 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42095 For test case part will be committed to this PR later mrshenli, please help to review Pull Request resolved: https://github.com/pytorch/pytorch/pull/42897 Reviewed By: osalpekar Differential Revision: D23841786 Pulled By: mrshenli fbshipit-source-id: 334ba1ed73eff2f668857390fc32d1bc7f08e5f3	2020-09-24 21:13:55 -07:00
Pritam Damania	872237c1f2	Output to stderr in distributed tests. (#42139 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42139 A bunch of tests were failing with buck since we would output to stdout and buck would fail parsing stdout in some cases. Moving these print statements to stderr fixes this issue. ghstack-source-id: 108606579 Test Plan: Run the offending unit tests. Reviewed By: mrshenli Differential Revision: D22779135 fbshipit-source-id: 789af3b16a03b68a6cb12377ed852e5b5091bbad	2020-07-29 19:23:34 -07:00
Jeff Daily	fc8bca094c	skip_if_rocm test_rnn in test_c10d_spawn.py (#40577 ) Summary: Test was added a few months back in https://github.com/pytorch/pytorch/issues/36503 but recently became flaky for ROCm. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40577 Differential Revision: D22258196 Pulled By: ezyang fbshipit-source-id: 8a22b0c17b536b3d42d0382f7737df0f8823ba08	2020-06-26 09:45:45 -07:00
Shen Li	a2e1a948a4	Increase number of iterations in DDP SPMD tests (#40506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40506 Test Plan: Imported from OSS Differential Revision: D22208965 Pulled By: mrshenli fbshipit-source-id: 7d27b60e2c09e641b4eeb1c89d9f9917c4e72e52	2020-06-24 12:48:04 -07:00
Mike Ruberry	13120bf677	Updates assertEqual to require atol and rtol, removes positional atol (#38872 ) Summary: This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument. In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872 Differential Revision: D21740237 Pulled By: mruberry fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042	2020-05-27 06:31:07 -07:00
Rohan Varma	63e545e0fe	Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol Test Plan: revert-hammer Differential Revision: D21717199 Original commit changeset: 9feb856f94ee fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259	2020-05-26 18:23:59 -07:00
Mike Ruberry	6ddca30b2d	Updates assertEqual to require atol and rtol, removes positional atol (#38872 ) Summary: This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument. In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872 Differential Revision: D21717199 Pulled By: mruberry fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a	2020-05-26 08:30:23 -07:00
Shen Li	8d6a8d2b3f	Fix DDP bug in single process multiple device use cases (#36503 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36503 Test Plan: Imported from OSS Differential Revision: D21179274 Pulled By: mrshenli fbshipit-source-id: 0afce30ae0ddda753d1e240584a0f80df9aec4c2	2020-04-22 15:06:28 -07:00
Pritam Damania	f050b16dd9	Move pytorch distributed tests to separate folder for contbuild. (#30445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445 Create distributed and rpc directories under caffe/test for better management of unit tests. Differential Revision: D18702786 fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606	2020-01-22 21:16:59 -08:00

34 Commits