pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Justin Chu	232b96b6e2	[BE] Enable ruff's UP rules and autoformat distributed/ (#105433 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105433 Approved by: https://github.com/albanD	2023-07-19 14:27:11 +00:00
Edward Z. Yang	9a8f71f23e	Convert logging f-strings to use % format (#98697 ) Codemod done with https://gist.github.com/ezyang/2e8b0463cdc6be278478495b23ff0530 with assistance from ChatGPT. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98697 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
Kazuaki Ishizaki	6514d71add	Fix typos under torch/distributed directory (#98225 ) This PR fixes typos in comments and messages of `.py` files under `torch/distributed` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/98225 Approved by: https://github.com/soulitzer, https://github.com/kit1980	2023-04-05 00:21:33 +00:00
Edward Z. Yang	5df59f957f	Fix G001,G002,G003 in logs to % syntax (#97812 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97812 Approved by: https://github.com/Skylion007, https://github.com/kiukchung, https://github.com/malfet, https://github.com/mlazos	2023-04-01 01:43:33 +00:00
Jeroen Van Goey	a238bab17c	Typo fix in generated module name (#76880 ) `f"{_FILE_PREFIX}non_sriptable"` -> `f"{_FILE_PREFIX}non_scriptable"` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76880 Approved by: https://github.com/mrshenli	2022-05-09 00:51:58 +00:00
Yi Wang	5c7e35c689	[RPC Framework] Clang-format remote_module.py and instantiator.py (#57414 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57414 ghstack-source-id: 127927609 Test Plan: N/A Reviewed By: rohan-varma Differential Revision: D28138870 fbshipit-source-id: 04894abaf2e713dc559cd9795197f85539b25e17	2021-05-03 20:28:51 -07:00
Yi Wang	4143483d95	[RPC Framework] Create a separate remote module template when moving CPU tensors to a cuda device is not enabled (#57413 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57413 An internal test fails because somehow `Tuple[()]` is not considered compatible with `Tuple[Any]` in TorchScript, even if the code that involves this type of variables is not executed at all. Therefore, create separate templates for instantiation to avoid typing check failure. This can address the FIXME left in https://github.com/pytorch/pytorch/pull/57288 #Closes: https://github.com/pytorch/pytorch/issues/51670 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1 buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts Reviewed By: wanchaol Differential Revision: D28138864 fbshipit-source-id: 39e3e67b0c3979b607ff104d84b4fb1070ffefd6	2021-05-03 19:10:24 -07:00
Yi Wang	13dbb77b7a	[RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57288 If the device map provided by RemoteModue is not empty, then TensorPipe RPC backend can support directly sending GPU tensors over the wire. Also add pybind of `_get_device_map`. The changes in unit test setup is separated out as a follow-up PR, as currently it breaks some tests in `distributed/rpc/test_faulty_agent.py`. Still need to fix test_load_di_parts in `torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test`. Currently an early return is used to bypass this test failure. #Original PR issue: https://github.com/pytorch/pytorch/issues/51670 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device_script buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1 CAUTION: This one actually fails and now it is bypassed. See FIXME in `_remote_forward`. buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts Reviewed By: wanchaol Differential Revision: D28021672 fbshipit-source-id: a89245dc35e1d9479811ec6f98d9f34116837d79	2021-04-30 18:04:45 -07:00
Yi Wang	35f3feca28	[RPC Framework] Supporting reading the input from the remote worker (#56943 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56943 If the module is placed on a CUDA device, then all the CPU tensors in `args` and `kwargs` will also be implicitly moved to the same CUDA device to run forward. Currently still need to move the forward output from CUDA device back to CPU, until: 1) Process group RPC backend is completely deprecated, and we always use TensorPipe RPC backend; 2) A device map is explicitly provided to TensorPipe RPC backend. These steps will be done in a separate PR. #Original PR issue: https://github.com/pytorch/pytorch/issues/51670 ghstack-source-id: 127457584 Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device_script buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule buck test mode/dev-nosan //caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- --exact 'caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test - test_load_di_parts (caffe2.torch.fb.training_toolkit.applications.sparse_nn.batch_distributed_inference.tests.batch_distributed_inference_test.BatchDistributedInferenceTest)' Reviewed By: wanchaol Differential Revision: D27934791 fbshipit-source-id: de27e27b905db83cc52800e63684fc6c942e9dc7	2021-04-26 20:04:06 -07:00
Sam Estep	e3900d2ba5	Add lint for unqualified `noqa` (#56272 ) Summary: As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future. Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two: ``` test/jit/test_misc.py:27: print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999 test/jit/test_misc.py:28: print(f"format blank") # noqa F541 ``` However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored: - If you change them to anything else, the warnings will still be suppressed. - If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally: ``` test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment ``` I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2365189927 Reviewed By: janeyx99 Differential Revision: D27830127 Pulled By: samestep fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb	2021-04-19 13:16:18 -07:00
Xu Zhao	7f66fa62ca	Fix typing errors in torch.distributed.nn.* directory. (#47533 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47533 Test Plan: Imported from OSS Reviewed By: walterddr Differential Revision: D24952500 Pulled By: xuzhao9 fbshipit-source-id: 8e66784fd8f9f111b6329e0bb48d6cd61c690a4a	2020-11-16 23:27:55 -08:00
Stanislau Hlebik	b774ce54f8	remediation of S205607 fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3	2020-07-17 17:19:47 -07:00
Stanislau Hlebik	8fdea489af	remediation of S205607 fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac	2020-07-17 17:17:03 -07:00
Michael Suo	c93e96fbd9	[jit] move script-related implementation out of torch/jit/__init__.py (#40902 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40902 See the bottom of this stack for context. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D22360210 Pulled By: suo fbshipit-source-id: 4275127173a36982ce9ad357aa344435b98e1faf	2020-07-08 11:38:34 -07:00
Alexander Mols	b7bfdcbe3e	[caffe2/torch] Use logger in jit instantiator Summary: Previously the module would log some data using `print()`. This can be a problem when used in contexts where the process expects to write data to stdout itself. This diff changes the log statements to use `logger` instead. This makes it similar to other log statements in the same module. Test Plan: Confirmed no weird test showed up when running: buck test caffe2/test/distributed/nn/api:remote_module_fork Differential Revision: D22136172 fbshipit-source-id: a3d144eba6c75925ed684981793c84b36eb45a5d	2020-06-19 07:49:15 -07:00
Shihao Xu	f3f30d4354	[JIT x RPC] Consolidate RRef type class and RRef impl class (#35694 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35694 close https://github.com/pytorch/pytorch/issues/35110 Differential Revision: D7881729 fbshipit-source-id: eedda8f1b7510491886d469efeed4e002bb8b991	2020-06-18 07:46:38 -07:00
Shihao Xu	bc9e8af218	[distributed.nn] Change remote module template instantiator to write to tmp folder (#40173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40173 - Avoid path sharing across runs and workers, so even the test methods/workers run in parallel on the same host, they don't interfere with each other. - On some environment (e.g. fb internal CI platform), the torch package file tree is not writable. But the temporary folder chosen by Python `tempfile` module is always writable, on linux it's "/tmp". close https://github.com/pytorch/pytorch/issues/40120 ghstack-source-id: 106086340 Test Plan: ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_scripted_remote_module_template buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_non_scripted_remote_module_template ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork ``` Differential Revision: D5708493 fbshipit-source-id: dd92695682433aaf79d1912c7956cef40a450eaf	2020-06-17 15:01:30 -07:00
Shihao Xu	00651b8c93	[distribtued.nn] Implement TorchScript-compatible RemoteModule API (#37139 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37139 See design doc in https://github.com/pytorch/pytorch/issues/37136 ghstack-source-id: 105926270 Test Plan: TODO: - Make the generated Interface usable. https://github.com/pytorch/pytorch/pull/37139#discussion_r434190978 - - Avoid generating the same template instances for Module that is not scriptable. - Remove "infer_module_interface_cls". - Use Python format instead of a CodeTemplate - Use Python tempfile to track and delete file. Does it work if there is crash. ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_scripted_remote_module_template buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_non_scripted_remote_module_template ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_spawn ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_async_script buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_sync_script buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_with_kwargs buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork ``` buck test mode/opt-asan //caffe2/test:jit -- 'test_script_forward_method_replacement buck build mode/dev-nosan //caffe2/test:jit && \ buck-out/gen/caffe2/test/jit\#binary.par -r 'test_script_forward_method_replacement' buck build mode/dev-nosan //caffe2/test:jit && \ buck-out/gen/caffe2/test/jit\#binary.par -r 'test_imported_classes' Differential Revision: D20499658 fbshipit-source-id: dd9383ae4eb2343366c11127664f845b91ca3b0a	2020-06-15 19:07:35 -07:00

18 Commits