Commit Graph

40 Commits

Author SHA1 Message Date
Anthony Barbier
ce9e27a0fc Add new keys for Graphcore IPU (DispatchKey / Backend / DeviceType)
We need a key to register our out of tree backend: https://github.com/graphcore/poptorch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74763
Approved by: https://github.com/bdhirsh
2022-04-07 17:18:45 +00:00
Horace He
7cdbbfaee2 Revert D33716716: [pytorch][PR] Added remove_duplicate parameter to nn.Module
Test Plan: revert-hammer

Differential Revision:
D33716716 (7e8217549f)

Original commit changeset: ff1ed9980bd1

Original Phabricator Diff: D33716716 (7e8217549f)

fbshipit-source-id: 91c3d9acc5bc731da716dd0d2485431f85f861c9
(cherry picked from commit c81d193bf0)
2022-02-03 09:04:29 +00:00
Horace He
7e8217549f Added remove_duplicate parameter to nn.Module (#39)
Summary:
Pull Request resolved: https://github.com/pytorch/torchrec/pull/39

Pull Request resolved: https://github.com/facebookresearch/torchrec/pull/6

This makes it so that shared parameters get their own entry in `named_parameters`.

More broadly, this makes it so that
```
params_and_buffers = {**mod.named_named_parameters(remove_duplicate=False), **mod.named_buffers(remove_duplicate=False)}
_stateless.functional_call(mod, params_and_buffers, args, kwargs)
```
is identical to calling the original module's forwards pass.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71542

Reviewed By: jbschlosser, albanD

Differential Revision: D33716716

Pulled By: Chillee

fbshipit-source-id: ff1ed9980bd1a3f7ebaf695ee5e401202b543213
(cherry picked from commit d6e3ad3cd0)
2022-02-01 18:34:58 +00:00
Pritam Damania
9ae3f3945b Add remote_module logging to the __new__ method. (#68035)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68035

RemoteModule is sometimes created using object.__new__ (ex:
init_from_module_rref), in this case the logging in the __init__ method would
not pick this up.

As a result, adding a `__new__` method to RemoteModule to log all usages
appropriately.
ghstack-source-id: 142762019

Test Plan: waitforbuildbot

Reviewed By: vipannalla

Differential Revision: D32263978

fbshipit-source-id: a95ab0bb5d0836da8fe6333c41593af164b008d9
2021-11-09 09:32:34 -08:00
Pritam Damania
05e17e7ff6 Add API usage logging for several other RPC APIs. (#67722)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67722

ghstack-source-id: 142259452

Test Plan: waitforbuildbot

Reviewed By: jaceyca, fduwjj

Differential Revision: D32118872

fbshipit-source-id: 041ab5601221b1846c56ce4bb63364bec9ad28b0
2021-11-03 14:02:00 -07:00
gmagogsfm
479fc4e412 Remove outdated warning about RecursiveScriptModule not being copiable (#64085)
Summary:
RecursiveScriptModule has its customized `__copy__` and `__deepcopy__` defined. The warning/error  that says it is not copiable is outdated

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64085

Reviewed By: rohan-varma

Differential Revision: D30598623

Pulled By: gmagogsfm

fbshipit-source-id: 0701d8617f42d818bc7b88244caee4cd47fbe976
2021-08-31 21:31:32 -07:00
Pritam Damania
b8e6144e0a Add a _RemoteDevice structure for ShardedTensor/ShardingSpec. (#62927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62927

As part of the ShardedTensor work, we realized we do need some sort of
_RemoteDevice structure that deals with our format of "workername/device" so
that users don't have to worry about parsing this string directly.

Right now this structure is just the bare minimum and is mostly a container for
describing a remote device. It is currently only used in ShardedTensor,
ShardingSpec and RemoteModule.

Once we actually have a consolidated remote device proposal, this class can be
extended appropriately if needed.
ghstack-source-id: 135534086

Test Plan:
1) unit tests
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D30170689

fbshipit-source-id: 1ac2e81c7a597dc40bf3fbf2c1168c382c66649f
2021-08-11 11:27:32 -07:00
Philip Meier
d5988c5eca remove unused type: ignore directives (#60006)
Summary:
During development it is common practice to put `type: ignore` comments on lines that are correct, but `mypy` doesn't recognize this. This often stems from the fact, that the used `mypy` version wasn't able to handle the used pattern.

With every new release `mypy` gets better at handling complex code. In addition to fix all the previously accepted but now failing patterns, we should also revisit all `type: ignore` comments to see if they are still needed or not. Fortunately, we don't need to do it manually: by adding `warn_unused_ignores = True` to the configuration, `mypy` will error out in case it encounters an `type: ignore` that is no longer needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60006

Reviewed By: jbschlosser, malfet

Differential Revision: D29133237

Pulled By: albanD

fbshipit-source-id: 41e82edc5cd5affa7ccedad044b59b94dad4425a
2021-06-18 07:23:31 -07:00
Pritam Damania
f11120967e Support EnumerableShardingSpec in ShardedTensor. (#59061)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59061

Overall Design: https://github.com/pytorch/pytorch/issues/55207

This PR builds upon https://github.com/pytorch/pytorch/pull/58517 and
https://github.com/pytorch/pytorch/pull/57409 to support creating a
ShardedTensor using EnumerableShardingSpec.
ghstack-source-id: 130780376

Test Plan:
1) unit tests
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D28734551

fbshipit-source-id: 656f5f2b22041dae071bc475f19fe94c969716e8
2021-06-09 23:21:14 -07:00
Yi Wang
d009c9c129 [RPC Framework] Separate initialize_from_module_rref method out of RemoteModule constructor (#59292)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59292

#Closes: https://github.com/pytorch/pytorch/issues/58274

Create an alternate initialization method, and also create a few util functions to avoid duplicate code.
ghstack-source-id: 130575373

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_create_remote_module_from_module_rref

Reviewed By: vipannalla

Differential Revision: D28825895

fbshipit-source-id: 87803e94d9b50f94e1b7b2c99b9bf1634e20d065
2021-06-04 03:43:36 -07:00
Lishan Yang
2aa463d931 Support switching RemoteModule between train/eval (#59026)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59026

#Closes: https://github.com/pytorch/pytorch/issues/51480

Enabled methods train and eval in RemoteModule to call the underlying train/eval methods on the actual
 nn.Module
ghstack-source-id: 130421137

Test Plan:
Call these two updated methods in method test_send_remote_module_over_the_wire in remote_module_test.py. To test the correctness, after running method train, the training mode should be set to True; after running method eval, the training mode of the remote module should be set to False.

	Related test output:

    ✓ Pass: caffe2/test/distributed/rpc:process_group_agent - test_send_remote_module_over_the_wire (fb.test_process_group_agent.ProcessGroupThreeWorkersRemoteModuleTestWithFork) (23.059)
    ✓ Pass: caffe2/test/distributed/rpc:thrift_agent - test_send_remote_module_over_the_wire (fb.test_thrift_agent.ThriftThreeWorkersRemoteModuleTestWithFork) (27.965)
    ✓ Pass: caffe2/test/distributed/rpc:process_group_agent - test_send_remote_module_over_the_wire (test_process_group_agent.ProcessGroupThreeWorkersRemoteModuleTestWithSpawn) (74.481)
    ✓ Pass: caffe2/test/distributed/rpc:thrift_agent - test_send_remote_module_over_the_wire (fb.test_thrift_agent.ThriftThreeWorkersRemoteModuleTestWithSpawn) (77.243)
    ✓ Pass: caffe2/test/distributed/rpc:tensorpipe_agent - test_send_remote_module_over_the_wire (fb.test_tensorpipe_agent.TensorPipeThreeWorkersRemoteModuleTestWithFork) (58.644)
    ✓ Pass: caffe2/test/distributed/rpc:tensorpipe_agent - test_send_remote_module_over_the_wire (test_tensorpipe_agent.TensorPipeThreeWorkersRemoteModuleTestWithSpawn) (90.229)

Reviewed By: pritamdamania87, SciPioneer

Differential Revision: D28721078

fbshipit-source-id: aa45c1e5755f583200144ecfec3704f28221972c
2021-06-03 13:13:58 -07:00
Yi Wang
dbe629c51d [RPC Framework] Support creating a RemoteModule by RRef (#59242)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59242

#Oringal PR Issue: https://github.com/pytorch/pytorch/issues/58274

This can be a workaround: Instead of passing a script `RemoteModule` over RPC, pass its `module_rref` field over RPC, and then construct a new `RemoteModule` on the receiver end.
ghstack-source-id: 130268018

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_over_the_wire_script_not_supported

buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_remote_module_py_pickle_not_supported_script

buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_create_remote_module_by_module_rref

Reviewed By: vipannalla

Differential Revision: D28794905

fbshipit-source-id: 1a677ff0d4b47c078ad47b50d7102a198a1fc39b
2021-06-01 22:35:03 -07:00
Shivansh Dhar
e89b150a39 [typing] Pyre fixes for remote_module (#59046)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59046

Correcting type hint for _RemoteModule to pass Pyre checks.

Test Plan: N/A

Reviewed By: walterddr, SciPioneer

Differential Revision: D28725237

fbshipit-source-id: 1ca714bbf1a597a29850f70bac826a0c95a4019f
2021-05-27 09:44:50 -07:00
Rong Rong (AI Infra)
97c1179c9d Revert D28549240: [typing] Pyre fixes for batch_distributed_inference
Test Plan: revert-hammer

Differential Revision:
D28549240 (671c224b0a)

Original commit changeset: dadfedf93aae

fbshipit-source-id: 820fefccf2b4c6368defd762ce55245dd35505ca
2021-05-26 13:39:30 -07:00
Shivansh Dhar
671c224b0a [typing] Pyre fixes for batch_distributed_inference
Summary:
Pyre does not support dynamic imports, so we can leave the pyre-ignores for those. (https://fb.workplace.com/groups/pyreqa/permalink/3119812734775204/)

Parameterized pyre-ignore are also necessary as explained by [this Q&A](https://www.internalfb.com/intern/qa/109058/pyre-says-undefined-attribute-16-module-parameteri)

Test Plan:
- `pyre -l .`
- `pyre check`
- `buck test //caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test`

Reviewed By: vipannalla

Differential Revision: D28549240

fbshipit-source-id: dadfedf93aae860fe6d0a112002bdfe743139b1e
2021-05-26 13:08:19 -07:00
Pritam Damania
0d6fa1adc5 Introduce ChunkShardingSpec as a model sharding specification. (#55728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55728

Full design: https://github.com/pytorch/pytorch/issues/55207

This PR introduces ChunkShardingSpec (SingleShardingSpec in the design). Used
the name ChunkShardingSpec since it is very similar to `torch.chunk` in terms
of how a Tensor is split up and feels more clear compared to SingleShardingSpec.
ghstack-source-id: 129603318

Test Plan: waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D27694108

fbshipit-source-id: c8764abe6a4d5fc56d023fda29b74b5af2a73b49
2021-05-23 16:04:57 -07:00
Yi Wang
2436377a7d Remote the list for the attributes that will be ignored for pickling (#58345)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58345

1. Add a sanity check to make sure any new attribute added to the constructor should be added to either `_REMOTE_MODULE_ATTRIBUTES_IGNORE_FOR_PICKLING` pr `_REMOTE_MODULE_ATTRIBUTES_IGNORE_FOR_PICKLING`.
2. Update some comments and warning -- now if a new attribute is added after the construction, it will not be pickled. Previously it will trigger a runtime error, which is hard for unit test (one worker hit the runtime error, but the other worker will cause timeout).
Context: https://github.com/pytorch/pytorch/pull/58019#discussion_r632322083
ghstack-source-id: 129070358

Test Plan: unit test

Reviewed By: rohan-varma

Differential Revision: D28460744

fbshipit-source-id: 8028186fc447c88fbf2bf57f5c5d321f42ba54ed
2021-05-15 00:47:48 -07:00
Yi Wang
e507771294 [RPC Framework] Replace Python Pickler with internal RPC pickler for RemoteModule (#58019)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58019

In order to support sending `RemoteModule` over PRC, previously the pickling/unpickling of `RemoteModule` was implemented based on `__setstate__` and `__getstate__`. However, this means that the user can call regular Python pickler/unpickler to invoke the same logic,which should not be allowed.

This PR ensures that the pickling can only happen over RPC and not via regular python pickle.

Additionally, when a new attribute is added to `RemoteModule`, if it's not added to either `_REMOTE_MODULE_PICKLED_ATTRIBUTES` or `_REMOTE_MODULE_ATTRIBUTES_IGNORE_FOR_PICKLING`, this attribute will be ignored and an error message will be printed to std.err. However, it will not raise an exception like before, because such exception raised at the RPC layer will somehow cause timeout.

#Closes: https://github.com/pytorch/pytorch/issues/57516
ghstack-source-id: 128868501

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_over_the_wire
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_remote_module_py_pickle_not_supported
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_with_a_new_attribute_ignored_over_the_wire
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

buck test mode/dev-nosan //caffe2/torch/fb/csrc/concurrency/test:atomic_int_interprocess_test -- --exact 'caffe2/torch/fb/csrc/concurrency/test:atomic_int_interprocess_test - test_multiple_processes (caffe2.torch.fb.csrc.concurrency.test.atomic_int_interprocess_test.ForkMultipleProcessTest)'
buck test mode/dev //caffe2/torch/distributed/fb/test:app_test -- --exact 'caffe2/torch/distributed/fb/test:app_test - test_custom_init_rpc (caffe2.torch.distributed.fb.test.app_test.TestRpc)'

Reviewed By: mrshenli

Differential Revision: D28318270

fbshipit-source-id: 7e7df2a6690f0860c4531a244d38789db424496f
2021-05-13 09:37:42 -07:00
Mehdi Mirzazadeh
614437751f make remote model instantiation async when possible (#58052)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58052 for the cases where `module_interface_cls` is not provided

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58052

Reviewed By: mruberry

Differential Revision: D28369064

Pulled By: mrzzd

fbshipit-source-id: 3ded7ea943a5ff0425bedc05448a59e6eefbeaaf
2021-05-12 13:48:09 -07:00
Richard Barnes
d9ea93181b Some types for remote_module (#58012)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58012

Test Plan: Sandcastle

Reviewed By: SciPioneer

Differential Revision: D28334611

fbshipit-source-id: 5e4645a7de65e064cb6a919cdc2372151ec48d44
2021-05-11 16:43:55 -07:00
Yi Wang
4db88307d9 [RPC Framework] Add a link to the tutorial in RemoteModule docstring (#57875)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57875

This tutorial combines DDP and RemoteModule.
ghstack-source-id: 128482681

Test Plan: N/A

Reviewed By: rohan-varma

Differential Revision: D28305382

fbshipit-source-id: 572e1ec4b4aa00735fff16a6ce6ae4c7cad0b27f
2021-05-07 19:42:27 -07:00
Yi Wang
74d493cc07 [RPC Framework] Support passing RemoteModule as an arg (#57695)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57695

Add pickling/unpickling support for `RemoteModule`.

#Closes: https://github.com/pytorch/pytorch/issues/57516
ghstack-source-id: 128472946

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_over_the_wire

buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_send_remote_module_with_a_new_attribute_over_the_wire

buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: rohan-varma

Differential Revision: D28233108

fbshipit-source-id: 94eea2251fa53fb71912457c80d0a1e44504fc85
2021-05-07 19:41:17 -07:00
Yi Wang
5c7e35c689 [RPC Framework] Clang-format remote_module.py and instantiator.py (#57414)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57414

ghstack-source-id: 127927609

Test Plan: N/A

Reviewed By: rohan-varma

Differential Revision: D28138870

fbshipit-source-id: 04894abaf2e713dc559cd9795197f85539b25e17
2021-05-03 20:28:51 -07:00
Yi Wang
4143483d95 [RPC Framework] Create a separate remote module template when moving CPU tensors to a cuda device is not enabled (#57413)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57413

An internal test fails because somehow `Tuple[()]` is not considered compatible with `Tuple[Any]` in TorchScript, even if the code that involves this type of variables is not executed at all.

Therefore, create separate templates for instantiation to avoid typing check failure. This can address the FIXME left in https://github.com/pytorch/pytorch/pull/57288

#Closes: https://github.com/pytorch/pytorch/issues/51670

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1

buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts

Reviewed By: wanchaol

Differential Revision: D28138864

fbshipit-source-id: 39e3e67b0c3979b607ff104d84b4fb1070ffefd6
2021-05-03 19:10:24 -07:00
Yi Wang
13dbb77b7a [RPC Framework] Enable RemoteModule to directly send GPU tensors over the wire on TensorPipe RPC backend if a device map is provided (#57288)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57288

If the device map provided by RemoteModue is not empty, then TensorPipe RPC backend can support directly sending GPU tensors over the wire.

Also add pybind of `_get_device_map`.

The changes in unit test setup is separated out as a follow-up PR, as currently it breaks some tests in `distributed/rpc/test_faulty_agent.py`.

Still need to fix test_load_di_parts in `torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test`. Currently an early return is used to bypass this test failure.

#Original PR issue: https://github.com/pytorch/pytorch/issues/51670

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device

buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- test_input_moved_to_cuda_device_script

buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule -j 1

CAUTION: This one actually fails and now it is bypassed. See FIXME in `_remote_forward`.
buck test mode/dev-nosan caffe2/torch/fb/training_toolkit/applications/sparse_nn/batch_distributed_inference/tests:batch_distributed_inference_test -- test_load_di_parts

Reviewed By: wanchaol

Differential Revision: D28021672

fbshipit-source-id: a89245dc35e1d9479811ec6f98d9f34116837d79
2021-04-30 18:04:45 -07:00
Jerry Zhang
0a541e23e1 [nn] Add allow_duplicate option for named_modules (#54812)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54812

Needed for quantization since different attribute might refer to the same module instance

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27408376

fbshipit-source-id: cada85c4a1772d3dd9502c3f6f9a56d690d527e7
2021-04-16 01:26:16 -07:00
Nikita Shulga
add49e7e4e Enforce PEP263 for PyTorch python codebase (#55346)
Summary:
All python files containing non-ASCII characters should be correctly annotated with `# -*- coding: utf-8 -*-` comment

Delete number of superfluous UTF-8 characters, most commonly UTF-8 opening closing quotation mark U+2019 (’) instead of ascii apostrophe ', for example `Module’s`->`Module's`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55346

Reviewed By: samestep

Differential Revision: D27582044

Pulled By: malfet

fbshipit-source-id: c1cd89655915858ff3a41f675cdfffff795a8e44
2021-04-06 18:31:38 -07:00
Pritam Damania
f612d4eb58 Add 'remote_parameters' and 'get_module_rref' to RemoteModule docs. (#54645)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54645

Had to replace RRef[..] with just RRef in the return signature since
sphynx seemed to completely mess up rendering RRef[..]
ghstack-source-id: 125024783

Test Plan: View locally.

Reviewed By: SciPioneer

Differential Revision: D27314609

fbshipit-source-id: 2dd9901e79f31578ac7733f79dbeb376f686ed75
2021-03-26 21:41:28 -07:00
Pritam Damania
59c0c19be2 Add RemoteModule to master RPC docs. (#53084)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53084

Adding RemoteModule to master RPC docs since it is a prototype
feature.
ghstack-source-id: 122816689

Test Plan: waitforbuildbot

Reviewed By: rohan-varma

Differential Revision: D26743372

fbshipit-source-id: 00ce9526291dfb68494e07be3e67d7d9c2686f1b
2021-03-03 13:52:11 -08:00
chengjun
4a8ef4525e Add new backend type for Intel heterogeneous computation platform. (#49786)
Summary:
Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc.

https://github.com/pytorch/pytorch/issues/48246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786

Reviewed By: mrshenli

Differential Revision: D25893962

Pulled By: ezyang

fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052
2021-01-20 08:15:18 -08:00
Guilherme Leobas
5f8e1a1da9 add type annotations to torch.nn.modules.module (#49045)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/49044

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49045

Reviewed By: malfet

Differential Revision: D25767092

Pulled By: walterddr

fbshipit-source-id: a81ba96f3495943af7bb9ee3e5fc4c94c690c405
2021-01-11 17:01:47 -08:00
Samuel Marks
e6779d4357 [*.py] Rename "Arguments:" to "Args:" (#49736)
Summary:
I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings.

```sh
(pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do
    printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" | paste -s -d+ -- | bc)"; done
Args:      1095
Arguments: 0336
```

It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per:

  - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md)

  - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md)

  - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst)

Therefore, only `Args:` is valid. This PR replaces them throughout the codebase.

PS: For related PRs, see tensorflow/tensorflow/pull/45420

PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736

Reviewed By: albanD

Differential Revision: D25710534

Pulled By: soumith

fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619
2020-12-28 09:34:47 -08:00
Yi Wang
2b1057b0cf [RPC Framework] Support retrieving the RRef to the remote module (#48983)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48983

Expose an API for users to retrieve the RRef for the underlying module.

This would be useful if users would like to run custom code on the remote end for the nn.Module.

Original PR issue: RemoteModule enhancements #40550
ghstack-source-id: 118378601

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: pritamdamania87

Differential Revision: D25386042

fbshipit-source-id: 2dff33e8d5c9770be464eacf0b26c3e82f49a943
2020-12-10 23:53:44 -08:00
Xu Zhao
7f66fa62ca Fix typing errors in torch.distributed.nn.* directory. (#47533)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47533

Test Plan: Imported from OSS

Reviewed By: walterddr

Differential Revision: D24952500

Pulled By: xuzhao9

fbshipit-source-id: 8e66784fd8f9f111b6329e0bb48d6cd61c690a4a
2020-11-16 23:27:55 -08:00
Yi Wang
cab32d9cdf [RPC Framework] Support remote device format "<workername>/<device>" (#46773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46773

Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Original PR issue: RemoteDevice Format #46554
ghstack-source-id: 115448051

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: pritamdamania87

Differential Revision: D24482562

fbshipit-source-id: 5acfc73772576a4b674df27625bf560b8f8e67c1
2020-10-29 00:14:56 -07:00
Yi Wang
c68cc78299 Add a device parameter to RemoteModule (#44254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44254

Add a device parameter to RemoteModule, so it can be placed on any device
and not just CPU.

Original PR issue: RemoteModule enhancements #40550

Test Plan: buck test test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: pritamdamania87

Differential Revision: D23483803

fbshipit-source-id: 4918583c15c6a38a255ccbf12c9168660ab7f6db
2020-09-18 10:31:03 -07:00
Yi Wang
396469f18c Explicitly forbidden the other inherited methods of RemoteModule. (#43895)
Summary:
Throw exceptions when the methods except for forwardXXX are used.

Original PR issue: RemoteModule enhancements #40550

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43895

Test Plan: buck test test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: rohan-varma

Differential Revision: D23392842

Pulled By: SciPioneer

fbshipit-source-id: 7c09a55a03f9f0b7e9f9264a42bfb907607f4651
2020-09-05 14:48:56 -07:00
Yi Wang
8b17fd2516 Add remote_parameters() into RemoteModule class. (#43906)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43906

This method returns a list of RRefs of remote parameters that can be fed into the DistributedOptimizer.

Original PR issue: RemoteModule enhancements #40550

Test Plan: buck test caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: rohan-varma

Differential Revision: D23399586

fbshipit-source-id: 4b0f1ccf2e47c8a9e4f79cb2c8668f3cdbdff820
2020-09-04 16:22:40 -07:00
wudenggang
9600ed9af3 typo fixes (#41632)
Summary:
typo fixes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41632

Reviewed By: ezyang

Differential Revision: D22617827

Pulled By: mrshenli

fbshipit-source-id: c2bfcb7cc36913a8dd32f13fc9adc3aa0a9b682f
2020-07-20 07:23:00 -07:00
Shihao Xu
00651b8c93 [distribtued.nn] Implement TorchScript-compatible RemoteModule API (#37139)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37139

See design doc in https://github.com/pytorch/pytorch/issues/37136

ghstack-source-id: 105926270

Test Plan:
TODO:

- Make the generated Interface usable. https://github.com/pytorch/pytorch/pull/37139#discussion_r434190978
-
- Avoid generating the same template instances for Module that is not scriptable.
- Remove "infer_module_interface_cls".
- Use Python format instead of a CodeTemplate
- Use Python tempfile to track and delete file. Does it work if there is crash.

```
buck test mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator

buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \
buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_scripted_remote_module_template

buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \
buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_non_scripted_remote_module_template
```

```
buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_spawn
```

```
buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_async_script

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_sync_script

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_with_kwargs

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name
```

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
```

buck test mode/opt-asan //caffe2/test:jit -- 'test_script_forward_method_replacement

buck build mode/dev-nosan //caffe2/test:jit && \
buck-out/gen/caffe2/test/jit\#binary.par -r 'test_script_forward_method_replacement'

buck build mode/dev-nosan //caffe2/test:jit && \
buck-out/gen/caffe2/test/jit\#binary.par -r 'test_imported_classes'

Differential Revision: D20499658

fbshipit-source-id: dd9383ae4eb2343366c11127664f845b91ca3b0a
2020-06-15 19:07:35 -07:00