Commit Graph

9 Commits

Author SHA1 Message Date
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
Andy Wei
5eee0670ab Pass torch.distributed launch process local rank as environment variable instead of argument (#16360)
Summary:
In `torch.distributed.launch.py`, it passes `local_rank` as argument and requires user's program to parse it. However, it would be more flexible for users and consistent with other variables, e.g. `RANK`, `MASTER_PORT`, `WORLD_SIZE`, if passing through environment variables.

265ed8ff45/torch/distributed/launch.py (L200-L212)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16360

Differential Revision: D14070372

Pulled By: ezyang

fbshipit-source-id: c3f6a8e55ab513918cad09d1326eccdedb4d98c9
2019-02-15 14:52:55 -08:00
Andy Wei
4cf76574b9 Raise CalledProcessError when torch.distributed launch process not return 0 (#16069)
Summary:
`torch.distributed.launch.py` will not raise error when `subprocess.Popen` is not return 0.
For better debugging it should always raise an error if processes launched have unusual behavior
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16069

Differential Revision: D13709467

Pulled By: ezyang

fbshipit-source-id: 31d32a5ec8fed7bccd62d845bfba0e670ed3fe20
2019-01-22 08:50:47 -08:00
Edward Yang
058a31839d Warn about local_rank not being globally unique. (#12370)
Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

CC deepakn94
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12370

Differential Revision: D10220135

Pulled By: ezyang

fbshipit-source-id: 6d1a8a383951ae52753e4f75a14b8080bf02b815
2018-10-05 17:38:41 -07:00
Tongzhou Wang
8e33451e2e Make torch.cuda.* take device objects; Update distributed docs (#10833)
Summary:
Commits:

1. Make `torch.cuda.*` take device objects
2. Update `torch.distributed` docs to emphasize calling `torch.cuda.set_device` before `init_process_group`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10833

Differential Revision: D9514241

Pulled By: SsnL

fbshipit-source-id: 2497464305fb1e63d6c495291a5744aaa7e2696e
2018-08-27 15:24:42 -07:00
Kaiyu Shi
db6e4576da Use customized python interpreter (#7520) 2018-05-12 13:06:39 -04:00
Edgar Andrés Margffoy Tuay
4ab6ea5b1f Add unbuffered flag to distributed node launcher (#7226) 2018-05-03 11:49:06 +02:00
Teng Li
f5beff334b Added distributed docs on NCCL2 backend/functions and launch module (#6579) 2018-04-15 21:53:10 -04:00
Teng Li
37059ba0ec Added torch.distributed.launch module for easier multi-proc/node distributed job launching (#5348) 2018-03-13 12:04:38 +01:00