Commit Graph

30 Commits

Author SHA1 Message Date
Yuanyuan Chen
a60d9e1f6d Fix flake8 B028 warnings (#166224)
This PR fixes flake8 B028 warning by specifying stacklevel=2 in `warnings.warn`. The advantage is that users can know more contextual information about PyTorch warnings.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/166224
Approved by: https://github.com/ezyang
2025-10-26 06:18:55 +00:00
Aaron Orenstein
805c4b597a PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145202
Approved by: https://github.com/bobrenjc93
2025-01-20 22:37:26 +00:00
Aaron Orenstein
62bcdc0ac9 Flip default value for mypy disallow_untyped_defs [4/11] (#127841)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127841
Approved by: https://github.com/oulgen
2024-06-08 18:36:48 +00:00
Kazuaki Ishizaki
6adcf21b2b Documenting the torch.cuda.nccl.version function (#128022)
Fixes #127892

This PR adds docstring to the torch.cuda.nccl.version function

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128022
Approved by: https://github.com/malfet
2024-06-06 01:13:07 +00:00
Xuehai Pan
67ef2683d9 [BE] wrap deprecated function/class with typing_extensions.deprecated (#127689)
Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.

Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.

Resolves #126888

- #126888

This PR is split from PR #126898.

- #126898

------

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127689
Approved by: https://github.com/Skylion007
2024-06-02 12:30:43 +00:00
PyTorch MergeBot
033e733021 Revert "[BE] wrap deprecated function/class with typing_extensions.deprecated (#126898)"
This reverts commit 749a132fb0.

Reverted https://github.com/pytorch/pytorch/pull/126898 on behalf of https://github.com/fbgheith due to switching typing-extensions=4.3.0 to 4.9.0 causes internal failure ([comment](https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456))
2024-05-31 19:47:24 +00:00
Xuehai Pan
749a132fb0 [BE] wrap deprecated function/class with typing_extensions.deprecated (#126898)
Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing.

Note that only warnings that their messages contain `[Dd]eprecat(ed|ion)` are updated in this PR.

UPDATE: Use `FutureWarning` instead of `DeprecationWarning`.

Resolves #126888

- #126888

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126898
Approved by: https://github.com/albanD
2024-05-29 12:09:27 +00:00
Wes Bland
9c331be919 [pytorch] Remove dot if no suffix (#113273)
Summary: Add the suffix to the version string shouldn't happen if there is no suffix.

Test Plan:
```
/data/users/wbland/fbsource/buck-out/v2/gen/fbcode/param_bench/train/comms/pt/comms.par \
--backend nccl --device cuda --collective all_gather \
--master-ip <snip> --log INFO --b 256 --e 1K \
--num-coll-per-iteration 10 --mode comms--num_iters 5 --w 1 --z 1
...
I1108 07:58:33.852557 2344130 ProcessGroupNCCL.cpp:990] [Rank 0] ProcessGroupNCCL initialization options: NCCL version: 2.17.1, NCCL_ASYNC_ERROR_HANDLING: 3, NCCL_DESYNC_DEBUG: 0, NCCL_ENABLE_TIMING: 0, NCCL_BLOCKING_WAIT: 0, TIMEOUT(ms): 600000, USE_HIGH_PRIORITY_STREAM: 0, TORCH_DISTRIBUTED_DEBUG: OFF, NCCL_DEBUG: OFF, ID=139992854228992
...
```

Differential Revision: D51116095

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113273
Approved by: https://github.com/kwen2501
2023-11-12 15:41:27 +00:00
Wes Bland
9d765d28ca [pytorch] Add binding to get nccl version suffix (#112884)
Summary: Adds a Python to C binding to get the NCCL_SUFFIX value for more accurate NCCL version information and add that to the NCCL version tuple.

Differential Revision: D50978181

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112884
Approved by: https://github.com/kwen2501
2023-11-08 02:51:22 +00:00
Edward Z. Yang
3bf922a6ce Apply UFMT to low traffic torch modules (#106249)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249
Approved by: https://github.com/Skylion007
2023-07-29 23:37:30 +00:00
Carlos Mocholí
491ee70e6e Avoid collections deprecation warning (#72239)
Summary:
Avoids the following deprecation warning:

```python
    loss.backward(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/torch/tensor.py:245: in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py:147: in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
/usr/local/lib/python3.7/dist-packages/torch/autograd/function.py:89: in apply
    return self._forward_cls.backward(self, *args)  # type: ignore
/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/_functions.py:34: in backward
    return (None,) + ReduceAddCoalesced.apply(ctx.input_device, ctx.num_inputs, *grad_outputs)
/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/_functions.py:45: in forward
    return comm.reduce_add_coalesced(grads_, destination)
/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/comm.py:143: in reduce_add_coalesced
    flat_result = reduce_add(flat_tensors, destination)
/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/comm.py:96: in reduce_add
    nccl.reduce(inputs, output=result, root=root_index)
/usr/local/lib/python3.7/dist-packages/torch/cuda/nccl.py:69: in reduce
    _check_sequence_type(inputs)
/usr/local/lib/python3.7/dist-packages/torch/cuda/nccl.py:48: in _check_sequence_type
    if not isinstance(inputs, collections.Container) or isinstance(inputs, torch.Tensor):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

name = 'Container'

    def __getattr__(name):
        # For backwards compatibility, continue to make the collections ABCs
        # through Python 3.6 available through the collections module.
        # Note, no new collections ABCs were added in Python 3.7
        if name in _collections_abc.__all__:
            obj = getattr(_collections_abc, name)
            import warnings
            warnings.warn("Using or importing the ABCs from 'collections' instead "
                          "of from 'collections.abc' is deprecated since Python 3.3,"
                          "and in 3.9 it will stop working",
>                         DeprecationWarning, stacklevel=2)
E           DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

/usr/lib/python3.7/collections/__init__.py:52: DeprecationWarning
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72239

Reviewed By: ngimel

Differential Revision: D34387815

Pulled By: mruberry

fbshipit-source-id: 30c9b4fe518351bc9a6f211269e27ee3ab73a13c
(cherry picked from commit 1f68cdfac5)
2022-02-23 02:31:42 +00:00
Eddie Yan
d893b44cd8 change nccl version reporting (#62916)
Summary:
https://github.com/pytorch/pytorch/issues/62295

Previously the packing and unpacking of the NCCL version "integer" was done to have parity with the upstream NCCL version encoding. However, there doesn't seem to be any place where this integer is directly compared with a version integer sourced from upstream NCCL, and syncing the encoding seems to be error-prone (e.g., a recent change where a special case was added for minor versions >= 10 7e51592129/src/nccl.h.in (L22)).

This patch changes the reporting to return a tuple of version numbers instead (to preserve ease-of-use for comparisons) and tweaks the passing between C/Python to avoid the digit overflow problem.

CC ngimel mcarilli

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62916

Reviewed By: anjali411

Differential Revision: D30201069

Pulled By: mrshenli

fbshipit-source-id: 2e4e7c69f001c3f22bd04aa6df6a992e538bea45
2021-08-10 17:46:27 -07:00
Chester Liu
58eb23378f Clean up usage of torch._six partially (#49785)
Summary:
See https://github.com/pytorch/pytorch/issues/42919

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785

Reviewed By: mruberry

Differential Revision: D25963833

Pulled By: bugra

fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2
2021-02-08 13:58:34 -08:00
Nikita Shulga
8ab2ad306d Enable torch.cuda.nccl typechecking (#45344)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45336

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45344

Reviewed By: walterddr

Differential Revision: D23935306

Pulled By: malfet

fbshipit-source-id: dd09d4f8ff7a327131764487158675027a13bf69
2020-09-25 17:02:47 -07:00
Pritam Damania
306eb3def7 Additional error checking for torch.cuda.nccl APIs. (#43247)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43247

`torch.cuda.nccl` APIs didn't throw appropriate errors when called
with inputs/outputs that were of the wrong type and it resulted in some cryptic
errors instead.

Adding some error checks with explicit error messages for these APIs.
ghstack-source-id: 110683546

Test Plan: waitforbuildbot

Reviewed By: rohan-varma

Differential Revision: D23206069

fbshipit-source-id: 8107b39d27f4b7c921aa238ef37c051a9ef4d65b
2020-08-26 13:50:00 -07:00
SsnL
d5236f8517 Avoid initializing unnecessary tensors in nccl.reduce (#39688)
Summary:
While working on https://github.com/pytorch/pytorch/issues/38911, I realized that `nccl.reduce` only needs a single output tensor, while our current implementation requires a list of output tensors. This, along with a TODO I fixed in reduce_add, should have some speed up for data parallel.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39688

Differential Revision: D22034547

Pulled By: mrshenli

fbshipit-source-id: e74d54d673ebbb062474b1bb5cc93a095a3a5f6c
2020-06-14 10:11:32 -07:00
Sam Gross
bcfe259f83
Add streams and comms as optional arguments (#3968)
Adds streams and comms as optional arguments to the NCCL calls in
torch.cuda.nccl. Also exposes ncclUniqueId and ncclCommInitRank for
multi-process mode.

Moves Py_RETURN_NONE statements after the GIL is re-acquired.
2017-12-04 13:51:22 -05:00
SsnL
01be4d6b20 sparse broadcast_coalesce and reduce_add_coalesced 2017-10-28 18:52:35 -04:00
Soumith Chintala
efe91fb9c1 delete redundant python nccl code 2017-10-09 22:24:18 -04:00
Soumith Chintala
e9dccb3156 implement all_reduce, broadcast, all_gather, reduce_scatter 2017-10-09 22:24:18 -04:00
Soumith Chintala
4d62933529 add initial NCCL C bindings 2017-10-09 22:24:18 -04:00
Christian Sarofeen
ec86d0b2ba Updates for CUDA 9 2017-08-25 07:32:05 -04:00
Adam Paszke
8ab3d214d5 Fixes for DistributedDataParallel (#2168) 2017-07-21 16:00:46 -04:00
Adam Paszke
8db8716c7c Support non-default streams in NCCL reduce 2017-06-12 21:58:38 -04:00
Sam Gross
b9379cfab7 Use cuDNN and NCCL symbols from _C library (#1017)
This ensures that we use the same library at the C++ level and with
Python ctypes. It moves the searching for the correct library from
run-time to compile-time.
2017-03-16 16:10:17 -04:00
Sam Gross
fc6fcf23f7 Lock the cudaFree mutex. (#880)
Prevents NCCL calls from overlapping with cudaFree() which can lead to
deadlocks.
2017-03-01 11:29:25 -05:00
Luke Yeager
e7c1e6a8e3 [pep8] Fix most lint automatically with autopep8
Here's the command I used to invoke autopep8 (in parallel!):

    git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i

Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.

Also configures flake8 to match pep8's behavior.

Also configures TravisCI to check the whole project for lint.
2017-01-28 01:15:51 +01:00
Natalia Gimelshein
2290798a83 if nccl is available, do not compile it and load system version 2017-01-14 10:09:48 +01:00
Sam Gross
0cb5943be8 Fix NCCL reduce_scatter in Python 2.7 (#183) 2016-10-30 17:58:02 -04:00
Sam Gross
f30081a313 Use NCCL bcast and reduce functions in comm 2016-10-14 14:16:32 -07:00