pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Maggie Moss	84b14f3a10	Fix error suppression syntax in utils and nn (#166242 ) Fixes syntax for pyrefly : ignores so they only ignore a specific category. No functional changes pyrefly check lintrunner Pull Request resolved: https://github.com/pytorch/pytorch/pull/166242 Approved by: https://github.com/oulgen, https://github.com/cyyever	2025-10-26 05:21:07 +00:00
Maggie Moss	c855f8632e	Pyrefly suppressions 7/n (#164913 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Almost there! Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the project-excludes field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: INFO 0 errors (6,884 ignored) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164913 Approved by: https://github.com/oulgen	2025-10-08 07:27:17 +00:00
Tom Ritchford	c0582fd0f8	Remove unused Python variables in torch/[b-z]* (#136963 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136963 Approved by: https://github.com/ezyang	2024-10-19 16:45:22 +00:00
Xuehai Pan	62ccf6d7cd	[BE] enable UFMT for `torch/nn/modules` (#128594 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128594 Approved by: https://github.com/mikaylagawarecki	2024-06-23 05:37:57 +00:00
PyTorch MergeBot	d4022b4658	Revert "[BE] enable UFMT for `torch/nn/modules` (#128594 )" This reverts commit `95ac2d6482`. Reverted https://github.com/pytorch/pytorch/pull/128594 on behalf of https://github.com/fbgheith due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/128594#issuecomment-2181788935))	2024-06-21 00:50:08 +00:00
Xuehai Pan	95ac2d6482	[BE] enable UFMT for `torch/nn/modules` (#128594 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128594 Approved by: https://github.com/mikaylagawarecki ghstack dependencies: #128596	2024-06-17 16:29:25 +00:00
Aaron Orenstein	27f9d3b0a1	Flip default value for mypy disallow_untyped_defs [8/11] (#127845 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127845 Approved by: https://github.com/oulgen ghstack dependencies: #127842, #127843, #127844	2024-06-08 18:49:56 +00:00
Aaron Gokaslan	ea7d70aecc	[BE]: ruff FURB136: replace ternary with min/max (preview) (#114382 ) Replaces ternary if else statements with simple min max when appropriate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114382 Approved by: https://github.com/albanD	2023-11-22 22:10:01 +00:00
Justin Chu	79c5e33349	[BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436 Approved by: https://github.com/malfet, https://github.com/albanD	2023-07-21 07:38:46 +00:00
shibo19	05854212dd	add syncBN support for custom device (#104250 ) Fixes #ISSUE_NUMBER there are some hard checks for `cuda`, so I make optimize the check so that we can run it for other device. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104250 Approved by: https://github.com/albanD	2023-07-17 15:41:39 +00:00
Danni Li	b33d63d97b	[BE] Use ValueError for input.dim check in torch.nn.modules (#105127 ) Summary: Use ValueError for input.dim check instead of Assertion Error. Fix: #104839 Test Plan: Please see GitHub actions. Differential Revision: D47427998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105127 Approved by: https://github.com/albanD, https://github.com/Skylion007	2023-07-13 23:20:46 +00:00
Kazuaki Ishizaki	a531a464fd	Fix typos under torch/nn directory (#97594 ) This PR fixes typos in comments of `.py` files under `torch/nn` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/97594 Approved by: https://github.com/dagitses, https://github.com/kit1980	2023-04-10 22:07:15 +00:00
Andrew Gu	3686416a57	[SyncBatchNorm] Support running with low precision parameters (#98332 ) This PR fixes https://github.com/pytorch/pytorch/issues/96203. Details When using `nn.SyncBatchNorm` with the model converted to FP16, there is a dtype discrepancy in the `SyncBatchNorm.forward()` causing an error like: ``` File "/.../pytorch/torch/nn/modules/_functions.py", line 91, in forward mean, invstd = torch.batch_norm_gather_stats_with_counts( RuntimeError: Expected counts to have type Half but got Float ``` [`torch.batch_norm_gather_stats_with_counts()`](`fe9da29842/torch/nn/modules/_functions.py (L88-L97)`) requires the `running_mean`, `running_var`, and `counts` to have the same dtype. However, when the model has been converted to FP16, only `running_mean` and `running_var` use FP16, while the `counts` are in FP32 due to [`mean` being in FP32](`fe9da29842/torch/nn/modules/_functions.py (L25-L30)`). This PR resolves this by casting `counts` from FP32 to FP16 instead of the alternative to cast `mean` and `invstd` from FP32 to FP16. Moreover, for the backward, this PR casts `weight` from FP16 to FP32 to match the dtype of `mean` and `invstd` as required by `torch.batch_norm_backward_elemt()` instead of the alternative to cast `mean` and `invstd` from FP32 to FP16. Test Plan I dug up this run command from 2021: For `world_size` in `{1,2}` and `backend` in `{nccl, gloo}`: ``` WORLD_SIZE=world_size BACKEND=backend python -m pytest test/distributed/test_distributed_spawn.py -k test_DistributedDataParallel_SyncBatchNorm_half -vs ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98332 Approved by: https://github.com/rohan-varma	2023-04-05 00:00:30 +00:00
Howard Huang	9497552771	Update SyncBatchNorm _all_gather_base to all_gather_into_tensor (#89521 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/88568 `_all_gather_base` is deprecated. So replacing its usage with `all_gather_into_tensor` Test Plan: CI Differential Revision: D41479983 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89521 Approved by: https://github.com/wz337	2022-11-24 19:41:17 +00:00
Xiao Wang	f5df685090	Enable channels_last_3d on SyncBatchNorm (#88401 ) This PR enabled the use of fast channels_last kernels on SyncBatchNorm with channels_last_3d memory format. With a small benchmark script here https://github.com/pytorch/pytorch/issues/88021#issuecomment-1299059859, on V100, I got master: ``` DDP channels_last=False, run_forward_backward, time: 0.8945400714874268 sec DDP channels_last=True, run_forward_backward, time: 1.4736433029174805 sec ``` This PR: ``` DDP channels_last=False, run_forward_backward, time: 0.8927242755889893 sec DDP channels_last=True, run_forward_backward, time: 0.48697471618652344 sec ``` This PR is a follow-up of https://github.com/pytorch/pytorch/pull/46906 Close https://github.com/pytorch/pytorch/issues/88021 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88401 Approved by: https://github.com/ngimel	2022-11-15 19:25:53 +00:00
Shen Li	1884d7fbe9	Avoid CPU Sync in SyncBatchNorm When Capturing CUDA Graphs We recently updated `SyncBatchNorm` to support empty input batches. The new code removes stats from ranks with empty inputs. However, this change breaks CUDA graph capture as it forces CPU sync. This commit uses `is_current_stream_capturing()` to guard the new code path, and only run the new code when not capturing CUA Graphs. To support empty inputs with CUDA graph capturing, we might need to update CUDA kernels for `batch_norm_backward_elemt` and `batch_norm_gather_stats_with_counts`. See #78656. Fixes #78549 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78666 Approved by: https://github.com/albanD	2022-06-03 04:32:57 +00:00
Shen Li	87ab665ba6	Fix SyncBatchNorm for empty inputs (#74944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74944 fixes #36530 Prior to this commit, SyncBatchNorm crashes with the following error message. ``` File "..../torch/nn/modules/_functions.py", line 17, in forward mean, invstd = torch.batch_norm_stats(input, eps) RuntimeError: cannot reshape tensor of 0 elements into shape [0, 3, -1] because the unspecified dimension size -1 can be any value and is ambiguous ``` This PR adds a dedicated branch to handle empty inputs. When a process recieves empty inputs, it will set its local `mean`, `invstd`, and `count` to zero, and participate in the `all_gather` collective communications in the forward pass. Then `mean` and `invstd` with zero count will be filtered out before computing global mean and invstd. In the backward pass, it also participate in the `all_reduce` communication with zero tensors to unblock its peers. Differential Revision: D35273409 D35273409 Test Plan: Imported from OSS Reviewed By: datumbox Pulled By: mrshenli fbshipit-source-id: 1cee51eea866773c329b3fbf5da2be8a5fee6f0f (cherry picked from commit f8e2a2357240ebe7b7a058047d376a5300bdeda9)	2022-04-01 23:48:30 +00:00
Yanli Zhao	2733555ed1	replace all_gather with more efficient collective api _all_gather_base (#57769 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57769 _all_gather_base saved copies in all_gather, so it is more efficient Test Plan: unit test Reviewed By: SciPioneer Differential Revision: D28227193 fbshipit-source-id: ddd8590095a5b45676497a71ed792a457f9825c6	2021-05-24 11:34:45 -07:00
Nikita Shulga	69c5fd1e00	SyncBatchNorm.forward() to handle optional weight (#54568 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54495 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54568 Reviewed By: ezyang Differential Revision: D27285822 Pulled By: malfet fbshipit-source-id: 4f7b489d80294cb2509eec4f6c4aa22d5c47b35d	2021-04-01 08:02:21 -07:00
Xiao Wang	d30f4d1dfd	Migrate apex.parallel.SyncBatchNorm channels_last to pytorch (#46906 ) Summary: per title This PR did - Migrate `apex.parallel.SyncBatchNorm` channels_last to pytorch `torch.nn.SyncBatchNorm` - Fix a TODO here by fusing `sum`, `div` kernels into backward elementwise kernel `b167402e2e/torch/nn/modules/_functions.py (L76-L95)` Todo - [x] Discuss a regression introduced in https://github.com/pytorch/pytorch/pull/37133#discussion_r512530389, which is the synchronized copy here `b167402e2e/torch/nn/modules/_functions.py (L32-L34)` Comment: This PR uses apex version for the size check. Test passed and I haven't seen anything wrong so far. - [x] The restriction to use channels_last kernel will be like this ``` inline bool batch_norm_use_channels_last_kernels(const at::Tensor& self) { return self.is_contiguous(at::MemoryFormat::ChannelsLast) \|\| self.ndimension() == 2; } ``` I think we can relax that for channels_last_3d as well? Comment: we don't have benchmark for this now, will check this and add functionality later when needed. - [x] Add test - [x] Add benchmark Detailed benchmark is at https://github.com/xwang233/code-snippet/tree/master/syncbn-channels-last Close https://github.com/pytorch/pytorch/issues/50781 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46906 Reviewed By: albanD Differential Revision: D26771437 Pulled By: malfet fbshipit-source-id: d00387044e9d43ac7e6c0e32a2db22c63d1504de	2021-03-03 15:29:45 -08:00
Nikita Shulga	bf4fcab681	Fix SyncBatchNorm usage without stats tracking (#50126 ) Summary: In `batch_norm_gather_stats_with_counts_cuda` use `input.scalar_type()` if `running_mean` is not defined In `SyncBatchNorm` forward function create count tensor with `torch.float32` type if `running_mean` is None Fix a few typos Pull Request resolved: https://github.com/pytorch/pytorch/pull/50126 Test Plan: ``` python -c "import torch;print(torch.batch_norm_gather_stats_with_counts( torch.randn(1, 3, 3, 3, device='cuda'), mean = torch.ones(2, 3, device='cuda'), invstd = torch.ones(2, 3, device='cuda'), running_mean = None, running_var = None , momentum = .1, eps = 1e-5, counts = torch.ones(2, device='cuda')))" ``` Fixes https://github.com/pytorch/pytorch/issues/49730 Reviewed By: ngimel Differential Revision: D25797930 Pulled By: malfet fbshipit-source-id: 22a91e3969b5e9bbb7969d9cc70b45013a42fe83	2021-01-07 18:31:13 -08:00
albanD	ccd646696b	Fix Module backward hooks for all Tensor inputs/outputs (#46163 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/598 This is BC-breaking as we now explicitly don't call the hook when there are not Tensors at the top level of the output. This feature was not working anyways as the returned grad_input/grad_output were wrong (not respecting the output structure and wrong inputs for multi-Node Module). This is also BC-breaking as we now report the correct gradients for `nn.Module`s that contain multiple autograd `Node`s while we use to return bad results before. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46163 Reviewed By: ailzhang, mruberry Differential Revision: D24894180 Pulled By: albanD fbshipit-source-id: e1b5d193d2818eb2f51e2a2722c7405c8bd13c2b	2020-12-18 09:04:36 -08:00
Vasiliy Kuznetsov	b167402e2e	[redo] Fix SyncBatchNorm forward pass for non-default process group (#43861 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43861 This is a redo of https://github.com/pytorch/pytorch/pull/38874, and fixing my original bug from https://github.com/pytorch/pytorch/pull/38246. Test Plan: CI Imported from OSS Reviewed By: supriyar Differential Revision: D23418816 fbshipit-source-id: 2a3a3d67fc2d03bb0bf30a87cce4e805ac8839fb	2020-09-02 10:44:46 -07:00
Vasiliy Kuznetsov	f64d24c941	speed up SyncBatchNorm by batching distributed communication (#38246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38246 Speeds up SyncBatchNorm by batching the distributed communication. Initial benchmarks show a ~15+% speed improvement on MobileNetV2 and EfficientNetB3 on a single machine with 8 gpus. Improvement vs baseline increases as # of gpus increases. Test Plan: verified that before+after intermediate values in fwd/bwd pass are equivalent (with `torch.allclose`) benchmark runner: https://gist.github.com/vkuzo/7b1ce1b1b051ee6d46877d0f18ab9b1f results (1 forward pass + 1 backward pass, 1 machine, 8x Tesla-P100, batch_size=20 per node): ``` model gpus before_ms after_ms speedup efficientnet-b3 2 660 654 0.00909 efficientnet-b3 4 777 710 0.08623 efficientnet-b3 8 988 838 0.15182 mobilenet-v2 2 267 266 0.00375 mobilenet-v2 4 328 289 0.1189 mobilenet-v2 8 453 373 0.1766 ``` Imported from OSS Differential Revision: D21505905 fbshipit-source-id: 3e796343fce8329a2e17671d60ae66c0387924e7	2020-05-13 11:21:42 -07:00
elmirador	ae755a73d3	SyncBatchNorm size check update (#37133 ) Summary: Update the requirements on input dimensions for torch.nn.SyncBatchNorm: 1. Checks the aggregated batch size `count_all` instead of batch size in every DDP process https://github.com/pytorch/pytorch/issues/36865 2. Added test function for SyncBatchNorm where every process only has 1 input Pull Request resolved: https://github.com/pytorch/pytorch/pull/37133 Differential Revision: D21331120 Pulled By: zhaojuanmao fbshipit-source-id: ef3d1937990006609cfe4a68a64d90276c5085f2	2020-05-01 18:01:30 -07:00
gzygzy9211	ab2a9ab925	Non-blocking SyncBatchNorm update (#36659 ) Summary: As shown in https://github.com/pytorch/pytorch/issues/36452 , SyncBatchNorm can block host thread due the ``MemcpyDtoH`` and ``MemcpyHtoD`` when dealing with argument ``counts`` for native function ``batch_norm_gather_stats_with_counts``. - This fix change signiture of ``batch_norm_gather_stats_with_counts`` to ```c++ std::tuple<Tensor, Tensor> batch_norm_gather_stats_with_counts_cuda(const Tensor& self, const Tensor& mean, const Tensor& invstd, const Tensor& running_mean, const Tensor& running_var, double momentum, double epsilon, const Tensor& counts) ``` so it can directly receive "counts" in a ``CUDATensor`` rather than ``IntArrayRef`` whose data is in host memory. - This fix also improve implementation of ``SyncBatchNorm`` function so the construction of ``counts`` tensor will not cause additional ``MemcpyHtoD``, which will block host thread, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36659 Differential Revision: D21196991 Pulled By: ngimel fbshipit-source-id: 84a529e6cf22e03618fecbb8f070ec452f81229e	2020-04-23 10:22:19 -07:00
Jie	289d52c120	Fixing SyncBN dgrad (#36382 ) Summary: Previous PR https://github.com/pytorch/pytorch/issues/22248 which provides support for variadic batch size across processes doesn't account the mean_dy/mean_dy_xmu on backward path, which produces wrong dgrad. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36382 Differential Revision: D20984446 Pulled By: ngimel fbshipit-source-id: 80066eee83760b275d61e2cdd4e86facca5577fd	2020-04-13 21:08:31 -07:00
Xiao Wang	c1dd70688a	Fix deprecated python "add" calls (#33428 ) Summary: This PR fixed those python "add" calls using deprecated signature `add(Scalar, Tensor)`. The alternative signature `add(Tensor, alpha = Scalar)` is used. cc csarofeen zasdfgbnm ptrblck ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/33428 Differential Revision: D20002534 Pulled By: vincentqb fbshipit-source-id: 81f2dd6170a47a9b53a17e5817c26e70d8afa130	2020-02-26 09:02:31 -08:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
jiej	9c7e604c60	SyncBatchNorm Update on input dimension checks (#29626 ) Summary: update the requirements on input dimensions for `torch.nn.SyncBatchNorm`: 1. 2D inputs is now permissible, https://github.com/pytorch/pytorch/issues/20204 ; 2. requires at least two element along normalization plane (BatchNorm behavior); Pull Request resolved: https://github.com/pytorch/pytorch/pull/29626 Differential Revision: D18492531 Pulled By: albanD fbshipit-source-id: f008e46a2d520d73c3c2730890a7424eba2ede9e	2019-11-18 16:09:51 -08:00
Gregory Chanan	23fde77d3d	Remove Module._backend as it's not used anymore. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25342 Test Plan: Imported from OSS Differential Revision: D17101571 Pulled By: gchanan fbshipit-source-id: 2cda46fe197e26a1cacb5e912f535809973d306e	2019-08-29 15:43:49 -07:00
root	8640aef505	Add support for non-affine batch norm with float stats and half inputs (#22750 ) Summary: This PR creates support for non-affine batch norm with float running estimates and half inputs. Changed were made similar to https://github.com/pytorch/pytorch/issues/16735. I couldn't find a specific test for `SyncBatchNorm`, so I used [this code](https://gist.github.com/ptrblck/ab45bfcde6df55ac28a7be18531f4718) to test it. cc ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/22750 Differential Revision: D17119965 Pulled By: ezyang fbshipit-source-id: 2e8c5d63fc3c636b8a1338c43c9c101a0f5e9b22	2019-08-29 14:04:37 -07:00
Gregory Chanan	a8ae33ce27	Move autograd function for CrossMapLRN2d from being backend specific to modules/_functions. (#25339 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25339 This is to get rid of backend-specific dispatch in modules; this autograd function is no longer backend specific so doesn't need to be in a backend specific location. Test Plan: Imported from OSS Differential Revision: D17101576 Pulled By: gchanan fbshipit-source-id: f4f0bd3ecc2d4dbd8cdfedbaabcadb8c603d2507	2019-08-29 09:55:11 -07:00
Shuaipeng Li	29ec4769bb	Fix SyncBatchNorm running var update issue (#22248 ) Summary: ## Fix https://github.com/pytorch/pytorch/issues/22192 + change signature: `func: batch_norm_gather_stats(Tensor input, Tensor mean, Tensor invstd, Tensor? running_mean, Tensor? running_var, float momentum, float eps, Tensor counts) -> (Tensor, Tensor)` + change cuda & cuda head ```cuda std::tuple<Tensor, Tensor> batch_norm_gather_stats_cuda(const Tensor& self, const Tensor& mean, const Tensor& invstd, const Tensor& running_mean, const Tensor& running_var, double momentum, double epsilon, int64_t count) { const Tensor& running_var, double momentum, double epsilon, const Tensor& counts) ``` + change python interface ```python class SyncBatchNorm(Function): def forward(self, input, weight, bias, running_mean, running_var, eps, momentum, process_group, world_size): ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/22248 Differential Revision: D16002146 Pulled By: mrshenli fbshipit-source-id: 9007e83928267b89df4d3847aabfbdb63e456956	2019-07-03 17:17:59 -07:00
jiej	39669316a6	(#14267 ) Summary: - Summary: Added synchronized batch normalization, allows synchronization of stats across mini-batches between processes within a process group. Current implementation uses a mixture of extended ATen native functions (cpp cuda extension) + torch.nn.modules (c10d python API) - User-facing api: 1. torch.nn.utils.convert_sync_batchnorm(modules, process_group=None) 2. torch.nn.SyncBatchNorm(num_features, eps=1e-5, momentum=0.1, affine=True, track_running_stats=True, *process_group=None) - supported use case: DistributedDataParallel with single-gpu multi-process* a. User creates model containing `torch.nn.SyncBatchNorm` layers through one of the ways listed below: 1. use layers directly: torch.nn.SyncBatchNorm(...) similar API as with torch.nn.BatchNormXd(...) with added argument `process_group` which is used to limit the scope of synchronization within each process group. Default value is None, which implies synchronization across all GPUs 2. use torch.nn.utils.convert_sync_batchnorm(modules, process_group) recursively convert all `torch.nn.BatchNormXd` into `torch.nn.SyncBatchNorm` preserving values of parameters/buffers. the utility function also allows user to specify process_group value to all converted layers. b. user wraps their model with `torch.distributed.parallel.DataParallelDistributed`, from this point, user should follow the general guidelines for DDP use guide - Error checking For use cases not supported, we error out: 1. Application launched without ddp: > import torch > sbn = torch.nn.SyncBatchNorm(10).cuda() > inp = torch.randn(5, 10, 3, 3).cuda() > sbn(inp) --> Error! > AttributeError: SyncBatchNorm is only supported within torch.nn.parallel.DistributedDataParallel 2. Application launched using DDP with multi-GPU per-process: > ddp_module = nn.parallel.DistributedDataParallel(module, device_ids=device_ids, output_device=args.local_rank) > ValueError: SyncBatchNorm is only supported for DDP with single GPU per process Pull Request resolved: https://github.com/pytorch/pytorch/pull/14267 Differential Revision: D14270035 Pulled By: ezyang fbshipit-source-id: 4956d8fa565c32e9df5408d53719ff9f945f4d6d	2019-03-06 13:39:11 -08:00

35 Commits