Summary:
In line 508.
convert_sync_batchnorm is called recursively to convert the bn to syncbn, thus the process_group also should be passed in the function.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19240
Differential Revision: D15240318
Pulled By: ezyang
fbshipit-source-id: 0fc9e856392824814991e5e9e8f9513d57f311af
Summary:
This is a fix for issue https://github.com/pytorch/pytorch/issues/18525. The issue is related not only to ONNX export, but can manifest in other scenarios.
An existing test point in test/onnx/test_operators.py has been updated to cover this scenario as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18764
Reviewed By: zrphercule
Differential Revision: D14735166
Pulled By: houseroad
fbshipit-source-id: 5a737c648f64355929ff31eb12bd4869e744768d
Summary:
Closes#18382
Please let me know if any changes are required.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18787
Differential Revision: D14821147
Pulled By: soumith
fbshipit-source-id: edd98eab1b3f4151c4ae5148146435ddb2ae678d
Summary:
- Summary:
Added synchronized batch normalization, allows synchronization of stats across mini-batches between processes within a process group.
Current implementation uses a mixture of extended ATen native functions (cpp cuda extension) + torch.nn.modules (c10d python API)
- User-facing api:
1. torch.nn.utils.convert_sync_batchnorm(modules, process_group=None)
2. torch.nn.SyncBatchNorm(num_features, eps=1e-5, momentum=0.1, affine=True, track_running_stats=True, ***process_group=None***)
- supported use case:
DistributedDataParallel with ***single-gpu multi-process***
a. User creates model containing `torch.nn.SyncBatchNorm` layers through one of the ways listed below:
1. use layers directly:
torch.nn.SyncBatchNorm(...)
similar API as with torch.nn.BatchNormXd(...)
with added argument `process_group` which is used to limit the scope of
synchronization within each process group. Default value is None, which
implies synchronization across all GPUs
2. use torch.nn.utils.convert_sync_batchnorm(modules, process_group)
recursively convert all `torch.nn.BatchNormXd` into `torch.nn.SyncBatchNorm`
preserving values of parameters/buffers.
the utility function also allows user to specify process_group value to all
converted layers.
b. user wraps their model with
`torch.distributed.parallel.DataParallelDistributed`, from this point, user
should follow the general guidelines for DDP use guide
- Error checking
For use cases not supported, we error out:
1. Application launched without ddp:
> import torch
> sbn = torch.nn.SyncBatchNorm(10).cuda()
> inp = torch.randn(5, 10, 3, 3).cuda()
> sbn(inp) --> Error!
> AttributeError: SyncBatchNorm is only supported within torch.nn.parallel.DistributedDataParallel
2. Application launched using DDP with multi-GPU per-process:
> ddp_module = nn.parallel.DistributedDataParallel(module, device_ids=device_ids, output_device=args.local_rank)
> ValueError: SyncBatchNorm is only supported for DDP with single GPU per process
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14267
Differential Revision: D14270035
Pulled By: ezyang
fbshipit-source-id: 4956d8fa565c32e9df5408d53719ff9f945f4d6d
Summary:
This PR:
1. add tests for batchnorm/dropout for train/eval parameter mutatino
2. remove training constants from all our standard library
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14780
Differential Revision: D13331578
Pulled By: wanchaol
fbshipit-source-id: d92ca3ce38cc2888688d50fe015e3e22539a20a5
Summary:
This PR:
1. Handle None value attr in the WeakScriptModuleProxy
2. add back module tests that now passing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14715
Differential Revision: D13313573
Pulled By: wanchaol
fbshipit-source-id: a6b7892707350290a6d69b6f6270ad089bfc954b
Summary:
This PR did three things:
1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features
2. In the process of export, add necessary compiler support for in_place op aug assign
4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016
Differential Revision: D13112064
Pulled By: wanchaol
fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91
Summary:
Problems with SN and DP after #12671 :
1. in eval mode, `weight_orig` is not getting correct gradient #12737 .
Fix: keep `v` vector around as a buffer and always calculate `W = W_orig / (u @ W_orig @ v)` even in eval.
2. in training mode, the `weight` buffer of the parallelized module is never updated, if someone touches `weight_orig` and/or `weight` and makes them not sharing storage. So in `eval` the weight used is wrong.
Fix: Make `weight` not a buffer anymore and always calculate it as above.
3. #12671 changed SN to update `u` in-place to make DP work correctly, but then it breaks backward through two forwards (e.g., the common GAN loss `D(real) - D(fake)`) because the vectors needed to backprop the 1st forward is changed in the 2nd forward.
Fix: This PR clones `u` and `v` before using them.
To maintain BC, I added a hook interface for producing and loading state_dict. This is ugly and we should really have better interface for spectral_norm. But for the purpose to fix this issue, I make this patch. Even if we have a better interface, BC mechanism for legacy loading legacy state_dict still needs to be done.
cc The controller you requested could not be found. crcrpar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13350
Differential Revision: D12931044
Pulled By: SsnL
fbshipit-source-id: 8be6f934eaa62414d76d2c644dedd7e1b7eb31ef
* Replace incorrect usages of "NotImplemented"
Fixes#7266. Replaces "NotImplemented" (which is supposed to be used for
binary ops) with the correct "NotImplementedError".
* Address comments
This PR enables users to print extra information of their subclassed nn.Module.
Now I simply insert the user-defined string at the ending of module name, which should be discussed in this PR.
Before this PR, users should redefine the __repr__ and copy&paste the source code from Module.
* Add support for extra information on Module
* Rewrite the repr method of Module
* Fix flake8
* Change the __repr__ to get_extra_repr in Linear
* Fix extra new-line for empty line
* Add test for __repr__ method
* Fix bug of block string indent
* Add indent for multi-line repr test.
* Address review comments
* Update tutorial for creating nn.Module
* Fix flake8, add extra_repr of bilinear
* Refactor DropoutNd
* Change to extra_repr in some Modules
* Fix flake8
* Refactor padding modules
* Refactor pooling module
* Fix typo
* Change to extra_repr
* Fix bug for GroupNorm
* Fix bug for LayerNorm
* Revert "Port ATen and JIT C++ tests to Catch2 (#5788)"
This reverts commit 6f80023c29.
* Revert "Fix error message for cat-ing zero-dim tensors (#5819)"
This reverts commit cf2e176049.
* Revert "Softmax symbolic should account for negative dim (#5846)"
This reverts commit ba64724aee.
* Revert "[fft][1 of 3] build system and helpers to support cuFFT and MKL (#5855)"
This reverts commit 22ef8e5654.
* Revert "Don't modify requires_grad when running DataParallel in no_grad mode (#5880)"
This reverts commit d11b7fbd1c.
* Revert "fix some methods not showing up in doc (#5882)"
This reverts commit 24fca0efb2.
* Revert "ReduceOps cleanup and set_num_threads (#5723)"
This reverts commit 84400d5531.
* Revert "introduce shape_as_tensor and reshape_from_variable_shape (#5824)"
This reverts commit f446b82e70.
* Revert "Enable resetting of batchnorm running moments and cumulative ("simple") moving average (#5766)"
This reverts commit 99b1f6cfad.
* Fix some minor errors in existing docs.
* Fix Convolution and Pooling docs in torch.nn.functional
* Cleaned up torch.nn.functional docs
* Address @SsnL 's comments
* Add multiplication sign missing in docs
* Fix more typos, and clear some warnings
* Change infinity symbol in LPPool2d
* Revert some changes in torch.nn.functional
* Few more minor changes
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan
* Fix minor nit in the docstring
* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs
* Add more changes
1. Modify all torch.Tensor wherever required
* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines
* Improve Pooling docs
1. Fix lint error
* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling
* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation
* Fix lint error
* Improve docstrings in torch.nn.init
* Fix lint error
* Fix minor error in torch.nn.init.sparse
* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs
* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py
* Fix batch norm doc error
The nn.* counterpart of #5443 . Mostly removed Variable wrapper. Also added doc for nn.RReLU.
Notice that torch.randn(*, requires_grad=True) isn't documented until #5462 is done.
* at::maybe_data_ptr and Check.h => TensorUtils.h
* THNN support for optional BN running_*
* ATen support for optional BN running_*
* Python nn.* support for optional BN running_*; Improve IN and BN doc
* Add tests for IN and BN new option
* Layer Norm
* Fix LRN doc
* functional interface for LN and IN
* Layer norm tests
* fix BN double backward returning undefined tensors
* fix jit test using wrong dim inputs for BN
* add/improve BN, IN and LN GPU tests with half type
* Udpate docs to be consistent with Conv notation
Fix onnx
Clarified onnx symbokic wrapper
* fix typo
* Address comments
Here's the command I used to invoke autopep8 (in parallel!):
git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i
Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.
Also configures flake8 to match pep8's behavior.
Also configures TravisCI to check the whole project for lint.
- don't use cuDNN for half inputs because weight, bias, running_mean,
etc. are required to be of different type than for THCUNN
- accept 3D inputs (N,C,L) in BatchNorm1d
- remove accidental 'use_cudnn=False'
Uses the assignment syntax to get deterministic ordering of parameters.
The ordering of parameters using the constructor syntax is
non-deterministic because kwargs use dict() in Python 3.5 and earlier.