Summary:
Here is a PR adding ```ModuleList``` to ```modules.h``` so that it can be used by including ```torch/torch.h```.
yf225 edit: Closes https://github.com/pytorch/pytorch/issues/25293.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25346
Differential Revision: D17115013
Pulled By: yf225
fbshipit-source-id: 38a1848b9a8272fa411865dfc83b76d10c5789a0
Summary:
This PR adds `TORCH_WARN_ONCE` macro, and use it in `Tensor.data<T>()`.
cc. gchanan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25207
Differential Revision: D17066263
Pulled By: yf225
fbshipit-source-id: 411c6ccc8326fb27ff885fee4638df8b5ba4d449
Summary:
This PR adds deprecation message for `tensor.data<T>()` (91d94e7d41), and changes all call sites of `tensor.data<T>()` to `tensor.data_ptr<T>()` in PyTorch core.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24886
Differential Revision: D16924576
Pulled By: yf225
fbshipit-source-id: 0943d6be73245c7c549c78597b74c3b07fa24440
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24342
Right now the two APIs that provided in autograd package only have
python bindings and we could not call them either in C++ API or in
TorchScript. This PR make these two APIs available purely in C++ (with
preserving semantics) and can be used in C++ API and TorchScript
Differential Revision: D16923271
fbshipit-source-id: 049d6fbd94cd71ecc08b2716f74d52ac061f861e
Summary:
This PR templatizes `Tensor.data_ptr()`, to prepare for the deprecation of `Tensor.data<T>()` and introduction of `Tensor.data()` that has the same semantics as `Variable.data()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24847
Differential Revision: D16906061
Pulled By: yf225
fbshipit-source-id: 8f9db9fd105b146598a9d759aa4b4332011da8ea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24393
Ability to register hook on a variable, similar to python autograd API. register_hook will take a function as argument and create a CppFunctionPreHook similar to PyFunctionPreHook.
It will return the index of the hook which can be passed to remove_hook to disable the hook.
Test Plan: Added tests.
Differential Revision: D16861722
fbshipit-source-id: d08047f932e38c7bde04283a18b2d0311c8ad604
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23803
Custom `forward()` can return a `Variable` in case of single outputs instead of returning a `variable_list` of size 1.
Test Plan: Modified tests involving single output forward functions.
Reviewed By: ezyang
Differential Revision: D16673857
Pulled By: ezyang
fbshipit-source-id: c96d9473b48ad99e6736a68d334b333a917498b7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23628
More tests for autograd::Fuction based on python tests from test_autograd.py
Test Plan: Imported from OSS
Differential Revision: D16600992
fbshipit-source-id: 0cb8bfbcff315111dc4936e837ff859d0a1e251d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23618
For example: `save_for_backward({Variable(), x, Variable()})` should be allowed, so that this is consistent with the python API behaviour.
Test Plan: Added a test similar to the python test `test_save_none_for_backward` from test_autograd.py.
Differential Revision: D16589402
fbshipit-source-id: 847544ad8fc10772954d8629ad5a62bfdc1a66c1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23572
### **(The stack from #23020 was moved into this PR)**
Adding API for custom autograd operations, with user defined forward and backward, [like in python](https://pytorch.org/docs/stable/notes/extending.html#extending-torch-autograd).
The custom operation should be a subclass of Function, with static forward and backward functions. `forward()` can accept any arguments similar to the Python API and `backward()` should accept a variable list as an argument.
Both `forward()` and `backward() `accept a AutogradContext* which can be used to share data between them.
Variables can be saved in the context using `save_for_backward()` and other data can be saved in the map `save` in the form of `<std::string, at::IValue>` pairs. Variables saved in forward can be accessed with `get_saved_variables()`.
Example usage:
```
class MyFunction : public Function<MyFunction> {
public:
static variable_list forward(AutogradContext *ctx, int n, Variable var) {
// Save data for backward in context
ctx->saved_data["n"] = n;
return {var};
}
static variable_list backward(AutogradContext *ctx, variable_list grad_output) {
// Use data saved in forward
auto n = ctx->saved_data["n"].toInt();
return {grad_output[0]*n};
}
};
```
Then, it can be used with:
```
Variable x;
MyFunction::apply(6, x);
```
Also AutogradContext has methods to mark outputs as non differentiable and mark inputs as dirty similar to the [Python API](ff23a02ac4/torch/autograd/function.py (L26)).
Test Plan: Added tests for the custom autograd function API based on test_autograd.py. Currently only the tests for the basic functionality have been added. More tests will be added later.
Differential Revision: D16583428
fbshipit-source-id: 0bd42f19ce37bcd99d3080d16195ad74d40d0413
Summary:
Add a sorting policy to ChunkDataset.
This is considered an advanced parameter for developers who want to apply a 'sorting policy' to the chunk data before sampling into minibatch.
Different than the collate method, this policy is applied on the chunk level instead of minibatch level. When a chunk of data is loaded (multiple chunks if cross_chunk_shuffle_count_ is greater than 1), this policy is targeting to the full loaded data. It will be useful if developers want to perform some pre-processing (like bucketing) to the chunk data before example sampler samples the data.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23053
Differential Revision: D16537692
Pulled By: colesbury
fbshipit-source-id: cd21ed40ab787a18b8c6dd304e5b806a7a45e6ba
Summary:
As pointed out by SsnL in https://github.com/pytorch/pytorch/issues/20910, when clone destination is different from the module's device,
`Cloneable` currently calls `clone()` and then `to()` on every parameter and buffer, where the first clone is unnecessary.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20995
Differential Revision: D15517353
Pulled By: mrshenli
fbshipit-source-id: 6b6dc01560540a63845663f863dea0a948021fa5
Summary:
In Python, `register_module` / `register_parameter` / `register_buffer` method in `nn.Module` is public. This PR makes those APIs public for C++ `nn::Module` as well. Closes https://github.com/pytorch/pytorch/issues/23140.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23196
Differential Revision: D16440239
Pulled By: yf225
fbshipit-source-id: e0eff6e1db592961fba891ec417dc74fa765e968
Summary:
This adds a replace_module method to the C++ api. This is needed to be able to replace modules.
The primary use case I am aware of is to enable finetuning of models.
Given that finetuning is fairly popular these days, I think it would be good to facilitate this in the C++ api as well.
This has been reported by Jean-Christophe Lombardo on the [forums](https://discuss.pytorch.org/t/finetuning-a-model-on-multiple-gpu-in-c/49195).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22546
Differential Revision: D16440289
Pulled By: yf225
fbshipit-source-id: c136f914b8fc5c0f1975d877ea817fda5c851cda
Summary:
Creating an untyped generic list is deprecated, we always want type information to be present.
This fixes test cases and removes one that used lists with ambigious types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23192
ghstack-source-id: 86972891
Differential Revision: D16431482
fbshipit-source-id: 4ca5cd142118a3f0a4dcb8cd77383127c54abb29
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22517
Force anybody creating an untyped Dict to call c10::impl::deprecatedUntypedDict().
This should hopefully make it clear that this is not public API and prevent people from using it.
Reviewed By: dzhulgakov
Differential Revision: D16115214
fbshipit-source-id: 2c8d0e4e375339c699d583995f79c05c59693c3e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22004
In future, we want all dicts/lists to store information about the types they contain.
This is only possible if the creation API doesn't allow creating lists/dicts without type information.
This diff removes some call sites that don't specify type information and have it specify type information.
Reviewed By: dzhulgakov
Differential Revision: D15906387
fbshipit-source-id: 64766a2534b52c221e8a5501a85eaad13812e7bd
Summary:
This change adds one advanced support for cross-chunk shuffling.
For training with static dataset, the default configuration is at user's disposal. However, in some user cases, over each epoch, new data is added to the current dataset, thus the dataset's size is dynamically changing/increasing. In order to mix the new data and the old data for better random sampling, one approach is to shuffle examples from more than 1 chunks. This feature is supported with this change. By specifying the `cross_chunk_shuffle_count_` on construction, advanced user can specify how many chunks to shuffle example from.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22347
Differential Revision: D16081378
Pulled By: zhangguanheng66
fbshipit-source-id: fd001dfb9e66947839adecfb9893156fbbce80d0
Summary:
When dealing with large scale dataset, it is handy if we can save the dataset status and resume later. Especially in cases where some unexpected crash happens, user don't need to start over the whole dataset from begining. Instead, they can reload it from the last checkpoint.
This change adds support for checkpoint save/load logic in ChunkDataset.
On ChunkDataset construction, user can specify a file name from which to load the checkpoint. If it is empty, default to start from fresh; otherwise the ChunkDataset will 'fast forward' the chunk sampler to the corresponding checkpoint.
The user can also call ChunkDataset::save() to serialize current status to a file, which can be used later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21889
Differential Revision: D16024582
Pulled By: ailzhang
fbshipit-source-id: 1862ab5116f94c9d29da174ce04a91041d06cad5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22084
For DictPtr/ListPtr, default construction was disallowed because it was ambigious if it's supposed to create an empty list or a nullptr.
But since we renamed them to Dict/List, we can now allow default construction without ambiguity.
Differential Revision: D15948098
fbshipit-source-id: 942a9235b51608d1870ee4a2f2f0a5d0d45ec6e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21937
This changes call sites to use the new naming scheme
Reviewed By: zdevito
Differential Revision: D15892404
fbshipit-source-id: 8d32aa90a0ead1066688166478f299fde9c2c133
Summary:
Resolves https://github.com/pytorch/lockdown/issues/18
This implements NamedTuple by taking advantage of the existing `names` field in `TupleType`.
TODO: This currently doesn't retain the NamedTuple-ness through serialization. Discussed with suo offline, we can probably make a way to define an anonymous NamedTuple in script (e.g. `NamedTuple('Foo', [('a', int), ('b', float), ('c', List[float])])` and serialize that
TODO: implement support for calling the constructor with kwargs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21428
Differential Revision: D15741564
Pulled By: jamesr66a
fbshipit-source-id: c077cbcea1880675ca6deb340a9ec78f824a136c
Summary:
This renames the CMake `caffe2` target to `torch`, as well as renaming `caffe2_gpu` to `torch_gpu` (and likewise for other gpu target variants). Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge of `caffe2` from CMake variable names is beyond the scope of this PR.
The shell `libtorch` library that had been introduced as a stopgap in https://github.com/pytorch/pytorch/issues/17783 is again flattened in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20774
Differential Revision: D15769965
Pulled By: kostmo
fbshipit-source-id: b86e8c410099f90be0468e30176207d3ad40c821
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177
- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now
Differential Revision: D15476433
fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
Summary:
Fixes#19540
CC nmerrill67
C++ data parallel was using Module.clone() to create module replicas on every destination device. However, clone() does not set up gradient edges to point from replicas to the original module. As a result, the gradient will not be aggregated into the original module. This commit fixes the the problem by manually setting gradient edges from every parameter X in every replica to the same parameter X in the original module.
## Failed Attempt
Initially I tried implementing what we did in `replicate.py`, which
1. create module replicas
2. use Python `Broadcast` autograd function to broadcast every parameter in the original module to all destination devices.
3. assign the broadcast result params to module replicas' `_parameters` dict.
This works in Python because derived module member field params (e.g., `Linear.weight`) and base module `_parameters` (e.g., `Linear._parameters['weight']`) are referencing the same parameter instance. Assigning one of them will apply to both. However, in C++, even though I can modify Module's `parameters_ `values and gradient edges to point to the broadcast source, I cannot touch the weight and bias member fields in Linear, because replicate cannot (and should not) add special-case handlers to every different module. (See `Linear` [.h](https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/include/torch/nn/modules/linear.h), [.cpp](https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/src/nn/modules/linear.cpp)) Although they initially point to the same `TensorImpl` instance, after assigning to `Module.parameters_['weight']`, it will be different from `Linear.weight`.
## Solution Options
gchanan and I had several discussions on this issue and figured two solutions to this problem.
### Option One [implemented in this PR]
Replicate the module in two steps:
1. call `Module.clone()` to create a module replica on every destination device.
2. manually setting gradient edges from every parameter in every replica to the same parameter in the original module.
* Pro: Does not need to change any existing module, and relatively easier to implement
* Con: It is a little hackish.
### Options Two
Implement a `Replicatable` class (similar to `Cloneable`), and make it a friend class of `Module`. For more details see `Note [Replicating Modules]` in the code change.
* Pro: Maybe this aligns more with our existing approach implemented in `Cloneable`?
* Con: Require changes to every existing module.
I am inclined to go with option one, because `replicate` will only be used on data parallel. I feel it is too big an overkill if we have to change all existing module implementations due to a data parallel requirement.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20910
Differential Revision: D15556426
Pulled By: mrshenli
fbshipit-source-id: aa836290ec657b32742e2bea80bd0ac2404ef3b0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20669
Before, Dict was a value type, i.e. copying it did a deep copy.
Unfortunately, this doesn't work well with storing and passing Dicts around in IValues because IValues are reference types.
This diff changes Dict to be a reference type.
Reviewed By: dzhulgakov
Differential Revision: D15404911
fbshipit-source-id: dc990d3eb7cae044b74dd0253f8b704dde6a6c86
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20372
Implement a Dict type that allows us to abstract away from the concrete implementation used.
The API is similar to std::unordered_map, but behind the scenes we can switch to any map implementation we like. ska::flat_hash_map, google dense map, or any future map implementation with better performance.
Switching such an implementation choice does not have to break backwards compatibility of kernel code using the Dict type.
Reviewed By: zdevito
Differential Revision: D15298234
fbshipit-source-id: b5ad368a9e9516030805cd8f5f1b02e3986933c0
Summary:
This PR is an intermediate step toward the ultimate goal of eliminating "caffe2" in favor of "torch". This PR moves all of the files that had constituted "libtorch.so" into the "libcaffe2.so" library, and wraps "libcaffe2.so" with a shell library named "libtorch.so". This means that, for now, `caffe2/CMakeLists.txt` becomes a lot bigger, and `torch/CMakeLists.txt` becomes smaller.
The torch Python bindings (`torch_python.so`) still remain in `torch/CMakeLists.txt`.
The follow-up to this PR will rename references to `caffe2` to `torch`, and flatten the shell into one library.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17783
Differential Revision: D15284178
Pulled By: kostmo
fbshipit-source-id: a08387d735ae20652527ced4e69fd75b8ff88b05