Commit Graph

197 Commits

Author SHA1 Message Date
Sebastian Messmer
e68dc899d1 Fix compiler warnings (#22162)
Summary:
Fix various compiler warnings
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22162

Differential Revision: D16085339

Pulled By: smessmer

fbshipit-source-id: d36a4b334315f1a5942cac46443a7d166ca36d0d
2019-07-02 14:12:55 -07:00
Sebastian Messmer
6d5871300b Use concrete types on call sites for Dict/List (#22004)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22004

In future, we want all dicts/lists to store information about the types they contain.
This is only possible if the creation API doesn't allow creating lists/dicts without type information.
This diff removes some call sites that don't specify type information and have it specify type information.

Reviewed By: dzhulgakov

Differential Revision: D15906387

fbshipit-source-id: 64766a2534b52c221e8a5501a85eaad13812e7bd
2019-07-02 11:52:35 -07:00
xzhu1900
f0f2331a1c Add support for cross-chunk shuffling in ChunkDataset (#22347)
Summary:
This change adds one advanced support for cross-chunk shuffling.

For training with static dataset, the default configuration is at user's disposal. However, in some user cases, over each epoch, new data is added to the current dataset, thus the dataset's size is dynamically changing/increasing. In order to mix the new data and the old data for better random sampling, one approach is to shuffle examples from more than 1 chunks. This feature is supported with this change. By specifying the `cross_chunk_shuffle_count_` on construction, advanced user can specify how many chunks to shuffle example from.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22347

Differential Revision: D16081378

Pulled By: zhangguanheng66

fbshipit-source-id: fd001dfb9e66947839adecfb9893156fbbce80d0
2019-07-01 19:13:34 -07:00
Roy Li
6c454ff14c Stop using Type in Python bindings (#21963)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21963
ghimport-source-id: 4d9d66ba2c8587503d892b67f535cc2a62e2d19e

Test Plan: Imported from OSS

Differential Revision: D15897423

Pulled By: li-roy

fbshipit-source-id: 2dd55ceb80971df7c86545b7bfff733387f13572
2019-06-30 04:11:32 -07:00
xzhu1900
f39b6624ba ChunkDataset checkpoint support (#21889)
Summary:
When dealing with large scale dataset, it is handy if we can save the dataset status and resume later. Especially in cases where some unexpected crash happens, user don't need to start over the whole dataset from begining. Instead, they can reload it from the last checkpoint.

This change adds support for checkpoint save/load logic in ChunkDataset.

On ChunkDataset construction, user can specify a file name from which to load the checkpoint. If it is empty, default to start from fresh; otherwise the ChunkDataset will 'fast forward' the chunk sampler to the corresponding checkpoint.

The user can also call ChunkDataset::save() to serialize current status to a file, which can be used later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21889

Differential Revision: D16024582

Pulled By: ailzhang

fbshipit-source-id: 1862ab5116f94c9d29da174ce04a91041d06cad5
2019-06-26 22:54:14 -07:00
Sebastian Messmer
de85abf226 Allow default construction of Dict/List (#22084)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22084

For DictPtr/ListPtr, default construction was disallowed because it was ambigious if it's supposed to create an empty list or a nullptr.
But since we renamed them to Dict/List, we can now allow default construction without ambiguity.

Differential Revision: D15948098

fbshipit-source-id: 942a9235b51608d1870ee4a2f2f0a5d0d45ec6e6
2019-06-25 17:40:48 -07:00
Sebastian Messmer
275087383b ListPtr->List DictPtr->Dict step 2 (#21937)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21937

This changes call sites to use the new naming scheme

Reviewed By: zdevito

Differential Revision: D15892404

fbshipit-source-id: 8d32aa90a0ead1066688166478f299fde9c2c133
2019-06-19 18:02:05 -07:00
peter
794ee6d00c Switch to out-source builds for LibTorch
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21772

Differential Revision: D15839332

Pulled By: yf225

fbshipit-source-id: 017cf61c5682c6a8ffeaf2ca952e1418c27be30e
2019-06-14 21:00:18 -07:00
James Reed
4bcc72fe95 Support for NamedTuple (#21428)
Summary:
Resolves https://github.com/pytorch/lockdown/issues/18

This implements NamedTuple by taking advantage of the existing `names` field in `TupleType`.

TODO: This currently doesn't retain the NamedTuple-ness through serialization. Discussed with suo offline, we can probably make a way to define an anonymous NamedTuple in script (e.g. `NamedTuple('Foo', [('a', int), ('b', float), ('c', List[float])])` and serialize that
TODO: implement support for calling the constructor with kwargs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21428

Differential Revision: D15741564

Pulled By: jamesr66a

fbshipit-source-id: c077cbcea1880675ca6deb340a9ec78f824a136c
2019-06-14 16:45:56 -07:00
Mikhail Zolotukhin
fbecb4621f schema_matching.cpp: improve error messages.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21141

Differential Revision: D15808354

Pulled By: ZolotukhinM

fbshipit-source-id: 16d938fd5acafb445a0c433cabc9a55cab563165
2019-06-13 17:04:38 -07:00
Will Feng
d3b3cbe26e Revert D15769066: [pytorch][PR] schema_matching.cpp: improve error messages.
Differential Revision:
D15769066

Original commit changeset: 5853e0360581

fbshipit-source-id: ac6fa8429136abf4c7835919009f936eea11ea7b
2019-06-12 20:17:38 -07:00
Karl Ostmo
49481d576d Torch rename (#20774)
Summary:
This renames the CMake `caffe2` target to `torch`, as well as renaming `caffe2_gpu` to `torch_gpu` (and likewise for other gpu target variants).  Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge of `caffe2` from CMake variable names is beyond the scope of this PR.

The shell `libtorch` library that had been introduced as a stopgap in https://github.com/pytorch/pytorch/issues/17783 is again flattened in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20774

Differential Revision: D15769965

Pulled By: kostmo

fbshipit-source-id: b86e8c410099f90be0468e30176207d3ad40c821
2019-06-12 20:12:34 -07:00
Sebastian Messmer
b527e48588 Use c10::List (#21177)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177

- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now

Differential Revision: D15476433

fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
2019-06-12 13:58:24 -07:00
Mikhail Zolotukhin
96910251e0 schema_matching.cpp: improve error messages.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21141

Differential Revision: D15769066

Pulled By: ZolotukhinM

fbshipit-source-id: 5853e0360581c44e42b068add3bf2bc68e671b2b
2019-06-12 12:43:12 -07:00
Will Feng
8cc8e15473 Back out "[pytorch][PR] [Re-landing] Fix caffe2 windows CI for new Windows AMI" (#21670)
Summary:
Original commit changeset: e65c1d6bfcc9
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21670

Differential Revision: D15776087

Pulled By: yf225

fbshipit-source-id: cbb55cbbcb133cae1aeb2fe75cc52e7350cc6c88
2019-06-12 10:37:19 -07:00
peter
bb788631ce Fix caffe2 windows CI for new Windows AMI (#21452)
Summary:
The alternative of #21410.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21452

Differential Revision: D15701767

Pulled By: ezyang

fbshipit-source-id: e65c1d6bfcc98e88460f4a57e5b99c2f395c0ceb
2019-06-06 13:46:45 -07:00
Shen Li
b7b6b612a7 Fix C++ data parallel (#20910)
Summary:
Fixes #19540

CC nmerrill67

C++ data parallel was using Module.clone() to create module replicas on every destination device. However, clone() does not set up gradient edges to point from replicas to the original module. As a result, the gradient will not be aggregated into the original module. This commit fixes the the problem by manually setting gradient edges from every parameter X in every replica to the same parameter X in the original module.

## Failed Attempt

Initially I tried implementing what we did in `replicate.py`, which
1. create module replicas
2. use Python `Broadcast` autograd function to broadcast every parameter in the original module to all destination devices.
3. assign the broadcast result params to module replicas' `_parameters` dict.

This works in Python because derived module member field params (e.g., `Linear.weight`) and base module `_parameters` (e.g., `Linear._parameters['weight']`) are referencing the same parameter instance. Assigning one of them will apply to both. However, in C++, even though I can modify Module's `parameters_ `values and gradient edges to point to the broadcast source,  I cannot touch the weight and bias member fields in Linear, because replicate cannot (and should not) add special-case handlers to every different module. (See `Linear` [.h](https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/include/torch/nn/modules/linear.h), [.cpp](https://github.com/pytorch/pytorch/blob/master/torch/csrc/api/src/nn/modules/linear.cpp)) Although they initially point to the same `TensorImpl` instance, after assigning to `Module.parameters_['weight']`, it will be different from `Linear.weight`.

## Solution Options

gchanan and I had several discussions on this issue and figured two solutions to this problem.

### Option One [implemented in this PR]

Replicate the module in two steps:
1. call `Module.clone()` to create a module replica on every destination device.
2. manually setting gradient edges from every parameter in every replica to the same parameter in the original module.

* Pro: Does not need to change any existing module, and relatively easier to implement
* Con: It is a little hackish.

### Options Two

Implement a `Replicatable` class (similar to `Cloneable`), and make it a friend class of `Module`. For more details see `Note [Replicating Modules]` in the code change.

* Pro: Maybe this aligns more with our existing approach implemented in `Cloneable`?
* Con: Require changes to every existing module.

I am inclined to go with option one, because `replicate` will only be used on data parallel. I feel it is too big an overkill if we have to change all existing module implementations due to a data parallel requirement.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20910

Differential Revision: D15556426

Pulled By: mrshenli

fbshipit-source-id: aa836290ec657b32742e2bea80bd0ac2404ef3b0
2019-06-06 11:57:31 -07:00
Will Feng
c8083e0292 Include named_any.h in modules.h (#21437)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/19462.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21437

Differential Revision: D15684880

Pulled By: yf225

fbshipit-source-id: db23c7e4e0f62d22b0b6c18f15420c3bb66af366
2019-06-06 09:57:33 -07:00
Michael Suo
b6d1a72f48 improve error message on inferred type (#21058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21058
ghimport-source-id: 7fad3a0567022dd417f4bd079a50a22e3c1dc020

Differential Revision: D15547218

Pulled By: suo

fbshipit-source-id: 5dbd567c79e6d01e9af4b8552777f7f0043df5b2
2019-05-30 10:50:34 -07:00
Michael Suo
154029a6ff Revert D15534670: [jit] improve error message on inferred type
Differential Revision:
D15534670

Original commit changeset: 8bbfd6e9c1af

fbshipit-source-id: fe62cf954292e8ef1d00a3cc569206f73cedcd31
2019-05-29 14:56:08 -07:00
Michael Suo
5dacf6b048 improve error message on inferred type (#21058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21058
ghimport-source-id: e7d6e082b0faf4f3d3e683f2c98863ee269439f0

Differential Revision: D15534670

Pulled By: suo

fbshipit-source-id: 8bbfd6e9c1afbc3006d7d55ed633e18618e05021
2019-05-29 14:47:00 -07:00
Sebastian Messmer
d5b7138a2c Dict is a reference type (#20669)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20669

Before, Dict was a value type, i.e. copying it did a deep copy.
Unfortunately, this doesn't work well with storing and passing Dicts around in IValues because IValues are reference types.
This diff changes Dict to be a reference type.

Reviewed By: dzhulgakov

Differential Revision: D15404911

fbshipit-source-id: dc990d3eb7cae044b74dd0253f8b704dde6a6c86
2019-05-23 15:24:31 -07:00
Ilia Cherniavskii
5835165ce3 Add get/set_num_interop_threads into torch.h include (#20659)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20659
ghimport-source-id: 4858d03a9f89c613f64901c3430a7b212f76eb95

Reviewed By: dzhulgakov

Differential Revision: D15399780

Pulled By: ilia-cher

fbshipit-source-id: c1c3cb628c5ee664468f9d181bcd76a5105a89fd
2019-05-20 00:34:59 -07:00
Sebastian Messmer
ace506fb38 Dict (#20372)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20372

Implement a Dict type that allows us to abstract away from the concrete implementation used.
The API is similar to std::unordered_map, but behind the scenes we can switch to any map implementation we like. ska::flat_hash_map, google dense map, or any future map implementation with better performance.
Switching such an implementation choice does not have to break backwards compatibility of kernel code using the Dict type.

Reviewed By: zdevito

Differential Revision: D15298234

fbshipit-source-id: b5ad368a9e9516030805cd8f5f1b02e3986933c0
2019-05-14 18:37:02 -07:00
Karl Ostmo
4ba28deb6e Unify libtorch and libcaffe2 (#17783)
Summary:
This PR is an intermediate step toward the ultimate goal of eliminating "caffe2" in favor of "torch".  This PR moves all of the files that had constituted "libtorch.so" into the "libcaffe2.so" library, and wraps "libcaffe2.so" with a shell library named "libtorch.so".  This means that, for now, `caffe2/CMakeLists.txt` becomes a lot bigger, and `torch/CMakeLists.txt` becomes smaller.

The torch Python bindings (`torch_python.so`) still remain in `torch/CMakeLists.txt`.

The follow-up to this PR will rename references to `caffe2` to `torch`, and flatten the shell into one library.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17783

Differential Revision: D15284178

Pulled By: kostmo

fbshipit-source-id: a08387d735ae20652527ced4e69fd75b8ff88b05
2019-05-10 09:50:53 -07:00
Edward Yang
c397134d6b Revert D15156384: Dict
Differential Revision:
D15156384

Original commit changeset: b9313ec4dd9a

fbshipit-source-id: 3b44f49ec4eaba692cfb2cfe46e5f98102e337d9
2019-05-10 06:11:25 -07:00
Sebastian Messmer
c92129033a Dict (#19976)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19976

Implement a Dict type that allows us to abstract away from the concrete implementation used.
The API is similar to std::unordered_map, but behind the scenes we can switch to any map implementation we like. ska::flat_hash_map, google dense map, or any future map implementation with better performance.
Switching such an implementation choice does not have to break backwards compatibility of kernel code using the Dict type.

Reviewed By: li-roy

Differential Revision: D15156384

fbshipit-source-id: b9313ec4dd9acb3b6a0035345b6ba4f2a437d1e5
2019-05-09 10:54:07 -07:00
Will Feng
0087069dce Use torch::get/set_num_threads without additional includes beyond torch/torch.h (#20176)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/20130.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20176

Differential Revision: D15275036

Pulled By: yf225

fbshipit-source-id: 0f04e1fbfed18c07030b20e92e957ef5f2b5707d
2019-05-09 06:08:27 -07:00
Thiago Crepaldi
3d4d7b9082 Refactor ChunkDataReader API + fix missing headers (#19485)
Summary:
This PR restricts the BatchType template argument of ChunkDataReader to STL
vectors only. Internally, ChunkDataReader was assuming BatchType was a
vector, but the user could pass any type to the template argument,
leading to compiling issues during CPP extensions.

Additionally to the proposed API change, this PR adds missing include
headers to chunk.h. Currently the current implementation works but if
users try to create C++ extensions that implements new ChunkDataReaders
to be along with the existing ChunkDataset, the build will fail due to
the missing headers.

In terms of functionality, nothing has changed. This PR simply makes the
implementation slightly more robust for future extensions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19485

Differential Revision: D15261725

Pulled By: soumith

fbshipit-source-id: 38c9465d665392ae6a2d12c5a520a4f501e1a6ca
2019-05-08 22:20:19 -07:00
Will Feng
5099db08d4 Ignore nn::Functional submodules in nn::Module serialization (#19740)
Summary:
Currently, the Python API doesn't serialize layers that don't have weights (such as `nn.ReLU` and `nn.MaxPool2d`e.g. in https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py#L80-L81). If one saves a model that contains weight-less layers in Python and tries to load it into C++, the C++ module loading code (`torch::load(...)`) will throw an error complaining that the expected layers are not found in the serialized file (e.g. https://github.com/pytorch/vision/pull/728#issuecomment-480974175). This PR solves the problem by ignoring layers that are not serializable (which currently only include `nn::Functional`) in the C++ module serialization code (`torch::save(...)` and `torch::load(...)`), and the user is expected to use `nn::Functional` to wrap the weight-less layers so that they can be ignored when serializing / deserializing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19740

Differential Revision: D15100575

Pulled By: yf225

fbshipit-source-id: 956481a2355d1de45341585abedda05e35d2ee8b
2019-04-26 12:47:23 -07:00
Will Feng
9aa0e6078f Support serializing std::vector<torch::Tensor> (#19677)
Summary:
In the distributed training development work, we need to be able to serialize a `std::vector` of `torch::Tensor`s. This PR adds support for serializing `std::vector<torch::Tensor>`.

cc. mrshenli
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19677

Differential Revision: D15069860

Pulled By: yf225

fbshipit-source-id: 505147e5f5fea78be1bf60fb8418bc187dbc2a98
2019-04-24 16:50:16 -07:00
Roy Li
ab78449e8c Add ScalarType argument to Type::options() (#19270)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19270
ghimport-source-id: a5ade6131f3260066c5750ea1fa9ed5c998bb791

Differential Revision: D14938707

Pulled By: li-roy

fbshipit-source-id: 018fb3f01706531a06515d6d861e5683a455a705
2019-04-21 21:16:07 -07:00
David Riazati
a0e09216f0 Fix test build (#19444)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19444
ghimport-source-id: c85db00e8037e7f6f0424eb8bd17f957d20b7247

Reviewed By: eellison

Differential Revision: D15008679

Pulled By: driazati

fbshipit-source-id: 0987035116d9d0069794d96395c8ad458ba7c121
2019-04-18 18:05:04 -07:00
David Riazati
d9052b2176 Allow optionals arguments from C++ (#19311)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19311
ghimport-source-id: 699f62eb2bbad53ff2045fb2e217eb1402f2cdc5

Reviewed By: eellison

Differential Revision: D14983059

Pulled By: driazati

fbshipit-source-id: 442f96d6bd2a8ce67807ccad2594b39aae489ca5
2019-04-18 17:15:05 -07:00
Omegastick
31ff0ecd2b Fix torch::nn::init::orthogonal_ with CNNs (#18915)
Summary:
Fixes #18518

I changed the C++ API torch::nn::init::orthogonal_ implementation to match the Python implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18915

Differential Revision: D14851833

Pulled By: ezyang

fbshipit-source-id: 45b5e9741582777c203e9ebed564ab3ac1f94baf
2019-04-09 10:39:15 -07:00
Soumith Chintala
b5d8844bbe push magma init into lazyInitCUDA (#18527)
Summary:
Tries to fix C++ API's usage of MAGMA-based functions.

Attempts to Fix https://github.com/pytorch/pytorch/issues/18074
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18527

Differential Revision: D14691694

Pulled By: soumith

fbshipit-source-id: dd04e74418e486d73ea4a92193ddf79352ed71ba
2019-04-03 12:47:34 -07:00
Will Feng
6ebfbdf4c6 Add named submodule support to nn::Sequential (#17552)
Summary:
Previously, we were not able to assign names to `nn::Sequential`'s submodules. This PR adds this feature to match the Python API. Example use:
```cpp
Sequential sequential(named_submodule({
      {"linear", Linear(10, 3)},
      {"conv2d", Conv2d(1, 2, 3)},
      {"dropout", Dropout(0.5)},
      {"batchnorm", BatchNorm(5)},
      {"embedding", Embedding(4, 10)},
      {"lstm", LSTM(4, 5)}
}));
```

It also enables loading parameters of Python `nn.Sequential` module with custom submodules names into C++ frontend, unblocking https://github.com/pytorch/vision/pull/728#issuecomment-466661344.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17552

Differential Revision: D14246834

Pulled By: yf225

fbshipit-source-id: 3030b5c5d68f6dd5d3e37ac4b4f98dc6d6d9ba72
2019-03-29 13:06:29 -07:00
Roy Li
7aae51cded Replace tensor.type().scalarType() calls with tensor.scalar_type()
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17515

Reviewed By: ezyang

Differential Revision: D14233250

fbshipit-source-id: 6c7af8d2291c0c2b148001b30cf03834f34366c0
2019-03-08 14:08:18 -08:00
Elias Ellison
10ea02facf fix tuple matching (#17687)
Summary:
Check for Tuple Matching in isSubvalueOf, since they may contain container types that need to be recursed within isSubvalueOf

Fix for https://github.com/pytorch/pytorch/issues/17650
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17687

Differential Revision: D14324642

Pulled By: eellison

fbshipit-source-id: 7f1e019875286b2640a3b9c003d1635dda8cf543
2019-03-06 11:25:36 -08:00
Jaliya Ekanayake
bb3a2d99ac Jaliyae/chunk buffer fix (#17409)
Summary:
The chunk buffer had a possibility to hang when no data is read and the buffer size is lower than chunk size. We detected this while running with larger dataset and hence the fix. I added a test to mimic the situation and validated that the fix is working. Thank you Xueyun for finding this issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17409

Differential Revision: D14198546

Pulled By: soumith

fbshipit-source-id: b8ca43b0400deaae2ebb6601fdc65b47f32b0554
2019-02-23 08:48:53 -08:00
Will Feng
be6ad7ddde Rename BatchNorm running_variance to running_var (#17371)
Summary:
Currently there is a mismatch in naming between Python BatchNorm `running_var` and C++ BatchNorm `running_variance`, which causes JIT model parameters loading to fail (https://github.com/pytorch/vision/pull/728#issuecomment-466067138):
```
terminate called after throwing an instance of 'c10::Error'
  what():  No such serialized tensor 'running_variance' (read at /home/shahriar/Build/pytorch/torch/csrc/api/src/serialize/input-archive.cpp:27)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x85 (0x7f2d92d32f95 in /usr/local/lib/libc10.so)
frame #1: torch::serialize::InputArchive::read(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, at::Tensor&, bool) + 0xdeb (0x7f2d938551ab in /usr/local/lib/libtorch.so.1)
frame #2: torch::nn::Module::load(torch::serialize::InputArchive&) + 0x98 (0x7f2d9381cd08 in /usr/local/lib/libtorch.so.1)
frame #3: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1)
frame #4: torch::nn::Module::load(torch::serialize::InputArchive&) + 0xf9 (0x7f2d9381cd69 in /usr/local/lib/libtorch.so.1)
frame #5: torch::nn::operator>>(torch::serialize::InputArchive&, std::shared_ptr<torch::nn::Module> const&) + 0x32 (0x7f2d9381c7b2 in /usr/local/lib/libtorch.so.1)
frame #6: <unknown function> + 0x2b16c (0x5645f4d1916c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest)
frame #7: <unknown function> + 0x27a3c (0x5645f4d15a3c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest)
frame #8: <unknown function> + 0x2165c (0x5645f4d0f65c in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest)
frame #9: <unknown function> + 0x1540b (0x5645f4d0340b in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest)
frame #10: __libc_start_main + 0xf3 (0x7f2d051dd223 in /usr/lib/libc.so.6)
frame #11: <unknown function> + 0x1381e (0x5645f4d0181e in /home/shahriar/Projects/CXX/build-TorchVisionTest-Desktop_Qt_5_12_1_GCC_64bit-Debug/TorchVisionTest)
```
Renaming C++ BatchNorm `running_variance` to `running_var` should fix this problem.

This is a BC-breaking change, but it should be easy for end user to rename `running_variance` to `running_var` in their call sites.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17371

Reviewed By: goldsborough

Differential Revision: D14172775

Pulled By: yf225

fbshipit-source-id: b9d3729ec79272a8084269756f28a8f7c4dd16b6
2019-02-22 08:00:25 -08:00
Jaliya Ekanayake
9477c143c6 C++ Frontend: adding two distributed samples (Random and Sequential) (#16910)
Summary:
Adding two distrbuted samplers, Random and Sequential to the mix. Similar to python counterpart, DistributedSampler introduces a new method `set_epoch(size_t epoch)` which can be use to shuffle data determinstically between distributed processes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16910

Differential Revision: D14130980

Pulled By: soumith

fbshipit-source-id: ec08b7130c01e2fc6dc3693f7ac622a0a6d60f10
2019-02-19 05:40:37 -08:00
David Riazati
b3d8c569d3 Remove templates for GenericDict
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17175

Differential Revision: D14113022

Pulled By: driazati

fbshipit-source-id: 5183e131cc8ccb58525875f76fa03133570a59ea
2019-02-15 21:35:19 -08:00
Josh Varty
1cdcdd78af Kaiming Initialization (#14718)
Summary:
/cc goldsborough

Working on #14582

The corresponding python implementations are at: [pytorch/torch/nn/init.py](6302e4001a/torch/nn/init.py (L261-L327))

Here is my initial implementation of Kaiming Initialization. I have not been able to figure out how to successfully run tests locally so I haven't added any yet.

A couple questions:
- Are the enums defined in the right place? I copied their names from Python, but do you prefer different naming conventions for C++?
- To run tests locally do I use `python setup.py test`? Can I run just a subset of the tests somehow?
- Should I add my tests at [test/cpp/api/misc.cpp](https://github.com/pytorch/pytorch/blob/master/test/cpp/api/misc.cpp#L47-L54)?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14718

Differential Revision: D14049159

Pulled By: goldsborough

fbshipit-source-id: 966ac5126875936e69b185b5041f16476ed4cf70
2019-02-15 14:58:22 -08:00
Michael Liu
92a516b9ff Apply modernize-use-override - 2/2
Summary:
Use C++11’s override and remove virtual where applicable.
Change are automatically generated.

Reviewed By: Orvid

Differential Revision: D14054721

fbshipit-source-id: 15d266fa1779b1e3ea6270f00841d7fb1e4d44ee
2019-02-13 21:01:28 -08:00
Dmytro Dzhulgakov
46503a7ac0 Trim libshm deps, move tempfile.h to c10 (#17019)
Summary:
libshm_manager doesn't need to depend on all of libtorch. It only uses tiny tempfile.h which can be moved to c10. I could just duplicate the file too, but it's not worth it as c10 is small enough.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17019

Differential Revision: D14052688

Pulled By: dzhulgakov

fbshipit-source-id: 8797d15f8c7c49c49d40b7ab2f43aa3bf6becb0c
2019-02-13 19:38:35 -08:00
Jaliya Ekanayake
bc39cf4d5e Remove chunk count check on the ChunkBuffer (#16868)
Summary:
Previously, the ChunkBuffer depends on the remaining chunk count to signal end of dataloading. This does not work with distributed samplers where each sampler only loads a subset of  chunks. This refactor remove the dependency on the remaining chunk count at the ChunkBuffer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16868

Differential Revision: D14066517

Pulled By: goldsborough

fbshipit-source-id: 293dfe282ceff326dff0876c2f75c2ee4f4463e2
2019-02-13 11:09:42 -08:00
David Riazati
ee0e71bee7 Allow dicts in C++ frontend (#16846)
Summary:
Fixes #16856
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16846

Differential Revision: D13991103

Pulled By: driazati

fbshipit-source-id: 4830dd6f707fa90429b5d3070eeda0bee53d2f2b
2019-02-07 18:44:49 -08:00
Edward Yang
4404762d7d Rename IntList to IntArrayRef. (#16751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751

This was made more complicated by the fact that ivalue::IntList
is a thing.  So I had to fix all of the sites where we referring
to IValue post facto.

The following codemods were run, in this order:

```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```

Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752

Reviewed By: dzhulgakov

Differential Revision: D13954363

fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
2019-02-05 14:54:34 -08:00
Elias Ellison
18659e1336 Allow generic containers as module inputs (#16482)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/16326

Previously we didn't handle module inputs which included Generic Lists. When checking whether a generic list if a subvalue of the input arg type, I currently recurse on every element of the list. This shouldn't be too slow since the innermost list will be specialized and we won't have to check it's elements.

E.g. Tensor[][] -> GenericList [TensorList ].

The error message could be improved, but extracting the complete type of nested lists would have to deal with unifying types across lists / empty lists & typevars so I'm going to save that for a follow up PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16482

Differential Revision: D13882582

Pulled By: eellison

fbshipit-source-id: 3609bc572f0ee9ebf20a77ea5ebc8fa3b165e24b
2019-01-30 14:20:56 -08:00