pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Edward Yang	0478d32cb8	Move AlignOf, SmallVector and ArrayRef to c10. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13916 Reviewed By: smessmer Differential Revision: D13046722 fbshipit-source-id: 1583d3170d60e22f0a535cd1fd56bdf928186f5d	2018-11-14 11:13:16 -08:00
Peter Goldsborough	5151d33287	Unflake the ordering enforcement test (#13919 ) Summary: Attempts to unflake the dataloader ordering enforcement test. I think the issue was that the `thread_counter` variable was not atomic. I've made it atomic, and also global just to make it a bit clearer. Fixes https://github.com/pytorch/pytorch/issues/13634 colesbury SsnL ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/13919 Differential Revision: D13051718 Pulled By: goldsborough fbshipit-source-id: b9f7f6317701a8b861a1d5c6a9b2b17b44782561	2018-11-13 21:05:02 -08:00
Peter Goldsborough	393ad6582d	Use torch:: instead of at:: in all C++ APIs (#13523 ) Summary: In TorchScript and C++ extensions we currently advocate a mix of `torch::` and `at::` namespace usage. In the C++ frontend I had instead exported all symbols from `at::` and some from `c10::` into the `torch::` namespace. This is far, far easier for users to understand, and also avoid bugs around creating tensors vs. variables. The same should from now on be true for the TorchScript C++ API (for running and loading models) and all C++ extensions. Note that since we're just talking about typedefs, this change does not break any existing code. Once this lands I will update stuff in `pytorch/tutorials` too. zdevito ezyang gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/13523 Differential Revision: D12942787 Pulled By: goldsborough fbshipit-source-id: 76058936bd8707b33d9e5bbc2d0705fc3d820763	2018-11-06 14:32:25 -08:00
Peter Goldsborough	8fafa7b6ac	Remove size() from BatchDataset and templatize IndexType (#12960 ) Summary: This PR brings to changes to the recently landed C++ Frontend dataloader: 1. Removes the `size()` method from `BatchDataset`. This makes it cleaner to implement unsized ("infinite stream") datasets. The method was not used much beyond initial configuration. 2. Makes the index type of a dataset a template parameter of `BatchDataset` and `Sampler`. This essentially allows custom index types instead of only `vector<size_t>`. This greatly improves flexibility. See the `InfiniteStreamDataset` and `TestIndex` datasets in the tests for what this enables. Some additional minor updates and code movements too. apaszke SsnL Pull Request resolved: https://github.com/pytorch/pytorch/pull/12960 Differential Revision: D12893342 Pulled By: goldsborough fbshipit-source-id: ef03ea0f11a93319e81fba7d52a0ef1a125d3108	2018-11-05 17:13:09 -08:00
Peter Goldsborough	c21471c77f	Sampler serialization and deserialization (#12999 ) Summary: Implements serialization and deserialization for samplers in the C++ frontend dataloader. apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/12999 Differential Revision: D10859676 Pulled By: goldsborough fbshipit-source-id: cd132100fd35323e5a3df33e314511750806f48d	2018-10-26 12:20:51 -07:00
Dmytro Dzhulgakov	49046239f2	Change explicit usages of at::optional to c10::optional (#13082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13082 Follow up of D10511254. For these cases we can move to preferred `optional` without namespace right away. Reviewed By: ezyang, Yangqing Differential Revision: D10844117 fbshipit-source-id: 99a59e692fb4b236b299579f937f1536d443d899	2018-10-25 15:17:53 -07:00
Peter Goldsborough	a022fd2d6b	Implement DataLoader (#11918 ) Summary: This PR implements a DataLoader API for the C++ frontend. The components present in this API largely match the Python API. It consists of: - `Dataset`s: Conceptually a function from a set of indices to a batch of examples; - `Transform`s: A functional transformation of a dataset. A `Map<D, T>` for Dataset `D` and transform `T` is itself a dataset; - `Sampler`s: Specify a strategy for generating indices for a new batch; - A `DataLoader`, with the ability to automatically parallelize fetching of samples across multiple worker threads; Note that collation functions fall naturally out of the `Map<Dataset, Transform>` abstraction. Things that are missing right now that maybe should be added: - Memory pinning for CUDA tensors The API was designed to be generalizable to almost any kind of dataset, transform or sampling strategy, while providing a convenient API out of the box. To achieve this, it is quite heavily templatized on various possible input types. There are many parts to this PR! Right now, I would like feedback on: - Your impression of the general usability of the API; - Your impression of which parts seem too complex or overthought; - The implementation of the parallelization aspects of the DataLoader. I've followed the Python implementation in some matters, but also differ in others. I think my implementation is a little cleaner and decouples components slightly better than the Python dataloader. I haven't added too many comments yet, as this is fresh out of the oven. Let me know if anything is unclear from the code itself. There also aren't any tests yet. I will write a comprehensive test suite once we agree on the API and implementation. apaszke ezyang The controller you requested could not be found. pietern Pull Request resolved: https://github.com/pytorch/pytorch/pull/11918 Reviewed By: ezyang Differential Revision: D9998881 Pulled By: goldsborough fbshipit-source-id: 22cf357b63692bea42ddb1cc2abc71dae5030aea	2018-10-22 10:22:41 -07:00

7 Commits