This PR:
- renames `torch.set_deterministic` to `torch._set_deterministic`
- renames `torch.is_deterministic` to `torch._is_deterministic`
- Modifies the docstrings for both to indicate that the feature is not
yet complete.
We would like to do this because this feature is experimental and the
docstrings before this PR are misleading.
This PR does not have an accompanying change in master. That is because
there still is discussion over what the eventual state of the feature
should be: https://github.com/pytorch/pytorch/issues/15359. I expect
that there will be a better plan for this once 1.7 rolls around.
Test Plan:
- wait for CI
Summary:
Adds `torch.experimental.deterministic` flag to enforce deterministic algorithms across all of pytorch.
Adds `torch.experimental.deterministic_error_level` to allow users to choose between error/warning/silent if determinism for an operation is not available.
Adds `torch.experimental.alert_not_deterministic()` which should be called within operations that are not deterministic.
Offers both Python and ATen interfaces
Issue https://github.com/pytorch/pytorch/issues/15359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38683
Differential Revision: D21998093
Pulled By: ezyang
fbshipit-source-id: 23aabbddd20f6199d846f97764ff24d728163737
Summary:
'Program Files' does not have to be on disk C (nor necesserily should
be called `Program Files`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39707
Differential Revision: D21954235
Pulled By: malfet
fbshipit-source-id: 91a9b765cd1bc7e6201dd4b800d45257207010d9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39008
This commit adds a `torch.futures.Future` type and exposes its ctor,
`wait`, `then`, and `set_result` APIs. This type is currently a
wrapper of `c10::ivalue::Future` and mainly used by RPC for now. Later,
we could revamp c10d APIs to return this `Future` type as well. More
utils will be added into `torch.futures` package in followup PRs.
Test Plan: Imported from OSS
Differential Revision: D21723022
Pulled By: mrshenli
fbshipit-source-id: 92e56160544e9bf00d11db3e8347a1b9707882c9
Summary:
This reverts commit 1ab4f35499.
Without this PR, the OS try to find the DLL in the following directories.
- The directory from which the application loaded.
- The system directory. Use the GetSystemDirectory function to get the path of this directory.
- The 16-bit system directory. There is no function that obtains the path of this directory, but it is searched.
- The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
- The current directory.
- The directories that are listed in the PATH environment variable. Note that this does not include the per-application path specified by the App Paths registry key. The App Paths key is not used when computing the DLL search path.
If we use LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags, the directories are searched in the following order.
- The directory that contains the DLL (LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR). This directory is searched only for dependencies of the DLL to be loaded.
- The application directory (LOAD_LIBRARY_SEARCH_APPLICATION_DIR).
- Paths explicitly added to the application search path with the AddDllDirectory function (LOAD_LIBRARY_SEARCH_USER_DIRS) or the SetDllDirectory function. If more than one path has been added, the order in which the paths are searched is unspecified.
- The System32 directory (LOAD_LIBRARY_SEARCH_SYSTEM32).
Advantages:
1. The directory that contains the DLL comes first and it's desirable for us, because the dependencies in `lib` should always be preferred.
2. The system directory is considered in the last place. According to some of the bug reports, the DLL load failure are caused by loading the conflicting ones in systemroot.
Neural:
1. The directories in `PATH` are not considered. Similar things happen as described in the previous point. So it may be beneficial for normal users. However, it may cause failures if there are some new dependencies if built from source. (Resolved by making the fallback to `LoadLibraryW` if error code is `126`)
Disadvantages:
1. LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags is only available for Win7/2008 R2 + KB2533623 and up. (Resolved by making the fallback to `LoadLibraryW` if it is not supported)
2. Failure during the call of `LoadLibraryEx` will lead to the OS to pop up a modal dialog, which can block the process if user is using a CLI-only interface. This can be switched off by calling `SetErrorMode`. (Resolved by calling `SetErrorMode`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38302
Test Plan:
Test some common cases (in a new repo maybe) including
1. Python 3.6/3.7/3.8, conda python, conda install
2. Python 3.6/3.7/3.8, conda python, pip install
3. Python 3.6/3.7/3.8, official python, pip install
Plus some corner cases like
1. Conflicting DLLs in systemroot or `PATH`
2. Remove some local dependencies and use global ones
References:
1. https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-seterrormode
2. https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa
3. https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order#standard-search-order-for-desktop-applications
Differential Revision: D21524090
Pulled By: malfet
fbshipit-source-id: 0cf5e260c91759b0af8c7aa0950a488e3b653ef5
Summary:
…n Windows
Without this PR, the OS try to find the DLL in the following directories.
- The directory from which the application loaded.
- The system directory. Use the GetSystemDirectory function to get the path of this directory.
- The 16-bit system directory. There is no function that obtains the path of this directory, but it is searched.
- The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
- The current directory.
- The directories that are listed in the PATH environment variable. Note that this does not include the per-application path specified by the App Paths registry key. The App Paths key is not used when computing the DLL search path.
If we use LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags, the directories are searched in the following order.
- The directory that contains the DLL (LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR). This directory is searched only for dependencies of the DLL to be loaded.
- The application directory (LOAD_LIBRARY_SEARCH_APPLICATION_DIR).
- Paths explicitly added to the application search path with the AddDllDirectory function (LOAD_LIBRARY_SEARCH_USER_DIRS) or the SetDllDirectory function. If more than one path has been added, the order in which the paths are searched is unspecified.
- The System32 directory (LOAD_LIBRARY_SEARCH_SYSTEM32).
Advantages:
1. The directory that contains the DLL comes first and it's desirable for us, because the dependencies in `lib` should always be preferred.
2. The system directory is considered in the last place. According to some of the bug reports, the DLL load failure are caused by loading the conflicting ones in systemroot.
Neural:
1. The directories in `PATH` are not considered. Similar things happen as described in the previous point. So it may be beneficial for normal users. However, it may cause failures if there are some new dependencies if built from source. (Resolved by making the fallback to `LoadLibraryW` if error code is `126`)
Disadvantages:
1. LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags is only available for Win7/2008 R2 + KB2533623 and up. (Resolved by making the fallback to `LoadLibraryW` if it is not supported)
2. Failure during the call of `LoadLibraryEx` will lead to the OS to pop up a modal dialog, which can block the process if user is using a CLI-only interface. This can be switched off by calling `SetErrorMode`. (Resolved by calling `SetErrorMode`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37763
Test Plan:
Test some common cases (in a new repo maybe) including
1. Python 3.6/3.7/3.8, conda python, conda install
2. Python 3.6/3.7/3.8, conda python, pip install
3. Python 3.6/3.7/3.8, official python, pip install
Plus some corner cases like
1. Conflicting DLLs in systemroot or `PATH`
2. Remove some local dependencies and use global ones
References:
1. https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-seterrormode
2. https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa
3. https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order#standard-search-order-for-desktop-applications
What do you think, malfet ezyang ?
Differential Revision: D21496081
Pulled By: malfet
fbshipit-source-id: aa5e528e5134326b00ac98982f4db4b4bbb47a44
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38157
This removes the error prone process of assembling `torch/__init__.pyi`
(and frequently forgetting to expose things), since now we can simply
rely on the true source file to get things done. Most of the old
codegen in gen_pyi.py is now rerouted to various files:
- `torch/_C/__init__.pyi` (the dumping pile of all misc bindings)
- `torch/_C/_nn.pyi` (NN function bindings)
- `torch/_C/_VariableFunctions.pyi` (torch function bindings)
`torch.types` grew a bunch more definitions that previously where
defined in `torch/__init__.pyi`
Some miscellaneous changes
- Fixed a bug where we treat single TensorList argument as implying
varargs are accepted. This is actually only supported on IntList.
This means we can correctly generate a stub for dequantize.
- Add missing manual stub for nonzero
- Switched torch/onnx/operators.py to directly refer to _C module,
since apparently mypy doesn't think that methods prefixed with
underscores get reexported. This may be a recurring theme; maybe
we need to find a better way to solve it.
Because I was really lazy, I dumped namedtuple definitions in both
`torch._C` and `torch._C._VariableFunctions`. This is definitely wrong.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21497400
Pulled By: ezyang
fbshipit-source-id: 07b126141c82efaca37be27c07255cb2b9b3f064
Summary:
`is_tensor` doesn't really have a reason to exist anymore (other than
backwards compatibility) and is worse for typechecking with mypy (see
gh-32824). Given that it may not be obvious what the fix is once mypy
gives an error, make the change in a number of places at once, and add
a note on this to the `is_tensor` docstring.
Recommending an isinstance check instead has been done for quite a
while, e.g. https://github.com/pytorch/pytorch/pull/7769#discussion_r190458971
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38062
Differential Revision: D21470963
Pulled By: ezyang
fbshipit-source-id: 98dd60d32ca0650abd2de21910b541d32b0eea41
Summary:
xref gh-32838, gh-34032
This is a major refactor of parts of the documentation to split it up using sphinx's `autosummary` feature which will build out `autofuction` and `autoclass` stub files and link to them. The end result is that the top module pages like torch.nn.rst and torch.rst are now more like table-of-contents to the actual single-class or single-function documentations pages.
Along the way, I modified many of the docstrings to eliminate sphinx warnings when building. I think the only thing I changed from a non-documentation perspective is to add names to `__all__` when adding them to `globals()` in `torch.__init__.py`
I do not know the CI system: are the documentation build artifacts available after the build, so reviewers can preview before merging?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37419
Differential Revision: D21337640
Pulled By: ezyang
fbshipit-source-id: d4ad198780c3ae7a96a9f22651e00ff2d31a0c0f
Summary:
Following up on this: https://github.com/pytorch/pytorch/pull/35851 cross dtype storage copy is not being used internally, so I have not included cross dtype copy for complex.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35771
Differential Revision: D21319650
Pulled By: anjali411
fbshipit-source-id: 07c72996ee598eba0cf401ad61534494d6f5b5b3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36682
For fb internal builds we need to separate whether to use global deps library from loading with RTLD_GLOBAL.
Test Plan: CI -- this should be a no-op for existing builds
Reviewed By: ezyang
Differential Revision: D21051427
fbshipit-source-id: 83bb703d6ceb0265a4c58166749312a44172e78c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36151
Python 2 has reached end-of-life and is no longer supported by PyTorch. To avoid confusing behavior when trying to use PyTorch with Python 2, detect this case early and fail with a clear message. This commit covers `import torch` only and not C++ for now.
Test Plan: waitforsandcastle
Reviewed By: dreiss
Differential Revision: D20894381
fbshipit-source-id: a1073b7a648e07cf10cda5a99a2cf4eee5a89230
Summary:
This reverts commit d7a7bcb042.
The previous commit is not useful because torch_global_deps doesn't include any external dependencies.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35355
Differential Revision: D20653036
Pulled By: ezyang
fbshipit-source-id: 6d2e2f90952ca865b27b649a6ff9114ada8ea78c
Summary:
This PR implements the following linear algebra algorithms for low-rank matrices:
- [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
+ exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q`
+ [x] dense matrices
+ [x] batches of dense matrices
+ [x] sparse matrices
+ [x] documentation
- [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
+ uses `torch.lowrank.get_approximate_basis`
+ exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)`
+ [x] dense matrices
+ [x] batches of dense matrices
+ [x] sparse matrices
+ [x] documentation
- [x] PCA - using `torch.svd_lowrank`
+ uses `torch.svd_lowrank`
+ exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)`
+ [x] dense matrices
+ [x] batches of dense matrices
+ [x] sparse matrices, uses non-centered sparse matrix algorithm
+ [x] documentation
- [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124)
+ exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)`
+ [x] dense matrices
+ [x] batches of dense matrices
+ [x] sparse matrices
+ [x] documentation
- [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883)
+ exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)`
+ [x] dense matrices
+ [x] batches of dense matrices
+ [x] sparse matrices
+ [x] documentation
- [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically
+ the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point.
- [x] benchmarks
+ [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations.
+ [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster.
+ [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results)
Resolves https://github.com/pytorch/pytorch/issues/8049.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488
Differential Revision: D20193196
Pulled By: vincentqb
fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e
Summary:
The way it works on the Anaconda distribution of Python 3.8 is a bit different. Loading DLLs explicitly (e.g. `ctype.CDLL`) relies on paths appended by `os.add_dll_directory`. But if you try to load DLLs implicitly (e.g. `from torch._C import *`), it will rely on `PATH`.
Fixes https://github.com/pytorch/vision/issues/1916.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33856
Differential Revision: D20150080
Pulled By: soumith
fbshipit-source-id: cdbe76c138ea259ef7414c6634d4f7e0b1871af3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31162
This should help us resolve a multitude of weird segfaults and crashes
when PyTorch is imported along with other packages. Those would often
happen because libtorch symbols were exposed globally and could be used
as a source of relocations in shared libraries loaded after libtorch.
Fixes#3059.
Some of the subtleties in preparing this patch:
* Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
* Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D19262579
Test Plan: Imported from OSS
Pulled By: ezyang
fbshipit-source-id: 06a48a5d2c9036aacd535f7e8a4de0e8fe1639f2
Summary:
Fixes https://github.com/pytorch/pytorch/issues/28389
Intel's OpenMP implementation sets the thread affinity on the first call to an OpenMP function after a fork. By adding an atfork handler we can force this to happen before a user tries to set the affinity in their own DataLoader `worker_init_fn`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29006
Differential Revision: D18782456
Pulled By: ezyang
fbshipit-source-id: ce0b515256da0cf18ceb125e0cdec99a3311bbd3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25680
Add a runtime flag to choose between FBGEMM and QNNPACK when compiled with both.
The flag can be set by using torch.backends.quantized.engine = torch.fbgemm/torch.qnnpack or ctx::setPreferredQuantizedEngine(at::QEngine)
ghstack-source-id: 89935643
Test Plan: Verified torch.backends.quantized.engine works
Differential Revision: D17198233
fbshipit-source-id: e5449d06f4136385e0e6d18bd4237f8654a61672
Summary:
I have some test code in there as well, along with a script "test_libtorch" to run it. You'll need to modify `test_libtorch` to point to where you have `pytorch` built. I currently require that `pybind11` is included as a subdirectory of the test, but added it to the `.gitignore` to make this reviewable.
Currently, something like this works:
```cpp
struct Foo {
int x, y;
Foo(): x(2), y(5){}
Foo(int x_, int y_) : x(x_), y(y_) {}
void display() {
cout<<"x: "<<x<<' '<<"y: "<<y<<endl;
}
int64_t add(int64_t z) {
return (x+y)*z;
}
};
static auto test = torch::jit::class_<Foo>("Foo")
.def(torch::jit::init<int64_t, int64_t>())
.def("display", &Foo::display)
.def("add", &Foo::add)
.def("combine", &Foo::combine);
```
with
```py
torch.jit.script
def f(x):
val = torch._C.Foo(5, 3)
val.display()
print(val.add(3))
```
results in
```
x: 5 y: 3
24
```
Current issues:
- [x] The python class created by torchscript doesn't interactly properly with the surrounding code.
```
torch.jit.script
def f(x):
val = torch._C.Foo(5, 3)
return val
```
- [x] Doesn't properly take in non-pointer classes. Can't define this function signature in cpp (We don't want to support this I believe).
```cpp
void combine(Foo x) {
```
- [x] Has some issues with memory for blobs when constructing multiple objects (fix constant propagation pass to not treat capsules as the same object).
```py
torch.jit.script
def f(x):
val = torch._C.Foo(5, 3)
val2 = torch._C.Foo(100, 0)
val.display()
print(val.add(3))
```
- [ ] Can't define multiple constructors (need to define overload string. Currently not possible since we don't support overloaded methods).
- [x] `init` is a little bit different syntax than `pybind`. `.init<...>()` instead of `.def(py::init<>())`
- [x] I couldn't figure out how to add some files into the build so they'd be copied to the `include/` directories, so I symlinked them manually.
- [ ] Currently, the conversion from Python into Torchscript doesn't work.
- [ ] Torchbind also currently requires Python/Pybind dependency. Fixing this would probably involve some kind of macro to bind into Python when possible.
- [ ] We pass back into Python by value, currently. There's no way of passing by reference.
- [x] Currently can only register one method with the same type signature. This is because we create a `static auto opRegistry`, and the function is templated on the type signature.
Somewhat blocked on https://github.com/pytorch/pytorch/pull/21177. We currently use some structures that will be refactored by his PR (namely `return_type_to_ivalue` and `ivalue_to_arg_type`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21098
Differential Revision: D16634872
Pulled By: Chillee
fbshipit-source-id: 1408bb89ea649c27d560df59e2cf9920467fe1de
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23003
torch.quantization.fuse_module and torch.nn._intrinsic convRelu and LinearRelu
Fusion function to combine specific modules: (conv,bn) and (conv,bn,relu).
In all cases, replace modules in place. The first module is replaced with the _intrinsic fused module and the remaining modules are replaced by nn.Identity.
Support both training and eval. For training, the modules are "fused" with a sequential container. This is to allow for further module swaps for quantization aware training.
Also add: torch.nn._intrinsic for convRelu and LinearRelu.
TODO: Add tests for _intrinsic modules.
Conv BN fusion code is based on DsKhudia's implementation
Differential Revision: D16199720
fbshipit-source-id: 95fb9ffe72b361d280313b2ec57de2acd4f9dda2
Summary:
This is a modified version of https://github.com/pytorch/pytorch/pull/14705 since commit structure for that PR is quite messy.
1. Add `IterableDataset`.
3. So we have 2 data loader mods: `Iterable` and `Map`.
1. `Iterable` if the `dataset` is an instance of `IterableDataset`
2. `Map` o.w.
3. Add better support for non-batch loading (i.e., `batch_size=None` and `batch_sampler=None`). This is useful in doing things like bulk loading.
3. Refactor `DataLoaderIter` into two classes, `_SingleProcessDataLoaderIter` and `_MultiProcessingDataLoaderIter`. Rename some methods to be more generic, e.g., `get_batch` -> `get_data`.
4. Add `torch.utils.data.get_worker_info` which returns worker information in a worker proc (e.g., worker id, dataset obj copy, etc.) and can be used in `IterableDataset.__iter__` and `worker_init_fn` to do per-worker configuration.
5. Add `ChainDataset`, which is the analog of `ConcatDataset` for `IterableDataset`.
7. Import torch.utils.data in `torch/__init__.py`
9. data loader examples and documentations
10. Use `get_worker_info` to detect whether we are in a worker process in `default_collate`
Closes https://github.com/pytorch/pytorch/issues/17909, https://github.com/pytorch/pytorch/issues/18096, https://github.com/pytorch/pytorch/issues/19946, and some of https://github.com/pytorch/pytorch/issues/13023
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19228
Reviewed By: bddppq
Differential Revision: D15058152
fbshipit-source-id: 9e081a901a071d7e4502b88054a34b450ab5ddde
Summary:
https://github.com/pytorch/pytorch/pull/17072 breaks `model.to(xla_device)`, because moving `model` to XLA device involves changing its parameters' TensorImpl type, and the current implementation of `nn.Module.to()` doesn't support changing module parameters' TensorImpl type:
```python
# 6dc445e1a8/torch/nn/modules/module.py (L192-L208)
def _apply(self, fn):
...
for param in self._parameters.values():
if param is not None:
# Tensors stored in modules are graph leaves, and we don't
# want to create copy nodes, so we have to unpack the data.
param.data = fn(param.data) # NOTE: this doesn't allow changing `param.data`'s TensorImpl type
if param._grad is not None:
param._grad.data = fn(param._grad.data) # NOTE: this doesn't allow changing `param._grad.data`'s TensorImpl type
...
```
yf225 TODO: fix the description here when we finish the implementation
To fix this problem, we introduce a new API `model.to_()` that always assign new tensors to the parameters (thus supporting changing the parameters to any TensorImpl type), and also bump the version counter of the original parameters correctly so that they are invalidated in any autograd graph they participate in.
We also add warning to the current `model.to()` API to inform users about the upcoming behavior change of `model.to()`: in future releases, it would create and return a new model instead of in-place updating the current model.
This unblocks adding XLA to our CI test suite, which also allows XLA to catch up with other changes in our codebase, notably the c10 dispatcher.
[xla ci]
cc. resistor ailzhang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21613
Differential Revision: D15895387
Pulled By: yf225
fbshipit-source-id: b79f230fb06019122a37fdf0711bf2130a016fe6
Summary:
* `torch.hub.list('pytorch/vision')` - show all available hub models in `pytorch/vision`
* `torch.hub.show('pytorch/vision', 'resnet18')` - show docstring & example for `resnet18` in `pytorch/vision`
* Moved `torch.utils.model_zoo.load_url` to `torch.hub.load_state_dict_from_url` and deprecate `torch.utils.model_zoo`
* We have too many env to control where the cache dir is, it's not very necessary. I actually want to unify `TORCH_HUB_DIR`, `TORCH_HOME` and `TORCH_MODEL_ZOO`, but haven't done it. (more suggestions are welcome!)
* Simplify `pytorch/vision` example in doc, it was used to show how how hub entrypoint can be written so had some confusing unnecessary args.
An example of hub usage is shown below
```
In [1]: import torch
In [2]: torch.hub.list('pytorch/vision', force_reload=True)
Downloading: "https://github.com/pytorch/vision/archive/master.zip" to /private/home/ailzhang/.torch/hub/master.zip
Out[2]: ['resnet18', 'resnet50']
In [3]: torch.hub.show('pytorch/vision', 'resnet18')
Using cache found in /private/home/ailzhang/.torch/hub/vision_master
Resnet18 model
pretrained (bool): a recommended kwargs for all entrypoints
args & kwargs are arguments for the function
In [4]: model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
Using cache found in /private/home/ailzhang/.torch/hub/vision_master
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18758
Differential Revision: D14883651
Pulled By: ailzhang
fbshipit-source-id: 6db6ab708a74121782a9154c44b0e190b23e8309