Commit Graph

155 Commits

Author SHA1 Message Date
albanD
16c5b7b3f2 Avoid leaking has_torch_function and handle_torch_function in torch namespace (#46680)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46680

Reviewed By: zou3519

Differential Revision: D24459823

Pulled By: albanD

fbshipit-source-id: 4ff6925afcf14214dc45921bca0d2f33ca1944a1
2020-10-22 07:48:36 -07:00
Supriya Rao
04526a49d3 [quant] creating quint4x2 dtype for quantized tensors (#44678)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44678

This is a prototype PR that introduces 4 bit qtensors. The new dtype added for this is c10::quint4x2
The underlying storage for this is still uint8_t, so we pack 2 4-bit values in a byte while quantizing it.

This change uses most of the existing scaffolding for qtensor storage. We allocate storage
based on the dtype before creating a new qtensor.

It also adds a dispatch mechanism for this dtype so we can use this to get the bitwidth, qmin and qmax info
while quantizing and packing the qtensor (when we add 2-bit qtensor)

Kernels that use this dtype should be aware of the packing format.

Test Plan:
Locally tested
```
x = torch.ones((100, 100), dtype=torch.float)
qx_8bit = torch.quantize_per_tensor(x, scale=1.0, zero_point=2, dtype=torch.quint8)
qx = torch.quantize_per_tensor(x, scale=1.0, zero_point=2, dtype=torch.quint4x2)

torch.save(x, "temp.p")
print('Size float (B):', os.path.getsize("temp.p"))
os.remove('temp.p')

torch.save(qx_8bit, "temp.p")
print('Size quantized 8bit(B):', os.path.getsize("temp.p"))
os.remove('temp.p')

torch.save(qx, "temp.p")
print('Size quantized 4bit(B):', os.path.getsize("temp.p"))
os.remove('temp.p')
```

Size float (B): 40760
Size quantized 8bit(B): 10808
Size quantized 4bit(B): 5816

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D23993134

fbshipit-source-id: 073bf262f9680416150ba78ed2d932032275946d
2020-10-01 23:53:34 -07:00
Mike Ruberry
b2925671b6 Updates deterministic flag to throw a warning, makes docs consistent (#45410)
Summary:
Per feedback in the recent design review. Also tweaks the documentation to clarify what "deterministic" means and adds a test for the behavior.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45410

Reviewed By: ngimel

Differential Revision: D23974988

Pulled By: mruberry

fbshipit-source-id: e48307da9c90418fc6834fbd67b963ba2fe0ba9d
2020-09-29 11:17:33 -07:00
Iurii Zdebskyi
722faeb2a4 [RELAND] Added optimizers based on multi tensor apply (#45408)
Summary:
Original PR https://github.com/pytorch/pytorch/pull/45299.  The present PR fixes minor bugs that caused revert.

Adding a new namespace `torch.optim._multi_tensor` with a bunch of updated optimizers. Those optimizers are using _foreach APIs which improve performance significantly.

### Tests
- updated existing tests to use both optimizers
- added `test_multi_tensor_optimizers` test to verify correctness.

### Perf results

**Adam**
timeit: 42.69 ms --> 10.16 ms
autorange: 41.96 ms --> 10.28 ms

**AdamW**
timeit: 51.38 ms --> 15.63 ms
autorange: 50.82 ms --> 16.07 ms

**SGD**
timeit: 6.28 ms --> 4.40 ms
autorange: 6.13 ms --> 4.73 ms

**RMSprop**
timeit: 28.63 ms --> 5.89 ms
autorange: 28.27 ms -->  5.76 ms

**Rprop**
timeit: 213.30 --> 178.42
autorange: 212.03 --> 178.03

**ASGD**
timeit: 21.67 --> 9.33
autorange: 21.64 --> 9.27

**Adamax**
timeit: 55.60 --> 48.29
autorange: 55.22 -> 49.13

**Rerf Script used**

```
import torch
import time
import torch.optim as optim
from torch.autograd import Variable
from torch.optim.lr_scheduler import ExponentialLR, ReduceLROnPlateau, StepLR
import torch.nn as nn
import time
import torchvision
import torch.utils._benchmark as benchmark_utils

device = "cuda"
model = torchvision.models.resnet.resnet101(pretrained=True).to(device)
targets = torch.randint(0, 1000, (100, 100), device=device)
criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(model.parameters(), lr=1e-3) # <----------------------- optimizer.
                                                          # would compare optim.SGD vs optim._multi_tensor.SGD
running_loss = 0.0
target = torch.empty(128, dtype=torch.long, device=device).random_(5)

optimizer.zero_grad()
inputs = torch.rand(128, 3, 100, 100, device=device , requires_grad=True)
outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()
running_loss += loss.item()

def main():
    timer = benchmark_utils.Timer(
        stmt="optimizer.step()",
        globals=globals(),
        label="str(optimizer)",
    )

    for i in range(1):
        print(f"Run: {i}\n{'-' * 40}")
        print(f"timeit:\n{timer.timeit(1000)}\n")
        print(f"autorange:\n{timer.blocked_autorange()}\n\n")

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45408

Reviewed By: gchanan

Differential Revision: D23956680

Pulled By: izdeby

fbshipit-source-id: c5eab7bf5fce14a287c15cead1cdc26e42cfed94
2020-09-28 13:14:04 -07:00
Mike Ruberry
54a253fded Revert D23931987: Added optimizers based on multi tensor apply
Test Plan: revert-hammer

Differential Revision:
D23931987 (2b21e7767e)

Original commit changeset: 582134ef2d40

fbshipit-source-id: ffd500aea55fda34155442fb15e2529cb9c00100
2020-09-26 18:11:54 -07:00
Iurii Zdebskyi
2b21e7767e Added optimizers based on multi tensor apply (#45299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45299

Adding a new namespace `torch.optim._multi_tensor` with a bunch of updated optimizers. Those optimizers are using _foreach APIs which improve performance significantly.

### Tests
- updated existing tests to use both optimizers
- added `test_multi_tensor_optimizers` test to verify correctness.

### Perf results

**Adam**
timeit: 42.69 ms --> 10.16 ms
autorange: 41.96 ms --> 10.28 ms

**AdamW**
timeit: 51.38 ms --> 15.63 ms
autorange: 50.82 ms --> 16.07 ms

**SGD**
timeit: 6.28 ms --> 4.40 ms
autorange: 6.13 ms --> 4.73 ms

**RMSprop**
timeit: 28.63 ms --> 5.89 ms
autorange: 28.27 ms -->  5.76 ms

**Rprop**
timeit: 213.30 --> 178.42
autorange: 212.03 --> 178.03

**ASGD**
timeit: 21.67 --> 9.33
autorange: 21.64 --> 9.27

**Adamax**
timeit: 55.60 --> 48.29
autorange: 55.22 -> 49.13

**Rerf Script used**

```
import torch
import time
import torch.optim as optim
from torch.autograd import Variable
from torch.optim.lr_scheduler import ExponentialLR, ReduceLROnPlateau, StepLR
import torch.nn as nn
import time
import torchvision
import torch.utils._benchmark as benchmark_utils

device = "cuda"
model = torchvision.models.resnet.resnet101(pretrained=True).to(device)
targets = torch.randint(0, 1000, (100, 100), device=device)
criterion = nn.CrossEntropyLoss()

optimizer = optim.SGD(model.parameters(), lr=1e-3) # <----------------------- optimizer.
                                                          # would compare optim.SGD vs optim._multi_tensor.SGD
running_loss = 0.0
target = torch.empty(128, dtype=torch.long, device=device).random_(5)

optimizer.zero_grad()
inputs = torch.rand(128, 3, 100, 100, device=device , requires_grad=True)
outputs = model(inputs)
loss = criterion(outputs, target)
loss.backward()
optimizer.step()
running_loss += loss.item()

def main():
    timer = benchmark_utils.Timer(
        stmt="optimizer.step()",
        globals=globals(),
        label="str(optimizer)",
    )

    for i in range(1):
        print(f"Run: {i}\n{'-' * 40}")
        print(f"timeit:\n{timer.timeit(1000)}\n")
        print(f"autorange:\n{timer.blocked_autorange()}\n\n")

if __name__ == "__main__":
    main()
```

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D23931987

Pulled By: izdeby

fbshipit-source-id: 582134ef2d402909d27d89a45c5b588fb7130ea1
2020-09-26 12:17:43 -07:00
Vasiliy Kuznetsov
eee7dad376 Add torch.do_assert, which is symbolically traceable (#45188)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45188

This is a symbolically traceable alternative to Python's `assert`.
It should be useful to allow people who want to use FX to also
be able to assert things.

A bunch of TODO(before) land are inline - would love thoughts
on where is the best place for this code to live, and what this
function should be called (since `assert` is reserved).

Test Plan:
```
python test/test_fx.py TestFX.test_symbolic_trace_assert
```

Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D23861567

fbshipit-source-id: d9d6b9556140faccc0290eba1fabea401d7850de
2020-09-25 13:46:28 -07:00
Taylor Robie
a5a4924c27 Warn if import torch is called from the source root. (#39995)
Summary:
This is a small developer quality of life improvement. I commonly try to run some snippet of python as I'm working on a PR and forget that I've cd-d into the local clone to run some git commands, resulting in annoying failures like:
`ImportError: cannot import name 'default_generator' from 'torch._C' (unknown location)`

This actually took a non-trivial amount of time to figure out the first time I hit it, and even now it's annoying because it happens just infrequently enough to not sit high in the mental cache.

This PR adds a check to `torch/__init__.py` and warns if `import torch` is likely resolving to the wrong thing:

```
WARNING:root:You appear to be importing PyTorch from a clone of the git repo:
  /data/users/taylorrobie/repos/pytorch
  This will prevent `import torch` from resolving to the PyTorch install
  (instead it will try to load /data/users/taylorrobie/repos/pytorch/torch/__init__.py)
  and will generally lead to other failures such as a failure to load C extensions.
```

so that the soon to follow internal import failure makes some sense. I elected to make this a warning rather than an exception because I'm not 100% sure that it's **always** wrong. (e.g. weird `PYTHONPATH` or `importlib` corner cases.)

EDIT: There are now separate cases for `cwd` vs. `PYTHONPATH`, and failure is an `ImportError`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/39995

Reviewed By: malfet

Differential Revision: D23817209

Pulled By: robieta

fbshipit-source-id: d9ac567acb22d9c8c567a8565a7af65ac624dbf7
2020-09-23 10:55:08 -07:00
Nikita Shulga
0c01f136f3 [BE] Use f-string in various Python functions (#44161)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44161

Reviewed By: seemethere

Differential Revision: D23515874

Pulled By: malfet

fbshipit-source-id: 868cf65aedd58fce943c08f8e079e84e0a36df1f
2020-09-04 07:38:25 -07:00
Andrew Jones
24ca6aab02 Improves type-checking guards. (#43339)
Summary:
PR https://github.com/pytorch/pytorch/issues/38157 fixed type checking for mypy by including `if False` guards on some type-checker-only imports. However other typecheckers - [like pyright](https://github.com/microsoft/pylance-release/issues/262#issuecomment-677758245) - will respect this logic and ignore the imports. Using [`if TYPE_CHECKING`](https://docs.python.org/3/library/typing.html#typing.TYPE_CHECKING) instead means both mypy and pyright will work correctly.

[For background, an example of where the current code fails](https://github.com/microsoft/pylance-release/issues/262) is if you make a file `tmp.py` with the contents
```python
import torch
torch.ones((1,))
```
Then [`pyright tmp.py --lib`](https://github.com/microsoft/pyright#command-line) will fail with a `"ones" is not a known member of module` error. This is because it can't find the `_VariableFunctions.pyi` stub file, as pyright respects the `if False` logic. After adding the `TYPE_CHECKING` guard, all works correctly.

Credit to erictraut for suggesting the fix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43339

Reviewed By: agolynski

Differential Revision: D23348142

Pulled By: ezyang

fbshipit-source-id: c8a58122a7b0016845c311da39a1cc48748ba03f
2020-09-03 07:45:53 -07:00
Kurt Mohler
d7ee84c9b5 Update determinism documentation (#41692)
Summary:
Add user-facing documentation for set_deterministic
Also update grammar and readability in Reproducibility page

Issue https://github.com/pytorch/pytorch/issues/15359

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41692

Reviewed By: ailzhang

Differential Revision: D23433061

Pulled By: mruberry

fbshipit-source-id: 4c4552950803c2aaf80f7bb4792d2095706d07cf
2020-08-31 21:06:24 -07:00
Mike Ruberry
9c8021c0b1 Adds torch.linalg namespace (#42664)
Summary:
This PR adds the `torch.linalg` namespace as part of our continued effort to be more compatible with NumPy. The namespace is tested by adding a single function, `torch.linalg.outer`, and testing it in a new test suite, test_linalg.py. It follows the same pattern that https://github.com/pytorch/pytorch/pull/41911, which added the `torch.fft` namespace, did.

Future PRs will likely:

- add more functions to torch.linalg
- expand the testing done in test_linalg.py, including legacy functions, like torch.ger
- deprecate existing linalg functions outside of `torch.linalg` in preference to the new namespace

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42664

Reviewed By: ngimel

Differential Revision: D22991019

Pulled By: mruberry

fbshipit-source-id: 39258d9b116a916817b3588f160b141f956e5d0b
2020-08-07 10:18:30 -07:00
Mike Ruberry
ccfce9d4a9 Adds fft namespace (#41911)
Summary:
This PR creates a new namespace, torch.fft (torch::fft) and puts a single function, fft, in it. This function is analogous to is a simplified version of NumPy's [numpy.fft.fft](https://numpy.org/doc/1.18/reference/generated/numpy.fft.fft.html?highlight=fft#numpy.fft.fft) that accepts no optional arguments. It is intended to demonstrate how to add and document functions in the namespace, and is not intended to deprecate the existing torch.fft function.

Adding this namespace was complicated by the existence of the torch.fft function in Python. Creating a torch.fft Python module makes this name ambiguous: does it refer to a function or module? If the JIT didn't exist, a solution to this problem would have been to make torch.fft refer to a callable class that mimicked both the function and module. The JIT, however, cannot understand this pattern. As a workaround it's required to explicitly `import torch.fft` to access the torch.fft.fft function in Python:

```
import torch.fft

t = torch.randn(128, dtype=torch.cdouble)
torch.fft.fft(t)
```

See https://github.com/pytorch/pytorch/issues/42175 for future work. Another possible future PR is to get the JIT to understand torch.fft as a callable class so it need not be imported explicitly to be used.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41911

Reviewed By: glaringlee

Differential Revision: D22941894

Pulled By: mruberry

fbshipit-source-id: c8e0b44cbe90d21e998ca3832cf3a533f28dbe8d
2020-08-06 00:20:50 -07:00
mattip
8c653e05ff DOC: fail to build if there are warnings (#41335)
Summary:
Merge after gh-41334 and gh-41321 (EDIT: both are merged).
Closes gh-38011

This is the last in a series of PRs to build documentation without warnings. It adds `-WT --keepgoing` to the shpinx build which will [fail the build if there are warnings](https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-W), print a [trackeback on error](https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-T) and [finish the build](https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-keep-going) even when there are warnings.

It should fail now, but pass once the PRs mentioned at the top are merged.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41335

Reviewed By: pbelevich

Differential Revision: D22794425

Pulled By: mruberry

fbshipit-source-id: eb2903e50759d1d4f66346ee2ceebeecfac7b094
2020-07-28 22:33:44 -07:00
rutujak24
96aaa311c0 Grammar Changes (#42076)
Summary:
Small grammatical updates.
![Screenshot (188)](https://user-images.githubusercontent.com/56619747/88471271-02723480-cf25-11ea-8fd1-ae98d5ebcc86.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42076

Reviewed By: mrshenli

Differential Revision: D22756651

Pulled By: ngimel

fbshipit-source-id: e810eb7397a5831d801348c8fff072854658830e
2020-07-26 13:53:41 -07:00
anjali411
e888c3bca1 Update torch.set_default_dtype doc (#41263)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41263

Test Plan: Imported from OSS

Differential Revision: D22482989

Pulled By: anjali411

fbshipit-source-id: 2aadfbb84bbab66f3111970734a37ba74d817ffd
2020-07-14 07:29:49 -07:00
lento
452d5e191b Grammatically updated the tech docs (#41031)
Summary:
Small grammatical update to the torch tech docs

![image](https://user-images.githubusercontent.com/26879385/86633690-e126c400-bfc8-11ea-8892-23cdc037daa9.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41031

Differential Revision: D22404342

Pulled By: ngimel

fbshipit-source-id: 1c723119cfb050c4ef53de7971fe6e0acf3e91a9
2020-07-07 11:17:17 -07:00
Nikita Shulga
591fffc524 Type-annotate serialization.py (#40862)
Summary:
Move Storage class from __init__.pyi.in to types.py and make it a protocol, since this is not a real class
Expose `PyTorchFileReader` and `PyTorchFileWriter` native classes

Ignore function attributes, as there are yet no good way to type annotate those, see https://github.com/python/mypy/issues/2087
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40862

Differential Revision: D22344743

Pulled By: malfet

fbshipit-source-id: 95cdb6f980ee79383960f306223e170c63df3232
2020-07-02 07:10:55 -07:00
Richard Zou
727463a727 Initial vmap frontend API (#40172)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40172

This PR introduces the initial vmap frontend API. It has the following
limitations that we can resolve in the future:
- the inputs must be a flat list of tensors
- the outputs must be a flat list of tensors
- in_dims = 0 (so we always vmap over dim 0 of input tensors)
- out_dims = 0 (so the returned tensors have their vmap dim appear at
dim 0)
- Coverage limited to operations that have batching rules implemented
(torch.mul, torch.sum, torch.expand).

There are some other semantic limitations (like not being able to handle
mutation, aside from pytorch operations that perform mutation) that will
be documented in the future.

I wanted to introduce the API before adding a slow fallback for the
coverage so that we can test future batching rules (and coverage) via
the python API to avoid verbosity in C++-land.

The way vmap works is that `vmap(func)(inputs)` wraps all Tensor inputs
to be batched in BatchedTensors, sends those into func, and then unwraps
the output BatchedTensors. Operations on BatchedTensors perform the batched
operations that the user is asking for. When performing nested vmaps,
each nested vmap adds a batch dimension upon entry and removes a batch
dimension on exit.

Coming up in the near future:
- Support for non-zero in_dims and out_dims
- docstring for vmap
- slow fallback for operators that do not have a batching rule
implemented.

Test Plan: - `pytest test/test_vmap.py -v`

Differential Revision: D22102076

Pulled By: zou3519

fbshipit-source-id: b119f0a8a3a3b1717c92dbbd180dfb1618295563
2020-06-24 08:14:24 -07:00
Kurt Mohler
124cdf2290 Add experimental deterministic flag (#38683)
Summary:
Adds `torch.experimental.deterministic` flag to enforce deterministic algorithms across all of pytorch.
Adds `torch.experimental.deterministic_error_level` to allow users to choose between error/warning/silent if determinism for an operation is not available.
Adds `torch.experimental.alert_not_deterministic()` which should be called within operations that are not deterministic.
Offers both Python and ATen interfaces

Issue https://github.com/pytorch/pytorch/issues/15359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38683

Differential Revision: D21998093

Pulled By: ezyang

fbshipit-source-id: 23aabbddd20f6199d846f97764ff24d728163737
2020-06-12 08:44:06 -07:00
peter
b5848833f0 Add runtime check for MSVC redist (#39841)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39734.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39841

Differential Revision: D21998020

Pulled By: ezyang

fbshipit-source-id: 77df537045e4d7e718ab34e35bb6f847638f4b01
2020-06-11 15:37:21 -07:00
Nikita Shulga
4e30146368 Use ProgramFiles environment variable on Windows (#39707)
Summary:
'Program Files' does not have to be on disk C (nor necesserily should
be called `Program Files`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39707

Differential Revision: D21954235

Pulled By: malfet

fbshipit-source-id: 91a9b765cd1bc7e6201dd4b800d45257207010d9
2020-06-09 14:55:52 -07:00
peter
3413f0a8ca Fix dll load failure in virtual environments on Windows (#39622)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39620.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39622

Differential Revision: D21953420

Pulled By: malfet

fbshipit-source-id: ab0e0358327ec321130384e0a654987cd70349c0
2020-06-09 11:28:22 -07:00
Shen Li
bb0377bb24 Expose torch.futures.Future (#39008)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39008

This commit adds a `torch.futures.Future` type and exposes its ctor,
`wait`, `then`, and `set_result` APIs. This type is currently a
wrapper of `c10::ivalue::Future` and mainly used by RPC for now. Later,
we could revamp c10d APIs to return this `Future` type as well. More
utils will be added into `torch.futures` package in followup PRs.

Test Plan: Imported from OSS

Differential Revision: D21723022

Pulled By: mrshenli

fbshipit-source-id: 92e56160544e9bf00d11db3e8347a1b9707882c9
2020-06-02 10:12:56 -07:00
peter
30146d7391 More fixes about using Windows API through ctypes (#39376)
Summary:
Representation of `NULL` using `c_void_p` is `None` in ctypes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39376

Differential Revision: D21833451

Pulled By: malfet

fbshipit-source-id: 70ec0a805a6c473e946ce9a7566440b6e0cd81ba
2020-06-02 09:42:09 -07:00
peter
e6d86036e2 Fix return types of Windows API functions in __init__.py (#39334)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39327.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39334

Differential Revision: D21820898

Pulled By: malfet

fbshipit-source-id: ea771f8c44a152cee395ada70f8f129d4ad5283d
2020-06-01 17:03:57 -07:00
peter
7eb9f1788c Using LoadLibraryEX [Reland] (#38302)
Summary:
This reverts commit 1ab4f35499.

Without this PR, the OS try to find the DLL in the following directories.
- The directory from which the application loaded.
- The system directory. Use the GetSystemDirectory function to get the path of this directory.
- The 16-bit system directory. There is no function that obtains the path of this directory, but it is searched.
- The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
- The current directory.
- The directories that are listed in the PATH environment variable. Note that this does not include the per-application path specified by the App Paths registry key. The App Paths key is not used when computing the DLL search path.

If we use  LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags, the directories are searched in the following order.

- The directory that contains the DLL (LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR). This directory is searched only for dependencies of the DLL to be loaded.
- The application directory (LOAD_LIBRARY_SEARCH_APPLICATION_DIR).
- Paths explicitly added to the application search path with the AddDllDirectory function (LOAD_LIBRARY_SEARCH_USER_DIRS) or the SetDllDirectory function. If more than one path has been added, the order in which the paths are searched is unspecified.
- The System32 directory (LOAD_LIBRARY_SEARCH_SYSTEM32).

Advantages:
1. The directory that contains the DLL comes first and it's desirable for us, because the dependencies in `lib` should always be preferred.
2. The system directory is considered in the last place. According to some of the bug reports, the DLL load failure are caused by loading the conflicting ones in systemroot.

Neural:
1. The directories in `PATH` are not considered. Similar things happen as described in the previous point. So it may be beneficial for normal users. However, it may cause failures if there are some new dependencies if built from source. (Resolved by making the fallback to `LoadLibraryW` if error code is `126`)

Disadvantages:
1. LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags is only available for Win7/2008 R2 + KB2533623 and up. (Resolved by making the fallback to `LoadLibraryW` if it is not supported)
2. Failure during the call of `LoadLibraryEx` will lead to the OS to pop up a modal dialog, which can block the process if user is using a CLI-only interface. This can be switched off by calling `SetErrorMode`. (Resolved by calling `SetErrorMode`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38302

Test Plan:
Test some common cases (in a new repo maybe) including
1. Python 3.6/3.7/3.8, conda python, conda install
2. Python 3.6/3.7/3.8, conda python, pip install
3. Python 3.6/3.7/3.8, official python, pip install
Plus some corner cases like
1. Conflicting DLLs in systemroot or `PATH`
2. Remove some local dependencies and use global ones

References:
1. https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-seterrormode
2. https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa
3. https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order#standard-search-order-for-desktop-applications

Differential Revision: D21524090

Pulled By: malfet

fbshipit-source-id: 0cf5e260c91759b0af8c7aa0950a488e3b653ef5
2020-05-12 09:31:43 -07:00
Natalia Gimelshein
1ab4f35499 Revert D21496081: [pytorch][PR] Using LoadLibraryEx and LOAD_LIBRARY_SEARCH_* flag for loading DLLs o…
Test Plan: revert-hammer

Differential Revision:
D21496081

Original commit changeset: aa5e528e5134

fbshipit-source-id: c0636b06dd65c7419018062f79aabc397fb2c5b8
2020-05-11 16:38:37 -07:00
peter
00f3790a9d Using LoadLibraryEx and LOAD_LIBRARY_SEARCH_* flag for loading DLLs o… (#37763)
Summary:
…n Windows

Without this PR, the OS try to find the DLL in the following directories.
- The directory from which the application loaded.
- The system directory. Use the GetSystemDirectory function to get the path of this directory.
- The 16-bit system directory. There is no function that obtains the path of this directory, but it is searched.
- The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
- The current directory.
- The directories that are listed in the PATH environment variable. Note that this does not include the per-application path specified by the App Paths registry key. The App Paths key is not used when computing the DLL search path.

If we use  LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags, the directories are searched in the following order.

- The directory that contains the DLL (LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR). This directory is searched only for dependencies of the DLL to be loaded.
- The application directory (LOAD_LIBRARY_SEARCH_APPLICATION_DIR).
- Paths explicitly added to the application search path with the AddDllDirectory function (LOAD_LIBRARY_SEARCH_USER_DIRS) or the SetDllDirectory function. If more than one path has been added, the order in which the paths are searched is unspecified.
- The System32 directory (LOAD_LIBRARY_SEARCH_SYSTEM32).

Advantages:
1. The directory that contains the DLL comes first and it's desirable for us, because the dependencies in `lib` should always be preferred.
2. The system directory is considered in the last place. According to some of the bug reports, the DLL load failure are caused by loading the conflicting ones in systemroot.

Neural:
1. The directories in `PATH` are not considered. Similar things happen as described in the previous point. So it may be beneficial for normal users. However, it may cause failures if there are some new dependencies if built from source. (Resolved by making the fallback to `LoadLibraryW` if error code is `126`)

Disadvantages:
1. LoadLibraryEx with LOAD_LIBRARY_SEARCH_* flags is only available for Win7/2008 R2 + KB2533623 and up. (Resolved by making the fallback to `LoadLibraryW` if it is not supported)
2. Failure during the call of `LoadLibraryEx` will lead to the OS to pop up a modal dialog, which can block the process if user is using a CLI-only interface. This can be switched off by calling `SetErrorMode`. (Resolved by calling `SetErrorMode`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37763

Test Plan:
Test some common cases (in a new repo maybe) including
1. Python 3.6/3.7/3.8, conda python, conda install
2. Python 3.6/3.7/3.8, conda python, pip install
3. Python 3.6/3.7/3.8, official python, pip install
Plus some corner cases like
1. Conflicting DLLs in systemroot or `PATH`
2. Remove some local dependencies and use global ones

References:
1. https://docs.microsoft.com/en-us/windows/win32/api/errhandlingapi/nf-errhandlingapi-seterrormode
2. https://docs.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryexa
3. https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order#standard-search-order-for-desktop-applications

What do you think, malfet ezyang ?

Differential Revision: D21496081

Pulled By: malfet

fbshipit-source-id: aa5e528e5134326b00ac98982f4db4b4bbb47a44
2020-05-11 14:02:03 -07:00
Edward Yang
6edf340338 Delete torch/__init__.pyi, deferring to direct extension stubs (#38157)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38157

This removes the error prone process of assembling `torch/__init__.pyi`
(and frequently forgetting to expose things), since now we can simply
rely on the true source file to get things done.  Most of the old
codegen in gen_pyi.py is now rerouted to various files:

- `torch/_C/__init__.pyi` (the dumping pile of all misc bindings)
- `torch/_C/_nn.pyi` (NN function bindings)
- `torch/_C/_VariableFunctions.pyi` (torch function bindings)

`torch.types` grew a bunch more definitions that previously where
defined in `torch/__init__.pyi`

Some miscellaneous changes

- Fixed a bug where we treat single TensorList argument as implying
  varargs are accepted. This is actually only supported on IntList.
  This means we can correctly generate a stub for dequantize.
- Add missing manual stub for nonzero
- Switched torch/onnx/operators.py to directly refer to _C module,
  since apparently mypy doesn't think that methods prefixed with
  underscores get reexported.  This may be a recurring theme; maybe
  we need to find a better way to solve it.

Because I was really lazy, I dumped namedtuple definitions in both
`torch._C` and `torch._C._VariableFunctions`.  This is definitely wrong.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D21497400

Pulled By: ezyang

fbshipit-source-id: 07b126141c82efaca37be27c07255cb2b9b3f064
2020-05-11 07:20:13 -07:00
Ralf Gommers
726aa713d5 Replace torch.is_tensor usages with isinstance checks. (#38062)
Summary:
`is_tensor` doesn't really have a reason to exist anymore (other than
backwards compatibility) and is worse for typechecking with mypy (see
gh-32824). Given that it may not be obvious what the fix is once mypy
gives an error, make the change in a number of places at once, and add
a note on this to the `is_tensor` docstring.

Recommending an isinstance check instead has been done for quite a
while, e.g. https://github.com/pytorch/pytorch/pull/7769#discussion_r190458971
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38062

Differential Revision: D21470963

Pulled By: ezyang

fbshipit-source-id: 98dd60d32ca0650abd2de21910b541d32b0eea41
2020-05-08 10:10:11 -07:00
Edward Yang
4fef3763dd Revert "Revert D21337640: [pytorch][PR] Split up documentation into subpages and clean up some warnings" (#37778)
Summary:
Original PR: https://github.com/pytorch/pytorch/pull/37419

cc mattip suo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37778

Differential Revision: D21385774

Pulled By: ezyang

fbshipit-source-id: 5de532faab8bae132736b6b5189e0ee2ac9935be
2020-05-04 14:32:35 -07:00
Michael Suo
20f7e62b1d Revert D21337640: [pytorch][PR] Split up documentation into subpages and clean up some warnings
Test Plan: revert-hammer

Differential Revision:
D21337640

Original commit changeset: d4ad198780c3

fbshipit-source-id: fa9ba6ac542173a50bdb45bfa12f3fec0ed704fb
2020-05-04 10:57:55 -07:00
mattip
f10fbcc820 Split up documentation into subpages and clean up some warnings (#37419)
Summary:
xref gh-32838, gh-34032

This is a major refactor of parts of the documentation to split it up using sphinx's `autosummary` feature which will build out `autofuction` and `autoclass` stub files and link to them. The end result is that the top module pages like torch.nn.rst and torch.rst are now more like table-of-contents to the actual single-class or single-function documentations pages.

Along the way, I modified many of the docstrings to eliminate sphinx warnings when building. I think the only thing I changed from a non-documentation perspective is to add names to `__all__` when adding them to `globals()` in `torch.__init__.py`

I do not know the CI system: are the documentation build artifacts available after the build, so reviewers can preview before merging?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37419

Differential Revision: D21337640

Pulled By: ezyang

fbshipit-source-id: d4ad198780c3ae7a96a9f22651e00ff2d31a0c0f
2020-05-04 09:39:22 -07:00
anjali411
1f09f7ea44 Python API for Complex Storage and storage copy logic (#35771)
Summary:
Following up on this: https://github.com/pytorch/pytorch/pull/35851 cross dtype storage copy is not being used internally, so I have not included cross dtype copy for complex.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35771

Differential Revision: D21319650

Pulled By: anjali411

fbshipit-source-id: 07c72996ee598eba0cf401ad61534494d6f5b5b3
2020-05-01 11:47:22 -07:00
James Reed
fd4a09ea73 [WIP] Bind in CellParams for RNN (#35787)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35787

Test Plan: Imported from OSS

Differential Revision: D20784118

Pulled By: jamesr66a

fbshipit-source-id: 5d8f7e1502f707bff9a9aefa90e3edfb3429549b
2020-04-28 21:47:06 -07:00
Alexander Fix
ca665c682c Separate RTLD_GLOBAL from _load_global_deps() (#36682)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36682

For fb internal builds we need to separate whether to use global deps library from loading with RTLD_GLOBAL.

Test Plan: CI -- this should be a no-op for existing builds

Reviewed By: ezyang

Differential Revision: D21051427

fbshipit-source-id: 83bb703d6ceb0265a4c58166749312a44172e78c
2020-04-22 19:08:44 -07:00
Orion Reblitz-Richardson
38849e119f [pytorch] Add error when PyTorch used with Python 2 (#36151)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36151

Python 2 has reached end-of-life and is no longer supported by PyTorch. To avoid confusing behavior when trying to use PyTorch with Python 2, detect this case early and fail with a clear message.  This commit covers `import torch` only and not C++  for now.

Test Plan: waitforsandcastle

Reviewed By: dreiss

Differential Revision: D20894381

fbshipit-source-id: a1073b7a648e07cf10cda5a99a2cf4eee5a89230
2020-04-08 10:40:27 -07:00
Lisa Roach
2b068d10b0 Removing references to PYTHON3COMPATIMPORTS. (#35384)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35384

Removing references to PYTHON3COMPATIMPORTS, mostly suppressions but
removed one instance of usage in a bash script.

Fixed errors arc lint uncovered.

Test Plan:
arc lint
Sandcastle tests

Reviewed By: zertosh

Differential Revision: D20635401

fbshipit-source-id: 74c6b5edb85a78a44f96b96f72ee75a9c2d029f1
2020-04-01 10:34:04 -07:00
peter
4a4e385e13 Revert "Load torch_global_deps for Windows (#35177)" (#35355)
Summary:
This reverts commit d7a7bcb042.

The previous commit is not useful because torch_global_deps doesn't include any external dependencies.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35355

Differential Revision: D20653036

Pulled By: ezyang

fbshipit-source-id: 6d2e2f90952ca865b27b649a6ff9114ada8ea78c
2020-03-26 07:33:48 -07:00
peterjc123
de3044b210 Load all DLLs in the lib directory for Windows (#35362)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/35358.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35362

Differential Revision: D20645218

Pulled By: ezyang

fbshipit-source-id: 08ef5889fe2cd9139a3f6852ee73fe7742b315b5
2020-03-25 13:22:45 -07:00
peter
d7a7bcb042 Load torch_global_deps for Windows (#35177)
Summary:
Fixes https://discuss.pytorch.org/t/torch-cat-runtimeerror-error-in-loadlibrarya/71188/8.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35177

Differential Revision: D20604654

Pulled By: ezyang

fbshipit-source-id: 263eb401300812fd336ff820c53b543342dca95e
2020-03-24 08:20:45 -07:00
Pearu Peterson
8bae1ed144 PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721)
Summary:
This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721

Differential Revision: D20444270

Pulled By: vincentqb

fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659
2020-03-16 14:13:30 -07:00
Edward Yang
4b929e5466 Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem
Test Plan: revert-hammer

Differential Revision:
D20193196

Original commit changeset: 78a487991242

fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8
2020-03-11 09:24:34 -07:00
Pearu Peterson
2ec779d46c PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488)
Summary:
This PR implements the following linear algebra algorithms for low-rank matrices:
- [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + uses `torch.lowrank.get_approximate_basis`
  + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] PCA - using `torch.svd_lowrank`
  + uses `torch.svd_lowrank`
  + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices, uses non-centered sparse matrix algorithm
  + [x] documentation
- [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically
  + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point.
- [x] benchmarks
  + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations.
  + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster.
  + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results)

Resolves https://github.com/pytorch/pytorch/issues/8049.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488

Differential Revision: D20193196

Pulled By: vincentqb

fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e
2020-03-11 07:33:49 -07:00
peter
c18cb1eb52 Improve dll loading logic on Windows (#33856)
Summary:
The way it works on the Anaconda distribution of Python 3.8 is a bit different. Loading DLLs explicitly  (e.g. `ctype.CDLL`) relies on paths appended by `os.add_dll_directory`. But if you try to load DLLs implicitly (e.g. `from torch._C import *`), it will rely on `PATH`.

Fixes https://github.com/pytorch/vision/issues/1916.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33856

Differential Revision: D20150080

Pulled By: soumith

fbshipit-source-id: cdbe76c138ea259ef7414c6634d4f7e0b1871af3
2020-02-27 21:58:35 -08:00
peter
b77c25dec0 Fix dll load logic for Python 3.8 on Windows (#32215)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/31181 and https://github.com/pytorch/pytorch/pull/31162#discussion_r362495611.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32215

Differential Revision: D19501869

Pulled By: ezyang

fbshipit-source-id: 363824e52d2592ad968ecf1df345aa4c0daff915
2020-01-22 08:33:34 -08:00
Tongzhou Wang
8d472bab6b Make torch.backends.mkldnn usable without import
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32055

Differential Revision: D19373220

Pulled By: ezyang

fbshipit-source-id: 50ab3ff70fc893c81123419c4d3cf2e3e48a0a93
2020-01-14 08:19:19 -08:00
Edward Yang
ddff4efa26 Don't use RTLD_GLOBAL to load _C. (#31162)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31162

This should help us resolve a multitude of weird segfaults and crashes
when PyTorch is imported along with other packages. Those would often
happen because libtorch symbols were exposed globally and could be used
as a source of relocations in shared libraries loaded after libtorch.

Fixes #3059.

Some of the subtleties in preparing this patch:

* Getting ASAN to play ball was a pain in the ass. The basic problem is that when we load with `RTLD_LOCAL`, we now may load a library multiple times into the address space; this happens when we have custom C++ extensions. Since the libraries are usually identical, this is usually benign, but it is technically undefined behavior and UBSAN hates it. I sprayed a few ways of getting things to "work" correctly: I preload libstdc++ (so that it is seen consistently over all library loads) and added turned off vptr checks entirely. Another possibility is we should have a mode where we use RTLD_GLOBAL to load _C, which would be acceptable in environments where you're sure C++ lines up correctly. There's a long comment in the test script going into more detail about this.
* Making some of our shared library dependencies load with `RTLD_LOCAL` breaks them. OpenMPI and MKL don't work; they play linker shenanigans to look up their symbols which doesn't work when loaded locally, and if we load a library with `RLTD_LOCAL` we aren't able to subsequently see it with `ctypes`. To solve this problem, we employ a clever device invented by apaszke: we create a dummy library `torch_global_deps` with dependencies on all of the libraries which need to be loaded globally, and then load that with `RTLD_GLOBAL`. As long as none of these libraries have C++ symbols, we can avoid confusion about C++ standard library.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D19262579

Test Plan: Imported from OSS

Pulled By: ezyang

fbshipit-source-id: 06a48a5d2c9036aacd535f7e8a4de0e8fe1639f2
2020-01-09 07:28:15 -08:00
Peter Bell
dcd1216efe Force early initialization of OpenMP in forked children (#29006)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/28389

Intel's OpenMP implementation sets the thread affinity on the first call to an OpenMP function after a fork. By adding an atfork handler we can force this to happen before a user tries to set the affinity in their own DataLoader `worker_init_fn`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29006

Differential Revision: D18782456

Pulled By: ezyang

fbshipit-source-id: ce0b515256da0cf18ceb125e0cdec99a3311bbd3
2019-12-03 15:23:31 -08:00