This PR:
- renames `torch.set_deterministic` to `torch._set_deterministic`
- renames `torch.is_deterministic` to `torch._is_deterministic`
- Modifies the docstrings for both to indicate that the feature is not
yet complete.
We would like to do this because this feature is experimental and the
docstrings before this PR are misleading.
This PR does not have an accompanying change in master. That is because
there still is discussion over what the eventual state of the feature
should be: https://github.com/pytorch/pytorch/issues/15359. I expect
that there will be a better plan for this once 1.7 rolls around.
Test Plan:
- wait for CI
* [quant] aten::repeat work for quantized tensor (#40644)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40644
Test Plan: Imported from OSS
Differential Revision: D22268558
fbshipit-source-id: 3bc9a129bece1b547c519772ecc6b980780fb904
* [quant][graphmode][fix] remove unsupported ops in the list (#40653)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40653
(Note: this ignores all push blocking failures!)
Test Plan: Imported from OSS
Differential Revision: D22271413
fbshipit-source-id: a01611b5d90849ac673fa5a310f910c858e907a3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38490
A meta tensor is a tensor that is a lot like a normal tensor,
except it doesn't actually have any data associated with it.
You can use them to carry out shape/dtype computations without
actually having to run the actual code; for example, this could
be used to do shape inference in a JIT analysis pass.
Check out the description in DispatchKey.h for more information.
Meta tensors are part of a larger project to rationalize how we
write kernels so that we don't have to duplicate shape logic
in CPU kernel, CUDA kernel and meta kernel (this PR makes the
duplication problem worse!) However, that infrastructure can
be built on top of this proof of concept, which just shows how
you can start writing meta kernels today even without this
infrastructure.
There are a lot of things that don't work:
- I special cased printing for dense tensors only; if you try to
allocate a meta sparse / quantized tensor things aren't going
to work.
- The printing formula implies that torch.tensor() can take an
ellipsis, but I didn't add this.
- I wrote an example formula for binary operators, but it isn't
even right! (It doesn't do type promotion of memory layout
correctly). The most future proof way to do it right is to
factor out the relevant computation out of TensorIterator,
as it is quite involved.
- Nothing besides torch.add works right now
- Meta functions are ALWAYS included in mobile builds (selective
build doesn't work on them). This isn't a big deal for now
but will become more pressing as more meta functions are added.
One reason I'm putting up this PR now is to check with Yinghai Lu
if we can unblock shape inference for accelerators, while we are
still working on a long term plan for how to unify all shape
computation across our kernels.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21935609
Pulled By: ezyang
fbshipit-source-id: f7d8636eeb8516b6bc296db99a16e56029972eee
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39203
Adds logic and test coverage for optional weights and biases for
the quantized normalization operators. This was broken before this
PR because the `TORCH_LIBRARY` registration had these as required parameters
- removed it, and cleaned up the callsites.
Note: consolidating the registrations in `native_functions.yaml` as opposed to `library.cpp`
after a discussion with ezyang .
Test Plan:
```
python test/test_quantization.py TestQuantizedOps.test_qlayer_norm
python test/test_quantization.py TestQuantizedOps.test_group_norm
python test/test_quantization.py TestQuantizedOps.test_instance_norm
python test/test_quantization.py TestStaticQuantizedModule.test_layer_norm
python test/test_quantization.py TestStaticQuantizedModule.test_group_norm
python test/test_quantization.py TestStaticQuantizedModule.test_instance_norm
python test/test_quantization.py TestQuantizeScriptPTSQOps.test_layer_norm
python test/test_quantization.py TestQuantizeScriptPTSQOps.test_group_norm
python test/test_quantization.py TestQuantizeScriptPTSQOps.test_instance_norm
```
Imported from OSS
Differential Revision: D21885259
fbshipit-source-id: 978c7b8bd6c11a03e9e5fdb68f154cb80cc43599
Summary:
Adds `torch.experimental.deterministic` flag to enforce deterministic algorithms across all of pytorch.
Adds `torch.experimental.deterministic_error_level` to allow users to choose between error/warning/silent if determinism for an operation is not available.
Adds `torch.experimental.alert_not_deterministic()` which should be called within operations that are not deterministic.
Offers both Python and ATen interfaces
Issue https://github.com/pytorch/pytorch/issues/15359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38683
Differential Revision: D21998093
Pulled By: ezyang
fbshipit-source-id: 23aabbddd20f6199d846f97764ff24d728163737
Summary:
This PR aims to add `arcosh`, `arcsinh` and `arctanh` support. Please see issue https://github.com/pytorch/pytorch/issues/38349 for more details.
**TODOs:**
* [x] Add test cases for `arcosh`, `arcsinh` and `arctanh`. (need help)
* [x] Overload ops if `std::op` does not work with `thrust::complex` types (like for `sinh`, `cosh`).
Note: `std::acosh, std::asinh, std::atanh` do not support `thrust::complex` types. Added support for complex types for these 3 ops (`arccosh, arcsinh, arctanh`)
cc: mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38388
Differential Revision: D21882055
Pulled By: mruberry
fbshipit-source-id: d334590b47c5a89e491a002c3e41e6ffa89000e3
Summary:
Fixes https://github.com/pytorch/pytorch/issues/37259, fixes https://github.com/pytorch/pytorch/issues/20156
This lazily calls `at::init_num_threads` once for each thread by adding a call to `lazy_init_num_threads` in `at::parallel_for` and `at::parallel_reduce`.
If this solution is okay, then we should add the same to guard other places that might use MKL or OpenMP.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37461
Reviewed By: ezyang
Differential Revision: D21472763
Pulled By: ilia-cher
fbshipit-source-id: 889d6664f5bd4080037ade02ee324b1233992915
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36847
Adds a quantized instancenorm operator, which can reuse most of
groupnorm's logic.
Benchmarking shows that the quantized version is about 10x faster than
floating point for equivalent input sizes
(https://gist.github.com/vkuzo/2f230e84d26f26cc6030afdbfbc8e7f0)
Test Plan:
```
python test/quantization/test_quantized.py TestQuantizedOps.test_instance_norm
```
Imported from OSS
Differential Revision: D21107925
fbshipit-source-id: 6bacda402f0eb9857bc8f9a5cf8ef306150613d4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36835
Adds a quantized groupnorm operator. We reuse most of the layernorm
kernel, modifying it to be able to perform channel-wise scaling.
Benchmark results: the quantized layer is between 6x to 15x faster
from fp to q, depending on input shapes
(full results:
https://gist.github.com/vkuzo/db67623232415382dabff6c8923124e9)
Test Plan:
```
python test/quantization/test_quantized.py TestQuantizedOps.test_group_norm
python test/quantization/test_quantized.py TestQuantizedOps.test_qlayer_norm
```
Numerics are nearly equivalent, with the only difference documented
in the test case. The difference is the same type as with quantized
layernorm. Making numerics equivalent is possible but will sacrifice
speed.
Imported from OSS
Differential Revision: D21107926
fbshipit-source-id: 80e87e9e2c71310bc28c3d114c88de428819cb45
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36815
Pytorch does not have native channel shuffle op.
This diff adds that for both fp and quantized tensors.
For FP implementation is inefficient one. For quantized there is a native
QNNPACK op for this.
ghstack-source-id: 103267234
Test Plan:
buck run caffe2/test:quantization --
quantization.test_quantized.TestQuantizedOps.test_channel_shuffle
X86 implementation for QNNPACK is sse2 so this may not be the most efficient
for x86.
Reviewed By: dreiss
Differential Revision: D21093841
fbshipit-source-id: 5282945f352df43fdffaa8544fe34dba99a5b97e
Summary:
Adds support for generating Vandermonde matrices based off of the Numpy implementation found [here](https://github.com/numpy/numpy/blob/v1.17.0/numpy/lib/twodim_base.py#L475-L563).
Adds test to ensure generated matrix matches expected Numpy implementation. Note test are only limited to torch.long and torch.double due to differences in now PyTorch and Numpy deal with type promotion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36725
Differential Revision: D21075138
Pulled By: jessebrizzi
fbshipit-source-id: 6bb1559e8247945714469b0e2b07c6f4d5fd1fd0
Summary:
Notes:
1. didn't name them as _copy_real and _copy_imag because it's desirable (but not necessary) to have these methods as tensor methods.
2. replaced old .real() and .imag() instances with _copy_real() and _copy_imag() methods
3. didn't add documentation because we plan to remove these methods when we add real and imag as tensor attributes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35879
Differential Revision: D20841760
Pulled By: anjali411
fbshipit-source-id: 7267e6fbaab9a5ce426e9396f12238994666b0dd
Summary:
Since the last one was apparently reverted.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35530
Differential Revision: D20777341
Pulled By: ezyang
fbshipit-source-id: 6aaaf2a0755359074ae3d0efe32018d78dafe976
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34747
Adds the hardswish FP operator from MobileNetV3 to PyTorch. This is for
common operator coverage, since this is widely used. A future PR will
add the quantized version. CUDA is saved for a future PR as well.
Test Plan:
tests pass:
```
python test/test_torch.py TestTorchDeviceTypeCPU.test_hardswish_cpu_float32
```
microbenchmark:
https://gist.github.com/vkuzo/b10d3b238f24e58c585314e8b5385aca
(batch_size == 1: 11.5GiB/s, batch_size == 4: 11.9GiB/s)
Imported from OSS
Differential Revision: D20451404
fbshipit-source-id: c7e13c9ab1a83e27a1ba18182947c82c896efae2
Summary:
Initial integration of eager autocasting, supporting out-of-place ops only for easier review.
Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081
In-place ops and ops with user-supplied `out=...` can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/pull/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32140
Differential Revision: D20346700
Pulled By: ezyang
fbshipit-source-id: 12d77b3917310186fbddf11c59b2794dc859131f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34545
This is for common operator coverage, since this is widely used. A future PR
will add the quantized version.
Some initial questions for reviewers, since it's my first FP operator
diff:
* do we need a backwards.out method for this?
* do we need CUDA? If yes, should it be this PR or is it ok to split
Test Plan:
```
// test
python test/test_torch.py TestTorchDeviceTypeCPU.test_hardsigmoid_cpu_float32
// benchmark
python -m pt.hardsigmoid_test
...
Forward Execution Time (us) : 40.315
Forward Execution Time (us) : 42.603
```
Imported from OSS
Differential Revision: D20371692
fbshipit-source-id: 95668400da9577fd1002ce3f76b9777c6f96c327
Summary:
This is a redo of https://github.com/pytorch/pytorch/pull/33791, which was reverted because it introduced a flaky test. The test was flaky and only flaky on Python3.5 because of dict order randomization.
I've fixed the issue with tests clobbering each other in b539fec and removed the override tests for `torch.nn.functional.tanh` and `torch.nn.functional.sigmoid`, which are deprecated and shouldn't be overridable in e0d7402. I also verified that no more test clobbering is happening.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34240
Differential Revision: D20252442
Pulled By: cpuhrsch
fbshipit-source-id: 069568e342a41c90e1dc76cbf85ba4aed47f24be
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33182
This adds private API functions that developers of types that implement `__torch_function__` can use to ensure full coverage of the subset of the PyTorch API that can be overrided.
I've refactored some of the code in the tests into a new `torch._overrides.get_overridable_functions` function. I've also changed `TENSOR_LIKE_TORCH_OVERRIDES` into `torch._overrides.get_testing_overrides` and `IGNORED_TORCH_FUNCTIONS` into `torch._overrides.get_ignored_functions`. Making these two static global variables in the tests into functions should allow rewriting their implementation to construct their return values instead of just statically defining the return value as is done here. Currently that is blocked on not being able to inspect function signatures of compiled kernels in PyTorch (see https://github.com/pytorch/pytorch/issues/28233). See the docs I've added for usage examples of these new functions. I also refactored the existing override tests to make use of these new functions, which should be a good forcing function to make sure they're kept up-to-date.
Finally, while working on this I discovered that `TestTorchFunctionOverrides.test_mean` and `TestTorchFunctionOverrides.test_mm` weren't ever being run because they were getting clobbered by the other dynamically generated override tests. I fixed that by renaming the tests and then fixing the actual test code. I've verified that all the subclassing semantics is correct and that the updated test answers are correct. I'm happy to put the fixes to the existing tests in as a separate pull request if that would be easier to review.
ping cpuhrsch since the feature request originally came from them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33791
Differential Revision: D20195053
Pulled By: cpuhrsch
fbshipit-source-id: 1585f4e405f5223932b410eae03a288dc8eb627e