pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Richard Zou	a2922f589d	[1.6.0] Mark `torch.set_deterministic` and `torch.is_deterministic` as experimental (#41870 ) This PR: - renames `torch.set_deterministic` to `torch._set_deterministic` - renames `torch.is_deterministic` to `torch._is_deterministic` - Modifies the docstrings for both to indicate that the feature is not yet complete. We would like to do this because this feature is experimental and the docstrings before this PR are misleading. This PR does not have an accompanying change in master. That is because there still is discussion over what the eventual state of the feature should be: https://github.com/pytorch/pytorch/issues/15359. I expect that there will be a better plan for this once 1.7 rolls around. Test Plan: - wait for CI	2020-07-22 18:32:47 -07:00
Jerry Zhang	d0045e5520	Some fixes for graph mode quantization (#40935 ) * [quant] aten::repeat work for quantized tensor (#40644) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40644 Test Plan: Imported from OSS Differential Revision: D22268558 fbshipit-source-id: 3bc9a129bece1b547c519772ecc6b980780fb904 * [quant][graphmode][fix] remove unsupported ops in the list (#40653) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40653 (Note: this ignores all push blocking failures!) Test Plan: Imported from OSS Differential Revision: D22271413 fbshipit-source-id: a01611b5d90849ac673fa5a310f910c858e907a3	2020-07-07 13:26:27 -07:00
Yanli Zhao	13a8ec3cc5	Revert D22102406: DNNL: enable max_pool3d and avg_pool3d Test Plan: revert-hammer Differential Revision: D22102406 Original commit changeset: 296a87188b79 fbshipit-source-id: ff023be5e8dd4bfcd68770cab305da6ba2e03893	2020-06-22 15:23:01 -07:00
anjali411	8ec2ae9a9f	Add view_as_real, view_as_complex for complex tensors (#39099 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39099 Test Plan: Imported from OSS Differential Revision: D22057886 Pulled By: anjali411 fbshipit-source-id: bad5ba7097ba0dd13f2c549b2463094dee9afa14	2020-06-22 15:15:27 -07:00
Zhang, Xiaobing	c873895722	DNNL: enable max_pool3d and avg_pool3d (#35664 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35664 Test Plan: Imported from OSS Differential Revision: D22102406 Pulled By: VitalyFedyunin fbshipit-source-id: 296a87188b79545741f6b7e136a58e4380564f25	2020-06-22 11:57:12 -07:00
Edward Yang	e4766fb4d9	Meta tensors, but without code deduplication (#38490 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38490 A meta tensor is a tensor that is a lot like a normal tensor, except it doesn't actually have any data associated with it. You can use them to carry out shape/dtype computations without actually having to run the actual code; for example, this could be used to do shape inference in a JIT analysis pass. Check out the description in DispatchKey.h for more information. Meta tensors are part of a larger project to rationalize how we write kernels so that we don't have to duplicate shape logic in CPU kernel, CUDA kernel and meta kernel (this PR makes the duplication problem worse!) However, that infrastructure can be built on top of this proof of concept, which just shows how you can start writing meta kernels today even without this infrastructure. There are a lot of things that don't work: - I special cased printing for dense tensors only; if you try to allocate a meta sparse / quantized tensor things aren't going to work. - The printing formula implies that torch.tensor() can take an ellipsis, but I didn't add this. - I wrote an example formula for binary operators, but it isn't even right! (It doesn't do type promotion of memory layout correctly). The most future proof way to do it right is to factor out the relevant computation out of TensorIterator, as it is quite involved. - Nothing besides torch.add works right now - Meta functions are ALWAYS included in mobile builds (selective build doesn't work on them). This isn't a big deal for now but will become more pressing as more meta functions are added. One reason I'm putting up this PR now is to check with Yinghai Lu if we can unblock shape inference for accelerators, while we are still working on a long term plan for how to unify all shape computation across our kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D21935609 Pulled By: ezyang fbshipit-source-id: f7d8636eeb8516b6bc296db99a16e56029972eee	2020-06-22 09:18:33 -07:00
Vasiliy Kuznetsov	4ad8ebe738	quant layer/group/instance norm: make weights and biases optional (#39203 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39203 Adds logic and test coverage for optional weights and biases for the quantized normalization operators. This was broken before this PR because the `TORCH_LIBRARY` registration had these as required parameters - removed it, and cleaned up the callsites. Note: consolidating the registrations in `native_functions.yaml` as opposed to `library.cpp` after a discussion with ezyang . Test Plan: ``` python test/test_quantization.py TestQuantizedOps.test_qlayer_norm python test/test_quantization.py TestQuantizedOps.test_group_norm python test/test_quantization.py TestQuantizedOps.test_instance_norm python test/test_quantization.py TestStaticQuantizedModule.test_layer_norm python test/test_quantization.py TestStaticQuantizedModule.test_group_norm python test/test_quantization.py TestStaticQuantizedModule.test_instance_norm python test/test_quantization.py TestQuantizeScriptPTSQOps.test_layer_norm python test/test_quantization.py TestQuantizeScriptPTSQOps.test_group_norm python test/test_quantization.py TestQuantizeScriptPTSQOps.test_instance_norm ``` Imported from OSS Differential Revision: D21885259 fbshipit-source-id: 978c7b8bd6c11a03e9e5fdb68f154cb80cc43599	2020-06-18 10:19:39 -07:00
Kurt Mohler	124cdf2290	Add experimental deterministic flag (#38683 ) Summary: Adds `torch.experimental.deterministic` flag to enforce deterministic algorithms across all of pytorch. Adds `torch.experimental.deterministic_error_level` to allow users to choose between error/warning/silent if determinism for an operation is not available. Adds `torch.experimental.alert_not_deterministic()` which should be called within operations that are not deterministic. Offers both Python and ATen interfaces Issue https://github.com/pytorch/pytorch/issues/15359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/38683 Differential Revision: D21998093 Pulled By: ezyang fbshipit-source-id: 23aabbddd20f6199d846f97764ff24d728163737	2020-06-12 08:44:06 -07:00
kshitij12345	9733390998	Add `torch.flip{lr, ud}` (#38599 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/38349 TODO: * [x] Add Tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/38599 Differential Revision: D21941884 Pulled By: mruberry fbshipit-source-id: 7a442ff11051c2c868cf8e3c04e4bba0f1a1d426	2020-06-09 07:19:37 -07:00
krshrimali	335e4a1e3b	Add arcosh, arcsinh and arctanh to unary ops (#38388 ) Summary: This PR aims to add `arcosh`, `arcsinh` and `arctanh` support. Please see issue https://github.com/pytorch/pytorch/issues/38349 for more details. TODOs: * [x] Add test cases for `arcosh`, `arcsinh` and `arctanh`. (need help) * [x] Overload ops if `std::op` does not work with `thrust::complex` types (like for `sinh`, `cosh`). Note: `std::acosh, std::asinh, std::atanh` do not support `thrust::complex` types. Added support for complex types for these 3 ops (`arccosh, arcsinh, arctanh`) cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/38388 Differential Revision: D21882055 Pulled By: mruberry fbshipit-source-id: d334590b47c5a89e491a002c3e41e6ffa89000e3	2020-06-04 11:40:55 -07:00
Xiaomeng Yang	03eca384fd	Optimize GroupNorm on CPU (#28203 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28203 Optimize GroupNorm on CPU ghstack-source-id: 105149765 Test Plan: buck test mode/dev-nosan caffe2/test:nn -- "GroupNorm" Reviewed By: houseroad Differential Revision: D17901506 fbshipit-source-id: 5eb22ad0e8a9ab2533282b967b2818f690e48865	2020-06-03 23:52:16 -07:00
Aayush Naik	0829cadca3	Implement rad2deg, deg2rad (#38852 ) Summary: Resolves https://github.com/pytorch/pytorch/issues/38372. cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/38852 Differential Revision: D21868935 Pulled By: mruberry fbshipit-source-id: ae6ded11b743c9d1cdc032984b4abe0a115290d6	2020-06-03 22:21:54 -07:00
anjali411	3370c045ae	Remove copy_imag and copy_real methods (#39065 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39065 Test Plan: Imported from OSS Differential Revision: D21803939 Pulled By: anjali411 fbshipit-source-id: c7313c527eb6b54d49ef46aa0a839a3418fa8d7e	2020-06-03 18:22:50 -07:00
Cloud Han	05f097b5bb	Implement logaddexp (#38384 ) Summary: Resolve https://github.com/pytorch/pytorch/issues/38377 Related https://github.com/pytorch/pytorch/issues/38349 This op should be disambiguated with `logsumexp` which do a reduction on a tensor over a specific axis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38384 Differential Revision: D21737336 Pulled By: mruberry fbshipit-source-id: 7864d04ca304c0fb2937bb083583e3e3d6ef205d	2020-05-27 20:27:31 -07:00
Ivan Kobzarev	996b6a3d00	[vulkan] Fix python overrides tests for is_vulkan_available (#39016 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39016 Differential Revision: D21724619 Pulled By: IvanKobzarev fbshipit-source-id: d7a6c8b944a55bc4f2cce957eeac08c5801667a0	2020-05-26 11:42:55 -07:00
kshitij12345	3487744821	Add `torch.logcumsumexp` (#36308 ) Summary: Creating new PR as I am unable to push to pandeykartikey 's branch as I don't have the permissions. Closes https://github.com/pytorch/pytorch/issues/26411 Based on https://github.com/pytorch/pytorch/issues/32876 Thanks pandeykartikey for starting this out. Have addressed the comments. anjali411 agadetsky albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/36308 Differential Revision: D21648573 Pulled By: albanD fbshipit-source-id: bc1a8fc4ab474a1148298117a1549b0e46f7c3ff	2020-05-21 09:12:31 -07:00
Ralf Gommers	d363cf4639	Fix incorrect __torch_function__ handling in einsum (#38741 ) Summary: Closes gh-38479 Pull Request resolved: https://github.com/pytorch/pytorch/pull/38741 Differential Revision: D21662512 Pulled By: ezyang fbshipit-source-id: 247e3b50b8f2dd842c03be8d6ebe71910b619bc6	2020-05-21 06:59:25 -07:00
Peter Bell	5137827ad0	Lazily initialise thread local num_threads value (#37461 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/37259, fixes https://github.com/pytorch/pytorch/issues/20156 This lazily calls `at::init_num_threads` once for each thread by adding a call to `lazy_init_num_threads` in `at::parallel_for` and `at::parallel_reduce`. If this solution is okay, then we should add the same to guard other places that might use MKL or OpenMP. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37461 Reviewed By: ezyang Differential Revision: D21472763 Pulled By: ilia-cher fbshipit-source-id: 889d6664f5bd4080037ade02ee324b1233992915	2020-05-11 13:24:45 -07:00
Vasiliy Kuznetsov	4fa049c525	add quantized instancenorm operator (#36847 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36847 Adds a quantized instancenorm operator, which can reuse most of groupnorm's logic. Benchmarking shows that the quantized version is about 10x faster than floating point for equivalent input sizes (https://gist.github.com/vkuzo/2f230e84d26f26cc6030afdbfbc8e7f0) Test Plan: ``` python test/quantization/test_quantized.py TestQuantizedOps.test_instance_norm ``` Imported from OSS Differential Revision: D21107925 fbshipit-source-id: 6bacda402f0eb9857bc8f9a5cf8ef306150613d4	2020-05-06 19:01:33 -07:00
Vasiliy Kuznetsov	b837d5d418	add quantized groupnorm operator (#36835 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36835 Adds a quantized groupnorm operator. We reuse most of the layernorm kernel, modifying it to be able to perform channel-wise scaling. Benchmark results: the quantized layer is between 6x to 15x faster from fp to q, depending on input shapes (full results: https://gist.github.com/vkuzo/db67623232415382dabff6c8923124e9) Test Plan: ``` python test/quantization/test_quantized.py TestQuantizedOps.test_group_norm python test/quantization/test_quantized.py TestQuantizedOps.test_qlayer_norm ``` Numerics are nearly equivalent, with the only difference documented in the test case. The difference is the same type as with quantized layernorm. Making numerics equivalent is possible but will sacrifice speed. Imported from OSS Differential Revision: D21107926 fbshipit-source-id: 80e87e9e2c71310bc28c3d114c88de428819cb45	2020-05-06 19:01:26 -07:00
Kimish Patel	df31ddbd98	Add channel shuffle op fp32 + quantized. (#36815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36815 Pytorch does not have native channel shuffle op. This diff adds that for both fp and quantized tensors. For FP implementation is inefficient one. For quantized there is a native QNNPACK op for this. ghstack-source-id: 103267234 Test Plan: buck run caffe2/test:quantization -- quantization.test_quantized.TestQuantizedOps.test_channel_shuffle X86 implementation for QNNPACK is sse2 so this may not be the most efficient for x86. Reviewed By: dreiss Differential Revision: D21093841 fbshipit-source-id: 5282945f352df43fdffaa8544fe34dba99a5b97e	2020-05-01 10:07:15 -07:00
Jesse Brizzi	bca82801e7	add support for generating Vandermonde matrices (#36725 ) Summary: Adds support for generating Vandermonde matrices based off of the Numpy implementation found [here](https://github.com/numpy/numpy/blob/v1.17.0/numpy/lib/twodim_base.py#L475-L563). Adds test to ensure generated matrix matches expected Numpy implementation. Note test are only limited to torch.long and torch.double due to differences in now PyTorch and Numpy deal with type promotion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36725 Differential Revision: D21075138 Pulled By: jessebrizzi fbshipit-source-id: 6bb1559e8247945714469b0e2b07c6f4d5fd1fd0	2020-04-29 13:16:26 -07:00
James Reed	fd4a09ea73	[WIP] Bind in CellParams for RNN (#35787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35787 Test Plan: Imported from OSS Differential Revision: D20784118 Pulled By: jamesr66a fbshipit-source-id: 5d8f7e1502f707bff9a9aefa90e3edfb3429549b	2020-04-28 21:47:06 -07:00
moto	5a27ec09b8	Add Inverse Short Time Fourier Transform in ATen native (#35569 ) Summary: Ported `torchaudio`'s implementation (test, and documentation as well) to ATen. Note - Batch packing/unpacking is performed in Python. ATen implementation expects 4D input tensor. - The way `hop_length` is initialized in the same way as `stft` implementation. [The Torchaudio's version tried to mimic the same behavior but slightly different](`7da61a4bee/torchaudio/functional.py (L152-L157)`). Closes https://github.com/pytorch/pytorch/issues/34827 Relates https://github.com/pytorch/pytorch/issues/3775 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35569 Differential Revision: D21178090 Pulled By: mthrok fbshipit-source-id: 2701a8b241a36a6fb1b740c2fb2b07cb938185d4	2020-04-24 12:14:55 -07:00
Masaki Kozuki	6fcabf619d	[takeover] BTRS algorithm for fast/efficient binomial sampling (#36858 ) Summary: The original PR is https://github.com/pytorch/pytorch/pull/31278. CC: ezyang jamestwebber fritzo zasdfgbnm --- <!-- # This PR - CPU In [1]: import torch; import torch.distributions as dist In [2]: counts = torch.randint(10, 1000, [1000,1000]) ...: p = 0.5 * torch.ones(1000, 1000) In [3]: %timeit dist.binomial.Binomial(total_count=counts, probs=p).sample() 94.8 ms ± 911 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) --> ``` # This PR - GPU In [1]: import torch; import torch.distributions as dist In [2]: counts = torch.randint(10, 1000, [1000,1000]).cuda(); p = 0.5 * torch.ones(1000, 1000).cuda() In [3]: %timeit dist.binomial.Binomial(total_count=counts, probs=p).sample() 737 µs ± 216 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) # master (commit: `806f22b167`) - GPU In [5]: counts = torch.randint(10, 1000, [1000,1000]).cuda(); p = 0.5 * torch.ones(1000, 1000).cuda() In [6]: %timeit dist.binomial.Binomial(total_count=counts, probs=p).sample() 46.3 ms ± 76.2 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/36858 Differential Revision: D21178367 Pulled By: ezyang fbshipit-source-id: 7e7d6f463e35b07156d69bd7452040b2f9c2eb7a	2020-04-22 15:53:41 -07:00
Jesse Brizzi	28f439d4f4	add absolute alias for abs (#36597 ) Summary: Adds an absolute alias for the abs function to match Numpy's use of both: https://docs.scipy.org/doc/numpy/reference/generated/numpy.absolute.html Adds test to ensure the output from abs and absolute are the same. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36597 Differential Revision: D21024458 Pulled By: jessebrizzi fbshipit-source-id: 4f2987e7bc7cde444d0a93e833a0350844b48d44	2020-04-20 14:49:51 -07:00
Vasiliy Kuznetsov	a5d0d762fa	redo of add quantized layer norm implementation (#36593 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36593 This is a redo of https://github.com/pytorch/pytorch/pull/35329 with a better test. Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Differential Revision: D21030268 Pulled By: vkuzo fbshipit-source-id: b3594c3393cfce37a881319e2e0560620d51080f	2020-04-15 19:47:18 -07:00
lixinyu	1e7155caa5	Bucketization (#7284 ) (#34577 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34577 Test Plan: Imported from OSS Differential Revision: D20380975 Pulled By: glaringlee fbshipit-source-id: d75939bc54d98675f88d7037491a8420ac20847a	2020-04-15 10:32:51 -07:00
Kurt Mohler	2bc49a4b85	block_diag dense (#33449 ) Summary: Add block_diag function for dense tensors, based on scipy.linalg.block_diag Closes https://github.com/pytorch/pytorch/issues/31932 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33449 Differential Revision: D20943099 Pulled By: zou3519 fbshipit-source-id: 8b5c9476fb5af959aafa4169612c660396d9b717	2020-04-13 10:04:55 -07:00
Hameer Abbasi	7c825bad10	[RELAND] Add __torch_function__ benchmarks (#36138 ) Summary: Re-land of https://github.com/pytorch/pytorch/issues/35530 and https://github.com/pytorch/pytorch/issues/34645 Pull Request resolved: https://github.com/pytorch/pytorch/pull/36138 Differential Revision: D20893770 Pulled By: ezyang fbshipit-source-id: 75ab688a086f5fb87412a853df5246c0c39704ca	2020-04-10 09:14:31 -07:00
Edward Yang	88c22070fe	Revert D20768930: add quantized layer norm implementation Test Plan: revert-hammer Differential Revision: D20768930 Original commit changeset: ddf8727e9840 fbshipit-source-id: a190e1d1e42281eba627b0dbb6de1b3651cd5e97	2020-04-09 14:36:37 -07:00
Vasiliy Kuznetsov	f813e7184e	add quantized layer norm implementation (#35329 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35329 Adds a quantized implementation of LayerNorm for server. A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation benchmarks by input size: v1 (mean+var non-vectorized): https://gist.github.com/vkuzo/f6d72c04742608112f4c2e612c74bd13 v2 (mean+var vectorized in float): https://gist.github.com/vkuzo/4dd95657c5b5f3654e0965db00eff8d2 v3 (mean+var vectorized in int, current): https://gist.github.com/vkuzo/57a75f75629da9f23b64b38ca0e3d34b Imported from OSS Differential Revision: D20768930 fbshipit-source-id: ddf8727e9840c65ead3b890220af0638c5637028	2020-04-09 09:11:41 -07:00
anjali411	66d50060eb	Temporary methods for real and imag values of complex tensors (#35879 ) Summary: Notes: 1. didn't name them as _copy_real and _copy_imag because it's desirable (but not necessary) to have these methods as tensor methods. 2. replaced old .real() and .imag() instances with _copy_real() and _copy_imag() methods 3. didn't add documentation because we plan to remove these methods when we add real and imag as tensor attributes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35879 Differential Revision: D20841760 Pulled By: anjali411 fbshipit-source-id: 7267e6fbaab9a5ce426e9396f12238994666b0dd	2020-04-05 07:22:02 -07:00
Nik Ved	35cdb78522	Make kl_div accept target in log space (#34586 ) Summary: Fixes [32520](https://github.com/pytorch/pytorch/issues/32520), implements [34536](https://github.com/pytorch/pytorch/issues/34536). Here are some benchmarks: ```python import torch import torch.nn.functional as F from IPython import get_ipython ipython = get_ipython() torch.set_num_threads(1) for d in [5, 10, 20, 50, 100, 1000]: i = torch.rand(d, d) t = torch.rand(d, d) print(f"Size: {d}x{d}") ipython.magic("timeit F.kl_div(i, t, reduction='none', log_target=False)") ipython.magic("timeit F.kl_div(i, t.log(), reduction='none', log_target=True)") ``` Output: ``` Size: 5x5 16 µs ± 33 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 8.24 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 10x10 16.7 µs ± 17.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 8.7 µs ± 20.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 20x20 17.7 µs ± 47.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 9.7 µs ± 28.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 50x50 23.6 µs ± 60.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 15 µs ± 33.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 100x100 42.8 µs ± 223 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 34 µs ± 17.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) Size: 1000x1000 3.9 ms ± 1.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 3.45 ms ± 364 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/34586 Differential Revision: D20652726 Pulled By: ezyang fbshipit-source-id: 480697b4cd01341bbeee7514a8b812705a0600ea	2020-04-01 12:26:58 -07:00
Michael Suo	6491bf2855	Revert D20777341: [pytorch][PR] Add __torch_function__ benchmarks. Test Plan: revert-hammer Differential Revision: D20777341 Original commit changeset: 6aaaf2a07553 fbshipit-source-id: 1c324f91f85ac624bf878297c96c682a46958954	2020-04-01 10:23:00 -07:00
Hameer Abbasi	8c534bb0bd	Add __torch_function__ benchmarks. (#35530 ) Summary: Since the last one was apparently reverted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35530 Differential Revision: D20777341 Pulled By: ezyang fbshipit-source-id: 6aaaf2a0755359074ae3d0efe32018d78dafe976	2020-04-01 06:30:17 -07:00
Alban Desmaison	4d39aeec27	Revert D20653072: [pytorch][PR] Add __torch_function__ benchmarks. Test Plan: revert-hammer Differential Revision: D20653072 Original commit changeset: e7e363f8a1b8 fbshipit-source-id: e75e4979399d6fee10e00a673ea45b9bcc0fd447	2020-03-26 13:36:59 -07:00
Hameer Abbasi	bf24753570	Add __torch_function__ benchmarks. (#34645 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34645 Differential Revision: D20653072 Pulled By: ezyang fbshipit-source-id: e7e363f8a1b84fc0c354586e266a695e4a2ea60e	2020-03-26 11:29:10 -07:00
Vasiliy Kuznetsov	f3e9fa6122	add hardswish FP operator (#34747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34747 Adds the hardswish FP operator from MobileNetV3 to PyTorch. This is for common operator coverage, since this is widely used. A future PR will add the quantized version. CUDA is saved for a future PR as well. Test Plan: tests pass: ``` python test/test_torch.py TestTorchDeviceTypeCPU.test_hardswish_cpu_float32 ``` microbenchmark: https://gist.github.com/vkuzo/b10d3b238f24e58c585314e8b5385aca (batch_size == 1: 11.5GiB/s, batch_size == 4: 11.9GiB/s) Imported from OSS Differential Revision: D20451404 fbshipit-source-id: c7e13c9ab1a83e27a1ba18182947c82c896efae2	2020-03-24 15:15:34 -07:00
Michael Carilli	0f0271e255	[RELAND2] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35102 ) Summary: This is the second reland attempt for https://github.com/pytorch/pytorch/pull/32140. The first reland attempt https://github.com/pytorch/pytorch/pull/35011 failed due a [small incompatible change](https://github.com/pytorch/pytorch/pull/35011#issuecomment-601754216) in recent master (`skipIfRocm` was removed from `test_data_parallel.py`). The present PR restores skipIfRocm. Description from first reland attempt https://github.com/pytorch/pytorch/pull/35011: > https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. > > The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. > > Original description of https://github.com/pytorch/pytorch/pull/32140: > > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. > Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > > > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35102 Differential Revision: D20596918 Pulled By: ezyang fbshipit-source-id: 60caa279bb0ce4a9bb0b28c1d585d42cf1cc7e50	2020-03-24 09:08:04 -07:00
Mike Ruberry	fe276d541e	Revert D20541921: [pytorch][PR] [RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) Test Plan: revert-hammer Differential Revision: D20541921 Original commit changeset: abb5488dca86 fbshipit-source-id: d2c6038978f80e5429632f8b49107090a8a247f4	2020-03-19 22:39:12 -07:00
Michael Carilli	991b97277a	[RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35011 ) Summary: https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. Original description of https://github.com/pytorch/pytorch/pull/32140: > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35011 Differential Revision: D20541921 Pulled By: ezyang fbshipit-source-id: abb5488dca8620b0daac4306ebf2bb47fc36e4f5	2020-03-19 20:18:18 -07:00
Edward Yang	d0577e19f0	Revert D20346700: [pytorch][PR] Eager autocasting, out-of-place ops only Test Plan: revert-hammer Differential Revision: D20346700 Original commit changeset: 12d77b391731 fbshipit-source-id: 108d72bf24232f443c0be293ec932c0c478d6a60	2020-03-18 11:42:51 -07:00
Michael Carilli	aaa8f02156	Eager autocasting, out-of-place ops only (#32140 ) Summary: Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 In-place ops and ops with user-supplied `out=...` can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/pull/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32140 Differential Revision: D20346700 Pulled By: ezyang fbshipit-source-id: 12d77b3917310186fbddf11c59b2794dc859131f	2020-03-18 10:28:21 -07:00
Hameer Abbasi	6b701de130	Add types argument to __torch_function__ (#34303 ) Summary: This PR adds the `types` argument to `__torch_function__` as per RFC 0001: https://github.com/pytorch/rfcs/pull/3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34303 Differential Revision: D20474992 Pulled By: ezyang fbshipit-source-id: cdd40b3b38f3bda4ece8812a629f5db87e919d01	2020-03-17 13:32:00 -07:00
Vasiliy Kuznetsov	1bac5fd0d3	add hardsigmoid FP operator to PyTorch (#34545 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34545 This is for common operator coverage, since this is widely used. A future PR will add the quantized version. Some initial questions for reviewers, since it's my first FP operator diff: * do we need a backwards.out method for this? * do we need CUDA? If yes, should it be this PR or is it ok to split Test Plan: ``` // test python test/test_torch.py TestTorchDeviceTypeCPU.test_hardsigmoid_cpu_float32 // benchmark python -m pt.hardsigmoid_test ... Forward Execution Time (us) : 40.315 Forward Execution Time (us) : 42.603 ``` Imported from OSS Differential Revision: D20371692 fbshipit-source-id: 95668400da9577fd1002ce3f76b9777c6f96c327	2020-03-16 15:24:12 -07:00
Pearu Peterson	8bae1ed144	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721 ) Summary: This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721 Differential Revision: D20444270 Pulled By: vincentqb fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659	2020-03-16 14:13:30 -07:00
Nathan Goldbaum	3f1ba3c465	Redo of "Add API for listing functions overridable by __torch_function__" (#34240 ) Summary: This is a redo of https://github.com/pytorch/pytorch/pull/33791, which was reverted because it introduced a flaky test. The test was flaky and only flaky on Python3.5 because of dict order randomization. I've fixed the issue with tests clobbering each other in `b539fec` and removed the override tests for `torch.nn.functional.tanh` and `torch.nn.functional.sigmoid`, which are deprecated and shouldn't be overridable in `e0d7402`. I also verified that no more test clobbering is happening. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34240 Differential Revision: D20252442 Pulled By: cpuhrsch fbshipit-source-id: 069568e342a41c90e1dc76cbf85ba4aed47f24be	2020-03-12 10:33:17 -07:00
Shen Li	ac6e75a165	Revert D20195053: [pytorch][PR] Add API for listing functions overridable by __torch_function__ Test Plan: revert-hammer Differential Revision: D20195053 Original commit changeset: 1585f4e405f5 fbshipit-source-id: 3c1aab9c60e3138d40d200ae4238bda0cddf8896	2020-03-04 10:13:54 -08:00
Nathan Goldbaum	ad2825a2c9	Add API for listing functions overridable by __torch_function__ (#33791 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/33182 This adds private API functions that developers of types that implement `__torch_function__` can use to ensure full coverage of the subset of the PyTorch API that can be overrided. I've refactored some of the code in the tests into a new `torch._overrides.get_overridable_functions` function. I've also changed `TENSOR_LIKE_TORCH_OVERRIDES` into `torch._overrides.get_testing_overrides` and `IGNORED_TORCH_FUNCTIONS` into `torch._overrides.get_ignored_functions`. Making these two static global variables in the tests into functions should allow rewriting their implementation to construct their return values instead of just statically defining the return value as is done here. Currently that is blocked on not being able to inspect function signatures of compiled kernels in PyTorch (see https://github.com/pytorch/pytorch/issues/28233). See the docs I've added for usage examples of these new functions. I also refactored the existing override tests to make use of these new functions, which should be a good forcing function to make sure they're kept up-to-date. Finally, while working on this I discovered that `TestTorchFunctionOverrides.test_mean` and `TestTorchFunctionOverrides.test_mm` weren't ever being run because they were getting clobbered by the other dynamically generated override tests. I fixed that by renaming the tests and then fixing the actual test code. I've verified that all the subclassing semantics is correct and that the updated test answers are correct. I'm happy to put the fixes to the existing tests in as a separate pull request if that would be easier to review. ping cpuhrsch since the feature request originally came from them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33791 Differential Revision: D20195053 Pulled By: cpuhrsch fbshipit-source-id: 1585f4e405f5223932b410eae03a288dc8eb627e	2020-03-03 12:40:34 -08:00

1 2

54 Commits