pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Michael Carilli	e6bc34f549	Amp gradient accumulation example (#36601 ) Summary: Several people have asked me about proper Amp usage with gradient accumulation. In particular, it's [unclear to people](https://github.com/NVIDIA/apex/issues/439#issuecomment-610351482) that you should only call `scaler.unscale_()` (if desired) and `scaler.update()` in iterations where you actually plan to step. This PR adds a minimal accumulation example. I built the docs locally and it looks free from sphinx errors, at least. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36601 Differential Revision: D21082295 Pulled By: ngimel fbshipit-source-id: b2faa6c02b9f7e1972618a0f1d5360a03f0450ac	2020-04-17 09:56:36 -07:00
Jessica Lin	ac950bb9c8	Update docs for master to remove Python 2 references (#36336 ) Summary: Fix compile error from original PR in jit_language_references.rst: https://github.com/pytorch/pytorch/pull/36114 Full details in task: https://our.intern.facebook.com/intern/tasks/?t=64776265 With pytroch 1.5+ we remove python2 support from PyTorch. All documentation under docs/ and on the pytorch.org website needs to remove Python 2 references. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36336 Differential Revision: D21057507 Pulled By: jlin27 fbshipit-source-id: 993a763f1ecb16dad859bc02a07625ddc023645d	2020-04-16 10:15:48 -07:00
Shen Li	049dede3be	Move rpc.rst back to the source folder to preserve existing doc URLs (#36675 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36675 Test Plan: Imported from OSS Differential Revision: D21048628 Pulled By: mrshenli fbshipit-source-id: 3cb1b35ddc1f40c673b0db9048d77dfa024be1e7	2020-04-16 08:12:34 -07:00
Omkar Salpekar	5927a6731c	[PyTorch Docs] Updated RRef docs to indicate RPC Retries (#36678 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36678 Updated the docs to explicitly indicate that RRef control messages are idempotent and retried upon failure. ghstack-source-id: 102225791 Test Plan: build bot Differential Revision: D20828041 fbshipit-source-id: ca4d71c65a453664c16c32134c47637a966b1a19	2020-04-15 17:33:20 -07:00
Kurt Mohler	2bc49a4b85	block_diag dense (#33449 ) Summary: Add block_diag function for dense tensors, based on scipy.linalg.block_diag Closes https://github.com/pytorch/pytorch/issues/31932 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33449 Differential Revision: D20943099 Pulled By: zou3519 fbshipit-source-id: 8b5c9476fb5af959aafa4169612c660396d9b717	2020-04-13 10:04:55 -07:00
Hameer Abbasi	1875c2e4bd	Add torch.Tensor.as_subclass method. (#34369 ) Summary: This is according to pytorch/rfcs#3. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34369 Differential Revision: D20963929 Pulled By: ezyang fbshipit-source-id: e618af6fd36e1dfaeda617162314ad5840f55358	2020-04-10 09:16:35 -07:00
Edward Yang	6016f694c0	Revert D20901746: [pytorch][PR] Update docs for master to remove Python 2 references Test Plan: revert-hammer Differential Revision: D20901746 Original commit changeset: 07f8dc8e6fab fbshipit-source-id: 13c55597f9f79b8473210cf35a5a0f1fb34bae39	2020-04-08 14:49:11 -07:00
Jessica Lin	373dc7c8ef	Group libraries in TOC and add PyTorch Elastic (#34928 ) Summary: Move XLA out of Notes and group with other libraries. Also adds link to PyTorch Elastic ![image](https://user-images.githubusercontent.com/8042156/76912125-f76d1080-686f-11ea-99d5-bb7be199adbd.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/34928 Differential Revision: D20901732 Pulled By: jlin27 fbshipit-source-id: a5da915bb435a3aa8995d8bbe87f53ef79fd3ce6	2020-04-07 16:37:45 -07:00
Jessica Lin	43234be525	Update docs for master to remove Python 2 references (#36114 ) Summary: Full details in task: https://our.intern.facebook.com/intern/tasks/?t=64776265 With pytroch 1.5+ we remove python2 support from PyTorch. All documentation under docs/ and on the pytorch.org website needs to remove Python 2 references. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36114 Differential Revision: D20901746 Pulled By: jlin27 fbshipit-source-id: 07f8dc8e6fab0b232e5048a63079cab0c433c85f	2020-04-07 16:13:18 -07:00
Orion Reblitz-Richardson	2d8dbcd3ef	Remove python2 and 3.5 from requirements.txt, README and docs (#35677 ) Summary: Some more cleanup now that we no longer support python2 or 3.5 on master and eventually PyTorch 1.6 release. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35677 Differential Revision: D20838097 Pulled By: orionr fbshipit-source-id: 95d553a1e8769f3baa395e0bc6d4ce7cd93236e9	2020-04-03 11:05:43 -07:00
Feng Tian	762270c51f	add c10d dynamic loading mechanism and unit test (#28068 ) Summary: The original behavior of pytorch c10d only supports built-in c10d backends, such as nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically loading 3rd party communication libraries which are derived from ProcessGroup base class. related RFC is in: https://github.com/pytorch/pytorch/issues/27955 Through this way, user just need specify a 3rd party c10d backend name when invoking torch.distributed.init_process_group(). The proposed logic will try to load corresponding c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068 Differential Revision: D19174838 Pulled By: agolynski fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62	2020-04-02 15:46:51 -07:00
anjali411	c070e8fb26	Updated canCast to disallow complex -> non complex conversion (#35883 ) Summary: fixes https://github.com/pytorch/pytorch/issues/35675 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35883 Differential Revision: D20818130 Pulled By: anjali411 fbshipit-source-id: c9b4b6112897639d1e9b7073c5dac7a29b9cd990	2020-04-02 15:12:38 -07:00
Rohan Varma	6616fad92e	[Docs] Fix typo in RPC docs (#35809 ) Summary: It's also fixed in the cherry pick PR https://github.com/pytorch/pytorch/pull/35808 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35809 Differential Revision: D20803338 Pulled By: rohan-varma fbshipit-source-id: 1925f367703faf053ab4b1c0ff0acb86230c5d89	2020-04-01 21:16:12 -07:00
Dhiraj D Kalamkar	945d7a7408	Add All-to-all comms support to distributed module and MPI backend (#32361 ) Summary: As described in https://github.com/pytorch/pytorch/issues/32345, a prototype implementation to add an alltoall communication primitive to torch.distributed module and ProcessGroup abstract interface. Also, implements alltoall in ProcessGroupMPI backend. mnaumovfb JianpingChen066 dmudiger srinivas212 Jianhui-Li mshiryaev ftian1 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini xush6528 osalpekar Pull Request resolved: https://github.com/pytorch/pytorch/pull/32361 Reviewed By: mrshenli Differential Revision: D20635481 Pulled By: srinivas212 fbshipit-source-id: 3dd0af800ce55d02f02813cde550e3a0f1a287d2	2020-04-01 08:57:12 -07:00
Rohan Varma	1f06db2579	Refactored rpc docs (#35109 ) Summary: Reorganize as per jlin27 's comments. Screenshots added in comments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35109 Differential Revision: D20788774 Pulled By: rohan-varma fbshipit-source-id: 7d64be70ef76ed6ff303d05d39c338293c234766	2020-04-01 02:01:34 -07:00
Ilia Cherniavskii	bc6bd0bb1a	Debug Information Guard Summary: This diff fixes the issues with current handling of debug information passed along the execution of the model. (For example, it is possible that multiple calls to the debug guard may override each other) Test Plan: CI test/cpp/jit Reviewed By: dzhulgakov Differential Revision: D20602775 fbshipit-source-id: 4683957954028af81a1a0f1f12b243650230c9bb	2020-04-01 01:55:29 -07:00
Ilia Cherniavskii	800d5617c0	Recording of TorchScript functions (#34710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710 Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate. Test Plan: unit test (test_misc.cpp/testRecordFunction) Reviewed By: gdankel, dzhulgakov Differential Revision: D20158523 fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582	2020-03-31 00:33:23 -07:00
Mike Ruberry	860790de88	Makes torch.real and torch.imag NumPy compatible, but disables them for complex tensors (#35560 ) Summary: The current implementations of torch.real and torch.imag are not NumPy compatible. In particular: - torch.real on a real tensor does not return the real tensor, like contiguous - torch.real on a complex tensor does not return a real-valued view of the real part - torch.imag on a complex tensor does not return a real-valued view of the imaginary part - torch.Tensor.real and torch.Tensor.imag exist as methods, but in NumPy they are writable attributes This PR makes the functions NumPy compatible by removing the method variants and out kwarg, restricting them to work on only real tensors, and updating the behavior of torch.real to return its input. New tests are added to test_torch.py to verify the behavior, a couple existing complex tests are skipped, and the documentation is updated to reflect the change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35560 Differential Revision: D20714568 Pulled By: mruberry fbshipit-source-id: 5dd092f45757b620c8426c829dd15ee997246a26	2020-03-29 02:09:00 -07:00
pinzhenx	bd604cb5b7	Upgrade MKL-DNN to DNNL v1.2 (#32422 ) Summary: ## Motivation This PR upgrades MKL-DNN from v0.20 to DNNL v1.2 and resolves https://github.com/pytorch/pytorch/issues/30300. DNNL (Deep Neural Network Library) is the new brand of MKL-DNN, which improves performance, quality, and usability over the old version. This PR focuses on the migration of all existing functionalities, including minor fixes, performance improvement and code clean up. It serves as the cornerstone of our future efforts to accommodate new features like OpenCL support, BF16 training, INT8 inference, etc. and to let the Pytorch community derive more benefits from the Intel Architecture. <br> ## What's included? Even DNNL has many breaking changes to the API, we managed to absorb most of them in ideep. This PR contains minimalist changes to the integration code in pytorch. Below is a summary of the changes: <br> General: 1. Replace op-level allocator with global-registered allocator ``` // before ideep::sum::compute<AllocForMKLDNN>(scales, {x, y}, z); // after ideep::sum::compute(scales, {x, y}, z); ``` The allocator is now being registeted at `aten/src/ATen/native/mkldnn/IDeepRegistration.cpp`. Thereafter all tensors derived from the `cpu_engine` (by default) will use the c10 allocator. ``` RegisterEngineAllocator cpu_alloc( ideep::engine::cpu_engine(), [](size_t size) { return c10::GetAllocator(c10::DeviceType::CPU)->raw_allocate(size); }, [](void* p) { c10::GetAllocator(c10::DeviceType::CPU)->raw_deallocate(p); } ); ``` ------ 2. Simplify group convolution We had such a scenario in convolution where ideep tensor shape mismatched aten tensor: when `groups > 1`, DNNL expects weights tensors to be 5-d with an extra group dimension, e.g. `goihw` instead of `oihw` in 2d conv case. As shown below, a lot of extra checks came with this difference in shape before. Now we've completely hidden this difference in ideep and all tensors are going to align with pytorch's definition. So we could safely remove these checks from both aten and c2 integration code. ``` // aten/src/ATen/native/mkldnn/Conv.cpp if (w.ndims() == x.ndims() + 1) { AT_ASSERTM( groups > 1, "Only group _mkldnn_conv2d weights could have been reordered to 5d"); kernel_size[0] = w.get_dim(0) * w.get_dim(1); std::copy_n( w.get_dims().cbegin() + 2, x.ndims() - 1, kernel_size.begin() + 1); } else { std::copy_n(w.get_dims().cbegin(), x.ndims(), kernel_size.begin()); } ``` ------ 3. Enable DNNL built-in cache Previously, we stored DNNL jitted kernels along with intermediate buffers inside ideep using an LRU cache. Now we are switching to the newly added DNNL built-in cache, and no longer caching buffers in order to reduce memory footprint. This change will be mainly reflected in lower memory usage from memory profiling results. On the code side, we removed couple of lines of `op_key_` that depended on the ideep cache before. ------ 4. Use 64-bit integer to denote dimensions We changed the type of `ideep::dims` from `vector<int32_t>` to `vector<int64_t>`. This renders ideep dims no longer compatible with 32-bit dims used by caffe2. So we use something like `{stride_.begin(), stride_.end()}` to cast parameter `stride_` into a int64 vector. <br> Misc changes in each commit: Commit: change build options Some build options were slightly changed, mainly to avoid name collisions with other projects that include DNNL as a subproject. In addition, DNNL built-in cache is enabled by option `DNNL_ENABLE_PRIMITIVE_CACHE`. Old \| New -- \| -- WITH_EXAMPLE \| MKLDNN_BUILD_EXAMPLES WITH_TEST \| MKLDNN_BUILD_TESTS MKLDNN_THREADING \| MKLDNN_CPU_RUNTIME MKLDNN_USE_MKL \| N/A (not use MKL anymore) ------ Commit: aten reintegration - aten/src/ATen/native/mkldnn/BinaryOps.cpp Implement binary ops using new operation `binary` provided by DNNL - aten/src/ATen/native/mkldnn/Conv.cpp Clean up group convolution checks Simplify conv backward integration - aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp Simplify prepacking convolution weights - test/test_mkldnn.py Fixed an issue in conv2d unit test: it didn't check conv results between mkldnn and aten implementation before. Instead, it compared the mkldnn with mkldnn as the default cpu path will also go into mkldnn. Now we use `torch.backends.mkldnn.flags` to fix this issue - torch/utils/mkldnn.py Prepack weight tensor on module `__init__` to achieve better performance significantly ------ Commit: caffe2 reintegration - caffe2/ideep/ideep_utils.h Clean up unused type definitions - caffe2/ideep/operators/adam_op.cc & caffe2/ideep/operators/momentum_sgd_op.cc Unify tensor initialization with `ideep::tensor::init`. Obsolete `ideep::tensor::reinit` - caffe2/ideep/operators/conv_op.cc & caffe2/ideep/operators/quantization/int8_conv_op.cc Clean up group convolution checks Revamp convolution API - caffe2/ideep/operators/conv_transpose_op.cc Clean up group convolution checks Clean up deconv workaround code ------ Commit: custom allocator - Register c10 allocator as mentioned above <br><br> ## Performance We tested inference on some common models based on user scenarios, and most performance numbers are either better than or on par with DNNL 0.20. ratio: new / old \| Latency (batch=1 4T) \| Throughput (batch=64 56T) -- \| -- \| -- pytorch resnet18 \| 121.4% \| 99.7% pytorch resnet50 \| 123.1% \| 106.9% pytorch resnext101_32x8d \| 116.3% \| 100.1% pytorch resnext50_32x4d \| 141.9% \| 104.4% pytorch mobilenet_v2 \| 163.0% \| 105.8% caffe2 alexnet \| 303.0% \| 99.2% caffe2 googlenet-v3 \| 101.1% \| 99.2% caffe2 inception-v1 \| 102.2% \| 101.7% caffe2 mobilenet-v1 \| 356.1% \| 253.7% caffe2 resnet101 \| 100.4% \| 99.8% caffe2 resnet152 \| 99.8% \| 99.8% caffe2 shufflenet \| 141.1% \| 69.0% † caffe2 squeezenet \| 98.5% \| 99.2% caffe2 vgg16 \| 136.8% \| 100.6% caffe2 googlenet-v3 int8 \| 100.0% \| 100.7% caffe2 mobilenet-v1 int8 \| 779.2% \| 943.0% caffe2 resnet50 int8 \| 99.5% \| 95.5% _Configuration: Platform: Skylake 8180 Latency Test: 4 threads, warmup 30, iteration 500, batch size 1 Throughput Test: 56 threads, warmup 30, iteration 200, batch size 64_ † Shufflenet is one of the few models that require temp buffers during inference. The performance degradation is an expected issue since we no longer cache any buffer in the ideep. As for the solution, we suggest users opt for caching allocator like jemalloc as a drop-in replacement for system allocator in such heavy workloads. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32422 Test Plan: Perf results: https://our.intern.facebook.com/intern/fblearner/details/177790608?tab=Experiment%20Results 10% improvement for ResNext with avx512, neutral on avx2 More results: https://fb.quip.com/ob10AL0bCDXW#NNNACAUoHJP Reviewed By: yinghai Differential Revision: D20381325 Pulled By: dzhulgakov fbshipit-source-id: 803b906fd89ed8b723c5fcab55039efe3e4bcb77	2020-03-26 22:07:59 -07:00
Ailing Zhang	7580470cc5	Update view op list. (#35399 ) Summary: Adding ops to the list based on our discussion. :D Pull Request resolved: https://github.com/pytorch/pytorch/pull/35399 Differential Revision: D20651393 Pulled By: ailzhang fbshipit-source-id: 8cf9026d10c0d74117953dbb68ebc2f537be956a	2020-03-25 16:15:00 -07:00
Yuichiro Ueno	aadd0fda8b	Document reduce_scatter collective operation (#35274 ) Summary: I don't know why reduce_scatter collective operation is not documented so I add it to the document. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35274 Differential Revision: D20645850 Pulled By: mrshenli fbshipit-source-id: 0a4458bff1a4e15a4593dd4dcc25e4e0f6e2265d	2020-03-25 13:36:29 -07:00
anjali411	c73e97033a	Added type promotion logic for complex numbers (#34093 ) Summary: Issue: https://github.com/pytorch/pytorch/issues/33780 After this PR: 1. dtype promotion logic will correctly work for ops involving complex scalars 2. added alias for complex64 (cfloat) and complex128 (cdouble) 3. added an internal function get_complex_default_dtype (consciously not exposed in public API) - sets the default complex dtype to be double if default_dtype is set to double, else float https://github.com/pytorch/pytorch/pull/34093#discussion_r392350224 >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex64) >>> torch.set_default_dtype(torch.float64) >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex128) >>> 1j + torch.ones(2) tensor([(1.0000 + 1.0000j), (1.0000 + 1.0000j)], dtype=torch.complex128) >>> torch.tensor(1j) + torch.ones(2,2) tensor([[(1.0000 + 1.0000j), (1.0000 + 1.0000j)], [(1.0000 + 1.0000j), (1.0000 + 1.0000j)]], dtype=torch.complex128) Pull Request resolved: https://github.com/pytorch/pytorch/pull/34093 Differential Revision: D20537125 Pulled By: anjali411 fbshipit-source-id: 05fb1f81b8ba039d0b698cdd2c0bbf8b0ce0b767	2020-03-25 09:12:21 -07:00
Michael Carilli	0f0271e255	[RELAND2] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35102 ) Summary: This is the second reland attempt for https://github.com/pytorch/pytorch/pull/32140. The first reland attempt https://github.com/pytorch/pytorch/pull/35011 failed due a [small incompatible change](https://github.com/pytorch/pytorch/pull/35011#issuecomment-601754216) in recent master (`skipIfRocm` was removed from `test_data_parallel.py`). The present PR restores skipIfRocm. Description from first reland attempt https://github.com/pytorch/pytorch/pull/35011: > https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. > > The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. > > Original description of https://github.com/pytorch/pytorch/pull/32140: > > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. > Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > > > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35102 Differential Revision: D20596918 Pulled By: ezyang fbshipit-source-id: 60caa279bb0ce4a9bb0b28c1d585d42cf1cc7e50	2020-03-24 09:08:04 -07:00
Mike Ruberry	7c1ea736ba	Extends true_divide to be a method (#34794 ) Summary: Per title. See related https://github.com/pytorch/pytorch/pull/34570. In PyTorch 1.7 the plan is for torch.div and Python's division operator to perform "true" division, like Python 3, JAX, and NumPy. To facilitate this change, this PR expands true_divide to be a method so it can cover all of torch.div's use cases. New true_divide tests are added to test_torch.py, test_type_promotion.py, and test_sparse.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34794 Differential Revision: D20545507 Pulled By: mruberry fbshipit-source-id: 55286f819716c8823d1930441a69008560ac2bd5	2020-03-23 23:12:23 -07:00
Peter Bell	bd0ef784e0	FAQ: Add note about recovering from OOM (#35214 ) Summary: Closes https://github.com/pytorch/pytorch/issues/18853 This documents the workaround needed to solve the issues in https://github.com/pytorch/pytorch/issues/18853 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35214 Differential Revision: D20604877 Pulled By: ezyang fbshipit-source-id: 71ed13cfa567d8e88fa9f18180a171cd174fb528	2020-03-23 20:22:46 -07:00
Vitaly Fedyunin	40da01db6a	Add docs about memory format (#34818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34818 Test Plan: Imported from OSS Differential Revision: D20601336 Pulled By: VitalyFedyunin fbshipit-source-id: d34ad226be950bf134c6b383a4810ea6aa75599e	2020-03-23 15:06:33 -07:00
Jerry Zhang	3fa7813b9f	[quant] Add dequantize.tensors (#34348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34348 We need this function to do swap dequantize for prim::ListConstruct since the output of prim::ListConstruct is a list of Tensors Test Plan: . Imported from OSS Differential Revision: D20504454 fbshipit-source-id: e6155e37da98e2219a6f79737cd46fe32a509c9f	2020-03-20 22:51:51 -07:00
Xiang Gao	df8d6eeb19	Update docs about DP and DDP for CUDA (#35063 ) Summary: We should recommend DDP instead of DP. Hope we can also cherry-pick this for 1.5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35063 Differential Revision: D20549621 Pulled By: ngimel fbshipit-source-id: 86b1b2134664065cc6070ea4212895f993eaf543	2020-03-20 20:06:37 -07:00
Mike Ruberry	fe276d541e	Revert D20541921: [pytorch][PR] [RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) Test Plan: revert-hammer Differential Revision: D20541921 Original commit changeset: abb5488dca86 fbshipit-source-id: d2c6038978f80e5429632f8b49107090a8a247f4	2020-03-19 22:39:12 -07:00
Michael Carilli	991b97277a	[RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35011 ) Summary: https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. Original description of https://github.com/pytorch/pytorch/pull/32140: > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35011 Differential Revision: D20541921 Pulled By: ezyang fbshipit-source-id: abb5488dca8620b0daac4306ebf2bb47fc36e4f5	2020-03-19 20:18:18 -07:00
albanD	1f4a4aaf64	functional autograd api (#34066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34066 Basic implementation of https://github.com/pytorch/pytorch/issues/30632 Test Plan: Imported from OSS Differential Revision: D20260307 Pulled By: albanD fbshipit-source-id: 7db5c2411ddc3e954ff8fbbe93eb3b96a2bcfb2f	2020-03-19 08:24:07 -07:00
Mike Ruberry	9c4683e8e3	Revert D20312366: [pytorch][PR] Added type promotion logic for complex numbers Test Plan: revert-hammer Differential Revision: D20312366 Original commit changeset: 90f00a1a916d fbshipit-source-id: 4510739a888b2eec5d8a72e792998ac46da6d82a	2020-03-19 05:55:57 -07:00
anjali411	c8f665dcb6	Added type promotion logic for complex numbers (#34093 ) Summary: Issue: https://github.com/pytorch/pytorch/issues/33780 After this PR: 1. dtype promotion logic will correctly work for ops involving complex scalars 2. torch.ComplexFloatTensor, torch.ComplexDoubleTensor works 3. added alias for complex64 (cfloat) and complex128 (cdouble) 4. added an internal function get_complex_default_dtype (consciously not exposed in public API) >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex64) >>> torch.set_default_dtype(torch.float64) >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex128) >>> 1j + torch.ones(2) tensor([(1.0000 + 1.0000j), (1.0000 + 1.0000j)], dtype=torch.complex128) >>> torch.tensor(1j) + torch.ones(2,2) tensor([[(1.0000 + 1.0000j), (1.0000 + 1.0000j)], [(1.0000 + 1.0000j), (1.0000 + 1.0000j)]], dtype=torch.complex128) Pull Request resolved: https://github.com/pytorch/pytorch/pull/34093 Differential Revision: D20312366 Pulled By: anjali411 fbshipit-source-id: 90f00a1a916d9c8eeda101eb6e9d250fce569815	2020-03-18 23:36:13 -07:00
Mike Ruberry	3b7e1cd2cc	Makes floor_divide a method, adds sparse floor division (#34552 ) Summary: (Updated per review feedback) `torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to: - have an out variant: `floor_divide(x, y, out=z)` - be a method on a tensor: `x.floor_divide(y)` - have an in-place variant: `x.floor_divide_(y)` - work with sparse tensors Tests are added to test_sparse.py and test_torch.py for these new behaviors. In addition, this PR: - cleans up the existing sparse division and true_division code and improves their error message - adds testing of sparse true_division to test_sparse.py - extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y). The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors. There are two potential follow-up issues suggested by this PR: - the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes - the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552 Differential Revision: D20509850 Pulled By: mruberry fbshipit-source-id: 2cd3c828aad67191c77f2ed8470411e246f604f8	2020-03-18 15:00:53 -07:00
Edward Yang	d0577e19f0	Revert D20346700: [pytorch][PR] Eager autocasting, out-of-place ops only Test Plan: revert-hammer Differential Revision: D20346700 Original commit changeset: 12d77b391731 fbshipit-source-id: 108d72bf24232f443c0be293ec932c0c478d6a60	2020-03-18 11:42:51 -07:00
Michael Carilli	aaa8f02156	Eager autocasting, out-of-place ops only (#32140 ) Summary: Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 In-place ops and ops with user-supplied `out=...` can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/pull/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32140 Differential Revision: D20346700 Pulled By: ezyang fbshipit-source-id: 12d77b3917310186fbddf11c59b2794dc859131f	2020-03-18 10:28:21 -07:00
Mike Ruberry	a1eaaea288	Revert D20497453: [pytorch][PR] Makes floor_divide a method, adds sparse floor division Test Plan: revert-hammer Differential Revision: D20497453 Original commit changeset: ac326f2007d8 fbshipit-source-id: b94b89b1a25521506e3d0a6b072d3d4d8c55e63d	2020-03-18 01:48:50 -07:00
Mike Ruberry	b7129050e7	Makes floor_divide a method, adds sparse floor division (#34552 ) Summary: (Updated per review feedback) `torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to: - have an out variant: `floor_divide(x, y, out=z)` - be a method on a tensor: `x.floor_divide(y)` - have an in-place variant: `x.floor_divide_(y)` - work with sparse tensors Tests are added to test_sparse.py and test_torch.py for these new behaviors. In addition, this PR: - cleans up the existing sparse division and true_division code and improves their error message - adds testing of sparse true_division to test_sparse.py - extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y). The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors. There are two potential follow-up issues suggested by this PR: - the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes - the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552 Differential Revision: D20497453 Pulled By: mruberry fbshipit-source-id: ac326f2007d8894f730d1278fef84d63bcb07b5d	2020-03-18 00:01:45 -07:00
Shen Li	3c48aadd98	Update descriptions for transmitting CUDA tensors (#34888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34888 Test Plan: Imported from OSS Differential Revision: D20491408 Pulled By: mrshenli fbshipit-source-id: 4ca35ac9edd4c1af4f2bae2cfb0f1f6060658d5c	2020-03-17 17:43:48 -07:00
Shen Li	800bdcf000	Removing experimental tag in for RPC and adding experimental tag for RPC+TorchScript (#34887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34887 Test Plan: Imported from OSS Differential Revision: D20491409 Pulled By: mrshenli fbshipit-source-id: ce79c9706eb70a3a52a4032de4f0bd538b694332	2020-03-17 17:43:42 -07:00
Hameer Abbasi	6b701de130	Add types argument to __torch_function__ (#34303 ) Summary: This PR adds the `types` argument to `__torch_function__` as per RFC 0001: https://github.com/pytorch/rfcs/pull/3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34303 Differential Revision: D20474992 Pulled By: ezyang fbshipit-source-id: cdd40b3b38f3bda4ece8812a629f5db87e919d01	2020-03-17 13:32:00 -07:00
Pearu Peterson	8bae1ed144	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721 ) Summary: This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721 Differential Revision: D20444270 Pulled By: vincentqb fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659	2020-03-16 14:13:30 -07:00
Rohan Varma	fd35596585	[docs][1.5] Update distributed autograd note (#34657 ) Summary: - Update API calls `backward` and `optim.step` now that we require `context_id` - Add notes to clarify purpose of distributed autograd context (this was a source of confusion in some feedback) - Add note that details why optimizer requires context_id - Clearly specify that we don't have SMART mode yet Pull Request resolved: https://github.com/pytorch/pytorch/pull/34657 Differential Revision: D20427667 Pulled By: rohan-varma fbshipit-source-id: 5f8a3539ccf648a78e9e9a0dfdfe389c678b1606	2020-03-12 22:56:32 -07:00
gabloa	a74fbea345	Continuous bernoulli distribution (take 2) (#34619 ) Summary: We recently had a NeurIPS paper (https://arxiv.org/abs/1907.06845 and https://papers.nips.cc/paper/9484-the-continuous-bernoulli-fixing-a-pervasive-error-in-variational-autoencoders) where we introduce a new [0,1]-supported distribution: the continuous Bernoulli. This pull request implements this distribution in pytorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34619 Differential Revision: D20403123 Pulled By: ngimel fbshipit-source-id: d807c7d0d372c6daf6cb6ef09df178bc7491abb2	2020-03-12 11:53:18 -07:00
Nathan Goldbaum	3f1ba3c465	Redo of "Add API for listing functions overridable by __torch_function__" (#34240 ) Summary: This is a redo of https://github.com/pytorch/pytorch/pull/33791, which was reverted because it introduced a flaky test. The test was flaky and only flaky on Python3.5 because of dict order randomization. I've fixed the issue with tests clobbering each other in `b539fec` and removed the override tests for `torch.nn.functional.tanh` and `torch.nn.functional.sigmoid`, which are deprecated and shouldn't be overridable in `e0d7402`. I also verified that no more test clobbering is happening. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34240 Differential Revision: D20252442 Pulled By: cpuhrsch fbshipit-source-id: 069568e342a41c90e1dc76cbf85ba4aed47f24be	2020-03-12 10:33:17 -07:00
Michael Suo	c235be42dd	[jit] kill script namespace (#34515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515 Once upon a time we thought this was necessary. In reality it is not, so removing it. For backcompat, our public interface (defined in `api/`) still has typedefs to the old `script::` names. There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph transform. I renamed one of them. Test Plan: Imported from OSS Differential Revision: D20353503 Pulled By: suo fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93	2020-03-11 23:32:48 -07:00
Samuel	b039bca4db	Fix typo in data.rst (#34624 ) Summary: Fix minor typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/34624 Differential Revision: D20401946 Pulled By: ngimel fbshipit-source-id: 0c6a7d838aa15120b3ecb8b9ba4b57550c9bcd32	2020-03-11 19:40:18 -07:00
Edward Yang	4b929e5466	Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem Test Plan: revert-hammer Differential Revision: D20193196 Original commit changeset: 78a487991242 fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8	2020-03-11 09:24:34 -07:00
Pearu Peterson	2ec779d46c	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488 ) Summary: This PR implements the following linear algebra algorithms for low-rank matrices: - [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + uses `torch.lowrank.get_approximate_basis` + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] PCA - using `torch.svd_lowrank` + uses `torch.svd_lowrank` + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices, uses non-centered sparse matrix algorithm + [x] documentation - [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124) + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883) + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point. - [x] benchmarks + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations. + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster. + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results) Resolves https://github.com/pytorch/pytorch/issues/8049. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488 Differential Revision: D20193196 Pulled By: vincentqb fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e	2020-03-11 07:33:49 -07:00
Mike Ruberry	3671036ef3	Adds true_divide function, analogous to Python 's, JAX's, NumPy's (true) division (#34236 ) Summary: See NumPy's division documentation here: https://numpy.org/doc/1.18/reference/generated/numpy.divide.html#numpy.divide. True division is the same as PyTorch's default division except when both inputs are integer or bool tensors. In the latter case the inputs are (conceptually) cast to the default floating type before the division is performed. The function is implemented for dense and sparse tensors and supports exporting to ONNX from PyTorch's eager mode or JIT traces. The function is inherently incompatible with exporting to ONNX via JIT script, and is another datapoint suggesting we should deprecate exporting scripted graphs to ONNX. Tests are added for the type promotion, named tensor, and ONNX export behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34236 Reviewed By: houseroad Differential Revision: D20334087 Pulled By: mruberry fbshipit-source-id: 83d00d886f46f713215d7d9e02ffd043164c57f1	2020-03-09 21:06:33 -07:00

1 2 3 4 5 ...

823 Commits