pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Karel Ha	5d9b4d5720	Update contribution_guide.rst (#36438 ) Summary: Fix formatting: change "Frequently Asked Questions" into an RST header, which is clickable and one get a URL of the FAQ section Pull Request resolved: https://github.com/pytorch/pytorch/pull/36438 Differential Revision: D21106180 Pulled By: mruberry fbshipit-source-id: 370dafd1883bd57285b478cf2faa14ae2f86e3ba	2020-04-18 02:27:38 -07:00
Michael Carilli	e6bc34f549	Amp gradient accumulation example (#36601 ) Summary: Several people have asked me about proper Amp usage with gradient accumulation. In particular, it's [unclear to people](https://github.com/NVIDIA/apex/issues/439#issuecomment-610351482) that you should only call `scaler.unscale_()` (if desired) and `scaler.update()` in iterations where you actually plan to step. This PR adds a minimal accumulation example. I built the docs locally and it looks free from sphinx errors, at least. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36601 Differential Revision: D21082295 Pulled By: ngimel fbshipit-source-id: b2faa6c02b9f7e1972618a0f1d5360a03f0450ac	2020-04-17 09:56:36 -07:00
Jessica Lin	ac950bb9c8	Update docs for master to remove Python 2 references (#36336 ) Summary: Fix compile error from original PR in jit_language_references.rst: https://github.com/pytorch/pytorch/pull/36114 Full details in task: https://our.intern.facebook.com/intern/tasks/?t=64776265 With pytroch 1.5+ we remove python2 support from PyTorch. All documentation under docs/ and on the pytorch.org website needs to remove Python 2 references. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36336 Differential Revision: D21057507 Pulled By: jlin27 fbshipit-source-id: 993a763f1ecb16dad859bc02a07625ddc023645d	2020-04-16 10:15:48 -07:00
Shen Li	049dede3be	Move rpc.rst back to the source folder to preserve existing doc URLs (#36675 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36675 Test Plan: Imported from OSS Differential Revision: D21048628 Pulled By: mrshenli fbshipit-source-id: 3cb1b35ddc1f40c673b0db9048d77dfa024be1e7	2020-04-16 08:12:34 -07:00
Omkar Salpekar	5927a6731c	[PyTorch Docs] Updated RRef docs to indicate RPC Retries (#36678 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36678 Updated the docs to explicitly indicate that RRef control messages are idempotent and retried upon failure. ghstack-source-id: 102225791 Test Plan: build bot Differential Revision: D20828041 fbshipit-source-id: ca4d71c65a453664c16c32134c47637a966b1a19	2020-04-15 17:33:20 -07:00
Kurt Mohler	2bc49a4b85	block_diag dense (#33449 ) Summary: Add block_diag function for dense tensors, based on scipy.linalg.block_diag Closes https://github.com/pytorch/pytorch/issues/31932 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33449 Differential Revision: D20943099 Pulled By: zou3519 fbshipit-source-id: 8b5c9476fb5af959aafa4169612c660396d9b717	2020-04-13 10:04:55 -07:00
Hameer Abbasi	1875c2e4bd	Add torch.Tensor.as_subclass method. (#34369 ) Summary: This is according to pytorch/rfcs#3. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34369 Differential Revision: D20963929 Pulled By: ezyang fbshipit-source-id: e618af6fd36e1dfaeda617162314ad5840f55358	2020-04-10 09:16:35 -07:00
Edward Yang	6016f694c0	Revert D20901746: [pytorch][PR] Update docs for master to remove Python 2 references Test Plan: revert-hammer Differential Revision: D20901746 Original commit changeset: 07f8dc8e6fab fbshipit-source-id: 13c55597f9f79b8473210cf35a5a0f1fb34bae39	2020-04-08 14:49:11 -07:00
Jessica Lin	373dc7c8ef	Group libraries in TOC and add PyTorch Elastic (#34928 ) Summary: Move XLA out of Notes and group with other libraries. Also adds link to PyTorch Elastic ![image](https://user-images.githubusercontent.com/8042156/76912125-f76d1080-686f-11ea-99d5-bb7be199adbd.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/34928 Differential Revision: D20901732 Pulled By: jlin27 fbshipit-source-id: a5da915bb435a3aa8995d8bbe87f53ef79fd3ce6	2020-04-07 16:37:45 -07:00
Jessica Lin	43234be525	Update docs for master to remove Python 2 references (#36114 ) Summary: Full details in task: https://our.intern.facebook.com/intern/tasks/?t=64776265 With pytroch 1.5+ we remove python2 support from PyTorch. All documentation under docs/ and on the pytorch.org website needs to remove Python 2 references. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36114 Differential Revision: D20901746 Pulled By: jlin27 fbshipit-source-id: 07f8dc8e6fab0b232e5048a63079cab0c433c85f	2020-04-07 16:13:18 -07:00
Orion Reblitz-Richardson	2d8dbcd3ef	Remove python2 and 3.5 from requirements.txt, README and docs (#35677 ) Summary: Some more cleanup now that we no longer support python2 or 3.5 on master and eventually PyTorch 1.6 release. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35677 Differential Revision: D20838097 Pulled By: orionr fbshipit-source-id: 95d553a1e8769f3baa395e0bc6d4ce7cd93236e9	2020-04-03 11:05:43 -07:00
Feng Tian	762270c51f	add c10d dynamic loading mechanism and unit test (#28068 ) Summary: The original behavior of pytorch c10d only supports built-in c10d backends, such as nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically loading 3rd party communication libraries which are derived from ProcessGroup base class. related RFC is in: https://github.com/pytorch/pytorch/issues/27955 Through this way, user just need specify a 3rd party c10d backend name when invoking torch.distributed.init_process_group(). The proposed logic will try to load corresponding c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068 Differential Revision: D19174838 Pulled By: agolynski fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62	2020-04-02 15:46:51 -07:00
anjali411	c070e8fb26	Updated canCast to disallow complex -> non complex conversion (#35883 ) Summary: fixes https://github.com/pytorch/pytorch/issues/35675 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35883 Differential Revision: D20818130 Pulled By: anjali411 fbshipit-source-id: c9b4b6112897639d1e9b7073c5dac7a29b9cd990	2020-04-02 15:12:38 -07:00
Rohan Varma	6616fad92e	[Docs] Fix typo in RPC docs (#35809 ) Summary: It's also fixed in the cherry pick PR https://github.com/pytorch/pytorch/pull/35808 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35809 Differential Revision: D20803338 Pulled By: rohan-varma fbshipit-source-id: 1925f367703faf053ab4b1c0ff0acb86230c5d89	2020-04-01 21:16:12 -07:00
Dhiraj D Kalamkar	945d7a7408	Add All-to-all comms support to distributed module and MPI backend (#32361 ) Summary: As described in https://github.com/pytorch/pytorch/issues/32345, a prototype implementation to add an alltoall communication primitive to torch.distributed module and ProcessGroup abstract interface. Also, implements alltoall in ProcessGroupMPI backend. mnaumovfb JianpingChen066 dmudiger srinivas212 Jianhui-Li mshiryaev ftian1 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini xush6528 osalpekar Pull Request resolved: https://github.com/pytorch/pytorch/pull/32361 Reviewed By: mrshenli Differential Revision: D20635481 Pulled By: srinivas212 fbshipit-source-id: 3dd0af800ce55d02f02813cde550e3a0f1a287d2	2020-04-01 08:57:12 -07:00
Rohan Varma	1f06db2579	Refactored rpc docs (#35109 ) Summary: Reorganize as per jlin27 's comments. Screenshots added in comments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35109 Differential Revision: D20788774 Pulled By: rohan-varma fbshipit-source-id: 7d64be70ef76ed6ff303d05d39c338293c234766	2020-04-01 02:01:34 -07:00
Ilia Cherniavskii	bc6bd0bb1a	Debug Information Guard Summary: This diff fixes the issues with current handling of debug information passed along the execution of the model. (For example, it is possible that multiple calls to the debug guard may override each other) Test Plan: CI test/cpp/jit Reviewed By: dzhulgakov Differential Revision: D20602775 fbshipit-source-id: 4683957954028af81a1a0f1f12b243650230c9bb	2020-04-01 01:55:29 -07:00
Ilia Cherniavskii	800d5617c0	Recording of TorchScript functions (#34710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710 Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate. Test Plan: unit test (test_misc.cpp/testRecordFunction) Reviewed By: gdankel, dzhulgakov Differential Revision: D20158523 fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582	2020-03-31 00:33:23 -07:00
Mike Ruberry	860790de88	Makes torch.real and torch.imag NumPy compatible, but disables them for complex tensors (#35560 ) Summary: The current implementations of torch.real and torch.imag are not NumPy compatible. In particular: - torch.real on a real tensor does not return the real tensor, like contiguous - torch.real on a complex tensor does not return a real-valued view of the real part - torch.imag on a complex tensor does not return a real-valued view of the imaginary part - torch.Tensor.real and torch.Tensor.imag exist as methods, but in NumPy they are writable attributes This PR makes the functions NumPy compatible by removing the method variants and out kwarg, restricting them to work on only real tensors, and updating the behavior of torch.real to return its input. New tests are added to test_torch.py to verify the behavior, a couple existing complex tests are skipped, and the documentation is updated to reflect the change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35560 Differential Revision: D20714568 Pulled By: mruberry fbshipit-source-id: 5dd092f45757b620c8426c829dd15ee997246a26	2020-03-29 02:09:00 -07:00
pinzhenx	bd604cb5b7	Upgrade MKL-DNN to DNNL v1.2 (#32422 ) Summary: ## Motivation This PR upgrades MKL-DNN from v0.20 to DNNL v1.2 and resolves https://github.com/pytorch/pytorch/issues/30300. DNNL (Deep Neural Network Library) is the new brand of MKL-DNN, which improves performance, quality, and usability over the old version. This PR focuses on the migration of all existing functionalities, including minor fixes, performance improvement and code clean up. It serves as the cornerstone of our future efforts to accommodate new features like OpenCL support, BF16 training, INT8 inference, etc. and to let the Pytorch community derive more benefits from the Intel Architecture. <br> ## What's included? Even DNNL has many breaking changes to the API, we managed to absorb most of them in ideep. This PR contains minimalist changes to the integration code in pytorch. Below is a summary of the changes: <br> General: 1. Replace op-level allocator with global-registered allocator ``` // before ideep::sum::compute<AllocForMKLDNN>(scales, {x, y}, z); // after ideep::sum::compute(scales, {x, y}, z); ``` The allocator is now being registeted at `aten/src/ATen/native/mkldnn/IDeepRegistration.cpp`. Thereafter all tensors derived from the `cpu_engine` (by default) will use the c10 allocator. ``` RegisterEngineAllocator cpu_alloc( ideep::engine::cpu_engine(), [](size_t size) { return c10::GetAllocator(c10::DeviceType::CPU)->raw_allocate(size); }, [](void* p) { c10::GetAllocator(c10::DeviceType::CPU)->raw_deallocate(p); } ); ``` ------ 2. Simplify group convolution We had such a scenario in convolution where ideep tensor shape mismatched aten tensor: when `groups > 1`, DNNL expects weights tensors to be 5-d with an extra group dimension, e.g. `goihw` instead of `oihw` in 2d conv case. As shown below, a lot of extra checks came with this difference in shape before. Now we've completely hidden this difference in ideep and all tensors are going to align with pytorch's definition. So we could safely remove these checks from both aten and c2 integration code. ``` // aten/src/ATen/native/mkldnn/Conv.cpp if (w.ndims() == x.ndims() + 1) { AT_ASSERTM( groups > 1, "Only group _mkldnn_conv2d weights could have been reordered to 5d"); kernel_size[0] = w.get_dim(0) * w.get_dim(1); std::copy_n( w.get_dims().cbegin() + 2, x.ndims() - 1, kernel_size.begin() + 1); } else { std::copy_n(w.get_dims().cbegin(), x.ndims(), kernel_size.begin()); } ``` ------ 3. Enable DNNL built-in cache Previously, we stored DNNL jitted kernels along with intermediate buffers inside ideep using an LRU cache. Now we are switching to the newly added DNNL built-in cache, and no longer caching buffers in order to reduce memory footprint. This change will be mainly reflected in lower memory usage from memory profiling results. On the code side, we removed couple of lines of `op_key_` that depended on the ideep cache before. ------ 4. Use 64-bit integer to denote dimensions We changed the type of `ideep::dims` from `vector<int32_t>` to `vector<int64_t>`. This renders ideep dims no longer compatible with 32-bit dims used by caffe2. So we use something like `{stride_.begin(), stride_.end()}` to cast parameter `stride_` into a int64 vector. <br> Misc changes in each commit: Commit: change build options Some build options were slightly changed, mainly to avoid name collisions with other projects that include DNNL as a subproject. In addition, DNNL built-in cache is enabled by option `DNNL_ENABLE_PRIMITIVE_CACHE`. Old \| New -- \| -- WITH_EXAMPLE \| MKLDNN_BUILD_EXAMPLES WITH_TEST \| MKLDNN_BUILD_TESTS MKLDNN_THREADING \| MKLDNN_CPU_RUNTIME MKLDNN_USE_MKL \| N/A (not use MKL anymore) ------ Commit: aten reintegration - aten/src/ATen/native/mkldnn/BinaryOps.cpp Implement binary ops using new operation `binary` provided by DNNL - aten/src/ATen/native/mkldnn/Conv.cpp Clean up group convolution checks Simplify conv backward integration - aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp Simplify prepacking convolution weights - test/test_mkldnn.py Fixed an issue in conv2d unit test: it didn't check conv results between mkldnn and aten implementation before. Instead, it compared the mkldnn with mkldnn as the default cpu path will also go into mkldnn. Now we use `torch.backends.mkldnn.flags` to fix this issue - torch/utils/mkldnn.py Prepack weight tensor on module `__init__` to achieve better performance significantly ------ Commit: caffe2 reintegration - caffe2/ideep/ideep_utils.h Clean up unused type definitions - caffe2/ideep/operators/adam_op.cc & caffe2/ideep/operators/momentum_sgd_op.cc Unify tensor initialization with `ideep::tensor::init`. Obsolete `ideep::tensor::reinit` - caffe2/ideep/operators/conv_op.cc & caffe2/ideep/operators/quantization/int8_conv_op.cc Clean up group convolution checks Revamp convolution API - caffe2/ideep/operators/conv_transpose_op.cc Clean up group convolution checks Clean up deconv workaround code ------ Commit: custom allocator - Register c10 allocator as mentioned above <br><br> ## Performance We tested inference on some common models based on user scenarios, and most performance numbers are either better than or on par with DNNL 0.20. ratio: new / old \| Latency (batch=1 4T) \| Throughput (batch=64 56T) -- \| -- \| -- pytorch resnet18 \| 121.4% \| 99.7% pytorch resnet50 \| 123.1% \| 106.9% pytorch resnext101_32x8d \| 116.3% \| 100.1% pytorch resnext50_32x4d \| 141.9% \| 104.4% pytorch mobilenet_v2 \| 163.0% \| 105.8% caffe2 alexnet \| 303.0% \| 99.2% caffe2 googlenet-v3 \| 101.1% \| 99.2% caffe2 inception-v1 \| 102.2% \| 101.7% caffe2 mobilenet-v1 \| 356.1% \| 253.7% caffe2 resnet101 \| 100.4% \| 99.8% caffe2 resnet152 \| 99.8% \| 99.8% caffe2 shufflenet \| 141.1% \| 69.0% † caffe2 squeezenet \| 98.5% \| 99.2% caffe2 vgg16 \| 136.8% \| 100.6% caffe2 googlenet-v3 int8 \| 100.0% \| 100.7% caffe2 mobilenet-v1 int8 \| 779.2% \| 943.0% caffe2 resnet50 int8 \| 99.5% \| 95.5% _Configuration: Platform: Skylake 8180 Latency Test: 4 threads, warmup 30, iteration 500, batch size 1 Throughput Test: 56 threads, warmup 30, iteration 200, batch size 64_ † Shufflenet is one of the few models that require temp buffers during inference. The performance degradation is an expected issue since we no longer cache any buffer in the ideep. As for the solution, we suggest users opt for caching allocator like jemalloc as a drop-in replacement for system allocator in such heavy workloads. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32422 Test Plan: Perf results: https://our.intern.facebook.com/intern/fblearner/details/177790608?tab=Experiment%20Results 10% improvement for ResNext with avx512, neutral on avx2 More results: https://fb.quip.com/ob10AL0bCDXW#NNNACAUoHJP Reviewed By: yinghai Differential Revision: D20381325 Pulled By: dzhulgakov fbshipit-source-id: 803b906fd89ed8b723c5fcab55039efe3e4bcb77	2020-03-26 22:07:59 -07:00
Ailing Zhang	7580470cc5	Update view op list. (#35399 ) Summary: Adding ops to the list based on our discussion. :D Pull Request resolved: https://github.com/pytorch/pytorch/pull/35399 Differential Revision: D20651393 Pulled By: ailzhang fbshipit-source-id: 8cf9026d10c0d74117953dbb68ebc2f537be956a	2020-03-25 16:15:00 -07:00
Yuichiro Ueno	aadd0fda8b	Document reduce_scatter collective operation (#35274 ) Summary: I don't know why reduce_scatter collective operation is not documented so I add it to the document. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35274 Differential Revision: D20645850 Pulled By: mrshenli fbshipit-source-id: 0a4458bff1a4e15a4593dd4dcc25e4e0f6e2265d	2020-03-25 13:36:29 -07:00
anjali411	c73e97033a	Added type promotion logic for complex numbers (#34093 ) Summary: Issue: https://github.com/pytorch/pytorch/issues/33780 After this PR: 1. dtype promotion logic will correctly work for ops involving complex scalars 2. added alias for complex64 (cfloat) and complex128 (cdouble) 3. added an internal function get_complex_default_dtype (consciously not exposed in public API) - sets the default complex dtype to be double if default_dtype is set to double, else float https://github.com/pytorch/pytorch/pull/34093#discussion_r392350224 >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex64) >>> torch.set_default_dtype(torch.float64) >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex128) >>> 1j + torch.ones(2) tensor([(1.0000 + 1.0000j), (1.0000 + 1.0000j)], dtype=torch.complex128) >>> torch.tensor(1j) + torch.ones(2,2) tensor([[(1.0000 + 1.0000j), (1.0000 + 1.0000j)], [(1.0000 + 1.0000j), (1.0000 + 1.0000j)]], dtype=torch.complex128) Pull Request resolved: https://github.com/pytorch/pytorch/pull/34093 Differential Revision: D20537125 Pulled By: anjali411 fbshipit-source-id: 05fb1f81b8ba039d0b698cdd2c0bbf8b0ce0b767	2020-03-25 09:12:21 -07:00
Michael Carilli	0f0271e255	[RELAND2] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35102 ) Summary: This is the second reland attempt for https://github.com/pytorch/pytorch/pull/32140. The first reland attempt https://github.com/pytorch/pytorch/pull/35011 failed due a [small incompatible change](https://github.com/pytorch/pytorch/pull/35011#issuecomment-601754216) in recent master (`skipIfRocm` was removed from `test_data_parallel.py`). The present PR restores skipIfRocm. Description from first reland attempt https://github.com/pytorch/pytorch/pull/35011: > https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. > > The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. > > Original description of https://github.com/pytorch/pytorch/pull/32140: > > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. > Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > > > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35102 Differential Revision: D20596918 Pulled By: ezyang fbshipit-source-id: 60caa279bb0ce4a9bb0b28c1d585d42cf1cc7e50	2020-03-24 09:08:04 -07:00
Mike Ruberry	7c1ea736ba	Extends true_divide to be a method (#34794 ) Summary: Per title. See related https://github.com/pytorch/pytorch/pull/34570. In PyTorch 1.7 the plan is for torch.div and Python's division operator to perform "true" division, like Python 3, JAX, and NumPy. To facilitate this change, this PR expands true_divide to be a method so it can cover all of torch.div's use cases. New true_divide tests are added to test_torch.py, test_type_promotion.py, and test_sparse.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34794 Differential Revision: D20545507 Pulled By: mruberry fbshipit-source-id: 55286f819716c8823d1930441a69008560ac2bd5	2020-03-23 23:12:23 -07:00
Peter Bell	bd0ef784e0	FAQ: Add note about recovering from OOM (#35214 ) Summary: Closes https://github.com/pytorch/pytorch/issues/18853 This documents the workaround needed to solve the issues in https://github.com/pytorch/pytorch/issues/18853 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35214 Differential Revision: D20604877 Pulled By: ezyang fbshipit-source-id: 71ed13cfa567d8e88fa9f18180a171cd174fb528	2020-03-23 20:22:46 -07:00
Vitaly Fedyunin	40da01db6a	Add docs about memory format (#34818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34818 Test Plan: Imported from OSS Differential Revision: D20601336 Pulled By: VitalyFedyunin fbshipit-source-id: d34ad226be950bf134c6b383a4810ea6aa75599e	2020-03-23 15:06:33 -07:00
Jerry Zhang	3fa7813b9f	[quant] Add dequantize.tensors (#34348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34348 We need this function to do swap dequantize for prim::ListConstruct since the output of prim::ListConstruct is a list of Tensors Test Plan: . Imported from OSS Differential Revision: D20504454 fbshipit-source-id: e6155e37da98e2219a6f79737cd46fe32a509c9f	2020-03-20 22:51:51 -07:00
Xiang Gao	df8d6eeb19	Update docs about DP and DDP for CUDA (#35063 ) Summary: We should recommend DDP instead of DP. Hope we can also cherry-pick this for 1.5 Pull Request resolved: https://github.com/pytorch/pytorch/pull/35063 Differential Revision: D20549621 Pulled By: ngimel fbshipit-source-id: 86b1b2134664065cc6070ea4212895f993eaf543	2020-03-20 20:06:37 -07:00
Mike Ruberry	fe276d541e	Revert D20541921: [pytorch][PR] [RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) Test Plan: revert-hammer Differential Revision: D20541921 Original commit changeset: abb5488dca86 fbshipit-source-id: d2c6038978f80e5429632f8b49107090a8a247f4	2020-03-19 22:39:12 -07:00
Michael Carilli	991b97277a	[RELAND] Eager autocasting, out-of-place ops only (with MSVC 2017 fix) (#35011 ) Summary: https://github.com/pytorch/pytorch/pull/32140 was approved and merged, but [reverted](`d0577e19f0`) because it broke builds with versions of Visual Studio older than 15.8 that were not represented in public CI. The build failures were caused by a [known VS bug](https://developercommunity.visualstudio.com/content/problem/27729/allow-function-with-internal-linkage-as-template-n.html), fixed in versions 15.8 and newer. The present PR reverts the revert (restoring https://github.com/pytorch/pytorch/pull/32140 's diffs) and adds a workaround to enable compilation with VS < 15.8. The workaround isn't pretty, but it's guarded by macros such that it's only used when compiling with VS < 15.8. All other builds compile with the same code/control flow as was merged in https://github.com/pytorch/pytorch/pull/32140. Original description of https://github.com/pytorch/pytorch/pull/32140: > Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 > In-place ops and ops with user-supplied out=... can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/issues/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35011 Differential Revision: D20541921 Pulled By: ezyang fbshipit-source-id: abb5488dca8620b0daac4306ebf2bb47fc36e4f5	2020-03-19 20:18:18 -07:00
albanD	1f4a4aaf64	functional autograd api (#34066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34066 Basic implementation of https://github.com/pytorch/pytorch/issues/30632 Test Plan: Imported from OSS Differential Revision: D20260307 Pulled By: albanD fbshipit-source-id: 7db5c2411ddc3e954ff8fbbe93eb3b96a2bcfb2f	2020-03-19 08:24:07 -07:00
Mike Ruberry	9c4683e8e3	Revert D20312366: [pytorch][PR] Added type promotion logic for complex numbers Test Plan: revert-hammer Differential Revision: D20312366 Original commit changeset: 90f00a1a916d fbshipit-source-id: 4510739a888b2eec5d8a72e792998ac46da6d82a	2020-03-19 05:55:57 -07:00
anjali411	c8f665dcb6	Added type promotion logic for complex numbers (#34093 ) Summary: Issue: https://github.com/pytorch/pytorch/issues/33780 After this PR: 1. dtype promotion logic will correctly work for ops involving complex scalars 2. torch.ComplexFloatTensor, torch.ComplexDoubleTensor works 3. added alias for complex64 (cfloat) and complex128 (cdouble) 4. added an internal function get_complex_default_dtype (consciously not exposed in public API) >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex64) >>> torch.set_default_dtype(torch.float64) >>> 1jtorch.ones(2) tensor([(0.0000 + 1.0000j), (0.0000 + 1.0000j)], dtype=torch.complex128) >>> 1j + torch.ones(2) tensor([(1.0000 + 1.0000j), (1.0000 + 1.0000j)], dtype=torch.complex128) >>> torch.tensor(1j) + torch.ones(2,2) tensor([[(1.0000 + 1.0000j), (1.0000 + 1.0000j)], [(1.0000 + 1.0000j), (1.0000 + 1.0000j)]], dtype=torch.complex128) Pull Request resolved: https://github.com/pytorch/pytorch/pull/34093 Differential Revision: D20312366 Pulled By: anjali411 fbshipit-source-id: 90f00a1a916d9c8eeda101eb6e9d250fce569815	2020-03-18 23:36:13 -07:00
Mike Ruberry	3b7e1cd2cc	Makes floor_divide a method, adds sparse floor division (#34552 ) Summary: (Updated per review feedback) `torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to: - have an out variant: `floor_divide(x, y, out=z)` - be a method on a tensor: `x.floor_divide(y)` - have an in-place variant: `x.floor_divide_(y)` - work with sparse tensors Tests are added to test_sparse.py and test_torch.py for these new behaviors. In addition, this PR: - cleans up the existing sparse division and true_division code and improves their error message - adds testing of sparse true_division to test_sparse.py - extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y). The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors. There are two potential follow-up issues suggested by this PR: - the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes - the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552 Differential Revision: D20509850 Pulled By: mruberry fbshipit-source-id: 2cd3c828aad67191c77f2ed8470411e246f604f8	2020-03-18 15:00:53 -07:00
Edward Yang	d0577e19f0	Revert D20346700: [pytorch][PR] Eager autocasting, out-of-place ops only Test Plan: revert-hammer Differential Revision: D20346700 Original commit changeset: 12d77b391731 fbshipit-source-id: 108d72bf24232f443c0be293ec932c0c478d6a60	2020-03-18 11:42:51 -07:00
Michael Carilli	aaa8f02156	Eager autocasting, out-of-place ops only (#32140 ) Summary: Initial integration of eager autocasting, supporting out-of-place ops only for easier review. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081 In-place ops and ops with user-supplied `out=...` can certainly be supported as well (my initial WIP https://github.com/pytorch/pytorch/pull/29552 handled many) but require substantially more complex special casing in the autocasting backend and tests. Support for these ops (much of which has already been written) will be broken into later PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32140 Differential Revision: D20346700 Pulled By: ezyang fbshipit-source-id: 12d77b3917310186fbddf11c59b2794dc859131f	2020-03-18 10:28:21 -07:00
Mike Ruberry	a1eaaea288	Revert D20497453: [pytorch][PR] Makes floor_divide a method, adds sparse floor division Test Plan: revert-hammer Differential Revision: D20497453 Original commit changeset: ac326f2007d8 fbshipit-source-id: b94b89b1a25521506e3d0a6b072d3d4d8c55e63d	2020-03-18 01:48:50 -07:00
Mike Ruberry	b7129050e7	Makes floor_divide a method, adds sparse floor division (#34552 ) Summary: (Updated per review feedback) `torch.floor_divide` is currently a function that can operate on two tensors or a tensor and a scalar (scalar x scalar floor division is handled natively by Python and the JIT has a builtin function for it). This PR updates it to: - have an out variant: `floor_divide(x, y, out=z)` - be a method on a tensor: `x.floor_divide(y)` - have an in-place variant: `x.floor_divide_(y)` - work with sparse tensors Tests are added to test_sparse.py and test_torch.py for these new behaviors. In addition, this PR: - cleans up the existing sparse division and true_division code and improves their error message - adds testing of sparse true_division to test_sparse.py - extends existing floor_divide testing in test_torch to run on CUDA, too, not just the CPU Unfortunately, making floor_divide a method requires breaking backwards compatibility, and floor_divide has been added to the BC whitelist since this is international. The BC issue is that the first parameter name to torch.floor_divide is changing from input to self. If you previously called torch.floor_divide with keyword arguments, e.g. torch.floor_divide(input=x, other=y), you will need to update to torch.floor_divide(self=x, other=y), or the more common torch.floor_divide(x, y). The intent of this PR is to allow floor_divide to be substituted for division (torch.div, /) wherever division was previously used. In 1.6 we expect torch.div to perform true_division, and floor_divide is how users can continue to perform integer division with tensors. There are two potential follow-up issues suggested by this PR: - the test framework might benefit from additional tensor construction classes, like one to create dividends and divisors for multiple dtypes - the test framework might benefit from a universal function test class. while methods have reasonable coverage as part of test_torch.py's TestTensorOp tests, function coverage is spotty. Universal functions are similar enough it should be possible to generate tests for them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34552 Differential Revision: D20497453 Pulled By: mruberry fbshipit-source-id: ac326f2007d8894f730d1278fef84d63bcb07b5d	2020-03-18 00:01:45 -07:00
Shen Li	3c48aadd98	Update descriptions for transmitting CUDA tensors (#34888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34888 Test Plan: Imported from OSS Differential Revision: D20491408 Pulled By: mrshenli fbshipit-source-id: 4ca35ac9edd4c1af4f2bae2cfb0f1f6060658d5c	2020-03-17 17:43:48 -07:00
Shen Li	800bdcf000	Removing experimental tag in for RPC and adding experimental tag for RPC+TorchScript (#34887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34887 Test Plan: Imported from OSS Differential Revision: D20491409 Pulled By: mrshenli fbshipit-source-id: ce79c9706eb70a3a52a4032de4f0bd538b694332	2020-03-17 17:43:42 -07:00
Hameer Abbasi	6b701de130	Add types argument to __torch_function__ (#34303 ) Summary: This PR adds the `types` argument to `__torch_function__` as per RFC 0001: https://github.com/pytorch/rfcs/pull/3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34303 Differential Revision: D20474992 Pulled By: ezyang fbshipit-source-id: cdd40b3b38f3bda4ece8812a629f5db87e919d01	2020-03-17 13:32:00 -07:00
Pearu Peterson	8bae1ed144	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem - copy (#34721 ) Summary: This is a copy of PR https://github.com/pytorch/pytorch/issues/29488 to help the merging process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34721 Differential Revision: D20444270 Pulled By: vincentqb fbshipit-source-id: 042c56c8c0dae37834f52b4aee2deae7dd6fa659	2020-03-16 14:13:30 -07:00
Rohan Varma	fd35596585	[docs][1.5] Update distributed autograd note (#34657 ) Summary: - Update API calls `backward` and `optim.step` now that we require `context_id` - Add notes to clarify purpose of distributed autograd context (this was a source of confusion in some feedback) - Add note that details why optimizer requires context_id - Clearly specify that we don't have SMART mode yet Pull Request resolved: https://github.com/pytorch/pytorch/pull/34657 Differential Revision: D20427667 Pulled By: rohan-varma fbshipit-source-id: 5f8a3539ccf648a78e9e9a0dfdfe389c678b1606	2020-03-12 22:56:32 -07:00
gabloa	a74fbea345	Continuous bernoulli distribution (take 2) (#34619 ) Summary: We recently had a NeurIPS paper (https://arxiv.org/abs/1907.06845 and https://papers.nips.cc/paper/9484-the-continuous-bernoulli-fixing-a-pervasive-error-in-variational-autoencoders) where we introduce a new [0,1]-supported distribution: the continuous Bernoulli. This pull request implements this distribution in pytorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34619 Differential Revision: D20403123 Pulled By: ngimel fbshipit-source-id: d807c7d0d372c6daf6cb6ef09df178bc7491abb2	2020-03-12 11:53:18 -07:00
Nathan Goldbaum	3f1ba3c465	Redo of "Add API for listing functions overridable by __torch_function__" (#34240 ) Summary: This is a redo of https://github.com/pytorch/pytorch/pull/33791, which was reverted because it introduced a flaky test. The test was flaky and only flaky on Python3.5 because of dict order randomization. I've fixed the issue with tests clobbering each other in `b539fec` and removed the override tests for `torch.nn.functional.tanh` and `torch.nn.functional.sigmoid`, which are deprecated and shouldn't be overridable in `e0d7402`. I also verified that no more test clobbering is happening. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34240 Differential Revision: D20252442 Pulled By: cpuhrsch fbshipit-source-id: 069568e342a41c90e1dc76cbf85ba4aed47f24be	2020-03-12 10:33:17 -07:00
Michael Suo	c235be42dd	[jit] kill script namespace (#34515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515 Once upon a time we thought this was necessary. In reality it is not, so removing it. For backcompat, our public interface (defined in `api/`) still has typedefs to the old `script::` names. There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph transform. I renamed one of them. Test Plan: Imported from OSS Differential Revision: D20353503 Pulled By: suo fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93	2020-03-11 23:32:48 -07:00
Samuel	b039bca4db	Fix typo in data.rst (#34624 ) Summary: Fix minor typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/34624 Differential Revision: D20401946 Pulled By: ngimel fbshipit-source-id: 0c6a7d838aa15120b3ecb8b9ba4b57550c9bcd32	2020-03-11 19:40:18 -07:00
Edward Yang	4b929e5466	Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem Test Plan: revert-hammer Differential Revision: D20193196 Original commit changeset: 78a487991242 fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8	2020-03-11 09:24:34 -07:00
Pearu Peterson	2ec779d46c	PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488 ) Summary: This PR implements the following linear algebra algorithms for low-rank matrices: - [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061). + uses `torch.lowrank.get_approximate_basis` + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] PCA - using `torch.svd_lowrank` + uses `torch.svd_lowrank` + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices, uses non-centered sparse matrix algorithm + [x] documentation - [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124) + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883) + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)` + [x] dense matrices + [x] batches of dense matrices + [x] sparse matrices + [x] documentation - [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point. - [x] benchmarks + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations. + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster. + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results) Resolves https://github.com/pytorch/pytorch/issues/8049. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488 Differential Revision: D20193196 Pulled By: vincentqb fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e	2020-03-11 07:33:49 -07:00
Mike Ruberry	3671036ef3	Adds true_divide function, analogous to Python 's, JAX's, NumPy's (true) division (#34236 ) Summary: See NumPy's division documentation here: https://numpy.org/doc/1.18/reference/generated/numpy.divide.html#numpy.divide. True division is the same as PyTorch's default division except when both inputs are integer or bool tensors. In the latter case the inputs are (conceptually) cast to the default floating type before the division is performed. The function is implemented for dense and sparse tensors and supports exporting to ONNX from PyTorch's eager mode or JIT traces. The function is inherently incompatible with exporting to ONNX via JIT script, and is another datapoint suggesting we should deprecate exporting scripted graphs to ONNX. Tests are added for the type promotion, named tensor, and ONNX export behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34236 Reviewed By: houseroad Differential Revision: D20334087 Pulled By: mruberry fbshipit-source-id: 83d00d886f46f713215d7d9e02ffd043164c57f1	2020-03-09 21:06:33 -07:00
Kamil Wojcicki	65bad41cbe	Fixed typos in quantization docs / docstrings (#34182 ) Summary: Removed extra back quote character. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34182 Differential Revision: D20320146 Pulled By: jerryzh168 fbshipit-source-id: 33c347711a052cc55f7d1a41ed959dadf99a3d7d	2020-03-06 21:53:52 -08:00
Duncan Riach	516a587438	Enhance reproducibility documentation (#33795 ) Summary: Improves explanation of non-determinism when running on GPUs. Adds info about `torch.nn.BCELoss` operating non-deterministically on GPUs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33795 Differential Revision: D20284880 Pulled By: ngimel fbshipit-source-id: d543959636d261a80c234150304344b19a37ba5d	2020-03-06 15:32:04 -08:00
Elias Ellison	479c3b0aa5	[JIT] add support for torch.norm (#33783 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33783 Fix for https://github.com/pytorch/pytorch/issues/20113 Test Plan: Imported from OSS Differential Revision: D20121917 Pulled By: eellison fbshipit-source-id: ffedcc40678cd80f5529ff9323088eed544e5158	2020-03-05 14:46:24 -08:00
Shen Li	ac6e75a165	Revert D20195053: [pytorch][PR] Add API for listing functions overridable by __torch_function__ Test Plan: revert-hammer Differential Revision: D20195053 Original commit changeset: 1585f4e405f5 fbshipit-source-id: 3c1aab9c60e3138d40d200ae4238bda0cddf8896	2020-03-04 10:13:54 -08:00
peter	5f4a01b2ea	Update MAGMA to 2.5.2 for Windows (#34205 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34205 Differential Revision: D20248224 Pulled By: soumith fbshipit-source-id: f5e0fe06aa8f8ee551abe45db1d55d06e95ab928	2020-03-04 08:28:09 -08:00
Jessica Lin	6d78882158	Add layout.html to template for stable docs (#33770 ) Summary: When docs are built, conf.py points to a _templates-stable/layout.html that does not exist. Adding this file here so future stable docs will build with Google Analytics tags and without the unstable able that is in _templates/layout.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/33770 Differential Revision: D20164895 Pulled By: jlin27 fbshipit-source-id: 5fca9f9b825b1484dab52e2b2d91f92ae6372371	2020-03-04 03:14:52 -08:00
Shen Li	3af0dffe84	Use double quotes in C++ to stay consistent with Python RPC docs (#34095 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34095 Test Plan: Imported from OSS Differential Revision: D20227343 Pulled By: mrshenli fbshipit-source-id: 69c556beee1f9e944eb1053b5ff0ac368dd99c60	2020-03-03 16:44:30 -08:00
Shen Li	f1085a8e41	Improve ProcessGroup RpcBackendOptions Constructor API (#34081 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34081 Before this commit, applications have to do the following to configure number of threads in ProcessGroup RPC backend: ``` op = ProcessGroupRpcBackendOptions() op.rpc_timeout = rpc_timeout op.init_method = init_method op.num_send_recv_threads = 32 init_rpc(...., rpc_backend_options=op) ``` After this commit, it can be simplified to: ``` init_rpc(...., rpc_backend_options=ProcessGroupRpcBackendOptions(num_send_recv_threads=32)) ``` Fixes #34075 Test Plan: Imported from OSS Differential Revision: D20227344 Pulled By: mrshenli fbshipit-source-id: def4318e987179b8c8ecca44d7ff935702c8a6e7	2020-03-03 16:43:29 -08:00
Nathan Goldbaum	ad2825a2c9	Add API for listing functions overridable by __torch_function__ (#33791 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/33182 This adds private API functions that developers of types that implement `__torch_function__` can use to ensure full coverage of the subset of the PyTorch API that can be overrided. I've refactored some of the code in the tests into a new `torch._overrides.get_overridable_functions` function. I've also changed `TENSOR_LIKE_TORCH_OVERRIDES` into `torch._overrides.get_testing_overrides` and `IGNORED_TORCH_FUNCTIONS` into `torch._overrides.get_ignored_functions`. Making these two static global variables in the tests into functions should allow rewriting their implementation to construct their return values instead of just statically defining the return value as is done here. Currently that is blocked on not being able to inspect function signatures of compiled kernels in PyTorch (see https://github.com/pytorch/pytorch/issues/28233). See the docs I've added for usage examples of these new functions. I also refactored the existing override tests to make use of these new functions, which should be a good forcing function to make sure they're kept up-to-date. Finally, while working on this I discovered that `TestTorchFunctionOverrides.test_mean` and `TestTorchFunctionOverrides.test_mm` weren't ever being run because they were getting clobbered by the other dynamically generated override tests. I fixed that by renaming the tests and then fixing the actual test code. I've verified that all the subclassing semantics is correct and that the updated test answers are correct. I'm happy to put the fixes to the existing tests in as a separate pull request if that would be easier to review. ping cpuhrsch since the feature request originally came from them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33791 Differential Revision: D20195053 Pulled By: cpuhrsch fbshipit-source-id: 1585f4e405f5223932b410eae03a288dc8eb627e	2020-03-03 12:40:34 -08:00
Moto Hira	6631c2a627	[doc] Add grad context manager doc to toplevel torch module. (#33877 ) Summary: fixes https://github.com/pytorch/pytorch/issues/32014 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33877 Differential Revision: D20141801 Pulled By: albanD fbshipit-source-id: bac713382a71666dd5e2499f710c51a55cc579ba	2020-03-02 06:32:36 -08:00
Basil Hosmer	ad769d74d9	Collapse _like overloads into a single overload. (#33705 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33705 The fact that there were two overloads appears to be a historical artifact that dates back to when goldsborough originally added these bindings in the first place. If TensorOptions is made optional, then you only need one overload, not two, as they are exactly redundant with each other. When MemoryFormat was added, it was made a little harder to do this, as the C++ syntax at::empty_like(t, memory_format) would not work if you collapsed the overload; but now it works because TensorOptions supports MemoryFormat. The upshot is, I can get rid of all the overloads and just have one overload. Amazingly, this change is backwards compatible, as the test attests. While I was at it, I also deleted the overload name from the functions entirely. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20073355 Pulled By: bhosmer fbshipit-source-id: c6a8908213b32ccf6737ea864d135e2cce34f56b	2020-03-01 19:40:22 -08:00
Ailing Zhang	69d2741480	Add list of view ops to public doc. (#32560 ) Summary: This PR comes from discussion with albanD in https://fb.quip.com/npBHAXaPfnbu. Main goal is to clarify view ops with general outplace/inplace ops and remind users about the difference. For reference this information is only available in code which is internal and hard to find. Also changes to this list actually affect users so we think it's better to expose it as public information. It's also helpful for new backend like XLA when implementing PyTorch ops. `19bbb4fccb/tools/autograd/gen_autograd.py (L32-L68)` Please feel free to comment! Pull Request resolved: https://github.com/pytorch/pytorch/pull/32560 Differential Revision: D20161069 Pulled By: ailzhang fbshipit-source-id: b5f1fd4353fe7594a427784db288aeb5a37dc521	2020-02-28 15:05:55 -08:00
Michael Suo	dbe850af5b	[jit] do the code reorg (#33851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851 Rationale and context described in #33828. Script to reproduce the move: https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9 ghstack-source-id: 99079645 Test Plan: Make sure CI passes Reviewed By: jamesr66a Differential Revision: D20133869 fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e	2020-02-27 13:02:51 -08:00
Omkar Salpekar	24dd800e6a	[Dist Autograd] Functional API for Dist Autograd and Dist Optimizer (#33711 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33711 Fixed #33480 This makes `dist_autograd.backward` and `dist_optimizer.step` functional by making the user explicitly pass in the `context_id` as opposed to relying on the confusing thread_local context_id. This diff incorporates these API changes and all places where these functions are called. More concretely, this code: ``` with dist_autograd.context(): # Forward pass. dist_autograd.backward([loss.sum()]) dist_optim.step() ``` should now be written as follows: ``` with dist_autograd.context() as context_id: # Forward pass. dist_autograd.backward(context_id, [loss.sum()]) dist_optim.step(context_id) ``` Test Plan: Ensuring all existing dist_autograd and dist_optimizer tests pass with the new API. Also added a new test case for input checking. Differential Revision: D20011710 fbshipit-source-id: 216e12207934a2a79c7223332b97c558d89d4d65	2020-02-26 19:08:28 -08:00
Elias Ellison	857eb4145e	[JIT] add support for torch.cdist (#33737 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33737 Test Plan: Imported from OSS Differential Revision: D20121916 Pulled By: eellison fbshipit-source-id: b0427bbfd3ade1f3129c4a95a542fbc32c3abd76	2020-02-26 18:37:37 -08:00
Elias Ellison	f31b1d3453	[JIT] add support for lu_unpack (#33736 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33736 Test Plan: Imported from OSS Differential Revision: D20121914 Pulled By: eellison fbshipit-source-id: 1136f4d7678a2233129aefe3e30234af385b8353	2020-02-26 18:37:33 -08:00
Elias Ellison	4543cf4eb1	[JIT] add support for torch.lu to torchscript (#33724 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33724 Fix for https://github.com/pytorch/pytorch/issues/33381, partial fix of https://github.com/pytorch/pytorch/issues/30786 Test Plan: Imported from OSS Differential Revision: D20077321 Pulled By: eellison fbshipit-source-id: a1e6a0370712b36c9f66979098ac2f9d500ca5f6	2020-02-26 18:37:28 -08:00
Ahmad Salim Al-Sibahi	24659d28a1	Feature/vonmises upstream (#33418 ) Summary: Third try of https://github.com/pytorch/pytorch/issues/33177 😄 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33418 Differential Revision: D20069683 Pulled By: ezyang fbshipit-source-id: f58e45e91b672bfde2e41a4480215ba4c613f9de	2020-02-26 08:19:12 -08:00
Michael Carilli	fc6a153688	[WIP] Reanimate gradient scaling API with original scale update heuristic (#33366 ) Summary: Also, windows memory failures responsible for the earlier reversion have been fixed. This PR (initially) contains 2 commits: * a revert of the revert * all changes to implement the original Apex scale update heuristic, squashed into a single commit for easier diff review Pull Request resolved: https://github.com/pytorch/pytorch/pull/33366 Differential Revision: D20099026 Pulled By: ngimel fbshipit-source-id: 339b9b6bd5134bf055057492cd1eedb7e4461529	2020-02-25 19:00:34 -08:00
peter	adbe289870	Update MKL to 2020.0.166 for Windows (#33690 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33690 Differential Revision: D20089300 Pulled By: ezyang fbshipit-source-id: 887c006fbdb2c837f0a1c607a196811f44f1fb35	2020-02-24 22:43:34 -08:00
Michael Suo	dc3d47110a	[docs] add experimental warning to TorchScript classes in language reference (#33697 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33697 reference Test Plan: Imported from OSS Differential Revision: D20070220 Pulled By: suo fbshipit-source-id: 9828d876afed59203cc472eaf0134d52d399069e	2020-02-24 14:01:19 -08:00
anjali411	13e4ee7883	Added tensor.is_complex(), is_complex and dtype.is_complex py binding, tensor printing, and dixed the scalar type returned for complex float (#33268 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33268 Test Plan: Imported from OSS Differential Revision: D19907698 Pulled By: anjali411 fbshipit-source-id: c3ce2e99fc09da91a90a8fb94e5525a00bb23703	2020-02-20 13:38:01 -08:00
Edward Yang	ae53f8dd25	Revert D19859905: [pytorch][PR] Gradient scaling API Test Plan: revert-hammer Differential Revision: D19859905 Original commit changeset: bb8ae6966214 fbshipit-source-id: 28f1c93e8a00e3a4bbe8cc981499b15468f0b970	2020-02-14 11:03:27 -08:00
Nicki Skafte	4bef344210	Implementation of mixture distributions (#22742 ) Summary: Addressing issue https://github.com/pytorch/pytorch/issues/18125 This implements a mixture distributions, where all components are from the same distribution family. Right now the implementation supports the ```mean, variance, sample, log_prob``` methods. cc: fritzo and neerajprad - [x] add import and `__all__` string in `torch/distributions/__init__.py` - [x] register docs in docs/source/distributions.rst ### Tests (all tests live in tests/distributions.py) - [x] add an `Example(MixtureSameFamily, [...])` to the `EXAMPLES` list, populating `[...]` with three examples: one with `Normal`, one with `Categorical`, and one with `MultivariateNormal` (to exercise, `FloatTensor`, `LongTensor`, and nontrivial `event_dim`) - [x] add a `test_mixture_same_family_shape()` to `TestDistributions`. It would be good to test this with both `Normal` and `MultivariateNormal` - [x] add a `test_mixture_same_family_log_prob()` to `TestDistributions`. - [x] add a `test_mixture_same_family_sample()` to `TestDistributions`. - [x] add a `test_mixture_same_family_shape()` to `TestDistributionShapes` ### Triaged for follup-up PR? - support batch shape - implement `.expand()` - implement `kl_divergence()` in torch/distributions/kl.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/22742 Differential Revision: D19899726 Pulled By: ezyang fbshipit-source-id: 9c816e83a2ef104fe3ea3117c95680b51c7a2fa4	2020-02-14 10:31:56 -08:00
George Guanheng Zhang	0c98939b7b	Revert D19899550: [pytorch][PR] Second try on Von Mises: Make it JIT compatible Test Plan: revert-hammer Differential Revision: D19899550 Original commit changeset: fbcdd9bc9143 fbshipit-source-id: c8a675a8b53f884acd0e6c57bc7aa15faf83d5d6	2020-02-14 08:42:16 -08:00
Ahmad Salim Al-Sibahi	b1583ceb1e	Second try on Von Mises: Make it JIT compatible (#33177 ) Summary: Follow up from https://github.com/pytorch/pytorch/issues/17168 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/33177 Differential Revision: D19899550 Pulled By: ezyang fbshipit-source-id: fbcdd9bc91438164bcb2b1cbc314c765520754e1	2020-02-14 07:16:41 -08:00
Michael Carilli	40246fa63c	Gradient scaling API (#26512 ) Summary: This PR implements the gradient scaling API that mruberry, jjsjann123, ngimel, zdevito, gchanan and I have been discussing. Relevant issue/RFC: https://github.com/pytorch/pytorch/issues/25081. Volume-wise, this PR is mostly documentation and tests. The Python API (found entirely in `torch/cuda/amp/amp_scaler.py`) is lightweight . The exposed functions are intended to make the implementation and control flow of gradient scaling convenient, intuitive, and performant. The API is probably easiest to digest by looking at the documentation and examples. `docs/source/amp.rst` is the homepage for the Automatic Mixed Precision package. `docs/source/notes/amp_examples.rst` includes several examples demonstrating common but not-immediately-obvious use cases. Examples are backed by tests in `test_cuda.py` (and thankfully the tests pass :P). Two small utility kernels have been added in `native/cuda/AmpKernels.cu` to improve performance and avoid host-device synchronizations wherever possible. Existing optimizers, both in the wild and in Pytorch core, do not need to change to use the scaling API. However, the API was also designed to establish a contract between user scripts and optimizers such that writers of _new_ custom optimizers have the control points they need to implement fast, optionally sync-free updates. User scripts that obey the scaling API can drop such custom optimizers in and reap performance benefits without having to change anything aside from the optimizer constructor itself. [I know what the contract with custom optimizers should be](`35829f24ef/torch/cuda/amp/amp_scaler.py (L179-L184)`), but I'm waiting for review on the rest of the API before I go about documenting it (it will be given a dedicated section in `docs/source/notes/amp_examples.rst`. Currently, the gradient scaling examples do not include the auto-casting API as discussed in https://github.com/pytorch/pytorch/issues/25081. The gradient scaling API is intended to be orthogonal/modular relative to autocasting. Without auto-casting the gradient scaling API is fully use-_able_, but not terribly use-_ful_, so it's up to you guys whether you want to wait until auto-casting is ready before merging the scaling API as well. ### Todo - [ ] How do I get c10 registered status for my two custom kernels? They're very simple. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26512 Differential Revision: D19859905 Pulled By: mruberry fbshipit-source-id: bb8ae6966214718dfee11345db824389e4286923	2020-02-13 11:06:06 -08:00
Ilia Cherniavskii	04829e924a	Update CPU threading doc (#33083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33083 Added more recommendations, some notes and warning Test Plan: cd docs ; make html Differential Revision: D19829133 Pulled By: ilia-cher fbshipit-source-id: b9fbd89f5875b3ce35cc42ba75a3b44bb132c506	2020-02-11 14:13:51 -08:00
Shinichiro Hamaji	478356aeec	Fix broken links in governance.rst Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30815 Differential Revision: D19697401 Pulled By: ezyang fbshipit-source-id: d7e1a1b54039624f471b6cfb568428feb73060f4	2020-02-04 14:26:09 -08:00
Shinichiro Hamaji	67706187fb	Fix a broken link in contribution_guide.rst Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30814 Differential Revision: D19697403 Pulled By: ezyang fbshipit-source-id: b01fd0e189b3bc7ccaa197c9c64e12fee70a6310	2020-02-04 14:14:25 -08:00
BowenBao	10183061eb	[ONNX] Update ONNX landing page since 1.3 (#32805 ) Summary: * New ops supported for exporting. * Updates on support for tensor indexing and dynamic list of tensors. * lara-hdr, spandantiwari Should we also include updates on torchvision support in this page? cc houseroad, neginraoof Please review if I have missed anything. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32805 Reviewed By: hl475 Differential Revision: D19635699 Pulled By: houseroad fbshipit-source-id: b6be4fce641f852dcbceed20b4433f4037d8024a	2020-02-03 10:38:29 -08:00
Edward Z. Yang	1177191c8e	Synchronize with ShipIt. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2020-01-21 13:39:28 -05:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
anjali411	5b815d980e	Added cummin Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32238 Differential Revision: D19416791 Pulled By: anjali411 fbshipit-source-id: 5aadc0a7a55af40d76f444ab7d7d47ec822f55a5	2020-01-17 10:51:58 -08:00
Shen Li	322f34b245	Adding DDP Design Note Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32158 Test Plan: Imported from OSS Differential Revision: D19405980 Pulled By: mrshenli fbshipit-source-id: 808ef1c71b637546f8872375bf1828967b1a5a60	2020-01-15 14:10:45 -08:00
Vamshi Chowdary	05088da8e9	[pytorch][PR] Fixed error in sample code of documentation (#31682 ) Summary: "in_features" and "out_features" are not defined. Possibly a typo. They should be "input_features" and "output_features" instead Pull Request resolved: https://github.com/pytorch/pytorch/pull/31682 Differential Revision: D19251685 Pulled By: zou3519 fbshipit-source-id: ac9e524e792a1853a16e8876d76b908495d8f35e	2020-01-15 10:34:07 -08:00
anjali411	8dc67a014f	Add cummax Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32169 Differential Revision: D19393236 Pulled By: anjali411 fbshipit-source-id: 5dac6b0a4038eb48458d4a0b253418daeccbb6bc	2020-01-14 17:19:10 -08:00
Zafar Takhirov	701ca68882	Docs entry for the `is_quantized` Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32075 Test Plan: Imported from OSS Differential Revision: D19353861 Pulled By: z-a-f fbshipit-source-id: 4249216ac9a4af354a251c62181d65bc14cbfd3e	2020-01-13 13:54:35 -08:00
Shen Li	62f93443e5	Explain RPC behavior when using Tensor as arg or return value Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31968 Test Plan: Imported from OSS Differential Revision: D19321380 Pulled By: mrshenli fbshipit-source-id: e3431f1f02963cc8d8266a420ab03866106f26ac	2020-01-09 16:42:24 -08:00
Bram Wasti	021e1e20c1	Revert D19320493: Javadoc changes Test Plan: revert-hammer Differential Revision: D19320493 Original commit changeset: cc76b2a2acbe fbshipit-source-id: 3b36dd2d2591acc60a06a421dd625c21adbe578a	2020-01-09 14:23:30 -08:00
Jessica Lin	26f552a3d1	Javadoc changes (#31956 ) Summary: - Add Javadoc url in index.rst - Delete no longer needed java rst files - Remove intersphinx extension from conf.oy - Remove javasphinx from docs/requirements.txt Pull Request resolved: https://github.com/pytorch/pytorch/pull/31956 Differential Revision: D19320493 Pulled By: jlin27 fbshipit-source-id: cc76b2a2acbe2ecdabcd3339e1cc3182f0c906ae	2020-01-09 10:55:24 -08:00
xiaobing.zhang	9ba6a768de	Add op bitwise_or (#31559 ) Summary: ezyang , this PR add bitwise_or operator as https://github.com/pytorch/pytorch/pull/31104 . Benchmark script : ``` import timeit import torch torch.manual_seed(1) for n, t in [(10, 100000),(1000, 10000)]: print('__or__ (a.numel() == {}) for {} times'.format(n, t)) for device in ('cpu', 'cuda'): for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a \| b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}")', number=t)) for n, t in [(10, 100000),(1000, 10000)]: print('__ior__ (a.numel() == {}) for {} times'.format(n, t)) for device in ('cpu', 'cuda'): for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a \| b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.tensor(5, dtype = {dtype}, device="{device}")', number=t)) ``` Device: Tesla P100, skx-8180 Cuda verison: 9.0.176 Before: ``` __or__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.17616272252053022 device: cpu, dtype: torch.uint8, 100000 times 0.17148233391344547 device: cpu, dtype: torch.int16, 100000 times 0.17616403382271528 device: cpu, dtype: torch.int32, 100000 times 0.17717823758721352 device: cpu, dtype: torch.int64, 100000 times 0.1801931718364358 device: cuda, dtype: torch.int8, 100000 times 1.270583058707416 device: cuda, dtype: torch.uint8, 100000 times 1.2636413089931011 device: cuda, dtype: torch.int16, 100000 times 1.2839747751131654 device: cuda, dtype: torch.int32, 100000 times 1.2548385225236416 device: cuda, dtype: torch.int64, 100000 times 1.2650810535997152 __or__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.031136621721088886 device: cpu, dtype: torch.uint8, 10000 times 0.030786747112870216 device: cpu, dtype: torch.int16, 10000 times 0.02391665056347847 device: cpu, dtype: torch.int32, 10000 times 0.024147341027855873 device: cpu, dtype: torch.int64, 10000 times 0.024414129555225372 device: cuda, dtype: torch.int8, 10000 times 0.12741921469569206 device: cuda, dtype: torch.uint8, 10000 times 0.1249831635504961 device: cuda, dtype: torch.int16, 10000 times 0.1283819805830717 device: cuda, dtype: torch.int32, 10000 times 0.12591975275427103 device: cuda, dtype: torch.int64, 10000 times 0.12655890546739101 __ior__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.3908365070819855 device: cpu, dtype: torch.uint8, 100000 times 0.38267823681235313 device: cpu, dtype: torch.int16, 100000 times 0.38239253498613834 device: cpu, dtype: torch.int32, 100000 times 0.3817988149821758 device: cpu, dtype: torch.int64, 100000 times 0.3901665909215808 device: cuda, dtype: torch.int8, 100000 times 1.4211318120360374 device: cuda, dtype: torch.uint8, 100000 times 1.4215159295126796 device: cuda, dtype: torch.int16, 100000 times 1.4307750314474106 device: cuda, dtype: torch.int32, 100000 times 1.4123614141717553 device: cuda, dtype: torch.int64, 100000 times 1.4480243818834424 __ior__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.06468924414366484 device: cpu, dtype: torch.uint8, 10000 times 0.06442475505173206 device: cpu, dtype: torch.int16, 10000 times 0.05267547257244587 device: cpu, dtype: torch.int32, 10000 times 0.05286940559744835 device: cpu, dtype: torch.int64, 10000 times 0.06211103219538927 device: cuda, dtype: torch.int8, 10000 times 0.15332304500043392 device: cuda, dtype: torch.uint8, 10000 times 0.15353196952492 device: cuda, dtype: torch.int16, 10000 times 0.15300503931939602 device: cuda, dtype: torch.int32, 10000 times 0.15274472255259752 device: cuda, dtype: torch.int64, 10000 times 0.1512152962386608 ``` After: ``` __or__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.2465507509186864 device: cpu, dtype: torch.uint8, 100000 times 0.2472386620938778 device: cpu, dtype: torch.int16, 100000 times 0.2469814233481884 device: cpu, dtype: torch.int32, 100000 times 0.2535214088857174 device: cpu, dtype: torch.int64, 100000 times 0.24855613708496094 device: cuda, dtype: torch.int8, 100000 times 1.4351346511393785 device: cuda, dtype: torch.uint8, 100000 times 1.4434308474883437 device: cuda, dtype: torch.int16, 100000 times 1.4520929995924234 device: cuda, dtype: torch.int32, 100000 times 1.4456610176712275 device: cuda, dtype: torch.int64, 100000 times 1.4580101007595658 __or__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.029985425993800163 device: cpu, dtype: torch.uint8, 10000 times 0.03024935908615589 device: cpu, dtype: torch.int16, 10000 times 0.026356655173003674 device: cpu, dtype: torch.int32, 10000 times 0.027377349324524403 device: cpu, dtype: torch.int64, 10000 times 0.029163731262087822 device: cuda, dtype: torch.int8, 10000 times 0.14540370367467403 device: cuda, dtype: torch.uint8, 10000 times 0.1456305105239153 device: cuda, dtype: torch.int16, 10000 times 0.1450125053524971 device: cuda, dtype: torch.int32, 10000 times 0.1472016740590334 device: cuda, dtype: torch.int64, 10000 times 0.14709716010838747 __ior__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.27195510920137167 device: cpu, dtype: torch.uint8, 100000 times 0.2692424338310957 device: cpu, dtype: torch.int16, 100000 times 0.27726674638688564 device: cpu, dtype: torch.int32, 100000 times 0.2815811652690172 device: cpu, dtype: torch.int64, 100000 times 0.2852728571742773 device: cuda, dtype: torch.int8, 100000 times 1.4743850827217102 device: cuda, dtype: torch.uint8, 100000 times 1.4766502184793353 device: cuda, dtype: torch.int16, 100000 times 1.4774163831025362 device: cuda, dtype: torch.int32, 100000 times 1.4749693805351853 device: cuda, dtype: torch.int64, 100000 times 1.5772947426885366 __ior__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.03614502027630806 device: cpu, dtype: torch.uint8, 10000 times 0.03619729354977608 device: cpu, dtype: torch.int16, 10000 times 0.0319912089034915 device: cpu, dtype: torch.int32, 10000 times 0.03319283854216337 device: cpu, dtype: torch.int64, 10000 times 0.0343862259760499 device: cuda, dtype: torch.int8, 10000 times 0.1581476852297783 device: cuda, dtype: torch.uint8, 10000 times 0.15974601730704308 device: cuda, dtype: torch.int16, 10000 times 0.15957212820649147 device: cuda, dtype: torch.int32, 10000 times 0.16002820804715157 device: cuda, dtype: torch.int64, 10000 times 0.16129320487380028 ``` Fix https://github.com/pytorch/pytorch/issues/24511, https://github.com/pytorch/pytorch/issues/24515, https://github.com/pytorch/pytorch/issues/24658, https://github.com/pytorch/pytorch/issues/24662. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31559 Differential Revision: D19315875 Pulled By: ezyang fbshipit-source-id: 4a3ca88fdafbeb796079687e676228111eb44aad	2020-01-08 15:06:30 -08:00
Jessica Lin	c888473b57	Restructure docs organization and naming (#31849 ) Summary: * Rename “Other Languages” → “Language Bindings” * Move the Community section to the bottom * Move "Language Bindings" above "Python API" Pull Request resolved: https://github.com/pytorch/pytorch/pull/31849 Differential Revision: D19290966 Pulled By: jlin27 fbshipit-source-id: 30b579e032a9fb1636e4afc7bbbd85a2708f637d	2020-01-07 11:16:53 -08:00
Rohan Varma	a561a8448b	minor doc tweak to use mp.spawn in example (#30381 ) Summary: Per pietern's comment in https://github.com/pytorch/pytorch/issues/30022, we can make this example launcher a bit simpler by using `torch.multiprocessing`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30381 Differential Revision: D19292080 Pulled By: rohan-varma fbshipit-source-id: 018ace945601166ef3af05d8c3e69d900bd77c3b	2020-01-06 22:19:01 -08:00
xiaobing.zhang	b47e9b97a2	Add op bitwise_and (#31104 ) Summary: Refer to https://github.com/pytorch/pytorch/pull/25665, add `bitwise_and` operator. Benchmark script : ``` import timeit #for __and__ for n, t in [(10, 100000),(1000, 10000)]: print('__and__ (a.numel() == {}) for {} times'.format(n, t)) for device in ('cpu', 'cuda'): for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a & b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}")', number=t)) #for __iand__ for n, t in [(10, 100000),(1000, 10000)]: print('__iand__ (a.numel() == {}) for {} times'.format(n, t)) for device in ('cpu', 'cuda'): for dtype in ('torch.int8', 'torch.uint8', 'torch.int16', 'torch.int32', 'torch.int64'): print(f'device: {device}, dtype: {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a & b\nif "{device}" == "cuda": torch.cuda.synchronize()', setup=f'import torch; a = torch.randint(0, 10, ({n},), dtype = {dtype}, device="{device}"); b = torch.tensor(5, dtype = {dtype}, device="{device}")', number=t)) ``` Device: Tesla P100, skx-8180 Cuda verison: 9.0.176 Before: ``` __and__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.1766007635742426 device: cpu, dtype: torch.uint8, 100000 times 0.17322628945112228 device: cpu, dtype: torch.int16, 100000 times 0.17650844901800156 device: cpu, dtype: torch.int32, 100000 times 0.17711848113685846 device: cpu, dtype: torch.int64, 100000 times 0.18240160401910543 device: cuda, dtype: torch.int8, 100000 times 1.273967768996954 device: cuda, dtype: torch.uint8, 100000 times 1.2778537990525365 device: cuda, dtype: torch.int16, 100000 times 1.2753686187788844 device: cuda, dtype: torch.int32, 100000 times 1.2797665279358625 device: cuda, dtype: torch.int64, 100000 times 1.2933144550770521 __and__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.031139614060521126 device: cpu, dtype: torch.uint8, 10000 times 0.03091452084481716 device: cpu, dtype: torch.int16, 10000 times 0.022756479680538177 device: cpu, dtype: torch.int32, 10000 times 0.025045674294233322 device: cpu, dtype: torch.int64, 10000 times 0.024164282716810703 device: cuda, dtype: torch.int8, 10000 times 0.12820732593536377 device: cuda, dtype: torch.uint8, 10000 times 0.12775669433176517 device: cuda, dtype: torch.int16, 10000 times 0.12697868794202805 device: cuda, dtype: torch.int32, 10000 times 0.12832533661276102 device: cuda, dtype: torch.int64, 10000 times 0.1280576130375266 __iand__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.3687064303085208 device: cpu, dtype: torch.uint8, 100000 times 0.36253443732857704 device: cpu, dtype: torch.int16, 100000 times 0.362891579978168 device: cpu, dtype: torch.int32, 100000 times 0.37680106051266193 device: cpu, dtype: torch.int64, 100000 times 0.3689364707097411 device: cuda, dtype: torch.int8, 100000 times 1.419940729625523 device: cuda, dtype: torch.uint8, 100000 times 1.4247053815051913 device: cuda, dtype: torch.int16, 100000 times 1.4191444097086787 device: cuda, dtype: torch.int32, 100000 times 1.4305962566286325 device: cuda, dtype: torch.int64, 100000 times 1.4567416654899716 __iand__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.06224383972585201 device: cpu, dtype: torch.uint8, 10000 times 0.06205617543309927 device: cpu, dtype: torch.int16, 10000 times 0.05016433447599411 device: cpu, dtype: torch.int32, 10000 times 0.05216377507895231 device: cpu, dtype: torch.int64, 10000 times 0.06139362137764692 device: cuda, dtype: torch.int8, 10000 times 0.14827249851077795 device: cuda, dtype: torch.uint8, 10000 times 0.14801877550780773 device: cuda, dtype: torch.int16, 10000 times 0.14952312968671322 device: cuda, dtype: torch.int32, 10000 times 0.14999118447303772 device: cuda, dtype: torch.int64, 10000 times 0.14951884001493454 ``` After: ``` __and__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.23157884553074837 device: cpu, dtype: torch.uint8, 100000 times 0.23063660878688097 device: cpu, dtype: torch.int16, 100000 times 0.23005440644919872 device: cpu, dtype: torch.int32, 100000 times 0.23748818412423134 device: cpu, dtype: torch.int64, 100000 times 0.24106105230748653 device: cuda, dtype: torch.int8, 100000 times 1.4394256137311459 device: cuda, dtype: torch.uint8, 100000 times 1.4436759827658534 device: cuda, dtype: torch.int16, 100000 times 1.4631587155163288 device: cuda, dtype: torch.int32, 100000 times 1.459101552143693 device: cuda, dtype: torch.int64, 100000 times 1.4784048134461045 __and__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.028442862443625927 device: cpu, dtype: torch.uint8, 10000 times 0.028130197897553444 device: cpu, dtype: torch.int16, 10000 times 0.025318274274468422 device: cpu, dtype: torch.int32, 10000 times 0.02519288007169962 device: cpu, dtype: torch.int64, 10000 times 0.028299466706812382 device: cuda, dtype: torch.int8, 10000 times 0.14342594426125288 device: cuda, dtype: torch.uint8, 10000 times 0.145280827768147 device: cuda, dtype: torch.int16, 10000 times 0.14673697855323553 device: cuda, dtype: torch.int32, 10000 times 0.14499565307050943 device: cuda, dtype: torch.int64, 10000 times 0.14582364354282618 __iand__ (a.numel() == 10) for 100000 times device: cpu, dtype: torch.int8, 100000 times 0.25548241566866636 device: cpu, dtype: torch.uint8, 100000 times 0.2552562616765499 device: cpu, dtype: torch.int16, 100000 times 0.25905191246420145 device: cpu, dtype: torch.int32, 100000 times 0.26635489892214537 device: cpu, dtype: torch.int64, 100000 times 0.26269810926169157 device: cuda, dtype: torch.int8, 100000 times 1.485458506271243 device: cuda, dtype: torch.uint8, 100000 times 1.4742380809038877 device: cuda, dtype: torch.int16, 100000 times 1.507783885113895 device: cuda, dtype: torch.int32, 100000 times 1.4926990242674947 device: cuda, dtype: torch.int64, 100000 times 1.519851053133607 __iand__ (a.numel() == 1000) for 10000 times device: cpu, dtype: torch.int8, 10000 times 0.03425929415971041 device: cpu, dtype: torch.uint8, 10000 times 0.03293587639927864 device: cpu, dtype: torch.int16, 10000 times 0.029559112153947353 device: cpu, dtype: torch.int32, 10000 times 0.030915481969714165 device: cpu, dtype: torch.int64, 10000 times 0.03292469773441553 device: cuda, dtype: torch.int8, 10000 times 0.15792148280888796 device: cuda, dtype: torch.uint8, 10000 times 0.16000914946198463 device: cuda, dtype: torch.int16, 10000 times 0.1600684942677617 device: cuda, dtype: torch.int32, 10000 times 0.16162546630948782 device: cuda, dtype: torch.int64, 10000 times 0.1629159888252616 ``` Fix https://github.com/pytorch/pytorch/issues/24508, https://github.com/pytorch/pytorch/issues/24509, https://github.com/pytorch/pytorch/issues/24655, https://github.com/pytorch/pytorch/issues/24656. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31104 Differential Revision: D18938930 Pulled By: VitalyFedyunin fbshipit-source-id: a77e805a0b84e8ace16c6e648c2f67dad44f2e44	2020-01-03 10:32:36 -08:00
vishwakftw	22d84204f7	Expose torch.poisson in documentation (#31667 ) Summary: Changelog: - Add doc string for torch.poisson briefing current behavior - Check for non-positive entries in the tensor passed as input to torch.poisson Closes https://github.com/pytorch/pytorch/issues/31646 Pull Request resolved: https://github.com/pytorch/pytorch/pull/31667 Differential Revision: D19247371 Pulled By: ngimel fbshipit-source-id: b53d105e73bf59a45beeb566f47365c3eb74efca	2019-12-28 21:32:26 -08:00
davidriazati	ec4e347744	Add Python language reference docs (#30686 ) Summary: This exposes our audit of https://docs.python.org/3/reference/ with descriptions for each line item. To generate the `.rst` from the Quip: ```bash pip install m2r m2r jit_language_reference.md ``` https://driazati.github.io/pytorch_doc_previews/30686/jit.html#python-functions-and-modules Pull Request resolved: https://github.com/pytorch/pytorch/pull/30686 Pulled By: driazati Differential Revision: D19219587 fbshipit-source-id: 249db9b5ee20e38804d4302bbfeca7d54f27d0bd	2019-12-26 13:21:36 -08:00
Martin Yuan	11854bcd38	Add test to torch.jit.export_opnames, make the _C function private Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31446 Test Plan: Imported from OSS Differential Revision: D19172851 Pulled By: iseeyuan fbshipit-source-id: f06d8766ed73c9abe4ebf41c402ee64880d745be	2019-12-20 13:38:43 -08:00
Elias Ellison	779b128872	add back in reference to jit_unsupported section (#31486 ) Summary: It was added in https://github.com/pytorch/pytorch/pull/31329 and removed in a bad merge in https://github.com/pytorch/pytorch/pull/31138/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/31486 Differential Revision: D19181967 Pulled By: eellison fbshipit-source-id: 7e4b4a9b2042c30ec18f7f737bc4a9a56fac7d92	2019-12-19 12:44:16 -08:00
davidriazati	503a4e9019	Cleanup after moving language reference (#31146 ) Summary: Stacked PRs * #31146 - [jit] Cleanup after moving language reference * #31138 - [jit] Move TorchScript language reference to its own page Preview: https://driazati.github.io/pytorch_doc_previews/jit.html#torchscript-language Pull Request resolved: https://github.com/pytorch/pytorch/pull/31146 Pulled By: driazati Differential Revision: D19167390 fbshipit-source-id: f28daed36754a553264fc8ac142ed22c3e26d63e	2019-12-18 15:09:35 -08:00
davidriazati	ae2487bf4d	Move TorchScript language reference to its own page (#31138 ) Summary: Stacked PRs * #31146 - [jit] Cleanup after moving language reference * #31138 - [jit] Move TorchScript language reference to its own page Preview: https://driazati.github.io/pytorch_doc_previews/jit.html#torchscript-language Pull Request resolved: https://github.com/pytorch/pytorch/pull/31138 Pulled By: driazati Differential Revision: D19167375 fbshipit-source-id: d37110d85fc8b8d2c741be49846e873de1357c2a	2019-12-18 15:09:31 -08:00
Elias Ellison	fb30a48b4e	add unsupported section (#31329 ) Summary: Add a section for unsupported ops, and modules. Automatically generate the properties and attributes that aren't bound, and for ops that have semantic mismatches set up tests so the docs stay up to date. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31329 Differential Revision: D19164472 Pulled By: eellison fbshipit-source-id: 46290bb8a64d9de928cfb1eda5ff4558c3799c88	2019-12-18 13:56:02 -08:00
Elliot Waite	c63f8e5ebe	Fix typo in data.rst docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31395 Differential Revision: D19160010 Pulled By: zou3519 fbshipit-source-id: cbc4e719e69117e8747617729d240c72e7a4e3dd	2019-12-18 09:52:10 -08:00
Vitaly Fedyunin	3e59e80429	Revert D18941024: Move TorchScript language reference to its own page Test Plan: revert-hammer Differential Revision: D18941024 Original commit changeset: d0ff600870a1 fbshipit-source-id: 01c0eac4c9741f27b91d710616e71a0d769f6f6a	2019-12-18 08:55:50 -08:00
davidriazati	c05538b831	Move TorchScript language reference to its own page (#31138 ) Summary: Preview: https://driazati.github.io/pytorch_doc_previews/jit.html#torchscript-language ](https://our.intern.facebook.com/intern/diff/18941024/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/31138 Pulled By: driazati Differential Revision: D18941024 fbshipit-source-id: d0ff600870a14c4a7c6ce54867d152072a12c48c	2019-12-18 00:46:19 -08:00
Michael Suo	293a139d79	add a warning for script classes (#31069 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31069 Just to clarify that they are still experimental. Test Plan: Imported from OSS Differential Revision: D18920496 Pulled By: suo fbshipit-source-id: d2f3014592a01a21f7fc60a4ce46dd0bfe5e19e9	2019-12-11 14:48:55 -08:00
Rohan Varma	dbc8b00816	Document WorkerInfo and RpcBackendOptions structures in RPC docs. (#31077 ) Summary: We mention `WorkerInfo` and `RpcBackendOptions` in a couple of different locations in our docs, and these are public classes that the user may use, so we should add the class to the documentation. <img width="978" alt="Screen Shot 2019-12-10 at 1 42 22 PM" src="https://user-images.githubusercontent.com/8039770/70571759-47db2080-1b53-11ea-9d61-c83985a29dd9.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/31077 Differential Revision: D18928162 Pulled By: rohan-varma fbshipit-source-id: 67f11eedd87523c469377b791a0ba23704ec3723	2019-12-11 11:39:57 -08:00
Michael Suo	d02280b432	move migration guide to appendix (#31068 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31068 Let's get it out of the early parts now that the recursive API has been around for a while Test Plan: Imported from OSS Differential Revision: D18920498 Pulled By: suo fbshipit-source-id: 6f4389739dd9e7e5f3014811b452249cc21d88e7	2019-12-10 18:04:02 -08:00
TH3CHARLie	5edfe9cb80	add torch.square (#30719 ) Summary: fixes https://github.com/pytorch/pytorch/issues/30524 This adds an new operator `torch.square` to PyTorch I think it is ready for the first-time review now albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/30719 Differential Revision: D18909268 Pulled By: albanD fbshipit-source-id: 5626c445d8db20471a56fc1d7a3490e77812662b	2019-12-10 15:22:46 -08:00
Elias Ellison	f48a8901c5	Add floor_divide function (#30493 ) Summary: Adds `torch.floor_divide` following the numpy's `floor_divide` api. I only implemented the out-of-place version, I can add the inplace version if requested. Also fixes https://github.com/pytorch/pytorch/issues/27512 Pull Request resolved: https://github.com/pytorch/pytorch/pull/30493 Differential Revision: D18896211 Pulled By: eellison fbshipit-source-id: ee401c96ab23a62fc114ed3bb9791b8ec150ecbd	2019-12-10 07:51:39 -08:00
Joseph Spisak	7af9d77290	Update persons_of_interest.rst Updating to add POI for mobile, quantization and an addition to optimizers.	2019-12-05 21:20:40 -08:00
davidriazati	2308a0ec1b	Improve documentation around builtin functions (#30347 ) Summary: This breaks the builtins page into some more sections and adds details about Python built-in functions ](https://our.intern.facebook.com/intern/diff/18718166/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/30347 Pulled By: driazati Reviewed By: wanchaol Differential Revision: D18718166 fbshipit-source-id: bf43260ab7bcf92cccef684a5ce68cb16020771d	2019-12-04 13:50:40 -08:00
Nathan Goldbaum	9d3402e4cb	Add the __torch_function__ API override mechanism (#30730 ) Summary: This is a re-do of https://github.com/pytorch/pytorch/issues/27064, which was reverted (`b8792c0438`). This was landed at the same time as other work that added new operators to the `torch` namespace so the check for whether the `torch` namespace is exhaustively checked for overridability was triggering test failures. I've temporarily disabled that check and added an explanatory comment that the check will be re-enabled in a future PR that will be merged during a time when the commit velocity on PyTorch is lower. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30730 Differential Revision: D18813270 Pulled By: ezyang fbshipit-source-id: 70477c4656dca8fea6e7bc59259555041fcfbf68	2019-12-04 13:19:07 -08:00
Tongzhou Wang	d0af07ca4c	Fix capitalization inconsistency in optim.rst Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30608 Differential Revision: D18808516 Pulled By: ezyang fbshipit-source-id: 4be68be9a8c8c3da7a0b98162bc1050b588fab43	2019-12-04 08:17:03 -08:00
Edward Yang	b8792c0438	Revert D18645954: add __torch_function__ API override mechanism Test Plan: revert-hammer Differential Revision: D18645954 Original commit changeset: 54b5e4344d7a fbshipit-source-id: 4a7aebb483e6b001130d6f384ccc53c5a808ab13	2019-12-04 07:41:47 -08:00
Prasun Anand	d12786b24f	add __torch_function__ API override mechanism (#27064 ) Summary: Closes https://github.com/pytorch/pytorch/issues/24015 (see description of that issue for more details). For a toy example, see the `DiagonalTensor` and `SubDiagonalTensor` class in test/test_overrides.py. This PR currently contains: * tests for `__torch_function__` behavior * modification to `gen_python_functions` and `parse` function signatures and dispatched to correct overloaded argument. This feature is inspired by and analogous to NumPy's `__array_function__` protocol ([see NumPy Enhancement Proposal 18](https://numpy.org/neps/nep-0018-array-function-protocol.html#trying-array-function-methods-until-the-right-one-works)). ### Benchmarks: See Nathan's comment below: https://github.com/pytorch/pytorch/pull/27064#issuecomment-554601189 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27064 Differential Revision: D18645954 Pulled By: ezyang fbshipit-source-id: 54b5e4344d7afdbcf996bb57191b0bdadc7b1767	2019-12-04 05:56:46 -08:00
Martin Yuan	b26401f965	Dump operator names of a script module (#30467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30467 Introduce function jit.export_opnames(module), which returns a list of all operator names used in the module and its submodules. One usage is to have mobile custom build to link only operators in the returned list to save the mobile size. Example: import torch m = torch.jit.load("example.pt") print(torch.jit.export_opnames(m)) The outputs are in alphabetical order: ['aten::_convolution', 'aten::add.Tensor', 'aten::add_.Tensor', 'aten::addmm', 'aten::append.Tensor', 'aten::cat', 'aten::dropout', 'aten::embedding', 'aten::matmul', 'aten::max.dim', 'aten::mul.Tensor', 'aten::permute', 'aten::relu', 'aten::t', 'aten::tanh', 'prim::ListConstruct', 'prim::TupleConstruct', 'prim::TupleUnpack'] Test Plan: Imported from OSS Differential Revision: D18801619 Pulled By: iseeyuan fbshipit-source-id: f9b198d3e82b095daf704ee595d8026ad889bb13	2019-12-03 20:20:33 -08:00
Hong Xu	bb5dcaf24f	Add logical_and and logical_or (#30521 ) Summary: With the CI failure caused in `8bbafa0b32` fixed (incorrect return type of the lambdas in CUDA kernels) Pull Request resolved: https://github.com/pytorch/pytorch/pull/30521 Differential Revision: D18770151 Pulled By: ailzhang fbshipit-source-id: 02f0fe1d5718c34d24da6dbb5884ee8b247ce39a	2019-12-03 18:24:54 -08:00
Joseph Spisak	4d4d8e0dce	Update persons_of_interest.rst (#30647 ) Summary: Adding back the 3 names for the MSFT team - re: ONNX Governance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30647 Differential Revision: D18781163 Pulled By: jlin27 fbshipit-source-id: 7284ba29841ab41b9807c9d92694630b50de7b6a	2019-12-03 14:46:15 -08:00
Brian Wignall	e7fe64f6a6	Fix typos (#30606 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606 Differential Revision: D18763028 Pulled By: mrshenli fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c	2019-12-02 20:17:42 -08:00
peterjc123	6deb41c88d	Update magma to 2.5.1 for Windows and switch CUDA in CI to 9.2 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30513 Differential Revision: D18764184 Pulled By: ezyang fbshipit-source-id: 4992869fd6a89471a5d25eb6a9b44ad8eceb480f	2019-12-02 11:56:10 -08:00
Shen Li	ec5e471647	Reorganize rpc API doc and add introduction (#30491 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30491 Our RPC API docs presents the APIs well but misses a general introduction to the APIs. Readers might be a little lost the first time landing this page. This commits reorganizes the APIs into four components from user's perspective, RPC, RRef, dist autograd, and dist optimizer. It also adds an intro to each and briefly discribes why we provide those. Test Plan: Imported from OSS Differential Revision: D18723294 Pulled By: mrshenli fbshipit-source-id: 4aced4ab537b070aa780aaaf9724659fd47cb3cb	2019-11-28 15:34:18 -08:00
Rohan Varma	1350b99de4	Add local shutdown to process group agent (#30330 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30330 This is now possible due to previous changes made in `gloo` and `ProcessGroupGloo`. We `abort` the listener thread that is waiting for a message, and join all other threads. The API is changed so that the previous `wait_all_workers` does not destroy the agent, and this is now done in a new `shutdown` method. All callsites are updated appropriately. ghstack-source-id: 94673884 ghstack-source-id: 94673884 Test Plan: Unit tests pass. Reviewed By: mrshenli Differential Revision: D18661775 fbshipit-source-id: 5aaa7c14603e18253394224994f6cd43234301c2	2019-11-27 22:34:08 -08:00
Richard Zou	ec5c08de74	Revert D18580867: Add logical_and and logical_or Test Plan: revert-hammer Differential Revision: D18580867 Original commit changeset: 7e4d7c37da4d fbshipit-source-id: 81fb604c7aef8d847f518f5faa016e7bd0423016	2019-11-27 09:27:00 -08:00
Hong Xu	8bbafa0b32	Add logical_and and logical_or (#28162 ) Summary: Superseding https://github.com/pytorch/pytorch/issues/24379 as type promotion has been implemented. Close https://github.com/pytorch/pytorch/issues/24379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/28162 Differential Revision: D18580867 Pulled By: ailzhang fbshipit-source-id: 7e4d7c37da4dc8df87314bd4f1f6a7539e46586a	2019-11-26 17:38:22 -08:00
Santiago Castro	4eff2f2007	Fix missing closing quotes in docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30448 Differential Revision: D18711396 Pulled By: zou3519 fbshipit-source-id: 6e35e0779716185791273eedca7a93667a6cda90	2019-11-26 17:38:13 -08:00
davidriazati	46e7f31fa3	Document unsupported types (#30344 ) Summary: This adds a listing of the parts of the `typing` module that are unsupported This is also a first pass decisions on features are 'unlikely to be implemented' vs 'not implemented' so they're open to discussion ](https://our.intern.facebook.com/intern/diff/18665628/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/30344 Pulled By: driazati Differential Revision: D18665628 fbshipit-source-id: 22b8ebbde23df03839306cdb4344ca18a44f2c29	2019-11-26 06:53:22 -08:00
Rohan Varma	5c6705e62c	add default arg for init_method (#30208 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30208 Adds default arg for init_method so users don't have to pass this in, and moves it to `RpcBackendOptions` struct. Removes `init_method` arg from rpc.init_rpc. Also fixes some docs. ghstack-source-id: 94500475 Test Plan: Unit tests pass. Reviewed By: mrshenli Differential Revision: D18630074 fbshipit-source-id: 04b7dd7ec96f4c4da311b71d250233f1f262135a	2019-11-25 14:52:48 -08:00
Chris Gottbrath	7c4b9042ab	Updates to quantization documentation (#30288 ) Summary: This pull request includes fixes for six quantization doc bugs. https://github.com/pytorch/pytorch/issues/30283 - Rendering issue on QConfig https://github.com/pytorch/pytorch/issues/26305 - Minor doc issue on fuse_modules() https://github.com/pytorch/pytorch/issues/27451 - Issues with ConvReLU2d, ConvReLU3d, and LinearReLU doc issues https://github.com/pytorch/pytorch/issues/26899 - Missing docstrings in torch.nn.intrinsic fused functions https://github.com/pytorch/pytorch/issues/29735 - add discussion of QNNPack to quantization doc page https://github.com/pytorch/pytorch/issues/27938 - some of the quantized functions lack documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/30288 Differential Revision: D18653368 Pulled By: gottbrath fbshipit-source-id: 410b3dd81ff10909a7f1a7736ca42d7cabf0beb1	2019-11-23 09:29:30 -08:00
Shen Li	a9f3f48f88	Revert D5578006: Add local shutdown to process group agent Test Plan: revert-hammer Differential Revision: D5578006 Original commit changeset: 6258879fb44c fbshipit-source-id: 11b893b3a280a8383eeb20a0548626811616dca1	2019-11-22 11:31:04 -08:00
Rohan Varma	c478a92b93	Add local shutdown to process group agent (#30020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30020 This is now possible due to previous changes made in `gloo` and `ProcessGroupGloo`. We `abort` the listener thread that is waiting for a message, and join all other threads. The destructor calls this same `localShutdown` method, but we ensure this is not called multiple times. ghstack-source-id: 94415336 Test Plan: Unit tests pass. Differential Revision: D5578006 fbshipit-source-id: 6258879fb44c9fca97fdfad64468c1488c16ac02	2019-11-22 10:03:00 -08:00
Shen Li	aa1e99e983	Fix two links in RPC API doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30259 Test Plan: Imported from OSS Differential Revision: D18644749 Pulled By: mrshenli fbshipit-source-id: ff515d2588cd59e0d87f020a01885156a6644450	2019-11-21 19:32:22 -08:00
Shen Li	063e22b7c2	Fix RRef design doc warning (#30240 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30240 Get rid of the following warning when build docs: ``` /Users/shenli/Project/pytorch/docs/source/notes/rref.rst:184: WARNING: Error in "code" directive: maximum 1 argument(s) allowed, 6 supplied. .. code:: import torch import torch.distributed.rpc as rpc # on worker A rref = rpc.remote('B', torch.add, args=(torch.ones(2), 1)) # say the rref has RRefId 100 and ForkId 1 rref.to_here() ``` Test Plan: Imported from OSS Differential Revision: D18640016 Pulled By: mrshenli fbshipit-source-id: d527827f01183411d4b4c73e0a976bdd7fccbf49	2019-11-21 16:22:39 -08:00
Shen Li	e0325011e4	Add link to RRef protocol in RPC doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30218 Test Plan: Imported from OSS Differential Revision: D18638881 Pulled By: mrshenli fbshipit-source-id: ca6fae6f8cea8cdcc33d275dd71a347fbb5dd45c	2019-11-21 16:22:35 -08:00
Alban Desmaison	a78e7eadbd	Fix typo in extending doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30159 Differential Revision: D18619060 Pulled By: albanD fbshipit-source-id: 1109c8da6242dffd6315b0c9de0f8ca34df0b276	2019-11-21 08:12:32 -08:00
Shen Li	2803261a23	Update API doc for wait_all_workers after rename Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30179 Test Plan: Imported from OSS Differential Revision: D18623092 Pulled By: mrshenli fbshipit-source-id: 1bbffc7476f256c156783274f7ef51342820edcd	2019-11-20 16:12:30 -08:00
Rohan Varma	de05114618	polish examples in docstrings and update docs to reflect correct use of (#30052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30052 Some of the examples provided in `rpc/api.py` were not updated along with the code changes, this PR updates them. Also removes the `dist.ProcessGroup` information since `init_rpc` now initializes a default process group. ghstack-source-id: 94273004 Test Plan: Unit tests pass Differential Revision: D18582596 fbshipit-source-id: a637683f0221f9600f7e50b74e9f7e5a1d331d8f	2019-11-20 15:30:38 -08:00
Shen Li	73cf4d468f	Design doc for Remote Reference (#30066 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30066 This commit adds design reasoning and walks through four scenarios for RRef. Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D18595094 Pulled By: mrshenli fbshipit-source-id: 134102901ce515a44a2e7cd013b62143a6158120	2019-11-20 12:42:28 -08:00
Rohan Varma	f304bd5062	rename join_rpc to wait_all_workers in public api (#30050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30050 Renames this API to wait_all_workers as discussed. ghstack-source-id: 94273005 Test Plan: Unit tests pass Differential Revision: D18581466 fbshipit-source-id: 4ff5d5fb2d528f17252d5b5f30c3047d2efb92bf	2019-11-20 12:38:35 -08:00
Shen Li	ff7afede92	Stop showing .api as an API path component in RPC docs (#30160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30160 The path torch.distributed.rpc.api is an implementation detail, which should not be used by applications to import RPC APIs. Instead, all RPC APIs are exposed directly as torch.distributed.rpc.*. This commit makes the API doc consistent with the above expectation. Test Plan: Imported from OSS Differential Revision: D18616359 Pulled By: mrshenli fbshipit-source-id: 8207f7d36c24cf55af737c03a27fd1896c231641	2019-11-20 12:04:10 -08:00
Pritam Damania	88ef402cb5	Add distributed optimizer section to distributed autograd design doc. (#30068 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30068 ghstack-source-id: 94228719 Test Plan: waitforbuildbot Differential Revision: D18556536 fbshipit-source-id: decd6927bfdd1ee3c81fef7430aa7095d7f38d33	2019-11-19 22:43:03 -08:00
Pritam Damania	5d69bc1eda	Add docs for distributed optimizer. (#29971 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29971 ghstack-source-id: 94132160 Test Plan: waitforbuildbot Differential Revision: D18554631 fbshipit-source-id: c4485f7cff5159f423d0f35d1caf71074b62dc28	2019-11-18 18:51:26 -08:00
Pritam Damania	ab93b3df60	Polish distributed autograd docs. (#29942 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29942 1) Added links to the design. 2) Fixed function signautres. 3) Expanded examples ghstack-source-id: 94162372 Test Plan: waitforbuildbot Differential Revision: D18547103 fbshipit-source-id: 067ba166c107ed14085af8ee3306d3f8a9dcebe7	2019-11-18 18:13:08 -08:00
Rohan Varma	639133d6d1	rename init_model_parallel to init_rpc (#29762 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29762 Rename this API as discussed, since it's use cases extend beyond only model parallelism. ghstack-source-id: 94020627 Test Plan: Unit tests pass Differential Revision: D18491743 fbshipit-source-id: d07676bb14f072c64da0ce99ee818bcc582efc57	2019-11-18 06:07:44 -08:00
Rohan Varma	455b5c1a7d	minor updates to rpc docs (#29857 ) Summary: Small fixes to rpc docs: - mark as experimental and subject to change - Reference the distributed autograd design document in pytorch notes page. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29857 Differential Revision: D18526252 Pulled By: rohan-varma fbshipit-source-id: e09757fa60a9f8fe9c76a868a418a1cd1c300eae	2019-11-15 22:28:08 -08:00
Pritam Damania	eb29276623	Update distributed autograd design doc with appropriate links. (#29927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29927 With the docs page now up, we can update the links in the design doc to point to the docs page. ghstack-source-id: 94055423 Test Plan: waitforbuildbot Differential Revision: D18541878 fbshipit-source-id: f44702d9a8296ccc0a5d58d56c3b6dc8a822b520	2019-11-15 21:10:53 -08:00
Xiaomeng Yang	510ef4b63a	Add nn.quantized.Conv3d (#29813 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29813 Add nn.quantized.Conv3d Test Plan: buck test mode/dev-nosan //caffe2/test:quantized -- "conv" Reviewed By: jianyuh Differential Revision: D18467749 fbshipit-source-id: 892f708179e9e836ad902851ac1838847009da15	2019-11-15 04:33:40 -08:00
Rohan Varma	06ef4a757d	Add docs for RPC, dist autograd, and RRef modules (#29276 ) Summary: Closes https://github.com/pytorch/pytorch/issues/28983. Documentation for `torch.distributed.rpc` and `torch.distributed.autograd` modules. Also fixes/tidies up some of the docstrings in rpc/autograd, and moves some functions to be private so they don't show up in the documentation. Note: Much of the text to describe/explain the RPC/RRef layers are taken from the following RFCs: https://github.com/pytorch/pytorch/issues/23110, https://github.com/pytorch/pytorch/issues/26759 Pull Request resolved: https://github.com/pytorch/pytorch/pull/29276 Differential Revision: D18478754 Pulled By: rohan-varma fbshipit-source-id: e9a7089baf5275304e5408d319eb9bf98e53fff8	2019-11-14 14:32:03 -08:00
Hong Xu	bd0394d473	Add op bitwise_xor to replace __xor__ and __ixor__ (#25665 ) Summary: We define `bitwise_xor` instead of `__xor__` and `__ixor__`. The reason is that (a) it is not idiomatic to call functions starting and ending with double underscores, and that (b) the types of argument that we can add is limited (e.g., no out), and that (c) consistent with the naming of `bitwise_not` and numpy. Fix https://github.com/pytorch/pytorch/issues/24513, Fix https://github.com/pytorch/pytorch/issues/24517, Fix https://github.com/pytorch/pytorch/issues/24660, Fix https://github.com/pytorch/pytorch/issues/24664 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25665 Differential Revision: D17577143 Pulled By: VitalyFedyunin fbshipit-source-id: 042f6385f9305bd66d50a8ce82e28f40a23a7266	2019-11-12 16:14:04 -08:00

1 2 3 4 5 ...

924 Commits