pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Nikita Shulga	63fd257879	Add `Ellipsis` constant to the list of recognized tokens (#44959 ) Summary: Per https://docs.python.org/3.6/library/constants.html > `Ellipsis` is the same as ellipsis literal `...` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44959 Reviewed By: suo Differential Revision: D23785660 Pulled By: malfet fbshipit-source-id: f68461849e7d16ef68042eb96566f2c936c06b0f	2020-09-22 09:05:25 -07:00
albanD	e155fbe915	add warning when ParameterList/Dict is used with DataParallel (#44405 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44405 Test Plan: Imported from OSS Reviewed By: agolynski Differential Revision: D23783987 Pulled By: albanD fbshipit-source-id: 5018b0d381cb09301d2f88a98a910854f740ace1	2020-09-22 08:58:00 -07:00
Rong Rong	4a0aa69a66	Fix undefined variable 'namedshape' in tensor.py (#45085 ) Summary: Hot Fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/45085 Reviewed By: malfet, seemethere Differential Revision: D23824444 Pulled By: walterddr fbshipit-source-id: c9f37b394d281b7ef44b14c30699bb7510a362a7	2020-09-22 08:52:47 -07:00
anjali411	58b6ab69e5	torch.sgn for complex tensors (#39955 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39955 resolves https://github.com/pytorch/pytorch/issues/36323 by adding `torch.sgn` for complex tensors. `torch.sgn` returns `x/abs(x)` for `x != 0` and returns `0 + 0j` for `x==0` This PR doesn't test the correctness of the gradients. It will be done as a part of auditing all the ops in future once we decide the autograd behavior (JAX vs TF) and add gradchek. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23460526 Pulled By: anjali411 fbshipit-source-id: 70fc4e14e4d66196e27cf188e0422a335fc42f92	2020-09-22 08:24:53 -07:00
Bugra Akyildiz	1b059f2c6d	Directly use work.result() to retrieve tensor rather than passing as a separate argument (#44914 ) Summary: We currently are fetching an allreduced tensor from Python in C++ in, where we are storing the resulting tensor in a struct's parameter. This PR removes extra tensor paratemeter in the function parameter and fetch from a single place. Fixes https://github.com/pytorch/pytorch/issues/43960 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44914 Reviewed By: rohan-varma Differential Revision: D23798888 Pulled By: bugra fbshipit-source-id: ad1b8c31c15e3758a57b17218bbb9dc1f61f1577	2020-09-22 06:28:47 -07:00
Jerry Zhang	5aed75b21b	[quant][graphmode][jit] Try to support append (#44641 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44641 Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D23682356 fbshipit-source-id: 09a03dfde0b1346a5764e8e28ba56e32b343d239	2020-09-21 23:13:56 -07:00
Gao, Xiang	2111ec3bf3	CUDA BFloat16 losses (#45011 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45011 Reviewed By: mruberry Differential Revision: D23805840 Pulled By: ngimel fbshipit-source-id: 3eb60d4367c727100763879e20e9df9d58bf5ad6	2020-09-21 22:51:17 -07:00
Ksenija Stanojevic	0dda65ac77	[ONNX] add jit pass for lists (#43820 ) Summary: Add jit preprocessing pass for adding int lists. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43820 Reviewed By: albanD Differential Revision: D23674598 Pulled By: bzinodev fbshipit-source-id: 35766403a073e202563bba5251c07efb7cc5cfb1	2020-09-21 22:05:25 -07:00
Shen Li	09e7f62ce2	Fix RPC and ProcessGroup GIL deadlock (#45088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45088 Fixes #45082 Found a few problems while working on #44983 1. We deliberately swallow RPC timeouts during shutdown, as we haven't found a good way to handle those. When we convert `_wait_all_workers` into `_all_gather`, the same logic was inherited. However, as `_all_gather` meant to be used in more general scenarios, we should no longer keep silent about errors. This commit let the error throw in `_all_gather` and also let `shutdown()` to catch them and log. 2. After fixing (1), I found that `UnpickledPythonCall` needs to acquire GIL on destruction, and this can lead to deadlock when used in conjuction with `ProcessGroup`. Because `ProcessGroup` ctor is a synchronization point which holds GIL. In `init_rpc`, followers (`rank != 0`) can exit before the leader (`rank == 0`). If the two happens together, we could get a) on a follower, it exits `init_rpc` after running `_broadcast_to_followers` and before the reaching dtor of `UnpickledPythonCall`. Then it runs the ctor of `ProcessGroup`, which holds the GIL and wait for the leader to join. However, the leader is waiting for the response from `_broadcast_to_followers`, which is blocked by the dtor of `UnpickledPythonCall`. And hence the deadlock. This commit drops the GIL in `ProcessGroup` ctor. 3. After fixing (2), I found that `TensorPipe` backend nondeterministically fails with `test_local_shutdown`, due to a similar reason as (2), but this time it is that `shutdown()` on a follower runs before the leader finishes `init_rpc`. This commit adds a join for `TensorPipe` backend `init_rpc` after `_all_gather`. The 3rd one should be able to solve the 2nd one as well. But since I didn't see a reason to hold GIL during `ProcessGroup` ctor, I made that change too. Test Plan: Imported from OSS Reviewed By: pritamdamania87 Differential Revision: D23825592 Pulled By: mrshenli fbshipit-source-id: 94920f2ad357746a6b8e4ffaa380dd56a7310976	2020-09-21 21:47:27 -07:00
Lin.Sung	f77ba0e48c	Change typo 'momemtum' to 'momentum' (#45045 ) Summary: As the title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45045 Reviewed By: mruberry Differential Revision: D23808563 Pulled By: mrshenli fbshipit-source-id: ca818377f4c23d67b037c146fef667ab8731961e	2020-09-21 19:03:26 -07:00
Nikita Shulga	81bb19c9f0	[JIT] Prohibit subscripted assignments for tuple types (#44929 ) Summary: This would force jit.script to raise an error if someone tries to mutate tuple ``` Tuple[int, int] does not support subscripted assignment: File "/home/nshulga/test/tupleassignment.py", line 9 torch.jit.script def foo(x: Tuple[int, int]) -> int: x[-1] = x[0] + 1 ~~~~~ <--- HERE ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44929 Reviewed By: suo Differential Revision: D23777668 Pulled By: malfet fbshipit-source-id: 8efaa4167354ffb4930ccb3e702736a3209151b6	2020-09-21 16:35:44 -07:00
Xiang Gao	581a364437	CUDA BFloat16 unary ops part 1 (#44813 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44813 Reviewed By: mruberry Differential Revision: D23805816 Pulled By: ngimel fbshipit-source-id: 28c645dc31f094c8b6c3d3803f0b4152f0475a64	2020-09-21 14:22:31 -07:00
ahassan@azavea.com	1cab27d485	Add a torch.hub.load_local() function that can load models from any local directory with a hubconf.py (#44204 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43622 - Moves the model loading part of `torch.hub.load()` into a new `torch.hub.load_local()` function that takes in a path to a local directory that contains a `hubconf.py` instead of a repo name. - Refactors `torch.hub.load()` so that it now calls `torch.hub.load_local()` after downloading and extracting the repo. - Updates `torch.hub` docs to include the new function + minor fixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44204 Reviewed By: malfet Differential Revision: D23817429 Pulled By: ailzhang fbshipit-source-id: 788fd83c87a94f487b558715b2809d346ead02b2	2020-09-21 14:17:21 -07:00
James Reed	c941dd3492	[FX] s/get_param/get_attr/ (#45000 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45000 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D23798016 Pulled By: jamesr66a fbshipit-source-id: 1d2f3db1994a62b95d0ced03bf958e54d30c35dd	2020-09-21 14:09:32 -07:00
Ailing Zhang	92f8f75c59	Add alias dispatch key Math. (#44354 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44354 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23591481 Pulled By: ailzhang fbshipit-source-id: 6e93c4ec99a07f3fc920ba2d09dc222e6ced5adf	2020-09-21 11:10:39 -07:00
Lucas Hosseini	ac8c7c4e9f	Make Channel API accept buffer structs rather than raw pointers. (#45014 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45014 Pull Request resolved: https://github.com/pytorch/tensorpipe/pull/219 Pull Request resolved: https://github.com/pytorch/tensorpipe/pull/212 + Introduce buffer.h defining the buffer struct(s). The `CpuBuffer` struct is always defined, while the `CudaBuffer` struct is defined only when `TENSORPIPE_SUPPORTS_CUDA` is true. + Update all channels to take a `CpuBuffer` or `CudaBuffer` for `send`/`recv` rather than a raw pointer and a length. + Make the base `Channel`/`Context` classes templated on `TBuffer`, effectively creating two channel hierarchies (one for CPU channels, one for CUDA channels). + Update the Pipe and the generic channel tests to use the new API. So far, generic channel tests are CPU only, and tests for the CUDA IPC channel are (temporarily) disabled. A subsequent PR will take care of refactoring tests so that generic tests work for CUDA channels. An other PR will add support for CUDA tensors in the Pipe. Differential Revision: D23598033 Test Plan: Imported from OSS Reviewed By: lw Pulled By: beauby fbshipit-source-id: 1d6c3f91e288420858835cd5e7962e8da051b44b	2020-09-21 10:18:45 -07:00
Nick Gibson	4bbb6adff5	[NNC] fix SyncThreads insertion and reenable CudaSharedMem test (#44909 ) Summary: A previous fix for masking Cuda dimensions (https://github.com/pytorch/pytorch/issues/44733) changed the behaviour of inserting thread synchronization barriers in the Cuda CodeGen, causing the CudaSharedMemReduce_1 to be flaky and ultimately disabled. The issue is working out where these barriers must be inserted - solving this optimally is very hard, and I think not possible without dependency analysis we don't have, so I've changed our logic to be quite pessimistic. We'll insert barriers before and after any blocks that have thread dimensions masked (even between blocks that have no data dependencies). This should be correct, but it's an area we could improve performance. To address this somewhat I've added a simplifier pass that removes obviously unnecessary syncThreads. To avoid this test being flaky again, I've added a check against the generated code to ensure there is a syncThread in the right place. Also fixed a couple of non-functional but clarity issues in the generated code: fixed the missing newline after Stores in the CudaPrinter, and prevented the PrioritizeLoad mutator from pulling out loads contained within simple Let statements (such as those produced by the Registerizer). Pull Request resolved: https://github.com/pytorch/pytorch/pull/44909 Reviewed By: agolynski Differential Revision: D23800565 Pulled By: nickgg fbshipit-source-id: bddef1f40d8d461da965685f01d00b468d8a2c2f	2020-09-21 09:27:22 -07:00
Gregory Chanan	a6895d43b6	Turn on gradgrad check for BCELoss Criterion Tests. (#44894 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44894 Looks like we added double backwards support but only turned on the ModuleTests. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23762544 Pulled By: gchanan fbshipit-source-id: b5cef579608dd71f3de245c4ba92e49216ce8a5e	2020-09-21 07:14:22 -07:00
Kaushik Ram Sadagopan	4810365576	Enabled torch.testing._internal.jit_utils.* typechecking. (#44985 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/44985 Reviewed By: malfet Differential Revision: D23794444 Pulled By: kauterry fbshipit-source-id: 9893cc91780338a8223904fb574efa77fa3ab2b9	2020-09-21 01:19:06 -07:00
anjali411	9f67176b82	Complex gradcheck logic (#43208 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43208 This PR adds gradcheck for complex. The logic used for complex gradcheck is described in Section 3.5.3 here: https://arxiv.org/pdf/1701.00392.pdf More concretely, this PR introduces the following changes: 1. Updates get_numerical_jacobian to take as input a scalar value for vector (v). Adds gradcheck logic for C -> C, C-> R, R -> C. For R -> C functions, only the real value of gradient is propagated. 2. Adds backward definition for `torch.complex` and also adds a test to verify the definition added. 3. Updates backward for `mul`, `sin`, `cos`, `sinh`, `cosh`. 4. Adds tests for all `torch.real`, `torch.imag`, `torch.view_as_real`, `torch.view_as_complex`, `torch.conj`. Follow up tasks: 1. Add more thorough tests for R -> C cases. Specifically, add R->C test variants for functions. for e.g., `torch.mul(complex_tensor, real_tensor)` 2. Add back commented test in `common_methods_invocation.py`. 3. Add more special case checking for complex gradcheck to make debugging easier. 4. Update complex autograd note. 5. disable complex autograd for operators not tested for complex. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D23655088 Pulled By: anjali411 fbshipit-source-id: caa75e09864b5f6ead0f988f6368dce64cf15deb	2020-09-20 22:05:04 -07:00
Peter Bell	da7863f46b	Add one dimensional FFTs to torch.fft namespace (#43011 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43011 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D23751850 Pulled By: mruberry fbshipit-source-id: 8dc5fec75102d8809eeb85a3d347ba1b5de45b33	2020-09-19 23:32:22 -07:00
Mike Ruberry	60709ad1bf	Adds multiply and divide aliases (#44463 ) Summary: These alias are consistent with NumPy. Note that C++'s naming would be different (std::multiplies and std::divides), and that PyTorch's existing names (mul and div) are consistent with Python's dunders. This also improves the instructions for adding an alias to clarify that dispatch keys should be removed when copying native_function.yaml entries to create the alias entries. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44463 Reviewed By: ngimel Differential Revision: D23670782 Pulled By: mruberry fbshipit-source-id: 9f1bdf8ff447abc624ff9e9be7ac600f98340ac4	2020-09-19 15:47:52 -07:00
Vasiliy Kuznetsov	2163d31016	histogram observer: ensure buffer shape consistency (#44956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44956 Makes buffer shapes for HistogramObserver have the same shapes in uninitialized versus initialized states. This is useful because the detectron2 checkpointer assumes that these states will stay the same, so it removes the need for manual hacks around the shapes changing. Test Plan: ``` python test/test_quantization.py TestObserver.test_histogram_observer_consistent_buffer_shape ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23785382 fbshipit-source-id: 1a83fd4f39b244b00747c368d5d305a07d877c92	2020-09-19 09:29:39 -07:00
Xiao Wang	d75c402755	Add cusolver to build, rewrite MAGMA inverse with cusolver (#42403 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42265 This PR adds cusolver to the pytorch build, and enables the use of cusolver/cublas library functions on GPU `torch.inverse` on certain tensor shapes. Specifically, when * the tensor is two dimensional (single batch), or * has >2 dimensions (multiple batches) and `batch_size <= 2`, or * magma is not linked, cusolver/cublas will be used. In other conditions, the current implementation of MAGMA will still be used. `8c0949ae45/aten/src/ATen/native/cuda/BatchLinearAlgebra.cu (L742-L752)` The reason for this is that for tensors with large batch_size, `cublasXgetrfBatched` and `cublasXgetriBatched` doesn't perform very well. For `batch_size > 1`, we launch cusolver functions in multiple streams. This lets cusolver functions run in parallel, and can greatly increase the performance. When `batch_size > 2`, the parallel launched cusolver functions are slightly slower than the current magma implementation, so we still use the current magma impl. On CUDA 9.2, there were some numerical issues detected, so cusolver impl will not be used. The cusolver impl will also not be used on platforms other than Nvidia CUDA. `060769feaf/aten/src/ATen/native/cuda/BatchLinearAlgebraLib.h (L10-L13)` Note that there is a new heuristic used before cusolver/cublas calls here: `8c0949ae45/aten/src/ATen/native/cuda/MiscUtils.h (L113-L121)` where `use_loop_launch = true` means launch single batch cusolver functions in parallel, and `use_loop_launch = false` means use cublas_X_batched functions. When magma is enabled (only `batch_size <= 2` will be dispatched to cusolver/cublas), the heuristic will always return `true` and the cusolver calls are faster than small batch_size magma calls. When magma is disabled, this adds the functionality of `torch.inverse`, which was disabled before for all shapes (though large batch_size cublas performance may not be as well as magma). Checklist: - [X] Add benchmark, cpu, gpu-before (magma), gpu-after (cusolver) - [X] Rewrite single inverse (ndim == 2) with cusolver - [X] Rewrite batched inverse (ndim > 2) with cublas - [X] Add cusolver to build - [x] Clean up functions related to `USE_MAGMA` define guard - [x] Workaround for non-cuda platform - [x] Workaround for cuda 9.2 - [x] Add zero size check - [x] Add tests Next step: If cusolver doesn't cause any problem in pytorch build, and there are no major performance regressions reported after this PR being merged, I will start porting other cusolver/cublas functions for linear algebra to improve the performance. <details> <summary> benchmark 73499c6 </summary> benchmark code: https://github.com/xwang233/code-snippet/blob/master/torch.inverse/inverse-cusolver.ipynb shape meaning: * `[] 2 torch.float32 -> torch.randn(2, 2, dtype=torch.float32)` * `[2] 4 torch.float32 -> torch.randn(2, 4, 4, dtype=torch.float32)` \| shape \| cpu_time (ms) \| gpu_time_before (magma) (ms) \| gpu_time_after (ms) \| \| --- \| --- \| --- \| --- \| \| [] 2 torch.float32 \| 0.095 \| 7.534 \| 0.129 \| \| [] 4 torch.float32 \| 0.009 \| 7.522 \| 0.129 \| \| [] 8 torch.float32 \| 0.011 \| 7.647 \| 0.138 \| \| [] 16 torch.float32 \| 0.075 \| 7.582 \| 0.135 \| \| [] 32 torch.float32 \| 0.073 \| 7.573 \| 0.191 \| \| [] 64 torch.float32 \| 0.134 \| 7.694 \| 0.288 \| \| [] 128 torch.float32 \| 0.398 \| 8.073 \| 0.491 \| \| [] 256 torch.float32 \| 1.054 \| 11.860 \| 1.074 \| \| [] 512 torch.float32 \| 5.218 \| 14.130 \| 2.582 \| \| [] 1024 torch.float32 \| 19.010 \| 18.780 \| 6.936 \| \| [1] 2 torch.float32 \| 0.009 \| 0.113 \| 0.128 *regressed \| \| [1] 4 torch.float32 \| 0.009 \| 0.113 \| 0.131 regressed \| \| [1] 8 torch.float32 \| 0.011 \| 0.116 \| 0.129 regressed \| \| [1] 16 torch.float32 \| 0.015 \| 0.122 \| 0.135 regressed \| \| [1] 32 torch.float32 \| 0.032 \| 0.177 \| 0.178 regressed \| \| [1] 64 torch.float32 \| 0.070 \| 0.420 \| 0.281 \| \| [1] 128 torch.float32 \| 0.328 \| 0.816 \| 0.490 \| \| [1] 256 torch.float32 \| 1.125 \| 1.690 \| 1.084 \| \| [1] 512 torch.float32 \| 4.344 \| 4.305 \| 2.576 \| \| [1] 1024 torch.float32 \| 16.510 \| 16.340 \| 6.928 \| \| [2] 2 torch.float32 \| 0.009 \| 0.113 \| 0.186 regressed \| \| [2] 4 torch.float32 \| 0.011 \| 0.115 \| 0.184 regressed \| \| [2] 8 torch.float32 \| 0.012 \| 0.114 \| 0.184 regressed \| \| [2] 16 torch.float32 \| 0.019 \| 0.119 \| 0.173 regressed \| \| [2] 32 torch.float32 \| 0.050 \| 0.170 \| 0.240 regressed \| \| [2] 64 torch.float32 \| 0.120 \| 0.429 \| 0.375 \| \| [2] 128 torch.float32 \| 0.576 \| 0.830 \| 0.675 \| \| [2] 256 torch.float32 \| 2.021 \| 1.748 \| 1.451 \| \| [2] 512 torch.float32 \| 9.070 \| 4.749 \| 3.539 \| \| [2] 1024 torch.float32 \| 33.655 \| 18.240 \| 12.220 \| \| [4] 2 torch.float32 \| 0.009 \| 0.112 \| 0.318 regressed \| \| [4] 4 torch.float32 \| 0.010 \| 0.115 \| 0.319 regressed \| \| [4] 8 torch.float32 \| 0.013 \| 0.115 \| 0.320 regressed \| \| [4] 16 torch.float32 \| 0.027 \| 0.120 \| 0.331 regressed \| \| [4] 32 torch.float32 \| 0.085 \| 0.173 \| 0.385 regressed \| \| [4] 64 torch.float32 \| 0.221 \| 0.431 \| 0.646 regressed \| \| [4] 128 torch.float32 \| 1.102 \| 0.834 \| 1.055 regressed \| \| [4] 256 torch.float32 \| 4.042 \| 1.811 \| 2.054 regressed \| \| [4] 512 torch.float32 \| 18.390 \| 4.884 \| 5.087 regressed \| \| [4] 1024 torch.float32 \| 69.025 \| 19.840 \| 20.000 *regressed \| </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/42403 Reviewed By: ailzhang, mruberry Differential Revision: D23717984 Pulled By: ngimel fbshipit-source-id: 54cbd9ea72a97989cff4127089938e8a8e29a72b	2020-09-18 20:43:29 -07:00
Ivan Kobzarev	e9941a5dd4	[vulkan][py] torch.utils.optimize_for_vulkan (#44903 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44903 Test Plan: Imported from OSS Reviewed By: kimishpatel Differential Revision: D23766039 Pulled By: IvanKobzarev fbshipit-source-id: dbdf484ee7d3a7719aab105efba51b92ebc51568	2020-09-18 18:20:11 -07:00
Shawn Wu	572f7e069c	Enable type check for torch.testing._internal.te_utils.* (#44927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44927 Test Plan: Imported from OSS Reviewed By: walterddr Differential Revision: D23776842 Pulled By: sshawnwu fbshipit-source-id: 65c028169a37e1f2f7d9fdce8a958234ee1caa26	2020-09-18 18:09:15 -07:00
James Reed	043466f978	[FX] Pass module's qualname to is_leaf_module (#44966 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44966 Test Plan: Imported from OSS Reviewed By: dzhulgakov Differential Revision: D23790360 Pulled By: jamesr66a fbshipit-source-id: 7ef569fd93646584b27af7a615fa69c8d8bbdd3b	2020-09-18 17:02:33 -07:00
Peter Bell	fd4e21c91e	Add optional string support to native_functions schema (#43010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43010 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D23751851 Pulled By: mruberry fbshipit-source-id: 648f7430e1b7311eff28421f38e01f52d998fcbd	2020-09-18 14:57:24 -07:00
Michael Suo	374e9373b5	[jit] Pull (most) tests out of libtorch_python (#44795 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44795 Today, we build our cpp tests twice, once as a standalone gtest binary, and once linked in `libtorch_python` so we can call them from `test_jit.py`. This is convenient (it means that `test_jit.py` is a single entry point for all our tests), but has a few drawbacks: 1. We can't actually use the gtest APIs, since we don't link gtest into `libtorch_python`. We're stuck with the subset that we want to write polyfills for, and an awkward registration scheme where you have to write a test then include it in `tests.h`). 2. More seriously, we register custom operators and classes in these tests. In a world where we may be linking many `libtorch_python`s, this has a tendency to cause errors with `libtorch`. So now, only tests that explicitly require cooperation with Python are built into `libtorch_python`. The rest are built into `build/bin/test_jit`. There are tests which require that we define custom classes and operators. In these cases, I've built thm into separate `.so`s that we call `torch.ops.load_library()` on. Test Plan: Imported from OSS Reviewed By: SplitInfinity, ZolotukhinM Differential Revision: D23735520 Pulled By: suo fbshipit-source-id: d146bf4e7eb908afa6f96b394e4d395d63ad72ff	2020-09-18 14:04:40 -07:00
Lucas Hosseini	af3fc9725d	Extract rpc/tensorpipe_utils.{cpp,h} from rpc/utils.{cpp,h} (#44803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44803 Test Plan: CI Reviewed By: lw Differential Revision: D23732022 fbshipit-source-id: 5b839c7997bbee162a14d03414ee32baabbc8ece	2020-09-18 13:51:43 -07:00
wuyangz	d22dd80128	Enable type check for torch.testing._internal.common_device_type. (#44911 ) Summary: This PR intends to fix the type exceptions in common_device_type.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44911 Reviewed By: walterddr Differential Revision: D23768397 Pulled By: wuyangzhang fbshipit-source-id: 053692583b4d6169b0eb5ffe0c3d30635c0db699	2020-09-18 13:42:11 -07:00
Richard Zou	6d312132e1	Beef up vmap docs and expose to master documentation (#44825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44825 Test Plan: - build and view docs locally. Reviewed By: ezyang Differential Revision: D23742727 Pulled By: zou3519 fbshipit-source-id: f62b7a76b5505d3387b7816c514c086c01089de0	2020-09-18 13:26:25 -07:00
Sam Estep	c2cf6efd96	Enable type check for torch.testing._internal.dist_utils.* (#44832 ) Summary: Addresses a sub-task of https://github.com/pytorch/pytorch/issues/44752. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44832 Reviewed By: malfet Differential Revision: D23744260 Pulled By: samestep fbshipit-source-id: 46aede57b4fa66a770d5df382b0aea2bd6772b9b	2020-09-18 12:50:48 -07:00
Nick Gibson	f175830558	[NNC] Fuse identical conditions in simplifier (#44886 ) Summary: Adds a pass to the IR Simplifier which fuses together the bodies of Cond statements which have identical conditions. e.g. ``` if (i < 10) { do_thing_1; } else { do_thing_2; } if (i < 10) { do_thing_3; } ``` is transformed into: ``` if (i < 10) { do_thing_1; do_thing_3; } else { do_thing_2; } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44886 Reviewed By: glaringlee Differential Revision: D23768565 Pulled By: nickgg fbshipit-source-id: 3fe40d91e82bdfff8dcb8c56a02a4fd579c070df	2020-09-18 11:38:03 -07:00
Yanan Cao	174cbff00a	Improve sugared value's error message (#42889 ) Summary: Stack from [ghstack](https://github.com/ezyang/ghstack): * https://github.com/pytorch/pytorch/issues/42889 Improve sugared value's error message I think most (if not all) cases where this code path is reached can be attributed to closing over a global variable. Improving error message to make this clearer to users. close https://github.com/pytorch/pytorch/issues/41288 Pull Request resolved: https://github.com/pytorch/pytorch/pull/42889 Reviewed By: SplitInfinity Differential Revision: D23779347 Pulled By: gmagogsfm fbshipit-source-id: ced702a96234040f79eb16ad998d202e360d6654	2020-09-18 11:01:40 -07:00
shubhambhokare1	0063512a4b	[ONNX] Updates to diagnostic tool to find missing ops (#44124 ) Summary: Moved description of tool and changes in function name Pull Request resolved: https://github.com/pytorch/pytorch/pull/44124 Reviewed By: albanD Differential Revision: D23674618 Pulled By: bzinodev fbshipit-source-id: 5db0bb14fc106fc96358b1e0590f08e975388c6d	2020-09-18 10:32:30 -07:00
Yi Wang	c68cc78299	Add a device parameter to RemoteModule (#44254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44254 Add a device parameter to RemoteModule, so it can be placed on any device and not just CPU. Original PR issue: RemoteModule enhancements #40550 Test Plan: buck test test/distributed/rpc:process_group_agent -- RemoteModule Reviewed By: pritamdamania87 Differential Revision: D23483803 fbshipit-source-id: 4918583c15c6a38a255ccbf12c9168660ab7f6db	2020-09-18 10:31:03 -07:00
Gregory Chanan	07b7e44ed1	Stop using check_criterion_jacobian. (#44786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44786 This predates gradcheck and gradcheck does the same and more. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23731902 Pulled By: gchanan fbshipit-source-id: 425fd30e943194f63a663708bada8960265b8f05	2020-09-18 07:04:57 -07:00
Gregory Chanan	6d178f6b8e	Stop ignoring errors in cuda nn module tests. (#44783 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44783 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23731778 Pulled By: gchanan fbshipit-source-id: 32df903a9e36bbf3f66645ee2d77efa5ed6ee429	2020-09-18 07:03:41 -07:00
Peter Bell	df39c40054	Cleanup tracer handling of optional arguments (#43009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43009 * #43009 Cleanup tracer handling of optional arguments Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D23766621 Pulled By: mruberry fbshipit-source-id: c1b46cd23b58b18ef4c03021b2514d7e692badb6	2020-09-18 06:54:09 -07:00
Peter Bell	caea1adc35	Complex support for stft and istft (#43886 ) Summary: Ref https://github.com/pytorch/pytorch/issues/42175, fixes https://github.com/pytorch/pytorch/issues/34797 This adds complex support to `torch.stft` and `torch.istft`. Note that there are really two issues with complex here: complex signals, and returning complex tensors. ## Complex signals and windows `stft` currently assumes all signals are real and uses `rfft` with `onesided=True` by default. Similarly, `istft` always takes a complex fourier series and uses `irfft` to return real signals. For `stft`, I now allow complex inputs and windows by calling the full `fft` if either are complex. If the user gives `onesided=True` and the signal is complex, then this doesn't work and raises an error instead. For `istft`, there's no way to automatically know what to do when `onesided=False` because that could either be a redundant representation of a real signal or a complex signal. So there, the user needs to pass the argument `return_complex=True` in order to use `ifft` and get a complex result back. ## stft returning complex tensors The other issue is that `stft` returns a complex result, represented as a `(... X 2)` real tensor. I think ideally we want this to return proper complex tensors but to preserver BC I've had to add a `return_complex` argument to manage this transition. `return_complex` defaults to false for real inputs to preserve BC but defaults to True for complex inputs where there is no BC to consider. In order to `return_complex` by default everywhere without a sudden BC-breaking change, a simple transition plan could be: 1. introduce `return_complex`, defaulted to false when BC is an issue but giving a warning. (this PR) 2. raise an error in cases where `return_complex` defaults to false, making it a required argument. 3. change `return_complex` default to true in all cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43886 Reviewed By: glaringlee Differential Revision: D23760174 Pulled By: mruberry fbshipit-source-id: 2fec4404f5d980ddd6bdd941a63852a555eb9147	2020-09-18 01:39:47 -07:00
Rohan Varma	5dbcbea265	TorchScript with record_function (#44345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44345 As part of enhancing profiler support for RPC, when executing TorchScript functions over RPC, we would like to be able to support user-defined profiling scopes created by `with record_function(...)`. Since after https://github.com/pytorch/pytorch/pull/34705, we support `with` statements in TorchScript, this PR adds support for `with torch.autograd.profiler.record_function` to be used within TorchScript. This can be accomplished via the following without this PR: ``` torch.opts.profiler._record_function_enter(...) # Script code, such as forward pass torch.opts.profiler._record_function_exit(....) ``` This is a bit hacky and it would be much cleaner to use the context manager now that we support `with` statements. Also, `_record_function_` type operators are internal operators that are subject to change, this change will help avoid BC issues in the future. Tested with `python test/test_jit.py TestWith.test_with_record_function -v` ghstack-source-id: 112320645 Test Plan: Repro instructions: 1) Change `def script_add_ones_return_any(x) -> Any` to `def script_add_ones_return_any(x) -> Tensor` in `jit/rpc_test.py` 2) `buck test mode/dev-nosan //caffe2/test/distributed/rpc:process_group_agent -- test_record_function_on_caller_rpc_async --print-passing-details` 3) The function which ideally should accept `Future[Any]` is `def _call_end_callbacks_on_future` in `autograd/profiler.py`. python test/test_jit.py TestWith.test_with_foo -v Reviewed By: pritamdamania87 Differential Revision: D23332074 fbshipit-source-id: 61b0078578e8b23bfad5eeec3b0b146b6b35a870	2020-09-17 18:45:00 -07:00
Yuxin Wu	9a007ba4cb	[jit] stop parsing the block after seeing exit statements (#44870 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44870 fix https://github.com/pytorch/pytorch/issues/44864 Test Plan: buck test mode/dev-nosan //caffe2/test:jit -- 'test_assert_is_script' Reviewed By: eellison Differential Revision: D23755094 fbshipit-source-id: ca3f8b27dc6f9dc9364a22a1bce0e2f588ed4308	2020-09-17 18:09:16 -07:00
James Reed	60ae6c9c18	[FX] Fix GraphModule copy methods not regenerating forward (#44806 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44806 Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D23738732 Pulled By: jamesr66a fbshipit-source-id: 14e13551c6568c562f3f789b6274b6c86afefd0b	2020-09-17 17:14:38 -07:00
Yanli Zhao	e14b2080be	[reland] move rebuild buckets from end of first iteration to beginning of second iteration (#44798 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44798 [test all] Update for relanding: in ddp.join(), moved _rebuild_buckets from end of backward to beginning of forward as well. Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration ghstack-source-id: 112279261 ghstack-source-id: 112279261 Test Plan: unit tests Reviewed By: rohan-varma Differential Revision: D23735185 fbshipit-source-id: c26e0efeecb3511640120faa1122a2c856cd694e	2020-09-17 17:10:21 -07:00
Nikita Shulga	2043fbdfb6	Enable torch.backends.cuda typechecking in CI (#44916 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44916 Reviewed By: walterddr Differential Revision: D23769844 Pulled By: malfet fbshipit-source-id: 3be3616fba9e2f9c6d89cc71d5f0d24ffcc45cf2	2020-09-17 15:31:38 -07:00
Alex Suhan	18b77d7d17	[TensorExpr] Add Mod support to the LLVM backend (#44823 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44823 Test Plan: test_tensorexpr --gtest_filter=TensorExprTest.LLVMElemwiseMod_LLVM Reviewed By: glaringlee Differential Revision: D23761996 Pulled By: asuhan fbshipit-source-id: c3c5b2fe0d989dec04f0152ce47c5cae35ed19c9	2020-09-17 15:25:42 -07:00
Jane (Yuan) Xu	1c996b7170	Enable typechecking for torch.testing._internal.common_quantized.* (#44805 ) Summary: Addresses a subproblem of [Issue 42969](https://github.com/pytorch/pytorch/issues/42969) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44805 Reviewed By: malfet Differential Revision: D23742754 Pulled By: janeyx99 fbshipit-source-id: e916a6a0c049cac318549a485d47f19363087d15	2020-09-17 14:24:32 -07:00
Alex Suhan	f5b92332c1	[TensorExpr] Fix order comparisons for unsigned types (#44857 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44857 Test Plan: test_tensorexpr --gtest_filter=TensorExprTest.LLVMCompareSelectByte*_LLVM Reviewed By: glaringlee Differential Revision: D23762162 Pulled By: asuhan fbshipit-source-id: 1553429bd2d5292ccda57910326b8c70e4e6ab88	2020-09-17 14:16:54 -07:00
Nikita Shulga	4066022146	Do not use `PRId64` in torch/csrc (#44767 ) Summary: Instead use `fmt::format()` or `%lld` and cast argument to `(long long)` Fix typos and add helper `PyErr_SetString()` method in torch/csrc/Exceptions.h Pull Request resolved: https://github.com/pytorch/pytorch/pull/44767 Reviewed By: ezyang Differential Revision: D23723671 Pulled By: malfet fbshipit-source-id: c0101aed222184aa436b1e8768480d1531dff232	2020-09-17 14:00:02 -07:00
Alex Suhan	5d57025206	[TensorExpr] Add log1p support to the LLVM backend (#44839 ) Summary: Also corrected Sleef_log1p registrations, float versions had a redundant f. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44839 Test Plan: test_tensorexpr --gtest_filter=TensorExprTest.LLVMElemwiseLog1pFloat_LLVM Reviewed By: glaringlee Differential Revision: D23762113 Pulled By: asuhan fbshipit-source-id: b5cf003b5c0c1ad549c7f04470352231929ac459	2020-09-17 13:38:35 -07:00
Rohan Varma	bee97d5be0	Document the default behavior for dist.new_group() when ranks=None (#44000 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44000 This wasn't documented, so add a doc saying all ranks are used when ranks=None ghstack-source-id: 111206308 Test Plan: CI Reviewed By: SciPioneer Differential Revision: D23465034 fbshipit-source-id: 4c51f37ffcba3d58ffa5a0adcd5457e0c5676a5d	2020-09-17 11:30:37 -07:00
Yanan Cao	2558e5769d	Implement sort for list of tuples (#43448 ) Summary: * Implement tuple sort by traversing contained IValue types and generate a lambda function as comparator for sort. * Tuple, class objects can now arbitrarily nest within each other and still be sortable Fixes https://github.com/pytorch/pytorch/issues/43219 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43448 Reviewed By: eellison Differential Revision: D23352273 Pulled By: gmagogsfm fbshipit-source-id: b6efa8d00e112178de8256da3deebdba7d06c0e1	2020-09-17 11:20:56 -07:00
Supriya Rao	1fde54d531	[quant][qat] Ensure fake_quant and observer can be disabled on scriptmodule (#44773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44773 The model is created and prepared using fx APIs and then scripted for training. In order to test QAT on scriptmodel we need to be able to disable/enable fake_quant and observer modules on it. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qat_and_script Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23741354 fbshipit-source-id: 3fee7aa9b049d9901313b977710f4dc1c4501532	2020-09-17 10:21:52 -07:00
Supriya Rao	361b38da19	[quant][fx] Add node name as prefix to observer module name (#44765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44765 Test Plan: python test/test_quantization.py TestQuantizeFx.test_save_observer_state_dict Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23741355 fbshipit-source-id: 7185ceae5b3b520ac0beebb627c44eab7ae7d231	2020-09-17 10:17:42 -07:00
Natalia Gimelshein	74c3dcd1d2	Revert D23725053: [pytorch][PR] change self.generator to generator Test Plan: revert-hammer Differential Revision: D23725053 (`a011b86115`) Original commit changeset: 89706313013d fbshipit-source-id: 035214f0d4298d29a52f8032d364b52dfd956fe8	2020-09-17 09:42:37 -07:00
Yanli Zhao	d2b4534d4d	refactor intialize bucket views (#44330 ) Summary: [test all] Pull Request resolved: https://github.com/pytorch/pytorch/pull/44330 Part of relanding PR #41954, this refactor is to seperate intialize_bucket_views and populate_bucket_views_out, as they are doing different things and called by different callsites as well ghstack-source-id: 112257271 Test Plan: unit tests Reviewed By: mrshenli Differential Revision: D23583347 fbshipit-source-id: a5f2041b2c4f2c2b5faba1af834c7143eaade938	2020-09-17 09:20:23 -07:00
Jane Xu	4affbbd9f8	minor style edits to torch/testing/_internal/common_quantized.py (#44807 ) Summary: style nits Pull Request resolved: https://github.com/pytorch/pytorch/pull/44807 Reviewed By: malfet Differential Revision: D23742537 Pulled By: janeyx99 fbshipit-source-id: 446343822d61f8fd9ef6dfcb8e5da4feff6522b6	2020-09-17 08:02:43 -07:00
Heitor Schueroff de Souza	28085cbd39	Fixed quantile nan propagation and implemented nanquantile (#44393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44393 torch.quantile now correctly propagates nan and implemented torch.nanquantile similar to numpy.nanquantile. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23649613 Pulled By: heitorschueroff fbshipit-source-id: 5201d076745ae1237cedc7631c28cf446be99936	2020-09-17 05:53:25 -07:00
Yanan Cao	99093277c0	Support Python Slice class in TorchScript (#44335 ) Summary: Implements support for[ Python Slice class](https://docs.python.org/3/c-api/slice.html) (not slice expression, which is already supported) Slice object can be used in any place that supports slice expression, including multi-dim tensor slicing. Fixes https://github.com/pytorch/pytorch/issues/43511 Fixes https://github.com/pytorch/pytorch/issues/43125 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44335 Reviewed By: suo, jamesr66a Differential Revision: D23682213 Pulled By: gmagogsfm fbshipit-source-id: f74fe25370e89fbfd2b3727d95ce4e1c4ba8dec4	2020-09-17 00:41:53 -07:00
Sameer Deshmukh	e18a2219dd	Implement scatter reductions (CUDA), remove divide/subtract (#41977 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/33394 . This PR does two things: 1. Implement CUDA scatter reductions with revamped GPU atomic operations. 2. Remove support for divide and subtract for CPU reduction as was discussed with ngimel . I've also updated the docs to reflect the existence of only multiply and add. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41977 Reviewed By: mruberry Differential Revision: D23748888 Pulled By: ngimel fbshipit-source-id: ea643c0da03c9058e433de96db02b503514c4e9c	2020-09-16 23:25:21 -07:00
Muthu Arivoli	b61d3d8be8	Implement torch.kaiser_window (#44271 ) Summary: Related to https://github.com/pytorch/pytorch/issues/38349 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44271 Reviewed By: ngimel Differential Revision: D23727972 Pulled By: mruberry fbshipit-source-id: b4c931b2eb3a536231ad6d6c3cb66e52a13286ac	2020-09-16 20:41:31 -07:00
alanashine	ba6534ae2b	enable type check common_distributed (#44821 ) Summary: Enabled type checking in common_distributed by using tensors of ints Pull Request resolved: https://github.com/pytorch/pytorch/pull/44821 Test Plan: Run python test/test_type_hints.py, errors are no longer ingnored by mypy.ini Reviewed By: walterddr Differential Revision: D23747466 Pulled By: alanadakotashine fbshipit-source-id: 820fd502d7ff715728470fbef0be90ae7f128dd6	2020-09-16 19:19:36 -07:00
Xiang Gao	e48201c5cf	Mention TF32 on related docs (#44690 ) Summary: cc: ptrblck ![image](https://user-images.githubusercontent.com/1032377/93168022-cbbfcb80-f6d6-11ea-8f6e-f2c8a15c5bea.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44690 Reviewed By: ngimel Differential Revision: D23727921 Pulled By: mruberry fbshipit-source-id: db7cc8e74cde09c13d6a57683129fd839863b914	2020-09-16 19:18:30 -07:00
James Reed	29664e6aa3	[FX] Further sanitize generated names (#44808 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44808 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D23739413 Pulled By: jamesr66a fbshipit-source-id: b759c3ea613dfa717fb23977b72ff4773d9dcc99	2020-09-16 18:47:38 -07:00
Nick Gibson	204f985fc3	[NNC] Add simplification of Loop + Condition patterns. (#44764 ) Summary: Adds a new optimization to the IRSimplifier which changes this pattern: ``` for ... if ... do thing; ``` into: ``` if ... for ... do thing; ``` Which should be almost strictly better. There are many cases where this isn't safe to do, hence tests. Most obviously when the condition depends on something modified within the loop. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44764 Reviewed By: mruberry Differential Revision: D23734463 Pulled By: nickgg fbshipit-source-id: 51617e837de96b354fb702d0090ac65ddc523d36	2020-09-16 18:41:58 -07:00
Yanan Cao	6befc09465	Fix misuse of PyObject_IsSubclass (#44769 ) Summary: PyObject_IsSubclass may set python live exception bit if given object is not a class. `IsNamedTuple` is currently using it incorrectly, which may trip all following python operations in debug-build python. Normal release-build python is not affected because `assert` is no-op in release-build. Fixes https://github.com/pytorch/pytorch/issues/43577 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44769 Reviewed By: jamesr66a Differential Revision: D23725584 Pulled By: gmagogsfm fbshipit-source-id: 2dabd4f8667a045d5bf75813500876c6fd81542b	2020-09-16 16:19:01 -07:00
Meghan Lele	43fe034514	[JIT] Disallow plain Optional type annotation without arg (#44586 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44586 Summary This commit disallows plain `Optional` type annotations without any contained types both in type comments and in-line as Python3-style type annotations. Test Plan This commit adds a unit test for these two situations. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D23721517 Pulled By: SplitInfinity fbshipit-source-id: ead411e94aa0ccce227af74eb0341e2a5331370a	2020-09-16 16:07:26 -07:00
Mingzhe Li	574f9af160	[NCCL] Add option to run NCCL on high priority cuda stream (#43796 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43796 This diff adds an option for the process group NCCL backend to pick high priority cuda streams. Test Plan: waitforsandcastle Reviewed By: jiayisuse Differential Revision: D23404286 fbshipit-source-id: b79ae097b7cd945a26e8ba1dd13ad3147ac790eb	2020-09-16 16:00:41 -07:00
Michael Suo	161490d441	Move `torch/version.py` generation to cmake (#44577 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44577 I would like to to move this to cmake so that I can depend on it happening from other parts of the build. This PR pulls out the logic for determining the version string and writing the version file into its own module. `setup.py` still receives the version string and uses it as before, but now the code for writing out `torch/version.py` lives in a custom command in torch/CMakeLists.txt I noticed a small inconsistency in how version info is populated. `TORCH_BUILD_VERSION` is populated from `setup.py` at configuration time, while `torch/version.py` is written at build time. So if, e.g. you configured cmake on a certain git rev, then built it in on another, the two versions would be inconsistent. This does not appear to matter, so I opted to preserve the existing behavior. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D23734781 Pulled By: suo fbshipit-source-id: 4002c9ec8058503dc0550f8eece2256bc98c03a4	2020-09-16 15:49:22 -07:00
Meghan Lele	ffe127e4f1	[JIT] Disallow plain Tuple type annotation without arg (#44585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44585 Summary This commit disallows plain `Tuple` type annotations without any contained types both in type comments and in-line as Python3-style type annotations. Test Plan This commit adds a unit test for these two situations. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D23721515 Pulled By: SplitInfinity fbshipit-source-id: e11c77a4fac0b81cd535c37a31b9f4129c276592	2020-09-16 15:49:19 -07:00
qxu	09a84071a3	enable mypy check for jit_metaprogramming_utils (#44752 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42969 enable mypy check for jit_metaprogramming_utils.py and fixed all errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44752 Reviewed By: walterddr Differential Revision: D23741285 Pulled By: qxu-fb fbshipit-source-id: 21e36ca5d25c8682fb93b806e416b9e1db76f71e	2020-09-16 15:44:37 -07:00
Alex Suhan	7b3432caff	[TensorExpr] Support boolean in simplifier (#44659 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44659 Test Plan: test_tensorexpr --gtest_filter=TensorExprTest.ConstantFoldCastToBool Reviewed By: ngimel Differential Revision: D23714675 Pulled By: asuhan fbshipit-source-id: 4c18d972b628d5ad55bad58eddd5f6974e043d9c	2020-09-16 15:30:19 -07:00
Meghan Lele	78b806ab4a	[JIT] Disallow plain List type annotation without arg (#44584 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44584 Summary This commit extends the work done in #38130 and disallows plain Python3-style `List` type annotations. Test Plan This commit extends `TestList.test_no_element_type_annotation` to the Python3-style type annotation. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D23721514 Pulled By: SplitInfinity fbshipit-source-id: 48957868286f44ab6d5bf5e1bf97f0a4ebf955df	2020-09-16 15:08:04 -07:00
Meghan Lele	cb3b8a33f1	[JIT] Disallow plain Dict type annotation without arg (#44334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44334 Summary This commit detects and prohibits the case in which `typing.Dict` is used as an annotation without type arguments (i.e. `typing.Dict[K, V]`). At present, `typing.Dict` is always assumed to have two arguments, and when it is used without them, `typing.Dict.__args__` is nonempty and contains some `typing.TypeVar` instances, which have no JIT type equivalent. Consequently, trying to convert `typing.Dict` to a JIT type results in a `c10::DictType` with `nullptr` for its key and value types, which can cause a segmentation fault. This is fixed by returning a `DictType` from `jit.annotations.try_ann_to_type` only if the key and value types are converted successfully to a JIT type and returning `None` otherwise. Test Plan This commit adds a unit test to `TestDict` that tests the plain `Dict` annotations throw an error. Fixes This commit closes #43530. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D23610766 Pulled By: SplitInfinity fbshipit-source-id: 036b10eff6e3206e0da3131cfb4997d8189c4fec	2020-09-16 14:38:28 -07:00
Edward Yang	5027c161a9	Add TORCH_SELECTIVE_NAME to AMP definitions (#44711 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44711 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D23711425 Pulled By: ezyang fbshipit-source-id: d4b0ef77893af80fe9b74791e66825e223ae221d	2020-09-16 14:25:17 -07:00
Nick Gibson	82ab167cce	[NNC] Fix masking for all block and thread dimensions in CudaCodeGen (#44733 ) Summary: Unifies a number of partial solutions to the thread and block dimension extent masking, including the NoThreadIdxWriter and my last fix https://github.com/pytorch/pytorch/issues/44325. The NoThreadIdxWriter is gone in favour of tracking the current loop extents and masking any statements that have a lower rank than the launch parameters in any Block or Thread dimension, which handles both the "no" and "smaller" axis binding cases. For example it will transform the following: ``` for i in 0..10 // blockIdx.x for j in 0..10 // threadIdx.x do thing(i, j); for k in 0..5 // threadIdx.x do other thing(i, k); ``` Into: ``` do thing(blockIdx.x, threadIdx.x); if (threadIdx.x < 5) { do other thing(blockIdx.x, threadIdx.x); } ``` And handle the case where statements are not bound by any axis, eg. ``` do outer thing; for i in 0..10 // blockIdx.x for j in 0..10 // threadIdx.x do thing(i, j); do other thing(i); ``` will become: ``` if (blockIdx.x < 1) { if (threadIdx.x < 1) { do outer thing; } } syncthreads(); do thing(blockIdx.x, threadIdx.x); syncthreads(); if (threadIdx.x < 1) { do other thing(blockIdx.x); } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44733 Reviewed By: mruberry Differential Revision: D23736878 Pulled By: nickgg fbshipit-source-id: 52d08626ae8043d53eb937843466874d479a6768	2020-09-16 14:23:47 -07:00
Yi Wang	f3bd984e44	Move the description comment of compute_bucket_assignment_by_size from cpp to the header file. (#44703 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44703 The description of this public function should be in the header file. Also fix some typos. Test Plan: N/A. Reviewed By: pritamdamania87 Differential Revision: D23703661 fbshipit-source-id: 24ae63de9498e321b31dfb2efadb44183c6370df	2020-09-16 13:44:14 -07:00
Xiang Gao	20ac736200	Remove py2 compatible future imports (#44735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735 Reviewed By: mruberry Differential Revision: D23731306 Pulled By: ezyang fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f	2020-09-16 12:55:57 -07:00
James Reed	e9c6449b46	[FX][EZ] Allow constructing GraphModule with dict for root (#44679 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44679 Test Plan: Imported from OSS Reviewed By: zdevito Differential Revision: D23696766 Pulled By: jamesr66a fbshipit-source-id: fe18b7b579c1728d00589bd5fd5e54c917cc61fe	2020-09-16 12:43:23 -07:00
Nikita Shulga	c44e4878ae	Enable torch.backends.quantized typechecks (#44794 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44793 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44794 Reviewed By: walterddr Differential Revision: D23734353 Pulled By: malfet fbshipit-source-id: 491bd7c8f147759715eb296d7537a172685aa066	2020-09-16 12:21:20 -07:00
Shen Li	cce7680a23	Add bound method tests for async_execution with RRef helper (#44716 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44716 Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D23707326 Pulled By: mrshenli fbshipit-source-id: a2f8db17447e9f82c9f6ed941ff1f8cb9090ad74	2020-09-16 12:01:07 -07:00
Shen Li	257c6d0fde	Make async_execution compatible with RRef helpers (#44666 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44666 Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D23691989 Pulled By: mrshenli fbshipit-source-id: b36f4b1c9d7782797a0220434a8272610a23e83e	2020-09-16 12:01:05 -07:00
Shen Li	924717bf51	Add _get_type() API to RRef (#44663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44663 The new API returns the type of the data object referenced by this `RRef`. On the owner, this is same as `type(rref.local_value())`. On a user, this will trigger an RPC to fetch the `type` object from the owner. After this function is run once, the `type` object is cached by the `RRef`, and subsequent invocations no longer trigger RPC. closes #33210 Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D23691990 Pulled By: mrshenli fbshipit-source-id: a2d87cd601a691dd75164b6bcd7315245e9cf6bd	2020-09-16 11:59:22 -07:00
Yanan Cao	07d07e3c6c	Remove EXPERIMENTAL_ENUM_SUPPORT feature guard (#44243 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41095 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44243 Reviewed By: ZolotukhinM Differential Revision: D23605979 Pulled By: gmagogsfm fbshipit-source-id: 098ae69049c4664ad5d1521c45b8a7dd22e72f6c	2020-09-16 11:45:59 -07:00
Michael Carilli	3e6bb5233f	Reference amp tutorial (recipe) from core amp docs (#44725 ) Summary: https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html is live. Core amp docs should reference it. Also i fixed some typos in the `zero_grad` docs we ignored when git was behaving weirdly during ngimel 's merge of https://github.com/pytorch/pytorch/pull/44423. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44725 Reviewed By: mruberry Differential Revision: D23723807 Pulled By: ngimel fbshipit-source-id: ca0b76365f8ca908bd978e3b38bf81857fa6c2a3	2020-09-16 11:37:58 -07:00
Fang Zhang	a011b86115	change self.generator to generator (#44461 ) Summary: bug fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/44461 Reviewed By: mruberry Differential Revision: D23725053 Pulled By: ngimel fbshipit-source-id: 89706313013d9eae96aaaf144924867457efd2c0	2020-09-16 11:32:17 -07:00
Jimmy Yao	5e717f0d5e	delete the space for the docs rendering (#44740 ) Summary: see the docs rendering of `jacobian` and `hessian` at https://pytorch.org/docs/stable/autograd.html ![image](https://user-images.githubusercontent.com/20907377/93268949-f0618500-f762-11ea-9ec6-ddd062540c59.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44740 Reviewed By: ngimel Differential Revision: D23724899 Pulled By: mrshenli fbshipit-source-id: f7558ff53989e5dc7e678706207be2ac7ce22c66	2020-09-16 11:13:45 -07:00
Pritam Damania	dbf17a1d4c	Fixing a few links in distributed CONTRIBUTING.md (#44753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44753 ghstack-source-id: 112132781 Test Plan: waitforbuildbot Reviewed By: rohan-varma Differential Revision: D23719077 fbshipit-source-id: 3d943dfde100d175f417554fc7fca1fdb295129f	2020-09-16 10:14:19 -07:00
Rohan Varma	63469da3bb	Add a test to ensure DDP join works with RPC (#44439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44439 Adds a test to ddp_under_dist_autograd_test to enusre that that uneven inputs join() API works properly when DDP + RPC is combined. We test that when running in outside DDP mode (DDP applied to whole hybrid module) we can correctly process uneven inputs across different trainers. ghstack-source-id: 112156980 Test Plan: CI Reviewed By: albanD Differential Revision: D23612409 fbshipit-source-id: f1e328c096822042daaba263aa8747a9c7e89de7	2020-09-16 09:51:43 -07:00
Supriya Rao	3f512b0de2	[quant][qat] Ensure observers and fq modules are scriptable (#44749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44749 Ensure fx module is scriptable after calling prepare_qat on it Test Plan: python test/test_quantization.py TestQuantizeFx.test_qat_and_script Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23718380 fbshipit-source-id: abf63ffb21e707f7def8f6c88246877f5aded58c	2020-09-16 09:30:07 -07:00
Mikhail Zolotukhin	d66520ba08	[TensorExpr] Fuser: try merging adjacent fusion groups. (#43671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43671 Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D23360796 Pulled By: ZolotukhinM fbshipit-source-id: 60ec318fe77ae9f2c821d9c4d106281845266e0f	2020-09-15 21:31:02 -07:00
Kent Gauen	2efc618f19	lr_schedule.py redundant code (#44613 ) Summary: The subclass sets "self.last_epoch" when this is set in the parent class's init function. Why would we need to set last_epoch twice? I think calling "super" resets last_epoch anyway, so I am not sure why we would want to include this in the subclass. Am I missing something? For the record, I am just a Pytorch enthusiast. I hope my question isn't totally silly. Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/44613 Reviewed By: albanD Differential Revision: D23691770 Pulled By: mrshenli fbshipit-source-id: 080d9acda86e1a2bfaafe2c6fcb8fc1544f8cf8a	2020-09-15 20:28:39 -07:00
Zachary DeVito	2c1b215b48	[fx] remove delegate, replace with tracer (#44566 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44566 The Delegate objects were confusing. They were suppose to be a way to configure how tracing works, but in some cases they appeared necessary for consturcting graphs, which was not true. This makes the organization clearer by removing Delgate and moving its functionality into a Tracer class, similar to how pickle has a Pickler class. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D23683177 Pulled By: zdevito fbshipit-source-id: 7605a34e65dfac9a487c0bada39a23ca1327ab00	2020-09-15 16:52:22 -07:00
Ailing Zhang	fb085d90e3	Revert D23583017: move rebuild buckets from end of first iteration to beginning of second iteration Test Plan: revert-hammer Differential Revision: D23583017 (`f5d231d593`) Original commit changeset: ef67f79437a8 fbshipit-source-id: fd914b7565aba6a5574a32b31403525abb80ff07	2020-09-15 15:10:52 -07:00
Dmytro Dzhulgakov	2f4c31ce3a	[jit] Speed up saving in case of many classes (#44589 ) Summary: There's an annoying O(N^2) in module export logic that makes saving some of the models (if they have many classes) take eternity. I'm not super familiar with this code to properly untangle the deps and make it a pure hash lookup. So I just added a side lookup table for raw pointers. It's still quadratic, but it's O(num_classes^2) instead of O(num_classes * num_references) which already gives huge savings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44589 Test Plan: Tested with one of the offending models - just loading a saving a Torchscript file: ``` Before: load 1.9239683151245117 save 165.74712467193604 After: load 1.9409027099609375 save 1.4711427688598633 ``` Reviewed By: suo Differential Revision: D23675278 Pulled By: dzhulgakov fbshipit-source-id: 8f3fa7730941085ea20d9255b49a149ac1bf64fe	2020-09-15 15:10:45 -07:00
Nick Gibson	69839ea3f6	[NNC] make inlining immediate (take 3) (#44231 ) Summary: This is a reup https://github.com/pytorch/pytorch/issues/43885 with an extra commit which should fix the bugs that caused it to be reverted. Read that for general context. The issue here was that we were still using the side maps `tensor_to_stmt_` and `stmt_to_tensor_` which get invalidated by any transform of the IR (rather than just any transform that isn't computeInline). I added a comment about this but didn't actually address our usages of it. I've removed these maps and changed the `getLoopBodyFor` and `getLoopStatementsFor` helpers to search the root stmt directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44231 Reviewed By: albanD Differential Revision: D23689688 Pulled By: nickgg fbshipit-source-id: 1c6009a880f8c0cebf2300fd06b5cc9322bffbf9	2020-09-15 11:12:24 -07:00
Elias Ellison	8df0400a50	Fix fallback graph in specialize autogradzero (#44654 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44654 Previously we weren't creating a fallback graph as intended in specialize autograd zero, so if a Tensor failed one of our undefinedness checks we would run the backward normally without reprofiling & optimizing. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D23691764 Pulled By: eellison fbshipit-source-id: 10c6fa79518c84a6f5ef2bfbd9ea10843af751eb	2020-09-15 11:12:20 -07:00
kshitij12345	1d733d660d	[docs] torch.min/max: remove incorrect warning from docs (#44615 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44195 cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/44615 Reviewed By: ngimel Differential Revision: D23703525 Pulled By: mruberry fbshipit-source-id: 471ebd764be667e29c03a30f3ef341440adc54d2	2020-09-15 10:42:08 -07:00
Xiang Gao	6bc77f4d35	Use amax/maximum instead of max in optimizers (#43797 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43797 Reviewed By: malfet Differential Revision: D23406641 Pulled By: mruberry fbshipit-source-id: 0cd075124aa6533b21375fe2c90c44a5d05ad6e6	2020-09-15 10:39:40 -07:00
Muthu Arivoli	9c364da9b9	Fix doc builds for bool kwargs (#44686 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43669 The bool will still link to https://docs.python.org/3/library/functions.html#bool. Tested using bmm: ![image](https://user-images.githubusercontent.com/16063114/93156438-2ad11080-f6d6-11ea-9b81-96e02ee68d90.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44686 Reviewed By: ngimel Differential Revision: D23703823 Pulled By: mruberry fbshipit-source-id: 7286afad084f5ab24a1254ad84e5d01907781c85	2020-09-15 10:34:58 -07:00
Yanli Zhao	f5d231d593	move rebuild buckets from end of first iteration to beginning of second iteration (#44326 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44326 Part of relanding PR #41954, this refactoring is to move rebuild_buckets call from end of first iteration to beginning of second iteration ghstack-source-id: 112011490 Test Plan: unit tests Reviewed By: mrshenli Differential Revision: D23583017 fbshipit-source-id: ef67f79437a820d9b5699b651803622418499a83	2020-09-15 09:51:33 -07:00
Vasiliy Kuznetsov	5f692a67db	qat conv_fused.py: one more patch for forward compatibility (#44671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44671 See comments inline - the FC between https://github.com/pytorch/pytorch/pull/38478 and https://github.com/pytorch/pytorch/pull/38820 was broken, patching it. Test Plan: Verified with customer hitting the issue that this fixes their issue. Reviewed By: jerryzh168 Differential Revision: D23694029 fbshipit-source-id: a5e1733334e22305a111df750b190776889705d0	2020-09-15 09:43:29 -07:00
Vitaliy Chiley	c71ce10cfc	add dilation to transposeconv's _output_padding method (#43793 ) Summary: This PR adds dilation to _ConvTransposeNd._output_padding method and tests using a bunch of different sized inputs. Fixes https://github.com/pytorch/pytorch/issues/14272 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43793 Reviewed By: zou3519 Differential Revision: D23493313 Pulled By: ezyang fbshipit-source-id: bca605c428cbf3a97d3d24316d8d7fde4bddb307	2020-09-14 21:28:27 -07:00
Meghan Lele	e7d782e724	[JIT] Add property support for ScriptModules (#42390 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42390 Summary This commit extends support for properties to include ScriptModules. Test Plan This commit adds a unit test that has a ScriptModule with a user-defined property. `python test/test_jit_py3.py TestScriptPy3.test_module_properties` Test Plan: Imported from OSS Reviewed By: eellison, mannatsingh Differential Revision: D22880298 Pulled By: SplitInfinity fbshipit-source-id: 74f6cb80f716084339e2151ca25092b6341a1560	2020-09-14 18:49:21 -07:00
Guilherme Leobas	e107ef5ca2	Add type annotations for torch.nn.utils.* (#43080 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43013 Redo of gh-42954 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43080 Reviewed By: albanD Differential Revision: D23681334 Pulled By: malfet fbshipit-source-id: 20ec78aa3bfecb7acffc12eb89d3ad833024394c	2020-09-14 17:52:37 -07:00
Elias Ellison	551494b01d	[JIT] Fix torch.tensor for empty multidimensional-typed lists (#44652 ) Summary: We were hitting an assert error when you passed in an empty `List[List[int]]` - this fixes that error by not recursing into 0-element tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44652 Reviewed By: ZolotukhinM Differential Revision: D23688247 Pulled By: eellison fbshipit-source-id: d48ea24893044fae96bc39f76c0f1f9726eaf4c7	2020-09-14 17:28:23 -07:00
Mike Ruberry	686e281bcf	Updates div to perform true division (#42907 ) Summary: This PR: - updates div to perform true division - makes torch.true_divide an alias of torch.div This follows on work in previous PyTorch releases that first deprecated div performing "integer" or "floor" division, then prevented it by throwing a runtime error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42907 Reviewed By: ngimel Differential Revision: D23622114 Pulled By: mruberry fbshipit-source-id: 414c7e3c1a662a6c3c731ad99cc942507d843927	2020-09-14 15:50:38 -07:00
Jerry Zhang	e594c30bc2	[quant][graphmode][fx] Support fp16 dynamic quantization for linear (#44582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44582 Test Plan: test_quantize_fx.py Imported from OSS Reviewed By: vkuzo Differential Revision: D23665974 fbshipit-source-id: 19ba6c61a9c77ef570b00614016506e9a2729f7c	2020-09-14 15:43:08 -07:00
BowenBao	43406e218a	[ONNX] Update ONNX shape inference (#43929 ) Summary: * Support sequence type (de)serialization, enables onnx shape inference on sequence nodes. * Fix shape inference with block input/output: e.g. Loop and If nodes. * Fix bugs in symbolic discovered by coverage of onnx shape inference. * Improve debuggability: added more jit logs. For simplicity, the default log level, when jit log is enabled, will not dump ir graphs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43929 Reviewed By: albanD Differential Revision: D23674604 Pulled By: bzinodev fbshipit-source-id: ab6aacb16d0e3b9a4708845bce27c6d65e567ba7	2020-09-14 15:36:19 -07:00
Ksenija Stanojevic	f7cfbac89b	[ONNX] Update len symbolic (#43824 ) Summary: Update len symbolic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43824 Reviewed By: izdeby Differential Revision: D23575765 Pulled By: bzinodev fbshipit-source-id: 0e5c8c8d4a5297f65e2dc43168993350f784c776	2020-09-14 15:00:44 -07:00
shubhambhokare1	da11d932bc	[ONNX] Update arange op to support out argument (#43777 ) Summary: Update arange op to support out argument Pull Request resolved: https://github.com/pytorch/pytorch/pull/43777 Reviewed By: albanD Differential Revision: D23674583 Pulled By: bzinodev fbshipit-source-id: 6fb65e048c6b1a551569d4d2a33223522d2a960c	2020-09-14 14:56:17 -07:00
neginraoof	62ebad4ff9	[ONNX] Export new_empty and new_zeros (#43506 ) Summary: Adding symbolic to export new_empty and new_zeros Pull Request resolved: https://github.com/pytorch/pytorch/pull/43506 Reviewed By: houseroad Differential Revision: D23674574 Pulled By: bzinodev fbshipit-source-id: ecfcdbd4845fd3a3c6618a060129fbeee4df5dd7	2020-09-14 14:48:34 -07:00
Zafar	742654d1b6	[quant] ConvTranspose1d / ConvTranspose2d (#40371 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40371 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D22158981 Pulled By: z-a-f fbshipit-source-id: defbf6fbe730a58d5b155dcb2460dd969797215c	2020-09-14 14:25:06 -07:00
Alex Suhan	a188dbdf3f	Check for index-rank consistency in FunctionInliner (#44561 ) Summary: When caller / callee pairs are inserted into the mapping, verify that the arity of the buffer access is consistent with its declared rank. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44561 Test Plan: CI, test_tensorexpr --gtest_filter=TensorExprTest.DetectInlineRankMismatch Reviewed By: albanD Differential Revision: D23684342 Pulled By: asuhan fbshipit-source-id: dd3a0cdd4c2492853fa68381468e0ec037136cab	2020-09-14 14:07:22 -07:00
Rong Rong	b5dd6e3e61	split torch.testing._internal.* and add type checking for torch.testing._internal.common_cuda (#44575 ) Summary: First step to fix https://github.com/pytorch/pytorch/issues/42969. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44575 Reviewed By: malfet Differential Revision: D23668740 Pulled By: walterddr fbshipit-source-id: eeb3650b1780aaa5727b525b4e6182e1bc47a83f	2020-09-14 14:04:02 -07:00
mariosasko	cfba33bde3	Fix the ELU formula in the docs (#43764 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43389. This PR replaces the old ELU formula from the docs that yields wrong results for negative alphas with the new one that fixes the issue and relies on the cases notation which makes the formula more straightforward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43764 Reviewed By: ailzhang Differential Revision: D23425532 Pulled By: albanD fbshipit-source-id: d0931996e5667897d926ba4fc7a8cc66e8a66837	2020-09-14 14:01:56 -07:00
Zafar	9d4943daaf	[quant] conv_transpose1d / conv_transpose2d (#40370 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40370 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D22158979 Pulled By: z-a-f fbshipit-source-id: f5cb812c9953efa7608f06cf0188de447f73f358	2020-09-14 13:45:28 -07:00
Rong Rong	ecac8294a6	enable type checking for torch._classes (#44576 ) Summary: Fix https://github.com/pytorch/pytorch/issues/42980 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44576 Reviewed By: malfet Differential Revision: D23668741 Pulled By: walterddr fbshipit-source-id: 4201ea3187a40051ebff53d28c8e571ea1a61126	2020-09-14 13:26:46 -07:00
Raghavan Raman	ad7a2eb1c9	Simplify nested Min and Max patterns. (#44142 ) Summary: Improve simplification of nested Min and Max patterns. Specifically, handles the following pattern simplications: * `Max(A, Max(A, Const)) => Max(A, Const)` * `Max(Min(A, B), Min(A, C)) => Min(A, Max(B, C))` * `Max(Const, Max(A, OtherConst) => Max(A, Max(Const, OtherConst))` - This case can have an arbitrarily long chain of Max ops. For example: `Max(5, Max(x, Max(y, Max(z, 8)))) => Max(Max(Max(x, 8), y), z)` Similarly, for the case of Min as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44142 Reviewed By: albanD Differential Revision: D23644486 Pulled By: navahgar fbshipit-source-id: 42bd241e6c2af820566744c8494e5dee172107f4	2020-09-14 13:24:46 -07:00
Heitor Schueroff de Souza	199435af90	Update median doc to note return value of even-sized input (#44562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44562 Add a note that torch.median returns the smaller of the two middle elements for even-sized input and refer user to torch.quantile for the mean of the middle values. fixes https://github.com/pytorch/pytorch/issues/39520 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23657208 Pulled By: heitorschueroff fbshipit-source-id: 2747aa652d1e7f10229d9299b089295aeae092c2	2020-09-14 13:18:33 -07:00
Bram Wasti	a475613d1d	[static runtime] Swap to out-variant compatible nodes (#44127 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44127 Test Plan: Imported from OSS Reviewed By: hlu1 Differential Revision: D23604306 Pulled By: bwasti fbshipit-source-id: 18ccfb9b466b822e28130be3d5c4fae36c76820b	2020-09-14 12:38:25 -07:00
Elias Ellison	856510c96d	[JIT] Dont optimize shape info in batch_mm (#44565 ) Summary: We run remove profile nodes and specialize types before batch_mm, so we cannot run peepholes on the type information of tensors since these properties have not been guarded to be guaranteed to be correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44565 Reviewed By: albanD Differential Revision: D23661538 Pulled By: eellison fbshipit-source-id: 0dd23a65714f047f49b4db4ec582b21870925fe1	2020-09-14 12:34:20 -07:00
Yi Wang	ace81b6794	Remove an extra empty line in the warning comments. (#44622 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44622 Remove an extra empty line in the warning comments.Remove an extra empty line. Test Plan: N/A Reviewed By: rohan-varma Differential Revision: D23674070 fbshipit-source-id: 4ee570590c66a72fb808e9ee034fb773b833efcd	2020-09-14 11:15:35 -07:00
Natalia Gimelshein	95a69a7d09	adds list_gpu_processes function (#44616 ) Summary: per title, to make it easier to track the creation of stray contexts: ``` python -c "import torch; a=torch.randn(1, device='cuda'); print(torch.cuda.memory.list_gpu_processes(0)); print(torch.cuda.memory.list_gpu_processes(1))" GPU:0 process 79749 uses 601.000 MB GPU memory GPU:1 no processes are running ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44616 Reviewed By: mruberry Differential Revision: D23675739 Pulled By: ngimel fbshipit-source-id: ffa14cad9d7144e883de13b1c2c6817bd432f53a	2020-09-14 09:54:32 -07:00
Thomas Viehmann	bd257a17a1	Add HIP/ROCm version to collect_env.py (#44106 ) Summary: This adds HIP version info to the `collect_env.py` output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44106 Reviewed By: VitalyFedyunin Differential Revision: D23652341 Pulled By: zou3519 fbshipit-source-id: a1f5bce8da7ad27a1277a95885934293d0fd43c5	2020-09-14 09:19:18 -07:00
Jeremy Lilley	7040a070e3	[torch] Minor: Avoid ostreamstring in Operator's canonicalSchemaString() (#44442 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44442 I noticed lock contention on startup as lookupByLiteral() was calling registerPendingOperators() - some calls were holding the lock for 10+ ms, as operators were being registered. canonicalSchemaString() was using ostreamstring, which isn't typically particularly fast (partly because of c++ spec locale requirements). If we repalce with regular c++ string appends, it's somewhat faster (which isn't hard when comparing with stringstream; albeit a bit more codegen) Over the first minute or so, this cuts out 1.4 seconds under the OperatorRegistry lock (as part of registerPendingOperators) in the first couple minutes of run time (mostly front-loaded) when running sync sgd. As an example, before: registerPendingOperators 12688 usec for 2449 operators After: registerPendingOperators 6853 usec for 2449 operators ghstack-source-id: 111862971 Test Plan: buck test mode/dev-nosan caffe2/test/cpp/... Reviewed By: ailzhang Differential Revision: D23614515 fbshipit-source-id: e712f9dac5bca0b1876e11fb8f0850402f03873a	2020-09-14 08:24:16 -07:00
kshitij12345	c68a99bd61	[numpy] Add `torch.exp2` (#44184 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 TODO * [x] Add tests * [x] Add docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/44184 Reviewed By: ngimel Differential Revision: D23674237 Pulled By: mruberry fbshipit-source-id: 7f4fb1900fad3051cd7fc9d3d7f6d985c5fb093c	2020-09-14 04:05:37 -07:00
Victor Bittorf	68a5c361ae	Adding Adapative Autorange to benchmark utils. (#44607 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44219 Rebasing https://github.com/pytorch/pytorch/pull/44288 and fixing the git history. This allows users to bencmark code without having to specify how long to run the benchmark. It runs the benchmark until the variance (IQR / Median) is low enough that we can be confident in the measurement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44607 Test Plan: There are unit tests, and we manually tested using Examples posted in git. Reviewed By: robieta Differential Revision: D23671208 Pulled By: bitfort fbshipit-source-id: d63184290b88b26fb81c2452e1ae701c7d513d12	2020-09-13 20:55:40 -07:00
Peter Bell	8daaa3bc7e	Fix latex error in heaviside docs (#44481 ) Summary: This fixes a `katex` error I was getting trying to build the docs: ``` ParseError: KaTeX parse error: Undefined control sequence: \0 at position 55: …gin{cases} ``` This failure was introduced in https://github.com/pytorch/pytorch/issues/42523. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44481 Reviewed By: colesbury Differential Revision: D23627700 Pulled By: mruberry fbshipit-source-id: 9cc09c687a7d9349da79a0ac87d6c962c9cfbe2d	2020-09-13 16:42:19 -07:00
Martin Yuan	7862827269	[pytorch] Add variadic run_method for lite intepreter (#44337 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44337 Add a new run_method to mobile Module which is variadic (takes any number of arguments) to match full jit. ghstack-source-id: 111909068 Test Plan: Added new unit test to test_jit test suite Reviewed By: linbinyu, ann-ss Differential Revision: D23585763 fbshipit-source-id: 007cf852290f03615b78c35aa6f7a21287ccff9e	2020-09-13 13:26:30 -07:00
Mikhail Zolotukhin	bcf97b8986	[JIT] Cleanup some places where we log graphs in executors. (#44588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44588 1) SOURCE_DUMP crashes when invoked on a backward graph since `prim::GradOf` nodes can't be printed as sources (they don't have schema). 2) Dumping graph each time we execute an optimized plan produces lots of output in tests where we run the graph multiple times (e.g. benchmarks). Outputting that on the least level of verbosity seems like an overkill. 3) Duplicated log statement is removed. Differential Revision: D23666812 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: b9a30e34fd39c85f3e13c3f1e3594e157e1c130f	2020-09-13 11:31:02 -07:00
Mikhail Zolotukhin	82da6b3702	[JIT] Fix jit-log verbosity selection logic. (#44587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44587 Currently it's skewed by one. The following test demonstrates it: ``` $ cat test.py import torch def foo(a,b): return aab torch._C._jit_set_profiling_executor(True) torch._C._jit_set_profiling_mode(True) torch._C._jit_override_can_fuse_on_cpu(True) torch._C._jit_set_texpr_fuser_enabled(True) f = torch.jit.script(foo) for _ in range(10): f(torch.rand(10), torch.rand(10)) $ cat test_logging_levels.sh PYTORCH_JIT_LOG_LEVEL="tensorexpr_fuser" python test.py 2>&1 \| grep DUMP >& /dev/null && echo OK \|\| echo FAIL PYTORCH_JIT_LOG_LEVEL="tensorexpr_fuser" python test.py 2>&1 \| grep UPDATE >& /dev/null && echo FAIL \|\| echo OK PYTORCH_JIT_LOG_LEVEL="tensorexpr_fuser" python test.py 2>&1 \| grep DEBUG >& /dev/null && echo FAIL \|\| echo OK PYTORCH_JIT_LOG_LEVEL=">tensorexpr_fuser" python test.py 2>&1 \| grep DUMP >& /dev/null && echo OK \|\| echo FAIL PYTORCH_JIT_LOG_LEVEL=">tensorexpr_fuser" python test.py 2>&1 \| grep UPDATE >& /dev/null && echo OK \|\| echo FAIL PYTORCH_JIT_LOG_LEVEL=">tensorexpr_fuser" python test.py 2>&1 \| grep DEBUG >& /dev/null && echo FAIL \|\| echo OK PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python test.py 2>&1 \| grep DUMP >& /dev/null && echo OK \|\| echo FAIL PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python test.py 2>&1 \| grep UPDATE >& /dev/null && echo OK \|\| echo FAIL PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python test.py 2>&1 \| grep DEBUG >& /dev/null && echo OK \|\| echo FAIL ``` Before this change: ``` OK FAIL OK OK OK FAIL OK OK OK ``` With this change everthing passes. Differential Revision: D23666813 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: 4adaa5a3d06deadf54eae014a0d76588cdc5e20a	2020-09-13 11:29:25 -07:00
Bert Maher	6d4a605ce9	Fix bug simplifying if-then-else when it can be removed (#44462 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44462 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23671157 Pulled By: bertmaher fbshipit-source-id: b9b92ad0de1a7bd9bc1fcac390b542d885d0ca58	2020-09-13 10:29:28 -07:00
Mike Ruberry	7e91728f68	Deprecates calling linspace and logspace without setting steps explicitly (#43860 ) Summary: BC-breaking note This change is BC-breaking for C++ callers of linspace and logspace if they were providing a steps argument that could not be converted to an optional. PR note This PR deprecates calling linspace and logspace wihout setting steps explicitly by: - updating the documentation to warn that not setting steps is deprecated - warning (once) when linspace and logspace are called without steps being specified A test for this behavior is added to test_tensor_creation_ops. The warning only appears once per process, however, so the test would pass even if no warning were thrown. Ideally there would be a mechanism to force all warnings, include those from TORCH_WARN_ONCE, to trigger. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43860 Reviewed By: izdeby Differential Revision: D23498980 Pulled By: mruberry fbshipit-source-id: c48d7a58896714d184cb6ff2a48e964243fafc90	2020-09-13 06:09:19 -07:00
Yi Wang	82b4477948	Pass the input tensor vector by const reference. (#44340 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44340 Changed the constructor of GradBucket to pass the input by const reference and hence avoided unnecessary explicit move semantics. Since previously the declaration and definition are separated, passing the input tensor vector by value looks quite bizarre. Test Plan: buck test caffe2/torch/lib/c10d:ProcessGroupGlooTest Reviewed By: pritamdamania87 Differential Revision: D23569939 fbshipit-source-id: db761d42e76bf938089a0b38e98e76a05bcf4162	2020-09-11 18:03:56 -07:00
Yi Wang	ab5fee2784	Move the inline implementations of GradBucket class to the header. (#44339 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44339 Moved the inline implementations of GradBucket class to the header for succinctness and readability. This coding style is also consistent with reducer.h under the same directory. Test Plan: buck test caffe2/torch/lib/c10d:ProcessGroupGlooTest Reviewed By: pritamdamania87 Differential Revision: D23569701 fbshipit-source-id: 237d9e2c5f63a6bcac829d0fcb4a5ba3bede75e5	2020-09-11 18:01:37 -07:00
Elias Ellison	1f0dcf39fc	[JIT] dont optimize device dtype on inline (#43363 ) Summary: Follow up to https://github.com/pytorch/pytorch/pull/36404 Adding prim::device and prim::dtype to list of skipped peepholes when we run inlining. In the long term another fix may not be to encode shape / dtype info on the traced graph, because it is not guaranteed to be correct. This is blocked by ONNX currently. Partial fix for https://github.com/pytorch/pytorch/issues/43134 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43363 Reviewed By: glaringlee Differential Revision: D23383987 Pulled By: eellison fbshipit-source-id: 2e9c5160d39d690046bd9904be979d58af8d3a20	2020-09-11 17:29:54 -07:00
Mikhail Zolotukhin	d729e2965e	[TensorExpr] Do not inline autodiff graphs if they contain prim::TypeCheck nodes. (#44564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44564 Before this change we sometimes inlined autodiff subgraph containing fusion groups. This happened because we didn't look for 'unsupported' nodes recursively (maybe we should), but fusion groups were inside if-nodes. The problem was detected by bertmaher in 'LearningToPaint' benchmark investigation where this bug caused us to keep constantly hitting fallback paths of the graph. Test Plan: Imported from OSS Reviewed By: bwasti Differential Revision: D23657049 Pulled By: ZolotukhinM fbshipit-source-id: 7c853424f6dce4b5c344d6cd9c467ee04a8f167e	2020-09-11 17:28:53 -07:00
Nick Gibson	64b4307d47	[NNC] Cuda Codegen - mask loops bound to block/thread dimensions (#44325 ) Summary: Fix an issue where loops of different sizes are bound to the same Cuda dimension / metavar. Coming soon more info and tests... Pull Request resolved: https://github.com/pytorch/pytorch/pull/44325 Reviewed By: colesbury Differential Revision: D23628859 Pulled By: nickgg fbshipit-source-id: 3621850a4cc38a790b62ad168d32e7a0e2462fad	2020-09-11 16:48:16 -07:00
Nikita Shulga	2ae74c0632	Compile less legacy code when BUILD_CAFFE2 is set to False (take 2) (#44453 ) Summary: 2nd attempt to land https://github.com/pytorch/pytorch/pull/44079 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44453 Reviewed By: walterddr, seemethere Differential Revision: D23619528 Pulled By: malfet fbshipit-source-id: c7c206ebd327dcf3994789bd47008b05ff862fe7	2020-09-11 16:27:47 -07:00
Jerry Zhang	b6f0ea0c71	[quant][graphmode][fx][fix] Remove qconfig in convert (#44526 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44526 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23641960 fbshipit-source-id: 546da1c16694d1e1dfb72629085acaae2165e759	2020-09-11 15:51:47 -07:00
Jerry Zhang	a82ea6a91f	[quant][graphmode][fx][fix] Support None qconfig in convert (#44524 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44524 None qconfig is not handled previously closes: https://github.com/pytorch/pytorch/issues/44438 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23640269 fbshipit-source-id: 8bfa88c8c78d4530338d9d7fa9669876c386d91f	2020-09-11 15:22:25 -07:00
Zafar	1fb5883072	removing conv filters from conv pattern matching (#44512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44512 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23637409 Pulled By: z-a-f fbshipit-source-id: ad5be0fa6accfbcceaae9171bf529772d87b4098	2020-09-11 15:16:29 -07:00
Wanchao Liang	ab6126b50e	[rpc][jit] support remote call in TorchScript (#43046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43046 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D23621108 Pulled By: wanchaol fbshipit-source-id: e8152c6cdd3831f32d72d46ac86ce22f3f13c651	2020-09-11 14:59:51 -07:00
Wanchao Liang	3e5df5f216	[rpc][jit] support rpc_sync in TorchScript (#43043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43043 This add the support for rpc_sync in TorchScript in a way similar to rpc_async Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D23252039 Pulled By: wanchaol fbshipit-source-id: 8a05329cb8a24079b2863178b73087d47273914c	2020-09-11 14:59:47 -07:00
Wanchao Liang	8bec7cfa91	[rpc] rename some functions (#43042 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43042 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D23228894 Pulled By: wanchaol fbshipit-source-id: 3702b7826ecb455073fabb9dc5dca804c0e092b2	2020-09-11 14:58:39 -07:00
Vasiliy Kuznetsov	70dfeb44bd	MinMax based observers: respect device affinity for state_dict (#44537 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44537 Originally, the `min_val`, `max_val`, `min_vals`, `max_vals` attributes of observers were Tensors but not buffers. They had custom state_dict save/load code to ensure their state was saved. At some point, these attributes became buffers, and the custom save/load code remained. This introduced a subtle bug: * create model A, move it to a device (cpu/cuda) and save its state_dict * create model B, load its state dict. * `min_val\|min_vals\|max_val\|max_vals` would always be loaded to model A's device, even if the rest of model B was on a different device * the above is inconsistent with how save/load on different devices is expected to work (see https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-across-devices) In practice, the case people would sometimes hit is: * model A is on CPU, state dict is saved * model B is created and moved to GPU, state_dict from model A is loaded * assertions throw when operations are attempted across different devices This PR fixes the behavior by removing the custom save/load where possible and letting the default `nn.Module` save/load code handle device assignment. We special case `PerChannelMinMaxObserver` and its children to allow for loading buffers or different size, which is normal. There are some followups to also enable this for HistogramObserver and FakeQuantize, which can be done in separate PRs due to higher complexity. Test Plan: ``` python test/test_quantization.py TestObserver.test_state_dict_respects_device_affinity ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23644493 fbshipit-source-id: 0dbb6aa309ad569a91a663b9ee7e44644080032e	2020-09-11 14:48:56 -07:00
Gregory Chanan	192c4111a3	Simplify target handling in nn gradcheck. (#44507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44507 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23635799 Pulled By: gchanan fbshipit-source-id: 75090d6a48771e5c92e737a0829fbfa949f7c8a7	2020-09-11 13:25:59 -07:00
Gregory Chanan	5579b53a7f	Fix SmoothL1Loss when target.requires_grad is True. (#44486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44486 SmoothL1Loss had a completely different (and incorrect, see #43228) path when target.requires_grad was True. This PR does the following: 1) adds derivative support for target via the normal derivatives.yaml route 2) kill the different (and incorrect) path for when target.requires_grad was True 3) modify the SmoothL1Loss CriterionTests to verify that the target derivative is checked. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23630699 Pulled By: gchanan fbshipit-source-id: 0f94d1a928002122d6b6875182867618e713a917	2020-09-11 12:13:36 -07:00
Cheng Chang	b7ef4eec46	[NNC] Add loop slicing transforms (#43854 ) Summary: Add new transforms `sliceHead` and `sliceTail` to `LoopNest`, for example: Before transformation: ``` for x in 0..10: A[x] = x2 ``` After `sliceHead(x, 4)`: ``` for x in 0..4: A[x] = x2 for x in 4..10: A[x] = x2 ``` After `sliceTail(x, 1)`: ``` for x in 0..4: A[x] = x2 for x in 4..9: A[x] = x2 for x in 9..10: A[x] = x2 ``` `sliceHead(x, 10)` and `sliceTail(x, 10)` is no-op. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43854 Test Plan: Tests are added in `test_loopnest.cpp`, the tests cover the basic transformations, and also tests the combination with other transformations such as `splitWithTail`. Reviewed By: nickgg Differential Revision: D23417366 Pulled By: cheng-chang fbshipit-source-id: 06c6348285f2bafb4be3286d1642bfbe1ea499bf	2020-09-11 12:09:12 -07:00
Jerry Zhang	11fb51d093	[quant][graphmode][fx][fix] Support dictionary output (#44508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44508 Bug fix for dictionary output Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D23636182 fbshipit-source-id: 0c00cd6b9747fa3f8702d7f7a0d5edb31265f466	2020-09-11 11:29:20 -07:00
Ann Shan	442957d8b6	[pytorch] Remove mobile nonvariadic run_method (#44235 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44235 Removes nonvariadic run_method() from mobile Module entirely (to be later replaced by a variadic version). All use cases should have been migrated to use get_method() and Method::operator() in D23436351 ghstack-source-id: 111848220 Test Plan: CI Reviewed By: iseeyuan Differential Revision: D23484577 fbshipit-source-id: 602fcde61e13047a34915b509da048b9550103b1	2020-09-11 10:23:08 -07:00
Ann Shan	a61318a535	[pytorch] Replace mobile run_method with get_method and operator() (#44202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44202 In preparation for changing mobile run_method() to be variadic, this diff: * Implements get_method() for mobile Module, which is similar to find_method but expects the method to exist. * Replaces calls to the current nonvariadic implementation of run_method() by calling get_method() and then invoking the operator() overload on Method objects. ghstack-source-id: 111848222 Test Plan: CI, and all the unit tests which currently contain run_method that are being changed. Reviewed By: iseeyuan Differential Revision: D23436351 fbshipit-source-id: 4655ed7182d8b6f111645d69798465879b67a577	2020-09-11 10:23:06 -07:00
Guilherme Leobas	cdf5e2ae86	add typing annotations for a few torch.utils.* modules (#43806 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43431. Depends on [gh-43862](https://github.com/pytorch/pytorch/pull/43862) (EDIT: now merged) Modules: - torch.utils.mkldnn - torch.utils.mobile_optimizer - torch.utils.bundled_inputs Pull Request resolved: https://github.com/pytorch/pytorch/pull/43806 Reviewed By: gmagogsfm Differential Revision: D23635151 Pulled By: SplitInfinity fbshipit-source-id: a85b75a7927dde6cc55bcb361f8ff601ffb0b2a1	2020-09-11 10:20:55 -07:00
David Reiss	7d78a6fcdd	Update interpolate to use new upsample overloads (#43025 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43025 - Use new overloads that better reflect the arguments to interpolate. - More uniform interface for upsample ops allows simplifying the Python code. - Also reorder overloads in native_functions.yaml to give them priority. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37177 ghstack-source-id: 106938111 Test Plan: test_nn has pretty good coverage. Relying on CI for ONNX, etc. Didn't test FC because this change is not forward compatible. To ensure backwards compatibility, I ran this code before this change ```python def test_func(arg): interp = torch.nn.functional.interpolate with_size = interp(arg, size=(16,16)) with_scale = interp(arg, scale_factor=[2.1, 2.2], recompute_scale_factor=False) with_compute = interp(arg, scale_factor=[2.1, 2.2]) return (with_size, with_scale, with_compute) traced_func = torch.jit.trace(test_func, torch.randn(1,1,1,1)) sample = torch.randn(1, 3, 7, 7) output = traced_func(sample) assert not torch.allclose(output[1], output[2]) torch.jit.save(traced_func, "model.pt") torch.save((sample, output), "data.pt") ``` then this code after this change ```python model = torch.jit.load("model.pt") sample, golden = torch.load("data.pt") result = model(sample) for r, g in zip(result, golden): assert torch.allclose(r, g) ``` Reviewed By: AshkanAliabadi Differential Revision: D21209991 fbshipit-source-id: 5b2ebb7c3ed76947361fe532d1dbdd6faa3544c8	2020-09-11 09:59:14 -07:00
Gregory Chanan	3de2c0b42f	Fix L1Loss when target.requires_grad is True. (#44471 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44471 L1Loss had a completely different (and incorrect, see #43228) path when target.requires_grad was True. This PR does the following: 1) adds derivative support for target via the normal derivatives.yaml route 2) kill the different (and incorrect) path for when target.requires_grad was True 3) modify the L1Loss CriterionTests to verify that the target derivative is checked. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23626008 Pulled By: gchanan fbshipit-source-id: 2828be16b56b8dabe114962223d71b0e9a85f0f5	2020-09-11 09:51:16 -07:00
Martin Yuan	b73b44f976	[PyTorch Mobile] Move some string ops to register_prim_ops.cpp and make them selective (#44500 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44500 Some user models are using those operators. Unblock them while keep the ops selective. Test Plan: CI Reviewed By: linbinyu Differential Revision: D23634769 fbshipit-source-id: 55841d1b07136b6a27b6a39342f321638dc508cd	2020-09-11 09:24:35 -07:00
Rohan Varma	567c51cce9	In common_distributed, fix TEST_SKIPS multiprocessing manager (#44525 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44525 Since `TEST_SKIPS` is a global multiprocessing.manager, this was causing issues when one test would fail and make the rest of the tests fail during setup due to networking errors. See the failed CI job: https://app.circleci.com/pipelines/github/pytorch/pytorch/212491/workflows/0450151d-ca09-4cf6-863d-272de6ed917f/jobs/7389065 for an example, where `test_ddp_backward` failed but then caused the rest of the tests to fail at the line `test_skips.update(TEST_SKIPS)`. To fix this issue, at the end of every test we revert `TEST_SKIPS` back to a regular dict, and redo the conversion to a `mulitiprocessing.Manager` in the next test, which prevents these errors. ghstack-source-id: 111844724 Test Plan: CI Reviewed By: malfet Differential Revision: D23641618 fbshipit-source-id: 27ce823968ece9804bb4dda898ffac43ef732b89	2020-09-11 09:16:33 -07:00
Gregory Chanan	d07d25a8c5	Fix MSELoss when target.requires_grad is True. (#44437 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44437 MSELoss had a completely different (and incorrect, see https://github.com/pytorch/pytorch/issues/43228) path when target.requires_grad was True. This PR does the following: 1) adds derivative support for target via the normal derivatives.yaml route 2) kill the different (and incorrect) path for when target.requires_grad was True 3) modify the MSELoss CriterionTests to verify that the target derivative is checked. TODO: 1) do we still need check_criterion_jacobian when we run grad/gradgrad checks? 2) ensure the Module tests check when target.requires_grad 3) do we actually test when reduction='none' and reduction='mean'? Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23612166 Pulled By: gchanan fbshipit-source-id: 4f74d38d8a81063c74e002e07fbb7837b2172a10	2020-09-11 08:51:28 -07:00
Shen Li	a9754fb860	Use TP Tensor.metadata to carry device info (#44396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44396 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D23602576 Pulled By: mrshenli fbshipit-source-id: c639789979b2b71fc165efbcf70f37b4c39469df	2020-09-11 08:33:22 -07:00
Shen Li	f44de7cdc3	Add missing rpc.shutdown() (#44417 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44417 Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D23626208 Pulled By: mrshenli fbshipit-source-id: 4ff8cad0e1193f99518804c21c9dd26ae718f4eb	2020-09-11 08:32:15 -07:00
lixinyu	77cc7d1ecd	C++ APIs Transformer NN Module Top Layer (#44333 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44333 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D23584010 Pulled By: glaringlee fbshipit-source-id: 990026e3f1b5ae276776e344ea981386cb7528fe	2020-09-11 08:25:27 -07:00
Tongzhou Wang	09892de815	Clarify track_running_stats docs; Make SyncBatchNorm track_running_stats behavior consistent (#44445 ) Summary: context: https://github.com/pytorch/pytorch/pull/38084 Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/44445 Reviewed By: colesbury Differential Revision: D23634216 Pulled By: mrshenli fbshipit-source-id: d1242c694dec0e7794651f8031327625eb9989ee	2020-09-11 08:20:34 -07:00
Nick Gibson	30fccc53a9	[NNC] Don't attempt to refactor conditional scalars (#44223 ) Summary: Fixes a bug in the NNC registerizer for Cuda where it would hoist reads out of a conditional context when trying to cache them. As a quick fix, prevent scalar replacement if a usage is within a condition. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44223 Reviewed By: gchanan Differential Revision: D23551247 Pulled By: nickgg fbshipit-source-id: 17a7bf2be4c8c3dd8a9ab7997dce9aea200c3685	2020-09-11 04:22:16 -07:00
Zafar	c967e7724e	[quant] conv_transpose1d_prepack / conv_transpose1d_unpack (#40360 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40360 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D22158982 Pulled By: z-a-f fbshipit-source-id: 844d02806554aaa68b521283703e630cc544d419	2020-09-11 04:12:28 -07:00
Elias Ellison	8b8986662f	[JIT] Remove profiling nodes in autodiff forward graph (#44420 ) Summary: Previously we were not removing profiling nodes in graphs that required grad and contained diff graphs Pull Request resolved: https://github.com/pytorch/pytorch/pull/44420 Reviewed By: bertmaher Differential Revision: D23607482 Pulled By: eellison fbshipit-source-id: af095f3ed8bb3c5d09610f38cc7d1481cbbd2613	2020-09-11 02:59:39 -07:00
Mikhail Zolotukhin	c6febc6480	[JIT] Add a python hook for a function to interpret JIT graphs. (#44493 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44493 This function allows to execute a graph exactly as it is, without going through a graph executor which would run passes on the graph before interpreting it. I found this feature extremely helpful when I worked on a stress-testing script to shake out bugs from the TE fuser: I needed to execute a very specific set of passes on a graph and nothing else, and then execute exactly it. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D23632505 Pulled By: ZolotukhinM fbshipit-source-id: ea81fc838933743e2057312d3156b77284d832ef	2020-09-11 02:55:26 -07:00
Pritam Damania	51ed31269e	Replace FutureMessage with c10::ivalue::Future in DistEngine. (#44239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44239 As part of https://github.com/pytorch/pytorch/issues/41574, use c10::ivalue::Future everywhere in DistEngine. ghstack-source-id: 111645070 Test Plan: waitforbuildbot Reviewed By: mrshenli Differential Revision: D23553507 fbshipit-source-id: 1b51ba13d1ebfa6c5c70b12028e9e96ce8ba51ff	2020-09-11 01:03:42 -07:00
Jerry Zhang	0c58a017bd	[quant][eagermode][refactor] Add set/get method for quantization and fusion mappings (#43990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43990 Allow user to register custom quantization and fusion patterns Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D23485344 fbshipit-source-id: 4f0174ee6d8000d83de0f73cb370e9a1941d54aa	2020-09-10 21:29:39 -07:00
Omkar Salpekar	f7278473d3	[NCCL] Fix NCCL_BLOCKING_WAIT functionality with Async Error Handling (#44411 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44411 This basically aborts errored NCCL communicators if either blocking wait or async error handling is enabled. Otherwise we may abort nccl communicators where neither are enabled, and this may result in subsequent GPU operations using corrupted data. ghstack-source-id: 111839264 Test Plan: Succesful Flow run: f217591683 Reviewed By: jiayisuse Differential Revision: D23605382 fbshipit-source-id: 6c16f9626362be3b0ce2feaf0979b2dff97ce61b	2020-09-10 20:57:55 -07:00
Richard Zou	69f6d94caa	Register diag_backward, diagonal_backward, infinitetely...gelu_backward as operators (#44422 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44422 See #44052 for context. Test Plan: - `pytest test/test_autograd.py -v` - `pytest test/test_nn.py -v` Reviewed By: mrshenli Differential Revision: D23607691 Pulled By: zou3519 fbshipit-source-id: 09fbcd66b877af4fa85fd9b2f851ed3912ce84d6	2020-09-10 18:43:18 -07:00
Richard Zou	7ff7e6cfc8	Register cummaxmin_backward, cumprod_backward as operators (#44410 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44410 See #44052 for context. One of the cumprod_backward overloads was unused so I just deleted it. Test Plan: - `pytest test/test_autograd.py -v` Reviewed By: mrshenli Differential Revision: D23605503 Pulled By: zou3519 fbshipit-source-id: f9c5b595e62d2d6e71f26580ba96df15cc9de4f7	2020-09-10 18:43:15 -07:00
Richard Zou	08b431f54c	Add trace_backward, masked_select_backward, and take_backward as ops (#44408 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44408 See #44052 for context. Test Plan: - `pytest test/test_autograd.py -v` Reviewed By: mrshenli Differential Revision: D23605504 Pulled By: zou3519 fbshipit-source-id: b9b1646d13caa6e536d08669c29bfc2ad8ff89a3	2020-09-10 18:41:07 -07:00
Rohan Varma	41f62b17e7	Fix DDP join() API in the case of model.no_sync() (#44427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44427 Closes https://github.com/pytorch/pytorch/issues/44425 DDP join API currently does not work properly with `model.no_sync()`, see https://github.com/pytorch/pytorch/issues/44425 for details. This PR fixes the problem via the approach mentioned in the issue, namely scheduling an allreduce that tells joined ranks whether to sync in the backwards pass or not. Tests are added for skipping gradient synchronization for various `sync_interval`s. ghstack-source-id: 111786479 Reviewed By: pritamdamania87 Differential Revision: D23609070 fbshipit-source-id: e8716b7881f8eee95e3e3499283e716bd3d7fe76	2020-09-10 18:31:40 -07:00
Mike Ruberry	c48f511c7e	Moves some of TestTorchMathOps to OpInfos (#44277 ) Summary: This PR fixes three OpInfo-related bugs and moves some functions from TestTorchMathOps to be tested using the OpInfo pattern. The bugs are: - A skip test path in test_ops.py incorrectly formatted its string argument - Decorating the tests in common_device_type.py was incorrectly always applying decorators to the original test, not the op-specific variant of the test. This could cause the same decorator to be applied multiple times, overriding past applications. - make_tensor was incorrectly constructing tensors in some cases The functions moved are: - asin - asinh - sinh - acosh - tan - atan - atanh - tanh - log - log10 - log1p - log2 In a follow-up PR more or all of the remaining functions in TestTorchMathOps will be refactored as OpInfo-based tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44277 Reviewed By: mrshenli, ngimel Differential Revision: D23617361 Pulled By: mruberry fbshipit-source-id: edb292947769967de9383f6a84eb327f027509e0	2020-09-10 17:31:50 -07:00
Mehdi Mirzazadeh	2e744b1820	Support work.result() to get result tensors for allreduce for Gloo, NCCL backends (#43970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43970 It is resubmition of #43386 Original commit changeset: 27fbeb161706 ghstack-source-id: 111775070 Test Plan: Added checks to existing unit test and ran it on gpu devserver. Verified the test that was failing in original diff also passes: https://app.circleci.com/pipelines/github/pytorch/pytorch/210229/workflows/86bde47b-f2da-48e3-a618-566ae2713102/jobs/7253683 Reviewed By: pritamdamania87 Differential Revision: D23455047 fbshipit-source-id: b8dc4a30b95570d68a482c19131674fff2a3bc7c	2020-09-10 17:13:37 -07:00
Ann Shan	1dd3fae3d2	[pytorch] Add logging to mobile Method run (#44234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44234 Changes mobile Method to point to a mobile Module directly instead of the Module ivalue in order to access metadata for logging/debugging, and then adds said logging. ghstack-source-id: 111775806 Test Plan: CI/existing unit tests to test BC Testing fb4a logging: Built fb4a on D23436351 (because usage of run_method isn't replaced yet in this diff), and then checked the Scuba logs to see that the appropriate ad clicks were logged (one ad for Buzzfeed shopping and another about Netflix from Bustle) {F328510687} {F328511201} [Scuba sample of QPL metrics](https://www.internalfb.com/intern/scuba/query/?dataset=qpl_metrics%2Fpytorch_employee&pool=uber&view=samples_client&drillstate=%7B%22sampleCols%22%3A[%22device_model%22%2C%22instance_id_sampled%22%2C%22method%22%2C%22ios_device_class%22%2C%22points_path%22%2C%22userid_sampled%22%2C%22client_sample_rate%22%2C%22browser_name%22%2C%22ios_device_name%22%2C%22points%22%2C%22is_employee%22%2C%22is_test_user%22%2C%22network_only_queries%22%2C%22annotations%22%2C%22oncall_shortname%22%2C%22environment_tags%22%2C%22revoked_queries%22%2C%22annotations_bool%22%2C%22points_data%22%2C%22annotations_double_array%22%2C%22annotations_string_array%22%2C%22revoked_steps%22%2C%22points_set%22%2C%22device_os_version%22%2C%22ota_version_rollout%22%2C%22steps%22%2C%22vadar_calculation_result%22%2C%22app_name%22%2C%22client_push_phase%22%2C%22vadar%22%2C%22release_channel%22%2C%22interaction_class%22%2C%22exposures%22%2C%22annotations_double%22%2C%22deviceid_sampled%22%2C%22is_logged_in%22%2C%22device_os%22%2C%22time%22%2C%22major_os_ver%22%2C%22annotations_int_array%22%2C%22duration_ns%22%2C%22app_build%22%2C%22bucket_id%22%2C%22cache_and_network_queries%22%2C%22value%22%2C%22vadar_v2%22%2C%22quicklog_event%22%2C%22unixname%22%2C%22vadar_calculation_result_v2%22%2C%22trace_tags%22%2C%22annotations_int%22%2C%22quicklog_module%22%2C%22push_phase%22%2C%22year_class%22%2C%22country%22%2C%22capped_duration%22%2C%22ram_class%22%2C%22weight%22%2C%22carrier%22%2C%22app_id%22%2C%22app_version%22%2C%22react_bundle_version%22%2C%22logging_source%22%2C%22is_unsampled_for_scuba%22%2C%22instrumentation_errors%22%2C%22android_cpu_abi_list%22%2C%22days_after_release%22%2C%22cpu_cores%22%2C%22user_bucket%22%2C%22quicklog_action%22%2C%22server_scuba_sample_rate%22%2C%22points_vector%22%2C%22annotations_bool_array%22%2C%22android_device_class%22%2C%22browser_full_version%22%2C%22major_app_ver%22]%2C%22derivedCols%22%3A[]%2C%22mappedCols%22%3A[]%2C%22enumCols%22%3A[]%2C%22hideEmptyColumns%22%3Afalse%2C%22focused_event%22%3A%22%22%2C%22show_metadata%22%3A%22false%22%2C%22start%22%3A%222020-09-08%2011%3A27%3A00%22%2C%22end%22%3A%22start%20%2B%201%20minute%22%2C%22timezone%22%3A%22America%2FLos_Angeles%22%2C%22samplingRatio%22%3A%221%22%2C%22num_samples%22%3A%22100%22%2C%22aggregateList%22%3A[]%2C%22param_dimensions%22%3A[]%2C%22modifiers%22%3A[]%2C%22order%22%3A%22none%22%2C%22order_desc%22%3Atrue%2C%22filterMode%22%3A%22DEFAULT%22%2C%22constraints%22%3A[[%7B%22column%22%3A%22quicklog_event%22%2C%22op%22%3A%22eq%22%2C%22value%22%3A[%22[%5C%22MOBILE_MODULE_STATS%5C%22]%22]%7D%2C%7B%22column%22%3A%22userid_sampled%22%2C%22op%22%3A%22eq%22%2C%22value%22%3A[%22[%5C%22100013484978975%5C%22]%22]%7D]]%2C%22c_constraints%22%3A[[]]%2C%22b_constraints%22%3A[[]]%2C%22metrik_view_params%22%3A%7B%22should_use_legacy_colors%22%3Afalse%2C%22columns_skip_formatting%22%3A[]%2C%22view%22%3A%22samples_client%22%2C%22width%22%3A%221358%22%2C%22height%22%3A%22912%22%2C%22tableID%22%3A%22qpl_metrics%2Fpytorch_employee%22%2C%22fitToContent%22%3Afalse%2C%22format_tooltip_in_percent%22%3Afalse%2C%22use_y_axis_hints_as_limits%22%3Atrue%2C%22has_dynamic_context_menu%22%3Atrue%2C%22has_context_menu%22%3Afalse%2C%22legend_mode%22%3A%22nongrid%22%2C%22connect_nulls%22%3Atrue%2C%22timezone_offset%22%3A420%2C%22timezone%22%3A%22America%2FLos_Angeles%22%2C%22y_min_hint%22%3A0%2C%22should_render_plugins_menu%22%3Afalse%7D%7D&normalized=1599581160) [Scuba sample showing ad source; just the bottom two results](https://www.internalfb.com/intern/scuba/query/?dataset=business_integrity_webpage_semantic&pool=uber&drillstate=%7B%22sampleCols%22%3A[%22from_custom_sampling%22%2C%22data_version%22%2C%22scribe_category_type%22%2C%22page_id%22%2C%22name%22%2C%22source_url%22%2C%22time%22%2C%22title_semantic%22%2C%22major_version%22%2C%22server_protocol%22%2C%22custom_sampling_enabled%22%2C%22ad_id%22%2C%22appversion%22%2C%22clienttime%22%2C%22isemployee%22%2C%22title%22%2C%22images%22%2C%22weight%22%2C%22carrier%22%2C%22is_ad%22%2C%22locale%22%2C%22appid%22%2C%22ip_country%22%2C%22iab_models%22]%2C%22derivedCols%22%3A[]%2C%22mappedCols%22%3A[]%2C%22enumCols%22%3A[]%2C%22return_remainder%22%3Afalse%2C%22should_pivot%22%3Afalse%2C%22is_timeseries%22%3Afalse%2C%22hideEmptyColumns%22%3Afalse%2C%22main_dimension%22%3A%22time%22%2C%22start%22%3A%22-5%20minutes%22%2C%22samplingRatio%22%3A%221%22%2C%22compare%22%3A%22none%22%2C%22axes%22%3A%22linked%22%2C%22overlay_types%22%3A[]%2C%22minBucketSamples%22%3A%22%22%2C%22dimensions%22%3A[]%2C%22scale_type%22%3A%22absolute%22%2C%22num_samples%22%3A%22100%22%2C%22metric%22%3A%22avg%22%2C%22fill_missing_buckets%22%3A%22connect%22%2C%22smoothing_bucket%22%3A%221%22%2C%22top%22%3A%227%22%2C%22markers%22%3A%22%22%2C%22timezone%22%3A%22America%2FLos_Angeles%22%2C%22end%22%3A%22now%22%2C%22show_p95_ci%22%3Afalse%2C%22time_bucket%22%3A%22auto%22%2C%22compare_mode%22%3A%22normal%22%2C%22aggregateList%22%3A[]%2C%22param_dimensions%22%3A[]%2C%22modifiers%22%3A[]%2C%22order%22%3A%22none%22%2C%22order_desc%22%3Atrue%2C%22filterMode%22%3A%22DEFAULT%22%2C%22constraints%22%3A[[%7B%22column%22%3A%22major_version%22%2C%22op%22%3A%22eq%22%2C%22value%22%3A[%22[%5C%22288%5C%22]%22]%7D]]%2C%22c_constraints%22%3A[[]]%2C%22b_constraints%22%3A[[]]%2C%22metrik_view_params%22%3A%7B%22should_use_legacy_colors%22%3Afalse%2C%22columns_skip_formatting%22%3A[]%2C%22view%22%3A%22time_view%22%2C%22width%22%3A%221358%22%2C%22height%22%3A%22912%22%2C%22tableID%22%3A%22business_integrity_webpage_semantic%22%2C%22fitToContent%22%3Afalse%2C%22format_tooltip_in_percent%22%3Afalse%2C%22use_y_axis_hints_as_limits%22%3Atrue%2C%22has_dynamic_context_menu%22%3Atrue%2C%22has_context_menu%22%3Afalse%2C%22legend_mode%22%3A%22nongrid%22%2C%22connect_nulls%22%3Atrue%2C%22timezone_offset%22%3A420%2C%22timezone%22%3A%22America%2FLos_Angeles%22%2C%22y_min_hint%22%3A0%2C%22should_render_plugins_menu%22%3Afalse%7D%7D&view=samples_client&normalized=1599587280) Reviewed By: iseeyuan Differential Revision: D23548687 fbshipit-source-id: 3e63085663f5fd8de90a4c7dbad0a17947aee973	2020-09-10 15:26:33 -07:00
Pritam Damania	a2a81e1335	Add a CONTRIBUTING.md for the distributed package. (#44224 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44224 The purpose of this file is to help developers on PT distributed get upto speed on the code structure and layout for PT Distributed. ghstack-source-id: 111644842 Test Plan: waitforbuildbot Reviewed By: rohan-varma Differential Revision: D23548377 fbshipit-source-id: 561d5b8e257642de172def8fdcc1311fae20690b	2020-09-10 14:58:00 -07:00
Nikita Shulga	4bead6438a	Enable torch.autograd typechecks (#44451 ) Summary: To help with further typing, move dynamically added native contributions from `torch.autograd` to `torch._C._autograd` Fix invalid error handling pattern in `89ac30afb8/torch/csrc/autograd/init.cpp (L13-L15)` `PyImport_ImportModule` already raises Python exception and nullptr should be returned to properly propagate the to Python runtime. And all native methods/types in `torch/autograd/__init.py` after `torch._C._init_autograd()` has been called Use f-strings instead of `.format` in test_type_hints.py Fixes https://github.com/pytorch/pytorch/issues/44450 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44451 Reviewed By: ezyang Differential Revision: D23618261 Pulled By: malfet fbshipit-source-id: fa5f739d7cff8410641128b55b810318c5f636ae	2020-09-10 13:37:29 -07:00
Elias Ellison	cc5a1cf616	[JIT] Erase shapes before fallback graph (#44434 ) Summary: Previously the specialized types were copied over to the fallback function, although the tensors in the fallback type were not of that type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44434 Reviewed By: SplitInfinity Differential Revision: D23611943 Pulled By: eellison fbshipit-source-id: 2ea88a97529409f6c5c4c1f59a14b623524933de	2020-09-10 12:07:31 -07:00
Yi Wang	38c10b4f30	[NCCL] Fix the initialization of futureNCCLCallbackStreams (#44347 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44347 Cloned from Pull Request resolved: https://github.com/pytorch/pytorch/pull/44097, because the original author Sinan has completed the internship and now is unable to submit this diff. As johnsonpaul mentioned in D23277575 (`7d517cf96f`). It looks like all processes were allocating memory on GPU-ID=0. I was able to reproduce it by running `test_ddp_comm_hook_allreduce_with_then_hook_nccl` unit test of `test_c10d.py` and running `nvidia-smi` while test was running. The issue was reproduced as: ``` +-----------------------------------------------------------------------------+ \| Processes: GPU Memory \| \| GPU PID Type Process name Usage \| \|=============================================================================\| \| 0 `3132563` C python 777MiB \| \| 0 3132564 C python 775MiB \| \| 4 3132564 C python 473MiB \| +-----------------------------------------------------------------------------+ ``` I realized that as we initialize ProcessGroupNCCL both processes were initially allocating memory on GPU 0. We later also realized that I forgot `isHighPriority` input of `getStreamFromPool` and `futureNCCLCallbackStreams_.push_back(std::make_shared<at::cuda::CUDAStream>(at::cuda::getStreamFromPool(device_index)));` was just creating a vector of GPU 0 streams. As i changed `at::cuda::getStreamFromPool(device_index)` to `at::cuda::getStreamFromPool(false, device_index)`. `nvidia-smi` looked like: ``` +-----------------------------------------------------------------------------+ \| Processes: GPU Memory \| \| GPU PID Type Process name Usage \| \|=============================================================================\| \| 0 673925 C python 771MiB \| \| 0 673926 C python 771MiB \| \| 1 673925 C python 771MiB \| \| 1 673926 C python 771MiB \| \| 2 673925 C python 771MiB \| \| 2 673926 C python 771MiB \| \| 3 673925 C python 771MiB \| \| 3 673926 C python 771MiB \| \| 4 673925 C python 771MiB \| \| 4 673926 C python 771MiB \| \| 5 673925 C python 771MiB \| \| 5 673926 C python 771MiB \| \| 6 673925 C python 771MiB \| \| 6 673926 C python 771MiB \| \| 7 673925 C python 707MiB \| \| 7 673926 C python 623MiB \| +-----------------------------------------------------------------------------+ ``` This confirms that we were just getting GPU 0 streams for the callback. I think this does not explain the `fp16_compress` stability issue, because we were able to reproduce that even without any then callback and just calling copy from fp32 to fp16 before allreduce. However, this can explain other issues where `allreduce` was not on par with `no_hook`. I'll run some additional simulations with this diff. I tried to to replace `getStreamFromPool` by `getDefaultCUDAStream(deviceIndex)` and it wasn't causing additional memory usage. In this diff, I temporarily solved the issue by just initializing null pointers for each device in the constructor and setting the callback stream for corresponding devices inside `ProcessGroupNCCL::getNCCLComm`. After the fix it looks like the memory issue was resolved: ``` +-----------------------------------------------------------------------------+ \| Processes: GPU Memory \| \| GPU PID Type Process name Usage \| \|=============================================================================\| \| 0 2513142 C python 745MiB \| \| 4 2513144 C python 747MiB \| +-----------------------------------------------------------------------------+ ``` I could use a dictionary instead of a vector for `futureNCCLCallbackStreams_`, but since number of devices is fixed, I think it isn't necessary. Please let me know what you think in the comments. ghstack-source-id: 111485483 Test Plan: `test_c10d.py` and some perf tests. Also check `nvidia-smi` while running tests to validate memory looks okay. This diff also fixes the regression in HPC tests as we register a hook: {F322730175} See https://fb.quip.com/IGuaAbD8 (`474fdd7e2d`)bnvy for details. Reviewed By: pritamdamania87 Differential Revision: D23495436 fbshipit-source-id: ad08e1d94343252224595d7c8a279fe75e244822	2020-09-10 11:25:38 -07:00
Kenichi Maehashi	cb90fef770	Fix return value of PyErr_WarnEx ignored (SystemError) (#44371 ) Summary: This PR fixes unexpected `SystemError` when warnings are emitted and warning filters are set. ## Current behavior ``` $ python -Werror >>> import torch >>> torch.range(1, 3) UserWarning: torch.range is deprecated in favor of torch.arange and will be removed in 0.5. Note that arange generates values in [start; end), not [start; end]. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "<stdin>", line 1, in <module> SystemError: <built-in method range of type object at 0x7f38c7703a60> returned a result with an error set ``` ## Expected behavior ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> UserWarning: torch.range is deprecated and will be removed in a future release because its behavior is inconsistent with Python's range builtin. Instead, use torch.arange, which produces values in [start, end). ``` ## Note Python exception must be raised if `PyErr_WarnEx` returns `-1` ([python docs](https://docs.python.org/3/c-api/exceptions.html#issuing-warnings)). This PR fixes warnings raised in the following code: ```py import torch torch.range(1, 3) torch.autograd.Variable().volatile torch.autograd.Variable().volatile = True torch.tensor(torch.tensor([])) torch.tensor([]).new_tensor(torch.tensor([])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44371 Reviewed By: mrshenli Differential Revision: D23598410 Pulled By: albanD fbshipit-source-id: 2fbcb13fe4025dbebaf1fd837d4c8e0944e05010	2020-09-10 10:15:21 -07:00
Hameer Abbasi	f9a0d0c21e	Allow Tensor-likes in torch.autograd.gradcheck (#43877 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42942 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43877 Reviewed By: zou3519 Differential Revision: D23493257 Pulled By: ezyang fbshipit-source-id: 6cdaabe17157b484e9491189706ccc15420ac239	2020-09-10 09:02:17 -07:00
Gregory Chanan	c8914afdfa	Merge criterion_tests and new_criterion_tests. (#44398 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44398 These end up executing the same tests, so no reason to have them separate. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23600855 Pulled By: gchanan fbshipit-source-id: 0952492771498bf813f1bf8e1d7c8dce574ec965	2020-09-10 08:29:59 -07:00
Gregory Chanan	fa158c4ca6	Combine criterion and new criterion tests in test_jit. (#43958 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43958 There is not any difference between these tests (I'm merging them), so let's merge them in the JIT as well. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23452337 Pulled By: gchanan fbshipit-source-id: e6d13cdb164205eec3dbb7cdcd0052b02c961778	2020-09-10 08:28:14 -07:00
Gregory Chanan	af9cad761a	Stop ignoring NotImplementedErrors in cuda CriterionTests. (#44381 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44381 Perhaps this was necessary when the test was originally introduced, but it's difficult to figure out what is actually tested. And I don't think we actually use NotImplementedErorrs. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23598646 Pulled By: gchanan fbshipit-source-id: aa18154bfc4969cca22323e61683a301198823be	2020-09-10 08:18:33 -07:00
generatedunixname89002005287564	356aa54694	[Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D23621463 fbshipit-source-id: 1cd7e94e480c7073c9a0aad55aeba98de4b96164	2020-09-10 04:24:43 -07:00
Kurt Mohler	28a23fce4c	Deprecate torch.norm and torch.functional.norm (#44321 ) Summary: Part of https://github.com/pytorch/pytorch/issues/24802 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44321 Reviewed By: mrshenli Differential Revision: D23617273 Pulled By: mruberry fbshipit-source-id: 6f88b5cb097fd0acb9cf0e415172c5a86f94e9f2	2020-09-10 01:16:41 -07:00
Chris Huynh	7b547f086f	To fix extra memory allocation when using circular padding (#39273 ) Summary: For fixing https://github.com/pytorch/pytorch/issues/39256 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39273 Reviewed By: anjali411 Differential Revision: D23471811 Pulled By: mruberry fbshipit-source-id: fb324b51baea765311715cdf14642b334f335733	2020-09-10 00:15:31 -07:00
Jeff Daily	65d4a6b7c0	[ROCm] fix cub hipify mappings (#44431 ) Summary: Fixes ROCm-specific workarounds introduced by https://github.com/pytorch/pytorch/issues/44259. This adds new hipify mappings that properly handle cub outside of caffe2 sources. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44431 Reviewed By: mrshenli Differential Revision: D23617417 Pulled By: ngimel fbshipit-source-id: 5d16afb6b8e6ec5ed049c51571866b0878d534ca	2020-09-09 23:39:25 -07:00
Cheng Chang	28bd4929bd	[NNC] Make it able to normalize loop with variable start (#44133 ) Summary: Loops with variable start can also be normalized. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44133 Test Plan: updated testNormalizeStartVariable. Reviewed By: navahgar Differential Revision: D23507097 Pulled By: cheng-chang fbshipit-source-id: 4e9aad1cd4f4a839f59a00bf8ddf97637a1a6648	2020-09-09 23:05:57 -07:00
taiyuanz	c515881137	Add reset_grad() function (#44423 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44423 Pull Request resolved: https://github.com/pytorch/pytorch/pull/42754 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23010859 Pulled By: ngimel fbshipit-source-id: 56eec43eba88b98cbf714841813977c68f983564	2020-09-09 22:05:45 -07:00
Meghan Lele	89ac30afb8	[JIT] Propagate type sharing setting to submodule compilation (#44226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44226 Summary At present, the `share_types` argument to `create_script_module` is used to decide whether to reuse a previously created type for a top-level module that has not yet been compiled. However, that setting does not apply to the compilation of submodules of the top-level module; types are still reused if possible. This commit modifies `create_script_module` so that the `share_types` flag is honoured during submodule compilation as well. Test Plan This commit adds a unit test to `TestTypeSharing` that checks that submodule types are not shared or reused when `share_types` is set to `False`. Fixes This commit fixes #43605. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D23602371 Pulled By: SplitInfinity fbshipit-source-id: b909b8b6abbe3b4cb9be8319ac263ade90e83bd3	2020-09-09 20:06:35 -07:00
Meghan Lele	d3b6d5caf1	[JIT] Add support for del to TS classes (#44352 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44352 Summary This commit adds support for `del` with class instances. If a class implements `__delitem__`, then `del class_instance[key]` is syntactic sugar for `class_instance.__delitem__[key]`. Test Plan This commit adds a unit test to TestClassTypes to test this feature. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D23603102 Pulled By: SplitInfinity fbshipit-source-id: 28ad26ddc9a693a58a6c48a0e853a1c7cf5c9fd6	2020-09-09 19:52:35 -07:00
Omkar Salpekar	e028ad0762	Fix HashStoreTests and move to Gtest (#43384 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43384 Much like the FileStoreTests, the HashStoreTests were also run in a single blob and threw exceptions upon failure. This modularizes the test by separating each function into separate gtest test cases. ghstack-source-id: 111690834 Test Plan: Confirmed that the tests pass on devvm. Reviewed By: jiayisuse Differential Revision: D23257579 fbshipit-source-id: 7e821f0e9ee74c8b815f06facddfdb7dc2724294	2020-09-09 17:56:33 -07:00
Omkar Salpekar	69a3ff005d	Modularize FileStoreTest and move to Gtest (#43383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43383 FileStore Test currently has a large blob of tests that throw exceptions upon failure. This PR modularizes each test so they can run independently, and migrates the framework to gtest. ghstack-source-id: 111690831 Test Plan: Confirmed tests pass on devvm Reviewed By: jiayisuse Differential Revision: D22879473 fbshipit-source-id: 6fa5468e594a53c9a6b972757068dfc41645703e	2020-09-09 17:56:30 -07:00
Omkar Salpekar	a7fba7de22	Convert StoreTestUtils to Gtest (#43382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43382 StoreTestCommon defines standard helper functions that are used by all of our Store tests. These helpers currently throw exceptions upon failure, this PR changes them to use gtest assertions instead. ghstack-source-id: 111690833 Test Plan: Tested the 2 PR's above this on devvm Reviewed By: jiayisuse Differential Revision: D22828156 fbshipit-source-id: 9e116cf2904e05ac0342a441e483501e00aad3dd	2020-09-09 17:55:25 -07:00
Elias Ellison	b69c28d02c	Improving ModuleList indexing error msg (#43361 ) Summary: Follow up to https://github.com/pytorch/pytorch/pull/41946/, to suggest enumerating a module as an alternative if a user tries indexing into a modulelist/sequential with a non-integer literal Pull Request resolved: https://github.com/pytorch/pytorch/pull/43361 Reviewed By: mrshenli Differential Revision: D23602388 Pulled By: eellison fbshipit-source-id: 51fa28d5bc45720529b3d45e92d367ee6c9e3316	2020-09-09 16:22:57 -07:00
Elias Ellison	e0c65abd38	Revert D23568330: [pytorch][PR] Moves some of TestTorchMathOps to OpInfos Test Plan: revert-hammer Differential Revision: D23568330 (`a953a825cc`) Original commit changeset: 03e69fccdbfd fbshipit-source-id: 04ec6843c5eb3c84ddf226dad0088172d9bed84d	2020-09-09 15:48:56 -07:00

... 2 3 4 5 6 ...

11958 Commits