pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Rong Rong	147a48fb27	[cmake] clean up cmake/Utils.cmake (#47923 ) Summary: Consolidate into cmake/public/utils.cmake Pull Request resolved: https://github.com/pytorch/pytorch/pull/47923 Reviewed By: samestep Differential Revision: D24955961 Pulled By: walterddr fbshipit-source-id: 9d5f6af2b353a8c6f6d521c841fd0989393755cd	2020-11-16 08:12:32 -08:00
albanD	cd4aa9c95c	Fix inplace check logic to be triggered when written to Tensor does not require gradients (#46296 ) Summary: Fix https://github.com/pytorch/pytorch/issues/46242 This ensures that the `check_inplace()` run the proper checks even if the Tensor that is being modified inplace does not requires gradient. As the Tensor written into it might require gradient and will make this inplace modification actually differentiable. This contains: - Codegen changes to tell `check_inplace()` if the inplace will be differentiable - Changes in `handle_view_on_rebase` to work properly even when called for an input that does not require gradients (which was assumed to be true before) - Corresponding tests (both warnings and the error raise internal assert errors without this fix) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46296 Reviewed By: ezyang Differential Revision: D24903770 Pulled By: albanD fbshipit-source-id: 74e65dad3d2e3b9f762cbb7b39f92f19d9a0b094	2020-11-16 08:06:06 -08:00
Jane Xu	d032d22141	Replacing CUDA11.0 config with CUDA11.1 in CI (#47942 ) Summary: Relands https://github.com/pytorch/pytorch/issues/46616 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47942 Reviewed By: walterddr Differential Revision: D24963006 Pulled By: janeyx99 fbshipit-source-id: 71a61c56dec88a32a1c5d194db5a2730100f60a1	2020-11-16 07:32:35 -08:00
Mike Ruberry	013e6a3d9d	Revert D24698027: Fix auto exponent issue for torch.pow Test Plan: revert-hammer Differential Revision: D24698027 (`8ef7ccd669`) Original commit changeset: f23fdb65c925 fbshipit-source-id: 9a67a2c6310c9e4fdefbb421a8cd4fa41595bc9a	2020-11-15 03:58:44 -08:00
anjali411	8ef7ccd669	Fix auto exponent issue for torch.pow (#47024 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47024 Fixes https://github.com/pytorch/pytorch/issues/46936 Stack from [ghstack](https://github.com/ezyang/ghstack): * #47024 Fix auto exponent issue for torch.pow Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D24698027 Pulled By: anjali411 fbshipit-source-id: f23fdb65c925166243593036e08214c4f041a63d	2020-11-14 22:50:12 -08:00
Xiang Gao	d293413b3e	Batched matmul dtypes (#47873 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/47873 Reviewed By: navahgar Differential Revision: D24928256 Pulled By: anjali411 fbshipit-source-id: a26aef7a15a13fc0b5716e905971265d8b1cea61	2020-11-14 22:45:48 -08:00
anjali411	db1f217d8d	Add complex support for torch.addcmul and torch.addcdiv (#46639 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46639 Resolves: https://github.com/pytorch/pytorch/issues/46546#issuecomment-713122245 Test Plan: Imported from OSS Reviewed By: izdeby, ansley Differential Revision: D24879099 Pulled By: anjali411 fbshipit-source-id: 76131dc68ac964e67a633f62e07f7c799df4463e	2020-11-14 21:27:34 -08:00
Bert Maher	5adf840259	[pytorch][te][easy] Remove KernelScope from fusion pass tests (#47952 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47952 We don't actually generate a TE kernel so no need to use the arena-allocation guard. Test Plan: ``` buck test //caffe2/test/cpp/tensorexpr -- FuserPass ``` Reviewed By: ZolotukhinM Differential Revision: D24967107 fbshipit-source-id: 302f65b2fcff704079e8b51b942b7b3baff95585	2020-11-14 20:25:01 -08:00
Jianyu Huang	0e98fdd389	[ATen/CPU] Parallelize HalfToFloat + FloatToHalf operators in PT (#47777 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47777 Parallelize FP32 <-> FP16 op. - Use at::Parallelize in ATen instead of parallelizing inside FBGEMM; - provide more flexibility (at::Parallelize can be configured with different parallel backend). ghstack-source-id: 116499687 Test Plan: ``` OMP_NUM_THREADS=10 buck test //caffe2/test:torch -- .test_half_tensor. ``` https://our.intern.facebook.com/intern/testinfra/testrun/7036874441928985 ``` OMP_NUM_THREADS=10 buck run mode/opt -c pytorch.parallel_backend=tbb //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test -- --iterations 1 --omp_num_threads 10 --warmup_iterations 0 ``` Benchmark results for 512 x 512 Tensor copy: - With 1 thread: ``` (base) [jianyuhuang@devbig281.ftw3.facebook.com: ~/fbsource/fbcode/caffe2/caffe2/operators] $ buck run mode/opt -c py torch.parallel_backend=tbb //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test -- --iterations 1 --omp_num_thread s 1 --warmup_iterations 10 Parsing buck files: finished in 1.3 sec Building: finished in 5.7 sec (100%) 6087/6087 jobs, 0 updated Total time: 7.0 sec No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: FloatToHalfTensorConversionBenchmark # Mode: Eager # Name: FloatToHalfTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 99.279 # Benchmarking PyTorch: HalfToFloatTensorConversionBenchmark # Mode: Eager # Name: HalfToFloatTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 81.707 ``` - With 2 threads: ``` (base) [jianyuhuang@devbig281.ftw3.facebook.com: ~/fbsource/fbcode/caffe2/caffe2/operators] $ buck run mode/opt -c py torch.parallel_backend=tbb //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test -- --iterations 1 --omp_num_thread s 2 --warmup_iterations 10 Parsing buck files: finished in 1.3 sec Building: finished in 4.4 sec (100%) 6087/6087 jobs, 0 updated Total time: 5.7 sec No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: FloatToHalfTensorConversionBenchmark # Mode: Eager # Name: FloatToHalfTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 68.162 # Benchmarking PyTorch: HalfToFloatTensorConversionBenchmark # Mode: Eager # Name: HalfToFloatTensorConversionBenchmark_M512_N512_cpu # Input: M: 512, N: 512, device: cpu Forward Execution Time (us) : 49.245 ``` Reviewed By: ngimel Differential Revision: D24676355 fbshipit-source-id: 02bfb893a7b5a60f97c0559d8974c53837755ac2	2020-11-14 18:45:23 -08:00
Rohan Varma	f8248543a1	Pass in smaller timeout into init_process_group for distributed_test (#47896 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47896 Per title ghstack-source-id: 116710141 Test Plan: CI Reviewed By: osalpekar Differential Revision: D24943323 fbshipit-source-id: 7bf33ce3a021b9750b65e0c08f602c465cd81d28	2020-11-14 13:38:20 -08:00
Jiakai Liu	07e98d28cf	[pytorch][codegen] migrate gen_variable_factories.py to the new data model (#47818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47818 This is another relatively small codegen. Ideally we should CppSignature.decl() to generate the c++ function declaration. We didn't because it needs to add 'at::' to the types defined in ATen namespace. E.g.: - standard declaration: ``` Tensor eye(int64_t n, int64_t m, const TensorOptions & options={}) ``` - expected: ``` at::Tensor eye(int64_t n, int64_t m, const at::TensorOptions & options = {}) ``` Kept the hacky fully_qualified_type() method to keep compatibility with old codegen. We could clean up by: - Using these types in torch namespace - but this is a user facing header file, not sure if it will cause problem; - Update cpp.argument_type() method to take optional namespace argument; Confirmed byte-for-byte compatible with the old codegen: ``` Run it before and after this PR: .jenkins/pytorch/codegen-test.sh <baseline_output_dir> .jenkins/pytorch/codegen-test.sh <test_output_dir> Then run diff to compare the generated files: diff -Naur <baseline_output_dir> <test_output_dir> ``` Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24909478 Pulled By: ljk53 fbshipit-source-id: a0ceaa60cc765c526908fee39f151cd7ed5ec923	2020-11-14 13:05:23 -08:00
Vasiliy Kuznetsov	4779553921	Revert "[quant] Remove nn.quantized.ReLU module and nn.quantized.functional.relu (#47415 )" (#47949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47949 This reverts commit `1478e5ec2a`. Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D24966363 Pulled By: vkuzo fbshipit-source-id: ca1126f699eef84027a15df35962728296c8a790	2020-11-14 08:40:30 -08:00
Jiakai Liu	c936b43f14	[pytorch][codegen] add fully migrated scripts to mypy strict config (#47747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47747 Moved MANUAL_AUTOGRAD / etc to gen_trace_type.py to avoid mypy from scanning not yet migrated gen_variable_type.py. Differential Revision: D24885066 Test Plan: Imported from OSS Reviewed By: ezyang Pulled By: ljk53 fbshipit-source-id: bf420e21c26f45fe2b94977bc6df840ffd8a3128	2020-11-14 02:28:00 -08:00
Jiakai Liu	4ff8cd8f3a	[pytorch][codegen] gen_python_functions.py loading native_functions.yaml / deprecated.yaml directly (#47746 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47746 - Removed the integration hack in gen_python_functions.py. It now directly loads native_functions.yaml. All dependencies on Declarations.yaml have been removed / moved to elsewhere. - Rewrote the deprecated.yaml parsing logic to work with new data model directly. Confirmed byte-for-byte compatible with the old codegen: ``` Run it before and after this PR: .jenkins/pytorch/codegen-test.sh <baseline_output_dir> .jenkins/pytorch/codegen-test.sh <test_output_dir> Then run diff to compare the generated files: diff -Naur <baseline_output_dir> <test_output_dir> ``` Differential Revision: D24885067 Test Plan: Imported from OSS Reviewed By: bhosmer Pulled By: ljk53 fbshipit-source-id: 8e906b7dd36a64395087bd290f6f54596485ceb4	2020-11-14 02:27:57 -08:00
Jiakai Liu	d91cefb0d8	[pytorch][codegen] migrate gen_annotated_fn_args.py to new codegen model (#47745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47745 This is a relatively small codegen. Reintroduced 'simple_type' to preserve old codegen output. It depends on some methods defined in gen_python_functions.py - next PR will clean up the remaining Declarations.yaml methods in gen_python_functions.py. Confirmed byte-for-byte compatible with the old codegen: ``` Run it before and after this PR: .jenkins/pytorch/codegen-test.sh <baseline_output_dir> .jenkins/pytorch/codegen-test.sh <test_output_dir> Then run diff to compare the generated files: diff -Naur <baseline_output_dir> <test_output_dir> ``` Differential Revision: D24885068 Test Plan: Imported from OSS Reviewed By: ezyang Pulled By: ljk53 fbshipit-source-id: c0fbd726bcc450c3c7fe232c23e5b31779d0b65f	2020-11-14 02:24:39 -08:00
Wang Xu	0dbff184e9	change file name to snake style (#47914 ) Summary: Change Partitioner.py file name to partitioner.py Change GraphManipulation.py file name to graph_manipulation.py Move test_replace_target_nodes_with() to test_fx_experimental.py Remove the unnecessary argument in size_based_partition() in Partitioner class Pull Request resolved: https://github.com/pytorch/pytorch/pull/47914 Reviewed By: gcatron Differential Revision: D24956653 Pulled By: scottxu0730 fbshipit-source-id: 25b65be7dc7d64e90ffdc59cf394446fee83c3e6	2020-11-14 01:29:25 -08:00
Jagadish Krishnamoorthy	1606899dbe	distributed_test: Map rank to GPU accordingly (#47898 ) Summary: If world_size is lesser than or equal to number of GPU's available then the rank can be directly mapped to corresponding GPU. This fixes the issue referenced in https://github.com/pytorch/pytorch/issues/45435 and https://github.com/pytorch/pytorch/issues/47629 For world_size = 3 and number of GPU's = 8, the rank to GPU mapping will be 0,2,4. This is due to the introduction of barrier, (refer PR https://github.com/pytorch/pytorch/issues/45181) the tensors in barrier is mapped to cuda0,1,2 and the tensors in the actual test cases are mapped to cuda0,2,4 resulting in different streams and leading to timeout. This issue is specific to default process group. Issue is not observed in new process group since the streams are created again after the initial barrier call. This patch maps the rank to corresponding GPU's when the world_size is less than or equal to the number of GPU's, in this case 0,1,2 Note: The barrier function in distributed_c10d.py should include new parameter to specify the tensor or rank to GPU mapping. In that case, this patch will be redundant but harmless since the tests can specify the tensors with appropriate GPU rankings. Fixes https://github.com/pytorch/pytorch/issues/47629 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47898 Reviewed By: smessmer Differential Revision: D24956021 Pulled By: rohan-varma fbshipit-source-id: a88257f22a7991ba36566329766c106d3360bb4e	2020-11-13 23:59:42 -08:00
Natalia Gimelshein	982ae987d3	Revert D24941350: [pytorch][PR] Reopen PR for 0 dim batch size for AvgPool2d. Test Plan: revert-hammer Differential Revision: D24941350 (`ceeab70da1`) Original commit changeset: b7e50346d86e fbshipit-source-id: 2e42e4418476658dc1afb905184841bf61688cfd	2020-11-13 22:33:37 -08:00
Richard Barnes	c543b3b582	Fix a downcast (#47919 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47919 Suppresses a downcast warning. Test Plan: Reproduces with ``` buck test mode/dev-nosan //caffe2/torch/fb/sparsenn:gpu_test ``` Reviewed By: suphoff Differential Revision: D24866987 fbshipit-source-id: 44f19ab37a7d95abe08f570abfebc702827a2510	2020-11-13 22:26:29 -08:00
Katy Voor	fe7d1d7d0e	Add LeakyReLU operator to static runtime (#47798 ) Summary: - Add LeakyReLU operator to static runtime - Add LeakyReLU benchmark - Add LeakyReLU correctness test case Static Runtime ``` ------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------ BM_leaky_relu/1 4092 ns 4092 ns 172331 BM_leaky_relu/8 4425 ns 4425 ns 158434 BM_leaky_relu/20 4830 ns 4830 ns 145335 BM_leaky_relu_const/1 3545 ns 3545 ns 198054 BM_leaky_relu_const/8 3825 ns 3825 ns 183074 BM_leaky_relu_const/20 4222 ns 4222 ns 165999 ``` Interpreter ``` ------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------ BM_leaky_relu/1 7183 ns 7182 ns 96377 BM_leaky_relu/8 7580 ns 7580 ns 91588 BM_leaky_relu/20 8066 ns 8066 ns 87183 BM_leaky_relu_const/1 6466 ns 6466 ns 107925 BM_leaky_relu_const/8 7063 ns 7063 ns 98768 BM_leaky_relu_const/20 7380 ns 7380 ns 94564 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47798 Reviewed By: ezyang Differential Revision: D24927043 Pulled By: kavoor fbshipit-source-id: 69b12cc57f725f1dc8d68635788813710a74dc2b	2020-11-13 22:05:52 -08:00
Chester Liu	17a6bc7c1b	Cleanup unused code for Python < 3.6 (#47822 ) Summary: I think these can be safely removed since the min version of supported Python is now 3.6 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47822 Reviewed By: smessmer Differential Revision: D24954936 Pulled By: ezyang fbshipit-source-id: 5d4b2aeb78fc97d7ee4abaf5fb2aae21bf765e8b	2020-11-13 21:37:01 -08:00
Guilherme Leobas	4f9d0757f3	Add type informations to torch.cuda (#47134 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47133 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47134 Reviewed By: smessmer Differential Revision: D24955031 Pulled By: ezyang fbshipit-source-id: 87f4623643715baa6ac0627383f009956f80cd46	2020-11-13 21:34:35 -08:00
Masaki Kozuki	2eb1e866e8	Update links in DDP note (#47663 ) Summary: Update the links in https://pytorch.org/docs/stable/notes/ddp.html#. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47663 Reviewed By: smessmer Differential Revision: D24951684 Pulled By: ezyang fbshipit-source-id: c1c104d76cf0292a7fc75a627bf76bb56fea72d0	2020-11-13 21:26:28 -08:00
Greg Tarr	550973b675	Missing curly bracket. (#47855 ) Summary: Typo fix Pull Request resolved: https://github.com/pytorch/pytorch/pull/47855 Reviewed By: smessmer Differential Revision: D24951767 Pulled By: ezyang fbshipit-source-id: 8884390370d4d71efd6cee10c3e0b8f55d7e5739	2020-11-13 21:17:24 -08:00
Meghan Lele	1bdd3687b9	Back out "[JIT] Fix function schema subtype checking" Summary: Original commit changeset: bd07e7b47d2a Test Plan: T79664004 Reviewed By: qizzzh Differential Revision: D24969339 fbshipit-source-id: 8ecc4d52b86c5440c673e42b0e2cb78d94937a6f	2020-11-13 20:33:54 -08:00
Zino Benaissa	11710598db	Preserve module parameters in freezing (#47094 ) Summary: Added preserveParameters to freezing API that allows to preserve module parameters. Fixes #{39613} Pull Request resolved: https://github.com/pytorch/pytorch/pull/47094 Reviewed By: eellison Differential Revision: D24792867 Pulled By: bzinodev fbshipit-source-id: f0cd980f5aed617b778afe2f231067c7c30a1527	2020-11-13 20:18:32 -08:00
Omkar Salpekar	f8c559db8e	[resubmit] Providing more information while crashing process in async error handling (#47246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47246 We crash the process in NCCL Async Error Handling if the collective has been running for greater than some set timeout. This PR introduces more information about the rank and duration the collective ran. ghstack-source-id: 116676182 Test Plan: Run desync tests and flow. Reviewed By: pritamdamania87 Differential Revision: D24695126 fbshipit-source-id: 61ae46477065a1a451dc46fb29c3ac0073ca531b	2020-11-13 20:11:06 -08:00
Xiaomeng Yang	a9b6fa9e46	Fix multinomial when input has 0 prob (#47386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47386 Fix multinomail when input has 0 prob Test Plan: buck test mode/dev-nosan //caffe2/test:torch -- "multinomial" Reviewed By: ngimel Differential Revision: D24699691 fbshipit-source-id: d88bb5be8cfed9da2ce6f6a8abd18e834fbde580	2020-11-13 19:07:49 -08:00
Ayush Saraf	f86ec08160	[pytorch][quantization] adding jit state for QuantizedLeakyReLU (#47660 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47660 Currently, `QuantizedLeakyReLU` doesn't have any items in the `state_dict`. However, this operator needs to store the `scale` and `zero_point` in its state dictionary or the loading state dict for a quantized model with LeakyReLUs that have non-default quantization params would break. Test Plan: Originally the issue was found here: https://www.internalfb.com/intern/anp/view/?id=390362&revision_id=2510709822565735 In the latest version, I fixed this issue: https://www.internalfb.com/intern/anp/view/?id=390362 Reviewed By: jerryzh168 Differential Revision: D24757522 fbshipit-source-id: 57e1dea072b5862e65e228e52a86f2062073aead	2020-11-13 18:59:46 -08:00
Elias Ellison	4380934b9b	[JIT] Dont use specialized tensor type (#46130 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/46122 For `Any`, we infer the type of the ivalue to set the ivalue's type tag. When we saw a Tensor, we would use a specialized Tensor type, so when `Dict[str, Tensor]` was passed in as any `Any` arg it would be inferred as `Dict[str, Float(2, 2, 2, 2)]` which breaks runtime `isinstance` checking. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46130 Reviewed By: glaringlee Differential Revision: D24261447 Pulled By: eellison fbshipit-source-id: 8a2bb26ce5b6c56c8dcd8db79e420f4b5ed83ed5	2020-11-13 18:34:40 -08:00
Richard Barnes	5c0dff836a	Improve dimensionality mismatch warning (#47874 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47874 Test Plan: N/A Reviewed By: ngimel Differential Revision: D24926123 fbshipit-source-id: ace5543ae5122906164e13ae9463fe4dfa74d8d6	2020-11-13 18:26:34 -08:00
Sameer Deshmukh	ceeab70da1	Reopen PR for 0 dim batch size for AvgPool2d. (#47426 ) Summary: Resubmitting https://github.com/pytorch/pytorch/pull/40694 since it could not be landed for some reason. CC ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/47426 Reviewed By: mruberry Differential Revision: D24941350 Pulled By: ngimel fbshipit-source-id: b7e50346d86eb63aaaf4fdd5ee71fafee2d0b476	2020-11-13 17:57:35 -08:00
Ivan Yashchuk	260daf088d	Added linalg.cholesky (#46083 ) Summary: This PR adds `torch.linalg.cholesky` function that matches `numpy.linalg.cholesky`. Fixed `lda` argument to `lapackCholesky` calls. Added `random_hermitian_pd_matrix` helper function for tests. Ref https://github.com/pytorch/pytorch/issues/42666. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46083 Reviewed By: ailzhang Differential Revision: D24861752 Pulled By: mruberry fbshipit-source-id: 214dbceb4e8a2c589df209493efd843962d25593	2020-11-13 16:50:40 -08:00
Xiang Gao	e8fecd5caf	Add constructor for ArgumentDef (#47492 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47493 Pull Request resolved: https://github.com/pytorch/pytorch/pull/47492 Reviewed By: bdhirsh Differential Revision: D24791564 Pulled By: dzhulgakov fbshipit-source-id: 43e4bbda754c61f40855675c1d5d0ddc9f351ebe	2020-11-13 16:39:45 -08:00
Facebook Community Bot	0685773d8d	Automated submodule update: FBGEMM (#47929 ) Summary: This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM). New submodule commit: `9b0131179f` Pull Request resolved: https://github.com/pytorch/pytorch/pull/47929 Test Plan: Ensure that CI jobs succeed on GitHub before landing. Reviewed By: smessmer Differential Revision: D24957361 fbshipit-source-id: 72fe80a784f10ddca52ee99fcf67cf6448a93012	2020-11-13 16:06:49 -08:00
Yang Wang	0125e14c9a	[OpBench] change relu entry point after D24747035 Summary: D24747035 (`1478e5ec2a`) removes the entry point of `nnq.functional.relu`. Adjust op benchmark to `torch.nn.ReLU` accordingly. Test Plan: buck run caffe2/benchmarks/operator_benchmark/pt:qactivation_test -- --use_jit --iterations 1 --warmup_iterations 1 Reviewed By: mingzhe09088 Differential Revision: D24961625 fbshipit-source-id: 5ed0ec7fa6d8cfefc8e7fc8324cf9a2a3e59de90	2020-11-13 15:38:27 -08:00
Xiang Gao	6e42b77be1	Add '--allow-run-as-root' to mpiexec to allow running distributed test inside a container (#43794 ) Summary: Inside a container, the user is often root. We should allow this use case so that people can easily run `run_test.py` insider a container Pull Request resolved: https://github.com/pytorch/pytorch/pull/43794 Reviewed By: ezyang Differential Revision: D24904469 Pulled By: malfet fbshipit-source-id: f96cb9dda3e7bd18b29801cde4c5b0616c750016	2020-11-13 15:31:06 -08:00
Ben Koopman	7b8bd91632	fp16 -> fp32 EmbeddingBag moved into CPU impl (#47076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47076 Pull Request resolved: https://github.com/pytorch/glow/pull/5038 Eliminate double casting in glow when submitting fp16 per sample weights Test Plan: buck test glow/glow/torch_glow/tests:embedding_bag_test Due to dependency conflicts between glow and caffe2, the test has been reverted from this diff, and landed separately Reviewed By: allwu Differential Revision: D24421367 fbshipit-source-id: eb3615144a2cad3d593543428dfdec165ad301df	2020-11-13 15:17:04 -08:00
BowenBao	6a4d55f23c	[ONNX] Enable onnx shape inference in export by default (#46629 ) Summary: * Enable ONNX shape inference by default. * ONNX could potentially set inferred shape in output instead of value_infos, checking both to be sure. * Small fix in symbol_map to avoid overlooking dup symbols. * Fix scalar_type_analysis to be consistent with PyTorch scalar type promotion logic. * Correctly handle None dim_param from ONNX inferred shape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46629 Reviewed By: ailzhang Differential Revision: D24900171 Pulled By: bzinodev fbshipit-source-id: 83d37fb9daf83a2c5969d8383e4c8aac986c35fb	2020-11-13 15:09:46 -08:00
Jerry Zhang	c0aa863c56	[quant][graphmode][fx][refactor] insert_quantize_node (#47880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47880 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D24928797 fbshipit-source-id: 9a8b359cabfb800da86da114bf26bb5bd99d3fff	2020-11-13 14:50:42 -08:00
Omkar Salpekar	5d51b63984	Use Blocking Wait if both Blocking Wait and Async Error Handling Are Set (#47926 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47926 Given that we're soon enabling async error handling in PET, we should make the behavior explicit when users have set NCCL_BLOCKING_WAIT in their own code while also using PET. This PR essentially gives blocking wait precedence (for now). This way the blast radius of the PET change is smaller, while we continue working with blocking wait users and discussing whether moving to async error handling may be a good fit. ghstack-source-id: 116553583 Test Plan: Simple FBL run/CI Reviewed By: jiayisuse Differential Revision: D24928149 fbshipit-source-id: d42c038ad44607feb3d46dd65925237c564ff7a3	2020-11-13 14:43:00 -08:00
Ankur Singla	f743b5639a	[caffe2][memonger] Add support for distributed inference predict nets in DAG memonger (#47718 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47718 Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs. As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error. Test Plan: Added corresponding tests in memonger_test.py . Could not find unit tests in c++ version of memonger. Reviewed By: hlu1 Differential Revision: D24872010 fbshipit-source-id: 1dc99b2fb52b2bc692fa4fc0aff6b7e4c5e4f5b0	2020-11-13 14:12:07 -08:00
Jonathan Kwok	a3e08e5344	Support ReduceSum in c2_pt_converter (#47889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47889 Adds support for converting the [caffe2 ReduceSum](https://caffe2.ai/docs/operators-catalogue#reducesum) operator to torch. ghstack-source-id: 116580127 Test Plan: buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test : [results](https://our.intern.facebook.com/intern/testinfra/testrun/6755399466095119) ✓ ListingSuccess: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - main (60.273) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_sub_op (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.119) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_layer_norm_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.404) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_local_model_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.966) ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_reduce_sum (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (114.896) Reviewed By: bugra Differential Revision: D24925318 fbshipit-source-id: 3f3b791eff1b03e8f5adee744560fe8bc811c659	2020-11-13 12:02:58 -08:00
Linbin Yu	eccbd4df1c	Remove fbcode/caffe2/mode (#46454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46454 we stopped syncing this folder to fbcode, and it's not been used. AIbench will use the ones in xplat. Test Plan: zbgs fbcode/caffe2/mode/ find nothing Reviewed By: xta0 Differential Revision: D24356743 fbshipit-source-id: 7e70a2181a49b8ff3f87e5be3b8c808135f4c527	2020-11-13 11:54:47 -08:00
Meghan Lele	03d1978a1a	[JIT] Resolve string literal type annotations using `Resolver::resolveType` (#47731 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47731 Summary This commit modifies `ScriptTypeParser::parseTypeFromExpr` so that string literal type annotations are resolved using `Resolver::resolveType`. At present, they are parsed in `parseBaseTypeName`, which inadvertently allows any key from `string_to_type_lut` to be used as a string literal type annotation. Test Plan Existing unit tests (most notably `TestClassType.test_self_referential_method` which tests the main feature, self-referential class type annotations, that make use of string literal type annotations). Fixes This commit fixes #47570. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D24934717 Pulled By: SplitInfinity fbshipit-source-id: b915b2c08272566b63b3cf5ff4a07ad43bdc381a	2020-11-13 11:46:08 -08:00
Jerry Zhang	1915ae9510	[quant][graphmode][fx][refactor] is_output_quantized (#47879 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47879 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D24928796 fbshipit-source-id: 55c49243b6a0b4811953cf72af57e5f56be8c419	2020-11-13 11:15:55 -08:00
Bert Maher	6b8d20c023	[pytorch][te] Don't start TE fusion groups with an unknown-typed result (#47884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47884 We need to know output types of everything in a fusion group to ensure that we generate correctly-typed tensors. We were incorrectly starting a fusion group with an unknown-typed output. Test Plan: New unit tests: ``` buck test //caffe2/test:jit //caffe2/test/cpp/tensorexpr:tensorexpr ``` Reviewed By: eellison Differential Revision: D24932786 fbshipit-source-id: 83978a951f32c1207bbc3555a7d3bd94fe4e70fb	2020-11-13 10:52:53 -08:00
Sam Estep	d54497fca7	Try again to give hash in doc push scripts (#47922 ) Summary: This is a second attempt at `8304c25c67`, since the first attempt did not work as shown by `b05f3571fe` and `c59015f21d`. This time the idea is to directly embed the commit hash itself into the generated command that is fed to `docker exec`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47922 Reviewed By: zou3519 Differential Revision: D24953734 Pulled By: samestep fbshipit-source-id: 35b14d1266ef039e8c1bdf3648275af812a2e57b	2020-11-13 10:17:37 -08:00
Gary Zheng	f1babb00f0	[caffe2] Fix ListWithEvicted _pprint_impl wrongly printing _evicted_values (#47881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47881 ListWithEvicted's _pprint_impl was accidentally printing _items before this change. Reviewed By: dzhulgakov Differential Revision: D24928521 fbshipit-source-id: 0d7940719b4a27defbaae3b99af104d7fe7b5144	2020-11-13 09:23:10 -08:00
Richard Zou	d4db4718fa	Revert D24873991: Profiler benchmark fix Test Plan: revert-hammer Differential Revision: D24873991 (`a97c7e2ef0`) Original commit changeset: 1c3950d7d289 fbshipit-source-id: 6f3b8a49caf90aaa3e16707005b6b7cf6e61d89f	2020-11-13 08:37:14 -08:00

1 2 3 4 5 ...

31465 Commits