Commit Graph

31465 Commits

Author SHA1 Message Date
Rong Rong
147a48fb27 [cmake] clean up cmake/Utils.cmake (#47923)
Summary:
Consolidate into cmake/public/utils.cmake

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47923

Reviewed By: samestep

Differential Revision: D24955961

Pulled By: walterddr

fbshipit-source-id: 9d5f6af2b353a8c6f6d521c841fd0989393755cd
2020-11-16 08:12:32 -08:00
albanD
cd4aa9c95c Fix inplace check logic to be triggered when written to Tensor does not require gradients (#46296)
Summary:
Fix https://github.com/pytorch/pytorch/issues/46242

This ensures that the `check_inplace()` run the proper checks even if the Tensor that is being modified inplace does not requires gradient. As the Tensor written into it might require gradient and will make this inplace modification actually differentiable.
This contains:
- Codegen changes to tell `check_inplace()` if the inplace will be differentiable
- Changes in `handle_view_on_rebase` to work properly even when called for an input that does not require gradients (which was assumed to be true before)
- Corresponding tests (both warnings and the error raise internal assert errors without this fix)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46296

Reviewed By: ezyang

Differential Revision: D24903770

Pulled By: albanD

fbshipit-source-id: 74e65dad3d2e3b9f762cbb7b39f92f19d9a0b094
2020-11-16 08:06:06 -08:00
Jane Xu
d032d22141 Replacing CUDA11.0 config with CUDA11.1 in CI (#47942)
Summary:
Relands https://github.com/pytorch/pytorch/issues/46616

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47942

Reviewed By: walterddr

Differential Revision: D24963006

Pulled By: janeyx99

fbshipit-source-id: 71a61c56dec88a32a1c5d194db5a2730100f60a1
2020-11-16 07:32:35 -08:00
Mike Ruberry
013e6a3d9d Revert D24698027: Fix auto exponent issue for torch.pow
Test Plan: revert-hammer

Differential Revision:
D24698027 (8ef7ccd669)

Original commit changeset: f23fdb65c925

fbshipit-source-id: 9a67a2c6310c9e4fdefbb421a8cd4fa41595bc9a
2020-11-15 03:58:44 -08:00
anjali411
8ef7ccd669 Fix auto exponent issue for torch.pow (#47024)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47024

Fixes https://github.com/pytorch/pytorch/issues/46936

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#47024 Fix auto exponent issue for torch.pow**

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D24698027

Pulled By: anjali411

fbshipit-source-id: f23fdb65c925166243593036e08214c4f041a63d
2020-11-14 22:50:12 -08:00
Xiang Gao
d293413b3e Batched matmul dtypes (#47873)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47873

Reviewed By: navahgar

Differential Revision: D24928256

Pulled By: anjali411

fbshipit-source-id: a26aef7a15a13fc0b5716e905971265d8b1cea61
2020-11-14 22:45:48 -08:00
anjali411
db1f217d8d Add complex support for torch.addcmul and torch.addcdiv (#46639)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46639

Resolves: https://github.com/pytorch/pytorch/issues/46546#issuecomment-713122245

Test Plan: Imported from OSS

Reviewed By: izdeby, ansley

Differential Revision: D24879099

Pulled By: anjali411

fbshipit-source-id: 76131dc68ac964e67a633f62e07f7c799df4463e
2020-11-14 21:27:34 -08:00
Bert Maher
5adf840259 [pytorch][te][easy] Remove KernelScope from fusion pass tests (#47952)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47952

We don't actually generate a TE kernel so no need to use the
arena-allocation guard.

Test Plan:
```
buck test //caffe2/test/cpp/tensorexpr -- FuserPass
```

Reviewed By: ZolotukhinM

Differential Revision: D24967107

fbshipit-source-id: 302f65b2fcff704079e8b51b942b7b3baff95585
2020-11-14 20:25:01 -08:00
Jianyu Huang
0e98fdd389 [ATen/CPU] Parallelize HalfToFloat + FloatToHalf operators in PT (#47777)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47777

Parallelize FP32 <-> FP16 op.
- Use at::Parallelize in ATen instead of parallelizing inside FBGEMM;
- provide more flexibility (at::Parallelize can be configured with different parallel backend).
ghstack-source-id: 116499687

Test Plan:
```
OMP_NUM_THREADS=10 buck test //caffe2/test:torch -- .test_half_tensor.
```
https://our.intern.facebook.com/intern/testinfra/testrun/7036874441928985

```
OMP_NUM_THREADS=10 buck run mode/opt -c pytorch.parallel_backend=tbb //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test -- --iterations 1 --omp_num_threads 10 --warmup_iterations 0
```

Benchmark results for 512 x 512 Tensor copy:

- With 1 thread:
```
(base) [jianyuhuang@devbig281.ftw3.facebook.com: ~/fbsource/fbcode/caffe2/caffe2/operators] $ buck run mode/opt -c py
torch.parallel_backend=tbb //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test -- --iterations 1 --omp_num_thread
s 1 --warmup_iterations 10
Parsing buck files: finished in 1.3 sec                                                                               Building: finished in 5.7 sec (100%) 6087/6087 jobs, 0 updated
  Total time: 7.0 sec
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: FloatToHalfTensorConversionBenchmark
# Mode: Eager
# Name: FloatToHalfTensorConversionBenchmark_M512_N512_cpu
# Input: M: 512, N: 512, device: cpu
Forward Execution Time (us) : 99.279

# Benchmarking PyTorch: HalfToFloatTensorConversionBenchmark
# Mode: Eager
# Name: HalfToFloatTensorConversionBenchmark_M512_N512_cpu
# Input: M: 512, N: 512, device: cpu
Forward Execution Time (us) : 81.707
```

- With 2 threads:
```
(base) [jianyuhuang@devbig281.ftw3.facebook.com: ~/fbsource/fbcode/caffe2/caffe2/operators] $ buck run mode/opt -c py
torch.parallel_backend=tbb //caffe2/benchmarks/operator_benchmark/pt:tensor_to_test -- --iterations 1 --omp_num_thread
s 2 --warmup_iterations 10
Parsing buck files: finished in 1.3 sec
Building: finished in 4.4 sec (100%) 6087/6087 jobs, 0 updated
  Total time: 5.7 sec
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: FloatToHalfTensorConversionBenchmark
# Mode: Eager
# Name: FloatToHalfTensorConversionBenchmark_M512_N512_cpu
# Input: M: 512, N: 512, device: cpu
Forward Execution Time (us) : 68.162

# Benchmarking PyTorch: HalfToFloatTensorConversionBenchmark
# Mode: Eager
# Name: HalfToFloatTensorConversionBenchmark_M512_N512_cpu
# Input: M: 512, N: 512, device: cpu
Forward Execution Time (us) : 49.245
```

Reviewed By: ngimel

Differential Revision: D24676355

fbshipit-source-id: 02bfb893a7b5a60f97c0559d8974c53837755ac2
2020-11-14 18:45:23 -08:00
Rohan Varma
f8248543a1 Pass in smaller timeout into init_process_group for distributed_test (#47896)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47896

Per title
ghstack-source-id: 116710141

Test Plan: CI

Reviewed By: osalpekar

Differential Revision: D24943323

fbshipit-source-id: 7bf33ce3a021b9750b65e0c08f602c465cd81d28
2020-11-14 13:38:20 -08:00
Jiakai Liu
07e98d28cf [pytorch][codegen] migrate gen_variable_factories.py to the new data model (#47818)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47818

This is another relatively small codegen.

Ideally we should CppSignature.decl() to generate the c++ function declaration.
We didn't because it needs to add 'at::' to the types defined in ATen namespace.

E.g.:
- standard declaration:
```
Tensor eye(int64_t n, int64_t m, const TensorOptions & options={})
```

- expected:
```
at::Tensor eye(int64_t n, int64_t m, const at::TensorOptions & options = {})
```

Kept the hacky fully_qualified_type() method to keep compatibility with old codegen.

We could clean up by:
- Using these types in torch namespace - but this is a user facing header file,
  not sure if it will cause problem;
- Update cpp.argument_type() method to take optional namespace argument;

Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
  .jenkins/pytorch/codegen-test.sh <baseline_output_dir>
  .jenkins/pytorch/codegen-test.sh <test_output_dir>

Then run diff to compare the generated files:
  diff -Naur <baseline_output_dir> <test_output_dir>
```

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24909478

Pulled By: ljk53

fbshipit-source-id: a0ceaa60cc765c526908fee39f151cd7ed5ec923
2020-11-14 13:05:23 -08:00
Vasiliy Kuznetsov
4779553921 Revert "[quant] Remove nn.quantized.ReLU module and nn.quantized.functional.relu (#47415)" (#47949)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47949

This reverts commit 1478e5ec2a.

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D24966363

Pulled By: vkuzo

fbshipit-source-id: ca1126f699eef84027a15df35962728296c8a790
2020-11-14 08:40:30 -08:00
Jiakai Liu
c936b43f14 [pytorch][codegen] add fully migrated scripts to mypy strict config (#47747)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47747

Moved MANUAL_AUTOGRAD / etc to gen_trace_type.py to avoid mypy from
scanning not yet migrated gen_variable_type.py.

Differential Revision: D24885066

Test Plan: Imported from OSS

Reviewed By: ezyang

Pulled By: ljk53

fbshipit-source-id: bf420e21c26f45fe2b94977bc6df840ffd8a3128
2020-11-14 02:28:00 -08:00
Jiakai Liu
4ff8cd8f3a [pytorch][codegen] gen_python_functions.py loading native_functions.yaml / deprecated.yaml directly (#47746)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47746

- Removed the integration hack in gen_python_functions.py. It now directly
  loads native_functions.yaml. All dependencies on Declarations.yaml
  have been removed / moved to elsewhere.
- Rewrote the deprecated.yaml parsing logic to work with new data model directly.

Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
  .jenkins/pytorch/codegen-test.sh <baseline_output_dir>
  .jenkins/pytorch/codegen-test.sh <test_output_dir>

Then run diff to compare the generated files:
  diff -Naur <baseline_output_dir> <test_output_dir>
```

Differential Revision: D24885067

Test Plan: Imported from OSS

Reviewed By: bhosmer

Pulled By: ljk53

fbshipit-source-id: 8e906b7dd36a64395087bd290f6f54596485ceb4
2020-11-14 02:27:57 -08:00
Jiakai Liu
d91cefb0d8 [pytorch][codegen] migrate gen_annotated_fn_args.py to new codegen model (#47745)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47745

This is a relatively small codegen. Reintroduced 'simple_type' to preserve
old codegen output.

It depends on some methods defined in gen_python_functions.py - next PR will
clean up the remaining Declarations.yaml methods in gen_python_functions.py.

Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
  .jenkins/pytorch/codegen-test.sh <baseline_output_dir>
  .jenkins/pytorch/codegen-test.sh <test_output_dir>

Then run diff to compare the generated files:
  diff -Naur <baseline_output_dir> <test_output_dir>
```

Differential Revision: D24885068

Test Plan: Imported from OSS

Reviewed By: ezyang

Pulled By: ljk53

fbshipit-source-id: c0fbd726bcc450c3c7fe232c23e5b31779d0b65f
2020-11-14 02:24:39 -08:00
Wang Xu
0dbff184e9 change file name to snake style (#47914)
Summary:
Change Partitioner.py file name to partitioner.py
Change GraphManipulation.py file name to graph_manipulation.py
Move test_replace_target_nodes_with() to test_fx_experimental.py
Remove the unnecessary argument in size_based_partition() in Partitioner class

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47914

Reviewed By: gcatron

Differential Revision: D24956653

Pulled By: scottxu0730

fbshipit-source-id: 25b65be7dc7d64e90ffdc59cf394446fee83c3e6
2020-11-14 01:29:25 -08:00
Jagadish Krishnamoorthy
1606899dbe distributed_test: Map rank to GPU accordingly (#47898)
Summary:
If world_size is lesser than or equal to number of GPU's available
then the rank can be directly mapped to corresponding GPU.
This fixes the issue referenced in https://github.com/pytorch/pytorch/issues/45435 and https://github.com/pytorch/pytorch/issues/47629

For world_size = 3 and number of GPU's = 8, the rank to GPU mapping
will be 0,2,4. This is due to the introduction of barrier,
(refer PR https://github.com/pytorch/pytorch/issues/45181)
the tensors in barrier is mapped to cuda0,1,2 and the tensors in the
actual test cases are mapped to cuda0,2,4 resulting in different streams and
leading to timeout. This issue is specific to default process group.
Issue is not observed in new process group since the streams are created again
after the initial barrier call.

This patch maps the rank to corresponding GPU's when the world_size is
less than or equal to the number of GPU's, in this case 0,1,2

Note: The barrier function in distributed_c10d.py should include new parameter
to specify the tensor or rank to GPU mapping. In that case, this patch will be
redundant but harmless since the tests can specify the tensors with appropriate
GPU rankings.

Fixes https://github.com/pytorch/pytorch/issues/47629

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47898

Reviewed By: smessmer

Differential Revision: D24956021

Pulled By: rohan-varma

fbshipit-source-id: a88257f22a7991ba36566329766c106d3360bb4e
2020-11-13 23:59:42 -08:00
Natalia Gimelshein
982ae987d3 Revert D24941350: [pytorch][PR] Reopen PR for 0 dim batch size for AvgPool2d.
Test Plan: revert-hammer

Differential Revision:
D24941350 (ceeab70da1)

Original commit changeset: b7e50346d86e

fbshipit-source-id: 2e42e4418476658dc1afb905184841bf61688cfd
2020-11-13 22:33:37 -08:00
Richard Barnes
c543b3b582 Fix a downcast (#47919)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47919

Suppresses a downcast warning.

Test Plan:
Reproduces with
```
buck test mode/dev-nosan //caffe2/torch/fb/sparsenn:gpu_test
```

Reviewed By: suphoff

Differential Revision: D24866987

fbshipit-source-id: 44f19ab37a7d95abe08f570abfebc702827a2510
2020-11-13 22:26:29 -08:00
Katy Voor
fe7d1d7d0e Add LeakyReLU operator to static runtime (#47798)
Summary:
- Add LeakyReLU operator to static runtime
- Add LeakyReLU benchmark
- Add LeakyReLU correctness test case

Static Runtime
```
------------------------------------------------------------------------------
Benchmark                                       Time           CPU Iterations
------------------------------------------------------------------------------
BM_leaky_relu/1                              4092 ns       4092 ns     172331
BM_leaky_relu/8                              4425 ns       4425 ns     158434
BM_leaky_relu/20                             4830 ns       4830 ns     145335
BM_leaky_relu_const/1                        3545 ns       3545 ns     198054
BM_leaky_relu_const/8                        3825 ns       3825 ns     183074
BM_leaky_relu_const/20                       4222 ns       4222 ns     165999
```

Interpreter
```
------------------------------------------------------------------------------
Benchmark                                       Time           CPU Iterations
------------------------------------------------------------------------------
BM_leaky_relu/1                              7183 ns       7182 ns      96377
BM_leaky_relu/8                              7580 ns       7580 ns      91588
BM_leaky_relu/20                             8066 ns       8066 ns      87183
BM_leaky_relu_const/1                        6466 ns       6466 ns     107925
BM_leaky_relu_const/8                        7063 ns       7063 ns      98768
BM_leaky_relu_const/20                       7380 ns       7380 ns      94564
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47798

Reviewed By: ezyang

Differential Revision: D24927043

Pulled By: kavoor

fbshipit-source-id: 69b12cc57f725f1dc8d68635788813710a74dc2b
2020-11-13 22:05:52 -08:00
Chester Liu
17a6bc7c1b Cleanup unused code for Python < 3.6 (#47822)
Summary:
I think these can be safely removed since the min version of supported Python is now 3.6

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47822

Reviewed By: smessmer

Differential Revision: D24954936

Pulled By: ezyang

fbshipit-source-id: 5d4b2aeb78fc97d7ee4abaf5fb2aae21bf765e8b
2020-11-13 21:37:01 -08:00
Guilherme Leobas
4f9d0757f3 Add type informations to torch.cuda (#47134)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47133

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47134

Reviewed By: smessmer

Differential Revision: D24955031

Pulled By: ezyang

fbshipit-source-id: 87f4623643715baa6ac0627383f009956f80cd46
2020-11-13 21:34:35 -08:00
Masaki Kozuki
2eb1e866e8 Update links in DDP note (#47663)
Summary:
Update the links in https://pytorch.org/docs/stable/notes/ddp.html#.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47663

Reviewed By: smessmer

Differential Revision: D24951684

Pulled By: ezyang

fbshipit-source-id: c1c104d76cf0292a7fc75a627bf76bb56fea72d0
2020-11-13 21:26:28 -08:00
Greg Tarr
550973b675 Missing curly bracket. (#47855)
Summary:
Typo fix

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47855

Reviewed By: smessmer

Differential Revision: D24951767

Pulled By: ezyang

fbshipit-source-id: 8884390370d4d71efd6cee10c3e0b8f55d7e5739
2020-11-13 21:17:24 -08:00
Meghan Lele
1bdd3687b9 Back out "[JIT] Fix function schema subtype checking"
Summary: Original commit changeset: bd07e7b47d2a

Test Plan: T79664004

Reviewed By: qizzzh

Differential Revision: D24969339

fbshipit-source-id: 8ecc4d52b86c5440c673e42b0e2cb78d94937a6f
2020-11-13 20:33:54 -08:00
Zino Benaissa
11710598db Preserve module parameters in freezing (#47094)
Summary:
Added preserveParameters to freezing API that allows to preserve module
parameters.

Fixes #{39613}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47094

Reviewed By: eellison

Differential Revision: D24792867

Pulled By: bzinodev

fbshipit-source-id: f0cd980f5aed617b778afe2f231067c7c30a1527
2020-11-13 20:18:32 -08:00
Omkar Salpekar
f8c559db8e [resubmit] Providing more information while crashing process in async error handling (#47246)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47246

We crash the process in NCCL Async Error Handling if the collective
has been running for greater than some set timeout. This PR introduces more
information about the rank and duration the collective ran.
ghstack-source-id: 116676182

Test Plan: Run desync tests and flow.

Reviewed By: pritamdamania87

Differential Revision: D24695126

fbshipit-source-id: 61ae46477065a1a451dc46fb29c3ac0073ca531b
2020-11-13 20:11:06 -08:00
Xiaomeng Yang
a9b6fa9e46 Fix multinomial when input has 0 prob (#47386)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47386

Fix multinomail when input has 0 prob

Test Plan: buck test mode/dev-nosan //caffe2/test:torch -- "multinomial"

Reviewed By: ngimel

Differential Revision: D24699691

fbshipit-source-id: d88bb5be8cfed9da2ce6f6a8abd18e834fbde580
2020-11-13 19:07:49 -08:00
Ayush Saraf
f86ec08160 [pytorch][quantization] adding jit state for QuantizedLeakyReLU (#47660)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47660

Currently, `QuantizedLeakyReLU` doesn't have any items in the `state_dict`. However, this operator needs to store the `scale` and `zero_point` in its state dictionary or the loading state dict for a quantized model with LeakyReLUs that have non-default quantization params would break.

Test Plan:
Originally the issue was found here: https://www.internalfb.com/intern/anp/view/?id=390362&revision_id=2510709822565735

In the latest version, I fixed this issue: https://www.internalfb.com/intern/anp/view/?id=390362

Reviewed By: jerryzh168

Differential Revision: D24757522

fbshipit-source-id: 57e1dea072b5862e65e228e52a86f2062073aead
2020-11-13 18:59:46 -08:00
Elias Ellison
4380934b9b [JIT] Dont use specialized tensor type (#46130)
Summary:
Fix for https://github.com/pytorch/pytorch/issues/46122

For `Any`, we infer the type of the ivalue to set the ivalue's type tag. When we saw a Tensor, we would use a specialized Tensor type, so when `Dict[str, Tensor]` was passed in as any `Any` arg it would be inferred as `Dict[str, Float(2, 2, 2, 2)]` which breaks runtime `isinstance` checking.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46130

Reviewed By: glaringlee

Differential Revision: D24261447

Pulled By: eellison

fbshipit-source-id: 8a2bb26ce5b6c56c8dcd8db79e420f4b5ed83ed5
2020-11-13 18:34:40 -08:00
Richard Barnes
5c0dff836a Improve dimensionality mismatch warning (#47874)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47874

Test Plan: N/A

Reviewed By: ngimel

Differential Revision: D24926123

fbshipit-source-id: ace5543ae5122906164e13ae9463fe4dfa74d8d6
2020-11-13 18:26:34 -08:00
Sameer Deshmukh
ceeab70da1 Reopen PR for 0 dim batch size for AvgPool2d. (#47426)
Summary:
Resubmitting https://github.com/pytorch/pytorch/pull/40694 since it could not be landed for some reason.

CC ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47426

Reviewed By: mruberry

Differential Revision: D24941350

Pulled By: ngimel

fbshipit-source-id: b7e50346d86eb63aaaf4fdd5ee71fafee2d0b476
2020-11-13 17:57:35 -08:00
Ivan Yashchuk
260daf088d Added linalg.cholesky (#46083)
Summary:
This PR adds `torch.linalg.cholesky` function that matches `numpy.linalg.cholesky`.

Fixed `lda` argument to `lapackCholesky` calls.
Added `random_hermitian_pd_matrix` helper function for tests.

Ref https://github.com/pytorch/pytorch/issues/42666.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46083

Reviewed By: ailzhang

Differential Revision: D24861752

Pulled By: mruberry

fbshipit-source-id: 214dbceb4e8a2c589df209493efd843962d25593
2020-11-13 16:50:40 -08:00
Xiang Gao
e8fecd5caf Add constructor for ArgumentDef (#47492)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47493

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47492

Reviewed By: bdhirsh

Differential Revision: D24791564

Pulled By: dzhulgakov

fbshipit-source-id: 43e4bbda754c61f40855675c1d5d0ddc9f351ebe
2020-11-13 16:39:45 -08:00
Facebook Community Bot
0685773d8d Automated submodule update: FBGEMM (#47929)
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: 9b0131179f

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47929

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: smessmer

Differential Revision: D24957361

fbshipit-source-id: 72fe80a784f10ddca52ee99fcf67cf6448a93012
2020-11-13 16:06:49 -08:00
Yang Wang
0125e14c9a [OpBench] change relu entry point after D24747035
Summary: D24747035 (1478e5ec2a) removes the entry point of `nnq.functional.relu`. Adjust op benchmark to `torch.nn.ReLU` accordingly.

Test Plan: buck run caffe2/benchmarks/operator_benchmark/pt:qactivation_test -- --use_jit  --iterations 1 --warmup_iterations 1

Reviewed By: mingzhe09088

Differential Revision: D24961625

fbshipit-source-id: 5ed0ec7fa6d8cfefc8e7fc8324cf9a2a3e59de90
2020-11-13 15:38:27 -08:00
Xiang Gao
6e42b77be1 Add '--allow-run-as-root' to mpiexec to allow running distributed test inside a container (#43794)
Summary:
Inside a container, the user is often root. We should allow this use case so that people can easily run `run_test.py` insider a container

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43794

Reviewed By: ezyang

Differential Revision: D24904469

Pulled By: malfet

fbshipit-source-id: f96cb9dda3e7bd18b29801cde4c5b0616c750016
2020-11-13 15:31:06 -08:00
Ben Koopman
7b8bd91632 fp16 -> fp32 EmbeddingBag moved into CPU impl (#47076)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47076

Pull Request resolved: https://github.com/pytorch/glow/pull/5038

Eliminate double casting in glow when submitting fp16 per sample weights

Test Plan:
buck test glow/glow/torch_glow/tests:embedding_bag_test

Due to dependency conflicts between glow and caffe2, the test has been reverted from this diff, and landed separately

Reviewed By: allwu

Differential Revision: D24421367

fbshipit-source-id: eb3615144a2cad3d593543428dfdec165ad301df
2020-11-13 15:17:04 -08:00
BowenBao
6a4d55f23c [ONNX] Enable onnx shape inference in export by default (#46629)
Summary:
* Enable ONNX shape inference by default.
* ONNX could potentially set inferred shape in output instead of value_infos, checking both to be sure.
* Small fix in symbol_map to avoid overlooking dup symbols.
* Fix scalar_type_analysis to be consistent with PyTorch scalar type promotion logic.
* Correctly handle None dim_param from ONNX inferred shape.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46629

Reviewed By: ailzhang

Differential Revision: D24900171

Pulled By: bzinodev

fbshipit-source-id: 83d37fb9daf83a2c5969d8383e4c8aac986c35fb
2020-11-13 15:09:46 -08:00
Jerry Zhang
c0aa863c56 [quant][graphmode][fx][refactor] insert_quantize_node (#47880)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47880

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24928797

fbshipit-source-id: 9a8b359cabfb800da86da114bf26bb5bd99d3fff
2020-11-13 14:50:42 -08:00
Omkar Salpekar
5d51b63984 Use Blocking Wait if both Blocking Wait and Async Error Handling Are Set (#47926)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47926

Given that we're soon enabling async error handling in PET, we should make the behavior explicit when users have set NCCL_BLOCKING_WAIT in their own code while also using PET. This PR essentially gives blocking wait precedence (for now). This way the blast radius of the PET change is smaller, while we continue working with blocking wait users and discussing whether moving to async error handling may be a good fit.
ghstack-source-id: 116553583

Test Plan: Simple FBL run/CI

Reviewed By: jiayisuse

Differential Revision: D24928149

fbshipit-source-id: d42c038ad44607feb3d46dd65925237c564ff7a3
2020-11-13 14:43:00 -08:00
Ankur Singla
f743b5639a [caffe2][memonger] Add support for distributed inference predict nets in DAG memonger (#47718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47718

Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs.

As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error.

Test Plan: Added corresponding tests in memonger_test.py .  Could not find unit tests in c++ version of memonger.

Reviewed By: hlu1

Differential Revision: D24872010

fbshipit-source-id: 1dc99b2fb52b2bc692fa4fc0aff6b7e4c5e4f5b0
2020-11-13 14:12:07 -08:00
Jonathan Kwok
a3e08e5344 Support ReduceSum in c2_pt_converter (#47889)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47889

Adds support for converting the [caffe2 ReduceSum](https://caffe2.ai/docs/operators-catalogue#reducesum) operator to torch.
ghstack-source-id: 116580127

Test Plan:
buck test //caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test : [results](https://our.intern.facebook.com/intern/testinfra/testrun/6755399466095119)

    ✓ ListingSuccess: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - main (60.273)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_sub_op (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.119)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_layer_norm_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.404)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_local_model_conversion (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (101.966)
    ✓ Pass: caffe2/torch/fb/model_transform/c2_convert:c2_pt_converter_test - test_reduce_sum (caffe2.torch.fb.model_transform.c2_convert.c2_pt_converter_test.C2PTConverterTest) (114.896)

Reviewed By: bugra

Differential Revision: D24925318

fbshipit-source-id: 3f3b791eff1b03e8f5adee744560fe8bc811c659
2020-11-13 12:02:58 -08:00
Linbin Yu
eccbd4df1c Remove fbcode/caffe2/mode (#46454)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46454

we stopped syncing this folder to fbcode, and it's not been used. AIbench will use the ones in xplat.

Test Plan: zbgs fbcode/caffe2/mode/ find nothing

Reviewed By: xta0

Differential Revision: D24356743

fbshipit-source-id: 7e70a2181a49b8ff3f87e5be3b8c808135f4c527
2020-11-13 11:54:47 -08:00
Meghan Lele
03d1978a1a [JIT] Resolve string literal type annotations using Resolver::resolveType (#47731)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47731

**Summary**
This commit modifies `ScriptTypeParser::parseTypeFromExpr` so that
string literal type annotations are resolved using
`Resolver::resolveType`. At present, they are parsed in
`parseBaseTypeName`, which inadvertently allows any key from
`string_to_type_lut` to be used as a string literal type annotation.

**Test Plan**
Existing unit tests (most notably
`TestClassType.test_self_referential_method` which tests the main
feature, self-referential class type annotations, that make use of
string literal type annotations).

**Fixes**
This commit fixes #47570.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D24934717

Pulled By: SplitInfinity

fbshipit-source-id: b915b2c08272566b63b3cf5ff4a07ad43bdc381a
2020-11-13 11:46:08 -08:00
Jerry Zhang
1915ae9510 [quant][graphmode][fx][refactor] is_output_quantized (#47879)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47879

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24928796

fbshipit-source-id: 55c49243b6a0b4811953cf72af57e5f56be8c419
2020-11-13 11:15:55 -08:00
Bert Maher
6b8d20c023 [pytorch][te] Don't start TE fusion groups with an unknown-typed result (#47884)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47884

We need to know output types of everything in a fusion group to ensure
that we generate correctly-typed tensors.  We were incorrectly starting a
fusion group with an unknown-typed output.

Test Plan:
New unit tests:
```
buck test //caffe2/test:jit //caffe2/test/cpp/tensorexpr:tensorexpr
```

Reviewed By: eellison

Differential Revision: D24932786

fbshipit-source-id: 83978a951f32c1207bbc3555a7d3bd94fe4e70fb
2020-11-13 10:52:53 -08:00
Sam Estep
d54497fca7 Try again to give hash in doc push scripts (#47922)
Summary:
This is a second attempt at 8304c25c67, since the first attempt did not work as shown by b05f3571fe and c59015f21d. This time the idea is to directly embed the commit hash itself into the generated command that is fed to `docker exec`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47922

Reviewed By: zou3519

Differential Revision: D24953734

Pulled By: samestep

fbshipit-source-id: 35b14d1266ef039e8c1bdf3648275af812a2e57b
2020-11-13 10:17:37 -08:00
Gary Zheng
f1babb00f0 [caffe2] Fix ListWithEvicted _pprint_impl wrongly printing _evicted_values (#47881)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47881

ListWithEvicted's _pprint_impl was accidentally printing _items before this change.

Reviewed By: dzhulgakov

Differential Revision: D24928521

fbshipit-source-id: 0d7940719b4a27defbaae3b99af104d7fe7b5144
2020-11-13 09:23:10 -08:00
Richard Zou
d4db4718fa Revert D24873991: Profiler benchmark fix
Test Plan: revert-hammer

Differential Revision:
D24873991 (a97c7e2ef0)

Original commit changeset: 1c3950d7d289

fbshipit-source-id: 6f3b8a49caf90aaa3e16707005b6b7cf6e61d89f
2020-11-13 08:37:14 -08:00