Commit Graph

43119 Commits

Author SHA1 Message Date
HDCharles
338483e018 Update base for Update on "[quant] Add QuantizedLSTM class"
The nn.LSTM is quantized through the custom module mechanism, which uses the nn.quantizable.LSTM for both observed and quantized paths. This is potentially a source of confusion. This creates a `quantized.LSTM` class, which completely takes the quantized path. Note that after this, the old usage will throw an error.

New way of using it:

```
>>> custom_module_config = {
...     'float_to_observed_custom_module_class': {
...         nn.LSTM: nn.quantizable.LSTM,
...     },
...     'observed_to_quantized_custom_module_class': {
...         nn.quantizable.LSTM: nn.quantized.LSTM,
...     }
... }
>>> tq.prepare(model, prepare_custom_module_class=custom_module_config)
>>> tq.convert(model, convert_custom_module_class=custom_module_config)
```

Differential Revision: [D33451338](https://our.internmc.facebook.com/intern/diff/D33451338/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D33451338/)!

[ghstack-poisoned]
2022-01-20 10:59:00 -08:00
CodemodService FBSourceClangFormatLinterBot
9f0c808593 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33677079

fbshipit-source-id: 997b73bebdcf83e09138bddc4bce257d0740e874
(cherry picked from commit 620023ad32)
2022-01-20 12:13:18 +00:00
Tran N.M. Hoang
06838ce8b1 fix: do_constant_folding arg when exporting ONNX (#71348)
Summary:
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71348

Reviewed By: H-Huang

Differential Revision: D33662228

Pulled By: msaroufim

fbshipit-source-id: a69c72838b7ff41a2305453ef00666c060ade593
(cherry picked from commit 75dd62b406)
2022-01-20 05:42:35 +00:00
Han Qi
21b697b646 add flatbuffer_loader and flatbuffer_serializer as BUCK target (#71463)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71463

title

Test Plan: unittest

Reviewed By: zhxchen17

Differential Revision: D33651339

fbshipit-source-id: 4bf325a40e263a441fd86bce560645ad0c1ebb23
(cherry picked from commit 4cb02e62a6)
2022-01-20 04:51:10 +00:00
Shirong Wu
99df96d800 Add silu and hardsigmoid converter (#71453)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71453

As title

Test Plan: unit test

Reviewed By: frank-wei

Differential Revision: D33646384

fbshipit-source-id: d86326c93e4d6bd59c9152592721f0e6ecf7f6fb
(cherry picked from commit d886380ede)
2022-01-20 03:16:20 +00:00
Can Balioglu
80b19c4c8c Enable Python bindings for UntypedStorage (#68945)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68945

This PR enables the Python conversion functions for `Storage` (specifically `UntypedStorage`) and also cleans up some remnants of the deprecated typed storages from `DynamicTypes.cpp`.
ghstack-source-id: 147245110

Test Plan: Run the existing unit and integration tests.

Reviewed By: albanD

Differential Revision: D32676505

fbshipit-source-id: 3a3f6db4fb0da5c78dd406c96ab70bdc37015521
(cherry picked from commit d6427b94cf)
2022-01-20 02:11:34 +00:00
Pritam Damania
f5b19ba683 Additional unit test for sharded linear. (#70476)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70476

1) Support a single dimension for inputs
2) Test several error cases

Partially addresses https://github.com/pytorch/pytorch/issues/65638
ghstack-source-id: 146307607

Test Plan: waitforbuildbot

Reviewed By: fduwjj

Differential Revision: D33344357

fbshipit-source-id: 4de7a7177452951dbcce76f27441703447609e6f
(cherry picked from commit 96dfded569)
2022-01-20 01:23:44 +00:00
Nikita Shulga
a5d5b11252 Add GitHub merge rules (#71514)
Summary:
Following subfolders of the project were identified as one that can be
merged on github first and then asynchronously merged into Meta
codebase:
## ONNX exporter
PRs that include only files under `torch/onnx`, `torch/csrc/jit/passes/onnx` and `test/onnx` and are reviewed by garymm
## CUDA fusers
PRs that include only files under `torch/csrc/jit/codegen/fuser/cuda`, `torch/csrc/jit/codegen/cuda` or `benchmarks/cpp/nvfuser` and reviewed by csarofeen or ngimel
## OSS CI
PR that include only files under `.circleci`, `.github` and `.jenkins` and reviewed either by seemethere or myself

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71514

Reviewed By: bigfootjon

Differential Revision: D33673050

Pulled By: malfet

fbshipit-source-id: 21b909d49cb73ff79879b3ea0568e53ef65aa08c
(cherry picked from commit 520226c1bf)
2022-01-20 01:16:25 +00:00
Scott Wolchok
c59942ac73 [PyTorch] Fix a bunch of structured kernel refcounting (#71140)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71140

Structured kernels need to use the borrowing variants of the build APIs to TensorIterator. (I am working on a debug check for this, but it is currently too strict, and relaxing it does not catch these bugs.)
ghstack-source-id: 147191022

Test Plan: CI

Reviewed By: bdhirsh

Differential Revision: D33520003

fbshipit-source-id: 3b0ff9036acdb78ae6fc7489ed0ed487d5ff080f
(cherry picked from commit 80ef4e14e3)
2022-01-20 00:30:43 +00:00
Zhengxu Chen
b98e955b24 [flatbuffer] Fix forward flatbuffer type handling with dynamic type. (#71500)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71500

Some places in flatbuffer_loader.cpp need to update to newer API call following the dynamic type changes.
ghstack-source-id: 147278860

Test Plan:
rebase D33665961
```
[zhxchen17@devbig560.ftw3 /data/users/zhxchen17/fbsource]  buck run fbcode/mode/dbg //arvr/firmware/silicon/turing:test_torch -c turing.min_runtime=1 -c turing.dsp_op=1 -c turing.model_file=test1.ptl -c pt.has_backtraces=1
Action graph will be rebuilt because files have been added or removed.
Downloaded 0/4 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 6.1 sec (100%) 253/253 jobs, 3/253 updated
  Total time: 6.1 sec
BUILD SUCCEEDED
Conv:  input [1, 32, 4, 4] residuals [1] weights [4, 4, 1, 1, 2, 32] nlu_params [4, 128] in_ch 32 out_ch 32 groups 1 kernel  stride  padding  upsample 0 op_type 0 act_type 0
```

Reviewed By: qihqi

Differential Revision: D33668588

fbshipit-source-id: 44163c1bc0ea57e4bd265384a253d6cc7b96ed4a
(cherry picked from commit 746487075e)
2022-01-20 00:22:35 +00:00
Scott Wolchok
565f78f571 [Pytorch] Speed up LayerNorm 4-5% (#71423)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71423

Replacing this math with a load seems to improve perf.
ghstack-source-id: 147171800

Test Plan: ptvsc2_predictor_bench runs on model from mikeiovine courtesy of mikeiovine

Reviewed By: mikeiovine, xiaomengy

Differential Revision: D33552176

fbshipit-source-id: f21a4cd66c13b9fcb7bcf48f356bdc85e94c4216
(cherry picked from commit 0354fcb988)
2022-01-20 00:16:17 +00:00
Scott Wolchok
958f9cf5ff [PyTorch][Static Runtime] Fix extra refcount bumps in layer_norm (#71237)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71237

Noticed these on inspection.
ghstack-source-id: 147171799

Test Plan: CI

Reviewed By: mikeiovine

Differential Revision: D33519799

fbshipit-source-id: 167c63323b345a5822303cecdbbbbb959f66f6e4
(cherry picked from commit 57e8da2d35)
2022-01-20 00:16:17 +00:00
Kim Juhyeong
811af25963 Fix trivial typo at the doc of torch.lobpcg (#71464)
Summary:
I think `symmetric positive defined generalized eigenvalue problem` should be changed to `symmetric positive definite generalized eigenvalue problem`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71464

Reviewed By: ejguan

Differential Revision: D33660670

Pulled By: H-Huang

fbshipit-source-id: 85dc830ed56a98d8a38bd2843f575f6ce08498cf
(cherry picked from commit dbbef542c0)
2022-01-20 00:07:39 +00:00
Nikita Shulga
dc5cda0cca Update min python version to 3.7 in setup.py and mypy configs (#71494)
Summary:
As Python-3.6 have reached EOL

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71494

Reviewed By: atalman

Differential Revision: D33667509

Pulled By: malfet

fbshipit-source-id: ab1f03085cfb9161df77ba5ce373b81f5e7ef3ae
(cherry picked from commit 60343166d9)
2022-01-20 00:03:57 +00:00
Jordan Fix
06bc6748a1 [acc_ops] Remove usage of kwarg expansion via **locals() for jit scripting support (#71425)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71425

att

Test Plan: CI

Reviewed By: yuhc

Differential Revision: D33639228

fbshipit-source-id: 95edced3b19a531d417538f00f0a555295c8741f
(cherry picked from commit 45455a6edc)
2022-01-19 23:49:50 +00:00
Rodrigo Kumpera
ef4bc3fa2f [distributed] Make rref_proxy._invoke_rpc trully async when needed. (#70206)
Summary:
From https://github.com/pytorch/pytorch/issues/67626: RRefProxy (rref.rpc_async, rref.rpc_sync, rref.remote) currently uses a blocking RPC call to the owner

This is done by chaining async calls. In the sync case we wait on the
resulting Future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70206

Test Plan:
I ran rpc_tests using tensorpipe_rpc_agent_test_fixture.py and had to
adjust test_rref_proxy_timeout to the new behavior.

I ran into test_tensorpipe_set_default_timeout failing due to the
timeout being too small. Doesn't look related to this change.
mrshenli
Fixes https://github.com/pytorch/pytorch/issues/67626

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Reviewed By: pritamdamania87

Differential Revision: D33243348

Pulled By: kumpera

fbshipit-source-id: e1e8c34bb3d170407c0a793e2e585357f905d3c6
(cherry picked from commit 1ad5a7ceea)
2022-01-19 23:37:15 +00:00
Raghavan Raman
70c9146c40 [nnc] Update block and thread extents in cuda_codegen to use int64_t (#71428)
Summary:
The block and thread extent calculations in `cuda_codegen` should be using `int64_t` instead of `int`. The updated test, `test_dynamic_shapes`, fails without this change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71428

Reviewed By: samdow

Differential Revision: D33640374

Pulled By: navahgar

fbshipit-source-id: 64c340ad2a9a1fa1fe066cf1c5dfc3b546b7be6d
(cherry picked from commit 6ea546ce11)
2022-01-19 23:21:24 +00:00
Shiyan Deng
2dbbb1a921 [fx2trt] Issue warnings instead of error if there's possible const folding opportunities (#71031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71031

During the conversion stage, we might create some constants when size op is called and size is static. Raising error here causes problem for this case. Generally speaking it doesn't hurt to allow not const folding.

Test Plan:
Test with D33483843 on shufflenet.

Added unit tests.

Reviewed By: wushirong

Differential Revision: D33484183

fbshipit-source-id: 5b32c06297e56965befd7e83fe8ca273e3665cee
(cherry picked from commit e6b79bd3dd)
2022-01-19 23:16:23 +00:00
Nikita Shulga
61713acb07 Add trymerge workflow (#71488)
Summary:
This one, will react to `repo_dispatch` event sent by PyTorch Probot
when `pytorchbot merge this` command is issued

At the moment, workflow will only attempt to merge PRs which has not
been created from forked repo and that match rules defined in
`.github/merge_rules.json`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71488

Reviewed By: bigfootjon

Differential Revision: D33665142

Pulled By: malfet

fbshipit-source-id: e22daa1892523e62d7b7a941960636a6514cb7d7
(cherry picked from commit 92059bab07)
2022-01-19 23:11:48 +00:00
Can Balioglu
f45e217c01 Consolidate the overloads of TensorImpl::shallow_copy_and_detach (#68953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68953

This PR consolidates the almost identical lvalue and rvalue implementations of shallow_copy_and_detach into a single templated function.
ghstack-source-id: 147238376

Test Plan: Run existing unit tests.

Reviewed By: fduwjj

Differential Revision: D32679741

fbshipit-source-id: 89a870335d2e09ffd005c943733a787d20d352f9
(cherry picked from commit 750344c860)
2022-01-19 21:52:13 +00:00
Michael Dagitses
805b7575db test //c10/... without Google libraries in OSS (#70853)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70853

We support both configurations, so we should ensure they both work.
ghstack-source-id: 147170900

Test Plan: This is adding a test to CI.

Reviewed By: malfet

Differential Revision: D33304505

fbshipit-source-id: 7074b6b98d05f60801bb1d74bc9ac1458c768d28
(cherry picked from commit 8e4134b777)
2022-01-19 20:56:12 +00:00
Michael Dagitses
78e1f9db34 port //c10/macros to common build structure (#70852)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70852

This is the first change that uses a common build file, build.bzl, to
hold most of the build logic.
ghstack-source-id: 147170895

Test Plan: Relying on internal and external CI.

Reviewed By: malfet

Differential Revision: D33299331

fbshipit-source-id: a66afffba6deec76b758dfb39bdf61d747b5bd99
(cherry picked from commit d9163c56f5)
2022-01-19 20:56:12 +00:00
Michael Dagitses
661d10aab4 use c10/macros/cmake_macros.h in fbcode build (#70851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70851

This is a step towards OSS/fbcode convergence since OSS uses this file
in both CMake and Bazel.
ghstack-source-id: 147170896

Test Plan: Relying on the extensive CI internal tests for this.

Reviewed By: malfet

Differential Revision: D33299102

fbshipit-source-id: c650dd4755f8d696d5fce81c583d5c73782e3990
(cherry picked from commit 741ca140c8)
2022-01-19 20:56:12 +00:00
Alex Beloi
bdeec0c7b6 [fx] add documentation to AccOpProperties (#71450)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71450

att

Test Plan: no test

Reviewed By: jfix71

Differential Revision: D33515471

fbshipit-source-id: ded40ca117f63c971d6c5ed4556932cc71c009ca
(cherry picked from commit a9f66d5921)
2022-01-19 20:50:21 +00:00
Jeff Daily
7ce6db48e5 add rocm GHA workflow (#68552)
Summary:
cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68552

Reviewed By: bdhirsh

Differential Revision: D33569551

Pulled By: seemethere

fbshipit-source-id: cc7d68a22ad0eedd4d11eea3cf43a909e5b8616b
(cherry picked from commit 2bb701eb9d)
2022-01-19 20:31:17 +00:00
Zhengxu Chen
15e7d18124 [jit][edge] Create convinience wrapper for dynamic type construcytors. (#71457)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71457

Today DynamicType is hard to be created because we have separare APIs for different types. In this diff we introduce an easier API to create types like the following:
```
#include <ATen/core/type_factory.h>

auto type = dynT<ListType>(dynT<TensorType>()); // etc...
```
ghstack-source-id: 147211236

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33647746

fbshipit-source-id: c850cf31ae781244eac805906a2fc110ef065a70
(cherry picked from commit 8cfd51d75f)
2022-01-19 20:11:11 +00:00
Jason Ansel
ac26f8237c Allow disabling nvfuser without CUDA (#71358)
Summary:
On a CPU-only build of pytorch `torch._C._jit_set_nvfuser_enabled(False)` would throw an error (even though it is a no-op operation), with this fix:
```
>>> torch._C._jit_set_nvfuser_enabled(False)
False
>>> torch._C._jit_set_nvfuser_enabled(True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Running CUDA fuser is only supported on CUDA builds.
>>>
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71358

Reviewed By: eellison

Differential Revision: D33601135

Pulled By: jansel

fbshipit-source-id: c764df2fa197ce7b4f71e5df0a91cd988766e99c
(cherry picked from commit a801df9321)
2022-01-19 20:01:09 +00:00
Pearu Peterson
214f4bf2ff Support sparse.sum on empty sparse tensor (#71091)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71091

Fixes https://github.com/pytorch/pytorch/issues/65394

The masked sum on a full input tensor (of any layout) with an all-true mask is the same as the sum on the strided input tensor (after applying `to_dense` to sparse inputs).
Since masked sum uses `torch.sparse.sum` then, for the simplicity of masked reductions implementations, its reduction behavior ought to be defined by the behavior of the `torch.sum`. This PR implements the behavioral connection with respect to the directional summation of empty sparse tensors that correspond to all-zero strided tensors.

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: davidberard98

Differential Revision: D33651750

Pulled By: cpuhrsch

fbshipit-source-id: 703891bff88c8da6270b4272f5d2da81688db67d
(cherry picked from commit 53f97e80f7)
2022-01-19 18:58:08 +00:00
Rohan Varma
3b589c3497 [DDP Checkpointing] non-reentrant checkpoint tests (#69060)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69060

Saved variable hooks checkpointing was added in https://github.com/pytorch/pytorch/pull/69508, this PR adds some tests for DDP.

Specifically, we can support almost all DDP use cases with this new API, such as dynamic module with find_unused_parameters=True. One case remains to be supported, which is static_graph + non-reentrant based checkpointing. The underlying reason this does not work is https://github.com/pytorch/pytorch/issues/58111.
ghstack-source-id: 147219887

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D32712126

fbshipit-source-id: ba5ae9ca77fd8929ee020c7dc97838bae9a1931b
(cherry picked from commit 9c7f93e217)
2022-01-19 18:09:41 +00:00
Richard Barnes
75aaa9f92b Remove simd qualifier for pragma omp loop in upsample_nearest_op.h (#71462)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71462

Fixes
```
      6 aienv/aienv_ig_reels_base:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
      6 deep_entity_classification/si_dec_gnn:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
      6 feed_recommendation_infra/multifeed_execution_graph_service_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
     12 mobile_cv/mobile-vision_experimental:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
     30 mobile_cv/mobile-vision_xraymobilev2_detection_caffe2:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
     42 aienv/aienv:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
    128 feed_recommendation_infra/multifeed_recagg_dev:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
    136 fluent2/fblearner_flow_projects_fluent2_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
   1338 f6/f6_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning]
```

Test Plan: Sandcastle

Reviewed By: luciang

Differential Revision: D33641869

fbshipit-source-id: 8424849cfac5cb0109272dec2086863067bbde66
(cherry picked from commit d18429905c)
2022-01-19 18:04:10 +00:00
kshitij12345
908fd3d78b [fix] composite compliance: quantile and nanquantile (#70894)
Summary:
Reference https://github.com/pytorch/pytorch/issues/69991

Refactored such that only `out` variant copies the result into `out` otherwise we just return the result of the composite functions as is.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70894

Reviewed By: samdow

Differential Revision: D33641742

Pulled By: zou3519

fbshipit-source-id: 671be13b31a7fff3afc0b7976706a5ecfc51ccac
(cherry picked from commit e7d5ac9af3)
2022-01-19 17:54:00 +00:00
Mike Ruberry
a0ada2d22b Back out "[pytorch][PR] Performance and memory improvements to batched torch.linalg.solve" (#71421)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71421

Original commit changeset: 7a0dd443cd0e

Original Phabricator Diff: D33028236 (410e91adee)

Test Plan: PyTorch OSS CI

Reviewed By: ngimel

Differential Revision: D33637628

fbshipit-source-id: 1e81485be202b2f9d6a1ff315279cc099754c2dc
(cherry picked from commit c2d730bfeb)
2022-01-19 17:26:01 +00:00
Nikita Shulga
8a9243996c Lazy load pandas when importing pytorch (#71316)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71313

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71316

Reviewed By: wenleix

Differential Revision: D33595043

Pulled By: malfet

fbshipit-source-id: da8c7a7f132696645191d7b7055c4c21970d92c3
(cherry picked from commit 2d4847780a)
2022-01-19 17:02:50 +00:00
Jane Xu
671a0b5376 Move sccache compilation log to its own group (#71444)
Summary:
The sccache compilation log is often misleading.

We can move it to its own group so people don't see it right away

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71444

Reviewed By: atalman

Differential Revision: D33659650

Pulled By: janeyx99

fbshipit-source-id: f22fd21640a8747beeacce8857bbb8281efd76f4
(cherry picked from commit e25970abf9)
2022-01-19 16:47:36 +00:00
Andrey Talman
7ed2a43d26 Adding wheels with py3.10 (#71419)
Summary:
Adding wheels with py3.10

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71419

Reviewed By: janeyx99

Differential Revision: D33657770

Pulled By: atalman

fbshipit-source-id: 5d24f1771991ff07fbfd92d04d3d5211cf53084c
(cherry picked from commit bf2f2624e1)
2022-01-19 16:40:39 +00:00
Pritam Damania
b56ba296b1 Support multiple input dims for sharded linear. (#70266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70266

Addresses some of the issues mentioned in
https://github.com/pytorch/pytorch/issues/65638. ShardedLinear implementation
only support 2D inputs.

On the other hand `nn.Linear` supports arbitrary dimensions for inputs and
outputs. As a result, in this PR I've added support to ensure that
ShardedLinear supports arbitrary input dims as well.
ghstack-source-id: 147206607

Test Plan: waitforbuildbot

Reviewed By: wanchaol

Differential Revision: D33267630

fbshipit-source-id: 0460994c3aa33348b80547d9274206ef90cb29b6
(cherry picked from commit 7c289e1dbf)
2022-01-19 08:07:14 +00:00
Rohan Varma
fbc3b8c1bb [RPC] Fix a few flaky RPC tsan tests (#71460)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71460

When running with TSAN, we use a larger RPC timeout: https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/dist_utils.py#L68. As a result, the assertions here are invalid.

Tried to fix this by just setting `self.rpc_backend_options.rpc_timeout` to the new timeout, but `rpc_backend_options` is reconstructed every time it is accessed, so this doesn't work:: https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/distributed/rpc/tensorpipe_rpc_agent_test_fixture.py#L15

Just removing the asserts should be fine as they don't really add value to what's being tested.
ghstack-source-id: 147208455

Test Plan: CI

Reviewed By: fduwjj

Differential Revision: D33648421

fbshipit-source-id: 9a5052b1c851fe7f838792d8bdf17d0563b4aa00
(cherry picked from commit 96ddab3433)
2022-01-19 06:12:43 +00:00
Chen Lai
9515213070 [Operator Versioning] Remove version compare as they are decoupled now (#71461)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71461

After operator versioning work, the version in model file is used for operator versioning, while bytecode_version is used for bytecode versioning (for bytecode schema). They are two seperate things now and this comparison is not needed.
ghstack-source-id: 147209286

Test Plan: CI

Reviewed By: iseeyuan, tugsbayasgalan

Differential Revision: D33648592

fbshipit-source-id: beaa136a728f88435176a00c07b2d521210f107f
(cherry picked from commit e90e650e1a)
2022-01-19 04:51:45 +00:00
Pearu Peterson
677fab6d1d Support broadcast_to on sparse COO tensors (#71073)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71073

cc nikitaved pearu cpuhrsch

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33645744

Pulled By: cpuhrsch

fbshipit-source-id: 4775c9636c4e868022a8c1bbfec93e351d1cf885
(cherry picked from commit 640f21e09a)
2022-01-19 04:33:41 +00:00
Mike Ruberry
9b9b878c89 Fixes jiterator cache macro include + updates CUDA note with cache variables (#71452)
Summary:
Per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71452

Reviewed By: ngimel

Differential Revision: D33646495

Pulled By: mruberry

fbshipit-source-id: bbf627e6d7a724a83a3ea2ae9c0f50430f8d578e
(cherry picked from commit d1e72b144a)
2022-01-19 03:45:05 +00:00
Peter Bell
125bdb6d51 empty_meta: Add functions that don't depend on Tensor (#70615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70615

This adds `at::detail::empty_meta` and
`at::detail::empty_strided_meta` to complement the cpu API.

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33623678

Pulled By: ngimel

fbshipit-source-id: 59e003116361fb547ec2c633bbc15a7973e21d0e
(cherry picked from commit b4f5836fa1)
2022-01-19 03:41:20 +00:00
Mengchi Zhang
b4a75af758 [fx2trt] Export some options out (#71315)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71315

Add variables in LowerSetting to export options from TRTInterpreter and interpreter.run:
- explicit precision
- int8_mode

Export skip_folding_node_fn options from split_const_subgraphs.

Reviewed By: wushirong

Differential Revision: D33585385

fbshipit-source-id: 3d20b69d255ad97487e462436ae479587a8e2118
(cherry picked from commit f24a279517)
2022-01-19 02:13:31 +00:00
Peter Bell
87215ed526 empty_strided: Factor out generic implementation (#70614)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70614

This creates an `empty_strided_generic` function which, similar to
`empty_generic`, is a device-independent tensor constructor. This also
adds `at::detail::empty_strided_cpu` to complement
`at::detail::empty_cpu`.

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33623679

Pulled By: ngimel

fbshipit-source-id: 85994e88d664870bf425f398dfcdfc467885c694
(cherry picked from commit 2ff2a89df5)
2022-01-19 01:54:16 +00:00
Matthias Braun
d5e9a276ea Adapt to llvm marking SmallVector::set_size private (#71434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71434

See also https://reviews.llvm.org/D115380

Reviewed By: zhuhan0

Differential Revision: D33638540

fbshipit-source-id: a55e51462dc0d8f55a75bb79d9d76db781a36af2
(cherry picked from commit 78d1d65f77)
2022-01-19 00:54:03 +00:00
Eli Uriegas
30739f5329 ci: Change binary trigger to be nightly push (#71447)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71447

Changes the nightly build trigger to be based on pushes to the `nightly`
branch instead of being based on the tagged push. This aligns it with
our current CircleCI trigger and should make it so that it's easily
viewable using tools like https://hud.pytorch.org/ci/pytorch/pytorch/nightly

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D33647102

Pulled By: seemethere

fbshipit-source-id: c6757da35b7ec2d68bf36160dd7f3cb9ed040899
(cherry picked from commit 99b7b22650)
2022-01-19 00:27:42 +00:00
Peter Bell
6f4c491c6b empty_cpu: Add functions that don't depend on Tensor (#70613)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70613

This refactors `at::detail::empty_cpu` to use only `TensorBase` so you
can construct tensors without including `Tensor.h`. It also adds a
`TensorOptions` version to reduce friction in operators moving from
the `at::empty` API.

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D33623682

Pulled By: ngimel

fbshipit-source-id: 7a7b08bc2ed06830a3d698197a0c8389a096dc1d
(cherry picked from commit 2e17ad0bbd)
2022-01-19 00:01:58 +00:00
Yan Li
6964aa2ced backout D33469839 (#71443)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71443

cogwheel test inline_cvr_infer_canary_pyper_model_publish is timing out.

The convert_fx call takes > 20 mins for local and local_ro sub modules, which used to take ~ 2 mins.

Test Plan:
Fblearn flow run
* the following cmd took 1113 seconds before the diff and 5002 seconds after.
    flow-cli clone-locally 320014219  --run-as-secure-group pytorch_at_scale  --operators pyper_model_publish_workflow.pyper_model_publish_workflow.process_torch_package_model_files.process_non_sparse_parameters[0]

Cogwheel test
* Cogwheel test with packages in B3588 (the last good run) took 4694.48s
* Cogwheel test with packages in B3590 (the first timeout) took 13975.83s
* Cogwheel test with the following packages took 4535.04s
  * all packages in B3588 except the model publish
  * the model publish built with D33469839 (043e84b3d2) reversed (created D33633570)

Reviewed By: albanD, jerryzh168

Differential Revision: D33633570

fbshipit-source-id: dc5e777c48a90c551641a3f79126461f6a60449e
(cherry picked from commit 03ab65023a)
2022-01-18 23:51:51 +00:00
Rohan Varma
4fd1992a60 [Docs][BE] DDP doc fix (#71363)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71363

Looks like DDP example is currently broken as per
https://discuss.pytorch.org/t/official-ddp-example-is-broken/141493. Fix the
issue by setting the correct env variable.
ghstack-source-id: 147080377

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D33607250

fbshipit-source-id: e0e7d03cc365c186253b959c4c5405a5e3609218
(cherry picked from commit 32472884ec)
2022-01-18 22:24:51 +00:00
Taylor Robie
322f13d914 [Profiler] Fix memory profile type from recent refactor (#71417)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71417

I accidentally changed CPU_INSTANT_EVENT to CPU_OP, which broke TensorBoard.

Test Plan: Make memory profiling unit test check this case.

Reviewed By: aaronenyeshi

Differential Revision: D33637286

fbshipit-source-id: c95945f6b85cd4168820bd4d2a9203274a0a5bd6
(cherry picked from commit b1e258672a)
2022-01-18 22:18:11 +00:00
Nikita Shulga
ff8fb717db Fix get_git_repo_dir (#71448)
Summary:
Otherwise, rev-list will only pick-up commits in `.github` repo

Before:
```
% git -C .github rev-list 1eb6146d967b2d09af37c54af411d03f0b790209..1ff7f65cc1ad499a71457368894ca14bed069749 -- .
598b55fd18
ae089d6bdf
```
After
```
% git -C . rev-list 1eb6146d967b2d09af37c54af411d03f0b790209..1ff7f65cc1ad499a71457368894ca14bed069749 -- .
1ff7f65cc1
2ac58b0dc1
598b55fd18
55899528a2
ae089d6bdf
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71448

Reviewed By: seemethere, atalman

Differential Revision: D33644256

Pulled By: malfet

fbshipit-source-id: fa2e06f6767e7702af6ce85471aea07fa58292c0
(cherry picked from commit 594cecc0e1)
2022-01-18 22:12:41 +00:00