Commit Graph

38836 Commits

Author SHA1 Message Date
Erjia Guan
8cdf16d1de Revert D29810657: [bc-breaking] reference option for linear produce a pattern instead of reference linear module
Test Plan: revert-hammer

Differential Revision:
D29810657 (9df605133e)

Original commit changeset: 949615bbc017

fbshipit-source-id: 54597d1f9636b0f94ae01c66018ff2592e5c39fc
2021-07-27 10:10:13 -07:00
Nikita Vedeneev
d7ddae8e4f det_backward: correct, more robust and with complex support [clone] (#61905)
Summary:
Clone of https://github.com/pytorch/pytorch/pull/58195 to ease the import. Done by request from anjali411

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61905

Reviewed By: albanD

Differential Revision: D29937920

Pulled By: anjali411

fbshipit-source-id: 025892a8e6147790825b20458986730ad8c5bb0f
2021-07-27 10:08:26 -07:00
Peter Bell
de3a4eb583 Migrate thnn_conv_depthwise2d from THC to ATen (#62006)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62006

Closes gh-24646, gh-24647

There is no `TensorIterator` equivalent to these kernels so this is just
migrating the existing kernels over to the ATen style.

I've benchmarked for contiguous tensors with this script:
```
import torch
shape = (10, 10, 100, 100)
x = torch.randn(*shape, device='cuda')
w = torch.randn((10, 1, 5, 5), device='cuda')

for _ in range(100):
    torch.nn.functional.conv2d(x, w, groups=10)
```

and similarly for backwards. I see these as the same to within measurement error.

|                   | Master Forward (us) | This PR Forward (us) |
|------------------:|:-------------------:|:--------------------:|
|           Forward |        133.5        |         133.6        |
|  Backward (input) |        1,102        |         1,119        |
| Backward (weight) |        2,220        |         2,217        |

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D29883676

Pulled By: ngimel

fbshipit-source-id: 9b2ac62cdd8a84e1a23ffcd66035b2b2fe2374d8
2021-07-27 10:00:25 -07:00
Jerry Zhang
9df605133e [bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61892

This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for
convert in the future.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29810657

fbshipit-source-id: 949615bbc017bc454d81c8a6b2bdec53badaab19
2021-07-27 09:49:20 -07:00
Amy He
6c6a9c73f2 [7/N] Nnapi backend delegation preprocess: compile_spec sanity check (#62213)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62213

Added sanity checks in preprocess function for Android NNAPI delegate.
`preprocess()` requires some input metadata passed through its `method_compile_spec` function argument.

`preprocess()` now throws specific error messages, if it cannot find the correct input arguments.
Example error message:
```
RuntimeError: method_compile_spec does not contain the "forward" key.
method_compile_spec should contain a Tensor or Tensor List which bundles input parameters: shape, dtype, quantization, and dimorder.
For input shapes, use 0 for run/load time flexible input.
method_compile_spec must use the following format: {"forward": {"inputs": at::Tensor}} OR {"forward": {"inputs": c10::List<at::Tensor>}}
```

nnapi_backend_preprocess.cpp: contains sanity check implementation
test_backend_nnapi.py: sanity check unit tests

Test: Ran `python test/test_jit.py TestNnapiBackend` in OSS successfully.

TODO: Using Tensors to pass input parameters is a temporary hack. When a dedicated object is implemented, update the sanity check error message.
ghstack-source-id: 134339282

Test Plan: Ran `python test/test_jit.py TestNnapiBackend` in OSS successfully.

Reviewed By: raziel, iseeyuan

Differential Revision: D29917004

fbshipit-source-id: 0d5c6b35889c556cda905ffc29c25c5422ae9ee4
2021-07-27 09:31:35 -07:00
Rohan Varma
2cbc0ede7d [DDP] Log if graph is static at end of training (#61871)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61871

When set_static_graph=False, the only type of dynamism we really
support in DDP is dynamic set of unused parameters which must be explicitly
enabled with find_unused_parameters=True. Although, some workflows have static
set of unused parameters, would be good to detect and add this to logging to
identify workflows that are candidates for static graph optimization.
ghstack-source-id: 134371429

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D29773962

fbshipit-source-id: 1f741984c6e6f8e3e55cf69ca719b1e25a485b13
2021-07-27 09:23:43 -07:00
Mike Iovine
79eb8bb299 [Static Runtime] Enforce proper output dtype for many ops (re-land) (#62267)
Summary:
Re-land of D29935444
We previously had lots of ops with implementations like this:
```
if (p_node->Output(0).isNone()) {
  p_node->Output(0) = create_empty_like(input_0);
}
...
auto& out = p_node->Output(0);
some_func_out(inputs, out);
```
This would make the output have the correct shape. But it would
also take the dtype of `input_0`, which is not always correct.

This change transforms these blocks to:
```
if (p_node->Output(0).isNone()) {
  p_node->Output(0) = some_func(inputs)
} else {
  ...
  auto& out = p_node->Output(0);
  some_func_out(inputs, out);
}
```
This gives the output the correct shape and dtype.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62267

Reviewed By: ejguan

Differential Revision: D29937253

Pulled By: malfet

fbshipit-source-id: d91ca5d5703490d7d349a1de2ad3bb09b0c33967
2021-07-27 08:54:09 -07:00
Brian Vaughan
2eef1f27f8 Disable ccache for nccl builds (#62208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62208

reverts
https://github.com/pytorch/pytorch/pull/55814
which removed a workaround for:
https://github.com/pytorch/pytorch/issues/13362

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D29935472

Pulled By: nairbv

fbshipit-source-id: 7ce9cde1408f17153632036fd128814032739746
2021-07-27 08:07:26 -07:00
Erjia Guan
dc55d511d9 Forward fix mypy (#62263)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62263

Fixes current HUD Error: https://github.com/pytorch/pytorch/runs/3170342799

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D29935265

Pulled By: ejguan

fbshipit-source-id: 6f247833d24bff7aea42f6287493a85d62d73b96
2021-07-27 07:52:31 -07:00
Ivan Yashchuk
3cd12448b4 Add forward mode differentiation for inverse and solve (#62160)
Summary:
This PR adds forward mode differentiation for `torch.linalg.inv`, `torch.linalg.inv_ex`, and `torch.linalg.solve` functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62160

Reviewed By: mruberry

Differential Revision: D29917213

Pulled By: albanD

fbshipit-source-id: b08bbc830f77f342cc7ca5b823d7ea4380f2aaa8
2021-07-27 07:51:22 -07:00
Joel Schlosser
a0309f89f4 Initial ModuleInfo implementation (#61935)
Summary:
This PR contains the initial version of `ModuleInfo` for use in testing modules. The design philosophy taken here is to start small and simple and build out / refactor as needed when more test coverage or `ModuleInfo` entries are added. As such, it's not intended for general usage yet. The PR contains the following:

* (new file) `torch/testing/_internal/common_modules.py`
  * `ModuleInfo` definition - metadata for each module to use in testing
  * `module_db` - the actual `ModuleInfo` database; currently contains entries for two modules
  * `ModuleInput` - analogous to `SampleInput` from OpInfo; contains `FunctionInput`s for both constructor and forward pass inputs
      * Constructor and forward pass inputs are tied together within a `ModuleInput` because they are likely correlated
  * `FunctionInput` - just contains args and kwargs to pass to a function (is there a nicer way to do this?)
  * `modules` decorator - analogous to `ops`; specifies a set of modules to run a test over
  * Some constants used to keep track of all modules under torch.nn:
      * `MODULE_NAMESPACES` - list of all namespaces containing modules
      * `MODULE_CLASSES` - list of all module class objects
      * `MODULE_CLASS_NAMES` - dict from module class object to nice name (e.g. torch.nn.Linear -> "nn.Linear")
* (new file) `test/test_modules.py`
    * Uses the above to define tests over modules
    * Currently, there is one test for demonstration, `test_forward`, which instantiates a module, runs its forward pass, and compares it to a reference, if one is defined

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61935

Reviewed By: mruberry

Differential Revision: D29881832

Pulled By: jbschlosser

fbshipit-source-id: cc05c7d85f190a3aa42d55d4c8b01847d1efd57f
2021-07-27 07:42:07 -07:00
Howard Huang
afe3644321 Remove faulty process group code (#61907)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61907

Removing the code for faulty process group agent since it was replaced by faulty tensorpipe agent

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D29794666

Pulled By: H-Huang

fbshipit-source-id: 0b35191cc07220b6774ecacc8d004f25fd2e87f0
2021-07-27 07:37:40 -07:00
Erjia Guan
a3be2ecc3a Revert D29887367: [Static Runtime] Enforce proper output dtype for many ops
Test Plan: revert-hammer

Differential Revision:
D29887367 (f4136c5efc)

Original commit changeset: cef04bfa52ec

fbshipit-source-id: 32e89f2b6381930559dd746b535904c3e90fd52b
2021-07-27 07:29:09 -07:00
lezcano
b599c1e794 Create linalg and parametrizations codeowners (#62086)
Summary:
Added myself nikitaved  and IvanYashchuk

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62086

Reviewed By: mruberry

Differential Revision: D29920798

Pulled By: albanD

fbshipit-source-id: dcbd57bb2a438a1f04d4651447710fced83264d3
2021-07-27 06:50:41 -07:00
CodemodService FBSourceClangFormatLinterBot
228b50e053 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D29930232

fbshipit-source-id: e36dbc59a25d7f36d3bb7a02ad76696f299712cf
2021-07-27 04:13:15 -07:00
Jerry Zhang
2d7c1e3fa8 [bc-breaking] Produce quantization pattern for add_scalar and mul_scalar (#61859)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61859

BC-breakign note:
Previously we do not add observer/fake_quant for output of add/mul for tensor - scalar operation,
in this PR we added the observer/fake_quant instance (that's the same as input) to correctly model
the behavior of the quantized add_scalar and mul_scalar op (since quantized add/mul scalar assumes the
output quantized tensor have the same quantization parameter as input)

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_add
python test/test_quantization.py TestQuantizeFxOps.test_mul

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D29770859

fbshipit-source-id: f43fcbfecd04c392467770b22c481bbbdaf43c25
2021-07-27 02:46:00 -07:00
Alex Suhan
b176feec1e Add device and key for lazy tensors (#61621)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61621

Test Plan: CI

Reviewed By: mruberry

Differential Revision: D29912934

Pulled By: asuhan

fbshipit-source-id: 493c32063a3e756d93cbf1d876563a35eaafb537
2021-07-26 23:00:22 -07:00
Nikita Shulga
2945a73d90 Add option to skip GH validation for torch.hub (#62139)
Summary:
Split from https://github.com/pytorch/pytorch/pull/62072

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62139

Reviewed By: mthrok

Differential Revision: D29891497

Pulled By: malfet

fbshipit-source-id: 5c0baf53a2acf8f95062bd001457e1f936011529
2021-07-26 22:44:12 -07:00
Rohan Varma
64283fe146 [DDP/Functional Optim] Support kwarg arguments (#62079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62079

Adds support for kwarg arguments into functional optimizer running as
hook.
ghstack-source-id: 134330379

Test Plan: CI

Reviewed By: SciPioneer

Differential Revision: D29838127

fbshipit-source-id: 2ab051ef5f0dff19c145ebe2260668b927ba47b2
2021-07-26 22:12:50 -07:00
Rohan Varma
c0ebeca1a8 [Functional Optim] Test kwargs parity for SGD (#62078)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62078

Ensure that kwarg arguments such as momentum and weight decay maintain
parity between optimizer.step and step_param.
ghstack-source-id: 134330377

Test Plan: CI

Reviewed By: SciPioneer

Differential Revision: D29837942

fbshipit-source-id: 1ae39648fc26aebd8aaef1a7ac0e03b598a8ed60
2021-07-26 22:11:40 -07:00
Nikita Shulga
478098aaac Revert D29801652: Refactor Tensor::to to call a primitive that is not copy_.
Test Plan: revert-hammer

Differential Revision:
D29801652 (29bb3f4647)

Original commit changeset: bb01eb1acf3d

fbshipit-source-id: 93693bad8068d47a3a4c16f34f300e03ea573897
2021-07-26 19:37:17 -07:00
Rohan Varma
69adb21940 Parity tests for functional optimizer step_param (#61756)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61756

DDP will support running optimizer as communication hook with
optimizers that support a per-parameter/gradient step function `step_param`.
Add parity tests as we implement more optimizers that support step_param to
ensure parity with regular optimizers.
ghstack-source-id: 134330378

Test Plan: Ci

Reviewed By: SciPioneer

Differential Revision: D29727549

fbshipit-source-id: 18977c896f12b8e478298488b298fd107affcf5f
2021-07-26 19:03:22 -07:00
Nikita Shulga
b6d10a3a27 Fix infinite loop in _validate_not_a_forked_repo() (#62072)
Summary:
Increase `page_idx` in the loop rather than outside of it
Break from the loop when receive empty response as it means there are no more items to fetch via pagination request

Also, add options to use provided github token (via `GITHUB_TOKEN` environment variable)

Fixes failure with "Rate Limit Exceeded" when doing something like `torch.hub.list("pytorch/test-infra:dsf")`

Fixes https://github.com/pytorch/pytorch/issues/61755

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62072

Reviewed By: jbschlosser

Differential Revision: D29868539

Pulled By: malfet

fbshipit-source-id: 206082a0ba1208e9b15ff6c9c6cb71d2da74f1c3
2021-07-26 17:54:07 -07:00
Pavithran Ramachandran
d0f430927b [PyTorch][Edge] Serializing sub modules with same names (#61933)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61933

### Issue:

SubModules with same name are not serialized correctly in bytecode format while using `_save_for_mobile`. These submodules are not distinguished as different modules even though they have different foward, setstate etc if they have the same name.

### Fix:
Mangler creates unique names so that modules and submodules that have same names can be uniquely identified  while saving the module. iseeyuan rightly pointed out the underlying issue that mangler is not used in the process of saving bytecode and hence unique references for the submodules are not created. Please refer to the notebook to repro the issue: N777224

### Diff:
The above idea of fix is implemented. The mangled names are used in bytecode thereby the files in `code/` directory now have right reference to the `bytecode.pkl`

Will this have backward compatibility?
iseeyuan please feel free to correct or update this.
Yes. This fix impacts only modules with same name sub modules which were not serialized correctly before. Existing modules should have correct references and `_load_for_mobile` must not see any change. To confirm this the existing test cases need to pass for the diff to be approved and shipped.
ghstack-source-id: 134242696

Test Plan:
```
~/fbsource/fbcode > buck test caffe2/test/cpp/jit:jit -- BackendTest.TestCompositeWithSetStates
Downloaded 0/5 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 19.2 sec (100%) 17619/17619 jobs, 3/17619 updated
  Total time: 19.5 sec
More details at https://www.internalfb.com/intern/buck/build/91542d50-25f2-434d-9e1a-b93117f4efe1
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: de9e27cf-4c6c-4980-8bc5-b830b7c9c534
Trace available for this run at /tmp/tpx-20210719-161607.659665/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/844425127206388
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (8.140)
    ✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestCompositeWithSetStates (0.528)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/844425127206388
```

```
~/fbsource/fbcode > buck test caffe2/test/cpp/jit:jit -- BackendTest.TestConsistencyOfCompositeWithSetStates
Building: finished in 4.7 sec (100%) 6787/6787 jobs, 0/6787 updated
  Total time: 5.0 sec
More details at https://www.internalfb.com/intern/buck/build/63d6d871-1dd9-4c72-a63b-ed91900c4dc9
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 81023cd2-c1a2-498b-81b8-86383d73d23b
Trace available for this run at /tmp/tpx-20210722-160818.436635/trace.log
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8725724325952153
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit - main (7.867)
    ✓ Pass: caffe2/test/cpp/jit:jit - BackendTest.TestConsistencyOfCompositeWithSetStates (0.607)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8725724325952153
```

To check the `bytecode.pkl` using module inspector please check:
N1007089

Reviewed By: iseeyuan

Differential Revision: D29669831

fbshipit-source-id: 504dfcb5f7446be5e1c9bd31f0bd9c986ce1a647
2021-07-26 16:31:48 -07:00
mattip
a13f714b6d DOC: remove git stamp from release documentation version (#58486)
Summary:
CI built the documentation for the recent 1.9.0rc1 tag, but left the git version in the `version`, so (as of now) going to https://pytorch.org/docs/1.9.0/index.html and looking at the version in the upper-left corner shows "1.9.0a0+git5f0bbb3" not "1.9.0". This PR should change that to cut off everything after and including the "a".

It should be cherry-picked to the release/1.9 branch so that the next rc will override the current documentation with a "cleaner" version.

brianjo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58486

Reviewed By: zou3519

Differential Revision: D28640476

Pulled By: malfet

fbshipit-source-id: 9fd1063f4a2bc90fa8c1d12666e8c0de3d324b5c
2021-07-26 16:28:59 -07:00
Raghavan Raman
60070982d2 [Static Runtime] Fixed build failure in OSS due to test_utils (#62216)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62216

Test Plan: Imported from OSS

Reviewed By: hlu1

Differential Revision: D29917514

Pulled By: navahgar

fbshipit-source-id: 379863e6cd0b157de3bfa1482f5519b26654b3d2
2021-07-26 16:10:10 -07:00
Janet Yang
962841b532 Fix subnet counting and re-enable check for multiple onnxifi ops in AOT (#62033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62033

Count the number of onnxifi ops rather than just number of subnets, since when the subnet size < min_ops, it isn't turned into an onnxifi op.

Test Plan:
Runs which ran into the "Did not find a partition with an SLS node" error now report "multiple onnxifi ops found"
From https://fb.workplace.com/groups/527892364588452/permalink/807802049930814/:
```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-06-30/onnxifi_caffe2_net_aot_input_arguments_01-55-32_711d9476?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"

```
Reran some failures from last week which now pass AOT:
From https://fb.workplace.com/groups/527892364588452/permalink/807802049930814/,
https://fb.workplace.com/groups/243933520351820/permalink/572715897473579/

```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-09/onnxifi_caffe2_net_aot_input_arguments_05-31-08_ef5393a6?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"
```
```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-12/onnxifi_caffe2_net_aot_input_arguments_14-44-34_cfdf3053?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"
```
```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-13/onnxifi_caffe2_net_aot_input_arguments_04-03-30_162e7e53?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"
```

Reviewed By: khabinov

Differential Revision: D29796893

fbshipit-source-id: e9de7529ef86745207d41643d0fbe932fa166437
2021-07-26 16:08:51 -07:00
Shiyan Deng
037c4aa1d1 [fx2trt] flatten converter (#62202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62202

Add acc_ops.flatten converter. Also migrate to oss acc tacer for trt interpreter.

Test Plan: unit test

Reviewed By: khabinov

Differential Revision: D29861555

fbshipit-source-id: dac88a703fdbf386f3f7fb27674a67951f3add49
2021-07-26 15:49:01 -07:00
Richard Barnes
f883ed9095 irange-ify 8b (#62195)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62195

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29887946

fbshipit-source-id: e3bd44721cf06a34ced47994810212be8460a2bb
2021-07-26 15:38:54 -07:00
Richard Barnes
f7743e92bf irange-ify 9 (#62118)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62118

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29879670

fbshipit-source-id: 99b86ac7d65dfa2a47d0e6b7d65433200d18081e
2021-07-26 15:13:02 -07:00
Kimish Patel
026cfe85b4 Fix InlinedCallStack annotation to account for module calling its own (#61791)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61791

methods from forward

During inlining we attached InlinedCallstack to nodes being inlined. In
the process we attach moodule information as well, such that if
CallMethod is being inlined we know which class instance and class type
the method belongs to. However, CallMethod can be calling a method of
the same object to which the graph belongs. e.g.:

```
def forward(self, input):
  x = input + 10
  return forward_impl_(x, input)
```
Here forward_impl is method defined on the same class in which forward
is defined. Existing module hierarchy annotation will mislabel this as
unknown instance since the method is not associated with output of
GetAttr node (it would be we had called self.conv.forward_impl_ for
example).
Change in this PR reconciles this by creating a placeholder name "SELF"
for module instance indicating that you can traverse InlinedCallStack
backwards to find first node with name != SELF, which would be the name
of the object.
e.g.:
TOP(ResNet)::forward.SELF(ResNet)::_forward_impl.layer1(Sequential)::forward.0(BasicBlock)::forward.conv1(Conv2d)::forward.SELF(Conv2d)::_conv_forward

Test Plan:
Add test

Imported from OSS

Reviewed By: larryliu0820

Differential Revision: D29745443

fbshipit-source-id: 1525e41df53913341c4c36a56772454782a0ba93
2021-07-26 15:00:57 -07:00
Nikita Shulga
f16102f72a Revert D29892919: Add squid proxy as egress cache
Test Plan: revert-hammer

Differential Revision:
D29892919 (e63160d735)

Original commit changeset: ac17227f2553

fbshipit-source-id: b78313147d60f26c1df68a25293e6b571ba66919
2021-07-26 14:42:28 -07:00
Edward Yang
cf1f59452b Hacky support for meta tensor serialization. (#62192)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62192

This support is hacky because it doesn't preserve meta tensor storage
sharing (e.g., if you serialize a model with shared storage, e.g., a
tensor and a view on a tensor, when I deserialize the viewing
relationship will be broken and these are just different tensors.) The
hack is also durable, in the sense that we will be on the hook for
supporting `_rebuild_meta_tensor_no_storage` in perpetuity in the
future, even if we change our mind about the serialization format.

This unblocks an FB production use case. I didn't add C++ support to minimize
blast area of this patch.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D29910535

Pulled By: ezyang

fbshipit-source-id: d98dcdd0108dfc3ae730a071d3c583b6d0281d21
2021-07-26 14:33:45 -07:00
Elton Leander Pinto
f0140a8c5f Disable cppcoreguidelines-non-private-member-variables-in-classes (#62212)
Summary:
This PR disables the `cppcoreguidelines-non-private-member-variables-in-classes` check. PyTorch makes use of `protected` members throughout the codebase, and we do not want to perform this clang-tidy check in CI to improve signal-to-noise.

Relevant failure: https://github.com/pytorch/pytorch/pull/61871/checks?check_run_id=3146453417

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62212

Reviewed By: driazati

Differential Revision: D29917882

Pulled By: 1ntEgr8

fbshipit-source-id: f607c3d050a122e95136f9915060c4cda6694c9d
2021-07-26 14:14:05 -07:00
Elton Leander Pinto
1343eea037 Fix clang-tidy line filtering logic (#62210)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62210

Fixes #62204

Test Plan: #62211 clang-tidy should only error on the added lines (and not on context/removals)

Reviewed By: driazati

Differential Revision: D29917897

Pulled By: 1ntEgr8

fbshipit-source-id: de91dbf34c1ad8405507cad91ab3dd0d6c61d82e
2021-07-26 14:12:53 -07:00
Elton Leander Pinto
2a83f24027 Enable macos clang-tidy installs (#62214)
Summary:
This PR enables installing our custom MacOS clang-tidy binaries. It also updates related documentation.

The binaries are produced by [this CI job](https://github.com/pytorch/test-infra/blob/master/.github/workflows/clang-tidy-macos.yml), and are published to S3.

This PR does not handle versioning of the downloaded binaries as this is being worked on separately. See https://github.com/pytorch/test-infra/issues/73

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62214

Test Plan:
On a MacOS machine, run
```bash
python3 -m tools.linter.install.clang_tidy
.clang-tidy-bin/clang-tidy --checks="*" --list-checks | grep "misc-max-tokens"
```

Reviewed By: janeyx99, mruberry

Differential Revision: D29917728

Pulled By: 1ntEgr8

fbshipit-source-id: 98d0d8b7a57bdebf0ebcdc83228ef391e8c6629e
2021-07-26 13:43:29 -07:00
Mike Iovine
f4136c5efc [Static Runtime] Enforce proper output dtype for many ops
Summary:
We previously had lots of ops with implementations like this:
```
if (p_node->Output(0).isNone()) {
  p_node->Output(0) = create_empty_like(input_0);
}
...
auto& out = p_node->Output(0);
some_func_out(inputs, out);
```
This would make the output have the correct shape. But it would
also take the dtype of `input_0`, which is not always correct.

This change transforms these blocks to:
```
if (p_node->Output(0).isNone()) {
  p_node->Output(0) = some_func(inputs)
} else {
  ...
  auto& out = p_node->Output(0);
  some_func_out(inputs, out);
}
```
This gives the output the correct shape and dtype.

Test Plan: `buck test //caffe2/benchmarks/static_runtime:static_runtime_cpptest`

Reviewed By: hlu1

Differential Revision: D29887367

fbshipit-source-id: cef04bfa52ec082ad3a9a32aa27c44e275c6b24c
2021-07-26 13:27:02 -07:00
Richard Zou
29bb3f4647 Refactor Tensor::to to call a primitive that is not copy_. (#61458)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61458

Context
-------
functorch is unable to vmap(grad(f)) when f contains a .to
call. This is because .to (when it is not a no-op) decomposes
to .copy_ under grad and the .copy_ is not compatible with vmap.

Fix
 ---
The fix for this is to have all Tensor::to variants call a new operator,
`_to_copy`, that always copies and is a primitive w.r.t. autograd so
that autograd decomposes Tensor::to into a call to `_to_copy`.
(This is related to https://github.com/pytorch/pytorch/issues/60956,
please let me know if you want to bikeshed the naming).

In order to get this done I had to do a bit of refactoring. All of the
`::to` implementations now call `to_impl` which may call `_to_copy`.

Autograd codegen changes
------------------------

The second thing I had to do was modify the autograd codegen. Right now,
autograd assumes that every output is either statically known to be
differentiable or not differentiable at codegen time. `_to_copy` is a
little special because its differentiability depends on the output
dtype. e.g. `torch.randn(3, requires_grad=True).to(torch.long)` is non
differentiable. To get this to work:
- I changed how `output_differentiability` in derivatives.yaml work.
- output_differentiability can now accept "conditions" for each of the
output arguments. A "condition" is some C++ code.
- We currently only support `output_differentiability` with conditions
if there is a single output. This is for convenience and can be changed
in the future.
- I added a new `output_differentiability_conditions` field to
DifferentiabilityInfo. This gets populated in load_derivatives.yaml
- forward-mode and reverse-mode AD take
`output_differentiability_conditions` into account.

Here's how the generated code for `VariableType::_to_copy`
[looks
like](https://gist.github.com/zou3519/93462df4bda1837acee345205b7cc849)
No other autogenerated code gets modified by this PR.

Performance benchmarking
------------------------
- I benchmarked [three
cases that demonstrate overhead](https://gist.github.com/zou3519/5b6985e6906b80eec5a0dd94ed5b6a1a).
- Case A: No-op .to(). Instruction count went from 50223 to 25623. I
have no clue why but this is a good thing.
- Case B: not-no-op .to(). Instruction count went from 665291 to 671961.
This is expected; `_to_copy` adds an additional dispatch.
- Case C: not-no-op .to() forward pass and backward pass. Instruction count
went from 4022841 to 4030057. This PR adds
an additional dispatch to .to() (so there should be one additional
dispatch in the forward pass) so this number looks reasonable.

Test Plan
---------
- test_torch.py has a test_to
- test_cuda.py has test_to*
- test_autograd has tests (test_type_conversions) that exercise the
reverse-mode path
- test_ops.py has some tests (like log_softmax) that exercise the
reverse-mode and forward-mode AD path.
- test_quantization, test_namedtensor all exercise tensor.to as well.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D29801652

Pulled By: zou3519

fbshipit-source-id: bb01eb1acf3d79d84f284150d1be4be3b4ace351
2021-07-26 13:02:39 -07:00
zhouzhuojie
e63160d735 Add squid proxy as egress cache (#62103)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62103

This PR adds a squid proxy that's deployed dedicated for PyTorch CI. Initially we only roll out to GHA, and if things are ok we will extend this to circleci tests if necessary.

`http_proxy` and `https_proxy` are compatible with the following http clients:

- curl
- wget
- python

Existing cache policy:

```
refresh_pattern -i .(7z|deb|rpm|exe|zip|tar|tgz|gz|ram|rar|bin|tiff|bz2|run|csv|sh)$ 1440 80% 2880
```

It uses the standard squid refresh_pattern for cache requests. In our setup, we tried
to cache at least (1440 minutes - 1 day) and at max (2880 minutes - 2 days), with
last-modified factor 80% ([squid doc](http://www.squid-cache.org/Doc/config/refresh_pattern/)). Please refer to [pytorch/test-infra](https://github.com/pytorch/test-infra/tree/master/aws/websites/squid-proxy) for details.

Right now, it only applies to the `build` and `test` step, to limit the scope and make sure build and test are more reliable with egress cache.

Test Plan: Imported from OSS

Reviewed By: jbschlosser, malfet, seemethere, janeyx99

Differential Revision: D29892919

Pulled By: zhouzhuojie

fbshipit-source-id: ac17227f2553ca62881711b3e9943488dfd8defd
2021-07-26 13:01:34 -07:00
Richard Barnes
d2594fa538 irange-ify 3 (#62112)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62112

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29879513

fbshipit-source-id: c01d18d34bb19014bf28d92c4d04b07e50a2770a
2021-07-26 12:56:58 -07:00
Salil Desai
f5c6c3947e Remove Input Pointer Caching for XNNPack (#61959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61959

We no longer need to cache the Input Pointer as XNNPACK has implemented a more robust approach where indirection buffer does not need to be recalculated even if activation tensor pointer changes, as long as tensor dimensions are the same.

This reverses the changes in https://github.com/pytorch/pytorch/pull/42840/files

Reviewed By: kimishpatel

Differential Revision: D29777605

fbshipit-source-id: c1750538c17bce34f885c6f1bbb1f7164ebba25b
2021-07-26 12:02:15 -07:00
Richard Barnes
7ec6d1e857 irange-ify 2 (#62113)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62113

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29879507

fbshipit-source-id: 1fb114e44afe8c1407f648b705db7fd4edb9d6e3
2021-07-26 12:00:52 -07:00
Rohan Varma
6dc2c07304 [Reland] [DDP] Implement a hook which performs FunctionalSGD step. (#62177)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62177

Reland of https://github.com/pytorch/pytorch/pull/61678
Fix CI failure by gating including torchvision model on whether torchvision is available or not.
ghstack-source-id: 134282165

Test Plan: CI

Reviewed By: SciPioneer

Differential Revision: D29904101

fbshipit-source-id: 47e799eb4a90acbbda91c5857ea00de3045d49f5
2021-07-26 11:56:56 -07:00
Jamie King
1dfb687f3c Fixed off-by-one bug in Adam Smart Decay (#62135)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62135

The initial implementation of Adam with Smart Decay had an off-by-one error.  This was in the summation of the geometric series used to calculate how much built-up momentum would have been discharged in skipped minibatches.

The unit tests should have caught these, but the testing strategy missed this because k, the "number of skipped minibatches" was always either 0 or so high that the impact of the bug was too small.  The impact of the bug was proportional to 1/k.  The testing strategy has also been adjusted to cover this bug.

Differential Revision: D29889309

fbshipit-source-id: b086c0efed5c27f621061e726533c73658daffc6
2021-07-26 11:55:38 -07:00
Supriya Rao
dcb3eadc1f [quant][fix] Update quantization c++ tests to not run if CPU_STATIC_DISPATCH is specified (#62197)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62197

For build configs with ATEN_CPU_STATIC_DISPATCH defined, quantization tests will fail since they
require QuantizedCPU dispatch to be enabled.
This will fix some internal test failures like https://www.internalfb.com/intern/test/844424941811803?ref_report_id=0 which are run under the `caffe2_aten_cpu_inference` project

Test Plan:
buck test mode/dev //caffe2/aten:quantized_test

Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D29912742

fbshipit-source-id: b117eb9f4afb51e0d0dd52fbe9d5c5be7dfafe85
2021-07-26 11:39:45 -07:00
Richard Barnes
0ca5dc7f03 irange-ify 5 (#62114)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62114

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29879534

fbshipit-source-id: 0b1d6d2c9062a2fd7a55b00cb9f3d59ec941bad3
2021-07-26 11:07:54 -07:00
Akshit Khurana
8e71f48f0a Handle simple NNAPI flatten NHWC case (#61796)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61796

We can easily handle nnapi conversion for nhwc inputs
that have 1 channel or H & W are 1

Test Plan:
pytest test/test_nnapi.py::TestNNAPI::test_flatten

Imported from OSS

Reviewed By: saketh-are

Differential Revision: D29827735

fbshipit-source-id: 65dee4b42fceef1b032bf5dd1c4cc6e020d01e14
2021-07-26 10:59:04 -07:00
kshitij12345
b73d759708 [fix] polygamma n>=1 (#61641)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/55357

TODO:
* [x] Use proper casting to avoid confusing the compiler

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61641

Reviewed By: albanD

Differential Revision: D29816592

Pulled By: mruberry

fbshipit-source-id: 2c020a6e4c325c1b5d15499a77fb39f9ba93dd79
2021-07-26 10:52:20 -07:00
Pritam Damania
ef7d572afa Ensure ShardedTensor handles list/tuple appropriately as size parameter. (#62109)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62109

The `size` parameter only worked correctly for *args like invocation
:10, 20 and not for list: [10, 20] and tuples: (10, 20). This PR ensures this
works similar to `torch.empty`.
ghstack-source-id: 134246166

Test Plan:
1) unit tests
2) waitforbuildbot

Reviewed By: SciPioneer

Differential Revision: D29884768

fbshipit-source-id: 7a4a3c5ed5d7c081344f6ead3170905b97fc652d
2021-07-26 10:31:32 -07:00
Richard Barnes
f9dce598a5 Add some missing cuda guards (#62100)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62100

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D29880330

fbshipit-source-id: 7089000ccbcaa70a13f0ab4531b032bd5326e539
2021-07-26 10:26:22 -07:00