Commit Graph

44789 Commits

Author SHA1 Message Date
James Reed
a2d2610ec9 [FX] Assert None concrete_args and improve error messages (#74662)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74662

Previously, we would not emit a check that `concrete_args` with value `None` matched that value during runtime. This fixes that and improves some of the warning messages

Test Plan: Imported from OSS

Reviewed By: Chillee

Differential Revision: D35137362

Pulled By: jamesr66a

fbshipit-source-id: 222a2c8a907748f90290f1c1b4ab8012b46099a0
(cherry picked from commit b960405ad87e57dcf62ca25dd4d4bdfc34c8744c)
2022-03-25 23:36:27 +00:00
Linbin Yu
1c4eb3a266 [android] improve unsupported scalar type error message for android
Summary: Android only support a few scalar types as model return value. This diff improved the error message so user can know which type is not supported.

Test Plan: verified unsupported scalar type is printed

Differential Revision: D35104788

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74660
Approved by: https://github.com/kit1980
2022-03-25 23:07:46 +00:00
Christian Puhrsch
edf2deb81e Add private conversion function from CSR to block CSR
This PR adds a private function that converts a CSR Tensor into a [scipy-style block CSR Tensor](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.bsr_matrix.html#scipy.sparse.bsr_matrix).

It uses the scipy CSR to BSR conversion routines (and credits them accordingly).

The main purpose of this function is to easily create a block CSR Tensor for matrix multiplication.

Follow up work includes
- Blocksize support for sparse_csr_tensor
- Parallel CPU kernel
- CUDA kernels
- Faster arg sanitization
- Benchmarking of cuSPARSE backend
- Dense to/from block CSR
- Autograd support
- Column-major blocks
- Block CSR to CSR conversion
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71582
Approved by: https://github.com/IvanYashchuk, https://github.com/albanD
2022-03-25 21:22:15 +00:00
anjali411
1dab71ab25 Allow specifying tags for aten operators in native_functions.yaml
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72549

Approved by: https://github.com/ezyang
2022-03-25 21:17:52 +00:00
Eli Uriegas
79f91e6ef4 ci: Move ssh setup to it's own action
SSH setup was being hidden away in the setup step for both linux and
windows, this moves it out to it's own step so that users can know where
to click to get ssh details

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74773

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Approved by: https://github.com/suo, https://github.com/malfet
2022-03-25 21:10:45 +00:00
Will Constable
85abc328b9 Adds dependencies on lazy codegen sources to invocation of generate_code (#74750)
Summary:
Isn't foolproof since it doesn't include transitive deps of these python scripts, but it's better than nothing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74750

Reviewed By: bdhirsh

Differential Revision: D35145749

Pulled By: wconstab

fbshipit-source-id: ccd77cf18f68cc66c790f41f111833eca4101dac
(cherry picked from commit 5ef757ffd27837eb2b3c98935d66aecb1fc5acf9)
2022-03-25 20:50:52 +00:00
Richard Zou
a75c718d7c [reland] Update tls logic to work better with guarded call (#73925)
This PR relands https://github.com/pytorch/pytorch/pull/73925 which we
reverted due to a large breakage in functorch.

As a part of the reland, this PR adds a change we agreed upon in
https://docs.google.com/document/d/1i7Y9VZp9PxtgVcrQh6nGQXkXkPc1uMep0dM-OMOGJ9o/edit
The change is moving the PythonTLSSnapshot key after
DynamicLayerFrontMode.

Test Plan:
- I tested this with an updated version of functorch and all the tests
pass so I think we are out of the woods.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74577

Approved by: https://github.com/albanD
2022-03-25 19:51:10 +00:00
Aaron Enye Shi
d014772b9f [Profiler] Store Input shapes, dtypes, and metadata into flat AppendOnlyList (#74241)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74241

Adds the following changes:
- During collection, replaces the vector of vector of int shapes, and vector of string dtypes. Instead pack the IValue details into InputOutputEncoder as flat AppendOnlyLists.
- This will save each IValue with a enum tag, metadata holding its dim and dtype, and the shapes.
- During Post-Processing, re-construct the vectors that are originally expected (struct Inputs).

Reviewed By: chaekit

Differential Revision: D34823546

Pulled By: aaronenyeshi

fbshipit-source-id: 718fccaa8aab16128da986d665564a8fef5436c8
(cherry picked from commit 96a47c068e55220e7b7224c8a1935033859b5cd2)
2022-03-25 19:10:09 +00:00
Omkar Salpekar
e8c4926e75 [GHF] Adding James Reed to Merge Rules superusers (#74758)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74758

See Title

Test Plan: NA

Reviewed By: jamesr66a

Differential Revision: D35147762

fbshipit-source-id: 34572bfb3aef5e14a06fe27dc3008308b40bdc34
(cherry picked from commit 190f429cdf6381b7d2c955ed9e9a2a62930d0582)
2022-03-25 18:38:18 +00:00
Pearu Peterson
ebeea9e2ea Support masked sum on sparse COO tensors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71239

Approved by: https://github.com/cpuhrsch
2022-03-25 18:26:39 +00:00
Xiang Gao
3b29bd00eb Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH is set (#69552)
Summary:
This is the very first step for the UCC-NCCL integration. This PR lets `ProcessGroupNCCL` load the `torch_ucc.so` if the user specifies an environmental variable `TORCH_UCC_LIBRARY_PATH`. If this environment variable is not specified by the user, then there will be no visible change.

In the future, we may want to make PyTorch smart enough to automatically detect the `torch_ucc.so` in the user's system, but before doing that, I believe we should first make sure that `ProcessGroupUCC` is very well tested.

Note that in this PR, `ProcessGroupNCCL` just loads the library but will not use it. I am trying to make PRs small, so the usage of `torch_ucc.so` will be submitted in later PRs.

This PR requires the change in https://github.com/facebookresearch/torch_ucc/pull/56, otherwise `torch_ucc.so` can not be successfully loaded. But his PR can be landed separately without waiting for https://github.com/facebookresearch/torch_ucc/pull/56 because, in PyTorch's unit tests, UCC is never used or tested.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69552

Reviewed By: mruberry

Differential Revision: D34675212

Pulled By: jiayisuse

fbshipit-source-id: a3d1fb98340dbe3a931af555423863efd381f1ae
(cherry picked from commit 3778b6fabe70c26b5a65e6ddec641d2ef9113cd1)
2022-03-25 18:19:39 +00:00
Nikita Shulga
f36ceefd71 [GHF] Speedup default PR query
Fetch check run statuses only last commit and authors names for first
hundred

This avoids hitting the resource limits on PR with lots of commits
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74731
Approved by: https://github.com/seemethere
2022-03-25 18:12:29 +00:00
Slava Kovalevskyi
3b3bdfd51c Revert D34808842: Reland "[pytorch][PR] Support dataclasses in TorchScript"
Test Plan: revert-hammer

Differential Revision:
D34808842 (b57cc9c752)

Original commit changeset: 02f807cff1ea

Original Phabricator Diff: D34808842 (b57cc9c752)

fbshipit-source-id: bd7c47493b598677e77634d06d7dc3e3a457b92d
(cherry picked from commit e1853d73b3ad2494457626fbb34c65169ae8cc31)
2022-03-25 17:17:30 +00:00
Christian Puhrsch
7fe0b6a5cd mul(sparse_csr, sparse_csr) using mul(sparse, sparse)
Basic fallback implementation. Let's make this faster once used.

NOTE: This is stacked on top of https://github.com/pytorch/pytorch/pull/74294
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74266
Approved by: https://github.com/pearu, https://github.com/malfet
2022-03-25 17:10:33 +00:00
Michael Melesse
cd929f403f [ROCM] Navi21 Enablement 7: Sparse kernels
This PR is a follow up to the following prs.
https://github.com/pytorch/pytorch/pull/69942
https://github.com/pytorch/pytorch/pull/72682
https://github.com/pytorch/pytorch/pull/72809
https://github.com/pytorch/pytorch/pull/73543
https://github.com/pytorch/pytorch/pull/73545
https://github.com/pytorch/pytorch/pull/73546

We are adding support to Navi21 GPUs which have a warpsize of 32. We cannot rely on a constant so we have to dynamically look up the warpsize when launching the kernel on the host side. Inside device functions this is not needed and the compiler can correctly detect the correct warpsize to replace the C10_WARP_SIZE constant.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73548
Approved by: https://github.com/ngimel
2022-03-25 17:09:03 +00:00
Brian Hirsh
c0491c9179 DispatchKeySet perf improvements (#72828)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72828

Reland of D34034847 (8aa3620d73)
ghstack-source-id: 152161453

Test Plan: confirm that Milan tests are passing

Reviewed By: ezyang, albanD

Differential Revision: D34227615

fbshipit-source-id: c7695e16dba3076e8ab9df8654327c5d57e92c77
(cherry picked from commit 940717db1551b799964894e0bb97757ecae14235)
2022-03-25 17:04:51 +00:00
Brian Hirsh
2cbddc0e9b free up dispatch key space (in C++) (#72827)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72827

Reland of D34034848 (6690256021)
ghstack-source-id: 152161452

Test Plan: Confirm that Milan tests are passing

Reviewed By: ezyang

Differential Revision: D34227616

fbshipit-source-id: 6d1dd0fd8144dfbd9e194cd7564cce017e7db968
(cherry picked from commit e5c1b29fedd5c2a0bad810cedc94aa784136b6aa)
2022-03-25 17:04:51 +00:00
Alban Desmaison
7c747c7907 Add Sherlock to superusers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74744
Approved by: https://github.com/SherlockNoMad, https://github.com/seemethere
2022-03-25 17:02:14 +00:00
Jerry Zhang
0747bdbf11 [quant][fx] Removing more unused code (#74603)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74603

att

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D35071546

fbshipit-source-id: 273a7f0cb2a8f306864eb118916056fad3bb1399
(cherry picked from commit 9c31a50a2bccb2e5b7a5db833085a75e5ebda707)
2022-03-25 16:39:48 +00:00
Salil Desai
cdcd1ac121 [PyTorch Edge] Make contexts thread local for quantized matmul (#74676)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74676

We don't want to create and destroy a new context with each multiplication

Test Plan:
From fbcode:
```buck test caffe2/test:quantization -- test_qmatmul```

# Performance Improvement
*Benchmarking done by on a model which performs matmuls of the same shapes and counts as Transformer Model, as determined in D30901505*

*Notebook in which Benchmarking was performed: https://www.internalfb.com/intern/anp/view/?id=1582075&revision_id=1891629751047842*

**Improvement from this diff alone**
~9.71% Reduction in Latency
- Non Thread Local Contexts (before this diff, D35087184 v2): [8.5410ms](https://www.internalfb.com/intern/aibench/details/661728682381311
)
- Thread Local Contexts (this diff, v12): [7.7113ms](https://www.internalfb.com/intern/aibench/details/956655867696198)

**FP32 Matmul vs Quantized Matmul, Overall Improvement from this diff stack**
56% reduction in latency compared to FP32 Matmul, 71% reduction in latency compared to Naive QMatmul
- FP32 Matmul: [17.4910ms](https://www.internalfb.com/intern/aibench/details/875394396322469)
- Quantized Matmul (after this diff): [7.7113ms](https://www.internalfb.com/intern/aibench/details/956655867696198
)
- Naive Quantized Matmul (dequantize → fp32matmul → quantize): [26.8639ms](https://www.internalfb.com/intern/aibench/details/52181682131461
)

Reviewed By: kimishpatel

Differential Revision: D34756288

fbshipit-source-id: b000658152cf71b4185dcd34a3cccc71b4cec1f0
(cherry picked from commit 5bc7ef6b5c3255388eb8fab230e44073004d2266)
2022-03-25 15:36:01 +00:00
Pavel Belevich
96c8f64459 Remove with_traceback(None) in wrapped_call to show the root cause error
Before:
```
Traceback (most recent call last):
  File "/Users/pbelevich/PycharmProjects/PiPPy/test/t5_test.py", line 37, in <module>
    t5_pipe_output = t5_pipe(input_ids=t5_input, decoder_attention_mask=None, decoder_input_ids=decoder_input_ids)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 251, in forward
    return self.executor.run(*executor_args)
  File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 155, in run
    return super().run(*args, initial_env=initial_env)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 121, in run
    self.env[node] = self.run_node(node)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 148, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 170, in call_module
    return super().call_module(target, args, kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 265, in call_module
    return submod(*args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e.with_traceback(None)
AttributeError: 'NoneType' object has no attribute 'dtype'
```
After:
```
Traceback (most recent call last):
  File "/Users/pbelevich/PycharmProjects/PiPPy/test/t5_test.py", line 37, in <module>
    t5_pipe_output = t5_pipe(input_ids=t5_input, decoder_attention_mask=None, decoder_input_ids=decoder_input_ids)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 251, in forward
    return self.executor.run(*executor_args)
  File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 155, in run
    return super().run(*args, initial_env=initial_env)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 121, in run
    self.env[node] = self.run_node(node)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 148, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/Users/pbelevich/PycharmProjects/PiPPy/pippy/IR.py", line 170, in call_module
    return super().call_module(target, args, kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/interpreter.py", line 265, in call_module
    return submod(*args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call
    return cls_call(self, *args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call
    return cls_call(self, *args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call
    return cls_call(self, *args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call
    return cls_call(self, *args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 620, in wrapped_call
    return cls_call(self, *args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 630, in wrapped_call
    raise e
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/fx/graph_module.py", line 622, in wrapped_call
    return super(cls, self).__call__(*args, **kwargs)
  File "/Users/pbelevich/miniconda3/envs/PiPPy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "<eval_with_key>.42", line 74, in forward
  File "/Users/pbelevich/PycharmProjects/pbelevich-transformers/src/transformers/utils/fx.py", line 180, in wrapper
    return func(*args, **kwargs)
  File "/Users/pbelevich/PycharmProjects/pbelevich-transformers/src/transformers/modeling_utils.py", line 256, in create_extended_attention_mask_for_decoder
    causal_mask = causal_mask.to(attention_mask.dtype)
AttributeError: 'NoneType' object has no attribute 'dtype'
```

The last lines of stack trace show where the problem is
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74655
Approved by: https://github.com/ansley, https://github.com/rohan-varma
2022-03-25 14:40:45 +00:00
Nicolas Hug
7df0d9fda4 Call super().setUp() and super().tearDown() in torchhub tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74621

Approved by: https://github.com/vmoens, https://github.com/janeyx99, https://github.com/cpuhrsch
2022-03-25 14:36:31 +00:00
atalman
ca96d1d447 Use nvidia cuda image without cudnn for cudnn 8 and up
Use nvidia cuda image without cudnn for cudnn 8 and up.
We want to decouple the CUDA and cudnn versions so that we can evolve these version separately.
We want to use cudnn 8.3.2 for following CUDA versions 11.3, 11.5 and 11.6.
We are using Official Nvidia Cuda ubuntu image. And installing cudnn 8.3.2 on top of it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74545
Approved by: https://github.com/malfet
2022-03-25 12:18:42 +00:00
Jerry Zhang
66e07f2aef [quant][fx] Merge is_general_tensor_shape_op into is_general_tensor_value_op in QuantizeHandler (#74601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74601

Currently the behavior for general tensor shape op and general tensor value op are the same, so we can remove
this flag and merge with the is_general_tensor_value_op flag.

is_general_tensor_value_op flag is used in two places in prepare:
(1). dtype propgation: we only do dtype propgation when this flag is true (this will be refactor in the future to be more systematic)
(2). observer sharing, we'll use the input observer instance as output observer for an op if this flag is True

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: george-qi

Differential Revision: D35071438

fbshipit-source-id: 5e8f5fd84e37db0433a63fe0a0e212ce3c5908d6
(cherry picked from commit b4bbc9fa0e65f3768eb97ca8e84b7cbd7e840b67)
2022-03-25 11:10:44 +00:00
CodemodService FBSourceClangFormatLinterBot
7235ebc5e2 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zsol

Differential Revision: D35138849

fbshipit-source-id: 9dc26f28c7855260121b188f3733d1e0a2a8560a
(cherry picked from commit 788424548ddecee7793a329cffd5e0454663a1ad)
2022-03-25 09:31:42 +00:00
Han Qi
b57cc9c752 Reland "[pytorch][PR] Support dataclasses in TorchScript" (#74353)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74353

Repatched `d00de0d43598522b8f6ab2de553b6aaf6768faa5` by Nora Belrose (norabelrose). With following changes:
* Register fake source of generated methods in linecache so that inspect.get_source will succeed.
* this patching is only triggered if the given dataclass passed to torch.jit.script previously. Effectively we make this feature opt-in.

## Original Summary:
Fixes #72901.

Since we can't get access to the source code for synthesized magic methods on dataclasses, we have to synthesize our own versions. torch/jit/_dataclass_impls.py has the code that does this.

What's supported

Synthesized __init__, __eq__, and the comparison magic methods when order=True is set on the dataclass decorator
Default values for fields
__post_init__, including using InitVar fields inside of __post_init__, on Python 3.8+
Overriding __eq__ or any of the comparison magic methods to provide your own implementation
What's not supported

Default factory initializers for fields
Frozen dataclasses
InitVar on Python 3.7
__repr__ and __hash__ (these are actually implemented, but the TorchScript interpreter won't call them)
Using the != operator on dataclasses inside TorchScript; this is because TorchScript requires that you implement __ne__ to use this operator, whereas in regular Python the != operator will resolve to the negation of whatever is returned by __eq__ if there's no __ne__. Dataclasses don't actually synthesize an __ne__ method for this reason. I've been toying with different ways to fix this but != is not working in this PR at the moment.

Test Plan:
unittest

Also run previously failed test:
```
buck test mode/dev-nosan //fblearner/flow/projects/fluent2/definition/transformers/contrib/faim/test:tests -- --exact 'fblearner/flow/projects/fluent2/definition/transformers/contrib/faim/test:tests - test_mixmatch_multiclass (fblearner.flow.projects.fluent2.definition.transformers.contrib.faim.test.faim_mixmatch_test.TestFaimTransformerMixMatch)'
```
passes

Reviewed By: zhxchen17

Differential Revision: D34808842

fbshipit-source-id: 02f807cff1ea99e606333960225c71a239743a4b
(cherry picked from commit ec885a2bc04f9e5f65838fa5704d9a05815ebd37)
2022-03-25 06:41:07 +00:00
Peter Bell
c7a6be4b9c qlinear: Remove legacy cpp_custom_type_hack support (#72680)
Summary:
Ref https://github.com/pytorch/pytorch/issues/72263 for cpp_custom_type_hack removal

`qlinear_prepack` and `qlinear_unpack` were updated to use torchbind
and the `cpp_custom_type_hack` overloads marked with a deprecation
warning in https://github.com/pytorch/pytorch/issues/38101 which was in the PyTorch 1.6 release. So, we are
safe to break BC here.

The deprecation warning only appears in unpack, but you can't use one
without the other I think that's still okay.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72680

Reviewed By: george-qi

Differential Revision: D35056994

Pulled By: jerryzh168

fbshipit-source-id: cc046b9fa00d0219a4510854204564f4ea23da4b
(cherry picked from commit 31abbf1142d86174a1980feced57e4c621b704d1)
2022-03-25 04:34:21 +00:00
Scott Wolchok
3466c1b690 [PyTorch][deploy] Work around missing libdl (#74705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74705

As the comment says, libdl might not be separate because it may be subsumed into libc.

Test Plan:
1) existing tests
2) this is being sent out on top of platform010 migration for caffe2

Reviewed By: d4l3k, r-barnes

Differential Revision: D35117159

fbshipit-source-id: c4a6de7c3412db695509bd25d529658cdf785e3d
(cherry picked from commit 563919d4c5fd7a9cbdc03d24b1afc5b6a2c09cc8)
2022-03-25 03:59:44 +00:00
Jerry Zhang
eaae62fed9 Make args work in the uru10x10_to_trt_eval script (#74707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74707

att

Test Plan:
```
buck run mode/dev-nosan -c fbcode.split-dwarf=true -c fbcode.platform=platform009 accelerators/workloads/models/uru10x10:uru_10x10_to_trt_eval -- -h
```

Reviewed By: 842974287

Differential Revision: D34088069

fbshipit-source-id: 5c89d25db6493e0f66f7e57aac24ed72196d0378
(cherry picked from commit d9d79f03e28d609a14ddc3e55b97c52b0e102438)
2022-03-25 03:52:47 +00:00
Oleg Khabinov
5079321b71 Fix issue with prim::Print() and torch::deploy (#74513)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74513

Reviewed By: d4l3k, houseroad

Differential Revision: D35035089

fbshipit-source-id: d67b98600c74e2ed16b4d80f52148cd64b9e6ca0
(cherry picked from commit 16caf865077e28be31b805f015b9a61962632c8f)
2022-03-25 03:14:34 +00:00
Jerry Zhang
b347b8c191 [quant][fx] Support some default ops in the native backend config (#74600)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74600

Following https://github.com/pytorch/pytorch/pull/74210, this PR adds the support for some ops
using the DefaultNodeQuantizeHandler in the backend_config_dict defintion for pytorch native backend

TODO: There is still a few ops we didn't handle with backend_config_dict path: gelu and softmax, need to discuss if we still need them, if so we can change the test
to use backend_config_dict and remove the DefaultNodeQuantizeHandler after that

Test Plan:
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D35071437

fbshipit-source-id: 70351d2810ca1ac7dc09d4a9c239f6757ccb51ca
(cherry picked from commit 5e68f755a32ba7d90d6c73db9c2017f9c58d7fa5)
2022-03-25 02:59:36 +00:00
Mengwei Liu
797fa26f60 [PyTorch] Only select root ops in codegen unboxing (#74663)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74663

In lightweight dispatch, we only need to register root ops. Unlike in the dispatcher world, the transitive closure of the operators doesn't need to go through dispatcher or op registry.

Test Plan: Rely on unit tests

Reviewed By: priyaramani

Differential Revision: D35104401

fbshipit-source-id: 1a2df571880ac3c8625985c01bd89a2bb9566af9
(cherry picked from commit 16207fa18e87908ec5e038a7f60f41893a236749)
2022-03-25 02:52:51 +00:00
Mengwei Liu
4d82e5bf44 [PyTorch] Avoid registering ops into dispatcher in lightweight dispatch (#74664)
Summary:
This change adds the following logic:

If lightweight dispatch is enabled, do not generate `TORCH_LIBARAY` API calls for operator schema and implementations, since these operators will be registered into JIT op registry.

`skip_dispatcher_op_registration` is an existing argument to `gen.py`. With that set, `RegisterDispatchKey.cpp` will not generate `m.def` and `m.impl` for each native function. This logic will be removed once we find a better way to skip op registration into dispatcher.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74664

Test Plan: Rely on unit tests for lightweight dispatch.

Reviewed By: priyaramani

Differential Revision: D34634300

Pulled By: larryliu0820

fbshipit-source-id: d87828f2c6c62f15024ce9e98823b09ee5a81336
(cherry picked from commit 3eb1c27547dea6accd9fa95496189f3699d91201)
2022-03-25 02:52:51 +00:00
Edward Z. Yang
51e7a3406c Fix formatting of scalar tensors (don't call item)
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74376

Approved by: https://github.com/bdhirsh
2022-03-25 02:22:25 +00:00
Peter Bell
f86bb2d6e4 Implement _pad_circular in ATen
Closes #44459

This migrates the python implementation of `_pad_circular` to ATen and
removes the old C++ implementation that had diverged from python.

Note that `pad` can't actually use this until the
forward-compatibility period is over.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73410

Approved by: https://github.com/ezyang
2022-03-25 02:09:01 +00:00
Slava Kovalevskyi
f7317d3c51 Jinja2 for docs/cpp build set to version 3.0
Fixes https://github.com/pytorch/pytorch/issues/74684

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74718
Approved by: https://github.com/malfet
2022-03-24 23:39:26 +00:00
Han Qi
75d6cbe605 [4/5]Testing jit module in flatbuffer in Python. (#74387)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74387

Make temporary python bindings for flatbuffer to test ScriptModule save / load.

(Note: this ignores all push blocking failures!)

Test Plan: unittest

Reviewed By: iseeyuan

Differential Revision: D34968080

fbshipit-source-id: d23b16abda6e4b7ecf6b1198ed6e00908a3db903
(cherry picked from commit 5cbbc390c5f54146a1c469106ab4a6286c754325)
2022-03-24 23:29:47 +00:00
Jamie McCrindle
11894db9ea Add Python Version to Torch.Package metadata (#74610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74610

Adding python version to exported package and reading it on import as per this issue in github https://github.com/pytorch/pytorch/issues/74068
ghstack-source-id: 152003088

Test Plan: CI Tests

Reviewed By: PaliC

Differential Revision: D35062709

fbshipit-source-id: 04091a1255a09b96255112a60d31df127c424193
(cherry picked from commit ed39fd54b8b20918dac89a2873ecccf06aafd724)
2022-03-24 22:48:25 +00:00
Slava Kovalevskyi
7f996b855c Jinja2 version pinned to 3.0.* (#74690)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/74684

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74690

Reviewed By: malfet

Differential Revision: D35119993

Pulled By: b0noI

fbshipit-source-id: f53b2643000e24662644fda8718a7c4e1bfaa273
(cherry picked from commit 6dfadffff864f1d57eaea088c6dae0b673496bd7)
2022-03-24 21:58:28 +00:00
Jeeja
13ebcf3723 Add support for backend to register reducer timer
Currently by default, reduce timer registration
is expected for all backend. if timer is not
registered throws assert in set_runtime_stats_and_log()

To allow registration of reducer timer for other
backends, moved the timer registration to another
file decoupling the internal interface.

Signed-off-by: Jeeja <jeejakp@habana.ai>

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71700
Approved by: https://github.com/rohan-varma
2022-03-24 21:52:27 +00:00
wayi1
5fbe8b1966 [Model Averaging] Make HierarchicalModelAverager a subclass of averagers.ModelAverager
Make `HierarchicalModelAverager` a subclass of `averagers.ModelAverager` is a preparation step for incorporating hierarchical SGD into `PostLocalSGDOptimizer`.

Proposal: https://github.com/pytorch/pytorch/issues/73382
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74564
Approved by: https://github.com/rohan-varma
2022-03-24 21:52:00 +00:00
Pavithran Ramachandran
fc2cf3d26f Back out "Revert D34805092: Extend _save_for_mobile and _load_for_mobile to support flatbuffer format; Default format is pickle + Change buck targets to support only pickle and pickle + flatbuffer for migration" (#74594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74594

Extending `_save_for_mobile` and `_load_for_mobile` to support faltbuffer format with additional optional argument which is set to pick pickle by default.

Adding new binary target with suffix `_pickle_and_flatbuffer` to help migration.

Size test in D34909502 shows the size has regressed by ~40K but after removing pickle and comparing lite_predictors we have ~120K size measure that we will achieve when deprecating pickle and moving to flatbuffer

**BEFORE:**

```lang=mermaid
graph TD;
    torch_core-->torch_mobile_deserialize;

    torch_mobile_core-->torch_mobile_deserialize;

    jit_module_saving-->torch_core;
    jit_module_saving-->torch_mobile_core;

    torch_mobile_deserialize-->caffe2_serialize;
    torch_mobile_deserialize-->torch_mobile_module;

    caffe2_serialize-->miniz;

    flatbuffer_loader-->mobile_bytecode;
    flatbuffer_serializer-->mobile_bytecode;

    mobile_bytecode-->flatbuffer_2.0;

    flatbuffer_loader-->torch_mobile_module;
    flatbuffer_serializer-->torch_mobile_module;
```

**AFTER:**
```lang=mermaid
graph TD;
    torch_core-->torch_mobile_deserialize;

    torch_mobile_core-->torch_mobile_deserialize;

    jit_module_saving-->torch_core;
    jit_module_saving-->torch_mobile_core;

    torch_mobile_deserialize-->caffe2_serialize;
    torch_mobile_deserialize-->torch_mobile_module;

    caffe2_serialize-->miniz;

    flatbuffer_loader-->mobile_bytecode;
    flatbuffer_serializer-->mobile_bytecode;

    mobile_bytecode-->flatbuffer_2.0;

    torch_mobile_deserialize_pickle_and_flatbuffer-->|new| flatbuffer_loader;
    torch_mobile_deserialize_pickle_and_flatbuffer-->|new| torch_mobile_deserialize;
    torch_mobile_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;
    torch_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;

    jit_module_saving_pickle_and_flatbuffer-->|new| torch_core_pickle_and_flatbuffer;
    jit_module_saving_pickle_and_flatbuffer-->|new| torch_mobile_core_pickle_and_flatbuffer;

    flatbuffer_serializer-->torch_mobile_module;

    jit_module_saving_pickle_and_flatbuffer-->|new|jit_module_saving;
    jit_module_saving_pickle_and_flatbuffer-->|new|flatbuffer_serializer;

    flatbuffer_loader-->torch_mobile_module;
```

Original commit changeset: 780dfb6fd6ba

Original Phabricator Diff: D34805092 (284b2b7135)
ghstack-source-id: 152044801

(Note: this ignores all push blocking failures!)

Test Plan:
CI

```
~/fbsource/fbcode] cd ~/fbsource/fbcode/ && buck test -c fbcode.caffe2_enable_flatbuffer=1 //caffe2/test/cpp/jit:jit  -- FlatbufferTest.ExtraFiles
Parsing buck files: finished in 0.9 sec
Building: finished in 5.3 sec (100%) 12992/54304 jobs, 0/54304 updated
  Total time: 6.2 sec
More details at https://www.internalfb.com/intern/buck/build/2b387fff-f813-4cfa-b53f-eb2378630d4e
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d
Trace available for this run at /tmp/tpx-20220323-134108.766518-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d/trace.log
RemoteExecution session id: reSessionID-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 486 tests discovered (19.122)
    ✓ Pass: caffe2/test/cpp/jit:jit - FlatbufferTest.ExtraFiles (0.187)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693
```

Similar Build Deps Dags

```
[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops_pickle_and_flatbuffer, //xplat/caffe2:torch_mobile_deserialize_pickle_and_flatbuffer)' --output-format dot-compact  | pastry
P486770901: https://www.internalfb.com/intern/paste/P486770901/

[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops, //xplat/caffe2:torch_mobile_deserialize)' --output-format dot-compact  | pastry
P486771278: https://www.internalfb.com/intern/paste/P486771278/
```

pickle_and_flatbuffer: https://www.internalfb.com/intern/dgw/graph/?build_id=P486770901
pickle: https://www.internalfb.com/intern/dgw/graph/?build_id=P486771278

Reviewed By: iseeyuan

Differential Revision: D35067157

fbshipit-source-id: 9044259c17a2e0da79bd6aedb28efbdfd57e23e0
(cherry picked from commit f738069ec3a72e79da56172741d027de514e9e5f)
2022-03-24 21:51:05 +00:00
Jerry Zhang
d64e7634ff [quant] Remove assert for weight since it could be non-Tensor (#74365)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74365

att

Test Plan:
Meta-Internal tests with fx2trt

Imported from OSS

Reviewed By: andrewor14

Differential Revision: D34952754

fbshipit-source-id: 11d392a520c9ab7c9484c96841f2b39fbbbc3f80
(cherry picked from commit e8a2348d5c6f9717b010972819723affba37a0e4)
2022-03-24 21:27:53 +00:00
Ansha Yu
f2ca4341c9 [pyper] to + lengths_to_offsets (#73879)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73879

Fuse the following pattern:
```
%1994 : Tensor = aten::to(%getattr_78.1, %188, %189, %189) # <eval_with_key>.50:11:0
%1995 : Tensor = fb::lengths_to_offsets(%1994, %190) # /mnt/xarfuse/uid-1994
```

This pattern is applied after all the applicable clip_ranges+gather_ranges patterns

Additional context in https://fb.quip.com/DSCbAozMBwUi

Test Plan:
> ./caffe2/caffe2/fb/predictor/scripts/run_disagg_model_benchmarks.sh 321004917 27 /data/users/ansha/tmp/ads_tail sr_only

~0.007ms overall reduction in tail model runtime
(321004917_27 oemae_long_attr_win_2d_7d_aux_model)

**Local  (25 fused nodes)**
Before: 2.04ms/iter
      0.0112739 ms.   0.543996%. fb::lengths_to_offsets (31 nodes, out variant)
     0.00805597 ms.   0.388722%. static_runtime::to_maybe_copy_out (30 nodes, out variant)

After: 1.96256ms/iter
      0.0100853 ms.   0.498655%. fb::to_lengths_to_offsets (25 nodes, out variant)
     0.00328385 ms.   0.157536%. fb::lengths_to_offsets (6 nodes, out variant)
     0.00239722 ms.   0.115002%. static_runtime::to_maybe_copy_out (5 nodes, out variant)

**Local_RO  (43 fused nodes)**
Before: 0.11427
      0.0110696 ms.    9.42255%. fb::lengths_to_offsets (43 nodes, out variant)
     0.00638323 ms.    5.43349%. static_runtime::to_maybe_copy_out (43 nodes, out variant)
After: 0.112098ms/iter
       0.014206 ms.    12.6795%. fb::to_lengths_to_offsets (43 nodes, out variant)

**Remote_RO (17 fused nodes)**
Before: 0.24
      0.0534883 ms.    23.0586%. static_runtime::to_maybe_copy_out (136 nodes, out variant)
     0.00216992 ms.   0.935446%. fb::lengths_to_offsets (17 nodes, out variant)
After: 0.240225
      0.0525392 ms.    23.2864%. static_runtime::to_maybe_copy_out (119 nodes, out variant)
     0.00265347 ms.    1.17607%. fb::to_lengths_to_offsets (17 nodes, out variant)

Remote_Other (3 fused nodes)
Not much affect

Reviewed By: mikeiovine

Differential Revision: D34696255

fbshipit-source-id: a0dc4a8ff8f25a825f6dc371ec5e4b3b09740c29
(cherry picked from commit a49b482117ebd6dbabce81a7e790f9e59cbf26c1)
2022-03-24 21:20:05 +00:00
Tristan Rice
5b915e844c c10d: retry dns lookup failures (#74641)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74641

This makes dns hostname lookup failures retryable since in some environments such as Kubernetes they're not guaranteed to be resolvable until the job starts. Retrying this eliminates the race condition.

This also fixes `sandcastle_skip_if` when used on the class instead of the method. Previously they wouldn't inherit from TestCase so just wouldn't run under buck at all.

Fixes https://github.com/pytorch/pytorch/issues/73682

Test Plan:
Added a unit test

```
buck test //caffe2/test/distributed:test_store
```

Reviewed By: aivanou

Differential Revision: D35092284

fbshipit-source-id: d40bf187e52c41f551e4fe41c536b2b0015588ee
(cherry picked from commit f8908309d8ee64c25ee466a6b4922f34f2b7618e)
2022-03-24 19:51:09 +00:00
Facebook Community Bot
d0adb5ff26 Automated submodule update: FBGEMM (#74633)
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: ef22aabc8b

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74633

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: jianyuh

Differential Revision: D35088395

Pulled By: geyyer

fbshipit-source-id: cb6808719a545c318302ed2770b3c7fa459fe169
(cherry picked from commit dfba6a9441136ff8563e80e6a09c555ad6a3af5a)
2022-03-24 19:09:57 +00:00
Taylor Robie
2ecf743757 [Profiler] Pay for what you use (v2) (#74484)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74484

In my first attempt at this in December I stamped out specializations using variadic templates. However I'm able to get comparable performance using simple conditionals since the branch is very predictable and AppendOnlyList::emplace_back is low enough overhead that multiple calls don't cause an issue.

This is also a chance to do some BE: rather than force ops and backend events to use the same fields (which in practice means setting a bunch of default values when reporting backend events), I just split them and use a variant.

Test Plan: The single threaded benchmark (with no extra options set) improved considerably from ~0.88 us to ~0.62 us. The stress test benchmark improved modestly from ~6.1 us to ~5.8 us. So the bottleneck for multi-threading is somewhere else, but doing less wasted work is still able to move the needle a little bit.

Reviewed By: swolchok

Differential Revision: D34779994

fbshipit-source-id: 392bc7c6f12797fa5e18777063aa21210d9d2067
(cherry picked from commit f0a49ff7be8aa65bab2f6952cc2e6306c1edc24b)
2022-03-24 18:43:08 +00:00
Shiyan Deng
3f164e0395 [reland] Process inputs and outputs in fx interpreter (#74637)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74637

Forgot to update the expect file in https://github.com/pytorch/pytorch/pull/74242. Reland to include changes in expect file.

Test Plan: unit test

Reviewed By: yinghai

Differential Revision: D35089989

fbshipit-source-id: 5e3ad9c696cf31cbc691d34fdb77eff26f92e38d
(cherry picked from commit 110ac12f5e2bcca7552d4b4691c7d98fafb21a57)
2022-03-24 18:32:57 +00:00
Jiaxu Zhu
7c1f3cc89e [quant] Populate FakeQuantize quant_min/quant_max to observer (#74581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74581

As title, currently the quant_min/quant_max of the FakeQuantize are not populated to the observer. We plan to populate when they are both not None.

To do this we need to do
1. Remove the current default quant_min/quant_max value (0/255) as it's not universal for various dtype.
2. Move the upper bound/lower bound check before creating the observer.

Test Plan:
```
[jiaxuzhu@devvm3400.frc0 /data/users/jiaxuzhu/fbsource/fbcode] buck test mode/dev //caffe2/test:quantization -- --exact 'caffe2/test:quantization - test_quant_min_max_override (quantization.core.test_workflow_module.TestFakeQuantize)'
Parsing buck files: finished in 0.8 sec
Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 9.5 sec (100%) 18535/84579 jobs, 2/84579 updated
  Total time: 10.3 sec
More details at https://www.internalfb.com/intern/buck/build/1cab97ef-0788-4d06-92ed-a828995e3bde
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 24be645e-eebc-45d6-8111-052ef1225fa0
Trace available for this run at /tmp/tpx-20220323-094106.724238-24be645e-eebc-45d6-8111-052ef1225fa0/trace.log
RemoteExecution session id: reSessionID-24be645e-eebc-45d6-8111-052ef1225fa0-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/5066549674998735
    ✓ ListingSuccess: caffe2/test:quantization : 483 tests discovered (20.179)
    ✓ Pass: caffe2/test:quantization - test_quant_min_max_override (quantization.core.test_workflow_module.TestFakeQuantize) (18.896)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/5066549674998735
```

Reviewed By: jerryzh168

Differential Revision: D34971236

fbshipit-source-id: 4407fd03116a296053256b333f7ce6d28dcc9c42
(cherry picked from commit f6980bccea802f220cc5b6dfe1bf3a3a3eef0a34)
2022-03-24 18:23:40 +00:00
kstant0725
ff58899b5e Pull request to run CI for #72556 (#73404)
Summary:
This PR moves the Dockerfile conda dependencies into a requirements-ci.txt (and begins the requirements file for other parts of CI as well).  Packages are listed alphabetically in the requirements-ci.txt.  Uncommented packages before the mkl package have been tested and confirmed to work on all platforms.  Commented out packages before mkl have broken at least one platform and so have been comment out.  There appears to be some randomness with certain platforms not passing tests so it might be good to run a number of tests for the same configuration to confirm if it is indeed these commented out packages that cause the errors.

Remaining is to test all commented out packages to ensure they work on all platforms.  This will likely involve repeat runs of the same configurations to ensure it is indeed the packages that break the platforms and not random errors.

This PR makes progress on task https://github.com/pytorch/pytorch/issues/72556

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73404

Reviewed By: janeyx99

Differential Revision: D34730797

Pulled By: kstant0725

fbshipit-source-id: 3e4b171720fa33b604cebb9c6101d38ba11f2f8b
(cherry picked from commit 99cc445aadb95f92f6ef040f2d4b7c6c6d5b7f8b)
2022-03-24 18:04:08 +00:00