Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55462
handles and symbolicate exception callstack thrown from backend.
Objective of this diff is to achieve improve error reporting when
exceptions are raised from lowered backend. We would effectively like to
get the same model level stack trace that you would get without having
lowered some module to backend.
For example:
```
class AA(nn.Module):
def forward(self, x, y):
return x + y
class A(nn.Module):
def __init__(...):
self.AA0 = AA()
def forward(self, x, y):
return self.AA0.forward(x, y) + 3
class B(nn.Module):
def forward(self, x):
return x + 2
class C(nn.Module):
def __init__(...):
self.A0 = A()
self.B0 = B()
def forward(self, x, y):
return self.A0.forward(x, y) + self.B0.forward(x)
```
If the we then do C().forward(torch.rand((2,3)), torch.rand(14,2))) we
will likely see error stack like:
```
C++ exception with description "The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "<string>", line 3, in forward
def forward(self, x, y):
return self.A0.forward(x, y) + self.B0.forward(x)
~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 3, in forward
def forward(self, x, y):
return self.AA0.forward(x, y) + 3
~~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 3, in forward
def forward(self, x, y):
return x + y
~~~~~ <--- HERE
```
We would like to see the same error stack if we lowered C.A0 to some
backend.
With this diff we get something like:
```
Module hierarchy:top(C).A0(backend_with_compiler_demoLoweredModule).AA0(AA)
Traceback of TorchScript (most recent call last):
File "<string>", line 3, in FunctionName_UNKNOWN
def forward(self, x, y):
return self.A0.forward(x, y) + self.B0.forward(x)
~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 5, in FunctionName_UNKNOWN
typed_inputs: List[Any] = [x, y, ]
if self.__backend.is_available() :
_0, = self.__backend.execute(self.__handles["forward"], typed_inputs)
~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
assert isinstance(_0, Tensor)
return _0
File "<string>", line 3, in FunctionName_UNKNOWN
def forward(self, x, y):
return self.AA0.forward(x, y) + 3
~~~~~~~~~~~~~~~~ <--- HERE
File "<string>", line 3, in FunctionName_UNKNOWN
def forward(self, x, y):
return x + y
~~~~~ <--- HERE
```
This is achieved in 3 parts:
Part 1:
A. BackendDebugInfoRecorder:
During backend lowering, in `to_backend`, before calling the preprocess
function corresponding to the backend. This will facilitate recording of
debug info (such as source range + inlined callstack) for the lowered module.
B. Instantiate WithBackendDebugInfoRecorder with BackendDebugInfoRecorder.
This initializes thread local pointer to BackendDebugInfoRecorder.
C. generate_debug_handles:
In preprocess function, the backend will call generate_debug_handles
for each method being lowered separately. generate_debug_handles
takes `Graph` of the method being lowered and returns a map
of Node*-to-debug_handles. Backend is responsible for storing debug
handles appropriately so as to raise exception (and later profiling)
using debug handles when the exception being raised corresponds to
particular Node that was lowered.
Inside generate_debug_handles, we will query the current
BackendDebugHandleInfoRecorder, that is issuing debug handles. This debug
handle manager will issue debug handles as well as record
debug_handles-to-<source range, inlined callstack> map.
D. Back in `to_backend`, once the preprocess function is has finished
lowering the module, we will call `stopRecord` on
BackendDebugInfoRecorder. This will return the debug info map. This
debug info is then stored inside the lowered module.
Part 2:
Serialization:
During serialization for bytecode (lite interpreter), we will do two
things:
1. Extract all the source ranges that are contained inside
debug_handles-to-<source range, inlined callstack> map for lowered
module. This will be source range corresponding to debug handles,
including what is there is inlined callstack. Since we replaced original
module with lowered module, we wont be serializing code for the original
module and thus no source range. That is why the source range will have
to be stored separately. We will lump all the source ranges for all the
lowered modules in one single debug_pkl file.
2. Then we will serialize debug_handles-to-<source range, inlined
callstack> map.
Now during deserialization we will be able to reconstruct
debug_handles-to-<source range, inlined callstack> map. Given all
debug_handles are unique we would not need any module information.
Test Plan:
Tests are added in test_backend.cpp
Tests are added in test_backend.cpp
Imported from OSS
Differential Revision:
D27621330
D27621330
Reviewed By: raziel
Pulled By: kimishpatel
fbshipit-source-id: 0650ec68cda0df0a945864658cab226a97ba1890
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58623
This splits out everything shellcheck related into its own job that generates and checks GHA workflows, then shellchecks those + jenkins scripts. This PR also integrates shellcheck into the changed-only stuff in `actions_local_runner.py` so that shellcheck won't do anything unless someone edits a shell script in their local checkout. This is the final piece to clean up the output of `make quicklint` and speeds it up by a good bit (before it was shellchecking everything which took a few seconds):
```
$ make quicklint -j $(nproc)
✓ quick-checks: Ensure no unqualified noqa
✓ quick-checks: Ensure canonical include
✓ quick-checks: Ensure no unqualified type ignore
✓ quick-checks: Ensure no direct cub include
✓ quick-checks: Ensure no tabs
✓ quick-checks: Ensure no non-breaking spaces
✓ shellcheck: Regenerate workflows
✓ quick-checks: Ensure no versionless Python shebangs
✓ quick-checks: Ensure correct trailing newlines
✓ shellcheck: Assert that regenerating the workflows didn't change them
✓ mypy (skipped typestub generation)
✓ cmakelint: Run cmakelint
✓ quick-checks: Ensure no trailing spaces
✓ flake8
✓ shellcheck: Extract scripts from GitHub Actions workflows
✓ shellcheck: Run Shellcheck
real 0.92
user 6.12
sys 2.45
```
Test Plan: Imported from OSS
Reviewed By: nikithamalgifb
Differential Revision: D28617293
Pulled By: driazati
fbshipit-source-id: af960ed441db797d07697bfb8292aff5010ca45b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55978
This is needed for broadcasting two of the same symbolic shape
Test Plan: Imported from OSS
Reviewed By: nikithamalgifb
Differential Revision: D27755328
Pulled By: eellison
fbshipit-source-id: d38d9458a9e28d31558f0bc55206516b78131032
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54809
I'm going to post on dev-discuss soon with a more thorough explanation of the design and advantages of this shape analysis, so I'm leaving out that for now.
There is still a ton left to do, I'm posting this initial version so we can get something on master multiple can work on. List of many remaining steps to do:
- [ ] Add symbolic shapes support
- [ ] Bind shape functions for operators in C++
- [ ] Make classes of operators share the same shape function (e.g. pointwise, broadcast two inputs)
- [ ] Refactor APIs
- [ ] Only iteratively optimize shape function while a change has been made
- [ ] Expand coverage of coverage to common ops
- [ ] Add shape analysis pass on Graph that handles Ifs and Loops
- [ ] Allow concurrent reads to the operator map
- [ ] Successive applications of same inputs to same shape function (e.g. series of pointwise ops)
For this review, I am mostly looking for comments related to the implementation of symolic_shape_analysis.cpp, with the caveats listed above. I am not really looking for comments related to api/registration/graph level analysis as those are all planned to be changed. I am fine landing this as is or waiting until necessary components of the TODOs above are finished.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D27750998
Pulled By: eellison
fbshipit-source-id: 4338b99e8651df076291c6b781c0e36a1bcbec03
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57397
Introduces two main classes in C++ runtime:
ScriptProfile is the implementation for enalbing and disabling interpreter
profiling in C++. This should be only used from Python, and we will add
corresponding Python API in the next diff.
InstructionSpan is a utility class to instrument execution of each single
instruction. A start timestamp is recorded in the consturctor, and an end
timestamp is recorded in the destructor. During destruction, this will send
runtime data to all enabled ScriptProfile instances.
Test Plan:
build/bin/test_jit --gtest_filter='ScriptProfileTest.Basic'
Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D28133579
fbshipit-source-id: e7e30e96151367022793ab3ad323f01c51ad4a3b
Summary:
This adds to internal build target and makes it ready for selective build
workflow.
Test Plan: CI builds
Reviewed By: z-a-f
Differential Revision: D28103697
fbshipit-source-id: 19c8b27aae4de1cece8d88d13ea51ca4ac7d79b6
Summary:
Testing for both that a lint job ran and that it was successful depends
on having lint pass for the PR, which can create confusion if it doesn't
(i.e. a flake8 failure also causes this job to fail, and it's not
immediately clear why). With this PR we just check for the presence of
job names to see that something ran.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58490
Reviewed By: samestep
Differential Revision: D28511229
Pulled By: driazati
fbshipit-source-id: 3036deff9f9d0ef2e78b44a9a43b342acdcfa296
Summary:
Build lite interpreter as default for android, should wait until https://github.com/pytorch/pytorch/pull/56002 lands
Mainly two changes:
1. Use lite interpreter as default for Android
2. Switch the lite interpreter build test to full jit build test
Test Plan: Imported from OSS
Differential Revision: D27695530
Reviewed By: IvanKobzarev
Pulled By: cccclai
fbshipit-source-id: e1b2c70fee6590accc22c7404b9dd52c7d7c36e2
Summary:
Fixes https://github.com/pytorch/pytorch/issues/57679
##### Release Notes
This is part of the end of the deprecation of inplace/view:
- `detach_` will now raise an error when invoked on any view created by `split`, `split_with_sizes`, or `chunk`. You should use the non-inplace `detach` instead.
- The error message for when an in-place operation (that is not detach) is performed on a view created by `split`, `split_with_size`, and `chunk` has been changed from "This view is **an** output of a function..." to "This view is **the** output of a function...".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58285
Reviewed By: bdhirsh
Differential Revision: D28441980
Pulled By: soulitzer
fbshipit-source-id: e2301d7b8cbc3dcdd328c46f24bcb9eb7f3c0d87
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57510
This is a re-write of https://github.com/pytorch/pytorch/pull/56835, which is significantly shorter thanks to the data model change in the PR below this one in the stack. See the original description in the linked PR for details.
The functional changes in this PR are the same as in the above linked one, so the description is the same with a few small changes:
- I don't bother generating `at::xla::{op}` entries for CPU fallbacks. After looking around, I see precedent for that. For example, we don't have `at::cpu::{op}` entries for composite ops- if you really want to bypass the dispatcher you need to call `at::compositeimplicitautograd::{op}`. Maybe we should revisit that later if we find an important use case for having full namespace coverage, but that doesn't seem worth half-fixing for external backends in this PR.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision: D28474364
Pulled By: bdhirsh
fbshipit-source-id: 4d58b60e5debad6f1ff06420597d8df8505b2876
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57361
Data model change in the codegen, which splits backend-specific information out of `NativeFunction`
### Overview
Currently in the codegen, native_functions.yaml has backend-specific information about each operator that is encoded directly into the data model, in the `NativeFunction` object. That's reasonable, since the native_functions.yaml is the source of truth for information about an operator, and the data model encodes that information into types.
Now that external backends can use the codegen though, that information is technically incomplete/inaccurate. In another PR, I tried patching the information on the `NativeFunction` object with the additional external information, by updating the `dispatch` entry to contain the external backend kernel name and dispatch key.
Instead, this PR tries to split out that information. The `NativeFunction` class contains all information about an operator from native_functions.yaml that's backend-independent and is known never to change regardless of what extra information backends provide. We also build up a backend "index", which is basically a mapping from [backend] -> [backend-specific-metadata]. Reading in an external backend yaml just involves updating that index with the new backend.
There were a few places where `NativeFunction` used the dispatch table directly, that I encoded as properties directly on the NativeFunction object (e.g. `is_abstract`). They were mostly around whether or not the operator has a composite kernel, which isn't something that's going to change for any external backends.
This has a few advantages:
- We can more easily re-use the existing logic in `native_function.py` and `register_dispatch_key.py` for both native and external backends, since they both involve a NativeFunction + a particular backend index
- The data in the data model will be the same regardless of how the codegen is run. Running the codegen with a new external backend doesn't change the data inside of NativeFunction or an existing backend index. It just adds a new index for that backend.
- There are several of codegen areas that don't care about backend-specific information: mostly the tracing and autograd codegen. We can reason about the codegen there more easily, knowing that backend-specific info is entirely uninvolved.
An alternative to this split would be to augment the NativeFunction objects with external backend information at the time that we create them. So the external codegen could read both native_functions.yaml and the external backend's yaml at the same time, and construct a NativeObject with a full dispatch table (including the XLA entry), and the correct setting of structured (taking into account both yamls). One disadvantage to this approach is that NativeFunction objects now contain different stuff depending on how you ran the codegen, and you have to make sure that any changes to the codegen can properly handle all the different variants.
### Data Model Changes
Removed 3 classes, which are used by the external codegen:
- ExternalBackendFunction
- ExternalBackendFunctionsGroup
- ExternalBackendMetadata
And added two new ones:
- BackendIndex
- BackendMetadata
`BackendIndex` contains any info that's specific to that backend, plus a mapping from operator names to backend specific metadata about the operator. One example of backend-specific info that's not operator-dependent is the fact that XLA prefers to implement functional kernels instead of out kernels (and so when they eventually mark an op as structured, they're going to mark the functional op and not the out op).
`BackendMetadata` contains info specific to an (operator, backend) pair. Right now, that's just (a) the name of the kernel, and (b) whether or not that operator is structured.
### Questions
I wanted to get this PR up earlier so I could get feedback, but there are a few things I want to call out:
**Dealing with `structured`.**
This PR separates out the notion of `structured` into two bits of information:
- Does [operator] have a meta() function. This is backend-agnostic, and is represented by the `structured` property on `NativeFunction`, same as before. This is used, e.g., to decide what signatures to add to `MetaFunctions.h`.
- Does [operator, backend] have an impl() function. This is backend dependent; even though technically all in-tree backends are forced to write impl() functions for an operator when we port the op to structured in native_functions.yaml, out-of-tree backends can decide to opt in independently. This is represented as a property on `BackendMetadata`. This is used in most other cases, e.g. in `RegisterDispatchKey` when we're deciding whether or not to gen a structured or unstructured wrapper.
I also baked `is_structured_dispatch_key` directly into each BackendIndex. So for operators marked "structured" in native_functions.yaml, their corresponding CPU/CUDA BackendIndex entries will be marked structured, and all others (except for potentially external backends) will not.
I ended up trying to deal with `structured` in this change since it's technically backend dependent (XLA can opt kernels into structured separately from in-tree ops), but that may have been too ambitious: it's technically not relevant until we actually add support for structured external kernels. If it's not clear that this is the right path for dealing with structured and we want to push that off, I'm fine with backing out the bits of this PR that make `structured` backend-dependent. I don't see anything *too* controversial related to structured in the change, but I tried to call out any areas in the comments
**Localizing the fact that external backends follow Dispatcher convention.**
Another thing that's sort of backend specific that I didn't totally address in this PR is the fact the fact that in-tree backends follow the Native API while external backends follow the Dispatcher API. I painted over that in `native_functions.py` by adding a helper, `kernel_signature`, that takes in a native function and gives you the "correct" signature for the specified backend- NativeSignature for in-tree backends, and DispatcherSignature for out-of-tree backends. In order to make that fully useable though, we'll need `NativeSignature` and `DispatcherSignature` to have matching interfaces. I didn't bother with that in this PR, which is why `gen_external_aten_fallbacks.py` still has a bunch of direct references to the dispatcher API. Thinking of adding it in a later PR but wanted to see if anyone has other opinions.
Maybe `is_external()` shouldn't even be a property on the BackendMetadata, and anything the codegen does that requires asking for that information should just be better abstracted away.
**Thoughts on the `BackendIndex` / `BackendMetadata` breakdown.**
One thing that's annoying right now is that to query for various pieces of metadata, you call helper functions like `backend_index.structured(f)`, which queries that particular backend and tells you if that specific NativeFunctionGroup is structured for that backend. It has to return an `Optional[bool]` though, since you have to handle the case where that operator doesn't have a kernel for that backend at all. So users of those helpers end up with a bunch of optionals that they need to unpack, even if they know at some point that the result isn't None. I think it would be easier instead to just store the NativeFunction object as a field directly on the BackendMetadata. Curious if there are any other opinions on a better way to model it though.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision: D28474362
Pulled By: bdhirsh
fbshipit-source-id: 41a00821acf172467d764cb41e771e096542f661
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56597
3 small changes, all centered around error messaging.
1) Improved error messages when `gen_backend_stubs.py` receives invalid yaml
2) Added error message tests. I wasn't sure if there was a canonical way to do this, so I just wrote a test that takes in a list of (yaml input, expected error message) pairs and runs the codegen pipeline on each of them.
3) I also removed the LineLoader from the yaml parsing bit that reads in the external backend yaml file. Two reasons that I took it out:
- The main reason we use it with native_functions.yaml is to easily pinpoint problems with new ops as they're added, that the codegen can pick up. 99% of these problems have to do with schema, which is irrelevant to the external yaml since it pulls the schema from native_functions
- Not all operators have to appear in the external yaml. We could do something like "line: -1", but that's kind of weird.
If you think the line numbers would actually be of more use than I'm thinking of in the external yaml though, let me know!
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision: D28474363
Pulled By: bdhirsh
fbshipit-source-id: 8b5ec804b388dbbc0350a20c053da657fad0474f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58205
It's worthing moving train related files into their own folder since we are adding more code under the mobile directory.
This diff does that.
Test Plan: run unit tests and ci
Reviewed By: iseeyuan
Differential Revision: D28402432
fbshipit-source-id: cd76a1c4f8ff06508cdc3aad8a169fbf34bb4995
Summary:
Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs.
I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275
Test Plan: CI.
Reviewed By: zhouzhuojie
Differential Revision: D28428143
Pulled By: samestep
fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55172
Description:
This is part 1 of series of PRs for supporting torch.jit.ignore as context manager. Following features are implemented in this PR:
- Unique name for the registered function under torch.jit.frontend module. The unique name is generated based on the file name and line number of context manager
- Forcing user to explicitly annotate the input and outputs.
- No side effects are considered.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D27895283
Pulled By: tugsbayasgalan
fbshipit-source-id: 5d36d9aa5d457055a6bb1676f264647a745ec36a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58201
Add light version of RandomSampler which can be used torch mobile.
Test Plan: run unit test
Reviewed By: iseeyuan
Differential Revision: D28364467
fbshipit-source-id: 3148129fa56533f5f4b76b63b60e8778eeaf815f
Summary:
This adds the methods `Tensor.cfloat()` and `Tensor.cdouble()`.
I was not able to find the tests for `.float()` functions. I'd be happy to add similar tests for these functions once someone points me to them.
Fixes https://github.com/pytorch/pytorch/issues/56014
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58137
Reviewed By: ejguan
Differential Revision: D28412288
Pulled By: anjali411
fbshipit-source-id: ff3653cb3516bcb3d26a97b9ec3d314f1f42f83d
Summary:
Currently only supports native ops that have all tensor arguments, an out variant, and no kwargs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58118
Reviewed By: ejguan
Differential Revision: D28421323
Pulled By: Chillee
fbshipit-source-id: 1c75c900415deca63fcc0e496e3bac126f21bf49
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48918
enable test case on AvgPool2d channels last for CPU
Test Plan: Imported from OSS
Reviewed By: glaringlee
Differential Revision: D25399466
Pulled By: VitalyFedyunin
fbshipit-source-id: 9477b0c281c0de5ed981a97e2dcbe6072d7f0aef
Summary:
This PR:
- Renames symeig_backward to eigh_backward
- Improves the stability and speed of the gradient computation by doing `V(A + B)Vh` instead of `VAVh + VBVh` when both the gradients of the eigenvectors and eigenvalues are defined.
- Updates the comments of the function to make them arguably clearer
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55049
Reviewed By: ngimel
Differential Revision: D28396823
Pulled By: mruberry
fbshipit-source-id: a144482bfb1054e281b58ae1fe3cf1015bab505d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58235
this is to make the opinfo change python only
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision: D28412937
Pulled By: albanD
fbshipit-source-id: 1d6eb1e4baaa837c300ee8aa00b57986ba3e3eb2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58039
The new function has the following signature
`inv_ex(Tensor inpit, *, bool check_errors=False) -> (Tensor inverse, Tensor info)`.
When `check_errors=True`, an error is thrown if the matrix is not invertible; `check_errors=False` - responsibility for checking the result is on the user.
`linalg_inv` is implemented using calls to `linalg_inv_ex` now.
Resolves https://github.com/pytorch/pytorch/issues/25095
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D28405148
Pulled By: mruberry
fbshipit-source-id: b8563a6c59048cb81e206932eb2f6cf489fd8531
Summary:
Fixes https://github.com/pytorch/pytorch/issues/56608
- Adds binding to the `c10::InferenceMode` RAII class in `torch._C._autograd.InferenceMode` through pybind. Also binds the `torch.is_inference_mode` function.
- Adds context manager `torch.inference_mode` to manage an instance of `c10::InferenceMode` (global). Implemented in `torch.autograd.grad_mode.py` to reuse the `_DecoratorContextManager` class.
- Adds some tests based on those linked in the issue + several more for just the context manager
Issues/todos (not necessarily for this PR):
- Improve short inference mode description
- Small example
- Improved testing since there is no direct way of checking TLS/dispatch keys
-
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58045
Reviewed By: agolynski
Differential Revision: D28390595
Pulled By: soulitzer
fbshipit-source-id: ae98fa036c6a2cf7f56e0fd4c352ff804904752c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49352
In this PR, we replace all definitions of slice to take None parameters for the start, end, and step. This will simplify the compiler logic
Test Plan:
test_jit test cases
Imported from OSS
Reviewed By: jamesr66a, nikithamalgifb
Differential Revision: D25929903
fbshipit-source-id: 5bfc6bad514a8aafbef2dacc706f95f867fe85f1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57701
The new OpInfo flag has the following semantic:
- If it says that it supports forward AD, we run gradcheck with forward AD to ensure it is correct
- If it says that it does not support it, we check that the corresponding error is raised
All the added tests take 3s to run for CPU builds and 1min for GPU builds which should be pretty negligible compared to the test_ops runtime for each of these arch.
Test Plan: Imported from OSS
Reviewed By: agolynski
Differential Revision: D28387767
Pulled By: albanD
fbshipit-source-id: 369d76921c8460aa4548f9b5159b7297994672f5