Fixes#115331.
This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary:
- `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`.
- Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`.
- Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this.
- Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS`
[^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639
Approved by: https://github.com/cyyever, https://github.com/albanD
This PR relands #108238 that was closed as stale due to CLA issues and also because the CI check has marked the PR as not mergeable.
Repro 1:
```python
import torch
def f(x):
return x[x > 0]
jf = torch.jit.trace(f, torch.tensor(2., device="cuda"))
```
Error:
```bash
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/pytorch/torch/jit/_trace.py", line 874, in trace
traced = torch._C._create_function_from_trace(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<stdin>", line 2, in f
RuntimeError: NYI: Named tensors are not supported with the tracer
```
Repro2:
```python
import torch
import torch.nn.functional as F
from torch import nn
import copy
class Net(nn.Module):
def __init__(self):
super().__init__()
def forward(self, inputs):
x = copy.deepcopy(inputs) # RuntimeError: NYI: Named tensors are not supported with the tracer
x = F.relu(x)
return x
model = Net()
images = torch.randn(8, 28, 28)
torch.jit.trace(model, images)
```
Error 2:
```bash
Traceback (most recent call last):
File "/opt/pytorch/test_deepcopy.py", line 18, in <module>
File "/opt/pytorch/torch/jit/_trace.py", line 806, in trace
return trace_module(
^^^^^^^^^^^^^
File "/opt/pytorch/torch/jit/_trace.py", line 1074, in trace_module
module._c._create_method_from_trace(
File "/opt/pytorch/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/pytorch/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/pytorch/torch/nn/modules/module.py", line 1501, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/pytorch/test_deepcopy.py", line 12, in forward
x = F.relu(x)
^^^^^^^^^^
File "/opt/conda/envs/ptca/lib/python3.11/copy.py", line 153, in deepcopy
y = copier(memo)
^^^^^^^^^^^^
File "/opt/pytorch/torch/_tensor.py", line 122, in __deepcopy__
new_storage = self._typed_storage()._deepcopy(memo)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/pytorch/torch/storage.py", line 847, in _deepcopy
return self._new_wrapped_storage(copy.deepcopy(self._untyped_storage, memo))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/ptca/lib/python3.11/copy.py", line 153, in deepcopy
y = copier(memo)
^^^^^^^^^^^^
File "/opt/pytorch/torch/storage.py", line 112, in __deepcopy__
new_storage = self.clone()
^^^^^^^^^^^^
File "/opt/pytorch/torch/storage.py", line 126, in clone
return type(self)(self.nbytes(), device=self.device).copy_(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: NYI: Named tensors are not supported with the tracer
```
----
#48054 RuntimeError: NYI: Named tensors are not supported with the tracer
#49538 jit tracer doesn't work with unflatten layer
#31591 when i try to export a pytorch model to ONNX, got RuntimeError: output of traced region did not have observable data dependence with trace inputs; this probably indicates your program cannot be understood by the tracer.
- This bug was closed but exists. Multiple comments on it still showing error. This is addressed
Likely fixes the following issues (but untested)
#63297 Named tensor in tracer
#2323 [Bug] torch.onnx.errors.UnsupportedOperatorError when convert mask2former to onnx
Fix zero dimensioned tensors when used with jit.trace They are currently assigned an empty set for names {} this is not the same as "no name" so jit.trace bails with
"NYI: Named tensors are not supported with the tracer"
This happens when I am trying to save a non-trivial model as onnx but the simplest repro I have seen is 48054 above which has been added as test/jit/test_zero_dim_tensor_trace.py
Test plan:
New unit test added
Broken scenarios tested locally
CI
Fixes#48054
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118393
Approved by: https://github.com/zou3519
Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling.
- Replace `assert(0)` with `raise AssertionError()`
- Remove extraneous parenthesis i.e.
- `assert(a == b)` -> `assert a == b`
- `if(x > y or y < z):`->`if x > y or y < z:`
- And `return('...')` -> `return '...'`
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591
Approved by: https://github.com/albanD, https://github.com/malfet
* Enable PERF402. Makes code more efficient and succinct by removing useless list copies that could be accomplished either via a list constructor or extend call. All test cases have noqa added since performance is not as sensitive in that folder.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115505
Approved by: https://github.com/malfet
Fixes#68972
Relands #107246
To avoid causing Meta-internal CI failures, this PR avoids always asserting that the default dtype is float in the `TestCase.setUp/tearDown` methods. Instead, the assert is only done if `TestCase._default_dtype_check_enabled == True`. `_default_dtype_check_enabled` is set to True in the `if __name__ == "__main__":` blocks of all the relevant test files that have required changes for this issue
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108088
Approved by: https://github.com/ezyang
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
Summary: check trace runs with no_grad() and grad or not impacts transformer trace construction. use no_grad() consistently
Test Plan:
sandcastle and github ci
```
buck2 run mode/opt mode/inplace //caffe2/test:test_jit_cuda -- --regex test_scriptmodule_transformer_cuda
```
Differential Revision: D48020889
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106523
Approved by: https://github.com/davidberard98
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.
Fixes#35735
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069
Approved by: https://github.com/mikaylagawarecki
When handling custom classes from Python, it is nice to be able to specify how they are displayed to the user.
Out of the two standard functions to do this, only `__str__` could be implemented in C++. This PR add `__repr__` to the allowlist of magic methods.
The second commit tweaks the default output of `__str__` to make it more informative, but I can remove the change if you want.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100724
Approved by: https://github.com/ezyang
stringstream construction is expensive, and we can exactly reserve space for the output string while doing the same number of string copies. (If we wanted to improve performance further, we could introduce annotation_str_out to append the output to a given std::string and thus avoid copying subtype annotation strings. It occurs to me that the existing approach is quadratic in the number of layers of nesting, so we should probably do this!)
Differential Revision: [D43919651](https://our.internmc.facebook.com/intern/diff/D43919651/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96498
Approved by: https://github.com/Skylion007
stringstream is expensive to create, we used stringstream instead of ostringstream, and we can easily specialize the empty tuple. Also, anybody compiling with C++20 support can move out of the stringstream and it shouldn't hurt people without C++20 support to do so. I would consider specializing the 1-element case as well but I don't have evidence that that's necessary right now.
Differential Revision: [D43882402](https://our.internmc.facebook.com/intern/diff/D43882402/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96497
Approved by: https://github.com/Skylion007
Fixes https://github.com/pytorch/pytorch/issues/91483
Using a separate test class here, so that there is no need to run setup and teardown for all tests in TestJit. The root cause here is that test_profiler could be flaky and fail in the middle without the chance to restore `torch._C._set_graph_executor_optimize` to its original value (https://github.com/pytorch/pytorch/issues/81626). This causes issues for all future tests running after as shown in https://github.com/pytorch/pytorch/issues/91483.
I suspect that is also the same root cause for several other flaky tests in the same file https://github.com/search?q=repo%3Apytorch%2Fpytorch+DISABLED+test_jit.TestScript&type=issues. After this fix is merged, I would let retry bot does it job and close these issues after 2 weeks.
### Testing
The issue https://github.com/pytorch/pytorch/issues/91483 can now be reproduced by adding `torch._C._set_graph_executor_optimize(False)` locally to see if the test fails:
```
diff --git a/test/test_jit.py b/test/test_jit.py
index 2d1161d7466..17745d39182 100644
--- a/test/test_jit.py
+++ b/test/test_jit.py
@@ -5413,6 +5413,8 @@ a")
FileCheck().check("int =").check("ListConstruct").check("aten::cat").run(str(g))
def test_stack(self):
+ torch._C._set_graph_executor_optimize(False)
+
with enable_profiling_mode_for_profiling_tests():
@torch.jit.script
def func(x):
```
It indeed fails:
```
======================================================================
FAIL [0.006s]: test_stack (test_jit.TestScript)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/var/lib/jenkins/workspace/test/test_jit.py", line 5437, in test_stack
self.assertAutodiffNode(func2.graph_for(x, y), True, ['aten::stack'], [])
File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/testing/_internal/common_jit.py", line 282, in assertAutodiffNode
self.assertEqual(should_autodiff_node,
##[endgroup]
File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 2975, in assertEqual
raise error_metas[0].to_error(
AssertionError: Booleans mismatch: True is not False
Failure in testing nodes' autodifferentiation. One or more nodes were expected to be autodiffed, but were not found in specified fusible/nonfusible DifferentiableGraph groups.
Specifically:
['aten::stack'] were not in one of the DifferentiableGraphs when they were expected to be. Did you intend for these nodes to be autodiffed? If not, remove them from the list of nonfusible nodes.
----------------------------------------------------------------------
Ran 2677 tests in 84.596s
FAILED (failures=1, skipped=136, expected failures=13)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96135
Approved by: https://github.com/clee2000
This PR optimizes the guards overhead introduced by dynamo tracing module forward hooks.
It can and maybe should be followed by a wider change proposed by @voznesenskym to optimize specialized nnmodules by 'observing' any user mutations and directly invalidating the root guard, obviating the need to install other nnmodule guards. (But this observer change seems more involved...)
Idea: maintain a flag, and keep it up to date whenever adding or
removing hooks. Use the flag rather than dict checks to enter the call fast path.
- need to extend RemovableHandle to keep a ref to nnModule so it can update the flag on removal.
- also need to handle the flag in ScriptModule which still uses the python call impl when called from python.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95931
Approved by: https://github.com/ezyang, https://github.com/voznesenskym
We want to make TorchRec sharded models TorchScriptable.
TorchRec sharded models uses generic types Awaitable[W] and LazyAwaitable[W] (https://github.com/pytorch/torchrec/blob/main/torchrec/distributed/types.py#L212).
In sharded model those types are used instead of contained type W, having the initialization function that produces object of type W.
At the moment when the first attribute of W is requested - `LazyAwaitable[W]` will call its initialization function (on the same stack), cache the result inside and work transparently as an object of W. So we can think about it as a delayed object initialization.
To support this behavior in TorchScript - we propose a new type to TorchScript - `Await`.
In eager mode it works the same as `LazyAwaitable[W]` in TorchRec, being dynamically typed - acting as a type `W` while it is `Await[W]`.
Within torchscript it is `Await[W]` and can be only explicitly converted to W, using special function `torch.jit.awaitable_wait(aw)`.
Creation of this `Await[W]` is done via another special function `torch.jit.awaitable(func, *args)`.
The semantic is close to `torch.jit.Future`, fork, wait and uses the same jit mechanics (inline fork Closures) with the difference that it does not start this function in parallel on fork. It only stores as a lambda inside IValue that will be called on the same thread when `torch.jit.awaitable_wait` is called.
For example (more examples in this PR `test/jit/test_await.py`)
```
def delayed(z: Tensor) -> Tensor:
return Tensor * 3
@torch.jit.script
def fn(x: Tensor):
aw: Await[int] = torch.jit._awaitable(delayed, 99)
a = torch.eye(2)
b = torch.jit._awaitable_wait(aw)
return a + b + x
```
Functions semantics:
`_awaitable(func -> Callable[Tuple[...], W], *args, **kwargs) -> Await[W]`
Creates Await object, owns args and kwargs. Once _awaitable_wait calls, executes function func and owns the result of the function. Following _awaitable_wait calls will return this result from the first function call.
`_awaitable_wait(Await[W]) -> W`
Returns either cached result of W if it is not the first _awaitable_wait call to this Await object or calls specified function if the first.
`_awaitable_nowait(W) -> Await[W]`
Creates trivial Await[W] wrapper on specified object To be type complaint for the corner cases.
Differential Revision: [D42502706](https://our.internmc.facebook.com/intern/diff/D42502706)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90863
Approved by: https://github.com/davidberard98
Preparation for the next PR in this stack: #89559.
I replaced
- `self.assertTrue(torch.equal(...))` with `self.assertEqual(..., rtol=0, atol=0, exact_device=True)`,
- the same for `self.assertFalse(...)` with `self.assertNotEqual(...)`, and
- `assert torch.equal(...)` with `torch.testing.assert_close(..., rtol=0, atol=0)` (note that we don't need to set `check_device=True` here since that is the default).
There were a few instances where the result of `torch.equal` is used directly. In that cases I've replaced with `(... == ...).all().item()` while sometimes also dropping the `.item()` depending on the context.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89527
Approved by: https://github.com/mruberry
Introduce `_eval_no_call` method, that evaluates statement only if it
does not contain any calls(done by examining the bytecode), thus preventing command injection exploit
Added simple unit test to check for that
`torch.jit.annotations.get_signature` would not result in calling random
code.
Although, this code path exists for Python-2 compatibility, and perhaps
should be simply removed.
Fixes https://github.com/pytorch/pytorch/issues/88868
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89189
Approved by: https://github.com/suo
# Support unpacking python dictionary in **torch.jit.trace()**
## Problem statement & Motivation
### Problem 1(usability):
Say, if you have a model and its forward method defined as follows:
**`def forward(self, key1=value1, key2=value2, key3=value3)`**
And you have a dataset and each data point in the dataset is a python dict as follows:
**`data = {key1:value1, key3:value3, key2:value2}`**
The problem is that if you want to trace the model using the dict data by the giving dataset, you need unpack the dictionary and reorder its value manually and make up a tuple as **`data_tuple = (value1, value2, value3)`** as the **`example_inputs`** parameter of **`torch.jit.trace()`**. This marshalling process is not user friendly.
### Problem 2 (feasibility):
Say, if you have a model and its forward method defined as follows:
**`def forward(self, key1=None, key2=None, key3=None)`** -> The default value is **None**
And you have a dataset and each data point in the dataset is a python dict as follows:
**`data = {key1:value1, key3:value3}`** -> Only **part of** the required value by forward was given, the rest use the default value.
The problem is that if you want to trace the model using the dict data by the giving dataset, it's not feasible at all. Cause neither you can pass a tuple like **`T1 = (value1, value3)`** nor **`T2 = (value1, None, value3)`**. T1 will mismatch value3 with key2 and T2 include **None** type which will be blocked by tracer's type checking. (Of course you can pass **`T3 = (value1,)`** to make the trace function finish without exception, but the traced model you get probably is not what you expect cause the different input may result in different traced result.).
These problems come from the HuggingFace's PT model, especially in text-classification tasks with datasets such as [MRPC,](https://paperswithcode.com/dataset/mrpc) [MNLI](https://paperswithcode.com/dataset/multinli) etc.
## Solution
To address these two issues, we propose to support a new type, that is, python dict as example_inputs parameter for torch.jit.trace(). We can base on the runtime type information of the example_inputs object to determine if we fall back to the original tuple path or go into the new dictionary path. Both problem 1 and problem 2 can be solved by utilizing the "**`**`**"
operator.
## Limitation & Mitigation
1. If we use dict as example_inputs to trace the model, then we have to pass a dictionary to the traced model too. (Cause probably we will change the order of debug name of the input parameter in torchscript IR, thus we can't assume the traced model's input parameters order are the same with the original model.). We need highlight this too in the document to mitigate this problem.
For example:
```
# fetch a data from dataloader, and the data is a dictionary
# and the example_inputs_dict is like: {key1:value1, key3:value3, key2:value2}
# the forward() is like: def forward(self, key1=value1, key2=value2, key3=value3)
example_inputs_dict = next(iter(dataloader))
jit_model = model.eval()
# use the dictionary to trace the model
jit_model = torch.jit.trace(jit_model, example_inputs_dict, strict=False) # Now the IR will be graph(%self : __torch__.module.___torch_mangle_n.Mymodule, %key1 : type1, %key3 : type3, %key2 : type2)
jit_model = torch.jit.freeze(jit_model)
# It's OK to use dict as the parameter for traced model
jit_model(**example_inputs_dict)
example_inputs_tuple = (value1, value3, value2)
# It's wrong to rely on the original args order.
jit_model(*example_inputs_tuple)
```
## Note
1. This PR will make some UT introduced in [39601](https://github.com/pytorch/pytorch/pull/39601) fail, which I think should be classified as unpacking a tuple containing a single dictionary element in our solution.
4. I think there is ambiguity since currently we only specify passing a tuple or a single Tensor as our example_inputs parameter in **torch.jit.trace()**'s documentation, but it seems we can still passing a dictionary.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81623
Approved by: https://github.com/davidberard98
Pytorch's built-in fuser seems have higher priority than my fuser registered via
```cpp
torch::jit::RegisterPass pass([accelerator_symbol](std::shared_ptr<torch::jit::Graph>& g) {
OrtFuseGraph(g, Accelerator::Supported, accelerator_symbol);
});
```
With this PR, I can reuse `aot_autograd` backend in TorchDynamo with my own JIT fuser. My custom context is
```python
class AOTAutogradOrtFusionWithContext:
"""Pass nvfuser context to TorchDynamo"""
def __init__(self):
self.backend_ctx_ctor = lambda: torch.jit.fuser("none")
def __call__(self, gm: torch.fx.GraphModule, example_inputs):
return AOTAutogradMemoryEfficientFusion.compile_fn(gm, example_inputs)
aot_autograd_ort_strategy = AOTAutogradOrtFusionWithContext()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81731
Approved by: https://github.com/davidberard98
**BC-breaking note**:
This PR deprecates `torch.lu` in favor of `torch.linalg.lu_factor`.
A upgrade guide is added to the documentation for `torch.lu`.
Note this PR DOES NOT remove `torch.lu`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77636
Approved by: https://github.com/malfet
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74353
Repatched `d00de0d43598522b8f6ab2de553b6aaf6768faa5` by Nora Belrose (norabelrose). With following changes:
* Register fake source of generated methods in linecache so that inspect.get_source will succeed.
* this patching is only triggered if the given dataclass passed to torch.jit.script previously. Effectively we make this feature opt-in.
## Original Summary:
Fixes https://github.com/pytorch/pytorch/issues/72901.
Since we can't get access to the source code for synthesized magic methods on dataclasses, we have to synthesize our own versions. torch/jit/_dataclass_impls.py has the code that does this.
What's supported
Synthesized __init__, __eq__, and the comparison magic methods when order=True is set on the dataclass decorator
Default values for fields
__post_init__, including using InitVar fields inside of __post_init__, on Python 3.8+
Overriding __eq__ or any of the comparison magic methods to provide your own implementation
What's not supported
Default factory initializers for fields
Frozen dataclasses
InitVar on Python 3.7
__repr__ and __hash__ (these are actually implemented, but the TorchScript interpreter won't call them)
Using the != operator on dataclasses inside TorchScript; this is because TorchScript requires that you implement __ne__ to use this operator, whereas in regular Python the != operator will resolve to the negation of whatever is returned by __eq__ if there's no __ne__. Dataclasses don't actually synthesize an __ne__ method for this reason. I've been toying with different ways to fix this but != is not working in this PR at the moment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74889
Test Plan:
unittest
Also run previously failed test:
```
buck test mode/dev-nosan //fblearner/flow/projects/fluent2/definition/transformers/contrib/faim/test:tests -- --exact 'fblearner/flow/projects/fluent2/definition/transformers/contrib/faim/test:tests - test_mixmatch_multiclass (fblearner.flow.projects.fluent2.definition.transformers.contrib.faim.test.faim_mixmatch_test.TestFaimTransformerMixMatch)'
```
passes
Reviewed By: zhxchen17
Differential Revision: D35206262
Pulled By: qihqi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76771
Approved by: https://github.com/seemethere
PR https://github.com/pytorch/pytorch/pull/77042 has fixed the new folding conv-bn data type issue but missing the case when original conv has no bias input.
In this PR:
- Fix the new folding conv-bn's bias data type issue, when conv has no bias but weight as lower precision datatype, the new generated bias data type should be same as conv's weight.
- Move the Autocast JIT Trace UT from `test_jit.py` to `test_jit_autocast.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78241
Approved by: https://github.com/davidberard98
Consider the following JIT graph, where the type of `%a` and `%b` are out of sync with tuple `%c`.
Before:
```
graph(%a : Float(123), %b : Float(4, 5, 6)):
c : (Tensor, Tensor) = prim::TupleConstruct(%a, %b)
return (%c)
```
After:
```
graph(%a : Float(123), %b : Float(4, 5, 6)):
c : (Float(123), Float(4, 5, 6)) = prim::TupleConstruct(%a, %b)
return (%c)
```
This PR adds a pass `RefineTypes(...)` to update all such instances with the correct type. This is also available via Python by using `torch._C._jit_pass_refine_types(...)`.
A unit test has been added for unnamed tuples, but no test exists for `NamedTuple` (though it was tested manually) since it isn't supported by the parser:
```
RuntimeError:
unknown type specifier:
graph(%a : Float(123), %b : Float(4, 5, 6)):
%c : NamedTuple(Tensor : Tuple, Tensor : Tuple) = prim::TupleConstruct(%a, %b)
~~~~~~~~~~ <--- HERE
return (%c)
```
cc: @ke1337 @antoniojkim @wconstab @eellison
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76919
Approved by: https://github.com/eellison
This PR ...
Makes the following testing changes:
- Updates stride testing in test_python_reference_consistency to only check strides of dimensions with length > 1
- Creates reference inputs for reshape
- Creates reference inputs for chunk
- Extends the sample inputs for unsqueeze
- Extends the sample inputs for stack -- test_conj_view and test_neg_view are now xfailed
- https://github.com/pytorch/pytorch/issues/77046
Makes the following architecture changes:
- Adds the refs.special (sub)module
- Adds the refs.nn.functional (sub)module
Adds the following prims:
- expand_dims
- view_of
- rev
- clone
Adds the following references:
- flatten
- squeeze
- unsqueeze
- special.i0e
- special.i1e
- logical_or
- logical_and
- isclose
- flip
- stack
- nn.functional.elu
- chunk
- clone
- narrow
Identifies the following bugs in PyTorch today:
- https://github.com/pytorch/pytorch/issues/77054
- https://github.com/pytorch/pytorch/issues/77055
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77043
Approved by: https://github.com/ngimel
For the most part, PrimTorch refs have the same signature as their
ATen equivalents. I modify most PrimTorch refs to register themselves
as decompositions, using the prim name they wrap to find the aten name
(except for a few cases where the prim/aten names mismatch). There are
some exclusions, falling into one of two categories:
- The torch equivalent was already implemented as a CompositeImplicitAutograd
decomposition in C++
- The ref doesn't support enough features (e.g., the real deal has more
kwargs / overloads than are currently implemented)
PrimTorch refs are written as a single function that supports all
overloads, and this style is convenient for cases where we have a bundle
of overloads for what morally is a single overload with a Union type
on an argument (which we ought to have supported in
native_functions.yaml but blah); to support registering a single decomp
for all the overloads, we modify register_decomposition to register
to ALL overloads if you pass it an overload packet. This is technically
BC breaking but no tests started failing because of it.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76835
Approved by: https://github.com/Chillee, https://github.com/mruberry
Summary:
This enables previous change made at D35196883 (b34b192d6b)
Previous change is landed for 2 weeks to make sure that the format change introduced here will be handed in code.
Test Plan: existing tests
Differential Revision: D36074453
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76688
Approved by: https://github.com/gmagogsfm
crossref is a new strategy for performing tests when you want
to run a normal PyTorch API call, separately run some variation of
the API call (e.g., same thing but all the arguments are meta tensors)
and then cross-reference the results to see that they are consistent.
Any logic you add to CrossRefMode will get run on *every* PyTorch API
call that is called in the course of PyTorch's test suite. This can
be a good choice for correctness testing if OpInfo testing is not
exhaustive enough.
For now, the crossref test doesn't do anything except verify that
we can validly push a mode onto the torch function mode stack for all
functions.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75988
Approved by: https://github.com/seemethere
This PR makes the following improvements:
- moves the custom skip list for test_normalize_operator_exhaustive in test_fx_experimental to use the typical OpInfo skip architecture. The skips were updated to xfails, and that identified some operators which were no longer failing the test
- redundant tests with OpInfo-based testing in test_jit.py were removed
- test_dtypes was improved so its error messages are clear and it makes test_nondifferentiable redundant; the latter test has been removed
- OpInfo.supports_complex_autograd() is removed in favor of a more accurate and general test for whether the particular dtype is in the backward dtypes of the operator
- gradchecks have been improved to verify that an operator doesn't support grad if it claims not to
- gradchecks have been improved to test the gradient of all input tensors that require gradient
- the concept of "default test dtypes" has been removed
- excessive and mostly redundant out testing for elementwise unary operators has been removed
- metadata for whether an op supports nuanced "safe casting" to out behavior has been removed from OpInfos
- numerous skips have been converted to xfails
- numerous OpInfos have had their metadata fixed based on the new checks
- jit-specific utilities in common_methods_invocations.py have been moved to jit_programming_utils.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75951
Approved by: https://github.com/ngimel
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75119
Add support for parsing Tensor constants like Double(4, 4) ... by initializing random tensors. This makes saving IR and then parsing it lossy, so I have it toggled as default not on, but is useful in cases like repro-ing Fusions with tensor constants post-freezing.
cc Krovatkin
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision: D35373999
Pulled By: eellison
fbshipit-source-id: a5c8d9f93f23a7442258fc745ed6b6def330dca8
(cherry picked from commit 32dd6567522973563bd452bf486ed27b02e4e35c)
Summary:
This PR introduces `SymInt` type to Pytorch which will be used by LTC and AOTAutograd for tracing size arithmetic and tests.
`SymInt` is a C++ union structure [int64_t, SymbolicIntNode*] that wraps around an int64_t field where the value of the field could be an index into a list of `shared_ptr<SymbolicIntNode>` or a real int.
This PR doesn't add any support for actually tracing symbolic ints. i.e. data_ for now can only contain real ints.
```
Goal 1: just to show we can add a type to PyTorch core. (wraps int) LANDEABLE
Finalize the naming - symint
Want the name to be short
Does invoke “size” - NO
SInt/SymInt/SymbolicInt
SInt could mean signed int
sym_int or symint or SymInt (originally it was “int”; capitalized implies object semantics, whereas lowercase implies value semantics)
JIT schema - symint
C++ - symint
```
See more details here: https://docs.google.com/document/d/1iiLNwR5ohAsw_ymfnOpDsyF6L9RTUaHMpD8 (d843f63f2a)YLw-jxEw
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74861
Reviewed By: qihqi, ngimel
Differential Revision: D35226230
Pulled By: Krovatkin
fbshipit-source-id: 34acf342bd50fcaa4d8d5dd49c2fd6a98823a5b3
(cherry picked from commit 218643f63ef181cabb92d13a6e837eb64f2dda3c)