for some tensor x, x.type(torch.FloatTensor) will essentially do the same thing as x.to(torch.float). x.type can be called with at least 3 types of inputs:
* a string "torch.FloatTensor"
* a dtype torch.float
* a tensor type torch.FloatTensor
the third option (torch.FloatTensor) fails in fx, because fx cannot trace torch.FloatTensor objects. So this PR will replace the torch.FloatTensor type with a string "torch.FloatTensor"
Why not fix this in fx? Well, it's possible, but I'm not sure a nice way to do it. We would want to update [torch.fx.node.BaseArgumentTypes](d88bc38b0c/torch/fx/node.py (L17)) to contain torch.FloatTensor etc. We could hard-code a list of tensor types there (the types vary depending on build type, e.g. whether or not cuda tensors are available), but that's not great in case our hardcoded list differs from the actual list registered by python_tensor.cpp. Another option is to dynamically populate the list of types with `Union[tuple(...)])`, and fill the tuple with `torch._tensor_classes` (which is directly populated by python_tensor.cpp), but apparently this breaks most typecheckers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93043
Approved by: https://github.com/jansel
The output of Torchbench model `doctr_det_predictor` on CPU is a `numpy ndarray`. When running the accuracy benchmark of this model, the below error is raised: `RuntimeError: unsupported type: ndarray`.
Repro CMD:
```bash
python benchmarks/dynamo/torchbench.py --accuracy --float32 -dcpu -n50 --inductor --no-skip --dashboard --only doctr_det_predictor --batch_size 1 --threads 1
```
This PR adds the support to compare `numpy ndarray` in the dynamo utils.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91870
Approved by: https://github.com/jgong5, https://github.com/Chillee
When we run the node with fake value for tensor.item, it would previously error because the utility method doesn't know how to handle placeholder node. The tensor we are calling item can be input from user will be placeholder in the graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91668
Approved by: https://github.com/voznesenskym
GraphArgs worked fairly well, but it was still missing sources
sometimes. Now, we maintain an auxiliary data structure which we
MUST populate whenever we fakeify a tensor / allocate a bare SymInt.
This should guarantee once and for all that every symbol is available.
Should fix swin_base_patch4_window7_224.
While I was at it, I moved fakeification utility back to builder
as it was only used at once call site.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90911
Approved by: https://github.com/voznesenskym
Instead of inferring shape mappings from a bunch of data structures that were plumbed in InstructionTranslator, we instead work out mappings by just iterating over the GraphArgs and mapping symbols to arguments as they show up. If multiple argument sizes/strides/offset map to the same symbol, this means they are duck sized, so we also generate extra equality tests that they must be equal. Finally, we generate 0/1 specialization guards. The resulting code is much shorter, and I think also easier to understand.
TODO: Delete all the tensor ref tracking code, it's unnecessary
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90528
Approved by: https://github.com/voznesenskym
Wow, I had to sweat so much to get this PR out lol.
This PR enforces the invariant that whenever we allocate SymInts as part of fakeification, the SymInt is associated with a Source, and in fact we store the string source name on SymbolWithSourceName. We use 'sname' as the shorthand for source name, as 'name' is already used by sympy to name symbols.
In order to store source names, we have to plumb source names from Dynamo to PyTorch. This made doing this PR a bit bone crushing, because there are many points in the Dynamo codebase where we are improperly converting intermediate tensors into fake tensors, where there is no source (and there cannot be, because it's a frickin' intermediate tensor). I've fixed all of the really awful cases in earlier PRs in the stack. This PR is just plumbing in source names from places where we do have it.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90295
Approved by: https://github.com/voznesenskym
The old code didn't actually fakeify traceable tensor subclasses at the
time they are added as a GraphArg to the module; now we do, by ignoring
the subclass during fakeification and relying on Dynamo to simulate
the subclass on top. See comments for more details.
BTW, this codepath is super broken, see filed issues linked on the
inside.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90009
Approved by: https://github.com/wconstab, https://github.com/voznesenskym
In #87741 we added the inference support for dynamo/torchxla integration. Later on in #88449 we attempt to add the training support. That attempt is not smooth because
- we try 2 things together
1. let dynamo trace the model on xla rather than eager
2. enable training
- It turns out neither of these two tasks are trivial enough.
Furthermore, item 2 (enable training) depends on item 1 (tracing on xla). We enable training via AOTAutograd. AOTAutograd lift all model parameters/buffers as graph inputs. Without item 1 being done, we would need copy all graph inputs (including model parameters/buffers) from eager device to xla devices. That hurts performance a lot. Have a cache to map eager parameter to XLA parameter does not solve the problem since the update on either will not sync automatically to the other. They will easily go out of sync.
This PR let dynamo trace the model on XLA rather than eager. This is a preparation step to enabling training.
Also, tracing on XLA makes the data movement more efficient. We see 1.5x geomean speedup compared to previous 1.38x.
```
+-------------------------+--------------------+-------------------------+
| Model | XLA (trace once) | XLA (trace everytime) |
+=========================+====================+=========================+
| resnet18 | 1.38 | 1.008 |
+-------------------------+--------------------+-------------------------+
| resnet50 | 1.227 | 0.998 |
+-------------------------+--------------------+-------------------------+
| resnext50_32x4d | 1.544 | 1.008 |
+-------------------------+--------------------+-------------------------+
| alexnet | 1.085 | 1.045 |
+-------------------------+--------------------+-------------------------+
| mobilenet_v2 | 2.028 | 1.013 |
+-------------------------+--------------------+-------------------------+
| mnasnet1_0 | 1.516 | 0.995 |
+-------------------------+--------------------+-------------------------+
| squeezenet1_1 | 0.868 | 1.01 |
+-------------------------+--------------------+-------------------------+
| vgg16 | 1.099 | 1.008 |
+-------------------------+--------------------+-------------------------+
| BERT_pytorch | 3.26 | 1.027 |
+-------------------------+--------------------+-------------------------+
| timm_vision_transformer | 2.182 | 1.015 |
+-------------------------+--------------------+-------------------------+
| geomean | 1.50389 | 1.01261 |
+-------------------------+--------------------+-------------------------+
```
Example command
```
GPU_NUM_DEVICES=1 python benchmarks/dynamo/torchbench.py --randomize-input --performance --trace-on-xla --only resnet18 --backend=torchxla_trace_once
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88904
Approved by: https://github.com/wconstab, https://github.com/JackCaoG, https://github.com/jansel
**Introduces symbolic shape guards into dynamo.**
In this PR, we take the existing fake tensor infra and plumbing in dynamo and we start passing a shape_env around. This shape_env does not get plumbed down to middle layers / backend yet - it only collects expressions from frontend invocations at the moment. We then translate these expressions into guards at the point where we take other guards installed throughout dynamo - and add them to check_fn.
Part 1 of https://docs.google.com/document/d/1QJ-M4zfMkD-fjHIqW089RptjLl9EgozZGCceUbvmgfY/edit#
cc @jansel @lezcano @fdrocha @mlazos @soumith @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87570
Approved by: https://github.com/ezyang