This PR introduces some modifications:
1. We find out some const function parameters that can be passed by reference and add the reference.
2. We find more opportunists of passing by value and change them accordingly.
3. Some use-after-move errors are fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942
Approved by: https://github.com/Skylion007
Summary:
Since `c10::ArrayRef` now support `c10::ArrayRef<const T>`, let's restore `ComputePostOrder` to accept `const Node*` again, which is more suitable for the context of the given helpers.
Test Plan:
CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88773
Approved by: https://github.com/JackCaoG
Partially fixes: #66328
This PR:
- adds support for `ITensorList` to the dispatcher for:
- computing the dispatch key
- boxing and unboxing `ITensorList`
- modified the codegen for structured kernels:
- codegen APIs use `ITensorList` instead of `ArrayRef<Tensor>`
**Changes summary:**
- Signature changes due to the different APIs:
- dispatcher API (e.g. `BatchingRegistrations.cpp`)
- C++ API (e.g. `TensorShape.cpp`)
- Miscelaneous functions used by codegen'd functions (e.g. `FunctionalTensorWrapper.*`)
- Dispatcher changes for handling `ITensorList` correctly (e.g. `DispatchKeyExtractor.h`)
- Signature changes of `at::cat` due to the need of `const` inside `TensorBody.h`
- Forward declarations of `ITensorList` (e.g. `MethodOperators.h`)
- Codegen changes, special casing structured kernels (e.g. `gen.py`)
**Short description of structured kernels special casing:**
I introduced, mainly, 5 types of changes to the codegen for generating code depending on
whether the kernel is structured or not:
1. Added a `structured_type_override` flag to the `argument_type` function definition of
the affected APIs (mainly the dispatcher and C++ APIs).
- `api/cpp.py`, `api/dispatcher.py`, `api/native.py`
2. Added a `structured_type_override` member to the signature
classes (e.g. `CppSignature`), since `FunctionSchema` doesn't really know whether the
function is structured or not
- `api/types.py`
3. Added a `part_of_structured_group` to `NativeFunction` class, which is just a
convenient function to forward to `structured_type_override` wherever needed
- `model.py`
4. Appropriately changed the rest of the codegen, whenever it used either the signature
classes or the `arguments` function directly
5. Added a check for `const ITensorList&` type wherever there was a check for `TensorList`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73350
Approved by: https://github.com/bdhirsh
Adding an `ShouldSyncTensor` interface to allow for the case of output pruning should a vendor not support retrieving the value of a certain output.
CC: @wconstab @JackCaoG @Krovatkin
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84418
Approved by: https://github.com/wconstab
Previously when codegening ops like `zeros_` or `ones_` we'd hit a `Code below assumes there is at least one tensor arg error`. This check is not entirely correct which is what is causing the error to be thrown. There are ops like the ones mentioned that pass in a `device` parameter that can be used in place of the "first tensor".
CC: @wconstab @desertfire @henrytwo @ke1337
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76917
Approved by: https://github.com/desertfire
Proposed solution for #76826
Basically adds a context which is only "active" when called in mark step. Any backend can then use this to check if within a mark step context.
I've also added an example warning in the TS backend so that we now see the following:
```python
>>> import torch
>>> import torch._lazy
>>> import torch._lazy.ts_backend
>>> torch._lazy.ts_backend.init()
>>> a = torch.tensor([1, 2, 3, 4], device="lazy")
>>> b = torch.tensor([5, 6, 7, 8], device="lazy")
>>> c = a + b
>>> c
[W ts_backend_impl.cpp:187] Compile outside of mark step
tensor([ 6, 8, 10, 12], device='lazy:0')
>>> d = a * b
>>> torch._lazy.mark_step()
>>> d
tensor([ 5, 12, 21, 32], device='lazy:0')
```
Though it was mainly for example and will be happy to remove if this warning is not desired.
Fixes#76826
CC: @wconstab @desertfire @henrytwo @ke1337
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76840
Approved by: https://github.com/desertfire
Next stage of breaking up https://github.com/pytorch/pytorch/pull/74710
IR builder class introduced to decouple the explicit usage of `TsNode` in core lazy tensors.
Requires https://github.com/pytorch/pytorch/pull/75324 to be merged in first.
**Background**
- there are ~ 5 special ops used in lazy core but defined as :public {Backend}Node. (DeviceData, Expand, Scalar...)
- we currently require all nodes derive from {Backend}Node, so that backends can make this assumption safely
- it is hard to have shared 'IR classes' in core/ because they depend on 'Node'
**Motivation**
1. avoid copy-paste of "special" node classes for each backend
2. in general decouple and remove all dependencies that LTC has on the TS backend
**Summary of changes**
- new 'IRBuilder' interface that knows how to make 5 special ops
- move 'special' node classes to `ts_backend/`
- implement TSIRBuilder that makes the special TS Nodes
- new backend interface API to get the IRBuilder
- update core code to call the builder
CC: @wconstab @JackCaoG @henrytwo
Partially Fixes#74628
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75433
Approved by: https://github.com/wconstab
Summary:
This PR enables Input/Output aliasing for Lazy Tensor Core. `SetUpAlias` is a virtual function that can be overridden in a vendor's custom `LoweringContext` implementation.
The return type of `LoweringContext::GetResultShape` has also been updated to return a `c10::optional` value, since `GetResultShape` isn't currently implemented for the TorchScript backend.
The changes here mirror the interface used by `torch_xla`: https://github.com/pytorch/xla/blob/master/torch_xla/csrc/tensor.cpp#L1548-L1549
cc: antoniojkim ke1337 wconstab silvasean
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75828
Reviewed By: Krovatkin
Differential Revision: D35952593
Pulled By: wconstab
fbshipit-source-id: e20b11e44e0e1beda1b1c47aa3a8b611afd97b7f
(cherry picked from commit bcbc9ef01ef8eb84667e5c42edc10d38d5d78395)
Summary:
Next stage of breaking up https://github.com/pytorch/pytorch/pull/74710
Move shape cache implementation to the backend interface. Also, clean up some of the hashing logic in the base node class.
CC: wconstab JackCaoG henrytwo
Partially Fixes https://github.com/pytorch/pytorch/issues/74628
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75324
Reviewed By: anjali411
Differential Revision: D35730823
Pulled By: wconstab
fbshipit-source-id: cf6fa326319b9324e5f422a78817b6fb5bf7e9b8
(cherry picked from commit faec5043df56639e2fd23de2d91ae796e4f3df70)
Summary:
Also enables bazel build to run lazy codegen. Bazel (oss) build feeds off the same filelists as cmake/buck (build_variables.bzl), so enabling it is easier than keeping it disabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74111
Test Plan: Run CI and verify test_lazy_ops is running via OSS cmake builds
Reviewed By: bdhirsh
Differential Revision: D34772403
fbshipit-source-id: 8a63f58b9536e6ac1be530667932176ef2549496
(cherry picked from commit e807ffb1918853d10b924fdc24f85ee5b1a39021)
Summary:
This merges changes that have already been reviewed/landed onto lazy_tensor_staging branch. It combines changes from multiple PRs into one diff.
updated from lazy_tensor_staging on 3/16
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74311
Test Plan:
Run CI to ensure compilation on various platforms
Run unit tests on lazy_tensor_staging branch with source version of all these diffs
Reviewed By: desertfire
Differential Revision: D34929235
fbshipit-source-id: babbc3bbeabc5b8107ee9284ed7765887a148622
(cherry picked from commit d91577a6557343ec536f6859e4808ec1a8a9b685)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73445
Refactors the whole codebase to use LazyTensorPtr (defined as c10::intrusive_ptr) to enable XLA to use a derived class XlaLazyTensor and override functionality.
this PR is just the first step, and we will need to add a factory class that XLA can override in their backend to actually hook up their derived tensor class.
Parallel PR on lazy_tensor_staging: #73429
Test Plan: tested via lazy_tensor_staging test_ptltc and torchbench and CI
Reviewed By: ezyang
Differential Revision: D34481918
fbshipit-source-id: 01176b127df6b79039aa1bc57bc6da5505161f87
(cherry picked from commit 52b9ae4e22d2703d44c6436311d79d40bd62c6aa)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70867
This commit syncs LazyGraphExecutor and LazyTensor with the staging branch's
latest changes.
Test Plan: CI in the lazy_tensor_staging branch.
Reviewed By: wconstab, desertfire
Differential Revision: D33440005
Pulled By: alanwaketan
fbshipit-source-id: 0dd72643dbf81a87fc4b05019b6564fcb28f1979
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70062
This commit upstreams LTCTensorImpl from the lazy_tensor_staging branch.
It inherits from c10::TensorImpl and thus manages the lifetime/storage
of LazyTensor.
Test Plan: ./build/bin/test_lazy --gtest_filter=LazyTensorImplTest.*
Reviewed By: desertfire
Differential Revision: D33171186
Pulled By: alanwaketan
fbshipit-source-id: 6af9f91cc7c7e997f120cb89a7bcd6785c03ace0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69098
Add the following utils: helpers, ir_dump_util, and
tensor_util. Some of the util functions may be better organized by
grouping into different files, but we can leave that for later.
Test Plan: Imported from OSS
Reviewed By: alanwaketan
Differential Revision: D32758480
Pulled By: desertfire
fbshipit-source-id: 2a0707879f0c49573380b4c8227a3c916c99bf9a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69012
Some changes to torch/csrc/lazy/core were done on the
lazy_tensor_staging branch (https://github.com/pytorch/pytorch/pull/68427).
Merge those back into the trunk.
Test Plan: Imported from OSS
Reviewed By: wconstab
Differential Revision: D32708696
Pulled By: desertfire
fbshipit-source-id: e54b978f2bdb9c7db27880f60246fdf1e8b41019
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67927
BackendData - represents 'tensor data' in opaque backend storage
LoweringContext - interface for performing backend-specific IR lowering
BackendImplInterface - interface for lazy tensors backends to implement
Reorgs backend-related files into lazy/backend subdir
includes a few small fixes, which were made on lazy_tensor_staging but need to be back-ported to master.
Test Plan: used by lazy_tensor_staging branch
Reviewed By: desertfire
Differential Revision: D32142032
fbshipit-source-id: 828c717bcd0d511876e64ad209b50f7bfb10cec5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68027
This commit upstreams class BackendDevice to the master, which is a backend
specific representation of the actual hardware, for instances, CPU, GPU, or
TPU.
This concept is important for backend like XLA where it needs to tell the
actual hardware type from the c10::DeviceType::Lazy virtual device during
both IR constructions and lowerings.
Test Plan: ./build/bin/test_lazy --gtest_filter=BackendDeviceTest.*
Reviewed By: wconstab
Differential Revision: D32261838
Pulled By: alanwaketan
fbshipit-source-id: 579c3fc5f9da7847c887a383c6047e8ecb9cc5bc