This PR introduces some modifications:
1. We find out some const function parameters that can be passed by reference and add the reference.
2. We find more opportunists of passing by value and change them accordingly.
3. Some use-after-move errors are fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942
Approved by: https://github.com/Skylion007
Partially fixes: #66328
This PR:
- adds support for `ITensorList` to the dispatcher for:
- computing the dispatch key
- boxing and unboxing `ITensorList`
- modified the codegen for structured kernels:
- codegen APIs use `ITensorList` instead of `ArrayRef<Tensor>`
**Changes summary:**
- Signature changes due to the different APIs:
- dispatcher API (e.g. `BatchingRegistrations.cpp`)
- C++ API (e.g. `TensorShape.cpp`)
- Miscelaneous functions used by codegen'd functions (e.g. `FunctionalTensorWrapper.*`)
- Dispatcher changes for handling `ITensorList` correctly (e.g. `DispatchKeyExtractor.h`)
- Signature changes of `at::cat` due to the need of `const` inside `TensorBody.h`
- Forward declarations of `ITensorList` (e.g. `MethodOperators.h`)
- Codegen changes, special casing structured kernels (e.g. `gen.py`)
**Short description of structured kernels special casing:**
I introduced, mainly, 5 types of changes to the codegen for generating code depending on
whether the kernel is structured or not:
1. Added a `structured_type_override` flag to the `argument_type` function definition of
the affected APIs (mainly the dispatcher and C++ APIs).
- `api/cpp.py`, `api/dispatcher.py`, `api/native.py`
2. Added a `structured_type_override` member to the signature
classes (e.g. `CppSignature`), since `FunctionSchema` doesn't really know whether the
function is structured or not
- `api/types.py`
3. Added a `part_of_structured_group` to `NativeFunction` class, which is just a
convenient function to forward to `structured_type_override` wherever needed
- `model.py`
4. Appropriately changed the rest of the codegen, whenever it used either the signature
classes or the `arguments` function directly
5. Added a check for `const ITensorList&` type wherever there was a check for `TensorList`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73350
Approved by: https://github.com/bdhirsh
Previously when codegening ops like `zeros_` or `ones_` we'd hit a `Code below assumes there is at least one tensor arg error`. This check is not entirely correct which is what is causing the error to be thrown. There are ops like the ones mentioned that pass in a `device` parameter that can be used in place of the "first tensor".
CC: @wconstab @desertfire @henrytwo @ke1337
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76917
Approved by: https://github.com/desertfire
Summary:
This merges changes that have already been reviewed/landed onto lazy_tensor_staging branch. It combines changes from multiple PRs into one diff.
updated from lazy_tensor_staging on 3/16
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74311
Test Plan:
Run CI to ensure compilation on various platforms
Run unit tests on lazy_tensor_staging branch with source version of all these diffs
Reviewed By: desertfire
Differential Revision: D34929235
fbshipit-source-id: babbc3bbeabc5b8107ee9284ed7765887a148622
(cherry picked from commit d91577a6557343ec536f6859e4808ec1a8a9b685)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73445
Refactors the whole codebase to use LazyTensorPtr (defined as c10::intrusive_ptr) to enable XLA to use a derived class XlaLazyTensor and override functionality.
this PR is just the first step, and we will need to add a factory class that XLA can override in their backend to actually hook up their derived tensor class.
Parallel PR on lazy_tensor_staging: #73429
Test Plan: tested via lazy_tensor_staging test_ptltc and torchbench and CI
Reviewed By: ezyang
Differential Revision: D34481918
fbshipit-source-id: 01176b127df6b79039aa1bc57bc6da5505161f87
(cherry picked from commit 52b9ae4e22d2703d44c6436311d79d40bd62c6aa)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68027
This commit upstreams class BackendDevice to the master, which is a backend
specific representation of the actual hardware, for instances, CPU, GPU, or
TPU.
This concept is important for backend like XLA where it needs to tell the
actual hardware type from the c10::DeviceType::Lazy virtual device during
both IR constructions and lowerings.
Test Plan: ./build/bin/test_lazy --gtest_filter=BackendDeviceTest.*
Reviewed By: wconstab
Differential Revision: D32261838
Pulled By: alanwaketan
fbshipit-source-id: 579c3fc5f9da7847c887a383c6047e8ecb9cc5bc