if the function is
```func(a, b, c) ```
and is called as
```func(a=1, b=.., c=..)```
before this change we do not iterate on the a, b, c, since those appear in kwargs this diff fix that issue.
This function is used in _inductor/ir.py to iterate over custom op arguments and when a custom pass does changes
and pass arguments as kwargs, we do not process them.
```
for info, arg in torch._library.utils.zip_schema(schema, args, kwargs):
handle_aliasing_and_mutation(info, arg)
```
Fix https://github.com/pytorch/pytorch/issues/137057
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137311
Approved by: https://github.com/zou3519
Add a way of generating a FunctionSchema from example values because hop's schema varies even for the same hop.
We didn't use torch._C.FunctionSchema because we cannot construct the classes directly (e.g. "__init__" cannot be used for torch._C.FunctionSchema). Also extending the Basic types in c++ seems not that easy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133521
Approved by: https://github.com/zou3519
We promise the user that these custom ops (and their kernels) are black
boxes w.r.t. torch.compile. Unfortunately Dynamo can turn itself back
on in the implementation of the custom operator, so we force it off by
disabling Dynamo
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133125
Approved by: https://github.com/ezyang
The capture_triton decorator returns a function that goes through the
triton kernel wrapper HOP. This is useful for make_fx tracing and
non-strict export. However, the HOP dispatch is slow (~1ms) and not
necessary in certain situations.
This PR skips going through the HOP dispatch for any
capture_triton-wrapped triton kernels that are registered as
implementations to a `@triton_op` custom operator. We do this by
creating a new thread-local flag that controls if the
capture_trition-wrapped triton kernel goes through HOP dispatch or not.
Test Plan:
- new test and existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132822
Approved by: https://github.com/SherlockNoMad
Summary:
Skip the warning if the fake script object doesn't implement a fake method for:
1. __obj_flatten__: for real script object only.
2. __set_state__ and __get_state__ for serialization. Don't expect it to be used during tracing.
Test Plan: Existing tests.
Reviewed By: angelayi
Differential Revision: D60478460
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132306
Approved by: https://github.com/angelayi
Made the following changes:
- mutates_args is now keyword-only and mandatory. This is to align with
torch.library.custom_op (which makes it mandatory because it's easy to
miss)
- op_name is now keyword-only. This helps the readability of the API
- updated all usages of infer_schema
This change is not BC-breaking because we introduced
torch.library.infer_schema a couple of days ago.
Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130705
Approved by: https://github.com/yushangdi
ghstack dependencies: #131777
Fixes#130284Fixes#130653
- Add `torch.library.register_vmap` to custom ops
- Add `register_vmap` for operators in ops in custom_op_db.
- Make `torch.autograd.Function` support kwarg-only kwargs for vmap
- test operators in op_db with `tests/test_vmap`.
- change `test_vmap` to allow custom `out_dim` and allow "None" in `out_dim` when testing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130589
Approved by: https://github.com/zou3519
Fixes#130284Fixes#130653
- Add `torch.library.register_vmap` to custom ops
- Add `register_vmap` for operators in ops in custom_op_db.
- Make `torch.autograd.Function` support kwarg-only kwargs for vmap
- test operators in op_db with `tests/test_vmap`.
- change `test_vmap` to allow custom `out_dim` and allow "None" in `out_dim` when testing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130589
Approved by: https://github.com/zou3519
Made the following changes:
- mutates_args is now keyword-only and mandatory. This is to align with
torch.library.custom_op (which makes it mandatory because it's easy to
miss)
- op_name is now keyword-only. This helps the readability of the API
- updated all usages of infer_schema
This change is not BC-breaking because we introduced
torch.library.infer_schema a couple of days ago.
Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130705
Approved by: https://github.com/yushangdi
This is the initial version of an API to create custom operators whose
implementations are backed by triton kernels. While user-defined triton
kernels work out-of-the-box with triton kernels, you may wish to
construct a custom operator if you need to compose with other PyTorch
subsystems, like Tensor subclasses or vmap.
I'm hoping to get design feedback on this and ship it so that we can
begin experimenting with customers.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130637
Approved by: https://github.com/albanD
Sometimes, it could be difficult to write a fake class e.g. when the original implementation is using some third-party libraries or users are certain that the class is safe to trace with the real object.
This PR allows user to specify their intention by implementing a "safe_to_trace_with_real_obj" method on their script class.
Test Plan:
`pytest test/export/test_torchbind.py -k safe`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129586
Approved by: https://github.com/zou3519
We add torch.library.Library._register_torch_dispatch_rule. Here, a user
can provide us a specific rule to run for a specific
(torch_dispatch_class, operator) pair. The motivation is that a user
might want to extend a subclass/mode but may not have access to the
source code of the subclass/mode.
I'll make this public in a follow-up PR if we think the approach and API
is good.
Keep in mind that many subclasses will likely deliver their own open
registration solution (DTensor has register_sharding_prop_rule and NJT
has register_jagged_op); _register_torch_dispatch_rule is meant as a
catch-all open registration mechanism for when the subclass hasn't
provided anything more specific.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130064
Approved by: https://github.com/albanD
We add torch.library.Library._register_torch_dispatch_rule. Here, a user
can provide us a specific rule to run for a specific
(torch_dispatch_class, operator) pair. The motivation is that a user
might want to extend a subclass/mode but may not have access to the
source code of the subclass/mode.
I'll make this public in a follow-up PR if we think the approach and API
is good.
Keep in mind that many subclasses will likely deliver their own open
registration solution (DTensor has register_sharding_prop_rule and NJT
has register_jagged_op); _register_torch_dispatch_rule is meant as a
catch-all open registration mechanism for when the subclass hasn't
provided anything more specific.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130064
Approved by: https://github.com/albanD
We add torch.library.Library._register_torch_dispatch_rule. Here, a user
can provide us a specific rule to run for a specific
(torch_dispatch_class, operator) pair. The motivation is that a user
might want to extend a subclass/mode but may not have access to the
source code of the subclass/mode.
I'll make this public in a follow-up PR if we think the approach and API
is good.
Keep in mind that many subclasses will likely deliver their own open
registration solution (DTensor has register_sharding_prop_rule and NJT
has register_jagged_op); _register_torch_dispatch_rule is meant as a
catch-all open registration mechanism for when the subclass hasn't
provided anything more specific.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130064
Approved by: https://github.com/albanD
Fixes#129389
If a user registers a device-specific implementation for an operator that accepts no Tensors, then we require the operator to have a "device: torch.device argument"
We switch on the device argument to select the correct backend to dispatch to.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129978
Approved by: https://github.com/zou3519
I run into this a lot. I can imagine that it would look opaque to users,
so made it more friendly
Old error message: "ValueError: infer_schema(func): Return has unsupported type <class 'inspect._empty'>."
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129896
Approved by: https://github.com/yushangdi
Fixes [#129370](https://github.com/pytorch/pytorch/issues/129370)
Suggest correct a List type annotation when input is in Tuple type. To avoid confusion, we only suggest a type if the type is supported.
Example:
Tuple[int, int] -> List[int]
Tuple[Tensor, Tensor, Optional[Tensor]] -> List[Optional[Tensor]]
Tuple[int, ...] -> List[int]
ValueError: infer_schema(func): Parameter y has unsupported type typing.Tuple[torch.Tensor, torch.Tensor, typing.Optional[torch.Tensor]]. Tuple type annotation is not supported. Please try to use a List instead. For example, typing.List[typing.Optional[torch.Tensor]].
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129417
Approved by: https://github.com/zou3519
As titled. Previously, __obj_flatten__ can run in a fake tensor mode, e.g. in process_input of aot_autograd, which is surrounded by a fake tensor mode. This causes the tensor ops inside __obj_flatten__ to run under fake tensor mode. However, tensors inside of script obejct are real tensors, this causes the fake tensor mode to error out saying that we need to first fakify fall the tensors (because allow_non_fake_inputs is set to True).
In this PR, we disable all the dispatch modes when running to_fake_obj.
Note that, the output of `__obj_flatten__` will be fakified and filled inside of the corresponding FakeScriptObject. So during traicng, we'll be using FakeScriptObject that has fake tensor contents.
Test Plan:
Add a new test: pytest test/export/test_torchbind.py -k test_compile_tensor_op_in_tensor_flatten
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129605
Approved by: https://github.com/angelayi
This PR does two things:
1. it duplicates the fake script object because aot_export trace the program twice. The result of tracing in the first time would cause the tracing result of second time be wrong.
2. Also add a new test for methods that return constant outputs. Before the PR, there's is no meta["val"] for these nodes because fx won't track these constants. We still need to preserve these constant return operators in the graph because torchbind objects are stateful and deleting it would remove the implicit state mutation inside of the object.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128844
Approved by: https://github.com/angelayi
This PR:
- moves some of the dtype-string utilities into ScalarType.{h, cpp}
- adds a new utility to get a mapping from dtype name to the C++ dtype
- the perser now checks if the string is a dtype name; if it is then it
pulls the c++ dtype from the mapping.
Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129189
Approved by: https://github.com/albanD
ghstack dependencies: #129177, #129178, #129179