pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Nikolay Korovaiko	f725009a48	as_strided supports SymInt; codegen supports optional SymInt (#84393 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84393 Approved by: https://github.com/ezyang	2022-09-06 16:39:24 +00:00
YifanShenSZ	673b35c847	Better reshape with autograd support (#82754 ) (#84154 ) The original author is @YifanShenSZ and the original PR is: #82754 # Summary: Previous reshape [https://github.com/pytorch/pytorch/issues/80981](https://github.com/pytorch/pytorch/pull/80981) is ok for forward, but needs improvement for backward: need to handle "sometimes view sometimes copy" behavior. This pull request fixes it by: 1. add a new alias dispatch key `CompositeImplicitAutogradNestedTensor`, which ideally would work as nested-tensor version of `CompositeImplicitAutograd` 2. register `reshape_nested` to `reshape` by `CompositeImplicitAutogradNestedTensor` Side changes: * add contiguous memory format support to `clone_nested` * add `view_nested` * add `reshape_as_nested` Fix issue [https://github.com/pytorch/pytorch/issues/83041](https://github.com/pytorch/pytorch/issues/83041) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82754 Test Plan: Imported from GitHub, without a `Test Plan:` line. Static Docs Preview: executorch \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D39023822/V13/executorch/)\| \|Modified Pages\| Reviewed By: albanD Differential Revision: D39023822 Pulled By: drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/84154 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-09-01 20:01:39 +00:00
Edward Z. Yang	5e2c23377a	LTC codegen appears to be hardcoded to only support tensors (#84355 ) Assert accordingly Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84355 Approved by: https://github.com/wconstab	2022-09-01 16:29:39 +00:00
Edward Z. Yang	ad44670fa1	Back out "Revert D38984222: Don't introduce new overload for SymInt (#83628 )" (#84173 ) Also Back out "Revert D39075159: [acc_tensor] Use SymIntArrayRef for overloaded empty.memory_format's signature" Original commit changeset: dab4a9dba4fa Original commit changeset: dcaf16c037a9 Original Phabricator Diff: D38984222 Original Phabricator Diff: D39075159 Also update Metal registrations for C++ registration changes. Also update NNPI registration to account for tightened schema checking Differential Revision: [D39084762](https://our.internmc.facebook.com/intern/diff/D39084762/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39084762/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84173 Approved by: https://github.com/Krovatkin	2022-08-29 18:01:07 +00:00
PyTorch MergeBot	c7edcd6968	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit `9790d90e4b`. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to Breaks internal builds, see D39076487	2022-08-27 01:23:17 +00:00
Edward Z. Yang	9790d90e4b	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-26 01:35:40 +00:00
PyTorch MergeBot	a7edf71360	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit `8fae7027b3`. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to breaking internal builds, see https://www.internalfb.com/diff/D38984222	2022-08-25 00:49:40 +00:00
Henry Tu	4a18d0a972	Fix LTC build warnings (#83955 ) Addresses `Wc++98-compat-extra-semi` warning from https://github.com/llvm/torch-mlir/issues/1264 by removing extraneous semicolon after autogen LTC native function definitions. ``` /home/runner/work/torch-mlir/torch-mlir/build/tools/torch-mlir/python/torch_mlir/csrc/base_lazy_backend/generated/LazyNativeFunctions.cpp:4241:6: warning: extra ';' outside of a function is incompatible with C++98 [-Wc++98-compat-extra-semi] }; ^ ``` cc: @wconstab @desertfire @ke1337 @antoniojkim Pull Request resolved: https://github.com/pytorch/pytorch/pull/83955 Approved by: https://github.com/wconstab	2022-08-24 14:33:52 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
Edward Z. Yang	8fae7027b3	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-23 22:04:07 +00:00
John Clow	eff28d61c9	[JIT SSA] Allow updating shape functions without recompilation (#83629 ) In order to avoid extra round trips, and avoid confusion in places such as this to manually pull in the latest copy of the shape_functions.py file This also fixes the cases where people pull in the wrong version of the file. This can happen in cases such as when developers run `python setup.py install` instead of `python setup.py develop` to generate their current copy of Pytorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83629 Approved by: https://github.com/davidberard98	2022-08-22 18:03:44 +00:00
Edward Z. Yang	329deb9757	Refactor is_X_like, better invariant checking for SymInt overload (#83668 ) Add is_symint_like, by way of is_base_ty_like which generalizes the pattern for is_tensor_like and is_generator_like. Now that we can query if a signature contains a SymInt, we can enforce that you must name the overload with SymInt if the signature contains SymInt. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83668 Approved by: https://github.com/bdhirsh, https://github.com/larryliu0820	2022-08-19 23:30:54 +00:00
Edward Z. Yang	0ec7fc13d6	Refactor CppSignatureGroup to collect signatures as list. (#83667 ) This makes it easier to add more signatures to the signature group, as relevant logic which needs to run for each signature no longer needs to be adjusted. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83667 Approved by: https://github.com/larryliu0820, https://github.com/bdhirsh	2022-08-19 16:00:33 +00:00
Edward Z. Yang	9152144944	Coverage for nondeterministic_seeded, respect it in constant prop (#83650 ) - nondeterministic_seeded was not applied to enough functions. I added some heuristics to codegen for identifying functions that are likely to be random and added a bunch of these tags to functions. Not sure I got all of them. - Don't constant propagate through nondeterministic functions in FX tracing. It would be better to do some testing for the tag but this would be quite an effort. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83650 Approved by: https://github.com/bdhirsh, https://github.com/eellison	2022-08-18 22:18:10 +00:00
Mengwei Liu	badbdb0330	[torchgen] Relax the restriction on number of custom namespaces (#83580 ) Summary: We started to see use cases where it involves more than 1 custom namespace to live within the same yaml file. Hence relaxing the restriction that 1 yaml file can only have 1 custom namespace other than `aten`. Updated unit test as well. Differential Revision: D38775685 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83580 Approved by: https://github.com/JacobSzwejbka	2022-08-18 04:47:13 +00:00
Larry Liu	11d4d91bdc	[torchgen] Add logic in annotation parser to accept alias set (#83501 ) Extending the current regex in `model.py` to support annotation alias set. See issue #83214. Ideally we should have a full fledged lexer similar to `schema_type_parser.cpp`, since regex can be more and more difficult to read if we add more support to it. Adding this to unblock this issue for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83501 Approved by: https://github.com/SherlockNoMad	2022-08-17 07:04:25 +00:00
Nikolay Korovaiko	759c37a4f4	make sure arguments are tuples otherwise they won't be hashable (#83342 ) make sure arguments are tuples otherwise they won't be hashable if used in autograd.py or any other places that uses dictionaries for that matter Pull Request resolved: https://github.com/pytorch/pytorch/pull/83342 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-08-15 19:12:15 +00:00
Mengwei Liu	d0d6b1f222	[torchgen] Generate out variant for functional operator (#81437 ) Summary: Previously we don't generate out variant (both schema and kernel) for an operator with functional variant only. This adds support for that and adds test. ## Changes on `native_function_generation.py` We are generating out variant for all functional variants if possible. This PR introduces a lot of newly generated out variants and `native_functions.yaml` needs to incorporate the changes by adding `autogen` keywords. The logic for determining what operators we should generate an out variant for is the following: 1. No existing out variant for this `NativeFunction` 2. Contains an existing in place, mutable or functional variant 3. Contains at least 1 tensor like return(s) For operators matching the first two conditions but failing the third, I listed them in `FUNCTIONAL_OPS_THAT_CANNOT_GET_AN_OUT_VARIANT`. ## Special handling The following operators satisfy all 3 criteria above but we chose to not autogen them, with some reasons. * `mkldnn_adaptive_avg_pool2d`, the generated out variant `mkldnn_adaptive_avg_pool2d.out` is colliding with the `mkldnn_adaptive_avg_pool2d_out` kernel in `adaptive_avg_pool2d.out` operator. I manually created `mkldnn_adaptive_avg_pool2d.out` and renamed `mkldnn_adaptive_avg_pool2d_out` to `mkldnn_adaptive_avg_pool2d_out_stub`. * `min`, `max` and `mean`. There already exist `min.out`, `max.out` and `mean.out` but they are having different semantics with the functional ones. I manually created `min.unary_out`, `max.unary_out` and `mean.dtype_out` to disambiguate. ## Autograd Changes We introduced a logic to not match derivatives info in `derivatives.yaml` to out variant, since we are generating `NOT_IMPLEMENTED` kernels for those out variants anyway. The issue we are seeing with the original logic is that it doesn't handle `TensorOption` arguments really well. For example we have these two operators: * `_to_copy(Tensor self, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, MemoryFormat? memory_format=None) -> Tensor` `_to_copy.out(Tensor self, *, bool non_blocking=False, MemoryFormat? memory_format=None, Tensor(a!) out) -> Tensor(a!)` If we uses `_to_copy` derivative info, there will be compilation error since `dtype` is missing from `_to_copy.out` signature. Test Plan: Rely on unit test Differential Revision: D37832342 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81437 Approved by: https://github.com/iseeyuan, https://github.com/bdhirsh	2022-08-13 05:44:53 +00:00
Mengwei Liu	c322fc03a1	[torchgen] Fix selective build error on custom namespace (#83141 ) Summary: Currently `SelectiveBuilder` is hardcoding namespace `aten` for operators. This is not working anymore since operators started to have custom namespaces. This fixes it. Test Plan: Rely on newly added unit test Differential Revision: D38565527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83141 Approved by: https://github.com/JacobSzwejbka	2022-08-10 21:27:05 +00:00
Mikayla Gawarecki	e3e33cfae0	Enable codegen of per-dispatch key derivative formulas in derivatives.yaml (#82801 ) `derivatives.yaml` can now take a `dispatch` entry which registers per-autograd dispatch key derivatives such as ``` name: foo(Tensor self, Tensor y) -> Tensor dispatch: Default: x: grad y: grad.expand(y.sizes()) AutogradNestedTensor: x: grad y: NestedTensor_foo_backward(grad, y) output_differentiabilty: [True] ``` However the old schema where there is no `dispatch` entry is still supported. Would greatly appreciate feedback on how to improve the testing strategy of this PR, currently have registered an aten test op in TestOps.cpp with dummy gradients in derivatives.yaml and have some tests in test_autograd.py:TestAutogradMultipleDispatch but I am not sure whether these are sufficiently rigorous. Additionally, this PR also makes the assumption that sets like [VIEW_FUNCTIONS](`ff5399e528/tools/autograd/gen_inplace_or_view_type.py (L60)`) are per-native-function and not per-native-function-and-dispatch-key. I'm not sure whether this is necessarily the case, would there ever be a situation where (e.g. a nested_tensor op is a view op but the aten function is not or vice versa?) * __->__ #82801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82801 Approved by: https://github.com/bhosmer, https://github.com/albanD	2022-08-10 19:26:29 +00:00
John Clow	ab76862b7a	Editing gen_jit_shape_functions to make lintrunner happy (#79571 ) Somehow even with clang-format off, it was unhappy with this line >>> Lint for torch/csrc/jit/runtime/serialized_shape_function_registry.cpp: Warning (CLANGFORMAT) format See https://clang.llvm.org/docs/ClangFormat.html. Run `lintrunner -a` to apply this patch. You can run `lintrunner -a` to apply this patch. 2855 2855 \| return shape_mappings; 2856 2856 \| } 2857 2857 \| 2857 \|- 2859 2858 \| // clang-format on 2860 2859 \| 2861 2860 \| } // namespace jit Note that there is no changes to `serialized_shape_function_registry.cpp` in this diff because I had to manually run `lintrunner` to make it format the code correctly in the previous diff so that we can land it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79571 Approved by: https://github.com/eellison	2022-08-10 18:20:18 +00:00
Bin Bao	7b39406526	[LTC] Pass a BackendDevice parameter into GetIrValueForScalarFromCodegen (#82970 ) Summary: Currently GetIrValueForScalarFromCodegen uses CPU as the default backend device for scalars, but we should make it a backend-dependent decision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82970 Approved by: https://github.com/Krovatkin, https://github.com/JackCaoG	2022-08-10 03:59:25 +00:00
soulitzer	b55f9047e1	Add forward AD support for elu_, celu_, selu_ (#83080 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83080 Approved by: https://github.com/albanD	2022-08-09 20:15:44 +00:00
Shunting Zhang	943553965e	support custom class in torchgen schema parser (#82925 ) Differential Revision: [D38480514](https://our.internmc.facebook.com/intern/diff/D38480514/) torchgen schema parser does not support parsing function schemas using custom class so far. Here is an example: ``` quantized::conv2d_relu.new(Tensor qx, __torch__.torch.classes.quantized.Conv2dPackedParamsBase packed_weight, float output_scale, int output_zero_point) -> (Tensor) ``` This PR parse custom class name and encapsulate that into an object of CustomClassType. The only thing we need right now is just store the string class name and return that in `__str__` method. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82925 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2022-08-08 22:24:43 +00:00
Peter Bell	4f255dbfb3	Remove manual bindings for arange (#81380 ) The functional variant of one of the `arange` overloads has a schema mismatch with the out variant. The functional one has `Scalar step`, but the corresponding out variant has `Scalar step=1`. This isn't allowed, so it had to be special-cased in the python codegen and manually bound. This adds the default `step` value to the functional overload and removes the special-casing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81380 Approved by: https://github.com/ngimel	2022-08-07 00:10:27 +00:00
Peter Bell	2c2278a960	Make python TensorOption signatures consistent with JIT schemas (#82241 ) Fixes #81774 `TensorOptions` arguments in the JIT schema are optional, but in the Python API these were being translated to non-optional but with a default value. This change makes the arguments accept `None` for consistency with the JIT schema. However, it also means that `dtype=c10::nullopt` was previously completely untested so this also fixes several related bugs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82241 Approved by: https://github.com/ngimel	2022-08-07 00:10:27 +00:00
Mengwei Liu	406ce692ca	[torchgen] Generate wrapper functions under custom namespaces (#81744 ) Summary: A follow up of #81581. Before these 2 PRs, if an operator with custom kernel namespace is added to `native_functions.yaml` (or any other yaml consumed by `torchgen`), although we are able to recognize the custom kernel in files such as `NativeFunctions.h` and `RegisterCPU.cpp`, we still generate backend specific wrappers under the hardcoded `at` namespace. This changes the behavior, by generating wrapper functions under custom namespaces. For example, if the entries in yaml file looks like: ``` - func: op_1(Tensor(a) self) -> Tensor(a) dispatch: CPU: at::op_1_kernel # ATen kernel - func: op_2(Tensor(a) self) -> Tensor(a) dispatch: CPU: custom::op_2_kernel # custom kernel ``` We generate the following code for `CPUFunctions_inl.h` and `RegisterCPU.cpp`: `CPUFunctions_inl.h`: ``` namespace at { namespace cpu { TORCH_API at::Tensor & op_1(const at::Tensor & self); } // namespace cpu } // namespace at namespace custom { namespace cpu { TORCH_API at::Tensor & op_2(const at::Tensor & self); } // namespace cpu } // namespace custom ``` Notice the difference between `at::cpu` and `custom::cpu`. Then the definition for these can be found in `RegisterCPU.cpp`. `RegisterCPU.cpp`: ``` #include "CPUFunctions.h" namespace at { namespace { at::Tensor & wrapper_op_1(const at::Tensor & self) { // No device check // DeviceGuard omitted return at::native::op_1_kernel(self); } } // anonymous namespace TORCH_LIBRARY_IMPL(aten, CPU, m) { m.impl("op_1", TORCH_FN(wrapper_op_1)); } namespace cpu { at::Tensor & op_1(at::Tensor & self) { return wrapper_op_1(self); } } // namespace cpu } // namespace at namespace custom { namespace { at::Tensor & wrapper_op_2(const at::Tensor & self) { // No device check // DeviceGuard omitted return at::native::op_2_kernel(self); } } // anonymous namespace TORCH_LIBRARY_IMPL(aten, CPU, m) { m.impl("op_2", TORCH_FN(wrapper_op_2)); } namespace cpu { at::Tensor & op_2(at::Tensor & self) { return wrapper_op_2(self); } } // namespace cpu } // namespace custom ``` The benefit for this change is that it unifies all the namespaces derived from custom ops. In the example above, there are: 1. `custom::native` for kernels 2. `custom::<dispatch_key>` e.g., `custom::cpu` for wrappers This customized operator will have nothing to do with `at::native`, `at::cpu` etc. Test Plan: This is very hard to test. I will refactor this logic, abstract out some layers so it's testable. Will do it in coming PRs Differential Revision: D37972772 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81744 Approved by: https://github.com/bdhirsh	2022-08-04 07:48:44 +00:00
Brian Hirsh	684ce1b0bc	add inplace_view tag to resize_() (#82667 ) `resize_()` is annoying because it needs special casing for functionalization. It's technically an inplace-view op, but it can't really have a pure view variant, since calling resize_() might bust the old storage. I gave it an `inplace_view` tag so that stuff like `FakeTensor` that relies on tags will pick it up properly, which required jumping through some codegen hoops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82667 Approved by: https://github.com/eellison	2022-08-03 18:13:00 +00:00
Peter Bell	afafd16671	Lintrunner: Run mypy-strict on torchgen (#82576 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82576 Approved by: https://github.com/ezyang	2022-08-01 19:14:27 +00:00
Peter Bell	ba84e9662e	Use OrderedSet in ufunc codegen (#82567 ) Follow up from https://github.com/pytorch/pytorch/pull/82536#discussion_r934000916 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82567 Approved by: https://github.com/ezyang	2022-08-01 18:11:06 +00:00
Edward Z. Yang	50e8abbcad	Change SymIntNode into an intrusive pointer (#82548 ) This will make the pointer type a single word, which is important for packing it into an int64_t This time, this diff doesn't segfault when you build with DEBUG mode; more details at https://github.com/pybind/pybind11/issues/4099 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82548 Approved by: https://github.com/albanD	2022-08-01 15:07:21 +00:00
Peter Bell	53f56894ae	Fix nondeterminism in torchgen (#82536 ) Closes #82320 The iteration order of a `set` can change from run to run, resulting in real content changes to generated files and therefore unnecessary rebuilding. The fix is to use a sort to give a repeatable iteration order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82536 Approved by: https://github.com/ezyang	2022-07-31 12:58:10 +00:00
Mengwei Liu	301fe8c27d	[torchgen] Fix multiple backends with custom namespace (#82133 ) Summary: Some quantized operators needs `QuantizedCPU` backend, due to an issue in namespace checking, currently if we have two backends as well as a custom namespaces in native function, codegen will hit assertion error. This PR fixes this issue The root cause is that codegen right now asserts that a native function should only have one namespace. The current behavior is that If a native function is not found in a `BackendIndex`, we will use default namespace for that backend, for fallback kernels. However that default namespace may not be listed in the yaml file and it should not be counted when checking if we have two different namespaces for that backend. In our error case, we have 2 `BackendIndex`, one for `QuantizedCPU` and one for `CPU`. The native function doesn't have a kernel in `QuantizedCPU` but we still use a default namespace (`at::native`) for it. Since we have a custom namespace for dispatch key `CPU`, we ran into the assertion error. This PR changes the assertion criteria. We only error out if a namespace has two or more kernels and they have two or more different namespaces. Test Plan: rely on newly added unit test Differential Revision: D38101345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82133 Approved by: https://github.com/iseeyuan	2022-07-29 22:53:58 +00:00
PyTorch MergeBot	3b9cbb1738	Revert "Change SymIntNode into an intrusive pointer (#82432 )" This reverts commit `7be44f8158`. Reverted https://github.com/pytorch/pytorch/pull/82432 on behalf of https://github.com/ezyang due to segfaults on test but not caught in CI	2022-07-29 20:08:59 +00:00
Edward Z. Yang	7be44f8158	Change SymIntNode into an intrusive pointer (#82432 ) This will make the pointer type a single word, which is important for packing it into an int64_t Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82432 Approved by: https://github.com/albanD, https://github.com/Krovatkin	2022-07-29 17:32:54 +00:00
Peter Bell	ba4727d4e5	Codegen: Parse deprecated signatures as a full FunctionSchema (#82179 ) Deprecated signatures are currently "parsed" manually to find the relative order of the argument names and all other information is inferred from the aten schema for the non-deprecated overload. However, this leads to problems if the argument names don't match or if there are multiple candidates that match the ATen function call. Instead, this makes the deprecated function a full FunctionSchema and so the entire python signature comes solely from the deprecated schema, with the `aten:` clause only used for the dispatch lambda call. I have confirmed locally that there is no change to `python_torch_functionsEverything.cpp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82179 Approved by: https://github.com/albanD	2022-07-29 17:19:54 +00:00
Edward Z. Yang	2f95b61cea	Revert "Revert "Make factory functions CompositeExplicitAutograd (#82251 )"" (#82470 ) This reverts commit `1df307f334`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82470 Approved by: https://github.com/zou3519	2022-07-29 17:06:07 +00:00
PyTorch MergeBot	1df307f334	Revert "Make factory functions CompositeExplicitAutograd (#82251 )" This reverts commit `9943ca3ce6`. Reverted https://github.com/pytorch/pytorch/pull/82251 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-07-29 03:05:59 +00:00
Edward Z. Yang	fd5ac1e6b5	Rename SymbolicIntNode to SymIntNodeImpl (#82350 ) Done via ``` git grep -l 'SymbolicIntNode' \| xargs sed -i 's/SymbolicIntNode/SymIntNodeImpl/g' ``` Reasoning for the change: * Sym is shorter than Symbolic, and consistent with SymInt * You usually will deal in shared_ptr<...>, so we're going to reserve the shorter name (SymIntNode) for the shared pointer. But I don't want to update the Python name, so afterwards I ran ``` git grep -l _C.SymIntNodeImpl \| xargs sed -i 's/_C.SymIntNodeImpl/_C.SymIntNode/' ``` and manually fixed up the binding code Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82350 Approved by: https://github.com/Krovatkin	2022-07-28 18:27:45 +00:00
Edward Z. Yang	9943ca3ce6	Make factory functions CompositeExplicitAutograd (#82251 ) This also makes them not decompose when we switch Python key. Note that CompositeExplicitAutogradNonFunctional maybe be overly conservative for some implementations (which actually call into other functional ops), but for now I just uniformly apply this everywhere to avoid errors. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82251 Approved by: https://github.com/bdhirsh, https://github.com/zou3519	2022-07-28 18:18:51 +00:00
Richard Zou	2d5318434e	Generate vmap plumbing on all native_functions, not just ones in allowlist (#82352 ) Motivation - The initial motivation for the allowlist is that we were checking in VmapGeneratedPlumbing.h to pytorch/functorch but people were changing schemas of operators in pytorch/pytorch. The allowlist helped reduce the number of collisions (because people change schemas of more operators than we had in the allowlist). This is no longer a problem because functorch is in the pytorch/pytorch repo - Avoid merge conflicts. Multiple people editing the allowlist leads to merge conflicts; getting rid of that alleviates it. Test Plan: - wait for CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/82352 Approved by: https://github.com/ezyang	2022-07-27 20:39:37 +00:00
Richard Zou	5c92777307	Stop checking in VmapGeneratedPlumbing.h (#82351 ) This PR changes VmapGeneratedPlumbing.h to be generated by torchgen. The output file is ATen/VmapGeneratedPlumbing.h. Why generate this file inside PyTorch codegen instead of a separate step in functorch? - I can't figure out how to get functorch's fbcode target to generate - functorch's build system will, in the mid-term, be absorbed into pytorch's build system, so I don't want to do the extra work of adding a step to the functorch build process. Test Plan: - build pytorch, build functorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/82351 Approved by: https://github.com/ezyang	2022-07-27 20:39:37 +00:00
Edward Z. Yang	d38ffa6a4c	Make all of new_/_like factory functions composite explicit autograd (#82238 ) Once CompositeImplicitAutograd gets registered to Python key, this will ensure that tensor subclasses can interpose on these functions directly rather than getting decomposed. We prefer not decomposing as these functions are functional, but their implementations use inplace operations (and are thus more difficult to deal with, unless you use functionalization.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82238 Approved by: https://github.com/zou3519, https://github.com/bdhirsh	2022-07-27 18:33:46 +00:00
Nikolay Korovaiko	d2c47d559c	Revert "Revert "Enabling SymInt in autograd; take 3 (#81145 )"" ; make sure is_intlist checks for symintnodes (#82189 ) ### Description <!-- What did you change and why was it needed? --> ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82189 Approved by: https://github.com/ezyang	2022-07-26 20:47:11 +00:00
Nikolay Korovaiko	30e74be784	a new section for ir generation (#81847 ) This is to get a conversation started. * @JackCaoG we could add attributes to items in `ir_codegen` section to customize IR generation logic (e.g. not generating `::Lower`). Though it could be a bit tricky to thread it through. * Adding an extra argument to `map_codegen` to filter native functions out seems like a step in the right direction. Otherwise, it's a bit confusing how do we go from a full list to a codegen list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81847 Approved by: https://github.com/JackCaoG, https://github.com/wconstab, https://github.com/bdhirsh	2022-07-26 20:39:07 +00:00
Will Constable	4f34cd6d1e	Replace all CHECK_ and DCHECK_ with TORCH_* macros (#82032 ) Avoid exposing defines that conflict with google logging, since this blocks external usage of libtorch in certain cases. All the 'interesting' changes should be in these two files, and the rest should just be mechanical changes via sed. c10/util/logging_is_not_google_glog.h c10/util/logging_is_google_glog.h Fixes https://github.com/pytorch/pytorch/issues/81415 cc @miladm @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/82032 Approved by: https://github.com/soumith, https://github.com/miladm	2022-07-26 01:20:44 +00:00
Wei-Sheng Chin	64094d81fe	Remove unused line (#82019 ) As title. #80251 introduced a new branch but forgot deleting the old one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82019 Approved by: https://github.com/ezyang	2022-07-24 16:30:37 +00:00
PyTorch MergeBot	c078476eb0	Revert "Enabling SymInt in autograd; take 3 (#81145 )" This reverts commit `032facd6e6`. Reverted https://github.com/pytorch/pytorch/pull/81145 on behalf of https://github.com/jeanschmidt due to breaking internal builds	2022-07-22 11:15:20 +00:00
Nikolay Korovaiko	032facd6e6	Enabling SymInt in autograd; take 3 (#81145 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/81145 Approved by: https://github.com/ezyang	2022-07-22 00:14:50 +00:00
Edward Z. Yang	6f0c253956	Add sparse, quantized and nested tensor meta support to codegen (#81793 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/81793 Approved by: https://github.com/cpuhrsch, https://github.com/bdhirsh	2022-07-21 21:23:56 +00:00
Edward Z. Yang	ec91f42d79	Sync up DispatchKey in model with C++ (#81770 ) I also dropped some keys that are not useful for codegen. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/81770 Approved by: https://github.com/bdhirsh	2022-07-21 21:23:54 +00:00
Peter Bell	5f2e31797a	Replace _dtype_default_type_hack (#81479 ) Currently any function with a default dtype other than None has to be manually entered into this function. Instead, this reads the default directly from `native_functions.yaml`. In order to do this, I also change `PythonSignatureGroup` to take `tensor_options_args` from the functional variant since the out variant doesn't actually have tensor options arguments to take the default values from. Also note that we need to use `default_init` instead of `default` because the out argument version doesn't have a `tensor_options` argument to extract the default value from and so the PythonSignature objects wouldn't match. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81479 Approved by: https://github.com/albanD	2022-07-21 16:42:49 +00:00
Brian Hirsh	ed0091f8db	fix view_copy kernel striding check logic (#81553 ) The composite kernel for `view_copy` that we generate is special-cased a bit for efficiency to avoid having to do extra clones in some cases. That logic was slightly wrong though, and is fixed here (it needs to mirror the logic in `reshape()`). It manifested as a debug assert firing for Lazy Tensor, which I confirmed no longer fires when running this script: ``` # ran with "python test_ltc_only_torch.py --device=lazy --sync=1 --nvtx=1" import torch import torch._lazy from torch._lazy.ts_backend import init as init_ts_backend init_ts_backend() torch.manual_seed(42) from transformers import BertForSequenceClassification def parse_args(): import argparse parser = argparse.ArgumentParser(description='') parser.add_argument('--device', type=str, default='cuda') parser.add_argument('--sync', type=bool, default=False) parser.add_argument('--nvtx', type=bool, default=False) return parser.parse_args() args = parse_args() device = args.device model = BertForSequenceClassification.from_pretrained('bert-base-uncased', return_dict=True) from transformers import AdamW from transformers import BertTokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') text_batch = ["I love Pixar.", "I don't care for Pixar."] encoding = tokenizer(text_batch, return_tensors='pt', padding=True, truncation=True) input_ids = encoding['input_ids'].to(device) attention_mask = encoding['attention_mask'].to(device) model = model.to(device) model.train() no_decay = ['bias', 'LayerNorm.weight'] optimizer_grouped_parameters = [ {'params': [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01}, {'params': [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], 'weight_decay': 0.0} ] optimizer = AdamW(optimizer_grouped_parameters, lr=1e-5) labels = torch.tensor([1,0]).unsqueeze(0).to(device) for _ in range(6): torch.cuda.nvtx.range_push(f'Iter{_}') torch.cuda.nvtx.range_push('F') outputs = model(input_ids, attention_mask=attention_mask, labels=labels) if args.sync: torch._lazy.mark_step() torch._lazy.wait_device_ops() torch.cuda.nvtx.range_pop() loss = outputs.loss torch.cuda.nvtx.range_push('B') optimizer.zero_grad() loss.backward() if args.sync: torch._lazy.mark_step() torch._lazy.wait_device_ops() torch.cuda.nvtx.range_pop() torch.cuda.nvtx.range_push('O') optimizer.step() if args.sync: torch._lazy.mark_step() torch._lazy.wait_device_ops() torch.cuda.nvtx.range_pop() torch.cuda.nvtx.range_pop() torch._lazy.mark_step() torch._lazy.wait_device_ops() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/81553 Approved by: https://github.com/ezyang	2022-07-19 13:47:05 +00:00
Mengwei Liu	9f873ed7c8	[torchgen] support codegen'd C++ API for a mixture of namespaces (#81581 ) Summary: In #77710 I introduces some hack to allow static dispatch to take namespaces. After we introduced namespace into ops and kernels, we don't have to pass namespace into `static_dispatch()`; instead we will generate ops with the kernel namespace for `Functions.h`. After this diff: If we have a yaml file looking like this: ``` - func: op_1(Tensor(a) self) -> Tensor(a) dispatch: CPU: at::op_1_kernel # ATen kernel - func: op_2(Tensor(a) self) -> Tensor(a) dispatch: CPU: custom::op_2_kernel # custom kernel ``` `Functions.h` will contain the following C++ APIs: ``` TORCH_API inline at::Tensor & op_1(at::Tensor & self) { return at::cpu::op_1_kernel(self); } TORCH_API inline at::Tensor & op_2(at::Tensor & self) { return custom::cpu::op_2_kernel(self); } ``` Test Plan: Rely on CI Differential Revision: D37900753 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81581 Approved by: https://github.com/iseeyuan	2022-07-19 07:46:36 +00:00
Sergii Dymchenko	19a296486b	Remove unused local variables from generator.py (#81505 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81505 Approved by: https://github.com/seemethere	2022-07-18 16:26:59 +00:00
Huy Do	a4647cc1fa	Apply ufmt linter to all py files under torchgen (#81570 ) Previous batches: * https://github.com/pytorch/pytorch/pull/81285 * https://github.com/pytorch/pytorch/pull/81335 We have multiple batches here to minimize merge conflicts and reviewing process. Once everything has been formatted by ufmt (black+usort), the current black linter will be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81570 Approved by: https://github.com/ezyang	2022-07-16 03:52:25 +00:00
Edward Z. Yang	fca03eeec1	Make proxy tensor support item() calls on torch.tensor constants (#81192 ) This PR is doing a few interrelated things, all of which are necessary to get correctness. Read the comment in torch/fx/experimental/proxy_tensor.py for the high level overview. Let's break down the parts of this PR: * Bug fix where `enable_torch_dispatch_mode` with `None` doesn't work. This make `enable_torch_dispatch_mode(current_mode.inner)` work which is the basis for how we temporarily disable fake tensor mode. * Bug fix for when fake tensor mode is combined with a non-mode tensor subclass. This actually could be ablated from this PR but it affects where the logic for allowing non fake tensor inputs with lift goes, so it's all in here in one go. There are some relevant tests for the fix in fake tensor, but it turns out I didn't need this because I'm always using proxy tensors as a mode (which ensures the ordering is right.) * New `lift_fresh` view operator. Note that like lift, we have to manually write the functionalize kernel for these functions. * The actual change, which is to save constants when we see them in the proxy tensor mode, and then propagate them as we go (because otherwise you'll handle mutations on constants incorrectly--see test.) This is mildly BC-breaking if anyone was previously interposing on at::lift, but this operator was relatively new and I checked functorch which has no explicit reference to lift. So I think it should not be too disruptive. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/81192 Approved by: https://github.com/samdow, https://github.com/bdhirsh	2022-07-15 03:53:40 +00:00
Sergii Dymchenko	3dea7fe6f3	Remove unused local variables from gen.py (#81508 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81508 Approved by: https://github.com/huydhn	2022-07-15 01:26:32 +00:00
Tim Gates	3a87b47de9	docs: Fix a few typos (#81435 ) There are small typos in: - caffe2/python/recurrent.py - test/distributed/test_c10d_nccl.py - test/test_fx.py - torch/csrc/jit/runtime/autodiff.cpp - torchgen/gen.py Fixes: - Should read `propagation` rather than `propogation`. - Should read `multiplied` rather than `multuplied`. - Should read `eliminate` rather than `elminate`. - Should read `dispatcher` rather than `disaptcher`. Semi-automated pull request generated by https://github.com/timgates42/meticulous/blob/master/docs/NOTE.md Pull Request resolved: https://github.com/pytorch/pytorch/pull/81435 Approved by: https://github.com/ngimel	2022-07-14 04:20:26 +00:00
Huy Do	8f07b7a069	Fix circular import error in torchgen (#81355 ) This also formats `tools/pyi/gen_pyi.py` with `usort` to test the fix because that is how the bug was discovered. The usort-formatted `gen_pyi.py` should work now without any issues Fixes #81294 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81355 Approved by: https://github.com/ezyang	2022-07-13 03:16:38 +00:00
Mengwei Liu	80f6d2e9e6	[torchgen] Extract out schema registration logic into a function (#80780 ) Summary: A followup to #78015 and #79733. In those PRs I introduced custom namespace support into: * `Register<DispatchKey>.cpp` * `RegisterSchema.cpp` * `NativeFunctions.h` This PR extracts out logic that generates schema registration code (used in `RegisterSchema.cpp`) into a function so that it can be easily tested and reused. Added unit test to cover the logic as well. Test Plan: Rely on newly added unit tests. Differential Revision: D37581186 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80780 Approved by: https://github.com/iseeyuan	2022-07-12 21:52:42 +00:00
Brian Hirsh	f84b30f790	fix functionalization regression introduced by ProxyTorchDispatchMode, migrate testing to make_fx (#80416 ) `ProxyTorchDispatchMode` was added recently as part of `make_fx`, which was secretly causing the meta tensor calls used inside of functionalization to get baked into the graph. It also wasn't caught because the functionalization tests in core don't use `make_fx`, and the tests in functorch aren't as comprehensive. Now that `make_fx` is in core, I also ported the functionalization test infra over to use it, which would have caught the regression. This also makes the tests cleaner, since mode-based tracing lets us pick up factory functions in the trace output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80416 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-07-12 01:46:16 +00:00
Mengwei Liu	5c8a9803c8	[torchgen] Support multiple namespace in NativeFunctions.h (#79733 ) Summary: This is a follow up to #78015. This PR * introduces namespace logic for generating `NativeFunctions.h`. * adds helper function to extract namespace from string * relaxes the constraint on the levels we support for custom kernel namespace to 2 Test Plan: Yaml entry: ``` - func: unsqueeze.out(Tensor(a) self, int dim, *, Tensor(a!) out) -> Tensor(a!) variants: function device_check: NoCheck dispatch: CPU: custom_1::custom_2::unsqueeze ``` Generated `NativeFunctions.h`: ``` namespace custom_1 { namespace custom_2 { namespace native { TORCH_API at::Tensor & unsqueeze(const at::Tensor & self, int64_t dim, at::Tensor & out); } // namespace native } // namespace custom_2 } // namespace custom_1 ``` Differential Revision: D37198111 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79733 Approved by: https://github.com/bdhirsh	2022-07-08 21:56:52 +00:00
Brian Hirsh	960758b0b7	fix overload ambiguity with functional ops; fix _foreach op grouping (#80556 ) This should fix the last issue that @anijain2305 hit when running ResNet with TorchDynamo <> functionalization. Today if you try to call an `OpOverloadPacket` from python with some arguments, we will use the types of those arguments to perform overload resolution. With some functional variants of ops, this can be ambiguous. Today this affects just one op: `_fused_moving_avg_obs_fq_helper`, although it would potentially affect e.g. `native_batch_norm` in the future. Example: ``` # There are technically two overloads: # torch.ops.aten._fused_moving_avg_obs_fq_helper.default (returns 2 argument, mutates 4 of its inputs inplace) # torch.ops.aten._fused_moving_avg_obs_fq_helper.functional (returns 6 argument, mutates none of its inputs) # We pick the wrong one - no way to know that we should pick the functional one, just from the call site. outs = torch.ops.aten._fused_moving_avg_obs_fq_helper(a, a, a, a, a, a, a, 1.0, 0, 1, 0) # raises an error - tries to call the overload with only 2 returns return _fused_moving_avg_obs_fq_helper_functional[5] ``` Specifically, functionalization will bake `_fused_moving_avg_obs_fq_helper.functional` into the graph, but when AOTAutograd tries to compile with TorchScript, it needs to remove the overload name (TS doesn't know how to parse overload names directly, so we need to remove the overload name and let it infer the right overload at runtime later- so it picks the wrong one). The situation is pretty similar to inplace; `ops.aten.add` and `ops.aten.add_` represent two different `OverloadPacket` objects; they can't be overloads of the same op, because their schemas would be ambiguous - the alias annotations are different, but that isn't enough to disambiguate). In this PR, I try to fix the situation in a pretty similar way to how we handle `inplace` in the data model: `inplace` ops get their own base operator name, but they are represented as a flag inside of `BaseOperatorName` in the data model. Two other important changes that I made as part of this PR: (1) Originally, there were ~100 different `_functional` operators: e.g. we had operators named `resize.functional` and `zero.functional`. The `_functional` bit isn't actually necessary in most cases: it's only necessary for operators that also* have a `SchemaKind.mutable` variant, where `_fused_moving_avg_obs_fq_helper` is the only op that fits that description today. So I removed the unnecessary notion of "functional" from those other ops. I also added a bunch of assertions to force this restriction. I think that makes more sense in the long run, because it eliminates an unnecessary difference in the model. E.g. we don't have `add_.Tensor` and `add.Tensor_functional`. We just have `add_.Tensor` and `add.Tensor`. (2) I noticed that we actually still weren't pairing up a bunch of `_foreach` operators correctly, because their input arguments were different (`self` vs. `tensors`). Since they're private API's, I went ahead and changed the argument names directly so they get matched up. Before this PR, we were generating a separate `_foreach_add` and `_foreach_add.functional` variant in a bunch of cases, that really did the same thing (but happened to have a different name for the first argument). Pull Request resolved: https://github.com/pytorch/pytorch/pull/80556 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-07-06 12:45:11 +00:00
Edward Z. Yang	805120ab57	See if we can elide TORCH_API from inline functions. (#80609 ) See https://github.com/pytorch/pytorch/issues/80604 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/80609 Approved by: https://github.com/malfet	2022-06-30 23:31:38 +00:00
Brian Hirsh	c2d395cf8e	functionalization <> LTC integration (take 3) (#80251 ) new PR for https://github.com/pytorch/pytorch/pull/75527. It looks like there's a bug in the windows CI scripts that was causing flaky failures, that disappear when I create a new PR. example failure: https://github.com/pytorch/pytorch/runs/6999272635?check_suite_focus=true Pull Request resolved: https://github.com/pytorch/pytorch/pull/80251 Approved by: https://github.com/wconstab	2022-06-26 23:10:21 +00:00
Nikita Shulga	f11cce309b	[MPS] Add equal operator (#80195 ) Which is, in essence is composite of `eq`->`all`->`item` `native/mps/operators/Equal.cpp` is an almost verbatim copy of `native/cuda/Equal.cpp` Fix codegen by generating MPSFunctions headers Pull Request resolved: https://github.com/pytorch/pytorch/pull/80195 Approved by: https://github.com/albanD	2022-06-25 12:40:52 +00:00
Peter Bell	2c43876f64	AT_DISPATCH: Expose switch-case like macro syntax (#79978 ) This expands the `AT_DISPATCH` macros to enable writing your own `AT_DISPATCH_SWITCH` statements with multiple `AT_DISPATCH_CASE` labels. So, where previously you may have written: ```cpp if (iter.common_dtype() == kBool) { my_bool_kernel(iter); } else { AT_DISPATCH_INTEGRAL_TYPES(iter.common_dtype(), "my_kernel", [&] { ... }); } ``` You can now instead write ```cpp AT_DISPATCH_SWITCH(iter.common_dtype(), "my_kernel", AT_DISPATCH_CASE(kBool, [&] { my_kernel_bool(iter); }) AT_DISPATCH_CASE_INTEGRAL_TYPES([&] { ... }) ); ``` The macro syntax is a bit ugly, however the benefits are: - Greater flexibility, as the kernel code doesn't have to be shared for all dtypes. - Selective build and RECORD_KERNEL_FUNCTION work even for single dtype specializations such as the bool case in the example. - The compiler sees a single switch for all types, which should be easier to optimize into a jump table. - We also now get errors if the same scalar type is handled twice. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79978 Approved by: https://github.com/ezyang	2022-06-25 04:06:56 +00:00
Henry Tu	fc6b645fe2	Prevent out of bounds access to null LTC operands (#80060 ) When constructing a lazy::Node, [null operands (optional values that aren't included) are dropped](`30fb2c4aba/torch/csrc/lazy/core/ir.cpp (L82-L84)`), so it’s possible for the stored operand list to be a different length than the one that was passed into the constructor. This can become a problem during the call to `CanBeReused` in the autogen `LazyIr.h` code. For example: ``` bool CanBeReused(const torch::lazy::Value& input, const c10::optional<torch::lazy::Value>& weight, const c10::optional<torch::lazy::Value>& bias, const c10::optional<torch::lazy::Value>& running_mean, const c10::optional<torch::lazy::Value>& running_var, const bool& training, const double& momentum, const double& eps) const { size_t i = 0; std::cout << "Num operands: " << operands().size() << std::endl; return (operand(i++) == input && operand(i++) == weight.value_or(kNullValue) && operand(i++) == bias.value_or(kNullValue) && operand(i++) == running_mean.value_or(kNullValue) && operand(i++) == running_var.value_or(kNullValue) && this->training == training && this->momentum == momentum && this->eps == eps); } ``` Here we operate under the assumption that the number of operands stored in the `lazy::Node` is equal to the number of operands originally passed into the constructor. Recall that we drop any null operands though, so it’s possible to inadvertently access an invalid index at this point. This PR addresses this issue by adding a new nullable_operand method which falls back to a null value instead of producing an index error when going out of bounds. This should solve the issue found at https://github.com/pytorch/pytorch/pull/79637#issuecomment-1162044545 cc: @antoniojkim @ke1337 @wconstab @desertfire Pull Request resolved: https://github.com/pytorch/pytorch/pull/80060 Approved by: https://github.com/desertfire	2022-06-24 20:39:37 +00:00
Max Podkorytov	bf75708ce4	[static-runtime] add nnc codegen for aten::div (#76903 ) Differential Revision: D36151087 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76903 Approved by: https://github.com/mikeiovine	2022-06-22 05:47:44 +00:00
Nikolay Korovaiko	efc7343743	Revert "Revert "Put symint overloads on a different name"" (#79680 ) This relands https://github.com/pytorch/pytorch/pull/79281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79680 Approved by: https://github.com/malfet	2022-06-21 07:06:33 +00:00
Shunting Zhang	0d909d3cff	add a new FunctionSchema kind called scratch Pull Request resolved: https://github.com/pytorch/pytorch/pull/79659 The scratch op expose intermetidate/scratch tensors used in kernel implementation as kernel input arguments so a memory planning algorithm can plan memory for those tensors. Differential Revision: [D37194429](https://our.internmc.facebook.com/intern/diff/D37194429/) Approved by: https://github.com/bdhirsh	2022-06-20 16:34:16 +00:00
Brian Hirsh	adf8060600	add a new alias key for functional to view op decompositions Pull Request resolved: https://github.com/pytorch/pytorch/pull/79615 Approved by: https://github.com/zou3519	2022-06-15 23:18:09 +00:00
PyTorch MergeBot	b9bb52d97b	Revert "Put symint overloads on a different name" This reverts commit `213a8fc992`. Reverted https://github.com/pytorch/pytorch/pull/79281 on behalf of https://github.com/bigfootjon due to Diff reverted internally	2022-06-15 17:15:21 +00:00
Edward Z. Yang	213a8fc992	Put symint overloads on a different name Due to implicit conversion shenanigans, having both IntArrayRef and SymIntArrayRef overloads makes {} ambiguous. While we could fix this by making a single unified type that accepts all the overloads we want, an easier fix was to just push the SymIntArrayRef overload to its own name. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/79281 Approved by: https://github.com/suo	2022-06-12 14:36:39 +00:00
anjali411	38350acf8f	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322 Approved by: https://github.com/albanD	2022-06-11 00:29:32 +00:00
Mengwei Liu	24050a5801	[RFC][Codegen] Add custom namespace support (#78015 ) Summary: Adding a feature to allow user to specify namespaces for operator and kernels. # Feature There's a feature request to allow DSL to: 1. take in an operator namespace other than `aten`. 2. take in a kernel that is in a different namespace than `at::native`. For both features, we only allow user to have a single layer of namespace for the sake of simplicity. If user specify `custom::function` as kernel, the codegen will depend on `custom::native::function` where `native` is hardcoded. # Proposal For feature 1, add a `namespace` attribute to data class `NativeFunction`. The namespace will be extract out by matching pattern "::" on the `func` variable. For `NativeFunctionsGroup` there's an assumption that all variants (function, inplace, out) will have the same namespace. By default (if not specified) the namespace will be "aten". For feature 2, add a `namespace` attribute to `BackendMetadata` class, similarly match pattern "::" on the kernel field. Remove the `cpp_namespace` field from `register_dispatch_key` data class. By default (if not specified) the namespace for a kernel would be "at::native". Test Plan: Example yaml entries: ``` - func: custom::gelu.out(Tensor self, , str approximate='none', Tensor(a!) out) -> Tensor(a!) structured: True structured_inherits: TensorIteratorBase device_check: NoCheck # TensorIterator python_module: nn dispatch: CPU: custom::gelu_out_cpu CUDA: custom::gelu_out_cuda MPS: custom::gelu_out_mps - func: custom::gelu_(Tensor(a!) self, , str approximate='none') -> Tensor(a!) structured_delegate: gelu.out device_check: NoCheck # TensorIterator python_module: nn dispatch: NestedTensorCPU, NestedTensorCUDA: custom::NestedTensor_gelu_ - func: custom::gelu(Tensor self, , str approximate='none') -> Tensor structured_delegate: gelu.out device_check: NoCheck # TensorIterator python_module: nn dispatch: MkldnnCPU: custom::mkldnn_gelu QuantizedCPU: custom::gelu_quantized_cpu NestedTensorCPU, NestedTensorCUDA: custom::NestedTensor_gelu ``` see generated code: `RegisterCPU.cpp`: ``` TORCH_LIBRARY_IMPL(aten, CPU, m) { ... } TORCH_LIBRARY_IMPL(custom, CPU, m) { m.impl("gelu", TORCH_FN(wrapper_gelu)); m.impl("gelu.out", TORCH_FN(wrapper_gelu_out_out)); m.impl("gelu_", TORCH_FN(wrapper_gelu_)); }; ``` ``` struct structured_gelu_out_cpu_inplace final : public custom::native::structured_gelu_out_cpu { structured_gelu_out_cpu_inplace(Tensor& self) : outputs_{std::ref(self)} {} void set_output_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { const auto& out = outputs_[output_idx].get(); check_inplace(out, sizes, options); auto maybe_proxy = maybe_create_proxy(out, sizes, strides, options); if (C10_UNLIKELY(maybe_proxy.has_value())) { proxy_outputs_[output_idx] = c10::ExclusivelyOwned<Tensor>(std::move(maybe_proxy).value()); } if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output custom::native::structured_gelu_out_cpu::set_output_raw_strided(output_idx, sizes, strides, options, names); } void set_output_raw_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { const auto& out = outputs_[output_idx].get(); check_inplace(out, sizes, options); if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output custom::native::structured_gelu_out_cpu::set_output_raw_strided(output_idx, sizes, strides, options, names); } const Tensor& maybe_get_output(int64_t output_idx) override { return proxy_outputs_[output_idx].has_value() ? proxy_outputs_[output_idx] : outputs_[output_idx].get(); } std::array<std::reference_wrapper<Tensor>, 1> outputs_; std::array<c10::optional<c10::ExclusivelyOwned<Tensor>>, 1> proxy_outputs_; }; ``` `RegisterSchema.cpp` ``` TORCH_LIBRARY(aten, m) { ... } TORCH_LIBRARY(custom, m) { m.def("gelu.out(Tensor self, , str approximate='none', Tensor(a!) out) -> Tensor(a!)"); m.def("gelu_(Tensor(a!) self, , str approximate='none') -> Tensor(a!)"); m.def("gelu(Tensor self, , str approximate='none') -> Tensor"); }; ``` Differential Revision: D36558459 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78015 Approved by: https://github.com/bdhirsh	2022-06-10 21:04:36 +00:00
Brian Hirsh	7b3a0ff87a	Port `index.Tensor` to structured kernels. Tracking issue: #55070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69607 Approved by: https://github.com/bdhirsh	2022-06-10 17:27:47 +00:00
George Qi	a90f006fe5	add strides to slow path Pull Request resolved: https://github.com/pytorch/pytorch/pull/78610 Approved by: https://github.com/ezyang	2022-06-10 16:59:14 +00:00
PyTorch MergeBot	4b82ef7928	Revert "Port `index.Tensor` to structured kernels." This reverts commit `cfd84125bd`. Reverted https://github.com/pytorch/pytorch/pull/69607 on behalf of https://github.com/zengk95 due to This is breaking mac trunk tests `cfd84125bd`	2022-06-08 20:16:10 +00:00
Brian Hirsh	cfd84125bd	Port `index.Tensor` to structured kernels. Tracking issue: #55070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69607 Approved by: https://github.com/bdhirsh	2022-06-08 18:17:52 +00:00
Richard Zou	9da5defff6	Package config/template files with torchgen (#78942 ) Package config/template files with torchgen This PR packages native_functions.yaml, tags.yaml and ATen/templates with torchgen. This PR: - adds a step to setup.py to copy the relevant files over into torchgen - adds a docstring for torchgen (so `import torchgen; help(torchgen)` says something) - adds a helper function in torchgen so you can get the torchgen root directory (and figure out where the packaged files are) - changes some scripts to explicitly pass the location of torchgen, which will be helpful for the first item in the Future section. Future ====== - torchgen, when invoked from the command line, should use sources in torchgen/packaged instead of aten/src. I'm unable to do this because people (aka PyTorch CI) invokes `python -m torchgen.gen` without installing torchgen. - the source of truth for all of these files should be in torchgen. This is a bit annoying to execute on due to potential merge conflicts and dealing with merge systems - CI and testing. The way things are set up right now is really fragile, we should have a CI job for torchgen. Test Plan ========= I ran the following locally: ``` python -m torchgen.gen -s torchgen/packaged ``` and verified that it outputted files. Furthermore, I did a setup.py install and checked that the files are actually being packaged with torchgen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78942 Approved by: https://github.com/ezyang	2022-06-07 13:33:55 +00:00
Sergii Dymchenko	0fdc1caf02	Cleanup some Python2-related code (#78864 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78864 Approved by: https://github.com/janeyx99, https://github.com/jbschlosser	2022-06-06 17:40:02 +00:00
Brian Hirsh	67b27a7bae	generate kernels for codegend out= operators Pull Request resolved: https://github.com/pytorch/pytorch/pull/78626 Approved by: https://github.com/ezyang, https://github.com/JacobSzwejbka, https://github.com/larryliu0820	2022-06-06 15:36:28 +00:00
PyTorch MergeBot	bcb424c8cf	Fix #78675 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78699 Approved by: https://github.com/tugsbayasgalan	2022-06-04 01:07:24 +00:00
Linbin Yu	1683a2618d	rename BUILD.buck to BUCK.oss (#78792 ) rename BUILD.buck to BUCK.oss to better reflect that it's the OSS version of BUCK build, not the one shared with Bazel Pull Request resolved: https://github.com/pytorch/pytorch/pull/78792 Approved by: https://github.com/kit1980	2022-06-03 07:23:16 +00:00
PyTorch MergeBot	954522a485	Revert "Autogen Tags enum, and allow specifying tags while defining an op" This reverts commit `9476a78f37`. Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see `9476a78f37`	2022-06-03 01:53:53 +00:00
anjali411	9476a78f37	Autogen Tags enum, and allow specifying tags while defining an op Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-06-03 01:13:44 +00:00
PyTorch MergeBot	fca1f495c2	Revert "Port `index.Tensor` to structured kernels." This reverts commit `9fe6f1baf5`. Reverted https://github.com/pytorch/pytorch/pull/69607 on behalf of https://github.com/suo due to this broke master, see: `9fe6f1baf5`	2022-06-01 00:12:15 +00:00
Brian Hirsh	9fe6f1baf5	Port `index.Tensor` to structured kernels. Tracking issue: #55070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69607 Approved by: https://github.com/bdhirsh	2022-05-31 22:15:20 +00:00
Antonio Kim	fe67dff82a	Deprecate `TSNodeLoweringInterface` (#78273 ) Fixes #78206 Deprecate `TSNodeLoweringInterface` and refactor lower functions into IR nodes. CC: @wconstab @desertfire Pull Request resolved: https://github.com/pytorch/pytorch/pull/78273 Approved by: https://github.com/wconstab	2022-05-31 18:09:12 +00:00
Brian Hirsh	92229adf0c	add special handling for resize_() in functionalization pass Pull Request resolved: https://github.com/pytorch/pytorch/pull/77714 Approved by: https://github.com/ezyang	2022-05-26 16:15:44 +00:00
Bin Bao	29189d2ba8	[LT] Add IR resuing support for manually-implemented ops Summary: Add CanBeReused methods for manually-implemented ops and replace MakeNode with ReuseOrMakeNode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77616 Approved by: https://github.com/JackCaoG, https://github.com/wconstab	2022-05-26 16:04:47 +00:00
Hui Guo	1803a592f4	[static_runtime] Add script to auto-generate view ops (#77105 ) Summary: Add script to go through view ops in "native_functions.yaml" and auto-register them into static runtime and auto-generate op unit tests for each. Overall there are 96 grouped view ops, among which 21 is already registered by hand; 9 (including sparse ops/training related ops etc.) are not the target of static runtime; 30 has list args or list ret; and 7 has non-basic types such as "Dimname", "MemoryFormat", etc. In summary, this script auto-generate 29 view ops for now. Run `buck run //caffe2/torch/fb/jit:gen_static_runtime_ops` to generate static runtime ops, and the results with this script are, ``` total grouped native ops: 1582 grouped native ops with out variant: 548 generated functions groups with out variant: 241 view grouped native ops: 96 generated functions view groups: 29 overall generated : 270 ``` The generated view ops are added in D36258968 Test Plan: Generate static runtime ops: `buck run //caffe2/torch/fb/jit:gen_static_runtime_ops` Unit tests: `buck run mode/opt //caffe2/benchmarks/static_runtime:static_runtime_cpptest` Differential Revision: D36258767 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77105 Approved by: https://github.com/mikeiovine	2022-05-26 03:12:22 +00:00
Antonio Kim	02c4d877b4	Codegen Non-Native IR Nodes (#76535 ) Add codegen infrastructure to generate IR nodes for non-native ops. The proposed change is to add a `non_native` key to the `{backend}_native_functions.yaml` file that contains schema definitions similar to what is found in `native_functions.yaml`. e.g. ``` non_native: ... - func: expand(Tensor input, int[] size, bool is_scalar_expand) -> Tensor ... ``` these definitions are parsed into a `LazyIrSchema` that can be used for generating IR nodes using `GenLazyIR`. Fixes #74628 CC: @wconstab @desertfire @henrytwo Pull Request resolved: https://github.com/pytorch/pytorch/pull/76535 Approved by: https://github.com/wconstab	2022-05-24 19:29:23 +00:00
Brian Hirsh	7ddc1425ff	functionalization fix for inplace comparison ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/77125 Approved by: https://github.com/ezyang	2022-05-24 18:20:31 +00:00
Brian Hirsh	22d566acda	functionalization fix for inplace_view ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/77126 Approved by: https://github.com/ezyang	2022-05-24 18:20:30 +00:00
Mengwei Liu	9e806619cc	[Codegen] Remove view operator check in NativeFunctionGroups and allow skipping native function generation (#78145 ) Summary: This PR adds two features: * A boolean to turn off native function generation in codegen * Relaxing `view` operator check for `NativeFunctionGroups` Differential Revision: D36604646 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78145 Approved by: https://github.com/iseeyuan, https://github.com/bdhirsh	2022-05-24 05:48:30 +00:00
Mengwei Liu	ffa3cce100	[Codegen] Expose namespace argument for static dispatch (#77710 ) For static dispatch we are hardcoding namespace to be `at` for backend-specific C++ functions, e.g., `at::cpu::add()`. We are extending it to accept namespaces from callsite. This is a temporary solution, in the long run we want to introduce custom namespace into codegen system, e.g., we should be able to add `at::` to `native_functions.yaml` and parse it into `NativeFunction`. This needs a bit more design. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77710 Approved by: https://github.com/ezyang	2022-05-21 00:39:06 +00:00
John Clow	417373337f	Put imports in correct order so clang-format doesn't get mad every time Pull Request resolved: https://github.com/pytorch/pytorch/pull/77282 Approved by: https://github.com/Krovatkin	2022-05-20 18:39:47 +00:00
Brian Hirsh	0161e9eb00	[test] attempt to functionalize ops with mutable positional-only args Pull Request resolved: https://github.com/pytorch/pytorch/pull/76320 Approved by: https://github.com/ezyang	2022-05-19 18:50:34 +00:00
Edward Z. Yang	befa4e371e	Fix typo Fixes #77412 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77488 Approved by: https://github.com/mruberry	2022-05-18 18:25:54 +00:00
Antonio Kim	55be35ae39	Fix 'Code below assumes there is at least one tensor arg' assumption (#76917 ) Previously when codegening ops like `zeros_` or `ones_` we'd hit a `Code below assumes there is at least one tensor arg error`. This check is not entirely correct which is what is causing the error to be thrown. There are ops like the ones mentioned that pass in a `device` parameter that can be used in place of the "first tensor". CC: @wconstab @desertfire @henrytwo @ke1337 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76917 Approved by: https://github.com/desertfire	2022-05-18 17:58:47 +00:00
John Clow	2a99018147	Adding a way to register both upper and lower bound functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/77388 Approved by: https://github.com/eellison	2022-05-18 17:34:07 +00:00
Brian Hirsh	edc904d6ba	add native view_copy.out ops, teach codegen about tensorlist out= Pull Request resolved: https://github.com/pytorch/pytorch/pull/76126 Approved by: https://github.com/ezyang	2022-05-18 14:23:43 +00:00
Yukio Siraichi	9d44250760	Reduce structured kernels' `set_output` boilerplate with new overloads. Partially fix #69813 This PR does mainly 3 things: 1. Introduces new methods for the `MetaBase` API: - `set_output_strided`: creates proxy tensors with exact strides, if strides don't match - `set_output_contiguous`: alias for `set_output_strided` with contiguous strides - `set_output_raw_strided`: does not create proxy tensors 2. Modifies codegen for handling proxy tensors: - Creates a new field for out-of-place kernels: `proxy_output_` - Implements `set_output_strided` by creating a proxy tensor if necessary - Passes the proxy tensor to them `IMPL` function - Copy the result back to the real output, in the end, whenever a proxy was created 3. Replace `set_output` by `set_output_raw_strided` for `TensorIterator*` - Needed, since it overrides `set_output` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76096 Approved by: https://github.com/ezyang	2022-05-17 12:01:53 +00:00
Linbin Yu	1f8049566f	Re-land BUCK build for pytorch mobile (#77612 ) see https://github.com/pytorch/pytorch/pull/76480 fixed most lint errors Pull Request resolved: https://github.com/pytorch/pytorch/pull/77612 Approved by: https://github.com/kit1980	2022-05-17 00:30:13 +00:00
Bin Bao	25c6ebd12c	Revert "Revert "[LT] Codegen ReuseNode for supported ops"" Summary: Fixed a XLC build failure by generating an always-return-false default CanBeReused method. This reverts commit `3cade9d454`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77513 Approved by: https://github.com/alanwaketan	2022-05-16 20:14:42 +00:00
PyTorch MergeBot	530481ed69	Revert "[mobile] add buck build for mobile targets (#76480 )" This reverts commit `168dc70faf`. Reverted https://github.com/pytorch/pytorch/pull/76480 on behalf of https://github.com/atalman	2022-05-16 16:14:17 +00:00
francescocastelli	dca416b578	Pretty-print dataclasses (#76810 ) Unfortunately the built-in pprint module support pretty-print of dataclasses only from python 3.10. The code that I wrote in method `__str__` of OpInfo should do the same job and should also work for any dataclass. For now I've put it there but we can create a function and put it somewhere where is accessible also for other dataclasses. Also the max width (80) is now hardcode but it would ideally be the parameter of the function. when you call print on an OpInfo you get: ``` OpInfo(name = '__getitem__', ref = None, aliases = (), variant_test_name = '', op = <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>, method_variant = <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>, inplace_variant = None, skips = (<torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbca90>, <torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbcae0>), decorators = (<torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbca90>, <torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbcae0>), sample_inputs_func = <function sample_inputs_getitem at 0x7f463acc6af0>, reference_inputs_func = None, error_inputs_func = None, sample_inputs_sparse_coo_func = <function _DecoratorContextManager.__call__.<locals>.decorate_context at 0x7f463acc6b80>, sample_inputs_sparse_csr_func = <function _DecoratorContextManager.__call__.<locals>.decorate_context at 0x7f463acc6c10>, dtypes = {torch.int16, torch.float64, torch.int32, torch.int64, torch.complex64, torch.float16, torch.bfloat16, torch.uint8, torch.complex128, torch.bool, torch.float32, torch.int8}, dtypesIfCUDA = {torch.int16, torch.float64, torch.int32, torch.int64, torch.complex64, torch.float16, torch.bfloat16, torch.uint8, torch.complex128, torch.bool, torch.float32, torch.int8}, dtypesIfROCM = {torch.int16, torch.float64, torch.int32, torch.int64, torch.complex64, torch.float16, torch.bfloat16, torch.uint8, torch.complex128, torch.bool, torch.float32, torch.int8}, backward_dtypes = {torch.int16, torch.float64, torch.int32, torch.int64, torch.complex64, torch.float16, torch.bfloat16, torch.uint8, torch.complex128, torch.bool, torch.float32, torch.int8}, backward_dtypesIfCUDA = {torch.int16, torch.float64, torch.int32, torch.int64, torch.complex64, torch.float16, torch.bfloat16, torch.uint8, torch.complex128, torch.bool, torch.float32, torch.int8}, backward_dtypesIfROCM = {torch.int16, torch.float64, torch.int32, torch.int64, torch.complex64, torch.float16, torch.bfloat16, torch.uint8, torch.complex128, torch.bool, torch.float32, torch.int8}, supports_out = False, supports_autograd = True, supports_gradgrad = True, supports_fwgrad_bwgrad = True, supports_inplace_autograd = False, supports_forward_ad = True, gradcheck_wrapper = <function OpInfo.<lambda> at 0x7f463a7a40d0>, check_batched_grad = True, check_batched_gradgrad = True, check_batched_forward_grad = True, check_inplace_batched_forward_grad = True, gradcheck_nondet_tol = 0.0, gradcheck_fast_mode = None, aten_name = '__getitem__', decomp_aten_name = None, aten_backward_name = None, assert_autodiffed = False, autodiff_nonfusible_nodes = ['aten::__getitem__'], autodiff_fusible_nodes = [], supports_sparse = False, supports_scripting = False, supports_sparse_csr = False, test_conjugated_samples = True, test_neg_view = True, assert_jit_shape_analysis = False, supports_expanded_weight = False) ``` cc @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/76810 Approved by: https://github.com/ezyang	2022-05-16 14:20:41 +00:00
Linbin Yu	168dc70faf	[mobile] add buck build for mobile targets (#76480 ) Create buck targets to replicate internal BUCK build, including - XNNPACK - QNNPACK - C10 - aten_cpu - torch_mobile_core - torch_mobile_all_ops - ptmobile_benchmark And able to run mobilenet v2 using ptmobile_benchmark (with all ops). Pull Request resolved: https://github.com/pytorch/pytorch/pull/76480 Approved by: https://github.com/seemethere, https://github.com/dreiss	2022-05-15 18:42:41 +00:00
PyTorch MergeBot	3cade9d454	Revert "[LT] Codegen ReuseNode for supported ops" This reverts commit `6066e5929f`. Reverted https://github.com/pytorch/pytorch/pull/76738 on behalf of https://github.com/malfet	2022-05-14 00:33:10 +00:00
Bin Bao	6066e5929f	[LT] Codegen ReuseNode for supported ops Summary: 1. Update the codegen script to add a TrieCache lookup (ReuseNode) before creating a new IR node. The following is an example generated code, ``` at::Tensor LazyNativeFunctions::add(const at::Tensor & self, const at::Tensor & other, const at::Scalar & alpha) { ... torch::lazy::NodePtr node = torch::lazy::ReuseNode<AddTensor>(lazy_self->GetIrValue(), lazy_other->GetIrValue(), node_alpha); if (!node) { auto out_meta = at::meta::add(self, other, alpha); std::vector<Shape> shapes{Shape(out_meta.scalar_type(), out_meta.sizes().vec())}; TORCH_INTERNAL_ASSERT(shapes.size() == 1); if(symbolicShapeEnabled()){ std::vector<jit::IValue> inputs = { self, other, alpha }; char* schema_str = "aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor"; applySymbolicShapesOnLT(schema_str, inputs, shapes); } node = torch::lazy::MakeNode<AddTensor>(lazy_self->GetIrValue(), lazy_other->GetIrValue(), node_alpha, std::move(shapes)); CacheNode(node); } ... } ``` 2. TrieCache lookup depends on each IR node subclass to provide its own comparison function. The following is an example generated code, ``` bool CanBeReused(const torch::lazy::Value& self, const torch::lazy::Value& other, const torch::lazy::Value& alpha) const { size_t i = 0; return (operand(i++) == self && operand(i++) == other && operand(i++) == alpha); } ``` 3. DeviceData is specially handled. 4. Non-codegen op changes are coming a separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76738 Approved by: https://github.com/JackCaoG, https://github.com/wconstab	2022-05-13 19:13:58 +00:00
Kulin Seth	e011a8e18b	Enable PyTorch operations on MPS Backend. (#77343 ) Add PyTorch operations to MPS backend. - https://github.com/pytorch/pytorch/issues/77394 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343 Approved by: https://github.com/albanD	2022-05-13 18:28:53 +00:00
JackCaoG	e36a8c1f13	Lazy codegen change for xla (#76717 ) Codegen change to enable PyTorch/XLA to generate the first op in https://github.com/pytorch/xla/pull/3544. @bdhirsh @wconstab Pull Request resolved: https://github.com/pytorch/pytorch/pull/76717 Approved by: https://github.com/Krovatkin	2022-05-12 17:04:04 +00:00
Brian Hirsh	47dd092bae	add a new at::lift operator, fix torch.tensor for functionalization This reverts commit `85bd65a880`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77285 Approved by: https://github.com/albanD, https://github.com/ezyang	2022-05-12 13:31:19 +00:00
PyTorch MergeBot	85bd65a880	Revert "[test] try to fix torch.tensor for functionalization" This reverts commit `9edee09ed6`. Reverted https://github.com/pytorch/pytorch/pull/76319 on behalf of https://github.com/janeyx99	2022-05-11 18:48:42 +00:00
Brian Hirsh	9edee09ed6	[test] try to fix torch.tensor for functionalization Pull Request resolved: https://github.com/pytorch/pytorch/pull/76319 Approved by: https://github.com/ezyang	2022-05-11 17:27:34 +00:00
Kulin Seth	f348b1b2b5	Add the Runtime components for MPS backend. (#76725 ) The PR adds the runtime components and few basic operations like copy, as_strided for MPS backend. Current list of identified TODOs are: - https://github.com/pytorch/pytorch/issues/77176 - Unify the logic with CUDACachingAllocator and remove redundant code. - https://github.com/pytorch/pytorch/issues/77170 - Look into using C++ smart pointers where possible with ObjC code - Use empty_strided_generic() to implement the `empty_strided_mps` code - https://github.com/pytorch/pytorch/issues/77144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76725 Approved by: https://github.com/albanD	2022-05-11 17:19:45 +00:00
Bin Bao	8f5cdc6d5d	Revert "Revert "[LT] Store OpKind for each IR subclass in a static field"" Summary: Re-land https://github.com/pytorch/pytorch/pull/76711 by fixing internal build errors. Generate class-level opkind as a static method instead of a static member. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77102 Approved by: https://github.com/wconstab, https://github.com/JackCaoG, https://github.com/antoniojkim	2022-05-11 12:27:05 +00:00
PyTorch MergeBot	7eaf4780ba	Revert "[LT] Store OpKind for each IR subclass in a static field" This reverts commit `ac37ddc795`. Reverted https://github.com/pytorch/pytorch/pull/76711 on behalf of https://github.com/malfet	2022-05-09 20:50:09 +00:00
Nikolay Korovaiko	daf8c48a87	Revert "Revert "[WIP] customize the C++ class for valueT"" (#77003 ) This reverts commit `ec841b0346`. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/77003 Approved by: https://github.com/shunting314, https://github.com/JackCaoG	2022-05-09 17:40:17 +00:00
PyTorch MergeBot	ec841b0346	Revert "[WIP] customize the C++ class for valueT" This reverts commit `c152817926`. Reverted https://github.com/pytorch/pytorch/pull/76911 on behalf of https://github.com/suo	2022-05-06 22:36:04 +00:00
Nikolay Korovaiko	c152817926	[WIP] customize the C++ class for valueT Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/76911 Approved by: https://github.com/wconstab	2022-05-06 21:05:35 +00:00
Bin Bao	ac37ddc795	[LT] Store OpKind for each IR subclass in a static field Summary: Currently OpKind is stored as an object field called op_ for each IR node, and one usage of op_ is to avoid dynamic_cast in NodeCast when we need to downcast a base-node pointer into a concrete sub-node pointer. As a result, we need to construct and pass in an op when downcasting nodes, and this becomes quite anonnying when we start to implement the trie-based IR node reusing. More importantly, the op for each subclass should be unique for that subclass and thus making it a const static field is a more logical design. In this PR, we still keep the object-level op_ for easier XLA adoption. As furture work, we can come back to remove op_, make the op() method virtual, and get rid of OpKind in all the node constructors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76711 Approved by: https://github.com/wconstab, https://github.com/JackCaoG	2022-05-06 19:14:46 +00:00
Yukio Siraichi	fcf38a5812	Add support to `Tensor[]?` for structured kernel codegen. This PR turns the previously introduced `ITensorList` into a more general `IList` class. It is a container wrapper for arbitrary types (given their appropriate implementations). In summary, I have: - Renamed `ITensorList` (its iterators and macros, for consistency) to `IList` - Made `IList` a templated function (for an arbitrary type `T`), given that they: - Specialize `IListTagImpl<T, Tag>`, for all `IListTag` - Introduced type aliases (for both list and iterator types): - `at::ITensorList` -> `c10::IList<at::Tensor>` - `at::IOptTensorRefList` -> `c10::IList<at::OptionalTensorRef>` - Added support for `Tensor?[]` in the structured codegen Pull Request resolved: https://github.com/pytorch/pytorch/pull/69606 Approved by: https://github.com/ezyang	2022-05-06 14:24:18 +00:00
Peter Bell	6df5a53127	Fix unmarked fstring This error message currently prints the format string literally, because the string isn't marked with the `f`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76841 Approved by: https://github.com/bdhirsh	2022-05-04 21:18:05 +00:00
Hui Guo	ca0f267022	[Static Runtime] [RFC] Codegen support for ops with unstructured kernels (#76203 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76203 Request for comments: This change adds extra code generator support to generate out variant wrappers for operators with unstructured kernels. The current version generates 105 new out variant wrappers in addition to the existing 136 auto-generated out variants wrappers. This change shows that a simple tweak can increase the generated op coverage to 16% (241/1559) among all native ops described in native_functions.yaml no. matter if they are structured or not. Command to generate out variant wrappers. ``` buck run //caffe2/torch/fb/jit:gen_static_runtime_ops ``` - AFTER this change ``` total grouped native ops: 1559 structured grouped native ops: 545 generated grouped native ops: 241 ``` - BEFORE this change ``` total grouped native ops: 1503 structured grouped native ops: 540 generated grouped native ops: 136 ``` To enable CI tests and make it easier to review, the generated ops are added in a separate diff: D35945633 More details: We added a block list to remove the generation of around 10 operations that are deprecated or for which the unit test would fail. All generated ops are well compiled but the compiled unittest may not pass due to the lack of hand-picked test input values for certain ops. Among the 42 ops whose unittest does not pass, 1 (op "index_select") is repeated from the existing ops; 32 ops are fixed; and 9 ops are removed and blocked from generation because either it is not being commonly used in internal models such as "cholesky", "linalg_householder_product", sparse kernel "sspaddmm", or it causes some errors in static runtime such as "conj_physical" leads to an error in memory planner, and "binary_cross_entropy". Test Plan: OP generation: ```buck run //caffe2/torch/fb/jit:gen_static_runtime_ops``` Test generated ops: ```buck run mode/opt //caffe2/benchmarks/static_runtime:static_runtime_cpptest``` Reviewed By: tenpercent Differential Revision: D34913736 fbshipit-source-id: a6f408321653c3589ae1c76826177fc403d59c44 (cherry picked from commit 6f4501730478dbaeeea7f3ad4f9d29bf6787e7c1)	2022-05-04 19:34:19 +00:00
Michael Suo	fb0f285638	[lint] upgrade mypy to latest version Fixes https://github.com/pytorch/pytorch/issues/75927. Had to fix some bugs and add some ignores. To check if clean: ``` lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753 Approved by: https://github.com/malfet	2022-05-03 20:51:34 +00:00
John Clow	db21e22b4b	[EASY] Quick Fix for broken shape function autogen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76703 Approved by: https://github.com/eellison	2022-05-03 17:34:05 +00:00
Bin Bao	f8a4780eb2	[LT] Move MakeNode into ir_builder.h Summary: Move MakeNode into ir_builder.h to avoid circular header reference later when introducing a trie cache for IR node lookup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76482 Approved by: https://github.com/wconstab	2022-05-03 14:53:19 +00:00
Will Constable	d0cb31d5bc	Make lazy tensor ptr class customizable (#76476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76476 Test Plan: Imported from OSS Reviewed By: Krovatkin, bdhirsh Differential Revision: D35980433 Pulled By: wconstab fbshipit-source-id: 1d4d00a494bf8aea86278b007f7f353cd7a822f8 (cherry picked from commit a78655bef23b5fa8487ced13443ca0bfdec65e5c)	2022-04-28 03:21:56 +00:00
Will Constable	4cae57080a	Make lazy tensor creation and value strings customizable (#76472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76472 - lets XLA backend customize codegenned functions during migration to LTC Test Plan: Imported from OSS Reviewed By: Krovatkin, bdhirsh Differential Revision: D35980435 Pulled By: wconstab fbshipit-source-id: 6138aef20862fccec40d715ffbb5a40a0a7d0401 (cherry picked from commit bad23f4b7ef73ffc2ef4a893364512907e9c4555)	2022-04-28 03:21:56 +00:00
Will Constable	cfc90cf3eb	Fix GenLazyIR.node_base_ctor_call (#76471 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76471 Make node_base_ctor_call produce the entire node_bace_ctor_call. Previously it was only producing the beginning of the call, which was unintended. Addresses part of https://github.com/pytorch/xla/issues/3472 Test Plan: Imported from OSS Reviewed By: qihqi, ngimel Differential Revision: D35980436 Pulled By: wconstab fbshipit-source-id: a443cf593ac7c35b2b65e72b82907e88e1e71c7a (cherry picked from commit 360ad6d82a7e8303b8a60e61b177dabf0131ea8b)	2022-04-28 03:21:56 +00:00
anjali411	b204ad863f	Revert "Revert "Allow specifying tags for aten operators in native_functions.yaml"" This reverts commit `ea44645c9a`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76456 Approved by: https://github.com/osalpekar	2022-04-28 02:04:57 +00:00
Brian Hirsh	40d96f0afd	Revert "functionalization: add support for zero_()" This reverts commit `7d44b3675b`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76375 Approved by: https://github.com/datumbox, https://github.com/albanD	2022-04-26 19:27:27 +00:00
Edward Z. Yang	c2ae0b01c0	Reapply black for torchgen, this time with lint to fix! Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76359 Approved by: https://github.com/suo	2022-04-26 04:03:38 +00:00
Nikolay Korovaiko	bb60cac25a	E2E SymInt example narrow_copy This roughly corresponds to Goal 3.2 in https://docs.google.com/document/d/1iiLNwR5ohAsw_ymfnOpDsyF6L9RTUaHMpD8YLw-jxEw/edit# Namely: It adds the following: * SymbolicIntNode interface * LazySymbolicIntNode implementation * Lazy `narrow_copy` implementation * Need add support for SymInt in codegen * Test (below) ```cpp TEST(LazyDynamicOpsTest, NarrowCopy) { auto x = torch::rand({5, 10, 10}).to(kLazy); const size_t Y_DIM = 3; const size_t X_DIM_INDEX = 2; auto y = torch::rand({Y_DIM}).to(kLazy); auto ly = torch::lazy::TryGetLtcTensor(y); auto dim_node = MakeNode<SizeNode>(ly->GetIrValue(), 0); auto lmn = new torch::lazy::SymbolicIntNode(dim_node); auto z = x.narrow_copy(X_DIM_INDEX, 0, lmn->toSymInt()); AllClose(z.cpu(), x.cpu().narrow_copy(X_DIM_INDEX, 0, Y_DIM)); } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75759 Approved by: https://github.com/wconstab	2022-04-26 02:40:27 +00:00
Brian Hirsh	640ce6bc9b	functionalization bugfix: using owning type when unwrapping tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/76125 Approved by: https://github.com/ezyang	2022-04-25 22:00:19 +00:00
Brian Hirsh	74e93f727a	remove _is_foreach_op codegen special cases, clean up mutable return type checks Pull Request resolved: https://github.com/pytorch/pytorch/pull/76190 Approved by: https://github.com/ezyang	2022-04-25 21:34:17 +00:00
Brian Hirsh	5da76acd1d	functionalization: add a copy() native function Pull Request resolved: https://github.com/pytorch/pytorch/pull/76083 Approved by: https://github.com/albanD	2022-04-25 21:31:48 +00:00
Brian Hirsh	7d44b3675b	functionalization: add support for zero_() Pull Request resolved: https://github.com/pytorch/pytorch/pull/75913 Approved by: https://github.com/albanD	2022-04-25 21:31:48 +00:00
Priya Ramani	f954c0a774	[Pytorch][4/4 Static dispatch] Support multiple backends with multiple kernels (#76059 ) Summary: - Supports multiple backends with multiple kernels in static dispatch - Refactor static dispatch generators Pull Request resolved: https://github.com/pytorch/pytorch/pull/76059 ghstack-source-id: 154735166 Test Plan: ``` (pytorch) ~/fbsource └─ $ buck build --config pt.enable_lightweight_dispatch=1 --config pt.static_dispatch_backend="CPU;QuantizedCPU;CompositeExplicitAutograd" //xplat/caffe2/fb/lite_predictor:lite_predictor_flatbuffer ``` Reviewed By: bdhirsh Differential Revision: D35727473 fbshipit-source-id: 986ba3390c6e585fcf8477b6d069720ee1fbc90b (cherry picked from commit 6473990c208a78879985e4cdfb50960f5727ad5e)	2022-04-25 21:18:08 +00:00
Edward Yang	36420b5e8c	Rename tools/codegen to torchgen (#76275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76275 In preparation for addressing https://github.com/pytorch/pytorch/issues/73212 Diff was generated with: ``` git mv tools/codegen torchgen git grep -l 'tools.codegen' \| xargs sed -i 's/tools.codegen/torchgen/g' sed -i "s/\${TOOLS_PATH}\/codegen/\${TORCH_ROOT}\/torchgen/g" caffe2/CMakeLists.txt ``` and a manual edits to: * tools/test/test_gen_backend_stubs.py * torchgen/build.bzl * torchgen/gen_backend_stubs.py aka this diff: ``` diff --git a/tools/test/test_gen_backend_stubs.py b/tools/test/test_gen_backend_stubs.py index 3dc26c6d2d..104054575e 100644 --- a/tools/test/test_gen_backend_stubs.py +++ b/tools/test/test_gen_backend_stubs.py @@ -9,7 +9,7 @@ from torchgen.gen_backend_stubs import run from torchgen.gen import _GLOBAL_PARSE_NATIVE_YAML_CACHE # noqa: F401 path = os.path.dirname(os.path.realpath(__file__)) -gen_backend_stubs_path = os.path.join(path, '../torchgen/gen_backend_stubs.py') +gen_backend_stubs_path = os.path.join(path, '../../torchgen/gen_backend_stubs.py') # gen_backend_stubs.py is an integration point that is called directly by external backends. # The tests here are to confirm that badly formed inputs result in reasonable error messages. diff --git a/torchgen/build.bzl b/torchgen/build.bzl index ed04e35a43..d00078a3cf 100644 --- a/torchgen/build.bzl +++ b/torchgen/build.bzl @@ -1,6 +1,6 @@ def define_targets(rules): rules.py_library( - name = "codegen", + name = "torchgen", srcs = rules.glob(["*/.py"]), deps = [ rules.requirement("PyYAML"), @@ -11,6 +11,6 @@ def define_targets(rules): rules.py_binary( name = "gen", - srcs = [":codegen"], + srcs = [":torchgen"], visibility = ["//visibility:public"], ) diff --git a/torchgen/gen_backend_stubs.py b/torchgen/gen_backend_stubs.py index c1a672a655..beee7a15e0 100644 --- a/torchgen/gen_backend_stubs.py +++ b/torchgen/gen_backend_stubs.py @@ -474,7 +474,7 @@ def run( ) -> None: # Assumes that this file lives at PYTORCH_ROOT/torchgen/gen_backend_stubs.py - pytorch_root = pathlib.Path(__file__).parent.parent.parent.absolute() + pytorch_root = pathlib.Path(__file__).parent.parent.absolute() template_dir = os.path.join(pytorch_root, "aten/src/ATen/templates") def make_file_manager(install_dir: str) -> FileManager: ``` run_all_fbandroid_tests Test Plan: sandcastle Reviewed By: albanD, ngimel Differential Revision: D35770317 fbshipit-source-id: 153ac4a7fef15b1e750812a90bfafdbc8f1ebcdf (cherry picked from commit c6d485d1d4648fa1c8a4c14c5bf3d8e899b9b4dd)	2022-04-25 01:38:06 +00:00

... 3 4 5 6 7

344 Commits