Commit Graph

70 Commits

Author SHA1 Message Date
Edward Z. Yang
213a8fc992 Put symint overloads on a different name
Due to implicit conversion shenanigans, having both IntArrayRef
and SymIntArrayRef overloads makes {} ambiguous.  While we could
fix this by making a single unified type that accepts all the overloads
we want, an easier fix was to just push the SymIntArrayRef overload
to its own name.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79281

Approved by: https://github.com/suo
2022-06-12 14:36:39 +00:00
anjali411
38350acf8f Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322

Approved by: https://github.com/albanD
2022-06-11 00:29:32 +00:00
Mengwei Liu
24050a5801 [RFC][Codegen] Add custom namespace support (#78015)
Summary:
Adding a feature to allow user to specify namespaces for operator and kernels.

# Feature
There's a feature request to allow DSL to:
1. take in an operator namespace other than `aten`.
2. take in a kernel that is in a different namespace than `at::native`.

For both features, we only allow user to have a single layer of namespace for the sake of simplicity. If user specify `custom::function` as kernel, the codegen will depend on `custom::native::function` where `native` is hardcoded.

# Proposal

For feature 1, add a `namespace` attribute to data class `NativeFunction`. The namespace will be extract out by matching pattern "::" on the `func` variable. For `NativeFunctionsGroup` there's an assumption that all variants (function, inplace, out) will have the same namespace. By default (if not specified) the namespace will be "aten".

For feature 2, add a `namespace` attribute to `BackendMetadata` class, similarly match pattern "::" on the kernel field. Remove the `cpp_namespace` field from `register_dispatch_key` data class. By default (if not specified) the namespace for a kernel would be "at::native".

Test Plan:
Example yaml entries:
```
- func: custom::gelu.out(Tensor self, *, str approximate='none', Tensor(a!) out) -> Tensor(a!)
  structured: True
  structured_inherits: TensorIteratorBase
  device_check: NoCheck   # TensorIterator
  python_module: nn
  dispatch:
    CPU: custom::gelu_out_cpu
    CUDA: custom::gelu_out_cuda
    MPS: custom::gelu_out_mps

- func: custom::gelu_(Tensor(a!) self, *, str approximate='none') -> Tensor(a!)
  structured_delegate: gelu.out
  device_check: NoCheck   # TensorIterator
  python_module: nn
  dispatch:
    NestedTensorCPU, NestedTensorCUDA: custom::NestedTensor_gelu_

- func: custom::gelu(Tensor self, *, str approximate='none') -> Tensor
  structured_delegate: gelu.out
  device_check: NoCheck   # TensorIterator
  python_module: nn
  dispatch:
    MkldnnCPU: custom::mkldnn_gelu
    QuantizedCPU: custom::gelu_quantized_cpu
    NestedTensorCPU, NestedTensorCUDA: custom::NestedTensor_gelu
```

see generated code:

`RegisterCPU.cpp`:
```
TORCH_LIBRARY_IMPL(aten, CPU, m) {
  ...
}
TORCH_LIBRARY_IMPL(custom, CPU, m) {
    m.impl("gelu", TORCH_FN(wrapper_gelu));
    m.impl("gelu.out", TORCH_FN(wrapper_gelu_out_out));
    m.impl("gelu_", TORCH_FN(wrapper_gelu_));
};
```
```
struct structured_gelu_out_cpu_inplace final : public custom::native::structured_gelu_out_cpu {
    structured_gelu_out_cpu_inplace(Tensor& self) : outputs_{std::ref(self)} {}

    void set_output_strided(
        int64_t output_idx, IntArrayRef sizes, IntArrayRef strides,
        TensorOptions options, DimnameList names
    ) override {

        const auto& out = outputs_[output_idx].get();
        check_inplace(out, sizes, options);

        auto maybe_proxy = maybe_create_proxy(out, sizes, strides, options);
        if (C10_UNLIKELY(maybe_proxy.has_value())) {
            proxy_outputs_[output_idx] = c10::ExclusivelyOwned<Tensor>(std::move(maybe_proxy).value());
        }

        if (!names.empty()) {
          namedinference::propagate_names(outputs_[output_idx], names);
        }
        // super must happen after, so that downstream can use maybe_get_output
        // to retrieve the output
        custom::native::structured_gelu_out_cpu::set_output_raw_strided(output_idx, sizes, strides, options, names);
    }

    void set_output_raw_strided(
        int64_t output_idx, IntArrayRef sizes, IntArrayRef strides,
        TensorOptions options, DimnameList names
    ) override {

        const auto& out = outputs_[output_idx].get();
        check_inplace(out, sizes, options);

        if (!names.empty()) {
          namedinference::propagate_names(outputs_[output_idx], names);
        }
        // super must happen after, so that downstream can use maybe_get_output
        // to retrieve the output
        custom::native::structured_gelu_out_cpu::set_output_raw_strided(output_idx, sizes, strides, options, names);
    }

    const Tensor& maybe_get_output(int64_t output_idx) override {
      return proxy_outputs_[output_idx].has_value() ? **proxy_outputs_[output_idx] : outputs_[output_idx].get();

    }
    std::array<std::reference_wrapper<Tensor>, 1> outputs_;
    std::array<c10::optional<c10::ExclusivelyOwned<Tensor>>, 1> proxy_outputs_;
};
```

`RegisterSchema.cpp`
```
TORCH_LIBRARY(aten, m) {
  ...
}
TORCH_LIBRARY(custom, m) {
    m.def("gelu.out(Tensor self, *, str approximate='none', Tensor(a!) out) -> Tensor(a!)");

    m.def("gelu_(Tensor(a!) self, *, str approximate='none') -> Tensor(a!)");

    m.def("gelu(Tensor self, *, str approximate='none') -> Tensor");
};
```

Differential Revision: D36558459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78015
Approved by: https://github.com/bdhirsh
2022-06-10 21:04:36 +00:00
Brian Hirsh
7b3a0ff87a Port index.Tensor to structured kernels.
Tracking issue: #55070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69607

Approved by: https://github.com/bdhirsh
2022-06-10 17:27:47 +00:00
George Qi
a90f006fe5 add strides to slow path
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78610

Approved by: https://github.com/ezyang
2022-06-10 16:59:14 +00:00
PyTorch MergeBot
4b82ef7928 Revert "Port index.Tensor to structured kernels."
This reverts commit cfd84125bd.

Reverted https://github.com/pytorch/pytorch/pull/69607 on behalf of https://github.com/zengk95 due to This is breaking mac trunk tests cfd84125bd
2022-06-08 20:16:10 +00:00
Brian Hirsh
cfd84125bd Port index.Tensor to structured kernels.
Tracking issue: #55070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69607

Approved by: https://github.com/bdhirsh
2022-06-08 18:17:52 +00:00
Richard Zou
9da5defff6 Package config/template files with torchgen (#78942)
Package config/template files with torchgen

This PR packages native_functions.yaml, tags.yaml and ATen/templates
with torchgen.

This PR:
- adds a step to setup.py to copy the relevant files over into torchgen
- adds a docstring for torchgen (so `import torchgen; help(torchgen)`
says something)
- adds a helper function in torchgen so you can get the torchgen root
directory (and figure out where the packaged files are)
- changes some scripts to explicitly pass the location of torchgen,
which will be helpful for the first item in the Future section.

Future
======

- torchgen, when invoked from the command line, should use sources
in torchgen/packaged instead of aten/src. I'm unable to do this because
people (aka PyTorch CI) invokes `python -m torchgen.gen` without
installing torchgen.
- the source of truth for all of these files should be in torchgen.
This is a bit annoying to execute on due to potential merge conflicts
and dealing with merge systems
- CI and testing. The way things are set up right now is really fragile,
we should have a CI job for torchgen.

Test Plan
=========
I ran the following locally:

```
python -m torchgen.gen -s torchgen/packaged
```
and verified that it outputted files.

Furthermore, I did a setup.py install and checked that the files are
actually being packaged with torchgen.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78942
Approved by: https://github.com/ezyang
2022-06-07 13:33:55 +00:00
Sergii Dymchenko
0fdc1caf02 Cleanup some Python2-related code (#78864)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78864
Approved by: https://github.com/janeyx99, https://github.com/jbschlosser
2022-06-06 17:40:02 +00:00
Brian Hirsh
67b27a7bae generate kernels for codegend out= operators
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78626

Approved by: https://github.com/ezyang, https://github.com/JacobSzwejbka, https://github.com/larryliu0820
2022-06-06 15:36:28 +00:00
PyTorch MergeBot
bcb424c8cf Fix #78675
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78699

Approved by: https://github.com/tugsbayasgalan
2022-06-04 01:07:24 +00:00
Linbin Yu
1683a2618d rename BUILD.buck to BUCK.oss (#78792)
rename BUILD.buck to BUCK.oss to better reflect that it's the OSS version of BUCK build, not the one shared with Bazel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78792
Approved by: https://github.com/kit1980
2022-06-03 07:23:16 +00:00
PyTorch MergeBot
954522a485 Revert "Autogen Tags enum, and allow specifying tags while defining an op"
This reverts commit 9476a78f37.

Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see 9476a78f37
2022-06-03 01:53:53 +00:00
anjali411
9476a78f37 Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-03 01:13:44 +00:00
PyTorch MergeBot
fca1f495c2 Revert "Port index.Tensor to structured kernels."
This reverts commit 9fe6f1baf5.

Reverted https://github.com/pytorch/pytorch/pull/69607 on behalf of https://github.com/suo due to this broke master, see: 9fe6f1baf5
2022-06-01 00:12:15 +00:00
Brian Hirsh
9fe6f1baf5 Port index.Tensor to structured kernels.
Tracking issue: #55070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69607

Approved by: https://github.com/bdhirsh
2022-05-31 22:15:20 +00:00
Antonio Kim
fe67dff82a Deprecate TSNodeLoweringInterface (#78273)
Fixes #78206

Deprecate `TSNodeLoweringInterface` and refactor lower functions into IR nodes.

CC: @wconstab @desertfire
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78273
Approved by: https://github.com/wconstab
2022-05-31 18:09:12 +00:00
Brian Hirsh
92229adf0c add special handling for resize_() in functionalization pass
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77714

Approved by: https://github.com/ezyang
2022-05-26 16:15:44 +00:00
Bin Bao
29189d2ba8 [LT] Add IR resuing support for manually-implemented ops
Summary: Add CanBeReused methods for manually-implemented ops and replace MakeNode with
ReuseOrMakeNode.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77616

Approved by: https://github.com/JackCaoG, https://github.com/wconstab
2022-05-26 16:04:47 +00:00
Hui Guo
1803a592f4 [static_runtime] Add script to auto-generate view ops (#77105)
Summary:
Add script to go through view ops in "native_functions.yaml" and auto-register them into static runtime and auto-generate op unit tests for each.

Overall there are 96 grouped view ops, among which 21 is already registered by hand; 9 (including sparse ops/training related ops etc.) are not the target of static runtime; 30 has list args or list ret; and 7 has non-basic types such as "Dimname", "MemoryFormat", etc. In summary, this script auto-generate 29 view ops for now.

Run `buck run //caffe2/torch/fb/jit:gen_static_runtime_ops` to generate static runtime ops, and the results with this script are,

```
total grouped native ops: 1582
grouped native ops with out variant: 548
generated functions groups with out variant: 241

view grouped native ops: 96
generated functions view groups: 29

overall generated : 270
```

The generated view ops are added in D36258968

Test Plan:
Generate static runtime ops: `buck run //caffe2/torch/fb/jit:gen_static_runtime_ops`

Unit tests: `buck run mode/opt //caffe2/benchmarks/static_runtime:static_runtime_cpptest`

Differential Revision: D36258767

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77105
Approved by: https://github.com/mikeiovine
2022-05-26 03:12:22 +00:00
Antonio Kim
02c4d877b4 Codegen Non-Native IR Nodes (#76535)
Add codegen infrastructure to generate IR nodes for non-native ops.

The proposed change is to add a `non_native` key to the `{backend}_native_functions.yaml` file that contains schema definitions similar to what is found in `native_functions.yaml`. e.g.
```
non_native:
    ...
    - func: expand(Tensor input, int[] size, bool is_scalar_expand) -> Tensor
    ...
```
these definitions are parsed into a `LazyIrSchema` that can be used for generating IR nodes using `GenLazyIR`.

Fixes #74628

CC: @wconstab @desertfire @henrytwo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76535
Approved by: https://github.com/wconstab
2022-05-24 19:29:23 +00:00
Brian Hirsh
7ddc1425ff functionalization fix for inplace comparison ops
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77125

Approved by: https://github.com/ezyang
2022-05-24 18:20:31 +00:00
Brian Hirsh
22d566acda functionalization fix for inplace_view ops
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77126

Approved by: https://github.com/ezyang
2022-05-24 18:20:30 +00:00
Mengwei Liu
9e806619cc [Codegen] Remove view operator check in NativeFunctionGroups and allow skipping native function generation (#78145)
Summary:
This PR adds two features:
* A boolean to turn off native function generation in codegen
* Relaxing `view` operator check for `NativeFunctionGroups`

Differential Revision: D36604646

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78145
Approved by: https://github.com/iseeyuan, https://github.com/bdhirsh
2022-05-24 05:48:30 +00:00
Mengwei Liu
ffa3cce100 [Codegen] Expose namespace argument for static dispatch (#77710)
For static dispatch we are hardcoding namespace to be `at` for backend-specific C++ functions, e.g., `at::cpu::add()`. We are extending it to accept namespaces from callsite. This is a temporary solution, in the long run we want to introduce custom namespace into codegen system, e.g., we should be able to add `at::` to `native_functions.yaml` and parse it into `NativeFunction`. This needs a bit more design.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77710
Approved by: https://github.com/ezyang
2022-05-21 00:39:06 +00:00
John Clow
417373337f Put imports in correct order so clang-format doesn't get mad every time
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77282

Approved by: https://github.com/Krovatkin
2022-05-20 18:39:47 +00:00
Brian Hirsh
0161e9eb00 [test] attempt to functionalize ops with mutable positional-only args
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76320

Approved by: https://github.com/ezyang
2022-05-19 18:50:34 +00:00
Edward Z. Yang
befa4e371e Fix typo
Fixes #77412

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77488

Approved by: https://github.com/mruberry
2022-05-18 18:25:54 +00:00
Antonio Kim
55be35ae39 Fix 'Code below assumes there is at least one tensor arg' assumption (#76917)
Previously when codegening ops like `zeros_` or `ones_` we'd hit a `Code below assumes there is at least one tensor arg error`. This check is not entirely correct which is what is causing the error to be thrown. There are ops like the ones mentioned that pass in a `device` parameter that can be used in place of the "first tensor".

CC: @wconstab @desertfire @henrytwo @ke1337
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76917
Approved by: https://github.com/desertfire
2022-05-18 17:58:47 +00:00
John Clow
2a99018147 Adding a way to register both upper and lower bound functions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77388

Approved by: https://github.com/eellison
2022-05-18 17:34:07 +00:00
Brian Hirsh
edc904d6ba add native view_copy.out ops, teach codegen about tensorlist out=
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76126

Approved by: https://github.com/ezyang
2022-05-18 14:23:43 +00:00
Yukio Siraichi
9d44250760 Reduce structured kernels' set_output boilerplate with new overloads.
Partially fix #69813
This PR does mainly 3 things:

1. Introduces new methods for the `MetaBase` API:
    - `set_output_strided`: creates proxy tensors with exact strides, if strides don't match
    - `set_output_contiguous`: alias for `set_output_strided` with contiguous strides
    - `set_output_raw_strided`: does not create proxy tensors

2. Modifies codegen for handling proxy tensors:
    - Creates a new field for out-of-place kernels: `proxy_output_`
    - Implements `set_output_strided` by creating a proxy tensor if necessary
    - Passes the proxy tensor to them `IMPL` function
    - Copy the result back to the real output, in the end, whenever a proxy was created

3. Replace `set_output` by `set_output_raw_strided` for `TensorIterator*`
    - Needed, since it overrides `set_output`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76096

Approved by: https://github.com/ezyang
2022-05-17 12:01:53 +00:00
Linbin Yu
1f8049566f Re-land BUCK build for pytorch mobile (#77612)
see https://github.com/pytorch/pytorch/pull/76480
fixed most lint errors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77612
Approved by: https://github.com/kit1980
2022-05-17 00:30:13 +00:00
Bin Bao
25c6ebd12c Revert "Revert "[LT] Codegen ReuseNode for supported ops""
Summary: Fixed a XLC build failure by generating an always-return-false
default CanBeReused method.

This reverts commit 3cade9d454.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77513

Approved by: https://github.com/alanwaketan
2022-05-16 20:14:42 +00:00
PyTorch MergeBot
530481ed69 Revert "[mobile] add buck build for mobile targets (#76480)"
This reverts commit 168dc70faf.

Reverted https://github.com/pytorch/pytorch/pull/76480 on behalf of https://github.com/atalman
2022-05-16 16:14:17 +00:00
francescocastelli
dca416b578 Pretty-print dataclasses (#76810)
Unfortunately the built-in pprint module support pretty-print of dataclasses only from python 3.10. The code that I wrote in method `__str__` of OpInfo should do the same job and should also work for any dataclass. For now I've put it there but we can create a function and put it somewhere where is accessible also for other dataclasses. Also the max width (80) is now hardcode but it would ideally be the parameter of the function.

when you call print on an OpInfo you get:
```
OpInfo(name = '__getitem__',
       ref = None,
       aliases = (),
       variant_test_name = '',
       op = <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>,
       method_variant = <slot wrapper '__getitem__' of 'torch._C._TensorBase' objects>,
       inplace_variant = None,
       skips = (<torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbca90>,
                <torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbcae0>),
       decorators = (<torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbca90>,
                     <torch.testing._internal.common_methods_invocations.DecorateInfo object at 0x7f463acbcae0>),
       sample_inputs_func = <function sample_inputs_getitem at 0x7f463acc6af0>,
       reference_inputs_func = None,
       error_inputs_func = None,
       sample_inputs_sparse_coo_func = <function _DecoratorContextManager.__call__.<locals>.decorate_context at 0x7f463acc6b80>,
       sample_inputs_sparse_csr_func = <function _DecoratorContextManager.__call__.<locals>.decorate_context at 0x7f463acc6c10>,
       dtypes = {torch.int16,
                 torch.float64,
                 torch.int32,
                 torch.int64,
                 torch.complex64,
                 torch.float16,
                 torch.bfloat16,
                 torch.uint8,
                 torch.complex128,
                 torch.bool,
                 torch.float32,
                 torch.int8},
       dtypesIfCUDA = {torch.int16,
                       torch.float64,
                       torch.int32,
                       torch.int64,
                       torch.complex64,
                       torch.float16,
                       torch.bfloat16,
                       torch.uint8,
                       torch.complex128,
                       torch.bool,
                       torch.float32,
                       torch.int8},
       dtypesIfROCM = {torch.int16,
                       torch.float64,
                       torch.int32,
                       torch.int64,
                       torch.complex64,
                       torch.float16,
                       torch.bfloat16,
                       torch.uint8,
                       torch.complex128,
                       torch.bool,
                       torch.float32,
                       torch.int8},
       backward_dtypes = {torch.int16,
                          torch.float64,
                          torch.int32,
                          torch.int64,
                          torch.complex64,
                          torch.float16,
                          torch.bfloat16,
                          torch.uint8,
                          torch.complex128,
                          torch.bool,
                          torch.float32,
                          torch.int8},
       backward_dtypesIfCUDA = {torch.int16,
                                torch.float64,
                                torch.int32,
                                torch.int64,
                                torch.complex64,
                                torch.float16,
                                torch.bfloat16,
                                torch.uint8,
                                torch.complex128,
                                torch.bool,
                                torch.float32,
                                torch.int8},
       backward_dtypesIfROCM = {torch.int16,
                                torch.float64,
                                torch.int32,
                                torch.int64,
                                torch.complex64,
                                torch.float16,
                                torch.bfloat16,
                                torch.uint8,
                                torch.complex128,
                                torch.bool,
                                torch.float32,
                                torch.int8},
       supports_out = False,
       supports_autograd = True,
       supports_gradgrad = True,
       supports_fwgrad_bwgrad = True,
       supports_inplace_autograd = False,
       supports_forward_ad = True,
       gradcheck_wrapper = <function OpInfo.<lambda> at 0x7f463a7a40d0>,
       check_batched_grad = True,
       check_batched_gradgrad = True,
       check_batched_forward_grad = True,
       check_inplace_batched_forward_grad = True,
       gradcheck_nondet_tol = 0.0,
       gradcheck_fast_mode = None,
       aten_name = '__getitem__',
       decomp_aten_name = None,
       aten_backward_name = None,
       assert_autodiffed = False,
       autodiff_nonfusible_nodes = ['aten::__getitem__'],
       autodiff_fusible_nodes = [],
       supports_sparse = False,
       supports_scripting = False,
       supports_sparse_csr = False,
       test_conjugated_samples = True,
       test_neg_view = True,
       assert_jit_shape_analysis = False,
       supports_expanded_weight = False)
```

cc @ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76810
Approved by: https://github.com/ezyang
2022-05-16 14:20:41 +00:00
Linbin Yu
168dc70faf [mobile] add buck build for mobile targets (#76480)
Create buck targets to replicate internal BUCK build, including
- XNNPACK
- QNNPACK
- C10
- aten_cpu
- torch_mobile_core
- torch_mobile_all_ops
- ptmobile_benchmark

And able to run mobilenet v2 using ptmobile_benchmark (with all ops).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76480
Approved by: https://github.com/seemethere, https://github.com/dreiss
2022-05-15 18:42:41 +00:00
PyTorch MergeBot
3cade9d454 Revert "[LT] Codegen ReuseNode for supported ops"
This reverts commit 6066e5929f.

Reverted https://github.com/pytorch/pytorch/pull/76738 on behalf of https://github.com/malfet
2022-05-14 00:33:10 +00:00
Bin Bao
6066e5929f [LT] Codegen ReuseNode for supported ops
Summary:
1. Update the codegen script to add a TrieCache lookup (ReuseNode)
before creating a new IR node. The following is an example generated
code,

```
    at::Tensor LazyNativeFunctions::add(const at::Tensor & self, const at::Tensor & other, const at::Scalar & alpha) {
        ...
        torch::lazy::NodePtr node = torch::lazy::ReuseNode<AddTensor>(lazy_self->GetIrValue(), lazy_other->GetIrValue(), node_alpha);
        if (!node) {
            auto out_meta = at::meta::add(self, other, alpha);
            std::vector<Shape> shapes{Shape(out_meta.scalar_type(), out_meta.sizes().vec())};
            TORCH_INTERNAL_ASSERT(shapes.size() == 1);
            if(symbolicShapeEnabled()){
                std::vector<jit::IValue> inputs = { self, other, alpha };
                char* schema_str = "aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor";
                applySymbolicShapesOnLT(schema_str, inputs, shapes);
            }

            node = torch::lazy::MakeNode<AddTensor>(lazy_self->GetIrValue(), lazy_other->GetIrValue(), node_alpha, std::move(shapes));
            CacheNode(node);
        }
        ...
    }
```
2. TrieCache lookup depends on each IR node subclass to provide its own
comparison function. The following is an example generated code,

```
  bool CanBeReused(const torch::lazy::Value& self, const torch::lazy::Value& other, const torch::lazy::Value& alpha) const {
    size_t i = 0;
    return (operand(i++) == self &&
        operand(i++) == other &&
        operand(i++) == alpha);
  }
```

3. DeviceData is specially handled.

4. Non-codegen op changes are coming a separate PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76738

Approved by: https://github.com/JackCaoG, https://github.com/wconstab
2022-05-13 19:13:58 +00:00
Kulin Seth
e011a8e18b Enable PyTorch operations on MPS Backend. (#77343)
Add PyTorch operations to MPS backend.

- https://github.com/pytorch/pytorch/issues/77394
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77343
Approved by: https://github.com/albanD
2022-05-13 18:28:53 +00:00
JackCaoG
e36a8c1f13 Lazy codegen change for xla (#76717)
Codegen change to enable PyTorch/XLA to generate the first op in https://github.com/pytorch/xla/pull/3544.

@bdhirsh @wconstab
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76717
Approved by: https://github.com/Krovatkin
2022-05-12 17:04:04 +00:00
Brian Hirsh
47dd092bae add a new at::lift operator, fix torch.tensor for functionalization
This reverts commit 85bd65a880.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77285

Approved by: https://github.com/albanD, https://github.com/ezyang
2022-05-12 13:31:19 +00:00
PyTorch MergeBot
85bd65a880 Revert "[test] try to fix torch.tensor for functionalization"
This reverts commit 9edee09ed6.

Reverted https://github.com/pytorch/pytorch/pull/76319 on behalf of https://github.com/janeyx99
2022-05-11 18:48:42 +00:00
Brian Hirsh
9edee09ed6 [test] try to fix torch.tensor for functionalization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76319

Approved by: https://github.com/ezyang
2022-05-11 17:27:34 +00:00
Kulin Seth
f348b1b2b5 Add the Runtime components for MPS backend. (#76725)
The PR adds the runtime components and few basic operations like copy, as_strided for MPS backend.

Current list of identified TODOs are:

-  https://github.com/pytorch/pytorch/issues/77176
- Unify the logic with CUDACachingAllocator and remove redundant code.
-  https://github.com/pytorch/pytorch/issues/77170
- Look into using C++ smart pointers where possible with ObjC code
- Use empty_strided_generic() to implement the `empty_strided_mps` code
- https://github.com/pytorch/pytorch/issues/77144
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76725
Approved by: https://github.com/albanD
2022-05-11 17:19:45 +00:00
Bin Bao
8f5cdc6d5d Revert "Revert "[LT] Store OpKind for each IR subclass in a static field""
Summary: Re-land https://github.com/pytorch/pytorch/pull/76711 by
fixing internal build errors.
Generate class-level opkind as a static method instead of a static
member.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77102

Approved by: https://github.com/wconstab, https://github.com/JackCaoG, https://github.com/antoniojkim
2022-05-11 12:27:05 +00:00
PyTorch MergeBot
7eaf4780ba Revert "[LT] Store OpKind for each IR subclass in a static field"
This reverts commit ac37ddc795.

Reverted https://github.com/pytorch/pytorch/pull/76711 on behalf of https://github.com/malfet
2022-05-09 20:50:09 +00:00
Nikolay Korovaiko
daf8c48a87 Revert "Revert "[WIP] customize the C++ class for valueT"" (#77003)
This reverts commit ec841b0346.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77003
Approved by: https://github.com/shunting314, https://github.com/JackCaoG
2022-05-09 17:40:17 +00:00
PyTorch MergeBot
ec841b0346 Revert "[WIP] customize the C++ class for valueT"
This reverts commit c152817926.

Reverted https://github.com/pytorch/pytorch/pull/76911 on behalf of https://github.com/suo
2022-05-06 22:36:04 +00:00
Nikolay Korovaiko
c152817926 [WIP] customize the C++ class for valueT
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76911
Approved by: https://github.com/wconstab
2022-05-06 21:05:35 +00:00