Commit Graph

58 Commits

Author SHA1 Message Date
Jane Xu
71ca600af9 Renaming CAFFE2_API to TORCH_API (#49496)
Summary:
Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO.

Manually edited some references of the removed `CAFFE2_API`:
* `CONTRIBUTING.md`
* `caffe2/proto/CMakeLists.txt`
* `cmake/ProtoBuf.cmake`
* `c10/macros/Export.h`
* `torch/csrc/WindowsTorchApiMacro.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496

Reviewed By: malfet, samestep

Differential Revision: D25600726

Pulled By: janeyx99

fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782
2020-12-18 10:54:50 -08:00
Jerry Zhang
b8d98f05e7 [reland][quant][docs] Add fx graph mode quantization to quantization docs (#49211) (#49515)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49515

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D25601061

fbshipit-source-id: 74e917d57895e9b4131a01fdcea8df3e94322bec
2020-12-17 10:30:10 -08:00
Sebastian Messmer
872f6486b1 Prevent accidentally writing old style ops (#49510)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49510

Adding old style operators with out arguments will break XLA. This prevents that. See for background: https://fb.workplace.com/groups/pytorch.dev/permalink/809934446251704/

This is a temporary change that will prevent this breakage for the next couple of days until the problem is resolved for good.
It will be deleted in https://github.com/pytorch/pytorch/pull/49164 then.
ghstack-source-id: 118756437

(Note: this ignores all push blocking failures!)

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D25599112

fbshipit-source-id: 6b0ca4da4b55da8aab9d1b332cd9f68e7602301e
2020-12-16 16:34:49 -08:00
Edward Yang
3efd5d8f01 Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122

cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way.  This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming.  The net result is shorter, cleaner and should be more robust to future changes.

This PR is a bit of a whopper.  Here is the order to review it.

- tools/codegen/api/types.py
  - Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
  - Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the  binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
  - I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
  - All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
   - We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
  - `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ljk53

Differential Revision: D25455887

Pulled By: ezyang

fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-16 16:18:40 -08:00
Mike Ruberry
676bfa6dbd Revert D25507480: [quant][docs] Add fx graph mode quantization to quantization docs
Test Plan: revert-hammer

Differential Revision:
D25507480 (7729581414)

Original commit changeset: 9e9e4b5fef97

fbshipit-source-id: fdb08d824209b97defaba2e207d1a914575a6ae7
2020-12-16 14:26:18 -08:00
Jerry Zhang
7729581414 [quant][docs] Add fx graph mode quantization to quantization docs (#49211)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49211

Test Plan: Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D25507480

fbshipit-source-id: 9e9e4b5fef979f5621c1bbd1b49e9cc6830da617
2020-12-16 12:40:02 -08:00
Sebastian Messmer
a6274c1278 Making ops c10 full: out overloads with default arguments (#49012)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49012

For some reason we apply default arguments to the functions in at::native too. So when an out overload had default arguments,
we couldn't move the out argument to the end because of those default arguments preceding it.
This PR fixes that and makes out overloads with default arguments c10-full
ghstack-source-id: 118619222

(Note: this ignores all push blocking failures!)

Test Plan: waitforsandcastle

Reviewed By: ezyang

Differential Revision: D25394605

fbshipit-source-id: 2ed1c3ce0d04a548e3141df2dca517756428fe15
2020-12-15 11:21:40 -08:00
Nikita Shulga
8999915a86 Fix "Missing return statement" mypy error (#49276)
Summary:
Adds `return None` after `assert_never` in the inner `get_one` function
Without it, TestTypeHints.test_run_mypy_strict using mypy  0.770 fails with the above mentioned error, see https://app.circleci.com/pipelines/github/pytorch/pytorch/249909/workflows/597d8e34-ff04-4efa-9dde-9e28fbded341/jobs/9557705

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49276

Reviewed By: jamesr66a

Differential Revision: D25513658

Pulled By: malfet

fbshipit-source-id: 318eaff7e0534b10eafe46c0b834b7f7cefea757
2020-12-11 22:18:50 -08:00
Sebastian Messmer
1cb5aa6c60 Fix structured kernel codegen (#49244)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49244

see https://fb.quip.com/ceEdANd5iVsO

RegisterMkldnnCPU kernels incorrectly used makeUnboxedOnly() calls to register add_.Tensor kernels. This is because the codegen incorrectly thought they're not c10-full.
This PR fixes that.
ghstack-source-id: 118411117

Test Plan: After this PR, RegisterMkldnnCPU doesn't contain the makeUnboxedOnly() calls anymore.

Reviewed By: ezyang

Differential Revision: D25500246

fbshipit-source-id: 8a8c2be9c4f4a5ce7eaae94257c2f8cbd176e92e
2020-12-11 12:37:35 -08:00
Sebastian Messmer
56a157fc79 hacky_wrapper_for_legacy_signatures reorders out arguments (#48911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48911

This enables us to use hacky_wrapper_for_legacy_signatures for ops with out arguments so they can use templated unboxing logic without having to be rewritten.

This only actually enables it for one op as a proof of concept. There will be a separate PR enabling it for more ops.
ghstack-source-id: 118379659

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D25363336

fbshipit-source-id: da075d2cc58814f886a25d52652511dbbe990cec
2020-12-10 23:29:00 -08:00
Edward Yang
59e822026c Add manual_cpp_binding to native_functions.yaml (#49092)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49092

Functions which specify manual_cpp_binding don't automatically
get C++ bindings generated for them in TensorBody.h or
Functions.h.  This lets end users manually define the bindings
themselves, which may be helpful if there is a way to
short circuit the dispatcher entirely.  contiguous() is switched
to use this mechanism.

Although manual_cpp_binding suggests that we don't generate the
binding at all, it is often the case that there is some "fast
path", but when this path is not satisfied, we should go back
to the slow dispatch.  So we still generate a fallback method/function
which the user-defined binding can call into in case that we
have to go slowpath.

The correctness conditions for bindings manually written in this
way are subtle.  Here are the ones I can think of off the top
of my head:

- Whatever condition is tested in the C++ body, must ALSO be
  tested again in the native:: implementation on the other
  side of the dispatcher.  This is because you are NOT GUARANTEED
  to hit the native:: implementation through the C++ binding,
  you may go straight to the implementation via a boxed call.

- If a binding is written in this way, it is only safe to
  skip dispatch if you would have returned the same tensor as
  before.  In any situation you would return a fresh tensor,
  you MUST go to the slow path, because you need to actually
  get to the autograd kernel.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D25428440

Pulled By: swolchok

fbshipit-source-id: 6e71767cb8d1086d56cd827c1d2d56cac8f6f5fe
2020-12-10 21:56:53 -08:00
Edward Yang
9b0ffb9fb3 Delete cpp.group_arguments (#49043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49043

Previously, this function had nontrivial algorithmic content,
but after #48195, this was just a swiss army knife for pasting
together arguments while maintaining structure.  I added some
more properties for Arguments for convenient access in this way,
and then inlined the implementation of group_arguments into all of its call
sites, simplifying whenever contextual.  This might be controversial, but I
think the resulting code is easier to understand.

You may notice that there is some modest code duplication between
dispatcher.cpparguments_exprs and CppSignature.argument_packs.
This is a known problem and I will be attempting to fix it in
a follow up PR.

Confirmed to be byte-for-byte compatible.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D25455885

Pulled By: ezyang

fbshipit-source-id: 8fbe066e8c3cb7ee8adb5b87296ec5bd7b49e01f
2020-12-10 18:20:46 -08:00
Edward Yang
267641a245 Rename positional and kwarg_only to have flat prefix (#49042)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49042

I want the names positional and kwarg_only to give the unflat
representation (e.g., preserving TensorOptionsArguments in the
returned Union).  So I regret my original naming choice when
I moved grouping to model.  This renames them to have flat_ prefix
and also adds a flat_non_out argument for cases where you just
want to look at non-out arguments.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D25455884

Pulled By: ezyang

fbshipit-source-id: f923f8881267a3e3e8e9521519412f7cc25034fc
2020-12-10 18:20:43 -08:00
Sebastian Messmer
c892c3ac9a remove hacky_wrapper from BackendSelect (#49079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49079

BackendSelect kernels have been changed to be written the new way, so this hacky_wrapper here isn't needed anymore.
This PR is not expected to change perf or anything, just simplify the code a bit. The hacky_wrapper here was a no-op and not creating any actual wrappers
because it short-cirtuits to not create a wrapper when there is no wrapper needed.
ghstack-source-id: 118318436

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D25421633

fbshipit-source-id: 7a6125613f465dabed155dd892c8be6af5c617cf
2020-12-10 12:54:29 -08:00
Edward Yang
16b8e6ab01 Class-based structured kernels, with migration of add to framework (#48718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48718

This PR rewrites structured kernels to do the class-based mechanism (instead of defining a meta and impl function, they are methods on a class), and adds enough customizability on the class to support TensorIterator. To show it works, add is made a structured kernel. Don't forget to check https://github.com/pytorch/rfcs/pull/9 for a mostly up-to-date high level description of what's going on here.

High level structure of this PR (the order you should review files):
* TensorMeta.h - TensorMeta is deleted entirely; instead, meta functions will call `set_output` to allocate/resize their outputs. MetaBase gets a new `maybe_get_output` virtual method for retrieving the (possibly non-existent) output tensor in a meta function; this makes it easier to do special promotion behavior, e.g., as in TensorIterator.
* TensorIterator.cpp - Two major changes: first, we add TensorIteratorBase::set_output, which is a "light" version of TensorIterator::set_output; it sets up the internal data structures in TensorIterator, but it doesn't do allocation (that is assumed to have been handled by the structured kernels framework). The control flow here is someone will call the subclassed set_output, which will allocate output, and then we will call the parent class (TensorIteratorBase) to populate the fields in TensorIterator so that other TensorIterator phases can keep track of it. Second, we add some tests for meta tensors, and skip parts of TensorIterator which are not necessary when data is not available.
* tools/codegen/model.py - One new field in native_functions.yaml, structured_inherits. This lets you override the parent class of a structured meta class; normally it's MetaBase, but you can make it point at TensorIteratorBase instead for TensorIterator based kernels
* tools/codegen/gen.py - Now generate all of the classes we promised. It's kind of hairy because this is the first draft. Check the RFC for what the output looks like, and then follow the logic here. There are some complications: I need to continue to generate old style wrapper functions even if an operator is structured, because SparseCPU/SparseCUDA/etc won't actually use structured kernels to start. The most complicated code generation is the instantiation of `set_output`, which by in large replicates the logic in `TensorIterator::set_output`. This will continue to live in codegen for the forseeable future as we would like to specialize this logic per device.
* aten/src/ATen/native/UpSampleNearest1d.cpp - The previous structured kernel is ported to the new format. The changes are very modest.
* aten/src/ATen/native/BinaryOps.cpp - Add is ported to structured.

TODO:
* Work out an appropriate entry point for static runtime, since native:: function stubs no longer are generated
* Refactor TensorIteratorConfig construction into helper functions, like before
* Make Tensor-Scalar addition structured to fix perf regression
* Fix `verify_api_visibility.cpp`
* Refactor tools/codegen/gen.py for clarity
* Figure out why header changes resulted in undefined reference to `at::Tensor::operator[](long) const`

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D25278031

Pulled By: ezyang

fbshipit-source-id: 57c43a6e5df21929b68964d485995fbbae4d1f7b
2020-12-09 15:39:12 -08:00
Sebastian Messmer
3ef36dca8e Faithful out arguments (#47712)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47712

This adds a faithful API for ops with out arguments, as described in https://docs.google.com/document/d/1h7nBibRwkRLQ8rsPhfALlwWR0QbkdQm30u4ZBwmaps8/edit# .

After this, an op will generate the following overloads for the C++ API:

```cpp
// Generated from the aten::abs operator (NOT from aten::abs.out)
Tensor at::abs(Tensor& self)

// Generated from the aten::abs.out operator
Tensor& at::abs(Tensor& self, Tensor& out)
Tensor& at::abs_out(Tensor& out, Tensor& self)

```

This is an important step towards making those ops c10-full (it allows VariableType, XLA and other backends to ignore reordering and just call through with the same argument order), but this does not make any of those ops c10-full yet.
It enables the faithful API independent from c10-fullness. That means the API is more consistent with the same API for all ops and making an op c10-full in the future will not trigger future C++ API changes.
ghstack-source-id: 118068091

Test Plan: waitforsandcastle

Reviewed By: ezyang

Differential Revision: D24835252

fbshipit-source-id: dedfabd07140fc8347bbf16ff219aad3b20f2870
2020-12-08 03:48:42 -08:00
Sebastian Messmer
046ea6696d Enable faithful API for all ops (#47711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47711

Seems we generated the declaration but the definition only for c10-full ops. We should also generate the definition for non-c10-full ops.
This makes future migrations of ops from non-c10-full to c10-full have a lower impact on the C++ API.
ghstack-source-id: 118064755

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D24835006

fbshipit-source-id: 8f5c3c0ffcdc9b479ca3785d57da16db508795f5
2020-12-08 03:43:48 -08:00
Edward Yang
742903c0df Move argument grouping into FunctionSchema (#48195)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48195

The general approach is to change Arguments, splitting `positional`, `kwarg_only` and `out`, into `pre_self_positional`, `self_arg`, `post_self_positional`, and `pre_tensor_options_kwarg_only`, `tensor_options` and `post_tensor_options_kwarg_only`. The splits are as you'd expect: we extract out the self argument and the tensor options arguments, and record the other arguments that came before and after. To do this, we move the logic in `group_arguments` to the parsing process.

Some fuzz in the process:
* I renamed `ThisArgument` to `SelfArgument`, since we don't actually use the terminology "this" outside of C++ (and the model is Python biased)
* I kept the `group_arguments` function, which now just reads out the arguments from the structured model in the correct order. In the long term, we should get rid of this function entirely, but for now I kept it as is to reduce churn.
* I decided to arbitrarily say that when self is missing, everything goes in "post-self", but when tensor options is missing, everything goes in "pre-tensor-options". This was based on where you typically find the argument in question: self is usually at front (so most args are after it), while tensor options are typically at the end (so most args go before it).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zhangguanheng66

Differential Revision: D25231166

Pulled By: ezyang

fbshipit-source-id: 25d77ad8319c4ce0bba4ad82e451bf536ef823ad
2020-12-02 07:57:11 -08:00
Edward Yang
ba5686f8c5 Refactor argument fields in FunctionSchema to Arguments (#48182)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48182

I'm planning to add a bunch more argument fields following
https://github.com/pytorch/pytorch/pull/45890#discussion_r503646917 and
it will be a lot more convenient if the arguments get to live
in their own dedicated struct.  Type checker will tell you if
I've done it wrong.  No change to output.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ljk53

Differential Revision: D25057897

Pulled By: ezyang

fbshipit-source-id: dd377181dad6ab0c894d19d83408b7812775a691
2020-12-02 07:57:06 -08:00
Edward Yang
b4f5efa7b2 Structured kernels generate Meta registrations (#48116)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48116

If you port kernels to be structured, you get Meta kernels automatically
generated for you.  This is one payoff of structured kernels.

Code generation was mercifully really simple, although at risk of
"swiss cheese" syndrome: there's two new conditionals in the codegen
to tweak behavior when generating for meta keys.  It's not too bad
right now but there's a risk of things getting out of hand.  One
way to rationalize the logic here would be to transmit "TensorMeta-ness"
inside the TensorOptions (so tensor_from_meta can deal with it); then
the "Meta" kernel magic would literally just be generating empty
out_impls to call after all the scaffolding is done.  But I didn't
do this because it seemed like it would be more annoying short term.

Also had to teach resize_ to work on meta tensors, since we use them
to implement the out kernels.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer, ailzhang

Differential Revision: D25056640

Pulled By: ezyang

fbshipit-source-id: f8fcfa0dbb58a94d9b4196748f56e155f83b1521
2020-12-02 07:54:48 -08:00
Jiakai Liu
f98ab18445 [pytorch][codegen] move is_abstract property to NativeFunction model (#48252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48252

Moved to a shared place so that gen_variable_type.py can reuse it.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25087808

Pulled By: ljk53

fbshipit-source-id: 1f32e506956fc4eb08734cfde0add47b3e666bd9
2020-11-19 12:30:13 -08:00
Edward Yang
cdc2d2843b Structured kernel definitions (#45277)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45277

Implements structured kernels as per https://github.com/pytorch/rfcs/pull/9 and ports upsample_nearest1d to use the framework.

The general structure of this diff:

- Define a new syntax for specifying structured kernels in `native_functions.yaml`. You put `structured: True` on the `out` function (that's what you implement) and `structured_delegate: foo.out` on the functional/inplace variants to define them in terms of the `out` function. There's a bunch of new consistency checking to see if you've done this right, though the error messages are of varying quality. This is most of what's going on in tools.codegen.model
- NativeFunctionGroup turns into StructuredNativeFunctions. Previously I thought that maybe we would use this grouping mechanism for both structured and unstructured kernels, but it turned out that Jiakai needed to make his own grouping structure. So now I've specialized it for structured kernels, which also means I get to add a bunch of invariants, like requiring structured kernels to have both a functional and an out variant. This is the lower bundle of changes in tools.codegen.model
- When you make an out kernel structured, this induces us to generate a new meta function signature for you to write shape checking and output allocation code. The signatures of these is defined by `tools.codegen.api.meta` and generated into `MetaFunctions.h`. Coverage here is very bare bones and will be driven by actual operators we port as we go.
- The meaty part of code generation is what we do when we have some grouped StructuredNativeFunctions. We continue to generate a wrapper per function type, but they're are a bit different as the call your meta functions, and make reference to the actual implementations in out.
- Then there's a port of `upsample_nearest1d`; easiest to review by just looking at what the final code looks like.

Missing pieces:

- Stride calculation in TensorMeta
- Sufficient sanity checking for inplace/out variants
- Enough rope to make TensorIterator work

This PR improves instruction counts on `upsample_nearest1d` because it eliminates an extra redispatch. Testing `at::upsample_nearest1d(x, {10});`

* Functional: before 1314105, after 1150705
* Out: before 915705, after 838405

These numbers may be jittered up to +-16400 (which is the difference when I tested against an unaffected operator `at::upsample_linear1d`), though that may also because unrelated changes affected all operators globally.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D24253555

Test Plan: Imported from OSS

Reviewed By: smessmer

Pulled By: ezyang

fbshipit-source-id: 4ef58dd911991060f13576864c8171f9cc614456
2020-11-17 15:24:43 -08:00
Edward Yang
809660ffa4 ATen DerivedType is dead, long live ATen RegisterDispatchKey (#47011)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47011

smessmer has complained about how it is difficult to find generated
code.  Well hopefully this diffs helps a bit with that.

There are three components to this refactor:

- Rename TypeDerived (CPUType) to RegisterDispatchKey (RegisterCPU).
  The 'Type' nomenclature is vestigial and I think Register says
  what these files do a lot more clearly.  I also got rid of
  the CPUType namespace; everything just goes in anonymous
  namespace now, less moving parts this way.
- Give Math and DefaultBackend their own files (RegisterMath and
  RegisterDefaultBackend)
- Restructure code generation so that schema definition is done
  completely separately from RegisterDispatchKey

I decided to name the files RegisterCPU rather than the old convention
BackendSelectRegister, because it seems better to me if these
files clump together in an alphabetical listing rather than being
spread out everywhere.  There are a few manual registration files
which should probably get similar renaming.

I also did a little garden cleaning about how we identify if a
dispatch key is a cuda key or a generic key (previously called
KEYWORD_ALL_BACKENDS but I like my naming better).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D24600806

Test Plan: Imported from OSS

Reviewed By: smessmer

Pulled By: ezyang

fbshipit-source-id: c1b510dd7515bd95e3ad25b8edf961b2fb30a25a
2020-11-12 09:53:48 -08:00
Edward Yang
0c64f9f526 Convert from higher order functions to classes in tools.codegen.gen (#47008)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47008

bhosmer has been complaining about how it is difficult to distinguish
between local variables and closed over variables in the higher order
functions.  Well, closures and objects do basically the same thing, so
just convert all these HOFs into objects.

The decoder ring:
- Higher order function => Constructor for object
- Access to closed over variable => Access to member variable on object
- with_native_function => method_with_native_function (because it's
  hard writing decorators that work for both functions and methods)

I didn't even have to change indentation (much).

When there is no need for closed over variables (a few functions), I
kept them as plain old functions, no need for an object with no
members.

While I was at it, I also deleted the kwargs, since the types are
enough to prevent mistakes.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24600805

Pulled By: ezyang

fbshipit-source-id: 7e3ce8cb2446e3788f934ddcc17f7da6e9299511
2020-11-11 10:30:50 -08:00
Jiakai Liu
4159191f0e [pytorch] split out trace type generator and migrate to new codegen model (#47438)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47438

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24808211

Pulled By: ljk53

fbshipit-source-id: 44dfadf550a255c05aa201e54b48101aaf722885
2020-11-09 12:39:39 -08:00
Jiakai Liu
499d2fad98 [pytorch] factor out return_names api (#47437)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47437

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24808213

Pulled By: ljk53

fbshipit-source-id: 8ec6d58952fd677ab2d97e63b060cafda052411a
2020-11-09 12:39:37 -08:00
Edward Yang
843cab3f2e Delete TypeDefault.h and TypeDerived.h codegen entirely. (#47002)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47002

There was no good reason for TypeDerived.h (CPUType.h) codegen
to exist after static dispatch was deleted, and now that we
have Math alias key TypeDefault.h header is not needed either.
Sorry to anyone who was using these out of tree.

I didn't entirely delete TypeDefault.h as it has a use in
a file that I can't conveniently compile test locally.  Will
kill it entirely in a follow up.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24596583

Pulled By: ezyang

fbshipit-source-id: b5095d3509098ff74f836c5d0c272db0b2d226aa
2020-10-29 14:43:53 -07:00
Edward Yang
41f8641f1e Delete SchemaRegister.cpp, make flag operate on TypeDefault.cpp (#46991)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46991

This change is motivated by a problem bdhirsh observed which is
that in internal builds that include both SchemaRegister.cpp
and TypeDefault.cpp, some operators have their schemas defined
multiple times.  Instead of dumping schema registrations in
multiple files, it seems better to just toggle how many schemas
we write into TypeDefault.cpp.

ljk53 observes that technically SchemaRegister.cpp is only needed by
full-JIT frontend, and not by light interpreter (to resolve schema
lookups).  However, in practice, the registration file seems to be
unconditionally loaded.  This change will make it harder to do the
optimization where we drop schemas in the light interpreter, but you
probably want to architect this differently (similar to per-op
registrations, DON'T do any registrations in ATen, and then write out
the schema registrations in a separate library.)

I took this opportunity to also simplify the TypeDefault generation
logic by reworking things so that we only ever call with None argument
when registering.  Soon, we should be able to just split these
files up entirely.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ljk53

Differential Revision: D24593704

Pulled By: ezyang

fbshipit-source-id: f01ea22a3999493da77b6e254d188da0ce9adf2f
2020-10-29 14:43:47 -07:00
Edward Yang
54d83296a9 Desugar missing dispatch field into singleton Math entry (#46970)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46970

Now that catchall declarations are reinterpreted as registrations to
dispatch key Math, we can now simplify code generation logic by directly
generating to Math, and bypasing logic for catchall.  This also helps
avoid bugs where we incorrectly classify some kernels as Math and others
as not, even though they get registered in the same way.

Bill of changes:
- Give Math its own unique TORCH_LIBRARY_IMPL
- Make it so NativeFunction.dispatch is always non-None.  Simplify
  downstream conditionals accordingly
- When parsing NativeFunction, fill in missing dispatch with a
  singleton Math entry (pointing to the cpp.name!)

One thing that is a little big about this change is a lot of kernels
which previously didn't report as "math" now report as math.  I picked
a setting for these booleans that made sense to me, but I'm not sure
if e.g. XLA will handle it 100% correctly.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24592391

Pulled By: ezyang

fbshipit-source-id: 2e3355f19f9525698864312418df08411f30a85d
2020-10-29 14:43:44 -07:00
Edward Yang
87e86fa84c Some miscellaneous cleanup in codegen (#46940)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46940

- Remove inaccurate generated comments
- Delete some dead code
- Delete some unused headers
- Delete unnecessary SparseTypeDerived.cpp template

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24573971

Pulled By: ezyang

fbshipit-source-id: 3de05d9cd9bada4c73f01d6cfaf51f16ada66013
2020-10-29 14:43:41 -07:00
Edward Yang
dc6f723cb4 Delete Vulkan from code generator. (#46938)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46938

It turns out that after https://github.com/pytorch/pytorch/pull/42194
landed we no longer actually generate any registrations into this
file.  That means it's completely unnecessary.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: IvanKobzarev

Differential Revision: D24573518

Pulled By: ezyang

fbshipit-source-id: b41ada9e394b780f037f5977596a36b896b5648c
2020-10-29 14:40:54 -07:00
Alexander Grund
93719440b8 Replace map(lambda constructs (#46462)
Summary:
Follow-up of https://github.com/pytorch/pytorch/issues/46461 with a similar goal

Makes them more readable and possibly faster. Care has to be taken because `map` applies the function immediately while `(x for x in xs)` is a generator expression which gets evaluated later. This is a benefit in some cases where it is not required to actually create the list of values in memory (e.g. when passing to `tuple` or `extend` or `join`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46462

Reviewed By: zou3519

Differential Revision: D24422343

Pulled By: ezyang

fbshipit-source-id: 252e33499c92ac0b15238f2df32681dbbda2b237
2020-10-22 09:50:22 -07:00
Ailing Zhang
33e82c0269 Update error message to include link to readme. (#46613)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46613

Test Plan: CI

Reviewed By: ezyang

Differential Revision: D24430852

fbshipit-source-id: 811e4d10508d47ef830d2b8445f11592f342461f
2020-10-21 19:38:19 -07:00
Alexander Grund
5b0f400488 Replace list(map(...)) constructs by list comprehensions (#46461)
Summary:
As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant.

It also fixes a bug detected by this where the argument order of `map` was confused: 030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)

Fixes https://github.com/pytorch/pytorch/issues/46392

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461

Reviewed By: ailzhang

Differential Revision: D24367015

Pulled By: ezyang

fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7
2020-10-19 18:42:49 -07:00
Dhruv Matani
0c5cd8c2b9 [RFC] Switch PyTorch Selective Build (Custom Build) to use the SelectiveBuilder abstraction (#45722)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45722

This diff does a bunch of things:

1. Introduces some abstractions as detailed in https://fb.quip.com/2oEzAR5MKqbD to help with selective build related codegen in multiple files.
2. Adds helper methods to combine operators, debug info, operator lists, etc...
3. Currently, the selective build machinery querying `op_registration_whitelist` directly at various places in the code. `op_registration_whitelist` is a list of allowed operator names (without overload name). We want to move to a world where the overload names are also included so that we can be more selective about which operators we include. To that effect, it makes sense to hide the checking logic in a separate abstraction and have the build use that abstraction instead of putting all this selective build specific logic in the code-generator itself. This change is attempting to do just that.
4. Updates generate_code, unboxing-wrapper codegen, and autograd codegen to accept the operator selector paradigm as opposed to a selected operator list.
5. Update `tools/code_analyzer/gen_op_registration_allowlist.py` to expose providing an actual structured operator dependency graph in addition to a serialized string.

There are a bunch of structural changes as well:

1. `root_op_list.yaml` and `combined_op_list.yaml` are now actual YAML files (not a space separated list of operator names)
2. `generate_code.py` accepts only paths to operator list YAML files (both old style as well as new style) and not list of operator names on the command line as arguments
3. `gen.py` optionally also accepts a custom build related operators YAML path (this file has information about which operators to register in the generated library).

ghstack-source-id: 114578753

(Note: this ignores all push blocking failures!)

Test Plan:
`buck test caffe2/test:selective_build`

Generated YAML files after the change:

{P143981979}

{P143982025}

{P143982056}

Ensure that the generated files are same before and after the change:

```
[dhruvbird@devvm2490 /tmp/TypeDefault.cpp] find -name "*.cpp" | xargs md5sum
d72c3d125baa7b77e4c5581bbc7110d2  ./after_change/gen_aten/TypeDefault.cpp
42353036c83ebc7620a7159235b9647f  ./after_change/lite_predictor_lib_aten/TypeDefault.cpp
d72c3d125baa7b77e4c5581bbc7110d2  ./before_change/gen_aten/TypeDefault.cpp
42353036c83ebc7620a7159235b9647f  ./before_change/lite_predictor_lib_aten/TypeDefault.cpp
```

`VariableTypes_N.cpp` are generated the same both before and after the change:

```
[dhruvbird@devvm2490 /tmp/VariableType] find -name "*.cpp" | xargs -n 1 md5sum | sort
3be89f63fd098291f01935077a60b677  ./after/VariableType_2.cpp
3be89f63fd098291f01935077a60b677  ./before/VariableType_2.cpp
40a3e59d64e9dbe86024cf314f127fd6  ./after/VariableType_4.cpp
40a3e59d64e9dbe86024cf314f127fd6  ./before/VariableType_4.cpp
a4911699ceda3c3a430f08c64e8243fd  ./after/VariableType_1.cpp
a4911699ceda3c3a430f08c64e8243fd  ./before/VariableType_1.cpp
ca9aa611fcb2a573a8cba4e269468c99  ./after/VariableType_0.cpp
ca9aa611fcb2a573a8cba4e269468c99  ./before/VariableType_0.cpp
e18f639ed23d802dc4a31cdba40df570  ./after/VariableType_3.cpp
e18f639ed23d802dc4a31cdba40df570  ./before/VariableType_3.cpp
```

Reviewed By: ljk53

Differential Revision: D23837010

fbshipit-source-id: ad06b1756af5be25baa39fd801dfdf09bc565442
2020-10-18 15:10:42 -07:00
Ailing Zhang
8c629ecc9a [WIP] Move catchAll to Math (#45939)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45939

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24165890

Pulled By: ailzhang

fbshipit-source-id: 72fe71ea95a738251b2fafc9eea4ab3831cf426b
2020-10-16 16:17:16 -07:00
Ailing Zhang
ec5f81f9d3 Remove variable_excluded_from_dispatch() check for factory functions. (#46371)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46371

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24324545

Pulled By: ailzhang

fbshipit-source-id: 78038054690dff14883df711073be4c2da4e1f8b
2020-10-15 21:15:41 -07:00
Ailing Zhang
419dafe791 [Reland] Update native_functions.yaml to add DefaultBackend. (#46236)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46236

Test Plan: Imported from OSS

Reviewed By: Krovatkin

Differential Revision: D24273378

Pulled By: ailzhang

fbshipit-source-id: bed1d4c84c0bba88a7da4d9bd2ccaa58253cf91e
2020-10-14 22:37:28 -07:00
Sebastian Messmer
69e152e60b Fix device guard for c10-full ops (#46091)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46091

ghstack-source-id: 114269274

Test Plan:
vs prev diff: https://www.internalfb.com/intern/fblearner/details/224487971/

vs D23328718 (6ba6ecb048) : https://www.internalfb.com/intern/fblearner/details/224488043/

Reviewed By: ezyang

Differential Revision: D24219943

fbshipit-source-id: bbabafb5c5b76ce0e93df4fdae2f08221354d9f7
2020-10-14 06:32:43 -07:00
Ailing Zhang
a37f2749cd Avoid computing AutogradKey if not needed. (#46252)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46252

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D24272744

fbshipit-source-id: 6cb66d13e6c910df1ad1a8badd43f990e7b55368
2020-10-13 15:01:55 -07:00
Edward Yang
d705083c2b Refactor dispatcher and native to use Signature structure. (#45990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45990

In #45890 we introduced the concept of a CppSignature, which bundled
up all of the information necessary to declare a C++ signature for
the cpp API.  This PR introduces analogous concepts for dispatcher
and native: DispatcherSignature and NativeSignature.

The three interfaces are not particularly well coupled right now,
but they do have some duck typing coincidences:

- defn() which renders the C++ definition "bool f(int x)"
- decl() which renders the C++ declaration "bool f(int x = 2)"
- type() which renders the C++ function type "bool(int)"

Maybe at some point we'll introduce a Protocol, or a supertype.
Many other methods (like arguments()) have varying types.  These
signatures also have some helper methods that forward back to real
implementations in the api modules.  Something to think about is
whether or not we should attempt to reduce boilerplate here or
not; I'm not too sure about it yet.

The net effect is we get to reduce the number of variables we
have to explicitly write out in the codegen, since now these are all
bundled together into a signature.  Something extra special happens
in BackendSelect, where we now dynamically select between dispatcher_sig
and native_sig as "how" the backend select is implemented.

A little bit of extra cleanup:
- Some places where we previously advertised Sequence, we now advertise
  a more informative Tuple.
- defn() may take an optional positional parameter overriding the entire
  name, or a kwarg-only prefix parameter to just add a prefix to the
  name.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D24223100

Pulled By: ezyang

fbshipit-source-id: f985eced08af4a60ba9641d125d0f260f8cda9eb
2020-10-13 08:34:48 -07:00
Edward Yang
f086032676 Remove unnecessary byte-for-byte compatibility code that is not needed. (#45975)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45975

I reordered declarations in the faithful API reimplementation to
make sure the diffs lined up nicely; they're not necessary now.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D24223102

Pulled By: ezyang

fbshipit-source-id: 77c6ae40c9a3dac36bc184dd6647d6857c63a50c
2020-10-13 08:34:46 -07:00
Edward Yang
8d5c899b19 Rename legacy_dispatcher to native. (#45974)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45974

The term "legacy dispatcher" caused a bunch of confusion between
me and Sebastian when discussing what the intended semantics of
legacy dispatcher argument is.  Legacy dispatcher argument implies
that you ought NOT to use it when you have use_c10_dispatcher: full;
but that's not really what's going on; legacy dispatcher API describes
the API that you write native:: functions (NativeFunctions.h) to.
Renaming it here makes this more clear.

I applied these seds:

```
git grep -l 'legacy_dispatcher' | xargs sed -i 's/legacy_dispatcher/native/g'
git grep -l 'legacydispatcher' | xargs sed -i 's/legacydispatcher/native/g'
git grep -l 'LegacyDispatcher' | xargs sed -i 's/LegacyDispatcher/Native/g'
```

and also grepped for "legacy" in tools/codegen and fixed documentation.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D24223101

Pulled By: ezyang

fbshipit-source-id: d1913b8b823b3b95e4546881bc0e876acfa881eb
2020-10-13 08:34:43 -07:00
Edward Yang
944eb0e31d Add NativeFunctionGroup (#45918)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45918

This groups together related native functions (functional, inplace, out)
into a single group.  It's not used by anything but Jiakai said this
would be useful for his stuff so I'm putting it in immediately.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D24163526

Pulled By: ezyang

fbshipit-source-id: 9979b0fe9249c78e4a64a50c5ed0e2ab99f499b9
2020-10-13 08:34:36 -07:00
Edward Yang
9079aea1ac Rewrite implementation of faithful cpp signatures (#45890)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45890

This rewrite is as per my comments at https://github.com/pytorch/pytorch/pull/44087#issuecomment-701664506
I did the rewrite by reverting #44087 and then reimplementing it on top.
You may find it easier to review by diffing against master with only #44087
reverted.

There are two main ideas.

First, we now factor cpp argument processing into two phases operating
on three representations of data:

1. `FunctionSchema` - this is the source from native_functions.yaml
2. `Union[Argument, ThisArgument, TensorOptionsArgument]` - this is
   the arguments after doing some basic semantic analysis to group
   them (for TensorOptions) or identify the this argument (if this
   is a method).  There is only ever one of these per functions.
3. `Union[CppArgument, CppThisArgument, CppTensorOptionsArgument]` -
   this is the arguments after we've elaborated them to C++.  There
   may be multiple of these per actual C++ signature.

You can think of (2) as common processing, whereas (3) bakes in specific
assumptions about whether or not you have a faithful or non-faithful
signature.

Second, we now have CppSignature and CppSignatureGroup representing
the *total* public C++ API signature.  So those dataclasses are what
know how to render definitions/declarations, and you no longer have
to manually type it out in the Functions/TensorMethods codegen.

Here is an exhaustive accounting of the changes.

tools.codegen.api.types

- CppSignature and CppSignatureGroup got moved to tools.codegen.api.types
- Add new CppThisArgument and CppTensorOptionsArguments (modeled off
  of ThisArgument and TensorOptionsArguments) so that we can retain
  high level semantic structure even after elaborating terms with C++
  API information.  Once this is done, we can refine
  CppArgument.argument to no longer contain a ThisArgument (ThisArgument
  is always translated to CppThisArgument.  Note that this doesn't
  apply to TensorOptionsArguments, as those may be expanded or not
  expanded, and so you could get a single CppArgument for 'options')
- Add no_default() functional mutator to easily remove default arguments
  from CppArgument and friends
- Add an explicit_arguments() method to CppArgument and friends to
  extract (flat) argument list that must be explicitly written in the signature.
  This is everything except (Cpp)ThisArgument, and is also convenient
  when you don't care about the extra structure of
  CppTensorOptionsArguments

tools.codegen.api.cpp

- group_arguments is back, and it doesn't send things directly to a
  CppSignatureGroup; instead, it moves us from representation (1) to (2)
  (perhaps it should live in model).  Here I changed my mind from my
  PR comment; I discovered it was not necessary to do classification at
  grouping time, and it was simpler and easier to do it later.
- argument got split into argument_not_this/argument/argument_faithful.
  argument and argument_faithful are obvious enough what they do,
  and I needed argument_not_this as a more refined version of argument
  so that I could get the types to work out on TensorOptionsArguments

tools.codegen.api.dispatcher

- Here we start seeing the payoff.  The old version of this code had a
  "scatter" mode and a "gather" mode.  We don't need that anymore:
  cppargument_exprs is 100% type-directed via the passed in cpp
  arguments.  I am able to write the functions without any reference
  to use_c10_dispatcher

tools.codegen.gen

- Instead of having exprs_str and types_str functions, I moved these to
  live directly on CppSignature, since it seemed pretty logical.
- The actual codegen for TensorMethods/Functions is greatly simplified,
  since (1) all of the heavy lifting is now happening in
  CppSignature(Group) construction, and (2) I don't need to proxy one
  way or another, the new dispatcher translation code is able to handle
  both cases no problem.  There is a little faffing about with ordering
  to reduce the old and new diff which could be removed afterwards.

Here are codegen diffs.  For use_c10_dispatcher: full:

```
+// aten::_cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False) -> Tensor
 Tensor _cudnn_init_dropout_state(double dropout, bool train, int64_t dropout_seed, const TensorOptions & options) {
-    return _cudnn_init_dropout_state(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
+    static auto op = c10::Dispatcher::singleton()
+        .findSchemaOrThrow("aten::_cudnn_init_dropout_state", "")
+        .typed<Tensor (double, bool, int64_t, c10::optional<ScalarType>, c10::optional<Layout>, c10::optional<Device>, c10::optional<bool>)>();
+    return op.call(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
 }
```

Otherwise:

```
+// aten::empty_meta(int[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor
 Tensor empty_meta(IntArrayRef size, c10::optional<ScalarType> dtype, c10::optional<Layout> layout, c10::optional<Device> device, c10::optional<bool> pin_memory, c10::optional<MemoryFormat> memory_format) {
-    return empty_meta(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
+    static auto op = c10::Dispatcher::singleton()
+        .findSchemaOrThrow("aten::empty_meta", "")
+        .typed<Tensor (IntArrayRef, const TensorOptions &, c10::optional<MemoryFormat>)>();
+    return op.call(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
 }
```

Things that I probably did not get right:

- The Union[Argument, TensorOptionsArguments, ThisArgument] and
  the Cpp variants are starting to get a little unwieldy.  Not sure if
  this means I should add a supertype (or at the very least an
  alias); in some cases I do purposely omit one of these from the Union
- Code may not necessarily live in the most logical files.  There isn't
  very much rhyme or reason to it.
- The fields on CppSignature.  They're not very well constrained and
  it will be better if people don't use them directly.
- Disambiguation.  We should do this properly in #44087 and we don't
  need special logic for deleting defaulting for faithful signatures;
  there is a more general story here.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D24144035

Pulled By: ezyang

fbshipit-source-id: a185f8bf9df8b44ca5718a7a44dac23cefd11c0a
2020-10-13 08:31:54 -07:00
Ailing Zhang
e6d30c89c1 Revert D24165889: Update native_functions.yaml to add DefaultBackend.
Test Plan: revert-hammer

Differential Revision:
D24165889 (1f9ddf64d2)

Original commit changeset: 7f3ccdb3499b

fbshipit-source-id: b5d0de57d918011f1e19c9ef6aafa89fefcb42d5
2020-10-12 23:17:06 -07:00
Ailing Zhang
1f9ddf64d2 Update native_functions.yaml to add DefaultBackend. (#45938)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45938

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24165889

Pulled By: ailzhang

fbshipit-source-id: 7f3ccdb3499b40795bc34af716d0e63241ae8de3
2020-10-12 22:06:50 -07:00
Sebastian Messmer
6ba6ecb048 Only use hacky_wrapper_for_legacy_signatures if an op needs it (#45742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45742

Add a new flag to native_functions.yaml: `use_c10_dispatcher: hacky_wrapper_for_legacy_signatures`
and the codegen only wraps kernels in the aforementioned wrapper if that flag is set.
Apart from that, `use_c10_dispatcher: hacky_wrapper_for_legacy_signatures` is equivalent to `full`,
i.e. it has full boxing and unboxing support.

This greatly reduces the number of ops we apply the hacky_wrapper to, i.e. all ops marked as `use_c10_dispatcher: full` don't have it anymore.
ghstack-source-id: 113982139

Test Plan:
waitforsandcastle

vs fbcode:
https://www.internalfb.com/intern/fblearner/details/214511705/

vs base diff:
https://www.internalfb.com/intern/fblearner/details/214693207/

Reviewed By: ezyang

Differential Revision: D23328718

fbshipit-source-id: be120579477b3a05f26ca5f75025bfac37617620
2020-10-12 09:39:18 -07:00
Ailing Zhang
d811d4d7ba Support DefaultBackend keyword in native_functions.yaml. (#45719)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45719

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24165888

Pulled By: ailzhang

fbshipit-source-id: 9b3c5e71f5b6a985e1a43157813e7d77dbe13b07
2020-10-09 16:28:26 -07:00
Peter Bell
8b39498a23 codegen: Allow string arguments to have defaults (#45665)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45665

Fixes #43944

Note that the codegen doesn't use a proper parser so, in the same way as with lists, the string `, ` cannot appear in defaults or it will be interpreted as a splitting point between arguments.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24141835

Pulled By: ezyang

fbshipit-source-id: 578127861fd2504917f4486c44100491a2c40343
2020-10-06 21:53:56 -07:00