pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Brian Hirsh	27a3204982	generate C++ API for meta functions using at::meta:: (#58570 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58570 What the PR does Generate a fast-path `at::meta::{op}` API for calling meta functions without having to go through the dispatcher. This will be important for perf for external backends that want to use meta functions for shape checking (which seems likely to be what we end up doing for LazyTensorCore). Details In order to avoid naming collisions I had to make two small changes: - rename `MetaFunctions.h` template -> `NativeMetaFunctions.h` (this is the file that declares the impl() function for every structured operator). - rename the meta class: `at::meta::{op}::meta()` -> `at::meta::structured_{op}::meta()` I also deleted a few unnecessary includes, since any file that includes NativeFunctions.h will automatically include NativeMetaFunctions.h. Why I made the change This change isn't actually immediately used anywhere; I already started writing it because I thought it would be useful for structured composite ops, but that isn't actually true (see [comment](https://github.com/pytorch/pytorch/pull/58266#issuecomment-843213147)). The change feels useful and unambiguous though so I think it's safe to add. I added explicit tests for C++ meta function calls just to ensure that I wrote it correctly - which is actually how I hit the internal linkage issue in the PR below this in the stack. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D28711299 Pulled By: bdhirsh fbshipit-source-id: d410d17358c2b406f0191398093f17308b3c6b9e	2021-06-15 16:54:46 -07:00
Brian Hirsh	e341bab8ae	bugfix: ensure that at::{dispatch_key}:: API gets external linkage (#58569 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58569 This should allow external C++ files that aren't compiled into `libtorch.so`/`libtorch_cpu.so` (including all of fbcode) to use fast path functions like `at::cpu::add()`, which skip the dispatcher. So, after spending way too much time trying to figure out why I was getting linker errors when calling `at::meta::{op}` and `at::cpu::{op}` from C++ test files, I realized that we're not including the header files for C++ for the namespaced operator definitions. I.e. `RegisterCPU.cpp`, which provides definitions for the `at::cpu::{op}` fast path functions, wasn't including the `CPUFunctions.h` header. Why that breaks stuff: the `CPUFunctions.h` header file is what marks each function with the `TORCH_API` macro, so without including it, when we build `libtorch.so` and `libtorch_cpu.so`, the compiler will look at the definition in `RegisterCPU.cpp`, not see a `TORCH_API`, and decide that the function should get internal linkage. An alternative would be to directly mark the function definitions in `RegisterCPU.cpp` with `TORCH_API`, but this seemed cleaner. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D28711300 Pulled By: bdhirsh fbshipit-source-id: 535f245c20e977ff566d6da0757b3cefa137040b	2021-06-15 16:53:22 -07:00
albanD	30a18fe318	refactor yaml loader import, no runtime change (#59850 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59850 This whole stack does not change anything to the codegened code Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D29063816 Pulled By: albanD fbshipit-source-id: ca3067443d8e6282c1077d3dafa3b4f330d43b28	2021-06-12 06:58:34 -07:00
albanD	c60d1ac9cf	Use C dumper if possible aten codegen 23s -> 13s (#59849 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59849 This whole stack does not change anything to the codegened code Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D29063815 Pulled By: albanD fbshipit-source-id: c4baa72594bd2fe50ac67f513916f2b2ccb7488c	2021-06-12 06:58:32 -07:00
albanD	504ec30109	avoid error string formatting aten codegen 28s -> 23s (#59848 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59848 This whole stack does not change anything to the codegened code Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D29063818 Pulled By: albanD fbshipit-source-id: c68734672eeacd212d7bd9bebe3d53aaa20c3c24	2021-06-12 06:58:31 -07:00
albanD	7143a6a189	Avoid unnecessary re-computation autograd codegen 21s -> 15s (#59847 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59847 This whole stack does not change anything to the codegened code Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D29063817 Pulled By: albanD fbshipit-source-id: 284c3e057029b7a67f43a1b034bb30863bd68c71	2021-06-12 06:57:19 -07:00
Jacob Szwejbka	1faba1e4cc	[Pytorch Edge] Make RegisterBackendSelect Selective (#59096 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59096 RegisterBackendSelect is bringing in ~100 extra ops to the runtime. This messes with the compatibility api, and also adds a nontrivial amount of size. Test Plan: Model Unittests/CI Reviewed By: iseeyuan Differential Revision: D28588100 fbshipit-source-id: ffd0b5b9cbe20f27dbf3be418a6c1f80c7396fdb	2021-06-07 19:48:46 -07:00
Richard Zou	970096b624	[Reland] Adds an aten::_ops namespace with unambiguous function names (#59018 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59018 Fixes #58044. This PR: - adds `ATEN_FN(op)` and `ATEN_FN2(op, overload)` macros that resolve to an non-overloaded function in aten::_ops that calls the desired operator (without default arguments). The motivation for this is two-fold: 1) Using aten operators with templates is hard if the operator is overloaded (e.g. add.Tensor and add.Scalar). 2) Method-only operators require special handling; pointers-to-method are different from function pointers. `ATEN_FN2(add_, Tensor)` returns a function instead of a method. There is some interesting behavior for out= operations. `ATEN_FN2(sin, "out")` gives a function that is faithful to the schema; that is, the order of arguments is exactly what it looks like in the schema. This makes it so that you can directly register `ATEN_FN2(sin,"out")` (or a function wrapping it using the same signature) as an override for a DispatchKey. Test Plan: - New tests that ATEN_FN2 works on function and method-only operators - New test that ATEN_FN works - New test that ATEN_FN macro returns a "faithful" function. Codegen output: Operators.h and Operators.cpp are both here: https://gist.github.com/zou3519/c2c6a900410b571f0d7d127019ca5175 Reviewed By: bdhirsh Differential Revision: D28721206 Pulled By: zou3519 fbshipit-source-id: a070017f98e8f4038cb0c64be315eef45d264217	2021-06-01 17:19:06 -07:00
Alice Ou	24508337f4	Revert D28643215: Adds an aten::_ops namespace with unambiguous function names Test Plan: revert-hammer Differential Revision: D28643215 (`28740869a1`) Original commit changeset: 7b2b8459f1b2 fbshipit-source-id: ea869bf4cfde7038087e990b2cff5a86f9e2a531	2021-05-26 12:35:34 -07:00
Richard Zou	28740869a1	Adds an aten::_ops namespace with unambiguous function names (#58092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58092 Fixes #58044. This PR: - adds `ATEN_FN(op)` and `ATEN_FN2(op, overload)` macros that resolve to an non-overloaded function in aten::_ops that calls the desired operator (without default arguments). The motivation for this is two-fold: 1) Using aten operators with templates is hard if the operator is overloaded (e.g. add.Tensor and add.Scalar). 2) Method-only operators require special handling; pointers-to-method are different from function pointers. `ATEN_FN2(add_, Tensor)` returns a function instead of a method. There is some interesting behavior for out= operations. `ATEN_FN2(sin, "out")` gives a function that is faithful to the schema; that is, the order of arguments is exactly what it looks like in the schema. This makes it so that you can directly register `ATEN_FN2(sin,"out")` (or a function wrapping it using the same signature) as an override for a DispatchKey. Test Plan: - New tests that ATEN_FN2 works on function and method-only operators - New test that ATEN_FN works - New test that ATEN_FN macro returns a "faithful" function. Codegen output: Operators.h and Operators.cpp are both here: https://gist.github.com/zou3519/c2c6a900410b571f0d7d127019ca5175 Reviewed By: mruberry Differential Revision: D28643215 Pulled By: zou3519 fbshipit-source-id: 7b2b8459f1b2eb5ad01ee7b0d2bb77639f77940e	2021-05-26 07:29:15 -07:00
Brian Hirsh	32273e806a	Ensure NativeFunctions.h codegen output is deterministic (#58889 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58889 fixes https://github.com/pytorch/pytorch/issues/58796 Planning on re-testing locally tomorrow morning to confirm, but this change should fix the non-determinism in the codegen output that was causing `ccache` not to re-use its cached output. I built from the commit referenced in https://github.com/pytorch/pytorch/issues/58796 a few times and ran `diff -Naur` on the codegen output in `build/aten/src/ATen`. After a few tries, `NativeFunctions.h` had a few diffs. The diffs were all related to the ordering of functional/inplace/out variants of a NativeFunctionGroup, which looked non-deterministic. That looks like it's coming from my calling `set()` to filter out duplicate NativeFunction declarations. The earlier version of the codegen also called `set()` to filter out duplicates, but it did so individually for each `NativeFunction` object, before merging the groups (I'm not too sure why this didn't introduce non-determinism before. though). With the refactor from https://github.com/pytorch/pytorch/pull/57361, we're calling `set()` on the declarations from every operator for a given DispatchKey, which is probably what introduced the nondeterminism. Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D28675941 Pulled By: bdhirsh fbshipit-source-id: bb66de00aafeeb9720d85e8156ac9f7539aed0d6	2021-05-25 16:33:03 -07:00
Brian Hirsh	1a9efbbc92	generate inplace/out kernels for xla (#57510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57510 This is a re-write of https://github.com/pytorch/pytorch/pull/56835, which is significantly shorter thanks to the data model change in the PR below this one in the stack. See the original description in the linked PR for details. The functional changes in this PR are the same as in the above linked one, so the description is the same with a few small changes: - I don't bother generating `at::xla::{op}` entries for CPU fallbacks. After looking around, I see precedent for that. For example, we don't have `at::cpu::{op}` entries for composite ops- if you really want to bypass the dispatcher you need to call `at::compositeimplicitautograd::{op}`. Maybe we should revisit that later if we find an important use case for having full namespace coverage, but that doesn't seem worth half-fixing for external backends in this PR. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D28474364 Pulled By: bdhirsh fbshipit-source-id: 4d58b60e5debad6f1ff06420597d8df8505b2876	2021-05-17 12:25:38 -07:00
Brian Hirsh	9354a68e7d	[codegen] split out backend-specific information from NativeFunction in the model (#57361 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57361 Data model change in the codegen, which splits backend-specific information out of `NativeFunction` ### Overview Currently in the codegen, native_functions.yaml has backend-specific information about each operator that is encoded directly into the data model, in the `NativeFunction` object. That's reasonable, since the native_functions.yaml is the source of truth for information about an operator, and the data model encodes that information into types. Now that external backends can use the codegen though, that information is technically incomplete/inaccurate. In another PR, I tried patching the information on the `NativeFunction` object with the additional external information, by updating the `dispatch` entry to contain the external backend kernel name and dispatch key. Instead, this PR tries to split out that information. The `NativeFunction` class contains all information about an operator from native_functions.yaml that's backend-independent and is known never to change regardless of what extra information backends provide. We also build up a backend "index", which is basically a mapping from [backend] -> [backend-specific-metadata]. Reading in an external backend yaml just involves updating that index with the new backend. There were a few places where `NativeFunction` used the dispatch table directly, that I encoded as properties directly on the NativeFunction object (e.g. `is_abstract`). They were mostly around whether or not the operator has a composite kernel, which isn't something that's going to change for any external backends. This has a few advantages: - We can more easily re-use the existing logic in `native_function.py` and `register_dispatch_key.py` for both native and external backends, since they both involve a NativeFunction + a particular backend index - The data in the data model will be the same regardless of how the codegen is run. Running the codegen with a new external backend doesn't change the data inside of NativeFunction or an existing backend index. It just adds a new index for that backend. - There are several of codegen areas that don't care about backend-specific information: mostly the tracing and autograd codegen. We can reason about the codegen there more easily, knowing that backend-specific info is entirely uninvolved. An alternative to this split would be to augment the NativeFunction objects with external backend information at the time that we create them. So the external codegen could read both native_functions.yaml and the external backend's yaml at the same time, and construct a NativeObject with a full dispatch table (including the XLA entry), and the correct setting of structured (taking into account both yamls). One disadvantage to this approach is that NativeFunction objects now contain different stuff depending on how you ran the codegen, and you have to make sure that any changes to the codegen can properly handle all the different variants. ### Data Model Changes Removed 3 classes, which are used by the external codegen: - ExternalBackendFunction - ExternalBackendFunctionsGroup - ExternalBackendMetadata And added two new ones: - BackendIndex - BackendMetadata `BackendIndex` contains any info that's specific to that backend, plus a mapping from operator names to backend specific metadata about the operator. One example of backend-specific info that's not operator-dependent is the fact that XLA prefers to implement functional kernels instead of out kernels (and so when they eventually mark an op as structured, they're going to mark the functional op and not the out op). `BackendMetadata` contains info specific to an (operator, backend) pair. Right now, that's just (a) the name of the kernel, and (b) whether or not that operator is structured. ### Questions I wanted to get this PR up earlier so I could get feedback, but there are a few things I want to call out: Dealing with `structured`. This PR separates out the notion of `structured` into two bits of information: - Does [operator] have a meta() function. This is backend-agnostic, and is represented by the `structured` property on `NativeFunction`, same as before. This is used, e.g., to decide what signatures to add to `MetaFunctions.h`. - Does [operator, backend] have an impl() function. This is backend dependent; even though technically all in-tree backends are forced to write impl() functions for an operator when we port the op to structured in native_functions.yaml, out-of-tree backends can decide to opt in independently. This is represented as a property on `BackendMetadata`. This is used in most other cases, e.g. in `RegisterDispatchKey` when we're deciding whether or not to gen a structured or unstructured wrapper. I also baked `is_structured_dispatch_key` directly into each BackendIndex. So for operators marked "structured" in native_functions.yaml, their corresponding CPU/CUDA BackendIndex entries will be marked structured, and all others (except for potentially external backends) will not. I ended up trying to deal with `structured` in this change since it's technically backend dependent (XLA can opt kernels into structured separately from in-tree ops), but that may have been too ambitious: it's technically not relevant until we actually add support for structured external kernels. If it's not clear that this is the right path for dealing with structured and we want to push that off, I'm fine with backing out the bits of this PR that make `structured` backend-dependent. I don't see anything too controversial related to structured in the change, but I tried to call out any areas in the comments Localizing the fact that external backends follow Dispatcher convention. Another thing that's sort of backend specific that I didn't totally address in this PR is the fact the fact that in-tree backends follow the Native API while external backends follow the Dispatcher API. I painted over that in `native_functions.py` by adding a helper, `kernel_signature`, that takes in a native function and gives you the "correct" signature for the specified backend- NativeSignature for in-tree backends, and DispatcherSignature for out-of-tree backends. In order to make that fully useable though, we'll need `NativeSignature` and `DispatcherSignature` to have matching interfaces. I didn't bother with that in this PR, which is why `gen_external_aten_fallbacks.py` still has a bunch of direct references to the dispatcher API. Thinking of adding it in a later PR but wanted to see if anyone has other opinions. Maybe `is_external()` shouldn't even be a property on the BackendMetadata, and anything the codegen does that requires asking for that information should just be better abstracted away. Thoughts on the `BackendIndex` / `BackendMetadata` breakdown. One thing that's annoying right now is that to query for various pieces of metadata, you call helper functions like `backend_index.structured(f)`, which queries that particular backend and tells you if that specific NativeFunctionGroup is structured for that backend. It has to return an `Optional[bool]` though, since you have to handle the case where that operator doesn't have a kernel for that backend at all. So users of those helpers end up with a bunch of optionals that they need to unpack, even if they know at some point that the result isn't None. I think it would be easier instead to just store the NativeFunction object as a field directly on the BackendMetadata. Curious if there are any other opinions on a better way to model it though. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D28474362 Pulled By: bdhirsh fbshipit-source-id: 41a00821acf172467d764cb41e771e096542f661	2021-05-17 12:25:35 -07:00
Brian Hirsh	6fa1d880b6	make external codegen aware of autogen'd composite kernels (#56960 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56960 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D28012667 Pulled By: bdhirsh fbshipit-source-id: 56da050c5d46b8952ddecfa83ebd5fe8454acffe	2021-04-30 07:23:28 -07:00
Brian Hirsh	76fbd755c1	Reland of "D27708346: generate xla codegen in-tree" (#56601 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56601 Updating it to ensure that RegistrationDeclarations.yaml is completely unchanged This reverts commit `90e532f3ef`. Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27915305 Pulled By: bdhirsh fbshipit-source-id: 491a025c44221690dad849f9a2166934130c0fec	2021-04-21 19:36:31 -07:00
Scott Wolchok	1211bccc65	[PyTorch] Fix const correctness for resize native functions (#55351 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55351 We incorrectly used `Tensor&` to mean "the underlying TensorImpl cannot be changed", as explained in https://github.com/zdevito/ATen/issues/27#issuecomment-330717839 . This diff gets us on the path to fixing this problem: we have an incremental way to fix individual native functions so that we can apply any handwritten fixes a few at a time. It gets the migration started with the `resize` family of native functions. ghstack-source-id: 127092677 Test Plan: fitsships Reviewed By: ezyang Differential Revision: D27583983 fbshipit-source-id: 4eeeec85f5d268e9d0f1645eb9396914a9f9557f	2021-04-21 14:51:41 -07:00
Brian Hirsh	90e532f3ef	Revert D27708346: generate xla codegen in-tree Test Plan: revert-hammer Differential Revision: D27708346 (`51d0212d0f`) Original commit changeset: 2289edd641f3 fbshipit-source-id: 86711c07db19833b9e772c558e12accba1432499	2021-04-21 11:07:45 -07:00
Brian Hirsh	51d0212d0f	generate xla codegen in-tree (#55050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55050 not ready for review yet Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D27708346 Pulled By: bdhirsh fbshipit-source-id: 2289edd641f30277d7561cf2d48ec69c6a2137a9	2021-04-21 08:19:08 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00
Brian Hirsh	87a1ebc9cd	fix RegistrationDeclarations.yaml, now that we codegen composite kernels for structured functional/inplace ops (#56307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56307 This should fix https://github.com/pytorch/pytorch/issues/56273. I tested these changes locally by making them directly on top of https://github.com/pytorch/pytorch/pull/56151, and running the xla tests (`xla/test/cpp/build/test_ptxla`). Current state: For ops that are ported to structured, If external backends like XLA have implemented the `out` op but not the `functional` version, they will call into our code-generated `CompositeExplicitAutograd` kernel, which calls the structured operator's `meta()` function and then redispatches to the external backend's `out` function. If XLA has registered their own kernel to the `functional` variant of the op, it'll override our codegen'd composite kernel. XLA has logic to code-generate "CPU fallback" kernels for "required" ops. It gets this information based off of `RegistrationDeclarations.yaml`. That info was technically incorrect up until this PR, since we were code-generating `inplace/functional` composite kernels for structured ops, but not updating `RegistrationDeclarations.yaml` with that information. Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D27883950 Pulled By: bdhirsh fbshipit-source-id: fe896b0d2bbd4369490dcdf7a87f227fd3d8b8b3	2021-04-21 07:41:09 -07:00
Brian Hirsh	947c7a8215	add C++ namespacing logic to ctypes (#55047 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55047 Added namespaces to all of the `CTypes` printed in the codegen. This is pretty much required if we want to use codegen externally, since we can no longer assume that we're inside of the `at::` namespace. Important changes are in `types.py`. How do we add the notion of namespaces to C++ types without people having to write "at::Tensor" everywhere? Before this PR, `CType` held a raw string representing the type, i.e. `BaseCType("Tensor", binds)`. This PR introduces a set of singleton base C++ types in `types.py`, that know how to print their namespace. Instead, we'd write `BaseCType(tensorT, binds)`, where printing `tensorT` will properly print out "at::Tensor". This also means that you can't create arbitrary `CTypes`. If we need a new C++ type in the codegen, we need to add it to the list in `types.py`. One blip in the design: we don't want to change `RegistrationDeclarations.yaml`, since that'll break external backends that ingest it. I added separate functions to display types without the namespace that are used to create RegistrationDeclarations.yaml`. With an external codegen API though, we can eventually kill it :) I also didn't realize until this PR that `Declarations.yaml` is still directly in use, by some python/autograd codegen. Rather than keep that yaml byte-for-byte compatible, I just updated the callsites in the autograd codegen to work with namespaces. In the NEXT pr, I try to clean up some of the autograd codegen to stop using raw strings to match against C++ types. Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D27708349 Pulled By: bdhirsh fbshipit-source-id: 56a4f81fc101795bcb9ee1f722121480fb2356ad	2021-04-16 11:43:06 -07:00
Brian Hirsh	164bee1d09	Return a CType instead of a string for returns, beef up CType (#55046 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55046 Updating `returns` in the codegen to return a CType instead of a raw string. This has benefit of putting all stringifying logic through CType, which is useful in the followup PR when I add namespaces. I also added new CTypes for other templated C++ types: array, vector and tuple. Mostly because it makes the namespacing logic in the next PR significantly easier. It also seems more natural to me that `BaseCType` shouldn't represent specializations of templated types. There's a little bit of weirdness, types that are currently only used for returns, i.e. `TupleCType`. Returns aren't named, so I opted not to give it one- so we can add it in later if we discover that we need it. Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D27708348 Pulled By: bdhirsh fbshipit-source-id: 230b210c3e53be1bd362105fbea8451055dc59a8	2021-04-16 11:41:46 -07:00
Edward Yang	6c327ef9d4	matches_jit_signatures is dead (#53637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53637 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26920687 Pulled By: ezyang fbshipit-source-id: 288bd9dca63da04ccc633d939833066a3305a68a	2021-04-15 12:31:19 -07:00
Sam Estep	4753100a3b	Un-ignore F403 in .flake8 (#55838 ) Summary: Generally wildcard imports are bad for the reasons described here: https://www.flake8rules.com/rules/F403.html This PR replaces wildcard imports with an explicit list of imported items where possible, and adds a `# noqa: F403` comment in the other cases (mostly re-exports in `__init__.py` files). This is a prerequisite for https://github.com/pytorch/pytorch/issues/55816, because currently [`tools/codegen/dest/register_dispatch_key.py` simply fails if you sort its imports](https://github.com/pytorch/pytorch/actions/runs/742505908). Pull Request resolved: https://github.com/pytorch/pytorch/pull/55838 Test Plan: CI. You can also run `flake8` locally. Reviewed By: jbschlosser Differential Revision: D27724232 Pulled By: samestep fbshipit-source-id: 269fb09cb4168f8a51fd65bfaacc6cda7fb87c34	2021-04-13 09:24:07 -07:00
Sameer Deshmukh	5fb1142702	Add CSR (compressed sparse row) layout for sparse tensors (#50937 ) Summary: Implement compressed sparse row format. Derived from the GCS implementation at https://github.com/pytorch/pytorch/pull/44190 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50937 Reviewed By: mrshenli Differential Revision: D27439865 Pulled By: ezyang fbshipit-source-id: 3ba3dcb9679505b980ff6a5f513e913bbae2fb1d	2021-04-12 10:09:12 -07:00
Edward Yang	13b1ca9466	Rename DefaultBackend to CompositeExplicitAutograd (#54470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54470 ``` git grep -l 'DefaultBackend' \| xargs sed -i 's/DefaultBackend/CompositeExplicitAutograd/g' ``` Plus a quick fixup in native/README.md Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D27253240 Pulled By: ezyang fbshipit-source-id: 964df951ea8b52fa72937f3cc66aeaf49a702e6f	2021-03-26 10:53:30 -07:00
Edward Yang	145bc5cd51	Rename Math to CompositeImplicitAutograd (#54466 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54466 I had to very carefully audit all the use sites since there are a lot of other uses of the string Math; I did most of the conversion by grepping for all occurrences of Math and then doing a search replace. I also updated documentation for clarity. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27253239 Pulled By: ezyang fbshipit-source-id: afb485d07ff39575742a4f0e1e205179b60bc953	2021-03-24 13:49:24 -07:00
Edward Yang	6e8c4ad7fd	s/StructuredNativeFunctions/NativeFunctionsGroup/ (#54427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54427 A StructuredNativeFunctions is no longer guaranteed to actually be structured (test structured property for that), so we rename this to a more neutral name. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27235380 Pulled By: ezyang fbshipit-source-id: 2b438d615bf06a47fc9c7bf6eb66fd8b4df31bc8	2021-03-23 00:43:57 -07:00
Edward Yang	bf2ca35f35	Rejigger to use NativeFunctionsGroup even without structured: True (#54426 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54426 Previously, we only put NativeFunctions in StructuredNativeFunctions if the out variant advertised that the kernel was structured. However, there are a few code generation things that can take advantage of this trio structure, even if the kernel itself hasn't been ported to be structured. So better to always group things when they are related, and then let clients decide whether or not to use the structure or throw it away. While doing this, I had hoped that there weren't any functional/inplace pairs that didn't also have an out variant. This turned out to not be true. These are probably all oversights and should get fixed at some point. Bill of changes: - The actual operational change happens in StructuredNativeFunctions.from_dict; then I need to relax some __post_init__ invariants. To tell if a StructuredNativeFunctions is actually structured, there is a new structured property, which is queried from a few new locations in code - Refactor native_functions.py into gen_structured/gen_unstructured functions so I can easily call gen_unstructured from two contexts I intend to s/StructuredNativeFunctions/NativeFunctionsGroup/ but for ease of review this rename hasn't been done in this PR. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27235379 Pulled By: ezyang fbshipit-source-id: d8a15de9abb75b365348ab94e67b830704e30cf0	2021-03-23 00:43:54 -07:00
Edward Yang	c00d66f73c	Move compute_native_function_declaration to its own dest module (#54419 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54419 I'm planning to break it into some helper functions, so let's put it in its own module first. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27235378 Pulled By: ezyang fbshipit-source-id: c03c5440d2d753859e2c5ec2b2c8b1b82870f03a	2021-03-23 00:43:50 -07:00
kedejesu	53d8778b4d	Update clang-format linux hash and yaml import calls (#53932 ) Summary: Fixing Bandit security issues. - yaml_load: Use of unsafe yaml load. Allows instantiation of arbitrary objects. Consider yaml.safe_load(). Test ID: B506 Severity: MEDIUM Confidence: HIGH File: ./caffe2/contrib/aten/gen_op.py More info: https://bandit.readthedocs.io/en/latest/plugins/b506_yaml_load.html 235 if __name__ == '__main__': 236 decls = yaml.load(read(os.path.join(args.yaml_dir, 'Declarations.yaml')), Loader=Loader) 237 factory_methods = find_factory_methods(decls) - Blacklist: Use of insecure MD2 (`6149a26adb`), MD4 (`fc7f026980`), MD5 (`7ea9d9af4e`), or SHA1 hash function. Test ID: B303 Severity: MEDIUM Confidence: HIGH File: ./tools/clang_format_utils.py More info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b303-md5 36 37 hash = hashlib.sha1() 38 Pull Request resolved: https://github.com/pytorch/pytorch/pull/53932 Reviewed By: jbschlosser Differential Revision: D27072017 Pulled By: malfet fbshipit-source-id: 2fef0119388797aee3cacdc880fc345bd2ba68ce	2021-03-18 17:11:58 -07:00
Brian Hirsh	a7ddd15d15	fix static dispatch linker error (#53859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53859 The redispatch API wasn't linking properly when static dispatch is enabled. I'm still not sure why this wasn't caught by the static dispatch test in CI- maybe, as swolchok pointed out, we have a flag set somewhere that defers undefined symbols until runtime. Before, building with static dispatch enabled locally + running `import torch` gave me this error: ``` >>> import torch Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/raid/hirsheybar/pytorch/torch/__init__.py", line 197, in <module> from torch._C import * ImportError: /raid/hirsheybar/pytorch/torch/lib/libtorch_cpu.so: undefined symbol: _ZN2at10redispatch11logical_or_EN3c1014DispatchKeySetERNS_6TensorERKS3_ >>> ``` Printing the symbol: ``` (pytorch) hirsheybar@devfair017:/scratch/hirsheybar/pytorch$ c++filt _ZN2at10redispatch11logical_or_EN3c1014DispatchKeySetERNS_6TensorERKS3_ at::redispatch::logical_or_(c10::DispatchKeySet, at::Tensor&, at::Tensor const&) ``` Sure enough, the functions defined in `RedispatchFunctions.cpp` don't have the DispatchKeySet argument included. Adding them in this PR. Test Plan: Imported from OSS Reviewed By: ljk53 Differential Revision: D26998735 Pulled By: bdhirsh fbshipit-source-id: c6c1104e42d13b7ec9d964b7e08d2adc8b344b78	2021-03-12 09:41:08 -08:00
Ailing Zhang	9f75de278f	Move common autograd utils functions from gen_variable_type.py to api/autograd.py. (#53340 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53340 Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D26973914 Pulled By: ailzhang fbshipit-source-id: 8367a08b27b25808782c77aadc3c67d07c354957	2021-03-11 19:58:45 -08:00
Brian Hirsh	08d266943d	structured kernels - error check when structured_delegate is not marked structured (#52227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52227 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26431105 Pulled By: bdhirsh fbshipit-source-id: 82e8c6eb5eee6f9cdb39637ecfee84ab5bb2cabe	2021-02-24 13:17:52 -08:00
Brian Hirsh	d02a2bd5d1	codegen'd API for redispatching (#52008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52008 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D26356079 Pulled By: bdhirsh fbshipit-source-id: 1fd34fbb4dbc48cc8390cad99e30e0d04fc75a4f	2021-02-22 10:55:38 -08:00
Jiakai Liu	c9c4b871a5	[pytorch] reintroduce static dispatch (#51957 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51957 This is a simplified version of #51554. Compared to #51554, this version only supports statically dispatching to a specific backend. The benefit is that it skipped the dispatch key computation logic thus has less framework overhead. The downside is that if input tensors do not match the specified backend it will throw error instead of falling back to regular dispatch. Sample code: ``` Tensor empty(IntArrayRef size, TensorOptions options, c10::optional<MemoryFormat> memory_format) { return at::cpu::empty(size, options, memory_format); } // aten::conj(Tensor(a) self) -> Tensor(a) Tensor conj(const Tensor & self) { return at::math::conj(self); } // aten::conj.out(Tensor self, , Tensor(a!) out) -> Tensor(a!) Tensor & conj_out(Tensor & out, const Tensor & self) { return at::cpu::conj_out(out, self); } // aten::conj.out(Tensor self, , Tensor(a!) out) -> Tensor(a!) Tensor & conj_outf(const Tensor & self, Tensor & out) { return at::cpu::conj_out(out, self); } // aten::_conj(Tensor self) -> Tensor Tensor _conj(const Tensor & self) { return at::defaultbackend::_conj(self); } ``` For ops without the specific backend dispatch, it will throw error: ``` // aten::_use_cudnn_ctc_loss(Tensor log_probs, Tensor targets, int[] input_lengths, int[] target_lengths, int blank) -> bool bool _use_cudnn_ctc_loss(const Tensor & log_probs, const Tensor & targets, IntArrayRef input_lengths, IntArrayRef target_lengths, int64_t blank) { TORCH_CHECK(false, "Static dispatch does not support _use_cudnn_ctc_loss for CPU."); } ``` Differential Revision: D26337857 Test Plan: Imported from OSS Reviewed By: bhosmer Pulled By: ljk53 fbshipit-source-id: a8e95799115c349de3c09f04a26b01d21a679364	2021-02-19 11:41:39 -08:00
Brian Hirsh	2303c244fc	skip a second call to shouldUseRecordFunction for BackendSelect ops (#50891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50891 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D25999514 Pulled By: bdhirsh fbshipit-source-id: 8a6c17ab502fe463cf3fb38a1e555c64bc5556f0	2021-02-08 18:32:40 -08:00
Brian Hirsh	41bab9a4b6	Plumbing dispatch keys through the dispatcher (#49354 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49354 Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25614042 Pulled By: bdhirsh fbshipit-source-id: 269a75e9a3ac518aa63bff2cafbd47ed2c4ff780	2021-02-08 11:09:51 -08:00
Edward Yang	668e0f3598	Split anonymous and namespaced definitions in RegisterDispatchKey (#51585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51585 Some payoff from the stack of refactors. When I initially landed at::cpu, Brian asked me why I couldn't just separate the anonymous and namespaced definitions. Well, it used to be annoying. Now it's not annoying anymore, so go ahead and split them up. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D26209873 Pulled By: ezyang fbshipit-source-id: 63057d22acfaa0c17229947d9e65ec1193e360ec	2021-02-04 09:19:41 -08:00
Edward Yang	93c4f9f972	Split out RegisterDispatchKey to its own file (#51508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51508 No substantive changes. The codegen for this file was getting a bit long so I moved it off into tools.codegen.dest submodule (I wanted to do tools.codegen.gen but that conflicts with the existing module; oy vey!) To do this I had to move some other functions around so that they were more generally accessible. Otherwise self-explanatory. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ljk53 Differential Revision: D26187856 Pulled By: ezyang fbshipit-source-id: fd3784571d03d01c4acb7ca589fcde4492526408	2021-02-04 09:19:32 -08:00
Edward Yang	6045663f39	Use Literal to model targets. (#51500 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51500 I'm going to add some new Target types shortly, so having tighter types for the individual unions will make it clearer which ones are valid. This is also the first use of typing_extensions in the codegen, and I want to make sure it works. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26187854 Pulled By: ezyang fbshipit-source-id: 6a9842f19b3f243b90b210597934db902b816c21	2021-02-04 09:16:22 -08:00
Edward Yang	333a0c8b6f	Add support for generating faithful at::cpu signatures (#51499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51499 I'm going to turn on at::cpu signatures on for all operators; before I do it I want to make sure I'm at feature parity everywhere. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26187855 Pulled By: ezyang fbshipit-source-id: 8fdfd9d843fc98435b1f1df8b475d3184d87dc96	2021-02-03 14:03:50 -08:00
Edward Yang	81c7c3bae5	Add api.structured; switch structured kernels to use const Tensor& everywhere (#51490 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51490 Mutable Tensor ref is a source of endless confusion for kernel writers; if we're going to make everyone rewrite their kernels, might as well also get rid of mutable Tensor& while we're at it. This is a refactor-then-small-update double whammy. The refactor is to separate tools.codegen.api.structured from api.native for describing the type signatures of structured kernels (previously, I was naughtily reusing native for this purpose--now I need it to behave differently as Tensor). This started off as a copy paste, but since there are not that many structured kernels so far I could delete all of the legacy logic from native that didn't make sense (without having to go out and fix all the use sites all at once). One more small addition was teaching translate to convert Tensor& to const Tensor&. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26182413 Pulled By: ezyang fbshipit-source-id: ed636866add3581179669cf9283f9835fcaddc06	2021-02-03 14:03:46 -08:00
Scott Wolchok	341c76dcc1	[PyTorch] Add C10_ALWAYS_INLINE to critical dispatcher paths (#51245 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51245 Splitting this out from #51164 (D26069629) to allow it to land separately; I'm sure this is a good idea but I'm less sure about #51164. ghstack-source-id: 120697499 Test Plan: double-check effect on empty benchmark with perf stat; didn't move Reviweers: ezyang, messmer Reviewed By: ezyang Differential Revision: D26112627 fbshipit-source-id: 50d4418d351527bcedd5ccdc49106bc642699870	2021-02-01 12:39:58 -08:00
Jiakai Liu	83287a6f2b	[pytorch] change codegen dispatch key from string to enum (#51115 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51115 Add enum type for dispatch key. Prepare to implement the DispatchTable computation logic in python for static dispatch. Verified byte-for-byte compatibility of the codegen output. Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26077430 Pulled By: ljk53 fbshipit-source-id: 86e74f3eb32266f31622a2ff6350b91668c8ff42	2021-01-27 22:28:52 -08:00
Scott Wolchok	1935880860	[PyTorch] Remove unnecessary dispatcher.h include in torch/library.h (#51162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51162 It's unused. ghstack-source-id: 120427120 Test Plan: CI Reviewed By: bhosmer Differential Revision: D25859010 fbshipit-source-id: 7bb21312843debaedaa6a969727c171b2bb0e6b2	2021-01-26 22:19:32 -08:00
Edward Yang	5e79b8e06d	Back out "Revert D25903846: [pytorch][PR] Structured kernel definition for upsample_nearest2d" (#50794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50794 Original commit changeset: b4a7948088c0 There are some subtle extra tweaks on top of the original. I can unbundle them, but I've opted to keep it with the port because it's the easiest way to make sure the changes are exercised. * There's a bugfix in the codegen to test if a dispatch key is structured before short circuiting because the dispatch key was missing in the table. This accounts for mixed structured-nonstructured situations where the dispatch table is present, but the relevant structured key isn't (because the dispatch table only exists to register, e.g., QuantizedCPU) * Dispatch tables for functions which delegate to structured kernels don't have Math entries from generated for them. * It's now illegal to specify a structured dispatch key in a delegated structured kernel (it will be ignored!) add is now fixed to follow this * There are some extra sanity checks for NativeFunctions validation * Finally, unlike the original PR, I switched the .vec variant of upsample_nearest2d to also be DefaultBackend, bringing it inline with upsample_nearest1d. ghstack-source-id: 120038038 Test Plan: ``` buck test mode/dev //coreai/tiefenrausch:python_tests -- --exact 'coreai/tiefenrausch:python_tests - test_can_run_local_async_inference_cpu (coreai.tiefenrausch.tests.python_test.TiefenrauschPY)' --run-disabled ``` Reviewed By: ngimel Differential Revision: D25962873 fbshipit-source-id: d29a9c97f15151db3066ae5efe7a0701e6dc05a3	2021-01-25 10:43:53 -08:00
Edward Yang	2ab497012f	Add at::cpu namespace of functions for structured kernels (#49505 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49505 I have a problem which is that static runtime needs a way to bypass dispatch and call into kernels directly. Previously, it used native:: bindings to do this; but these bindings no longer exist for structured kernels! Enter at::cpu: a namespace of exactly at:: compatible functions that assume all of their arguments are CPU and non-autograd! The header looks like this: ``` namespace at { namespace cpu { CAFFE2_API Tensor & add_out(Tensor & out, const Tensor & self, const Tensor & other, Scalar alpha=1); CAFFE2_API Tensor add(const Tensor & self, const Tensor & other, Scalar alpha=1); CAFFE2_API Tensor & add_(Tensor & self, const Tensor & other, Scalar alpha=1); CAFFE2_API Tensor & upsample_nearest1d_out(Tensor & out, const Tensor & self, IntArrayRef output_size, c10::optional<double> scales=c10::nullopt); CAFFE2_API Tensor upsample_nearest1d(const Tensor & self, IntArrayRef output_size, c10::optional<double> scales=c10::nullopt); CAFFE2_API Tensor & upsample_nearest1d_backward_out(Tensor & grad_input, const Tensor & grad_output, IntArrayRef output_size, IntArrayRef input_size, c10::optional<double> scales=c10::nullopt); CAFFE2_API Tensor upsample_nearest1d_backward(const Tensor & grad_output, IntArrayRef output_size, IntArrayRef input_size, c10::optional<double> scales=c10::nullopt); }} ``` This slows down static runtime because these are not the "allow resize of nonzero tensor" variant binding (unlike the ones I had manually written). We can restore this: it's a matter of adding codegen smarts to do this, but I haven't done it just yet since it's marginally more complicated. In principle, non-structured kernels could get this treatment too. But, like an evil mastermind, I'm withholding it from this patch, as an extra carrot to get people to migrate to structured muahahahaha. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D25616105 Pulled By: ezyang fbshipit-source-id: 84955ae09d0b373ca1ed05e0e4e0074a18d1a0b5	2021-01-22 13:11:59 -08:00
Sebastian Messmer	e4c41b6936	Remove codegen logic to support non-c10-full ops (#49164 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49164 This PR removes the logic paths in codegen that were responsible for handling non-c10-full ops. This only goes through our basic codegen. It does not simplify C++ code yet and it does not remove the codegen for generated unboxing wrappers yet. ghstack-source-id: 119450487 Test Plan: waitforsandcastle Reviewed By: ezyang Differential Revision: D25462977 fbshipit-source-id: 7e70d14bea96948f5056d98125f3e6ba6bd78285	2021-01-06 14:17:36 -08:00
Edward Yang	68a6e46379	Push anonymous namespace into codegen, not template (#49498 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49498 In the near future, I want to code generate some functions that are visible externally to this compilation unit. I cannot easily do this if all the codegen code is wrapped in a global anonymous namespace, so push the namespace in. Registration has to stay in an anonymous namespace to avoid name conflicts. This could also have been solved by making the wrapper functions have more unique names but I didn't do this in the end. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD, smessmer Differential Revision: D25616104 Pulled By: ezyang fbshipit-source-id: 323c0dda05a081502aab702f359a08dfac8c41a4	2021-01-06 08:44:49 -08:00

1 2 3

114 Commits