Commit Graph

298 Commits

Author SHA1 Message Date
Edward Yang
2f222fc88c Mild refactor of native_functions.yaml dispatch parsing (#66109)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66109

This refactor is no longer necessary for ufunc codegen, as I changed
the format of ufuncs to not directly be inserted into the 'dispatch'
key, but I think the refactored code here is better.  The basic concept
is to directly construct BackendMetadata as we are parsing entries of
the dispatch dictionary, rather than post facto creating them later.
This centralizes the compute and means that the creation of the backend index
is just a simple reindexing by operator name (nothing nontrivial).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D31385760

Pulled By: ezyang

fbshipit-source-id: 4fcb491ba025d2aa6fd356586b57affb97a507fc
(cherry picked from commit 21c93d4199)
2022-02-17 02:01:36 +00:00
Will Constable
889f3f48b2 Revert D34178476: Update lazy_ir.py from lazy_tensor_staging
Test Plan: revert-hammer

Differential Revision:
D34178476 (3842140fd5)

Original commit changeset: 7190b2e0d82b

Original Phabricator Diff: D34178476 (3842140fd5)

fbshipit-source-id: 4c969a355f01244c6f5acc52bc31679f2182aa55
(cherry picked from commit 17082075dd)
2022-02-16 19:34:41 +00:00
Guo Yejun
c19255840f codegen: do not generate code for dispatch_namespaced_definitions (#69074)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69074

Reviewed By: jbschlosser

Differential Revision: D32758621

Pulled By: bdhirsh

fbshipit-source-id: f8a174fd9d74039003f9713d8dfaae2b4eaa7089
(cherry picked from commit 462e92c82d)
2022-02-16 19:30:38 +00:00
Will Constable
3842140fd5 Update lazy_ir.py from lazy_tensor_staging (#72730)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72730

This diff contains changes from several PRs landed to lazy_tensor_staging branch.
- generating 'fallback' overrides for each codegenned op, useful for debugging
- supports operators which are missing aten:: symbols for op names, instead using their string counterpart
- makes the IR class a base class instead of hardcoding the assumption of TS

Test Plan: tested on lazy_tensor_staging branch

Reviewed By: desertfire

Differential Revision: D34178476

fbshipit-source-id: 7190b2e0d82b4eb1f4510c858c24446c6df3f9d0
(cherry picked from commit 6713d3f0ef)
2022-02-16 18:33:31 +00:00
Brian Hirsh
22ccf448e8 Revert D34034848: free up dispatch key space (in C++)
Test Plan: revert-hammer

Differential Revision:
D34034848 (6690256021)

Original commit changeset: 9677ee2c0a1a

Original Phabricator Diff: D34034848 (6690256021)

fbshipit-source-id: fd50943d915ef813bb9f9ab278fb582429eea3b1
(cherry picked from commit 3acefee1cd)
2022-02-14 23:29:00 +00:00
Brian Hirsh
6690256021 free up dispatch key space (in C++) (#72402)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72402

The original PR had an array-out-of-bounds access in `DispatchKeyExtractor.cpp`, that wasn't caught by ASAN and appeared to only manifest in a subset of android internal tests. After fixing the OOB access (and adding more asserts), I confirmed that the android internal test passes.

Reland of D33255193 (20b8653dfa)
ghstack-source-id: 148830728

Test Plan:
Steps to test:

(1) connect to a mobile OD

(2) run `one_world android emulator android-29` in a terminal to start the android emulator

(3) In a separate terminal, run the test: `buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled`

I also ran `buck test fbandroid/mode/dbg //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test`, which failed before and passed after the PR.

Reviewed By: albanD

Differential Revision: D34034848

fbshipit-source-id: 9677ee2c0a1afd1183896f7055009445712523c5
(cherry picked from commit 9ab9b12d35)
2022-02-14 16:02:29 +00:00
Shintaro Iwasaki
078247304a fix minor issues for ATen/ROCm (#71925)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71925

This patch fixes a few minor issues and introduces a few changes that are found in enabling ROCm compilation for ATen.

1. Minor type-related changes in `ATen/miopen/*`
This is to suppress compiler warnings.

2. [EXCLUDED] Hipify `ATen/naitve/miopen/*.cpp`
`ATen/native/miopen/*.cpp` includes "cuda/CUDAConfig.h", which should be `hip/HIPConfig.h` (though compilation succeeds without this change since currently `CUDAConfig.h` = `HIPConfig.h`).

3. Update `gen.py` to include `hip/EmptyTensor.h` instead of `cuda/EmptyTensor.h` for HIP compilation
`RegisterCUDA.cpp` (for HIP) should include `hip/EmptyTensor.h` (though compilation succeeds without this change since currently `cuda/EmptyTensor.h` does not contain CUDA-specific logic).

4. Exclude the `USE_DIRECT_NVRTC` code when `USE_ROCM=0`.
Note that `USE_DIRECT_NVRTC` is always undefined for OSS compilation. It seems that this flag exists only for an internal purpose.

5. [EXCLUDED] Exclude `frexp()` for ROCm <= 3.10
a newer ROCm (i.e., officially supported ROCm versions) has `frexp()`, but an old ROCm (e.g., ROCm <= 3.10) doesn't. This preprocessor branch avoids compilation error for old ROCm (though such an old ROCm is not officially supported).

6. Change an include path from `aten/src/ATen/` to `ATen/` in `SharedReduceOps.h`
This is, as far as I checked, the only place that includes `Aten` from `aten/src`. This change unifies the include format.

Test Plan: CI (including GitHub CI for ROCm)

Reviewed By: xw285cornell

Differential Revision: D33441758

fbshipit-source-id: 0853806c60de050d329b5ddddb8d51948f8f2788
(cherry picked from commit c2b8c16308)
2022-02-09 19:11:01 +00:00
francescocastelli
5e6f296612 Structured Kernel Precompute codegen handle fields without replacement (#71368)
Summary:
I've added the parsing of an optional first line in native_functions.yaml after the precomputed keyword for arguments that will be precomputed without replacement. This line is optional, must be the first and does not contain any arrow.

These new fields are precomputed as before in the meta function and added to the precompute struct returned by the meta function. For now I've put them as last args of the impl function where they can be reused.

example:

native_function.yaml:
```
  ...
  precomputed:
  - int numBatch, int numPlanes, int inputT, int inputH, int inputW   <- new
  - kernel_size -> int poolSizeT, int poolSizeH, int poolSizeW
  - output_size -> int outputT, int outputH, int outputW
```

meta:
```
TORCH_PRECOMPUTE_META_FUNC(fractional_max_pool3d)(
  const at::Tensor& input_,
  IntArrayRef pool_size,
  IntArrayRef output_size,
  const at::Tensor& randomSamples
) {
    ...

return TORCH_PRECOMPUTE_STRUCT(fractional_max_pool3d)().set_numBatch(numBatch).set_numPlanes(numPlanes).set_inputT(inputT).set_inputH(inputH).set_inputW(inputW)
  .set_poolSizeT(poolSizeT) ...
}
```

impl:
```
TORCH_IMPL_FUNC(fractional_max_pool3d_out_cpu)(
  const at::Tensor& input_,
  int64_t poolSizeT,
  int64_t poolSizeH,
  int64_t poolSizeW,
  int64_t outputT,
  int64_t outputH,
  int64_t outputW,
  const at::Tensor& randomSamples,
  const at::Tensor& output,
  const at::Tensor& indices,
  int64_t numBatch,    <- for now I've put them here
  int64_t numPlanes,
  int64_t inputT,
  int64_t inputH,
  int64_t inputW) {
```

Fixes https://github.com/pytorch/pytorch/issues/71314

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71368

Reviewed By: zou3519

Differential Revision: D33683984

Pulled By: bdhirsh

fbshipit-source-id: 33066dd92b8743aadf0dc8102f6bf0689f843242
(cherry picked from commit 64e46af6a4)
2022-02-08 03:56:56 +00:00
Guo Yejun
4d4b94b3cb gen_backend_stubs.py: fix typo for supported_autograd (#68562)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68562

Reviewed By: jbschlosser

Differential Revision: D32758608

Pulled By: bdhirsh

fbshipit-source-id: 496e1ec831edaa6fcc586f3c8f0361c31cad4e78
(cherry picked from commit 68ea9e9df5)
2022-02-08 01:28:42 +00:00
Jacob Szwejbka
791e7df7d9 Back out "free up dispatch key space (in C++)"
Summary: I think this diff stack broke all the related tasks below.

Test Plan:
For our failing tests:

buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled

For the ubn:

Not really sure what to do, trying to build the app and see if I can use an effect?

Reviewed By: shoumikhin

Differential Revision: D34018849

fbshipit-source-id: 3571718cb6621931af931b494e0a70d6e0164e65
(cherry picked from commit 3cc63cb2ea)
2022-02-05 01:25:42 +00:00
Brian Hirsh
20b8653dfa free up dispatch key space (in C++) (#69633)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69633

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33255193

Pulled By: bdhirsh

fbshipit-source-id: 79773e9c15bf4f2f27675121a49ff5ffd1375238
(cherry picked from commit eac0b13005)
2022-02-04 17:57:38 +00:00
Chen Lai
d23231fd8c Fix upgrader codegen when constant list is 0 (#72199)
Summary:
When the constant list is empty, previous codegen will generate something like
```
std::vector<c10::IValue>({

}), // constants list,
```
However it will fail quick-check, because it includes trailing spaces. This pr will generate the following instead.
```
std::vector<c10::IValue>(), // constants list,
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72199

ghstack-source-id: 148231023

Test Plan: CI

Reviewed By: tugsbayasgalan

Differential Revision: D33952046

fbshipit-source-id: 359b8a418928c89bbeb446b44774b312c94f03bc
(cherry picked from commit 060490f667)
2022-02-03 00:41:03 +00:00
Peter Bell
b0518b2705 Codegen: Do less work in dry-runs for sharded files (#69805)
Summary:
This improves a dry-run of `gen.py` from 0.80s to 0.45s.

`FileManager` in `dry_run` mode doesn't actually need to compute the
environment; it just records the filenames that would have been
written.

cc ezyang bhosmer bdhirsh

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69805

Reviewed By: ngimel

Differential Revision: D33944912

Pulled By: albanD

fbshipit-source-id: 74f22af3f2bd5afdef7105961270198566fa91e5
(cherry picked from commit 6fcdc15954)
2022-02-02 19:25:16 +00:00
Brian Hirsh
58dabebcd7 improve quantized error checking for structured kernels (#71928)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71928

Test Plan: Imported from OSS

Reviewed By: wconstab, bhosmer

Differential Revision: D33823417

Pulled By: bdhirsh

fbshipit-source-id: e894b9724833b77b12963cc4bf194bc6ce526ad9
(cherry picked from commit 6be10b79e7)
2022-02-01 16:09:45 +00:00
Chen Lai
784bd92340 Use upgrader_mobile.cpp as the reference for codegen unittest (#71930)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71930

Previously `fbcode/caffe2/test/mobile/test_upgrader_bytecode_table_example.cpp` was checked in as intermediate step to make sure upgrader codegen works properly, before upgrader codegen is actually being used.

this change use `buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen` to codegen `upgrader_mobile.cpp` and we no longer need to use the checkin file `test_upgrader_bytecode_table_example.cpp` for the codegen unit test.
ghstack-source-id: 147957826

Test Plan:
```
buck test mode/opt //caffe2/test:upgrader_codegen
```

Reviewed By: tugsbayasgalan

Differential Revision: D33746264

fbshipit-source-id: 18de3cae53aed966e67f8dc42976a2d10d3788b3
(cherry picked from commit 661ffa7860)
2022-01-30 03:11:32 +00:00
Chen Lai
af65634d1c Move generated keyword out of gen_mobile_upgraders.py (#71938)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71938

`generated` will trigger the generated changes and hide the file changes. It's also misleading, because `gen_mobile_upgraders.py` itself is not autogen. Separate the keyword out from `gen_mobile_upgraders.py` so it's easier to see the changes from `gen_mobile_upgraders.py`.
ghstack-source-id: 147957825

Test Plan:
```
buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen
```

Reviewed By: tugsbayasgalan

Differential Revision: D33826982

fbshipit-source-id: 593c19f8ef4c9da776b11650863dc43c0b171cd5
(cherry picked from commit 43038d5bc7)
2022-01-30 03:11:32 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
c5df294940 Fix bug in upgrader generation in mobile (#71578)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71578

Use more robust way of extracting upgrader min and max versions

Test Plan: omgitsgreen

Reviewed By: cccclai

Differential Revision: D33690113

fbshipit-source-id: 79a964acb26d7ca1354e104710a285b8da3f46d1
(cherry picked from commit 9e316ee5c1)
2022-01-28 18:20:59 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
e849c8b0f2 Move bytecode generation to python (#71681)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71681

Test Plan: Imported from OSS

Reviewed By: gmagogsfm, cccclai

Differential Revision: D33730791

Pulled By: tugsbayasgalan

fbshipit-source-id: e752e9ae20c01a57a3bea270f604215fdcc9182e
(cherry picked from commit 69c9dc0548)
2022-01-28 02:33:00 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
c9bd1c60ed Move upgraders from python to cpp (#70593)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70593

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D33402543

Pulled By: tugsbayasgalan

fbshipit-source-id: 713c54fbbb2bc4c96d5e3b6084f3090a8923a12d
(cherry picked from commit e72b375264)
2022-01-22 00:24:24 +00:00
Peter Bell
84fe4279db Structured Kernels: Use at::detail::empty functions (#70617)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70617

This reduces the divergence between the code generated for
`create_out` different devices, and means the `TensorOptions` don't
need to be unpacked.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33623680

Pulled By: ngimel

fbshipit-source-id: 54f36774a8530be99c26a54270d4d95f3e38d684
(cherry picked from commit b22ba92e27)
2022-01-21 22:57:27 +00:00
Peter Bell
71a41323bb BackendSelect: Use at::_ops API and per-operator headers (#69840)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69840

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33160027

Pulled By: albanD

fbshipit-source-id: 0e492ec8bab73da90afd9df70f48c17a8206a768
(cherry picked from commit 133ec77e9f)
2022-01-21 21:44:24 +00:00
Peter Bell
2bb6a4f437 Generate aten_interned_strings.h automatically (#69407)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69407

This generates aten_interned_strings.h from `native_functions.yaml`
which is more like how it was originally done. The items deleted from
`interned_strings.h` are duplicates that need to be removed in order
for the code to compile, some of the remaining items may still be out
of date but it is fairly benign even if that's the case.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D32923636

Pulled By: albanD

fbshipit-source-id: a0fd6b3714e70454c5f4ea9b19da5e047d2a4687
2022-01-18 08:29:54 -08:00
Zhengxu Chen
30699cbfd5 Reland D33284352: [jit][edge] Do not reuse mobile type parser for all unpicklers. (#71048)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71048

reland D33284352 (0a921ba0d0)
ghstack-source-id: 146735646

Test Plan: All Github CI: ciflow rerun -l ciflow/all

Reviewed By: gmagogsfm

Differential Revision: D33489731

fbshipit-source-id: 3e160209a1abb193ad3eed3018054aa7d331025e
2022-01-10 12:42:23 -08:00
Zhengxu Chen
9762aa0fdc Revert D33284352: [jit][edge] Do not reuse mobile type parser for all unpicklers.
Test Plan: revert-hammer

Differential Revision:
D33284352 (0a921ba0d0)

Original commit changeset: 997c4f110b36

Original Phabricator Diff: D33284352 (0a921ba0d0)

fbshipit-source-id: af316727442a64f1ae40d53d7a9d26ec550d634e
2022-01-07 19:58:03 -08:00
Zhengxu Chen
0a921ba0d0 [jit][edge] Do not reuse mobile type parser for all unpicklers. (#70338)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70338

Today Unpickler is used by both server and mobile for deserializing model, and it always fallback to mobile parser when there's no type resolver provided by user. However this is not intended as server and mobile type parser supports different things. In this diff we provide a default fallback using script parser and opt it out for all mobile cases.
ghstack-source-id: 146727330

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D33284352

fbshipit-source-id: 997c4f110b36eee6596e8f23f6a87bf91a4197ed
2022-01-07 18:35:32 -08:00
Jiewen Tan
524bbb1442 [LTC] Sync gen_lazy_tensor.py from the staging branch (#70385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70385

This commit sync gen_lazy_tensor.py from the lazy_tensor_staging branch
to the master.

Test Plan: CI in the lazy_tensor_staging branch.

Reviewed By: wconstab

Differential Revision: D33306232

Pulled By: alanwaketan

fbshipit-source-id: a15c72b22418637f851a6cd4901a9f5c4be75449
2022-01-06 13:12:37 -08:00
Brian Hirsh
7b8c43cd7c Revert "Revert D32498570: make codegen'd device guards not cuda-specific. Allow them to be used in external codegen" (#69951)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69951

This reverts commit 0ef523633f.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33113543

Pulled By: bdhirsh

fbshipit-source-id: b28073ee0870b413ea9f617f27671ae5c6f3c696
2022-01-04 14:53:21 -08:00
Brian Hirsh
bb5b4cceb6 Revert "Revert D32498569: allow external backend codegen to toggle whether to generate out= and inplace kernels" (#69950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69950

This reverts commit f6cad53443.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33113545

Pulled By: bdhirsh

fbshipit-source-id: d6590294662588d36c09662dea65919ad4e1e288
2022-01-04 14:52:00 -08:00
Chen Lai
a0c99a8d3b [Operator Verioning][Edge] Update upgrader codegen with latest change (#70293)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70293

```
python /Users/chenlai/pytorch/tools/codegen/operator_versions/gen_mobile_upgraders.py

```
https://github.com/pytorch/pytorch/pull/70161 is landed to resolve a thread safety issue. Accordingly, the upgrader codegen needs to be updated.
ghstack-source-id: 146296324

Test Plan:
```
buck test mode/opt //caffe2/test:upgrader_codegen
buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen
python /Users/chenlai/pytorch/tools/codegen/operator_versions/gen_mobile_upgraders.py

```

Reviewed By: iseeyuan

Differential Revision: D33274831

fbshipit-source-id: 0e1d2a81edc9b6111f3c6127dbd5b97e16c93dca
2021-12-28 18:34:31 -08:00
Brian Hirsh
5e222d08a1 Revert "Revert D32498572: allow external backend codegen to be used without autograd kernels" (#69949)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69949

This reverts commit 33363cea64.

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D33113544

Pulled By: bdhirsh

fbshipit-source-id: e219f10d52776498c9ad273e97bca3e3406cf702
2021-12-21 08:19:37 -08:00
Richard Barnes
70ed4f3ffc Try dropping Torch from typeshed_internal (#69926)
Summary:
Removes the internal typeshed for PyTorch and replaces it with PyTorch's own type annotations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69926

Generated files are in P471601595, P471601643, P471601662

Based on an example in D26410012

Test Plan: Sandcastle

Reviewed By: malfet, pradeep90

Differential Revision: D32292834

fbshipit-source-id: 5223f514cbdccd02c08ef0a027a48d92cdebed2c
2021-12-17 14:08:19 -08:00
Nikita Shulga
fa582045fc Fix lint/mypy violations (#70059)
Summary:
Introduced by https://github.com/pytorch/pytorch/pull/69194

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70059

Reviewed By: suo, cccclai

Differential Revision: D33170748

Pulled By: malfet

fbshipit-source-id: a2e42f37d04c21a735f6474e42eb6670d2a0c3b9
2021-12-16 14:06:27 -08:00
Chen Lai
b23890177f [Operator Versioning][Edge] Codegen upgrader_mobile.cpp (#69194)
Summary:
From operator version map and upgrader torchscript, generate upgrader_mobile.cpp file. It also includes a unit test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69194

ghstack-source-id: 145819351

Test Plan:
```
buck test mode/opt //caffe2/test:upgrader_codegen
```
```
buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen
```
```
python /Users/chenlai/pytorch/tools/codegen/operator_versions/gen_mobile_upgraders.py
```

Reviewed By: iseeyuan

Differential Revision: D32748985

fbshipit-source-id: f8437766edaba459bfc5e7fc7a3ca0520c4edb9a
2021-12-16 10:29:35 -08:00
Bin Bao
e6a4988b2d [LTC] Upstream utils in computation_client (#69621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69621

Upstream the following utils
- metrics.h
- multi_wait.h
- thread_pool.h
- unique.h

Test Plan: Imported from OSS

Reviewed By: wconstab, VitalyFedyunin

Differential Revision: D32957629

Pulled By: desertfire

fbshipit-source-id: 5f2fb57493856556099b7cda7560a568d1f9ed97
2021-12-16 05:43:09 -08:00
Peter Bell
9c7c1b769a Functionalization: Only include headers for required ops (#68690)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68690

RegisterFunctionalization.cpp is a shared file, so only including the
required operators means a single operator change only requires 1
shard to be rebuilt instead of all of them.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32596275

Pulled By: albanD

fbshipit-source-id: 8b56f48872156b96fbc0a16b542b8bab76b73fd4
2021-12-15 14:29:35 -08:00
Peter Bell
7bb4b683b5 Codegen: Registration now only includes the functions used (#68689)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68689

Currently Register{DispatchKey}.cpp includes all of
`NativeFunctions.h`, so any operator signature change requires all
backend registration to be recompiled. However, most backends only
have registrations for a small fraction of operators so it makes sense
to only include the specific functions required.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32596273

Pulled By: albanD

fbshipit-source-id: 11d511f47937fbd5ff9f677c9914277b5d015c25
2021-12-15 14:29:32 -08:00
Peter Bell
6ba18ba87e Codegen: Generate static dispatch headers per operator (#68714)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68714

This splits the static dispatch headers (e.g. `CPUFunctions.h`)
into per operators headers (e.g. `ops/empty_cpu_dispatch.h`) which is
needed for when `Tensor.h` is compiled with static dispatch enabled.

There are also several places in ATen where the static dispatch
headers are used as an optimization even in dynamic dispatch builds.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32596265

Pulled By: albanD

fbshipit-source-id: 287783ef4e35c7601e9d2714ddbc8d4a5b1fb9e5
2021-12-15 14:29:29 -08:00
Peter Bell
bab61be43b Codegen: Add root_name property to NativeFunction{,sGroup} (#68687)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68687

This adds `NativeFunction.root_name` which is the canonical name
for the operator group. i.e. the BaseOperatorName without inplace or
double-underscores. In the previous PR I referred to this as
`base_name` but confusingly `BaseOperatorName` does potentially
include inplace or double-underscores.

I also add the property to `NativeFunctionsGroup` so that grouped
functions with type `Union[NativeFunction, NativeFunctionsGroup]`
can have the property queried without needing `isinstance` checks.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32596271

Pulled By: albanD

fbshipit-source-id: 8b6dad806ec8d796dcd70fc664604670d668cae7
2021-12-15 14:28:10 -08:00
Brian Hirsh
33363cea64 Revert D32498572: allow external backend codegen to be used without autograd kernels
Test Plan: revert-hammer

Differential Revision:
D32498572 (b83b6f7424)

Original commit changeset: 3e7159c633f6

Original Phabricator Diff: D32498572 (b83b6f7424)

fbshipit-source-id: f93fa444c95a2423eef5975a2ecdb96f14e0c535
2021-12-14 15:28:49 -08:00
Brian Hirsh
f6cad53443 Revert D32498569: allow external backend codegen to toggle whether to generate out= and inplace kernels
Test Plan: revert-hammer

Differential Revision:
D32498569 (aa0cf68c17)

Original commit changeset: ebd932d042b9

Original Phabricator Diff: D32498569 (aa0cf68c17)

fbshipit-source-id: 21a393fa339510d926512a7983d33ece327b743d
2021-12-14 15:27:24 -08:00
Brian Hirsh
0ef523633f Revert D32498570: make codegen'd device guards not cuda-specific. Allow them to be used in external codegen
Test Plan: revert-hammer

Differential Revision:
D32498570 (2e7a91c45f)

Original commit changeset: 0ce6a5614417

Original Phabricator Diff: D32498570 (2e7a91c45f)

fbshipit-source-id: 7c64ce1b5e51a680b4aeae8721e0c9e15c793289
2021-12-14 15:04:10 -08:00
Brian Hirsh
2e7a91c45f make codegen'd device guards not cuda-specific. Allow them to be used in external codegen (#68531)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68531

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D32498570

Pulled By: bdhirsh

fbshipit-source-id: 0ce6a5614417671313b4d274ea84742c5b81d1b0
2021-12-14 10:25:04 -08:00
Brian Hirsh
aa0cf68c17 allow external backend codegen to toggle whether to generate out= and inplace kernels (#68530)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68530

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D32498569

Pulled By: bdhirsh

fbshipit-source-id: ebd932d042b988e19c71aa04a21677db9bdc9f04
2021-12-14 10:25:02 -08:00
Brian Hirsh
b83b6f7424 allow external backend codegen to be used without autograd kernels (#68529)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68529

Test Plan: Imported from OSS

Reviewed By: wconstab

Differential Revision: D32498572

Pulled By: bdhirsh

fbshipit-source-id: 3e7159c633f6a80b60faa068436a4c49ebe731ca
2021-12-14 10:23:12 -08:00
Peter Bell
4829dcea09 Codegen: Generate seperate headers per operator (#68247)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68247

This splits `Functions.h`, `Operators.h`, `NativeFunctions.h` and
`NativeMetaFunctions.h` into seperate headers per operator base name.
With `at::sum` as an example, we can include:
```cpp
<ATen/core/sum.h>         // Like Functions.h
<ATen/core/sum_ops.h>     // Like Operators.h
<ATen/core/sum_native.h>  // Like NativeFunctions.h
<ATen/core/sum_meta.h>    // Like NativeMetaFunctions.h
```

The umbrella headers are still being generated, but all they do is
include from the `ATen/ops' folder.

Further, `TensorBody.h` now only includes the operators that have
method variants. Which means files that only include `Tensor.h` don't
need to be rebuilt when you modify function-only operators. Currently
there are about 680 operators that don't have method variants, so this
is potentially a significant win for incremental builds.

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D32596272

Pulled By: albanD

fbshipit-source-id: 447671b2b6adc1364f66ed9717c896dae25fa272
2021-12-14 06:40:08 -08:00
anjali411
3e6164449f Add efficient zero tensors (#64837)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64837

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D32834987

Pulled By: anjali411

fbshipit-source-id: 20ea08ade0db0044ca633d9c1a117a6a2e65d1fd
2021-12-08 10:37:39 -08:00
Bin Bao
8a975c0106 [LT] Sync with the lazy_tensor_staging branch (#69527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69527

- Add missing TORCH_API in class/struct declarations;
- Fix internal op declarations in ltc_ops;
- Update lazy_ts_lowering.py

Test Plan: Imported from OSS

Reviewed By: alanwaketan

Differential Revision: D32918929

Pulled By: desertfire

fbshipit-source-id: e956d51aff5ef593fdf4cd5ad2a38e38788913d8
2021-12-07 16:47:35 -08:00
Peter Bell
9a7732e852 CMake: Support dynamic codegen outputs (#68246)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68246

Currently the codegen produces a list of output files at CMake
configuration time and the build system has no way of knowing if the
outputs change. So if that happens, you basically need to delete the
build folder and re-run from scratch.

Instead, this generates the output list every time the code generation
is run and changes the output to be a `.cmake` file that gets included
in the main cmake configuration step. That means the build system
knows to re-run cmake automatically if a new output is added. So, for
example you could change the number of shards that `Operators.cpp` is
split into and it all just works transparently to the user.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D32596268

Pulled By: albanD

fbshipit-source-id: 15e0896aeaead90aed64b9c8fda70cf28fef13a2
2021-12-07 15:58:06 -08:00
Will Constable
855365e9c4 Clean up dead code (#69296)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69296

remove a commented block of code that was accidentally checked in

Test Plan: no testable changes

Reviewed By: alanwaketan

Differential Revision: D32799197

fbshipit-source-id: d3eb05cbafb0f5a4a3f41c17f66ca6d0c2fc60b7
2021-12-03 15:11:38 -08:00
Mark Richardson
834bd3134e Back out "Add efficient zero tensors" (#69327)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69327

Original commit changeset: d44096d88265

Original Phabricator Diff: D32144240 (668574af4a)

Test Plan:
CI

original diff failed 175 builds in CI

Reviewed By: airboyang, anjali411

Differential Revision: D32809407

fbshipit-source-id: c7c8e69bcee0274992e2d5da901f035332e60071
2021-12-02 19:11:41 -08:00