gen_static_runtime_ops hasn't been updated in a while. In preparation for https://github.com/pytorch/pytorch/pull/127675 in which I need to re-run the codegen step for cumprod, I want to land these changes beforehand in case there are any other issues that arise.
I added a number of ops to the blocklist:
```
+ "_nested_tensor_storage_offsets",
+ "_nested_get_values", # no CPU backend
+ "_nested_get_values_copy", # no CPU backend
+ "_nested_view_from_jagged", # testing needs to be patched
+ "_nested_view_from_jagged_copy", # testing needs to be patched
+ "_nested_view_from_buffer", # testing needs to be patched
+ "_nested_view_from_buffer_copy", # testing needs to be patched
+ "_int_mm", # testing needs to be patched
+ "_to_sparse_csc", # testing needs to be patched
+ "_to_sparse_csr", # testing needs to be patched
+ "segment_reduce", # testing needs to be patched
```
Most of these are added just because testing doesn't work right now.
Additionally, a few `fft` ops seem to have been removed from native_functions.yaml; I'm guessing it's unlikely FFT would have been used in many real models though.
Differential Revision: [D58329403](https://our.internmc.facebook.com/intern/diff/D58329403/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128299
Approved by: https://github.com/YuqingJ
Summary:
Add script to go through view ops in "native_functions.yaml" and auto-register them into static runtime and auto-generate op unit tests for each.
Overall there are 96 grouped view ops, among which 21 is already registered by hand; 9 (including sparse ops/training related ops etc.) are not the target of static runtime; 30 has list args or list ret; and 7 has non-basic types such as "Dimname", "MemoryFormat", etc. In summary, this script auto-generate 29 view ops for now.
Run `buck run //caffe2/torch/fb/jit:gen_static_runtime_ops` to generate static runtime ops, and the results with this script are,
```
total grouped native ops: 1582
grouped native ops with out variant: 548
generated functions groups with out variant: 241
view grouped native ops: 96
generated functions view groups: 29
overall generated : 270
```
The generated view ops are added in D36258968
Test Plan:
Generate static runtime ops: `buck run //caffe2/torch/fb/jit:gen_static_runtime_ops`
Unit tests: `buck run mode/opt //caffe2/benchmarks/static_runtime:static_runtime_cpptest`
Differential Revision: D36258767
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77105
Approved by: https://github.com/mikeiovine
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76203
Request for comments:
This change adds extra code generator support to generate out variant wrappers for operators with unstructured kernels.
The current version generates 105 new out variant wrappers in addition to the existing 136 auto-generated out variants wrappers.
This change shows that a simple tweak can increase the generated op coverage to 16% (241/1559) among all native ops described in native_functions.yaml no. matter if they are structured or not.
Command to generate out variant wrappers.
```
buck run //caffe2/torch/fb/jit:gen_static_runtime_ops
```
- AFTER this change
```
total grouped native ops: 1559
structured grouped native ops: 545
generated grouped native ops: 241
```
- BEFORE this change
```
total grouped native ops: 1503
structured grouped native ops: 540
generated grouped native ops: 136
```
To enable CI tests and make it easier to review, the generated ops are added in a separate diff: D35945633
More details:
We added a block list to remove the generation of around 10 operations that are deprecated or for which the unit test would fail. All generated ops are well *compiled* but the compiled unittest may not pass due to the lack of hand-picked test input values for certain ops. Among the 42 ops whose unittest does not pass, 1 (op "index_select") is repeated from the existing ops; 32 ops are fixed; and 9 ops are removed and blocked from generation because either it is not being commonly used in internal models such as "cholesky", "linalg_householder_product", sparse kernel "sspaddmm", or it causes some errors in static runtime such as "conj_physical" leads to an error in memory planner, and "binary_cross_entropy".
Test Plan:
OP generation:
```buck run //caffe2/torch/fb/jit:gen_static_runtime_ops```
Test generated ops:
```buck run mode/opt //caffe2/benchmarks/static_runtime:static_runtime_cpptest```
Reviewed By: tenpercent
Differential Revision: D34913736
fbshipit-source-id: a6f408321653c3589ae1c76826177fc403d59c44
(cherry picked from commit 6f4501730478dbaeeea7f3ad4f9d29bf6787e7c1)