From PR: https://github.com/pytorch/pytorch/pull/58691, Replacing the second input of `Gather` 0 to 1 affects other innocent Nodes. In Issue #91526 onnx::range starts from 0, the 0 is changed by this mechanism, as it's shared with onnx::Gather. This PR intends to create a whole independent Constant 0 for replacement. NOTE: The PR passes all existing RNN tests locally in case CI doesn't include RNN test.
~~TODO: test~~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93120
Approved by: https://github.com/BowenBao
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68691
TraceType is a sharded file, so by only including specific operator
headers, we ensure that changing one (non-method) operator only needs
one shard to be re-compiled.
This also changes all the included autograd and jit headers from
including `ATen/ATen.h` to just including `ATen/core/Tensor.h`.
Test Plan: Imported from OSS
Reviewed By: gchanan
Differential Revision: D33336948
Pulled By: albanD
fbshipit-source-id: 4e40371592b9a5a7e7fcd1d8cecae11ffb873113
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68691
TraceType is a sharded file, so by only including specific operator
headers, we ensure that changing one (non-method) operator only needs
one shard to be re-compiled.
This also changes all the included autograd and jit headers from
including `ATen/ATen.h` to just including `ATen/core/Tensor.h`.
Test Plan: Imported from OSS
Reviewed By: jbschlosser, malfet
Differential Revision: D32596264
Pulled By: albanD
fbshipit-source-id: 2f28b62d7b9932f30fad7daacd8ac5bb7f63c621
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66140
* Add new argument to export api to enable users specifying `nn.Module` classes that they wish to be exported as local function in ONNX model.
* Refactor `torch/csrc/jit/serialization/export.cpp`, and remove redundant `EncoderBase` class.
* ~~Contains changes from #63268~~
* Depends on #63716 to update onnx submodule.
Test Plan: Imported from OSS
Reviewed By: jansel
Differential Revision: D31424098
fbshipit-source-id: c949d0b01c206c30b4182c2dd1a5b90e32b7a0d3
Co-authored-by: BowenBao <bowbao@microsoft.com>
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45255
Mostly straightforward. Only downside in this PR is the lack of more scalable way to check for all newly-created nodes in `callPySymbolicFunction`. The other options were:
* Create a scope within the node's scope and loop through all nodes that correspond to the scope. The code would still need to loop through all nodes.
* Add extra state to the graph (no good reason to do so).
* Add extra state to the ONNX exporter, since python calls go back to `g.op(...)` (no good reason to do so, also not very pythonic).
cc BowenBao neginraoof
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45256
Reviewed By: malfet, houseroad
Differential Revision: D31744281
Pulled By: msaroufim
fbshipit-source-id: 1b63f6e7f02ed61b3a9b7ac3d0be0a3a203c8ff6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61558
When we construct an empty list by python list comprehension, we need to avoid converting the node without inputs to onnx::Concat in shape_type_inference.cpp and peephole.cpp because it will create an invalid Concat node which doesn't have inputs.
In addition, update the code to avoid passing a Sequence input to an onnx::Cast node which doesn't accept Sequence data type as an input.
Add tests for the validation as well.
Test Plan: Imported from OSS
Reviewed By: nikithamalgifb
Differential Revision: D29767989
Pulled By: SplitInfinity
fbshipit-source-id: f97f172ff20eebda4c3744c7a934df36716f12a2
Co-authored-by: fatcat-z <jiz@microsoft.com>
Summary:
* Minor: spelling, grammar.
* Add calls to `GRAPH_DUMP()` where they were missing.
* Add or expand a few comments.
* Move a few comments to seemingly more appropriate spots.
* In canonicalize_graph_fuser_ops.cpp inline `runnableInputs()` since it
was only called in one place and had a misleading comment and
confusing name.
* In `PeepholeOptimizeImpl::optimizeBlock()`, set `changed = true;` when
removing `aten::is_complex`. Pretty sure its absence was a bug.
* Delete unused `_jit_pass_remove_inplace_ops` and and its
implementation `RemoveInplaceOps()`.
* In `preprocessCaffe2Ops()`, remove redundant check for nested optional
types. It was already checked in `checkONNXCompatibility()`.
* In `EncoderBase::AddAttribute`, log the unexpected attribute kind.
I don't remember the repro case now but I did hit this error at some
point and this additional logging made it easier to understand.
* In `fuseConvBatchNorm()` in eval_peephole.cpp, consistently use
camelCase instead of snake_case for local variables.
* Add curly braces around the bodies of if and loops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60390
Reviewed By: Krovatkin
Differential Revision: D29523283
Pulled By: SplitInfinity
fbshipit-source-id: 4e16c5648616f53da07d68dab7fdf252e06a0752
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58691
Note the first commit in this PR has its own pull request here since it seemed self-contained:
https://github.com/pytorch/pytorch/pull/57082
* [ONNX] simplify batch_first logic in RNN tests
* [ONNX] support GRU with packed input in scripting mode
This required two changes:
* Add as_tensor to symbolic_opset9.py
* Change torch::jit::pushPackingPastRnn to recognize and properly
replace another use of the batch_sizes output of prim::PackPadded.
Previously the code assumed that the first use was as input to the
RNN operator. However in some cases, it is also used to compute
max_batch_size. For example in this code:
https://github.com/pytorch/pytorch/blob/febff45/torch/nn/modules/rnn.py#L815-L815
With these changes the GRU tests now pass in scripting mode for opset
version >= 11.
Test Plan: Imported from OSS
Reviewed By: driazati
Differential Revision: D28714805
Pulled By: SplitInfinity
fbshipit-source-id: f19647a04533d9ec76399a8793b3f712ea0337d2
Co-authored-by: Gary Miguel <garymiguel@microsoft.com>
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57603
With explicit list unpack code from user, it is possible to observe `prim::ListUnpack` with a `ONNX::Sequence` object.
This PR supports the conversion.
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D28393527
Pulled By: SplitInfinity
fbshipit-source-id: 1e6234d349b94c97c6ff20880a801433a9a428e9
Co-authored-by: BowenBao <bowbao@microsoft.com>
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52349
Adds a check for patterns for cases with autocasting enabled in which a cast node is inserted before the NegativeLogLikelihoodLoss
node and causing these patterns below not to be recognizable by peephole pass function
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D26490326
Pulled By: SplitInfinity
fbshipit-source-id: 4a6d806acc51b4696fd3932734d55af075fba6b1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50229
`fastmod -m 'cast(<((at|c10)::)?\w+Type>\(\)\s*)->' 'castRaw${1}->'` Presuming it builds, this is a safe change: the
result of `cast()` wasn't being saved anywhere, so we didn't need
it, so we can use a raw pointer instead of a new `shared_ptr`.
ghstack-source-id: 120769170
Test Plan: CI
Reviewed By: SplitInfinity
Differential Revision: D25837494
fbshipit-source-id: 46319100dc0dfc78f6d2b45148207f83481f2ada
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50906
In opset 13, squeeze/unsqueeze is updated to take axes as input, instead of attribute.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D26050883
Pulled By: SplitInfinity
fbshipit-source-id: 7b5faf0e016d476bc75cbf2bfee6918d77e8aecd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50228
`fastmod -m 'expect(<((at|c10)::)?\w+Type>\(\)\s*)->'
'expectRef${1}.'`
Presuming it builds, this is a safe change: the result of `expect()`
wasn't being saved anywhere, so we didn't need it, so we can take a
reference instead of a new `shared_ptr`.
ghstack-source-id: 119782961
Test Plan: CI
Reviewed By: SplitInfinity
Differential Revision: D25837374
fbshipit-source-id: 86757b70b1520e3dbaa141001e7976400cdd3b08
Summary:
* Support propagating `dim_param` in ONNX by encoding as `ShapeSymbol` in `SymbolicShape` of outputs. If export is called with `dynamic_axes` provided, shape inference will start with these axes set as dynamic.
* Add new test file `test_pytorch_onnx_shape_inference.py`, reusing all test cases from `test_pytorch_onnx_onnxruntime.py`, but focus on validating shape for all nodes in graph. Currently this is not enabled in the CI, since there are still quite some existing issues and corner cases to fix. The test is default to run only at opset 12.
* Bug fixes, such as div, _len, and peephole.cpp passes for PackPadded, and LogSoftmaxCrossEntropy.
* This PR depends on existing PR such as 44332.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44920
Reviewed By: eellison
Differential Revision: D23958398
Pulled By: bzinodev
fbshipit-source-id: 00479d9bd19c867d526769a15ba97ec16d56e51d
Summary:
* Support sequence type (de)serialization, enables onnx shape inference on sequence nodes.
* Fix shape inference with block input/output: e.g. Loop and If nodes.
* Fix bugs in symbolic discovered by coverage of onnx shape inference.
* Improve debuggability: added more jit logs. For simplicity, the default log level, when jit log is enabled, will not dump ir graphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43929
Reviewed By: albanD
Differential Revision: D23674604
Pulled By: bzinodev
fbshipit-source-id: ab6aacb16d0e3b9a4708845bce27c6d65e567ba7
Summary:
It is often that the conversion from torch operator to onnx operator requires input rank/dtype/shape to be known. Previously, the conversion depends on tracer to provide these info, leaving a gap in conversion of scripted modules.
We are extending the export with support from onnx shape inference. If enabled, onnx shape inference will be called whenever an onnx node is created. This is the first PR introducing the initial look of the feature. More and more cases will be supported following this PR.
* Added pass to run onnx shape inference on a given node. The node has to have namespace `onnx`.
* Moved helper functions from `export.cpp` to a common place for re-use.
* This feature is currently experimental, and can be turned on through flag `onnx_shape_inference` in internal api `torch.onnx._export`.
* Currently skipping ONNX Sequence ops, If/Loop and ConstantOfShape due to limitations. Support will be added in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40628
Reviewed By: mrshenli
Differential Revision: D22709746
Pulled By: bzinodev
fbshipit-source-id: b52aeeae00667e66e0b0c1144022f7af9a8b2948
Summary:
in `_jit_pass_onnx`, symbolic functions are called for each node for conversion. However, there are nodes that cannot be converted without additional context. For example, the number of outputs from split (and whether it is static or dynamic) is unknown until the point where it is unpacked by listUnpack node. This pass does a preprocess, and prepares the nodes such that enough context can be received by the symbolic function.
* After preprocessing, `_jit_pass_onnx` should have enough context to produce valid ONNX nodes, instead of half baked nodes that replies on fixes from later postpasses.
* `_jit_pass_onnx_peephole` should be a pass that does ONNX specific optimizations instead of ONNX specific fixes.
* Producing more valid ONNX nodes in `_jit_pass_onnx` enables better utilization of the ONNX shape inference https://github.com/pytorch/pytorch/issues/40628.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41832
Reviewed By: ZolotukhinM
Differential Revision: D22968334
Pulled By: bzinodev
fbshipit-source-id: 8226f03c5b29968e8197d242ca8e620c6e1d42a5
Summary:
* move both under new file `fixup_onnx_controlflow`
* move the fixup to where the ONNX loop/if node is created, as oppose to running the fixup as postpass. This will help with enable onnx shape inference later.
* move `fuseSequenceSplitConcat` to `Peephole`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40943
Reviewed By: mrshenli
Differential Revision: D22709999
Pulled By: bzinodev
fbshipit-source-id: 51d316991d25dc4bb4047a6bb46ad1e2401d3d2d
Summary:
Add ONNX export support for torch.nn.CrossEntropyLoss.
This PR makes following changes:
1. Updates nll_loss export
2. Makes a post pass for SoftmaxCrossEntropy
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34830
Reviewed By: hl475
Differential Revision: D21230712
Pulled By: houseroad
fbshipit-source-id: c81911a41968e23813ba10274340ce4d8ba1ed78
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115
This commit runs the newly added tools/clang_format.py on the JIT
codebase and includes all of the formatting changes thus produced.
Testing:
Ran the script, CI.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D20568523
Pulled By: SplitInfinity
fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b
Summary:
torch.mm is exported as Gemm operator in ONNX and both have an optional input: out.
out is considered as broadcastable in Gemm and during graph optimization the optional input (out) would get selected. Since out is optional, in case when it is not defined in torch.mm that would result in the following exception:
IndexError: vector::_M_range_check: __n (which is 2) >= this->size() (which is 2)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34661
Reviewed By: hl475
Differential Revision: D20496398
Pulled By: houseroad
fbshipit-source-id: e677aef0a6aefb1f83a54033153aaabe5c23bc0f