In almost all cases this is only included for writing the output formatter, which
only uses `std::ostream` so including `<ostream>` is sufficient.
The istream header is ~1000 lines so the difference is non-trivial.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106914
Approved by: https://github.com/lezcano
Summary:
Before we copy a meta merge, and use it as a skeleton to do d2d merge replication. However some models like prospector has CPU op LongIndex which takes quite long time to load. That makes the meta merge copy expensive.
Modify jit::Module::deepcopy() to allow device copy. It simplifies user code and removes all unnecessary copies like tempfile, meta merge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106521
Approved by: https://github.com/davidberard98
As we live in C++17 world
This is a functional no-op, just
- `s/namespace at { namespace native {/namespace at::native {/`
- `s/namespace torch { namespace jit {/namespace torch::jit {/`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100
Approved by: https://github.com/izaitsevfb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71778
This assert was broken (never triggers). Fixing the assert leads to test failures. We need to fix those test failures, so a FIXME has been filed. The urgency is avoiding the compile time failure that will come with enabling `-Wstring-conversion` as an error.
Test Plan: CI Pass
Reviewed By: r-barnes
Differential Revision: D33754171
fbshipit-source-id: 834b070b94007af583d0fc6c022f23b6703f3fbc
(cherry picked from commit ac8f905fb1)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69406
Most files that include `interned_strings.h` don't actually depend on
anything generated from `FORALL_NS_SYMBOLS` yet because they're in a
single file you need to recompile whenever a new symbol is added. Here
I move the class definition into a separate file so this doesn't
happen.
Test Plan: Imported from OSS
Reviewed By: zou3519
Differential Revision: D32923637
Pulled By: albanD
fbshipit-source-id: 6e488cbfcfe2c041a99d9ff22e167dbddf3f46d7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65967
Graph is an implementation detail. If user wants to get access to the
underlying graph, they should be able to explicitly dynamic cast instead.
ghstack-source-id: 141659819
Test Plan: no behavior change.
Reviewed By: gmagogsfm
Differential Revision: D31326153
fbshipit-source-id: a0e984f57c6013494b92a7095bf5bb660035eb84
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64066
I noticed a bunch of time being spent heap-allocating Tuples
in the unpickler. 1-, 2-, and 3-element Tuples are apparently common
enough that they get their own bytecode instructions, so I decided to
try also giving them their own representation. We store up to 3
IValues inline in `Tuple` rather than doing a second heap allocation
for a `std::vector<IValue>`.
ghstack-source-id: 140695395
Test Plan:
Added automated tests for TupleElements.
Pixel 3 before: https://www.internalfb.com/intern/aibench/details/761596366576284
Pixel 3 after: https://www.internalfb.com/intern/aibench/details/591414145082422
We went from 347 ms to 302 ms.
Reviewed By: dhruvbird
Differential Revision: D30592622
fbshipit-source-id: 93625c54c9dca5f765ef6d5c191944179cb281a8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62442
For PythonMethodWrapper::setArgumentNames, make sure to use the correct method
specified by method_name_ rather than using the parent model_ obj which itself
_is_ callable, but that callable is not the right signature to extract.
For Python vs Script, unify the behavior to avoid the 'self' parameter, so we only
list the argument names to the unbound arguments which is what we need in practice.
Test Plan: update unit test and it passes
Reviewed By: alanwaketan
Differential Revision: D29965283
fbshipit-source-id: a4e6a1d0f393f2a41c3afac32285548832da3fb4
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61856
This diff did the following few things:
1. It implemented IMethod::getArgumentNames() for all IMethod's subclasses.
2. It refactors PyTorchDeployPredictor to use IMethod for model executions.
Test Plan:
[... ~/fbsource/fbcode/caffe2] buck test mode/dev caffe2/fb/predictor:pytorch_predictor_test -- PyTorchDeployPredictor
[... ~/fbsource/fbcode/caffe2] buck test mode/dev caffe2/fb/predictor:pytorch_predictor_test -- PyTorchPredictor
Reviewed By: wconstab
Differential Revision: D29648756
fbshipit-source-id: e047345f26ce495a5d74d8063f7f8edc32a1b13c
Summary:
Freezing exists as a pass which partially evaluates your model and applies generic optimizations which should speed it up. Optimize for inference is a counterpart to these optimizations which runs build & server specific optimizations. The interaction with existing `optimize_frozen_module` is not great, I guess we could just deprecate the API entirely? it was never officially released but just existed to document the `optimize_numerics` keyword.
Eventually, I would like to add a way of adding example inputs but I didnt add that here because they are not being used at all yet. I also have not yet included a way to blacklist individual optimizations, and would like to wait until we move this to Beta and have a little more clarity on how everything will fit together. I also think blacklisting will be an uncommon use case for the current optimizations.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58193
Reviewed By: bertmaher, navahgar
Differential Revision: D28443714
Pulled By: eellison
fbshipit-source-id: b032355bb2585720a6d2f00c89d0d9a7ef60e649
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56152
Currently, the Bundled Inputs API mutates the module in-place. It adds class methods and not instance methods. This results in a small problem that one can't re-run an already executed cell in Bento if the class has already been subject to bundled inputs.
In addition, there is no way to add bundled inputs to a module that has bundled inputs added already. This API provides a way to solve this problem as well by adding an `ignored_methods` to the call to `clone()` by allowing the implementation of bundled inputs to pass in the methods that it will add as `ignored_methods` so that when it does try to add those methods, it will be able to do so successfully.
We'll have to be careful when ignoring those methods during the call to `torch.jit._clone_module_with_class` since any bundled input that relies on a user-provided method will need to be preserved and not ignored during the clone.
Looking for feedback on whether this is an acceptable direction.
ghstack-source-id: 128908360
Test Plan:
Added unit test and ran it as `buck test //caffe2/test:mobile`
Also see this Bento Notebook: https://www.internalfb.com/intern/anp/view/?id=550829
Reviewed By: gmagogsfm
Differential Revision: D27788394
fbshipit-source-id: 48109cd4583506d4efdb345e4ba31385db23a273
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
The code uses `torch::jit::jit_log_prefix` for handling recursive
indenting in most places in this function. There was one place that was
using "level", but it was buggy -- it would result in a compounding
superlinear indent. Note that changing it to "level+1" doesn't fix the
bug.
Before/after:
https://gist.github.com/silvasean/8ee3ef115a48de6c9c54fbc40838d8d7
The new code establishes a recursive invariant for
`Module::dump_to_str`: the function returns the module printed at the
base indent level (i.e. no indent). `torch::jit:log_prefix` is used
to prefix recursive calls. The code was already nearly there, except for
this spurious use of "level".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52539
Reviewed By: navahgar
Differential Revision: D26773657
Pulled By: gmagogsfm
fbshipit-source-id: ab476f0738bf07de9f40d168dd038dbf62a9a79e
Summary:
Update freezing api for 1.8, and add a corresponding C++ API. The `optimize` flag hasn't been publicly released yet, so we are able to change it without breaking BC. I will submit a PR to branch release as well, there are a few more days to do that
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52337
Reviewed By: ejguan
Differential Revision: D26491833
Pulled By: eellison
fbshipit-source-id: 6dcd74eb8f76db64ac53183d03dabdd0f101f4b5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48863
Support default arguments when invoking a module via PyTorch Lite (`mobile::Module`).
Test Plan:
buck test mode/dbg //caffe2/test/cpp/jit:jit -- LiteInterpreterTest.MethodInvocation
buck test mode/dbg caffe2/test:mobile -- test_method_calls_with_optional_arg
Reviewed By: iseeyuan
Differential Revision: D25896212
fbshipit-source-id: 6d7e7fd5f3244a88bd44889024d81ad2e678ffa5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48863
Support default arguments when invoking a module via PyTorch Lite (`mobile::Module`).
Test Plan:
buck test mode/dbg //caffe2/test/cpp/jit:jit -- LiteInterpreterTest.MethodInvocation
buck test mode/dbg caffe2/test:mobile -- test_method_calls_with_optional_arg
Reviewed By: raziel, iseeyuan
Differential Revision: D25152559
fbshipit-source-id: bbf52f1fbdbfbc6f8fa8b65ab524b1cd4648f9c0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48607
This change builds on top of
https://github.com/pytorch/pytorch/pull/46865
further exposing the async interface to `torch::jit::Method`.
added unit test for new `run_async`
Test Plan: `buck test caffe2/test/cpp/jit/...`
Reviewed By: dzhulgakov
Differential Revision: D25219726
fbshipit-source-id: 89743c82a0baa1affe0254c1e2dbf873de8e5c76
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45858
When cloning a module that has __setstate__, __getstate__ methods.
We need to load these methods to initialize these modules.
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D24116524
Pulled By: bzinodev
fbshipit-source-id: a5111638e2dc903781f6468838c000850d1f9a74
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42121
This PR changes the Module API to allow register a module with module
interface type, and therefore allows Module::clone works on the case
where there's a module interface type being shared by two submodules.
interface type will be shared by the new cloned instance in the same
compilation unit bc it only
contains a list of functionSchema, which does not involve any
attributes compared to classType.
fixes https://github.com/pytorch/pytorch/issues/41882
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D22781205
Pulled By: wanchaol
fbshipit-source-id: f97f4b75970f0b434e38b5a1f778eda2c4e5109b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37548
Moving RecordFunction from torch::autograd::profiler into at namespace
Test Plan:
CI
Imported from OSS
Differential Revision: D21315852
fbshipit-source-id: 4a4dbabf116c162f9aef0da8606590ec3f3847aa
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37472
Our convention is for `findX` to return an optional version and `getX`
to assert that the X is there. Fix up `getMethod` to be consistent with
this convention.
Test Plan: Imported from OSS
Differential Revision: D21297543
Pulled By: suo
fbshipit-source-id: b40f56231cc8183e61bbb01fe5c0c113bcb6464d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32684
Previously we have `clone` and `clone_instance`, where `clone` will clone both type
and value, and `clone_instance` only clone the value, both of them are shallow copies.
We need to re-evaluate whether we should expose them as a user facing API.
I think we should hide `clone`, but `clone_instance` might be useful as well, especially
when we are copying a model with very large weights, people might just want to do shallow copy.
This PR adds a `deepcopy` that might be useful as a user API, which deep copies the values, including
Tensor, but we didn't deepcopy `Blob`, `Capsule`, `Future` or `PyObject`.
For more discussions please see the following issue.
fixes: https://github.com/pytorch/pytorch/issues/32519
Test Plan: Imported from OSS
Differential Revision: D21220756
fbshipit-source-id: 476bf11fe82c08fac36e7457879a09f545ffdc5e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710
Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate.
Test Plan: unit test (test_misc.cpp/testRecordFunction)
Reviewed By: gdankel, dzhulgakov
Differential Revision: D20158523
fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115
This commit runs the newly added tools/clang_format.py on the JIT
codebase and includes all of the formatting changes thus produced.
Testing:
Ran the script, CI.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D20568523
Pulled By: SplitInfinity
fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515
Once upon a time we thought this was necessary. In reality it is not, so
removing it.
For backcompat, our public interface (defined in `api/`) still has
typedefs to the old `script::` names.
There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph
transform. I renamed one of them.
Test Plan: Imported from OSS
Differential Revision: D20353503
Pulled By: suo
fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93