Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65861
First in a series. This PR changes the code in deploy.h/cpp and
interpreter_impl.h/cpp to be camel case instead of snake case. Starting
with this as it has the most impact on downstream users.
Test Plan: Imported from OSS
Reviewed By: shannonzhu
Differential Revision: D31291183
Pulled By: suo
fbshipit-source-id: ba6f74042947c9a08fb9cb3ad7276d8dbb5b2934
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63918
Previously we were building with `USE_DISTRIBUTED` off, because c10d was built as a separately library for historical reasons. Since then, lw has merged the c10d build into libtorch, so this is fairly easy to turn on.
Differential Revision:
D30492442
**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.intern.facebook.com/intern/diff/D30492442/)!
D30492442
D30492442
Test Plan: added a unit test
Reviewed By: wconstab
Pulled By: suo
fbshipit-source-id: 843b8fcf349a72a7f6fcbd1fcc8961268690fb8c
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
This PR suppresses clang-tidy warnings in the codebase (for now) so that we can re-enable clang-tidy checks on master.
I ran this script to add the `NOLINTNEXTLINE` comments (on a devserver):
```bash
python3 setup.py develop
# Uses same script that's run on CI and adds the -j (parallel), -s (add comments), -k (continue if diagnostic errors are found) options
python3 tools/clang_tidy.py \
-j \
-s \
-k \
-v \
--paths torch/csrc/ \
-g"-torch/csrc/jit/passes/onnx/helper.cpp" \
-g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \
-g"-torch/csrc/jit/serialization/onnx.cpp" \
-g"-torch/csrc/jit/serialization/export.cpp" \
-g"-torch/csrc/jit/serialization/import.cpp" \
-g"-torch/csrc/jit/serialization/import_legacy.cpp" \
-g"-torch/csrc/onnx/init.cpp" \
-g"-torch/csrc/cuda/nccl.*" \
-g"-torch/csrc/cuda/python_nccl.cpp" \
-g"-torch/csrc/autograd/FunctionsManual.cpp" \
-g"-torch/csrc/generic/*.cpp" \
-g"-torch/csrc/jit/codegen/cuda/runtime/*" \
-g"-torch/csrc/deploy/interpreter/interpreter.cpp" \
-g"-torch/csrc/deploy/interpreter/interpreter.h" \
-g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \
-g"-torch/csrc/deploy/interpreter/test_main.cpp"
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60649
Test Plan: Verified changes by re-running the script (without the `-s` option) and seeing no warnings/errors.
Reviewed By: walterddr, janeyx99
Differential Revision: D29504258
Pulled By: 1ntEgr8
fbshipit-source-id: 78310b30ee8213b73ddb4771ad874665323e7a4e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59460
Original commit changeset: 6e01a96d3746
Test Plan: Verify new tests run in sandcastle and existing CI is OK
Reviewed By: H-Huang
Differential Revision: D28900869
fbshipit-source-id: a8962ec48c66bba3b4b8f001ece7231953b29e82
Summary:
Added GPU tests in previous diffs but had to disable them as they only
pass locally on devgpu, but not in sandcastle.
note: local testing requires mode/dev-nosan or else ASAN interferes with CUDA.
Test Plan: Verify tests passing in sandcastle.
Reviewed By: malfet
Differential Revision: D28538996
fbshipit-source-id: 1a6ccea07cfe2f150eee068594e636add620cd91
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58493
In fbcode, we want torch::deploy to be a target that works with or without cuda, depending only on whether cuda is linked in the final binary. To enable this, we build both flavors of libinterpreter, and choose which to load at runtime depending on whether cuda is available in the application. This comes at a cost to binary size, as it includes two copies of libinterpreter instead of one. However, it does not require _loading_ two copies of libinterpreter into memory at runtime, so the memory footprint of the interpreter (which we make N copies of) is not impacted.
In oss/cmake, this change is a no-op. cuda is already handled there by building just one libinterpreter, but building cuda or not for the whole pytorch build based on a global cmake flag.
Test Plan: test in fbcode with new gpu mode unit tests, verify existing oss CI passes
Reviewed By: suo
Differential Revision: D28512178
fbshipit-source-id: 61354bf78b1932605a841388fcbc4bafc0c4bbb4