Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70248
Modified loops in files under fbsource/fbcode/caffe2/ from the format
```
for(TYPE var=x0;var<x_max;x++)
```
to the format
```
for(const auto var: irange(xmax))
```
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D32813863
fbshipit-source-id: 527244b4a2b220fdfe7f17dee3599603f492a2ca
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66743
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D31705359
fbshipit-source-id: c9ea2fbc0f9cd29e97a52dcb203addc5f2abb09b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
bypass_size_limit
allow-large-files
Test Plan: Sandcastle
Reviewed By: ngimel
Differential Revision: D30652629
fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e
Summary:
Replace for loop with for `irange` loop. Also fix some unused variable warnings in range loop cases
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62928
Reviewed By: driazati
Differential Revision: D30171904
Pulled By: malfet
fbshipit-source-id: 1b437a0f7e3515f4a2e324f3450e93312f1933ae
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
AFAICT, this include was a typo, and meant to be the corresponding
header for this .cpp, but instead pulled in an unrelated header.
Test Plan: CI
Reviewed By: igorsugak
Differential Revision: D29422993
fbshipit-source-id: cc9bb29ee1f1007b68c6666ea8e389f6f39928af
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO.
Manually edited some references of the removed `CAFFE2_API`:
* `CONTRIBUTING.md`
* `caffe2/proto/CMakeLists.txt`
* `cmake/ProtoBuf.cmake`
* `c10/macros/Export.h`
* `torch/csrc/WindowsTorchApiMacro.h`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496
Reviewed By: malfet, samestep
Differential Revision: D25600726
Pulled By: janeyx99
fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43987
This replaces the caffe2 CPU random number (std::mt19937) with at::mt19937 which is the one currently used in pytorch. The ATen RNG is 10x faster than the std one and appears to be more robust given bugs in the std (https://fburl.com/diffusion/uhro7lqb)
For large embedding tables (10GB+) we see UniformFillOp taking upwards of 10 minutes as we're bottlenecked on the single threaded RNG. Swapping to at::mt19937 cuts that time to 10% of the current.
Test Plan: Ran all relevant tests + CI. This doesn't introduce new features (+ is a core change) so existing tests+CI should be sufficient to catch regressions.
Reviewed By: dzhulgakov
Differential Revision: D23219710
fbshipit-source-id: bd16ed6415b2933e047bcb283a013d47fb395814
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45986
Recurrent networks have subnets that are not well supported by `RemoveOpsByType`. Here we exclude recurrent networks by adding the same check as in memonger.
Test Plan:
```
buck test //caffe2/caffe2/fb/predictor:black_box_predictor_test
```
AdIndexer canary for sanity check:
https://www.internalfb.com/intern/ads/canary/430059485214766620
Differential Revision: D24167284
fbshipit-source-id: fa90d1c1f34af334a599d879af09d4c0bf7c27bd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42286
One more bug to fix. Operators such as If and AsyncIf need special treatment not just in `onnx::SsaRewrite`, but also in `RemoveOpsByType`. The solution needs two steps:
1) add external inputs/outputs of the subnets of If/AsyncIf op to the inputs/outputs of the op
2) if the inputs/outputs of the If/AsyncIf op need to be renamed as a result, the same inputs/outputs of the subnets need to be renamed as well.
I also added unit tests to cover this corner case.
Test Plan:
```
buck test //caffe2/caffe2/fb/predictor:black_box_predictor_test
mkdir /tmp/models
rm -rf /tmp/$USER/snntest
rm -rf /tmp/snntest
buck run mode/opt admarket/lib/ranking/prediction_replayer/snntest_replayer_test/tools:snntest_replay_test -- --serving_paradigm=USER_AD_PRECOMPUTATION_DSNN
```
Differential Revision: D22834028
fbshipit-source-id: c070707316cac694f452a96e5c80255abf4014bc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41606
The previous diff (D22220798 (59294fbbb9) and D22220797) was recently reverted (D22492356 (28291d3cf8), D22492355) because of a bug associated with the op AsyncIf. The AsyncIf op has net_defs as args and the SSA rewriting didn't take that into account. It has a special path for the op If, but not for AsyncIf. Several changes I made to fix the bug:
1) Add op AsyncIf to the special path for If op in SSA rewriting
2) clear inputs/outputs of the netdefs that are args in If/AsyncIf ops because they're no longer valid
3) revert renamed inputs/outputs in the arg netdefs that are in the external_outputs in the parent netdef
2) and 3) are existing bugs in the `SsaRewrite` function that were just never exposed before.
The algorithm for `RemoveOpsByType` is the same as in my previous diff D22220798 (59294fbbb9). The only new changes in this diff are in `onnx::SsaRewrite` and a few newly added unit tests.
(Note: this ignores all push blocking failures!)
Reviewed By: yinghai
Differential Revision: D22588652
fbshipit-source-id: ebb68ecd1662ea2bae14d4be8f61a75cd8b7e3e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40649
The original implementation of RemoveOpsByType is pretty buggy and does not remove all instances of the ops that should be removed. It's also quite complicated and hard to modify. I reimplemented it by first converting the graph to its SSA form. The algorithm is quite simple once the graph is in SSA form. It's very similar to constant propagation with a few modifications. The hardest part is to deal with the case of removing an op with the output being an output of the predict net, because that output has to be preserved.
(Note: this ignores all push blocking failures!)
Reviewed By: yinghai, dzhulgakov
Differential Revision: D22220798
fbshipit-source-id: faf6ed5242f1e2f310125d964738c608c6c55c94
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915
Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609
Test Plan: waitforsandcastle
Differential Revision: D18869639
fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
Summary:
Overal context: open-source BlackBoxPredictor as the entry
point for inference in Caffe2 (thread safe abstraction for Caffe2
inference). This should be used in ThroughputBenchmark for the purpose
of framework comparison
This specific diff:
There should be no harm in moving transformation code to
OSS. On the advantages side we will be able to compare production
Caffe2 setup with PyTorch in the most fair way via
ThroughputBenchmark. This approach avoid any complicated
transformation regirstries. Building those proper would be significant
engineering effort as well as production risk. In the past we had SEVs
related to transforms being turned off due to various refactors. Given
that we don't plan to build any other significant investments into
transformation logic except existing ones (like TVM and Glow), and
those also relate to open-source technologies, I came up to the
conclusion of moving to OSS the whole thing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23350
ghstack-source-id: 87121538
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24928
Test Plan: waitforsandcastle
Differential Revision: D16445133
Pulled By: salexspb
fbshipit-source-id: a93106489611dfe427b0f144717bc720d04e47f3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23257
Overal context: open-source BlackBoxPredictor as the entry
point for inference in Caffe2 (thread safe abstraction for Caffe2
inference). This should be used in ThroughputBenchmark for the purpose
of framework comparison
This specific diff:
There should be no harm in moving transformation code to
OSS. On the advantages side we will be able to compare production
Caffe2 setup with PyTorch in the most fair way via
ThroughputBenchmark. This approach avoid any complicated
transformation regirstries. Building those proper would be significant
engineering effort as well as production risk. In the past we had SEVs
related to transforms being turned off due to various refactors. Given
that we don't plan to build any other significant investments into
transformation logic except existing ones (like TVM and Glow), and
those also relate to open-source technologies, I came up to the
conclusion of moving to OSS the whole thing.
Reviewed By: zrphercule
Differential Revision: D16428124
fbshipit-source-id: b35deada5c015cd97b91ae12a7ea4aac53bd14b8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17619
--filler hive --iter -1 will let debugger exhaust all batches from a hive partition before exiting.
add README that summarizes command line options and usage.
Reviewed By: yinghai
Differential Revision: D14220166
fbshipit-source-id: daa23b7e8a9184481c6d7b67acf1599e5c99d74a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17481
Usually, feature macros are either defined or undefined and checked accordingly.
C10_MOBILE was a weird special case that was always defined but either defined to 1 or to 0.
This caused a lot of confusion for me when trying to disable something from mobile build and it also disabled it
from the server build (because I was using ifdef). Also, I found a place in the existing code base that made
that wrong assumption and used the macro wrongly, see https://fburl.com/y4icohts
Reviewed By: dzhulgakov
Differential Revision: D14214825
fbshipit-source-id: f3a155b6d43d334e8839e2b2e3c40ed2c773eab6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16597
This diff fixes some bugs in shape inference for `SparseLengthsSumFused8BitRowwise`. And added input shape inference for `Concat` when `add_axis=1`.
Reviewed By: bertmaher
Differential Revision: D13892452
fbshipit-source-id: 6cd95697a6fabe6d78a5ce3cb749a3a1e51c68e7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16620
LogfiledbNetLoader loads all external input blobs into a workspace instance, we pack a shared pointer to this loaded workspace into the SingleLoadedNetSupplier.
SingleLoadedNetSupplier will pass this workspace to BlackBoxPredictor to be executed. (D13891759 is a WIP of how it all comes together)
Reviewed By: pjh5
Differential Revision: D13901467
fbshipit-source-id: 20589f898922f5f1aec50be131dad17a8c38e9b2
Summary:
Based on offline discussion it should be less surprising to the users of existing code. Thus caffe2::Tensor is now a move-only class (as it used to be), explicit calls to UnsafeSharedInstance() are necessary to get shared_ptr behavior.
This change also identified a few places that misused the copy constructor - those are fixed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15416
Reviewed By: Yangqing
Differential Revision: D13524598
fbshipit-source-id: aea12d6dff77342606fa88ce4ddddbff266245a7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15418
Previously we are using Resize + ShareData.
Instead, we'll create a function on Tensor that clones itself with same storage.
Suppose we want `t` to `ShareData` with `t0`, Previous:
```
Tensor t(dims, CPU);
t.Resize(t0.sizes());
t.ShareData(t0);
```
Now:
```
Tensor t = t0.Alias();
```
Reviewed By: dzhulgakov
Differential Revision: D13507609
fbshipit-source-id: 6e4275d02f4c3356cbce91127f1b01111dc86b9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15027
- Make DataRandomFiller able to accept input_dims and input_types for only non intermediate inputs. Add a helper to fill input directly to a workspace
Reviewed By: highker
Differential Revision: D13408345
fbshipit-source-id: 5fc54d33da12e3f0a200e79380d4c695b0339b17
Summary:
Hi guys,
I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.
What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.
After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550
Reviewed By: ezyang
Differential Revision: D13042597
Pulled By: orionr
fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13949
This diff adds support to fillers for `SparseLengthsWeight*` ops. It does 3 things:
1. Add the fillers for `SparseLengthsWeight*` ops
2. Add filling heuristics to consider the path of `LengthsRangeFill` -> `Gather` -> `SparseLengthsWeightedSum`, where the length input is shared by `LengthsRangeFill` and `SparseLengthsWeightedSum`. Therefore, we need to carefully bound the value of that length input so that at `Gather`, it does not index out-of-bound for the weight input of `Gather`.
3. Fix and simplify the logic of `math::RandFixedSum`, where we just keep rejecting the generated value if it violates the invariants.
Reviewed By: highker
Differential Revision: D13048216
fbshipit-source-id: bfe402e07e6421b28548047d18b298c148e0ec87
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13820
We would like to provide an option to show additional info of the net to be benchmarked.
Reviewed By: highker, rdzhabarov
Differential Revision: D13018219
fbshipit-source-id: d3ec69901bdae58117a482ddd2c327b0f8cf7cb6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13522
Currently Tensor is a shared pointer to the underlying implementation, rather than a value, copying
the pointer will share the underlying TensorImpl, ShareData probably don't make sense anymore.
Reviewed By: dzhulgakov
Differential Revision: D12871708
fbshipit-source-id: d3773c66b7ed0bf1c37e886f69f59aec158b216b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12905
This diff does some clean up of the multithread benchmark code:
1. Split implementation to `.cc` file to separate implementation and improve build
2. Make `MutatingNetSupplier` more generic by providing the mutating function as an argument instead of virtual method.
3. Fix AI benchmark by sticking to the original option names
Reviewed By: highker
Differential Revision: D10479238
fbshipit-source-id: afa201fc287e3fdbb232db24513ecf8024501f66
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11135
This diff does not have any logic change; it simply move files/functions/classes around.
Open source (almost all) necessary dependency for multithreaded predictor bench.
The benchmark itself can be open sourced once the predictor is open sourced.
Reviewed By: salexspb
Differential Revision: D9602006
fbshipit-source-id: 386c9483e2c64c8b7d36e4600189c4e0b7e159ff
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12043
Re-trying D9979976, this time with all call sites fixed.
D9979976 got reverted because there was a call site that wasn't covered by sandcastle it seems.
I fixed it and used 'grep' to ensure there aren't any more call sites in fbsource.
Reviewed By: ezyang
Differential Revision: D10026392
fbshipit-source-id: cd341514a8e53a40147ea0ee3e52f63bb6444157