Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48021
Extending operator schema check for simple memonger to dag memonger as well. As part of this a fix is being made to handle inplace ops (having at least one output name same as input blob). Earlier all the output blobs from ops were being treated as shareable but it failed assertion of external input blobs with the same name not allowed to share.
Test Plan: Added corresponding unit tests
Reviewed By: hlu1
Differential Revision: D24968862
fbshipit-source-id: b6679a388a82b0d68f65ade64b85560354aaa3ef
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47718
Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs.
As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error.
Test Plan: Added corresponding tests in memonger_test.py . Could not find unit tests in c++ version of memonger.
Reviewed By: hlu1
Differential Revision: D24872010
fbshipit-source-id: 1dc99b2fb52b2bc692fa4fc0aff6b7e4c5e4f5b0
Summary:
Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs. Here is one reference part0 predict net with AsyncIf ops: https://www.internalfb.com/intern/paste/P145812115/
As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error.
Reviewed By: hlu1
Differential Revision: D24346771
fbshipit-source-id: ad2dd2e63f3e822ad172682f6d63f8474492255d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46560
Follow-up for D24236604 (16c52d918b).
For nets that pass the schema check, memonger actually makes sure to preserve the inplaceness of operators if they are already inplace. So we can safely enable it for correct input nets.
(Note: this ignores all push blocking failures!)
Differential Revision: D24402482
fbshipit-source-id: a7e95cb0e3eb87adeac79b9b69eef207957b0bd5
Summary:
Hi guys,
I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.
What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.
After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550
Reviewed By: ezyang
Differential Revision: D13042597
Pulled By: orionr
fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
Summary:
Original commit changeset: f5614a5d2607
D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz
We need to land this revert ASAP to unblock aggregator push.
Reviewed By: orionr
Differential Revision: D10123245
fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2
Summary:
Pure experimental addition to guide us on delivering this
into real production systems and their threadpools. Biggest limitation
now is that we need to turn off BlackBoxPredictor activation
deallocation logic to get to sane performance
Reviewed By: highker
Differential Revision: D8798029
fbshipit-source-id: ec7962689d605fba62b2c9e0904309df567a25a4
* Fix some signed/unsigned mismatches
* Skip unused result warning
* Explict fallthrough for murmur hash
* Enable aligned new support to eliminate warning
* Switch to int instead of unsigned in some cases
Summary: memonger.cc's support for RNNs was broken in D5994548, because it changed a .n argument to .s argument. That made data_parallel_model_test fail (but tests were not run for the blame diff, so this was not noticed).
Reviewed By: kennyhorror
Differential Revision: D6043948
fbshipit-source-id: d29abd6927c519227a28b41c1ef70fb1756904bf
Summary:
RNNOp have been using TextFormat for representing nets. This have already cause
some incompatibilites and also pulls huge dependencies for RNN on Mobile. This
diff is adding support for using NetDef arg instead and adds supports for
compiling only this version.
Reviewed By: salexspb
Differential Revision: D5994548
fbshipit-source-id: 6c4ded97b80d7a57ad5a013b79ae917aac777c7d
Summary: When we ported to memonger to C++ in D5544219, we forgot to include the special handling of RecurrentNetwork ops. This fixes that and adds a test.
Reviewed By: asaadaldien
Differential Revision: D5692407
fbshipit-source-id: 4e739b5dd6c7298303eee9bfa1aa4d19359eb7b5
Summary:
Enforce that blobs don't mix between operators on different GPUs or CPU/GPU. Add test.
+ Fix memonger when no namescope is provided.
Reviewed By: asaadaldien
Differential Revision: D5644708
fbshipit-source-id: 0cb361efd6361b6e2138462584bab6b4de039b5d
Summary: This diff replaces the main of the memonger for dag algorithm _compute_blob_recycling_for_dag with a c++ implementation.
Reviewed By: akyrola
Differential Revision: D5544219
fbshipit-source-id: 9f868880c8d0eb997ad3dd39433f9d0b9216d303
Summary: Fix a bug reported by dzhulgakov that occurs when input blobs is used twice in a same op --> it was released to the recycled blobs pool twice.
Reviewed By: dzhulgakov, volkhin
Differential Revision: D5414023
fbshipit-source-id: 861bb46fe901023cb9a496401736e6ecb77d5fae
Summary:
To be used with predictor "online": C++ version of memonger for simple nets. Very simple greedy algorithm. Works well at least on Resnet-50 inference graph: only 3 shared blobs are used.
Next I will integrate this with predictor and run canary (separate diff).
Reviewed By: asaadaldien
Differential Revision: D5375392
fbshipit-source-id: d36e419e39a32e568e105657c27fb00c85a2535d