pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	b5594f7df0	Revert "Use missing-prototypes in torch_cpu (#103725 )" This reverts commit `716b3b893d`. Reverted https://github.com/pytorch/pytorch/pull/103725 on behalf of https://github.com/osalpekar due to Broke caffe2 builds due. More info at [D46920675](https://www.internalfb.com/diff/D46920675) ([comment](https://github.com/pytorch/pytorch/pull/103725#issuecomment-1603129273))	2023-06-22 18:30:31 +00:00
cyy	716b3b893d	Use missing-prototypes in torch_cpu (#103725 ) This PR enables Wmissing-prototypes in torch_cpu except some generated cpp files and the mps and metal backends. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103725 Approved by: https://github.com/albanD	2023-06-21 13:19:55 +00:00
Richard Barnes	704af23ee4	Use a reference in GetSingleArgument (#71007 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71007 A string copy at Line 417 is currently consuming 125,749,287,000 cycles/day. I suspect the issue is with a copy-on-return, but we can experiment with introducing a reference in the middle to see if that produces a good savings without changing the interface. Reference ``` ["Inline caffe2::ArgumentHelper::GetSingleArgument @ caffe2/caffe2/utils/proto_utils.cc:417"] ``` Test Plan: Sandcastle Reviewed By: xw285cornell Differential Revision: D33478883 fbshipit-source-id: e863e359c0c718fcd0d52fd4b3c7858067de0670	2022-01-07 20:18:56 -08:00
Scott Wolchok	03a58a2ba0	[Caffe2] Create fewer strings during argument fetching (#64285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64285 With C++14 heterogeneous ordered container lookup, it is no longer necessary to create a `std::string` in order to look up elements of a `CaffeMap` keyed by std::string. Accordingly, this diff reworks the argument-getting operator functions to avoid that in favor of `c10::string_view`. ghstack-source-id: 137139818 ghstack-source-id: 137139818 Test Plan: buildsizebot iOS apps -- code size win. less strings is probably marginally good for perf but this only happens at setup time anyway. Reviewed By: dzhulgakov Differential Revision: D26826676 fbshipit-source-id: ee653b14dc2c528bae8c90f0fc6a7a419cbca1d6	2021-09-01 13:30:54 -07:00
Andrew Gallagher	20bda0057e	[caffe2/utils] Add explicit rule to avoid package boundary violation Summary: Add a rule to wrap proto_utils.h and depend on that, rather than relying on a glob which violates package boundaries. Reviewed By: igorsugak Differential Revision: D29273453 fbshipit-source-id: 08f198a03d06ee2fdf61f5dbe1d0087db22aec8b	2021-06-22 12:22:24 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
davidriazati@fb.com	2f5c352162	Fix protobuf warnings in caffe2 (#56186 ) Summary: This guards some deprecated usages of the Protobuf API behind an `#ifdef` (this is how onnx does it as well) ](https://our.intern.facebook.com/intern/diff/27803121/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56186 Pulled By: driazati Reviewed By: bertmaher, dzhulgakov Differential Revision: D27803121 fbshipit-source-id: 2d3a348ec1ab9879a0d8f2dff17c5444fd4baf2c	2021-04-19 15:19:53 -07:00
Ying Zhang	8c1a70a7c9	[A*][Gen-1.5] Add shape inference func for PredictorCall. Summary: ATT, so that the shape inference works for a model with only distributed parts. Previously, we rely on a full_predictor net to do shape inference. For very large models, the full_predictor net won't be generated, so we have to do shape inference based on distributed parts. Surprisingly, the PredictorCall op does tensor name mapping so it has to have shape inference func supported. Test Plan: Added unittests. Reviewed By: khabinov Differential Revision: D27250956 fbshipit-source-id: 3ebd36ba1eb020bb5d00358cffb8f038a6a996e8	2021-04-06 21:18:40 -07:00
Scott Wolchok	4c9eb57914	[PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47023 DeviceType pretty clearly only needs 1 byte. DeviceIndex only needs 1 byte given that machines don't have anywhere near 255 GPUs in them as far as I know. ghstack-source-id: 116901430 Test Plan: Existing tests, added assertion to catch if my assumption about DeviceIndex is incorrect Reviewed By: dzhulgakov Differential Revision: D24605460 fbshipit-source-id: 7c9a89027fcf8eebd623b7cdbf6302162c981cd2	2020-11-18 19:39:40 -08:00
Yinghai Lu	a92b49f7c8	[Onnxifi] Don't throw exception when we cannot write out debug files (#45979 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45979 For some reason, sometime we cannot write out the debug files. This shouldn't block the whole service. Hence, we opt in to error out instead of throw error. Test Plan: Run net_runner test at `/` and observe error being printed out but the test passes. Reviewed By: ipiszy Differential Revision: D24165081 fbshipit-source-id: a4e1d0479d54d741e615e3a00b3003f512394fd4	2020-10-08 00:18:24 -07:00
Hao Lu	39b4701d31	[caffe2][redo] Reimplement RemoveOpsByType with SSA (#41606 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41606 The previous diff (D22220798 (`59294fbbb9`) and D22220797) was recently reverted (D22492356 (`28291d3cf8`), D22492355) because of a bug associated with the op AsyncIf. The AsyncIf op has net_defs as args and the SSA rewriting didn't take that into account. It has a special path for the op If, but not for AsyncIf. Several changes I made to fix the bug: 1) Add op AsyncIf to the special path for If op in SSA rewriting 2) clear inputs/outputs of the netdefs that are args in If/AsyncIf ops because they're no longer valid 3) revert renamed inputs/outputs in the arg netdefs that are in the external_outputs in the parent netdef 2) and 3) are existing bugs in the `SsaRewrite` function that were just never exposed before. The algorithm for `RemoveOpsByType` is the same as in my previous diff D22220798 (`59294fbbb9`). The only new changes in this diff are in `onnx::SsaRewrite` and a few newly added unit tests. (Note: this ignores all push blocking failures!) Reviewed By: yinghai Differential Revision: D22588652 fbshipit-source-id: ebb68ecd1662ea2bae14d4be8f61a75cd8b7e3e6	2020-07-17 16:06:43 -07:00
Dmytro Dzhulgakov	49457a7be7	Logging for ATen op subtype Summary: ATenOp should go away, but before it does it's important to understand what's going inside of it. We already log `arguments`, but it's rather hard to parse in scuba as its a list, not a dictionary. Let's extract operator name explicitly so that grouping works well Test Plan: unittest Reviewed By: ngimel Differential Revision: D21057966 fbshipit-source-id: 86be7cca39055620477a28bd5d8ab29e8edd2ff9	2020-04-19 23:02:50 -07:00
Dmytro Dzhulgakov	1f759936f0	Propagate model id used by Predictor to Caffe2 logging Summary: Does the same things as D19658565 but for Caffe2 models. From investigation https://fb.quip.com/PbgsAEmoJVuf the model id that predictor uses and the model id saved inside the model don't match. Common reason is recurring fluent2 jobs but there are others. Since model_id from predictor is what the rest of datasets use, it's way more useful imho. I've considered adding both ids, but it'd require additional piping and I don't think it's that useful. Test Plan: unittests added Reviewed By: houseroad Differential Revision: D20630599 fbshipit-source-id: 3e6d0cb0b6f8c8b6ae5935138f55ae7a2ff60653	2020-03-29 23:07:32 -07:00
Yinghai Lu	79e1305519	[net_runner] Get shape info from qtensors (#34321 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34321 Mostly cosmetic as we can infer the shape anyway. It can remove a lot of the noise in the log though. Note that weight sharing doesn't work yet. I'll add another diff to address this. Reviewed By: houseroad Differential Revision: D20290841 fbshipit-source-id: fe6f9b60d05dbe150af15b5d9d7a69fd902e12cc	2020-03-09 18:34:16 -07:00
Yinghai Lu	80404cb2f5	Add support for getting TensorProto argument (#18364 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18364 att Reviewed By: bddppq Differential Revision: D14584784 fbshipit-source-id: 03f9207d5cf4f7f4b812428a931edbcdcb21ca8d	2019-04-02 20:58:28 -07:00
Duc Ngo	172ec4ace5	caffe2 - Util to cleanup external inputs and outputs from a NetDef (#18194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18194 Add a util method to cleanup external inputs and outputs from a NetDef The following conditions will be met after the modification - No duplicate external inputs - No duplicate external outputs - Going through list of ops in order, all op inputs must be outputs from other ops, or registered as external inputs. - All external outputs must be outputs of some operators. Reviewed By: ZolotukhinM Differential Revision: D14528589 fbshipit-source-id: c8d82fda1946aa3696abcbec869a4a8bb22f09b6	2019-03-22 11:23:03 -07:00
Sebastian Messmer	d408324350	Move files to/from c10/core and c10/util (#15316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15316 This starts cleaning up the files in c10 according to the module structure we decided on. Move to c10/util: - Half.h, Half-inl.h, Half.cpp, bitcasts.h Move to c10/core: - Device.h, Device.cpp - DeviceType.h, DeviceType.cpp i-am-not-moving-c2-to-c10 Reviewed By: dzhulgakov Differential Revision: D13498493 fbshipit-source-id: dfcf1c490474a12ab950c72ca686b8ad86428f63	2019-01-10 16:22:22 -08:00
David Reiss	cbd1c519c4	Replace non-printable-ascii characters in ProtoDebugString (#14918 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14918 When ProtoBuf-Lite is in use, ProtoDebugString just calls SerializeAsString. This produces binary output, which is not a very suitable "debug" string. Specifically, we've observed it causing problems when calling code tries to add the debug string to a Java exception message (which requires valid UTF-8). Now, we replace all non-ASCII bytes with "?". This is not a very fast implementation, but generating debug strings shouldn't be a performance-sensitive operation in any application. Reviewed By: dzhulgakov Differential Revision: D13385540 fbshipit-source-id: 8868172baf20efaf53fecf7d666a6980f59b64f5	2018-12-13 13:16:24 -08:00
Junjie Bai	a682ce9144	Add back HIP support to async net (#13400 ) Summary: We lost HIP support in last refactoring `620ece2668` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13400 Differential Revision: D12868211 Pulled By: bddppq fbshipit-source-id: 72dbfda105b826bee28ddf480e88fca7d63f93d8	2018-10-31 17:52:36 -07:00
Ilia Cherniavskii	620ece2668	Simplify thread pool creation logic (#13114 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13114 Using one thread pool creator for all device types Reviewed By: manojkris, wesolwsk Differential Revision: D10851533 fbshipit-source-id: 32ca51d7932ba7faa8137df26315f52ecb4c6157	2018-10-26 16:02:08 -07:00
Junjie Bai	e290a9d2fd	Back out "Migrate DeviceOption.numa_node_id to DeviceOption.device_id" Summary: Original commit changeset: 82583d0ad4b8 Reviewed By: enosair, ilia-cher Differential Revision: D10560741 fbshipit-source-id: e289a37d441bd2243b369810abf451292891d9ee	2018-10-24 17:11:25 -07:00
Edward Yang	34cca9f05b	Move Device and DeviceType to c10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12995 Reviewed By: Yangqing Differential Revision: D10513246 fbshipit-source-id: 0c6d52e09166d7e8a786c1a0e21685ec9c35b12a	2018-10-24 08:27:44 -07:00
Junjie Bai	202893fe1a	Migrate DeviceOption.numa_node_id to DeviceOption.device_id Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12717 Reviewed By: ilia-cher Differential Revision: D10408325 fbshipit-source-id: 82583d0ad4b8db094ee4c5c607b52500826328f7	2018-10-19 12:45:48 -07:00
Yangqing Jia	713e706618	Move exception to C10 (#12354 ) Summary: There are still a few work to be done: - Move logging and unify AT_WARN with LOG(ERROR). - A few header files are still being plumbed through, need cleaning. - caffe2::EnforceNotMet aliasing is not done yet. - need to unify the macros. See c10/util/Exception.h This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches: (1) add //caffe2/c10:c10 to your dependency (or transitive dependency). (2) change objects such as at::Error, at::Optional to the c10 namespace. (3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes. Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354 Reviewed By: orionr Differential Revision: D10238910 Pulled By: Yangqing fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32	2018-10-15 13:33:18 -07:00
Junjie Bai	89010d60f9	Migrate HIP to use DeviceOption.device_id and delete DeviceOption.hip_gpu_id Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12546 Reviewed By: hyuen, xw285cornell Differential Revision: D10305222 fbshipit-source-id: 955e1d2878508a25fe4e9980ae66f8f54aaf7db9	2018-10-10 18:25:06 -07:00
Junjie Bai	f54ab540af	Rename cuda_gpu_id to device_id in DeviceOption (#12456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12456 codemod with 'Yes to all' codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id Overload TextFormat::ParseFromString to do string replace when parsing from protobuf format Reviewed By: Yangqing Differential Revision: D10240535 fbshipit-source-id: 5e6992bec961214be8dbe26f16f5794154a22b25	2018-10-09 15:54:04 -07:00
Junjie Bai	ff608a9ff3	Back out "Revert D10123245: Back out "codemod cuda_gpu_id to device_id"" (#12232 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12232 Original commit changeset: fca91fea58b7 This adds proper modifications to the DeviceType <->DeviceOption conversion code added in D10033396 Reviewed By: jerryzh168 Differential Revision: D10132473 fbshipit-source-id: 801ef777e2950982cb47b48051b1471a0a91e64b	2018-10-01 21:54:52 -07:00
Rick Ratmansky	3010dc4208	Revert D10123245: Back out "codemod cuda_gpu_id to device_id" Differential Revision: D10123245 Original commit changeset: d83da8e00a12 fbshipit-source-id: fca91fea58b7df208edc2e218a1d514f9821ec7b	2018-10-01 12:22:36 -07:00
Yang Liu	7d7d336c45	Back out "codemod cuda_gpu_id to device_id" Summary: Original commit changeset: f5614a5d2607 D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz We need to land this revert ASAP to unblock aggregator push. Reviewed By: orionr Differential Revision: D10123245 fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2	2018-10-01 11:31:14 -07:00
Junjie Bai	3eb5940cf5	codemod cuda_gpu_id to device_id (#12022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12022 codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id codemod with 'Yes to all' Reviewed By: orionr Differential Revision: D9986213 fbshipit-source-id: f5614a5d26078817aee8caf79a494abfd6a95ff1	2018-09-27 20:24:53 -07:00
Yangqing Jia	28dba2f928	Unify all _EXPORT and _IMPORT macros across c++ backend (#12019 ) Summary: TSIA. Right now we should basically use C10_EXPORT and C10_IMPORT for explicitly marking dllexport and dllimport, as a continued effort of the C10 unification. This is a codemod by mechanically doing the following change: CAFFE2_{EXPORT,IMPORT} -> C10_{EXPORT,IMPORT} AT_CORE_{EXPORT,IMPORT} -> C10_{EXPORT,IMPORT} Pull Request resolved: https://github.com/pytorch/pytorch/pull/12019 Reviewed By: ezyang, teng-li Differential Revision: D10016276 Pulled By: Yangqing fbshipit-source-id: a420d62c43d1110105fc88f9e9076e28a3203164	2018-09-25 17:41:05 -07:00
Jerry Zhang	9f4bcdf075	caffe2::DeviceType -> at::DeviceType (#11254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11254 Previously we use DeviceType in caffe2.proto directly, but it's an `enum` and have implicit conversion to int, which does not have type safety, e.g. we have to explicitly check for a device type is valid in event.h: ``` template <int d> struct EventCreateFunctionRegisterer { explicit EventCreateFunctionRegisterer(EventCreateFunction f) { static_assert(d < MaxDeviceTypes, ""); Event::event_creator_[d] = f; } }; ``` at::DeviceType is an `enum class`, and it does not have implicit conversion to int, and provides better type safety guarantees. In this diff we have done the following refactor(taking CPU as an example): 1. caffe2::DeviceType → caffe2::DeviceTypeProto 2. caffe2::CPU → caffe2::PROTO_CPU 3. caffe2::DeviceType = at::DeviceType 4. caffe2::CPU = at::DeviceType::CPU codemod -d caffe2/caffe2 --extensions h,cc,cpp 'device_type, ' 'device_type(), PROTO_' + some manual changes In short, after this diff, in c++, caffe2::CPU refers to the at::DeviceType::CPU and the old proto caffe2::CPU will be caffe2::PROTO_CPU. In python side, we have a temporary workaround that alias `caffe2_pb2.CPU = caffe2_pb2.PROOT_CPU` to make the change easier to review and this will be removed later. Reviewed By: ezyang Differential Revision: D9545704 fbshipit-source-id: 461a28a4ca74e616d3ee183a607078a717fd38a7	2018-09-05 16:28:09 -07:00
Orion Reblitz-Richardson	6508db7421	Remove BUILD_CAFFE2 and build everything (#8338 ) Summary: This completely removes BUILD_CAFFE2 from CMake. There is still a little bit of "full build" stuff in setup.py that enables USE_CUDNN and BUILD_PYTHON, but otherwise everything should be enabled for PyTorch as well as Caffe2. This gets us a lot closer to full unification. cc mingzhe09088, pjh5, ezyang, smessmer, Yangqing Pull Request resolved: https://github.com/pytorch/pytorch/pull/8338 Reviewed By: mingzhe09088 Differential Revision: D9600513 Pulled By: orionr fbshipit-source-id: 9f6ca49df35b920d3439dcec56e7b26ad4768b7d	2018-08-31 13:10:24 -07:00
Pengyao Chen	dec3ed7b49	Increase the limit for Proto size (#10745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10745 ParseProtoFromLargeString hits limit when using recurring v2. To unblock warmup project, we can increase the limit temporarily. More details in this post -- https://fb.facebook.com/groups/264913123977784/permalink/463566404112454/ Differential Revision: D9436368 fbshipit-source-id: 54488f27ef941cab679843cb0c502095dd056c1b	2018-08-23 13:55:50 -07:00
Edward Yang	36939417b2	Introduce at::DeviceType, which subsumes at::Device::Type and (partially) caffe2::DeviceType (#10175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10175 Previously, we had at::Device::Type and caffe2::DeviceType (from protobuf), intended to help us distinguish between CPU, CUDA, etc. devices. This replaces at::Device::Type entirely with at::DeviceType, which in turn is a direct, 'enum class' version of the protobuf generated caffe2::DeviceType 'enum'. We can't eliminate the 'enum' because this would a pretty drastic API change (enum is interconvertible with integers, enum class is not) but we can make the two line up exactly and share code for, e.g., printing. Reviewed By: Yangqing Differential Revision: D9137156 fbshipit-source-id: 566385cd6efb1ed722b25e6f7849a910b50342ab	2018-08-03 19:25:06 -07:00
Lu Fang	63233f98ad	Bump up opset version to 7 in Caffe2 ONNX exporter (#8854 ) Summary: Will bump up to opset 8 in another PR to match the current opset version. Already tested through generating the models in current model zoo. Closes https://github.com/pytorch/pytorch/pull/8854 Reviewed By: ezyang Differential Revision: D8666437 Pulled By: houseroad fbshipit-source-id: feffdf704dd3136aa59c0f1ff1830c14d1bd20aa	2018-06-28 07:39:02 -07:00
Duc Ngo	f52c2ca1c6	net_async tracing use enable_profile arg from NetDef (#8927 ) Summary: Closes https://github.com/pytorch/pytorch/pull/8927 Closes https://github.com/pytorch/pytorch/pull/8855 - Add parameter `enable_tracing` to the Arg field of NetDef. `net_async_tracing` will only enable Tracer for Net instances that have this field set (unless the command line argument also include the net name). - Append a unique id to the json profiling result file because there could be multiple instances of the same net running. - Dump json profling file regularly instead of just when the Tracer object is destroyed Reviewed By: ilia-cher Differential Revision: D8372378 fbshipit-source-id: 8adc9d59f48b67456beed2e3a88235c298fdfd01	2018-06-27 16:24:57 -07:00
Orion Reblitz-Richardson	edd4e2c5d1	Expose proto utils and ONNX (#8073 ) * Expose proto utils and ONNX from PyTorch libcaffe2.so * Try to use protobuf from _C.so * Fix ONNX proto header include * Adjust order of imports for ONNX until nanopb goes away * Set and use ONNX_NAMESPACE for PyTorch builds * Show protobuf summary for all builds * Add ONNX_NAMESPACE for cpp_build * Statically link libprotobuf.a into libtorch.so * Set ONNX_NAMESPACE on Windows build * Move core/dispatch up as well * Add /MD flag for Windows build of _C * Potential Windows fix for ONNX and protobuf * Add direct linkage from _C to ONNX on Windows * Only include protobuf wrapper for PyTorch * Pass extra_compile_args to _nvrtc ext build * Remove installation of .a files	2018-06-13 10:25:32 -07:00
bddppq	e5b997223c	[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7955 ) * Add hip support for caffe2 core * Add MIOPEN header/wrapper to caffe2 core * Add HIP device into caffe2 PB * top level makefile change for rocm/hip * makefile scaffolding for AMD/RocM/HIP * Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files * caffe2 PB update for AMD/ROCM HIP device * Add AMD/RocM/Thrust dependency * HIP threadpool update * Fix makefile macro * makefile fix: duplicate test/binary name * makefile clean-up * makefile clean-up * add HIP operator registry * add utilities for hip device * Add USE_HIP to config summary * makefile fix for BUILD_TEST * merge latest * Fix indentation * code clean-up * Guard builds without HIP and use the same cmake script as PyTorch to find HIP * Setup rocm environment variables in build.sh (ideally should be done in the docker images) * setup locale * set HIP_PLATFORM * Revert "set HIP_PLATFORM" This reverts commit 8ec58db2b390c9259220c49fa34cd403568300ad. * continue the build script environment variables mess * HCC_AMDGPU_TARGET * Cleanup the mess, has been fixed in the lastest docker images * Assign protobuf field hip_gpu_id a new field number for backward compatibility * change name to avoid conflict * Fix duplicated thread pool flag * Refactor cmake files to not add hip includes and libs globally * Fix the wrong usage of environment variables detection in cmake * Add MIOPEN CNN operators * Revert "Add MIOPEN CNN operators" This reverts commit 6e89ad4385b5b8967a7854c4adda52c012cee42a. * Resolve merge conflicts * . * Update GetAsyncNetHIPThreadPool * Enable BUILD_CAFFE2 in pytorch build * Unifiy USE_HIP and USE_ROCM * always check USE_ROCM * . * remove unrelated change * move all core hip files to separate subdirectory * . * . * recurse glob core directory * . * correct include * .	2018-06-04 09:04:30 -07:00
bddppq	5e35fbfaa3	Post process onnx proto (#8064 ) * Post processing onnx generated protobuf files to hide global symbols * . * .	2018-06-02 10:46:48 -07:00
bddppq	966c65859d	Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2" (#7802 ) * Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) `4898c9e925`" This reverts commit `9c679dab5f`. * Revert "Add BiasCHW fallback for GPU (#7738)" This reverts commit `14ad2e74f1`. * Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566)" This reverts commit `2ebcf4bb37`.	2018-05-23 17:58:47 -07:00
Peter Yeh	2ebcf4bb37	[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566 ) * Add hip support for caffe2 core * Add MIOPEN header/wrapper to caffe2 core * Add HIP device into caffe2 PB * top level makefile change for rocm/hip * makefile scaffolding for AMD/RocM/HIP * Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files * caffe2 PB update for AMD/ROCM HIP device * Add AMD/RocM/Thrust dependency * HIP threadpool update * Fix makefile macro * makefile fix: duplicate test/binary name * makefile clean-up * makefile clean-up * add HIP operator registry * add utilities for hip device * Add USE_HIP to config summary * makefile fix for BUILD_TEST * merge latest * Fix indentation * code clean-up * Guard builds without HIP and use the same cmake script as PyTorch to find HIP * Setup rocm environment variables in build.sh (ideally should be done in the docker images) * setup locale * set HIP_PLATFORM * Revert "set HIP_PLATFORM" This reverts commit 8ec58db2b390c9259220c49fa34cd403568300ad. * continue the build script environment variables mess * HCC_AMDGPU_TARGET * Cleanup the mess, has been fixed in the lastest docker images * Assign protobuf field hip_gpu_id a new field number for backward compatibility * change name to avoid conflict * Fix duplicated thread pool flag * Refactor cmake files to not add hip includes and libs globally * Fix the wrong usage of environment variables detection in cmake * Add MIOPEN CNN operators * Revert "Add MIOPEN CNN operators" This reverts commit 6e89ad4385b5b8967a7854c4adda52c012cee42a.	2018-05-23 15:13:09 -07:00
Jinghui	26ddefbda1	[feature request] [Caffe2] Enable MKLDNN support for inference (#6699 ) * Add operators based-on IDEEP interfaces Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Enable IDEEP as a caffe2 device Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add test cases for IDEEP ops Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add IDEEP as a caffe2 submodule Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Skip test cases if no IDEEP support Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct cmake options for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add dependences on ideep libraries Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix issues in IDEEP conv ops and etc. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Move ideep from caffe2/ideep to caffe2/contrib/ideep Signed-off-by: Gu Jinghui <jinghui.gu@intel.com> * Update IDEEP to fix cmake issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix cmake issue caused by USE_MKL option Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct comments in MKL cmake file Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-04-22 21:58:14 -07:00
Bram Wasti	a73b3fd1f0	[caffe2][opencl] Add OpenCL context (#6777 )	2018-04-20 11:31:21 -07:00
Yinghai Lu	ef8f556212	[Caffe2] Changes done inside Facebook (#6378 ) * fix unit test for sqrt op From the error logging: [idx, grad, grad_estimate] are: [[ 146. 0.5 0.45776367] [ 147. 0.5 0.45776367] The gradient == 0.5 is correct, which means the SqrtOp and its gradient is doing right job. (Because y = sqrt(x), loss = y^2/2 = x/2, and then d(loss)/dx = 1/2 = 0.5; ) The test failed because of numerical problem of grad_estimate (in unit test). It can be because the step_size is small, and float precision is not high (when there are multiple elements in the tensor, we do sum(y^2) to compute loss) This diff - increase the step size, and also move the test cases to be further away from 0 (where sqrt(x) is not well defined) to be safe :) - also clean up, and merge the test case for inplace Vs. non-inplace Tested with: `CAFFE2_HYPOTHESIS_PROFILE=debug ai_bt caffe2/caffe2/python/operator_test:elementwise_ops_test -- "test_sqrt"` * CompositeReader & CompositeReaderBuilder A new type of reader gluing multiple readers together. * Back out "Revert D7394363: [GanH]: Log D Trick for Cross Entropy with Sigmoid" Original commit changeset: 9325a4356dbe * [dai][WIP] convert params to int8 on ps before sending to trainer Add float->uint8 conversion in addition to float->fp16 conversion in model_saver. * [easy] improve unit test for sparse length sum ops as desc. #accept2ship * Update GitHub upstream to `771fcb3455` * move sparse hash unique ops to OOS and add unit tests - move the SparseHash version to OOS, since 'sparsehash' is already deps of caffe2 OOS: https://fburl.com/arssw4n1 - The 'SparseHash' engine is also being used in OOS, so the SparseHash version shall be in OOS to reduce confusion: https://fburl.com/o5ea7ah2 - fix the CUDA UniqueOp for the case when batch is empty. - add unit test * group_norm_op for caffe2 This is the cuda op for Group Normalization (GN): https://arxiv.org/abs/1803.08494 This code implements GN in one op that computes Y=gamma * (X-mu) / sigma + beta and also its gradients. It is expected to have minimal memory consumption (similar to the BN op), without creating new blobs if GN were implemented as several ops (e.g., reshape, norm_mean/std, affine_channel). * Resubmit D7405233: disappeared in D7464958 OOS publish causes the op missing -- however, test was still there * [c2] add sparse hash engine for cuda unique op The SparseHash version of UniqueOp copy input tensor to CPU, and make use of sparse hash map to get unique output, and then copy back to GPU. * [dper][gpu] enable unit testing gpu trainer for sparse nn to debug the GPU trainer using mock data in unit test. make it easier to develop GPU trainer for new models. * Reuse Gloo context for Synchronize() calls Previously we were creating (and leaking) the Gloo context on each call to Synchronize(). Now only run the common world op and create the barrier net once, then run the barrier net on each Synchronize() call. Since timeout is associated with the Gloo context, assert that the timeout is fixed instead of trying to handle the complexity of multiple timeouts (and associated contexts). * [GanH/WGAN][1/n]: add FC param clipping as titled * [mobile] minimizing changes between caffe2_benchmark and speed_benchmark * [GanH]: enable diagnose within model avoid finding blob names but to directly enable inside the model * Add `net_transformer_fun` option to DPM This callback allows for various transformations to be made to the model after gradient operators have been added. The immediate motivation for this is to allow transformations such has "checkpoint-and-recompute" which allow trading off memory for additional compute. Adding several callbacks like this has made DPM's API less than ideal at this stage. However, I could not find any reasonable alternative. * [DT] [33/n] Compile flow task groups task groups need to compiled in order to pickle the object in fblearner. However I also changed the Job's compile function as creating new object is not necessary. * Initial commit for sparse_normalize vectorization and benchmark * [GanH]: LB Calibration for JSD as titled * Tracing event in async executor Adding event tracing through TRACE_EVENT macro in async executor * [Resubmit] D7409751 Reseting book-keeping blobs when the reservoir is reset D7409751 got lost in D7464958 * Visualizing realtime weights values we want to visualize the weights values as optimizer is iterating. This diff supports to visual the weights at an assigned index. Currently, we assume the blob to be 2 dimensional. * [GanH][Easy]: Fix Homotopy Weighting apparantely, there was a bug in homotopy weight (alpha, beta) update * [c2] move sparse hash unique op out of oss so that oss do not need to depend on google hash map. * Get rid of std::round as it's not supported on Android * Revert changes on setup.py * Skip shaky test on Dataio * fix	2018-04-10 21:11:43 -07:00
Orion Reblitz-Richardson	dbac044759	Add protobuf wrapper functions to proto_utils. * These will be used when we statically link libprotobuf.a inside libcaffe2.so	2018-03-28 10:05:20 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Yangqing Jia	2d03ae2f85	Move ParseProtobufFromLargeString to proto_utils (#2354 ) * Move ParseProtobufFromLargeString to proto_utils * ParseProtobuf -> ParseProto to be consistent in naming	2018-03-21 17:05:14 -07:00
Yangqing Jia	611a89c4b6	Remove more protobuf APIs. (#2348 ) * Wrap ShutdownProtobufLibrary * Remove text_format.h header and only put the function in proto_utils.h * ParseFromString returns bool	2018-03-21 10:29:45 -07:00

1 2

82 Commits