pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Author	SHA1	Message	Date
Shashank Chaudhry	06d1be2447	[NOOP][clangformat][codemod] Enable CLANGFORMAT for caffe2/caffe2/* (#67624 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67624 Test Plan: Visual inspection. Sandcastle. Reviewed By: malfet Differential Revision: D31986628 fbshipit-source-id: c872bded7325997a2945dbf5d4d052628dcb3659	2021-11-02 22:14:04 -07:00
Nikita Shulga	a9b0a921d5	Disable `avoid-non-const-global-variables` lint check (#62008 ) Summary: As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH` All changes but the ones to `.clang-tidy` are generated using following script: ``` for i in `find . -type f -iname ".c" -or -iname "*.h"\|xargs grep cppcoreguidelines-avoid-non-const-global-variables\|cut -f1 -d:\|sort\|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008 Reviewed By: driazati, r-barnes Differential Revision: D29838584 Pulled By: malfet fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13	2021-07-22 18:04:40 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Jane Xu	71ca600af9	Renaming CAFFE2_API to TORCH_API (#49496 ) Summary: Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO. Manually edited some references of the removed `CAFFE2_API`: * `CONTRIBUTING.md` * `caffe2/proto/CMakeLists.txt` * `cmake/ProtoBuf.cmake` * `c10/macros/Export.h` * `torch/csrc/WindowsTorchApiMacro.h` Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496 Reviewed By: malfet, samestep Differential Revision: D25600726 Pulled By: janeyx99 fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782	2020-12-18 10:54:50 -08:00
Shai Szulanski	0ddaaf6a92	[codemod][caffe2] Run clang-format - 5/7 Summary: This directory is opted-in to clang-format but is not format-clean. This blocks continuous formatting from being enabled on fbcode, and causes hassle for other codemods that leave inconsistent formatting. This diff runs clang-format, which is widely used and considered safe. If you are unhappy with the formatting of a particular block, please accept this diff and then in a stacked commit undo the change and wrap that code in `// clang-format off` and `// clang-format on`, or `/* clang-format off /` and `/ clang-format on */`. drop-conflicts Test Plan: sandcastleit Reviewed By: jerryzh168 Differential Revision: D22311706 fbshipit-source-id: 1ca59a82e96156a4a5dfad70ba3e64d44c5e762a	2020-06-30 15:45:11 -07:00
Michael Ranieri	51d969e86a	preprocessor cleanup (#33957 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33957 lots of small preprocessor warning cleanup for windows Test Plan: CI green Reviewed By: malfet, albanD Differential Revision: D20153582 fbshipit-source-id: 18fd61c466fd1f55ededdae4448b3009a9cedc04	2020-03-02 13:37:19 -08:00
Sebastian Messmer	643ca5def2	Replace c10::guts::stuff with std::stuff (#30915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915 Since we now have C++14, we don't need these c10::guts helpers anymore ghstack-source-id: 95777609 Test Plan: waitforsandcastle Differential Revision: D18869639 fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e	2019-12-16 13:57:19 -08:00
Martin Schatz	5b835682e3	Remove GPU dependency from ProfileObserver (#17592 ) Summary: Remove GPU dependency and register ProfileObserver. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17592 Reviewed By: ezyang Differential Revision: D14265801 Pulled By: mdschatz fbshipit-source-id: f98c0c32653c64a8b087c58ece4f864dfbe1d4b8	2019-03-04 10:00:46 -08:00
Jerry Zhang	0c32e1b43e	use C10_MOBILE/ANDROID/IOS (#15363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363 Didn't define C10_MOBILE in the numa file move diff: D13380559 move CAFFE2_MOBILE/ANDROID/IOS to c10 ``` codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS" ``` i-am-not-moving-c2-to-c10 Reviewed By: marcinkwiatkowski Differential Revision: D13490020 fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4	2019-01-09 15:08:20 -08:00
Aldian Fazrihady	e27d77815d	Prevent `profile_observer_test` from being run by CPU test (#14168 ) Summary: Fix CMakeLists.txt, so the test for CPU won't run profile_observer_test.cc, as currently it only supports GPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/14168 Differential Revision: D13356274 Pulled By: ezyang fbshipit-source-id: 7d105f2e18675e5fab129864958148b0f18d582c	2018-12-05 22:34:29 -08:00
Junjie Bai	fade36668a	Revert D10428917: [Caffe2] Add cost into profile observer Differential Revision: D10428917 Original commit changeset: 7c100e551bdd fbshipit-source-id: 5164d9ba61cc103eccfdeb91a5cc140cea31a819	2018-11-16 23:30:07 -08:00
Junjie Bai	a43037fa11	Revert D10439558: Add cost for non-linear ops Differential Revision: D10439558 Original commit changeset: 9aeb05bac8b5 fbshipit-source-id: f00977b4f95bdd500d254eb44fb5b0c816506ee4	2018-11-16 23:30:05 -08:00
Haixin Liu	cbc94894fb	Add cost for non-linear ops (#13327 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13327 Add cost inference function to non-linear ops. Since the actual flops of the non-linear operator depends on the implementation, we use the number of non-linear operations as the proxy for the analytical flops for non-linear operators. Reviewed By: jspark1105 Differential Revision: D10439558 fbshipit-source-id: 9aeb05bac8b5c7ae5d351ebf365e0a81cf4fc227	2018-11-16 18:53:30 -08:00
Haixin Liu	86dc3ab252	Add cost into profile observer (#12793 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12793 Add analytical cost into profile observer. It includes the op level cost information for each op run and net level aggregated cost information for each op type. It outputs the following information: 1. analytical flops 2. analytical bytes_read 3. analytical bytes_written Example output at op level: ```I1017 14:58:14.245978 3686541 profile_observer_gpu.cc:26] --------- Starting operator FC op#24 --------- I1017 14:58:14.246049 3686541 profile_observer_gpu.cc:33] Input 0: Tensor model1/embedded_encoder_inputs of type float. Dims: (17,1,256,): I1017 14:58:14.246109 3686541 profile_observer_gpu.cc:33] Input 1: Tensor model1/encoder/layer0/fw/milstm/i2h_w of type float. Dims: (2048,256,): I1017 14:58:14.246176 3686541 profile_observer_gpu.cc:33] Input 2: Tensor model1/encoder/layer0/fw/milstm/i2h_b of type float. Dims: (2048,): I1017 14:58:14.246217 3686541 profile_observer_gpu.cc:44] Argument 0: name: "use_cudnn" i: 1 I1017 14:58:14.246271 3686541 profile_observer_gpu.cc:44] Argument 1: name: "cudnn_exhaustive_search" i: 0 I1017 14:58:14.246338 3686541 profile_observer_gpu.cc:44] Argument 2: name: "order" s: "NHWC" I1017 14:58:14.246372 3686541 profile_observer_gpu.cc:44] Argument 3: name: "axis" i: 2 I1017 14:58:14.246418 3686541 profile_observer_gpu.cc:44] Argument 4: name: "quantization_scheme" i: 1 I1017 14:58:14.246470 3686541 profile_observer_gpu.cc:53] Output 0: Tensor model1/encoder/layer0/fw/milstm/i2h of type float. Dims: (17,1,2048,): I1017 14:58:14.246596 3686541 profile_observer_gpu.cc:61] Cost (flops, bytes_read, bytes_written): I1017 14:58:14.246649 3686541 profile_observer_gpu.cc:62] 17860608 2122752 139264 I1017 14:58:14.246677 3686541 profile_observer_gpu.cc:64] --------- Finished operator FC in 0.764221 ms --------- ``` Example output at net level: ``` I1017 11:13:44.675585 3146691 profile_observer_gpu.cc:165] ================ Detailed stats for net model0/encoder/layer0/bw/milstm ================ I1017 11:13:44.675662 3146691 profile_observer_gpu.cc:167] Cost (flops, bytes_read, bytes_written) per operator type: I1017 11:13:44.675706 3146691 profile_observer_gpu.cc:169] 20992000 42045440 81920 FC I1017 11:13:44.675745 3146691 profile_observer_gpu.cc:169] 20480 163840 81920 Mul I1017 11:13:44.675824 3146691 profile_observer_gpu.cc:169] 20480 163840 81920 Sum I1017 11:13:44.675878 3146691 profile_observer_gpu.cc:169] 0 0 0 ElementwiseLinear I1017 11:13:44.675909 3146691 profile_observer_gpu.cc:169] 0 0 0 LSTMUnit I1017 11:13:44.675958 3146691 profile_observer_gpu.cc:169] 0 0 0 rnn_internal_apply_link ``` Reviewed By: mdschatz Differential Revision: D10428917 fbshipit-source-id: 7c100e551bdd3ac8d7c09be12c72d70a2d67cae1	2018-11-16 18:53:28 -08:00
ArutyunovG	8e91da4cb3	Windows shared build (#13550 ) Summary: Hi guys, I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios. This is the first pull request. Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015. CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system. Python is 3.5, Detectron works from python interface as well. It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built. What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat. After this pull request the next step is to add Visual Studio 2017 support in the script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550 Reviewed By: ezyang Differential Revision: D13042597 Pulled By: orionr fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc	2018-11-16 12:16:28 -08:00
Junjie Bai	f54ab540af	Rename cuda_gpu_id to device_id in DeviceOption (#12456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12456 codemod with 'Yes to all' codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id Overload TextFormat::ParseFromString to do string replace when parsing from protobuf format Reviewed By: Yangqing Differential Revision: D10240535 fbshipit-source-id: 5e6992bec961214be8dbe26f16f5794154a22b25	2018-10-09 15:54:04 -07:00
Junjie Bai	ff608a9ff3	Back out "Revert D10123245: Back out "codemod cuda_gpu_id to device_id"" (#12232 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12232 Original commit changeset: fca91fea58b7 This adds proper modifications to the DeviceType <->DeviceOption conversion code added in D10033396 Reviewed By: jerryzh168 Differential Revision: D10132473 fbshipit-source-id: 801ef777e2950982cb47b48051b1471a0a91e64b	2018-10-01 21:54:52 -07:00
Rick Ratmansky	3010dc4208	Revert D10123245: Back out "codemod cuda_gpu_id to device_id" Differential Revision: D10123245 Original commit changeset: d83da8e00a12 fbshipit-source-id: fca91fea58b7df208edc2e218a1d514f9821ec7b	2018-10-01 12:22:36 -07:00
Yang Liu	7d7d336c45	Back out "codemod cuda_gpu_id to device_id" Summary: Original commit changeset: f5614a5d2607 D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz We need to land this revert ASAP to unblock aggregator push. Reviewed By: orionr Differential Revision: D10123245 fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2	2018-10-01 11:31:14 -07:00
Junjie Bai	3eb5940cf5	codemod cuda_gpu_id to device_id (#12022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12022 codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id codemod with 'Yes to all' Reviewed By: orionr Differential Revision: D9986213 fbshipit-source-id: f5614a5d26078817aee8caf79a494abfd6a95ff1	2018-09-27 20:24:53 -07:00
Sebastian Messmer	ce6906b051	Narrowing Blob (#11167 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11167 Narrow the Blob API as preparation for merging Blob/IValue - get rid of templated IsType and Operator::InputIsType / OutputIsType - Use 'using' instead of 'typedef' for DestroyCall (just for readability) Reviewed By: ezyang Differential Revision: D9623916 fbshipit-source-id: 952f0b0cf5a525094b02e8d2798dd57a56a9e1d8	2018-09-10 12:40:16 -07:00
Hector Yuen	a6cb41486d	update documentation for observers Summary: update to the latest observer usage syntax add an example of HistogramObservers Reviewed By: jspark1105 Differential Revision: D6878439 fbshipit-source-id: c9521f2daecfc7f0c17de6a944dce58e568e3dbe	2018-08-30 18:11:48 -07:00
Orion Reblitz-Richardson	edb34434ab	More changes for hidden visibility (#10692 ) Summary: Let's run CI tests to see what fails given the changes that just landed in https://github.com/pytorch/pytorch/pull/10624 cc mingzhe09088 ezyang Yangqing Pull Request resolved: https://github.com/pytorch/pytorch/pull/10692 Reviewed By: mingzhe09088 Differential Revision: D9423617 Pulled By: orionr fbshipit-source-id: 3bda1f118d13f8dd8e823727c93167cae747d8cf	2018-08-21 13:39:57 -07:00
Jerry Zhang	3b3aff2ed6	IsType<TensorCPU> -> IsType<Tensor>(CPU) (#10135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10135 att Reviewed By: yinghai Differential Revision: D9121892 fbshipit-source-id: 4a4a3bfc450896b619bf92c92ef218aaaefc3081	2018-08-03 17:24:59 -07:00
Jerry Zhang	aebf3b47ae	Remove template parameter from Tensor (#9939 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939 Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: ezyang, houseroad Differential Revision: D9024330 fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba	2018-07-27 10:56:39 -07:00
Jerry Zhang	969b62f276	Revert D8121878: Remove template parameter from Tensor Differential Revision: D8121878 Original commit changeset: 4a5e9a677ba4 fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e	2018-07-26 14:02:04 -07:00
Jerry Zhang	cd5adc7b5f	Remove template parameter from Tensor (#13 ) Summary: Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: xw285cornell Differential Revision: D8121878 fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81	2018-07-26 10:25:23 -07:00
Bram Wasti	cfb626b638	[caffe2][tiny][fix] Make the build work with profile observers (#6908 )	2018-04-24 12:46:48 -07:00
mdschatz	aeb91587e5	[caffe2] Fix observer logic in RNN executor. Remove dynamic casts (#6202 ) * Fix observer logic in RNN executor. Remove dynamic casts * Revert to original design	2018-04-23 15:01:00 -07:00
Lu Fang	8b434d1141	Quick fix on the observer test	2018-03-27 18:10:39 -07:00
Martin Schatz	d5e38a8aee	[PerfModel] Add Profile observer Adds profile observer to system. This outputs the following information 1) Input tensor sizes 2) Argument list 3) Output tensor sizes 4) Operator run time Example output: I0206 14:00:51.217067 1730559 profile_observer_gpu.cc:53] --------- Starting operator Conv op#0 --------- I0206 14:00:51.217073 1730559 profile_observer_gpu.cc:65] Input 0: Tensor gpu_0/data of type float. Dims: (32,3,227,227,): I0206 14:00:51.217077 1730559 profile_observer_gpu.cc:65] Input 1: Tensor gpu_0/conv1_w of type float. Dims: (64,3,7,7,): I0206 14:00:51.217082 1730559 profile_observer_gpu.cc:71] Argument 0: name: "kernel" i: 7 I0206 14:00:51.217087 1730559 profile_observer_gpu.cc:71] Argument 1: name: "enable_tensor_core" i: 0 I0206 14:00:51.217089 1730559 profile_observer_gpu.cc:71] Argument 2: name: "exhaustive_search" i: 1 I0206 14:00:51.217092 1730559 profile_observer_gpu.cc:71] Argument 3: name: "float16_compute" i: 0 I0206 14:00:51.217095 1730559 profile_observer_gpu.cc:71] Argument 4: name: "stride" i: 2 I0206 14:00:51.217099 1730559 profile_observer_gpu.cc:71] Argument 5: name: "pad" i: 3 I0206 14:00:51.217103 1730559 profile_observer_gpu.cc:71] Argument 6: name: "order" s: "NCHW" I0206 14:00:51.217105 1730559 profile_observer_gpu.cc:71] Argument 7: name: "ws_nbytes_limit" i: 67108864 I0206 14:00:51.217109 1730559 profile_observer_gpu.cc:85] Output 0: Tensor gpu_0/conv1 of type float. Dims: (32,64,114,114,): I0206 14:00:51.217111 1730559 profile_observer_gpu.cc:88] --------- Finished operator Conv in 1.12685 ms --------- Example output for internal RNN op (from seq2seq): I0219 18:57:06.779331 2960991 profile_observer_gpu.cc:52] --------- Starting operator LSTMUnit op#3161697160-7 --------- I0219 18:57:06.779336 2960991 profile_observer_gpu.cc:59] Input 0: Tensor model0/encoder/layer3/lstm/hidden_t_prev of type float. Dims: (1,1,512,): I0219 18:57:06.779340 2960991 profile_observer_gpu.cc:59] Input 1: Tensor model0/encoder/layer3/lstm/cell_t_prev of type float. Dims: (1,1,512,): I0219 18:57:06.779343 2960991 profile_observer_gpu.cc:59] Input 2: Tensor model0/encoder/layer3/lstm/gates_t of type float. Dims: (1,1,2048,): I0219 18:57:06.779346 2960991 profile_observer_gpu.cc:59] Input 3: Tensor encoder_lengths of type int. Dims: (1,): I0219 18:57:06.779350 2960991 profile_observer_gpu.cc:59] Input 4: Tensor timestep_rnnexec_t24 of type int. Dims: (1,): I0219 18:57:06.779353 2960991 profile_observer_gpu.cc:70] Argument 0: name: "no_sequence_lengths" i: 0 I0219 18:57:06.779357 2960991 profile_observer_gpu.cc:70] Argument 1: name: "drop_states" i: 0 I0219 18:57:06.779362 2960991 profile_observer_gpu.cc:70] Argument 2: name: "forget_bias" f: 0 I0219 18:57:06.779366 2960991 profile_observer_gpu.cc:79] Output 0: Tensor model0/encoder/layer3/lstm/hidden_t of type float. Dims: (1,1,512,): I0219 18:57:06.779369 2960991 profile_observer_gpu.cc:79] Output 1: Tensor model0/encoder/layer3/lstm/cell_t of type float. Dims: (1,1,512,): I0219 18:57:06.779372 2960991 profile_observer_gpu.cc:89] RecurrentNetwork 3161697160: order: 7 I0219 18:57:06.779373 2960991 profile_observer_gpu.cc:92] --------- Finished operator LSTMUnit in 0.00153923 ms --------- Existing deficiencies: 1) Need support to create separate CPU and GPU builds Once this is approved, I'll port the changes over to OSS	2018-03-27 18:10:39 -07:00
Martin Schatz	8baa563daf	Change observer copy() method to take id parameter This diff is added to support the ProfileObserver in order to differentiate operators in the stepnet properly. Since copy() is only used in the context of RNNs, the name has been changed to reflect that.	2018-03-27 18:10:39 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Yangqing Jia	611a89c4b6	Remove more protobuf APIs. (#2348 ) * Wrap ShutdownProtobufLibrary * Remove text_format.h header and only put the function in proto_utils.h * ParseFromString returns bool	2018-03-21 10:29:45 -07:00
Koan-Sin Tan	fab2c07af9	make -DUSE_OBSERVERS=ON work ``` scripts/build_android.sh -DBUILD_TEST=ON -DBUILD_BINARY=ON -DBUILD_OBSERVERS=On -DUSE_OBSERVERS=ON ```	2018-03-02 20:31:33 +01:00
mdschatz	3c952426fb	Add operator attaching net observer Summary: Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from. Closes https://github.com/caffe2/caffe2/pull/1806 Reviewed By: salexspb Differential Revision: D6808623 Pulled By: mdschatz fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736	2018-01-29 14:34:34 -08:00
Martin Schatz	f19ae690c3	Update observer when attached to RNN ops Summary: Observer passed to RNN step net cloned with RecurrentOperator as subject instead of internal Operator. This diff applies adds the internal operator as the subject Reviewed By: enosair Differential Revision: D6560996 fbshipit-source-id: 7af4fb0ff8c19795b5c994c5fc6876f3d2ba7bf4	2017-12-14 10:04:20 -08:00
Alexander Sidorov	913a9a736c	Backed out changeset 4e1241fe65cd (revert a revert :) ) Summary: This fixes the issue but I haven't figured out yet why is it happening. Reviewed By: bwasti Differential Revision: D6437378 fbshipit-source-id: bf983c9b6f57647423423ec6b22e0f9d2b170e74	2017-11-29 13:33:15 -08:00
Aapo Kyrola	e0c8c539e7	Backed out changeset 119623addbbd Summary: Unlanding D6327460 because seems to be causing unstability. Differential Revision: D6377117 fbshipit-source-id: 4e1241fe65cd4c7a127fa6fa724f60b75965a096	2017-11-20 16:17:52 -08:00
Alexander Sidorov	8d321b6cd3	Improve observer framework overhead Summary: There were several regressions over time. Looks like the main one is recent change with having a map which we iterate over for each operator call. I made some other little optimizations to our Facebook observer. Overal this seems to cut about 1000ns from an opertor. At a rate of 36B operators per second this shouldbe about 750 type vi hosts. Reviewed By: bwasti Differential Revision: D6327460 fbshipit-source-id: 119623addbbd575486906959d65603eea8d4f5e6	2017-11-17 08:36:35 -08:00
Dmytro Dzhulgakov	2bf4dec9ff	Add missing CMakeFile in caffe2/observers Summary: Broke by `43075b779b` Closes https://github.com/caffe2/caffe2/pull/1473 Differential Revision: D6333460 Pulled By: dzhulgakov fbshipit-source-id: 94a06b53650b02ff5938367896b52f47cdbf811a	2017-11-14 21:46:57 -08:00
Qinqing Zheng	c77f0cb5e6	Attach observers to operators inside step net Summary: Pass the list of observers to rnnExecutor_ and attach them to operators Reviewed By: akyrola Differential Revision: D6279655 fbshipit-source-id: 086dde1bf6edbfb36082d6b4de33ec41f0bbefab	2017-11-14 15:06:38 -08:00
Bram Wasti	7d16d320d5	expose observers to python, add multiple observers per observable Summary: observer framework can now be used in python + a small writeup of how to use it. this is D6035393 with a fix for ct-scan Reviewed By: salexspb Differential Revision: D6066380 fbshipit-source-id: 896c4c580d4387240b81ac2dbbc43db51d4bfeb9	2017-10-16 14:32:56 -07:00
Scott Yost	a7a81351f2	Revert D6035393: [caffe2] expose observers to python, add multiple observers per observable Summary: This reverts commit 4563cf0203095fa979bb2160621cd16dd22ff830 bypass-lint Differential Revision: D6035393 fbshipit-source-id: 090fba774ce433904f7ef769dda75c2fbbf784a8	2017-10-14 21:47:34 -07:00
Bram Wasti	58fe66e337	expose observers to python, add multiple observers per observable Summary: observer framework can now be used in python + a small writeup of how to use it Reviewed By: sf-wind Differential Revision: D6035393 fbshipit-source-id: 4563cf0203095fa979bb2160621cd16dd22ff830	2017-10-14 13:09:29 -07:00
Yangqing Jia	b1508e8e86	Revert D5905002: [caffe2] expose observers to python Summary: This reverts commit e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1 bypass-lint Differential Revision: D5905002 fbshipit-source-id: 4f1b79d9a318978f6b74565f633f34b9701a9d5c	2017-10-10 22:12:00 -07:00
Bram Wasti	63caca89db	expose observers to python Summary: observer framework can now be used in python + a small writeup of how to use it Reviewed By: salexspb Differential Revision: D5905002 fbshipit-source-id: e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1	2017-10-10 16:10:41 -07:00

48 Commits