Commit Graph

48 Commits

Author SHA1 Message Date
Shashank Chaudhry
06d1be2447 [NOOP][clangformat][codemod] Enable CLANGFORMAT for caffe2/caffe2/* (#67624)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67624

Test Plan: Visual inspection. Sandcastle.

Reviewed By: malfet

Differential Revision: D31986628

fbshipit-source-id: c872bded7325997a2945dbf5d4d052628dcb3659
2021-11-02 22:14:04 -07:00
Nikita Shulga
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00
Nikita Shulga
3a66a1cb99 [clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841)
Summary:
Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy
Remove existing nolint warnings using following script:
```
for file in `git ls-files | grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i  $file; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841

Reviewed By: samestep

Differential Revision: D28295045

Pulled By: malfet

fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163
2021-05-07 20:02:33 -07:00
Nikita Shulga
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
Jane Xu
71ca600af9 Renaming CAFFE2_API to TORCH_API (#49496)
Summary:
Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO.

Manually edited some references of the removed `CAFFE2_API`:
* `CONTRIBUTING.md`
* `caffe2/proto/CMakeLists.txt`
* `cmake/ProtoBuf.cmake`
* `c10/macros/Export.h`
* `torch/csrc/WindowsTorchApiMacro.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496

Reviewed By: malfet, samestep

Differential Revision: D25600726

Pulled By: janeyx99

fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782
2020-12-18 10:54:50 -08:00
Shai Szulanski
0ddaaf6a92 [codemod][caffe2] Run clang-format - 5/7
Summary:
This directory is opted-in to clang-format but is not format-clean. This blocks continuous formatting from being enabled on fbcode, and causes hassle for other codemods that leave inconsistent formatting. This diff runs clang-format, which is widely used and considered safe.

If you are unhappy with the formatting of a particular block, please *accept this diff* and then in a stacked commit undo the change and wrap that code in `// clang-format off` and `// clang-format on`, or `/* clang-format off */` and `/* clang-format on */`.

drop-conflicts

Test Plan: sandcastleit

Reviewed By: jerryzh168

Differential Revision: D22311706

fbshipit-source-id: 1ca59a82e96156a4a5dfad70ba3e64d44c5e762a
2020-06-30 15:45:11 -07:00
Michael Ranieri
51d969e86a preprocessor cleanup (#33957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33957

lots of small preprocessor warning cleanup for windows

Test Plan: CI green

Reviewed By: malfet, albanD

Differential Revision: D20153582

fbshipit-source-id: 18fd61c466fd1f55ededdae4448b3009a9cedc04
2020-03-02 13:37:19 -08:00
Sebastian Messmer
643ca5def2 Replace c10::guts::stuff with std::stuff (#30915)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915

Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609

Test Plan: waitforsandcastle

Differential Revision: D18869639

fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
2019-12-16 13:57:19 -08:00
Martin Schatz
5b835682e3 Remove GPU dependency from ProfileObserver (#17592)
Summary:
Remove GPU dependency and register ProfileObserver.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17592

Reviewed By: ezyang

Differential Revision: D14265801

Pulled By: mdschatz

fbshipit-source-id: f98c0c32653c64a8b087c58ece4f864dfbe1d4b8
2019-03-04 10:00:46 -08:00
Jerry Zhang
0c32e1b43e use C10_MOBILE/ANDROID/IOS (#15363)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363

Didn't define C10_MOBILE in the numa file move diff: D13380559
move CAFFE2_MOBILE/ANDROID/IOS to c10

```
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS"

```

i-am-not-moving-c2-to-c10

Reviewed By: marcinkwiatkowski

Differential Revision: D13490020

fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4
2019-01-09 15:08:20 -08:00
Aldian Fazrihady
e27d77815d Prevent profile_observer_test from being run by CPU test (#14168)
Summary:
Fix CMakeLists.txt, so the test for CPU won't run profile_observer_test.cc, as currently it only supports GPU
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14168

Differential Revision: D13356274

Pulled By: ezyang

fbshipit-source-id: 7d105f2e18675e5fab129864958148b0f18d582c
2018-12-05 22:34:29 -08:00
Junjie Bai
fade36668a Revert D10428917: [Caffe2] Add cost into profile observer
Differential Revision:
D10428917

Original commit changeset: 7c100e551bdd

fbshipit-source-id: 5164d9ba61cc103eccfdeb91a5cc140cea31a819
2018-11-16 23:30:07 -08:00
Junjie Bai
a43037fa11 Revert D10439558: Add cost for non-linear ops
Differential Revision:
D10439558

Original commit changeset: 9aeb05bac8b5

fbshipit-source-id: f00977b4f95bdd500d254eb44fb5b0c816506ee4
2018-11-16 23:30:05 -08:00
Haixin Liu
cbc94894fb Add cost for non-linear ops (#13327)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13327

Add cost inference function to non-linear ops. Since the actual flops of the non-linear operator depends on the implementation, we use the number of non-linear operations as the proxy for the analytical flops for non-linear operators.

Reviewed By: jspark1105

Differential Revision: D10439558

fbshipit-source-id: 9aeb05bac8b5c7ae5d351ebf365e0a81cf4fc227
2018-11-16 18:53:30 -08:00
Haixin Liu
86dc3ab252 Add cost into profile observer (#12793)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12793

Add analytical cost into profile observer. It includes the op level cost information for each op run and net level aggregated cost information for each op type.

It outputs the following information:
1. analytical flops
2. analytical bytes_read
3. analytical bytes_written

Example output at op level:
```I1017 14:58:14.245978 3686541 profile_observer_gpu.cc:26] --------- Starting operator FC op#24 ---------
I1017 14:58:14.246049 3686541 profile_observer_gpu.cc:33] Input 0: Tensor model1/embedded_encoder_inputs of type float. Dims: (17,1,256,):
I1017 14:58:14.246109 3686541 profile_observer_gpu.cc:33] Input 1: Tensor model1/encoder/layer0/fw/milstm/i2h_w of type float. Dims: (2048,256,):
I1017 14:58:14.246176 3686541 profile_observer_gpu.cc:33] Input 2: Tensor model1/encoder/layer0/fw/milstm/i2h_b of type float. Dims: (2048,):
I1017 14:58:14.246217 3686541 profile_observer_gpu.cc:44] Argument 0: name: "use_cudnn" i: 1
I1017 14:58:14.246271 3686541 profile_observer_gpu.cc:44] Argument 1: name: "cudnn_exhaustive_search" i: 0
I1017 14:58:14.246338 3686541 profile_observer_gpu.cc:44] Argument 2: name: "order" s: "NHWC"
I1017 14:58:14.246372 3686541 profile_observer_gpu.cc:44] Argument 3: name: "axis" i: 2
I1017 14:58:14.246418 3686541 profile_observer_gpu.cc:44] Argument 4: name: "quantization_scheme" i: 1
I1017 14:58:14.246470 3686541 profile_observer_gpu.cc:53] Output 0: Tensor model1/encoder/layer0/fw/milstm/i2h of type float. Dims: (17,1,2048,):
I1017 14:58:14.246596 3686541 profile_observer_gpu.cc:61] Cost (flops, bytes_read, bytes_written):
I1017 14:58:14.246649 3686541 profile_observer_gpu.cc:62]        17860608 2122752 139264
I1017 14:58:14.246677 3686541 profile_observer_gpu.cc:64] --------- Finished operator FC in 0.764221 ms ---------
```
Example output at net level:
```
I1017 11:13:44.675585 3146691 profile_observer_gpu.cc:165] ================ Detailed stats for net model0/encoder/layer0/bw/milstm ================
I1017 11:13:44.675662 3146691 profile_observer_gpu.cc:167] Cost (flops, bytes_read, bytes_written) per operator type:
I1017 11:13:44.675706 3146691 profile_observer_gpu.cc:169]        20992000 42045440 81920 FC
I1017 11:13:44.675745 3146691 profile_observer_gpu.cc:169]           20480 163840 81920 Mul
I1017 11:13:44.675824 3146691 profile_observer_gpu.cc:169]           20480 163840 81920 Sum
I1017 11:13:44.675878 3146691 profile_observer_gpu.cc:169]               0 0 0 ElementwiseLinear
I1017 11:13:44.675909 3146691 profile_observer_gpu.cc:169]               0 0 0 LSTMUnit
I1017 11:13:44.675958 3146691 profile_observer_gpu.cc:169]               0 0 0 rnn_internal_apply_link
```

Reviewed By: mdschatz

Differential Revision: D10428917

fbshipit-source-id: 7c100e551bdd3ac8d7c09be12c72d70a2d67cae1
2018-11-16 18:53:28 -08:00
ArutyunovG
8e91da4cb3 Windows shared build (#13550)
Summary:
Hi guys,

I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.

What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.

After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550

Reviewed By: ezyang

Differential Revision: D13042597

Pulled By: orionr

fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
2018-11-16 12:16:28 -08:00
Junjie Bai
f54ab540af Rename cuda_gpu_id to device_id in DeviceOption (#12456)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12456

codemod with 'Yes to all'
codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id

Overload TextFormat::ParseFromString to do string replace when parsing from protobuf format

Reviewed By: Yangqing

Differential Revision: D10240535

fbshipit-source-id: 5e6992bec961214be8dbe26f16f5794154a22b25
2018-10-09 15:54:04 -07:00
Junjie Bai
ff608a9ff3 Back out "Revert D10123245: Back out "codemod cuda_gpu_id to device_id"" (#12232)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12232

Original commit changeset: fca91fea58b7

This adds proper modifications to the DeviceType <->DeviceOption conversion code added in D10033396

Reviewed By: jerryzh168

Differential Revision: D10132473

fbshipit-source-id: 801ef777e2950982cb47b48051b1471a0a91e64b
2018-10-01 21:54:52 -07:00
Rick Ratmansky
3010dc4208 Revert D10123245: Back out "codemod cuda_gpu_id to device_id"
Differential Revision:
D10123245

Original commit changeset: d83da8e00a12

fbshipit-source-id: fca91fea58b7df208edc2e218a1d514f9821ec7b
2018-10-01 12:22:36 -07:00
Yang Liu
7d7d336c45 Back out "codemod cuda_gpu_id to device_id"
Summary:
Original commit changeset: f5614a5d2607

D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz
We need to land this revert ASAP to unblock aggregator push.

Reviewed By: orionr

Differential Revision: D10123245

fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2
2018-10-01 11:31:14 -07:00
Junjie Bai
3eb5940cf5 codemod cuda_gpu_id to device_id (#12022)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12022

codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id

codemod with 'Yes to all'

Reviewed By: orionr

Differential Revision: D9986213

fbshipit-source-id: f5614a5d26078817aee8caf79a494abfd6a95ff1
2018-09-27 20:24:53 -07:00
Sebastian Messmer
ce6906b051 Narrowing Blob (#11167)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11167

Narrow the Blob API as preparation for merging Blob/IValue

- get rid of templated IsType and Operator::InputIsType / OutputIsType
- Use 'using' instead of 'typedef' for DestroyCall (just for readability)

Reviewed By: ezyang

Differential Revision: D9623916

fbshipit-source-id: 952f0b0cf5a525094b02e8d2798dd57a56a9e1d8
2018-09-10 12:40:16 -07:00
Hector Yuen
a6cb41486d update documentation for observers
Summary:
update to the latest observer usage syntax
add an example of HistogramObservers

Reviewed By: jspark1105

Differential Revision: D6878439

fbshipit-source-id: c9521f2daecfc7f0c17de6a944dce58e568e3dbe
2018-08-30 18:11:48 -07:00
Orion Reblitz-Richardson
edb34434ab More changes for hidden visibility (#10692)
Summary:
Let's run CI tests to see what fails given the changes that just landed in https://github.com/pytorch/pytorch/pull/10624

cc mingzhe09088 ezyang Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10692

Reviewed By: mingzhe09088

Differential Revision: D9423617

Pulled By: orionr

fbshipit-source-id: 3bda1f118d13f8dd8e823727c93167cae747d8cf
2018-08-21 13:39:57 -07:00
Jerry Zhang
3b3aff2ed6 IsType<TensorCPU> -> IsType<Tensor>(CPU) (#10135)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10135

att

Reviewed By: yinghai

Differential Revision: D9121892

fbshipit-source-id: 4a4a3bfc450896b619bf92c92ef218aaaefc3081
2018-08-03 17:24:59 -07:00
Jerry Zhang
aebf3b47ae Remove template parameter from Tensor (#9939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939

Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: ezyang, houseroad

Differential Revision: D9024330

fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
2018-07-27 10:56:39 -07:00
Jerry Zhang
969b62f276 Revert D8121878: Remove template parameter from Tensor
Differential Revision:
D8121878

Original commit changeset: 4a5e9a677ba4

fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e
2018-07-26 14:02:04 -07:00
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Bram Wasti
cfb626b638
[caffe2][tiny][fix] Make the build work with profile observers (#6908) 2018-04-24 12:46:48 -07:00
mdschatz
aeb91587e5 [caffe2] Fix observer logic in RNN executor. Remove dynamic casts (#6202)
* Fix observer logic in RNN executor. Remove dynamic casts

* Revert to original design
2018-04-23 15:01:00 -07:00
Lu Fang
8b434d1141 Quick fix on the observer test 2018-03-27 18:10:39 -07:00
Martin Schatz
d5e38a8aee [PerfModel] Add Profile observer
Adds profile observer to system.  This outputs the following information
1) Input tensor sizes
2) Argument list
3) Output tensor sizes
4) Operator run time

Example output:
  I0206 14:00:51.217067 1730559 profile_observer_gpu.cc:53] --------- Starting operator Conv op#0 ---------
  I0206 14:00:51.217073 1730559 profile_observer_gpu.cc:65] Input 0: Tensor gpu_0/data of type float. Dims: (32,3,227,227,):
  I0206 14:00:51.217077 1730559 profile_observer_gpu.cc:65] Input 1: Tensor gpu_0/conv1_w of type float. Dims: (64,3,7,7,):
  I0206 14:00:51.217082 1730559 profile_observer_gpu.cc:71] Argument 0: name: "kernel" i: 7
  I0206 14:00:51.217087 1730559 profile_observer_gpu.cc:71] Argument 1: name: "enable_tensor_core" i: 0
  I0206 14:00:51.217089 1730559 profile_observer_gpu.cc:71] Argument 2: name: "exhaustive_search" i: 1
  I0206 14:00:51.217092 1730559 profile_observer_gpu.cc:71] Argument 3: name: "float16_compute" i: 0
  I0206 14:00:51.217095 1730559 profile_observer_gpu.cc:71] Argument 4: name: "stride" i: 2
  I0206 14:00:51.217099 1730559 profile_observer_gpu.cc:71] Argument 5: name: "pad" i: 3
  I0206 14:00:51.217103 1730559 profile_observer_gpu.cc:71] Argument 6: name: "order" s: "NCHW"
  I0206 14:00:51.217105 1730559 profile_observer_gpu.cc:71] Argument 7: name: "ws_nbytes_limit" i: 67108864
  I0206 14:00:51.217109 1730559 profile_observer_gpu.cc:85] Output 0: Tensor gpu_0/conv1 of type float. Dims: (32,64,114,114,):
  I0206 14:00:51.217111 1730559 profile_observer_gpu.cc:88] --------- Finished operator Conv in 1.12685 ms ---------

Example output for internal RNN op (from seq2seq):
  I0219 18:57:06.779331 2960991 profile_observer_gpu.cc:52] --------- Starting operator LSTMUnit op#3161697160-7 ---------
  I0219 18:57:06.779336 2960991 profile_observer_gpu.cc:59] Input 0: Tensor model0/encoder/layer3/lstm/hidden_t_prev of type float. Dims: (1,1,512,):
  I0219 18:57:06.779340 2960991 profile_observer_gpu.cc:59] Input 1: Tensor model0/encoder/layer3/lstm/cell_t_prev of type float. Dims: (1,1,512,):
  I0219 18:57:06.779343 2960991 profile_observer_gpu.cc:59] Input 2: Tensor model0/encoder/layer3/lstm/gates_t of type float. Dims: (1,1,2048,):
  I0219 18:57:06.779346 2960991 profile_observer_gpu.cc:59] Input 3: Tensor encoder_lengths of type int. Dims: (1,):
  I0219 18:57:06.779350 2960991 profile_observer_gpu.cc:59] Input 4: Tensor timestep_rnnexec_t24 of type int. Dims: (1,):
  I0219 18:57:06.779353 2960991 profile_observer_gpu.cc:70] Argument 0: name: "no_sequence_lengths" i: 0
  I0219 18:57:06.779357 2960991 profile_observer_gpu.cc:70] Argument 1: name: "drop_states" i: 0
  I0219 18:57:06.779362 2960991 profile_observer_gpu.cc:70] Argument 2: name: "forget_bias" f: 0
  I0219 18:57:06.779366 2960991 profile_observer_gpu.cc:79] Output 0: Tensor model0/encoder/layer3/lstm/hidden_t of type float. Dims: (1,1,512,):
  I0219 18:57:06.779369 2960991 profile_observer_gpu.cc:79] Output 1: Tensor model0/encoder/layer3/lstm/cell_t of type float. Dims: (1,1,512,):
  I0219 18:57:06.779372 2960991 profile_observer_gpu.cc:89] RecurrentNetwork 3161697160: order: 7
  I0219 18:57:06.779373 2960991 profile_observer_gpu.cc:92] --------- Finished operator LSTMUnit in 0.00153923 ms ---------

Existing deficiencies:
1) Need support to create separate CPU and GPU builds

Once this is approved, I'll port the changes over to OSS
2018-03-27 18:10:39 -07:00
Martin Schatz
8baa563daf Change observer copy() method to take id parameter
This diff is added to support the ProfileObserver in order to differentiate operators in the stepnet properly.  Since copy() is only used in the context of RNNs, the name has been changed to reflect that.
2018-03-27 18:10:39 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yangqing Jia
611a89c4b6 Remove more protobuf APIs. (#2348)
* Wrap ShutdownProtobufLibrary

* Remove text_format.h header and only put the function in proto_utils.h

* ParseFromString returns bool
2018-03-21 10:29:45 -07:00
Koan-Sin Tan
fab2c07af9 make -DUSE_OBSERVERS=ON work
```
scripts/build_android.sh -DBUILD_TEST=ON -DBUILD_BINARY=ON  -DBUILD_OBSERVERS=On -DUSE_OBSERVERS=ON
```
2018-03-02 20:31:33 +01:00
mdschatz
3c952426fb Add operator attaching net observer
Summary:
Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from.
Closes https://github.com/caffe2/caffe2/pull/1806

Reviewed By: salexspb

Differential Revision: D6808623

Pulled By: mdschatz

fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736
2018-01-29 14:34:34 -08:00
Martin Schatz
f19ae690c3 Update observer when attached to RNN ops
Summary: Observer passed to RNN step net cloned with RecurrentOperator as subject instead of internal Operator.  This diff applies adds the internal operator as the subject

Reviewed By: enosair

Differential Revision: D6560996

fbshipit-source-id: 7af4fb0ff8c19795b5c994c5fc6876f3d2ba7bf4
2017-12-14 10:04:20 -08:00
Alexander Sidorov
913a9a736c Backed out changeset 4e1241fe65cd (revert a revert :) )
Summary:
This fixes the issue but I haven't figured out yet why is it
happening.

Reviewed By: bwasti

Differential Revision: D6437378

fbshipit-source-id: bf983c9b6f57647423423ec6b22e0f9d2b170e74
2017-11-29 13:33:15 -08:00
Aapo Kyrola
e0c8c539e7 Backed out changeset 119623addbbd
Summary: Unlanding D6327460 because seems to be causing unstability.

Differential Revision: D6377117

fbshipit-source-id: 4e1241fe65cd4c7a127fa6fa724f60b75965a096
2017-11-20 16:17:52 -08:00
Alexander Sidorov
8d321b6cd3 Improve observer framework overhead
Summary:
There were several regressions over time. Looks like the main
one is recent change with having a map which we iterate over for each
operator call. I made some other little optimizations to our Facebook
observer. Overal this seems to cut about 1000ns from an opertor. At a
rate of 36B operators per second this shouldbe about 750 type vi
hosts.

Reviewed By: bwasti

Differential Revision: D6327460

fbshipit-source-id: 119623addbbd575486906959d65603eea8d4f5e6
2017-11-17 08:36:35 -08:00
Dmytro Dzhulgakov
2bf4dec9ff Add missing CMakeFile in caffe2/observers
Summary:
Broke by 43075b779b
Closes https://github.com/caffe2/caffe2/pull/1473

Differential Revision: D6333460

Pulled By: dzhulgakov

fbshipit-source-id: 94a06b53650b02ff5938367896b52f47cdbf811a
2017-11-14 21:46:57 -08:00
Qinqing Zheng
c77f0cb5e6 Attach observers to operators inside step net
Summary: Pass the list of observers to rnnExecutor_ and attach them to operators

Reviewed By: akyrola

Differential Revision: D6279655

fbshipit-source-id: 086dde1bf6edbfb36082d6b4de33ec41f0bbefab
2017-11-14 15:06:38 -08:00
Bram Wasti
7d16d320d5 expose observers to python, add multiple observers per observable
Summary: observer framework can now be used in python + a small writeup of how to use it.  this is D6035393 with a fix for ct-scan

Reviewed By: salexspb

Differential Revision: D6066380

fbshipit-source-id: 896c4c580d4387240b81ac2dbbc43db51d4bfeb9
2017-10-16 14:32:56 -07:00
Scott Yost
a7a81351f2 Revert D6035393: [caffe2] expose observers to python, add multiple observers per observable
Summary:
This reverts commit 4563cf0203095fa979bb2160621cd16dd22ff830

bypass-lint

Differential Revision: D6035393

fbshipit-source-id: 090fba774ce433904f7ef769dda75c2fbbf784a8
2017-10-14 21:47:34 -07:00
Bram Wasti
58fe66e337 expose observers to python, add multiple observers per observable
Summary: observer framework can now be used in python + a small writeup of how to use it

Reviewed By: sf-wind

Differential Revision: D6035393

fbshipit-source-id: 4563cf0203095fa979bb2160621cd16dd22ff830
2017-10-14 13:09:29 -07:00
Yangqing Jia
b1508e8e86 Revert D5905002: [caffe2] expose observers to python
Summary:
This reverts commit e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1

bypass-lint

Differential Revision: D5905002

fbshipit-source-id: 4f1b79d9a318978f6b74565f633f34b9701a9d5c
2017-10-10 22:12:00 -07:00
Bram Wasti
63caca89db expose observers to python
Summary: observer framework can now be used in python + a small writeup of how to use it

Reviewed By: salexspb

Differential Revision: D5905002

fbshipit-source-id: e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1
2017-10-10 16:10:41 -07:00