Summary:
Fix CMakeLists.txt, so the test for CPU won't run profile_observer_test.cc, as currently it only supports GPU
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14168
Differential Revision: D13356274
Pulled By: ezyang
fbshipit-source-id: 7d105f2e18675e5fab129864958148b0f18d582c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13327
Add cost inference function to non-linear ops. Since the actual flops of the non-linear operator depends on the implementation, we use the number of non-linear operations as the proxy for the analytical flops for non-linear operators.
Reviewed By: jspark1105
Differential Revision: D10439558
fbshipit-source-id: 9aeb05bac8b5c7ae5d351ebf365e0a81cf4fc227
Summary:
Hi guys,
I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.
What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.
After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550
Reviewed By: ezyang
Differential Revision: D13042597
Pulled By: orionr
fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
Summary:
Original commit changeset: f5614a5d2607
D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz
We need to land this revert ASAP to unblock aggregator push.
Reviewed By: orionr
Differential Revision: D10123245
fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11167
Narrow the Blob API as preparation for merging Blob/IValue
- get rid of templated IsType and Operator::InputIsType / OutputIsType
- Use 'using' instead of 'typedef' for DestroyCall (just for readability)
Reviewed By: ezyang
Differential Revision: D9623916
fbshipit-source-id: 952f0b0cf5a525094b02e8d2798dd57a56a9e1d8
Summary:
update to the latest observer usage syntax
add an example of HistogramObservers
Reviewed By: jspark1105
Differential Revision: D6878439
fbshipit-source-id: c9521f2daecfc7f0c17de6a944dce58e568e3dbe
Summary:
Let's run CI tests to see what fails given the changes that just landed in https://github.com/pytorch/pytorch/pull/10624
cc mingzhe09088 ezyang Yangqing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10692
Reviewed By: mingzhe09088
Differential Revision: D9423617
Pulled By: orionr
fbshipit-source-id: 3bda1f118d13f8dd8e823727c93167cae747d8cf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13
Pull Request resolved: https://github.com/pytorch/translate/pull/166
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125
Closes https://github.com/pytorch/pytorch/pull/9125
Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later
Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:
1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change
Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.
Reviewed By: ezyang, houseroad
Differential Revision: D9024330
fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13
Pull Request resolved: https://github.com/pytorch/translate/pull/166
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125
Closes https://github.com/pytorch/pytorch/pull/9125
Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later
Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:
1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change
Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.
Reviewed By: xw285cornell
Differential Revision: D8121878
fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
This diff is added to support the ProfileObserver in order to differentiate operators in the stepnet properly. Since copy() is only used in the context of RNNs, the name has been changed to reflect that.
Summary:
Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from.
Closes https://github.com/caffe2/caffe2/pull/1806
Reviewed By: salexspb
Differential Revision: D6808623
Pulled By: mdschatz
fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736
Summary: Observer passed to RNN step net cloned with RecurrentOperator as subject instead of internal Operator. This diff applies adds the internal operator as the subject
Reviewed By: enosair
Differential Revision: D6560996
fbshipit-source-id: 7af4fb0ff8c19795b5c994c5fc6876f3d2ba7bf4
Summary:
This fixes the issue but I haven't figured out yet why is it
happening.
Reviewed By: bwasti
Differential Revision: D6437378
fbshipit-source-id: bf983c9b6f57647423423ec6b22e0f9d2b170e74
Summary: Unlanding D6327460 because seems to be causing unstability.
Differential Revision: D6377117
fbshipit-source-id: 4e1241fe65cd4c7a127fa6fa724f60b75965a096
Summary:
There were several regressions over time. Looks like the main
one is recent change with having a map which we iterate over for each
operator call. I made some other little optimizations to our Facebook
observer. Overal this seems to cut about 1000ns from an opertor. At a
rate of 36B operators per second this shouldbe about 750 type vi
hosts.
Reviewed By: bwasti
Differential Revision: D6327460
fbshipit-source-id: 119623addbbd575486906959d65603eea8d4f5e6
Summary: Pass the list of observers to rnnExecutor_ and attach them to operators
Reviewed By: akyrola
Differential Revision: D6279655
fbshipit-source-id: 086dde1bf6edbfb36082d6b4de33ec41f0bbefab
Summary: observer framework can now be used in python + a small writeup of how to use it. this is D6035393 with a fix for ct-scan
Reviewed By: salexspb
Differential Revision: D6066380
fbshipit-source-id: 896c4c580d4387240b81ac2dbbc43db51d4bfeb9
Summary: observer framework can now be used in python + a small writeup of how to use it
Reviewed By: sf-wind
Differential Revision: D6035393
fbshipit-source-id: 4563cf0203095fa979bb2160621cd16dd22ff830
Summary: observer framework can now be used in python + a small writeup of how to use it
Reviewed By: salexspb
Differential Revision: D5905002
fbshipit-source-id: e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1