Commit Graph

35 Commits

Author SHA1 Message Date
Xiaomeng Yang
e04c9195b7 Update math::Transpose to support tensor with size > 2G (#17670)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17670

Update math::Transpose to support tensor with size > 2G

i-am-not-moving-c2-to-c10

Differential Revision: D14313624

fbshipit-source-id: 0b4a85b913972e5a8981f0d40d0c539407b98f30
2019-03-20 18:22:21 -07:00
Xiaomeng Yang
3a34f443c5 Separate reduce functions from math (#16929)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16929

Separate CPU reduce functions from math

i-am-not-moving-c2-to-c10

Reviewed By: houseroad

Differential Revision: D13999469

fbshipit-source-id: bd628b15a6e3c1f04cc62aefffb0110690e1c0d1
2019-02-13 17:50:47 -08:00
Xiaomeng Yang
2db847b3a7 Separate elementwise level2 math functions (#16753)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16753

Separate elementwise level2 math functions

i-am-not-moving-c2-to-c10

Reviewed By: houseroad

Differential Revision: D13954928

fbshipit-source-id: 1ca7a5d3da96e32510f502e5e4e79168854bee67
2019-02-07 18:38:26 -08:00
Xiaomeng Yang
866c4e3467 Separate Moments from math and optimize it (#16175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16175

Separate Moments from math and optimize it

i-am-not-moving-c2-to-c10

Reviewed By: houseroad

Differential Revision: D13742472

fbshipit-source-id: 90757d908d38c98ca69818855aaf68315e525992
2019-01-20 08:53:25 -08:00
Jerry Zhang
91e87c0395 Renaming size() to numel() - 2/2
Summary:
Codemod generated with clangr shard mode, 50 files per diff,
clangr code(size->numel): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

i-am-not-moving-c2-to-c10

Reviewed By: ezyang

Differential Revision: D12833748

fbshipit-source-id: 98dc2d3abc23c177c2c9e457b81499952d4b690c
2018-10-29 18:59:29 -07:00
Dmytro Dzhulgakov
5a2b2aa6af Remove calls to CopyFrom that can be sync (#13205)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13205

CopyFrom without context argument does the sync copy on the current gpu - exactly what most of the places need.

This diff kills about 60% of CopyFrom usages. Most common pattern is gpu->cpu copy with further FinishDeviceComputation - the latter can be just killed.

Reviewed By: Yangqing

Differential Revision: D11236076

fbshipit-source-id: eb790ca494dfc5d5e3a7d850b45d6f73221bb204
2018-10-29 13:57:42 -07:00
Yangqing Jia
38f3d1fc40 move flags to c10 (#12144)
Summary:
still influx.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144

Reviewed By: smessmer

Differential Revision: D10140176

Pulled By: Yangqing

fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c
2018-10-04 02:09:56 -07:00
Sebastian Messmer
8f0db9bbbb Removing some dependency edges from Blob to other caffe2 (#12043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12043

Re-trying D9979976, this time with all call sites fixed.

D9979976 got reverted because there was a call site that wasn't covered by sandcastle it seems.
I fixed it and used 'grep' to ensure there aren't any more call sites in fbsource.

Reviewed By: ezyang

Differential Revision: D10026392

fbshipit-source-id: cd341514a8e53a40147ea0ee3e52f63bb6444157
2018-09-25 11:40:24 -07:00
Maciej Bargiel
2cdf98a74d Back out "Removing some dependency edges from Blob to other caffe2"
Summary: The controller you requested could not be found. Original commit changeset: 2ea17724e223

Differential Revision:
D10026321
Ninja: stable broken

fbshipit-source-id: faf87cb7cc0f78c2c10d4aa6fceea279cd27acd6
2018-09-25 01:11:14 -07:00
Sebastian Messmer
17a65bf9b6 Removing some dependency edges from Blob to other caffe2 (#11923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11923

This is pre-work to allow moving Blob to ATen/core, which cannot depend on caffe2 anymore.
(1) Removing the Blob -> Tensor dependency allows us to move Blob to ATen/core and use it inside IValue without having to wait for the Tensor merge to be complete.
(2) In the final Blob design, we want it to be a very small class that doesn't have any special treatment for Tensor (or to be more correct, doesn't allow storing Tensor anymore), so this is anyhow the direction we want to go.

This changes call sites that will have to be moved to IValue later, but they cannot be moved to IValue directly, because for that, IValue first needs to be able to store Blob, which in turn first needs this diff and some other changes coming up in future diffs.

Codemods:
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.IsTensorType\\(" "BlobIsTensorType(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->IsTensorType\\(" "BlobIsTensorType(*\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.GetMutableTensor\\(" "BlobGetMutableTensor(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->GetMutableTensor\\(" "BlobGetMutableTensor(*\\1, "

It is, however, not only these codemods because regex based refactoring was only able to match a small amount of the call sites. To catch more, I wouldn've needed a AST aware tool like clangr, which I didn't figure out how to use.

Reviewed By: ezyang

Differential Revision: D9979976

fbshipit-source-id: 2ea17724e223b5b73b44f99362727759ca689e61
2018-09-24 22:57:05 -07:00
Christian Puhrsch
a6630e25af Remove many caffe2::TIndex and replace them with int64_t (#11943)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11943

See title

Reviewed By: ezyang

Differential Revision: D9992645

fbshipit-source-id: e8f80d6ea762971513e5e8072975ceea53e1f11a
2018-09-22 18:11:04 -07:00
Jerry Zhang
9f4bcdf075 caffe2::DeviceType -> at::DeviceType (#11254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11254
Previously we use DeviceType in caffe2.proto directly, but it's an `enum` and have implicit conversion to int, which does not have type safety, e.g. we have to explicitly check for a device type is valid in event.h:
```
template <int d>
struct EventCreateFunctionRegisterer {
  explicit EventCreateFunctionRegisterer(EventCreateFunction f) {
    static_assert(d < MaxDeviceTypes, "");
    Event::event_creator_[d] = f;
  }
};
```
at::DeviceType is an `enum class`, and it does not have implicit conversion to int, and provides better type safety guarantees. In this diff we have done the following refactor(taking CPU as an example):

    1. caffe2::DeviceType → caffe2::DeviceTypeProto
    2. caffe2::CPU → caffe2::PROTO_CPU
    3. caffe2::DeviceType = at::DeviceType
    4. caffe2::CPU = at::DeviceType::CPU

codemod -d caffe2/caffe2 --extensions h,cc,cpp 'device_type\(\), ' 'device_type(), PROTO_'
+ some manual changes

In short, after this diff, in c++, caffe2::CPU refers to the at::DeviceType::CPU and the old proto caffe2::CPU will be caffe2::PROTO_CPU.
In python side, we have a temporary workaround that alias `caffe2_pb2.CPU = caffe2_pb2.PROOT_CPU` to make the change easier to review and this will be removed later.

Reviewed By: ezyang

Differential Revision: D9545704

fbshipit-source-id: 461a28a4ca74e616d3ee183a607078a717fd38a7
2018-09-05 16:28:09 -07:00
Xiaomeng Yang
f57e4ce1d5 Update broadcast with alpha to reduce num of launching kernels. (#10235)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10235

Update broadcast with alpha to reduce num of launching kernels.

Reviewed By: houseroad

Differential Revision: D9175824

fbshipit-source-id: 7a463833350a2c84dcfb82f73cf40da403dd59a0
2018-08-04 19:54:20 -07:00
Xiaomeng Yang
57d2d4bcff Optimize reduce ops for 2d and 3d (#9992)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9992

Optimize reduce ops for 2d and 3d

Reviewed By: houseroad

Differential Revision: D9042505

fbshipit-source-id: 62af2125aa6439106293e59bdf6a2b920792fd2d
2018-08-04 13:53:58 -07:00
Jerry Zhang
aebf3b47ae Remove template parameter from Tensor (#9939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939

Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: ezyang, houseroad

Differential Revision: D9024330

fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
2018-07-27 10:56:39 -07:00
Jerry Zhang
969b62f276 Revert D8121878: Remove template parameter from Tensor
Differential Revision:
D8121878

Original commit changeset: 4a5e9a677ba4

fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e
2018-07-26 14:02:04 -07:00
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Xiaomeng Yang
5df3eae89e Add 1x1 specialization for conv with NCHW order (#9671)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9671

Add 1x1 specialization for conv with NCHW order

Reviewed By: houseroad

Differential Revision: D8944686

fbshipit-source-id: 94bf44f69498b1934b7dfff4c0e989342c7bb61c
2018-07-23 18:54:58 -07:00
Xiaomeng Yang
e27d66a454 Remove Eigen from math CUDA and update algorithm in ReduceTensor and Moments (#6922) 2018-04-24 23:07:35 -07:00
Xiaomeng Yang
34fa355f27
[caffe2] Add Moments to math (#6798)
* Add gpu check for reduce_max

* Add Moments in math

* Update cpu version to avoid int type to be 0

* Update Moments on CPU to same as GPU
2018-04-21 01:03:44 -07:00
Xiaomeng Yang
38614c4670
Add gpu check for reduce_max (#6729) 2018-04-18 14:51:52 -07:00
Xiaomeng Yang
be0b7f8c81 Add reduce min and reduce max (#6685) 2018-04-18 10:58:05 -07:00
Xiaomeng Yang
4be34ca0f3 Add broadcast and reduce gradient (#6668)
Add broadcast and reduce gradient
2018-04-17 13:31:13 -07:00
Xiaomeng Yang
cd2112717c
[caffe2] Update math functions with params on host. (#6602)
* Update ReduceMean

Add reduce mean to math

Add reduce mean to math

* sync reduce_ops_test

* Update math_gpu.cu
2018-04-14 21:41:41 -07:00
Xiaomeng Yang
8849bea120 [caffe2] Update ReduceOps (#6497)
* Update ReduceMean

* Add reduce mean to math

* Update cuda flag

* Update Eigen::Tensor ctor

* Remove unused variables

* Skip ReduceTensorGPUTest if no gpus

* Add NOMINMAX for windows

* Fix lpnorm_op in windows
2018-04-11 23:36:05 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Xiaomeng Yang
278d398748 Add GPU version of math::Transpose
Summary: Add GPU version of math::Transpose

Reviewed By: Yangqing

Differential Revision: D6747958

fbshipit-source-id: 7047107609386c1ab53492381ca9bcf8bccd2924
2018-01-24 14:18:02 -08:00
Xiaomeng Yang
0a8a18ca01 Fix GemmBatched
Summary: Fix GemmBatched

Reviewed By: Yangqing

Differential Revision: D6678168

fbshipit-source-id: 132117633573600d4e31c1959a0ccbe34416e1f1
2018-01-10 18:16:52 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Wojciech Glogowski
e27431ddf5 New math.h functions required by YellowFin
Summary: New math.h functions requred by YellowFin

Reviewed By: akyrola

Differential Revision: D5695258

fbshipit-source-id: b21a23b7f9647004173f8eb4f8ba9a852370d97a
2017-08-25 18:09:34 -07:00
Ahmed Taei
b294aadc66 fp16 support for FullyConnected op(Fixed)
Summary: This diff resloved some issues in reverted PR246.

Differential Revision: D4911821

fbshipit-source-id: 0a6fa47f4c2405475697e40fb926758c534f8ef7
2017-04-19 12:49:12 -07:00
Aapo Kyrola
9ab077dc9d Revert D4871248: [caffe2][PR] fp16 support for FullyConnected op
Summary: This reverts commit 6a991c2c993dcf0b1e18aa3f2ffbe19e693dbadd

Differential Revision: D4871248

fbshipit-source-id: b6d812d09a00c83e363432e84742c503abfed65b
2017-04-17 21:31:20 -07:00
Simon Layton
1082db600e fp16 support for FullyConnected op
Summary:
Includes math lib support, removal of double-precision.
Closes https://github.com/caffe2/caffe2/pull/246

Reviewed By: Yangqing

Differential Revision: D4871248

Pulled By: asaadaldien

fbshipit-source-id: 6a991c2c993dcf0b1e18aa3f2ffbe19e693dbadd
2017-04-17 12:07:57 -07:00
Yiming Wu
fd5643e426 Add math::Gemv<double, CUDAContext> by cublas::cublasDgemv
Summary: support double gemv in CUDAContext

Differential Revision: D4872986

fbshipit-source-id: c6397c5a3b2667ca446deca0f5edbcc7f29f7a1e
2017-04-12 01:17:47 -07:00
Aapo Kyrola
ed44e87f98 use striped batch add for the recurrent network gradient
Summary: Instead of callint batch-size many math::Adds, added a new function that does a batch of additions. For CPU there is no difference, but for CUDA we do everything in one kernel. I don't think this has huge performance impact, but at least makes the CUDA profiling look better with less kernel launches.

Reviewed By: jamesr66a

Differential Revision: D4798411

fbshipit-source-id: 44ac65b2da5a615971219809b9298b4e122085cd
2017-03-30 08:57:16 -07:00