Commit Graph

31 Commits

Author SHA1 Message Date
Eli Amesefe
385a755b68 Undefined behavior with memset of std::string to 0 (#18703)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18703

 `zeroPtr` is sometimes a `std::string` tensor, so `memset` to 0 is undefined behavior.

This might be accidentally safe with `std::string` implementation that use SSO (Small String Optimization), but will crash otherwise.

Reviewed By: zheng-xq

Differential Revision: D14714458

fbshipit-source-id: 012a18464e6514d38ff791509b88ddc3fc55b2b1
2019-04-02 10:10:11 -07:00
Junjie Bai
246f5c412e Revert "Tensor construction codemod(raw_mutable_data) (#16373)" (#18680)
Summary:
This reverts commit d73c830e23.

We have observed significant perf drop when training ResNext101 with multiple amd GPUs:

Before:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1636/console
2 GPUs ResNext training got 150\~160 imgs/sec
4 GPUs ResNext training got 270\~280 imgs/sec

After:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1637/console
Both 2 and 4 GPUs ResNext training drop to 110\~120 imgs/sec

Similar perf drop are seen on ResNet50 training jobs as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18680

Differential Revision: D14702941

Pulled By: bddppq

fbshipit-source-id: 828141805afc23f25c08d4a2eb6d4b99f817c128
2019-04-01 14:39:13 -07:00
Jerry Zhang
d73c830e23 Tensor construction codemod(raw_mutable_data) (#16373)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16373

motivation: https://github.com/pytorch/pytorch/pull/12407
This is a manual diff.
most of the fixes should be:

```
auto* Y = Output(0);
Y->Resize(dims);
Y->raw_mutable_data(dtype);
```
-->
```
auto* Y = Output(0, dims, at::dtype(dtype));
```
But there might be other cases.

Reviewed By: dzhulgakov

Differential Revision: D13725460

fbshipit-source-id: 649a4b0e42f62cda1a60171dd9fa3e440dc9dca1
2019-03-29 18:36:46 -07:00
Jerry Zhang
5fefb29a53 Tensor construction: combine Resize+mutable_data - 4/4 (#13856)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13856

Codemod generated with clangr shard mode, 25 files per diff,
motivation: https://github.com/pytorch/pytorch/pull/12407

Reviewed By: smessmer

Differential Revision: D13007310

fbshipit-source-id: 941f064ef8934bb17fbfb706e6ed3db173b5d268
2018-11-27 12:34:25 -08:00
Xiaoqiang Zheng
de41d1ae0b Enable junk fill for the default CPU allocator (#13377)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13377

* Enable junk fill for the default CPU allocator. The first diff only enables this for the tests. A second diff will change the default of zero-fill to false.
* Fix tests to use 64-bit counters that IterOp and LearningRateOp demands.
* Fix kernels that uses uninitialized memory.

Reviewed By: salexspb

Differential Revision: D10866512

fbshipit-source-id: 17860e77e63a203edf46d0da0335608f77884821
2018-11-08 00:02:37 -08:00
Jerry Zhang
508f676c50 Rename ndim() -> dim() - 5/6
Summary:
Codemod generated with clangr shard mode, 50 files per diff,
clangr code(ndim()->dim()): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

Reviewed By: salexspb

Differential Revision: D12935787

fbshipit-source-id: 303d71d3eb050789af2ab9575e5dcc48f6037086
2018-11-06 16:38:35 -08:00
Jerry Zhang
519570def8 Rename dim(i) -> size(i) - 2/2
Summary:
Codemod generated with clangr shard mode, 50 files per diff,
clangr code(dim->size): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

Reviewed By: salexspb

Differential Revision: D12896721

fbshipit-source-id: deb0290354a1ffd69d080f0f126479844bf04e3c
2018-11-02 14:29:06 -07:00
Jerry Zhang
13b9fd3e05 Renaming meta() to dtype() - 2/2 (#13334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13334

Codemod generated with clangr shard mode, 50 files per diff,
clangr code(meta->dtype): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

i-am-not-moving-c2-to-c10

Reviewed By: ezyang

Differential Revision: D12845197

fbshipit-source-id: f87eb575d3c31593ca76b70780cc4fca888e706b
2018-10-30 18:24:30 -07:00
Jerry Zhang
537d671829 Renaming size() to numel() - 4/6
Summary: Codemod generated with clangr shard mode, 50 files per diff

Reviewed By: li-roy

Differential Revision: D10866391

fbshipit-source-id: 3badc4e86edaac376918fca8d09dbfa396ac3a2c
2018-10-26 16:47:36 -07:00
Jerry Zhang
cccd457a1e Tensor dims() -> sizes() (caffe2/operators) - 4/5 (#13031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13031

Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes()

Reviewed By: ezyang

Differential Revision: D10476232

fbshipit-source-id: cb4ad76be068065eb2c5e7d87f33d04423cf93c4
2018-10-24 15:07:42 -07:00
Edward Yang
54d9823d00 Make caffe2::Tensor::dims() return an IntList instead of a const vector& (#12180)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12180

I had to fix a lot of call sites, because a lot of places assume that
you can actually get a const vector&, and if the internal representation
of sizes in a tensor is NOT a vector, it's not possible to fulfill
this API contract.

Framework changes:
- I deleted TensorImpl::dims(); caffe2::Tensor::dims() just forwards to
  sizes() now.
- De-templatized SetDims; now it is an explicit list of ArrayRef and
  variadic overloads.  This makes implicit conversions work again,
  so I don't need to explicitly list the std::vector cases too.
  - As a knock-on effect, this causes Reset() to accept at::IntList as well as
    const std::vector<int64_t>&
- Edited variadic overloads of SetDims to all forward to the underlying
  arbitrary-dim implementation, reducing code duplication. (It's probably
  marginally less efficient in the new world.)
- Replace Tensor constructor accepting const std::vector<int64_t>& with at::IntList
- Make MKLTensor accept ArrayRef along with vector in constructor and
  Reset (unfortunately, no implicit conversions here, since it's templated on
  index type.)
- There are a few other places, like cudnn, where I changed functions
  that previously took const std::vector<int64_t>& to take at::IntList
  instead.

Classification of call site changes:
- 'const std::vector<int64_t>& x_dims = x.dims()' ==>
  'at::IntList x_dims = x.dims()'
- 'std::vector<int64_t> x_dims = x.dims()' ==>
  'std::vector<int64_t> x_dims = x.dims().vec()' (we need a copy!)
  Usually this is because we're about to mutably modify the vector
  to compute some new dimension.  However, it also very commonly occurs in the
  form: 'x_dims_ = x.dims()' because we frequently cache sizes in operators.
- Instead of constructing std::vector<int64_t>{blah, blah}, construct an
  at::IntList directly

ArrayRef changes:
- cbegin()/cend() iterators, they operate the same aas begin()/end() because
  everything on ArrayRef is const.
- Moved operator<< into ArrayRef.h, so that it's always available when
  working with ArrayRef.  I also templated it, so it now works on an
  ArrayRef of any type.
- Add operator== overload for ArrayRef, and also add variants to permit
  comparison of ArrayRef with std::vector, a very common operation.
  (The non-templated version of operator== can get these automatically
  via implicit conversion, but with templates C++ refuses to do
  any explicit conversions.)

I'm planning to audit all dims() call sites to make sure they don't
expect 'auto x = t.dims()' to give you an x whose lifetime can validly
outlive the tensor.

I opted not to do a dims() to sizes() rename, because dims() also matches
the protobufs accessor.  Bad news!

Reviewed By: jerryzh168

Differential Revision: D10111759

fbshipit-source-id: a2a81dc4b92c22ad4b3b8ef4077a7e97b6479452
2018-10-05 15:57:41 -07:00
Christian Puhrsch
a6630e25af Remove many caffe2::TIndex and replace them with int64_t (#11943)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11943

See title

Reviewed By: ezyang

Differential Revision: D9992645

fbshipit-source-id: e8f80d6ea762971513e5e8072975ceea53e1f11a
2018-09-22 18:11:04 -07:00
Jerry Zhang
aebf3b47ae Remove template parameter from Tensor (#9939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939

Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: ezyang, houseroad

Differential Revision: D9024330

fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
2018-07-27 10:56:39 -07:00
Jerry Zhang
969b62f276 Revert D8121878: Remove template parameter from Tensor
Differential Revision:
D8121878

Original commit changeset: 4a5e9a677ba4

fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e
2018-07-26 14:02:04 -07:00
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Mingzhe Li
a70a90b28f Fix pytorch linux build issues (#9273)
Summary:
Breaking out of #8338

This fixes the build issues with pytorch on linux machines after BUILD_CAFFE2 and BUILD_ATEN are removed.

cc orionr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9273

Reviewed By: orionr

Differential Revision: D8768869

Pulled By: mingzhe09088

fbshipit-source-id: 2730426ed1bed398eb5dc804c7348aeeb27c93d3
2018-07-09 14:41:36 -07:00
Orion Reblitz-Richardson
9ec0a2aef4 fbshipit-source-id: ba600fcd2b5cefc7621357bdeb05e24cea02e5af 2018-06-27 04:50:56 -07:00
Matthew Inkawhich
b10c94b507 Update operator documentation with markdown descriptions and interfaces (#8085)
* Update operator documentation with markdown descriptions and interfaces

* Added rest of updated operator documentation to source files

* Commiting local changes for rebase

* fixed bracket typo in sqrt_op.cc file

* Added updated markdown documentation to remaining completed ops
2018-06-15 19:02:24 -04:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yangqing Jia
efa7c895f6 Misc Windows lint
Summary: Closes https://github.com/caffe2/caffe2/pull/1656

Differential Revision: D6633052

Pulled By: Yangqing

fbshipit-source-id: 5eeb3912fc769cfd06d252f3ed1d8d5f2a207cfc
2017-12-23 20:07:27 -08:00
Ahmed Taei
0a25926f4b CUDA implementation for GatherPadddingOp
Summary: AT

Reviewed By: enosair

Differential Revision: D6561996

fbshipit-source-id: ad03d6db8d4318e426ff96569bb3c93cba696926
2017-12-15 16:05:31 -08:00
Qinqing Zheng
5ec224496b Merge common part in CUDA & CPU implementations of AddPaddingOp
Summary: The RunWithType() function of CUDA version shares a lot of code with the CPU version of the op. Merge them by pulling out the different parts of RunWithType() and putting them into a separate CPU/CUDA functions.

Reviewed By: asaadaldien

Differential Revision: D6467962

fbshipit-source-id: 83b45e697a094e959f66e898f46f06b0e2c329bc
2017-12-04 16:55:49 -08:00
James Cross
0e21cd2eae CUDA implementation of RemovePadding operator
Summary:
This is a CUDA implementation of the RemovePadding operator, modeled on akyrola's implementation for AddPadding.

There's also an incidental spelling correction: GetAddPadingGradient -> GetAddPaddingGradient.

Reviewed By: akyrola

Differential Revision: D6439594

fbshipit-source-id: b29cd0c252021c58e150b901bbaad28a3bd3cc4a
2017-11-29 18:48:01 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Yangqing Jia
8efb762fcd gpu sequence op step 1: clean headers
Summary:
@public

This has no functionality changes yet, only cleaning up the sequence_op file
so that the header is context-independent and I will implement the gpu parts
separately.

Reviewed By: pietern

Differential Revision: D4777140

fbshipit-source-id: 9b4aea6c36f06a64a53e235a125cd3477d54a045
2017-03-29 08:47:00 -07:00
Zhao Tan
31ca9d57b6 Remove args in Grad
Summary: Removed Def().arg() in the backward computation since they have already been included in the forward.

Differential Revision: D4563600

fbshipit-source-id: bb6ee25e7c8da99977b82963670267392893fcde
2017-02-15 16:00:44 -08:00
Martin Raison
2c3eb3e592 fix sequence_ops doc (pad_width -> padding_width)
Summary: The doc for sequence ops says "pad_width" instead of "padding_width". This diff fixes it.

Differential Revision: D4277186

fbshipit-source-id: 63af6cce2fe0af0d395f78c6a6a1f41518039cf8
2016-12-15 12:01:29 -08:00
Yangqing Jia
589398950f fbsync at f5a877 2016-11-18 15:41:06 -08:00
Yangqing Jia
d1e9215184 fbsync 2016-10-07 13:08:53 -07:00
Yangqing Jia
b23e51d467 chunky sync 2016-09-06 15:55:19 -07:00
Yangqing Jia
09bed67e4f add untracked files 2016-07-21 11:26:41 -07:00