Commit Graph

62 Commits

Author SHA1 Message Date
Rohith Menon
879a90b322 [ModelLoading] Use byte encoding for uint8, fp16 etc. instead of int32 (#34343)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34343

Use byte encoding for uint8, fp16 etc. instead of int32 in TensorProto serialization/deserialization

tl;dr
- fp16 tensor deserialization 12x faster, serialized size 25% lower
- uint8 tensor deserialization 36x faster, serialized size 25% lower

Test Plan:
```
============================================================================
caffe2/caffe2/fb/predictor/ModelLoaderBenchmark.cpprelative  time/iter  iters/s
============================================================================
BlobProtoInt32DeserializationFloat16                        12.37ms    80.82
BlobProtoByteDeserializationFloat16             1125.46%     1.10ms   909.64
----------------------------------------------------------------------------
BlobProtoInt32DeserializationUInt8                          17.57ms    56.92
BlobProtoByteDeserializationUInt8               3629.45%   484.02us    2.07K
============================================================================
```

Reviewed By: yinghai

Differential Revision: D20137451

fbshipit-source-id: 8ed4be2286a6d4c7e134fcb0832f22bc645039a1
2020-03-06 11:58:30 -08:00
Shunting Zhang
7f5f2e8871 add ZERO_COLLISION_HASH to caffe2 data type (#30912)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30912

Add a new data type ZERO_COLLISION_HASH .

Test Plan: ci

Reviewed By: boryiingsu

Differential Revision: D18843626

fbshipit-source-id: b2d8280f13c78b4a656cf95822198df59de7b64c
2019-12-10 21:36:24 -08:00
Edward Yang
1e6acc676f Replace caffe2::DeviceGuard with c10::cuda::CUDAGuard (#17623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17623

Despite it's generic sounding name, caffe2::DeviceGuard actually
only worked on CUDA devices.  Rename it to something that more
clearly spells out its applicability.

I'm not sure if it's the right call, but in this patch I added
'using CUDAGuard = c10::cuda::CUDAGuard', as this seems to be more
in-line with how the Caffe2 codebase is currently written.  More
idiomatic c10 namespace style would be to say cuda::CUDAGuard.
Willing to change this if people shout.

This is a respin of D13156470 (#14284)

Reviewed By: dzhulgakov

Differential Revision: D14285504

fbshipit-source-id: 93b8ab938b064572b3b010c307e1261fde0fff3d
2019-03-06 10:48:15 -08:00
Michael Liu
5f866d0ea2 Apply modernize-use-override (2nd iteration)
Summary:
Use C++11’s override and remove virtual where applicable.
Change are automatically generated.

Reviewed By: Orvid

Differential Revision: D14086124

fbshipit-source-id: 2005227d095d776ca3b4309a57f54e25782b9b58
2019-02-14 16:52:57 -08:00
Shahzad Lone
53ae8bc64d Reserve vectors that we know the size in advance for. (#16201)
Summary:
Save reallocation costs, by reserving vectors according to how many elements we expect to put in.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16201

Differential Revision: D13762594

Pulled By: ezyang

fbshipit-source-id: 7e3bfe421489dde48a2ddb0920dd155f69baecc0
2019-01-22 08:02:40 -08:00
Edward Yang
f4c59c5fdf Replace SwitchToDevice(0) with SwitchToDevice() (#15126)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15126

I want to make people stop manufacturing StreamId from thin air,
and a first step is to make people use the default stream.

Reviewed By: dzhulgakov

Differential Revision: D13432922

fbshipit-source-id: 9f0d8d70646c50d979bde5ba3c3addeebac48a3d
2018-12-17 15:15:00 -08:00
Jerry Zhang
9b272c08cf Remove partially initialized Tensor in Deserialization (#14197)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14197

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13642

Previously we pass in a patially initialized Tensor to Deserialize and it will fill
it with the result of deserialization of a tensor proto. Now we want it to return
a Tensor directly since it's just a shared pointer to TensorImpl.

Reviewed By: dzhulgakov

Differential Revision: D12874357

fbshipit-source-id: 12b80a763375da23cfa64a74d6bc186d8d03b94f
2018-12-10 17:17:29 -08:00
Dmytro Dzhulgakov
0cfbbceac3 Change Tensor::CopyFrom to a simple double dispatch (#14268)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14268

Removes the need for Context in Tensor by doing simple dispatch for CopyBytes. It'd eventually be subsumed by Roy Li's changes of proper copy_ op, but before that is done, let's get a clear logic of how copies are implemented and clean up some craft in CopyFrom implementation.

Note, that with these changes, one can probably can get rid of Context::CopyFromCPU/CopyToCPU, but it's a matter for follow up diffs.

This diff doesn't change the API of Tensor yet, but relies on the fact that passing `Context` to CopyFrom makes copy async if the device is CUDA and doesn't have any effect otherwise (that's how Context methods are implemented).

This doesn't change semantics of copy async implementation - as before it blindly calls cudaMemcpyAsync which probably means that it can be misused if invoked separately outside of operator body. I'll leave it for the follow up copy_ unification.

For Extend() we always do async copy - it makes sense as it's an in-place device-device operation and only any further op would be observable.

Note: there are now three ways of invoking copy in C2 code - templated CopyBytes, virtual CopyFromCPU/etc, and double-dispatch free method here. Hopefully we can get rid of the second one.

Also, please advise whether it's c10-worthy :)

Reviewed By: ezyang

Differential Revision: D13117987

fbshipit-source-id: a6772d6dcf3effaf06717da3a656fc9873b310b5
2018-11-28 15:45:37 -08:00
Jerry Zhang
a228a95b94 Rename ndim() -> dim() - 1/6
Summary:
Codemod generated with clangr shard mode, 50 files per diff,
clangr code(ndim()->dim()): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

Reviewed By: ezyang

Differential Revision: D12935693

fbshipit-source-id: f24f1c10cd5bbb9e63cda0a0da989e6e3766380a
2018-11-07 07:30:11 -08:00
Jerry Zhang
2e1b7a6f4f Renaming dim() to size() - 1/3 (#13434)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13434

Codemod generated with clangr shard mode, 50 files per diff,
clangr code(dim->size): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

Reviewed By: ezyang

Differential Revision: D12867223

fbshipit-source-id: 3e05be1a370ebd1a273bd4c70499d019fd056ac4
2018-10-31 17:43:52 -07:00
Jerry Zhang
edd902594a Renaming meta() to dtype() - 1/2 (#13333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13333

Codemod generated with clangr shard mode, 50 files per diff,
clangr code(meta->dtype): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp

Reviewed By: ezyang

Differential Revision: D12845168

fbshipit-source-id: 492091963d2211ea80215200e981965767566135
2018-10-31 17:14:08 -07:00
Dmytro Dzhulgakov
47c0d88739 Bring back warning for dtype uninitialized in serialization (#13239)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13239

Previous diff missed the if (dtype_initialized) check, duh.

Also, for safety of spamming - using LOG_EVERY_MS if it's available

Reviewed By: kennyhorror

Differential Revision: D12818938

fbshipit-source-id: 76590bd1b28010fb13f5d33423c8eac1395e9f76
2018-10-29 22:09:54 -07:00
Jerry Zhang
eea2ee6d29 Renaming size() to numel() - 1/17
Summary: Codemod generated with clangr shard mode, 25 files per diff

Reviewed By: li-roy

Differential Revision: D10866237

fbshipit-source-id: 020fcfdf52083430c5b674eda8e07ad3adfcc838
2018-10-26 15:36:59 -07:00
Edward Yang
f282fa1afe Comment out LOG(ERROR) for legacy no-dtyle serialization behavior
Reviewed By: wylqc

Differential Revision: D12569279

fbshipit-source-id: 46def8ca163bcf9070a1179166fd8970e07ee229
2018-10-26 13:18:27 -07:00
Dmytro Dzhulgakov
c95fa4b904 fix dtype uninitialized tensor serialization
Summary:
See D10380678 for the discussion.

Caffe2 serialization code was able to handle dtype uninitalized tensor as long as their numel was 0 O_O.

For safety to unblock the push I'm preserving this behavior with critical. As we fix all occurrences of old API, we can delete this test.

Reviewed By: kennyhorror

Differential Revision: D10866562

fbshipit-source-id: e172bd045fdfca660ff05b426e001f5f2f03f408
2018-10-26 01:30:47 -07:00
Michael Antonov
a6949abb15 Guard all Caffe2 protobuf string serializations with CAFFE_ENFORCE (fixed reverted bug) (#12848)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12848

Updated all non-test uses of protobuf::MessageLite::SerializeAsString to call
SerializeAsString_EnforceCheck so that the return value is checked and can
throw an exception if failing.

Most of the affected code was called from classes derived from  BlobSerializeBase.
Didn't touch most tests and ENFORCE calls because they usually do checks
anyway.

Original commit changeset: c0760e73ecc7

Reviewed By: dzhulgakov

Differential Revision: D10453456

fbshipit-source-id: d2f2b7b4578e721924354149f08f627c7e3bf070
2018-10-23 16:21:26 -07:00
Junjie Bai
805f4d5cb8 Revert D10416438: Guard all Caffe2 protobuf string serializations with CAFFE_ENFORCE
Differential Revision:
D10416438

Original commit changeset: cb842e3e26b0

fbshipit-source-id: c0760e73ecc76ca9b1b74f6844e243c2df5260a2
2018-10-18 13:46:33 -07:00
Michael Antonov
63cd051867 Guard all Caffe2 protobuf string serializations with CAFFE_ENFORCE (#12799)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12799

Updated all non-test uses of protobuf::MessageLite::SerializeAsString to call
SerializeAsString_EnforceCheck so that the return value is checked and can
throw an exception if failing.

Most of the affected code was called from classes derived from  BlobSerializeBase.
Didn't touch most tests and ENFORCE calls because they usually do checks
anyway.

Reviewed By: ezyang

Differential Revision: D10416438

fbshipit-source-id: cb842e3e26b0918829d71267a375d4dd40600d58
2018-10-18 12:49:01 -07:00
Yangqing Jia
7d5f7ed270 Using c10 namespace across caffe2. (#12714)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12714

This is a short change to enable c10 namespace in caffe2. We did not enable
it before due to gflags global variable confusion, but it should have been
mostly cleaned now. Right now, the plan on record is that namespace caffe2 and
namespace aten will fully be supersets of namespace c10.

Most of the diff is codemod, and only two places of non-codemod is in caffe2/core/common.h, where

```
using namespace c10;
```

is added, and in Flags.h, where instead of creating aliasing variables in c10 namespace, we directly put it in the global namespace to match gflags (and same behavior if gflags is not being built with).

Reviewed By: dzhulgakov

Differential Revision: D10390486

fbshipit-source-id: 5e2df730e28e29a052f513bddc558d9f78a23b9b
2018-10-17 12:57:19 -07:00
Sebastian Messmer
dd7501e3a8 Remove Blob::ShareExternal from serialization (#11926)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11926

With the preparation work of diffs stacked below, we're now able to remove this call to Blob::ShareExternal(),
preparing for removing that function from Blob,

Reviewed By: dzhulgakov

Differential Revision: D9884563

fbshipit-source-id: 7dd5c5fe02be0df7a44be45587c1dd7c474126ef
2018-10-17 11:50:35 -07:00
Sebastian Messmer
6cbf1992bd Serialization takes pointers instead of Blob (#11925)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11925

This is step 1 in the refactoring to remove Blob::ShareExternal(), i.e. Blob would then always own its contents.

ShareExternal() is for example used to pass non-owning blobs to serialization. This diff prepares removing that.

Reviewed By: ezyang

Differential Revision: D9884177

fbshipit-source-id: d01df9a613a4fc62e5679fe45bfc47e2c899b818
2018-10-17 11:50:34 -07:00
Lu Fang
30aaa07594 New serialization format (#12384)
Summary:
Addressed Dima's feedback.

The proposal is here: https://fb.quip.com/TbQmAuqIznCf
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12384

Reviewed By: dzhulgakov

Differential Revision: D10246743

Pulled By: houseroad

fbshipit-source-id: c80db0c35d60ca32965275da705f2b1dfb2a7265
2018-10-16 16:36:58 -07:00
Yangqing Jia
713e706618 Move exception to C10 (#12354)
Summary:
There are still a few work to be done:

- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h

This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:

(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.

Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354

Reviewed By: orionr

Differential Revision: D10238910

Pulled By: Yangqing

fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
2018-10-15 13:33:18 -07:00
Jerry Zhang
7724807551 Remove ExtractDeviceOption from StaticContext (#12304)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12304

- make ExtractDeviceOption to be a free function.
- Add a Strorage(at::Device) constructor in order to preserve the device_id.

Reviewed By: dzhulgakov

Differential Revision: D10069839

fbshipit-source-id: a5f3994a39bdf1b7503b39bb42c228e438b52bfa
2018-10-10 14:12:16 -07:00
Yangqing Jia
38f3d1fc40 move flags to c10 (#12144)
Summary:
still influx.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144

Reviewed By: smessmer

Differential Revision: D10140176

Pulled By: Yangqing

fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c
2018-10-04 02:09:56 -07:00
Dmytro Dzhulgakov
1d3f650ce4 Revert D10098106: [pytorch][PR] [WIP] New version of PT1 model format
Differential Revision:
D10098106

Original commit changeset: 94ec7fc57c84

fbshipit-source-id: 38f729b0970618f38359797b806cbbcd865f4715
2018-10-02 00:43:40 -07:00
Lu Fang
35becd1879 New version of PT1 model format (#12149)
Summary:
Considered four different existing formats: 1) static graph, 2) torch script, 3) pickle files, 4) PyTorch C++ serialize APIs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12149

Reviewed By: BIT-silence

Differential Revision: D10098106

Pulled By: houseroad

fbshipit-source-id: 94ec7fc57c842e50fae5286ddeda657a4967a07a
2018-10-01 15:57:02 -07:00
Jerry Zhang
006171fffc Back out "[pytorch][PR] Revert "Move CreateContext to global registry (#11688)"" (#12121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12121

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12055

Original commit changeset: 6ca9de65b707

Reviewed By: ezyang

Differential Revision: D10033396

fbshipit-source-id: ca9f4b2f7ef0561f619b833415d394a8b9972bf4
2018-10-01 11:10:46 -07:00
Yangqing Jia
9c49bb9ddf Move registry fully to c10 (#12077)
Summary:
This does 6 things:

- add c10/util/Registry.h as the unified registry util
  - cleaned up some APIs such as export condition
- fully remove aten/core/registry.h
- fully remove caffe2/core/registry.h
- remove a bogus aten/registry.h
- unifying all macros
- set up registry testing in c10

Also, an important note that we used to mark the templated Registry class as EXPORT - this should not happen, because one should almost never export a template class. This PR fixes that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12077

Reviewed By: ezyang

Differential Revision: D10050771

Pulled By: Yangqing

fbshipit-source-id: 417b249b49fed6a67956e7c6b6d22374bcee24cf
2018-09-27 03:09:54 -07:00
Sebastian Messmer
8f0db9bbbb Removing some dependency edges from Blob to other caffe2 (#12043)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12043

Re-trying D9979976, this time with all call sites fixed.

D9979976 got reverted because there was a call site that wasn't covered by sandcastle it seems.
I fixed it and used 'grep' to ensure there aren't any more call sites in fbsource.

Reviewed By: ezyang

Differential Revision: D10026392

fbshipit-source-id: cd341514a8e53a40147ea0ee3e52f63bb6444157
2018-09-25 11:40:24 -07:00
Edward Yang
d7e11e3aae Revert "Move CreateContext to global registry (#11688)" (#12049)
Summary:
This reverts commit 3ae6ee4ebd.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12049

Differential Revision: D10030954

Pulled By: ezyang

fbshipit-source-id: 6ca9de65b707c5b4c68280fc6f1b8e5ad7251efc
2018-09-25 10:13:43 -07:00
Maciej Bargiel
2cdf98a74d Back out "Removing some dependency edges from Blob to other caffe2"
Summary: The controller you requested could not be found. Original commit changeset: 2ea17724e223

Differential Revision:
D10026321
Ninja: stable broken

fbshipit-source-id: faf87cb7cc0f78c2c10d4aa6fceea279cd27acd6
2018-09-25 01:11:14 -07:00
Sebastian Messmer
17a65bf9b6 Removing some dependency edges from Blob to other caffe2 (#11923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11923

This is pre-work to allow moving Blob to ATen/core, which cannot depend on caffe2 anymore.
(1) Removing the Blob -> Tensor dependency allows us to move Blob to ATen/core and use it inside IValue without having to wait for the Tensor merge to be complete.
(2) In the final Blob design, we want it to be a very small class that doesn't have any special treatment for Tensor (or to be more correct, doesn't allow storing Tensor anymore), so this is anyhow the direction we want to go.

This changes call sites that will have to be moved to IValue later, but they cannot be moved to IValue directly, because for that, IValue first needs to be able to store Blob, which in turn first needs this diff and some other changes coming up in future diffs.

Codemods:
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.IsTensorType\\(" "BlobIsTensorType(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->IsTensorType\\(" "BlobIsTensorType(*\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.GetMutableTensor\\(" "BlobGetMutableTensor(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->GetMutableTensor\\(" "BlobGetMutableTensor(*\\1, "

It is, however, not only these codemods because regex based refactoring was only able to match a small amount of the call sites. To catch more, I wouldn've needed a AST aware tool like clangr, which I didn't figure out how to use.

Reviewed By: ezyang

Differential Revision: D9979976

fbshipit-source-id: 2ea17724e223b5b73b44f99362727759ca689e61
2018-09-24 22:57:05 -07:00
Jerry Zhang
3ae6ee4ebd Move CreateContext to global registry (#11688)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11688

As a first step to remove static context(merge with allocator), we'll create a
global registries for context constructors, and remove CreateContext function from tensor.

Reviewed By: ezyang, dzhulgakov

Differential Revision: D9779821

fbshipit-source-id: 8b239ea50af7a0556fde2382f58f79194f0e3dc1
2018-09-24 17:07:50 -07:00
Christian Puhrsch
a6630e25af Remove many caffe2::TIndex and replace them with int64_t (#11943)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11943

See title

Reviewed By: ezyang

Differential Revision: D9992645

fbshipit-source-id: e8f80d6ea762971513e5e8072975ceea53e1f11a
2018-09-22 18:11:04 -07:00
Sebastian Messmer
b2b05b7c20 Move blob serialization to free functions (#11817)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11817

Blob::Serialize() and Blob::Deserialize() are now free functions SerializeBlob(), DeserializeBlob() instead.
This takes away access to Blob internals from them and makes future refactorings easier.

Reviewed By: ezyang

Differential Revision: D9882726

fbshipit-source-id: 3251ebd4b53fc12f5e6924a6e4a8db3846ab3729
2018-09-20 23:27:34 -07:00
Roy Li
30521a37ad codemod: caffe::float16 -> at::Half (#11785)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11785

Replace each instead of float16 with Half.

Reviewed By: Yangqing

Differential Revision: D9892158

fbshipit-source-id: b9225ca7bd5c84fd1c04a9d24b026c8b6cbff120
2018-09-20 18:55:19 -07:00
Edward Yang
5765549155 codemod -d caffe2 --extensions cc,h CaffeTypeId TypeIdentifier (#10166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10166

TypeIdentifier is still easy to codemod away from

Reviewed By: smessmer

Differential Revision: D9132840

fbshipit-source-id: bc83a8b17b2e7c19c9d2c9cfe5c7ce6ec1d8cec5
2018-08-02 11:54:30 -07:00
Jerry Zhang
aebf3b47ae Remove template parameter from Tensor (#9939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939

Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: ezyang, houseroad

Differential Revision: D9024330

fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
2018-07-27 10:56:39 -07:00
Jerry Zhang
969b62f276 Revert D8121878: Remove template parameter from Tensor
Differential Revision:
D8121878

Original commit changeset: 4a5e9a677ba4

fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e
2018-07-26 14:02:04 -07:00
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Ilia Cherniavskii
1149b9bbb5 Polling async net executor
Summary:
Implementation of polling async net executor.
Notes:
- New net executor async_polling - schedules CPU and GPU ops asynchronously, uses single polling thread
- Events: update to Caffe2 events to support async CPU events, adding new methods:
 Query() - non-blocking checking of event states: INITIALIZED -> RECORDED -> SUCCESS/FAILED
 ErrorMessage() - when operation runs asynchronously and fails calling this on event will give error message
- Tasks: using existing DAGNet's algorithm to compute CPU and GPU chains, a separate task for each chain
- Polling: using single thread to query state of events - for CPU tasks atomically queries task state, for GPU task - uses cudaEventQuery; using Event
- Scheduling of CPU ops: using global thread pools
- Scheduling of GPU ops: using GPU thread pool per GPU device

Reviewed By: dzhulgakov

Differential Revision: D5985110

fbshipit-source-id: a9de7fcbb71d046a3aa1b573072b89a65dfeee8c
2017-11-03 07:27:44 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Victor Gao
34be12353b comment out unused parameters
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.

Reviewed By: igorsugak

Differential Revision: D5454343

fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
2017-07-21 15:14:43 -07:00
Artem Volkhin
54e8ef14fb add flag caffe2_serialize_fp16_as_bytes
Reviewed By: kennyhorror

Differential Revision: D5403218

fbshipit-source-id: 755e7a709880f54096a6e5e661554614fc2cc585
2017-07-11 22:20:36 -07:00
Andrey Malevich
12965a4108 Add Poorman's IOBound ThreadPool for serialization.
Summary:
At the moment serialization can tak up to 3x memory of the largest blob:
original blob, BlobProto, SerializeAsString version of the blob. As a result in
certain cases serialization takes more memory than it should and it hurts
utilization/max model size per machines.

This diff is adding IOBound ThreadPool that should set quite strict limitation
on the extra memory overhead per one blob.

Reviewed By: dzhulgakov

Differential Revision: D5012887

fbshipit-source-id: 12dbb9d3efab136411ddeffd519b602cf606661e
2017-05-08 06:43:31 -07:00
Dmytro Dzhulgakov
8a35fea9eb Improve error message for not found operator
Summary: Seems like a lot of confusion in the group lately has been about missing CUDA operators. Let's make it clearer in the error message.

Reviewed By: azzolini

Differential Revision: D4737037

fbshipit-source-id: 56c7819df909bf954510296703bff5f221fa8ae7
2017-03-21 10:35:00 -07:00
Dmytro Dzhulgakov
864f561525 Make BlobDeserialization throw exceptions instead of returning bool
Summary: Makes it much nicer to spot errors, especially in iPython notebook.

Reviewed By: kennyhorror

Differential Revision: D4465726

fbshipit-source-id: c0adaf5168248a70987ff9d5dfce54a622ff2219
2017-01-26 09:44:19 -08:00
Dmytro Dzhulgakov
65f7c915fd Fix non-chunked Blob::Serialize method
Summary: Previous implementation was just concatenating string which I believe is wrong. Instead let's turn off chunking when we don't ask for it.

Reviewed By: kennyhorror

Differential Revision: D4461311

fbshipit-source-id: 8b9a3325a40a1cd0a8ffeeb20a17bf9f57b7b0a9
2017-01-25 11:14:51 -08:00