Summary:
The cases are found out by compiling against clang on Windows.
Those functions will still be exported under this case, which is a waste of space in the symbol table.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62952
Reviewed By: gchanan
Differential Revision: D30191291
Pulled By: ezyang
fbshipit-source-id: 3319b0ec4f5fb02e0fe1b81dbbcedcf12a0c795e
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61006
Test Plan: Modified existing unit test to test for eps = 0. It would fail without the equality test first.
Reviewed By: ajyu
Differential Revision: D29423770
fbshipit-source-id: 168e7de00d8522c4b646a8335d0120700915f260
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60208
Update the DB APIs so that `db::Transaction::Put()` accepts the value by
rvalue reference. This allows DB implementations to write data asynchronously
without being forced to make an additional copy of the data in memory.
`Put()` implementations can now use the string move constructor or assignment
operator to get the string data and continue performing the write
asynchronously after returning from `Put()`.
Note that I chose to entirely replace the existing `Put()`, removing the
ability for callers to call `Put()` with a `const std::string&` argument for
the value, rather than simply adding another overloaded version of `Put()`.
This was done because in practice there were no call sites using `Put()` that
cannot move in their data. Eliminating the `const std::string&` API entirely
simplifies the DB implementations: DBs that wish do support move semantics do
not have to implement both the move and the copy versions of `Put()`.
Test Plan:
Searched through fbcode to try and make sure I found all `db::Transaction`
subclasses, and will check sandcastle results to help confirm.
Ran the modelstore checkpointing unit tests.
Differential Revision: D29204425
fbshipit-source-id: 28be6646e92e5df71954d4bb3dc0c8add30ed041
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60207
Update the `BlobSerializerBase` API so that the serizialized blob data is
passed as a `std::string&&` rather than `const std::string&`. This allows the
acceptor to take ownership of the string data. This allows the acceptor to do
things like queue it for storing asynchronously, rather than having to make a
copy of the data if they need it to remain valid after returning.
All existing `BlobSerializerBase` implementations already pass in a valid
rvalue reference to the data, so this change did not require updating any of
the existing serializer implementations.
ghstack-source-id: 132216750
Test Plan:
Examined all ~46 `BlobSerializerBase` subclasses in fbsource to confirm they
already pass in an rvalue reference for this argument. Also searched for
`BlobSerializerBase` on google and did not find any external references to
this class in other open source projects that might be affected.
Differential Revision: D29204426
fbshipit-source-id: b1d567e52a5c17a01d651c70bbfa2fddbaea6cd9
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53402
Add an `options` field to the `Save` operator which accepts options for how to
serialize different blobs. At the moment this simply allows controlling the
existing `chunk_size` behavior, but in the future we can add other options,
such as the ability to control compression settings or other serialization
formats.
ghstack-source-id: 123567034
Test Plan:
Added a new test to `load_save_test.py` that passes in options and verifies
that blobs were serialized with the expected number of chunks.
buck test caffe2/caffe2:caffe2_test_cpu \
caffe2/caffe2/core:serialization_test \
caffe2/caffe2/python/operator_test:load_save_test
Reviewed By: mraway
Differential Revision: D26502577
fbshipit-source-id: 6e302e530bb96990517c2e35c505db7f14a56284
Summary:
Fixes incorrect usages of symbol annotations including:
1. Exporting or importing a function/class in an anonymous namespace.
2. Exporting or importing a function/class implementation in a header file. However, by removing the symbol annotations, they are now local symbols. If they need to be remain global, I can move the implementations to the source file.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35364
Differential Revision: D20670031
Pulled By: ezyang
fbshipit-source-id: cd8018dee703e2424482c27fe9608e040d8105b8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15418
Previously we are using Resize + ShareData.
Instead, we'll create a function on Tensor that clones itself with same storage.
Suppose we want `t` to `ShareData` with `t0`, Previous:
```
Tensor t(dims, CPU);
t.Resize(t0.sizes());
t.ShareData(t0);
```
Now:
```
Tensor t = t0.Alias();
```
Reviewed By: dzhulgakov
Differential Revision: D13507609
fbshipit-source-id: 6e4275d02f4c3356cbce91127f1b01111dc86b9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14197
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13642
Previously we pass in a patially initialized Tensor to Deserialize and it will fill
it with the result of deserialization of a tensor proto. Now we want it to return
a Tensor directly since it's just a shared pointer to TensorImpl.
Reviewed By: dzhulgakov
Differential Revision: D12874357
fbshipit-source-id: 12b80a763375da23cfa64a74d6bc186d8d03b94f
Summary:
See D10380678 for the discussion.
Caffe2 serialization code was able to handle dtype uninitalized tensor as long as their numel was 0 O_O.
For safety to unblock the push I'm preserving this behavior with critical. As we fix all occurrences of old API, we can delete this test.
Reviewed By: kennyhorror
Differential Revision: D10866562
fbshipit-source-id: e172bd045fdfca660ff05b426e001f5f2f03f408
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12656
I originally wanted to do this in two steps, but deleting the Storage-only
constructor also changes the default numel state (which breaks tests),
so easiest to do it all in one go.)
- I still need a way to compute the correct TensorTypeId for all of the
Caffe2 constructors; rather than hard-code it, I wrote a function
in at::detail::computeTensorTypeId() to do this calculation. Maybe
this function could be used more widely, but for now, it's used
by Caffe2 only.
- Added a pile more TensorTypeId for all of Caffe2's supported DeviceTypes
- Because I still can't put arbitrary TypeMeta in TensorOptions, the
TensorTypeId() calculation doesn't respect dtype. For now, this is
not a problem, but this might block work to split non-POD dtypes
into their own TensorTypeId.
Reviewed By: li-roy
Differential Revision: D10380678
fbshipit-source-id: 10c5d12020596fc9f27d5579adffad00513af363
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12848
Updated all non-test uses of protobuf::MessageLite::SerializeAsString to call
SerializeAsString_EnforceCheck so that the return value is checked and can
throw an exception if failing.
Most of the affected code was called from classes derived from BlobSerializeBase.
Didn't touch most tests and ENFORCE calls because they usually do checks
anyway.
Original commit changeset: c0760e73ecc7
Reviewed By: dzhulgakov
Differential Revision: D10453456
fbshipit-source-id: d2f2b7b4578e721924354149f08f627c7e3bf070
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12799
Updated all non-test uses of protobuf::MessageLite::SerializeAsString to call
SerializeAsString_EnforceCheck so that the return value is checked and can
throw an exception if failing.
Most of the affected code was called from classes derived from BlobSerializeBase.
Didn't touch most tests and ENFORCE calls because they usually do checks
anyway.
Reviewed By: ezyang
Differential Revision: D10416438
fbshipit-source-id: cb842e3e26b0918829d71267a375d4dd40600d58
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12407
We want to use tensor factory to refactor the caffe2's old way of initialize Tensor by Resize and mutable_data
in order to eliminate uninitialized Tensor.
Previously when we want to create a Tensor in caffe2, we'll do the following
```
Tensor x(CPU); // device type provided
x.Resize({1, 2, 3}); // size provided
x.mutable_data<float>(); // data type provided and memory allocated
```
This leaves Tensor in not fully initialized state during the process, to eliminate this, we
want to provide all the needed information in the begining. ATen already has its TensorFactories: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/TensorFactories.cpp, and there is a TensorOption, we want to adopt the same interface to ease future refactoring.
In the callsite, we used to have `Output(i)` that returns a `Blob` that contains an uninitialized `Tensor` and we'll call Resize and mutable_data afterwards to provide dimension and data type,
```
// uninitialized tensor
auto* Y = Output(0);
// set dimensions
Y->Resize({1, 2, 3});
// actually allocate the data
auto* data = Y->mutable_data<float>();
// After this step, Tensor is fully initialized.
```
We want to change it to the following:
```
// provide dimensions and TensorOptions which include device type and data type.
// This will set all the information of Tensor properly and also allocate memory.
auto* Y = Output(0, {1, 2, 3}, at::device({context_.device_type()}).template dtype<T>());
// Tensor is fully initialized after this step
// following `mutable_data` call won't allocate memory.
auto* data = Y->mutable_data<float>();
```
microbenchmarks
```
============================================================================
caffe2/caffe2/fb/benchmarks/core_overhead_benchmark.ccrelative time/iter iters/s
============================================================================
OperatorNewOutputTensorAPI 3.27us 306.05K
OperatorOldOutputTensorAPI 3.55us 281.54K
============================================================================
```
Reviewed By: ezyang
Differential Revision: D10207890
fbshipit-source-id: f54ddacaa057b7c6bc7d5a8290171f35e9e40e29
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12714
This is a short change to enable c10 namespace in caffe2. We did not enable
it before due to gflags global variable confusion, but it should have been
mostly cleaned now. Right now, the plan on record is that namespace caffe2 and
namespace aten will fully be supersets of namespace c10.
Most of the diff is codemod, and only two places of non-codemod is in caffe2/core/common.h, where
```
using namespace c10;
```
is added, and in Flags.h, where instead of creating aliasing variables in c10 namespace, we directly put it in the global namespace to match gflags (and same behavior if gflags is not being built with).
Reviewed By: dzhulgakov
Differential Revision: D10390486
fbshipit-source-id: 5e2df730e28e29a052f513bddc558d9f78a23b9b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11925
This is step 1 in the refactoring to remove Blob::ShareExternal(), i.e. Blob would then always own its contents.
ShareExternal() is for example used to pass non-owning blobs to serialization. This diff prepares removing that.
Reviewed By: ezyang
Differential Revision: D9884177
fbshipit-source-id: d01df9a613a4fc62e5679fe45bfc47e2c899b818
Summary:
There are still a few work to be done:
- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h
This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:
(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.
Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354
Reviewed By: orionr
Differential Revision: D10238910
Pulled By: Yangqing
fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
Summary:
This does 6 things:
- add c10/util/Registry.h as the unified registry util
- cleaned up some APIs such as export condition
- fully remove aten/core/registry.h
- fully remove caffe2/core/registry.h
- remove a bogus aten/registry.h
- unifying all macros
- set up registry testing in c10
Also, an important note that we used to mark the templated Registry class as EXPORT - this should not happen, because one should almost never export a template class. This PR fixes that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12077
Reviewed By: ezyang
Differential Revision: D10050771
Pulled By: Yangqing
fbshipit-source-id: 417b249b49fed6a67956e7c6b6d22374bcee24cf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12043
Re-trying D9979976, this time with all call sites fixed.
D9979976 got reverted because there was a call site that wasn't covered by sandcastle it seems.
I fixed it and used 'grep' to ensure there aren't any more call sites in fbsource.
Reviewed By: ezyang
Differential Revision: D10026392
fbshipit-source-id: cd341514a8e53a40147ea0ee3e52f63bb6444157
Summary: The controller you requested could not be found. Original commit changeset: 2ea17724e223
Differential Revision:
D10026321
Ninja: stable broken
fbshipit-source-id: faf87cb7cc0f78c2c10d4aa6fceea279cd27acd6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11923
This is pre-work to allow moving Blob to ATen/core, which cannot depend on caffe2 anymore.
(1) Removing the Blob -> Tensor dependency allows us to move Blob to ATen/core and use it inside IValue without having to wait for the Tensor merge to be complete.
(2) In the final Blob design, we want it to be a very small class that doesn't have any special treatment for Tensor (or to be more correct, doesn't allow storing Tensor anymore), so this is anyhow the direction we want to go.
This changes call sites that will have to be moved to IValue later, but they cannot be moved to IValue directly, because for that, IValue first needs to be able to store Blob, which in turn first needs this diff and some other changes coming up in future diffs.
Codemods:
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.IsTensorType\\(" "BlobIsTensorType(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->IsTensorType\\(" "BlobIsTensorType(*\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)\\.GetMutableTensor\\(" "BlobGetMutableTensor(\\1, "
$ codemod --extensions h,hpp,c,cpp,cc "([a-zA-Z0-9_]+)->GetMutableTensor\\(" "BlobGetMutableTensor(*\\1, "
It is, however, not only these codemods because regex based refactoring was only able to match a small amount of the call sites. To catch more, I wouldn've needed a AST aware tool like clangr, which I didn't figure out how to use.
Reviewed By: ezyang
Differential Revision: D9979976
fbshipit-source-id: 2ea17724e223b5b73b44f99362727759ca689e61
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11817
Blob::Serialize() and Blob::Deserialize() are now free functions SerializeBlob(), DeserializeBlob() instead.
This takes away access to Blob internals from them and makes future refactorings easier.
Reviewed By: ezyang
Differential Revision: D9882726
fbshipit-source-id: 3251ebd4b53fc12f5e6924a6e4a8db3846ab3729
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11167
Narrow the Blob API as preparation for merging Blob/IValue
- get rid of templated IsType and Operator::InputIsType / OutputIsType
- Use 'using' instead of 'typedef' for DestroyCall (just for readability)
Reviewed By: ezyang
Differential Revision: D9623916
fbshipit-source-id: 952f0b0cf5a525094b02e8d2798dd57a56a9e1d8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11254
Previously we use DeviceType in caffe2.proto directly, but it's an `enum` and have implicit conversion to int, which does not have type safety, e.g. we have to explicitly check for a device type is valid in event.h:
```
template <int d>
struct EventCreateFunctionRegisterer {
explicit EventCreateFunctionRegisterer(EventCreateFunction f) {
static_assert(d < MaxDeviceTypes, "");
Event::event_creator_[d] = f;
}
};
```
at::DeviceType is an `enum class`, and it does not have implicit conversion to int, and provides better type safety guarantees. In this diff we have done the following refactor(taking CPU as an example):
1. caffe2::DeviceType → caffe2::DeviceTypeProto
2. caffe2::CPU → caffe2::PROTO_CPU
3. caffe2::DeviceType = at::DeviceType
4. caffe2::CPU = at::DeviceType::CPU
codemod -d caffe2/caffe2 --extensions h,cc,cpp 'device_type\(\), ' 'device_type(), PROTO_'
+ some manual changes
In short, after this diff, in c++, caffe2::CPU refers to the at::DeviceType::CPU and the old proto caffe2::CPU will be caffe2::PROTO_CPU.
In python side, we have a temporary workaround that alias `caffe2_pb2.CPU = caffe2_pb2.PROOT_CPU` to make the change easier to review and this will be removed later.
Reviewed By: ezyang
Differential Revision: D9545704
fbshipit-source-id: 461a28a4ca74e616d3ee183a607078a717fd38a7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10804
Make ShareData and ShareExternalPointer to create new storage when the old one is used by multiple tensors.
When we need to modify the field of storage, we'll create a new storage instead.
Reviewed By: ezyang
Differential Revision: D9350686
fbshipit-source-id: 68d2b6b886b0367b0fc4fabfd55b9a480e7388ca
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13
Pull Request resolved: https://github.com/pytorch/translate/pull/166
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125
Closes https://github.com/pytorch/pytorch/pull/9125
Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later
Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:
1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change
Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.
Reviewed By: ezyang, houseroad
Differential Revision: D9024330
fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba