This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>
This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64285
With C++14 heterogeneous ordered container lookup, it is no longer necessary to create a `std::string` in order to look up elements of a `CaffeMap` keyed by std::string. Accordingly, this diff reworks the argument-getting operator functions to avoid that in favor of `c10::string_view`.
ghstack-source-id: 137139818
ghstack-source-id: 137139818
Test Plan: buildsizebot iOS apps -- code size win. less strings is probably marginally good for perf but this only happens at setup time anyway.
Reviewed By: dzhulgakov
Differential Revision: D26826676
fbshipit-source-id: ee653b14dc2c528bae8c90f0fc6a7a419cbca1d6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61500
libstdc++ defines a static variable called `std::__ioinit` in iostream that adds global constructor size overhead to each translation that includes iostream. To reduce the size overhead from that, we can often include ostream instead.
ghstack-source-id: 136163529
Test Plan: buildsizebot some mobile apps
Reviewed By: dhruvbird
Differential Revision: D29648016
fbshipit-source-id: 9c3139712c71248513cc5032d21e77f3ecbae8fe
Summary:
The cases are found out by compiling against clang on Windows.
Those functions will still be exported under this case, which is a waste of space in the symbol table.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62952
Reviewed By: gchanan
Differential Revision: D30191291
Pulled By: ezyang
fbshipit-source-id: 3319b0ec4f5fb02e0fe1b81dbbcedcf12a0c795e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55121
This is done by allow -1 as a stream ID, meaning "don't change
the stream", in SwitchToDevice
Fixes#54830
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D27527544
Pulled By: ezyang
fbshipit-source-id: c54983d6fc79a8fa1c65a71559a57425e40ba717
Summary:
Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO.
Manually edited some references of the removed `CAFFE2_API`:
* `CONTRIBUTING.md`
* `caffe2/proto/CMakeLists.txt`
* `cmake/ProtoBuf.cmake`
* `c10/macros/Export.h`
* `torch/csrc/WindowsTorchApiMacro.h`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496
Reviewed By: malfet, samestep
Differential Revision: D25600726
Pulled By: janeyx99
fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44145
## Motivation
* To be able to make C2 ops cancellable so we can safely exit.
* Some C2 operators are now blocking thus being non-cancellable. If an error
occurs we need to be able to safely stop all net execution so we can throw
the exception to the caller.
## Summary
* Adds `NetBase::Cancel()` to NetBase which iterates over the entire list of
operators and call Cancel.
* Cancel on all ops was added to Net since there's nothing Asyc specific about it.
* `AsyncSchedulingNet` calls parent Cancel.
* To preserve backwards compatibility, `AsyncSchedulingNet`'s Cancel still calls
`CancelAndFinishAsyncTasks` .
* Adds `Cancel()` to `OperatorBase`.
Reviewed By: dzhulgakov
Differential Revision: D23279202
fbshipit-source-id: e1bb0ff04a4e1393f935dbcac7c78c0baf728550
Summary: per title, makes c2 wrappers safer as contiguity of torch inputs is not guaranteed
Test Plan: covered by existing tests
Reviewed By: dzhulgakov
Differential Revision: D23310137
fbshipit-source-id: 3fe12abc7e394b8762098d032200778018e5b591
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41096
The spark spot model had some issues in tensor conversion, see P134598596. It happens when we convert an undefined c10 tensor to caffe2 tensor.
This diff added a null check.
Test Plan: spark spot model runs without problem
Reviewed By: smessmer
Differential Revision: D22330705
fbshipit-source-id: dfe0f29a48019b6611cad3fd8f2ae49e8db5427e
Summary:
If virtual function is implemented in header file, it's implementation will be included as a weak symbol to every shared library that includes this header along with all of it's dependencies.
This was one of the reasons why size of libcaffe2_module_test_dynamic.so was 500Kb (AddRelatedBlobInfo implementation pulled a quarter of libprotobuf.a with it)
Combination of this and https://github.com/pytorch/pytorch/issues/40845 reduces size of `libcaffe2_module_test_dynamic.so` from 500kb to 50Kb.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40844
Differential Revision: D22334725
Pulled By: malfet
fbshipit-source-id: 836a4cbb9f344355ddd2512667e77472546616c0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37101Fixes#36954.
The basic concept is to streamline the process of rethrowing
c10::Error with extra error information. This is in a few
steps:
- I completely remodeled the Error data type and the internal
invariants. Instead of manually adding in newlines, the
message stack formatting process is responsible for inserting
newlines and spacing as necessary. Call sites are then
modified to respect the new API model.
- TORCH_RETHROW macro is added, which adds context to an error
message and then rethrows it.
New internal assert failure looks like:
```
0 INTERNAL ASSERT FAILED at ../c10/test/util/exception_test.cpp:64, please report a bug to PyTorch.
Exception raised from TestBody at ../c10/test/util/exception_test.cpp:64 (most recent call first):
frame #0: <unknown function> + 0x6aab9 (0x7ff611d3aab9 in /data/users/ezyang/pytorch-tmp/build/lib/libc10.so)
frame #1: ...
```
Error message with context looks like:
```
This is an error
This is context 1
This is context 2
```
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21202891
Pulled By: ezyang
fbshipit-source-id: 361cadd16bc52e5886dba08e79277771ada76169
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35187
When I touch these files, lint will always introduce some unintended change, to prevent it from happening, we need to format the code first.
change is generated by:
arc f
Test Plan: integration test.
Differential Revision: D20587596
fbshipit-source-id: 512cf6b86bd6632a61c80ed53e3a9e229feecc2a
Summary:
This is a realand of https://github.com/pytorch/pytorch/pull/36196
Before the fix bazel spews following multi-line warning for every single caffe2 operator:
```
In file included from ./c10/util/logging_is_google_glog.h:50,
from ./c10/util/Logging.h:26,
from ./caffe2/core/logging.h:2,
from ./caffe2/core/blob.h:13,
from ./caffe2/core/operator.h:18,
from ./caffe2/sgd/adadelta_op.h:1,
from caffe2/sgd/adadelta_op.cc:1:
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h: In instantiation of 'std::string* google::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = long unsigned int; std::string = std::__cxx11::basic_string<char>]':
./caffe2/core/operator.h:192:5: required from 'const T& caffe2::OperatorBase::Input(int, caffe2::DeviceType) [with T = caffe2::Tensor; caffe2::DeviceType = c10::DeviceType]'
./caffe2/core/operator.h:890:48: required from 'const caffe2::Tensor& caffe2::Operator<Context>::Input(int, caffe2::DeviceType) [with Context = caffe2::CPUContext; caffe2::DeviceType = c10::DeviceType]'
./caffe2/sgd/adadelta_op.h:87:5: required from 'bool caffe2::SparseAdadeltaOp<Context>::RunOnDevice() [with Context = caffe2::CPUContext]'
./caffe2/sgd/adadelta_op.h:85:8: required from here
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h:722:32: warning: comparison of integer expressions of different signedness: 'const int' and 'const long unsigned int' [-Wsign-compare]
722 | DEFINE_CHECK_OP_IMPL(Check_LT, < )
| ^
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h:148:53: note: in definition of macro 'GOOGLE_PREDICT_TRUE'
148 | #define GOOGLE_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
| ^
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h:722:1: note: in expansion of macro 'DEFINE_CHECK_OP_IMPL'
722 | DEFINE_CHECK_OP_IMPL(Check_LT, < )
| ^~~~~~~~~~~~~~~~~~~~
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36224
Test Plan: CI
Differential Revision: D20919506
Pulled By: malfet
fbshipit-source-id: b8b4b7c62dcbc109b30165b19635a6ef30033e73
Summary:
Otherwise, while bazel spews following multi-line warning for every single caffe2 operator:
```
In file included from ./c10/util/logging_is_google_glog.h:50,
from ./c10/util/Logging.h:26,
from ./caffe2/core/logging.h:2,
from ./caffe2/core/blob.h:13,
from ./caffe2/core/operator.h:18,
from ./caffe2/sgd/adadelta_op.h:1,
from caffe2/sgd/adadelta_op.cc:1:
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h: In instantiation of 'std::string* google::Check_LTImpl(const T1&, const T2&, const char*) [with T1 = int; T2 = long unsigned int; std::string = std::__cxx11::basic_string<char>]':
./caffe2/core/operator.h:192:5: required from 'const T& caffe2::OperatorBase::Input(int, caffe2::DeviceType) [with T = caffe2::Tensor; caffe2::DeviceType = c10::DeviceType]'
./caffe2/core/operator.h:890:48: required from 'const caffe2::Tensor& caffe2::Operator<Context>::Input(int, caffe2::DeviceType) [with Context = caffe2::CPUContext; caffe2::DeviceType = c10::DeviceType]'
./caffe2/sgd/adadelta_op.h:87:5: required from 'bool caffe2::SparseAdadeltaOp<Context>::RunOnDevice() [with Context = caffe2::CPUContext]'
./caffe2/sgd/adadelta_op.h:85:8: required from here
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h:722:32: warning: comparison of integer expressions of different signedness: 'const int' and 'const long unsigned int' [-Wsign-compare]
722 | DEFINE_CHECK_OP_IMPL(Check_LT, < )
| ^
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h:148:53: note: in definition of macro 'GOOGLE_PREDICT_TRUE'
148 | #define GOOGLE_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
| ^
bazel-out/k8-fastbuild/bin/external/com_github_glog/_virtual_includes/glog/glog/logging.h:722:1: note: in expansion of macro 'DEFINE_CHECK_OP_IMPL'
722 | DEFINE_CHECK_OP_IMPL(Check_LT, < )
| ^~~~~~~~~~~~~~~~~~~~
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36196
Differential Revision: D20909696
Pulled By: malfet
fbshipit-source-id: 16723355f473379ba9da6d3c33bd561b9724800a
Summary:
Fixes incorrect usages of symbol annotations including:
1. Exporting or importing a function/class in an anonymous namespace.
2. Exporting or importing a function/class implementation in a header file. However, by removing the symbol annotations, they are now local symbols. If they need to be remain global, I can move the implementations to the source file.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35364
Differential Revision: D20670031
Pulled By: ezyang
fbshipit-source-id: cd8018dee703e2424482c27fe9608e040d8105b8
Summary:
Replacing <ATen/core/Tensor.h> with <<ATen/core/TensorBody.h> speeds up compilation of caffe2 operators by 15%
For example, it reduces pool_op.cu compilation from 18.8s to 16s
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34810
Test Plan: CI
Differential Revision: D20472230
Pulled By: malfet
fbshipit-source-id: e1b261cc24ff577f09e2d5f6428be2063c6d4a8b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30734
What are specialized lists?
The IValues that hold List[int], List[Tensor], and List[AnythingElse] are different C++ types.
e.g. List[int] has a std::vector<int> while List[AnythingElse] holds a std::vector<IValue>.
Why do we have specialized lists?
When we first created the JIT we needed to bind the ATen C++ API which has std::vector<int>,
std::vector<Tensor> as inputs. The easiest way to match this API was to make our IValues contain
these same types. Conversion was just unwrapping the IValue, very easy and cheap.
What is the problem with specialized lists?
We end up with significant special cases through the compiler. Other types like Dict are not
specialized. So in the Pickler, for instance, there is a single piece of logic to handle
their serialization. For Lists, we end up with multiple cases. Furthermore, it doesn't
match Python, leading to problems along translation boundaries. Our pickle serialization
is slightly different than python, so it is harder to load objects from our IValue serialization
as Python values.
They also make it harder to provide an easy-to-use user API. We'd like to match pybind11 for C++
bindings to TorchScript. This would entail having a single torch::List class (untemplated)
that can be used to construct inputs. This is made much harder if the underlying ivalue needs
to be different depending on the type inside the list. The ideal case would be to have a constructor like
```
template<typename T>
List(std::vector<T> foo);
```
It would then set up the type tags correctly based on type T, without the need for passing tags.
Do specialized lists improve perf?
Not in a way we have been able to measure. Our major concern initially was having to translate
a std::vector<IValue> to std::vector<int> to call ATen functions. This was especially a concern
for aten::_convolution which takes a number of mostly-constant lists of integers. However,
when we measure the effect of actually having to do this conversion for an aten::_convolution,
it does not take measurable time (benchmark results below).
This is true even if you use a trivial convolution (e.g. 1x1x1), and comment out the actual convolution code.
What are the issues removing them?
This PR removes list specialization but keeps the serialization format, and IValue APIs almost exactly
the same. The only visible change is that toTensorListRef and family have turned into toTensorVector
because they now return by value a copy of the list as a vector.
Further PRs can then clean up the complexity issues that arose from speclization. This will likely
involve removing the isTensorList/isIntList functions, and refactoring the code that used them to
work generically. At some point we will also change serialization to no longer write specialized
lists in the pickle binary. This is forward incompatible, so will go in its own PR.
Benchmark:
```
import torch
import torch.nn as nn
import torch.nn.functional as F
import time
class MnistNet(nn.Module):
def __init__(self):
super(MnistNet, self).__init__()
self.conv1 = nn.Conv2d(1, 1, kernel_size=1)
self.conv2 = nn.Conv2d(1, 1, kernel_size=1)
def forward(self, x):
for i in range(10):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
return x
model = MnistNet()
x = torch.rand(1, 1, 1, 1)
r = torch.jit.trace(model, x )
r(x)
r(x)
r(x)
r(x)
print(torch.jit.last_executed_optimized_graph())
while True:
b = time.time()
for i in range(100):
r(x)
e = time.time()
print(e - b)
```
Results (no observable difference):
```
Before (actual conv)
0.13251137733459473
0.13260436058044434
0.13276338577270508
0.1327497959136963
0.13250041007995605
0.13270330429077148
0.13290190696716309
0.13265132904052734
0.13274288177490234
0.1326758861541748
0.13253355026245117
0.13254785537719727
0.13260746002197266
0.13285017013549805
0.13264012336730957
0.132490873336792
0.13280034065246582
0.13243484497070312
0.1325232982635498
0.1326127052307129
0.13264131546020508
0.13274383544921875
0.13298296928405762
0.1326909065246582
-------------------
After (actual conv)
0.13127517700195312
0.13150334358215332
0.13092470169067383
0.13102364540100098
0.13134360313415527
0.13155555725097656
0.13314104080200195
0.13151955604553223
0.13160037994384766
0.1315293312072754
0.13137340545654297
0.13148093223571777
0.131455659866333
0.1327371597290039
0.13134026527404785
0.13152337074279785
0.13151192665100098
0.13165974617004395
0.13403725624084473
0.13251852989196777
0.13135504722595215
0.1315624713897705
0.1317615509033203
0.1314380168914795
0.13157200813293457
--------------------
The following replace the convolution operator with a no-op, to show
that even if the conv op was made faster, then we still would not see
a difference:
Before (fake conv)
0.0069539546966552734
0.0069522857666015625
0.007120847702026367
0.007344722747802734
0.007689952850341797
0.007932662963867188
0.00761723518371582
0.007501363754272461
0.007532835006713867
0.007141828536987305
0.007174253463745117
0.007114410400390625
0.007071495056152344
------------------
After (fake conv)
0.007458209991455078
0.007337093353271484
0.007268190383911133
0.007313251495361328
0.007306575775146484
0.007468700408935547
0.0073091983795166016
0.007308483123779297
0.007538318634033203
0.007356882095336914
0.007464170455932617
0.007372140884399414
```
Test Plan: Imported from OSS
Differential Revision: D18814702
Pulled By: zdevito
fbshipit-source-id: 0371c73b63068fdc12f24b801371ea90f23531a6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27086
This is a major source of merge conflicts, and AFAICT isn't necessary anymore (it may have been necessary for some mobile build stuff in the past).
This is a commandeer of #25031
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D17687345
Pulled By: ezyang
fbshipit-source-id: bf6131af835ed1f9e3c10699c81d4454a240445f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23668
- The eager mode frontend now calls operators who are defined in native_functions.yaml with `use_c10_dispatcher: True` through the c10 dispatcher and not anymore through globalATenDispatch().
- These operators aren't registered with globalAtenDispatch anymore, only on c10 now.
- Backend extensions calling globalATenDispatch().registerOp() to add their own kernels still work, this function will forward the registration to the c10 dispatcher for them.
ghstack-source-id: 90130455
Test Plan: benchmarks at https://docs.google.com/document/d/1gpzKZcFf1JJameY1vKxF7Cloul9s6D8HKIK2_Pp1hFo/edit#
Differential Revision: D16603133
fbshipit-source-id: 991f17b355e9c78c5e86fee4fa381df7ab98ac82
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24361
Currently we only support Conv in kernel but have entrance for both type using one same class
It is time make change
Reviewed By: csummersea
Differential Revision: D16604713
fbshipit-source-id: b98d39a2c7960707cd50ba27e43dce73f741eeeb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22477
There is actually no use of uninitialized variable but some compilers are not smart enough to reason about two if branches are already taken together.
Reviewed By: hx89
Differential Revision: D16100211
fbshipit-source-id: 25f01d668063603d7aaa776451afe8a10415d2ea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22005
When a Dict or List is created with type information, it will remember that.
If at any point later, this list is instantiated to a List<T> with a concrete type, it will assert that T is the correct type.
Differential Revision: D15914462
fbshipit-source-id: a8c3d91cb6d28d0c1ac0b57a4c4c6ac137153ff7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22241
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20387
glibc has a non-standard function, feenableexcept, that triggers floating-point exception handler . Compared to feclearexcept + fetestexcept , this approach allows us to see precisely where the exception is raised from the stack trace.
Reviewed By: jspark1105
Differential Revision: D15301095
fbshipit-source-id: 94f6e72456b2280f78d7d01c2ee069ae46d609bb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21937
This changes call sites to use the new naming scheme
Reviewed By: zdevito
Differential Revision: D15892404
fbshipit-source-id: 8d32aa90a0ead1066688166478f299fde9c2c133
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177
- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now
Differential Revision: D15476433
fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21492
If one async operator failed, async_scheduling net currently only marks all scheduled async operators as finished without cancelling the callbacks.
The new behavior is to cancel the callbacks first, then set event status to finished.
Reviewed By: ilia-cher
Differential Revision: D15702475
fbshipit-source-id: 55a1774d768b2e238bab859b83332f1877a001ca
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17946
Some of these are probably implementable for exported operators,
but aren't implemented yet and for now it's better to assert than to just return wrong results.
Reviewed By: ezyang
Differential Revision: D14430749
fbshipit-source-id: 2b0037a9ed227a22aa7376a90e6d3d09d3e04707