This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>
This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
Summary:
This diff fixes more test failures (T150117218) caused by upgrading the "hypothesis" library to 6.70.1 (D44523679).
# //caffe2/caffe2/python:hypothesis_test
This test generates float numbers and filters out those whose absolute values are less than 1e-2.
It is a known issue of the new version of "hypothesis" that it generates zeros or floats with small absolute values too often:
https://github.com/HypothesisWorks/hypothesis/issues/3603
I'm circumventing this issue by suppressing the health check `filter_too_much`.
# //caffe2/caffe2/quantization/server:resize_nearest_dnnlowp_op_test
All arithmetic should be done in float32 when calculating the reference, since the network being tested uses float32 everywhere.
Mixing float32, float64 or even integers will result in intermediate values in float64.
The different precision may cause off-by-1 errors when converting to integer.
Test Plan:
Run all the tests in both "dev" and "opt" modes:
```
for mode in dev opt; do
buck2 test mode/$mode //caffe2/caffe2/python:hypothesis_test -- --run-disabled
buck2 test mode/$mode //caffe2/caffe2/quantization/server:resize_nearest_dnnlowp_op_test -- --run-disabled
buck2 test mode/$mode //caffe2/caffe2/fb/layers/tests:tum_history_test -- --run-disabled
buck2 test mode/$mode //caffe2/caffe2/fb/dper/layer_models/tests:nn_ops_test -- --run-disabled
buck2 test mode/$mode //caffe2/caffe2/fb/metrics:metrics_test -- --run-disabled
buck2 test mode/$mode //deeplearning/numeric_suite/toolkit/test:net_transform_test -- --run-disabled
buck2 test mode/$mode //f3/type_system:tests -- --run-disabled
done
```
**NOTE:** In the first test (`//caffe2/caffe2/python:hypothesis_test`), the two methods `test_constant_fill_from_tensor` and `test_recurrent` would crash.
But these crash on hypothesis 5.49.0, too, so I'm leaving them alone.
Differential Revision: D44812706
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98685
Approved by: https://github.com/malfet
Summary:
[Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there.
Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block.
Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538
Reviewed By: anjali411
Differential Revision: D35747333
Pulled By: malfet
fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626
(cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73478
temp_qparams is not used and while constructing it accessing non-existent qparams
Test Plan: buck test mode/dev -c fbcode.platform=platform010 //caffe2/caffe2/quantization/server:conv_groupwise_dnnlowp_acc16_op_test -- --exact 'caffe2/caffe2/quantization/server:conv_groupwise_dnnlowp_acc16_op_test - test_groupwise_dnnlowp_conv_acc16_outlier (caffe2.caffe2.quantization.server.conv_groupwise_dnnlowp_acc16_op_test.GroupWiseDNNLowPOpConvAcc16OpTest)'
Reviewed By: jiyuanzFB
Differential Revision: D34502856
fbshipit-source-id: aeaa5c3aa76a2fb01d9565ee294a0c627418f55e
(cherry picked from commit 9f930156b3823d7f0676e26ebd4d0ae1e682f08b)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73452
Added a fused Int8FC path using PackAWithQuantRowOffset like the INT8 dynamic path. There are two ways to enable it
(1) set an positive "X_scale" value in the arg list of Int8FC op
(2) send both "Qparam" (for output requantization, could be dummy values) and "in_Qparam" (for fused input quantization)
Differential Revision: D34034681
fbshipit-source-id: f25ca8a2b783ea597389d31c110448d19610218e
(cherry picked from commit 6fa10ba0e3be2d46298b439fba0fe9ae7e329f3a)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71166
Remove the use of deprecated old interface
Test Plan: CI
Reviewed By: jiyuanzFB
Differential Revision: D33533494
fbshipit-source-id: 930eb93cd67c7a9bb77708cc48914aa0c9f1c841
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70248
Modified loops in files under fbsource/fbcode/caffe2/ from the format
```
for(TYPE var=x0;var<x_max;x++)
```
to the format
```
for(const auto var: irange(xmax))
```
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D32813863
fbshipit-source-id: 527244b4a2b220fdfe7f17dee3599603f492a2ca
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70207
In corner case when min == max, adjust_hist_to_include_zero() function used in L2 search will cause additional_nbins = -2147483648 and initialize bins_f with negative size.
Test Plan:
Before fix:
f315187213
After fix:
f315471862
Reviewed By: jspark1105
Differential Revision: D33227717
fbshipit-source-id: 7e8a455e51a0703a3a9c5eb7595d9b4d43966001
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66743
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D31705359
fbshipit-source-id: c9ea2fbc0f9cd29e97a52dcb203addc5f2abb09b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68365
title. broadcast fastpath has been running fine for the enabled ops for a while now, so make it the default for these ops.
Test Plan: diff is a no-op, so sandcastle
Differential Revision: D32107847
fbshipit-source-id: b239b127b219985bf7df6a0eea2d879b8e9c79a4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
bypass_size_limit
allow-large-files
Test Plan: Sandcastle
Reviewed By: ngimel
Differential Revision: D30652629
fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e
Summary:
Delete `-Wno-unused-variable` from top level `CMakeLists.txt`
Still suppress those warnings for tests and `torch_python`
Delete number of unused variables from caffe2 code
Use `(void)var;` to suppress unused variable in range loops
Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants
Do not delete `caffe2::OperatorBase::Output` calls as they have side effects
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66041
Reviewed By: ngimel
Differential Revision: D31360142
Pulled By: malfet
fbshipit-source-id: 6fdfb9f91efdc49ca984a2f2a17ee377d28210c8
Summary:
Delete `-Wno-unused-variable` from top level `CMakeLists.txt`
Still suppress those warnings for tests and `torch_python`
Delete number of unused variables from caffe2 code
Use `(void)var;` to suppress unused variable in range loops
Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65954
Reviewed By: ngimel
Differential Revision: D31326599
Pulled By: malfet
fbshipit-source-id: 924155f1257a2ba1896c50512f615e45ca1f61f3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64935
As title
Test Plan: CI
Reviewed By: dskhudia
Differential Revision: D30889157
fbshipit-source-id: 316c808806b084bd2e44c56e1cdb61adf2369a9d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62369
This diff is a big no-op that just sets up scaffolding for passing the "allow_broadcast_fastpath" from caffe2 operator protos created in Python down to C++. To facilitate this, we create helper template wrappers that pass a flag for "allow_broadcast_fastpath" down to elementwise functors. This flag will determine whether to try and take the broadcast fastpath, which we will add in subsequent diffs.
Test Plan: sandcastle + let github CI run
Differential Revision: D28154475
fbshipit-source-id: 15750a0bcd2994fbc6a61fb5653d8cae6b0177dd
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
This PR deletes some code in `MiscCheck.cmake` that perform the exact
same functionality as `FindAVX.cmake`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61748
Reviewed By: ejguan
Differential Revision: D29791282
Pulled By: malfet
fbshipit-source-id: 6595fd1b61c8ae12b821fad8c9a34892dd52d213
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59541
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/621
Fixing 2 issues. These are actually 2 independent issues one in Caffe2 and another in FBGEMM, so no need to wait until FBGEMM is synchronized with PyTorch
1) conv 16-bit accumulation doesn't support fast gconv path, so TakeGConvFastPath_ should honor it
2) packed_index_ generates indices up to (G/GTogether_) F R S OC_per_G GTogether_ paddedICPerG which can exceed G kernel_prod OC_per_G paddedICPerG allocated in PackWeightMatrixForGConv (kernel_prod = F R S): e.g., when G=3, GTogether_=2, we allocate 3 F R S OC_per_G paddedICPerG but we access up to 2 F R S OC_per_G 2 paddedICPerG
BTW, not sure how we haven't known about this issue for so long. Any idea will be really appreciated.
Test Plan:
In a BDW machine,
buck test //caffe2/caffe2/quantization/server:conv_groupwise_dnnlowp_acc16_op_test -- --run-disabled
Reviewed By: dskhudia
Differential Revision: D28927214
fbshipit-source-id: 3ec98ea2fc177545392a0148daca592d80f40ad3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58022
Caffe2 Int8FC + rowwise quantization was not handling bias correctly.
Test Plan: The example in D28347336 doesn't show bigger error with rowwise quantization any more
Reviewed By: hx89, janeyx99
Differential Revision: D28347336
fbshipit-source-id: 3ac95fd2f29ef6e52705c3a2361b605813c2bcc5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57378
Previous version got reverted due to some tests not running because I wasn't in the pytorch github org
Differential Revision: D28125562
fbshipit-source-id: 758c1c9a009e79febf6cbd062a47d2a3d94e3a78
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
fix Semmle warning: Comparison of narrow type with wide type in loop condition
For example there is below piece of code:
for (int i=0; i<array.size(); ++i) {}
The problem is that array.size() return type is size_t can be larger type than int depending on the implementation so there is chance that i overflows (for very large array that array size is beyond the range of integer) and this loop will never be terminated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53951
Reviewed By: zou3519
Differential Revision: D27181495
Pulled By: malfet
fbshipit-source-id: 0612c5cedcdc656c193085e7fbb87dd163f20688
Summary:
In order to enable FC int8 quantization in P2C2, we are trying to run the caffe2 op Int8FCPackWeight in the model transformation pipeline.
The net is being generated from the python side, and passed back into C++ and run here: https://fburl.com/diffusion/3zt1mp03, with these dependencies included: https://fburl.com/diffusion/rdjtdtcf
However, when the net is executed, it errors out with:
```
Cannot create operator of type 'Int8FCPackWeight' on the device 'CPU'
```
This diff attempts to fix this issue.
Test Plan:
To reproduce, just this test without
```
buck test //aiplatform/modelstore/transformation/tests:pyper_to_caffe2_dispatcher_test
```
Reviewed By: jspark1105
Differential Revision: D25965167
fbshipit-source-id: a7414669abb8731177c14e8792de58f400970732
Summary: When the FC output min max range is very small, we want to enforce a cutoff on the scale parameter to better generalize for future values that could fall beyond the original range.
Test Plan:
More analysis about the output distributions can be found in N425166
An example workflow using fp16 min clipping is f240972205
Reviewed By: jspark1105
Differential Revision: D25681249
fbshipit-source-id: c4dfbd3ee823886afed06e6c2eccfc29d612f7e6
Summary: Adding support to gen qparams to quantize a tensor from min and max thresholds of a tensor
Test Plan:
```
buck test mode/opt caffe2/caffe2/quantization/server:int8_gen_quant_params_min_max_test
```
```
Started reporting to test run: https://our.intern.facebook.com/intern/testinfra/testrun/5629499573509506
✓ ListingSuccess: caffe2/caffe2/quantization/server:int8_gen_quant_params_min_max_test - main (2.522)
✓ Pass: caffe2/caffe2/quantization/server:int8_gen_quant_params_min_max_test - test_int8_gen_quant_params_min_max_op (caffe2.caffe2.quantization.server.int8_gen_quant_params_min_max_test.TestInt8GenQuantParamsMinMaxOperator) (1.977)
Summary
Pass: 1
ListingSuccess: 1
```
Reviewed By: hx89
Differential Revision: D24485985
fbshipit-source-id: 18dee193f7895295d85d31dc013570e5d5d97357
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48749
this change reverts D25179863 (55e225a2dc) because in 1.0.0.14 this behavior got
reintroduced
we believe this was already working pre 1.0.0.9, then intel regressed which is
why we had to remove this quantization section, and in 1.0.0.14 they fixed it
Test Plan:
we tested ctr_instagram_5x which now passes with bitwise matching
hl475 will test the top6 models and if they match, we will use this point
to lock any further changes in the future
Reviewed By: venkatacrc
Differential Revision: D25283605
fbshipit-source-id: 33aa9af008c113d4d61e3461a44932b502bf42ea
Summary:
The FullyConnectedDNNLowPOp::Y_int32_ vectors consume between 1GB and 2GB on one of FB's larger applications. By adding tracing I noticed that the number of elements in each instance oscillates wildy over time. As the buffer backing a vector can only be extended in a resize operation, this means there is wasted memory space. So as a simple optimization, I added code to right-size the buffer backing the vector when the number of elements is less than half the vector capacity at that point; this doesn't affect the existing elements.
There is of course a memory/cpu tradeoff here - with the change we are doing more mallocs and frees. I added tracing to measure how many times we grow or shrink per second: it's about 100 per second on average, which is not a great deal.
Test Plan:
Memory growth impact: over 24 hours and after the startup period, the memory consumed by this code grows from 0.85GB to 1.20GB vs 0.95GB to 1.75GB in the baseline. [ source: https://fburl.com/scuba/heap_profiles/wm47kpfe ]
https://pxl.cl/1pHlJ
Reviewed By: jspark1105
Differential Revision: D24592098
fbshipit-source-id: 7892b35f24e42403653a74a1a9d06cbc7ee866b9
Summary: It creates cpu overload issues when openmp gets enabled and OMP_NUM_THREADS=1 is not set.
Test Plan: buck test //caffe2/caffe2/quantization/server:quantize_dnnlowp_op_test
Reviewed By: jspark1105
Differential Revision: D24437305
fbshipit-source-id: 426209fc33ce0d4680c478f584716837ee62cb5e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46449
modifies `ComputeEqualizationScale` to have a single output `S`
Test Plan:
```
buck test caffe2/caffe2/quantization/server:compute_equalization_scale_test
```
plus e2e tests
Reviewed By: hx89
Differential Revision: D23946768
fbshipit-source-id: 137c2d7a58bb858db411248606a5784b8066ab23