Commit Graph

23 Commits

Author SHA1 Message Date
cyy
483f748dd5 [BE] Enforce missing override keyword (#104032)
This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>

This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
2023-06-24 02:34:24 +00:00
Kazuaki Ishizaki
601e7dc0bb Fix typos under caffe2/operators directory (#98235)
This PR fixes typos in comments and messages of `.cc` and `.h` files under `caffe2/operators` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98235
Approved by: https://github.com/kit1980
2023-04-05 06:26:01 +00:00
Shai Szulanski
0ddaaf6a92 [codemod][caffe2] Run clang-format - 5/7
Summary:
This directory is opted-in to clang-format but is not format-clean. This blocks continuous formatting from being enabled on fbcode, and causes hassle for other codemods that leave inconsistent formatting. This diff runs clang-format, which is widely used and considered safe.

If you are unhappy with the formatting of a particular block, please *accept this diff* and then in a stacked commit undo the change and wrap that code in `// clang-format off` and `// clang-format on`, or `/* clang-format off */` and `/* clang-format on */`.

drop-conflicts

Test Plan: sandcastleit

Reviewed By: jerryzh168

Differential Revision: D22311706

fbshipit-source-id: 1ca59a82e96156a4a5dfad70ba3e64d44c5e762a
2020-06-30 15:45:11 -07:00
Edward Yang
f4c59c5fdf Replace SwitchToDevice(0) with SwitchToDevice() (#15126)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15126

I want to make people stop manufacturing StreamId from thin air,
and a first step is to make people use the default stream.

Reviewed By: dzhulgakov

Differential Revision: D13432922

fbshipit-source-id: 9f0d8d70646c50d979bde5ba3c3addeebac48a3d
2018-12-17 15:15:00 -08:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Rishi Raj Singh Jhelumi
1fd05df738 Add no_prefetch option to prefetch_op.
Summary:
We may not want to run the operator in a prefetch manner if we don't need any prefetching.
The option allows without modification to any operator to run it ina normal fashion.

Differential Revision: D6717720

fbshipit-source-id: 10114d68edd95258b823603d8532360120421649
2018-01-16 11:07:50 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Dmytro Dzhulgakov
5f8693cc6f Make Context::FinishDeviceComputation throw instead of FATAL
Summary:
We shouldn't LOG(FATAL) in Caffe2 code under any conditions as it's a library.

The case where it failed was a bug in SparseAdaGrad that failed on empty input trying to launch 0-sized CUDA kernel.

Also, the trend for C2 core is in moving from bool to exceptions, so I just moved CAFFE_ENFORCE directly into FinishDeviceComputation. Most of the use cases were already doing that or ignoring the output (bad!).

Reviewed By: akyrola

Differential Revision: D5495913

fbshipit-source-id: 66f382369417a262da69d54470f720e7d04a5cdf
2017-07-31 00:05:10 -07:00
Victor Gao
34be12353b comment out unused parameters
Summary: This uses `clang-tidy` to comment out unused parameters (in functions, methods and lambdas) in fbcode. Cases that the tool failed to handle are fixed manually.

Reviewed By: igorsugak

Differential Revision: D5454343

fbshipit-source-id: 5dee339b4334e25e963891b519a5aa81fbf627b2
2017-07-21 15:14:43 -07:00
Aapo Kyrola
6f1b1828e9 add SwitchToDevice to PrefetchOp constructor
Summary: In https://github.com/caffe2/caffe2/pull/802, slayton58 fixed issue in ImageInputOp where the std and mean blobs were allocated on wrong GPU (0). This fails when there is no P2P memory access. Fundamental reason was that ImageInputOp's constructor did not call SwitchToDevice. Operator's does, but ImageInputOp inherits PrefetchOp -> OperatorBase, neither of which does the switch. So made PrefetchOperator do the switch (OperatorBase does not have context, so it cannot).

Reviewed By: asaadaldien

Differential Revision: D5258729

fbshipit-source-id: c615c60eb2047ad26249c5bcba57ab0ef21d00e4
2017-06-15 22:35:27 -07:00
Yangqing Jia
aa4d07d3c4 bugfix for Windows, esp. VS 2017
Summary:
aaronmarkham this solves your Windows build issue. Basically:

(1) VS 2017 does not have CUDA support yet, and we will be waiting on NVidia to do so.

(2) VS 2015 and 2017 need different cmake generator strings.

This PR shows how to determine those and also updates appveyor to do contbuild guard for the following 3 settings:
- VS2015 without cuda
- VS2017 without cuda
- VS2015 with cuda
Closes https://github.com/caffe2/caffe2/pull/210

Differential Revision: D4745007

Pulled By: Yangqing

fbshipit-source-id: 50952552843abd0eb6f4145d9f132daeee3a6794
2017-03-21 05:17:59 -07:00
Aapo Kyrola
ba1d592b5f New 40% faster net-type for MLP on GPUs
Summary:
This diff introduces a new net type 'singlethread_async' which is based on my investigation of DPER/hogwild MLP bottlenecks.
It only uses one CPU thread, but multiple GPUs on each GPU. This is implemented by having each Net to submit their list of operators to
a central GPU-specific executor queue and a thread that executes them asynchronously. This executor takes all tasks in the queue and executes them on separate cuda streams and then waits them in the end. This solution can achieve >95% GPU utilization on 8 GPUs when sufficient amount of workers is used.

FYI: I also tried fancier solution such as using cudaStreamCallbacks(), but they did not have as good performance.

Improved the dper bench by adding the MomentumSGDUpdate operations and adding speed test capabilities. During my testing I also noticed that the startup costs for inizialing CUDA streams and contexts  are high, so it is important to do a warm up.

Reviewed By: Yangqing

Differential Revision: D4553941

fbshipit-source-id: bb00524bef653d75de026dd64097b8d9b7a0acb3
2017-02-21 21:40:15 -08:00
Aapo Kyrola
8ed9a91d77 Avoid PrefetchOp destructor assertion when not necessary
Summary:
Countless hours were spent debugging why ImageInputOp failed with a cryptic exception P56967302. Turns out, that assertion happened in PrefetchOp destructor, that was triggered when a assertion failed in ImageInputOp constructor. Because of this, the underlying problem was shadowed. I fixed this by not asserting on finalize_ if there is no prefetch thread running, and now the error is clean:

[enforce fail at image_input_op.h:105] scale_ > 0. -1 vs 0. Must provide the scaling factor.

Reviewed By: Yangqing

Differential Revision: D4435105

fbshipit-source-id: 52f85a9fd30eea396c9faca54b6d946fa847b7ff
2017-01-19 08:29:22 -08:00
Yangqing Jia
589398950f fbsync at f5a877 2016-11-18 15:41:06 -08:00
Yangqing Jia
44509f9f91 fbsync: mostly lint changes, added mkl files 2016-10-11 22:45:06 -07:00
Yangqing Jia
b23e51d467 chunky sync 2016-09-06 15:55:19 -07:00
Yangqing Jia
6463eebc7b chunky sync - build scripts to be written 2016-07-21 10:16:42 -07:00
Yangqing Jia
559053d3a8 chunky sync 2016-05-13 14:43:48 -07:00
Yangqing Jia
0521e1d672 notebook rewrite and grammar bugfix 2016-03-10 17:34:31 -08:00
Yangqing Jia
648d1b101a A consolidation of a couple random weekend work.
(1) various bugfixes.
(2) Tensor is now a class independent from its data type. This allows us
    to write easier type-independent operators.
(3) code convention changes a bit: dtype -> T, Tensor<*Context> -> Tensor* alias.
(4) ParallelNet -> DAGNet to be more consistent with what it does.
(5) Caffe's own flags library instead of gflags.
(6) Caffe's own logging library instead of glog, but glog can be chosen with
    compile-time definition -DCAFFE2_USE_GOOGLE_GLOG. As a result, glog macros
    like CHECK, DCHECK now have prefix CAFFE_, and LOG(*) now becomes
    CAFFE_LOG_*.
(7) an optional protobuf inclusion, which can be chosen with USE_SYSTEM_PROTOBUF
    in build_env.py.
2015-10-11 23:14:06 -07:00
Yangqing Jia
a12a471b2d suppress compiler warning. 2015-08-28 14:02:53 -07:00
Yangqing Jia
b4656c77b3 prefetch op bugfix 2015-08-08 13:01:12 -07:00
Yangqing Jia
2ed1077a83 A clean init for Caffe2, removing my earlier hacky
commits.
2015-06-25 16:26:01 -07:00