Commit Graph

81 Commits

Author SHA1 Message Date
cyy
e4c32d14a8 [3/N] Remove inclusion of c10/util/string_utils.h (#128504)
Follows #128372

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128504
Approved by: https://github.com/malfet
2024-06-15 06:38:40 +00:00
cyy
3008644297 [Caffe2] Remove remaining unused perfkernels (#128477)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128477
Approved by: https://github.com/ezyang, https://github.com/r-barnes
2024-06-12 22:19:36 +00:00
James Donald
1dcc034fba [caffe2] Avoid attempt to use undefined preprocessor directive
Summary:
This is somewhat more verbose, but it's more correct and addresses this warning on Visual Studio 2017:
```
xplat\caffe2\caffe2\core\common.h(76): warning C4067: unexpected tokens following preprocessor directive - expected a newline
```

Test Plan: Built locally with fix

Reviewed By: simpkins

Differential Revision: D28868632

fbshipit-source-id: f6a583e8275162adedb2a4bc5ed0f64847020871
2021-06-05 09:22:52 -07:00
Adam Simpkins
91531d3047 [caffe2] add a CAFFE2_NODISCARD macro to help support old compilers (#53754)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53754

Some of the PyTorch CircleCI builds still use gcc 5.4, and compile with
`-Werror=attributes` causing this old compiler to fail because it does not
understand the `[[nodiscard]]` attribute.

Let's define a `CAFFE2_NODISCARD` macro to work around this.
ghstack-source-id: 123594084

Test Plan: I'm using this macro in subsequent diffs in the stack.

Reviewed By: mraway

Differential Revision: D26959584

fbshipit-source-id: c7ba94f7ea944b6340e9fe20949ba41931e11d41
2021-03-12 11:32:30 -08:00
Jane Xu
71ca600af9 Renaming CAFFE2_API to TORCH_API (#49496)
Summary:
Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO.

Manually edited some references of the removed `CAFFE2_API`:
* `CONTRIBUTING.md`
* `caffe2/proto/CMakeLists.txt`
* `cmake/ProtoBuf.cmake`
* `c10/macros/Export.h`
* `torch/csrc/WindowsTorchApiMacro.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496

Reviewed By: malfet, samestep

Differential Revision: D25600726

Pulled By: janeyx99

fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782
2020-12-18 10:54:50 -08:00
Michael Ranieri
9239608037 fix windows clang attributes (#33959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33959

make sure clang on windows uses correct attributes.
add support for cl.exe style pragma attributes

Test Plan: CI green

Differential Revision: D20153548

fbshipit-source-id: bfbfd374e8f5e7d7b8598453c3ca2b6693a425f1
2020-03-02 13:20:51 -08:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Sebastian Messmer
643ca5def2 Replace c10::guts::stuff with std::stuff (#30915)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915

Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609

Test Plan: waitforsandcastle

Differential Revision: D18869639

fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
2019-12-16 13:57:19 -08:00
Sebastian Messmer
409151e1bb Use [[noreturn]] instead of C10_NORETURN or CAFFE_NORETURN (#30917)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30917

This is a C++14 feature, we can use this now.
ghstack-source-id: 95255753

Test Plan: waitforsandcastle

Differential Revision: D18869637

fbshipit-source-id: dd02036b9faeaffa64b2d2d305725443054da31b
2019-12-15 23:54:16 -08:00
Karl Ostmo
49481d576d Torch rename (#20774)
Summary:
This renames the CMake `caffe2` target to `torch`, as well as renaming `caffe2_gpu` to `torch_gpu` (and likewise for other gpu target variants).  Many intermediate variables that don't manifest as artifacts of the build remain for now with the "caffe2" name; a complete purge of `caffe2` from CMake variable names is beyond the scope of this PR.

The shell `libtorch` library that had been introduced as a stopgap in https://github.com/pytorch/pytorch/issues/17783 is again flattened in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20774

Differential Revision: D15769965

Pulled By: kostmo

fbshipit-source-id: b86e8c410099f90be0468e30176207d3ad40c821
2019-06-12 20:12:34 -07:00
Jerry Zhang
0c32e1b43e use C10_MOBILE/ANDROID/IOS (#15363)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363

Didn't define C10_MOBILE in the numa file move diff: D13380559
move CAFFE2_MOBILE/ANDROID/IOS to c10

```
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID"
codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS"

```

i-am-not-moving-c2-to-c10

Reviewed By: marcinkwiatkowski

Differential Revision: D13490020

fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4
2019-01-09 15:08:20 -08:00
vaeksare
82903dda9b Fixes for some Windows compiler warnings (#14490)
Summary:
Implement some simple fixes to clean up windows build by fixing compiler warnings. Three main types of warnings were fixes:

1. GCC specific pragmas were changed to not be used on windows.
2. cmake flags that don't exist on windows were removed from windows build
3. Fix a macro that was defined multiple times on Windows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14490

Differential Revision: D13241988

Pulled By: ezyang

fbshipit-source-id: 38da8354f0e3a3b9c97e33309cdda9fd23c08247
2018-12-05 21:27:07 -08:00
ArutyunovG
8e91da4cb3 Windows shared build (#13550)
Summary:
Hi guys,

I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.

What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.

After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550

Reviewed By: ezyang

Differential Revision: D13042597

Pulled By: orionr

fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
2018-11-16 12:16:28 -08:00
Lu Fang
e2a7d43dfd Use the torch.proto to store script module (#13736)
Summary:
Directly operate protobuf in the serializer/deserializer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13736

Reviewed By: dzhulgakov

Differential Revision: D13028487

Pulled By: houseroad

fbshipit-source-id: e578474008874f00f2a22f0a2ffd85f52643881a
2018-11-14 00:22:09 -08:00
Edward Yang
c0e24443f7 Revert D10459665: [c10] Redo jit/type and utils/functional to ATen/core
Differential Revision:
D10459665

Original commit changeset: 563dec9987aa

fbshipit-source-id: bea1dac93ebe73c9e09753d641f04f722d80aef7
2018-11-01 07:26:54 -07:00
Bram Wasti
10a6a3e404 Redo jit/type and utils/functional to ATen/core (#12862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12862

This is a redo of the previous move in a way that doesn't migrate the namespace -- also will check for the windows cudnn build failure

Reviewed By: Yangqing

Differential Revision: D10459665

fbshipit-source-id: 563dec9987aa979702e6d71072ee2f4b2d969d69
2018-10-31 19:57:43 -07:00
Sebastian Messmer
979560c9fc Include c10 namespace into caffe2 and at namespaces. (#12950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12950

For backwards compatibility, we want the c10 symbols to be reachable from caffe2 and aten.
When we move classes from at/caffe2 to c10, this
 1. allow keeping backwards compatibility with third paty code we can't control
 2. allows splitting diffs that move such classes into two diffs, where one only fixes the includes and the second one fixes the namespaces.

Reviewed By: ezyang

Differential Revision: D10496244

fbshipit-source-id: 914818688fad8c079889dfdc6242bc228b539f0e
2018-10-25 14:08:47 -07:00
Yangqing Jia
7d5f7ed270 Using c10 namespace across caffe2. (#12714)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12714

This is a short change to enable c10 namespace in caffe2. We did not enable
it before due to gflags global variable confusion, but it should have been
mostly cleaned now. Right now, the plan on record is that namespace caffe2 and
namespace aten will fully be supersets of namespace c10.

Most of the diff is codemod, and only two places of non-codemod is in caffe2/core/common.h, where

```
using namespace c10;
```

is added, and in Flags.h, where instead of creating aliasing variables in c10 namespace, we directly put it in the global namespace to match gflags (and same behavior if gflags is not being built with).

Reviewed By: dzhulgakov

Differential Revision: D10390486

fbshipit-source-id: 5e2df730e28e29a052f513bddc558d9f78a23b9b
2018-10-17 12:57:19 -07:00
Orion Reblitz-Richardson
cee19eb31c Back out "[c10][NFCI] Move jit/type, function_schema, and utils/functional to ATen/core" (#12568)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12568

Second attempt at D10324615

Original commit changeset: b71eeec98dfe
Original commit changeset #2: 1af6400ae0c1

Reviewed By: bwasti

Differential Revision: D10338168

fbshipit-source-id: 04cb443a89a9cd1a174df6d5ac1a86c3d423d56b
2018-10-11 09:53:40 -07:00
Andrey Malevich
229397b439 Revert D10324615: [pytorch][PR] Revert #12466 and #12467 to fix JIT test error on Windows CI
Differential Revision:
D10324615

Original commit changeset: 12e5fc73da42

fbshipit-source-id: 710c5f3b7a4fe56799ae31a86359b2085b7e741d
2018-10-11 03:39:14 -07:00
Will Feng
1f7cbea984 Revert #12466 and #12467 to fix JIT test error on Windows CI (#12557)
Summary:
Sample error log: https://ci.pytorch.org/jenkins/job/pytorch-builds/job/pytorch-win-ws2016-cuda9-cudnn7-py3-test2/11766/console
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12557

Differential Revision: D10324615

Pulled By: yf225

fbshipit-source-id: 12e5fc73da42ffa22e39250aee9ea072fd2e33de
2018-10-10 23:56:56 -07:00
Bram Wasti
f989d4b18e Move jit/type and utils/functional to ATen/core (#12466)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12466

Moves type.{h,cpp} and functional.h to ATen/core

move is necessary for IR merging -- slimmed down from this diff: D9819906

Reviewed By: ezyang

Differential Revision: D10242680

fbshipit-source-id: b71eeec98dfe9496e751a91838d538970ff05b25
2018-10-09 15:38:24 -07:00
Sebastian Messmer
ac9bb8ecef Make dynamic_cast_if_rtti safer (#12408)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12408

Using static_cast is better than reinterpret_cast because it will cause a compile time error in the following cases, while reinterpret_cast would run into undefined behavior and likely segfault:
- Src and Dst are not related through inheritance (say converting int* to double*)
- Src and Dst are related through virtual inheritance

This `dynamic_cast_if_rtti` is still unsafe because `dynamic_cast` and `static_cast` behave differently if the runtime type is not what you expected (i.e. dynamic_cast returns nullptr or throws whereas static_cast has undefined behavior), but it's much safer than doing reinterpret_cast.

Reviewed By: Yangqing

Differential Revision: D10227820

fbshipit-source-id: 530bebe9fe1ff88646f435096d7314b65622f31a
2018-10-06 12:56:27 -07:00
Yangqing Jia
9c49bb9ddf Move registry fully to c10 (#12077)
Summary:
This does 6 things:

- add c10/util/Registry.h as the unified registry util
  - cleaned up some APIs such as export condition
- fully remove aten/core/registry.h
- fully remove caffe2/core/registry.h
- remove a bogus aten/registry.h
- unifying all macros
- set up registry testing in c10

Also, an important note that we used to mark the templated Registry class as EXPORT - this should not happen, because one should almost never export a template class. This PR fixes that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12077

Reviewed By: ezyang

Differential Revision: D10050771

Pulled By: Yangqing

fbshipit-source-id: 417b249b49fed6a67956e7c6b6d22374bcee24cf
2018-09-27 03:09:54 -07:00
Christian Puhrsch
db5f8d42bb Remove TIndex typedef from core/common.h (#12032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12032

See title

Reviewed By: dinhviethoa

Differential Revision: D10023757

fbshipit-source-id: dbf0a043b2afab767f052bd4c5e8de13e0f57dcc
2018-09-26 17:02:54 -07:00
Hoa Dinh
70e4b3ef59 Revert D10006069: Remove TIndex typedef from core/common.h
Differential Revision:
D10006069

Original commit changeset: 5e2aac993968

fbshipit-source-id: fbd8d3860635211e641ca14eaff7a64882e0d6bd
2018-09-24 15:30:25 -07:00
Yangqing Jia
a6f1ae7f20 set up c10 scaffolding. Move macros proper first.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11939

Reviewed By: orionr, dzhulgakov

Differential Revision: D10004629

Pulled By: Yangqing

fbshipit-source-id: ba50a96820d35c7922d81c78c4cbe849c85c251c
2018-09-24 11:09:59 -07:00
Christian Puhrsch
1a1d79e761 Remove TIndex typedef from core/common.h (#11993)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11993

See title

Reviewed By: ezyang

Differential Revision: D10006069

fbshipit-source-id: 5e2aac993968307c850e431c00052cb1a339ced2
2018-09-24 10:55:55 -07:00
Orion Reblitz-Richardson
d4832f1e7b More fixes for hidden visibility (#10624)
Summary:
Some more `ATEN_API` additions for hidden visibility.

Running CI tests to see what fails to link.

cc Yangqing mingzhe09088 ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10624

Reviewed By: mingzhe09088

Differential Revision: D9392728

Pulled By: orionr

fbshipit-source-id: e0f0861496b12c9a4e40c10b6e0c9e0df18e8726
2018-08-20 10:11:59 -07:00
Yangqing Jia
0a809fc8b1 build changes to make cpu unified build working. (#10504)
Summary:
Properly annotated all apis for cpu front. Checked with cmake using

cmake -DUSE_ATEN=ON -DUSE_CUDA=OFF -DBUILD_ATEN=ON

and resulting libcaffe2.so has about 11k symbols.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10504

Reviewed By: ezyang

Differential Revision: D9316491

Pulled By: Yangqing

fbshipit-source-id: 215659abf350af7032e9a4b0f28a856babab2454
2018-08-15 17:22:36 -07:00
Edward Yang
ad76fc8807 s/DISABLE_COPY_AND_ASSIGN/AT_DISABLE_COPY_AND_ASSIGN/ (#10275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10275

Remove forwarding declaration in caffe2/core/common.h

```
codemod -d caffe2 --extensions cc,cpp,cu,cuh,h \\bDISABLE_COPY_AND_ASSIGN AT_DISABLE_COPY_AND_ASSIGN
```

Reviewed By: mingzhe09088

Differential Revision: D9184809

fbshipit-source-id: 958cf5162b0d92b83ea9c2597abb77320ca57ce8
2018-08-07 08:54:26 -07:00
Edward Yang
66f7b8abbe Better macro name hygiene prefixing. (#10274)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10274

Good C++ libraries don't take up un-namespaced identifiers
like DISABLE_COPY_AND_ASSIGN.  Re-prefix this.

Follow up fix: codemod Caffe2 to use the new macro, delete
the forwarding definition

Reviewed By: mingzhe09088

Differential Revision: D9181939

fbshipit-source-id: 857d099de1c2c0c4d0c1768c1ab772d59e28977c
2018-08-07 08:54:24 -07:00
Edward Yang
fa9ea5bde9 Move CoreAPI.h to Macros.h, to give it a more accurate name. (#10264)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10264

Since we now have DISABLE_COPY_AND_ASSIGN macro in the file,
CoreAPI is no longer an accurate name.

Reviewed By: dzhulgakov

Differential Revision: D9181687

fbshipit-source-id: a9cc5556be9c43e6aaa22671f755010707caef67
2018-08-06 22:27:44 -07:00
Edward Yang
da44cf6101 Move TensorTypeId, TensorTypeIdRegistration and flat_hash_map to ATen/core (#10263)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10263

Auxiliary changes that were needed:
- Add DISABLE_COPY_AND_ASSIGN to CoreAPI.h (maybe we should rename this file
  now)

Reviewed By: dzhulgakov

Differential Revision: D9181321

fbshipit-source-id: 975687068285b5a94a57934817c960aeea2bbafa
2018-08-06 22:27:40 -07:00
Sebastian Meßmer
d3b690ecd5
TensorTypeId (#8389) 2018-06-19 15:05:24 -07:00
wuhuikx
fa277e6785 [IDEEP] [fix bug] Fix bug in ideep SkipOutputCopy strategy (#8372)
* fix a bug for SkipIndices

* IDEEP bug, revise the output to CPUTensor in SkipOutputCopy strategy

* [IDEEP] Add IDEEP fallbacks for Style-Transfer ops
2018-06-14 09:42:00 -07:00
bddppq
e5b997223c
[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7955)
* Add hip support for caffe2 core

* Add MIOPEN header/wrapper to caffe2 core

* Add HIP device into caffe2 PB

* top level makefile change for rocm/hip

* makefile scaffolding for AMD/RocM/HIP

* Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files

* caffe2 PB update for AMD/ROCM HIP device

* Add AMD/RocM/Thrust dependency

* HIP threadpool update

* Fix makefile macro

* makefile fix: duplicate test/binary name

* makefile clean-up

* makefile clean-up

* add HIP operator registry

* add utilities for hip device

* Add USE_HIP to config summary

* makefile fix for BUILD_TEST

* merge latest

* Fix indentation

* code clean-up

* Guard builds without HIP and use the same cmake script as PyTorch to find HIP

* Setup rocm environment variables in build.sh (ideally should be done in the docker images)

* setup locale

* set HIP_PLATFORM

* Revert "set HIP_PLATFORM"

This reverts commit 8ec58db2b390c9259220c49fa34cd403568300ad.

* continue the build script environment variables mess

* HCC_AMDGPU_TARGET

* Cleanup the mess, has been fixed in the lastest docker images

* Assign protobuf field hip_gpu_id a new field number for backward compatibility

* change name to avoid conflict

* Fix duplicated thread pool flag

* Refactor cmake files to not add hip includes and libs globally

* Fix the wrong usage of environment variables detection in cmake

* Add MIOPEN CNN operators

* Revert "Add MIOPEN CNN operators"

This reverts commit 6e89ad4385b5b8967a7854c4adda52c012cee42a.

* Resolve merge conflicts

* .

* Update GetAsyncNetHIPThreadPool

* Enable BUILD_CAFFE2 in pytorch build

* Unifiy USE_HIP and USE_ROCM

* always check USE_ROCM

* .

* remove unrelated change

* move all core hip files to separate subdirectory

* .

* .

* recurse glob core directory

* .

* correct include

* .
2018-06-04 09:04:30 -07:00
xkszltl
89ba9dc44f Import/export observer symbols for DLL, which fixes the linking error in Visual Studio. (#6834)
* Import/export observer symbols for DLL, which fixes the linking error in Visual Studio.

* Add support of all default cmake build types for release to cuda.
2018-05-31 10:22:21 -07:00
bddppq
966c65859d Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2" (#7802)
* Revert "[auto] Update onnx to 4898c9e - Added TensorDenotation and metadata_props for images (onnx/onnx#879) 4898c9e925"

This reverts commit 9c679dab5f.

* Revert "Add BiasCHW fallback for GPU (#7738)"

This reverts commit 14ad2e74f1.

* Revert "[Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566)"

This reverts commit 2ebcf4bb37.
2018-05-23 17:58:47 -07:00
Peter Yeh
2ebcf4bb37 [Caffe2] Enabling AMD GPU Backend for Caffe2 (#7566)
* Add hip support for caffe2 core

* Add MIOPEN header/wrapper to caffe2 core

* Add HIP device into caffe2 PB

* top level makefile change for rocm/hip

* makefile scaffolding for AMD/RocM/HIP

* Makefile scafodding for AMD/RocM/HIP; add makefile/utility for HIP files

* caffe2 PB update for AMD/ROCM HIP device

* Add AMD/RocM/Thrust dependency

* HIP threadpool update

* Fix makefile macro

* makefile fix: duplicate test/binary name

* makefile clean-up

* makefile clean-up

* add HIP operator registry

* add utilities for hip device

* Add USE_HIP to config summary

* makefile fix for BUILD_TEST

* merge latest

* Fix indentation

* code clean-up

* Guard builds without HIP and use the same cmake script as PyTorch to find HIP

* Setup rocm environment variables in build.sh (ideally should be done in the docker images)

* setup locale

* set HIP_PLATFORM

* Revert "set HIP_PLATFORM"

This reverts commit 8ec58db2b390c9259220c49fa34cd403568300ad.

* continue the build script environment variables mess

* HCC_AMDGPU_TARGET

* Cleanup the mess, has been fixed in the lastest docker images

* Assign protobuf field hip_gpu_id a new field number for backward compatibility

* change name to avoid conflict

* Fix duplicated thread pool flag

* Refactor cmake files to not add hip includes and libs globally

* Fix the wrong usage of environment variables detection in cmake

* Add MIOPEN CNN operators

* Revert "Add MIOPEN CNN operators"

This reverts commit 6e89ad4385b5b8967a7854c4adda52c012cee42a.
2018-05-23 15:13:09 -07:00
Orion Reblitz-Richardson
6223bfdb1d Update from Facebook (#6692)
* [GanH][Easy]: Add assertion to adaptive weighting layer

0 weight causes numeric instability and exploding ne

* [Easy] Add cast op before computing norm in diagnose options

As LpNorm only takes floats we add a manual casting here.

* Introduce a new caching device allocator

`cudaMalloc` and `cudaFree` calls are slow, and become slower the
more GPUs there are. Essentially, they grab a host-wide (not device-wide) lock
because GPU memory is transparently shared across all GPUs. Normally, this
isn't much of a concern since workloads allocate memory upfront, and reuse it
during later computation.

However, under some computation models (specifically, memory conserving
approaches like checkpoint-and-recompute, see
https://medium.com/@yaroslavvb/fitting-larger-networks-into-memory-583e3c758ff9)
this assumption is no longer true. In these situations, `cudaMalloc` and
`cudaFree` are common and frequent. Furthermore, in data parallel contexts,
these calls happen at nearly the same time from all GPUs worsening lock
contention.

A common solution to this problem is to add a custom allocator. In fact,
nVIDIA provides one out of the box: CUB, which Caffe2 already supports.
Unfortunately, the CUB allocator suffers from very high fragmentation. This is
primarily because it is a "buddy" allocator which neither splits nor merges
free cached blocks. Study
https://github.com/NVlabs/cub/blob/1.8.0/cub/util_allocator.cuh#L357 if you
want to convince yourself.

This diff adapts a caching allocator from the Torch codebase
https://github.com/torch/cutorch/blob/master/lib/THC/THCCachingAllocator.cpp
which does splitting and merging and ends up working really well, at least for
workloads like the checkpoint-and-recompute computation models noted above.

I simplified the implementation a little bit, made it a bit more C++-like. I
also removed a bunch of stream synchronization primitives for this diff. I
plan to add them back in subsequent diffs.

* Report reader progress in fblearner workflows

Integrate with fblearner progress reporting API and add support to report training progress from reader nodes.
If reader is constructed with batch limits, report based on finished batch vs total batch. The finished batch may be more than total batch because we evaludate if we should stop processing everytime we dequeue a split.
If no limit for the reader, report based on finished splits (Hive files) vs total splits. This is fairly accurate.

* [GanH][Diagnose]: fix plotting

1. ganh diagnose needs to set plot options
2. modifier's blob name is used for metric field can need to be fixed before
generating net

* Automatic update of fbcode/onnx to 985af3f5a0f7e7d29bc0ee6b13047e7ead9c90c8

* Make CompositeReader stops as soon as one reader finishes

Previously, CompositeReader calls all readers before stopping. It results in flaky test since the last batch may be read by different threads; resulting in dropped data.

* [dper] make sure loss is not nan

as desc.

* [rosetta2] [mobile-vision] Option to export NHWC order for RoIWarp/RoIAlign

Thanks for finding this @stzpz and @wangyanghan. Looks like NHWC is more
optimized. For OCR though it doesn't yet help since NHWC uses more mem b/w but
will soon become important.

* Intra-op parallel FC operator

Intra-op parallel FC operator

* [C2 Proto] extra info in device option

passing extra information in device option

design doc: https://fb.quip.com/yAiuAXkRXZGx

* Unregister MKL fallbacks for NCHW conversions

* Tracing for more executors

Modified Tracer to work with other executors and add more tracing

* Remove ShiftActivationDevices()

* Check for blob entry iff it is present

When processing the placeholders ops, ignore if the blob is not present in the blob_to_device.

* Internalize use of eigen tensor

Move use of eigen tensor out of the header file so we don't get template partial specialization errors when building other libraries.

* feature importance for transformed features.

* - Fix unused parameter warnings

The changes in this diff comments out unused parameters.
This will allow us to enable -Wunused-parameter as error.

#accept2ship

* add opencv dependencies to caffe2

The video input op requires additional opencv packages. This is to add them to
cmake so that it can build

* Add clip_by_value option in gradient clipping

Add clip_by_value option in gradient clipping

when the value is bigger than max or smaller than min, do the clip

* std::round compat
2018-04-17 23:36:40 -07:00
Xiaomeng Yang
8849bea120 [caffe2] Update ReduceOps (#6497)
* Update ReduceMean

* Add reduce mean to math

* Update cuda flag

* Update Eigen::Tensor ctor

* Remove unused variables

* Skip ReduceTensorGPUTest if no gpus

* Add NOMINMAX for windows

* Fix lpnorm_op in windows
2018-04-11 23:36:05 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yangqing Jia
91d76f5dbd Reapply Windows fix
Summary:
Last fix was uncommitted due to a bug in internal build (CAFFE2_API causing error). This one re-applies it as well as a few more, especially enabling gtest.

Earlier commit message: Basically, this should make windows {static_lib, shared_lib} * {static_runtime, shared_runtime} * {cpu, gpu} work other than gpu shared_lib, which willyd kindly pointed out a symbol limit problem. A few highlights:
(1) Updated newest protobuf.
(2) use protoc dllexport command to ensure proper symbol export for windows.
(3) various code updates to make sure that C2 symbols are properly shown
(4) cmake file changes to make build proper
(5) option to choose static runtime and shared runtime similar to protobuf
(6) revert to visual studio 2015 as current cuda and msvc 2017 do not play well together.
(7) enabled gtest and fixed testing bugs.

Earlier PR is #1793

Closes https://github.com/caffe2/caffe2/pull/1827

Differential Revision: D6832086

Pulled By: Yangqing

fbshipit-source-id: 85f86e9a992ee5c53c70b484b761c9d6aed721df
2018-01-29 10:03:28 -08:00
Vladimir Chalyshev
8c02674964 Revert D6817719: [caffe2][PR] Better support for windows
Summary:
This reverts commit d286264fccc72bf90a2fcd7da533ecca23ce557e

bypass-lint

An infra SEV is better than not reverting this diff.
If you copy this password, see you in SEV Review!
cause_a_sev_many_files

Differential Revision: D6817719

fbshipit-source-id: 8fe0ad7aba75caaa4c3cac5e0a804ab957a1b836
2018-01-26 06:08:49 -08:00
Yangqing Jia
8aa8eaabb1 Better support for windows
Summary:
Basically, this should make windows {static_lib, shared_lib} * {static_runtime, shared_runtime} * {cpu, gpu} work. A few highlights:

(1) Updated newest protobuf.
(2) use protoc dllexport command to ensure proper symbol export.
(3) various code updates to make sure that C2 symbols are properly shown
(4) cmake file changes to make build proper
(5) option to choose static runtime and shared runtime similar to protobuf
(6) revert to visual studio 2015 as current cuda and msvc 2017 do not play well together.
Closes https://github.com/caffe2/caffe2/pull/1793

Reviewed By: dzhulgakov

Differential Revision: D6817719

Pulled By: Yangqing

fbshipit-source-id: d286264fccc72bf90a2fcd7da533ecca23ce557e
2018-01-26 00:48:43 -08:00
Yangqing Jia
046c11cd73 Stod
Summary:
This is in order for Android to pass - Android support for string related functions is quite limited.
Closes https://github.com/caffe2/caffe2/pull/1571

Reviewed By: pietern

Differential Revision: D6486079

Pulled By: Yangqing

fbshipit-source-id: f0961e2dde6202bd6506f4fb8a3aea4af1670cb5
2017-12-05 10:48:09 -08:00
Dmytro Dzhulgakov
5527dd3b08 Expose CMake options in the binary
Summary:
Useful for figuring out with people which version they built with. We can just ask for --caffe2_version gflag or get core.build_options from python.

Also adds CMAKE_INSTALL_RPATH_USE_LINK_PATH - without it wasn't building on my Mac. How should it be tested?
Closes https://github.com/caffe2/caffe2/pull/1271

Reviewed By: bddppq

Differential Revision: D5940750

Pulled By: dzhulgakov

fbshipit-source-id: 45b4c94f67e79346a10a65b34f40fd258295dad1
2017-10-04 02:33:02 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Yangqing Jia
f14d75c7ef Proper versioning and misc CMake improvements
Summary:
This brings proper versioning in Caffe2: instead of manual version macros, this puts the version information in CMake (replacing the TODO bwasti line) and uses macros.h.in to then generate the version in the C++ header.

A few misc updates:
- Removed the mac os rpath, verified on local macbook that it is no longer needed.
- Misc updates for caffe2 ready:
  - Mapped cmake/Cuda.cmake with gloo's setting.
  - upstreamed third_party/nccl so it builds with cuda 9.
- Separated the Caffe2 cpu dependencies and cuda dependencies
  - now libCaffe2_CPU.so do not depend on any cuda libs.
  - caffe2 python extensions now depend on cpu and gpu separately too.
- Reduced the number of unused functions in Utils.cmake
Closes https://github.com/caffe2/caffe2/pull/1256

Reviewed By: dzhulgakov

Differential Revision: D5899210

Pulled By: Yangqing

fbshipit-source-id: 36366e47366c3258374d646cf410b5f49f95767b
2017-09-26 08:52:21 -07:00