Commit Graph

22 Commits

Author SHA1 Message Date
cyy
fa65ae8f56 cleanup unused include (#93359)
Using `include-what-you-use` tool to find out and remove some unused includes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93359
Approved by: https://github.com/malfet
2023-02-04 02:15:50 +00:00
Nikita Shulga
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00
Nikita Shulga
3a66a1cb99 [clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841)
Summary:
Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy
Remove existing nolint warnings using following script:
```
for file in `git ls-files | grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i  $file; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841

Reviewed By: samestep

Differential Revision: D28295045

Pulled By: malfet

fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163
2021-05-07 20:02:33 -07:00
Nikita Shulga
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
Dmytro Dzhulgakov
51dd2000cd unify c2 and TH allocator (#16892)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16892

Replaces https://github.com/pytorch/pytorch/pull/14517

Merged caffe2 and TH CPU Allocators. Mostly using the code from caffe2 allocators.
`memset` of caffe2 allocator is gone now. These two allocators should be almost the same.

Baseline:
```
Running ./tensor_allocation
Run on (48 X 2501 MHz CPU s)
CPU Caches:
  L1 Data 32K (x24)
  L1 Instruction 32K (x24)
  L2 Unified 256K (x24)
  L3 Unified 30720K (x2)
-------------------------------------------------------------------------
Benchmark                                  Time           CPU Iterations
-------------------------------------------------------------------------
BM_MakeStorageImpl                       148 ns        148 ns    4676594
BM_StorageImplCtor                        54 ns         54 ns   12957810
BM_MallocStorageImpl                      62 ns         62 ns   11254745
BM_TensorImplCtor                         22 ns         22 ns   31939472
BM_MallocTensorImpl                      105 ns        105 ns    6505661
BM_Malloc_1                               43 ns         43 ns   16464905
BM_MakeTensorFromStorage                 126 ns        126 ns    5586116
BM_MakeVariableFromTensor                236 ns        236 ns    2995528
BM_ATenCPUTensorAllocationSmall1         319 ns        319 ns    2268884
BM_ATenCPUTensorAllocationSmall2         318 ns        318 ns    2163332
BM_ATenCPUTensorAllocationMedium1        403 ns        403 ns    1663228
BM_ATenCPUTensorAllocationMedium2        448 ns        448 ns    1595004
BM_ATenCPUTensorAllocationBig1           532 ns        532 ns    1352634
BM_ATenCPUTensorAllocationBig2          4486 ns       4486 ns     160978
```
Changed:
```
Running ./tensor_allocation
Run on (48 X 2501 MHz CPU s)
CPU Caches:
  L1 Data 32K (x24)
  L1 Instruction 32K (x24)
  L2 Unified 256K (x24)
  L3 Unified 30720K (x2)
-------------------------------------------------------------------------
Benchmark                                  Time           CPU Iterations
-------------------------------------------------------------------------
BM_MakeStorageImpl                       141 ns        141 ns    4803576
BM_StorageImplCtor                        55 ns         55 ns   13129391
BM_MallocStorageImpl                      64 ns         64 ns   11088143
BM_TensorImplCtor                         23 ns         23 ns   31616273
BM_MallocTensorImpl                      101 ns        101 ns    7017585
BM_Malloc_1                               39 ns         39 ns   18523954
BM_MakeTensorFromStorage                 118 ns        118 ns    5877919
BM_MakeVariableFromTensor                452 ns        452 ns    1565722
BM_ATenCPUTensorAllocationSmall1         384 ns        384 ns    1819763
BM_ATenCPUTensorAllocationSmall2         389 ns        389 ns    1857483
BM_ATenCPUTensorAllocationMedium1        425 ns        425 ns    1646284
BM_ATenCPUTensorAllocationMedium2        430 ns        430 ns    1561319
BM_ATenCPUTensorAllocationBig1           508 ns        508 ns    1309969
BM_ATenCPUTensorAllocationBig2          3799 ns       3799 ns     173674
```

lstm benchmark:
Before:
```
INFO:lstm_bench:Iter: 1 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 21 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 41 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 61 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 81 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 101 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 121 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 141 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 161 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 181 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 201 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 221 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 241 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 261 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 281 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 301 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 321 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 341 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 361 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 381 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Done. Total EPS excluding 1st iteration: 0.8k
```

After:
```
INFO:lstm_bench:Iter: 1 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 21 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 41 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 61 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 81 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 101 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 121 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 141 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 161 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 181 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 201 / 390. Entries Per Second: 0.8k.
INFO:lstm_bench:Iter: 221 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 241 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 261 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 281 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 301 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 321 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 341 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 361 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Iter: 381 / 390. Entries Per Second: 0.7k.
INFO:lstm_bench:Done. Total EPS excluding 1st iteration: 0.8k
```

Reviewed By: ezyang

Differential Revision: D13202632

fbshipit-source-id: db6d2ec756ed15b0732b15396c82ad42302bb79d
2019-02-12 21:16:34 -08:00
Edward Yang
298b775577 Delete temporary ATenCoreTest. (#14622)
Summary:
It was previously used to sure that ATen/core was working;
but now we have plenty of headers and C++ files in ATen/core
so this is no longer necessary.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14622

Differential Revision: D13276899

Pulled By: ezyang

fbshipit-source-id: 9bef7eb1882ccdfa3ee7681a3d5b048ea94b59d3
2018-12-03 15:07:40 -08:00
Jerry Zhang
74dc4460eb New in StaticContext returns at::DataPtr (#12029)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12029

In order to remove New() function in StaticContext(to remove StaticContext) and converge to the Allocator design, we'll first change the return type of New to at::DataPtr.

Reviewed By: ezyang

Differential Revision: D9889990

fbshipit-source-id: 3257c763530b987025f428741bdd2e089d11bad4
2018-10-03 19:10:07 -07:00
Edward Yang
91797c0672 Replace direct include of caffe2.pb.h with an intermediary header caffe2_pb.h (#10946)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10946

```
codemod -d . --extensions cc,cpp,cu,cuh,h caffe2/proto/caffe2.pb.h caffe2/proto/caffe2_pb.h
```

Reviewed By: houseroad

Differential Revision: D9539945

fbshipit-source-id: 497d04720e8e7e61c05ffe1b23733d0cb774de7e
2018-08-28 11:57:08 -07:00
Edward Yang
37a226de63 When BUILD_ATEN=OFF, use ATen/core directly (#10019)
Summary:
ATenCore.h is a dummy header to just test that this is working at all.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10019

Reviewed By: smessmer

Differential Revision: D9067262

Pulled By: ezyang

fbshipit-source-id: 58bab9c0aa83b56335e36b719b9b6505400d8dee
2018-07-30 21:09:55 -07:00
Jerry Zhang
aebf3b47ae Remove template parameter from Tensor (#9939)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939

Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: ezyang, houseroad

Differential Revision: D9024330

fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
2018-07-27 10:56:39 -07:00
Jerry Zhang
969b62f276 Revert D8121878: Remove template parameter from Tensor
Differential Revision:
D8121878

Original commit changeset: 4a5e9a677ba4

fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e
2018-07-26 14:02:04 -07:00
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yangqing Jia
91d76f5dbd Reapply Windows fix
Summary:
Last fix was uncommitted due to a bug in internal build (CAFFE2_API causing error). This one re-applies it as well as a few more, especially enabling gtest.

Earlier commit message: Basically, this should make windows {static_lib, shared_lib} * {static_runtime, shared_runtime} * {cpu, gpu} work other than gpu shared_lib, which willyd kindly pointed out a symbol limit problem. A few highlights:
(1) Updated newest protobuf.
(2) use protoc dllexport command to ensure proper symbol export for windows.
(3) various code updates to make sure that C2 symbols are properly shown
(4) cmake file changes to make build proper
(5) option to choose static runtime and shared runtime similar to protobuf
(6) revert to visual studio 2015 as current cuda and msvc 2017 do not play well together.
(7) enabled gtest and fixed testing bugs.

Earlier PR is #1793

Closes https://github.com/caffe2/caffe2/pull/1827

Differential Revision: D6832086

Pulled By: Yangqing

fbshipit-source-id: 85f86e9a992ee5c53c70b484b761c9d6aed721df
2018-01-29 10:03:28 -08:00
Vladimir Chalyshev
8c02674964 Revert D6817719: [caffe2][PR] Better support for windows
Summary:
This reverts commit d286264fccc72bf90a2fcd7da533ecca23ce557e

bypass-lint

An infra SEV is better than not reverting this diff.
If you copy this password, see you in SEV Review!
cause_a_sev_many_files

Differential Revision: D6817719

fbshipit-source-id: 8fe0ad7aba75caaa4c3cac5e0a804ab957a1b836
2018-01-26 06:08:49 -08:00
Yangqing Jia
8aa8eaabb1 Better support for windows
Summary:
Basically, this should make windows {static_lib, shared_lib} * {static_runtime, shared_runtime} * {cpu, gpu} work. A few highlights:

(1) Updated newest protobuf.
(2) use protoc dllexport command to ensure proper symbol export.
(3) various code updates to make sure that C2 symbols are properly shown
(4) cmake file changes to make build proper
(5) option to choose static runtime and shared runtime similar to protobuf
(6) revert to visual studio 2015 as current cuda and msvc 2017 do not play well together.
Closes https://github.com/caffe2/caffe2/pull/1793

Reviewed By: dzhulgakov

Differential Revision: D6817719

Pulled By: Yangqing

fbshipit-source-id: d286264fccc72bf90a2fcd7da533ecca23ce557e
2018-01-26 00:48:43 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Dmytro Dzhulgakov
0d833590c1 Change Allocator interface to return deleter
Summary:
This is in preparation for adding huge pages. There we want to remember for the pointer how we got it - via mmap() or alloc(). One option is to store gigantic map of void* -> destructor, but luckily usages of Context::New are all inside Tensor which already uses shared_ptr with custom deleter.

This diff could have used unique_ptr as the return type but then it's easy to accidentally call release() and loose the deleter. Thus going with std::pair<void*, MemoryDeleter> to be explicit.

Also, now CPUAllocator can be effectively changed to std::function. Haven't done it yet, but can do if necessary.

Let me know whether it's a bad idea to proceed like this.

Reviewed By: Yangqing

Differential Revision: D5429830

fbshipit-source-id: 8382ab7b81592d51272056c05c122894bb203827
2017-07-17 15:26:27 -07:00
Andrew Gallagher
9c58341809 codemod: use <> includes for gtest headers
Summary: These are system headers and so should be included via `<>`.

Reviewed By: yfeldblum

Differential Revision: D4783480

fbshipit-source-id: 979670b594859b45560cead34f615442dfcc9f8b
2017-03-28 00:50:54 -07:00
Yangqing Jia
559053d3a8 chunky sync 2016-05-13 14:43:48 -07:00
Jeff Donahue
d829950eff change arg order of Copy/Memcpy to follow inputs-then-outputs convention
instead of C memcpy order -- from (dst, src, n) to (n, src, dst)
2015-07-27 21:19:32 -07:00
Yangqing Jia
2ed1077a83 A clean init for Caffe2, removing my earlier hacky
commits.
2015-06-25 16:26:01 -07:00