Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76366
caffe2 is not currently being built for XROS.
Test Plan: CI
Reviewed By: kimishpatel
Differential Revision: D35923922
fbshipit-source-id: 260dacadf0bd5b6bab7833a4ce81e896d280b053
(cherry picked from commit 8370b8dd2519d55a79fa8d45e7951ca8dc0b21a8)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71007
A string copy at Line 417 is currently consuming 125,749,287,000 cycles/day. I suspect the issue is with a copy-on-return, but we can experiment with introducing a reference in the middle to see if that produces a good savings without changing the interface.
Reference
```
["Inline caffe2::ArgumentHelper::GetSingleArgument @ caffe2/caffe2/utils/proto_utils.cc:417"]
```
Test Plan: Sandcastle
Reviewed By: xw285cornell
Differential Revision: D33478883
fbshipit-source-id: e863e359c0c718fcd0d52fd4b3c7858067de0670
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69533
Modified loops in files under fbsource/fbcode/caffe2/ from the format
```
for(TYPE var=x0;var<x_max;x++)
```
to the format
```
for(const auto var: irange(xmax))
```
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D32837942
fbshipit-source-id: 8663037a38ade8f81bd5e983a614d197ea11f0d1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66743
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D31705359
fbshipit-source-id: c9ea2fbc0f9cd29e97a52dcb203addc5f2abb09b
Summary:
This PR is to update PyTorch with the following cub changes:
- Starting cub 1.13.1, cub requires users to define `CUB_NS_QUALIFIER` if `CUB_NS_PREFIX` is also defined. Besides that, a new mechanism `CUB_WRAPPED_NAMESPACE` is added.
And I do the following change to PyTorch:
- Starting CUDA 11.5, define `CUB_WRAPPED_NAMESPACE` globally as an nvcc flag.
- Fix caffe2 failures caused by the above change.
- Add a `aten/src/ATen/cuda/cub_definitions.cuh` that defines helper macros about feature availability.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66219
Reviewed By: bdhirsh
Differential Revision: D31626931
Pulled By: ngimel
fbshipit-source-id: 97ebf5ef671ade8bf46d0860edc317f22660f26d
Summary:
CAFFE2 has been deprecated for a while, but still included in every PyTorch build.
We should stop building it by default, although CI should still validate that caffe2 code is buildable.
Build even fewer dependencies when compiling mobile builds without Caffe2
Introduce `TEST_CAFFE2` in torch.common.utils
Skip `TestQuantizedEmbeddingOps` and `TestJit.test_old_models_bc` is code is compiled without Caffe2
Should be landed after https://github.com/pytorch/builder/pull/864
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66658
Reviewed By: driazati, seemethere, janeyx99
Differential Revision: D31669156
Pulled By: malfet
fbshipit-source-id: 1cc45e2d402daf913a4685eb9f841cc3863e458d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
bypass_size_limit
allow-large-files
Test Plan: Sandcastle
Reviewed By: ngimel
Differential Revision: D30652629
fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65245
Building and running c10 and qnnpack tests on XROS.
Notable changes:
- Adding #if define(_XROS_) in few places not supported by XROS
- Changing Threadpool to abstract class
ghstack-source-id: 139513579
Test Plan: Run c10 and qnnpack tests on XROS.
Reviewed By: veselinp, iseeyuan
Differential Revision: D30137333
fbshipit-source-id: bb6239b935187fac712834341fe5a8d3377762b1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610
- Replace HIP_PLATFORM_HCC with USE_ROCM
- Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION.
- In the next PR
- Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify.
- HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc.
cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd
Reviewed By: jbschlosser
Differential Revision: D30909053
Pulled By: ezyang
fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64285
With C++14 heterogeneous ordered container lookup, it is no longer necessary to create a `std::string` in order to look up elements of a `CaffeMap` keyed by std::string. Accordingly, this diff reworks the argument-getting operator functions to avoid that in favor of `c10::string_view`.
ghstack-source-id: 137139818
ghstack-source-id: 137139818
Test Plan: buildsizebot iOS apps -- code size win. less strings is probably marginally good for perf but this only happens at setup time anyway.
Reviewed By: dzhulgakov
Differential Revision: D26826676
fbshipit-source-id: ee653b14dc2c528bae8c90f0fc6a7a419cbca1d6
Summary:
- HIP_VERSION semantic versioning will change in ROCm4.3. The changes essentially remove the dependency on HIP_VERSION provided in the hip header to keep code compatible with older and newer versions of ROCm.
- TORCH_HIP_VERSION is derived from HIP_VERSION_MAJOR and HIP_VERSION_MINOR
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62786
Reviewed By: bdhirsh
Differential Revision: D30281682
Pulled By: seemethere
fbshipit-source-id: e41e69fb9e13de5ddd1af99ba5bbdcbb7b64b673
Summary:
The cases are found out by compiling against clang on Windows.
Those functions will still be exported under this case, which is a waste of space in the symbol table.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62952
Reviewed By: gchanan
Differential Revision: D30191291
Pulled By: ezyang
fbshipit-source-id: 3319b0ec4f5fb02e0fe1b81dbbcedcf12a0c795e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62493
This diff adds a broadcast fastpath for the caffe2 broadcast utility function, which just copies the contents of a smaller tensor into a larger one. We also update the tests to exercise the new functionality.
Test Plan: unit tests + let CI run
Differential Revision: D29938285
fbshipit-source-id: 543ecc548500380e307be91902696033454964a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62428
In this diff we add a broadcast fastpath for reduce utility functions. These functions are used by various elementwise ops, whose tests we update to exercise the new functionality.
Test Plan: Added test cases to elementwise ops (which will exercise the new reducer functionality) that will be run by CI. It's worth noting there's still no code (outside of the new test cases) that takes the new code paths added -- the user must explicitly request `allow_broadcast_fastpath=True`, and nothing outside of the added tests currently does so.
Differential Revision: D29938264
fbshipit-source-id: 5d5542bd93afb85fd9f7a4073f766adc07eb3b65
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62369
This diff is a big no-op that just sets up scaffolding for passing the "allow_broadcast_fastpath" from caffe2 operator protos created in Python down to C++. To facilitate this, we create helper template wrappers that pass a flag for "allow_broadcast_fastpath" down to elementwise functors. This flag will determine whether to try and take the broadcast fastpath, which we will add in subsequent diffs.
Test Plan: sandcastle + let github CI run
Differential Revision: D28154475
fbshipit-source-id: 15750a0bcd2994fbc6a61fb5653d8cae6b0177dd
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60402
Add float64 data type support for ScatterWeightedSum for cases that 10^7 precision is not sufficient.
Test Plan: buck test caffe2/caffe2/python/operator_test:sparse_ops_test -- testScatterWeightedSum
Reviewed By: jianyuh
Differential Revision: D29190324
fbshipit-source-id: 871a60744694e901a2c7685a67350860745d6729
Summary:
Enables an important performance optimization for ROCm, in light of the discussion in https://github.com/pytorch/pytorch/issues/41028.
CC jithunnair-amd sunway513
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60607
Reviewed By: jbschlosser
Differential Revision: D29409894
Pulled By: ngimel
fbshipit-source-id: effca258a0f37eaefa35674a7fd19459ca7dc95b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60677
Add a rule to wrap conversions.h and depend on that, rather than
relying on a glob which violates package boundaries.
Test Plan: `buck2 build fbcode//caffe2/caffe2:caffe2_core`
Reviewed By: mzlee
Differential Revision: D29370841
fbshipit-source-id: d4dd383eb8457d4f5118574e34e6f17c32fde647
Summary:
Add a rule to wrap proto_utils.h and depend on that, rather than
relying on a glob which violates package boundaries.
Reviewed By: igorsugak
Differential Revision: D29273453
fbshipit-source-id: 08f198a03d06ee2fdf61f5dbe1d0087db22aec8b
Summary:
Add a rule to wrap simple_queue.h and depend on that, rather than
relying on a glob which violates package boundaries.
Test Plan: `buck2 build fbcode//caffe2/caffe2:caffe2_core`
Reviewed By: igorsugak
Differential Revision: D29273415
fbshipit-source-id: f2b62a82cd6478bd71a8194d661d1c8b023c0953
Summary:
Fixes https://github.com/pytorch/pytorch/issues/57273.
Some users reported that they dislike the Caffe2 thread-pool leak warning, as it floods their logs, and have requested disabling it, or have asked for a way to filter it.
It seems caffe2 pthreadpool already exists because of some dependency in the binary distribution, so `torch.set_num_threads()` invocation isn't required to reproduce the issue (as is otherwise the case when building from the master branch).
https://github.com/pytorch/pytorch/issues/60171's test script does have a `set_num_threads` invocation & hence that's why I was able to reproduce the issue after building from the master branch's source code.
cc malfet & ejguan, who have the authority to make a decision.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60318
Reviewed By: albanD
Differential Revision: D29265771
Pulled By: ezyang
fbshipit-source-id: 26f678af2fec45ef8f7e1d39a57559790eb9e94b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59355
Add a `CheckKnob()` function for doing run-time checks of feature roll-out
knobs. This provides an API for safely controlling the roll-out of new
functionality in the code.
Test Plan: Included some basic unit tests.
Reviewed By: voznesenskym
Differential Revision: D26536430
fbshipit-source-id: 2e53234c6d9ce624848fc8b2c76f6833f344f48b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58759
* Makes `pthreadpool()->run` respect `_NoPThreadPoolGuard`
Runs tasks on the same thread instead of parallelizing when guard is present
Test Plan:
buck build //xplat/caffe2:aten_test_test_thread_pool_guard
./buck-out/last/aten_test_test_thread_pool_guard
Reviewed By: kimishpatel
Differential Revision: D28597425
fbshipit-source-id: 0365ad9947c239f5b37ce682802d4d401b8b0a48
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56717
The signal_handler was under the caffe2 namespacee but was being used
by PyTorch as well.
I've fixed this my moving it to the c10 namespace where now both C2 and PyTorch
can use it.
The signal_handler interface in caffe2/utils/signal_handler.h is kept the same
for backward compatiblity for C2, but most of the commmon code is moved to c10.
ghstack-source-id: 127446929
Test Plan: waitforbuildbot
Reviewed By: ezyang
Differential Revision: D27946738
fbshipit-source-id: d6228d1a0108f4c807d405e7a0bb799c5375388f
Summary:
This cuts out caffe2's old backtrace generation in favor of the one already in c10.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56198
Pulled By: driazati
Reviewed By: nikithamalgifb
Differential Revision: D27868282
fbshipit-source-id: aa9b9691271eaa3f95baab48773ffefebd924ae2
Summary:
This guards some deprecated usages of the Protobuf API behind an `#ifdef` (this is how onnx does it as well)
](https://our.intern.facebook.com/intern/diff/27803121/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56186
Pulled By: driazati
Reviewed By: bertmaher, dzhulgakov
Differential Revision: D27803121
fbshipit-source-id: 2d3a348ec1ab9879a0d8f2dff17c5444fd4baf2c
Summary:
Following up on https://github.com/pytorch/pytorch/pull/54895#discussion_r606402656.
A race-condition wouldn't arise because `leak_corrupted_threadpool` can be set to true only after fork via the `pthread_atfork` handler, when a (child) process would be single-threaded. It's set to false also when the process is still single-threaded (`pthreadpool` is called during an invocation to `set_num_threads`, prior to which a child process would remain single-threaded). All threads (if & when multiple threads would be created) would always see `leak_corrupted_threadpool` as false if it would be accessed concurrently.
Since no reader threads can exist while a writer thread changes its value (false->true and true->false), `leak_corrupted_threadpool` might as well be a non-atomic bool.
### Pros
1. No thread-synchronization is required for `leak_corrupted_threadpool`, as it's a non-atomic bool.
2. The call to `compare_exchange_strong` has been be removed.
cc: malfet VitalyFedyunin ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55341
Reviewed By: albanD
Differential Revision: D27669442
Pulled By: ezyang
fbshipit-source-id: 926cb5c1b0a537c1c2ab164b0d51d37c1f1b67f0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55435
We've seen issues from the macos skylight app that PyTorch is super slow due to the lack of cap support in pthreadpools. For mac builds, we set the thread count to `#threads/2`.
ghstack-source-id: 125900852
Test Plan:
- Sandcastle CI
- CircleCI
Reviewed By: kimishpatel
Differential Revision: D27578871
fbshipit-source-id: 7b947bc5d6cf289378abf5f479575e112325d02b
Summary:
ATT, so that the shape inference works for a model with only distributed parts.
Previously, we rely on a full_predictor net to do shape inference. For very large models, the full_predictor net won't be generated, so we have to do shape inference based on distributed parts. Surprisingly, the PredictorCall op does tensor name mapping so it has to have shape inference func supported.
Test Plan: Added unittests.
Reviewed By: khabinov
Differential Revision: D27250956
fbshipit-source-id: 3ebd36ba1eb020bb5d00358cffb8f038a6a996e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55003
Using the `caffe2::setPrintStackTracesOnFatalSignal` utility in
distributed tests to set a signal handler that dumps the state of all threads
for all processes when it receives a FATAL signal. This would help in debugging
tests further.
I had to revert all the python faulthandler code since only one signal handler
function is supported, so running python faulthandler with
`setPrintStackTracesOnFatalSignal` doesn't work.
Sample output:
```
SIGSEGV(11), PID: 3492872, Thread 3492872:
[0] ???(0x7fa7b2d1d61b) in libcaffe2_caffe2_caffe2_cpu.so
[1] ???(0x7fa7b2d1d3fb) in libcaffe2_caffe2_caffe2_cpu.so
[2] ???(0x7fa7b2d1d33d) in libcaffe2_caffe2_caffe2_cpu.so
[3] ???(0x7fa7b2d1d167) in libcaffe2_caffe2_caffe2_cpu.so
[4] ???(0x7fa7ce683150) in libpthread.so.0
[5] ???(0x7fa7be2b233c) in libcaffe2__C_impl_cuda.so
[6] ???(0x7fa7be2ce80c) in libcaffe2__C_impl_cuda.so
[7] ???(0x7fa7be2a0512) in libcaffe2__C_impl_cuda.so
[8] torch::distributed::rpc::TensorPipeAgent::send(torch::distributed::rpc::WorkerInfo const&, torch::distributed::rpc::Message&&, float, std::unordered_map<signed char, signed char, std::hash<signed char>, std::equal_to<signed char>, std::allocator<std::pair<signed char const, signed char> > > const&)+0x24f(0x7fa7be29f71f) in libcaffe2__C_impl_cuda.so
[9] torch::distributed::autograd::sendMessageWithAutograd(torch::distributed::rpc::RpcAgent&, torch::distributed::rpc::WorkerInfo const&, torch::distributed::rpc::Message&&, bool, float, bool)+0x393(0x7fa7b602b203) in libcaffe2_libtorch.so
[10] torch::distributed::rpc::pyRpcPythonUdf(torch::distributed::rpc::WorkerInfo const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<at::Tensor, std::allocator<at::Tensor> >&, float, bool)+0x201(0x7fa7bd844971) in libcaffe2__C_impl_cuda.so
```
ghstack-source-id: 125630551
Test Plan: waitforbuildbot
Reviewed By: SciPioneer
Differential Revision: D27419714
fbshipit-source-id: 8aca9a14ef688004053d8798124d9c3a3fbe3489
Summary:
## Problem summary
Fixes https://github.com/pytorch/pytorch/issues/54752 - when the number of threads is more than 3 and at least one `set_num_threads` invocation has taken place before forking child processes by the dataloader, `set_num_threads(1)` in the child process causes a segfault, as during its invocation, the child process is made to handle the data structures of the Caffe2 thread-pool of the parent process, whose data structures it inherits from the parent process (these threads don't exist in the child process, but some of its data structures do, due to the copy-on-write technique used by `fork`).
## Solution
malfet [advised](https://github.com/pytorch/pytorch/issues/54752#issuecomment-810315302) & [authored code](https://github.com/pytorch/pytorch/pull/54895#pullrequestreview-625670122) for adding a `pthread_atfork` handler in `pytorch/caffe2/utils/threadpool/pthreadpool-cpp.cc`, that's invoked in the child process right after fork, to leak the Caffe2 thread-pool (the child inherits the thread-pool's data structures from its parent process, but doesn't actually have those threads, since after `fork` , a child process only has one thread).
## Additional changes
Added unittest `test_no_segfault` to test for this issue in `test_dataloader.py`
Also enabled `test_segfault` (which actually makes sure that segfaults happen in worker processes in a particular case).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54895
Reviewed By: zhangguanheng66
Differential Revision: D27542253
Pulled By: malfet
fbshipit-source-id: 10f9c67ce1ff1aa37d3efebf405bd93f7f9d2489
Summary:
*Context:* https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines.
The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR:
- `.github/workflows/lint.yml`
- `mypy-strict.ini`
- `tools/README.md`
- `tools/test/test_trailing_newlines.py`
- `tools/trailing_newlines.py`
I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository):
- [How to detect file ends in newline?](https://stackoverflow.com/q/38746)
- [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068)
- [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800)
- [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632)
- [git ensure newline at end of each file](https://stackoverflow.com/q/57770972)
To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737
Test Plan:
Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR:
- https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true
In contrast, this run (after correcting the trailing newlines in this PR) succeeded:
- https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241
To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow):
```
python tools/test/test_trailing_newlines.py
```
Reviewed By: malfet
Differential Revision: D27409736
Pulled By: samestep
fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19
Summary:
fix Semmle warning: Comparison of narrow type with wide type in loop condition
For example there is below piece of code:
for (int i=0; i<array.size(); ++i) {}
The problem is that array.size() return type is size_t can be larger type than int depending on the implementation so there is chance that i overflows (for very large array that array size is beyond the range of integer) and this loop will never be terminated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53951
Reviewed By: zou3519
Differential Revision: D27181495
Pulled By: malfet
fbshipit-source-id: 0612c5cedcdc656c193085e7fbb87dd163f20688
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50862
add all missing kernal launch check for all cu and cuh files under caffe2/caffe2/utils
Test Plan:
building
```buck build //caffe2/caffe2:``` gives no error
Tests all pass
```buck test //caffe2/caffe2:```
check using the check to ensure there is no show under
`fbcode/caffe2/caffe2/utils`
the PR on github shows all tests are passed https://github.com/pytorch/pytorch/actions/runs/500036434
Reviewed By: r-barnes
Differential Revision: D25987367
fbshipit-source-id: 52add63a14f2da855c784ab24468f64056c93836
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50238
Added `C10_CUDA_KERNEL_LAUNCH_CHECK();` after all kernel launches in caffe2/caffe2/utils/math
Test Plan:
```
buck build //caffe2/caffe2
```
{F356531214}
files in caffe2/caffe2/utils/math no longer show up when running
```
python3 caffe2/torch/testing/check_kernel_launches.py
```
Reviewed By: r-barnes
Differential Revision: D25773299
fbshipit-source-id: 28d67b4b9f57f1fa1e8699e43e9202bad4d42c5f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49574
Adds support for additional Eigen Utils for custom type defs.
Reviewed By: linbinyu
Differential Revision: D25624556
fbshipit-source-id: 0ffa90aaf8cbf1d08825e95156fb40d966ca7042
Summary:
Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO.
Manually edited some references of the removed `CAFFE2_API`:
* `CONTRIBUTING.md`
* `caffe2/proto/CMakeLists.txt`
* `cmake/ProtoBuf.cmake`
* `c10/macros/Export.h`
* `torch/csrc/WindowsTorchApiMacro.h`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496
Reviewed By: malfet, samestep
Differential Revision: D25600726
Pulled By: janeyx99
fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782