Summary: Rename static tracepoint macros to better describe their targeted usage.
Test Plan:
Same as for D47159249:
Tested the following macros on test scripts with libbpf USDTs:
* `CAFFE_SDT`
* `CAFFE_DISABLE_SDT`
* `CAFFE_SDT_WITH_SEMAPHORE`
Reviewed By: chaekit
Differential Revision: D47727339
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106380
Approved by: https://github.com/chaekit
Summary: Moving static tracepoint macros header to a location where it can be easily used by various PyTorch components (`c10/utill`).
Test Plan:
Same as for D47159249:
Tested the following macros on test scripts with libbpf USDTs:
* `CAFFE_SDT`
* `CAFFE_DISABLE_SDT`
* `CAFFE_SDT_WITH_SEMAPHORE`
Reviewed By: EDG-GH
Differential Revision: D47636258
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105856
Approved by: https://github.com/EDG-GH, https://github.com/chaekit
- BatchLinearAlgebraLib.cpp is now split into one additional file
- BatchLinearAlgebraLib.cpp uses only cusolver APIs
- BatchLinearAlgebraLibBlas.cpp uses only cublas APIs
- hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file
- cmake changes to link against hipblas instead of rocblas
- hipify mappings changes to map cublas -> hipblas instead of rocblas
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881
Approved by: https://github.com/albanD
Summary:
Fix existing CAFFE static tracepoint macros and make them match the latest FOLLY version.
Per anakryiko, current `CAFE_SDT` definition is broken. Quote:
```
"Arguments: -5@-16(%rbp) -4@$100
Arguments: -8@-16(%rbp) -4@$100
#define FOLLY_SDT_IS_ARRAY_POINTER(x) ((__builtin_classify_type(x) == 14) || \
(__builtin_classify_type(x) == 5))
vs
#define CAFFE_SDT_ISARRAY(x) (__builtin_classify_type(x) == 14)
https://github.com/atgreen/gcc/blob/master/gcc/typeclass.h
that 5 is "pointer_type_class"
so you were right, it's just fixed up version of header
I think it should be 8, not 5
5 is the size of literal, but you don't pass string literal as an argument, you pass its address, so actual argument is a pointer, and so 8 byte long
you can try just fixing up CAFFE_SDT macro
```
{F1048035373}
Test Plan:
Tested the following macros on test scripts with libbpf USDTs:
CAFFE_SDT
CAFFE_DISABLE_SDT
CAFFE_SDT_WITH_SEMAPHORE
Reviewed By: RihamSelim
Differential Revision: D47159249
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105232
Approved by: https://github.com/chaekit, https://github.com/malfet
This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>
This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
remove unused CAFFE2_VERSION macros
Summary:
Nothing reads these and they are completely subsumed by TORCH_VERSION.
Getting rid of these will be helpful for build unification, since they
are also not used internally.
Test Plan: Rely on CI.
Reviewers: sahanp
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97337
Approved by: https://github.com/malfet
Summary:
My team has been hitting a mysterious crash for a few months on a windows binary that uses Caffe2 inside a worker thread.
When this thread gets destroyed, there is an error at this line in context_gpu.h where the state of this operation gives CUDNN_STATUS_INTERNAL_ERROR instead of CUDNN_STATUS_SUCCESS.
When enabling cudnn debug logs (via the env variables nvidia specifies), I can see that the context is destroyed twice, even though this code only destroys it once, so something mysterious is causing a double free.
This seems very very similar to the issue/fix described here for pytorch:
https://github.com/pytorch/pytorch/issues/17658https://github.com/apache/tvm/pull/8267
And pytorch handles this in the same way, by just not calling cudnnDestroy
This seems to have become an issue with cuda11, but I tested cuda12 as well and found that the issue persists so this needs to be somehow fixed.
Test Plan:
CI
I checked that the specific windows binary I am using is able to create and drestroy caffe2-invoking threads without causing the application to crash.
buck run arvr/mode/win/cuda11/opt //arvr/projects/nimble/prod/tools/MonoHandTrackingVis
Differential Revision: D43538017
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95382
Approved by: https://github.com/malfet
This PR introduces some modifications:
1. We find out some const function parameters that can be passed by reference and add the reference.
2. We find more opportunists of passing by value and change them accordingly.
3. Some use-after-move errors are fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942
Approved by: https://github.com/Skylion007
Summary: cuda:: is a ambiguous namespace. Make it explicit c10::cuda
Differential Revision: D41469007
/caffe2/caffe2/core/context_gpu.cu(564): error: "caffe2::cuda" is ambiguous/caffe2/caffe2/core/context_gpu.cu(564): error: expected a ";"/caffe2/caffe2/core/context_gpu.cu(568): warning #12-D: parsing restarts here after previous syntax error
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"/caffe2/caffe2/core/context_gpu.cu(569): error: "caffe2::cuda" is ambiguous/caffe2/caffe2/core/context_gpu.cu(628): error: "caffe2::cuda" is ambiguous
4 errors detected in the compilation of "/caffe2/caffe2/core/context_gpu.cu".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89534
Approved by: https://github.com/malfet
Fix Caffe2_CPU_INCLUDE with Caffe2_GPU_INCLUDE. The expanding parent scope should be with the same variable name. The compilation in certain build configurations is corrected with this fix.
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87030
Approved by: https://github.com/kit1980
We're no longer building Caffe2 mobile as part of our CI, and it adds a lot of clutter to our make files. Any lingering internal dependencies will use the buck build and so wont be effected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84338
Approved by: https://github.com/dreiss
Add `TensorImpl::sym_strides`, bind it to python with `torch.ops.aten.sym_strides`, and use it in `ProxyTensor` and `FakeTensor`.
Before, `ProxyTensor` was generating `ProxySymInt`'s for the sizes, but not for the strides. Internally we still represent strides with a `SymIntArrayRef` though, so I ran into some weird issues where sizes were showing up as `ProxySymInt`, but strides were `PySymInt`'s.
Differential Revision: [D38594558](https://our.internmc.facebook.com/intern/diff/D38594558)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81300
Approved by: https://github.com/ezyang
This PR relands sym_numel #82374 and fixes the ios build break in this commit : 8cbd0031c5
which was a type mismatch in an equality.
### Description
<!-- What did you change and why was it needed? -->
### Issue
<!-- Link to Issue ticket or RFP -->
### Testing
<!-- How did you test your change? -->
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82731
Approved by: https://github.com/malfet