pytorch/caffe2
drisspg 3ecfe6be25 [Submodule] Turning flash-attention integration into 3rd party submod (#144120) (#146372)
Summary:

# Summary

### Sticky points

Cuda-graph rng handling has changed / deviated from original implementation. We will be left with a dangling 'offset' val and confusing naming due to BC

## Dependencies
- Flash PR: https://github.com/Dao-AILab/flash-attention/pull/1419

### Other Points
- The BC linter is complaining about losing generate.py and its functions which is not real BC surface
cc albanD

imported-using-ghimport

Test Plan:
Imported from OSS

Building in dev
`buck build @//mode/dev-nosan -c fbcode.nvcc_arch=h100a  //caffe2:ATen-cu --show-full-output    `

I and Nming the .so I do see that the flash symbols are correctly named:
```
0000000001c3dfb0 t pytorch_flash::run_mha_bwd(pytorch_flash::Flash_bwd_params&, CUstream_st*)::$_0::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#7}::operator()() const
0000000001c36080 t pytorch_flash::run_mha_fwd(pytorch_flash::Flash_fwd_params&, CUstream_st*, bool)::$_0::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda()#6}::operator()() const
0000000001c360e0 t pytorch_flash::run_mha_fwd(pytorch_flash::Flash_fwd_params&, CUstream_st*, bool)::$_0::operator()() const::{lambda()#2}::operator()() const::{lambda()#1}::operator()() const::{lambda()#7}::operator()() const
0000000001c35fc0 t pytorch_flash::run_mha_fwd(pytorch_flash::Flash_fwd_params&, CUstream_st*, bool)::$_0::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#6}::operator()() const
0000000001c36020 t pytorch_flash::run_mha_fwd(pytorch_flash::Flash_fwd_params&, CUstream_st*, bool)::$_0::operator()() const::{lambda()#1}::operator()() const::{lambda()#1}::operator()() const::{lambda()#7}::operator()() const
```

Reviewed By: vkuzo

Differential Revision: D68502879

Pulled By: drisspg

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146372
Approved by: https://github.com/jbschlosser
2025-02-26 00:10:59 +00:00
..
core [Build] Add COMMIT_SHA to caffe2::GetBuildOptions (#141313) 2024-11-26 00:09:36 +00:00
perfkernels [caffe2] Add AVX512 support for box_cox operator (#143627) 2025-01-07 09:54:39 +00:00
serialize Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880) 2025-01-31 17:09:20 +00:00
utils Re-enable some C++ warnings (#142332) 2024-12-12 04:02:12 +00:00
.clang-format [BE][clang-format] make macro PyObject_HEAD_INIT(type) and PyVarObject_HEAD_INIT(type, size) have its own line (#136949) 2024-10-02 18:39:22 +00:00
CMakeLists.txt [Submodule] Turning flash-attention integration into 3rd party submod (#144120) (#146372) 2025-02-26 00:10:59 +00:00
unexported_symbols.lds Hide all symbols in llvm namespace (#63272) 2021-08-15 11:29:43 -07:00
version_script.lds Hide all symbols in llvm namespace (#63272) 2021-08-15 11:29:43 -07:00