Commit Graph

1694 Commits

Author SHA1 Message Date
cyy
d7d6190493 [11/N] Use std::nullopt and std::optional (#132396)
Follows #132364
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132396
Approved by: https://github.com/ezyang
2024-08-01 14:46:33 +00:00
PyTorch MergeBot
161bb67116 Revert "Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)"
This reverts commit ace6decc99.

Reverted https://github.com/pytorch/pytorch/pull/130341 on behalf of https://github.com/clee2000 due to unfortunately the internal pybind update got reverted cc @malfet ([comment](https://github.com/pytorch/pytorch/pull/130341#issuecomment-2253147079))
2024-07-26 17:02:56 +00:00
cyy
eac83479cc Enable Wunused-function and Wunused-result globally (#131596)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131596
Approved by: https://github.com/zou3519
2024-07-25 23:50:12 +00:00
Xuehai Pan
ace6decc99 Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)
Fix static `py::object`s with `py::gil_safe_call_once_and_store`.

The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault.

```c++
void func() {
    static py::object obj = py::module_::import("foo").attr("bar");

    ...
}
```

The correct code is to use raw pointers rather than the instance.

```c++
void func() {
    static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")};
    py::object obj = *obj_ptr;

    ...
}
```

This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely.

```c++
void func() {
    PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage;
    py::object obj = storage
                         .call_once_and_store_result(
                             []() -> py::object {
                                 return py::module_::import("foo").attr("bar");
                             }
                         )
                         .get_stored();

    ...
}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130341
Approved by: https://github.com/ezyang, https://github.com/malfet
2024-07-25 05:53:09 +00:00
Xuehai Pan
740fb22966 [BE][Easy][4/19] enforce style for empty lines in import segments in functorch/ (#129755)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129755
Approved by: https://github.com/zou3519
ghstack dependencies: #129752
2024-07-18 05:08:03 +00:00
PyTorch MergeBot
ea78b0c177 Revert "Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)"
This reverts commit a17d1e5322.

Reverted https://github.com/pytorch/pytorch/pull/130341 on behalf of https://github.com/izaitsevfb due to internal needs pybind update ([comment](https://github.com/pytorch/pytorch/pull/130341#issuecomment-2226499397))
2024-07-12 23:07:37 +00:00
Xuehai Pan
a17d1e5322 Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)
Fix static `py::object`s with `py::gil_safe_call_once_and_store`.

The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault.

```c++
void func() {
    static py::object obj = py::module_::import("foo").attr("bar");

    ...
}
```

The correct code is to use raw pointers rather than the instance.

```c++
void func() {
    static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")};
    py::object obj = *obj_ptr;

    ...
}
```

This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely.

```c++
void func() {
    PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage;
    py::object obj = storage
                         .call_once_and_store_result(
                             []() -> py::object {
                                 return py::module_::import("foo").attr("bar");
                             }
                         )
                         .get_stored();

    ...
}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130341
Approved by: https://github.com/ezyang
2024-07-10 04:23:37 +00:00
cyy
c219fa5eb9 [3/N] Remove unused functions (#128179)
Following https://github.com/pytorch/pytorch/pull/128005, this PR continues to remove unused functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128179
Approved by: https://github.com/ezyang
2024-06-07 16:13:16 +00:00
PaliC
a25b28a753 [Split Build] Add option to create libtorch wheel and use it to build pytorch as a separate wheel (#126328)
Creates an option to just build the libtorch portion of pytorch such that we have the necessary .so files.  Then it builds a torch package using the libtorch wheel. These options are enabled using ` BUILD_LIBTORCH_WHL` and `BUILD_PYTHON_ONLY`.

We run

```
 BUILD_LIBTORCH_WHL=1 python setup.py install
python setup.py clean
BUILD_PYTHON_ONLY=1 python setup.py install
```

to produce

```
sahanp@devgpu086 ~/pytorch (detached HEAD|REBASE-i 3/5)> ls /home/sahanp/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/torch/lib/                                                                                                                (pytorch-3.10)
libshm.so*  libtorch_global_deps.so*  libtorch_python.so*
sahanp@devgpu086 ~/pytorch (detached HEAD|REBASE-i 3/5)> ldd build/lib/libtorch_python.so                                                                                                                                                                (pytorch-3.10)
        linux-vdso.so.1 (0x00007ffdc2d37000)
        libtorch.so => /home/sahanp/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/libtorch/lib/libtorch.so (0x00007f539fe99000)
        libshm.so => /home/sahanp/pytorch/build/lib/libshm.so (0x00007f539fe90000)
        libcudnn.so.8 => /usr/local/cuda-12.1/targets/x86_64-linux/lib/libcudnn.so.8 (0x00007f539e800000)
        libnvToolsExt.so.1 => /usr/local/cuda/lib64/libnvToolsExt.so.1 (0x00007f539e400000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f539e000000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f539fda5000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f539ebe5000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f539dc00000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f539fea0000)
        libtorch_cpu.so => /home/sahanp/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/libtorch/lib/libtorch_cpu.so (0x00007f5392400000)
        libtorch_cuda.so => /home/sahanp/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/libtorch/lib/libtorch_cuda.so (0x00007f5380000000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f539fd9e000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f539fd99000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f539fd94000)
        libc10.so => /home/sahanp/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/libtorch/lib/libc10.so (0x00007f539eb07000)
        libmkl_intel_lp64.so.2 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libmkl_intel_lp64.so.2 (0x00007f537ec00000)
        libmkl_gnu_thread.so.2 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libmkl_gnu_thread.so.2 (0x00007f537ce00000)
        libmkl_core.so.2 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libmkl_core.so.2 (0x00007f5378800000)
        libomp.so => /home/sahanp/.conda/envs/pytorch-3.10/lib/libomp.so (0x00007f539e707000)
        libcupti.so.12 => /usr/local/cuda/lib64/libcupti.so.12 (0x00007f5377e00000)
        libcudart.so.12 => /usr/local/cuda/lib64/libcudart.so.12 (0x00007f5377a00000)
        libc10_cuda.so => /home/sahanp/.conda/envs/pytorch-3.10/lib/python3.10/site-packages/libtorch/lib/libc10_cuda.so (0x00007f539ea6a000)
        libcusparse.so.12 => /usr/local/cuda/lib64/libcusparse.so.12 (0x00007f5368400000)
        libcufft.so.11 => /usr/local/cuda/lib64/libcufft.so.11 (0x00007f535ee00000)
        libcusolver.so.11 => /usr/local/cuda/lib64/libcusolver.so.11 (0x00007f534c800000)
        libcurand.so.10 => /usr/local/cuda/lib64/libcurand.so.10 (0x00007f5346200000)
        libcublas.so.12 => /usr/local/cuda/lib64/libcublas.so.12 (0x00007f533f800000)
        libcublasLt.so.12 => /usr/local/cuda/lib64/libcublasLt.so.12 (0x00007f531e800000)
        libutil.so.1 => /lib64/libutil.so.1 (0x00007f539ea63000)
        libnvJitLink.so.12 => /usr/local/cuda/lib64/libnvJitLink.so.12 (0x00007f531b800000)
sahanp@devgpu086 ~/pytorch (detached HEAD|REBASE-i 3/5)> ldd build/lib/libtorch_global_deps.so                                                                                                                                                           (pytorch-3.10)
        linux-vdso.so.1 (0x00007ffc265df000)
        libmkl_intel_lp64.so.2 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libmkl_intel_lp64.so.2 (0x00007fa93fc00000)
        libmkl_gnu_thread.so.2 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libmkl_gnu_thread.so.2 (0x00007fa93de00000)
        libmkl_core.so.2 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libmkl_core.so.2 (0x00007fa939800000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fa940f05000)
        libcudart.so.12 => /usr/local/cuda/lib64/libcudart.so.12 (0x00007fa939400000)
        libnvToolsExt.so.1 => /usr/local/cuda/lib64/libnvToolsExt.so.1 (0x00007fa939000000)
        libgomp.so.1 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libgomp.so.1 (0x00007fa93fb07000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fa938c00000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fa940efe000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa940ef9000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fa940ff5000)
        librt.so.1 => /lib64/librt.so.1 (0x00007fa940ef2000)
        libstdc++.so.6 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libstdc++.so.6 (0x00007fa93921d000)
        libgcc_s.so.1 => /home/sahanp/.conda/envs/pytorch-3.10/lib/libgcc_s.so.1 (0x00007fa93faec000)
        ```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126328
Approved by: https://github.com/atalman
2024-05-29 04:33:56 +00:00
Xuehai Pan
26f4f10ac8 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
2024-05-27 14:49:57 +00:00
PyTorch MergeBot
55c0ab2887 Revert "[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)"
This reverts commit 7763c83af6.

Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))
2024-05-27 09:22:08 +00:00
Xuehai Pan
7763c83af6 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
ghstack dependencies: #127122, #127123, #127124, #127125
2024-05-27 04:22:18 +00:00
Xuehai Pan
a28bfb5ed5 [4/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort functorch (#127125)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127125
Approved by: https://github.com/Skylion007
ghstack dependencies: #127122, #127123, #127124
2024-05-25 22:45:38 +00:00
chilli
f9a7033194 Refactor partitioner and clean it up (#126318)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126318
Approved by: https://github.com/anijain2305
2024-05-17 06:15:00 +00:00
Richard Barnes
ed327876f5 [codemod] c10:optional -> std::optional (#126135)
Generated by running the following from PyTorch root:
```
find . -regex ".*\.\(cpp\|h\|cu\|hpp\|cc\|cxx\)$" | grep -v "build/" | xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/'
```

`c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi
2024-05-14 19:35:51 +00:00
Aaron Orenstein
a8574a9719 Fix global flake8 issues (#124771)
Prior to this `lintrunner --all-files --take FLAKE8` failed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124771
Approved by: https://github.com/Skylion007
ghstack dependencies: #124428
2024-04-26 15:35:53 +00:00
PyTorch MergeBot
1ac60484c1 Revert "Fix global flake8 issues (#124771)"
This reverts commit f01275934b.

Reverted https://github.com/pytorch/pytorch/pull/124771 on behalf of https://github.com/jeanschmidt due to Unfortunately, I needed to revert #123735 and this one depends on it. So please check if there are no merge conflicts or breakages and feel free to merge this PR again ([comment](https://github.com/pytorch/pytorch/pull/124428#issuecomment-2078699836))
2024-04-26 06:15:17 +00:00
Nikita Shulga
f9a611a3ce Update Jinja to 3.1.3 (#124976)
To fix CVE-2024-22195

Also, delete unused docs/cpp/requirements.txt and functorch/docs/requirements.txt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124976
Approved by: https://github.com/kit1980
2024-04-26 02:57:55 +00:00
Aaron Orenstein
f01275934b Fix global flake8 issues (#124771)
Prior to this `lintrunner --all-files --take FLAKE8` failed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124771
Approved by: https://github.com/Skylion007
ghstack dependencies: #124428
2024-04-25 14:25:00 +00:00
mantaionut
247646333e Fix py opcode (#118977)
Added a C file that includes the symbols  _PyOpcode_Deopt and _PyOpcode_Caches since they are not available in the python lib but they are available on Linux in order to fix linking issues in Windows in python 3.11.
Fixes #93854

Test by running on python 3.11 `python test/functorch/test_dims.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118977
Approved by: https://github.com/ezyang
2024-04-10 02:39:17 +00:00
lezcano
8a5a377190 Move doc links to point to main (#121823)
The previous links were pointing to an outdated branch

Command: `find . -type f -exec sed -i "s:docs/main:docs/master:g" {} + `

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121823
Approved by: https://github.com/albanD, https://github.com/malfet
2024-03-15 19:49:37 +00:00
Aaron Gokaslan
33938cfddd [BE][Ez] Update ruff to 0.2.2 (#120517)
Updates ruff to 0.2.2. This updates the config and handles some of the new rules that have come out of preview.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120517
Approved by: https://github.com/albanD
2024-02-24 07:13:53 +00:00
Aaron Gokaslan
f9200c8608 [BE][Ez]: FURB129: remove unneeded readlines() (#119796)
Applies a refurb rule to remove any readlines() in a for loop iteration as it just creates a temporary list in memory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119796
Approved by: https://github.com/ezyang
2024-02-13 21:21:22 +00:00
Aaron Gokaslan
1562dae62c [BE]: Apply RUF025 dict.fromkeys preview rule (#118637)
Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637
Approved by: https://github.com/albanD
2024-01-30 20:46:54 +00:00
Ilya Kamen
be455921f5 Fix missing words in README.md (#116606)
minor fix to wording

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116606
Approved by: https://github.com/Skylion007
2024-01-02 18:24:58 +00:00
Aaron Gokaslan
bd10fea79a [BE]: Enable F821 and fix bugs (#116579)
Fixes #112371

I tried to fix as many of the bugs as I could, a few I could not figure out what the proper fix for them was though and so I left them with noqas.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116579
Approved by: https://github.com/ezyang
2024-01-01 08:40:46 +00:00
Wongboo
68f74dd162 Add python and C++ support for LPPool3d (#114199)
Add python and C++ support for LPPool3d to Fixes #114114

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114199
Approved by: https://github.com/mikaylagawarecki
2023-12-08 18:18:44 +00:00
albanD
25fb88cf23 Add all 3.12 binary build for wheel. Let's see how it goes. V2 (#112882)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112882
Approved by: https://github.com/malfet, https://github.com/sammcj
2023-11-16 18:20:12 +00:00
ydwu4
670311190d [HigherOrderOp] Move _map.py to _higher_order_ops (#111152)
Differential Revision: [D50332159](https://our.internmc.facebook.com/intern/diff/D50332159)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111152
Approved by: https://github.com/zou3519
2023-11-16 03:04:12 +00:00
PyTorch MergeBot
77d5f0379e Revert "[HigherOrderOp] remove _deprecated_global_ns (#112757)"
This reverts commit fa81237af7.

Reverted https://github.com/pytorch/pytorch/pull/112757 on behalf of https://github.com/PaliC due to breaking a bunch of executorch tests ([comment](https://github.com/pytorch/pytorch/pull/112757#issuecomment-1795503740))
2023-11-06 17:04:19 +00:00
ydwu4
fa81237af7 [HigherOrderOp] remove _deprecated_global_ns (#112757)
As titled.

Test Plan:
existing test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112757
Approved by: https://github.com/zou3519
2023-11-03 23:03:18 +00:00
Igor Sugak
4a17693d19 [CODEMOD][caffe2] replace uses of np.float with np.float64 (#112675)
Differential Revision: D50752096

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112675
Approved by: https://github.com/Skylion007
2023-11-03 03:00:51 +00:00
ydwu4
8bc0b382fa [HigherOrderOp] Move map_impl to torch.ops.higher_order (#111404)
The purpose of this pr is as titled. Because of some misusage of ghstack, ghimport, and export to github from internal, the stack of https://github.com/pytorch/pytorch/pull/111092 is a mess. I'll try to land them one by one. This is a replacement for #111092 and #111400.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111404
Approved by: https://github.com/tugsbayasgalan, https://github.com/zou3519
2023-10-26 16:59:10 +00:00
ydwu4
d84bcb9c8c [HigherOrderOp] expose torch.cond (#110293)
This pr expose torch._higher_order_ops.cond as torch.cond.

1. Need to add #noqa: F811 to the _check calls in torch/__init__.py to address some confusing linter error "Redefinition of unused 'cond'" but only one cond is imported and for these lines that have this error, they don't define the cond but just use it as an argument.
2. Also add cond to the list that allows it to be traced through so as dynamo could trigger the CondHigherOrder logic instead of creating a TorchVariable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110293
Approved by: https://github.com/zou3519
2023-10-07 20:39:52 +00:00
PyTorch MergeBot
576b80d23e Revert "[HigherOrderOp] expose torch.cond (#110293)"
This reverts commit 601f872831.

Reverted https://github.com/pytorch/pytorch/pull/110293 on behalf of https://github.com/ydwu4 due to Sorry, didn't check the error carefully on the PR. A doc error is related to this pr ([comment](https://github.com/pytorch/pytorch/pull/110293#issuecomment-1751176719))
2023-10-06 17:44:17 +00:00
ydwu4
601f872831 [HigherOrderOp] expose torch.cond (#110293)
This pr expose torch._higher_order_ops.cond as torch.cond.

1. Need to add #noqa: F811 to the _check calls in torch/__init__.py to address some confusing linter error "Redefinition of unused 'cond'" but only one cond is imported and for these lines that have this error, they don't define the cond but just use it as an argument.
2. Also add cond to the list that allows it to be traced through so as dynamo could trigger the CondHigherOrder logic instead of creating a TorchVariable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110293
Approved by: https://github.com/zou3519
2023-10-06 17:04:31 +00:00
ydwu4
cc1de49340 [HigherOrderOp] fallthrough some keys by default. (#110478)
Fixes #109253

Test Plan:
Added a new test that shows default fallthrough keys can be overrided.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110478
Approved by: https://github.com/ezyang
2023-10-05 16:25:42 +00:00
Brian Hirsh
b457e3f79a Reland attempt 2 of "Update AOTAutograd to use FunctionalTensorMode instead of C++ functionalization (#106406)" (#109906)" (#110079)
The first reland broke internal (failing diff: D49617462).

The major error looks like it's because there's an internal-only higher order op that needs a new functionalization rule. I'm going to land an internal diff for that and confirm tests pass before relanding this PR.

Also confirmed that the issue from https://github.com/pytorch/pytorch/issues/110121 is fixed, and added a test.

This reverts commit 1b90f07f5a.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110079
Approved by: https://github.com/ezyang
2023-10-03 18:50:25 +00:00
PyTorch MergeBot
1b90f07f5a Revert "Reland "Update AOTAutograd to use FunctionalTensorMode instead of C++ functionalization (#106406)" (#109906)"
This reverts commit d0fe8fa5db.

Reverted https://github.com/pytorch/pytorch/pull/109906 on behalf of https://github.com/atalman due to Breaks internal tests ([comment](https://github.com/pytorch/pytorch/pull/109906#issuecomment-1735416852))
2023-09-26 12:10:25 +00:00
Moritz Hennen
09c598745c Rename torch._C._TensorBase to TensorBase (#109940)
I have gone ahead and implemented the renaming of the type `torch._C._TensorBase` to a non-private class name `TensorBase`.
The changes also include leaving `torch._C._TensorBase` as an alias to the new type: 70458768fb/torch/csrc/autograd/python_variable.cpp (L2196-L2197) both in the c++ code and in the corresponding `__init__.pyi.in` file:
70458768fb/torch/_C/__init__.pyi.in (L1522)

Fixes #109438

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109940
Approved by: https://github.com/ezyang
2023-09-25 19:10:22 +00:00
Brian Hirsh
d0fe8fa5db Reland "Update AOTAutograd to use FunctionalTensorMode instead of C++ functionalization (#106406)" (#109906)
I'm pretty sure this is fixed but I'll run inductor and trunk CI. The failing test in trunk previously was that the selective activation checkpointing code that landed recently assumes that it can detect whether or not AOTAutograd is running by seeing if the inputs to SAC are C++ `FunctionalTensorWrapper`s

previous land broke some inductor trunk tests

This reverts commit 629a628cc8.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109906
Approved by: https://github.com/ezyang
2023-09-25 14:53:54 +00:00
PyTorch MergeBot
629a628cc8 Revert "Update AOTAutograd to use FunctionalTensorMode instead of C++ functionalization (#106406)"
This reverts commit b5d6e831a9.

Reverted https://github.com/pytorch/pytorch/pull/106406 on behalf of https://github.com/malfet due to Broke lots of tests on trunk ([comment](https://github.com/pytorch/pytorch/pull/106406#issuecomment-1731524917))
2023-09-22 14:32:34 +00:00
Brian Hirsh
b5d6e831a9 Update AOTAutograd to use FunctionalTensorMode instead of C++ functionalization (#106406)
Now that FunctionalTensor and `FunctionalTensorMode` are lower down in this stack, the changes in this PR are more mechanical: Everywhere in AOTAutograd that I used to use the C++ functionalization API, I now use the python functionalization API.

Note that this doesn't actually cause functionalization to run underneath torch_dispatch. I'm saving that re-ordering for later in the stack.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106406
Approved by: https://github.com/ezyang
ghstack dependencies: #108654, #109662, #109632, #109023
2023-09-22 07:09:04 +00:00
Peter Bell
7ce69d5dbe [RELAND] Remove some unnecessary <iostream> includes from headers (#108150)
In almost all cases this is only included for writing the output formatter, which
only uses `std::ostream` so including `<ostream>` is sufficient.

The istream header is ~1000 lines so the difference is non-trivial.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108150
Approved by: https://github.com/albanD, https://github.com/malfet
ghstack dependencies: #108149
2023-09-20 21:55:15 +00:00
Brian Hirsh
238fb66085 python functionalization: support higher order ops (#108656)
We now have two types of functionalization, C++ Functionalization (through the `Functionalize` dispatch key), and python functionalization (through the `FunctionalTensorMode` torch_dispatch mode).

This means that all higher order ops need custom functionalization rules for the python variant too. I added them here, as well as a helper function `dispatch_functionalize()` - equivalent to `torch.func.functionalize()`, except that it uses `FunctionalTensorMode`.

In theory we could have secretly switched `torch.func.functionalize` to use `FunctionalTensorMode`. This would be BC-breaking, though, since `FunctionalTensorMode` isn't composable with the other functorch transforms (the functorch layer-mode stack doesn't know how to re-order torch_dispatch modes arbitrarily).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108656
Approved by: https://github.com/zou3519
ghstack dependencies: #109024, #109248
2023-09-20 04:37:31 +00:00
Yanbo Liang
8a567bb59d [HigherOrderOp] Should automatically pop modes (#109157)
Fixes #108282

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157
Approved by: https://github.com/zou3519
2023-09-18 20:54:09 +00:00
PyTorch MergeBot
07f2efa285 Revert "[HigherOrderOp] Should automatically pop modes (#109157)"
This reverts commit f03b8abd47.

Reverted https://github.com/pytorch/pytorch/pull/109157 on behalf of https://github.com/clee2000 due to broke internal builds D49346922 ([comment](https://github.com/pytorch/pytorch/pull/109157#issuecomment-1722571262))
2023-09-17 21:19:52 +00:00
Yanbo Liang
f03b8abd47 [HigherOrderOp] Should automatically pop modes (#109157)
Fixes #108282

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157
Approved by: https://github.com/zou3519
2023-09-14 20:46:26 +00:00
Kurt Mohler
4c5e43574c Reland 2: Add PyObject preservation for UntypedStorage (#109039)
Relands #103907 after it was reverted. This PR makes the new `ignore_hermetic_tls` argument of `check_pyobj` optional to avoid causing a compilation error in torchdistx

Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109039
Approved by: https://github.com/ezyang
2023-09-12 22:26:05 +00:00
PyTorch MergeBot
59f605be57 Revert "Reland 2: Add PyObject preservation for UntypedStorage (#109039)"
This reverts commit 419e4e17a2.

Reverted https://github.com/pytorch/pytorch/pull/109039 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing linter job in trunk, probably due to a landrace ([comment](https://github.com/pytorch/pytorch/pull/109039#issuecomment-1715147020))
2023-09-12 07:26:11 +00:00
Kurt Mohler
419e4e17a2 Reland 2: Add PyObject preservation for UntypedStorage (#109039)
Relands #103907 after it was reverted. This PR makes the new `ignore_hermetic_tls` argument of `check_pyobj` optional to avoid causing a compilation error in torchdistx

Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109039
Approved by: https://github.com/ezyang
2023-09-12 01:19:40 +00:00
PyTorch MergeBot
68238606f3 Revert "Reland: Add PyObject preservation for UntypedStorage (#103907)"
This reverts commit 56b848157c.

Reverted https://github.com/pytorch/pytorch/pull/103907 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing torchdistx build which uses check_pyobj here 9c1b9f5cb2/src/python/torchdistx/_C/deferred_init.cc (L87) ([comment](https://github.com/pytorch/pytorch/pull/103907#issuecomment-1712121158))
2023-09-08 19:27:07 +00:00
Kurt Mohler
56b848157c Reland: Add PyObject preservation for UntypedStorage (#103907)
This relands #97470 after #102553 reverted it. This PR attempts to fix the internal failure by avoiding an unnecessary intermediate storage buffer allocation in `c10::newStorageImplFromRefcountedDataPtr`.

Part of #91395

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103907
Approved by: https://github.com/ezyang
2023-09-07 04:24:11 +00:00
Shayekh Bin Islam
e8005781be Softmax in functorch example fixed (#107988)
The output of softmax was overwritten by the output of fc2 in the following line. So, the output of the softmax is never utilized. Now, the final output of the model includes softmax.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107988
Approved by: https://github.com/zou3519
2023-09-05 17:18:48 +00:00
ydwu4
f8c93df2d1 Fix boolean tensor for map (#108289)
torch.empty_strided is able to create a new tensor based on the meta data. For boolean tensor, we call a clone directly, however, we'll get a functional tensor if input is a functional tensor and that functional tensor won't be tracked by tracer's tensor_tracker after dispatching so it become a tensor\_constant in the graph if create_arg. So we manually unwrap the functional tensor before calling clone.

Test Plan:
See added test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108289
Approved by: https://github.com/angelayi
2023-08-31 19:17:28 +00:00
PyTorch MergeBot
378ffde8c1 Revert "Remove some unnecessary <iostream> includes from headers (#106914)"
This reverts commit a6c29b7227.

Reverted https://github.com/pytorch/pytorch/pull/106914 on behalf of https://github.com/izaitsevfb due to Causing metal breakage internally, see D48709279 ([comment](https://github.com/pytorch/pytorch/pull/106914#issuecomment-1696670027))
2023-08-29 02:22:33 +00:00
ydwu4
8688965337 Move cond to torch/_higher_order_ops/ (#108025)
1. Move cond to torch/_higher_order_ops
2. Fix a bug in map, which didn't respect tensor dtype when creating a new one from them. We cannot directly use empty_strided because boolean tensor created by empty_strided is not properly intialized so it causes error "load of value 190, which is not a valid value for type 'bool'" on clang asan environment on CI.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108025
Approved by: https://github.com/zou3519
2023-08-28 00:01:35 +00:00
james-a-watson
808e088615 Update writing_batching_rules.md (#108007)
Was reading through the batching rules info which is very cool and just saw a couple of typos 😊.

Thanks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108007
Approved by: https://github.com/msaroufim
2023-08-26 19:07:36 +00:00
Peter Bell
a6c29b7227 Remove some unnecessary <iostream> includes from headers (#106914)
In almost all cases this is only included for writing the output formatter, which
only uses `std::ostream` so including `<ostream>` is sufficient.

The istream header is ~1000 lines so the difference is non-trivial.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106914
Approved by: https://github.com/lezcano
2023-08-25 18:24:05 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
2b7271c703 Support cond and out_dtype for predispatch (#107941)
Summary: Title

Test Plan: CI

Differential Revision: D48675742

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107941
Approved by: https://github.com/jerryzh168
2023-08-25 17:37:16 +00:00
albanD
b9472decf8 Initial Python 3.12 build fixes (#106083)
This compiles with python 3.12
You can get numpy from https://anaconda.org/scientific-python-nightly-wheels/numpy/files so that you don't need to remove numpy from test files.

Basic core tests work but obviously dynamo and first class dims don't work.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106083
Approved by: https://github.com/ezyang
2023-08-25 13:23:48 +00:00
PyTorch MergeBot
28dc1a093f Revert "Remove some unnecessary <iostream> includes from headers (#106914)"
This reverts commit 60936e4c29.

Reverted https://github.com/pytorch/pytorch/pull/106914 on behalf of https://github.com/ZainRizvi due to Sorry, but this is breaking internal builds. Seems like a lot of internal code depends on some of the removed imports ([comment](https://github.com/pytorch/pytorch/pull/106914#issuecomment-1688605975))
2023-08-22 17:16:48 +00:00
Peter Bell
60936e4c29 Remove some unnecessary <iostream> includes from headers (#106914)
In almost all cases this is only included for writing the output formatter, which
only uses `std::ostream` so including `<ostream>` is sufficient.

The istream header is ~1000 lines so the difference is non-trivial.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106914
Approved by: https://github.com/lezcano
2023-08-19 20:21:58 +00:00
Alexander Pivovarov
02abbb8109 Fix some typos, mostly "that that" (#106901)
Fix some typos
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106901
Approved by: https://github.com/janeyx99
2023-08-10 19:46:53 +00:00
kshitij12345
cce2c52b0b [pt2] support vmap (#101707)
Teach dynamo about `vmap`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101707
Approved by: https://github.com/zou3519
2023-08-09 03:39:33 +00:00
drisspg
788c825837 Higher order operator util for raising if inputs require grads (#106078)
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 08bd685</samp>

Added a utility function `autograd_not_implemented_check` to `torch._higher_order_ops.utils` and used it in `out_dtype_autograd` to simplify and standardize the error handling for higher order operators that do not support autograd.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106078
Approved by: https://github.com/zou3519
2023-08-01 00:13:13 +00:00
Edward Z. Yang
e6ec0efaf8 Apply UFMT to all non test/torch files (#106205)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106205
Approved by: https://github.com/albanD
2023-07-29 02:56:24 +00:00
cyy
b3e24c53eb use performance-unnecessary-value-param in clang-tidy (#102615)
performance-unnecessary-value-param has been disabled in clang-tidy for a long time. However, this check is actually useful and able to some interesting performance problems.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102615
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-07-28 17:37:03 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
8a688277a2 [BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432
Approved by: https://github.com/ezyang
2023-07-19 13:48:44 +00:00
ydwu4
6a3d5f1986 [HigherOrderOp] Remove _deprecated_global_ns from cond (#104380)
Remove _deprecated_global_ns from cond following #104105.

We change the module attribute of HigherOrderOperator instances in the constructor from torch.ops to torch.ops.higher_order when self.namespace is "higher_order". For subclasses (e.g. customized higher order operator), we leave their \_\_module\_\_ unchanged.

Will import this PR to fix internal tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104380
Approved by: https://github.com/zhxchen17, https://github.com/zou3519
2023-07-07 17:13:09 +00:00
Zhengxu Chen
df281bf788 Refactor unwrap_proxy() for proxy tensor tracing. (#104667)
Test Plan: CI

Differential Revision: D47241815

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104667
Approved by: https://github.com/tugsbayasgalan
2023-07-06 03:03:13 +00:00
Omkar Salpekar
ae1ed27756 [codemod][numpy] replace np.str with str (#103931)
Summary:
`np.str` is removed from numpy 1.20.0. It was an alias to builtin `str` and it's safe to do the replacement.

The whole changes is mechanical, generated using the following onliner:
```
fbgr -sl 'np\.str\b' | xargs perl -pi -e 's,\bnp\.str\b,str,g'
```

Test Plan: sandcastle

Differential Revision: D46586144

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103931
Approved by: https://github.com/huydhn
2023-06-21 18:16:42 +00:00
rzou
036cda415f Change HigherOrderOperator default namespace from global to 'higher_order' (#103870)
This PR changes the default namespace for higher order operators from the
global namespace (e.g. torch.ops.cond) to `higher_order` (e.g.
torch.ops.higher_order.cond). We don't actually change the namespace
for existing HigherOrderOperators.

The motivation is to stem the bleeding; exposing operators into the global
namespace is a bad idea due to name collision with other user-defined
namespaces.

We will go in and fix the `_deprecated_global_ns` as necessary after this diff.

Differential Revision: [D46809738](https://our.internmc.facebook.com/intern/diff/D46809738/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103870
Approved by: https://github.com/ydwu4
2023-06-20 19:10:55 +00:00
Tugsbayasgalan Manlaibaatar
d4b85f3031 Support params/buffers inside cond and map (#102310)
With #102022, params and buffers are always treated as special case of free variables. In this PR, I switch cond and map implementation to the this method and deprecate the old tracing mechanism.

Differential Revision: [D46746202](https://our.internmc.facebook.com/intern/diff/D46746202)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102310
Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519
2023-06-20 05:33:10 +00:00
PyTorch MergeBot
2087d32811 Revert "Support params/buffers inside cond and map (#102310)"
This reverts commit 766f236bad.

Reverted https://github.com/pytorch/pytorch/pull/102310 on behalf of https://github.com/huydhn due to The test is failing in trunk 766f236bad ([comment](https://github.com/pytorch/pytorch/pull/102310#issuecomment-1592159710))
2023-06-15 00:29:20 +00:00
Tugsbayasgalan Manlaibaatar
766f236bad Support params/buffers inside cond and map (#102310)
With #102022, params and buffers are always treated as special case of free variables. In this PR, I switch cond and map implementation to the this method and deprecate the old tracing mechanism.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102310
Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519
2023-06-14 22:32:33 +00:00
cyy
48e3ee29ff enable missing-prototypes in functorch (#103391)
This PR enables  missing-prototypes in functorch target and turn some functions into static ones

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103391
Approved by: https://github.com/malfet
2023-06-12 17:47:37 +00:00
PyTorch MergeBot
d1f24f73da Revert "Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108)"
This reverts commit 194262ee49.

Reverted https://github.com/pytorch/pytorch/pull/103108 on behalf of https://github.com/izaitsevfb due to Breaks executorch internally, see D46581996 ([comment](https://github.com/pytorch/pytorch/pull/103108#issuecomment-1585041505))
2023-06-09 19:31:40 +00:00
Richard Zou
194262ee49 Make HigherOrderOperator stop appearing like torch.ops.* in FX (#103108)
Previously, defining a HigherOrderOperators (like cond) automatically generates
a torch.ops.cond and causes them to trace into the FX graph as e.g.
torch.ops.cond.

This is not good, because:
- Duplication. Since HigherOrderOperators are written in Python, they have an
associated Python function that users should access them from. E.g.
torch.cond (when we make it public). That is what should actually appear in the
graph.
- torch.ops.cond is a valid namespace for operator registration; having
it be a function too confuses things.

This PR:
- Moves cond/map HigherOrderOperators to be under torch (necessary for
the FX logic to not do weird things)
- Sets the `__module__` of a HigherOrderOperator correct. This is what
FX uses when tracing the operator.

Test Plan:
- updated tests

Future:
- I'll delete the ability to call cond as torch.ops.cond in a couple of
days, after this change circulates internally.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103108
Approved by: https://github.com/ydwu4
2023-06-08 01:55:27 +00:00
Matthew Hoffman
dda59162f1 Native rearrange in functorch (#101957)
Fixes #92675

Here we implement a native version of [`einops.rearrange`](https://einops.rocks/api/rearrange/) using first class dims to perform the operations. The string parsing + validation, documentation, and relevant tests are adapted from `einops`. The API is exactly the same as the `einops` API.

The main idea is to take the string and convert it to a left and right `ParsedExpression`, and then find a mapping from the axes to first class dims. Once the mapping exists we convert the left expression `composition` list into a `Tensor.__getitem__` index and the right expression `composition` into the `Tensor.order` arguments, and then use this to dynamically create a callable that performs the `rearrange` operation as specified by the pattern.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101957
Approved by: https://github.com/zdevito
2023-06-06 02:10:42 +00:00
Richard Zou
fc31b3a106 Allow existing "Python RAII guards" to be used as context managers (#102579)
This PR adds a `py_context_manager_DEPRECATED` that converts a C++ RAII
guard to an object that may be either used as Python context manager or
as a "Python RAII guard".

We don't convert all of them to Python context manager only due to BC
reasons; people in OSS and internally actually rely on these APIs and I
don't want to break them. We are justified in breaking BC if we wanted
to, but it seemed like too much work for not a lot of gain.

The API is postfixed with "DEPRECATED" to indicate that people should
really use `py_context_manager` (converts C++ RAII guard to Python
context manager) instead.

Test Plan:
- this PR converts all PyTorch usages of _AutoDispatchBelowAutograd to
context manager. I can do the rest in follow-ups.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102579
Approved by: https://github.com/bdhirsh, https://github.com/albanD
2023-05-31 19:55:38 +00:00
Richard Zou
08fb648fe1 Add mechanism to turn any RAII guard into a Python Context Manager (#102037)
This PR:
- adds a mechanism to turn any RAII guard into a Python Context Manager
- turns ExcludeDispatchKeyGuard into a context manager, and purges usages
of the older torch._C.ExcludeDispatchKeyGuard from the codebase.

The mechanism is that given a RAII guard, we construct a context
manager object that holds an optional guard. When we enter the context
manager we populate the guard, when we exit we reset it.

We don't delete torch._C.ExcludeDispatchKeyGuard for BC reasons (people
are using it in fbcode). If this code actually sticks
(it is using C++17 and that worries me a bit), then I'll apply the
change to other RAII guards we have, otherwise, we can write our own
std::apply.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102037
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-05-24 14:20:52 +00:00
Liang Hou
b429a4de13 Update public_api to remove duplicated randn_like (#101302)
Remove duplicated `randn_like` in `functorch/op_analysis`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101302
Approved by: https://github.com/drisspg
2023-05-17 21:03:13 +00:00
ydwu4
326a4cc815 Support map autograd and pytree in/out. (#101633)
Rebased https://github.com/pytorch/pytorch/pull/100494 and added dummy AOTConfig.

This PR adds autograd and pytree support for map operator.

Implementation-wise:

1. We temporarily make two HigherOrderOperators, "map" and "map_impl":
- "map" is user-facing. Currently, it unwraps the pytrees in inputs and create a flat_fn for it. Dynamo currently cannot deal with pytree.tree_flatten and pytree.tree_unflatten, we therefore make it a HigherOrderOperator to trigger dynamo logic of handling HigherOrderOperators.
- "map_impl" is the actual operator that works with the rest of torch subsystems such as functionalization, make_fx. It accepts flattend arguments, and a num_mapped_args integer denoting how many of the flattend arguments need to mapped i.e. their first dimension will be unstacked.

2. We create the forward and backward graph in autograd key and call torch.autograd.Function. Currently, the backward graph is recomputation-based and we need to partition the joint graph in the future to be more efficient.

Example traced graphs for map operators:
### Case 1: simple f and autograd
```python
def f(x, y):
    return x + y

def g(xs, y):
    out = control_flow.map(f, xs, y)
    return torch.autograd.grad(out, (xs, y), torch.ones_like(out))

gm = make_fx(g, tracing_mode="symbolic")(torch.ones(3, 4, 5, requires_grad=True), torch.ones(5, requires_grad=True))
# gm.print_readable() produces following:
class g(torch.nn.Module):
    def forward(self, xs_1: f32[3, s1, s2], y_1: f32[s2]):
        # No stacktrace found for following nodes
        body_graph_0 = self.body_graph_0
        map_impl = torch.ops.map_impl(body_graph_0, 1, xs_1, y_1);  body_graph_0 = None
        getitem: f32[3, s1, s2] = map_impl[0];  map_impl = None
        ones_like: f32[3, s1, s2] = torch.ops.aten.ones_like.default(getitem, pin_memory = False)
        is_same_size = torch.ops.aten.is_same_size.default(getitem, ones_like);  getitem = None
        body_graph_1 = self.body_graph_1
        map_impl_1 = torch.ops.map_impl(body_graph_1, 2, xs_1, ones_like, y_1);  body_graph_1 = xs_1 = ones_like = None
        getitem_1 = map_impl_1[0]
        getitem_2: f32[3, s1, s2] = map_impl_1[1]
        getitem_3: f32[3, s2] = map_impl_1[2];  map_impl_1 = None
        sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(getitem_3, [0], True);  getitem_3 = None
        sym_size: Sym(s2) = torch.ops.aten.sym_size(y_1, 0);  y_1 = None
        view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = sym_size = None
        return (getitem_2, view)

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s2]):
            # No stacktrace found for following nodes
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(arg1_1, arg2_1);  arg1_1 = arg2_1 = None
            return [add]

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s1, s2], arg3_1: f32[s2]):
            # No stacktrace found for following nodes
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(arg1_1, arg3_1);  arg1_1 = None
            is_same_size = torch.ops.aten.is_same_size.default(add, arg2_1);  add = None
            sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(arg2_1, [0], True)
            sym_size: Sym(s2) = torch.ops.aten.sym_size(arg3_1, 0);  arg3_1 = None
            view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = sym_size = None
            return [None, arg2_1, view]
```
### Case 2: list input/output f and autograd
```python
def f(x, y):
    return [x[0].cos() + y.sin(), x[1].sin() * y.cos()]

def g(xs, y):
    out = control_flow.map(f, xs, y)
    flat_out, _ = pytree.tree_flatten(out)
    flat_inp, _ = pytree.tree_flatten((xs, y))
    requires_grad_inp = [inp for inp in flat_inp if inp.requires_grad]
    return torch.autograd.grad(flat_out, requires_grad_inp, [torch.ones_like(out) for out in flat_out])

gm = make_fx(g, tracing_mode="symbolic")(
    [torch.ones(3, 4, 5), torch.ones(3, 4, 5, requires_grad=True)],
    torch.ones(5, requires_grad=True))

# gm.print_readable() produces following:
class g(torch.nn.Module):
    def forward(self, xs, y):
        xs_1: f32[3, s1, s2], xs_2: f32[3, s1, s2], y_1: f32[s2], = fx_pytree.tree_flatten_spec([xs, y], self._in_spec)
        # No stacktrace found for following nodes
        body_graph_0 = self.body_graph_0
        map_impl = torch.ops.map_impl(body_graph_0, 2, xs_1, xs_2, y_1);  body_graph_0 = None
        getitem: f32[3, s1, s2] = map_impl[0]
        getitem_1: f32[3, s1, s2] = map_impl[1];  map_impl = None
        ones_like: f32[3, s1, s2] = torch.ops.aten.ones_like.default(getitem, pin_memory = False)
        ones_like_1: f32[3, s1, s2] = torch.ops.aten.ones_like.default(getitem_1, pin_memory = False)
        is_same_size = torch.ops.aten.is_same_size.default(getitem, ones_like);  getitem = None
        is_same_size_1 = torch.ops.aten.is_same_size.default(getitem_1, ones_like_1);  getitem_1 = None
        body_graph_1 = self.body_graph_1
        map_impl_1 = torch.ops.map_impl(body_graph_1, 4, xs_1, xs_2, ones_like, ones_like_1, y_1);  body_graph_1 = xs_1 = xs_2 = ones_like = ones_like_1 = None
        getitem_2 = map_impl_1[0]
        getitem_3 = map_impl_1[1]
        getitem_4: f32[3, s1, s2] = map_impl_1[2]
        getitem_5: f32[3, s2] = map_impl_1[3];  map_impl_1 = None
        sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(getitem_5, [0], True);  getitem_5 = None
        sym_size: Sym(s2) = torch.ops.aten.sym_size(y_1, 0);  y_1 = None
        view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = sym_size = None
        return pytree.tree_unflatten([getitem_4, view], self._out_spec)

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s1, s2], arg3_1: f32[s2]):
            # No stacktrace found for following nodes
            cos: f32[s1, s2] = torch.ops.aten.cos.default(arg1_1);  arg1_1 = None
            sin: f32[s2] = torch.ops.aten.sin.default(arg3_1)
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(cos, sin);  cos = sin = None
            sin_1: f32[s1, s2] = torch.ops.aten.sin.default(arg2_1);  arg2_1 = None
            cos_1: f32[s2] = torch.ops.aten.cos.default(arg3_1);  arg3_1 = None
            mul: f32[s1, s2] = torch.ops.aten.mul.Tensor(sin_1, cos_1);  sin_1 = cos_1 = None
            return [add, mul]

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s1, s2], arg3_1: f32[s1, s2], arg4_1: f32[s1, s2], arg5_1: f32[s2]):
            # No stacktrace found for following nodes
            cos: f32[s1, s2] = torch.ops.aten.cos.default(arg1_1);  arg1_1 = None
            sin: f32[s2] = torch.ops.aten.sin.default(arg5_1)
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(cos, sin);  cos = sin = None
            sin_1: f32[s1, s2] = torch.ops.aten.sin.default(arg2_1)
            cos_1: f32[s2] = torch.ops.aten.cos.default(arg5_1)
            mul: f32[s1, s2] = torch.ops.aten.mul.Tensor(sin_1, cos_1)
            is_same_size = torch.ops.aten.is_same_size.default(add, arg3_1);  add = None
            is_same_size_1 = torch.ops.aten.is_same_size.default(mul, arg4_1);  mul = None
            mul_1: f32[s1, s2] = torch.ops.aten.mul.Tensor(arg4_1, sin_1);  sin_1 = None
            mul_2: f32[s1, s2] = torch.ops.aten.mul.Tensor(arg4_1, cos_1);  arg4_1 = cos_1 = None
            sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(mul_1, [0], True);  mul_1 = None
            sym_size: Sym(s2) = torch.ops.aten.sym_size(arg5_1, 0)
            view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = None

            #
            sin_2: f32[s2] = torch.ops.aten.sin.default(arg5_1)
            neg: f32[s2] = torch.ops.aten.neg.default(sin_2);  sin_2 = None
            mul_3: f32[s2] = torch.ops.aten.mul.Tensor(view, neg);  view = neg = None
            cos_2: f32[s1, s2] = torch.ops.aten.cos.default(arg2_1);  arg2_1 = None
            mul_4: f32[s1, s2] = torch.ops.aten.mul.Tensor(mul_2, cos_2);  mul_2 = cos_2 = None
            sum_2: f32[1, s2] = torch.ops.aten.sum.dim_IntList(arg3_1, [0], True);  arg3_1 = None
            view_1: f32[s2] = torch.ops.aten.view.default(sum_2, [sym_size]);  sum_2 = sym_size = None
            cos_3: f32[s2] = torch.ops.aten.cos.default(arg5_1);  arg5_1 = None
            mul_5: f32[s2] = torch.ops.aten.mul.Tensor(view_1, cos_3);  view_1 = cos_3 = None
            add_1: f32[s2] = torch.ops.aten.add.Tensor(mul_3, mul_5);  mul_3 = mul_5 = None
            return [None, None, mul_4, add_1]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101633
Approved by: https://github.com/zou3519
2023-05-17 16:52:26 +00:00
PyTorch MergeBot
e69198b043 Revert "Support map autograd and pytree in/out (#100494)"
This reverts commit b8fa41be9d.

Reverted https://github.com/pytorch/pytorch/pull/100494 on behalf of https://github.com/PaliC due to breaking tests on trunk, please check hud.pytorch.org for the broken tests ([comment](https://github.com/pytorch/pytorch/pull/100494#issuecomment-1550454835))
2023-05-16 22:50:18 +00:00
ydwu4
b8fa41be9d Support map autograd and pytree in/out (#100494)
This PR adds autograd and pytree support for map operator.

Implementation-wise:

1. We temporarily make two HigherOrderOperators, "map" and "map_impl":
- "map" is user-facing. Currently, it unwraps the pytrees in inputs and create a flat_fn for it. Dynamo currently cannot deal with pytree.tree_flatten and pytree.tree_unflatten, we therefore make it a HigherOrderOperator to trigger dynamo logic of handling HigherOrderOperators.
- "map_impl" is the actual operator that works with the rest of torch subsystems such as functionalization, make_fx. It accepts flattend arguments, and a num_mapped_args integer denoting how many of the flattend arguments need to mapped i.e. their first dimension will be unstacked.

2. We create the forward and backward graph in autograd key and call torch.autograd.Function. Currently, the backward graph is recomputation-based and we need to partition the joint graph in the future to be more efficient.

Example traced graphs for map operators:
### Case 1: simple f and autograd
```python
def f(x, y):
    return x + y

def g(xs, y):
    out = control_flow.map(f, xs, y)
    return torch.autograd.grad(out, (xs, y), torch.ones_like(out))

gm = make_fx(g, tracing_mode="symbolic")(torch.ones(3, 4, 5, requires_grad=True), torch.ones(5, requires_grad=True))
# gm.print_readable() produces following:
class g(torch.nn.Module):
    def forward(self, xs_1: f32[3, s1, s2], y_1: f32[s2]):
        # No stacktrace found for following nodes
        body_graph_0 = self.body_graph_0
        map_impl = torch.ops.map_impl(body_graph_0, 1, xs_1, y_1);  body_graph_0 = None
        getitem: f32[3, s1, s2] = map_impl[0];  map_impl = None
        ones_like: f32[3, s1, s2] = torch.ops.aten.ones_like.default(getitem, pin_memory = False)
        is_same_size = torch.ops.aten.is_same_size.default(getitem, ones_like);  getitem = None
        body_graph_1 = self.body_graph_1
        map_impl_1 = torch.ops.map_impl(body_graph_1, 2, xs_1, ones_like, y_1);  body_graph_1 = xs_1 = ones_like = None
        getitem_1 = map_impl_1[0]
        getitem_2: f32[3, s1, s2] = map_impl_1[1]
        getitem_3: f32[3, s2] = map_impl_1[2];  map_impl_1 = None
        sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(getitem_3, [0], True);  getitem_3 = None
        sym_size: Sym(s2) = torch.ops.aten.sym_size(y_1, 0);  y_1 = None
        view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = sym_size = None
        return (getitem_2, view)

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s2]):
            # No stacktrace found for following nodes
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(arg1_1, arg2_1);  arg1_1 = arg2_1 = None
            return [add]

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s1, s2], arg3_1: f32[s2]):
            # No stacktrace found for following nodes
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(arg1_1, arg3_1);  arg1_1 = None
            is_same_size = torch.ops.aten.is_same_size.default(add, arg2_1);  add = None
            sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(arg2_1, [0], True)
            sym_size: Sym(s2) = torch.ops.aten.sym_size(arg3_1, 0);  arg3_1 = None
            view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = sym_size = None
            return [None, arg2_1, view]
```
### Case 2: list input/output f and autograd
```python
def f(x, y):
    return [x[0].cos() + y.sin(), x[1].sin() * y.cos()]

def g(xs, y):
    out = control_flow.map(f, xs, y)
    flat_out, _ = pytree.tree_flatten(out)
    flat_inp, _ = pytree.tree_flatten((xs, y))
    requires_grad_inp = [inp for inp in flat_inp if inp.requires_grad]
    return torch.autograd.grad(flat_out, requires_grad_inp, [torch.ones_like(out) for out in flat_out])

gm = make_fx(g, tracing_mode="symbolic")(
    [torch.ones(3, 4, 5), torch.ones(3, 4, 5, requires_grad=True)],
    torch.ones(5, requires_grad=True))

# gm.print_readable() produces following:
class g(torch.nn.Module):
    def forward(self, xs, y):
        xs_1: f32[3, s1, s2], xs_2: f32[3, s1, s2], y_1: f32[s2], = fx_pytree.tree_flatten_spec([xs, y], self._in_spec)
        # No stacktrace found for following nodes
        body_graph_0 = self.body_graph_0
        map_impl = torch.ops.map_impl(body_graph_0, 2, xs_1, xs_2, y_1);  body_graph_0 = None
        getitem: f32[3, s1, s2] = map_impl[0]
        getitem_1: f32[3, s1, s2] = map_impl[1];  map_impl = None
        ones_like: f32[3, s1, s2] = torch.ops.aten.ones_like.default(getitem, pin_memory = False)
        ones_like_1: f32[3, s1, s2] = torch.ops.aten.ones_like.default(getitem_1, pin_memory = False)
        is_same_size = torch.ops.aten.is_same_size.default(getitem, ones_like);  getitem = None
        is_same_size_1 = torch.ops.aten.is_same_size.default(getitem_1, ones_like_1);  getitem_1 = None
        body_graph_1 = self.body_graph_1
        map_impl_1 = torch.ops.map_impl(body_graph_1, 4, xs_1, xs_2, ones_like, ones_like_1, y_1);  body_graph_1 = xs_1 = xs_2 = ones_like = ones_like_1 = None
        getitem_2 = map_impl_1[0]
        getitem_3 = map_impl_1[1]
        getitem_4: f32[3, s1, s2] = map_impl_1[2]
        getitem_5: f32[3, s2] = map_impl_1[3];  map_impl_1 = None
        sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(getitem_5, [0], True);  getitem_5 = None
        sym_size: Sym(s2) = torch.ops.aten.sym_size(y_1, 0);  y_1 = None
        view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = sym_size = None
        return pytree.tree_unflatten([getitem_4, view], self._out_spec)

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s1, s2], arg3_1: f32[s2]):
            # No stacktrace found for following nodes
            cos: f32[s1, s2] = torch.ops.aten.cos.default(arg1_1);  arg1_1 = None
            sin: f32[s2] = torch.ops.aten.sin.default(arg3_1)
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(cos, sin);  cos = sin = None
            sin_1: f32[s1, s2] = torch.ops.aten.sin.default(arg2_1);  arg2_1 = None
            cos_1: f32[s2] = torch.ops.aten.cos.default(arg3_1);  arg3_1 = None
            mul: f32[s1, s2] = torch.ops.aten.mul.Tensor(sin_1, cos_1);  sin_1 = cos_1 = None
            return [add, mul]

    class <lambda>(torch.nn.Module):
        def forward(self, arg0_1, arg1_1: f32[s1, s2], arg2_1: f32[s1, s2], arg3_1: f32[s1, s2], arg4_1: f32[s1, s2], arg5_1: f32[s2]):
            # No stacktrace found for following nodes
            cos: f32[s1, s2] = torch.ops.aten.cos.default(arg1_1);  arg1_1 = None
            sin: f32[s2] = torch.ops.aten.sin.default(arg5_1)
            add: f32[s1, s2] = torch.ops.aten.add.Tensor(cos, sin);  cos = sin = None
            sin_1: f32[s1, s2] = torch.ops.aten.sin.default(arg2_1)
            cos_1: f32[s2] = torch.ops.aten.cos.default(arg5_1)
            mul: f32[s1, s2] = torch.ops.aten.mul.Tensor(sin_1, cos_1)
            is_same_size = torch.ops.aten.is_same_size.default(add, arg3_1);  add = None
            is_same_size_1 = torch.ops.aten.is_same_size.default(mul, arg4_1);  mul = None
            mul_1: f32[s1, s2] = torch.ops.aten.mul.Tensor(arg4_1, sin_1);  sin_1 = None
            mul_2: f32[s1, s2] = torch.ops.aten.mul.Tensor(arg4_1, cos_1);  arg4_1 = cos_1 = None
            sum_1: f32[1, s2] = torch.ops.aten.sum.dim_IntList(mul_1, [0], True);  mul_1 = None
            sym_size: Sym(s2) = torch.ops.aten.sym_size(arg5_1, 0)
            view: f32[s2] = torch.ops.aten.view.default(sum_1, [sym_size]);  sum_1 = None

            #
            sin_2: f32[s2] = torch.ops.aten.sin.default(arg5_1)
            neg: f32[s2] = torch.ops.aten.neg.default(sin_2);  sin_2 = None
            mul_3: f32[s2] = torch.ops.aten.mul.Tensor(view, neg);  view = neg = None
            cos_2: f32[s1, s2] = torch.ops.aten.cos.default(arg2_1);  arg2_1 = None
            mul_4: f32[s1, s2] = torch.ops.aten.mul.Tensor(mul_2, cos_2);  mul_2 = cos_2 = None
            sum_2: f32[1, s2] = torch.ops.aten.sum.dim_IntList(arg3_1, [0], True);  arg3_1 = None
            view_1: f32[s2] = torch.ops.aten.view.default(sum_2, [sym_size]);  sum_2 = sym_size = None
            cos_3: f32[s2] = torch.ops.aten.cos.default(arg5_1);  arg5_1 = None
            mul_5: f32[s2] = torch.ops.aten.mul.Tensor(view_1, cos_3);  view_1 = cos_3 = None
            add_1: f32[s2] = torch.ops.aten.add.Tensor(mul_3, mul_5);  mul_3 = mul_5 = None
            return [None, None, mul_4, add_1]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100494
Approved by: https://github.com/zou3519
2023-05-16 22:05:11 +00:00
Edward Z. Yang
0577043d94 Rename minpybind namespace from py to mpy (#101410)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101410
Approved by: https://github.com/Skylion007, https://github.com/zou3519
2023-05-15 23:15:01 +00:00
Tugsbayasgalan Manlaibaatar
bf08b072a7 Add functionalization pass in TorchDynamo (#99461)
Fixes: https://github.com/pytorch/pytorch/issues/99000

Differential Revision: [D45106409](https://our.internmc.facebook.com/intern/diff/D45106409)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99461
Approved by: https://github.com/bdhirsh, https://github.com/anijain2305, https://github.com/zou3519
2023-05-05 16:08:14 +00:00
Huy Do
bb6b24c622 [BE] Dockerize PyTorch docs jobs (#100601)
Saw some connection error to pip in docs jobs today, so let's dockerize it:

* https://github.com/pytorch/pytorch/actions/runs/4877612277/jobs/8702572072
* https://github.com/pytorch/pytorch/actions/runs/4877612277/jobs/8702572072

Some additional fixes:
* Moving the docs script from under `.circleci` to under `.ci` as they should be
* Linter (as scripts under .ci are subjected to shellcheck)
* Fix some minor Sphinx warnings in functorch docs

### Testing
Docs previews look fine:

* https://docs-preview.pytorch.org/100601/index.html
* https://docs-preview.pytorch.org/100601/cppdocs/index.html
* https://docs-preview.pytorch.org/100601/functorchdocs/index.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100601
Approved by: https://github.com/clee2000
2023-05-05 06:24:46 +00:00
PyTorch MergeBot
6d2f8114be Revert "[BE] Dockerize PyTorch docs jobs (#100601)"
This reverts commit 2703684acf.

Reverted https://github.com/pytorch/pytorch/pull/100601 on behalf of https://github.com/huydhn due to Curiously, this breaks inductor jobs ([comment](https://github.com/pytorch/pytorch/pull/100601#issuecomment-1535515587))
2023-05-04 23:13:15 +00:00
Huy Do
2703684acf [BE] Dockerize PyTorch docs jobs (#100601)
Saw some connection error to pip in docs jobs today, so let's dockerize it:

* https://github.com/pytorch/pytorch/actions/runs/4877612277/jobs/8702572072
* https://github.com/pytorch/pytorch/actions/runs/4877612277/jobs/8702572072

Some additional fixes:
* Moving the docs script from under `.circleci` to under `.ci` as they should be
* Linter (as scripts under .ci is subjected to shellcheck)
* Fix some minor Sphinx warnings in functorch docs

### Testing
Docs previews look fine:

* https://docs-preview.pytorch.org/100601/index.html
* https://docs-preview.pytorch.org/100601/cppdocs/index.html
* https://docs-preview.pytorch.org/100601/functorchdocs/index.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100601
Approved by: https://github.com/clee2000
2023-05-04 20:33:51 +00:00
Richard Zou
b2d703e2d7 Stop loading functorch._C unless torchdim is needed (#100491)
Just a small optimization. This PR changes it so that import of
functorch.dim ends up loading functorch._C (which is entirely composed
of torchdim APIs)

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100491
Approved by: https://github.com/Chillee, https://github.com/kshitij12345
2023-05-03 13:47:49 +00:00
Svetlana Karslioglu
d425da8bf3 Replace master with main in links and docs/conf.py (#100176)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100176
Approved by: https://github.com/albanD, https://github.com/malfet
2023-05-02 18:20:32 +00:00
Huy Do
56e235ad8c Pin functorch docs requirements (#100257)
The job https://github.com/pytorch/pytorch/actions/runs/4830815291/jobs/8607848573 starts to fail with the new IPython https://pypi.org/project/ipython/#history

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100257
Approved by: https://github.com/clee2000
2023-04-28 17:58:58 +00:00
Aleksei Nikiforov
bc0c74bcd5 Don't apply _Py_OPCODE twice (#97986)
It's already applied in PyInstDecoder::opcode.
Applying it twice returns incorrect result on big endian systems.

This change fixes 14 tests in test/functorch/test_dims.py on big endian systems.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97986
Approved by: https://github.com/Skylion007, https://github.com/kit1980
2023-04-27 00:27:32 +00:00
Aaron Gokaslan
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
Justin Chu
79c9e82e27 Fix flake8 lint errors reported by ruff - take 2 (#99798)
Replaces #99784. This PR is pure autofix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99798
Approved by: https://github.com/Skylion007, https://github.com/kit1980
2023-04-23 23:09:51 +00:00
Horace He
547bef11ee tweak heuristic for sdpa selection based off of *data* (and a decision tree) (#99644)
High level approach:
1. I generated a bunch of data comparing FlashAttention and Cutlass implementations (https://pastebin.com/pe0j3YeK)
2. I trained a decision tree using standard train/val split methodology and hyperparameter sweeps (https://pastebin.com/fjYX1HjR).
2a. I did a bunch of feature augmentation to capture interactions between features.

The heuristic I ended up with is:
```
use_flash = seq_len / (num_heads * batch_size) > 6
```

TL;DR: On my dataset, where FlashAttention and Cutlass differ by more than 10%, the existing heuristic achieves 69% accuracy.  My new heuristic achieves 94% accuracy.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99644
Approved by: https://github.com/ngimel, https://github.com/drisspg
2023-04-21 23:28:44 +00:00
Guang Yang
aa4ed332c3 Improve torch.cond useability: Return UserError with actionable error messages (#98909)
It's part of the effort to improve PT2 Export UX. This PR is to improve the usability of `torch.cond()` by separating user errors from the dynamo internal errors. By definition, user error means the usage of `torch.cond()` violates the restrictions of this API therefore needs users to take action and fix the error.

In this notebook N3363227 we discovered a bunch of limitations of using `torch.cond(pred, true_fn, false_fn, operands)`. In summary, the limitations can be categorized as:
 - predicate restriction (`pred`)
 - operands restriction (`operands`)
 - branch restriction (`true_fn` & `false_fn`)

The error message will be more accurate about where the (user) error is from and more actionable for users to fix it.

For example, `operands` must be a list of tensors and the signature of `true_fn` and `false_fn` must match with the `operands`.
If the operands contains non-tensor types, user will see error message like:
```
torch._dynamo.exc.UserError: Expected a list of tensors but got ["<class 'torch.Tensor'>", "<class 'float'>"]

from user code:
   File "~/pytorch/test/dynamo/test_export.py", line 2504, in f_non_tensor_operands
    return cond(True, lambda x, a: x.sin(), lambda x, a: x.cos(), [x, a])
```
If the signature of the branch function doesn't match with `operands`, user will see error message like:
```
torch._dynamo.exc.UserError: too many positional arguments.
  func = 'false_fn' ~/pytorch/test/dynamo/test_export.py:2514, args = [<class 'torch.Tensor'>, <class 'torch.Tensor'>], kwargs = {}
```
Or if the tensor returned from user defined branches has different metadata, e.g. shapes, dtypes, etc., user will see error message like:
```
TypeError: Expected each tensor to have same metadata but got:
  cond_true_0 returns TensorMetadata(shape=torch.Size([2, 1]), dtype=torch.int64, requires_grad=False, stride=(1, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
  cond_false_0 returns TensorMetadata(shape=torch.Size([1]), dtype=torch.float32, requires_grad=False, stride=(1,), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98909
Approved by: https://github.com/jansel
2023-04-20 17:20:41 +00:00