Commit Graph

7520 Commits

Author SHA1 Message Date
Angela Yi
51f6b946ae [torchbind] Add generic __deepcopy__ method (#137613)
Summary: Added a generic `__deepcopy__` method which will use the torchbind object's existing `__getattr__` and `__setattr__` to copy the torchbind object. This will later be used in [D64124825](https://www.internalfb.com/diff/D64124825)

Differential Revision: D64124826

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137613
Approved by: https://github.com/ydwu4, https://github.com/zou3519
2024-10-24 22:14:55 +00:00
Richard Barnes
8f62832189 c10::nullopt -> std::nullopt (#138701)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138701
Approved by: https://github.com/Skylion007, https://github.com/malfet
2024-10-24 15:03:32 +00:00
Richard Barnes
dbf0fa811a Remove C10_HOST_CONSTEXPR_EXCEPT_WIN_CUDA and CONSTEXPR_EXCEPT_WIN_CUDA (#138479)
BC linter suppressed due to removal of `tools/linter/adapters/constexpr_linter.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138479
Approved by: https://github.com/eqy, https://github.com/malfet
2024-10-24 07:51:05 +00:00
FFFrog
af0bc75460 Remove deprecated alias macro(1/3) (#137556)
**Detailed Descriptions:**
- Remove AT_ERROR Macro

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137556
Approved by: https://github.com/ezyang
2024-10-21 17:32:32 +00:00
Nikita Shulga
c0879d0c21 Fix lint
Regression casued by fddabc6e0b that was force merged
2024-10-19 08:33:41 -07:00
Richard Barnes
fddabc6e0b C10_UNUSED to [[maybe_unused]] (#6357) (#138364)
Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/6357

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138364
Approved by: https://github.com/Skylion007, https://github.com/eqy
2024-10-19 13:17:43 +00:00
Richard Barnes
542f7c8383 Eliminate C10_NODISCARD (#138336)
Test Plan: Sandcastle

Reviewed By: swolchok

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138336
Approved by: https://github.com/Skylion007
2024-10-19 02:54:06 +00:00
Richard Barnes
8dd575faf6 [BE] Modernize C10_UNUSED (#138102)
[`[[maybe_unused]]`](https://en.cppreference.com/w/cpp/language/attributes/maybe_unused) is part of C++17 standard

Test Plan: Sandcastle

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138102
Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/malfet, https://github.com/eqy
2024-10-18 16:33:01 +00:00
Pian Pawakapan
95f869c3d7 [pytorch_operator_stats] log if using torchscript runtime (#137986)
Summary: logs if an operator is run with the TorchScript runtime, using a thread_local variable set in `InterpreterState.run()`

Test Plan: buck2 run mode/dev-nosan caffe2/torch/fb/observers:scuba_observer_runner

Reviewed By: zou3519

Differential Revision: D64200781

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137986
Approved by: https://github.com/angelayi
2024-10-18 00:55:22 +00:00
FFFrog
ad28565ed7 Use C++17 Convention Methods in PyTorch (#137958)
Detailed Descriptions:
- `std::is_same<X, Y>::value` -> `std::is_same_v<X, Y>`
- `std::enable_if<C, T>::type` -> `std::enable_if_t<C, T>`
- and so on

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137958
Approved by: https://github.com/janeyx99
2024-10-18 00:52:51 +00:00
Edward Yang
b14269dcfb Make Context to be Device-agnostic Step by Step (1/N) (#136519) (#138155)
Summary:
- make init to be device-agnostic and move it to AcceleratorHooksInterface
- refactoring context related to device initialization

Original pull request: https://github.com/pytorch/pytorch/pull/136519

Test Plan: contbuild & OSS CI, see 4a8e49389c

Reviewed By: malfet

Differential Revision: D64471142

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138155
Approved by: https://github.com/malfet, https://github.com/bobrenjc93
2024-10-17 20:58:56 +00:00
PyTorch MergeBot
375dcb960f Revert "Avoid some dangling reference warnings (#132535)"
This reverts commit f3d7a02716.

Reverted https://github.com/pytorch/pytorch/pull/132535 on behalf of https://github.com/clee2000 due to broke some internal builds D64479234 ([comment](https://github.com/pytorch/pytorch/pull/132535#issuecomment-2419983509))
2024-10-17 16:23:36 +00:00
PyTorch MergeBot
dd32a32cb6 Revert "Expose option to disable CRC-32 computation during torch.save (#137735)"
This reverts commit 534fa96f2d.

Reverted https://github.com/pytorch/pytorch/pull/137735 on behalf of https://github.com/clee2000 due to failing internally D64438525, probably needs gating ([comment](https://github.com/pytorch/pytorch/pull/137735#issuecomment-2417412264))
2024-10-16 17:03:06 +00:00
Isuru Fernando
f3d7a02716 Avoid some dangling reference warnings (#132535)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132535
Approved by: https://github.com/aaronenyeshi
2024-10-16 13:41:12 +00:00
Mikayla Gawarecki
534fa96f2d Expose option to disable CRC-32 computation during torch.save (#137735)
Option only works in open source, not internal

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137735
Approved by: https://github.com/albanD
2024-10-15 19:30:02 +00:00
PyTorch MergeBot
cd292908e5 Revert "Make c10::string_view an alias of std::string_view (#130417)"
This reverts commit c48fe89011.

Reverted https://github.com/pytorch/pytorch/pull/130417 on behalf of https://github.com/clee2000 due to breaking some internal tests, probably usages of string_view that need to be changed? ([comment](https://github.com/pytorch/pytorch/pull/130417#issuecomment-2414775064))
2024-10-15 18:55:09 +00:00
PyTorch MergeBot
d4d687ffb2 Revert "Make Context to be Device-agnostic Step by Step (1/N) (#136519)"
This reverts commit 4a8e49389c.

Reverted https://github.com/pytorch/pytorch/pull/136519 on behalf of https://github.com/clee2000 due to breaking internal tests related to MITA, @ezyang has a forward fix? ([comment](https://github.com/pytorch/pytorch/pull/136519#issuecomment-2414588302))
2024-10-15 17:19:16 +00:00
Richard Barnes
b7f798caa4 Use C10_UNUSED instead of (void)X (#137239)
Summary:
Auto-generated with
```
buck run //scripts/rbarnes/regex_multiline_replacer:regex_multiline_replacer -- --find '^(\s*for\s*\()(const.*\n)\s*\(void\)[A-Za-z]+;\s*//\s*Suppress.*\s*\n(.*)'  --replace '\1C10_UNUSED \2\3' `find caffe2/ -regex ".*\.\(cpp\|h\)"`
```

Differential Revision: D33432600

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137239
Approved by: https://github.com/Skylion007
2024-10-15 14:32:59 +00:00
cyy
c48fe89011 Make c10::string_view an alias of std::string_view (#130417)
In order to facilitate the mitigation from c10::string_view to std::string_view, the old c10::string_view was renamed to c10::string_view_ext.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130417
Approved by: https://github.com/ezyang
2024-10-14 09:28:04 +00:00
FFFrog
4a8e49389c Make Context to be Device-agnostic Step by Step (1/N) (#136519)
----

- make init to be device-agnostic and move it to AcceleratorHooksInterface
- refactoring context related to device initialization

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136519
Approved by: https://github.com/ezyang, https://github.com/EikanWang, https://github.com/guangyey
2024-10-13 12:38:02 +00:00
Richard Barnes
a919742149 c10::optional -> std::optional in PyTorch (#137333)
Test Plan: Sandcastle

Differential Revision: D63876535

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137333
Approved by: https://github.com/Skylion007, https://github.com/albanD
2024-10-11 00:16:10 +00:00
PyTorch MergeBot
079f909263 Revert "Make Context to be Device-agnostic Step by Step (1/N) (#136519)"
This reverts commit be0b75256a.

Reverted https://github.com/pytorch/pytorch/pull/136519 on behalf of https://github.com/jovianjaison due to this pr is causing errors internally ([comment](https://github.com/pytorch/pytorch/pull/136519#issuecomment-2405781093))
2024-10-10 18:32:17 +00:00
FFFrog
be0b75256a Make Context to be Device-agnostic Step by Step (1/N) (#136519)
- make init to be device-agnostic and move it to AcceleratorHooksInterface
- refactoring context related to device initialization

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136519
Approved by: https://github.com/ezyang, https://github.com/EikanWang, https://github.com/guangyey
2024-10-09 02:13:36 +00:00
cyy
bbff667e32 [Distributed] [13/N] Fix clang-tidy warnings in torch/csrc/distributed/ (#136713)
Follows #136528

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136713
Approved by: https://github.com/kwen2501
2024-09-27 10:11:53 +00:00
cyy
3c542ce831 [Reland] Check function declarations of COREML code (#136070)
Reland of #135467 by fixing periodic workflows.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136070
Approved by: https://github.com/ezyang
2024-09-26 03:52:06 +00:00
David Berard
8ecc5f1a8f [TorchScript][tensorexpr] imbue locale for IRPrinter (#136458)
We had an internal report where the NNC-generated CUDA code had thousands separators in integer literals. Although I wasn't able to cleanly repro, I did come up with a hacky repro and verified that this fix works (see #136459).

Differential Revision: [D63278771](https://our.internmc.facebook.com/intern/diff/D63278771)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136458
Approved by: https://github.com/eellison
2024-09-24 19:00:57 +00:00
albanD
067d203b22 Upgrade pybind11 API calls for 3.13t (#136370)
This is a modified version of https://github.com/pytorch/pytorch/pull/130341 that preserve support for older pybind version.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136370
Approved by: https://github.com/Skylion007, https://github.com/malfet
2024-09-20 23:09:55 +00:00
albanD
cf31724db7 Fix and improvements to toward 3.13t (#136319)
Small part of https://github.com/pytorch/pytorch/pull/130689
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136319
Approved by: https://github.com/malfet, https://github.com/Skylion007
2024-09-20 04:22:18 +00:00
cyy
7bbdf87517 [22/N] Fix clang-tidy warnings in jit (#134829)
Follows  #134537

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134829
Approved by: https://github.com/ezyang
2024-09-19 19:24:42 +00:00
cyy
31e42a45dd Fix redundant move warnings by g++ (#134987)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134987
Approved by: https://github.com/ezyang
2024-09-15 05:28:19 +00:00
PyTorch MergeBot
564d00f364 Revert "Fix clang-tidy warnings in Caffe2 code (#134935)"
This reverts commit 7cfd23636c.

Reverted https://github.com/pytorch/pytorch/pull/134935 on behalf of https://github.com/izaitsevfb due to breaks internal builds, caffe2 is still used internally ([comment](https://github.com/pytorch/pytorch/pull/134935#issuecomment-2349368152))
2024-09-13 16:42:37 +00:00
PyTorch MergeBot
3de9e474df Revert "Check function declarations of Core ML code (#135467)"
This reverts commit bc1b8f094d.

Reverted https://github.com/pytorch/pytorch/pull/135467 on behalf of https://github.com/malfet due to This breaks ios periodic jobs, see https://github.com/pytorch/pytorch/actions/runs/10797026668/job/29947377532 ([comment](https://github.com/pytorch/pytorch/pull/135467#issuecomment-2347322784))
2024-09-12 22:04:35 +00:00
cyy
7cfd23636c Fix clang-tidy warnings in Caffe2 code (#134935)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134935
Approved by: https://github.com/ezyang
2024-09-12 03:27:09 +00:00
penguin-wwy
356f14e7b7 Fix the output of FileCheck when not run and add unit tests (#135345)
When FileCheck is destructed without execution, it should output all rules.
For example:
```
>>> fc = FileCheck().check("test")
>>> del fc
You have not run this instance of FileCheck!
FileCheck checks:
        CHECK: test
```

Additionally, unit tests for the Python interface of FileCheck will be added.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135345
Approved by: https://github.com/eellison
2024-09-11 04:13:24 +00:00
cyy
bc1b8f094d Check function declarations of Core ML code (#135467)
Relax the restrictions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135467
Approved by: https://github.com/ezyang
2024-09-10 16:05:22 +00:00
cyy
2196f32475 [22/N] Fix clang-tidy warnings in jit (#135319)
Follows #134537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135319
Approved by: https://github.com/titaiwangms
2024-09-08 17:18:29 +00:00
Kulin Seth
144fde4fd2 [MPS] Add support for autocast in MPS (#99272)
Fixes https://github.com/pytorch/pytorch/issues/88415

Need to run inductor/test_cpu_select_algorithm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272
Approved by: https://github.com/malfet

Co-authored-by: Siddharth Kotapati <skotapati@apple.com>
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Co-authored-by: Roy Hvaara <roy@lightyear.no>
2024-09-05 23:23:17 +00:00
cyyever
d9cc693719 [jit] Change argument names (#134828)
It seems like a bug.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134828
Approved by: https://github.com/janeyx99

Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>
2024-08-31 08:42:30 +00:00
Lucian Grijincu
71ff168dbb pytorch: llvm_codegen: prefix JIT generated functions with 8B of data so jitted code can be called from ASAN+UBSAN on LLVM17 (llvm/llvm-project#65253) (#134572)
Summary:
Similar workaround was already applied elsewhere in pytorch https://github.com/pytorch/pytorch/pull/133623 {D61348865}

LLVM17 UBSAN change discussion https://github.com/llvm/llvm-project/issues/104505

Here we also have to associate the data with the function with `setPrefixData(dummyPrefixData)` to prevent this workaround being disabled by the `optimize(*module_);` call which  could change layout/remove the unused variable/etc.

Differential Revision: D61845799

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134572
Approved by: https://github.com/atalman
2024-08-29 23:15:13 +00:00
Jennifer (Jiyue) Wang
a32255481b [caffe2][hipify] remove un-used flag from pybind_utils.h (#134404)
Summary:
Encountered issues related to AMD build when working on https://www.internalfb.com/diff/D60739324?dst_version_fbid=2203158110057105 (see stack trace P1545717562)

Looking at the file history, seems that the flag is no longer used so I propose to remove it.  Alternatively, I could change the `#ifdef` to check both `USE_C10D_NCCL` and  `USE_ROCM` and include the corresponding AMD header files.

Let me know what is more preferred way.

Test Plan: Sandcastle

Differential Revision: D61762129

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134404
Approved by: https://github.com/malfet
2024-08-29 04:09:44 +00:00
cyy
ec3f52dd27 [21/N] Fix clang-tidy warnings in jit (#134537)
Follows #133399

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134537
Approved by: https://github.com/Skylion007
2024-08-28 03:22:01 +00:00
cyy
73604eed0c [20/N] Fix clang-tidy warnings in jit (#133399)
Follows #133067

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133399
Approved by: https://github.com/Skylion007
2024-08-26 17:43:52 +00:00
Zhengxu Chen
517aee5369 [torchscript] Add a sampled logging integration point. (#133484)
Test Plan:
test script:
```
    def test_zhxchen17(self):
        from libfb.py.pyinit import initFacebook

        initFacebook()

        class M(torch.nn.Module):
            def forward(self, x):
                return torch.add(x, x)

        def tmptmp(x, y):
            return torch.mul(x, y)

        m = M()
        n = torch.jit.script(m)
        print(n(torch.tensor(1)))
        print(torch.jit.script(tmptmp)(torch.tensor(1), torch.tensor(2)))
```

```
I0802 12:01:23.932929 4079081 init.cc:407] Logging to scuba: run __torch__.caffe2.test.export.test_export.M.forward sample rate: 1000000
```

Differential Revision: D60920867

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133484
Approved by: https://github.com/davidberard98
2024-08-19 18:04:45 +00:00
Matthias Braun
5ee070266f Workaround ASAN failure (#133623)
Summary:
ASAN in llvm 17.x and newer reads 8 bytes in front of every function called. This means the JIT must not place a function immediately at the beginning of a freshly `mmap`ed page. This adds an 8 byte sized dummy variable as the first thing to work around the problem.

See also:
- https://reviews.llvm.org/D148665
- https://github.com/llvm/llvm-project/issues/65253

Test Plan:
- `servicelab create cogwheel_adfinder_ubsan_multi_trial_test --local-commit`: https://www.internalfb.com/servicelab/experiment/3701354882
- sandcastle

Differential Revision: D61348865

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133623
Approved by: https://github.com/Skylion007
2024-08-16 18:48:10 +00:00
Zhengxu Chen
59b3f5911d [sigmoid] Support custom obj deserialization. (#133463)
Summary:
It seems we have multiple places deserializing torchbind objects. Moving the code around so that every load essentially share the same implementation.

Also added a test case "package_reader_testing" which load back the archive file in Python and eagerly validate the numerical result.

Test Plan: buck test mode/opt sigmoid/inference/test:e2e_test_cpu

Reviewed By: SherlockNoMad

Differential Revision: D61235770

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133463
Approved by: https://github.com/ydwu4
2024-08-15 17:58:44 +00:00
Daniil Kutz
62cd065de2 Validate that node TK_ASSIGN have field initialized (#127878)
Fixes segmentation fault during model load via C++ API.

An `Assign` statement (`TK_ASSIGN` type) have 3 fields: `lhs`, `rhs` and `type`. Field `type` is of type `Maybe`, which means it could be not presented. During model load in `import_source.cpp` field `type` is dereferenced without validation.

It is similar error that have been fixed in #106041.

Fixes #127877

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127878
Approved by: https://github.com/malfet
2024-08-14 15:27:58 +00:00
cyy
2f30473fba [19/N] Fix clang-tidy warnings in jit (#133067)
Follows  #132963
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133067
Approved by: https://github.com/Skylion007
2024-08-13 15:59:43 +00:00
cyy
8967d55b01 [18/N] Fix clang-tidy warnings in jit (#132963)
Follows #132753

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132963
Approved by: https://github.com/Skylion007
2024-08-09 01:27:32 +00:00
Jeff Daily
ca713b8393 llvm update for backward-breaking APIs in 18 and 19 (#132825)
Related to #130661, #129797.  Based on the LLVM tagged releases, these LLVM_VERSION_MAJOR guards are accurate.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132825
Approved by: https://github.com/dcci, https://github.com/Skylion007
2024-08-07 18:31:40 +00:00
cyy
bfeb45e46b [17/N] Fix clang-tidy warnings in jit (#132753)
Follows #132604
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132753
Approved by: https://github.com/Skylion007
2024-08-07 03:47:54 +00:00
cyy
3ef45e5669 Fix ODR (#131032)
Fixes ODR violation

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131032
Approved by: https://github.com/ezyang
2024-08-05 23:19:49 +00:00
PyTorch MergeBot
2764bee942 Revert "[MPS] Add support for autocast in MPS (#99272)"
This reverts commit 6919e8baab.

Reverted https://github.com/pytorch/pytorch/pull/99272 on behalf of https://github.com/clee2000 due to Broke test/inductor/test_cpu_select_algorithm.py::TestSelectAlgorithmCPU::test_quantized_linear_amx_batch_size_3_in_features_128_out_features_64_bias_False_cpu on sm86 jobs [GH job link](https://github.com/pytorch/pytorch/actions/runs/10252979157/job/28367091621) [HUD commit link](6919e8baab) Not caught on PR due to bad TD ([comment](https://github.com/pytorch/pytorch/pull/99272#issuecomment-2269808857))
2024-08-05 19:59:04 +00:00
cyy
d5045cceff [16/N] Fix clang-tidy warnings in jit (#132604)
Follows #132564

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132604
Approved by: https://github.com/Skylion007
2024-08-05 17:36:22 +00:00
Kulin Seth
6919e8baab [MPS] Add support for autocast in MPS (#99272)
Fixes https://github.com/pytorch/pytorch/issues/88415

Co-authored-by: Siddharth Kotapati <skotapati@apple.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272
Approved by: https://github.com/malfet
2024-08-05 17:02:30 +00:00
cyy
bc46f205c4 [15/N] Fix clang-tidy warnings in jit (#132564)
Follows  #132477

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132564
Approved by: https://github.com/Skylion007
2024-08-03 19:33:24 +00:00
cyy
fcef6cc6d1 [13/N] Fix clang-tidy warnings in jit (#132477)
Follows  #132209

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132477
Approved by: https://github.com/Skylion007
2024-08-03 00:13:18 +00:00
cyy
b9cb1abf65 [12/N] Use std::optional (#132361)
Follows #132396

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132361
Approved by: https://github.com/eqy
2024-08-02 13:46:46 +00:00
cyy
07fe1dd58f [13/N] Fix clang-tidy warnings in jit (#132411)
Follows  #132209

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132411
Approved by: https://github.com/Skylion007
2024-08-02 03:14:09 +00:00
Oguz Ulgen
72d2dba992 Add None return type to init (#132335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335
Approved by: https://github.com/albanD
2024-08-01 15:26:45 +00:00
cyy
c99adce9a1 [12/N] Fix clang-tidy warnings in jit (#132209)
Follows #132131

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132209
Approved by: https://github.com/Skylion007
2024-08-01 15:12:12 +00:00
cyy
043e41f4f4 [10/N] Use std::nullopt and std::make_optional (#132364)
Follows #130674
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132364
Approved by: https://github.com/ezyang
2024-08-01 07:02:35 +00:00
cyy
89da94594e [11/N] Fix clang-tidy warnings in jit (#132131)
Follows #132122

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132131
Approved by: https://github.com/Skylion007
2024-07-31 03:45:52 +00:00
cyy
eccbd408e5 [10/N] Fix clang-tidy warnings in jit (#132122)
Follows #132010

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132122
Approved by: https://github.com/Skylion007
2024-07-30 12:56:31 +00:00
cyy
c764ef6d53 [9/N] Fix clang-tidy warnings in jit (#132010)
Follows  #131997

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132010
Approved by: https://github.com/Skylion007
2024-07-29 18:38:35 +00:00
cyy
efca51e171 [8/N] Fix clang-tidy warnings in jit (#131997)
Follows #131996
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131997
Approved by: https://github.com/Skylion007
2024-07-29 12:40:42 +00:00
cyy
5b3b2b9cc7 [7/N] Fix clang-tidy warnings in jit (#131996)
Follows #131986

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131996
Approved by: https://github.com/ezyang
2024-07-29 01:21:18 +00:00
cyy
ddd539ba6c [6/N] Fix clang-tidy warnings in jit (#131986)
Follows  #131969
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131986
Approved by: https://github.com/ezyang
2024-07-29 00:49:08 +00:00
cyy
8e5a367311 [5/N] Fix clang-tidy warnings in jit (#131969)
Follows #131903
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131969
Approved by: https://github.com/ezyang
2024-07-27 17:54:20 +00:00
cyy
99e13e68e9 [4/N] Fix clang-tidy warnings in jit (#131903)
Follows #131830

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131903
Approved by: https://github.com/Skylion007
2024-07-27 08:08:14 +00:00
PyTorch MergeBot
161bb67116 Revert "Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)"
This reverts commit ace6decc99.

Reverted https://github.com/pytorch/pytorch/pull/130341 on behalf of https://github.com/clee2000 due to unfortunately the internal pybind update got reverted cc @malfet ([comment](https://github.com/pytorch/pytorch/pull/130341#issuecomment-2253147079))
2024-07-26 17:02:56 +00:00
cyy
2988d33c80 [3/N] Fix clang-tidy warnings in jit (#131830)
Follows #131735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131830
Approved by: https://github.com/ezyang
2024-07-26 15:46:28 +00:00
Brian Hirsh
5612408735 _get_operation_overload: dont raise exception when overload does not exist (#131554)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131554
Approved by: https://github.com/ezyang, https://github.com/zou3519
ghstack dependencies: #131403, #131482, #131665
2024-07-26 15:38:11 +00:00
cyy
b07ea91c4c [2/N] Fix clang-tidy warnings in jit (#131735)
Follows #131034

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131735
Approved by: https://github.com/ezyang
2024-07-25 15:56:53 +00:00
Xuehai Pan
ace6decc99 Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)
Fix static `py::object`s with `py::gil_safe_call_once_and_store`.

The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault.

```c++
void func() {
    static py::object obj = py::module_::import("foo").attr("bar");

    ...
}
```

The correct code is to use raw pointers rather than the instance.

```c++
void func() {
    static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")};
    py::object obj = *obj_ptr;

    ...
}
```

This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely.

```c++
void func() {
    PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage;
    py::object obj = storage
                         .call_once_and_store_result(
                             []() -> py::object {
                                 return py::module_::import("foo").attr("bar");
                             }
                         )
                         .get_stored();

    ...
}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130341
Approved by: https://github.com/ezyang, https://github.com/malfet
2024-07-25 05:53:09 +00:00
cyy
538258bc13 [1/N] Fix clang-tidy warnings in jit (#131034)
Some some tidy warnings
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131034
Approved by: https://github.com/ezyang
2024-07-25 03:03:46 +00:00
cyy
46e42ae85d [4/N] Fix Wunused-parameter warnings (#131291)
Follows #131271

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131291
Approved by: https://github.com/ezyang
2024-07-25 02:59:22 +00:00
Nikita Shulga
62e566b345 [BE] Remove suppression of inconsistent missing overrides (#131524)
This should prevent regressions like the ones fixed by https://github.com/pytorch/pytorch/pull/131204

- Remove global `-Wno-error=inconsistent-missing-override`
- Wrap offending includes (protobuf and asmjit) with `C10_DIAGNOSTIC_PUSH_AND_IGNORE` and `C10_DIAGNOSTIC_POP_AND_IGNORED`
- Add `override` keyword to `at::namespace::tunable::StreamTimer` and `LLVMCodeGenImpl`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131524
Approved by: https://github.com/atalman
2024-07-24 10:07:36 +00:00
cyy
cd8bbdc71a [2/N] Fix Wunused-parameter warnings (#131170)
Follows #130924
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131170
Approved by: https://github.com/mikaylagawarecki
2024-07-19 23:58:56 +00:00
cyy
3cc6183ce1 Fix getAugOp error (#131033)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131033
Approved by: https://github.com/ezyang
2024-07-19 01:07:24 +00:00
cyy
73d0f484b3 [structural binding][11/N] Replace std::tie with structural binding (#130830)
Follows  #130784

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130830
Approved by: https://github.com/janeyx99
2024-07-18 00:45:06 +00:00
Lei Wang (Server LLVM)
9a6d81b178 Fix pytorch JIT build for LLVM 18+ (#130661)
Summary: LLVM upstream(https://github.com/llvm/llvm-project/pull/97824) changed `getHostCPUFeatures`to  use Return StringMap. Fix this to unblock T195389358

Test Plan:
```
buck2 build mode/opt-clang-thinlto --upload-all-actions -c unicorn.hfsort="1" -c cxx.extra_cxxflags="-gpubnames -w -Wno-enum-constexpr-conversion -Wno-missing-template-arg-list-after-template-kw -Wno-c++11-narrowing -Wno-c++11-narrowing-const-reference -ferror-limit=0" -c cxx.extra_cflags="-gpubnames -w -Wno-enum-constexpr-conversion -Wno-missing-template-arg-list-after-template-kw -Wno-c++11-narrowing -Wno-c++11-narrowing-const-reference" -c cxx.profile="fbcode//fdo/autofdo/unicorn/topaggr/top_aggregator_server:autofdo" unicorn/topaggr:top_aggregator_server
```

Differential Revision: D59708722

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130661
Approved by: https://github.com/Skylion007
2024-07-16 23:47:48 +00:00
David Berard
1fbfb3202d [docs][TorchScript] document c10::AliasAnalysisKind::CONSERVATIVE (#130765)
I spent a while trying to search this to remember what this was called. Adding it to the OVERVIEW.md docs so it's easier to search
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130765
Approved by: https://github.com/nmacchioni, https://github.com/eellison, https://github.com/aaronenyeshi
2024-07-16 14:20:31 +00:00
PyTorch MergeBot
ea78b0c177 Revert "Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)"
This reverts commit a17d1e5322.

Reverted https://github.com/pytorch/pytorch/pull/130341 on behalf of https://github.com/izaitsevfb due to internal needs pybind update ([comment](https://github.com/pytorch/pytorch/pull/130341#issuecomment-2226499397))
2024-07-12 23:07:37 +00:00
Bertrand Thia
43b98fa521 Add debug repr to SymNode (#129925)
Fixes #129403

Create a separate printing function to debug SymNode, since we can't easily change `__repr__` that is used by GraphModule.recompile() to create a pythonic version of a graph

This is my first contribution, please let me know if there is anything that I should look into in further details

Thank you for you guidance! 🙏 I hope to contribute more in the future!

@aorenste
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129925
Approved by: https://github.com/aorenste
2024-07-12 18:31:23 +00:00
PyTorch MergeBot
7ce5b5767c Revert "Make c10::string_view an alias of std::string_view (#130417)"
This reverts commit c9551a3f50.

Reverted https://github.com/pytorch/pytorch/pull/130417 on behalf of https://github.com/izaitsevfb due to depends on #130009 which needs to be reverted ([comment](https://github.com/pytorch/pytorch/pull/130417#issuecomment-2224212227))
2024-07-12 00:37:04 +00:00
cyy
c9551a3f50 Make c10::string_view an alias of std::string_view (#130417)
Follows #130009 to further facilitate the mitigation from c10::string_view to std::string_view. The old c10::string_view was renamed to c10::string_view_ext.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130417
Approved by: https://github.com/ezyang
2024-07-11 12:31:06 +00:00
cyy
798b9652f7 [6/N] Replace c10::optional with std::optional (#130438)
Follows #130408

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130438
Approved by: https://github.com/janeyx99
2024-07-11 01:15:37 +00:00
Xuehai Pan
a17d1e5322 Fix static py::object dangling pointer with py::gil_safe_call_once_and_store (#130341)
Fix static `py::object`s with `py::gil_safe_call_once_and_store`.

The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault.

```c++
void func() {
    static py::object obj = py::module_::import("foo").attr("bar");

    ...
}
```

The correct code is to use raw pointers rather than the instance.

```c++
void func() {
    static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")};
    py::object obj = *obj_ptr;

    ...
}
```

This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely.

```c++
void func() {
    PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage;
    py::object obj = storage
                         .call_once_and_store_result(
                             []() -> py::object {
                                 return py::module_::import("foo").attr("bar");
                             }
                         )
                         .get_stored();

    ...
}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130341
Approved by: https://github.com/ezyang
2024-07-10 04:23:37 +00:00
Jerry Mannil
42f647219a [ROCm] Add int4 support (#129710)
- Add AMD support for int4 kernel
  - Only supports CDNA2 and CDNA3 gpus for now
  - Uses `mfma_f32_16x16x16bf16` instruction for matrix multiply
  - Uses `v_and_or_b32` instruction and `__hfma2` instrinsic for unpacking bf16 values
  - Enable hipify for `__nv_bfloat16` and `__nv_bfloat162` data types
- Enable int4 unit tests for CDNA2 and CDNA3 AMD gpus
- Fix torchscript issues due to hipify for `__nv_bfloat16` type
  - TorchScript has its own implementation for bfloat16 type
    - Implemented in `__nv_bloat16` structure at [resource_strings.h](https://github.com/pytorch/pytorch/blob/main/torch/csrc/jit/codegen/fuser/cuda/resource_strings.h)
    - So, we shouldn't hipify any reference of `__nv_bfloat16` in the torchscript implementation
    - Hence moved the `__nv_bfloat16` direct references in `codegen.cpp` and `cuda_codegen.cpp` to `resource_strings.h` which is already exempted from hipify

Fixes #124699
Fixes pytorch-labs/gpt-fast/issues/154

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129710
Approved by: https://github.com/malfet
2024-07-09 19:49:12 +00:00
cyy
71efbf701d [3/N] Change #include <c10/util/Optional.h> to #include <optional> (#130300)
Follows #130236

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130300
Approved by: https://github.com/ezyang
2024-07-09 13:32:57 +00:00
cyy
29861779ce [2/N] Change #include <c10/util/Optional.h> to #include <optional> (#130236)
Follows  #128301. The changes were made by grep and sed

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130236
Approved by: https://github.com/ezyang
2024-07-09 03:17:24 +00:00
cyy
f4dcf2ae93 [1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301
Approved by: https://github.com/ezyang, https://github.com/r-barnes
2024-07-08 07:03:53 +00:00
cyy
efb73eda51 [2/N] Fix some violations of unused-function and unused-variable checks in torch_cpu (#129878)
Follows #128670

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129878
Approved by: https://github.com/ezyang
2024-07-04 00:39:28 +00:00
Shiyan Deng
1e27af335e [easy] enhance local model loading (#129897)
Summary:
1. add one more model lib dep.
2. add error message when torchscript failed to find a class in python compilation unit.

Test Plan: CI

Reviewed By: jingsh

Differential Revision: D59243250

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129897
Approved by: https://github.com/jingsh
2024-07-03 00:29:02 +00:00
cyy
26de2c2487 [3/N] Enable clang-tidy on torch/csrc/jit/serialization/* (#129850)
Follows #129300.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129850
Approved by: https://github.com/ezyang
2024-07-02 20:08:48 +00:00
PyTorch MergeBot
07450e9713 Revert "[MPS] Add support for autocast in MPS (#99272)"
This reverts commit 6240cfd5c7.

Reverted https://github.com/pytorch/pytorch/pull/99272 on behalf of https://github.com/jeanschmidt due to introduced breakages in trunk ([comment](https://github.com/pytorch/pytorch/pull/99272#issuecomment-2203033719))
2024-07-02 12:29:51 +00:00
Kulin Seth
6240cfd5c7 [MPS] Add support for autocast in MPS (#99272)
Fixes https://github.com/pytorch/pytorch/issues/88415

Co-authored-by: Siddharth Kotapati <skotapati@apple.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99272
Approved by: https://github.com/malfet
2024-07-02 01:49:52 +00:00
cyy
ca5d13c672 [1/N] Enable unused variable warnings on torch_cpu and fix some violations (#128670)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128670
Approved by: https://github.com/ezyang
2024-07-01 14:56:46 +00:00
cyy
eb1583dbc1 [2/N] Fix clang-tidy warnings in torch/csrc/jit/serialization (#129300)
Follows #129055
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129300
Approved by: https://github.com/ezyang
2024-07-01 01:09:00 +00:00
Wenlei He
89696db4b0 Revert "[LLVM/TensorExpr] Update for an API change in LLVM 18." (#129797)
This reverts commit 20f394f10a (https://github.com/pytorch/pytorch/pull/117086)

LLVM upstream changed the pass builder API again, so registerPassBuilderCallbacks no longer takes extra boolean for PopulateClassToPassNames. Update accordingly.

Relevant LLVM upstream change:
https://github.com/llvm/llvm-project/pull/96321
https://github.com/llvm/llvm-project/pull/96462

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129797
Approved by: https://github.com/dcci
2024-06-29 05:17:20 +00:00
rzou
856541c701 [custom_op] support default dtype values (#129189)
This PR:
- moves some of the dtype-string utilities into ScalarType.{h, cpp}
- adds a new utility to get a mapping from dtype name to the C++ dtype
- the perser now checks if the string is a dtype name; if it is then it
  pulls the c++ dtype from the mapping.

Test Plan:
- new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129189
Approved by: https://github.com/albanD
ghstack dependencies: #129177, #129178, #129179
2024-06-23 00:13:23 +00:00
cyy
2c7c286fa4 [1/N] Fix clang-tidy warnings in torch/csrc/jit/serialization (#129055)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129055
Approved by: https://github.com/r-barnes
2024-06-21 14:56:31 +00:00
cyy
5c676bb8b3 Remove Caffe2 handling from onnx_unpack_quantized_weights (#129021)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129021
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-06-21 06:16:44 +00:00
David Berard
108318ad10 [BE][JIT] Handle case where codegen object can be unset (#128951)
Summary:
Unblocks a test that's failing.

`codegen` can be unset until `compile` is called. If `codegen` is not set, then just use the kernel name directly.

Test Plan:
```
buck2 run //caffe2/test:tensorexpr -- --regex test_simple_add
```

Differential Revision: D58727391

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128951
Approved by: https://github.com/aaronenyeshi
2024-06-18 15:40:45 +00:00
PyTorch MergeBot
846bb30e13 Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)"
This reverts commit bd72e28314.

Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build bd72e28314. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))
2024-06-15 01:58:20 +00:00
cyy
bd72e28314 [1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301
Approved by: https://github.com/ezyang
2024-06-14 23:21:01 +00:00
angelayi
e9c6e8369c Torchbind call method + effects support (#128397)
Adds effect token support to torchbind method calls by allowing `with_effects` to take in `torch.ops._higher_order_ops.call_torchbind` as an input.

Here is the print from `TORCH_LOGS="aot" python test/export/test_torchbind.py -k test_compile_obj_torchbind_op`:
```python
def forward(self, arg0_1: "f32[0]", arg1_1: "f32[2]", arg2_1):
    # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1266 in f, code: torch.ops._TorchScriptTesting.queue_push(tq, x.cos())
    cos: "f32[2]" = torch.ops.aten.cos.default(arg1_1)
    with_effects = torch._higher_order_ops.effects.with_effects(arg0_1, torch.ops._TorchScriptTesting.queue_push.default, arg2_1, cos);  arg0_1 = cos = None
    getitem: "f32[0]" = with_effects[0];  with_effects = None

    # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1267 in f, code: torch.ops._TorchScriptTesting.queue_push(tq, x.cos() + 1)
    cos_1: "f32[2]" = torch.ops.aten.cos.default(arg1_1)
    add: "f32[2]" = torch.ops.aten.add.Tensor(cos_1, 1);  cos_1 = None
    with_effects_1 = torch._higher_order_ops.effects.with_effects(getitem, torch.ops._TorchScriptTesting.queue_push.default, arg2_1, add);  getitem = add = None
    getitem_2: "f32[0]" = with_effects_1[0];  with_effects_1 = None

    # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1268 in f, code: torch.ops._TorchScriptTesting.queue_pop(tq)
    with_effects_2 = torch._higher_order_ops.effects.with_effects(getitem_2, torch.ops._TorchScriptTesting.queue_pop.default, arg2_1);  getitem_2 = None
    getitem_4: "f32[0]" = with_effects_2[0];  with_effects_2 = None

    # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1269 in f, code: torch.ops._TorchScriptTesting.queue_push(tq, x.sin())
    sin: "f32[2]" = torch.ops.aten.sin.default(arg1_1);  arg1_1 = None
    with_effects_3 = torch._higher_order_ops.effects.with_effects(getitem_4, torch.ops._TorchScriptTesting.queue_push.default, arg2_1, sin);  getitem_4 = sin = None
    getitem_6: "f32[0]" = with_effects_3[0];  with_effects_3 = None

    # File: /data/users/angelayi/pytorch2/test/export/test_torchbind.py:1270 in f, code: return tq.pop(), tq.pop() + tq.size(), tq
    with_effects_4 = torch._higher_order_ops.effects.with_effects(getitem_6, torch.ops._higher_order_ops.call_torchbind, arg2_1, 'pop');  getitem_6 = None
    getitem_8: "f32[0]" = with_effects_4[0]
    getitem_9: "f32[2]" = with_effects_4[1];  with_effects_4 = None
    with_effects_5 = torch._higher_order_ops.effects.with_effects(getitem_8, torch.ops._higher_order_ops.call_torchbind, arg2_1, 'pop');  getitem_8 = None
    getitem_10: "f32[0]" = with_effects_5[0]
    getitem_11: "f32[2]" = with_effects_5[1];  with_effects_5 = None
    with_effects_6 = torch._higher_order_ops.effects.with_effects(getitem_10, torch.ops._higher_order_ops.call_torchbind, arg2_1, 'size');  getitem_10 = arg2_1 = None
    getitem_12: "f32[0]" = with_effects_6[0];  with_effects_6 = None
    add_1: "f32[2]" = torch.ops.aten.add.Tensor(getitem_11, 0);  getitem_11 = None
    return (getitem_12, getitem_9, add_1)
```

In order to support this, this PR makes the following changes:
* Adds `FakeScriptObject` to `CustomObjArgument`, which will be put on the `meta["val"]` of nodes representing torchbind objects.
* Adds pickle/deepcopy support to FunctionSchema.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128397
Approved by: https://github.com/ydwu4, https://github.com/zou3519
2024-06-14 21:28:17 +00:00
cyy
9ebec1f345 Enable Wunused-function in torch_cpu (#128576)
Follows #128499

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128576
Approved by: https://github.com/ezyang, https://github.com/r-barnes
2024-06-14 00:12:58 +00:00
cyy
3f9b8446cf [8/N] Remove unused functions (#128499)
Follows #128407

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128499
Approved by: https://github.com/malfet
2024-06-13 01:15:11 +00:00
cyy
219da29dfd [7/N] Remove unused functions (#128407)
Follows  #128309
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128407
Approved by: https://github.com/ezyang
2024-06-12 01:10:33 +00:00
David Berard
29081059b6 [Static Runtime] Fix & run gen_static_runtime_ops (#128299)
gen_static_runtime_ops hasn't been updated in a while. In preparation for https://github.com/pytorch/pytorch/pull/127675 in which I need to re-run the codegen step for cumprod, I want to land these changes beforehand in case there are any other issues that arise.

I added a number of ops to the blocklist:
```
+        "_nested_tensor_storage_offsets",
+        "_nested_get_values",  # no CPU backend
+        "_nested_get_values_copy",  # no CPU backend
+        "_nested_view_from_jagged",  # testing needs to be patched
+        "_nested_view_from_jagged_copy",  # testing needs to be patched
+        "_nested_view_from_buffer",  # testing needs to be patched
+        "_nested_view_from_buffer_copy",  # testing needs to be patched
+        "_int_mm",  # testing needs to be patched
+        "_to_sparse_csc",  # testing needs to be patched
+        "_to_sparse_csr",  # testing needs to be patched
+        "segment_reduce",  # testing needs to be patched
```

Most of these are added just because testing doesn't work right now.

Additionally, a few `fft` ops seem to have been removed from native_functions.yaml; I'm guessing it's unlikely FFT would have been used in many real models though.

Differential Revision: [D58329403](https://our.internmc.facebook.com/intern/diff/D58329403/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128299
Approved by: https://github.com/YuqingJ
2024-06-11 16:27:39 +00:00
cyy
99f5a85a09 [Clang Tidy] Fix misc-header-include-cycle errors in clang-tidy and ignore some files (#127233)
Since there are such cycles in libfmt and PyTorch, which are detected by clang-tidy.
```
/home/cyy/pytorch/third_party/fmt/include/fmt/format-inl.h:25:10: error: circular header file dependency detected while including 'format.h', please check the include path [misc-header-include-cycle,-warnings-as-errors]
   25 | #include "format.h"
      |          ^
/home/cyy/pytorch/third_party/fmt/include/fmt/format.h:4530:12: note: 'format-inl.h' included from here
 4530 | #  include "format-inl.h"
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127233
Approved by: https://github.com/ezyang
2024-06-10 23:49:58 +00:00
cyy
30875953a4 [1/N] Remove inclusion of c10/util/string_utils.h (#128300)
As a first step to remove it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128300
Approved by: https://github.com/ezyang, https://github.com/eqy
2024-06-10 23:40:47 +00:00
Peter Bell
d3817d8a60 Don't create python tuple when _maybe_handle_torch_function is called from C++ (#128187)
Marginal overhead reduction when calling through the `torch.ops` API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128187
Approved by: https://github.com/lezcano
ghstack dependencies: #128183, #128184, #128185
2024-06-10 00:16:59 +00:00
Edward Z. Yang
3964a3ec73 Complete revamp of float/promotion sympy handling (#126905)
At a high level, the idea behind this PR is:

* Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
* Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions:

* FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
* ModularIndexing, LShift, RShift now assert they are given integer inputs.
* Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
* TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
* Trunc is split to TruncToFloat and TruncToInt.
* Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
* RoundDecimal updated to consistently only ever return a float
* Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations.  Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**:

* `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
* `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv`

These changes have consequences. First, we need to make some administrative changes:

* Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2)
* Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py**
  * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
  * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
* Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
* Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments:

* Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now
* `_assert_bound_is_rational` is no more, we no longer generate rational bounds
* Don't intersect non-int value ranges with the `int_range`
* Support more sympy Functions for guard SYMPY_INTERP
* Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

* **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions
* **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions
* **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
* **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py**

**Reland notes.** This requires this internal fbcode diff https://www.internalfb.com/phabricator/paste/view/P1403322587 but I cannot prepare the diff codev due to https://fb.workplace.com/groups/osssupport/posts/26343544518600814/

It also requires this Executorch PR https://github.com/pytorch/executorch/pull/3911 but the ET PR can be landed prior to this landing.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905
Approved by: https://github.com/xadupre, https://github.com/lezcano
2024-06-09 06:20:25 +00:00
cyy
c219fa5eb9 [3/N] Remove unused functions (#128179)
Following https://github.com/pytorch/pytorch/pull/128005, this PR continues to remove unused functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128179
Approved by: https://github.com/ezyang
2024-06-07 16:13:16 +00:00
PyTorch MergeBot
ac51f782fe Revert "Complete revamp of float/promotion sympy handling (#126905)"
This reverts commit 2f7cfecd86.

Reverted https://github.com/pytorch/pytorch/pull/126905 on behalf of https://github.com/atalman due to Sorry need to revert - failing internally ([comment](https://github.com/pytorch/pytorch/pull/126905#issuecomment-2155118778))
2024-06-07 16:01:46 +00:00
Edward Z. Yang
2f7cfecd86 Complete revamp of float/promotion sympy handling (#126905)
At a high level, the idea behind this PR is:

* Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
* Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions:

* FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
* ModularIndexing, LShift, RShift now assert they are given integer inputs.
* Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
* TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
* Trunc is split to TruncToFloat and TruncToInt.
* Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
* RoundDecimal updated to consistently only ever return a float
* Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations.  Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**:

* `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
* `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv`

These changes have consequences. First, we need to make some administrative changes:

* Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2)
* Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py**
  * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
  * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
* Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
* Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments:

* Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now
* `_assert_bound_is_rational` is no more, we no longer generate rational bounds
* Don't intersect non-int value ranges with the `int_range`
* Support more sympy Functions for guard SYMPY_INTERP
* Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

* **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions
* **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions
* **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
* **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py**

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905
Approved by: https://github.com/xadupre, https://github.com/lezcano
2024-06-06 02:29:45 +00:00
cyy
9f2c4b9342 Replace with standard type traits in torch/csrc (#127852)
In preparation to clean up more type traits.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127852
Approved by: https://github.com/ezyang
2024-06-05 15:22:48 +00:00
PyTorch MergeBot
d5cb5d623a Revert "Complete revamp of float/promotion sympy handling (#126905)"
This reverts commit fb696ef3aa.

Reverted https://github.com/pytorch/pytorch/pull/126905 on behalf of https://github.com/ezyang due to internal user reported ceiling equality simplification problem, I have a plan ([comment](https://github.com/pytorch/pytorch/pull/126905#issuecomment-2148805840))
2024-06-05 03:57:58 +00:00
cyy
059cae6176 [Caffe2] Remove Caffe2 proto and other files (#127655)
Remove Caffe2 proto files altogether.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127655
Approved by: https://github.com/ezyang
2024-06-04 14:22:21 +00:00
Edward Z. Yang
fb696ef3aa Complete revamp of float/promotion sympy handling (#126905)
At a high level, the idea behind this PR is:

* Make it clearer what the promotion and int/float rules for various Sympy operations are. Operators that previously were polymorphic over int/float are now split into separate operators for clarity. We never do mixed int/float addition/multiplication etc in sympy, instead, we always promote to the appropriate operator. (However, equality is currently not done correctly.)
* Enforce strict typing on ValueRanges: if you have a ValueRange for a float, the lower and upper MUST be floats, and so forth for integers.

The story begins in **torch/utils/_sympy/functions.py**. Here, I make some changes to how we represent certain operations in sympy expressions:

* FloorDiv now only supports integer inputs; to do float floor division, do a truediv and then a trunc. Additionally, we remove the divide out addition by gcd optimization, because sympy gcd is over fields and is willing to generate rationals (but rationals are bad for ValueRange strict typing).
* ModularIndexing, LShift, RShift now assert they are given integer inputs.
* Mod only supports integer inputs; eventually we will support FloatMod (left for later work, when we build out Sympy support for floating operations). Unfortunately, I couldn't assert integer inputs here, because of a bad interaction with sympy's inequality solver that is used by the offline solver
* TrueDiv is split into FloatTrueDiv and IntTrueDiv. This allows for us to eventually generate accurate code for Python semantics IntTrueDiv, which is written in a special way to preserve precision when the inputs are >= 2**53 beyond what first coercing the integer to floats and then doing true division.
* Trunc is split to TruncToFloat and TruncToInt.
* Round is updated to return a float, not an int, making it consistent with the round op handler in Inductor. To get Python-style conversion to int, we call TruncToInt on the result.
* RoundDecimal updated to consistently only ever return a float
* Add ToFloat for explicit coercion to float (required so we can enforce strict ValueRanges typing)

In **torch/__init__.py**, we modify SymInt and SymFloat to appropriately call into new bindings that route to these refined sympy operations.  Also, we modify `torch.sym_min` and `torch.sym_max` to have promotion semantics (if one argument is a float, the return result is always a float), making them inconsistent with builtins.min/max, but possible to do type analysis without runtime information.

We also need to introduce some new op handlers in **torch/_inductor/ops_handler.py**:

* `to_int` for truncation to int64, directly corresponding to TruncToInt; this can be implemented by trunc and dtype, but with a dedicated handler it is more convenient for roundtripping in Sympy
* `int_truediv` for Python-style integer true division, which has higher precision than casting to floats and then running `truediv`

These changes have consequences. First, we need to make some administrative changes:

* Actually wire up these Sympy functions from SymInt/SymFloat in **torch/fx/experimental/sym_node.py**, including the new promotion rules (promote2)
* Add support for new Sympy functions in **torch/utils/_sympy/interp.py**, **torch/utils/_sympy/reference.py**
  * In particular, in torch.utils._sympy.reference, we have a strong preference to NOT do nontrivial compute, instead, everything in ops handler should map to a singular sympy function
  * TODO: I chose to roundtrip mod back to our Mod function, but I think I'm going to have to deal with the C/Python inconsistency this to fix tests here
* Add printer support for the Sympy functions in **torch/_inductor/codegen/common.py**, **torch/_inductor/codegen/cpp_utils.py**, **torch/_inductor/codegen/triton.py**. `int_truediv` and mixed precision equality is currently not implemented soundly, so we will lose precision in codegen for large values. TODO: The additions here are not exhaustive yet
* Update ValueRanges logic to use new sympy functions in **torch/utils/_sympy/value_ranges.py**. In general, we prefer to use the new Sympy function rather than try to roll things by hand, which is what was done previously for many VR analysis functions.

In **torch/fx/experimental/symbolic_shapes.py** we need to make some symbolic reasoning adjustments:

* Avoid generation of rational subexpressions by removing simplification of `x // y` into `floor(x / y)`. This simplification then triggers an addition simplification rule `(x + y) / c --> x / c + y / c` which is bad because x / c is a rational number now
* `_assert_bound_is_rational` is no more, we no longer generate rational bounds
* Don't intersect non-int value ranges with the `int_range`
* Support more sympy Functions for guard SYMPY_INTERP
* Assert the type of value range is consistent with the variable type

The new asserts uncovered necessary bug fixes:

* **torch/_inductor/codegen/cpp.py**, **torch/_inductor/select_algorithm.py**, **torch/_inductor/sizevars.py** - Ensure Wild/Symbol manually allocated in Inductor is marked `is_integer` so it's accepted to build expressions
* **torch/_inductor/utils.py** - make sure you actually pass in sympy.Expr to these functions
* **torch/_inductor/ir.py** - make_contiguous_strides_for takes int/SymInt, not sympy.Expr!
* **torch/export/dynamic_shapes.py** - don't use infinity to represent int ranges, instead use sys.maxsize - 1

Because of the removal of some symbolic reasoning that produced rationals, some of our symbolic reasoning has gotten worse and we are unable to simplify some guards. Check the TODO at **test/test_proxy_tensor.py**

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126905
Approved by: https://github.com/xadupre, https://github.com/lezcano
2024-06-04 11:47:32 +00:00
Kiuk Chung
4d0386ce1c [torch/jit-runtime] Add explicit include of <chrono> to torch/jit/run… (#127779)
Added an explicit include to `<chrono>` in `jit/runtime/logging.h` since `std::chrono::time_point<std::chrono::high_resolution_clock>` is directly referenced in the header.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127779
Approved by: https://github.com/albanD
2024-06-04 02:12:17 +00:00
Daniil Kutz
f343f98710 [jit] Validate mobile module fields parsed by flatbuffer loader (#127437)
Fixing error in `torch.jit.load` Python API function that cause crash in C-backend of PyTorch.
The mobile module is succesfully parsed from flatbuffer format, but its fields are used without any validation.

Fixes #127434

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127437
Approved by: https://github.com/davidberard98
2024-06-03 08:48:12 +00:00
cyy
8629f9b3f2 Remove more unused variables in tests (#127510)
Follows #127379

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127510
Approved by: https://github.com/Skylion007, https://github.com/r-barnes
2024-05-31 03:39:45 +00:00
cyy
67d52d7fcb [caffe2] Remove import_legacy.cpp (#126149)
I think they are for Caffe2 and should be deleted.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126149
Approved by: https://github.com/r-barnes
2024-05-24 19:47:32 +00:00
Richard Barnes
3f5b59eef4 [codemod] c10::optional -> std::optional in caffe2/aten/src/ATen/DeviceGuard.h +117 (#126901)
Summary:
Generated with
```
fbgs -f '.*\.(cpp|cxx|cc|h|hpp|cu|cuh)$' c10::optional -l | perl -pe 's/^fbsource.fbcode.//' | grep -v executorch | xargs -n 50 perl -pi -e 's/c10::optional/std::optional/g'
```

 - If you approve of this diff, please use the "Accept & Ship" button :-)

(117 files modified.)

Test Plan: Sandcastle

Reviewed By: palmje

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126901
Approved by: https://github.com/Skylion007, https://github.com/eqy
2024-05-24 00:26:15 +00:00
Richard Zou
f8857cef45 [Reland] Verify types in custom op schemas (#126861)
Summary:
co-dev reland of https://github.com/pytorch/pytorch/pull/124520, which requires
the removal of some executorch tests.

Before this PR, we didn't check that types in a schema were valid. This
is because TorchScript treats unknown types as type variables.

This PR checks types in a schema for the TORCH_LIBRARY APIs. To do this,
we add an `allow_typevars` flag to parseSchema so that TorchScript can
use allow_typevars=True. We also add some error messages for common
mistakes (e.g. using int64_t or double in schema).

Test Plan: Wait for tests

Differential Revision: D57666659

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126861
Approved by: https://github.com/albanD
2024-05-23 19:53:52 +00:00
cyy
e5db6758c8 [BE]: Use make_unique (#126966)
Adds make_unique in places

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126966
Approved by: https://github.com/Skylion007
2024-05-23 17:39:48 +00:00
Wei Han
74b053d7c4 Pass model path to observer (#126503)
Summary: Passing model path to observer so that they can get additional info if needed.

Test Plan: contbuild & OSS CI

Differential Revision: D57475129

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126503
Approved by: https://github.com/kirklandsign
2024-05-20 22:17:56 +00:00
David Berard
cb3b8cd0d3 Use object identity for deepcopy memo (#126126)
Copy of #126089, with some additional fixes & tests

Partial fix for #125635: previously, the deepcopy implementation would group together any tensors with any aliasing relationship and assign them to the same tensor. This was sort of good if you have two tensors `b = a.detach()`, because then if you deepcopy `list = [a, b]` to `list2 = list.deepcopy()`, then writes to `list2[0]` will also modify `list2[1]`. But for the most part, it's bad; (1) if you have `b = a.as_strided((4, 4), (16, 1), 16)`, then it'll make `b == a` in the deepcopied implementation, which is completely wrong; and (2) even if you have `b = a.detach()`, these are still initially two different tensors which become the same tensor after the old deepcopy implementation.

The new implementation only groups together tensors that have the same identity. This is a partial fix, but it's more reasonable. What changes:
* (becomes more correct): different views of the same base tensor will no longer all become equal after deepcopying
* (still kind of wrong): views won't actually alias each other after deepcopying.
* (arguably a minor regression): equivalent views of the same tensor will no longer be copied to the same tensor - so they won't alias.

BC breaking: C++ deepcopy interface changes from accepting `IValue::HashAliasedIValueMap memo` to accepting `IValue::HashIdentityIValueMap memo`. If there are objections, we can keep the old API. However, it seems likely that users generally won't try to deepcopy from C++.

Differential Revision: [D57406306](https://our.internmc.facebook.com/intern/diff/D57406306)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126126
Approved by: https://github.com/ezyang
2024-05-17 00:06:26 +00:00
Gustav Larsson
5862521ad1 [onnx.export] Cache SetGraphInputTypeReliable (#124912)
This PR is part of an effort to speed up torch.onnx.export (https://github.com/pytorch/pytorch/issues/121422).

- For each node that is processed in onnx.export, a check is run to see if all inputs are "reliable" (static shape, etc.). This value does not change, so it is much faster to cache it on the first computation. The caching is added to the ConstantMap state.
- Resolves (6) in #121422.
- Also see #123028 with a similar addition of a cache state.

(partial fix of #121545)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124912
Approved by: https://github.com/justinchuby
2024-05-16 18:48:56 +00:00
Gustav Larsson
31d22858e9 [onnx.export] Avoid unnecessary copy of debug_names (#123026)
This PR is part of an effort to speed up torch.onnx.export (#121422).

- The `auto debug_names = ` infers a copy, where as `const auto& debug_names` does not.
- However, this ones requires us to be careful, since calls to `setDebugName` changes `debug_names` and invalidates the `exist_name` iterator. So if we simply change `auto` to `const auto&`, then between that line and `find` we have corrupted the iterator by calling `output[i]->setDebugName`. This change aims to be functionally equivalent to the original, which is why we first get the Value pointer, then call `output[i]->setDebugName`, and finally call `setDebugName` on the found value. It is possible functionally it is OK to simply call `output[i]->setDebugName` first and then find and the second `setDebugName`, but this would not be identical to current behavior.
- Resolves (2) in #121422.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123026
Approved by: https://github.com/justinchuby
2024-05-15 20:58:18 +00:00
Gustav Larsson
44e47d5bd0 [onnx.export] Avoid linear loop over symbol_dim_map (#123029)
This PR is part of an effort to speed up torch.onnx.export (#121422).

- Doing a reverse look-up in `symbol_dim_map` incurs a linear cost in number of symbols. This happens for each node, so incurs a quadratic cost to the whole export.
- Add a reverse look-up `dim_symbol_map` that is kept in parallel of `symbol_dim_map`. This avoids a linear time look-up, which creates a quadratic export time complexity.
- This is a highly pragmatic solution. If someone more familiar with the code base has a better solution, I'm interested to hear about it.
- Resolves (9) in #121422.

(partial fix of #121422)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123029
Approved by: https://github.com/justinchuby
2024-05-15 17:22:30 +00:00
Mikayla Gawarecki
bbdbfe3661 Reland add write_record_metadata to PyTorchFileWriter (#126087)
Reland of https://github.com/pytorch/pytorch/pull/125184 with compiler warning fixed by extending `m_pWrite` rather than adding `m_pSeek` to miniz API

Differential Revision: [](https://our.internmc.facebook.com/intern/diff/)

Differential Revision: [D57287327](https://our.internmc.facebook.com/intern/diff/D57287327)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126087
Approved by: https://github.com/albanD
2024-05-14 21:48:44 +00:00
Richard Barnes
ed327876f5 [codemod] c10:optional -> std::optional (#126135)
Generated by running the following from PyTorch root:
```
find . -regex ".*\.\(cpp\|h\|cu\|hpp\|cc\|cxx\)$" | grep -v "build/" | xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/'
```

`c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi
2024-05-14 19:35:51 +00:00
Gustav Larsson
50c3d58734 [onnx.export] Cache AllGraphInputsStatic (#123028)
This PR is part of an effort to speed up torch.onnx.export (#121422).

- The inputs (dynamic inputs and constants) do not change as as nodes are added and it is expensive to re-compute for every node. So, we cache this value so we avoid computing it for every node. Open to entirely other solution as well.
- Resolves (5) in #121422.

(partial fix of #121545)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123028
Approved by: https://github.com/justinchuby
2024-05-14 18:19:04 +00:00
Richard Barnes
b9e7b35912 Remove caffe2 from more build files (#125898)
Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125898
Approved by: https://github.com/Skylion007
2024-05-13 18:37:59 +00:00
Jeff Daily
ae9a4fa63c [ROCm] enforce ROCM_VERSION >= 6.0 (#125646)
Remove any code relying on ROCM_VERSION < 6.0.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646
Approved by: https://github.com/albanD, https://github.com/eqy
2024-05-12 18:01:28 +00:00
Zhengxu Chen
1115a25c36 Add obc counter for TS migration. (#125986)
Summary: Since table caffe2_pytorch_usage_stats only has 1 day retention which renders it useless for TS migration purposes, we want to build a lightweight counter mechanism to collect usage data about torch jit APIs which can monitor the usage decline in the long term.

Test Plan: CI

Reviewed By: SherlockNoMad

Differential Revision: D57216847

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125986
Approved by: https://github.com/gmagogsfm
2024-05-11 05:14:02 +00:00
Gustav Larsson
52fad83335 [onnx.export] Avoid linear look up in env for exist_in_env (#124909)
This PR is part of a series of PRs to significantly speed up torch.onnx.export for models with many nodes (e.g. LLM). See #121422 for more analysis.

- As part of torch.onnx.export, a reverse look-up is made in env. This is done for each node, and this look-up costs in proportional to the graph size, which incurs and overall O(N^2) time complexity.
- A pragmatic solution is simply to keep a separate data structure to make this de facto constant time. So, this introduces a set containing all the values of env. Open to other ideas. Ideally `exist_in_env` wouldn't be needed at all, but to preserve current behavior exactly I'm not sure how that can be done.
- Resolves (4) in #121422.
- This code change and the choice of py::set looks a bit more natural on top of #123063, where the env is changed from a std::unordered_map to a py::dict.

Partially fixes #121422
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124909
Approved by: https://github.com/srikris-sridhar, https://github.com/justinchuby
2024-05-09 22:38:00 +00:00
Gustav Larsson
e5766f02d0 [onnx.export] Avoid dict <-> unordered_map implicit copies (#123063)
This PR is part of an effort to speed up torch.onnx.export (#121422).

- Avoid [implicit copy](https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#automatic-conversion) between `pybind11::dict` and `std::unordered_map` that
  happens for every node that gets processed. The copy scales with N
  (number of nodes), so this creates a quadratic time complexity.
  Solution is to always use `pybind11::dict`.
- This alone speeds up exports by x2 for large models.
- Resolves (1) in #121422.

(partial fix of #121422)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123063
Approved by: https://github.com/justinchuby
2024-05-09 07:34:47 +00:00
Yu, Guangye
d17be10df1 make torch.amp.autocast more generic (#125103)
# Motivation
As discussed in [#124479](https://github.com/pytorch/pytorch/pull/124479), `torch.amp.autocast` can NOT be completely equivalent to `torch.cuda.amp.autocast` and `torch.cpu.amp.autocast` since `torch.amp.autocast` has NOT the default `dtype` for CPU (`torch.bfloat16` by default) and CUDA (`torch.float16` by default) respectively. We would like `torch.amp.autocast` to be more generic to help the developer/customer write the device-agnostic code. Because there are not enough reasons to add device-specific autocast `torch.xxx.amp.autocast` for each device backend.

# Solution
When `None` is passed to `dtype`, we should use `torch.get_autocast_dtype` to get the related dtype for each backend. Meanwhile, `torch.get_autocast_dtype` is necessary to be supported in JIT path for BC.

# Additional Context
With this PR, `torch.amp.autocast(device_type='cuda')` is equivalent to `torch.cuda.amp.autocast`.
Add two new UTs to cover this change in eager and jit path respectively.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125103
Approved by: https://github.com/albanD, https://github.com/jgong5, https://github.com/gujinghui
2024-05-08 12:13:26 +00:00
PyTorch MergeBot
ccbac091d2 Revert "Add write_record_metadata to PyTorchFileWriter (#125184)"
This reverts commit dd92637f44.

Reverted https://github.com/pytorch/pytorch/pull/125184 on behalf of https://github.com/izaitsevfb due to breaks internal builds, see D56962076 ([comment](https://github.com/pytorch/pytorch/pull/125184#issuecomment-2094976897))
2024-05-05 22:40:00 +00:00
Sergii Dymchenko
59abd1dccb Fix lint after PR 122611 (#125512)
Fix lint after https://github.com/pytorch/pytorch/pull/122611
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125512
Approved by: https://github.com/clee2000
2024-05-03 22:58:20 +00:00
Iosif Spulber
4abcf36dde Make c10::Error empty backtrace as an optional argument (#122611)
Summary: Split from the main diff in the stack.

Test Plan: Build validation should be enough.

Reviewed By: ezyang

Differential Revision: D55313410

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122611
Approved by: https://github.com/ezyang
2024-05-03 22:50:00 +00:00
Mikayla Gawarecki
dd92637f44 Add write_record_metadata to PyTorchFileWriter (#125184)
Add `PyTorchFileWriter.write_record_metadata(record_name, num_bytes)` that
- writes the zipfile header/end of central directory metadata for an entry*
- reserves `num_bytes` in the zipfile for the payload.

*Since the payload is not provided, the CRC32 computation is skipped and 0s are written in the corresponding entry of the zipfile header

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125184
Approved by: https://github.com/albanD
2024-05-03 07:29:52 +00:00
PyTorch MergeBot
ea347fa6ce Revert "Fix & optimze open device registration test. (#124712)"
This reverts commit f03cf9d4dc.

Reverted https://github.com/pytorch/pytorch/pull/124712 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/124712#issuecomment-2086971499))
2024-04-30 20:00:37 +00:00
FFFrog
f03cf9d4dc Fix & optimze open device registration test. (#124712)
Fixes #100152

1. Fix the wrong tests about lazy init for PrivateUse1 named foo
2. Fix wrong backend meta registry mechanism when compiling with clang++( compiling with g++ work well)(introduced by static variable in inline function)
3. Refactor the tests and make it more flexible
4. Disable the two tests temporarily
    - test_open_device_storage_pin_memory
    - test_compile_autograd_function_aliasing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124712
Approved by: https://github.com/albanD, https://github.com/malfet
2024-04-29 18:55:38 +00:00
James Pang
43069c460e Correct check for Boolean list input type (#124899)
Summary:
This diff fixes a bug in PyTorch where when creating a tensor from a List of booleans, PyTorch was throwing an error.

This fix resolves that issue. All credit goes to swolchok for identifying the root cause of the issue and suggesting this fix.

Test Plan: Running our model end to end works as expected and no error occurs.

Differential Revision: D55990810

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124899
Approved by: https://github.com/zhxchen17
2024-04-26 22:25:43 +00:00