Commit Graph

2485 Commits

Author SHA1 Message Date
Tugsbayasgalan Manlaibaatar
36164265ae [export oncall] add some examples during oncall (#112445)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112445
Approved by: https://github.com/ydwu4
2023-10-31 18:33:03 +00:00
Devang Aggarwal
69b9e54d45 Add openvino backend into torch.compile docs (#112321)
The torch.compile [docs page](https://pytorch.org/docs/stable/torch.compiler.html) shows commonly used backends through torch.compile. Recently, the OpenVINO backend for torch.compile was released. This PR adds the torch.compile openvino backend into the torch.compile docs page.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112321
Approved by: https://github.com/msaroufim
2023-10-30 20:13:41 +00:00
PyTorch MergeBot
ace2713d1e Revert "Add torch.utils.deterministic.fill_uninitialized_memory flag (#111377)"
This reverts commit f1785373c0.

Reverted https://github.com/pytorch/pytorch/pull/111377 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111377#issuecomment-1784179040))
2023-10-29 17:41:55 +00:00
agunapal
1460e5b7f5 updated aarch64 maintainers in docs (#112047)
This PR adds a new section for maintainers of `aarch64`.

Adding @snadampal to the list

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112047
Approved by: https://github.com/atalman
2023-10-27 21:09:36 +00:00
lezcano
47ccf04885 Split SymNode into its own file (#112037)
This PR:

- Moves TrueDiv, LShift, RShift, IsNonOverlappingAndDenseIndicator to `_sympy.functions.py`
- Moves SymNode to `fx.experimental.sym_node`.
  - This file does not have any SymPy dependencies at import time
  - It installs the magic methods in Sym{Bool,Int,Float}.
  - N.b. With this split, we may be able to move Sym{Bool,Int,Float} to this file, and remove quite a few of the hacks around these classes
- Imports `sym_node` in `torch/__init__.py` rather than the whole `symbolic_shapes.py`.
  This breaks the import-time dependency between torch and SymPy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112037
Approved by: https://github.com/peterbell10
ghstack dependencies: #112035, #112036
2023-10-26 23:32:27 +00:00
Kurt Mohler
f1785373c0 Add torch.utils.deterministic.fill_uninitialized_memory flag (#111377)
Part of #109802

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111377
Approved by: https://github.com/albanD
2023-10-26 02:39:06 +00:00
eellison
7fe51e3e9b Add cudagraph_mark_step_begin in torch.compiler, reference in error message (#111722)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111722
Approved by: https://github.com/ezyang, https://github.com/msaroufim
2023-10-25 21:53:21 +00:00
Mikayla Gawarecki
b54ab57522 Document torch.from_file and fix UntypedStorage.from_file docs (#111688)
Fixes https://github.com/pytorch/pytorch/issues/37439

Also threads through filename so it is accessible via `t.storage().filename`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111688
Approved by: https://github.com/albanD
2023-10-25 19:28:11 +00:00
Thiago Crepaldi
9d4dbebc34 Add support to ExportedProgram as input to torch.onnx.dynamo_export (#111497)
Fixes #109889

This PR adds `torch.export.export` as another `FXGraphExtractor` implementation. `torch.onnx.dynamo_export` automatically uses this new FX tracer when a `torch.export.ExportedProgram` is specified as `model`

Implementation is back compatible, thus non `ExportedProgram` models are handled the exact same way as before
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111497
Approved by: https://github.com/BowenBao
2023-10-25 18:11:19 +00:00
PyTorch MergeBot
5120c97f32 Revert "Add support to ExportedProgram as input to torch.onnx.dynamo_export (#111497)"
This reverts commit 4f42edfb6e.

Reverted https://github.com/pytorch/pytorch/pull/111497 on behalf of https://github.com/huydhn due to Sorry for reverting your change, it is failing ONNX test in trunk 4f42edfb6e, possibly a landrace ([comment](https://github.com/pytorch/pytorch/pull/111497#issuecomment-1778519212))
2023-10-25 05:07:00 +00:00
Thiago Crepaldi
4f42edfb6e Add support to ExportedProgram as input to torch.onnx.dynamo_export (#111497)
Fixes #109889

This PR adds `torch.export.export` as another `FXGraphExtractor` implementation. `torch.onnx.dynamo_export` automatically uses this new FX tracer when a `torch.export.ExportedProgram` is specified as `model`

Implementation is back compatible, thus non `ExportedProgram` models are handled the exact same way as before
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111497
Approved by: https://github.com/BowenBao
2023-10-25 00:17:43 +00:00
PyTorch MergeBot
e62c887bab Revert "[inductor][BE] split triton_meta and inductor_meta (#111397)"
This reverts commit 070b94dc08.

Reverted https://github.com/pytorch/pytorch/pull/111397 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111397#issuecomment-1776282039))
2023-10-24 00:52:24 +00:00
Richard Zou
0ea9646cdd Rewrite torch.library's documentation (#111310)
We mention the higher-level torch.library APIs and put the original docs
into a low-level API section.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111310
Approved by: https://github.com/soulitzer
ghstack dependencies: #111380, #111659
2023-10-23 23:02:41 +00:00
Nikita Shulga
d22e5e4b52 Fix DDP notes (#111833)
To include `import os` otherwise sample is not syntactically correct Reported in https://github.com/pytorch/pytorch.github.io/pull/1490

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111833
Approved by: https://github.com/wanchaol
2023-10-23 22:05:36 +00:00
David Berard
070b94dc08 [inductor][BE] split triton_meta and inductor_meta (#111397)
triton_meta is intended to be passed directly to triton. Previous we were also putting other metadata into triton_meta; but we should split out the other metadata into a separate dict to avoid possible conficts in the future.

This PR splits out triton_meta and inductor_meta so we have a place to put additional metadata that isn't intended to be passed to triton.

Tests - wait for CI

Differential Revision: [D50442547](https://our.internmc.facebook.com/intern/diff/D50442547)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111397
Approved by: https://github.com/shunting314, https://github.com/eellison
2023-10-23 21:38:21 +00:00
ydwu4
f3d02d9ae6 Add support for sym_ite (#111440)
This PR supports sym_ite. This is useful for converting SymBool to SymInt in e.g. #109916. Internally, it uses sympy.Piecewise. We cannot use sympy.ITE because it expects the arguments and output all to be boolean type but we want return SymInt type when converting a SymBool to SymInt. So we use sympy.Piecewise to denote the symbolic relationship.

Note that this pr uses the range analysis for sympy.Piecewise implemented in https://github.com/pytorch/pytorch/blob/main/torch/utils/_sympy/value_ranges.py.

Test Plan:
See added test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111440
Approved by: https://github.com/ezyang
2023-10-23 16:17:43 +00:00
eqy
894b9957c8 [DOCS][CUDA] Update TF32 docs for sm90 (#111337)
For #110252.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111337
Approved by: https://github.com/msaroufim
2023-10-19 09:36:13 +00:00
PyTorch MergeBot
7a740e2b85 Revert "direct runtime assertions (#111262)"
This reverts commit e6d9350d7f.

Reverted https://github.com/pytorch/pytorch/pull/111262 on behalf of https://github.com/jeanschmidt due to Breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/111262#issuecomment-1765881675))
2023-10-17 08:04:36 +00:00
Chien-Chin Huang
19a6487ad4 [state_dict][6/N] Change API names to avoid conflict and simplify the API signatures (#111120)
`state_dict` is a very common variable name people use to represent a local
state_dict and `load_state_dict` conflicts with DCP's `load_state_dict`.

This PR changes `state_dict` to `get_state_dict`. `get_state_dict` is more close to what is this API does -- users use the API to get the current state_dict for saving or for loading (passed to DCP for loading in-place)..

This PR also changes `load_state_dict` to `set_state_dict`. `set_state_dict` is less ideal compared to `get_state_dict` but is symetric. We can still change the API name before it goes to beta.

This PR also simplies the API signatures. `model_only` is removed and `optim_only` only exists for `get_state_dict`.

Differential Revision: [D50213931](https://our.internmc.facebook.com/intern/diff/D50213931/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111120
Approved by: https://github.com/wz337
ghstack dependencies: #111106, #111107, #111275, #111109, #111110
2023-10-17 00:15:31 +00:00
Avik Chaudhuri
e6d9350d7f direct runtime assertions (#111262)
Previously we were generating a graph to add runtime assertions on inputs and then running that graph to check input constraints. This PR checks input constraints directly.

Differential Revision: D50289970

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111262
Approved by: https://github.com/zhxchen17
2023-10-15 05:15:09 +00:00
fduwjj
ff3d773dd9 [TP] Add deprecation warnings in the documentations for Pairwise parallel, sequence parallel and other prepare input/output functions (#111176)
As part of TP UX improvements, we want to keep our API simple (not easy) so that users get the flexibility to do what they want and avoid a too generic API which tries to solve everything and get things too complicated. We are updating the doc accordingly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111176
Approved by: https://github.com/wanchaol
ghstack dependencies: #111160, #111166
2023-10-15 00:39:24 +00:00
fduwjj
8085e08a84 [TP] Add prepareInput and output for input/output DTensor layout annotation in the parent module in TP API (#111166)
In some use cases, we found that users might want to annote the input/output DTensor layout for the parent module rather than the submodule whose parameters are to be distributed so that we want to have these two class for users to annote input/output DTensor layouts so that we register pre-FWD/FWD hook for the TP-lized module.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111166
Approved by: https://github.com/wanchaol
ghstack dependencies: #111160
2023-10-14 15:37:52 +00:00
Chien-Chin Huang
7c67139e7b [state_dict][3/N] Cleanup StateDictOptions, make it more readable (#111275)
This is a reland PR for https://github.com/pytorch/pytorch/pull/111108 with the proper docstring fix.

1. Rename DistributedStateDictOptions to StateDictOptions.
2. Remove cpu_offload as we have not yet required this option.
3. Rename save_frozen_parameters to ignore_frozen_params.

Differential Revision: [D50294352](https://our.internmc.facebook.com/intern/diff/D50294352/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111275
Approved by: https://github.com/wz337
ghstack dependencies: #111106, #111107
2023-10-14 15:34:52 +00:00
yewentao
c151163333 Documentation Clarification on torch.compile Example (#110942)
Fixes #110917
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110942
Approved by: https://github.com/msaroufim, https://github.com/malfet
2023-10-13 22:46:42 +00:00
Kaichao You
69dcbc02b0 [Dynamo]Expose bytecode hooks and add example usage for decompilation in docs (#110714)
Dynamo dynamically translate bytecode of python functions, which is powerful but with difficult-to-understand bytecode. Most users cannot understand python bytecode. Although a general purpose way to decompile python bytecode into source code is very difficult, I find that this work can be greatly simplified since Dynamo already cleans up the code: the bytecode generated by Dynamo is a reduced subset of well-behaved python bytecode.

I created a tiny decompiler for pytorch 2.0, named `depyf`: https://github.com/youkaichao/depyf .

There are several takeaways:

- **It supports pyton 3.7 - 3.11 (both inclusive), the same python versions supported by pytorch.** Since the main usage of this library is to understand pytorch 2.0, I plan to keep pace with pytorch. If pytorch supports a new python version, I can add support for that. (Actually, the core code is just about 1k lines. Adding support for new versions of python bytecode can be done in just several days.)
- **I have tested the correctness of decompiled source code in torchbench.** I capture the modified bytecode generated by Dynamo, decompile it into source code, and then compile it into new bytecode, replace the Dynamo generated bytecode with new bytecode. And **it passed all the accuracy tests for timm models**. For huggingface models, the situation is more complicated: all failed cases are caused by the compile step: some functions use the `__class__`  as closure variables, but decompiler can only get the code object, so it has no way to figure out the `__class__` , leading to a name error when compiling the decompiled code. That said, it passed the rest tests without the `__class__` issue. Please see the log file https://cloud.tsinghua.edu.cn/f/685e4af8d930499baa7c/?dl=1 and https://cloud.tsinghua.edu.cn/f/cab89500e15e4b62890b/?dl=1 for details.

With the above efforts, I think it would be great to add an additional logging option in Dynamo: we can try to decompile the generated bytecode into source code, so that users can have a rough idea of what the modified bytecode does. It does not affect the workflow of Dynamo, but just adds more debug information.

An example code from the [doc](https://pytorch.org/docs/main/torch.compiler_deepdive.html):

```python
from typing import List
import torch
from torch import _dynamo as torchdynamo
def my_compiler(gm: torch.fx.GraphModule, example_inputs: List[torch.Tensor]):
    print("my_compiler() called with FX graph:")
    gm.graph.print_tabular()
    return gm.forward  # return a python callable

@torchdynamo.optimize(my_compiler)
def toy_example(a, b):
    x = a / (torch.abs(a) + 1)
    if b.sum() < 0:
        b = b * -1
    return x * b
for _ in range(100):
    toy_example(torch.randn(10), torch.randn(10))
```

Run with `export TORCH_LOGS="+dynamo,guards,bytecode"`.

Bytecode logging:

```
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] ORIGINAL BYTECODE toy_example /Users/youkaichao/DeepLearning/depyf/ykc_test.py line 8
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  10           0 LOAD_FAST                0 (a)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               2 LOAD_GLOBAL              0 (torch)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               4 LOAD_METHOD              1 (abs)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               6 LOAD_FAST                0 (a)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               8 CALL_METHOD              1
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              10 LOAD_CONST               1 (1)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              12 BINARY_ADD
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              14 BINARY_TRUE_DIVIDE
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              16 STORE_FAST               2 (x)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  11          18 LOAD_FAST                1 (b)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              20 LOAD_METHOD              2 (sum)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              22 CALL_METHOD              0
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              24 STORE_FAST               3 (__temp_2)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  12          26 LOAD_FAST                3 (__temp_2)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              28 LOAD_CONST               2 (0)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              30 COMPARE_OP               0 (<)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              32 POP_JUMP_IF_FALSE       21 (to 42)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  13          34 LOAD_FAST                1 (b)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              36 LOAD_CONST               3 (-1)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              38 BINARY_MULTIPLY
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              40 STORE_FAST               1 (b)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  14     >>   42 LOAD_FAST                2 (x)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              44 LOAD_FAST                1 (b)
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              46 BINARY_MULTIPLY
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              48 RETURN_VALUE
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,929] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] MODIFIED BYTECODE toy_example /Users/youkaichao/DeepLearning/depyf/ykc_test.py line 8
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]   8           0 LOAD_GLOBAL              3 (__compiled_fn_0)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               2 LOAD_FAST                0 (a)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               4 LOAD_FAST                1 (b)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               6 CALL_FUNCTION            2
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               8 UNPACK_SEQUENCE          2
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              10 STORE_FAST               2 (x)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              12 POP_JUMP_IF_FALSE       12 (to 24)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              14 LOAD_GLOBAL              4 (__resume_at_34_1)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              16 LOAD_FAST                1 (b)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              18 LOAD_FAST                2 (x)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              20 CALL_FUNCTION            2
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              22 RETURN_VALUE
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]         >>   24 LOAD_GLOBAL              5 (__resume_at_42_2)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              26 LOAD_FAST                1 (b)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              28 LOAD_FAST                2 (x)
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              30 CALL_FUNCTION            2
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              32 RETURN_VALUE
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 23:56:44,930] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
```

New output with this PR:

```
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] possible source code:
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] def toy_example(a, b):
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]     __temp_1 = __compiled_fn_0(a, b)
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]     x = __temp_1[0]
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]     if __temp_1[1]:
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]         return __resume_at_34_1(b, x)
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]     return __resume_at_42_2(b, x)
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,535] [0/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] If you find the decompiled code is wrong,please submit an issue at https://github.com/youkaichao/depyf/issues.
```

The rest two log (please pay attention to the output `possible source code:`):

```
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] ORIGINAL BYTECODE <resume in toy_example> /workspace/youkaichao/code/pytorch/ykc.py line 12
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  12           0 JUMP_ABSOLUTE           22 (to 44)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               2 LOAD_FAST                2 (a)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               4 LOAD_GLOBAL              0 (torch)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               6 LOAD_ATTR                1 (abs)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               8 LOAD_FAST                2 (a)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              10 CALL_FUNCTION            1
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              12 LOAD_CONST               1 (1)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              14 BINARY_ADD
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              16 BINARY_TRUE_DIVIDE
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              18 STORE_FAST               1 (x)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              20 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              22 LOAD_ATTR                2 (sum)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              24 CALL_FUNCTION            0
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              26 STORE_FAST               3 (__temp_2)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              28 LOAD_FAST                3 (__temp_2)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              30 LOAD_CONST               2 (0)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              32 COMPARE_OP               0 (<)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              34 POP_JUMP_IF_FALSE       22 (to 44)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              36 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              38 LOAD_CONST               3 (-1)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              40 BINARY_MULTIPLY
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              42 STORE_FAST               0 (b)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  14     >>   44 LOAD_FAST                1 (x)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              46 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              48 BINARY_MULTIPLY
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              50 RETURN_VALUE
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] MODIFIED BYTECODE <resume in toy_example> /workspace/youkaichao/code/pytorch/ykc.py line 12
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  12           0 LOAD_GLOBAL              3 (__compiled_fn_3)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               2 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               4 LOAD_FAST                1 (x)
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               6 CALL_FUNCTION            2
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               8 UNPACK_SEQUENCE          1
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              10 RETURN_VALUE
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,566] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,567] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] possible source code:
[2023-10-06 16:25:21,567] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] def <resume in toy_example>(b, x):
[2023-10-06 16:25:21,567] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]     return __compiled_fn_3(b, x)[0]
[2023-10-06 16:25:21,567] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,567] [1/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] If you find the decompiled code is wrong,please submit an issue at https://github.com/youkaichao/depyf/issues.
```

```
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] ORIGINAL BYTECODE <resume in toy_example> /workspace/youkaichao/code/pytorch/ykc.py line 12
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  12           0 JUMP_ABSOLUTE           18 (to 36)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               2 LOAD_FAST                2 (a)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               4 LOAD_GLOBAL              0 (torch)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               6 LOAD_ATTR                1 (abs)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               8 LOAD_FAST                2 (a)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              10 CALL_FUNCTION            1
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              12 LOAD_CONST               1 (1)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              14 BINARY_ADD
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              16 BINARY_TRUE_DIVIDE
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              18 STORE_FAST               1 (x)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              20 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              22 LOAD_ATTR                2 (sum)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              24 CALL_FUNCTION            0
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              26 STORE_FAST               3 (__temp_2)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              28 LOAD_FAST                3 (__temp_2)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              30 LOAD_CONST               2 (0)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              32 COMPARE_OP               0 (<)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              34 POP_JUMP_IF_FALSE       22 (to 44)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  13     >>   36 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              38 LOAD_CONST               3 (-1)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              40 BINARY_MULTIPLY
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              42 STORE_FAST               0 (b)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  14     >>   44 LOAD_FAST                1 (x)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              46 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              48 BINARY_MULTIPLY
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              50 RETURN_VALUE
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,579] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] MODIFIED BYTECODE <resume in toy_example> /workspace/youkaichao/code/pytorch/ykc.py line 12
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]  12           0 LOAD_GLOBAL              3 (__compiled_fn_4)
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               2 LOAD_FAST                0 (b)
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               4 LOAD_FAST                1 (x)
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               6 CALL_FUNCTION            2
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]               8 UNPACK_SEQUENCE          1
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]              10 RETURN_VALUE
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] possible source code:
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] def <resume in toy_example>(b, x):
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]     return __compiled_fn_4(b, x)[0]
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG]
[2023-10-06 16:25:21,580] [2/0] torch._dynamo.convert_frame.__bytecode: [DEBUG] If you find the decompiled code is wrong,please submit an issue at https://github.com/youkaichao/depyf/issues.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110714
Approved by: https://github.com/jansel
2023-10-13 12:36:00 +00:00
Zhengxu Chen
168bad5f23 [export] Reland "Fix graph signature data model to list of specs." (#111136)
Summary: reland D49876258

Test Plan: CI

Differential Revision: D50224384

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111136
Approved by: https://github.com/angelayi
2023-10-13 02:04:29 +00:00
Matthew Hoffman
ad4472833c define public API for torch.nn.utils (#111026)
Adding modules imported here and the following functions to the `__all__`:
* [clip_grad_norm_](https://pytorch.org/docs/stable/generated/torch.nn.utils.clip_grad_norm_.html)
* [clip_grad_value_](https://pytorch.org/docs/stable/generated/torch.nn.utils.clip_grad_value_.html)
* [remove_weight_norm](https://pytorch.org/docs/stable/generated/torch.nn.utils.remove_weight_norm.html)
* [parameters_to_vector](https://pytorch.org/docs/stable/generated/torch.nn.utils.parameters_to_vector.html)
* [vector_to_parameters](https://pytorch.org/docs/stable/generated/torch.nn.utils.vector_to_parameters.html)
* [remove_spectral_norm](https://pytorch.org/docs/stable/generated/torch.nn.utils.remove_spectral_norm.html)
* [skip_init](https://pytorch.org/docs/stable/generated/torch.nn.utils.skip_init.html)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111026
Approved by: https://github.com/mikaylagawarecki
2023-10-12 23:05:23 +00:00
PyTorch MergeBot
42b89aea4b Revert "[export] Fix graph signature data model to list of specs. (#111017)"
This reverts commit 33b69509d3.

Reverted https://github.com/pytorch/pytorch/pull/111017 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111017#issuecomment-1759292161))
2023-10-12 09:52:33 +00:00
Tugsbayasgalan Manlaibaatar
5614023f5e Move export.constrain_as_* to torch._constrain_as_* (#110757)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110757
Approved by: https://github.com/avikchaudhuri
ghstack dependencies: #109859
2023-10-12 05:37:44 +00:00
PyTorch MergeBot
6ce3a38050 Revert "Move export.constrain_as_* to torch._constrain_as_* (#110757)"
This reverts commit 5aee22e0e0.

Reverted https://github.com/pytorch/pytorch/pull/110757 on behalf of https://github.com/kit1980 due to Depends on https://github.com/pytorch/pytorch/pull/109859 that needs to be reverted ([comment](https://github.com/pytorch/pytorch/pull/110757#issuecomment-1758908371))
2023-10-12 04:53:29 +00:00
albanD
5e8be63e99 Allow specifiying inputs as GradientEdge in autograd APIs (#110867)
This can be useful for advanced users (like AOTAutograd) who don't want to keep the corresponding Tensor alive (for memory reasons for example) or when inplace op will change the Tensor's grad_fn (but gradients wrt to the original value is needed).

I went minimal API change but open to suggestions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110867
Approved by: https://github.com/soulitzer
2023-10-12 04:08:44 +00:00
Zhengxu Chen
33b69509d3 [export] Fix graph signature data model to list of specs. (#111017)
Summary:
Previously we design the GraphSignature format as a bunch of inputs and outputs node names. After a discussion in the design meeting we decide to change the format to make signature more self-contained. Now the signature format look like the following:
```
[
InputSpec(
   kind=InputKind.USER_INPUT,
   arg=TensorArgument(name="arg0_1"),
   target=None,
),
...
]
```

Test Plan: CI

Reviewed By: angelayi

Differential Revision: D49876258

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111017
Approved by: https://github.com/angelayi
2023-10-12 03:39:04 +00:00
Kurt Mohler
5292a92e03 Add torch.unravel_index (#110580)
Fixes #35674

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110580
Approved by: https://github.com/lezcano, https://github.com/kulinseth
2023-10-12 00:55:51 +00:00
Michael Voznesensky
1e7947b3e0 Revert "Reland 3rd try [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#109323)" + Forward fixes + test (#110964)
This reverts commit f786fbdebd.

Forward fixes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110964
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2023-10-11 05:16:47 +00:00
wz337
a614281ea9 Add current_device() to torch.cpu (#110987)
Better support device agnostic, add a "cpu" return for `current_device()` in torch.cpu so that we won't run into `AttributeError: module 'torch.cpu' has no attribute 'current_device'`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110987
Approved by: https://github.com/wanchaol
2023-10-11 05:13:10 +00:00
Tugsbayasgalan Manlaibaatar
5aee22e0e0 Move export.constrain_as_* to torch._constrain_as_* (#110757)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110757
Approved by: https://github.com/avikchaudhuri
ghstack dependencies: #109859
2023-10-11 02:37:55 +00:00
soulitzer
c9eb8d8d90 Add set_checkpoint_debug_enabled that overrides local setting (#110728)
People access activation checkpoint through many layers of config and it is not always guaranteed that all the layers of wrapping around checkpoint properly propagate all the kwargs, e.g. debug mode. This context manager offers an alternative way to enable debug mode that bypasses the need for all layers to propagate kwargs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110728
Approved by: https://github.com/albanD
ghstack dependencies: #110673, #110674, #110675, #110676
2023-10-11 02:12:31 +00:00
Jerry Zhang
7a69e3d30b [fx][subgraph_matcher] Add a matcher that supports name to node map (#110743)
Summary:
We want the matcher to return a name -> node in target graph
so that we can refer to the node by name, this is useful for downstream applications like
quantization.

and also we can use the torch API as source of truth instead of matching aten API directly.

Test Plan:
python test/fx/test_matcher_utils.py

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110743
Approved by: https://github.com/SherlockNoMad
2023-10-10 22:21:24 +00:00
angelayi
3704bf4ee8 [export] Update custom ops docs (#110492)
Updating the doc links in the custom ops documentation in export
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110492
Approved by: https://github.com/avikchaudhuri
2023-10-09 23:40:40 +00:00
Wanchao Liang
28d7d7fc42 device agnostic: torch.cpu.set_device (#110716)
to support device agnostic, add a dummpy placeholder in torch.cpu

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110716
Approved by: https://github.com/albanD
2023-10-09 23:00:15 +00:00
Kazuaki Ishizaki
50bd252863 Fix typo the the (#110869)
This PR fixes typo `the the` of comments and exception message in files.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110869
Approved by: https://github.com/soulitzer
2023-10-09 19:32:45 +00:00
ydwu4
d84bcb9c8c [HigherOrderOp] expose torch.cond (#110293)
This pr expose torch._higher_order_ops.cond as torch.cond.

1. Need to add #noqa: F811 to the _check calls in torch/__init__.py to address some confusing linter error "Redefinition of unused 'cond'" but only one cond is imported and for these lines that have this error, they don't define the cond but just use it as an argument.
2. Also add cond to the list that allows it to be traced through so as dynamo could trigger the CondHigherOrder logic instead of creating a TorchVariable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110293
Approved by: https://github.com/zou3519
2023-10-07 20:39:52 +00:00
albanD
a0bbd075b2 Add the Mode section in the extending doc (#110073)
Cover the basic principles of Mode and an example on how to use them and their behavior.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110073
Approved by: https://github.com/janeyx99
2023-10-06 23:50:55 +00:00
PyTorch MergeBot
576b80d23e Revert "[HigherOrderOp] expose torch.cond (#110293)"
This reverts commit 601f872831.

Reverted https://github.com/pytorch/pytorch/pull/110293 on behalf of https://github.com/ydwu4 due to Sorry, didn't check the error carefully on the PR. A doc error is related to this pr ([comment](https://github.com/pytorch/pytorch/pull/110293#issuecomment-1751176719))
2023-10-06 17:44:17 +00:00
ydwu4
601f872831 [HigherOrderOp] expose torch.cond (#110293)
This pr expose torch._higher_order_ops.cond as torch.cond.

1. Need to add #noqa: F811 to the _check calls in torch/__init__.py to address some confusing linter error "Redefinition of unused 'cond'" but only one cond is imported and for these lines that have this error, they don't define the cond but just use it as an argument.
2. Also add cond to the list that allows it to be traced through so as dynamo could trigger the CondHigherOrder logic instead of creating a TorchVariable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110293
Approved by: https://github.com/zou3519
2023-10-06 17:04:31 +00:00
albanD
c4db607607 Doc test non packages (#110568)
Add non-package python modules to the public API checks.
The original change is to remove the `ispkg` check in this line
https://github.com/pytorch/pytorch/blob/main/docs/source/conf.py#L518

Everything else is to add the appropriate modules to the rst files, make sure every module we provide can be imported (fixed by either making optional dependencies optional or just deleting files that have been un-importable for 3 years), make API that are both modules and functions (like torch.autograd.gradcheck) properly rendered on the docs website without confusion and add every non-documented API to the allow list (~3k of them).

Next steps will be to try and fix these missing docs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110568
Approved by: https://github.com/zou3519
2023-10-06 14:16:01 +00:00
Banit Agrawal
64583c4d04 [CUDA Host Allocator] Add support of CudaHostRegister (#108488)
Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister.

Differential Revision: D45843715

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108488
Approved by: https://github.com/zdevito
2023-10-06 04:13:02 +00:00
Zhengxu Chen
be5dc3a00d [export] Update ArgumentSpec definition. (#110612)
Summary: Changing ArgumentSpec into a true union type in Python without changing serialization format.

Test Plan: CI

Differential Revision: D49871088

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110612
Approved by: https://github.com/angelayi
2023-10-06 03:14:45 +00:00
Angela Yi
a93337ed55 [export] Add ir spec (#110394)
Summary: Copied IR spec over from Executorch

Test Plan: _docs_

Differential Revision: D49829187

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110394
Approved by: https://github.com/ydwu4, https://github.com/gmagogsfm
2023-10-05 03:06:30 +00:00
ydwu4
6db3853eeb Add doc for torch.cond (#108691)
We add a doc for torch.cond. This PR is a replacement of https://github.com/pytorch/pytorch/pull/107977.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108691
Approved by: https://github.com/zou3519
2023-10-04 21:24:14 +00:00
Jerry Zhang
64416a1fc7 [quant][docs] Fix formatting (#110460)
Summary:
att

Test Plan:
check generated docs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110460
Approved by: https://github.com/andrewor14
2023-10-04 04:54:10 +00:00
Kazuaki Ishizaki
aa3629ee3e Fix typo under docs directory (#110359)
This PR fixes typo in `.rst` files under docs directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110359
Approved by: https://github.com/kit1980
2023-10-03 16:36:05 +00:00
Jerry Zhang
28b3ff7974 [quant][pt2e][docs] Update main quant doc with pt2 export quantization information (#110260)
Summary:
att

Test Plan:
.

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110260
Approved by: https://github.com/kimishpatel
2023-10-02 21:29:38 +00:00
Avik Chaudhuri
5da5e068f3 deprecate constraints in favor of dynamic_shapes (#110143)
Recently we updated the `export` API to take an experimental `dynamic_shapes` argument that was meant to subsume the existing `constraints` argument.

This PR deprecates `constraints` (with a warning on its use, but without actually removing it). Simultaneously it replaces all uses of `constraints` in docs, examples, and tests with corresponding uses of `dynamic_shapes` (preserving behavior). This exercise fortunately revealed some minor bugs in the implementation which have also been fixed in this PR.

Some uses of `constraints` still remain, e.g., when `torch._dynamo.export` is called directly. (Meta-internal uses will be updated in a separate diff.)

Differential Revision: D49676049

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110143
Approved by: https://github.com/tugsbayasgalan
2023-09-28 10:26:21 +00:00
Howard Huang
1ca68c971c distributed doc fix (#110157)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110157
Approved by: https://github.com/awgu
2023-09-28 01:34:02 +00:00
Nikita Shulga
58c33789c6 Fix governance.rst link rendering (#110171)
By adding `__` to the end of the link decorator according to https://sublime-and-sphinx-guide.readthedocs.io/en/latest/references.html#links-to-external-web-pages

Fixes regression introduced by https://github.com/pytorch/pytorch/pull/106863

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110171
Approved by: https://github.com/seemethere, https://github.com/msaroufim, https://github.com/atalman
2023-09-27 18:49:03 +00:00
Avik Chaudhuri
ebc7039bcb New export API with dynamic shape specifications instead of constraints (#108448)
Our experience using `constraints` / `dynamic_dim` with the existing export API has found it to be (subjectively) clunky and (objectively) verbose in common cases.

This PR implements a new design for the export API that replaces the use of `constraints` / `dynamic_dim` with a new way of specifying dynamic shapes, involving the following concepts:
* a constructor `Dim` for first-class named dynamic dimensions with ranges (similar to `functorch.dim`, and analogous to internal symbolic sizes)
* a mechanism that uses the above in `export` calls to associate inputs to their dynamic shape specifications (`dynamic_shapes`)

Design doc: https://docs.google.com/presentation/d/168U7XK72C_WSsZpGESP6Cho9udh193fi0gfjxCNcJ4E/edit#slide=id.p (Meta-only). Note that we only implement Option 1 in that doc. An older version of this PR also implemented Option 3, which is an alternative way of specifying dynamic shapes using tensor type annotations on the exported callable; but we have moved that to future work for now.

See docs for these new features in `torch.export`. The existing `torch.export.export` is modified to use the new API, `torch._export.export__RC__`, whenever `constraints=None`. We have not deprecated the existing API yet, but will do in a follow-up.

Constraint violation errors arising through use of the new API will now contain suggested fixes using the new API. No longer do we need to report all specializations for static dimensions and suggest all constraints over dynamic dimensions to fix such errors. Instead, due to the redesign, the suggested fixes are much more concise, only involving modifying the definitions of relevant `Dim`s.

Differential Revision: [D48919204](https://our.internmc.facebook.com/intern/diff/D48919204/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108448
Approved by: https://github.com/suo, https://github.com/gmagogsfm
2023-09-22 06:58:26 +00:00
Suraj Subramanian
d43f9f7707
Add redirect links to the contributor wiki (#106863)
* Update contribution guide links to the wiki page

---------

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>
2023-09-21 22:01:20 -04:00
Edward Z. Yang
d38379f9f1 Update dynamic shapes documentation (#109764)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109764
Approved by: https://github.com/gchanan
2023-09-21 13:53:43 +00:00
lezcano
13bd4ed933 Add docs for torch.compile(numpy) (#109710)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109710
Approved by: https://github.com/ev-br, https://github.com/gchanan, https://github.com/peterbell10
2023-09-21 03:05:21 +00:00
Nikita Shulga
af867c2d14 [Docs] Fix compiler.list_backends invocation (#109568)
s/torch.compile.list_backends/torch.compiler.list_backends`

Fixes https://github.com/pytorch/pytorch/issues/109451

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109568
Approved by: https://github.com/msaroufim, https://github.com/svekars
2023-09-19 10:00:04 +00:00
JackCaoG
282aa26764 Update the instruction to enable dynamo logs (#109409)
```
   torch._dynamo.config.log_level = logging.INFO
   torch._dynamo.config.output_code = True
```

were replaced with the module level log control https://github.com/pytorch/pytorch/pull/94858
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109409
Approved by: https://github.com/msaroufim
2023-09-18 17:49:40 +00:00
David Berard
b4ea3260d7 [JIT] Document torch.jit.interface (#109356)
Good option for replacing "Callable" types; we should document it so
it's searchable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109356
Approved by: https://github.com/eellison, https://github.com/gmagogsfm
2023-09-15 23:23:47 +00:00
Animesh Jain
f786fbdebd Reland 3rd try [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#109323)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109323
Approved by: https://github.com/huydhn, https://github.com/voznesenskym
2023-09-15 08:44:14 +00:00
FFFrog
d4990ad5a1 Fix the example in the extending.func.rst (#109279)
As the title shown ,the `backward` function is missing the definition of `ind` and `ind_inv`, which will lead to error when calling backward
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109279
Approved by: https://github.com/zou3519
2023-09-14 17:29:39 +00:00
Sahdev Zala
35aeb6aa85 Do not use a specific LOC in link (#108957)
The order of LOC can change and so it should not be used in creating a link. Also, a specific LOC is not needed here given the function name as used in general in overall documentaton.
Previously, a fix was provided by updating the line number for the mentioned issue in this PR but the LOC was eventually changed resulting a broken link.

Fixes #102183

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108957
Approved by: https://github.com/ezyang
2023-09-13 19:21:45 +00:00
Yanan Cao
a09539f454 Add torch.export.register_dataclass API (#109152)
`register_dataclass` allows dataclass to be used as valid input/output types of torch.export.export

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109152
Approved by: https://github.com/ydwu4
2023-09-13 04:17:12 +00:00
Michael Voznesensky
55a204ebc8 [Easy] log graphs in compiled_autograd if TORCH_LOGS=compiled_autograd (#108991)
[Easy] log graphs in compiled_autograd if TORCH_LOGS=compiled_autograd

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108991
Approved by: https://github.com/ezyang
ghstack dependencies: #108846
2023-09-12 00:15:02 +00:00
PyTorch MergeBot
56c2386157 Revert "reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883)"
This reverts commit d4230e5574.

Reverted https://github.com/pytorch/pytorch/pull/108883 on behalf of https://github.com/huydhn due to Per the discussion thread on D49122208, reverting this change ([comment](https://github.com/pytorch/pytorch/pull/108883#issuecomment-1712707853))
2023-09-10 04:40:02 +00:00
angelayi
2b138e4f7d [export] torch.export landing page (#108783)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108783
Approved by: https://github.com/avikchaudhuri, https://github.com/gmagogsfm
2023-09-10 01:40:42 +00:00
Animesh Jain
d4230e5574 reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108883
Approved by: https://github.com/voznesenskym, https://github.com/huydhn
2023-09-09 03:12:31 +00:00
Thiago Crepaldi
7b3efeaf42 Follow-up #108379 (#108905)
Fixes #108379

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108905
Approved by: https://github.com/abock
2023-09-09 01:38:36 +00:00
Thiago Crepaldi
aa3355da8a Refactor torch.onnx documentation (#108379)
* Distinguish both TorchScript-based exporter (`torch.onnx.export`) and the TorchDynamo-based exporter (`torch.onnx.dynamo_export`) exporters
* Merge ONNX diagnostics page with the exporter page
* Add initial version of a quick overview on the new exporter
* Updates `torch.compiler.html` with the right page for the ONNX Runtime backend for `torch.compile`
* Renamed doc files to clearly identify files belonging to the legacy and newer onnx exporters

Fixes #108274

https://docs-preview.pytorch.org/pytorch/pytorch/108379/index.html
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108379
Approved by: https://github.com/justinchuby, https://github.com/wschin, https://github.com/malfet
2023-09-08 18:23:48 +00:00
PyTorch MergeBot
72f24d0001 Revert "[dynamo][finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108528)"
This reverts commit 34bb74c4cf.

Reverted https://github.com/pytorch/pytorch/pull/108528 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it has some nasty merge conflicts after the revert of D48910794. I need to revert this so the conflict could be resolved. Please help rebase this tomorrow and reland the change ([comment](https://github.com/pytorch/pytorch/pull/108528#issuecomment-1711034781))
2023-09-08 03:49:41 +00:00
Animesh Jain
34bb74c4cf [dynamo][finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108528)
**This PR is a 99% copy paste of Sam Gross** (@colesbury) work at https://github.com/pytorch/pytorch/pull/100642. Copied from there

--------
The NN_MODULE guard now subsumes guards on Module attributes. The check_fn will fail if the module attributes are changed (such as Module.training), parameters, submodules, and buffers are added or removed, and if fields are changed on the type itself.

This gives up specificity in the guard check -- if any field is changed the check_fn fails -- for faster overall checks.

-----

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108528
Approved by: https://github.com/ezyang
2023-09-07 01:45:47 +00:00
Sherlock Huang
bee7e78130 [PT2 Inference] Prototype of Inference Runtime (#108482)
Summary:
This diff demonstrates a simplified E2E workflow for PT2 Inference stack:
1. Model author with `torch.export()`
2. Model processing with `aot_inductor.compile()`
3. Model served with a new Inference Runtime API, named `ModelRunner`

`torch.export()` and `aot_inductor.compile()` produces a zip file using `PyTorchStreamWriter`.
Runtime reads the zip file with `PyTorchStreamReader`.
The zip file contains
 {F1080328179}
More discussion on packaging can be found in https://docs.google.com/document/d/1C-4DP5yu7ZhX1aB1p9JcVZ5TultDKObM10AqEtmZ-nU/edit?usp=sharing

Runtime can now switch between two Execution modes:
1. Graph Interpreter mode, implemented based on Sigmoid's Executor
2. AOTInductor mode, implemented based on FBAOTInductorModel

Test Plan:
buck2 run  mode/dev-nosan mode/inplace -c fbcode.enable_gpu_sections=True //sigmoid/inference/test:e2e_test

Export and Lower with AOTInductor
buck2 run mode/dev-sand mode/inplace -c fbcode.enable_gpu_sections=True sigmoid/inference:export_package

Run with GraphInterpreter and AOTInducotr
buck2 run mode/dev-nosan //sigmoid/inference:main

Reviewed By: suo

Differential Revision: D47781098

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108482
Approved by: https://github.com/zhxchen17
2023-09-06 19:28:58 +00:00
Jing Xu
aa89f0a1fd [Doc] Move Dynamo IPEX backend to training/inference category (#108643)
As title.
Since dynamo IPEX backend supports training, move it to the category above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108643
Approved by: https://github.com/msaroufim
2023-09-06 15:57:12 +00:00
Thiago Crepaldi
b1729d8bbe Fix doc preview page url at CONTRIBUTING.md (#108580)
The URL for previewing documentation directly on PR has changed and CONTRIBUTING.md got outdated. There is also a minor fix to a non-existent document URL

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108580
Approved by: https://github.com/svekars, https://github.com/kit1980
2023-09-05 20:17:55 +00:00
kshitij12345
a74f50d524 torch.compile-functorch interaction: update docs (#108130)
Doc Preview: https://docs-preview.pytorch.org/pytorch/pytorch/108130/torch.compiler_faq.html#torch-func-works-with-torch-compile-for-grad-and-vmap-transforms

Will also cherry-pick this for release branch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108130
Approved by: https://github.com/zou3519
2023-09-05 18:24:08 +00:00
youkaichao
ba9acbebfc [Doc] Update the dynamo deepdive doc (#108147)
With a new tool `depyf` to decompile bytecode into human readable source code, understanding dynamo becomes much more easier.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108147
Approved by: https://github.com/jansel
2023-09-03 13:08:13 +00:00
Pritam Damania
704b0b3c67 [RESUBMIT] Standardize on error types for distributed errors. (#108191)
We have a plethora of error types for various errors raised from c10d. These include `RuntimeError`, `TimeoutError`, `SocketError`, `DistBackendError` etc.

This results in messy code during error handling somewhat like this:
```
if "NCCL" in exception_str:
  ...
if "Timed out initializing process group in store based barrier on rank" in exception_str:
  ...
if "The client socket has timed out after" in exception_str:
  ...
if "Broken pipe" in exception_str:
  ...
if "Connection reset by peer" in exception_str:
  ...
```

To address this issue, in this PR I've ensured added these error types:

1. **DistError** - the base type of all distributed errors
2. **DistBackendError** - this already existed and referred to PG backend errors
3. **DistStoreError** - for errors originating from the store
4. **DistNetworkError** - for general network errors coming from the socket library

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108191
Approved by: https://github.com/H-Huang
2023-08-30 21:47:39 +00:00
Jane Xu
fa49be2a49 [docs] Properly link register_post_accumulate_grad_hook docs (#108157)
it shows up now

![image](https://github.com/pytorch/pytorch/assets/31798555/0aa86839-b9c5-4b4b-b1b1-aa1c0c0abbab)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108157
Approved by: https://github.com/soulitzer, https://github.com/albanD
2023-08-29 22:13:33 +00:00
PyTorch MergeBot
d4ff06ec84 Revert "Standardize on error types for distributed errors. (#107651)"
This reverts commit 0e2317479b.

Reverted https://github.com/pytorch/pytorch/pull/107651 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing inductor test in trunk for one of its model moco ([comment](https://github.com/pytorch/pytorch/pull/107651#issuecomment-1696578138))
2023-08-28 23:58:33 +00:00
Pritam Damania
0e2317479b Standardize on error types for distributed errors. (#107651)
We have a plethora of error types for various errors raised from c10d. These include `RuntimeError`, `TimeoutError`, `SocketError`, `DistBackendError` etc.

This results in messy code during error handling somewhat like this:
```
if "NCCL" in exception_str:
  ...
if "Timed out initializing process group in store based barrier on rank" in exception_str:
  ...
if "The client socket has timed out after" in exception_str:
  ...
if "Broken pipe" in exception_str:
  ...
if "Connection reset by peer" in exception_str:
  ...
```

To address this issue, in this PR I've ensured added these error types:

1. **DistError** - the base type of all distributed errors
2. **DistBackendError** - this already existed and referred to PG backend errors
3. **DistStoreError** - for errors originating from the store
4. **DistNetworkError** - for general network errors coming from the socket library
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107651
Approved by: https://github.com/H-Huang
2023-08-28 21:58:15 +00:00
Aaron Bockover
15e5bd5103 [ONNX] Support torch.compile(backend="onnxrt", options=OrtBackendOptions(...)) (#107973)
This reworks the DORT backend factory function to support the options kwarg of torch.compile, and defines a concrete OrtBackendOptions type that can be used to influence the backend.

Caching is also implemented in order to reuse backends with equal options.

Wrapping the backend in auto_autograd also becomes an option, which allows `OrtBackend` to always be returned as the callable for torch.compile; wrapping happens internally if opted into (True by default).

Lastly, expose options for configuring preferred execution providers (will be attempted first), whether or not to attempt to infer an ORT EP from a torch found device in the graph or inputs, and finally the default/fallback EPs.

### Demo

The following demo runs `Gelu` through `torch.compile(backend="onnxrt")` using various backend options through a dictionary form and a strongly typed form. It additionally exports the model through both the ONNX TorchScript exporter and the new TorchDynamo exporter.

```python
import math

import onnx.inliner
import onnxruntime
import torch
import torch.onnx

torch.manual_seed(0)

class Gelu(torch.nn.Module):
    def forward(self, x):
        return x * (0.5 * torch.erf(math.sqrt(0.5) * x) + 1.0)

@torch.compile(
    backend="onnxrt",
    options={
        "preferred_execution_providers": [
            "NotARealEP",
            "CPUExecutionProvider",
        ],
        "export_options": torch.onnx.ExportOptions(dynamic_shapes=True),
    },
)
def dort_gelu(x):
    return Gelu()(x)

ort_session_options = onnxruntime.SessionOptions()
ort_session_options.log_severity_level = 0

dort_gelu2 = torch.compile(
    Gelu(),
    backend="onnxrt",
    options=torch.onnx._OrtBackendOptions(
        preferred_execution_providers=[
            "NotARealEP",
            "CPUExecutionProvider",
        ],
        export_options=torch.onnx.ExportOptions(dynamic_shapes=True),
        ort_session_options=ort_session_options,
    ),
)

x = torch.randn(10)

torch.onnx.export(Gelu(), (x,), "gelu_ts.onnx")

export_output = torch.onnx.dynamo_export(Gelu(), x)
export_output.save("gelu_dynamo.onnx")
inlined_model = onnx.inliner.inline_local_functions(export_output.model_proto)
onnx.save_model(inlined_model, "gelu_dynamo_inlined.onnx")

print("Torch Eager:")
print(Gelu()(x))

print("DORT:")
print(dort_gelu(x))
print(dort_gelu2(x))
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107973
Approved by: https://github.com/BowenBao
2023-08-26 18:20:18 +00:00
Pearu Peterson
c5ad44be1d Add torch.sparse.as_sparse_gradcheck decorator of gradcheck that allows gradcheck input function to receive and return sparse tensors (#107150)
Compared to #104848, this PR makes a step further: when the enable_sparse_support decorator is applied to `torch.autograd.gradcheck`, the resulting callable is equivalent to `torch.autograd.gradcheck` with an extra feature of supporting functions that can have input sparse tensors or/and can return sparse tensors.

At the same time, the underlying call to `torch.autograd.gradcheck` will operate on strided tensors only. This basically means that torch/autograd/gradcheck.py can be cleaned up by removing the code that deals with sparse tensors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107150
Approved by: https://github.com/albanD, https://github.com/amjames, https://github.com/cpuhrsch
ghstack dependencies: #107638, #107777
2023-08-26 07:24:31 +00:00
BowenBao
25d98a3e3b [ONNX] Remove API reference for TorchScript export diagnostics (#107979)
Remove both api reference and rules specific to TorchScript ONNX export. The page should display only info related to `torch.onnx.dynamo_export` diagnostics.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107979
Approved by: https://github.com/justinchuby
2023-08-26 00:52:59 +00:00
gmagogsfm
9af0e47653 Hide transform method by renaming it (#107940)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107940
Approved by: https://github.com/tugsbayasgalan
2023-08-25 16:31:44 +00:00
gmagogsfm
39854df1d3 Make validate private by renaming validate to _validate (#107927)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107927
Approved by: https://github.com/tugsbayasgalan
2023-08-25 08:14:56 +00:00
gmagogsfm
bfb09204bd Expose torch.export.{save,load} APIs (#107888)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107888
Approved by: https://github.com/angelayi
2023-08-25 06:06:36 +00:00
gmagogsfm
7dd1113463 Expose ExportedProgram and related classes (#107852)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107852
Approved by: https://github.com/zhxchen17, https://github.com/angelayi
2023-08-25 00:07:00 +00:00
Digant Desai
8a7a6867b9 [PyTorch][Tensor] Introduce tensor.dim_order (#106835)
Summary:
This is a stride based attribute for a tensor available in Python.

This can help inspect tensors generated using `torch.empty_permuted(.., physical_layout, ...)`, where physical_layout should match the dim_order returned here. `empty_permuted` will be renamed to use dim_order as the param name in the future. And also help Executorch export pipeline with implementing dim_order based tensors.

Differential Revision: D48134476

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106835
Approved by: https://github.com/ezyang
2023-08-25 00:06:03 +00:00
Zachary DeVito
40cbda274b document memory snapshotting (#107660)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107660
Approved by: https://github.com/albanD
ghstack dependencies: #107171, #107399
2023-08-24 19:20:03 +00:00
angelayi
6ec2ec845c [exportdb] Fix generating docs (#107838)
Previously I accidentally replaced all `=` with `-`, resulting in clowny code rendering like:
![image](https://github.com/pytorch/pytorch/assets/10901756/738eaf92-8cc6-43bd-b531-224ec44afa9f)

The purpose of replacing the `=` with `-` is to change the RST heading size of modules. So now, I replace strings with more than 3 `=`'s with `-`. This should avoid incorrectly replacing code where we set variables with `=` and do equality checks with `==`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107838
Approved by: https://github.com/gmagogsfm
2023-08-24 06:32:51 +00:00
gmagogsfm
f8119f8bda Move Constraint class to torch.export() to avoid circular dependency in _dynamo package (#107750)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107750
Approved by: https://github.com/tugsbayasgalan
2023-08-24 03:07:28 +00:00
gmagogsfm
652ccfadc1 Expose torch.export.constrain_as_{size,value} APIs (#107735)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107735
Approved by: https://github.com/avikchaudhuri
2023-08-23 20:13:40 +00:00
PyTorch MergeBot
ecde622649 Revert "reseed all Generators in Dataloader's _worker_loop() -- via GC (#107131)"
This reverts commit 42625da5e1.

Reverted https://github.com/pytorch/pytorch/pull/107131 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107131#issuecomment-1690325745))
2023-08-23 17:08:07 +00:00
gmagogsfm
137d96a26e Expose torch.export.dynamic_dim() API (#107635)
With updated doc

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107635
Approved by: https://github.com/avikchaudhuri
2023-08-22 18:40:49 +00:00
Jane Xu
515aa993e3 Document post acc grad hooks in backward hooks execution (#107323)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107323
Approved by: https://github.com/soulitzer, https://github.com/albanD
2023-08-22 18:37:03 +00:00
Alexander Jipa
2e054037da fixing named tensor unflatten example (#106921)
Fixes an example from the documentation [here](https://pytorch.org/docs/stable/named_tensor.html#manipulating-dimensions).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106921
Approved by: https://github.com/zou3519
2023-08-22 18:00:10 +00:00
gmagogsfm
bbb216bca4 Move torch.export() to torch.export.export() (#107609)
New plan:

torch.export.export() as the main API

All other utilities will be torch.export.foo_utilities
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107609
Approved by: https://github.com/tugsbayasgalan, https://github.com/msaroufim
2023-08-22 00:38:32 +00:00
moto
a250cc9bd7 Update persons_of_interest.rst (#107592)
Updating the state of PyTorch Audio.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107592
Approved by: https://github.com/cpuhrsch
2023-08-21 20:01:46 +00:00
Chien-Chin Huang
7ba513b6e4 [FSDP][state_dict] Expose optimizer state_dict config (#105949)
Optimizer state_dict config are not exposed. This PR exposes the 2 dataclass.

Differential Revision: [D47766024](https://our.internmc.facebook.com/intern/diff/D47766024/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105949
Approved by: https://github.com/rohan-varma
2023-08-21 07:29:49 +00:00
Nicolas Hug
42625da5e1 reseed all Generators in Dataloader's _worker_loop() -- via GC (#107131)
Alternative to https://github.com/pytorch/pytorch/pull/107034, implements @ezyang 's suggestion from https://github.com/pytorch/pytorch/pull/107034#discussion_r1292857201.

This PR addresses https://fb.workplace.com/groups/pytorch.oss.dev/posts/1699944830430051 and does a bunch of stacked changes:

- Make `Generator` class support GC;this makes all `Generator` instances tracked and accessile through Python's GC.
- Use the GC to retrieve all existing Generator instances in Dataloader's `_worker_loop` and re-seed them: this extends what is already applied to the global/default Generator, which is already re-seeded.

~TODO: a bit of docs and justification, which I'll do if this PR is mergeable.~ -- Done

CC @albanD @ezyang  as previously discussed

BC-Breaking Note
-------------------

We now re-seed all `Generator` instances within the `Dataloader` workers' loop to ensure that their RNG is different across workers.
Previously, the RNG of user-defined `Generators` would be the same across workers, which could lead to wrong training procedures. This only affects user-defined `Generators`, not the default `Generator` (which was already re-seeded).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107131
Approved by: https://github.com/ezyang
2023-08-18 10:23:23 +00:00
Alexander Pivovarov
35b2b3ee47 Fix rst formatting in torch.compiler_troubleshooting.rst (#107360)
Fix some rst formatting - mostly around ``.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107360
Approved by: https://github.com/kit1980
2023-08-18 01:04:24 +00:00
Alexander Pivovarov
a98f745c80 Use compiled model in torch.compiler_get_started (#107267)
- Text says `Next, let’s try a real model like resnet50 from the PyTorch` but the code example uses `resnet18`. Fixed code to use `resnet50` for consistency.
- One of the examples in TorchDynamo Overview uses uncompiled model - fixed it - now it uses compiled model.
- Removed unused import to `_dynamo` in one of the examples
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107267
Approved by: https://github.com/soulitzer
2023-08-17 09:26:54 +00:00
Alexander Pivovarov
11e366943d Fix rst formatting in dynamo/guards-overview doc (#107275)
Fix rst formatting in dynamo/guards-overview doc
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107275
Approved by: https://github.com/soulitzer
2023-08-17 09:04:44 +00:00
fduwjj
983fd5ba79 [2D][TP] Enable DDP TP integration with unit test (#106583)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106583
Approved by: https://github.com/kumpera, https://github.com/fegin, https://github.com/wanchaol
ghstack dependencies: #107313
2023-08-17 02:54:17 +00:00
gmagogsfm
ddba7a5a55 Expose torch.export() API (#106904)
Other class definitions and utilities will be moved in subsequent PRs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106904
Approved by: https://github.com/avikchaudhuri
2023-08-16 10:47:26 +00:00
BowenBao
19a76290d8 [ONNX] Public diagnostic options for 'dynamo_export' (#106741)
Generate diagnostic reports to monitor the internal stages of the export process. This tool aids in unblocking model exports and debugging the exporter.

#### Settings

~~1. Choose if you want to produce a .sarif file and specify its location.~~
1. Updated: saving .sarif file should be done by `export_output.save_sarif_log(dst)`, similar to saving exported onnx model `export_output.save(model_dst)`.
2. Customize diagnostic options:
    - Set the desired verbosity for diagnostics.
    - Treat warnings as errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106741
Approved by: https://github.com/titaiwangms, https://github.com/justinchuby, https://github.com/malfet
2023-08-15 17:46:15 +00:00
youkaichao
05db3d9969 improve doc on how to understand dynamo (#106860)
Per the discussion in https://github.com/pytorch/pytorch/pull/106673#issuecomment-1669939815 , I add more documentation to explain the output of dynamo compilation. I didn't find any de-compile library, so I manually de-compile the bytecode. The result looks good.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106860
Approved by: https://github.com/jansel, https://github.com/msaroufim
2023-08-14 19:49:24 +00:00
BowenBao
22095acfd7 [ONNX] Migrate to PT2 logging (#106592)
Summary
- The 'dynamo_export' diagnostics leverages the PT2 artifact logger to handle the verbosity
level of logs that are recorded in each SARIF log diagnostic. In addition to SARIF log,
terminal logging is by default disabled. Terminal logging can be activated by setting
the environment variable `TORCH_LOGS="onnx_diagnostics"`. When the environment variable
is set, it also fixes logging level to `logging.DEBUG`, overriding the verbosity level
specified in the diagnostic options.
See `torch/_logging/__init__.py` for more on PT2 logging.
- Replaces 'with_additional_message' with 'Logger.log' like apis.
- Introduce 'LazyString', adopted from 'torch._dynamo.utils', to skip
evaluation if the message will not be logged into diagnostic.
- Introduce 'log_source_exception' for easier exception logging.
- Introduce 'log_section' for easier markdown title logging.
- Updated all existing code to use new api.
- Removed 'arg_format_too_verbose' diagnostic.
- Rename legacy diagnostic classes for TorchScript Onnx Exporter to avoid
confusion.

Follow ups
- The 'dynamo_export' diagnostic now will not capture python stack
information at point of diagnostic creation. This will be added back in
follow up PRs for debug level logging.
- There is type mismatch due to subclassing 'Diagnostic' and 'DiagnosticContext'
for 'dynamo_export' to incorporate with PT2 logging. Follow up PR will
attempt to fix it.
- More docstrings with examples.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106592
Approved by: https://github.com/titaiwangms
2023-08-11 23:27:00 +00:00
Howard Huang
149e458846 [BE] RPC is missing RRef docs (#106902)
Current `RRef` class derives from `PyRRef` which has all the method definitions and documentations, and we don't see any of this in the current documentation:

<img width="891" alt="image" src="https://github.com/pytorch/pytorch/assets/14858254/62897766-a660-4846-97bf-182e4aa45079">

Changing to :inherited-member: so sphinx can pick up these methods

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106902
Approved by: https://github.com/svekars
2023-08-10 16:26:27 +00:00
Ivan Yashchuk
c913f3857f Remove dynamo+nvfuser (#105789)
This PR removes unmaintained Dynamo+nvFuser.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789
Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD
2023-08-08 22:29:32 +00:00
Jason Lu
bc88028e8e Back out "Reland "Make adding buffers more like adding parameters (#104069)" (#106224)" (#106743)
Summary:
Original commit changeset: 81319beb97f3

Original Phabricator Diff: D47961182

Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822

Reviewed By: atuljangra

Differential Revision: D48131623

@diff-train-skip-merge
(D48131623 landed internally)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743
Approved by: https://github.com/malfet
2023-08-08 15:27:34 +00:00
PyTorch MergeBot
891bb259f8 Revert "Remove dynamo+nvfuser (#105789)"
This reverts commit 6030151d37.

Reverted https://github.com/pytorch/pytorch/pull/105789 on behalf of https://github.com/DanilBaibak due to Break a lot of tests on main. ([comment](https://github.com/pytorch/pytorch/pull/105789#issuecomment-1669710571))
2023-08-08 14:20:32 +00:00
Ivan Yashchuk
6030151d37 Remove dynamo+nvfuser (#105789)
This PR removes unmaintained Dynamo+nvFuser.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789
Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD
2023-08-08 13:29:31 +00:00
Ramin Azarmehr
cdfd0ea162 [MPS] Introduce torch.mps.Event() APIs (#102121)
- Implement `MPSEventPool` to recycle events.
- Implement python bindings with `torch.mps.Event` class using the MPSEventPool backend. The current member functions of the Event class are `record()`, `wait()`, `synchronize()`, `query()`, and `elapsed_time()`.
- Add API to measure elapsed time between two event recordings.
- Added documentation for Event class to `mps.rst`.
- Added test case to `test_mps.py`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102121
Approved by: https://github.com/albanD, https://github.com/kulinseth
2023-08-08 03:45:45 +00:00
AllenTiTaiWang
b782beb18e [ONNX] Expose OnnxRegistry publicly (#106140)
The official move of `OnnxRegistry` to `torch.onnx` allows it to become one of the parameters in `torch.onnx.ExportOption`. By incorporating `OnnxRegistry` in `torch.onnx.ExportOption`, users gain access to various functionalities, including the ability to register custom operators using `register_custom_op`, check whether an operator is supported using `is_registered_op`, and obtain symbolic functions that support specific operators using `get_functions`.

Additionally, `opset_version` is now exclusively available in `torch.onnx.OnnxRegistry` as it is removed from `torch.onnx.ExportOption`. The initialization of the registry with torchlib under the provided opset version ensures that the exporter uses the specified opset version as the primary version for exporting.

These changes encompass scenarios where users can:

1. Register an unsupported ATen operator with a custom implementation using onnx-script.
2. Override an existing symbolic function (onnx invariant).

NOTE: The custom registered function will be prioritized in onnx dispatcher, and if there are multiple custom ones, the one registered the last will be picked.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106140
Approved by: https://github.com/justinchuby, https://github.com/thiagocrepaldi
2023-08-04 20:46:03 +00:00
wangxiyuan
4eeda6616c Correct URL Link for torchDynamo (#105903)
Correct some error or 404 urls for torchDynamo doc

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105903
Approved by: https://github.com/malfet
2023-07-31 20:50:09 +00:00
Mikayla Gawarecki
d8e5f2aa6d Reland "Make adding buffers more like adding parameters (#104069)" (#106224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224
Approved by: https://github.com/atalman, https://github.com/albanD
2023-07-31 17:18:56 +00:00
Svetlana Karslioglu
4d3ea5df65 Restructure torch.compile docs (#105376)
Current torch.compile docs have become a bit of a mess with the docs expanded in the left nav. This PR moves them under the torch.compiler menu item in the left nav. A bunch of rewrites were made in collaboration with @msaroufim to address formatting issues, latest updates that moved some of the APIs to the public torch.compiler namespace were addressed as well. The documentation is broken down in three categories that address three main audiences: PyTorch users, Pytorch Developers and PyTorch backend vendors. While, the user-facing documentation was significantly rewritten, dev docs and vendor docs kept mostly untouched. This can be addressed in the follow up PRs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105376
Approved by: https://github.com/msaroufim
2023-07-28 20:58:57 +00:00
Mikayla Gawarecki
035124774a Enable registering fallthroughs to (op, dk) from torch.library (#106086)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106086
Approved by: https://github.com/zou3519, https://github.com/albanD
2023-07-28 19:37:59 +00:00
fduwjj
487ebcac3b Clean up unsed MHA code to avoid confusion (#105956)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105956
Approved by: https://github.com/wz337, https://github.com/ezyang, https://github.com/wanchaol
2023-07-27 17:10:17 +00:00
Edward Z. Yang
edebdaf182 Change _dynamo.explain to be explain(f)(*args, **kwargs) (#106066)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106066
Approved by: https://github.com/wanchaol, https://github.com/voznesenskym
2023-07-27 03:21:52 +00:00
Edward Z. Yang
f70844bec7 Enable UFMT on a bunch of low traffic Python files outside of main files (#106052)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106052
Approved by: https://github.com/albanD, https://github.com/Skylion007
2023-07-27 01:01:17 +00:00
Jerry Zhang
3a77f9aaaf [quant][api] Move torch.ao.quantization.pt2e.quantizer to torch.ao.quantization.quantizer (#105885)
Summary: moving quantizer to torch.ao.quantization to make it a public api, since pt2e is a folder for implementations

Test Plan:
CIs

sanity check: "buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18"

Differential Revision: D47727838

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105885
Approved by: https://github.com/andrewor14
2023-07-26 18:20:09 +00:00
Danni Li
c0c208516b [Doc] Add Tensor.Shape (#104750)
Summary:
Add `Tensor.Shape` doc.

Fix: #104038

Ref:

- https://github.com/pytorch/pytorch/issues/5544
- https://github.com/pytorch/pytorch/issues/1980

Differential Revision: D47278630

CC: @svekars @carljparker

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104750
Approved by: https://github.com/mikaylagawarecki
2023-07-26 16:30:15 +00:00
albanD
9d2e15882e Add torch.utils to the docs page, remove dead code and fix docstrings (#105142)
As per title.
Note that the c++ side code for the minidumps part was removed. So trying to call any of these 3 functions today results in an error saying that `torch._C` doesn't have these attributes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105142
Approved by: https://github.com/janeyx99
2023-07-26 14:24:58 +00:00
Andrew Gu
c9edf11073 [FSDP][Docs] Make model/optim state dict configs visible in docs (#105848)
This closes https://github.com/pytorch/pytorch/issues/104717.

Rendered docs:
![Screenshot 2023-07-25 at 11 15 23 AM](https://github.com/pytorch/pytorch/assets/31054793/3c38166a-70c0-472c-805d-452d3bd9c700)
![Screenshot 2023-07-25 at 11 15 30 AM](https://github.com/pytorch/pytorch/assets/31054793/6d275d94-020a-44a2-a64c-0eeba083d47f)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105848
Approved by: https://github.com/rohan-varma
2023-07-25 16:23:53 +00:00
Ruoxi
5afc2f5069 Documentation for torch.autocast (#95760)
- [x] Corrected examples for CUDA devices.
- [x] Information about availability of `torch.autocast`.

Fixes #95547

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95760
Approved by: https://github.com/leslie-fang-intel, https://github.com/kit1980
2023-07-22 03:56:34 +00:00
PyTorch MergeBot
050d3de07d Revert "Correct dynamo logging docs (#105658)"
This reverts commit f3a261e096.

Reverted https://github.com/pytorch/pytorch/pull/105658 on behalf of https://github.com/PaliC due to breaking docs f3a261e096 ([comment](https://github.com/pytorch/pytorch/pull/105658#issuecomment-1646310865))
2023-07-21 22:38:28 +00:00
David Radley
f3a261e096 Correct dynamo logging docs (#105658)
Fixes #105657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105658
Approved by: https://github.com/zou3519
2023-07-21 21:37:02 +00:00
PyTorch MergeBot
117325862c Revert "Add torch.utils to the docs page, remove dead code and fix docstrings (#105142)"
This reverts commit e985719e98.

Reverted https://github.com/pytorch/pytorch/pull/105142 on behalf of https://github.com/huydhn due to Sorry for reverting this but it is failing python doc build job in trunk e985719e98 ([comment](https://github.com/pytorch/pytorch/pull/105142#issuecomment-1644874540))
2023-07-21 01:47:49 +00:00
albanD
e985719e98 Add torch.utils to the docs page, remove dead code and fix docstrings (#105142)
As per title.
Note that the c++ side code for the minidumps part was removed. So trying to call any of these 3 functions today results in an error saying that `torch._C` doesn't have these attributes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105142
Approved by: https://github.com/janeyx99
2023-07-21 00:14:59 +00:00
ydwu4
6abb8c382c [export] add kwargs support for export. (#105337)
Solving #105242.

During export, the exported function's signature changes multiple times. Suppose we'd like to export f as shown in following example:
```python
def f(arg1, arg2, kw1, kw2):
  pass

args = (arg1, arg2)
kwargs =  {"kw2":arg3, "kw1":arg4}

torch.export(f, args, kwargs)
```
The signature changes mutiple times during export process in the following order:
1. **gm_torch_level = dynamo.export(f, *args, \*\*kwargs)**. In this step, we turn all  kinds of parameters such as **postional_only**, **var_positioinal**, **kw_only**, and **var_kwargs** into **positional_or_kw**.It also preserves the positional and kword argument names in original function (i.e. f in this example) [here](https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/export.py#L546C13-L546C27). The order of kwargs will be the **key order** of kwargs (after python 3.6, the order is the insertion of order of keys) instead of the original function signature and the order is baked into a _orig_args varaible of gm_torch_level's pytree info. So we'll have:
```python
def gm_torch_level(arg1, arg2, kw2, kw1)
```
Such difference is acceptable as it's transparent to users of export.

2. **gm_aot_export = aot_export_module(gm_torch_level, pos_or_kw_args)**. In this step, we need to turn kwargs into positional args in the order of how gm_torch_level expected, which is stored in _orig_args. The returned gm_aot_export has the graph signature of flat_args, in_spec = pytree.tree_flatten(pos_or_kw_args):
``` python
flat_args, _ = pytree.tree_flatten(pos_or_kw_args)
def gm_aot_export(*flat_args)
```

3. **exported_program(*args, \*\*kwargs)**. The epxorted artifact is exported_program, which is a wrapper over gm_aot_export and has the same calling convention as the original function "f". To do this, we need to 1. specialize the order of kwargs into pos_or_kw_args and 2. flatten the pos_or_kw_args into what gm_aot_export expected.  We can combine the two steps into one with :
```python
_, in_spec = pytree.tree_flatten((args, kwargs))

# Then during exported_program.__call__(*args, **kwargs)
flat_args  = fx_pytree.tree_flatten_spec((args, kwargs), in_spec)
```
, where kwargs is treated as a normal pytree whose keyorder is preserved in in_spec.

Implementation-wise, we treat _orig_args in dynamo exported graph module as single source of truth and kwags are ordered following it.

Test plan:
See added tests in test_export.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105337
Approved by: https://github.com/angelayi, https://github.com/tugsbayasgalan
2023-07-20 19:53:08 +00:00
Andrey Talman
c6653b65d8 Back out "Make adding buffers more like adding parameters (#104069)" (#105581)
Summary:
D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/

with `TypeError: register_buffer() takes 3 positional arguments but 4 were given`

Original commit changeset: d4b4069fbd38

Original Phabricator Diff: D47537831

Test Plan:
```
buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform
```

Reviewed By: atalman

Differential Revision: D47600140

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581
Approved by: https://github.com/mikaylagawarecki
2023-07-20 03:39:53 +00:00
Justin Chu
14d87bb5ff [BE] Enable ruff's UP rules and autoformat tools and scripts (#105428)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105428
Approved by: https://github.com/albanD, https://github.com/soulitzer, https://github.com/malfet
2023-07-19 01:24:44 +00:00
ekamiti
32d422f335 Make adding buffers more like adding parameters (#104069)
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.

Fixes #35735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069
Approved by: https://github.com/mikaylagawarecki
2023-07-17 17:59:05 +00:00
Jerry Zhang
7b4d080496 [quant][pt2e] Rename _pt2e to pt2e (#104668)
Summary:
X-link: https://github.com/pytorch/executorch/pull/3

att

Test Plan: Imported from OSS

Differential Revision: D47202807

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104668
Approved by: https://github.com/andrewor14
2023-07-15 06:34:17 +00:00
Aleksandar Samardžić
d7e6040efa Update sparse semi-structured linear operator (#104608)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104608
Approved by: https://github.com/cpuhrsch
2023-07-13 23:52:39 +00:00
Aleksandar Samardžić
fc2f87b281 Add semi-structured sparse conversions (#103830)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103830
Approved by: https://github.com/amjames, https://github.com/jcaip, https://github.com/cpuhrsch
2023-07-13 21:09:09 +00:00
William Wen
15c67ca95c Update troubleshooting.rst (#105018)
Update with `TORCH_LOGS` information

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105018
Approved by: https://github.com/mlazos
2023-07-12 21:42:53 +00:00
Rodrigo Kumpera
fc012d716d [core] Bring cpu device module closer to cuda's. (#103172)
By implementing some of the functionality used by CUDA we make
implementing device agnostic code a lot easier.

With this set of changes it's now possible to get FSDP wrap a trivial
module. FWD/BWD still TBD.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103172
Approved by: https://github.com/wz337, https://github.com/wanchaol
2023-07-12 19:43:22 +00:00
Zaili Wang
16d3638c11 Add best practices for CPU backend doc (#105051)
Content same as #103948
@svekars the PR content is updated per your comment, but when trying to solve the conflict the original PR was closed by a mis-operation. Would you help handle this new one? sorry for the inconvenience.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105051
Approved by: https://github.com/svekars
2023-07-12 18:04:51 +00:00
Svetlana Karslioglu
eb03af44ee Fixes to the torch.compile doc and doctest (#104911)
Fixing the user warning in doctest by removing autosummary from the compile/index.rst :
```
/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/__init__.py:docstring of torch.compile:1: WARNING: duplicate object description of torch.compile, other instance in compile/generated/torch.compile, use :noindex: for one of them
```
The error is no longer present in the log: https://github.com/pytorch/pytorch/actions/runs/5513741050/jobs/10052379357?pr=104911
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104911
Approved by: https://github.com/kit1980, https://github.com/malfet
2023-07-11 17:54:12 +00:00
Thiago Crepaldi
f1bff6601c [ONNX] Add fake tensor support to torch.onnx.dynamo_export (#103865)
## Context prior to this PR

https://github.com/pytorch/pytorch/pull/100017/ was merged onto PyTorch `main` branch with the goal of enabling `torch._dynamo.export` to perform symbolic tracing.
In that context, symbolic tracing is defined as tracing of a model using fake inputs and weights. An input is Fake when `torch.nn.Tensor` is replaced by `torch._subclasses.FakeTensor`, whereas a weight is fake when a `torch.nn.Parameter` is replaced by `torch._subclasses.FakeTensor`.

For additional context, several strategies were discussed with Meta to enable this feature, including 1) calling `torch._dynamo.export` within a `torch._subclass.FakeTensorMode` context and 2) **fake**fying input and model as separate step and then call `torch._dynamo.export` without an active `torch._subclass.FakeTensorMode` context. At the end, 2) was preferred and implemented by #100017 to minimize the number of side-effects the fake tensor mode has on the code base.

As a consequence, `torch._dynamo.export` API introduced a new argument called `fake_mode`. When symbolic tracing is used, the user must pass in the `fake_mode` used to fakefy both the input and the model. Internally, `torch._dynamo.export` will adopt this `fake_mode` instead of creating its own instance. This is needed because each instance of `FakeTensorMode` has metadata on the tensor/parameter it fakefied. Thus, using real tensor/model and specify a `fake_mode` to `torch._dynamo.export` is an error. Also, specify a `fake_mode` instance to `torch._dynamo.export` different than the one used to fakefy the model and input is also an error.

## Changes introduced from this PR

This PR is intended to integrate `torch._dynamo.export(fake_mode=...)` through `torch.onnx.dynamo_export`. In essence, it
* Introduces a new public API `ONNXFakeContext` which wraps a `FakeTensorMode` under the hood. This removes complexity from the user side while still allow the exporter to leverage the fake mode.
* Adds a new public API `enable_fake_mode` *context manager* that instantiates and return a `ONNXFakeContext`.
* Adds a new `ExportOptions.fake_context` that will be used to persist the `ONNXFakeContext` created by `enable_fake_mode` and plumb through until it reaches the call to `torch._dynamo.export`.
* Adds a `model_state_dict` argument to `ExportOutput.save` API.
  * When model is exported with fake tensors, no actual data exist in the FX module and, therefore, in the ONNX graph.
    * In fact, `torch.fx.make_fx` lifts initializers as model input when fake tensors are used
      * https://github.com/pytorch/pytorch/pull/104493 is needed to enforce name matching between Parameters and inputs
    *  A model checkpoint file or state_dict is needed to populate the ONNX graph with real initializers through `export_output.save(model_state_dict=...)` API

Symbolic tracing, or onnx fake mode, is only enabled when the user instantiates the input and model within the `enable_fake_mode` context. Otherwise, real tracing is done, which preserves the current behavior.

## Usability

Because symbolic tracing depends a lot on having changes made on Dynamo side before it can be consumed on ONNX exporter, this feature may have its API and assumptions changed as symbolic tracing matures upstream. Nonetheless, it is still important to have this feature merged ASAP on the ONNX exporter side to "lock" changes on Dynamo that would otherwise break ONNX exporter without warning.

Example:

```python
class Model(torch.nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.linear = torch.nn.Linear(2, 2)

    def forward(self, x):
        out = self.linear(x)
        return out

with torch.onnx.enable_fake_mode() as fake_context:
    x = torch.rand(5, 2, 2)
    model = Model()

# Export the model with fake inputs and parameters
export_options = ExportOptions(fake_context=fake_context)
export_output = torch.onnx.dynamo_export(
    model, x, export_options=export_options
)

model_state_dict = Model().state_dict()  # optional
export_output.save("/path/to/model.onnx", model_state_dict=model_state_dict)
```

## Next steps

* Add unit tests running the exported model with ORT
Today this is not possible yet because `make_fx` used by our Decomposition pass lifts initializers as model inputs. However, the initializer names are not preserved by FX tracing, causing a mismatch between the initializer and input name.
https://github.com/pytorch/pytorch/pull/104493 and https://github.com/pytorch/pytorch/pull/104741 should fix the initializer mismatch, enabling model execution

* Revisit `ONNXTorchPatcher` and how the ONNX initializers are saved in the graph as external data
We can try to get rid of the PyTorch patcher. If we can't, we might prefer to create specific patchers, say `FXSymbolicTracePatcher` used specifically during an export using `torch.fx.symbolic_trace` and maybe a `ExportOutputSavePacther` used specifically for `ExportOutput.save` to prevent "patching too many pytorch API that we don't need

## References

* [FakeTensor implementation](https://github.com/pytorch/pytorch/blob/main/torch/_subclasses/fake_tensor.py)
* [PR that adds fake tensor support to torch._dynamo.export](https://github.com/pytorch/pytorch/pull/100017)
* [Short fake tensor documentation](https://pytorch.org/torchdistx/latest/fake_tensor.html)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103865
Approved by: https://github.com/BowenBao
2023-07-11 03:17:17 +00:00
David Radley
dbc2216800 Add autograd modes table to docs (#104774)
Fixes #104461

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104774
Approved by: https://github.com/soulitzer
2023-07-08 03:14:10 +00:00
Aleksei Nikiforov
c42fd73cf9 Add functions to get and set default endianness in load() functions (#101973)
By default interpret tensor data as native endian, but add an option to interpret data as little endian or big endian.

Related to #101688

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101973
Approved by: https://github.com/mikaylagawarecki
2023-07-06 20:12:56 +00:00
toma
2abbed42ee correct the generated code and corresponding text to make them consistent (#104596)
Fixes #104500

As discussed in #104500, the [corresponding doc](https://pytorch.org/docs/stable/dynamo/get-started.html#getting-started) for dynamo is inconsistent between the code and explanation. I have run the code example to get the correct code.
![image](https://github.com/pytorch/pytorch/assets/6964699/d11e0f2f-2225-4ba9-8934-b06c9fc78721)
This PR fixes the problem and makes the doc more readable.

cc:
@davidberard98 @ezyang  please help me check this PR, thanks!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104596
Approved by: https://github.com/ezyang
2023-07-04 22:56:03 +00:00
angelayi
828b275740 [exportdb] Setup website (#104288)
<img width="1109" alt="image" src="https://github.com/pytorch/pytorch/assets/10901756/e67ff8a9-adb1-466f-8285-fb7d3653d139">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104288
Approved by: https://github.com/zhxchen17
2023-07-01 01:03:56 +00:00
Jesse Cai
2da6cae43c [core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135)
This PR adds in support for semi-structured sparsity via a tensor
subclass. It currently uses the CUTLASS kernels merged in PR #100881.

In the future we plan to add in cuSPARSELt support (see the other PRs in
the stack), which will give us larger performance gains.

This PR adds in 2 things:
- a Tensor subclass, `SparseSemiStructuredTensor` to store the
  sparse tensor in copmressed form and override `__torch_dispatch__`.
- a conversion function that takes in a dense tensor and a
  semi-structured sparse bool mask and creates an instance of the
  subclass.

**SparseSemiStructuredTensor**

The subclass stores the dense tensor in a contiguous flattened tensor
for future compatability with cuSPARSELt, which expects this format.
Note that the CUTLASS kernels do not have this limitation, as the
specified values and the metadata are passed separately in
`_structured_sparse_linear`. In the future we can use the cuSPARSELT bindings
[here](https://github.com/pytorch/pytorch/pull/103700) for faster matmul, better dtype converage, and relaxed shape
constraints.

Since we currently don't have a way to go back from the sparse
representation to the dense representation, and we store the weights in
compressed form, we don't have a great way to handle .t().

Instead, we keep track of how often we've called transpose on our
tensor, and if it's an unexpected number we throw an error. When the first
argument is sparse, we expect an even number of calls to transpose,
while when the second argument is sparse, we expect an odd number of
calls. This is because we support second argument sparse matrix
multiplications by using transpose properties.

**to_sparse_semi_structured**

This is a conversion function to convert a dense tensor and a
semi-structured sparse bool mask into a subclass. Currently, we must
pass in a bool mask, since we can't infer it becuase there may be
additional zero elements in the dense tensor, so `tensor !=0` is not 2:4
sparse.

Once we add either a method to derive the mask from the dense tensor or
cuSPARSELt, we no longer need to pass in the mask. cuSPARSELt has it's
own helper functions to create the metadata mask.

**User Details**

We have implemented support for the following ops for `torch.float16`
and `torch.int8`:
```
torch.addmm(bias, dense, sparse.t())
torch.mm(dense, sparse)
torch.mm(sparse, dense)
aten.linear.default
aten.t.default
aten.t.detach
```

The end user interface to accelerate a nn.Linaer module with the
subclass would look like this:

```
from torch.sparse import to_sparse_semi_structured

mask = torch.Tensor([0, 0, 1, 1]).tile(128, 32).cuda().bool()
linear = Model(128, 128).half().cuda()

linear.weight = nn.Parameter(to_sparse_semi_structured(linear.weight,
                                                       mask=linear.weight.bool())

```

This also updates tests and the `torch.sparse` module docstring to
reflect these changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102135
Approved by: https://github.com/albanD
2023-06-27 19:21:06 +00:00
PyTorch MergeBot
b76a040b18 Revert "[core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135)"
This reverts commit aea771de30.

Reverted https://github.com/pytorch/pytorch/pull/102135 on behalf of https://github.com/huydhn due to test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_cuda_int8 is still failing CUDA trunk jobs aea771de30 ([comment](https://github.com/pytorch/pytorch/pull/102135#issuecomment-1608744110))
2023-06-27 03:49:31 +00:00
Jesse Cai
aea771de30 [core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135)
This PR adds in support for semi-structured sparsity via a tensor
subclass. It currently uses the CUTLASS kernels merged in PR #100881.

In the future we plan to add in cuSPARSELt support (see the other PRs in
the stack), which will give us larger performance gains.

This PR adds in 2 things:
- a Tensor subclass, `SparseSemiStructuredTensor` to store the
  sparse tensor in copmressed form and override `__torch_dispatch__`.
- a conversion function that takes in a dense tensor and a
  semi-structured sparse bool mask and creates an instance of the
  subclass.

**SparseSemiStructuredTensor**

The subclass stores the dense tensor in a contiguous flattened tensor
for future compatability with cuSPARSELt, which expects this format.
Note that the CUTLASS kernels do not have this limitation, as the
specified values and the metadata are passed separately in
`_structured_sparse_linear`. In the future we can use the cuSPARSELT bindings
[here](https://github.com/pytorch/pytorch/pull/103700) for faster matmul, better dtype converage, and relaxed shape
constraints.

Since we currently don't have a way to go back from the sparse
representation to the dense representation, and we store the weights in
compressed form, we don't have a great way to handle .t().

Instead, we keep track of how often we've called transpose on our
tensor, and if it's an unexpected number we throw an error. When the first
argument is sparse, we expect an even number of calls to transpose,
while when the second argument is sparse, we expect an odd number of
calls. This is because we support second argument sparse matrix
multiplications by using transpose properties.

**to_sparse_semi_structured**

This is a conversion function to convert a dense tensor and a
semi-structured sparse bool mask into a subclass. Currently, we must
pass in a bool mask, since we can't infer it becuase there may be
additional zero elements in the dense tensor, so `tensor !=0` is not 2:4
sparse.

Once we add either a method to derive the mask from the dense tensor or
cuSPARSELt, we no longer need to pass in the mask. cuSPARSELt has it's
own helper functions to create the metadata mask.

**User Details**

We have implemented support for the following ops for `torch.float16`
and `torch.int8`:
```
torch.addmm(bias, dense, sparse.t())
torch.mm(dense, sparse)
torch.mm(sparse, dense)
aten.linear.default
aten.t.default
aten.t.detach
```

The end user interface to accelerate a nn.Linaer module with the
subclass would look like this:

```
from torch.sparse import to_sparse_semi_structured

mask = torch.Tensor([0, 0, 1, 1]).tile(128, 32).cuda().bool()
linear = Model(128, 128).half().cuda()

linear.weight = nn.Parameter(to_sparse_semi_structured(linear.weight,
                                                       mask=linear.weight.bool())

```

This also updates tests and the `torch.sparse` module docstring to
reflect these changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102135
Approved by: https://github.com/albanD
2023-06-27 02:37:00 +00:00
Mikayla Gawarecki
981f24e806 Add docstring to torch.serialization.register_package (#104046)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104046
Approved by: https://github.com/albanD
2023-06-26 23:28:32 +00:00
PyTorch MergeBot
bfa08a1c67 Revert "[core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135)"
This reverts commit cf5262a84f.

Reverted https://github.com/pytorch/pytorch/pull/102135 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_cuda_int8 is failing CUDA trunk jobs cf5262a84f. This looks like a landrace ([comment](https://github.com/pytorch/pytorch/pull/102135#issuecomment-1608423849))
2023-06-26 22:54:16 +00:00
Jesse Cai
cf5262a84f [core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135)
This PR adds in support for semi-structured sparsity via a tensor
subclass. It currently uses the CUTLASS kernels merged in PR #100881.

In the future we plan to add in cuSPARSELt support (see the other PRs in
the stack), which will give us larger performance gains.

This PR adds in 2 things:
- a Tensor subclass, `SparseSemiStructuredTensor` to store the
  sparse tensor in copmressed form and override `__torch_dispatch__`.
- a conversion function that takes in a dense tensor and a
  semi-structured sparse bool mask and creates an instance of the
  subclass.

**SparseSemiStructuredTensor**

The subclass stores the dense tensor in a contiguous flattened tensor
for future compatability with cuSPARSELt, which expects this format.
Note that the CUTLASS kernels do not have this limitation, as the
specified values and the metadata are passed separately in
`_structured_sparse_linear`. In the future we can use the cuSPARSELT bindings
[here](https://github.com/pytorch/pytorch/pull/103700) for faster matmul, better dtype converage, and relaxed shape
constraints.

Since we currently don't have a way to go back from the sparse
representation to the dense representation, and we store the weights in
compressed form, we don't have a great way to handle .t().

Instead, we keep track of how often we've called transpose on our
tensor, and if it's an unexpected number we throw an error. When the first
argument is sparse, we expect an even number of calls to transpose,
while when the second argument is sparse, we expect an odd number of
calls. This is because we support second argument sparse matrix
multiplications by using transpose properties.

**to_sparse_semi_structured**

This is a conversion function to convert a dense tensor and a
semi-structured sparse bool mask into a subclass. Currently, we must
pass in a bool mask, since we can't infer it becuase there may be
additional zero elements in the dense tensor, so `tensor !=0` is not 2:4
sparse.

Once we add either a method to derive the mask from the dense tensor or
cuSPARSELt, we no longer need to pass in the mask. cuSPARSELt has it's
own helper functions to create the metadata mask.

**User Details**

We have implemented support for the following ops for `torch.float16`
and `torch.int8`:
```
torch.addmm(bias, dense, sparse.t())
torch.mm(dense, sparse)
torch.mm(sparse, dense)
aten.linear.default
aten.t.default
aten.t.detach
```

The end user interface to accelerate a nn.Linaer module with the
subclass would look like this:

```
from torch.sparse import to_sparse_semi_structured

mask = torch.Tensor([0, 0, 1, 1]).tile(128, 32).cuda().bool()
linear = Model(128, 128).half().cuda()

linear.weight = nn.Parameter(to_sparse_semi_structured(linear.weight,
                                                       mask=linear.weight.bool())

```

This also updates tests and the `torch.sparse` module docstring to
reflect these changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102135
Approved by: https://github.com/albanD
2023-06-26 21:30:43 +00:00
Sergii Dymchenko
adf9595c2f Update CODEOWNERS (#103934)
Remove users that no longer have write access to the repo, resolving CODEOWNERS errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103934
Approved by: https://github.com/ZainRizvi, https://github.com/atalman, https://github.com/malfet
2023-06-26 19:29:29 +00:00
ZhaoqiongZ
7cef7195f6 [draft] Update Multiprocessing best practices with CPU device (#103229)
Fixes [#102498](https://github.com/pytorch/pytorch/issues/102498)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103229
Approved by: https://github.com/mingfeima, https://github.com/svekars, https://github.com/jgong5
2023-06-25 06:26:40 +00:00
Zachary DeVito
afc788a99c Re-land _cycleviz.py: visualize reference cycles holding cuda memory (#104051)
Reference cycles are freed by the cycle collector rather than being cleaned up
when the objects in the cycle first become unreachable. If a cycle points to a tensor,
the CUDA memory for that tensor will not be freed until garbage collection runs.
Accumulation of CUDA allocations can lead to out of memory errors (OOMs), as well as
non-deterministic allocation behavior which is harder to debug.

This visualizer installs a garbage collection hook to look for cycles containing
CUDA tensors and saves a visualization of the garbage:

```
from torch.cuda._cycleviz import warn_tensor_cycles
warn_tensor_cycles()
# do some work that results in a cycle getting garbage collected
# ...
> WARNING:root:Reference cycle includes a CUDA Tensor see visualization of cycle /tmp/tmpeideu9gl.html
```

Reland to make windows skip the test.

This reverts commit 7b3b6dd426.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104051
Approved by: https://github.com/aaronenyeshi, https://github.com/malfet
2023-06-23 13:44:58 +00:00
PyTorch MergeBot
7b3b6dd426 Revert "_cycleviz.py: visualize reference cycles holding cuda memory (#102656)"
This reverts commit dba67f71c9.

Reverted https://github.com/pytorch/pytorch/pull/102656 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. But I think the change is failing on Windows CUDA https://github.com/pytorch/pytorch/actions/runs/5341701630/jobs/9683293600 ([comment](https://github.com/pytorch/pytorch/pull/102656#issuecomment-1603035364))
2023-06-22 17:16:47 +00:00
albanD
4143b6b89b Add torch_dispatch and modes to extending.rst note (#102087)
The following subjects are not in this PR and will be done in a follow up:
- Go through torch_function section and update to the latest phrasing and link to the proper new sections
- Go through torch.library and custom device docs to add links to the new sections as appropriate
- Top level explanations on which component should be used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102087
Approved by: https://github.com/janeyx99
2023-06-22 12:56:35 +00:00
Zachary DeVito
dba67f71c9 _cycleviz.py: visualize reference cycles holding cuda memory (#102656)
Reference cycles are freed by the cycle collector rather than being cleaned up
when the objects in the cycle first become unreachable. If a cycle points to a tensor,
the CUDA memory for that tensor will not be freed until garbage collection runs.
Accumulatin of CUDA allocations can lead to out of memory errors (OOMs), as well as
non-deterministic allocation behavior which is harder to debug.

This visualizer installs a garbage collection hook to look for cycles containing
CUDA tensors and saves a visualization of the garbage:

```
from torch.cuda._cycleviz import warn_tensor_cycles
warn_tensor_cycles()
# do some work that results in a cycle getting garbage collected
# ...
> WARNING:root:Reference cycle includes a CUDA Tensor see visualization of cycle /tmp/tmpeideu9gl.html
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102656
Approved by: https://github.com/aaronenyeshi
2023-06-22 04:00:28 +00:00
Michael Suo
a475ea4542 [fx] change from #users to num_users in graph printout (#101140)
`#users` means stuff in various chat apps, which makes it annoying to copypasta graphs into them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101140
Approved by: https://github.com/ezyang
2023-06-20 21:24:32 +00:00
PyTorch MergeBot
e031dd23b0 Revert "To add brief intro for CPU backend optimization (#103666)"
This reverts commit 013ffe457e.

Reverted https://github.com/pytorch/pytorch/pull/103666 on behalf of https://github.com/huydhn due to Failing doc tests in trunk 013ffe457e ([comment](https://github.com/pytorch/pytorch/pull/103666#issuecomment-1599301270))
2023-06-20 18:33:01 +00:00
Zaili Wang
013ffe457e To add brief intro for CPU backend optimization (#103666)
This PR is about adding brief introduction for x86 CPU backend optimization. Per previous discussion, the former PR #103307 was closed and creating this one, the contents are put into a new file.
@Guobing-Chen @jgong5 @mingfeima @jingxu10 please help review, thanks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103666
Approved by: https://github.com/jgong5, https://github.com/malfet
2023-06-20 17:35:22 +00:00
leslie-fang-intel
9832cfbbfe Quantization oneDNN backend only support VNNI CPU (#103653)
**Summary**

- Update the quantization document that default qconfig with oneDNN backend is recommended to be used on CPUs with Vector Neural Network Instruction support.
- Add the warning message when user uses default qconfig with oneDNN backend on CPU without Vector Neural Network Instruction support.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103653
Approved by: https://github.com/jgong5, https://github.com/malfet
2023-06-19 09:50:07 +00:00
albanD
918fe519a0 Use the new analytics ID (#103766)
Re: https://github.com/pytorch/pytorch.github.io/issues/1397
Following the migration to latest google analytics
FYI @malfet
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103766
Approved by: https://github.com/svekars
2023-06-16 23:21:08 +00:00
Edward Z. Yang
bc6ec97e02 Switch dynamic_shapes to True by default (#103597)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103597
Approved by: https://github.com/voznesenskym
2023-06-15 15:16:20 +00:00
Mark Saroufim
ea384cd377 torch.compiler public namespace (#102182)
# torch.compiler public API

## Goal

The goal of this document is to describe the public facing API for torchdynamo and torchinductor.

Today both dynamo and torchinductor are in `torch/_dynamo` and `torch/_inductor` namespace with the only public function

`torch.compile()` which is directly placed in `torch/__init__.py`

This poses a few problems for users trying to take dependencies on PyTorch 2.0
1. Unclear BC guarantees
2. No builtin discovery mechanism outside of reading the source code
3. No hard requirements for docstrings or type annotations

Most importantly it mixes two personas the PyTorch 2.0 developer vs the PyTorch 2.0 customer so this is an attempt to address this. We draw a lot of inspiration from the `functorch` migration to the `func` namespace.

## Alternate names

We did discuss some other alternative names

1. `torch.compile` -> problem is this would break BC on the existing `torch.compile` function
2. `torch.dynamo` -> `dynamo` is so far not something we've deliberately hidden from users but problem is now figuring out what it's `_dynamo` vs `dynamo` might be confusing
3. `torch.compiler` -> 1 would be better but to keep BC this is a good compromise

# The general approach
## Proposal 1
In https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/__init__.py

We have function called `reset()`, this function is essential if users are trying to `torch.compile()` a model under different settings

```python
# in _dynamo/
def reset():
    do_reset_stuff()
```

Instead we propose

```python
# in compiler/
def reset():
    do_reset_stuff() # As in copy paste the logic from _dynamo.reset

# in _dynamo/
import warnings
import inspect

def reset():
    function_name = inspect.currentframe().f_code.co_name
    warnings.warn(f"{function_name} is deprecated, use compiler.{function_name} instead", DeprecationWarning)
    return compiler.reset()

```
## Proposal 2

```python
# in compiler/
def reset():
    “””
    Docstrings here
    “””
    _dynamo.reset()

# in _dynamo/
No changes
```
Consensus so far seems to be proposal 2 since fewer warnings will be less jarring and it’ll make it quite easy to merge the public API

## Docstrings

The above was an example of a function that has no inputs or outputs but there are other functions which could use an improvement in their docstrings, for example allow_in_graph actually works over lists of functions but that’s not mentioned anywhere in the example only if you read the source code.

def allow_in_graph(fn):
    """
    Customize which functions TorchDynamo will include in the generated
    graph. Similar to `torch.fx.wrap()`.

    Parameters:
        fn (callable or list/tuple): The function(s) to be allowed in the graph.

    Returns:
        callable or list/tuple: The input function(s) included in the graph.

    Examples:
        Customize inclusion of a single function:
        ::
            torch._dynamo.allow_in_graph(my_custom_function)

        Customize inclusion of multiple functions:
        ::
            torch._dynamo.allow_in_graph([my_custom_function1, my_custom_function2])

        @torch._dynamo.optimize(...)
        def fn(a):
            x = torch.add(x, 1)
            x = my_custom_function(x)
            x = torch.add(x, 1)
            return x

        fn(...)

    Notes:
        The `allow_in_graph` function allows customization of which functions TorchDynamo
        includes in the generated graph. It can be used to include specific functions that
        are not automatically captured by TorchDynamo.

        If `fn` is a list or tuple, `allow_in_graph` will be called recursively on each
        element in the sequence.

        Once a function is allowed in the graph using `allow_in_graph`, it will be captured
        in the graph generated by TorchDynamo. This customization enables more fine-grained
        control over the functions included in the graph.

        Note that `allow_in_graph` expects the input `fn` to be a callable.

    """
    if isinstance(fn, (list, tuple)):
        return [allow_in_graph(x) for x in fn]
    assert callable(fn), "allow_in_graph expects a callable"
    allowed_functions._allowed_function_ids.add(id(fn))
    allowed_functions._disallowed_function_ids.remove(id(fn))
    return fn

So to make the API public, we’d have to write similar docstrings for all public functions we’d like to create.

The benefit of this approach is that
1. No BC risks, internal and external users relying on our tooling can slowly wean off the private functions.
2. We will also have to write correct docstrings which will automatically make our documentation easier to maintain and render correctly on pytorch.org
3. We already have some BC guarantees already, we don’t kill OptimizedModule, we rejected the PR to change the config system

The con of this approach is that
Will be stuck with some potentially suboptimal functions/classes that you can’t kill

## Testing strategy
If the approach is to mostly make a public function call an already tested private function then all we need to do is ensure that the function signatures don't change

## Which functions should be in the public API

Our heuristic for deciding whether something should be public or not is are users already relying on it for lack of other options or have we recommended some non public functions for users to debug their PT 2.0 programs.

Heuristic for not putting something in public is that it’s an experimental subsystem with the goal of turning it on by default, it’s very core dev centric, meta centric, a bunch of different configs that should be batched into a single user facing one, or something that needs to be renamed because the name is confusing

#### Top level
`torch.compile()` -> already is a public API it does require some minor improvements like having configs be passed in to any backend and not just inductor (EDIT: This was already done https://github.com/pytorch/pytorch/pull/99645l) and renaming `mode=reduce-overhead` to `mode=cudagraph`

To make sure that PT 2.0 is supported with a given pytorch version users can create a new public function and this would replace the need for `try/except` blocks around `import torch._dynamo` that has been populating user code.

```python
def pt2_enabled():
    if hasattr(torch, 'compile'):
        return True
    else:
        return False
```

For all of the below they will be translated to `torch.compiler.function_name()`

#### From _dynamo

As a starting point we looked at https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/__init__.py and we suggest redefining these functions in `pytorch/torch/compiler/__init__.py`

It might also make sense to split them over multiple files and import them in `__init__.py` but because the number of functions is small it'd probably be fine to add them all into a single compiler/__init__.py until this list becomes larger

1. `reset()`
2. `allow_in_graph()`
10. `list_backends()`
12. `compile()`:  torch.compile() would be mostly a shell function passing arguments to torch.compiler.compile()
13. `assume_constant_result()`: TODO: Double check how this is useful
15. `torch._dynamo.disable()`

Some notable omissions
11. `explain()`: We need to clean up the output for this function, make it a data class and pretty printable
1. `forbid_in_graph()`: Considered adding this but should instead consolidate on `disallow_in_graph`
2. `optimize_assert()`: Already covered by `torch.compile(fullgraph=True)`
3. `check_if_dynamo_supported()`: this would be supplanted by pt2_enabled()
4. `compilation_metrics`, `graph_breaks_reasons` ..: would all be accessed via `torch.compiler.explain()`
5. `replay` does not seem useful to end customers
6. . `graph_break()`: Mostly useful for debugging or unit tests
9. `register_backend()`: End users will just pass a string backend to torch.compile, only devs will create new backends
10. `export()` : Eventually this needs to public but for now it’s not ready so just highlighting that it will be in the public API eventually
11. `disallow_in_graph()`: Usage is limited
12. `mark_static()`: we can keep this private until dynamic=True is recommended in stable
13. `mark_dynamic()`:  we can keep this private until dynamic=True is recommended in trunk
14. 8. `OptimizedModule`: This is the only class that we'd expose but is crucial since users are running code like `if isinstance(mod, OptimizedModule): torch.save(mod._orig_mod)` EDIT: because we fixed pickling we no longer need to
expose this
15. `is_compiling()`: Still not clear how this useful to end users

There are also config variables which we need to expose https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/config.py

Some of our configs are useful dev flags, others are to gate experimental functionality and others are essential debugging tools and we seperate out the essential debugging and logging tools to a public facing config.

TODO: I still need to think of a good way of porting the config in a BC way here are some ideas
1. Just make all passes available and controllable via `torch.compile(options={})` but only show docstrings for the ones users should care about.

The current problem with our config system is we have 3 ways of setting them once via `options={}`, environment variables and variables in `config.py`, it'd be worth settling on one source of truth and have that be the public API.

The configs we should make public are
1. `log_file_name`
2. `verbose`
3. `cache_size_limit`
4. `repro_level` and `repro_after`: Although we can rename these to minifier and give human readable names to the levels

Everything else should stay private in particular

1. `print_graph_breaks`, `print_specializations`: should be supplanted by `explain()` for public users
2. dynamic shape configs : Users should only have to worry about `torch.compile(dynamic=True/False)`
3. The distributed flags, hook or guard configs: If we tell a user to use FSDP and DDP then the flag should be enabled by default or be in a private namespace
4. The fbcode flags: Obviously no need to be user facing
5. Skip/Allow lists: Not something normal users should play around with

#### From _inductor
Very little of inductor should be exposed in a public facing API, our core audience as in people writing models mostly just need information on what certain passes mean and how to control them a high level and they can do this with `torch.compile(options={})` so the goal here should be more to make available passes clearer and ideally consolidate them into `torch.compile()` docstrings or modes.

There are some exceptions though from https://github.com/pytorch/pytorch/blob/main/torch/_inductor/__init__.py

1. `list_mode_options()`
2. `list_options()`: this needs an additional pass to hide internal or debug options

For both of these we’d rename them to compiler.inductor_list_mode_options and compiler.inductor_list_options() since they would be in the same init file as the one for dynamo

Notable omissions
1. `_inductor.compile()`: Because of users are coming in with their own fx graph, they are likely developers
2. `_inductor.aot_compile()`:Again this is about capturing and modifying fx graphs so users APIs don't need to be public

However the configs are a slightly different story, because we can choose to either
1. Make all configs public
2. Make some configs public and keep most of the private ones. If public config is set it should override the private version
3. Make all configs controllable via `torch.compile(options={})` but make list_options() hide more things

For now 3 seems like the most reasonable choice with some high level configs we’ll keep like TORCH_COMPILE_DEBUG

Regardless here's what should probably be public or advertised more
1. `disable_progress` and verbose_progress:  Combine and enable by default
2. `fallback_random`: We could make the case this shouldn't be public if a top level deterministic mode enables this
3. `profile_bandwidth`: Or could make the case that this should be in TORCH_COMPILE_DEBUG

Notable omissions
1. Any config that would generally improve performance for most that we should probably enable by default but might be disabled in the short term because of stability: example `epilogue_fusion`, `pattern_matcher`, `reordering`
2. Autotuning flags: Should just sit behind `torch.compile(mode="max-autotune")` like `max_autotune`, `max_autotune_gemm`
3. `coordinate_descent_tuning`: This one I'm a but mixed about, maybe it just also fall into `mode="max-autotune"`
4. `trace`: `TORCH_COMPILE_DEBUG` is the best flag for all of this
5. `triton.cudagraphs`: Default should be `torch.compile(mode="reduce-overhead")` - I'd go further and rename the `mode=cudagraph` and we can keep reduce-overhead for BC reasons
6. `triton_unique_kernel_names`: Mostly useful for devs debugging
7. `dce`: which doesnt really do anything
8. `shape_padding`: Elias is working on enabling this by default in which case we also remove it

## Mechanics

This PR would include the public functions with their docstrings

Another PR will take a stab at the configs

And for work where the APIs are still being cleaned up whether its minifier or escape hatches, export or dynamic shapes, aot_inductor etc.. we’ll keep them private until a public commitment can be made

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102182
Approved by: https://github.com/jansel, https://github.com/albanD
2023-06-13 19:52:17 +00:00
Michael Lazos
6c6c897d6b Add graph break logging option instead of config flag (#103202)
Make graph break logging a logging option vs a config setting

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103202
Approved by: https://github.com/yanboliang, https://github.com/anijain2305
2023-06-12 19:52:31 +00:00
shaoyf42
443edb9015 [DOCS][DDP]Fix the simple of saving and reloading PowerSGD state and hook. (#102721)
Fix the simple of saving and reloading PowerSGD state and hook.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102721
Approved by: https://github.com/H-Huang
2023-06-10 00:15:00 +00:00
Weiming Zhao
28f43c767c Fix outdated log settings in doc (#102285) (#102286)
Replace torch._dynamo.config.loglevel=<level> with torch._logging.set_logs(dynamo=<level>)

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102286
Approved by: https://github.com/msaroufim, https://github.com/Neilblaze
2023-06-07 18:07:20 +00:00
David Berard
038955f489 torch.compile docs: "Profiling to understand torch.compile performance (#102862)
Docs on how to use torch.profiler.profile to understand torch.compile performance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102862
Approved by: https://github.com/eellison
2023-06-06 22:00:36 +00:00
Eli Uriegas
e26f5b2ac7 docs: Render bullet points correctly (#103021)
This wasn't rendering correctly on the website, this should make it so that the bullet points actually show correctly now.

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103021
Approved by: https://github.com/albanD
2023-06-06 00:22:49 +00:00
Elias Ellison
4479e2fa19 fix profiling ref in side panel (#103014)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103014
Approved by: https://github.com/msaroufim
2023-06-05 21:19:51 +00:00
Elias Ellison
d89c719160 Fix torch.compile side panels refs (#102407)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102407
Approved by: https://github.com/msaroufim
2023-06-05 20:08:40 +00:00
PyTorch MergeBot
258d398eec Revert "torch.compiler public namespace (#102182)"
This reverts commit b5840f99c3.

Reverted https://github.com/pytorch/pytorch/pull/102182 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/102182#issuecomment-1576144551))
2023-06-05 06:52:37 +00:00
Mark Saroufim
b5840f99c3 torch.compiler public namespace (#102182)
# torch.compiler public API

## Goal

The goal of this document is to describe the public facing API for torchdynamo and torchinductor.

Today both dynamo and torchinductor are in `torch/_dynamo` and `torch/_inductor` namespace with the only public function

`torch.compile()` which is directly placed in `torch/__init__.py`

This poses a few problems for users trying to take dependencies on PyTorch 2.0
1. Unclear BC guarantees
2. No builtin discovery mechanism outside of reading the source code
3. No hard requirements for docstrings or type annotations

Most importantly it mixes two personas the PyTorch 2.0 developer vs the PyTorch 2.0 customer so this is an attempt to address this. We draw a lot of inspiration from the `functorch` migration to the `func` namespace.

## Alternate names

We did discuss some other alternative names

1. `torch.compile` -> problem is this would break BC on the existing `torch.compile` function
2. `torch.dynamo` -> `dynamo` is so far not something we've deliberately hidden from users but problem is now figuring out what it's `_dynamo` vs `dynamo` might be confusing
3. `torch.compiler` -> 1 would be better but to keep BC this is a good compromise

# The general approach
## Proposal 1
In https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/__init__.py

We have function called `reset()`, this function is essential if users are trying to `torch.compile()` a model under different settings

```python
# in _dynamo/
def reset():
    do_reset_stuff()
```

Instead we propose

```python
# in compiler/
def reset():
    do_reset_stuff() # As in copy paste the logic from _dynamo.reset

# in _dynamo/
import warnings
import inspect

def reset():
    function_name = inspect.currentframe().f_code.co_name
    warnings.warn(f"{function_name} is deprecated, use compiler.{function_name} instead", DeprecationWarning)
    return compiler.reset()

```
## Proposal 2

```python
# in compiler/
def reset():
    “””
    Docstrings here
    “””
    _dynamo.reset()

# in _dynamo/
No changes
```
Consensus so far seems to be proposal 2 since fewer warnings will be less jarring and it’ll make it quite easy to merge the public API

## Docstrings

The above was an example of a function that has no inputs or outputs but there are other functions which could use an improvement in their docstrings, for example allow_in_graph actually works over lists of functions but that’s not mentioned anywhere in the example only if you read the source code.

def allow_in_graph(fn):
    """
    Customize which functions TorchDynamo will include in the generated
    graph. Similar to `torch.fx.wrap()`.

    Parameters:
        fn (callable or list/tuple): The function(s) to be allowed in the graph.

    Returns:
        callable or list/tuple: The input function(s) included in the graph.

    Examples:
        Customize inclusion of a single function:
        ::
            torch._dynamo.allow_in_graph(my_custom_function)

        Customize inclusion of multiple functions:
        ::
            torch._dynamo.allow_in_graph([my_custom_function1, my_custom_function2])

        @torch._dynamo.optimize(...)
        def fn(a):
            x = torch.add(x, 1)
            x = my_custom_function(x)
            x = torch.add(x, 1)
            return x

        fn(...)

    Notes:
        The `allow_in_graph` function allows customization of which functions TorchDynamo
        includes in the generated graph. It can be used to include specific functions that
        are not automatically captured by TorchDynamo.

        If `fn` is a list or tuple, `allow_in_graph` will be called recursively on each
        element in the sequence.

        Once a function is allowed in the graph using `allow_in_graph`, it will be captured
        in the graph generated by TorchDynamo. This customization enables more fine-grained
        control over the functions included in the graph.

        Note that `allow_in_graph` expects the input `fn` to be a callable.

    """
    if isinstance(fn, (list, tuple)):
        return [allow_in_graph(x) for x in fn]
    assert callable(fn), "allow_in_graph expects a callable"
    allowed_functions._allowed_function_ids.add(id(fn))
    allowed_functions._disallowed_function_ids.remove(id(fn))
    return fn

So to make the API public, we’d have to write similar docstrings for all public functions we’d like to create.

The benefit of this approach is that
1. No BC risks, internal and external users relying on our tooling can slowly wean off the private functions.
2. We will also have to write correct docstrings which will automatically make our documentation easier to maintain and render correctly on pytorch.org
3. We already have some BC guarantees already, we don’t kill OptimizedModule, we rejected the PR to change the config system

The con of this approach is that
Will be stuck with some potentially suboptimal functions/classes that you can’t kill

## Testing strategy
If the approach is to mostly make a public function call an already tested private function then all we need to do is ensure that the function signatures don't change

## Which functions should be in the public API

Our heuristic for deciding whether something should be public or not is are users already relying on it for lack of other options or have we recommended some non public functions for users to debug their PT 2.0 programs.

Heuristic for not putting something in public is that it’s an experimental subsystem with the goal of turning it on by default, it’s very core dev centric, meta centric, a bunch of different configs that should be batched into a single user facing one, or something that needs to be renamed because the name is confusing

#### Top level
`torch.compile()` -> already is a public API it does require some minor improvements like having configs be passed in to any backend and not just inductor (EDIT: This was already done https://github.com/pytorch/pytorch/pull/99645l) and renaming `mode=reduce-overhead` to `mode=cudagraph`

To make sure that PT 2.0 is supported with a given pytorch version users can create a new public function and this would replace the need for `try/except` blocks around `import torch._dynamo` that has been populating user code.

```python
def pt2_enabled():
    if hasattr(torch, 'compile'):
        return True
    else:
        return False
```

For all of the below they will be translated to `torch.compiler.function_name()`

#### From _dynamo

As a starting point we looked at https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/__init__.py and we suggest redefining these functions in `pytorch/torch/compiler/__init__.py`

It might also make sense to split them over multiple files and import them in `__init__.py` but because the number of functions is small it'd probably be fine to add them all into a single compiler/__init__.py until this list becomes larger

1. `reset()`
2. `allow_in_graph()`
10. `list_backends()`
12. `compile()`:  torch.compile() would be mostly a shell function passing arguments to torch.compiler.compile()
13. `assume_constant_result()`: TODO: Double check how this is useful
15. `torch._dynamo.disable()`

Some notable omissions
11. `explain()`: We need to clean up the output for this function, make it a data class and pretty printable
1. `forbid_in_graph()`: Considered adding this but should instead consolidate on `disallow_in_graph`
2. `optimize_assert()`: Already covered by `torch.compile(fullgraph=True)`
3. `check_if_dynamo_supported()`: this would be supplanted by pt2_enabled()
4. `compilation_metrics`, `graph_breaks_reasons` ..: would all be accessed via `torch.compiler.explain()`
5. `replay` does not seem useful to end customers
6. . `graph_break()`: Mostly useful for debugging or unit tests
9. `register_backend()`: End users will just pass a string backend to torch.compile, only devs will create new backends
10. `export()` : Eventually this needs to public but for now it’s not ready so just highlighting that it will be in the public API eventually
11. `disallow_in_graph()`: Usage is limited
12. `mark_static()`: we can keep this private until dynamic=True is recommended in stable
13. `mark_dynamic()`:  we can keep this private until dynamic=True is recommended in trunk
14. 8. `OptimizedModule`: This is the only class that we'd expose but is crucial since users are running code like `if isinstance(mod, OptimizedModule): torch.save(mod._orig_mod)` EDIT: because we fixed pickling we no longer need to
expose this
15. `is_compiling()`: Still not clear how this useful to end users

There are also config variables which we need to expose https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/config.py

Some of our configs are useful dev flags, others are to gate experimental functionality and others are essential debugging tools and we seperate out the essential debugging and logging tools to a public facing config.

TODO: I still need to think of a good way of porting the config in a BC way here are some ideas
1. Just make all passes available and controllable via `torch.compile(options={})` but only show docstrings for the ones users should care about.

The current problem with our config system is we have 3 ways of setting them once via `options={}`, environment variables and variables in `config.py`, it'd be worth settling on one source of truth and have that be the public API.

The configs we should make public are
1. `log_file_name`
2. `verbose`
3. `cache_size_limit`
4. `repro_level` and `repro_after`: Although we can rename these to minifier and give human readable names to the levels

Everything else should stay private in particular

1. `print_graph_breaks`, `print_specializations`: should be supplanted by `explain()` for public users
2. dynamic shape configs : Users should only have to worry about `torch.compile(dynamic=True/False)`
3. The distributed flags, hook or guard configs: If we tell a user to use FSDP and DDP then the flag should be enabled by default or be in a private namespace
4. The fbcode flags: Obviously no need to be user facing
5. Skip/Allow lists: Not something normal users should play around with

#### From _inductor
Very little of inductor should be exposed in a public facing API, our core audience as in people writing models mostly just need information on what certain passes mean and how to control them a high level and they can do this with `torch.compile(options={})` so the goal here should be more to make available passes clearer and ideally consolidate them into `torch.compile()` docstrings or modes.

There are some exceptions though from https://github.com/pytorch/pytorch/blob/main/torch/_inductor/__init__.py

1. `list_mode_options()`
2. `list_options()`: this needs an additional pass to hide internal or debug options

For both of these we’d rename them to compiler.inductor_list_mode_options and compiler.inductor_list_options() since they would be in the same init file as the one for dynamo

Notable omissions
1. `_inductor.compile()`: Because of users are coming in with their own fx graph, they are likely developers
2. `_inductor.aot_compile()`:Again this is about capturing and modifying fx graphs so users APIs don't need to be public

However the configs are a slightly different story, because we can choose to either
1. Make all configs public
2. Make some configs public and keep most of the private ones. If public config is set it should override the private version
3. Make all configs controllable via `torch.compile(options={})` but make list_options() hide more things

For now 3 seems like the most reasonable choice with some high level configs we’ll keep like TORCH_COMPILE_DEBUG

Regardless here's what should probably be public or advertised more
1. `disable_progress` and verbose_progress:  Combine and enable by default
2. `fallback_random`: We could make the case this shouldn't be public if a top level deterministic mode enables this
3. `profile_bandwidth`: Or could make the case that this should be in TORCH_COMPILE_DEBUG

Notable omissions
1. Any config that would generally improve performance for most that we should probably enable by default but might be disabled in the short term because of stability: example `epilogue_fusion`, `pattern_matcher`, `reordering`
2. Autotuning flags: Should just sit behind `torch.compile(mode="max-autotune")` like `max_autotune`, `max_autotune_gemm`
3. `coordinate_descent_tuning`: This one I'm a but mixed about, maybe it just also fall into `mode="max-autotune"`
4. `trace`: `TORCH_COMPILE_DEBUG` is the best flag for all of this
5. `triton.cudagraphs`: Default should be `torch.compile(mode="reduce-overhead")` - I'd go further and rename the `mode=cudagraph` and we can keep reduce-overhead for BC reasons
6. `triton_unique_kernel_names`: Mostly useful for devs debugging
7. `dce`: which doesnt really do anything
8. `shape_padding`: Elias is working on enabling this by default in which case we also remove it

## Mechanics

This PR would include the public functions with their docstrings

Another PR will take a stab at the configs

And for work where the APIs are still being cleaned up whether its minifier or escape hatches, export or dynamic shapes, aot_inductor etc.. we’ll keep them private until a public commitment can be made

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102182
Approved by: https://github.com/jansel
2023-06-02 14:38:55 +00:00
Weiming Zhao
b76af5f9a6 Fix broken link in Dynamo's guards doc (#102183) (#102185)
This PR fixes broken link for the code referenced in the guards doc.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102185
Approved by: https://github.com/mikaylagawarecki, https://github.com/ezyang
2023-06-02 14:36:28 +00:00
Thomas J. Fan
0d17bd5fa4 DOC Fixes unpacking issue in dynamo explain docs (#101761)
This PR updates the docs to be consistent with `torch.explain` which currently returns 6 items:

bfb3941ad8/torch/_dynamo/eval_frame.py (L622-L629)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101761
Approved by: https://github.com/desertfire
2023-05-25 22:32:15 +00:00
Elias Ellison
aa83a52742 Profiling doc (#101895)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101895
Approved by: https://github.com/msaroufim, https://github.com/shunting314
2023-05-25 04:57:38 +00:00
Elias Ellison
4692ea76a0 Fine grained apis docs (#101897)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101897
Approved by: https://github.com/msaroufim
2023-05-23 19:03:44 +00:00
Elias Ellison
2bce7c8f46 CUDAGraph trees doc (#101902)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101902
Approved by: https://github.com/msaroufim
2023-05-23 03:35:43 +00:00
Ramil Nugmanov
2ae87a1f87 missed StackDataset documentation (#101927)
New dataset class added by #101338 missed in documentation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101927
Approved by: https://github.com/kit1980
2023-05-22 21:12:16 +00:00
Ren Pang
a630328695 Fix Backend docs search items (#101214)
Fixes #100944

## New

<img width="1142" alt="image" src="https://github.com/pytorch/pytorch/assets/13214530/79102f2e-8a8f-4169-be53-9248397e653c">

<img width="765" alt="image" src="https://github.com/pytorch/pytorch/assets/13214530/4e5f17e7-a445-4822-ac8a-0d73c9ed71ee">

## Old

<img width="1341" alt="image" src="https://github.com/pytorch/pytorch/assets/13214530/985b4ec9-6d11-4962-8619-3c14ec09c3d9">

<img width="1112" alt="image" src="https://github.com/pytorch/pytorch/assets/13214530/e8dcf1a9-73e7-4fd6-8adc-eb036b1bb87b">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101214
Approved by: https://github.com/albanD
2023-05-22 14:58:38 +00:00
Rickey K. Liang
807d81155f [CUDA][CUBLAS] Fix BF16 reduced precision reduction note in Numerical accuracy docs (#101884)
Fixes #100966

Ref #101044

Align implementation and documentation. (This is what's previously missed from the above issue and PR)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101884
Approved by: https://github.com/eqy, https://github.com/ezyang
2023-05-21 17:38:00 +00:00
Mark Saroufim
3666ca9d97 Dynamic Shape Doc (#101885)
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 2f25c1e</samp>

> _Dynamic shapes guide_
> _`TorchDynamo` and `TorchInductor`_
> _Learn from data flow_

Thanks @ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101885
Approved by: https://github.com/eellison, https://github.com/ezyang
2023-05-19 21:43:22 +00:00
Mark Saroufim
ff5b9428aa Fake Tensor Docs (#101882)
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 75f33ae</samp>

> _Fake tensors help_
> _compile and optimize code_
> _`PT2` in autumn_

Thanks @ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101882
Approved by: https://github.com/eellison, https://github.com/ezyang
2023-05-19 21:39:34 +00:00
Mark Saroufim
581d13a069 Add Logging Doc to compile index (#101888)
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at ba85a41</samp>

> _`logging` module_
> _documents PyTorch events_
> _cutting through the fog_

Thanks @mlazos
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101888
Approved by: https://github.com/eellison
2023-05-19 21:29:25 +00:00
Mark Saroufim
2dd33c71c1 Docs for torchcompile and functorch (#101881)
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at b5f48b6</samp>

> _`torch.compile` docs_
> _Add a new section for `func`_
> _Winter of features_

Thanks @zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101881
Approved by: https://github.com/eellison, https://github.com/zou3519
2023-05-19 21:23:43 +00:00
Jane Xu
cde597efa1 [docs] Warn that GradScaler can scale under 1 (#101569)
Completes action item 1 in https://github.com/pytorch/pytorch/issues/99640

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101569
Approved by: https://github.com/ngimel
2023-05-16 23:56:07 +00:00
PyTorch MergeBot
66eef31444 Revert "[fx] change from #users to num_users in graph printout (#101140)"
This reverts commit e568c5a18d.

Reverted https://github.com/pytorch/pytorch/pull/101140 on behalf of https://github.com/jeanschmidt due to There are internal changes to this commit that are preventing landing, so I am reverting to unblock the diff train ([comment](https://github.com/pytorch/pytorch/pull/101140#issuecomment-1547989487))
2023-05-15 14:35:22 +00:00
Ramin Azarmehr
0be53d83fc [MPS] Add support for MPSProfiler Python bindings (#101002)
- Added torch.mps.profiler.[start() and stop()] APIs with RST documentation
- Added test case in test_mps
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101002
Approved by: https://github.com/malfet
2023-05-12 21:55:34 +00:00
Yueming Hao
a12b640dc9 Fix typos in troubleshooting.rst (#101305)
There are several typos in the troubleshooting documentation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101305
Approved by: https://github.com/desertfire
2023-05-12 21:05:13 +00:00
Ran Ding
b5c8d0359c Update autograd.rst (#101007)
Fixes #ISSUE_NUMBER

typo fix and small change to improve clarity

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101007
Approved by: https://github.com/lezcano, https://github.com/anjali411
2023-05-12 11:47:51 +00:00
Michael Suo
e568c5a18d [fx] change from #users to num_users in graph printout (#101140)
`#users` means stuff in various chat apps, which makes it annoying to copypasta graphs into them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101140
Approved by: https://github.com/ezyang
2023-05-12 04:34:01 +00:00
eqy
33f3dca6b5 [CUDA][CUBLAS] Fix BF16 reduced precision reduction note in docs (#101044)
#100966

CC @ngimel @ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101044
Approved by: https://github.com/ngimel
2023-05-10 06:50:58 +00:00
eqy
6e2efd16d8 [CUDA][CUBLAS] Add cuBLAS workspace allocation behavior to docs (#100919)
Adding to the docs for now, hopefully we can move to `cudaMallocAsync`-backed cuBLAS workspaces soon which should alleviate the recent confusion around `cuBLAS` "leaking" memory through workspaces.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100919
Approved by: https://github.com/ngimel
2023-05-10 06:40:26 +00:00
fduwjj
953aa6d90e [TP] Enable more generic attn in Tensor Parallelism (#100508)
To make TP more generic for Attention module, we come up with this new col/rowwise parallel style.

Basically, the idea behind is that:
We only do DTensor op for Col/Rowwise sharded part. For the rest of ATen ops, we will leave it to Tensor ops.

And we set this behavior as default for Colwise and Rowwise parallel style. If people want to customize it, they can always pass in different prepare_input or prepare_output

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100508
Approved by: https://github.com/wanchaol
2023-05-07 18:15:49 +00:00
Michael Lazos
850556ed6e Add "all" option to logging (#100664)
Adds the long-promised "all" option to logging.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100664
Approved by: https://github.com/lezcano
2023-05-06 01:11:18 +00:00
Michael Lazos
c525440ba3 Logging documentation updates (#100595)
Updated the logging.rst with info about the env var.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100595
Approved by: https://github.com/msaroufim, https://github.com/lezcano
2023-05-04 21:54:02 +00:00
Animesh Jain
8994d9e610 [dynamo] Hide guard_fail_hook behind a flag to improve cache lookup time (+10% DebertaV2) (#100590)
For TorchDynamo eager backend, DebertaV2 speedup improves from 0.77x to 0.87x.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100590
Approved by: https://github.com/voznesenskym, https://github.com/wconstab
2023-05-04 18:52:21 +00:00
Bin Bao
edebad81a9 Add a rst doc for the performance dashboard (#100592)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100592
Approved by: https://github.com/msaroufim, https://github.com/huydhn
2023-05-04 18:28:09 +00:00
Richard Barnes
9c185b6b46 [codemod] Replace hasattr with getattr in caffe2/docs/source/notes/extending.rst (#100598)
Summary:
The pattern
```
X.Y if hasattr(X, "Y") else Z
```
can be replaced with
```
getattr(X, "Y", Z)
```

The [getattr](https://www.w3schools.com/python/ref_func_getattr.asp) function gives more succinct code than the [hasattr](https://www.w3schools.com/python/ref_func_hasattr.asp) function. Please use it when appropriate.

**This diff is very low risk. Green tests indicate that you can safely Accept & Ship.**

Test Plan: Sandcastle

Differential Revision: D44886464

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100598
Approved by: https://github.com/Skylion007
2023-05-04 16:36:15 +00:00
Angela Yi
8eb82135d1 [docs] Docs for writing ATen IR passes + FX Pattern matching (#100577)
I'm not really sure where to put this...maybe just link it somewhere in torch.compile docs?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100577
Approved by: https://github.com/msaroufim
2023-05-04 05:17:10 +00:00
shibo
6aeb85add8 add checkpoint support for custom device (#99626)
Fixes #ISSUE_NUMBER
1、add checkpoint support for custom device
2、add a device argument, I want to add a device="cuda" parameter to the func `forward` of `CheckpointFunction`, and I can specify the device type when using it, but the func `apply` of `torch.autograd.Function` does not support `kwargs`, so I added a variable named `_device`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99626
Approved by: https://github.com/soulitzer
2023-05-04 00:23:42 +00:00
vfdev-5
6a12f10b08 Publicly exposing torch.backends.cpu.get_cpu_capability() (#100164)
Description:

- As suggested by Nikita, created `torch.backends.cpu` submodule and exposed `get_cpu_capability`.

- In torchvision Resize method we want to know current cpu capability in order to pick appropriate codepath depending on cpu capablities

Newly coded vectorized resize of uint8 images on AVX2 supported CPUs is now faster than older way (uint8->float->resize->uint8). However, on non-avx hardware (e.g. Mac M1) certain configs are slower using native uint8.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100164
Approved by: https://github.com/albanD, https://github.com/malfet
2023-05-03 19:02:07 +00:00
Svetlana Karslioglu
d425da8bf3 Replace master with main in links and docs/conf.py (#100176)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100176
Approved by: https://github.com/albanD, https://github.com/malfet
2023-05-02 18:20:32 +00:00
Hirochika Matsumoto
f143c92739 [docs] Fix typo in get-started.rst (#100355)
This PR changes `""nvprims_nvfuser"` which should be a typo to `"nvprims_nvfuser"`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100355
Approved by: https://github.com/Skylion007, https://github.com/kit1980
2023-05-02 00:29:53 +00:00
BowenBao
c94b6a6712 [ONNX] Introduce 'diagnostics' to 'dynamo_export' api (#99668)
Summary
* Introduce `DiagnosticContext` to `torch.onnx.dynamo_export`.
* Remove `DiagnosticEngine` in preparations to update 'diagnostics' in `dynamo_export` to drop dependencies on global diagnostic context. No plans to update `torch.onnx.export` diagnostics.

Next steps
* Separate `torch.onnx.export` diagnostics and `torch.onnx.dynamo_export` diagnostics.
* Drop dependencies on global diagnostic context. https://github.com/pytorch/pytorch/pull/100219
* Replace 'print's with 'logger.log'.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99668
Approved by: https://github.com/justinchuby, https://github.com/abock
2023-05-01 19:58:49 +00:00
pbialecki
8fe91d16b0 Remove CUDA 11.6 note from complex docs (#100118)
Removes note in the complex docs pointing to the CUDA 11.6 wheels introduced in https://github.com/pytorch/pytorch/pull/80363.
Background: this warning was added via https://github.com/pytorch/pytorch/issues/79876 which pointed out a slow compilation time in 11.3. The 11.6 pip wheels were thus recommended but are not build anymore as our current support is 11.7, 11.8 (and 12.1 experimental in nightlies).

The note is confusing users as it doesn't explain why 11.6 is needed.
Reference: https://discuss.pytorch.org/t/complex-numbers-cuda-11-6-documentation-warning/178588/1

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100118
Approved by: https://github.com/msaroufim
2023-04-27 16:26:27 +00:00
milesial
45bf3f6216 Optimized EMA implementation (#94820)
This PR proposes an optimized way to do Exponential Moving Average (EMA), which is faster than the current way using `swa_utils.AveragedModel` described in https://pytorch.org/docs/stable/optim.html#custom-averaging-strategies.

This implementation is asynchronous, and is built as an optimizer wrapper so that the EMA weight update happens without any additional CPU/GPU sync, just after optimizer steps, and with limited code changes.

Example usage:
```
model = Model().to(device)
opt = torch.optim.Adam(model.parameters())

opt = EMAOptimizer(opt, device, 0.9999)

for epoch in range(epochs):
    training_loop(model, opt)

    regular_eval_accuracy = evaluate(model)

    with opt.swap_ema_weights():
        ema_eval_accuracy = evaluate(model)
```

Here are some benchmarks (time per iteration) on various torchvision models:

|model|this PR iteration time                      |swa_utils.AveragedModel iteration time| iteration speedup                                      |
|-----|-----------------------------|-----------------------|---------------------------------------------|
|     |                             |                       |                                             |
|regnet_x_1_6gf|62.73                        |67.998                 |1.08                                         |
|regnet_x_3_2gf|101.75                       |109.422                |1.08                                         |
|regnet_x_400mf|25.13                        |32.005                 |1.27                                         |
|regnet_x_800mf|33.01                        |37.466                 |1.13                                         |
|regnet_x_8gf|128.13                       |134.868                |1.05                                         |
|regnet_y_16gf|252.91                       |261.292                |1.03                                         |
|regnet_y_1_6gf|72.14                        |84.22                  |1.17                                         |
|regnet_y_3_2gf|99.99                        |109.296                |1.09                                         |
|regnet_y_400mf|29.53                        |36.506                 |1.24                                         |
|regnet_y_800mf|37.82                        |43.634                 |1.15                                         |
|regnet_y_8gf|196.63                       |203.317                |1.03                                         |
|resnet101|128.80                       |137.434                |1.07                                         |
|resnet152|182.85                       |196.498                |1.07                                         |
|resnet18|29.06                        |29.975                 |1.03                                         |
|resnet34|50.73                        |53.443                 |1.05                                         |
|resnet50|76.88                        |80.602                 |1.05                                         |
|resnext101_32x8d|277.29                       |280.759                |1.01                                         |
|resnext101_64x4d|269.56                       |281.052                |1.04                                         |
|resnext50_32x4d|100.73                       |101.102                |1.00                                         |
|shufflenet_v2_x0_5|10.56                        |15.419                 |1.46                                         |
|shufflenet_v2_x1_0|13.11                        |18.525                 |1.41                                         |
|shufflenet_v2_x1_5|18.05                        |23.132                 |1.28                                         |
|shufflenet_v2_x2_0|25.04                        |30.008                 |1.20                                         |
|squeezenet1_1|14.26                        |14.325                 |1.00                                         |
|swin_b|264.52                       |274.613                |1.04                                         |
|swin_s|180.66                       |188.914                |1.05                                         |
|swin_t|108.62                       |112.632                |1.04                                         |
|swin_v2_s|220.29                       |231.153                |1.05                                         |
|swin_v2_t|127.27                       |133.586                |1.05                                         |
|vgg11|95.52                        |103.714                |1.09                                         |
|vgg11_bn|106.49                       |120.711                |1.13                                         |
|vgg13|132.94                       |147.063                |1.11                                         |
|vgg13_bn|149.73                       |165.256                |1.10                                         |
|vgg16|158.19                       |172.865                |1.09                                         |
|vgg16_bn|177.04                       |192.888                |1.09                                         |
|vgg19|184.76                       |194.194                |1.05                                         |
|vgg19_bn|203.30                       |213.334                |1.05                                         |
|vit_b_16|217.31                       |219.748                |1.01                                         |
|vit_b_32|69.47                        |75.692                 |1.09                                         |
|vit_l_32|223.20                       |258.487                |1.16                                         |
|wide_resnet101_2|267.38                       |279.836                |1.05                                         |
|wide_resnet50_2|145.06                       |154.918                |1.07                                         |

You can see that in all cases it is faster than using `AveragedModel`. In fact in many cases, adding EMA does not add any overhead since the computation is hidden behind the usual iteration flow.

This is a similar implementation to the one currently in [NVIDIA NeMo](https://github.com/NVIDIA/NeMo).

If the team is interested in merging this, let me know and I'll add some documentation similar to `swa_utils` and tests.

Credits to @szmigacz for the implementation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94820
Approved by: https://github.com/janeyx99
2023-04-26 18:02:11 +00:00
Chris Gottbrath
f0e28b1cb9 Adding the maintainers approved in 2023Q1 Core Maintainers meeting (#98520)
Added Nikita to Core Maintainers
Merged MKLDNN with CPU Performance
Renamed CUDA to GPU Performance
Added Jiong to Compiler and CPU Performance
Added Xiaobing to CPU Performance
Marking Vitaly and Jian Hui as Emeritus
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98520
Approved by: https://github.com/ezyang, https://github.com/soumith, https://github.com/dzhulgakov
2023-04-24 17:58:18 +00:00
Kurt Mohler
1e8cf6ad7f Add documentation for torch._logging.set_logs (#99219)
Part of #98871

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99219
Approved by: https://github.com/mlazos, https://github.com/lezcano
2023-04-24 08:06:57 +00:00
BowenBao
51742a467d [ONNX] Fix missing import numpy for docs example (#99663)
Fixes https://github.com/pytorch/pytorch/issues/99408
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99663
Approved by: https://github.com/justinchuby
2023-04-21 04:06:45 +00:00
Simon Seo
9f95032101 Fix broken links in contribution_guide.rst (#99295)
mainly from `master` to `main`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99295
Approved by: https://github.com/kit1980
2023-04-20 22:20:56 +00:00
Will Constable
e6aa8e0729 Test and document dynamo backward hooks support (#99382)
No new support added, but backward hooks are working and now there is a test and some documentation about the limitations (hooks firing after whole graph).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99382
Approved by: https://github.com/yanboliang
2023-04-18 03:03:29 +00:00
Will Constable
6eab5e88c8 Graph-break on allowed modules if they have hooks (#97184)
Allowed modules are stuck into dynamo's fx graph as call_module
nodes, without dynamo doing any tracing of the module.  This means
during AOT trace time, hooks will fire during tracing when the
call_module is executed, but the hooks themselves will disappear
after that and not be present in the compiled program.
  (worse, if they performed any tensor operations, those would get
   traced so you could end up with part of the hook's functionality).

To circumvent this, there are two options for 'allowed modules' with hooks.
1) don't treat them as 'allowed' - trace into them
2) graph-break, so the module is no longer part of the dynamo trace at all

(1) will fail for users that opted into allowed modules becuase they know
    their module has problems being traced by dynamo.
(2) causes graph breaks on common modules such as nn.Linear, just because they
    are marked as 'allowed'.

It would help matters if we could differentiate between types of allowed modules
  (A) allowed to avoid overheads - used for common ops like nn.Linear
  (B) allowed to avoid dynamo graphbreaks caused by unsupported code

Ideally, we'd use method (1) for group (A) and (2) for (B).

For now, graph-break on all cases of allowed modules.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97184
Approved by: https://github.com/jansel
2023-04-15 01:46:15 +00:00
BowenBao
606ce5b653 [ONNX] Introduce Input/Ouptut adapter; Switch to 'DynamoExporter' (#98421)
Summary
* Introduce input/output adapter. Due to design differences, input/output format
between PyTorch model and exported ONNX model are often not the same. E.g., `None`
inputs are allowed for PyTorch model, but are not supported by ONNX. Nested constructs
of tensors are allowed for PyTorch model, but only flattened tensors are supported by ONNX,
etc. The new input/output adapter is exported with the model. Providing an interface to
automatically convert and validate inputs/outputs format.
* As suggested by #98251,
provide extension for unwrapping user defined python classes for `dynamo.export` based
exporter. Unblock huggingface models.
* Re-wire tests to run through `DynamoExporter` w/ `dynamo_export` api. Kept
`DynamoOptimizeExporter` in the tests for now for coverage of this change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98421
Approved by: https://github.com/justinchuby, https://github.com/titaiwangms, https://github.com/thiagocrepaldi
2023-04-15 01:13:00 +00:00
PyTorch MergeBot
dda7ce4bb3 Revert "[core][pruning][be] Rename sparsifier folder to pruner (#98758)"
This reverts commit 778fd1922a.

Reverted https://github.com/pytorch/pytorch/pull/98758 on behalf of https://github.com/jcaip due to https://www.internalfb.com/diff/D44905951 need to fix broken import in fbcode
2023-04-13 16:30:47 +00:00
Tugsbayasgalan Manlaibaatar
39fd7f945f Add Symbool support in python to C++ translation (#98453)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98453
Approved by: https://github.com/ezyang
2023-04-12 03:21:57 +00:00
Mark Saroufim
bc8cb62bcb torch.compile benchmark utility (#97699)
I've had many exchanges that look like this https://github.com/rasbt/faster-pytorch-blog/pull/2 so this is an attempt to get make this problem easier

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97699
Approved by: https://github.com/ezyang
2023-04-12 03:02:06 +00:00
soulitzer
367051e47e [docs] Add missing functions to autograd.rst (#98854)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98854
Approved by: https://github.com/albanD
2023-04-11 20:45:49 +00:00
Jesse Cai
778fd1922a [core][pruning][be] Rename sparsifier folder to pruner (#98758)
Summary:
att

Test Plan:
```
python test/test_ao_sparsity.py
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98758
Approved by: https://github.com/jerryzh168
2023-04-11 17:26:29 +00:00
Edward Z. Yang
b8b840be3d Convert logging f-strings to use % format, part five (#98765)
This does some annoying but simple cases by hand.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98765
Approved by: https://github.com/wanchaol
2023-04-11 13:17:59 +00:00
Guspan Tanadi
ab385bd49e docs: Linking ResNeXt PyTorch Hub Pipeline (#98689)
Introducing ResNeXt model as link to PyTorch Hub see Skip connections section.
Handle issue in #98690.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98689
Approved by: https://github.com/zou3519, https://github.com/kit1980
2023-04-11 02:20:26 +00:00
Will Constable
390c51bf87 Skip nnmodule hook guards by default (#98371)
This PR makes basic nnmodule forward hooks work by default, without any overhead.  But it leaves silent correctness issues if users modify/remove their hooks later, thus also emits a warning.

- the usual case is to not use hooks, so avoid guard overhead here
- registering any hook before compile will trigger a warning about hook support
- registering a hook later (or removing one) requires user knowledge and opting in,
  currently this isn't warnable (but maybe we can observe compiled nnmodules to make it
  warnable).

Why skip hook guards by default instead of not tracing __call__/hooks by default?
- avoid having a mode flag that alters dynamo tracing behavior (harder to test both codepaths
  in CI with full coverage)
- the most basic hook usecase (registering a hook before compile, and never removing it)
  will work by default with this PR, while it would require enablement and incur overhead
  in the 'not tracing __call__' proposal.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98371
Approved by: https://github.com/jansel
2023-04-07 15:10:51 +00:00
BJ Hargrave
555ab310dc Add itemsize and nbytes properties to Tensor (#98322)
Adds properties for itemsize and nbytes to Tensor matching the properties in NumPy.

Fixes https://github.com/pytorch/pytorch/issues/12728

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98322
Approved by: https://github.com/ezyang
2023-04-05 12:11:55 +00:00
Aaron Bockover
558e5a240e Introduce torch.onnx.dynamo_export API (#97920)
This is the first phase of the new ONNX exporter API for exporting from TorchDynamo and FX, and represents the beginning of a new era for exporting ONNX from PyTorch.

The API here is a starting point upon which we will layer more capability and expressiveness in subsequent phases. This first phase introduces the following into `torch.onnx`:

```python
dynamo_export(
    model: torch.nn.Module,
    /,
    *model_args,
    export_options: Optional[ExportOptions] = None,
    **model_kwargs,
) -> ExportOutput:
    ...

class ExportOptions:
    opset_version: Optional[int] = None
    dynamic_shapes: Optional[bool] = None
    logger: Optional[logging.Logger] = None

class ExportOutputSerializer(Protocol):
    def serialize(
        self,
        export_output: ExportOutput,
        destination: io.BufferedIOBase,
    ) -> None:
        ...

class ExportOutput:
    model_proto: onnx.ModelProto

    def save(
        self,
        destination: Union[str, io.BufferedIOBase],
        *,
        serializer: Optional[ExportOutputSerializer] = None,
    ) -> None:
        ...
```

In addition to the API in the first commit on this PR, we have a few experiments for exporting Dynamo and FX to ONNX that this PR rationalizes through the new Exporter API and adjusts tests to use the new API.

- A base `FXGraphModuleExporter` exporter from which all derive:
  - `DynamoExportExporter`: uses dynamo.export to acquire FX graph
  - `DynamoOptimizeExporter`: uses dynamo.optimize to acquire FX graph
  - `FXSymbolicTraceExporter`: uses FX symbolic tracing

The `dynamo_export` API currently uses `DynamoOptimizeExporter`.

### Next Steps (subsequent PRs):

* Combine `DynamoExportExporter` and `DynamoOptimizeExporter` into a single `DynamoExporter`.
* Make it easy to test `FXSymbolicTraceExporter` through the same API; eventually `FXSymbolicTraceExporter` goes away entirely when the Dynamo approach works for large models. We want to keep `FXSymbolicTraceExporter` around for now for experimenting and internal use.
* Parameterize (on `ExportOptions`) and consolidate Dynamo exporter tests.
  - This PR intentionally leaves the existing tests unchanged as much as possible except for the necessary plumbing.
* Subsequent API phases:
  - Diagnostics
  - Registry, dispatcher, and Custom Ops
  - Passes
  - Dynamic shapes

Fixes #94774

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97920
Approved by: https://github.com/justinchuby, https://github.com/titaiwangms, https://github.com/thiagocrepaldi, https://github.com/shubhambhokare1
2023-04-04 18:13:29 +00:00
Richard Zou
6b9e22f3f6 Clarify the saving of intermediates in the "extending torch.func" docs (#98020)
Fixes https://github.com/pytorch/pytorch/issues/97260

We got some feedback that the page reads like "in order to save an input
for backward, you must return it as an output of the
autograd.Function.forward".

Doing so actually raises an error (on master and as of 2.1), but results
in an ambiguous situation on 2.0.0. To avoid more users running into
this, we clarify the documentation so it doesn't read like the above
and clearly mentions that you can save things from the inputs or
outputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98020
Approved by: https://github.com/soulitzer, https://github.com/kshitij12345
2023-03-31 13:57:37 +00:00
drisspg
a5b6f10c5d Fix format bug in NT docs (#97998)
Fixes a formatting bug in the NT docs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97998
Approved by: https://github.com/jbschlosser
2023-03-31 01:00:25 +00:00
Driss Guessous
5a81508bb6 Add NestedTensor ops: logical_not, logical_not_, masked_fill (#97934)
# Summary
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 7954302</samp>

This pull request adds support for `logical_not` and `masked_fill` operations on nested tensors, which are tensors that can have tensors as elements. It modifies the `native_functions.yaml` file to dispatch these operations to the nested tensor backend, implements the logic for these operations in `NestedTensorBinaryOps.cpp` and `NestedTensorUnaryOps.cpp`, adds documentation in `nested.rst`, and adds tests in `test_nestedtensor.py`.

## Description
<!--
copilot:walkthrough
-->
### <samp>🤖 Generated by Copilot at 7954302</samp>

*  Implement `logical_not` operation on nested tensors ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1164), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1172), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f7c94671810b3ce652f9ad5458518cb7bbd67e8bf7e84e0a2fba641d878ba7c5R45-R56), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR203), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0L854-R867))
  - Add `NestedTensor_logical_not` and `NestedTensor_logical_not_` functions to `native_functions.yaml` for CPU and CUDA dispatch ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1164), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1172))
  - Define `NestedTensor_logical_not` and `NestedTensor_logical_not_` functions in `NestedTensorUnaryOps.cpp` using `map_nt` and `get_buffer` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f7c94671810b3ce652f9ad5458518cb7bbd67e8bf7e84e0a2fba641d878ba7c5R45-R56))
  - Document `torch.logical_not` function for nested tensors in `nested.rst` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR203))
  - Add subtest for `logical_not` function in `test_activations` method in `TestNestedTensorDeviceType` class in `test_nestedtensor.py` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0L854-R867))
* Implement `masked_fill` operation on nested tensors ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R7439), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L210-R224), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR197), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R677-R688), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R2515-R2528))
  - Add `NestedTensor_masked_fill` function to `native_functions.yaml` for CPU and CUDA dispatch ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R7439))
  - Define `NestedTensor_masked_fill` function in `NestedTensorBinaryOps.cpp` using `NestedTensor_elementwise_Tensor` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L210-R224))
  - Document `torch.Tensor.masked_fill` function for nested tensors in `nested.rst` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR197))
  - Add test case for `masked_fill` function in `TestNestedTensorDeviceType` class in `test_nestedtensor.py` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R677-R688))
  - Add test case for backward pass of `masked_fill` function in `TestNestedTensorAutograd` class in `test_nestedtensor.py` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R2515-R2528))
* Improve error message for unsupported element-wise binary operations on nested dense tensors ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L142-R150))
  - Modify `NestedTensor_elementwise_Tensor` function in `NestedTensorBinaryOps.cpp` to include operation name in error message ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L142-R150))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97934
Approved by: https://github.com/cpuhrsch
2023-03-30 08:14:39 +00:00
Driss Guessous
f603873c1b add various NT ops needed for testing (#97837)
# Summary
Add some Simple unary and binary NT ops
- Sub
- sgn
- abs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97837
Approved by: https://github.com/cpuhrsch
2023-03-29 23:43:37 +00:00
vfdev
0f424f7f05 Fixed broken link to troubleshooting.html docs page (#97330)
Seen first in error message:
```
[2023-03-22 10:30:39,786] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64)
   function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:407)
   reasons:  w == 857
to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html.
[2023-03-22 10:30:40,036] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64)
   function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:406)
   reasons:  ___stack0 == 207
to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html.
```

Broken link:
- https://pytorch.org/docs/master/dynamo/troubleshooting.html.

Good link:
- https://pytorch.org/docs/master/compile/troubleshooting.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97330
Approved by: https://github.com/zou3519
2023-03-22 16:40:21 +00:00
Mikayla Gawarecki
b04363ead4 [easy] Expose documentation for a few global nn.Module hooks (#97185)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97185
Approved by: https://github.com/albanD
2023-03-21 20:09:29 +00:00
Kazuaki Ishizaki
50ed38a7eb Fix typo under docs directory (#97202)
This PR fixes typo in `.rst` files under docs directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97202
Approved by: https://github.com/kit1980
2023-03-21 01:24:10 +00:00
Driss Guessous
a269e5fa04 Add forward and backward support for silu to NestedTensors (#97181)
# Summary
Add forward and backward support for silu to NestedTensors
- Add forward support to silu
- Add forward support to silu_
- Add backward support to silu
- Add to NT docs
- Add tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97181
Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser
2023-03-20 23:46:12 +00:00
Mark Saroufim
6110effa86 Rework torch.compile docs (#96706)
Chatted with @stas00 on slack and here are some great improvements he suggested to the compile docs

- [x] Rename `dynamo` folder to `compile`
- [x] Link `compile` docstring on `torch.html` to main index page for compile
- [x] Create a new index page that describes why people should care
  - [x] easy perf, memory reduction, 1 line
  - [x] Short benchmark table
  - [x] How to guide
  - [x] TOC that links to the more technical pages folks have written, make the existing docs we have a Technical overview
- [x] Highlight the new APIs for `torch._inductor.list_options()` and `torch._inductor.list_mode_options()` - clarify these are inductor specific and add more prose around which ones are most interesting

He also highlighted an interesting way to think about who is reading this doc we have

- [x] End users, that just want things to run fast
- [x] Library maintainers wrapping torch.compile which would care for example about understanding when in their code they should compile a model, which backends are supported
- [x] Debuggers who needs are somewhat addressed by the troubleshooting guide and faq but those could be dramatically reworked to say what we expect to break

And in a seperate PR I'll work on the below with @SherlockNoMad
- [ ] Authors of new backends that care about how to plug into dynamo or inductor layer so need to explain some more internals like
  - [ ] IR
  - [ ] Where to plugin, dynamo? inductor? triton?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96706
Approved by: https://github.com/svekars
2023-03-15 04:41:13 +00:00
Bin Bao
f03db8d6cb [reland2][inductor] Add an AOT compilation mode for Inductor CPP backend (#96520)
Summary: This is a reland of https://github.com/pytorch/pytorch/pull/94822.
Solved the long compilation issue for inductor cpp tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96520
Approved by: https://github.com/huydhn, https://github.com/malfet
2023-03-14 16:10:54 +00:00
eqy
6e3e22d58c [CUDA][cuFFT] Minor fix for cuFFT plan cache docs (#96373)
The attributes described in the docs require indexing in to the plan cache manager, as there is a separate plan cache per device.

CC @ptrblck @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96373
Approved by: https://github.com/ngimel
2023-03-14 00:28:14 +00:00
Driss Guessous
f330281fb2 Add torch.nn.LayerNorm() to documented list of supported nested tensor ops (#96434)
Layer norm is supported and this updates the documentation to reflect that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96434
Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser
2023-03-13 23:16:09 +00:00
Joel Schlosser
30d56dd8c1 Support randn_like() for NT (#96528)
To satisfy an internal ask.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96528
Approved by: https://github.com/mikaylagawarecki, https://github.com/cpuhrsch
2023-03-13 19:39:51 +00:00
Kiuk Chung
55a1bd3fc6 [PT-D] Update CODEOWNERS, merge_rules, and Persons-of-Interest for to… (#96321)
Synchronize CODEOWNERS, merge_rules, and POI files to reflect kiukchung and d4l3k (Tristan Rice) as one of the maintainers for the distributed module.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96321
Approved by: https://github.com/d4l3k, https://github.com/albanD, https://github.com/malfet
2023-03-13 17:38:43 +00:00
Joel Schlosser
024ea1a21e Support zeros_like() for NT (#96527)
This is used for the fake tensor fallbacks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96527
Approved by: https://github.com/cpuhrsch
2023-03-13 15:15:08 +00:00
Rishub Tamirisa
f3b8638074 Adding nn.ZeroPad1d and nn.ZeroPad3d (#96295)
Fixes #95796

### Implementation
Adds python implementation for `nn.ZeroPad1d` and `nn.ZeroPad3d` in `torch/nn/modules/padding.py`.

Adds cpp implementation for `nn::ZeroPad1d` and `nn::ZeroPad3d` in the following 3 files, refactored with templates similarly to `nn::ConstantPad`'s implementation: <br>
- `torch/crsc/api/include/torch/nn/modules/padding.h`
- `torch/csrc/api/include/torch/nn/options/padding.h`
- `torch/csrc/api/src/nn/modules/padding.cpp`

Also added relevant definitions in `torch/nn/modules/__init__.py`.
### Testing
Adds the following tests:
-  cpp tests of similar length and structure as `ConstantPad` and the existing `ZeroPad2d` impl in `test/cpp/api/modules.cpp`
- cpp API parity tests in `torch/testing/_internal/common_nn.py`
- module init tests in `test/test_module_init.py`

Also added relevant definitions in `test/cpp_api_parity/parity-tracker.md`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96295
Approved by: https://github.com/soulitzer
2023-03-10 03:51:41 +00:00
Joel Schlosser
7324aef9a8 Add torch.empty_like() to documented list of supported nested tensor ops (#96211)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96211
Approved by: https://github.com/drisspg
2023-03-07 23:33:34 +00:00
Iris
a7698a8260 [DCP] Add DCP FSDP sharded_state_dict checkpoint example to DCP .rst file (#95517)
As title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95517
Approved by: https://github.com/kumpera
2023-03-03 18:09:10 +00:00
Svetlana Karslioglu
004bcffc6a Fix formatting (#95906)
Fixing list formatting by adding a missing blank line:

Before:
![Screenshot 2023-03-02 at 3 17 28 PM (2)](https://user-images.githubusercontent.com/5317992/222585127-9b6ed4dd-4719-4756-b2ac-1ba6e8f97b87.png)

After:
![Screenshot 2023-03-02 at 3 16 48 PM (2)](https://user-images.githubusercontent.com/5317992/222585172-3ef35a48-641f-4b73-9f7b-f419a122196b.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95906
Approved by: https://github.com/orionr
2023-03-03 16:18:12 +00:00
Michael Lazos
184fb9f11d Small doc update for torch_compile_debug (#95809)
Updates the troubleshooting documentation with the folder structure of the debug directory
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95809
Approved by: https://github.com/msaroufim
2023-03-02 00:25:28 +00:00