Commit Graph

2337 Commits

Author SHA1 Message Date
soulitzer
a7bcc78bff Make it clearer that current selective AC is PT2-only and private (#115081)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115081
Approved by: https://github.com/albanD
2023-12-04 23:01:22 +00:00
Xuehai Pan
55064a4ef9 [BE] add parentheses to kwargs unpacking func(*args, **(kwargs or {})) (#115026)
This PR adds parentheses to kwargs unpacking `func(*args, **(kwargs or {}))` for better code readability.

With/without the parentheses are semantic equivalent because they produce the same bytecode.

```console
$ echo "func(*args, **kwargs or {})" | python3 -m dis -
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (func)
              6 LOAD_NAME                1 (args)
              8 BUILD_MAP                0
             10 LOAD_NAME                2 (kwargs)
             12 JUMP_IF_TRUE_OR_POP      1 (to 16)
             14 BUILD_MAP                0
        >>   16 DICT_MERGE               1
             18 CALL_FUNCTION_EX         1
             20 POP_TOP
             22 LOAD_CONST               0 (None)
             24 RETURN_VALUE

$ echo "func(*args, **(kwargs or {}))" | python3 -m dis -
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (func)
              6 LOAD_NAME                1 (args)
              8 BUILD_MAP                0
             10 LOAD_NAME                2 (kwargs)
             12 JUMP_IF_TRUE_OR_POP      1 (to 16)
             14 BUILD_MAP                0
        >>   16 DICT_MERGE               1
             18 CALL_FUNCTION_EX         1
             20 POP_TOP
             22 LOAD_CONST               0 (None)
             24 RETURN_VALUE
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115026
Approved by: https://github.com/Skylion007
2023-12-03 20:03:26 +00:00
PyTorch MergeBot
3a2e2044cd Revert "[DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710) (#114991)"
This reverts commit 729ac7317a.

Reverted https://github.com/pytorch/pytorch/pull/114991 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/114991#issuecomment-1837214567))
2023-12-02 17:55:51 +00:00
Wanchao Liang
28925902fa [TP] fully rewrite Tensor Parallel APIs (#114732)
This PR rewrites Tensor Parallel implementation. Tensor Parallel APIs
supposed to be a very thin-wrapper to DTensor APIs, but the current
implementation got too messy and buggy. It's really hard to debug what
went wrong when using it. It's crucially important for advanced users or
developers to understand the API and its implementation easily without
going through all different types of functions and utils, so that
they could trust what happen under the hood.

In particular this PR:

* Make ParallelStyle to be a real contract API for parallelize_module to
  take, each concrete ParallelStyle only needs to implement `apply` to
apply the sharding to nn.Module, remove all non-necessary fields. This
also enable easier ParallelStyle authoring going forward.
* Keep the ColwiseParallel and RowwiseParallel public interface, but
  refactor them in a way that makes the parameter sharding, inputs and
outputs handling lives within the style itself, so that it's easy to
understand how Linear/Embedding layers are sharded and how the inputs/outputs
transformations are performed.
* remove all those private _prepare_input/_prepare_output_fn fields for
  both ColwiseParallel/RowwiseParallel. Since we throw deprecation
messages in nightly for a while and TP is on prototype release, the
fields are also private, it should be safe to remove them
* Refactor the recently landed PrepareModuleInput/Output style, change
  output_layouts to desired_input/output_layouts, group
  the function inside the style itself, no default arguments for these
two styles and user need to specify them to think about the sharding
layouts. Fixed bugs about not handling
`use_local_output` flag.
* Make default arguments be None instead of Placement object, this is
  standard python practice to not have custom object instance as default
argument
* Remove all dead APIs (i.e. PairwiseParallel and SequenceParallel
  style, all prepare input/output functions) as we throw deprecation
 msgs for a while, and in the progress of removing all of them from the tests.
* throw deprecation warning for `tp_mesh_dim` as we recomemnd use device
  mesh slice/indexing instead of manually specify mesh dim
* Rewrite all documentations for every ParallelStyle and make the
  documentation more clear about what each style is doing

TODOs:
* Rewrite TP tests to adjust for the changes we have in this PR
* add more tests to guard the bug fixes

Differential Revision: [D51761183](https://our.internmc.facebook.com/intern/diff/D51761183)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114732
Approved by: https://github.com/wz337, https://github.com/fduwjj
2023-12-02 08:18:12 +00:00
Iris Zhang (PyTorch)
729ac7317a [DeviceMesh] Rename _device_mesh.py to device_mesh.py to prepare for beta (#114710) (#114991)
Summary:

Same content of changes as https://github.com/pytorch/pytorch/pull/114710

Rename _device_mesh.py to device_mesh.py, update all callsites, adds documentation.
ghstack-source-id: 208980207
exported-using-ghexport

Test Plan: CI.

Reviewed By: wanchaol

Differential Revision: D51629761

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114991
Approved by: https://github.com/wanchaol, https://github.com/fduwjj, https://github.com/fegin
2023-12-02 04:39:41 +00:00
Rohan Varma
3c78ea4c9d [DDP][Compile] Test to Ensure torch.compile works w/static_graph=True (#114621)
Resolves https://github.com/pytorch/pytorch/issues/93672. This was
actually fixed by https://github.com/pytorch/pytorch/pull/103487 but I didn't
realize that PR also fixes torch compile at the time.

Differential Revision: [D51596148](https://our.internmc.facebook.com/intern/diff/D51596148/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114621
Approved by: https://github.com/wconstab
2023-12-01 22:18:45 +00:00
Lucas Pasqualin
f073dcd4f7 Stateful Checkpointing for Distributed [1/N] (#113867)
First pass at adding a save/load API, as well as definition of Stateful objects.

Amongst a couple todo's, we still need to explore adding an `all_gather` & potentially a `barrier` while iterating through state keys.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113867
Approved by: https://github.com/fegin, https://github.com/wz337
2023-12-01 19:21:03 +00:00
Philip Meier
373f2060ba fix extending torch native API docs (#114863)
Couldn't think of a better `release notes:` label. Feel free to set a more fitting one
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114863
Approved by: https://github.com/mikaylagawarecki
2023-12-01 06:09:35 +00:00
Jerry Zhang
64fd706b21 [quant][pt2e] Add generate_numeric_debug_handle pass (#114315)
Summary:
This is a util for numeric suite in pt2 export so that we can build
a more streamlined UX for numerical debugging in quant + executorch stack

Test Plan:
python test/test_quantization.py TestGenerateNumericDebugHandle

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315
Approved by: https://github.com/zhxchen17
2023-12-01 03:38:17 +00:00
William Wen
38ae17d166 [dynamo, docs] update dynamo backend registration docs (#114820)
Update docs to reflect current backend registration API. Add `lookup_backend` to root `dynamo` module.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114820
Approved by: https://github.com/eellison
2023-11-30 21:41:05 +00:00
Nikita Shulga
a9d5133207 [ez][doc] Fix sample code in onnx_dynamo.rst (#114770)
By adding `import torch.nn as nn`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114770
Approved by: https://github.com/atalman, https://github.com/thiagocrepaldi
2023-11-29 19:27:52 +00:00
Guo Yejun
4aa2c51a09 [doc] fix typo on graph 3 that is recorded (#114666)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114666
Approved by: https://github.com/eellison
2023-11-28 20:40:13 +00:00
Guo Yejun
4a35ec3c0e [docs] correct the code for cudagraph trees integration (#114583)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114583
Approved by: https://github.com/eellison
2023-11-28 20:28:52 +00:00
lezcano
4ba3e6758d Canonicalize runtime asserts (#114509)
This allows us to remove quite a few redundant runtime asserts, and potentially a number of guards as well.

On
```
python test/dynamo/test_subclasses.py -k test_unbind
```
we go from
```
inserting runtime assert i0 <= s0
inserting runtime assert 0 <= -i0 + s0
inserting runtime assert i0 + i1 <= s0
inserting runtime assert i0 <= -i1 + s0
inserting runtime assert i0 + i1 + i2 <= s0
inserting runtime assert i0 + i1 <= -i2 + s0
inserting runtime assert Eq(i0 + i1 + i2 + i3, s0)
inserting runtime assert i0 + i1 + i2 + i3 <= s0
inserting runtime assert i0 + i1 + i2 <= -i3 + s0
```
to
```
inserting runtime assert i0 - s0 <= 0
inserting runtime assert i0 + i1 - s0 <= 0
inserting runtime assert i0 + i1 + i2 - s0 <= 0
inserting runtime assert Eq(i0 + i1 + i2 + i3, s0)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114509
Approved by: https://github.com/voznesenskym
2023-11-28 01:38:47 +00:00
voznesenskym
081c5b3adc Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926) (#114526)
Summary:

The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors *at the end* of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor.

This PR is the result of *a lot* of back and forth with ezyang and eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same:

1) We cache source->symbol in shape_env
2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification
3) We create a new fake mode for backends
(from https://github.com/pytorch/pytorch/pull/113605/files)

This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't).

We went back to the drawing board here, but with a few concessions:
1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons
2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (ezyang did this)

cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

imported-using-ghimport

Test Plan: Imported from OSS

Reviewed By: huydhn, Chillee

Differential Revision: D51566250

Pulled By: voznesenskym

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114526
Approved by: https://github.com/Chillee, https://github.com/huydhn
2023-11-26 23:40:32 +00:00
Akihiro Nitta
d37c4c6995 Update torch.compiler_troubleshooting.rst (#114530)
If you copy and paste the env var in the docs:
```console
TORCHDYNAMO_REPRO_AFTER=“aot”
```
it leads to this error:
```python
    @functools.wraps(unconfigured_compiler_fn)
    def debug_wrapper(gm, example_inputs, **kwargs):
        compiler_fn = functools.partial(unconfigured_compiler_fn, **kwargs)
>       assert config.repro_after in ("dynamo", "aot", None)
E       torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
E       AssertionError:
```
because `config.repro_after` is being `'“aot”'` but not `'aot'`.

---

It would've saved a few minutes of my time 😄
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114530
Approved by: https://github.com/Chillee
2023-11-25 23:15:47 +00:00
PyTorch MergeBot
2f3beb715c Revert "Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926)"
This reverts commit 2ca1119d53.

Reverted https://github.com/pytorch/pytorch/pull/113926 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113926#issuecomment-1822713852))
2023-11-22 12:52:33 +00:00
Thiago Crepaldi
3f736c2d77 Add ONNXProgram.__call__ API to run model with ONNX Runtime (#113495)
Currently the user can use torch.onnx.dynamo_export to export the model.
to ONNX.

```python
import torch

class Model(torch.nn.Module):
    def forward(self, x):
        return x + 1.0

onnx_program = torch.onnx.dynamo_export(
    Model(),
    torch.randn(1, 1, 2, dtype=torch.float),
)
```

The next step would be instantiating a ONNX runtime to execute it.

```python
import onnxruntime  # type: ignore[import]

onnx_input = self.adapt_torch_inputs_to_onnx(*args, **kwargs)
options = options or {}
providers = options.get("providers", onnxruntime.get_available_providers())
onnx_model = self.model_proto.SerializeToString()
ort_session = onnxruntime.InferenceSession(onnx_model, providers=providers)

def to_numpy(tensor):
    return (
        tensor.detach().cpu().numpy()
        if tensor.requires_grad
        else tensor.cpu().numpy()
    )

onnxruntime_input = {
    k.name: to_numpy(v) for k, v in zip(ort_session.get_inputs(), onnx_input)
}

return ort_session.run(None, onnxruntime_input)
```

This PR provides the `ONNXProgram.__call__` method as facilitator to use ONNX Runtime under the hood, similar to how `torch.export.ExportedProgram.__call__` which allows the underlying `torch.fx.GraphModule` to be executed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113495
Approved by: https://github.com/titaiwangms
2023-11-22 01:48:45 +00:00
Antonio Kim
7fc292930c Add support for torch.Generator type in TorchScript (#110413)
- Add support for `torch.Generator` type in TorchScript
- Add `generator` args to all `torch.nn.init` functions that call `uniform_` or `normal_`
- Add support for `torch.Generator` in LTC's TorchScript backend (CC: @wconstab)

CC: @eellison @davidberard98 @GlebKazantaev @behzad-a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110413
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/glebk-cerebras, https://github.com/davidberard98
2023-11-21 23:07:21 +00:00
HDCharles
18e1a37c4e [ao] updating embedding_bag support for fx and eager (#107623)
Summary: our docs were saying dynamic embedding bag wasn't supported but
it actually is (at least at the same level as embeddings were) it just wasn't previously tested/listed.

Test Plan: python test/test_quantization.py -k "test_embedding"

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107623
Approved by: https://github.com/jerryzh168
2023-11-21 03:54:00 +00:00
Ke Wen
dc65f6c601 [c10d] Remove deprecated multi-gpu-per-thread APIs (#114156)
As of today, PyTorch Distributed's preferred programming model is one device per thread, as exemplified by the APIs in its document.  The multi-GPU functions (which stand for multiple GPUs per CPU thread) have been deprecated for three versions. Removing them now before 2.2 release.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114156
Approved by: https://github.com/albanD, https://github.com/fduwjj, https://github.com/H-Huang
2023-11-21 03:50:23 +00:00
voznesenskym
2ca1119d53 Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926)
The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors *at the end* of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor.

This PR is the result of *a lot* of back and forth with @ezyang and @eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same:

1) We cache source->symbol in shape_env
2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification
3) We create a new fake mode for backends
(from https://github.com/pytorch/pytorch/pull/113605/files)

This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't).

We went back to the drawing board here, but with a few concessions:
1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons
2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (@ezyang did this)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113926
Approved by: https://github.com/ezyang, https://github.com/eellison
2023-11-20 23:06:37 +00:00
Edward Z. Yang
aeb5fd52c7 Remove dead tensor_has_hints. (#114071)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114071
Approved by: https://github.com/aakhundov
2023-11-20 16:02:24 +00:00
Pearu Peterson
0bd4d1f4ab Add sparse tensors support to dataloader. (#112842)
Fixes https://github.com/pytorch/pytorch/issues/106837

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112842
Approved by: https://github.com/cpuhrsch, https://github.com/gokulavasan
2023-11-19 16:05:27 +00:00
Edward Z. Yang
e2b114ab9f [BE] Package dynamic_dims/constraint_dims into CreateSymbolicPolicy (#113802)
This will make it more convenient to propagate more information through
all of these functions in the future (e.g., for storage offset
information.)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113802
Approved by: https://github.com/davidberard98, https://github.com/voznesenskym
2023-11-17 18:22:46 +00:00
Edward Z. Yang
3a3a979984 Add torch.distributed.breakpoint (#113775)
I tested it works by patching

```
diff --git a/test/distributed/test_dynamo_distributed.py b/test/distributed/test_dynamo_distributed.py
index 96b3a82bdfa..dea9bac9302 100644
--- a/test/distributed/test_dynamo_distributed.py
+++ b/test/distributed/test_dynamo_distributed.py
@@ -18,6 +18,7 @@ from torch._dynamo import config
 from torch._dynamo.utils import same
 from torch._dynamo.testing import collect_results
 from torch.utils._triton import has_triton
+import torch.distributed as dist
 from torch.distributed.fsdp.wrap import transformer_auto_wrap_policy, lambda_auto_wrap_policy
 from torch._higher_order_ops.wrap import tag_activation_checkpoint
 from torch.nn.parallel import DistributedDataParallel as DDP
@@ -398,6 +399,7 @@ class TestMultiProc(DynamoDistributedMultiProcTestCase):
     @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch")
     def test_fsdp_activation_checkpointing(self):
         with _dynamo_dist_per_rank_init(self.rank, self.world_size):
+            dist.breakpoint()
             model, inputs = get_toy_model_for_activation_checkpointing(f"cuda:{self.rank}")
             is_inner = lambda module: isinstance(module, ToyInnerModel)  # noqa: E731
             wrap_policy = functools.partial(lambda_auto_wrap_policy, lambda_fn=is_inner)
```

and then running `python test/distributed/test_dynamo_distributed.py -k test_fsdp_activation_checkpointing`

It prints:

```
ATTENTION!!!

Type 'up' to get to the frame that called dist.breakpoint(rank=0)

> /data/users/ezyang/c/pytorch/torch/distributed/__init__.py(71)breakpoint()
-> barrier()
(Pdb) up
> /data/users/ezyang/c/pytorch/test/distributed/test_dynamo_distributed.py(402)test_fsdp_activation_checkpointing()
-> dist.breakpoint()
(Pdb) list
397
398         @skip_if_lt_x_gpu(1)
399         @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch")
400         def test_fsdp_activation_checkpointing(self):
401             with _dynamo_dist_per_rank_init(self.rank, self.world_size):
402  ->             dist.breakpoint()
403                 model, inputs = get_toy_model_for_activation_checkpointing(f"cuda:{self.rank}")
404                 is_inner = lambda module: isinstance(module, ToyInnerModel)  # noqa: E731
405                 wrap_policy = functools.partial(lambda_auto_wrap_policy, lambda_fn=is_inner)
406                 model = apply_fsdp_with_checkpointing(model, wrap_policy, is_inner)
407                 correct_outputs = model(inputs)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113775
Approved by: https://github.com/wconstab, https://github.com/wanchaol
2023-11-16 19:30:57 +00:00
Mu-Chu Lee
eddce3c054 [AOTInductor] Rename model_runner to model_container_runner (#111324)
Summary:
We rename the model_runner to model_container_runner to prepare for
adding tests of pure model without container.

Test Plan:
commit itself is a test.

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111324
Approved by: https://github.com/desertfire, https://github.com/chenyang78
2023-11-16 19:14:22 +00:00
Tongzhou Wang
275403be16 [doc] Add nn.parametrizations.weight_norm (#113783)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113783
Approved by: https://github.com/albanD
2023-11-16 17:42:48 +00:00
Iris Zhang
72ce5dd13e [2D] Remove enable_2d_with_fsdp() API and make remove_enable_2d_with_fsdp private (#112473)
As we have our new 2D flow out, we want to remove `enable_2d_with_fsdp()`.
In addition, we change pre_dp_module_transform to private, as we may need to change the UX later on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112473
Approved by: https://github.com/fegin, https://github.com/wanchaol
2023-11-16 01:14:00 +00:00
gs-olive
757f36b988 [docs] Fix torch.compile "tensorrt" backend docs (#113711)
- Update description from ONNX to current state (Torch-TensorRT)
- Add clarification about import

Fixes documentation on this page: https://pytorch.org/docs/stable/torch.compiler.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113711
Approved by: https://github.com/msaroufim
2023-11-15 08:42:53 +00:00
drisspg
9b0f2f8d94 expose sdpa helpers to python (#110496)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110496
Approved by: https://github.com/jbschlosser
2023-11-15 07:34:34 +00:00
PyTorch MergeBot
252e68a83b Revert "Add support for torch.Generator type in TorchScript (#110413)"
This reverts commit 54493fe8c4.

Reverted https://github.com/pytorch/pytorch/pull/110413 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is, unfortunately, still breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/110413#issuecomment-1811625557))
2023-11-15 00:51:23 +00:00
Antonio Kim
54493fe8c4 Add support for torch.Generator type in TorchScript (#110413)
- Add support for `torch.Generator` type in TorchScript
- Add `generator` args to all `torch.nn.init` functions that call `uniform_` or `normal_`
- Add support for `torch.Generator` in LTC's TorchScript backend (CC: @wconstab)

CC: @eellison @davidberard98 @GlebKazantaev @behzad-a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110413
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/glebk-cerebras, https://github.com/davidberard98
2023-11-13 23:18:14 +00:00
Justin Chu
47a59ee4d1 [ONNX] Update exporter issue report instructions for quantized models (#113494)
Update the instructions to point users to the right place for creating issues.

https://github.com/onnx/onnx/issues/5674#issuecomment-1806505240

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113494
Approved by: https://github.com/jerryzh168
2023-11-13 18:18:19 +00:00
Bin Bao
c197c48ceb [aotinductor] Add a demo tutorial (#112457)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112457
Approved by: https://github.com/msaroufim, https://github.com/albanD
2023-11-10 17:01:03 +00:00
Thiago Crepaldi
574e313643 Add thiagocrepaldi as person of interest for onnx exporter (#113402)
@malfet @kit1980

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113402
Approved by: https://github.com/malfet
2023-11-10 15:19:58 +00:00
Sergii Dymchenko
bb06725ee0 Update mentions of deprecated functions if complex_numbers.rst (#113391)
`torch.svd` is deprecated, and `torch.solve` is completely removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113391
Approved by: https://github.com/malfet, https://github.com/lezcano
2023-11-09 22:32:26 +00:00
Jerry Zhang
501d118255 [quant][pt2e] Add transform_for_annotation method in Quantizer (#113115)
Summary:
Adding the method so that people can do some transformations before annotation to make the graph easier to annotate

Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_transform_for_annotation

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D51141080](https://our.internmc.facebook.com/intern/diff/D51141080)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113115
Approved by: https://github.com/kimishpatel
2023-11-09 20:23:29 +00:00
Nikita Shulga
81bf0bd68d [no ci] Fix typo in persons_of_interest.rst (#113283)
There are no `c` in `Hirsh`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113283
Approved by: https://github.com/bdhirsh
2023-11-08 19:36:32 +00:00
Edward Z. Yang
1f3fa13f0a Handle unbacked SymInt sized outputs in AOTAutograd (#113159)
Thanks aakhundov for constructing the test case. This PR was constructed by running the failing test case, and then fixing problems until we got all the way to the end. There are a few distinct fixes:

* AOTAutograd performs equality tests on tensor metadata to determine if a metadata mutation had occurred. If we test i0 vs i1, we should report these are NOT equal, since obviously we have somehow resized the tensor from i0 to i1 (even if, on a particular run, it is possible i0 == i1).
* There's a sketchy fix for `test_aot_autograd_exhaustive_matmul_cpu_float32` where we check if the output shape equals the tangent shape. Unfortunately, the same `definitely_true` treatment does not work here, it still fails on the example. I piled an extra sketchy fix on top of it, where I just try my best to avoid doing the view. Maybe we should have some sort of logging here.
* Partitioner needs to get out a size for unbacked SymInt when partitioning. I just feed it a random heuristic value in this case, similar to how we've been dealing with this in Inductor.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113159
Approved by: https://github.com/aakhundov, https://github.com/bdhirsh
2023-11-08 04:28:38 +00:00
PyTorch MergeBot
9a28a7b498 Revert "Add support for torch.Generator type in TorchScript (#110413)"
This reverts commit 27e31ab6e8.

Reverted https://github.com/pytorch/pytorch/pull/110413 on behalf of https://github.com/PaliC due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/110413#issuecomment-1799003164))
2023-11-07 15:53:32 +00:00
Thiago Crepaldi
eefe327b11 Rename torch.onnx.ExportOutput* to ONNXProgram* (#112263)
Since PyTorch 2.1, torch.export API was introduced and the term "export"
got overloaded due to the already existing torch.onnx.export API.

The torch.onnx.dynamo_export API was introduced on pyTorch 2.0 and it
exposed a torch.onnx.ExportOutput which now can be confused with
torch.export.export output

To prevent such ambiguity and standardize names around the new
torch.export.ExportedProgram, this PR renames torch.onnx.ExportOutput to
torch.onnx.ONNXProgram

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112263
Approved by: https://github.com/BowenBao
ghstack dependencies: #112444
2023-11-06 22:27:15 +00:00
Antonio Kim
27e31ab6e8 Add support for torch.Generator type in TorchScript (#110413)
- Add support for `torch.Generator` type in TorchScript
- Add `generator` args to all `torch.nn.init` functions that call `uniform_` or `normal_`
- Add support for `torch.Generator` in LTC's TorchScript backend (CC: @wconstab)

CC: @eellison @davidberard98 @GlebKazantaev @behzad-a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110413
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/glebk-cerebras, https://github.com/davidberard98
2023-11-06 21:27:02 +00:00
Peter Bell
718035791d Prefer e.is_number over not e.free_symbols in SymPy (#112688)
We spend somewhere on the order 1% in `sympy.Expr.free_symbols` as it is called millions of times.
Most of the time we actually just want to know "is this a constant", however `e.is_constant()` is
horribly slow. It turns out though that there is another propery `is_number` that does what we want.

> property is_number:
>
> Returns True if self has no free symbols and no undefined functions (AppliedUndef, to be precise). It will be faster
> than if not self.free_symbols, however, since is_number will fail as soon as it hits a free symbol or undefined
> function.

Even further, we also avoid the overhead of building the unnecessary set object.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112688
Approved by: https://github.com/lezcano
2023-11-06 20:05:13 +00:00
Chien-Chin Huang
9d0c3e21d0 [state_dict][9/N] Add get and set APIs for model and optimizer state_dict (#112203)
The original get_state_dict and set_state_dict pair is too complicated because of the possible combinations of usages. This PR adds the APIs to get/set model_state_dict and optimizer_state_dict seperately.

Differential Revision: [D50713584](https://our.internmc.facebook.com/intern/diff/D50713584/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112203
Approved by: https://github.com/wz337
ghstack dependencies: #112167
2023-11-02 22:03:57 +00:00
Zhengxu Chen
50767a075a [export] Clean up verifier [1/n]. (#112505)
Summary: Some adjustments to verifier so that it's easier to use it correctly. We will enable verifier later, so the current diff is no-op.

Test Plan: CI

Differential Revision: D50839295

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112505
Approved by: https://github.com/tugsbayasgalan, https://github.com/angelayi
2023-11-02 19:36:06 +00:00
Jerry Zhang
6929ebf2b0 [quant][docs] Add x86 inductor quant docs (#112648)
Summary:
att

Test Plan:
.

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112648
Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/andrewor14
2023-11-02 17:02:09 +00:00
Edward Z. Yang
09df6b771b Add a note about performant record_stream use. (#112526)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112526
Approved by: https://github.com/albanD
2023-11-02 15:50:22 +00:00
David Berard
8191fb3e06 [Reland2] [inductor][BE] split triton_meta and inductor_meta (#112351)
triton_meta is intended to be passed directly to triton. Previous we were also putting other metadata into triton_meta; but we should split out the other metadata into a separate dict to avoid possible conficts in the future.

This PR splits out triton_meta and inductor_meta so we have a place to put additional metadata that isn't intended to be passed to triton.

Tests - wait for CI

Differential Revision: [D50864493](https://our.internmc.facebook.com/intern/diff/D50864493)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112351
Approved by: https://github.com/eellison
2023-11-02 00:40:12 +00:00
angelayi
131e0f1b75 [export] Separate out graph signature (#112412)
Differential Revision: [D50800524](https://our.internmc.facebook.com/intern/diff/D50800524)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112412
Approved by: https://github.com/zhxchen17
2023-11-02 00:18:28 +00:00