Commit Graph

218 Commits

Author SHA1 Message Date
Ken Jin
3de0857503 [Dynamo] Match closures by code ID (#109427)
Closes https://github.com/pytorch/pytorch/issues/107866

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109427
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-09-25 19:10:35 +00:00
Michael Voznesensky
a902150a1e [Easy] ConstantVariable() -> .create (#109896)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109896
Approved by: https://github.com/ezyang
2023-09-22 22:30:15 +00:00
Michael Lazos
24ba4b7059 [dynamo][__torch_function__ 1/n] Add getset descriptor and __get__ vars (#109542)
Adds the MethodWrapperVariable and GetSetDescriptor variable types. These are used in `__torch_function__` tracing to represent attribute reads (`__get__`) and for comparing unbound methods. (the func argument when `__torch_function__` is dispatched from a method call)

towards tracing for https://github.com/pytorch/pytorch/issues/93723

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109542
Approved by: https://github.com/jansel
2023-09-22 10:39:15 +00:00
Avik Chaudhuri
ebc7039bcb New export API with dynamic shape specifications instead of constraints (#108448)
Our experience using `constraints` / `dynamic_dim` with the existing export API has found it to be (subjectively) clunky and (objectively) verbose in common cases.

This PR implements a new design for the export API that replaces the use of `constraints` / `dynamic_dim` with a new way of specifying dynamic shapes, involving the following concepts:
* a constructor `Dim` for first-class named dynamic dimensions with ranges (similar to `functorch.dim`, and analogous to internal symbolic sizes)
* a mechanism that uses the above in `export` calls to associate inputs to their dynamic shape specifications (`dynamic_shapes`)

Design doc: https://docs.google.com/presentation/d/168U7XK72C_WSsZpGESP6Cho9udh193fi0gfjxCNcJ4E/edit#slide=id.p (Meta-only). Note that we only implement Option 1 in that doc. An older version of this PR also implemented Option 3, which is an alternative way of specifying dynamic shapes using tensor type annotations on the exported callable; but we have moved that to future work for now.

See docs for these new features in `torch.export`. The existing `torch.export.export` is modified to use the new API, `torch._export.export__RC__`, whenever `constraints=None`. We have not deprecated the existing API yet, but will do in a follow-up.

Constraint violation errors arising through use of the new API will now contain suggested fixes using the new API. No longer do we need to report all specializations for static dimensions and suggest all constraints over dynamic dimensions to fix such errors. Instead, due to the redesign, the suggested fixes are much more concise, only involving modifying the definitions of relevant `Dim`s.

Differential Revision: [D48919204](https://our.internmc.facebook.com/intern/diff/D48919204/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108448
Approved by: https://github.com/suo, https://github.com/gmagogsfm
2023-09-22 06:58:26 +00:00
Edward Z. Yang
518308a740 Trace through pytree API with dynamo. (#108533)
Fix: #107315

This PR enables dynamo to trace through the `pytree` API by inlining its functions. In
order to do so, a few details of `pytree` had to be changed.

In summary, this PR:

- Introduces `TreeSpecVariable` for representing `TreeSpec` instances
- Specializes `<type>.__bases__` call, returning a `TupleVariable`
- Enables the call to `id` builtin function for every variable that implements
  `as_python_constant` method
- Specializes `ConstantVariable.call_method` for its (un)flatten functions
- Implements `UserDefinedObjectVariable.as_python_constant`
- Modifies `pytree` by:
    - Make `SUPPORTED_NODES` a map of ids (instead of types) to `NodeDef`
    - Removed `functools.wraps` function, since it can't be inlined

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108533
Approved by: https://github.com/ezyang, https://github.com/voznesenskym
ghstack dependencies: #109201
2023-09-20 00:04:56 +00:00
Edward Yang
88600e7d2e [RELAND] Force synced KJT to trace unbacked SymInt (#108960) (#109216)
Summary:

The basic concept behind this diff is to modify Dynamo's tracing behavior when it encounters a KeyedJaggedTensor that is synced (aka has `_length_per_key` and `_offset_per_key` populated). These fields are lists of integers; ordinarily, Dynamo will optimistically try to specialize on integers, however, for KJTs, we know that these integers will definitely vary from run-to-run. Furthermore, ordinarily, we would also specialize these integers if they are 0/1, but we will frequently expect features in KJTs to be 0/1.

The fix is to detect KJTs and treat these integers as *unbacked integers*. This is NOT a universally sound optimization: when treating these integers as unbacked, we never report them as equal to zero or one. In return, we always generate graphs that generalize no matter the length of values on features. This is enough to trace through APS sparse arch, torchrec_dlrm and some small split-cat examples.

The special integer behavior is triggered by a dynamically scoped `force_unspec_int_unbacked_size_like` variable on TracingContext, which we trigger when we wrap a KJT. There probably are other ways to do this, but this was simple and worked.

Test Plan:
```
buck2 test mode/dev-nosan //pytorch/benchmark/fb/test_gpu:run_test_gpu
```

from aakhundov

1. first build feed_lower_benchmark:
```
buck2 build --show-output mode/opt -c python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true hpc/new/models/feed/benchmark:feed_lower_benchmark
```
2. then run the lowering of the model with it:
```
TORCHINDUCTOR_MAX_AUTOTUNE=1 TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCH_LOGS="output_code,graph_code" TORCH_COMPILE_DEBUG=1 ../buck-out/v2/gen/fbcode/79c6b019ee0f9469/hpc/new/models/feed/benchmark/__feed_lower_benchmark__/feed_lower_benchmark.par --load=manifold://ig_inference_model/tree/user/facebook/fblearner/predictor/960999465/60/gpu_lowering/input.predictor --skip-trt --skip-ait --sync-mode=0 --enable-aot-inductor --lower-presets="ig_stories" --gpu-trace
```
cf https://docs.google.com/document/d/1yD30xYrdmM8r2HTdmXnZTg0-MHVexfVrAa0294m1AUE/edit?pli=1#heading=h.qiv3fp7e6zg0

From torchrec: https://www.internalfb.com/intern/wiki/Torchrec/Development/Testing_production_models/

From ge0405
baseline (without your diff): f477293168
your diff: f477292363

```
buck2 test //caffe2/test/dynamo:test_dynamo_torchrec
buck2 run 'fbcode//mode/opt' fbcode//pytorch/benchmark/fb/test_gpu:run_test_gpu -- 'pytorch.benchmark.fb.test_gpu.test_gpu.TestBenchmarkFbGpu.test_train_blue_reels_vdd_v3_inductor_speedup'
```

Differential Revision: D49236757

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109216
Approved by: https://github.com/voznesenskym
2023-09-18 14:39:44 +00:00
ydwu4
706d8e2230 [dynamo] Respect shape dynamism of SymInt sized tensor (#109331)
Before this PR, if we run the following code:
```python
def true_fn(x):
    return x - x.cos()

def false_fn(x):
    return x + x.sin()

def foo(x):
    return cond(x.shape[0] == 4, true_fn, false_fn, [x])
gm = make_fx(foo, tracing_mode='symbolic')(torch.ones(3, 4))
gm = make_fx(foo, tracing_mode='symbolic')(torch.ones(4, 5))
```
we'll have the following error:
```python
Traceback (most recent call last):
  File "/home/yidi/local/pytorch/make_fx.py", line 16, in <module>
    gm = make_fx(foo, tracing_mode='symbolic')(torch.ones(4, 5))
  File "/home/yidi/local/pytorch/torch/fx/experimental/proxy_tensor.py", line 841, in wrapped
    t = dispatch_trace(wrap_key(func, args, fx_tracer, pre_dispatch), tracer=fx_tracer, concrete_args=tuple(phs))
  File "/home/yidi/local/pytorch/torch/_compile.py", line 24, in inner
    return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 397, in _fn
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/_dynamo/external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/fx/experimental/proxy_tensor.py", line 461, in dispatch_trace
    graph = tracer.trace(root, concrete_args)
  File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 397, in _fn
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/_dynamo/external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/fx/_symbolic_trace.py", line 817, in trace
    (self.create_arg(fn(*args)),),
  File "/home/yidi/local/pytorch/torch/fx/experimental/proxy_tensor.py", line 497, in wrapped
    out = f(*tensors)
  File "/home/yidi/local/pytorch/make_fx.py", line 13, in foo
    return control_flow.cond(x.shape[0] == 4, true_fn, false_fn, [x])
  File "/home/yidi/local/pytorch/torch/_higher_order_ops/cond.py", line 151, in cond
    return torch.compile(cond_op, backend="eager", fullgraph=True)(
  File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 397, in _fn
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/_dynamo/eval_frame.py", line 545, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state)
  File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 140, in _fn
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 380, in _convert_frame_assert
    return _compile(
  File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 561, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/yidi/local/pytorch/torch/_dynamo/utils.py", line 189, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 483, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/yidi/local/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
    transformations(instructions, code_options)
  File "/home/yidi/local/pytorch/torch/_dynamo/convert_frame.py", line 432, in transform
    tracer = InstructionTranslator(
  File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2032, in __init__
    self.symbolic_locals = collections.OrderedDict(
  File "/home/yidi/local/pytorch/torch/_dynamo/symbolic_convert.py", line 2035, in <genexpr>
    VariableBuilder(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 229, in __call__
    vt = self._wrap(value).clone(**self.options())
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 374, in _wrap
    return type_dispatch(self, value)
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 808, in wrap_listlike
    output = [
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 809, in <listcomp>
    VariableBuilder(self.tx, GetItemSource(self.get_source(), i))(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 229, in __call__
    vt = self._wrap(value).clone(**self.options())
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 374, in _wrap
    return type_dispatch(self, value)
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 808, in wrap_listlike
    output = [
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 809, in <listcomp>
    VariableBuilder(self.tx, GetItemSource(self.get_source(), i))(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 229, in __call__
    vt = self._wrap(value).clone(**self.options())
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 374, in _wrap
    return type_dispatch(self, value)
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 1040, in wrap_tensor
    tensor_variable = wrap_fx_proxy(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 1267, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 1382, in wrap_fx_proxy_cls
    example_value = wrap_to_fake_tensor_and_record(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 1652, in wrap_to_fake_tensor_and_record
    dynamic_dims, constraint_dims = _automatic_dynamic(
  File "/home/yidi/local/pytorch/torch/_dynamo/variables/builder.py", line 1550, in _automatic_dynamic
    if dim is not None and e.size()[i] != dim:
  File "/home/yidi/local/pytorch/torch/__init__.py", line 352, in __bool__
    return self.node.bool_()
  File "/home/yidi/local/pytorch/torch/fx/experimental/symbolic_shapes.py", line 1019, in bool_
    return self.guard_bool("", 0)
  File "/home/yidi/local/pytorch/torch/fx/experimental/symbolic_shapes.py", line 1001, in guard_bool
    r = self.shape_env.evaluate_expr(self.expr, self.hint, fx_node=self.fx_node)
  File "/home/yidi/local/pytorch/torch/fx/experimental/recording.py", line 227, in wrapper
    return fn(*args, **kwargs)
  File "/home/yidi/local/pytorch/torch/fx/experimental/symbolic_shapes.py", line 3793, in evaluate_expr
    assert orig_expr == hint, f"{orig_expr} != {hint}"
AssertionError: False != True

from user code:

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
```

It's because we record the SymInt in the frame state in _automatic_dynamic the first time we compile the function. Then In the second time, when we are given a symint sized input with different hints, the comparison fails.

Implementation:
This PR returns shape dynamism according to the dynamism of inputs: if a diemsion is SymInt, return DYNAMIC else return static.

Test Plan:
Add a test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109331
Approved by: https://github.com/ezyang
2023-09-16 02:56:53 +00:00
ydwu4
6140facf00 Support SymBool input to torch.compile (#107850)
We could have SymBool inputs for torch.compile, e.g. in the following situation:
```
def f(x:torch.Tensor):
  pred = x.size(0) == 3
  torch.compile(f)(pred, x)

make_fx(f, tracing_mode="symbolic")(x)
```

The idea of this PR (credit to @ezyang) is to support SymBool by re-using the infra we've already had for SymInt so that we don't need to replicate a lot of stuff.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107850
Approved by: https://github.com/ezyang
ghstack dependencies: #107662
2023-09-14 21:34:31 +00:00
PyTorch MergeBot
47f79e9a2b Revert "Support SymBool input to torch.compile (#107850)"
This reverts commit 9f6d70b2fd.

Reverted https://github.com/pytorch/pytorch/pull/107850 on behalf of https://github.com/huydhn due to Sorry for reverting this, but test_export_with_symbool_inputs is failing in trunk a08e1370ef ([comment](https://github.com/pytorch/pytorch/pull/107850#issuecomment-1718675877))
2023-09-14 02:53:36 +00:00
Michael Voznesensky
064ae9ff33 Support register_hook on input tensors (#108903)
The strategy in this PR is pretty straightforward.

There are 2 kinds of hooks:

1) Hooks on objects with sources (inputs, params)
2) Hooks on objects w/o sources (intermediaries, and outputs).

Note: As outputs can be made simple by how dynamo handles residuals, they could actually be handled as if they were inputs, but, for the sake of this PR, we will refer to hooks as either hooks on inputs (sourced), or hooks on intermediaries (not sourced).

The plan:

**For tensors w/ a source:**
We record registered hooks, store them as a global, and associate them with the tensor in residuals. This means that when dynamo goes to create the frame, where we produce bytecode to stitch together our PT2 modified bytecode with the original eager code, we call `register_hook`. This registration of hooks in residuals is sound because (a) it happens right after a Pt2 frame region ends and (b) we know that the tensor is alive in f_locals, f_globals, or a module in the users invoking frame. This means we can soundly know it will be around to invoke `register_hook` on. As long as we guard on the identity of the lifted function, this is sound to do.

**For tensors w/o a source:**
Graph break - we will support this in a subsequent PR

**Handles:**

An interesting new component here is the creation of a `STORE_FAST `->`LOAD_FAST` associated with the handle, the return result of `register_hook`. If the user code stored the result of `register_hook` in a handle, we need to honor that. We do so by interceding into `STORE_FAST`, and recording the name of the local variable as directed by user code. We then honor that same name in the reconstructed bytecode. If the user did not store a hook, we merely pop the produced value to preserve the stack.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108903
Approved by: https://github.com/ezyang
ghstack dependencies: #108846, #109092
2023-09-14 01:52:21 +00:00
ydwu4
9f6d70b2fd Support SymBool input to torch.compile (#107850)
We could have SymBool inputs for torch.compile, e.g. in the following situation:
```
def f(x:torch.Tensor):
  pred = x.size(0) == 3
  torch.compile(f)(pred, x)

make_fx(f, tracing_mode="symbolic")(x)
```

The idea of this PR (credit to @ezyang) is to support SymBool by re-using the infra we've already had for SymInt so that we don't need to replicate a lot of stuff.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107850
Approved by: https://github.com/ezyang
ghstack dependencies: #107662
2023-09-14 01:16:29 +00:00
PyTorch MergeBot
1d32c9c7f2 Revert "Force synced KJT to trace unbacked SymInt (#108960)"
This reverts commit f9a250c35b.

Reverted https://github.com/pytorch/pytorch/pull/108960 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/108960#issuecomment-1715850779))
2023-09-12 14:37:36 +00:00
Edward Yang
f9a250c35b Force synced KJT to trace unbacked SymInt (#108960)
Summary:
The basic concept behind this diff is to modify Dynamo's tracing behavior when it encounters a KeyedJaggedTensor that is synced (aka has `_length_per_key` and `_offset_per_key` populated). These fields are lists of integers; ordinarily, Dynamo will optimistically try to specialize on integers, however, for KJTs, we know that these integers will definitely vary from run-to-run. Furthermore, ordinarily, we would also specialize these integers if they are 0/1, but we will frequently expect features in KJTs to be 0/1.

The fix is to detect KJTs and treat these integers as *unbacked integers*. This is NOT a universally sound optimization: when treating these integers as unbacked, we never report them as equal to zero or one. In return, we always generate graphs that generalize no matter the length of values on features. This is enough to trace through APS sparse arch, torchrec_dlrm and some small split-cat examples.

The special integer behavior is triggered by a dynamically scoped `force_unspec_int_unbacked_size_like` variable on TracingContext, which we trigger when we wrap a KJT. There probably are other ways to do this, but this was simple and worked.

Test Plan:
```
buck2 test mode/dev-nosan //pytorch/benchmark/fb/test_gpu:run_test_gpu
```

from aakhundov

1. first build feed_lower_benchmark:
```
buck2 build --show-output mode/opt -c python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true hpc/new/models/feed/benchmark:feed_lower_benchmark
```
2. then run the lowering of the model with it:
```
TORCHINDUCTOR_MAX_AUTOTUNE=1 TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCH_LOGS="output_code,graph_code" TORCH_COMPILE_DEBUG=1 ../buck-out/v2/gen/fbcode/79c6b019ee0f9469/hpc/new/models/feed/benchmark/__feed_lower_benchmark__/feed_lower_benchmark.par --load=manifold://ig_inference_model/tree/user/facebook/fblearner/predictor/960999465/60/gpu_lowering/input.predictor --skip-trt --skip-ait --sync-mode=0 --enable-aot-inductor --lower-presets="ig_stories" --gpu-trace
```
cf https://docs.google.com/document/d/1yD30xYrdmM8r2HTdmXnZTg0-MHVexfVrAa0294m1AUE/edit?pli=1#heading=h.qiv3fp7e6zg0

From torchrec: https://www.internalfb.com/intern/wiki/Torchrec/Development/Testing_production_models/

From ge0405
baseline (without your diff): f477293168
your diff: f477292363

Differential Revision: D49019987

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108960
Approved by: https://github.com/voznesenskym
2023-09-12 03:44:24 +00:00
Michael Voznesensky
e4350d6d4e Functools partial support in dynamo (#108846)
The strategy for supporting functools partials is relatively straightforward.

There are 2 cases we need to support:

**1) Functools partials as input**
In this case, we are first seeing the functools partial and it is guaranteed to have a source. As such, the args, keywords, and func of the functools partial are passed through VariableBuilder. As this is the first time we are seeing these objects (as it is an input), we re-enter VariableBuilder with a source referencing the args, keywords, and func as attributes of the input to produce:

- func: A callable VariableTracker (UDF, TorchVariable, etc) depending on the value of `func`
- args: List[VariableTracker] - note, not ListVariableTracker!
- keywords: Dict[str, VariableTracker]

A major benefit of this structure is that it very elegantly matches the args to `call_function`.

We then compose a FunctoolsPartialVariable from the VariableTrackers made above.

**2) Functools partials created within compile**
In this case, we already have all the args as known VTs, and thus just compose a FunctoolsPartialVariable as we do for case (1).

For both (1) and (2) - we propagate all guards from the func, args, and keyword VTs to the FunctoolsPartialVariable

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108846
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-09-09 17:25:02 +00:00
CK Luk
366baf690b Back out "[Dynamo x FSDP] Add support for params, buffers, submodules on FSDPManagedNNModuleVariable (#107923)" (#108823)
Summary:
Original commit changeset: 33650f7cb0fb

Original Phabricator Diff: D48833682

Test Plan: See T162942232 for how we figured out that this diff caused significant numeric difference.

Reviewed By: voznesenskym

Differential Revision: D49082219

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108823
Approved by: https://github.com/xw285cornell
2023-09-08 14:39:43 +00:00
Huy Do
5a4fe05a15 Revert "Force synced KJT to trace unbacked SymInt (#107788)" (#108684)
This reverts commit 3b92ef814d.  So let's manually revert it instead.

(Not sure why the bot doesn't work on https://github.com/pytorch/pytorch/pull/107788)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108684
Approved by: https://github.com/ezyang
2023-09-06 19:15:45 +00:00
Edward Z. Yang
3b92ef814d Force synced KJT to trace unbacked SymInt (#107788)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107788
Approved by: https://github.com/voznesenskym
2023-09-06 03:18:26 +00:00
vasiliy
3702980717 dynamo: trace autograd.Function with tensor subclass input (#108093)
Summary:

Enables dynamo eager mode tracing for the following situation:
1. we have a torch.autograd.Function
2. the input to that function is a tensor subclass which is an intermediary

This is useful for float8 training UX.

Test Plan:

```
python test/dynamo/test_autograd_function.py -k intermediary_input
```

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108093
Approved by: https://github.com/bdhirsh, https://github.com/wanchaol
2023-09-01 02:12:38 +00:00
Wanchao Liang
a29b9101fa [dynamo] fix dynamo + DTensor to work with 2d (#108329)
pair debugged with @wconstab and we found some issue in both dynamo and
the TP's fsdp extension side. This PR fixes the dynamo + DTensor integration
so that the current graph break FSDP can work with tensor parallel by moving
the torch.compile after FSDP wrapping.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108329
Approved by: https://github.com/Skylion007, https://github.com/wconstab
2023-08-31 22:46:26 +00:00
Michael Lazos
49df1de383 Cudagraphs support for compiled optimizers (#107504)
Marks all params/optimizer state as static addresses and a finalizer which cleans up the graph attributes when the optimizer goes out of scope.

**Note: this does not mark grads as static because this will increase memory usage significantly

There are two cases:
1. The upstream graph is cudagraphed - this case will work fine OOTB
2. The upstream graph is not cudagraphed - in this case, there will be a lot of copies introduced from the upstream (to copy the grads) into cudagraphed-owned memory, unless the user explicitly marks the grads as static. If the user does this, this will also require not deallocating the grads in zero_grad() (either the mod or optimizer version) by setting them to zero vs None. There is a PR (https://github.com/pytorch/pytorch/pull/107853) in flight to throw an error if zero_grad attempts to set static grads to None.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107504
Approved by: https://github.com/eellison
2023-08-31 20:47:18 +00:00
Yanbo Liang
dabdb97087 [Dynamo] Graph break on functions using tensor out variants (#108182)
Fixes #108021

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108182
Approved by: https://github.com/eellison
2023-08-31 17:49:14 +00:00
Michael Lazos
0297232053 Fix operator precedence (#108196)
Summary: Ensure that modules are only installed if they are not fsdp modules.

Differential Revision: D48810186

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108196
Approved by: https://github.com/shunting314, https://github.com/anijain2305
2023-08-30 14:00:33 +00:00
voznesenskym
f3a8d57aea [Dynamo x FSDP] Add support for params, buffers, submodules on FSDPManagedNNModuleVariable (#107923)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107923
Approved by: https://github.com/wconstab
2023-08-29 08:54:13 +00:00
ydwu4
cbcd551045 Fix torch.compile FunctionalTensor inputs for higherOrderOps (#107604)
Before this PR, for the added [test](https://github.com/pytorch/pytorch/pull/107604/files#diff-c618f2274b6b5ccc533c580549d2e552edbd9fc5ac0da1aa4b00338525c8f78dR224), which feeds FunctionTensorWrapper inputs to higherOrderOperator, we have an assertion error in this line [code](https://github.com/pytorch/pytorch/pull/107604/files#diff-9f0663783bcd93e948e0491ef61b48123bdc9977bcc632fd707da578df13bfa1R1284).

The key difference of this PR is this [line ](https://github.com/pytorch/pytorch/pull/107604/files#diff-9f0663783bcd93e948e0491ef61b48123bdc9977bcc632fd707da578df13bfa1L1263)of check:
```python
        elif (
            isinstance(example_value, FakeTensor)
            and example_value.fake_mode is tx.fake_mode
        ):
```
The original intention of it seems to be dealing with case where we want to wrap an fx proxy for an intermediate fake tensor that's produced by some tensor ops and an example value is provided (as is the case for higherOrderOps [here](https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/variables/higher_order_ops.py#L85)). A fakified FunctionalTensorWrapper(FakeTensor) always fails this check. This PR changes it to checking whether it's already fakified by tx.fake_mode.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107604
Approved by: https://github.com/zou3519
ghstack dependencies: #107569
2023-08-23 02:42:18 +00:00
lezcano
db39a81e1e Add a flag that allows breaking on NumPy ops (#107687)
This was removed in 63d406a6a9
Resotiring, as it's rather useful for debugging.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107687
Approved by: https://github.com/larryliu0820
2023-08-23 01:21:22 +00:00
ydwu4
a408920817 Reland fakify FunctionalTensor (#107569)
Try to rebase and reland https://github.com/pytorch/pytorch/pull/107062 . One difference compared with previous is to make the DTensor logic same as previously in _clone_input.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107569
Approved by: https://github.com/zou3519
2023-08-22 15:46:25 +00:00
lezcano
612c8a8c84 Guard numpy imports in the dynamo folder (#107299)
Fixes https://github.com/pytorch/pytorch/issues/107228

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107299
Approved by: https://github.com/atalman
2023-08-21 19:07:20 +00:00
PyTorch MergeBot
96c5be8bc4 Revert "Fakify leaf of FunctionalTensor (#107062)"
This reverts commit 3349725766.

Reverted https://github.com/pytorch/pytorch/pull/107062 on behalf of https://github.com/ydwu4 due to This appears to have broken the test TestDTensorCompile.test_dtensor_fullgraph. Probably a land race ([comment](https://github.com/pytorch/pytorch/pull/107062#issuecomment-1685447747))
2023-08-21 00:30:16 +00:00
ydwu4
3349725766 Fakify leaf of FunctionalTensor (#107062)
This PR allows dynamo to fakify FunctionalTensorWrapper by unwrapping, replacing and wrapping again for FunctionalTensorWrapper so that FunctionalTensorWrapper can be passed in as input for dynamo.optimize and we can support something like this
```python
ff = torch.func.functionalize(f)
torch.compile(ff)(x)
```

This PR didn't follow the \_\_tensor_flatten\_\_ and \_\_tensor_unflatten\_\_ protocol right now because we're not sure the plan of doing that for FunctionalTensorWrapper (it's implemented in C++).

**Test Plan:**
Add a new test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107062
Approved by: https://github.com/zou3519
ghstack dependencies: #107042
2023-08-19 17:33:42 +00:00
zhxchen17
8d6a487d69 [dynamo] Make KeyedJaggedTensor a variable. (#107319)
This is extracted from https://github.com/pytorch/pytorch/pull/107156/
to model KeyedKaggedTensor as a first class concept in dynamo.
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107319
Approved by: https://github.com/ezyang
2023-08-18 17:15:46 +00:00
PyTorch MergeBot
3c11184ca8 Revert "Fakify leaf of FunctionalTensor (#107062)"
This reverts commit 6cb0128c8a.

Reverted https://github.com/pytorch/pytorch/pull/107062 on behalf of https://github.com/ZainRizvi due to This appears to have broken the test TestDTensorCompile.test_dtensor_fullgraph.  Probably a land race ([comment](https://github.com/pytorch/pytorch/pull/107062#issuecomment-1684124230))
2023-08-18 16:02:54 +00:00
ydwu4
6cb0128c8a Fakify leaf of FunctionalTensor (#107062)
This PR allows dynamo to fakify FunctionalTensorWrapper by unwrapping, replacing and wrapping again for FunctionalTensorWrapper so that FunctionalTensorWrapper can be passed in as input for dynamo.optimize and we can support something like this
```python
ff = torch.func.functionalize(f)
torch.compile(ff)(x)
```

This PR didn't follow the \_\_tensor_flatten\_\_ and \_\_tensor_unflatten\_\_ protocol right now because we're not sure the plan of doing that for FunctionalTensorWrapper (it's implemented in C++).

**Test Plan:**
Add a new test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107062
Approved by: https://github.com/zou3519
ghstack dependencies: #107042
2023-08-18 03:05:45 +00:00
Animesh Jain
7cb2a6bfab [dynamo][fallback] Fallback to eager when backend fails with fake tensor exceptions (#107179)
Example (I think we should fix this test case for real, but using this to test the ux around fallbacks)

~~~
@torch.compile(backend="aot_eager")
def fn(x):
    return torch.sum(x, dim=1).tolist()

print(fn(torch.rand(4, 4).to(dtype=torch.int64)))
~~~

Running the script as is

~~~
[2023-08-14 14:53:48,863] torch._dynamo.output_graph: [WARNING] Backend compiler failed with a fake tensor exception at
[2023-08-14 14:53:48,863] torch._dynamo.output_graph: [WARNING]   File "/data/users/anijain/pytorch/examples/spl.py", line 5, in fn
[2023-08-14 14:53:48,863] torch._dynamo.output_graph: [WARNING]     return torch.sum(x, dim=1).tolist()
[2023-08-14 14:53:48,863] torch._dynamo.output_graph: [WARNING] Falling back to eager for this frame. Please use TORCH_LOGS=graph_breaks to see the full stack trace.
[0, 0, 0, 0]
~~~

Running the script with TORCH_LOGS="graph_breaks"

~~~
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG] WON'T CONVERT fn /data/users/anijain/pytorch/examples/spl.py line 3
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG] ========== TorchDynamo Stack Trace ==========
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG] Traceback (most recent call last):
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_dynamo/output_graph.py", line 995, in call_user_compiler
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     compiled_fn = compiler_fn(gm, self.example_inputs())
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     compiled_gm = compiler_fn(gm, example_inputs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/__init__.py", line 1586, in __call__
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     return self.compiler_fn(model_, inputs_, **self.kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_dynamo/backends/common.py", line 55, in compiler_fn
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     cg = aot_module_simplified(gm, example_inputs, **kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_functorch/aot_autograd.py", line 3795, in aot_module_simplified
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     compiled_fn = create_aot_dispatcher_function(
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_dynamo/utils.py", line 194, in time_wrapper
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     r = func(*args, **kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_functorch/aot_autograd.py", line 3283, in create_aot_dispatcher_function
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     fw_metadata = run_functionalized_fw_and_collect_metadata(
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_functorch/aot_autograd.py", line 757, in inner
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     flat_f_outs = f(*flat_f_args)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_functorch/aot_autograd.py", line 3400, in functional_call
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     out = Interpreter(mod).run(*args[params_len:], **kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/fx/interpreter.py", line 138, in run
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     self.env[node] = self.run_node(node)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/fx/interpreter.py", line 195, in run_node
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     return getattr(self, n.op)(n.target, args, kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/fx/interpreter.py", line 289, in call_method
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     return getattr(self_obj, target)(*args_tail, **kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/utils/_stats.py", line 20, in wrapper
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     return fn(*args, **kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_subclasses/fake_tensor.py", line 1233, in __torch_dispatch__
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     return self.dispatch(func, types, args, kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_subclasses/fake_tensor.py", line 1470, in dispatch
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     op_impl_out = op_impl(self, func, *args, **kwargs)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/torch/_subclasses/fake_tensor.py", line 501, in local_scalar_dense
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     raise DataDependentOutputException(func)
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG] torch._subclasses.fake_tensor.DataDependentOutputException: aten._local_scalar_dense.default
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG] While executing %item : [num_users=1] = call_method[target=item](args = (%getitem,), kwargs = {})
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG] Original traceback:
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]   File "/data/users/anijain/pytorch/examples/spl.py", line 5, in fn
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]     return torch.sum(x, dim=1).tolist()
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]
[2023-08-14 14:54:15,689] torch._dynamo.output_graph.__graph_breaks: [DEBUG]
~~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107179
Approved by: https://github.com/ezyang
2023-08-16 14:57:42 +00:00
Michael Lazos
e0d6072f69 Add API to mark input tensors static for cudagraphs (#107154)
Adds API to mark tensor as a static input -
To make this trigger recompiles properly, I'll need to update tensor match checks to also check for this new attribute

Additional concern is memory - the tensors will be kept alive, but this is the current behavior for nn modules and parameters.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107154
Approved by: https://github.com/eellison
2023-08-16 04:38:19 +00:00
ydwu4
c71828b097 Lift non-FakeTensor restriction for compile (#107042)
Currently, we have the assertion that dynamo won't accept FakeTensor input unless we're exporting. This PR try to remove this restriction to finish https://github.com/pytorch/pytorch/pull/105679.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107042
Approved by: https://github.com/ezyang, https://github.com/zou3519
2023-08-15 20:58:56 +00:00
Michael Voznesensky
71a336ef75 [Dynamo x FSDP][1/x] Builder support for deque, appendleft (#106884)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106884
Approved by: https://github.com/ezyang
2023-08-11 03:26:12 +00:00
lezcano
a9dca53438 NumPy support in torch.compile (#106211)
RFC: https://github.com/pytorch/rfcs/pull/54
First commit is the contents of https://github.com/Quansight-Labs/numpy_pytorch_interop/

We have already been using this in core for the last few months as a external dependency. This PR pulls all these into core.

In the next commits, I do a number of things in this order
- Fix a few small issues
- Make the tests that this PR adds pass
- Bend backwards until lintrunner passes
- Remove the optional dependency on `torch_np` and simply rely on the upstreamed code
- Fix a number dynamo tests that were passing before (they were not tasting anything I think) and are not passing now.

Missing from this PR (but not blocking):
- Have a flag that deactivates tracing NumPy functions and simply breaks. There used to be one but after the merge stopped working and I removed it. @lezcano to investigate.
- https://github.com/pytorch/pytorch/pull/106431#issuecomment-1667079543. @voznesenskym to submit a fix after we merge.

All the tests in `tests/torch_np` take about 75s to run.

This was a work by @ev-br, @rgommers @honno and I. I did not create this PR via ghstack (which would have been convenient) as this is a collaboration, and ghstack doesn't allow for shared contributions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106211
Approved by: https://github.com/ezyang
2023-08-11 00:39:32 +00:00
Jason Lu
bc88028e8e Back out "Reland "Make adding buffers more like adding parameters (#104069)" (#106224)" (#106743)
Summary:
Original commit changeset: 81319beb97f3

Original Phabricator Diff: D47961182

Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822

Reviewed By: atuljangra

Differential Revision: D48131623

@diff-train-skip-merge
(D48131623 landed internally)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743
Approved by: https://github.com/malfet
2023-08-08 15:27:34 +00:00
Edward Z. Yang
697893568d Improve error message when export encounters non-local input (#106403)
Previously, you would get an error like

```
Dynamo input and output is a strict subset of traced input/output
```

now you get

```
Cannot export model which references tensors that are neither
buffers/parameters/constants nor are direct inputs.  For each tensor, if you'd
like this tensor to be an explicit input, add it as a dummy argument
to the top-level model definition you are exporting; if you would
like its value to be embedded as an exported constant, wrap its access
in a function marked with @assume_constant_result.

G['bulbous_bouffant'], accessed at:
  File "test_export.py", line N, in f
    return bulbous_bouffant + y
```

This doesn't handle outputs, I'm going to hit that next.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106403
Approved by: https://github.com/tugsbayasgalan
2023-08-03 12:35:25 +00:00
Mikayla Gawarecki
d8e5f2aa6d Reland "Make adding buffers more like adding parameters (#104069)" (#106224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224
Approved by: https://github.com/atalman, https://github.com/albanD
2023-07-31 17:18:56 +00:00
Michael Voznesensky
8549abc347 Grab bag of DTensor enablement stuff (Enable whole graph capture for DTensor) (#105787)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105787
Approved by: https://github.com/ezyang
2023-07-30 00:17:45 +00:00
Yukio Siraichi
707aadeedd Track global Numpy variables as side-effect. (#105959)
Fix: #105074

This PR makes dynamo handle Numpy global variables the same way as PyTorch tensor global
variables by tracking them as side-effect.

In summary, we add `NumpyNdarrayVariable` to the
`VariableBuilder._can_lift_attrs_to_inputs` function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105959
Approved by: https://github.com/ezyang
2023-07-27 03:49:48 +00:00
Wanchao Liang
c76c84bde4 [dynamo] make ProcessGroupVariable a DistributedVariable (#105593)
This PR move the ProcessGroupVariable from UDO to DistributedVT
so that Distributed VTs are consolidated together

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105593
Approved by: https://github.com/voznesenskym
2023-07-26 06:42:50 +00:00
Michael Voznesensky
54a673bdcf Initial sourceless builder (#104734)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104734
Approved by: https://github.com/ezyang
2023-07-24 02:48:32 +00:00
Wanchao Liang
66fbffce1f Fix unstable CI related to dynamo tests (#105797)
this PR fix the current unstable CI. The test failure comes from a bad
revert in https://github.com/pytorch/pytorch/pull/105581 where it does
not revert the intended PR correctly (there were some merge conflicts
and some logic got deleted during this revert)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105797
Approved by: https://github.com/ezyang
2023-07-23 05:43:54 +00:00
Thiago Crepaldi
09b5c35911 Support torch.onnx.dynamo_export within FakeTensorMode (#105477)
Currently, exporting a model to ONNX with fake tensor mode requires the
user to load data and model within `torch.onnx.enable_fake_mode` context,
but the actual call to `torch.onnx.dynamo_export` is done outside such
context.

With this PR, we enable `torch.onnx.dynamo_export` to be called either
within `torch.onnx.enable_fake_mode` or outside of it. This feature
required changes to the core PyTorch Dynamo, which were greatly
supported by @ezyang

In future steps we will determine which scenario we are going to
support, but for now we can use either to explore different options and
scenarios and asses their pros and cons.

This PR also creates a separate suite of tests for fake mode specific
scenarios (`TestFxToOnnxFakeTensorWithOnnxRuntime`).
It was done separately to decrease the test time, but we
could merge it with the default `TestFxToOnnxWithOnnxRuntime`. The
additional parameters are `load_checkpoint_during_init` and
`export_within_fake_mode`

With the newly added supported of nested export within fake mode, the
following scenarios are now supported:

```python
import torch

with torch.onnx.enable_fake_mode() as fake_context:
    fake_args = create_args()
    fake_kwargs = create_kwargs()
    fake_model = create_model()
    fake_model.load_state_dict(torch.load(tmp_checkpoint_file.name))

    export_options = torch.onnx.ExportOptions(fake_context=fake_context)

    # `torch.onnx.dynamo_export` called WITHIN `torch.onnx.enable_fake_mode`
    export_output = torch.onnx.dynamo_export(
        fake_model,
        *fake_args,
        **fake_kwargs,
        export_options=export_options,
    )

    export_output.save("/path/to/model.onnx", model_state_dict=create_model())
```

If we decide to only support scenarios in which `torch._dynamo.export` is called within `FakeTensorMode`, then we can remove `fake_mode` argument from `torch._dynamo.export` as a follow-up task

ps: This PR is mostly Edward's https://github.com/pytorch/pytorch/pull/105468 + unit tests after an offline discussion
ps: https://github.com/pytorch/pytorch/issues/105464 tracks pending tasks/limitations from this PR

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105477
Approved by: https://github.com/ezyang, https://github.com/BowenBao
2023-07-22 03:50:52 +00:00
Andrey Talman
c6653b65d8 Back out "Make adding buffers more like adding parameters (#104069)" (#105581)
Summary:
D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/

with `TypeError: register_buffer() takes 3 positional arguments but 4 were given`

Original commit changeset: d4b4069fbd38

Original Phabricator Diff: D47537831

Test Plan:
```
buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform
```

Reviewed By: atalman

Differential Revision: D47600140

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581
Approved by: https://github.com/mikaylagawarecki
2023-07-20 03:39:53 +00:00
Yukio Siraichi
5ce5372d70 Create tensor from Numpy in current device. (#105546)
Fix: #105046

This PR changes how tensors are created from Numpy arrays, when tracing with
dynamo. Instead of using `from_numpy`, we use `as_tensor`. The latter takes into
consideration the current device.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105546
Approved by: https://github.com/lezcano
2023-07-19 21:31:52 +00:00
Wanchao Liang
f139aab2f4 [dynamo] add initial dynamo support for DTensor (#103146)
This PR adds initial dynamo support for DTensor, in particular, it:
- allows DTensor be passed into a compiled function, and allow fakify
DTensor during dynamo tracing by turning the inner local tensor to meta
tensor.
- We use `allow_in_graph` to include `DTensor` and `DTensor.from_local` to be represented as `TorchVariable`
- The dtensor created becomes a normal `TensorVariable` and it would insert any tensor operations to the output graph just like torch.Tensor
- note that dtensor have a new instance method `redistribute` compare to plain tensor, and we currently special handle it in `TensorVariable`

`from_local` and `redistribute` both accepts some non-trival metadata as arguments (i.e. DeviceMesh, Placement) which fx.Graph does not support. In order to let these two APIs appear in the dynamo captured graph, we encoded the metadata into a new_function (like `functools.partial`) and the new function only accepts prim args (i.e. tensor), then we put `call_function` with this new_function to the graph. This is suggested by @ezyang. The underlying rationale here is that the metadata will not change across the graph invocations so it's safe to encode them.

Captured graph:
```
    def forward(self, L_x_ : torch.Tensor):
        l_x_ = L_x_

        # File: /scratch/wanchaol/work/pytorch/test/distributed/_tensor/test_dtensor.py:685, code: dt = DTensor.from_local(x, mesh, [Shard(0)], run_check=False)
        prim_from_local = torch__dynamo_variables_torch_prim_from_local(l_x_, run_check = False);  l_x_ = None

        # File: /scratch/wanchaol/work/pytorch/test/distributed/_tensor/test_dtensor.py:686, code: return dt.redistribute(mesh, [Replicate()]).to_local() + 2
        prim_redistribute = torch__dynamo_variables_tensor_prim_redistribute(prim_from_local);  prim_from_local = None
        to_local = prim_redistribute.to_local();  prim_redistribute = None
        add = to_local + 2;  to_local = None
        return (add,)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103146
Approved by: https://github.com/voznesenskym
2023-07-19 16:01:12 +00:00
Justin Chu
8a688277a2 [BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432
Approved by: https://github.com/ezyang
2023-07-19 13:48:44 +00:00