Commit Graph

13 Commits

Author SHA1 Message Date
ydwu4
cc1de49340 [HigherOrderOp] fallthrough some keys by default. (#110478)
Fixes #109253

Test Plan:
Added a new test that shows default fallthrough keys can be overrided.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110478
Approved by: https://github.com/ezyang
2023-10-05 16:25:42 +00:00
Yanbo Liang
8a567bb59d [HigherOrderOp] Should automatically pop modes (#109157)
Fixes #108282

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157
Approved by: https://github.com/zou3519
2023-09-18 20:54:09 +00:00
PyTorch MergeBot
07f2efa285 Revert "[HigherOrderOp] Should automatically pop modes (#109157)"
This reverts commit f03b8abd47.

Reverted https://github.com/pytorch/pytorch/pull/109157 on behalf of https://github.com/clee2000 due to broke internal builds D49346922 ([comment](https://github.com/pytorch/pytorch/pull/109157#issuecomment-1722571262))
2023-09-17 21:19:52 +00:00
Yanbo Liang
f03b8abd47 [HigherOrderOp] Should automatically pop modes (#109157)
Fixes #108282

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109157
Approved by: https://github.com/zou3519
2023-09-14 20:46:26 +00:00
Animesh Jain
78a053bad7 [activation checkpointing] Add default autocast keys to functional rng wrappers (#107934)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107934
Approved by: https://github.com/xw285cornell
2023-08-25 18:22:02 +00:00
eellison
3495f0c999 Generate mypy hints for torch.Tag, add a couple of pointwise ops (#106910)
Replace https://github.com/pytorch/pytorch/pull/106739, since i had a bad CLA commit.

- adds clone, and convert_element_dtype to pointwise
- adds codegen for mypy hints of torch.Tag and removes existing ignores for them

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106910
Approved by: https://github.com/mlazos
2023-08-10 05:12:27 +00:00
drisspg
788c825837 Higher order operator util for raising if inputs require grads (#106078)
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 08bd685</samp>

Added a utility function `autograd_not_implemented_check` to `torch._higher_order_ops.utils` and used it in `out_dtype_autograd` to simplify and standardize the error handling for higher order operators that do not support autograd.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106078
Approved by: https://github.com/zou3519
2023-08-01 00:13:13 +00:00
Zhengxu Chen
df281bf788 Refactor unwrap_proxy() for proxy tensor tracing. (#104667)
Test Plan: CI

Differential Revision: D47241815

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104667
Approved by: https://github.com/tugsbayasgalan
2023-07-06 03:03:13 +00:00
Richard Zou
280df5dc2e [HigherOrderOp] Remove _deprecated_global_ns from some ops (#104105)
The remaining ops after this PR are:
- cond
- map
- anything that is out of tree.

These are a bit more difficult to remove.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104105
Approved by: https://github.com/ydwu4
2023-06-28 00:03:29 +00:00
rzou
036cda415f Change HigherOrderOperator default namespace from global to 'higher_order' (#103870)
This PR changes the default namespace for higher order operators from the
global namespace (e.g. torch.ops.cond) to `higher_order` (e.g.
torch.ops.higher_order.cond). We don't actually change the namespace
for existing HigherOrderOperators.

The motivation is to stem the bleeding; exposing operators into the global
namespace is a bad idea due to name collision with other user-defined
namespaces.

We will go in and fix the `_deprecated_global_ns` as necessary after this diff.

Differential Revision: [D46809738](https://our.internmc.facebook.com/intern/diff/D46809738/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103870
Approved by: https://github.com/ydwu4
2023-06-20 19:10:55 +00:00
Animesh Jain
58d2c66a70 [activation checkpointing] Higher order functional rng op wrappers (#102934)
Introduces two higher order operators
* run_and_save_rng_state - Saves the current rng state and then runs the op.
* run_with_rng_state - Runs the op with the rng state supplied as an input

Ideally, we would like to use torch.compile for these operators. But currently the plan is to introduce these operators at the partitioner level, obviating the need to support them fully through the torch.compile stack. To ensure that we have good enough debugging with minifiers, we have ensure that they work with make_fx. In future, we can move on torch.compile.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102934
Approved by: https://github.com/jansel, https://github.com/zou3519
2023-06-12 22:54:17 +00:00
Animesh Jain
6bc4651193 [philox_rand] Dynamic shape support (#99290)
Extends the functionalization of rng work to Dynamic shapes. An example of the generated graph looks like this

~~~

[2023-04-24 21:41:37,446] torch._functorch.aot_autograd.__aot_graphs: [INFO] TRACED GRAPH
 ===== Forward graph 1 =====
 <eval_with_key>.7 class <lambda>(torch.nn.Module):
    def forward(self, arg0_1: i64[], arg1_1: i64[], arg2_1: Sym(s0), arg3_1: Sym(s1), arg4_1: f32[s0, s1]):
        # File: /scratch/anijain/work/pytorch/test/test_functionalization_of_rng_ops.py:46, code: a = torch.rand_like(x) * x
        add: i64[] = torch.ops.aten.add.Tensor(arg1_1, 0)
        philox_rand = torch.ops.rngprims.philox_rand.default([arg2_1, arg3_1], arg0_1, add, None, device(type='cuda', index=0), torch.float32);  add = None
        getitem: f32[s0, s1] = philox_rand[0]
        getitem_1: i64[] = philox_rand[1];  philox_rand = None
        add_1: i64[] = torch.ops.aten.add.Tensor(getitem_1, 0);  getitem_1 = None
        mul: f32[s0, s1] = torch.ops.aten.mul.Tensor(getitem, arg4_1);  getitem = arg4_1 = None

        # File: /scratch/anijain/work/pytorch/test/test_functionalization_of_rng_ops.py:47, code: a = torch.rand_like(x) * a
        add_2: i64[] = torch.ops.aten.add.Tensor(arg1_1, add_1)
        philox_rand_1 = torch.ops.rngprims.philox_rand.default([arg2_1, arg3_1], arg0_1, add_2, None, device(type='cuda', index=0), torch.float32);  arg2_1 = arg3_1 = arg0_1 = add_2 = None
        getitem_2: f32[s0, s1] = philox_rand_1[0]
        getitem_3: i64[] = philox_rand_1[1];  philox_rand_1 = None
        add_3: i64[] = torch.ops.aten.add.Tensor(add_1, getitem_3);  add_1 = getitem_3 = None
        mul_1: f32[s0, s1] = torch.ops.aten.mul.Tensor(getitem_2, mul);  getitem_2 = mul = None

        # No stacktrace found for following nodes
        add_4: i64[] = torch.ops.aten.add.Tensor(arg1_1, add_3);  arg1_1 = add_3 = None
        return (mul_1, add_4)

 ~~~

Each rand op is accompanied by its offset calculation op.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99290
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-04-25 22:40:28 +00:00
Animesh Jain
fdbc8625a1 Functionalization of torch.rand/rand_like ops (#97377)
This PR introduces the functionalization of RNG ops. Key points are

* Introduces a new `philox_rand` prim operator that accepts seed, offset.
* Adds decompositions for random operators that use these philox_rand prims
* Adds a PhiloxStateTracker to track the offset for each occurence of rand ops
* Changes calling convention of AOT Autograd and adds <fwd_seed, fwd_base_offset> and <bwd_seed, bwd_base_offset>
* Monkeypatches set_rng_state and get_rng_state while AOT Autograd tracing to record the rng state behavior
* Raises assertion for CPU because CPU does not Philox RNG.

Not dealt in this PR
* dropout op - offset calculation is different
* other distributions like normal, poisson etc
* Inductor support
* Cudagraph support
* Dynamic shape support

An example
~~~

class Custom(torch.autograd.Function):
    @staticmethod
    def forward(ctx, x):
        ctx.save_for_backward(x)
        a = torch.rand_like(x) * x
        a = torch.rand_like(x) * a
        return a

    @staticmethod
    def backward(ctx, grad_out):
        x, = ctx.saved_tensors
        return grad_out * torch.rand_like(grad_out) * torch.cos(x)

====== Forward graph 0 ======
def forward(self, fwd_seed_1: i64[], fwd_base_offset_1: i64[], primals_1: f32[16, 16]):
    # No stacktrace found for following nodes
    add: i64[] = torch.ops.aten.add.Tensor(fwd_base_offset_1, 0)
    philox_rand: f32[16, 16] = torch.ops.prims.philox_rand.default([16, 16], fwd_seed_1, add, [16, 1], device(type='cuda', index=0), torch.float32);  add = None
    mul: f32[16, 16] = torch.ops.aten.mul.Tensor(philox_rand, primals_1);  philox_rand = None
    add_1: i64[] = torch.ops.aten.add.Tensor(fwd_base_offset_1, 4);  fwd_base_offset_1 = None
    philox_rand_1: f32[16, 16] = torch.ops.prims.philox_rand.default([16, 16], fwd_seed_1, add_1, [16, 1], device(type='cuda', index=0), torch.float32);  fwd_seed_1 = add_1 = None
    mul_1: f32[16, 16] = torch.ops.aten.mul.Tensor(philox_rand_1, mul);  philox_rand_1 = mul = None
    return [mul_1, primals_1]

====== Backward graph 0 ======
def forward(self, bwd_seed_1: i64[], bwd_base_offset_1: i64[], primals_1: f32[16, 16], tangents_1: f32[16, 16]):
    # No stacktrace found for following nodes
    add_2: i64[] = torch.ops.aten.add.Tensor(bwd_base_offset_1, 0);  bwd_base_offset_1 = None
    philox_rand_2: f32[16, 16] = torch.ops.prims.philox_rand.default([16, 16], bwd_seed_1, add_2, [16, 1], device(type='cuda', index=0), torch.float32);  bwd_seed_1 = add_2 = None
    mul_2: f32[16, 16] = torch.ops.aten.mul.Tensor(tangents_1, philox_rand_2);  tangents_1 = philox_rand_2 = None
    cos: f32[16, 16] = torch.ops.aten.cos.default(primals_1);  primals_1 = None
    mul_3: f32[16, 16] = torch.ops.aten.mul.Tensor(mul_2, cos);  mul_2 = cos = None
    return [mul_3]

~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97377
Approved by: https://github.com/ezyang
2023-04-16 09:55:56 +00:00