pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Animesh Jain	7e092a62e6	[dynamo] Support weakref objects (#128533 ) Fixes https://github.com/pytorch/pytorch/issues/125720 I was earlier worried that DELETE_* or STORE_* on referent values should result in a graph break, because they could invalidate the weak ref. But then @zou3519 pointed out that weakref invalidation will happen EVENTUALLY, CPython provides no guarantees when the weakref will be invalidated (even when the user calls del x and x is the last reference). So any code that relies on del x to invalidate the weakref of x right away is BAD code. CPython provide no guarantees. Therefore we can (ab)use this nuance, and can just ignore DELETE_* or STORE_* on the referent objects. The only corner case is when Dynamo is reconstructing the weakref object. Dynamo will have a hard time being correct here, so just SKIP_FRAME on such a case. This is rare. Cpython notes 1) https://docs.python.org/3/library/weakref.html 2) https://docs.python.org/3/reference/datamodel.html#index-2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128533 Approved by: https://github.com/jansel	2024-06-15 02:16:25 +00:00
Simon Fan	4b96575a09	[dynamo][aot autograd] Silently disable default saved tensor hooks during tracing (#123196 ) FIXES #113263. Same idea as in https://github.com/pytorch/pytorch/pull/113417, but we need a more intrusive C API to silently nop default saved tensor hooks, in order to support user-code that use torch.autograd.disable_saved_tensors_hooks (see test_unpack_hooks_can_be_disabled). We mock the output of get_hooks while leaving push/pop untouched. For compiled autograd, we're firing pack hooks once and unpack hooks twice right now, I'll look into this separately from this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123196 Approved by: https://github.com/soulitzer	2024-06-14 20:28:08 +00:00
Michael Lazos	b86b4ace88	Invalidate eager params when inlining and freezing nn modules (#128543 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128543 Approved by: https://github.com/anijain2305	2024-06-13 04:50:17 +00:00
Animesh Jain	2b28b107db	[dynamo][fsdp] Dont take unspecializedNNModuleVariable path for FSDP modules (#128453 ) Co-authored-by: Laith Sakka <lsakka@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128453 Approved by: https://github.com/yf225 ghstack dependencies: #126578, #128440, #128470	2024-06-12 22:03:45 +00:00
William Wen	85eeb90d2c	[dynamo] Fix graph breaks related to HF ModelOutput (#127780 ) Fixes https://github.com/pytorch/pytorch/issues/126028 and https://github.com/pytorch/pytorch/issues/126027. Changes: - Support building `CustomizedDictVariable` in` VariableBuilder` (but only for HF `ModelOutput` subclasses) - Remove `DataClassVariable` since it's not really being used anywhere (`CustomizedDictVariable` can be used instead) - Support side effects for `CustomizedDictVariable` - Allow `NO_HASATTR` leaf guard on `DictSubclassGuardManager` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127780 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-06-12 02:16:24 +00:00
Wei Chen	bb2a995529	Back out "[Dynamo] Treat integers stored on nn.Modules as dynamic (#126466 )" (#128432 ) Summary: Original commit changeset: c7d2e6b13922 Original Phabricator Diff: D57618942 Differential Revision: D58383241 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128432 Approved by: https://github.com/ezyang, https://github.com/Yuzhen11	2024-06-12 01:34:32 +00:00
Michael Lazos	a32157c67c	Mark params static if inlining modules and freezing (#128355 ) Today inlining builtin nn modules is not compatible with parameter freezing. Freezing parameters and then constant folding them through the graph relies on the assumption that they will not be inputs and will be static across calls to the same graph. When inlining builtin nn modules this assumption is broken and we reuse the same graph for different instances of the same nn module. There are three options 1) abandon constant folding, 2) create a dispatcher layer (like cudagraphs) which will dispatch to the correct constant-folded graph for each distinct set of parameters or 3) recompile This PR implements 3 by introducing guards on the parameter pointers. This was due to freezing being relatively rare and performance sensistive. 2 Had many more unknowns and 1 is not a viable option due to the drop in performance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128355 Approved by: https://github.com/anijain2305	2024-06-11 06:48:26 +00:00
Animesh Jain	05711eece9	[dynamo][inlining inbuilt modules] Ensure BC for nn_module_stack (#128295 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128295 Approved by: https://github.com/ydwu4	2024-06-10 23:11:04 +00:00
Andrew M. James	80a8fc07b2	[dynamo] Handle np.iinfo/finfo/dtype as input (#124482 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124482 Approved by: https://github.com/lezcano ghstack dependencies: #124481	2024-05-29 16:00:15 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	72f0bdcc22	Remove torch._constrain_as_value (#127103 ) Summary: This API doesn't do anything useful and should be subsumed by torch._check. Test Plan: CI Differential Revision: D57786740 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127103 Approved by: https://github.com/angelayi	2024-05-24 22:49:46 +00:00
Aart Bik	ff82e2e7cf	[traced-graph][sparse] propagate sparsity metadata into traced graph (#117907 ) Propagate sparsity metadata from sparse tensors of torch.sparse into the traced graph representation (with would be useful for a JIT backend that supports a "sparse compiler"). This is a first careful attempt, since the actual "meta" feature seem still incomplete for coo and completely lacking for csr/csc/bsr/bsc. For background see forum postings (with examples): https://discuss.pytorch.org/t/connecting-pytorch-sparse-tensors-with-mlir/195145 https://dev-discuss.pytorch.org/t/connecting-pytorch-sparse-tensors-with-mlir/1803 And feature request: https://github.com/pytorch/pytorch/issues/117188 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117907 Approved by: https://github.com/pearu, https://github.com/ezyang	2024-05-23 22:46:46 +00:00
Edward Z. Yang	0d17aae242	Teach FakeTensor to fill in item_memo when converting scalar CPU tensor (#126245 ) This PR requires a little justification, but let's start with what it does first: 1. When you have a 0d CPU scalar int64/float64 tensor input to a graph, we will preallocate a backed SymInt/SymFloat corresponding to what you would get if you call item() on this tensor. This means you can freely change your input to be a Python int/float or a Tensor with an item() call and end up with exactly the same level of expressivity (specifically, you can guard on the internal SymInt/SymFloat no matter what). By default, the source of the backed SymInt/SymFloat is `L['tensor'].item()`, but if you have promoted a float input into a Tensor, we will cancel out `torch.as_tensor(L['float']).item()` into just `L['float']`. 2. We switch wrap_symfloat to use this, instead of hand crafting the new SymNodeVariable. Everything works out, except that we carefully pass the item() result to tracked fakes (and not the fake Tensor argument) OK, so why do this at all? There is some marginal benefit where now some item() calls on scalar inputs can be guarded on, but IMO this is a pretty marginal benefit, and if it was the only reason, I wouldn't do this. The real reason for this is that I need to be able to propagate fake tensors through the graphs that are produced by Dynamo, and if I am doing the old custom wrap_symfloat logic, there's no way I can do this, because ordinarily an item() call will cause an unbacked SymInt when I reallocate. The other obvious way to solve the problem above is to make a HOP alternative that item() that "bakes in" the backed SymInt its supposed to return. But this strategy seems more parsimonious, and it does have the marginal benefit I mentioned above. The main downside is that what I have to do next, is make it so that when I run tensor computation, I also apply the equivalent operations to the SymInt/SymFloat as well. That's next PR. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126245 Approved by: https://github.com/eellison ghstack dependencies: #126637	2024-05-22 15:25:38 +00:00
Yanbo Liang	c1b90a4e8a	[Dynamo] Treat integers stored on nn.Modules as dynamic (#126466 ) Fixes #115711 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126466 Approved by: https://github.com/jansel	2024-05-21 03:31:20 +00:00
PyTorch MergeBot	71b6459edc	Revert "[Dynamo] Treat integers stored on nn.Modules as dynamic (#126466 )" This reverts commit `6bb9d6080d`. Reverted https://github.com/pytorch/pytorch/pull/126466 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the ONNX test failure looks legit, not flaky, as it starts failing in trunk `6bb9d6080d` ([comment](https://github.com/pytorch/pytorch/pull/126466#issuecomment-2119078245))	2024-05-19 02:52:11 +00:00
Yanbo Liang	6bb9d6080d	[Dynamo] Treat integers stored on nn.Modules as dynamic (#126466 ) Fixes #115711 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126466 Approved by: https://github.com/jansel	2024-05-18 05:02:16 +00:00
Animesh Jain	173b1d811d	[dynamo] Sourceless builder - ordered dict and re.pattern (#126468 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126468 Approved by: https://github.com/Skylion007	2024-05-17 23:24:55 +00:00
Yanbo Liang	dfab69fdf1	[Inductor] Flex attention supports dynamic shape (#125994 ) ## static shapes perf ``` \| Type \| Speedup \| batch_size \| num_heads \| q_seq_len \| k_seq_len \| head_dim \| score_mod \| dtype \| \|---------\|-----------\|--------------\|-------------\|-------------\|-------------\|------------\|-------------\|----------------\| \| Average \| 0.692 \| \| \| \| \| \| \| \| \| Max \| 0.855 \| 16 \| 16 \| 4096 \| 4096 \| 64 \| head_bias \| torch.bfloat16 \| \| Min \| 0.419 \| 8 \| 16 \| 512 \| 512 \| 256 \| noop \| torch.bfloat16 \| ``` ## dynamic shapes perf ``` \| Type \| Speedup \| batch_size \| num_heads \| q_seq_len \| k_seq_len \| head_dim \| score_mod \| dtype \| \|---------\|-----------\|--------------\|-------------\|-------------\|-------------\|------------\|---------------\|----------------\| \| Average \| 0.670 \| \| \| \| \| \| \| \| \| Max \| 0.864 \| 16 \| 16 \| 4096 \| 4096 \| 64 \| relative_bias \| torch.bfloat16 \| \| Min \| 0.376 \| 8 \| 16 \| 512 \| 512 \| 256 \| relative_bias \| torch.bfloat16 \| ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/125994 Approved by: https://github.com/Chillee	2024-05-15 04:43:24 +00:00
Edward Z. Yang	2ba102f689	Implement native support for float inputs in Dynamo and ShapeEnv (#125325 ) The big idea is that floats are treated as Tensors on input/output to the FX graph, but on the inside, we immediately call item() on the synthetic Tensor and record regular float operations on it. Canonicalization to Tensor operations will happen in a standalone FX pass. This behavior is controlled by `specialize_float` config variable when set to False. The generated graph looks like this for the test `test_unspec_float_output`: ``` def forward(self, L_x_: "f32[3]", L_y_: "f32[]"): l_x_ = L_x_ l_y_ = L_y_ # File: /data/users/ezyang/a/pytorch/test/dynamo/test_unspec.py:511 in f, code: return x + 1, y * 2 add: "f32[3]" = l_x_ + 1; l_x_ = None item: "Sym(zf0)" = l_y_.item(); l_y_ = None mul: "Sym(2zf0)" = item 2; item = None scalar_tensor: "f32[]" = torch.scalar_tensor(mul); mul = None return (add, scalar_tensor) ``` The ingredients: * torch/_dynamo/variables/builder.py When `specialize_float` is False, we wrap float literals with `wrap_symfloat`. This is an unholy mashup of `wrap_symint` and `wrap_unspecialized_primitive`. The overall strategy is that we first generate a tensor argument (because that's what we want to show up into the FX graph), but then immediately call item() on the tensor argument to get a SymNodeVariable, which we will do the rest of the tracing with. Importantly, this SymNodeVariable is backed with the source of the original float: this means we can guard on the resulting value (something we could NOT do with UnspecializedPythonVariable). This has to be done manually, because if you literally call item() on the tensor, you will end up with an unbacked float. There is a bit of copy paste from wrap_symint and wrap_unspecialized_primitive which we can try to factor out, but this really is its own thing and you should review every line of code in the function. * torch/fx/experimental/symbolic_shapes.py We now can generate guards on float inputs, and these guards are handled inside of ShapeEnv. So we need to be able to allocate (backed!) float symbols, and produce guards for them. Fairly straightforward generalization. * torch/_dynamo/codegen.py I also need to maintain the invariant that there are no float outputs to the FX graph. I chose to do this at codegen time. When we detect a SymNodeVariable on the return stack for a float, we on the fly convert it (via `as_tensor`) to a TensorVariable, which is the true output. We then special case the output bytecode to call item() on it again. The tensor conversion is memoized on SymNodeVariable since we typically run the code generation process twice. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125325 Approved by: https://github.com/lezcano, https://github.com/jansel	2024-05-14 04:10:01 +00:00
Simon Fan	7e0edafe86	[compiled autograd][dynamo] improve lifted autograd.Function.backward handling and fallback to pseudo-eager (#125661 ) - `FakeContext` hides all fields other than ctx.saved_tensors, this dynamo errors when the autograd.Function.backward uses other attrs on ctx and it also doesn't allow fallback to eager. - If we remove it, we still can't fallback to eager: node variables are already freed (ctx.saved_tensors throws) - However, we can fallback to "pseudo-eager" by using a duck-typed ctx and routing the ctx.saved_tensors to lifted tensors - Dynamo tries to inline external_utils.call_backward, treats BackwardCFunction as a AutogradFunctionContextVariable (only used up until we create the fake context: FakeBackwardCFunction) - we call_function backward from the forward class AutogradFunctionVariable, and we still pass in the fake context as a UserDefinedObjectVariable (can later use AutogradFunctionContextVariable + HOO graph speculate) Fixes #125489 #124827 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125661 Approved by: https://github.com/jansel	2024-05-08 21:00:37 +00:00
ydwu4	461ffaaaf3	[dynamo] support torchbind object input (#124978 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124978 Approved by: https://github.com/jansel	2024-05-07 03:02:00 +00:00
Edward Z. Yang	650a248d3e	Rename is_unspecialized to pass_arg_as_tensor, add comment (#125496 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125496 Approved by: https://github.com/lezcano ghstack dependencies: #125395, #125419, #125483, #125494	2024-05-05 16:57:50 +00:00
Edward Z. Yang	12da7ee58f	Don't use wrap_fx_proxy_cls for wrap_symint (#125494 ) We use very little of the code in wrap_fx_proxy_cls, so dupe it out. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125494 Approved by: https://github.com/lezcano ghstack dependencies: #125395, #125419, #125483	2024-05-05 16:57:50 +00:00
Edward Z. Yang	617e473da5	Split wrap_symint out of wrap_unspecialized_primitive (#125483 ) While there are some similarities, they are also quite different (one handles Numpy numbers while the other handles ints. I am also going to add a wrap_symfloat soon which will do even more different behavior. So split these out for clarity. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125483 Approved by: https://github.com/lezcano ghstack dependencies: #125395, #125419	2024-05-05 16:57:50 +00:00
Edward Z. Yang	b4ccc615cd	Do exact type match on int so we don't pick up bool here too (#125305 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125305 Approved by: https://github.com/Skylion007	2024-05-01 19:46:36 +00:00
Animesh Jain	e68d65dae2	[dynamo][cpp-guards] Differentiate dict guards wrt to guarding on key order (#124779 ) We guard on key order 1) When a key is a non-constant object 2) When we actually need key order - like .values, .items etc For dicts/OrderedDicts that do not require key order guarding, we just rely on usual `GuardManger + DictGetItemGuardAccessor`. This is faster than going through the `list(d.keys())` based design for OrderedDicts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124779 Approved by: https://github.com/jansel	2024-04-25 08:20:35 +00:00
Animesh Jain	59a1f1f308	[dynamo][inline inbuilt nn modules] Do not inline for export (#124814 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124814 Approved by: https://github.com/jansel	2024-04-25 06:35:31 +00:00
Aaron Gokaslan	5a1216bb2e	[BE]: Update ruff to 0.4.1 (#124549 ) Update ruff to 0.4.1 . This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes. Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0 \| Repository \| Linter (v0.3) \| Linter (v0.4) \| Formatter (v0.3) \| Formatter (v0.4) \| \|----------------------------------------------------\|---------------\|---------------\|------------------\|------------------\| \| [pytorch/pytorch](https://github.com/pytorch/pytorch) \| 328.7 \| 251.8 \| 351.1 \| 274.9 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549 Approved by: https://github.com/ezyang	2024-04-21 14:06:23 +00:00
JackCaoG	7ae835eee4	Enable SourcelessBuilder to build GraphModule generated by make_fx (#123673 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123673 Approved by: https://github.com/ezyang, https://github.com/anijain2305 ghstack dependencies: #123680	2024-04-19 17:23:51 +00:00
Xuehai Pan	a6f044a490	[dynamo, 3.8-3.9] support dataclass with `frozen=True` in Python 3.8/3.9 (#124393 ) Closes #114966 Frozen field assignment in `__init__` in Python 3.8-3.9: `f5bd65ed37/Lib/dataclasses.py (L402-L411)` ```python import builtins BUILTINS = builtins def _field_assign(frozen, name, value, self_name): # If we're a frozen class, then assign to our fields in __init__ # via object.__setattr__. Otherwise, just use a simple # assignment. # # self_name is what "self" is called in this function: don't # hard-code "self", since that might be a field name. if frozen: return f'BUILTINS.object.__setattr__({self_name},{name!r},{value})' return f'{self_name}.{name}={value}' ``` Frozen field assignment in `__init__` in Python 3.10+: `812245ecce/Lib/dataclasses.py (L436-L445)` ```python __dataclass_builtins_object__ = object def _field_assign(frozen, name, value, self_name): # If we're a frozen class, then assign to our fields in __init__ # via object.__setattr__. Otherwise, just use a simple # assignment. # # self_name is what "self" is called in this function: don't # hard-code "self", since that might be a field name. if frozen: return f'__dataclass_builtins_object__.__setattr__({self_name},{name!r},{value})' return f'{self_name}.{name}={value}' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/124393 Approved by: https://github.com/jansel	2024-04-19 05:10:33 +00:00
Edward Z. Yang	bebdbb63ce	Introduce set_example_value and use it throughout Dynamo (#124176 ) I'm going to setup some extra behavior when we set example value, so I need a convenient place to interpose. I cannot easily do it on meta itself because its a generic dict with no interposition point. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124176 Approved by: https://github.com/oulgen ghstack dependencies: #124105, #124059	2024-04-17 22:57:11 +00:00
Animesh Jain	f433517181	[dynamo][decorator] Support disable on nn modules (#124185 ) Fixes https://github.com/pytorch/pytorch/issues/123979 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124185 Approved by: https://github.com/weifengpy, https://github.com/yoyoyocmu	2024-04-17 16:20:34 +00:00
Jason Ansel	11e6f84ad8	[dynamo] Graph break on uninitialized nn.Module (#123790 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123790 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700, #123705, #123786	2024-04-12 19:03:13 +00:00
Jason Ansel	6b0ba6bbd3	[dynamo] Improve constant-prop for regex/torch.__version__ (#123705 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123705 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700	2024-04-12 19:03:13 +00:00
Simon Fan	7fc3aa5f81	[compiled autograd][aot] Trim runtime refs for list inputs from dynamo (#122535 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122535 Approved by: https://github.com/bdhirsh ghstack dependencies: #123630, #123674, #122353, #123359	2024-04-12 10:29:09 +00:00
Simon Fan	d274d57037	[compiled autograd][dynamo] Make compiled graph take in boxed inputs (#122353 ) ### Context In today's Dynamo, we lift all tensors encountered during tracing to be individual graph inputs, even when they were in a container. And [Dynamo generates](`fdc281f258/torch/_dynamo/codegen.py (L371)`) the runtime function's signature using the graph's graphargs. This means that the generated function will have each grapharg as an argument, which is problematic if we want to free the inputs in inductor codegen. See [python function arguments are kept alive for the duration of the function call](https://github.com/pytorch/pytorch/pull/83137#issuecomment-1211320670). ```python # original code def forward(inputs): a, b, c, d, e = inputs inputs.clear() out = a out += b del b # frees memory out += c del c # frees memory out += d del d # frees memory out += e del e # frees memory return out # compiled code: def forward(a, b, c, d, e): # b, c, d, e can't be freed before end of function ``` This isn't a concern when compiling forward because a, b, c, d, e are all from user code, and should be kept alive. But when compiling backwards, a, b, c, d, e may be intermediate results i.e. activations, that we DO want to clear ASAP to remain on par with eager peak memory. ### Solution We have encountered similar memory problems in AOTAutograd before, where we adopted the boxed calling convention (wrapping to-be-freed objects in a list), adding list clearing to inductor codegen, and being careful about holding references to elements in the input list. We need to do something similar, but for inputs from the user program (compiled autograd fx graph in this case). This PR support lists as graphargs/placeholder nodes. When tracing a list of tensors, we create a node for it, and pre-emptively initialize variable trackers for its elements before they are used in the user program. Subsequent uses of those variables will find hits in the lookup table `input_source_to_var`. With the inputs as a list in the graph args, our compiled code can free inputs just like in the eager case. ```python def forward(inputs): # a, b, c, d, e can be freed within the function now ``` Currently, AOT/Inductor flattens list input via [flatten_graph_inputs wrapper](`597f479643/torch/_inductor/compile_fx.py (L1454-L1478)`), which is why this PR's CI can be green. Additional changes are needed to its runtime wrapper, done in the next PR. The next step is to ensure that we are careful in forwarding the list to inductor codegen without holding additional references. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122353 Approved by: https://github.com/jansel ghstack dependencies: #123630, #123674	2024-04-12 10:29:09 +00:00
Thiago Crepaldi	1b5944358e	Ignore logging.Logger.* calls during dynamo export (#123402 ) Follow up for https://github.com/pytorch/pytorch/pull/123368 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123402 Approved by: https://github.com/williamwen42	2024-04-09 18:51:00 +00:00
PyTorch MergeBot	d04957c0c6	Revert "Ignore logging.Logger.* calls during dynamo export (#123402 )" This reverts commit `75933ff523`. Reverted https://github.com/pytorch/pytorch/pull/123402 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/123402#issuecomment-2044236088))	2024-04-09 06:28:12 +00:00
Thiago Crepaldi	75933ff523	Ignore logging.Logger.* calls during dynamo export (#123402 ) Follow up for https://github.com/pytorch/pytorch/pull/123368 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123402 Approved by: https://github.com/williamwen42	2024-04-08 22:50:54 +00:00
Michael Lazos	73e235f0a6	Swap to ID guard for optimizer Variable (#123496 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123496 Approved by: https://github.com/anijain2305	2024-04-08 19:28:25 +00:00
PyTorch MergeBot	3e8d3577be	Revert "Swap to ID guard for optimizer Variable (#123496 )" This reverts commit `26bf05ccac`. Reverted https://github.com/pytorch/pytorch/pull/123496 on behalf of https://github.com/PaliC due to seems to have broken distributed/fsdp/test_fsdp_hybrid_shard.py as per `26bf05ccac` ([comment](https://github.com/pytorch/pytorch/pull/123496#issuecomment-2043251234))	2024-04-08 17:06:05 +00:00
Michael Lazos	26bf05ccac	Swap to ID guard for optimizer Variable (#123496 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123496 Approved by: https://github.com/anijain2305	2024-04-08 05:03:34 +00:00
Will Feng	7b02910163	[Compile FSDP2][2/n] Support streams created outside of compile region (#123487 ) FSDP2 creates CUDA streams outside of compile region in its 1st iteration eager run, and then torch.compile will attempt to record method calls on these streams (e.g. `stream.record_event()`) in >1st iteration compiled run. Before this PR, stream proxy is None which causes "None doesn't have attribute record_event" error when we try to call `record_event()` on it. After this PR, stream proxy has the correct value which makes calling methods on it possible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123487 Approved by: https://github.com/jansel	2024-04-06 08:42:42 +00:00
Animesh Jain	fb7664d5bf	[dynamo][optimizer][guard-overhead] NOT_NONE guard for param.grad instead of TENSOR_MATCH (#123285 ) For optimizers, we do an DATA_PTR match for parameters. For param.grad, we were doing TENSOR_MATCH, but what we really need to guard is if param.grad is None or not. Therefore, I add a new guard called NOT_NONE. Further improves the guard overhead ![image](https://github.com/pytorch/pytorch/assets/13822661/574598ac-ca71-4e5e-9e75-8774577cd58f) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123285 Approved by: https://github.com/mlazos, https://github.com/jansel	2024-04-04 03:52:47 +00:00
Animesh Jain	969bbf8e82	[dynamo][guards] Skip aliasing guards for optimizers (#123044 ) I am ok if people don't want this PR to be merged. For optimizers, we know that the state dict and param_group have same parameters. So, I think its ok to skip TENSOR_MUST_ALIAS guards. Similarly for state tensors, all of them are different. Therefore, we can skip the tensor aliasing guards. With this PR, these are the numbers for Megatron which has 394 parameters <img width="290" alt="image" src="https://github.com/pytorch/pytorch/assets/13822661/0ce75dc6-4299-46bb-bf3c-7989ebc7cfc4"> C++ numbers jump a lot because of 2 reasons 1) We are now not doing INCREF/DECREF for a large number of tensors. 2) For python guards, we can expect higher numbers but that requires some more plumbing because the Python tensor guards are all collapsed into one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123044 Approved by: https://github.com/jansel, https://github.com/mlazos	2024-04-02 08:51:00 +00:00
drisspg	557e7c9c16	Add some type hints to functions and update a few spelling mistakes (#123015 ) # Summary While working on this PR: https://github.com/pytorch/pytorch/pull/121845 I found that these type hints made my ide/ noob experience easier to reason about Pull Request resolved: https://github.com/pytorch/pytorch/pull/123015 Approved by: https://github.com/Skylion007	2024-03-30 21:15:01 +00:00
Simon Fan	1d96791661	[dynamo] Fix list proxy to list element proxy source propagation (#122691 ) Currently, when we create proxies for a list's elements in wrap_fx_proxy_cls, we create them using the same source as the list's e.g. `LocalSource(inputs)` instead of `GetItemSource(LocalSource(inputs), index=i)`. This results in invalid guards when the tensors it contains becomes dynamic, and the guard system thinks the list is a tensor: ``` Malformed guard: L['sizes'][0] == L['inputs'].size()[0] Malformed guard: 2 <= L['inputs'].size()[0] Traceback [...] AttributeError: 'list' object has no attribute 'size' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/122691 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-03-28 14:40:54 +00:00
Joel Schlosser	07b618e2d4	Graph break cleanly in Dynamo for module parametrization (#121041 ) Fixes #118795 This is a graph breaking partial fix for #120914. We still need -actual- module parametrization tracing support, but at least it doesn't blow up hard now. Background: Module parametrization injects a property as the module parameter attribute that calls a `nn.Module` whose forward takes in a module parameter and returns a reparametrized module parameter. Example: ``` class MyParametrization(nn.Module): def forward(X): # This reparametrization just negates the original parameter value return -X m = nn.Linear(...) p = MyParametrization() register_parametrization(m, "weight", p) # Accessing the "weight" attribute will invoke p's forward() on m's original weight and return the output as the new weight. # m.weight here is now an injected property that does the above instead of an actual Parameter. # This property is defined in torch/nn/utils/parametrize.py. m.weight # NB: Parametrization changes the module type (e.g. torch.nn.utils.parametrize.ParametrizedLinear) print(type(m)) ``` Problem 1: Dynamo has special tracing rules for things in `torch.nn`. Parametrizing a module changes the type of the module and the parametrized attribute, so now these rules wrongly affect tracing here. To fix this: * For parametrized modules, call `convert_to_unspecialized()` to restart analysis where Dynamo starts inlining the module. Problem 2: The issue seen in #118795 is that Dynamo will see a dynamically constructed tensor when `m.weight` is called and introduce that to its `tensor_weakref_to_sizes_strides` cache during fake-ification. This tensor is also made to be a graph input, since it's a module parameter. When guards are created for this module parameter input, the logic calls `m.weight` again and tries to look the result up in the cache, but this is a different tensor now, giving the `KeyError` symptom. To fix this: * Replace Dynamo's `tensor_weakref_to_sizes_strides` cache with a `input_source_to_sizes_strides` cache. * This cache was originally introduced in #100128. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121041 Approved by: https://github.com/anijain2305	2024-03-26 23:44:51 +00:00
Yifu Wang	36188360dd	[dynamo] support torch.distributed.{group.WORLD, GroupMember.WORLD, distributed_c10d._get_default_group} (#120560 ) Fixes https://github.com/pytorch/pytorch/issues/120431 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120560 Approved by: https://github.com/wconstab	2024-03-24 11:13:05 +00:00
Guilherme Leobas	4eaa000acc	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-22 20:25:47 +00:00
Joel Schlosser	cd6bfc7965	Proper view support for jagged layout NestedTensor (#113279 ) This PR: * Introduces an ATen op for creating true jagged views from a dense values buffer * `_nested_view_from_jagged(values, offsets, lengths, ragged_idx, dummy)` * This ops is implemented on the Python side using torch.library so we can return a subclass instance * `jagged_from_list()` now uses this instead of the old autograd.Function `NestedViewFromBuffer` * The latter op is used for non-contiguous JTs returned via `torch.nested.narrow()` * `dummy` is an awful hack to ensure that `NestedTensor.__torch_dispatch__()` is invoked for our view * Introduces an ATen op for accessing the `values` component of an NT via a view * `_nested_get_values(nt)` * Removes the autograd.Functions `ViewNestedFromBuffer` and `ViewBufferFromNested` in favor of `nested_from_values_offsets()` / `nested_from_values_offsets_lengths()` and `nt.values()`, respectively. * Changes test code to prefer `as_nested_tensor()` over `jagged_from_list()` directly * Similarly, avoid `buffer_from_jagged()`, preferring `values()` * Depends on general subclass view fake-ification on the PT2 side (handled solely in previous PRs in the stack) With these changes, the semantics of jagged layout NTs are such that they are considered a true view of the underlying `values` buffer. This means views of jagged NTs are views of the underlying buffer as well, simplifying some handling. Differential Revision: [D54269922](https://our.internmc.facebook.com/intern/diff/D54269922) Co-authored-by: voznesenskym <voznesenskym@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113279 Approved by: https://github.com/ezyang	2024-03-22 02:12:36 +00:00

1 2 3 4 5 ...

427 Commits