pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
ekamiti	32d422f335	Make adding buffers more like adding parameters (#104069 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069 Approved by: https://github.com/mikaylagawarecki	2023-07-17 17:59:05 +00:00
Animesh Jain	95232c216b	[dynamo] Bugfix for enums (#105306 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105306 Approved by: https://github.com/yanboliang	2023-07-17 16:39:16 +00:00
lezcano	b190f46514	Allow NumPy code in torch.compile to run on cuda (#104699 ) This can be achieved by doing `torch.set_default_device("cuda")`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104699 Approved by: https://github.com/ezyang, https://github.com/larryliu0820	2023-07-06 18:43:09 +00:00
Animesh Jain	8c191d8eef	[dynamo][ac] Reland #104397 - Remove disable monkeypatching of utils.checkpoint (#104665 ) NO CHANGE from before. The ancestor diff was reverted, so this diff got reverted as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104665 Approved by: https://github.com/wconstab	2023-07-06 00:48:02 +00:00
Animesh Jain	4005152b92	[dynamo] Organize higherorderops variable trackers (#104565 ) The main change is moving the higherorderops from torch.py to higher_order_ops.py. And creating smaller subclasses of HigherOrderOp for cond, map etc Pull Request resolved: https://github.com/pytorch/pytorch/pull/104565 Approved by: https://github.com/zou3519	2023-07-05 22:19:26 +00:00
PyTorch MergeBot	40f53912cf	Revert "[dynamo][ac] Remove disable monkeypatching of utils.checkpoint (#104397 )" This reverts commit `537a6c0651`. Reverted https://github.com/pytorch/pytorch/pull/104397 on behalf of https://github.com/huydhn due to This has been reverted internally by D47216591, so I need to also revert it on OSS to keep them in sync ([comment](https://github.com/pytorch/pytorch/pull/104397#issuecomment-1621086360))	2023-07-05 06:11:08 +00:00
Animesh Jain	537a6c0651	[dynamo][ac] Remove disable monkeypatching of utils.checkpoint (#104397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104397 Approved by: https://github.com/wconstab	2023-06-30 02:27:06 +00:00
Animesh Jain	2bb83cd45c	[dynamo][ac] Minor refactor for better code organization and a bugfix (#104276 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104276 Approved by: https://github.com/zou3519	2023-06-29 12:57:59 +00:00
cdzhan	c06bb82ba1	fix specialization when you pass an unspec int into slicing on a Python list. (#104142 ) Fixes #103545 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104142 Approved by: https://github.com/malfet, https://github.com/jansel	2023-06-28 13:13:07 +00:00
Animesh Jain	75dab587ef	[dynamo] FSDP + AC + torch.compile (#103953 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103953 Approved by: https://github.com/wanchaol	2023-06-24 01:40:56 +00:00
Vinay Kumar Burugu	3c28431a0f	Feature: Dump compile_times when TORCH_LOGS=dynamo is enabled. (#104057 ) Partial implementation of https://github.com/pytorch/pytorch/issues/103173. This PR only implements the feature to dump compile_times at the end of the session using the atexit handler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104057 Approved by: https://github.com/ezyang	2023-06-23 05:25:09 +00:00
Thiago Crepaldi	6f655d4195	Add symbolic tracing support to torch._dynamo.export (fake input + weights) (#100017 ) Fixes #95900 Using the following repro as guide: ```python import torch import torch._dynamo from torch._subclasses import fake_tensor from torch.fx.experimental.symbolic_shapes import ShapeEnv from torch._dynamo.output_graph import config class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.linear = torch.nn.Linear(2, 2) self.linear2 = torch.nn.Linear(2, 2) def forward(self, x): out = self.linear(x) out = self.linear2(out) return out fake_mode = fake_tensor.FakeTensorMode(allow_non_fake_inputs=False, allow_fallback_kernels=True, shape_env=ShapeEnv( allow_scalar_outputs=config.capture_scalar_outputs, allow_dynamic_output_shape_ops=config.capture_dynamic_output_shape_ops, frame_id=0 ), ) # Fakefying input/model before calling torch._dynamo.export with fake_mode: fake_x = torch.rand(5, 2, 2) model = Model() # Calling torch._dynamo.export without active fake mode graph_module, guards = torch._dynamo.export( model, fake_x, aten_graph=True, fake_mode=fake_mode ) graph_module.print_readable() graph_module.graph.print_tabular() ``` Summary of changes: * Plumb fake_mode through torch.export API. When specified, it replaces the creation of a new FaketendorMode at InstructionTranslator on behalf of OutputGraph Hacks FakeTensor.__new__ to prevent a torch.tensor._make_subclass call for inputs that are already fakefied by user. This probably need to be fixed in a nicer way. Any idea? * Removed a few asserts that didn't want faked tensors coming from user script * Added torch._subclasses.fake_tensor.FakeTensor to type list on a few asserts check to allow fake inputs The changes above allowed symbolic tracing with both static and dynamic shapes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100017 Approved by: https://github.com/ezyang	2023-06-15 21:28:10 +00:00
Mengwei Liu	96c23fe212	[dynamo][numpy] Add support for builtin functions (#103457 ) In order to be able to run stuff like: ``` def f(x): a = x.numpy() return a + a ``` This PR adds a branch in `BuiltinVariable` to handle `NumpyNdarrayVariable` case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103457 Approved by: https://github.com/ezyang	2023-06-15 09:18:45 +00:00
Animesh Jain	16c2090b2d	[benchmark][compile] Limit number of bounding boxes to 5 (#103413 ) Depends on https://github.com/pytorch/benchmark/pull/1729 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103413 Approved by: https://github.com/ezyang	2023-06-15 01:06:40 +00:00
Edward Z. Yang	ddf4cd69ec	Delete ifdyn and ifunspec combinators (#103596 ) Replaced with expect tests for ease of updating. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103596 Approved by: https://github.com/voznesenskym	2023-06-15 00:14:17 +00:00
Animesh Jain	bd0ed940b7	[activation checkpoint][dynamo] Wrap AC into Tag based higher order op (#102935 ) These are the numbers with this PR ![image](https://github.com/pytorch/pytorch/assets/13822661/63e991d5-80e2-4e94-8e4b-243621c3990e) There are 3 main followups * A naive partitioner gives better memory footprint than min-cut partitioner here. Currently, we are using min-cut partitioner. Waiting for @Chillee to discuss this further to either modify min-cut or add a naive partitioner. * aot_eager is < 1x memory footprint. This is true even for non AC models. This could hide some inefficiency somewhere. * inductor is giving very different memory numbers between AOT-traced-AC (duplicate early) vs this implementation. This leads to some inefficiency in inductor that we need to resolve. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102935 Approved by: https://github.com/jansel	2023-06-14 20:15:43 +00:00
Edward Z. Yang	8b015c166c	Don't test dynamic_shapes in tensor_always_has_static_shape (#103517 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103517 Approved by: https://github.com/anijain2305	2023-06-14 07:04:17 +00:00
Mengwei Liu	2eac8bd2b8	[dynamo][numpy] Support ndarray methods (#97537 ) This PR adds universal support for ndarray methods. After #100839 each `NumpyNdarrayVariable` should wrap a `torch.Tensor`. This PR adds a `numpy_method_wrapper` which converts the `torch.Tensor` to `torch_np.ndarray` and then call the numpy ndarray method. Then we also try to return a `torch.Tensor` (return as-is if the value is not ndarray-like) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97537 Approved by: https://github.com/ezyang	2023-06-12 17:21:31 +00:00
Edward Z. Yang	12cd1dbba0	Handle recursive tuple in clone_inputs (#102979 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102979 Approved by: https://github.com/wconstab	2023-06-05 22:11:48 +00:00
Michael Lazos	c46af25bb3	Initialize optimizer in dynamo to avoid graph break and tracing slowness (#102640 ) On calls to `_init_group` rather than tracing through it, extract python values from the arguments, and call the initialization. This avoids having to trace this function which is very slow with large parameters, and also avoids graph breaking on it. This is sound in this case because the state is only initialized once in the eager case. Guards on the state and params are generated explicitly rather than via tracing the initialization. Caveats: `_init_group` also gathers various state tensors into lists via mutating list arguments to pass to the functional optimizer implementation. These state tensors exist on the optimizer itself, but we don't know exactly how the gathering is done and which tensors correspond to which attributes of the optimizer module (each optimizer has different states). To rectify this, we keep weak_ptrs to all of the tensors collected in the lists in globals (similar to how parameter keys are stored for dictionaries). These pointers are guaranteed to be alive as long as the optimizer object is alive if the internal state is not interfered with and they are guarded with weakref guards Pull Request resolved: https://github.com/pytorch/pytorch/pull/102640 Approved by: https://github.com/jansel	2023-06-03 15:49:51 +00:00
Mengwei Liu	c304fddf68	[dynamo][numpy] Support graph break for numpy ndarray (#100839 ) Issue: #93684 In previous PRs #95849 #99560 we redirect `numpy.`, `<tensor>.numpy()` calls to `torch_np.` methods and attributes, by creating `NumpyNdarrayVariable` for those calls. We need to handle `NumpyNdarrayVariable` when graph break happens. This PR did 2 things: 1. In `codegen.py` we made sure we can reconstruct the value wrapped by `NumpyNdarrayVariable`, to be `torch_np.ndarray` in the stack whenerver we recompiles the subgraph. 2. In `builder.py` we can wrap the value to be `NumpyNdarrayVariable` and save it as graph input. ----- Starting from commit 6: ## A new design for supporting numpy in dynamo In short the core concept doesn't change: we still convert `numpy` API calls to `torch_np` API calls. However, instead of wrapping a `torch_np.ndarray` in `NumpyNdarrayVariable`, the new design wraps a `torch.Tensor`. The reason for doing this change is because we need to keep `torch.Tensor` everywhere in the captured graph, so that it works well with the backend of dynamo. See discussions in https://github.com/Quansight-Labs/numpy_pytorch_interop/issues/142 for details. ### Flow This is an example showing how do we think about dynamo working on a simple function: ```python def f(x: torch.Tensor, y: torch.Tensor): a, b = x.numpy(), y.numpy() c = np.add(x, y) return torch.from_numpy(c) ``` ``` +------------+ +------------+ torch.Tensor \| \|numpy.ndarray\| \| -------------- .numpy() --------------\| \| \| \| \| \| +------------------+ +------------+ \| numpy.add \|numpy.ndarray\| \|torch.Tensor +------------+ \| --------------\| torch.from_numpy -------------- torch.Tensor \| \|numpy.ndarray\| \| \| \| -------------- .numpy() --------------\| \| +------------------+ \| \| \| \| +------------+ +------------+ +------------+ +----------------+ torch.Tensor \| \|torch.Tensor \| \| -------------- .detach() --------------\| \| \| \| \| \| +----------------+ +------------+ +------------+ \| \|torch_np.ndarray\| \|torch.Tensor\| \|torch.Tensor \| torch_np.add -----------------\| util.to_tensor -------------\| .detach() -------------- +------------+ \| \| \| \| \| \| torch.Tensor \| \|torch.Tensor \| \| +----------------+ +------------+ -------------- .detach() --------------\| \| \| \| \| \| +------------+ \| +----------------+ \| \| wrapper on torch_np.add \| +--------------------------------------------------------+ ``` ### Approach `torch_np` APIs can take both `torch_np.ndarray` as well as `torch.Tensor`. What we need to do is to have a wrapper for these APIs to convert the return value back to `torch.Tensor`. This way only the wrapper is showing up in the captured graph, with `torch.Tensor`s as input and `torch.Tensor` as output. If we have a graph break or we've traced to the end of the program, we need to inspect all the `NumpyNdarrayVariable` in the stack and convert them back to `numpy.ndarray`, to make sure the compiled version is still behaving the same as the eager version. ### Examples Here's an example of the graph generated: ```python def fn(x: np.ndarray, y: np.ndarray): a = x.real b = y.real torch._dynamo.graph_break() return np.add(a, 1), np.add(b, 1) ``` Graph generated: ``` [2023-05-16 10:31:48,737] torch._dynamo.output_graph.__graph: [DEBUG] TRACED GRAPH __compiled_fn_0 <eval_with_key>.0 opcode name target args kwargs ------------- -------------- ---------------------------------------------------------- ---------------------- -------- placeholder l_x_ L_x_ () {} placeholder l_y_ L_y_ () {} call_function from_numpy <built-in method from_numpy of type object at 0x12b1fdc80> (l_x_,) {} call_function from_numpy_1 <built-in method from_numpy of type object at 0x12b1fdc80> (l_y_,) {} call_function attr_wrapper <function attr_wrapper at 0x12e8693a0> (from_numpy, 'real') {} call_function attr_wrapper_1 <function attr_wrapper at 0x12e8693a0> (from_numpy_1, 'real') {} output output output ((),) {} [2023-05-16 10:31:48,908] torch._dynamo.output_graph.__graph: [DEBUG] TRACED GRAPH __compiled_fn_2 <eval_with_key>.1 opcode name target args kwargs ------------- ------------- ---------------------------------------------------------- ------------------------------- -------- placeholder l_a_ L_a_ () {} placeholder l_b_ L_b_ () {} call_function from_numpy <built-in method from_numpy of type object at 0x12b1fdc80> (l_a_,) {} call_function from_numpy_1 <built-in method from_numpy of type object at 0x12b1fdc80> (l_b_,) {} call_function wrapped_add <Wrapped function <original add>> (from_numpy, 1) {} call_function wrapped_add_1 <Wrapped function <original add>> (from_numpy_1, 1) {} output output output ((wrapped_add, wrapped_add_1),) {} ``` ### Changes * `codegen.py`: reconstruct `numpy.ndarray` from `NumpyNdarrayVariable` by adding bytecode to call `utils.to_numpy_helper()`. * `output_graph.py`: getting rid of legacy code that does exactly what `codegen.py` does, which only handling return case but not graph break case. * `utils.py`: added helpers to convert `numpy.ndarray` to `torch.Tensor` and vice versa. Also adding a wrapper class that takes in a function. In `__call__` it calls the function and converts its out to `torch.Tensor` (or a list of it). * `builder.py`: add method to wrap `numpy.ndarray` graph inputs into `NumpyNdarrayVariable`, by calling `torch.numpy` in the proxy. * `misc.py`: `numpy` API calls goes into `NumpyVariable` and we find the function with the same name in `torch_np` module, then wrap it with the wrapper defined in `utils.py`. * `tensor.py`, `torch.py`: proxy `tensor.numpy()` to be `torch.detach()` but wrap it with `NumpyNdarrayVariable`. Similarly, `torch.from_numpy()` -> `torch.detach()` but wrap it with `TensorVariable`. In `NumpyNdarrayVariable`, do the similar `torch_np.ndarray` to `torch.Tensor` wrapping for attributes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100839 Approved by: https://github.com/ezyang	2023-06-03 00:54:25 +00:00
Edward Z. Yang	90b1b17c9f	Fix string concatenation with non-string (#102728 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102728 Approved by: https://github.com/Skylion007	2023-06-01 20:02:03 +00:00
Animesh Jain	2fa1b563da	[dynamo] Activation checkpoint higher order ops - Reland 101028 (#101790 ) https://github.com/pytorch/pytorch/pull/101028 was reverted due to internal breakage. Relanding. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101790 Approved by: https://github.com/zou3519	2023-05-18 19:09:14 +00:00
Yanbo Liang	7052fb37bd	[Dynamo] Improve handling UnspecializedNNModuleVariable side effect (#101141 ) Fixes #101102 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101141 Approved by: https://github.com/jansel	2023-05-16 03:57:13 +00:00
PyTorch MergeBot	d0db7d624d	Revert "[dynamo] Activation checkpointing as higher order op (#101028 )" This reverts commit `de15e740a1`. Reverted https://github.com/pytorch/pytorch/pull/101028 on behalf of https://github.com/jeanschmidt due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/101028#issuecomment-1548280970))	2023-05-15 17:47:08 +00:00
Michael Lazos	d75f93603a	Flatten exceptions in dynamo (#100779 ) Fixes https://github.com/pytorch/pytorch/issues/93571 [before and after](https://gist.github.com/mlazos/256b0e8f0f98495752a22b960e9f4fcb) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100779 Approved by: https://github.com/ezyang	2023-05-13 00:58:57 +00:00
Animesh Jain	de15e740a1	[dynamo] Activation checkpointing as higher order op (#101028 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101028 Approved by: https://github.com/voznesenskym, https://github.com/zou3519	2023-05-12 03:17:41 +00:00
Jerry Zhang	c3f3cb5b0f	[quant][pt2e] Support conv bn fusion in convert step for QAT flow (#100442 ) Summary: This PR adds support for folding bn weights into conv for QAT flow, this is equivalent to the QAT branch of `from_float` in eager mode quantized conv module: https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/quantized/modules/conv.py#L223 Items that needs followup: * there are some workaround I did because quantize_per_tensor is using float/int args and dynamo does not support these args, need to fix after we change the quantized model representation and also change these args to Tensor Test Plan: buck2 test @//mode/opt //caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_convert_qat_conv_bn_fusion (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' Reviewed By: andrewor14 Differential Revision: D45344281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100442 Approved by: https://github.com/kimishpatel	2023-05-09 19:43:51 +00:00
Bin Bao	86ddfc7f68	[inductor] Move cpp wrapper trigger logic to inner_compile (#100611 ) Summary: This enables cpp wrapper for backward as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100611 Approved by: https://github.com/jansel	2023-05-08 15:24:02 +00:00
Animesh Jain	3f025c607c	summarize graph breaks (#100696 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100696 Approved by: https://github.com/yanboliang	2023-05-05 22:27:47 +00:00
Edward Z. Yang	ce1ad1c143	Add load_storage (#100519 ) This adds a new operator debugprims::load_storage which does the unusual thing of loading a tensor from disk (via ContentStoreReader). This will be used in a later PR to implement delta debugging in the minifier, even when the repro is too big to fit into memory. The way it works is that you specify a name of the tensor you want to load, as well as enough metadata to reconstruct the tensor, if the store isn't available. If there is an active content store, we read and return the tensor from that store; otherwise we use `rand_strided` to create it. I needed some infra improvements to do this: * `custom_op` now supports factory functions. Factory functions have to be registered specially via `impl_factory` * I modified `clone_input` to also support dtype conversion, which I use to change the dtype of a loaded tensor if necessary. * ContentStore needs to work with a device argument, so we torch.load directly to the correct device. This is for fake tensor support. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100519 Approved by: https://github.com/zou3519, https://github.com/anijain2305	2023-05-05 05:25:03 +00:00
Animesh Jain	8994d9e610	[dynamo] Hide guard_fail_hook behind a flag to improve cache lookup time (+10% DebertaV2) (#100590 ) For TorchDynamo eager backend, DebertaV2 speedup improves from 0.77x to 0.87x. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100590 Approved by: https://github.com/voznesenskym, https://github.com/wconstab	2023-05-04 18:52:21 +00:00
Edward Z. Yang	c7e9f40653	Misc accuracy improvements on minifier (#100447 ) The changes: * Add config knob `same_two_models_use_fp64` for toggling whether or not to use fp64 * Add a test showing that RMSE is superior to atol/rtol * Add `--strict-accuracy` options, which allows for testing against integral/boolean accuracy. Regular accuracy by default now ONLY. There's a test which exercises this, it's a little delicate but I had trouble thinking of a good test otherwise. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100447 Approved by: https://github.com/voznesenskym	2023-05-04 02:51:26 +00:00
kshitij12345	8b64dee5d2	[fix] torch_compile_debug don't log with 0 (#100462 ) Fixes https://github.com/pytorch/pytorch/issues/99906 Tested locally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100462 Approved by: https://github.com/mlazos	2023-05-03 08:23:09 +00:00
Richard Zou	984a2397ba	Refactor OutputGraph (#99987 ) This PR splits OutputGraph into two classes: - SubgraphTracer (handles FX-tracing) - OutputGraph (handles Dynamo-specific output graph logic, like tracking graph inputs, compiling the graph, and executing it). The motivation behind this is in the next PR up in the stack. TL;DR is: in order to do higher-order operators, we need nested SubgraphTracer, one for each level of nesting of the higher-order operators. I'm happy to flatten the stack into a single PR, but this separate made it easier for me to test. Lmk if you want the stack flattened. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/99987 Approved by: https://github.com/anijain2305, https://github.com/voznesenskym	2023-05-02 17:11:02 +00:00
Michael Voznesensky	aafc6ce8cc	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-05-01 21:32:11 +00:00
Edward Z. Yang	2d8deffc1e	Refactor repro/minifier into CLI; add analyze (#100226 ) This is a two part PR; I can split it if you really want me to. The first part is a refactor of the after aot repro/minifier scripts to come with a command line interface. I maintain exact BC with the previous interface (so, e.g., you still get a repro.py and a run_minifier.py that do the same thing as before), but each of these scripts also take command line arguments now which you can use to customize what actually happens. Check `run_repro` for full documentation on the arguments. The second part of this is an implementation of `analyze` subcommand on the new CLI for any repro. <img width="1277" alt="image" src="https://user-images.githubusercontent.com/13564/235045677-8545aab7-5e83-4813-bbec-47783dc60122.png"> This facility is oriented towards accuracy debugging. It does several things: 1. It will run your model twice and check for nondeterminism in inductor/float64, even on intermediate inputs (our benchmarking nondeterminism test only checks for nondeterminism on the final output). This makes localizing which operator is nondeterministic easy. 2. It will run your compiled model side-by-side with eager and float64 variants, and then report when things diverge too far from RMSE delta from float64. Importantly, it does all this without requiring every intermediate to be held in memory (which will cause an OOM on large repros, such as the one I tested this on.) Some other minor improvements: * MinifierTestBase now has an easy to comment out spot that you can use to retain the temporary directory; good for debugging * We print "running minifier" and "running repro" in MinifierTestBase to make it easier to orient where logs are coming from * same takes a `log_error` optional argument which you can use to reroute the error logs when things mismatch * counters["inductor"]["intermediate_hooks"] tracks the number of intermediate hooks we've codegen'ed; good for populate the tqdm interface * torch.fx.interpreter gets an official `boxed_run` interface which uses the boxed arguments calling convention and doesn't retain inputs unnecessarily long * torch.utils._content_store gets compute_tensor_metadata/read_tensor_metadata helper functions for computing tensor information without serializing it Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100226 Approved by: https://github.com/bertmaher, https://github.com/bdhirsh, https://github.com/anijain2305	2023-05-01 11:12:38 +00:00
PyTorch MergeBot	89c43f4108	Revert "Produce constant variables in cases where a SymNode is created with a constant (#100144 )" This reverts commit `d7bdfd3454`. Reverted https://github.com/pytorch/pytorch/pull/100144 on behalf of https://github.com/ezyang due to ci failure is real ([comment](https://github.com/pytorch/pytorch/pull/100144#issuecomment-1529587039))	2023-05-01 11:10:48 +00:00
Michael Voznesensky	d7bdfd3454	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-04-30 17:13:57 +00:00
Animesh Jain	03806eddbf	[dynamo] Compile torchvision augmentations (#100292 ) Resolves https://github.com/pytorch/pytorch/issues/100112 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100292 Approved by: https://github.com/jansel	2023-04-29 02:59:41 +00:00
Larry Liu	f5853342ea	[dynamo][numpy] Handle return value being numpy ndarray (#99560 ) On top of #95849 this PR is trying to handle the special case when dealing with numpy. Consider the following example: ``` def f(x: torch.Tensor) -> np.ndarray: a = x.numpy() return a.T ``` In previous PR this will error out because we translate `a.T` to be a method call on `torch_np.ndarray.T` which is also a `torch_np.ndarray`. This PR handles this case, by conditionally converting a `torch_np.ndarray` to `np.ndarray` before returning, to match the original behavior. The compiled version will be: ``` def f(x): ___tmp_0 = __compiled_fn_0(x) if isinstance(___tmp_0, torch_np.ndarray): return ___tmp_0.tensor.numpy() else: return ___tmp_0 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99560 Approved by: https://github.com/jansel, https://github.com/yanboliang	2023-04-27 16:18:35 +00:00
Larry Liu	687afeb686	[dynamo][numpy] Add NumpyTensorVariable to translate ndarray attribute calls to tensor attributes (#95849 ) Issue: #93684 # Problem Reduce graph breaks when dynamo compiles python functions containing numpy functions and ndarray operations. # Design (as I know it) * Use torch_np.ndarray(a wrapper of tensor) to back a `VariableTracker`: `NumpyTensorVariable`. * Translate all attributes and methods calls, on ndarray, to torch_np.ndarray equivalent. This PR adds `NumpyTensorVariable` and supports: 1. tensor to ndarray, ndarray to tensor 2. numpy functions such as numpy.meshgrid() 3. ndarray attributes such as `itemsize`, `stride` Next PR will handle returning `np.ndarray` and add support for ndarray methods Pull Request resolved: https://github.com/pytorch/pytorch/pull/95849 Approved by: https://github.com/ezyang	2023-04-27 16:18:35 +00:00
Animesh Jain	3dcc7b396c	[easy] iterate dict with sorted keys for accuracy checking (#99793 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99793 Approved by: https://github.com/jansel	2023-04-24 21:26:35 +00:00
Edward Z. Yang	f602b3a6ae	Preserve mark_dynamic when cloning inputs (#99617 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99617 Approved by: https://github.com/ngimel, https://github.com/voznesenskym, https://github.com/anijain2305	2023-04-22 19:46:31 +00:00
Michael Voznesensky	0ac0d9d224	Pass locals to enum_repr to correctly make the guard str for enums (#99680 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99680 Approved by: https://github.com/jansel	2023-04-21 07:14:49 +00:00
Yanbo Liang	05809c7d3b	[Dynamo] No graph break for explicit calling Conv{1/2/3}d.forward & ConvTranspose{1/2/3}d.forward (#99015 ) Before this PR, if users call ```Conv2d(x)```, dynamo handles it well(no graph break) and puts a ```call_module``` op in the FX graph. However, if users explicitly call ```Conv2d.forward(x)``` in another ```forward``` function, the inlining would be failed(caused graph break). This PR fixed this issue by translating the explicit ```Conv2d.forward(x)``` to ```Conv2d(x)```. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99015 Approved by: https://github.com/jansel, https://github.com/wconstab	2023-04-15 08:04:13 +00:00
Michael Voznesensky	10fbdcf72c	Re-PR of 90269 - Force all nn_module associated tensors to be static (#99108 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99108 Approved by: https://github.com/ezyang	2023-04-14 05:53:48 +00:00
Angela Yi	1d077f28ed	[export] Constraints API (#98433 ) Wrapper for users to insert constraints into model code. The constraints will not be maintained in the graph after tracing through make_fx so retracing with dynamo/make_fx will not work. This will be supported after torch._assert supported is implemented. Then we can convert the constrain_range calls to torch._asserts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98433 Approved by: https://github.com/avikchaudhuri, https://github.com/tugsbayasgalan	2023-04-13 21:20:10 +00:00
PyTorch MergeBot	ab761605ae	Revert "[export] Constraints API (#98433 )" This reverts commit `1510eb4072`. Reverted https://github.com/pytorch/pytorch/pull/98433 on behalf of https://github.com/izaitsevfb due to Breaks internal tests, asked by author to revert	2023-04-12 23:37:19 +00:00
PyTorch MergeBot	629377ea8b	Revert "Replace _dynamo.config with an object instead of module (#96455 )" This reverts commit `420104a886`. Reverted https://github.com/pytorch/pytorch/pull/96455 on behalf of https://github.com/jansel due to BC breaking, was landed prematurely	2023-04-12 15:06:14 +00:00

1 2 3

133 Commits