pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Jerry Zhang	c3f3cb5b0f	[quant][pt2e] Support conv bn fusion in convert step for QAT flow (#100442 ) Summary: This PR adds support for folding bn weights into conv for QAT flow, this is equivalent to the QAT branch of `from_float` in eager mode quantized conv module: https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/quantized/modules/conv.py#L223 Items that needs followup: * there are some workaround I did because quantize_per_tensor is using float/int args and dynamo does not support these args, need to fix after we change the quantized model representation and also change these args to Tensor Test Plan: buck2 test @//mode/opt //caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_convert_qat_conv_bn_fusion (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' Reviewed By: andrewor14 Differential Revision: D45344281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100442 Approved by: https://github.com/kimishpatel	2023-05-09 19:43:51 +00:00
Bin Bao	86ddfc7f68	[inductor] Move cpp wrapper trigger logic to inner_compile (#100611 ) Summary: This enables cpp wrapper for backward as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100611 Approved by: https://github.com/jansel	2023-05-08 15:24:02 +00:00
Animesh Jain	3f025c607c	summarize graph breaks (#100696 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100696 Approved by: https://github.com/yanboliang	2023-05-05 22:27:47 +00:00
Edward Z. Yang	ce1ad1c143	Add load_storage (#100519 ) This adds a new operator debugprims::load_storage which does the unusual thing of loading a tensor from disk (via ContentStoreReader). This will be used in a later PR to implement delta debugging in the minifier, even when the repro is too big to fit into memory. The way it works is that you specify a name of the tensor you want to load, as well as enough metadata to reconstruct the tensor, if the store isn't available. If there is an active content store, we read and return the tensor from that store; otherwise we use `rand_strided` to create it. I needed some infra improvements to do this: * `custom_op` now supports factory functions. Factory functions have to be registered specially via `impl_factory` * I modified `clone_input` to also support dtype conversion, which I use to change the dtype of a loaded tensor if necessary. * ContentStore needs to work with a device argument, so we torch.load directly to the correct device. This is for fake tensor support. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100519 Approved by: https://github.com/zou3519, https://github.com/anijain2305	2023-05-05 05:25:03 +00:00
Animesh Jain	8994d9e610	[dynamo] Hide guard_fail_hook behind a flag to improve cache lookup time (+10% DebertaV2) (#100590 ) For TorchDynamo eager backend, DebertaV2 speedup improves from 0.77x to 0.87x. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100590 Approved by: https://github.com/voznesenskym, https://github.com/wconstab	2023-05-04 18:52:21 +00:00
Edward Z. Yang	c7e9f40653	Misc accuracy improvements on minifier (#100447 ) The changes: * Add config knob `same_two_models_use_fp64` for toggling whether or not to use fp64 * Add a test showing that RMSE is superior to atol/rtol * Add `--strict-accuracy` options, which allows for testing against integral/boolean accuracy. Regular accuracy by default now ONLY. There's a test which exercises this, it's a little delicate but I had trouble thinking of a good test otherwise. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100447 Approved by: https://github.com/voznesenskym	2023-05-04 02:51:26 +00:00
kshitij12345	8b64dee5d2	[fix] torch_compile_debug don't log with 0 (#100462 ) Fixes https://github.com/pytorch/pytorch/issues/99906 Tested locally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100462 Approved by: https://github.com/mlazos	2023-05-03 08:23:09 +00:00
Richard Zou	984a2397ba	Refactor OutputGraph (#99987 ) This PR splits OutputGraph into two classes: - SubgraphTracer (handles FX-tracing) - OutputGraph (handles Dynamo-specific output graph logic, like tracking graph inputs, compiling the graph, and executing it). The motivation behind this is in the next PR up in the stack. TL;DR is: in order to do higher-order operators, we need nested SubgraphTracer, one for each level of nesting of the higher-order operators. I'm happy to flatten the stack into a single PR, but this separate made it easier for me to test. Lmk if you want the stack flattened. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/99987 Approved by: https://github.com/anijain2305, https://github.com/voznesenskym	2023-05-02 17:11:02 +00:00
Michael Voznesensky	aafc6ce8cc	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-05-01 21:32:11 +00:00
Edward Z. Yang	2d8deffc1e	Refactor repro/minifier into CLI; add analyze (#100226 ) This is a two part PR; I can split it if you really want me to. The first part is a refactor of the after aot repro/minifier scripts to come with a command line interface. I maintain exact BC with the previous interface (so, e.g., you still get a repro.py and a run_minifier.py that do the same thing as before), but each of these scripts also take command line arguments now which you can use to customize what actually happens. Check `run_repro` for full documentation on the arguments. The second part of this is an implementation of `analyze` subcommand on the new CLI for any repro. <img width="1277" alt="image" src="https://user-images.githubusercontent.com/13564/235045677-8545aab7-5e83-4813-bbec-47783dc60122.png"> This facility is oriented towards accuracy debugging. It does several things: 1. It will run your model twice and check for nondeterminism in inductor/float64, even on intermediate inputs (our benchmarking nondeterminism test only checks for nondeterminism on the final output). This makes localizing which operator is nondeterministic easy. 2. It will run your compiled model side-by-side with eager and float64 variants, and then report when things diverge too far from RMSE delta from float64. Importantly, it does all this without requiring every intermediate to be held in memory (which will cause an OOM on large repros, such as the one I tested this on.) Some other minor improvements: * MinifierTestBase now has an easy to comment out spot that you can use to retain the temporary directory; good for debugging * We print "running minifier" and "running repro" in MinifierTestBase to make it easier to orient where logs are coming from * same takes a `log_error` optional argument which you can use to reroute the error logs when things mismatch * counters["inductor"]["intermediate_hooks"] tracks the number of intermediate hooks we've codegen'ed; good for populate the tqdm interface * torch.fx.interpreter gets an official `boxed_run` interface which uses the boxed arguments calling convention and doesn't retain inputs unnecessarily long * torch.utils._content_store gets compute_tensor_metadata/read_tensor_metadata helper functions for computing tensor information without serializing it Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100226 Approved by: https://github.com/bertmaher, https://github.com/bdhirsh, https://github.com/anijain2305	2023-05-01 11:12:38 +00:00
PyTorch MergeBot	89c43f4108	Revert "Produce constant variables in cases where a SymNode is created with a constant (#100144 )" This reverts commit `d7bdfd3454`. Reverted https://github.com/pytorch/pytorch/pull/100144 on behalf of https://github.com/ezyang due to ci failure is real ([comment](https://github.com/pytorch/pytorch/pull/100144#issuecomment-1529587039))	2023-05-01 11:10:48 +00:00
Michael Voznesensky	d7bdfd3454	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-04-30 17:13:57 +00:00
Animesh Jain	03806eddbf	[dynamo] Compile torchvision augmentations (#100292 ) Resolves https://github.com/pytorch/pytorch/issues/100112 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100292 Approved by: https://github.com/jansel	2023-04-29 02:59:41 +00:00
Larry Liu	f5853342ea	[dynamo][numpy] Handle return value being numpy ndarray (#99560 ) On top of #95849 this PR is trying to handle the special case when dealing with numpy. Consider the following example: ``` def f(x: torch.Tensor) -> np.ndarray: a = x.numpy() return a.T ``` In previous PR this will error out because we translate `a.T` to be a method call on `torch_np.ndarray.T` which is also a `torch_np.ndarray`. This PR handles this case, by conditionally converting a `torch_np.ndarray` to `np.ndarray` before returning, to match the original behavior. The compiled version will be: ``` def f(x): ___tmp_0 = __compiled_fn_0(x) if isinstance(___tmp_0, torch_np.ndarray): return ___tmp_0.tensor.numpy() else: return ___tmp_0 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99560 Approved by: https://github.com/jansel, https://github.com/yanboliang	2023-04-27 16:18:35 +00:00
Larry Liu	687afeb686	[dynamo][numpy] Add NumpyTensorVariable to translate ndarray attribute calls to tensor attributes (#95849 ) Issue: #93684 # Problem Reduce graph breaks when dynamo compiles python functions containing numpy functions and ndarray operations. # Design (as I know it) * Use torch_np.ndarray(a wrapper of tensor) to back a `VariableTracker`: `NumpyTensorVariable`. * Translate all attributes and methods calls, on ndarray, to torch_np.ndarray equivalent. This PR adds `NumpyTensorVariable` and supports: 1. tensor to ndarray, ndarray to tensor 2. numpy functions such as numpy.meshgrid() 3. ndarray attributes such as `itemsize`, `stride` Next PR will handle returning `np.ndarray` and add support for ndarray methods Pull Request resolved: https://github.com/pytorch/pytorch/pull/95849 Approved by: https://github.com/ezyang	2023-04-27 16:18:35 +00:00
Animesh Jain	3dcc7b396c	[easy] iterate dict with sorted keys for accuracy checking (#99793 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99793 Approved by: https://github.com/jansel	2023-04-24 21:26:35 +00:00
Edward Z. Yang	f602b3a6ae	Preserve mark_dynamic when cloning inputs (#99617 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99617 Approved by: https://github.com/ngimel, https://github.com/voznesenskym, https://github.com/anijain2305	2023-04-22 19:46:31 +00:00
Michael Voznesensky	0ac0d9d224	Pass locals to enum_repr to correctly make the guard str for enums (#99680 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99680 Approved by: https://github.com/jansel	2023-04-21 07:14:49 +00:00
Yanbo Liang	05809c7d3b	[Dynamo] No graph break for explicit calling Conv{1/2/3}d.forward & ConvTranspose{1/2/3}d.forward (#99015 ) Before this PR, if users call ```Conv2d(x)```, dynamo handles it well(no graph break) and puts a ```call_module``` op in the FX graph. However, if users explicitly call ```Conv2d.forward(x)``` in another ```forward``` function, the inlining would be failed(caused graph break). This PR fixed this issue by translating the explicit ```Conv2d.forward(x)``` to ```Conv2d(x)```. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99015 Approved by: https://github.com/jansel, https://github.com/wconstab	2023-04-15 08:04:13 +00:00
Michael Voznesensky	10fbdcf72c	Re-PR of 90269 - Force all nn_module associated tensors to be static (#99108 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99108 Approved by: https://github.com/ezyang	2023-04-14 05:53:48 +00:00
Angela Yi	1d077f28ed	[export] Constraints API (#98433 ) Wrapper for users to insert constraints into model code. The constraints will not be maintained in the graph after tracing through make_fx so retracing with dynamo/make_fx will not work. This will be supported after torch._assert supported is implemented. Then we can convert the constrain_range calls to torch._asserts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98433 Approved by: https://github.com/avikchaudhuri, https://github.com/tugsbayasgalan	2023-04-13 21:20:10 +00:00
PyTorch MergeBot	ab761605ae	Revert "[export] Constraints API (#98433 )" This reverts commit `1510eb4072`. Reverted https://github.com/pytorch/pytorch/pull/98433 on behalf of https://github.com/izaitsevfb due to Breaks internal tests, asked by author to revert	2023-04-12 23:37:19 +00:00
PyTorch MergeBot	629377ea8b	Revert "Replace _dynamo.config with an object instead of module (#96455 )" This reverts commit `420104a886`. Reverted https://github.com/pytorch/pytorch/pull/96455 on behalf of https://github.com/jansel due to BC breaking, was landed prematurely	2023-04-12 15:06:14 +00:00
Angela Yi	1510eb4072	[export] Constraints API (#98433 ) Wrapper for users to insert constraints into model code. The constraints will not be maintained in the graph after tracing through make_fx so retracing with dynamo/make_fx will not work. This will be supported after torch._assert supported is implemented. Then we can convert the constrain_range calls to torch._asserts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98433 Approved by: https://github.com/avikchaudhuri, https://github.com/tugsbayasgalan	2023-04-12 01:32:44 +00:00
Han Qi	420104a886	Replace _dynamo.config with an object instead of module (#96455 ) Summary: Replace _dynamo.config with an object instead of module Current usage patterns of setting and reading fields on config will work unchanged. Only changes needed going forward: 1. import torch._dynamo.config will not work. However, just doing import torch._dynamo is sufficient to access dynamo config as torch._dynamo.config. 2. Files inside of _dynamo folder need to access config via from torch._dynamo.config_util import config instead of from torch._dynamo import config. Because _dynamo/__init__.py imports some of the files so it would be circular import. Test Plan: Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/96455 Approved by: https://github.com/williamwen42	2023-04-11 21:23:32 +00:00
Edward Z. Yang	b8b840be3d	Convert logging f-strings to use % format, part five (#98765 ) This does some annoying but simple cases by hand. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98765 Approved by: https://github.com/wanchaol	2023-04-11 13:17:59 +00:00
Edward Z. Yang	822464567f	Lazily format graphs for debug printing (#98776 ) The current code unconditionally formats the graphs, which is a waste of CPU if no one looks at them. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98776 Approved by: https://github.com/albanD, https://github.com/mlazos	2023-04-10 22:41:33 +00:00
Edward Z. Yang	b09722f540	Convert logging f-strings to use % format, part two (#98700 ) This hits multi-line logging strings Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98700 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
Edward Z. Yang	9a8f71f23e	Convert logging f-strings to use % format (#98697 ) Codemod done with https://gist.github.com/ezyang/2e8b0463cdc6be278478495b23ff0530 with assistance from ChatGPT. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98697 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
YJ Shi	5ceae85f1c	[Dynamo] Include UserDict in clone_inputs (#97725 ) Fixes #97724 Pull Request resolved: https://github.com/pytorch/pytorch/pull/97725 Approved by: https://github.com/yanboliang	2023-04-08 00:19:35 +00:00
Horace He	c75dd7c413	grab bag of changes (#98572 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98572 Approved by: https://github.com/shunting314, https://github.com/mlazos	2023-04-07 20:02:59 +00:00
Will Constable	390c51bf87	Skip nnmodule hook guards by default (#98371 ) This PR makes basic nnmodule forward hooks work by default, without any overhead. But it leaves silent correctness issues if users modify/remove their hooks later, thus also emits a warning. - the usual case is to not use hooks, so avoid guard overhead here - registering any hook before compile will trigger a warning about hook support - registering a hook later (or removing one) requires user knowledge and opting in, currently this isn't warnable (but maybe we can observe compiled nnmodules to make it warnable). Why skip hook guards by default instead of not tracing __call__/hooks by default? - avoid having a mode flag that alters dynamo tracing behavior (harder to test both codepaths in CI with full coverage) - the most basic hook usecase (registering a hook before compile, and never removing it) will work by default with this PR, while it would require enablement and incur overhead in the 'not tracing __call__' proposal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98371 Approved by: https://github.com/jansel	2023-04-07 15:10:51 +00:00
Edward Z. Yang	d01ee10b25	Add detect_fake_mode (#98321 ) This replaces fake_mode_from_tensors but it preferentially looks for fake_mode in TracingContext and also if there is an active fake mode on the dispatch stack, before groveling in tensors to find it. This advances PegasusForCausalLM, which was previously failing because we generated a graph that had a parameter (non-fake) and a SymInt, and thus previously we failed to detect the correct fake mode. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98321 Approved by: https://github.com/voznesenskym	2023-04-05 22:15:16 +00:00
Yanbo Liang	b1c2925493	[Dynamo] Support typing.Union and typing.Optional (#98384 ) Fixes #98265 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98384 Approved by: https://github.com/ezyang	2023-04-05 21:31:52 +00:00
Michael Voznesensky	b1e60bfb6a	Pass f_locals as a dict rather than kwargs (#98107 ) Fixes https://github.com/pytorch/pytorch/issues/97688 One big problem is that instead of printing x < y we now print `E["x"] < E["y"]` and now all of the tests wobbled and I'm mad. Signed-off-by: Edward Z. Yang <ezyangmeta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98107 Approved by: https://github.com/ezyang	2023-04-04 00:30:08 +00:00
Yanbo Liang	a6bd21d935	[Dynamo] Eagerly initializing Lazy Module to reduce graph breaks (#97946 ) Fixes Meta internal user case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97946 Approved by: https://github.com/wconstab	2023-04-03 22:24:43 +00:00
Jason Ansel	35b3309539	Fix graph break from inline patched init (#98150 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98150 Approved by: https://github.com/anijain2305, https://github.com/yanboliang	2023-04-03 01:11:30 +00:00
Michael Lazos	ee9a9b7add	Remove old logging callsites (#98095 ) Get around GH first issue, OSS only changes for https://github.com/pytorch/pytorch/pull/97182 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98095 Approved by: https://github.com/anijain2305	2023-04-01 00:57:37 +00:00
William Wen	14ef91cea6	[dynamo 3.11] small bug fixes (#96508 ) Bugs fixed: - CALL_FUNCTION_EX expects null pop in symbolic_convert - make_function_with_closure codegen requires a push_null - copy over the closure in eval_frame.c - add JUMP_FORWARD to terminal opcodes - enum repr fix in utils.py - fix symbolic_convert's break_graph_if_unsupported wrapper Pull Request resolved: https://github.com/pytorch/pytorch/pull/96508 Approved by: https://github.com/jansel	2023-03-31 18:18:12 +00:00
David Berard	c218309f88	[dynamo] profiler.record_function on all dynamo_timed functions (#96495 ) Summary: profiler.record_function inserts an event into the chrome trace generated by the pytorch profiler. This PR adds record_function everywhere that @dynamo_timed is annotated. dynamo_timed and the CLI viewer torch._dynamo.utils.compile_times() are already useful on their own; but for identifying _when_ these get called, it's nice to be able to view in the profiler chrome trace. Why not just turn on python stack traces in the profiler to get this information? Dynamo compilation is implemented in python and therefore produces a huge amount of events when it records compilation steps. The resulting trace files are often too large to load in chrome://tracing, and they take a long time to generate. Additionally, the stack traces are deep enough that they are often hard to read. This approach produces much more readable traces with lower overhead. Tests: - Added in test/dynamo/test_profiler.py. Verified in https://github.com/pytorch/pytorch/actions/runs/4559322864/jobs/8043307798?pr=96495 that the tests are actually running. - Performance run with `ciflow/inductor-perf-compare` shows no noticeable change in compilation time or speedup numbers. Geomean speedup changes from 1.275 -> 1.277. Geomean compilation times change from 54.2s -> 53.8s. That's likely just due to noise. All individual benchmark numbers regressed by no more than 5% between the two runs; and we see improvements of around the same magnitude, suggesting this is, again, just noise. For meta employees, you can see the results in a google sheets here: https://docs.google.com/spreadsheets/d/1Ki69XvcgxcA3ZnqC5n_jav5KiD4u7Wojlad3VTnIdlk/edit?usp=sharing Example: Run this: ```python import torch def gn(x): return x.sin().cos() def fn(x, y): return x.sin() * y.cos() x, y = [torch.rand((2, 2), device='cuda') for _ in range(2)] # just to clear out any lazy initialization with torch.profiler.profile() as prof: torch.compile(gn)(x) with torch.profiler.profile() as prof: torch.compile(fn)(x, y) prof.export_chrome_trace("./dynamo_timed_profile.json") ``` and we can see that the resulting trace shows important dynamo steps, even when python tracing is turned off. <img width="867" alt="Screenshot 2023-03-29 at 7 26 15 PM" src="https://user-images.githubusercontent.com/5067123/228712263-8ae67ab9-1a52-4765-a9c2-7c5cf0abe2f5.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96495 Approved by: https://github.com/ngimel, https://github.com/mlazos	2023-03-30 21:49:02 +00:00
Edward Z. Yang	fb7f983357	Graph break on operators that fake tensor doesn't support (#97708 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97708 Approved by: https://github.com/eellison	2023-03-28 19:49:54 +00:00
vfdev	0f424f7f05	Fixed broken link to troubleshooting.html docs page (#97330 ) Seen first in error message: ``` [2023-03-22 10:30:39,786] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:407) reasons: w == 857 to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html. [2023-03-22 10:30:40,036] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:406) reasons: ___stack0 == 207 to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html. ``` Broken link: - https://pytorch.org/docs/master/dynamo/troubleshooting.html. Good link: - https://pytorch.org/docs/master/compile/troubleshooting.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/97330 Approved by: https://github.com/zou3519	2023-03-22 16:40:21 +00:00
Will Constable	141a2ebcf1	Clean up Compilation Profiler (#97029 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97029 Approved by: https://github.com/voznesenskym	2023-03-21 06:24:22 +00:00
Michael Voznesensky	722c4e59a4	Replace source check with assert (#95640 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95640 Approved by: https://github.com/ezyang	2023-03-19 21:51:59 +00:00
Michael Lazos	a1c46e5f8f	component-level configurable logging for dynamo, inductor, aot (#94858 ) Summary: Adds NNC-like logging that is configured through an env var `TORCH_COMPILE_LOGS` Examples: `TORCH_LOGS="dynamo,guards" python script.py` - prints dynamo logs at level INFO with guards of all functions that are compiled `TORCH_LOGS="+dynamo,guards,graph" python script.py` - prints dynamo logs at level DEBUG with guards and graphs (in tabular) format of all graphs that are compiled [More examples with full output](https://gist.github.com/mlazos/b17f474457308ce15e88c91721ac1cce) Implementation: The implementation parses the log settings from the environment, finds any components (aot, dynamo, inductor) or other loggable objects (guards, graph, etc.) and generates a log_state object. This object contains all of the enabled artifacts, and a qualified log name -> level mapping. _init_logs then adds handlers to the highest level logs (the registered logs), and sets any artifact loggers to level DEBUG if the artifact is enabled. Note: set_logs is an alternative for manipulating the log_state, but if the environment contains TORCH_LOGS, the environment settings will be prioritized. Adding a new log: To add a new log, a dev should add their log name to torch._logging._registrations (there are examples there already). Adding a new artifact: To add a new artifact, a dev should add their artifact name to torch._logging._registrations as well. Additionally, wherever the artifact is logged, `torch._logging.getArtifactLogger(__name__, <artifact_name>)` should be used instead of the standard logging implementation. [design doc](https://docs.google.com/document/d/1ZRfTWKa8eaPq1AxaiHrq4ASTPouzzlPiuquSBEJYwS8/edit#) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94858 Approved by: https://github.com/ezyang	2023-03-18 04:17:31 +00:00
Edward Z. Yang	384d3ec2b6	Extra CR comments from #95621 (#96043 ) Specifically: `063e441471 (r1120306196)` https://github.com/pytorch/pytorch/pull/95621#discussion_r1125015510 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96043 Approved by: https://github.com/Chillee, https://github.com/albanD	2023-03-10 01:10:48 +00:00
Horace He	5bbec680d7	Fix usages of contextmanager without finally (#96170 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96170 Approved by: https://github.com/ngimel, https://github.com/malfet	2023-03-08 20:59:27 +00:00
Edward Z. Yang	d303665d33	Make int unspecialization actually work (#95621 ) OK, so this PR used to be about reducing the number of constants we specialize on, but it turns out that unspecialization was ~essentially never used (because we still constant specialized way too aggressively) and I ended up having to fix a bunch of issues to actually get tests to pass. So this PR is now "make int unspecialization actually work". As part of this, I have to turn off unspecialization by default, as there are still latent bugs in inductor. The general strategy is that an unspecialized int is represented as a SymInt. Representing it as a 0d tensor (which is what the code used to do) is untenable: (1) we often need unspecialized ints to participate in size computations, but we have no way of propagating sympy expressions through tensor compute, and (2) a lot of APIs work when passed SymInt, but not when passed a Tensor. However, I continue to represent Numpy scalars as Tensors, as they are rarely used for size computation and they have an explicit dtype, so they are more accurately modeled as 0d tensors. * I folded in the changes from https://github.com/pytorch/pytorch/pull/95099 as I cannot represent unspecialized ints as SymInts without also turning on dynamic shapes. This also eliminates the necessity for test_unspec.py, as toggling specialization without dynamic shapes doesn't do anything. As dynamic shapes defaults to unspecializing, I just deleted this entirely; for the specialization case, I rely on regular static shape tests to catch it. (Hypothetically, we could also rerun all the tests with dynamic shapes, but WITH int/float specialization, but this seems... not that useful? I mean, I guess export wants it, but I'd kind of like our Source heuristic to improve enough that export doesn't have to toggle this either.) * Only 0/1 integers get specialized by default now * A hodgepodge of fixes. I'll comment on the PR about them. Fixes https://github.com/pytorch/pytorch/issues/95469 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95621 Approved by: https://github.com/jansel, https://github.com/Chillee	2023-03-04 01:22:08 +00:00
Michael Voznesensky	34a7c79eac	Rename func (#95639 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95639 Approved by: https://github.com/ezyang	2023-03-01 23:03:09 +00:00
Edward Z. Yang	835122c89f	Add missing f-string specifiers (#95707 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95707 Approved by: https://github.com/Skylion007, https://github.com/albanD	2023-02-28 20:20:05 +00:00

1 2 3

106 Commits