pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Laith Sakka	acbf888a13	rename sl to strobelight (#124455 ) Summary: TORCH_COMPILE_SL_PROFILE ->TORCH_COMPILE_STROBELIGHT SL_MAX_STACK_LENGTH -> COMPILE_STROBELIGHT_MAX_STACK_LENGTH SL_MAX_PROFILE_TIME -> COMPILE_STROBELIGHT_MAX_PROFILE_TIME profile_with_sl() -> strobelight() compiletime_sl_profile_meta() -> compiletime_strobelight_meta() Test Plan: 1. run and verify ``` TORCH_COMPILE_STROBELIGHT=TRUE buck2 run @//mode/inplace @//mode/opt //caffe2/fb/strobelight:compiletime_profiler_example ``` 2. run and verify ``` buck2 run @//mode/inplace @//mode/opt //caffe2/fb/strobelight:function_profiler_example --local-only ``` 3. run and verify truncated stack for ``` TORCH_COMPILE_STROBELIGHT=TRUE COMPILE_STROBELIGHT_MAX_STACK_LENGTH=1 buck2 run @//mode/inplace @//mode/opt //caffe2/fb/strobelight:compiletime_profiler_example ``` 4. add infinite loop in _verify and verify samples for ``` COMPILE_STROBELIGHT_MAX_PROFILE_TIME=30 TORCH_COMPILE_STROBELIGHT=TRUE buck2 run @//mode/inplace @//mode/opt //caffe2/fb/strobelight:compiletime_profiler_example ``` Reviewed By: oulgen Differential Revision: D56327139 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124455 Approved by: https://github.com/oulgen	2024-04-19 22:50:13 +00:00
Animesh Jain	601112fdb4	[dynamo][log] Print missing skipped frame info on debug (#124078 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124078 Approved by: https://github.com/yanboliang	2024-04-15 20:33:17 +00:00
Animesh Jain	4e3022dbe9	[dynamo][logs] Print bytecode before tracing (#123877 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123877 Approved by: https://github.com/jansel ghstack dependencies: #122943	2024-04-12 02:32:58 +00:00
Aaron Orenstein	2bcc83dfbd	Preserve dispatch state across function tracing (#122073 ) If we throw an exception in the "wrong" place we can end up with the dispatch state being in a weird state which can cause all future dispatching to fail. Preserve and restore it as part of `preserve_global_state` so we know it's sane after that. Also fake_tensor's in_kernel_invocation_manager() was leaving a bit set in the dispatcher (DispatchKey.Dense) which affected follow-on code. Fixed that to reset after as well. Repro: before: ``` $ rm test/dynamo_skips/TestSparseCPU.test_to_dense_with_gradcheck_sparse_cpu_complex64 $ PYTORCH_TEST_WITH_DYNAMO=1 pytest -s test/dynamo/test_export.py test/test_sparse.py -k 'test_to_dense_with_gradcheck_sparse_cpu_complex64' ======== 1 passed, 6173 deselected in 5.21s ============= $ PYTORCH_TEST_WITH_DYNAMO=1 pytest -s test/dynamo/test_export.py test/test_sparse.py -k 'test_torch_inference_mode_ctx or test_to_dense_with_gradcheck_sparse_cpu_complex64' ========= 1 skipped, 6172 deselected, 1 error in 5.29s ========= ``` (note that test_to_dense_with_gradcheck_sparse_cpu_complex64 passes on its own but failed when including the skipped test_export.py tests) after: ``` $ rm test/dynamo_skips/TestSparseCPU.test_to_dense_with_gradcheck_sparse_cpu_complex64 $ PYTORCH_TEST_WITH_DYNAMO=1 pytest -s test/dynamo/test_export.py test/test_sparse.py -k 'test_to_dense_with_gradcheck_sparse_cpu_complex64' ===================== 1 passed, 6173 deselected in 5.42s ===================== $ PYTORCH_TEST_WITH_DYNAMO=1 pytest -s test/dynamo/test_export.py test/test_sparse.py -k 'test_torch_inference_mode_ctx or test_to_dense_with_gradcheck_sparse_cpu_complex64' ===================== 1 passed, 1 skipped, 6172 deselected in 7.30s ====================== ``` (note that test_to_dense_with_gradcheck_sparse_cpu_complex64 passes in both runs) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122073 Approved by: https://github.com/zou3519	2024-04-10 18:57:01 +00:00
Animesh Jain	1dc4e1e335	[dynamo][logs] Bug fix (#123606 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123606 Approved by: https://github.com/jansel, https://github.com/ezyang	2024-04-10 07:30:02 +00:00
Animesh Jain	bb04f3f66a	[dynamo][logger] Log graph break on Unsupported bytecodes (#122684 ) This would have saved me a few hours while debugging an internal model. We could not support a LOAD_ATTR bytecode, because it was a property, and the inlining failed due to skip. Since LOAD_ATTR does not support continuation function, we would fallback to eager for the whole frame aka skip. But, we should also log this as graph break. This PR does it. Bonus - removes skip from a test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122684 Approved by: https://github.com/ezyang	2024-04-08 01:50:04 +00:00
Jason Ansel	e3ea316623	[dynamo] Save/restore cublas_allow_tf32 in convert_frame (#123509 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123509 Approved by: https://github.com/anijain2305	2024-04-07 03:37:47 +00:00
Laith Sakka	caed7f6727	profile pt2 compile time with strobelight (#123311 ) For oss this diff adds a decorator @profile_sb_fbcode that is a nop for non meta workload. Facebook: With this diff someone can generate a strobelight profile for pt2 compilation. users need to set the env variable TORCH_COMPILE_SL_PROFILE =TRUE . For example: ``` TORCH_COMPILE_SL_PROFILE =TRUE buck2 run @//mode/inplace @//mode/opt //caffe2/fb/strobelight:compiletime_profile_example ``` see sample output bellow, at the end of summary. The way this works, is that a unique id is generated and associated with all samples that are collected for functions that are decorated with profile_sb_fbcode. This id can then be used to combine different strobe light profile into one. (for example three compilation events happens in the code bellow). Right now the following two functions are annotated with profile_sb_fbcode. bw_compiler and _compile. if two profile_sl_fbcode is called recursively, recursive invocations are ignored and a log is printed. The output is: ``` Strobelight is enabled for pt2 compilation Unique user-id for this run is: 2024-04-03-13:59:49147091devvm4561.ash0.facebook.com You can use the following link to access the strobelight profile at the end of the run: https://www.internalfb.com/intern/scuba/query/?dataset=pyperf_experimental%2Fon_demand&drillstate=%7B%22purposes%22%3A[]%2C%22end%22%3A%22now%22%2C%22start%22%3A%22-30%20days%22%2C%22filterMode%22%3A%22DEFAULT%22%2C%22modifiers%22%3A[]%2C%22sampleCols%22%3A[]%2C%22cols%22%3A[%22namespace_id%22%2C%22namespace_process_id%22]%2C%22derivedCols%22%3A[]%2C%22mappedCols%22%3A[]%2C%22enumCols%22%3A[]%2C%22return_remainder%22%3Afalse%2C%22should_pivot%22%3Afalse%2C%22is_timeseries%22%3Afalse%2C%22hideEmptyColumns%22%3Afalse%2C%22timezone%22%3A%22America%2FLos_Angeles%22%2C%22compare%22%3A%22none%22%2C%22samplingRatio%22%3A%221%22%2C%22metric%22%3A%22count%22%2C%22aggregation_field%22%3A%22async_stack_complete%22%2C%22top%22%3A10000%2C%22aggregateList%22%3A[]%2C%22param_dimensions%22%3A[%7B%22dim%22%3A%22py_async_stack%22%2C%22op%22%3A%22edge%22%2C%22param%22%3A%220%22%2C%22anchor%22%3A%220%22%7D]%2C%22order%22%3A%22weight%22%2C%22order_desc%22%3Atrue%2C%22constraints%22%3A[[%7B%22column%22%3A%22run_user%22%2C%22op%22%3A%22eq%22%2C%22value%22%3A[%22[%5C%222024-04-03-13:59:49147091devvm4561.ash0.facebook.com%5C%22]%22]%7D]]%2C%22c_constraints%22%3A[[]]%2C%22b_constraints%22%3A[[]]%2C%22ignoreGroupByInComparison%22%3Afalse%7D&view=GraphProfilerView&&pool=uber&graphprofiler_filter=&graphprofiler_column_to_sort_by=exclusive the link below takes you to the collected strobelight profile https://www.internalfb.com/intern/scuba/query/?dataset=pyperf_experimental%2Fon_demand&drillstate=%7B%22dimensions%22%3A%5B%5D%2C%22param_dimensions%22%3A%5B%7B%22anchor%22%3A%220%22%2C%22param%22%3A%220%22%2C%22op%22%3A%22edge%22%2C%22dim%22%3A%22py_async_stack%22%7D%5D%2C%22constraints%22%3A%5B%5B%7B%22value%22%3A%5B%22%5B%5C%22-6800545191281321%5C%22%5D%22%5D%2C%22op%22%3A%22eq%22%2C%22column%22%3A%22run_id%22%7D%2C%7B%22value%22%3A%5B%22%5B%5C%222024-04-03-13%3A59%3A49147091devvm4561.ash0.facebook.com%5C%22%5D%22%5D%2C%22op%22%3A%22eq%22%2C%22column%22%3A%22run_user%22%7D%5D%5D%2C%22top%22%3A10000%2C%22end%22%3A%221712181610%22%2C%22start%22%3A%221712174410%22%7D&view=GraphProfilerView& 1 storbelight success runs out of 1 non-ignored runs. strobelight run id is: 6181728288420687 the link below takes you to the collected strobelight profile https://www.internalfb.com/intern/scuba/query/?dataset=pyperf_experimental%2Fon_demand&drillstate=%7B%22dimensions%22%3A%5B%5D%2C%22param_dimensions%22%3A%5B%7B%22anchor%22%3A%220%22%2C%22param%22%3A%220%22%2C%22op%22%3A%22edge%22%2C%22dim%22%3A%22py_async_stack%22%7D%5D%2C%22constraints%22%3A%5B%5B%7B%22value%22%3A%5B%22%5B%5C%226181728288420687%5C%22%5D%22%5D%2C%22op%22%3A%22eq%22%2C%22column%22%3A%22run_id%22%7D%2C%7B%22value%22%3A%5B%22%5B%5C%222024-04-03-13%3A59%3A49147091devvm4561.ash0.facebook.com%5C%22%5D%22%5D%2C%22op%22%3A%22eq%22%2C%22column%22%3A%22run_user%22%7D%5D%5D%2C%22top%22%3A10000%2C%22end%22%3A%221712181621%22%2C%22start%22%3A%221712174421%22%7D&view=GraphProfilerView& 2 storbelight success runs out of 2 non-ignored runs. strobelight run id is: -1026103682715688 the link below takes you to the collected strobelight profile https://www.internalfb.com/intern/scuba/query/?dataset=pyperf_experimental%2Fon_demand&drillstate=%7B%22dimensions%22%3A%5B%5D%2C%22param_dimensions%22%3A%5B%7B%22anchor%22%3A%220%22%2C%22param%22%3A%220%22%2C%22op%22%3A%22edge%22%2C%22dim%22%3A%22py_async_stack%22%7D%5D%2C%22constraints%22%3A%5B%5B%7B%22value%22%3A%5B%22%5B%5C%22-1026103682715688%5C%22%5D%22%5D%2C%22op%22%3A%22eq%22%2C%22column%22%3A%22run_id%22%7D%2C%7B%22value%22%3A%5B%22%5B%5C%222024-04-03-13%3A59%3A49147091devvm4561.ash0.facebook.com%5C%22%5D%22%5D%2C%22op%22%3A%22eq%22%2C%22column%22%3A%22run_user%22%7D%5D%5D%2C%22top%22%3A10000%2C%22end%22%3A%221712181647%22%2C%22start%22%3A%221712174447%22%7D&view=GraphProfilerView& 3 storbelight success runs out of 3 non-ignored runs. ``` Test Plan: Was tested on buck2 run @//mode/inplace @//mode/opt //caffe2/fb/strobelight:compiletime_profile_example This was also tested in one of the ads benchmarks ``` TORCH_COMPILE_SL_PROFILE =TRUE buck2 run mode/opt mode/inplace //pytorch/benchmark:run -- ads_mc_igctr_mc3_v0 -d cuda -t train --torchdynamo inductor ``` The results matches the results reported in https://fb.workplace.com/groups/257735836456307/permalink/657458576484029 Differential Revision: D55672271 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123311 Approved by: https://github.com/aorenste	2024-04-06 18:57:44 +00:00
James Wu	df1cdaedeb	Log restart reasons and extra compile time in CompilationMetrics (#121827 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121827 Approved by: https://github.com/ezyang, https://github.com/yanboliang	2024-03-18 18:59:25 +00:00
Wenting Wang	02bb2180f4	[torch export] replace traceback.extract_stack with CapturedTraceback.extract (#121449 ) Summary: with a simple bench in TestDeserializer.test_basic function: ``` time_start = time.time() for i in range(1000): self.check_graph(MyModule(), inputs) warnings.warn(f"time_taken: {time.time() - time_start}") ``` and forcing FakeTensorConfig.debug to True, record_stack_traces to True, logging level to debug, it shows that the the changed code is consistently ard 20 secs faster (~90s vs originally ~110s) Test Plan: test passed, see summary compared debug trace before and after: - exactly the same for fake tensor and proxy callsite https://www.internalfb.com/intern/diffing/?paste_number=1189883685 - slightly different for the user frame in proxy node https://www.internalfb.com/intern/diffing/?paste_number=1189884347 Differential Revision: D54237017 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121449 Approved by: https://github.com/angelayi	2024-03-13 00:19:05 +00:00
Yanbo Liang	169c220bf8	[torch.compile] Provide capability to register callback on compile start/stop (#120764 ) This is a requirement from Meta internal cases, where ppl wants to register a callback function to detect if a job is stuck during compilation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120764 Approved by: https://github.com/jansel	2024-02-29 07:37:52 +00:00
Edward Z. Yang	1a1fc1047d	Add structured trace logs (#120289 ) Overall design: https://docs.google.com/document/d/1CX_hJ0PNy9f3R1y8TJrfkSeLkvGjjjLU84BSXgS2AZ8/edit How to read the diff: * Most files are me augmenting pre-existing logging with structured variants. For the most part it's simple (esp FX graphs, which have a canonical string representation); it gets more complicated when I decided to JSON-ify some data structure instead of keeping the ad hoc printing (notably, guards and dynamo output graph sizes) * torch/_functorch/_aot_autograd/collect_metadata_analysis.py is some unrelated fixes I noticed while auditing artifact logs * torch/_logging/_internal.py has the actual trace log implementation. The trace logger is implement as a logger named torch.__trace which is disconnected from the logging hierarchy. It gets its own handler and formatter (TorchLogsFormatter with _is_trace True). `trace_structured` is the main way to emit a trace log. Unusually, there's a separate "metadata" and "payload" field. The metadata field should not be too long (as it is serialized as a single line) and is always JSON (we put contextual things like compile id in it); the payload field can be long and is emitted after the metadata log line and can span multiple lines. * torch/_logging/structured.py contains some helpers for converting Python data structures into JSON form. Notably, we have a string interning implementation here, which helps reduce the cost of serializing filenames into the log. * test/dynamo/test_structured_trace.py the tests are cribbed from test_logging.py, but all rewritten to use expect tests on munged versions of what we'd actually output. Payloads are never tested, since they tend not be very stable. https://github.com/ezyang/tlparse is a POC Rust program that can interpret these logs. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120289 Approved by: https://github.com/Skylion007 ghstack dependencies: #120712	2024-02-28 01:01:41 +00:00
Edward Z. Yang	213b3ac3f2	[BE] fail_* variables don't need to be shared across restarts, they're set only once (#120712 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120712 Approved by: https://github.com/yanboliang	2024-02-28 00:48:11 +00:00
PyTorch MergeBot	f3dd2a544c	Revert "Add structured trace logs (#120289 )" This reverts commit `9dfaef962c`. Reverted https://github.com/pytorch/pytorch/pull/120289 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54230697 ([comment](https://github.com/pytorch/pytorch/pull/120289#issuecomment-1967477120))	2024-02-27 19:49:05 +00:00
Edward Z. Yang	9dfaef962c	Add structured trace logs (#120289 ) Overall design: https://docs.google.com/document/d/1CX_hJ0PNy9f3R1y8TJrfkSeLkvGjjjLU84BSXgS2AZ8/edit How to read the diff: * Most files are me augmenting pre-existing logging with structured variants. For the most part it's simple (esp FX graphs, which have a canonical string representation); it gets more complicated when I decided to JSON-ify some data structure instead of keeping the ad hoc printing (notably, guards and dynamo output graph sizes) * torch/_functorch/_aot_autograd/collect_metadata_analysis.py is some unrelated fixes I noticed while auditing artifact logs * torch/_logging/_internal.py has the actual trace log implementation. The trace logger is implement as a logger named torch.__trace which is disconnected from the logging hierarchy. It gets its own handler and formatter (TorchLogsFormatter with _is_trace True). There's a teensy bit of FB specific code to automatically enable trace logging if a /logs directory exists. `trace_structured` is the main way to emit a trace log. Unusually, there's a separate "metadata" and "payload" field. The metadata field should not be too long (as it is serialized as a single line) and is always JSON (we put contextual things like compile id in it); the payload field can be long and is emitted after the metadata log line and can span multiple lines. * torch/_logging/structured.py contains some helpers for converting Python data structures into JSON form. Notably, we have a string interning implementation here, which helps reduce the cost of serializing filenames into the log. * test/dynamo/test_structured_trace.py the tests are cribbed from test_logging.py, but all rewritten to use expect tests on munged versions of what we'd actually output. Payloads are never tested, since they tend not be very stable. https://github.com/ezyang/tlparse is a POC Rust program that can interpret these logs. Testing that the fbcode detection works at https://www.internalfb.com/mlhub/pipelines/runs/fblearner/534553450 (Meta-only) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120289 Approved by: https://github.com/Skylion007	2024-02-27 00:04:23 +00:00
Michael Lazos	56203fc407	Add profiling for backward (#120540 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/120540 Approved by: https://github.com/anijain2305	2024-02-24 16:53:28 +00:00
Yanbo Liang	d42ede8ae4	[torch.compile] Log compilation start time for timeline view (#120220 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/120220 Approved by: https://github.com/angelayi	2024-02-20 21:07:40 +00:00
Yanbo Liang	7f5b87c953	[torch.compile] Log more compilation time breakdown (#119865 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/119865 Approved by: https://github.com/ezyang	2024-02-15 02:20:07 +00:00
Yanbo Liang	5356b5d1f0	[Dynamo][16/N] Move skipfiles to trace_rules.py (#119432 ) This is follow-up-1 for https://github.com/pytorch/pytorch/pull/118971#issue-2114082018. Only code motion and doc update in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119432 Approved by: https://github.com/jansel	2024-02-09 18:18:23 +00:00
PyTorch MergeBot	eff93fbd86	Revert "[Dynamo][16/N] Move skipfiles to trace_rules.py (#119432 )" This reverts commit `56364124af`. Reverted https://github.com/pytorch/pytorch/pull/119432 on behalf of https://github.com/atalman due to Breaks internal tests ([comment](https://github.com/pytorch/pytorch/pull/119432#issuecomment-1936122795))	2024-02-09 15:25:25 +00:00
Yanbo Liang	56364124af	[Dynamo][16/N] Move skipfiles to trace_rules.py (#119432 ) This is follow-up-1 for https://github.com/pytorch/pytorch/pull/118971#issue-2114082018. Only code motion and doc update in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119432 Approved by: https://github.com/jansel	2024-02-08 09:41:52 +00:00
Chien-Chin Huang	1d2382f141	[DDP] Use compiled_autograd to trace DDP backward allreduce (#110662 ) Summary The reducer of `DistributedDataParallel` is implemented with C++ and it is not easy to trace the allreduce launched in the reducer. This PR modifies `DistributedDataParallel` to launch one allreduce per gradient when `compiled_autograd` is enabled. The changes allow us to use `compiled_autograd` to trace the allreduce and later be optimized (fused) in the Inductor. Key Logic 1. If `ddp_python_hook` is True, we assume `compiled_autograd` is used. `DistributedDataParallel` registers `compiled_accum_grad_hook` for all parameters. 2. In the first forward() call, if `DistributedDataParallel` is not compiled, all `compiled_accum_grad_hook` are deregistered. If `DistributedDataParallel` is compiled, all `compiled_accum_grad_hook` will be compiled by `compiled_autograd`. 3. `compiled_accum_grad_hook` launches an allreduce to reduce the gradient of the parameter. Bucketing The compiled backward is slow because there is no bucketing for the allreduces. We rely on Inductor to bucket the allreduces. The bucketing is done in a separate PR. Differential Revision: [D49428482](https://our.internmc.facebook.com/intern/diff/D49428482/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110662 Approved by: https://github.com/wconstab	2024-02-08 03:03:15 +00:00
Edward Z. Yang	169c070076	Move catch_errors_wrapper to convert_frame (#119253 ) With this change, we now have the invariant that eval_frame only contains "hot" functions that are called at runtime, as opposed to cold functions which are only called at compile time. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119253 Approved by: https://github.com/yanboliang ghstack dependencies: #119251	2024-02-06 17:40:07 +00:00
Edward Z. Yang	790858afa9	Make start compiling stack trace omit framework frames (#119251 ) Fixes https://github.com/pytorch/pytorch/issues/119238 Here's what it looks like now: ``` $ TORCH_LOGS=+torch._dynamo.convert_frame python a.py [2024-02-05 18:52:07,248] [0/0] torch._dynamo.convert_frame: [DEBUG] torchdynamo start compiling f /data/users/ezyang/b/pytorch/a.py:3, stack (elided 5 frames): [2024-02-05 18:52:07,248] [0/0] torch._dynamo.convert_frame: [DEBUG] File "/data/users/ezyang/b/pytorch/a.py", line 7, in <module> [2024-02-05 18:52:07,248] [0/0] torch._dynamo.convert_frame: [DEBUG] f(torch.randn(2)) [2024-02-05 18:52:07,248] [0/0] torch._dynamo.convert_frame: [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 453, in _fn [2024-02-05 18:52:07,248] [0/0] torch._dynamo.convert_frame: [DEBUG] return fn(args, kwargs) [2024-02-05 18:52:07,248] [0/0] torch._dynamo.convert_frame: [DEBUG] $ cat a.py import torch @torch.compile def f(x): return x 2 f(torch.randn(2)) ``` The eval_frame frame is intentionally present, since what happens is you run the torch.compile wrapper, and then you actually hit the user frame to be compiled. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119251 Approved by: https://github.com/yanboliang, https://github.com/mlazos	2024-02-06 17:40:07 +00:00
Animesh Jain	0c3a1c893e	[dynamo] Setup the globals for guard_fn without a reference to f_locals (#118447 ) UPDATE - I changed the PR because from discussion with @jansel it was clear that someone else was holding on to a reference to f_locals. This PR now solves that problem first. I removed the eval_frame.c part because it was failing tests that use `exec` or `eval` with weird error like `no no locals found when storing 'math'`. I would debug that in a separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118447 Approved by: https://github.com/Skylion007, https://github.com/jansel ghstack dependencies: #118975, #118420	2024-02-05 05:39:39 +00:00
Catherine Lee	4f5785b6b3	Enable possibly-undefined error code (#118533 ) Fixes https://github.com/pytorch/pytorch/issues/118129 Suppressions automatically added with ``` import re with open("error_file.txt", "r") as f: errors = f.readlines() error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Co-authored-by: Catherine Lee <csl@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2024-01-30 21:07:01 +00:00
PyTorch MergeBot	40ece2e579	Revert "Enable possibly-undefined error code (#118533 )" This reverts commit `4f13f69a45`. Reverted https://github.com/pytorch/pytorch/pull/118533 on behalf of https://github.com/clee2000 due to sorry i'm trying to figure out a codev merge conflict, if this works i'll be back to rebase and merge ([comment](https://github.com/pytorch/pytorch/pull/118533#issuecomment-1917695185))	2024-01-30 19:00:34 +00:00
Edward Z. Yang	4f13f69a45	Enable possibly-undefined error code (#118533 ) Fixes https://github.com/pytorch/pytorch/issues/118129 Suppressions automatically added with ``` import re with open("error_file.txt", "r") as f: errors = f.readlines() error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2024-01-30 05:08:10 +00:00
Edward Z. Yang	46712b019d	Enable local_partial_types (#118467 ) When using dmypy, this setting is enabled and cannot be turned off. Force it for regular mypy too. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118467 Approved by: https://github.com/Skylion007 ghstack dependencies: #118414, #118418, #118432	2024-01-28 13:38:22 +00:00
Shunting Zhang	fe10b1800f	LazyGraphModule (#117911 ) I feel it's easier to open a new PR rather than iterating on the previous PR (https://github.com/pytorch/pytorch/pull/105257 ) since this is more like a rewrite. In this PR, instead of changing GraphModule directly which can easily causes BC issue, I create a LazyGraphModule class as Zachary & Jason suggested in comments from the previous PR. The difference between LazyGraphModule and GraphModule is mainly about how re-compile for the graph module happens. In GraphModule the recompilation happens 'eagerly': constructing a GraphModule will cause the recompilation. While in LazyGraphModule, we just mark the module as needing recompilation. The real recompilation only happens when absolutely required (e.g. call forward method, access the code property etc.). In a lot of cases in torch.compile, the real recompilation eventually is not triggered at all. This can save a few seconds of compilation time. By default, GraphModule rather than LazyGraphModule is used. `use_lazy_graph_module(True)` context manager can be used to pick LazyGraphModule instead. This has been applied to the torch.compile stack. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117911 Approved by: https://github.com/jansel	2024-01-27 04:10:18 +00:00
Peter Bell	b53cc6cf8d	[dynamo] Fix test_replay_record.py (#116230 ) This test isn't run in CI because the CI runners don't have dill installed. This fixes the tests so they run for me locally, and in the next PR I add dill to the CI so we can test it properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116230 Approved by: https://github.com/jansel	2024-01-24 23:42:35 +00:00
Ana Basalo	5667a990fd	Chore: improve log message about cache size limit exceeded (#116557 ) Fixes #114527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116557 Approved by: https://github.com/ezyang	2024-01-17 06:07:18 +00:00
voznesenskym	33917150d3	Cleanup scope ref properly (#116169 ) Fixes https://github.com/pytorch/pytorch/issues/116143 See test in PR for a case where this happens. Discovered while debugging optimizers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116169 Approved by: https://github.com/janeyx99, https://github.com/williamwen42, https://github.com/jansel	2023-12-28 23:29:37 +00:00
Yanbo Liang	7e12e722af	[Dynamo][12/N] Remove allowed_functions.py (#116401 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/116401 Approved by: https://github.com/angelayi	2023-12-28 21:26:06 +00:00
Yanbo Liang	f657b2b1f8	[Dynamo][10/N] Remove TorchVariable and is_allowed (#116312 ) After this refactor: * ```TorchVariable``` definition and all references are removed. * All ```is_allowed``` references except one are removed. - The only left one is in ```torch/_dynamo/decorators:_disallow_in_graph_helper```. It was called when users put ```disallow_in_graph``` decorator on a function. Since we use the lists in ```trace_rules``` to decide the function's trace rule, so the decorator would only be used as customer function rather than torch functions. I'll defer this to a separate decorator refactor PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116312 Approved by: https://github.com/jansel	2023-12-27 18:47:05 +00:00
PyTorch MergeBot	3b709d7c1e	Revert "[Dynamo][10/N] Remove TorchVariable and is_allowed (#116312 )" This reverts commit `015bd0e0a1`. Reverted https://github.com/pytorch/pytorch/pull/116312 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/116312#issuecomment-1869825506))	2023-12-26 23:47:15 +00:00
Yanbo Liang	015bd0e0a1	[Dynamo][10/N] Remove TorchVariable and is_allowed (#116312 ) After this refactor: * ```TorchVariable``` definition and all references are removed. * All ```is_allowed``` references except one are removed. - The only left one is in ```torch/_dynamo/decorators:_disallow_in_graph_helper```. It was called when users put ```disallow_in_graph``` decorator on a function. Since we use the lists in ```trace_rules``` to decide the function's trace rule, so the decorator would only be used as customer function rather than torch functions. I'll defer this to a separate decorator refactor PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116312 Approved by: https://github.com/jansel	2023-12-23 09:44:09 +00:00
David Berard	054f9548b4	[dynamo] Store CompilationEvents in a buffer in torch._dynamo.utils (#115788 ) Motivation: it would be nice to be able to test using the metrics in log_compilation_event; currently dumps logs (or logs to a database in fbcode) - these are hard to use in unit tests. This change: * always record the information in torch._dynamo.utils.record_compilation_metrics; here, log into a limited-size deque to prevent the list of metrics from getting too long * if config.log_compilation_metrics, then call back into the original log_compilation_event function Pull Request resolved: https://github.com/pytorch/pytorch/pull/115788 Approved by: https://github.com/yanboliang	2023-12-18 23:26:13 +00:00
youkaichao	034e871710	[Dynamo] Look up variables from old frame, rather than copy variables to new frame; skip some copy to save time. (#115062 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115062 Approved by: https://github.com/williamwen42	2023-12-16 00:02:59 +00:00
Yanbo Liang	b4d6443bcf	[Dynamo] Log innermost user frame filename & lineno for better error aggregation (#115899 ) CompilationMetrics example: ``` frame_key='1', co_name='fn', co_filename='/data/users/ybliang/debug/debug1.py', co_firstlineno=58, cache_size=0, accumulated_cache_size=0, guard_count=None, graph_op_count=None, graph_node_count=None, graph_input_count=None, entire_frame_compile_time_s=None, backend_compile_time_s=None, fail_type="<class 'torch._dynamo.exc.Unsupported'>", fail_reason='custome dict init with args/kwargs unimplemented', fail_user_frame_filename='/data/users/ybliang/debug/debug1.py', fail_user_frame_lineno=61 ``` where: * ```fail_type``` and ```fail_reason``` are exceptions inside of Dynamo. * ```fail_user_frame_filename``` and ```fail_user_frame_lineno``` are where the original user code triggered the exception. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115899 Approved by: https://github.com/davidberard98, https://github.com/ydwu4	2023-12-15 08:24:55 +00:00
David Berard	67232199b1	[dynamo] Log shape_env_guard_count separately from guard_count (#115776 ) guard_count counts all the shape_env guards as a single guard; log the shape_env_guard_count separately so those metrics can be used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115776 Approved by: https://github.com/yanboliang	2023-12-14 20:12:49 +00:00
David Berard	89ee3af076	[Reland][Dynamo] Don't log compilation metrics for PyTorch unit tests (#115571 ) Reland #115452, which was reverted to simplify a merge conflict with #115386 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115571 Approved by: https://github.com/yanboliang	2023-12-12 01:15:54 +00:00
David Berard	5c0976fa04	Revert "[dynamo] guarded config (#111299 )" (#115386 ) This reverts commit `5927e9cbf2`. Differential Revision: [D51959266](https://our.internmc.facebook.com/intern/diff/D51959266) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115386 Approved by: https://github.com/yanboliang, https://github.com/malfet ghstack dependencies: #115384, #115401, #115385	2023-12-11 19:35:42 +00:00
David Berard	6db7b30db4	Revert "[dynamo] Cache size calc for differing config (#111300 )" (#115385 ) This reverts commit `78318d0249`. Differential Revision: [D51959268](https://our.internmc.facebook.com/intern/diff/D51959268) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115385 Approved by: https://github.com/malfet ghstack dependencies: #115384, #115401	2023-12-11 19:35:42 +00:00
PyTorch MergeBot	f06f51b152	Revert "[Dynamo] Don't log compilation metrics for PyTorch unit tests (#115452 )" This reverts commit `cd444aa075`. Reverted https://github.com/pytorch/pytorch/pull/115452 on behalf of https://github.com/davidberard98 due to Merge conflict with #115385, which already landed in fbcode ([comment](https://github.com/pytorch/pytorch/pull/115452#issuecomment-1850729965))	2023-12-11 19:21:40 +00:00
Yanbo Liang	cd444aa075	[Dynamo] Don't log compilation metrics for PyTorch unit tests (#115452 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/115452 Approved by: https://github.com/zou3519	2023-12-09 01:39:36 +00:00
rzou	c56d91ba39	Log pt2_compliant custom ops used with torch.compile (#115083 ) Summary: We already log non-pt2_compliant ops. This PR extends the logging to include pt2_compliant custom ops. We do not log all pt2_compliant ops (i.e. including builtin ops) because it would probably take too much memory Test Plan: Tested locally Pull Request resolved: https://github.com/pytorch/pytorch/pull/115083 Approved by: https://github.com/yanboliang, https://github.com/williamwen42	2023-12-05 00:51:33 +00:00
Kaichao You	d114f31b30	add testcase when bytecode hook changes the bytecode; fix code map (#114487 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114487 Approved by: https://github.com/jansel	2023-11-28 22:14:57 +00:00
Jon Chuang	78318d0249	[dynamo] Cache size calc for differing config (#111300 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111300 Approved by: https://github.com/ezyang ghstack dependencies: #111299	2023-11-17 09:59:58 +00:00
Jon Chuang	5927e9cbf2	[dynamo] guarded config (#111299 ) --- Fixes: https://github.com/pytorch/pytorch/issues/110682 Replaces: https://github.com/pytorch/pytorch/pull/111074 The guards are installed based on config that is valid at the call to `torch.compile`, rather than at any subsequent call / triggered compilation. Subsequent compilations will restore the config if there is a config mismatch of the existing global config with the saved config. TODO: - [X] add tests Follow up PRs: - [x] add revised cache size computation (follow up PR: #111300 , based on: https://github.com/pytorch/pytorch/pull/107496) - [ ] handle run-only mode? - [ ] config restoration itself is not thread-safe (tracked: https://github.com/pytorch/pytorch/issues/111150) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111299 Approved by: https://github.com/ezyang	2023-11-17 09:59:58 +00:00

1 2 3

150 Commits