pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
IvanKobzarev	8ae4f82243	[aotd] Support HOP effects in backward (#132638 ) Support of effectful operations in backward: 1/ AOTD collects metadata from forward fn only, so we can have usage of effectful ops in backward, that were not used in forward => Allowing tokens discovery during joint function . FunctionalTensorMode holds _tokens, in Joint function after tracing forward we memoize _tokens as `_tokens_forward_output`. 2/ Tokens are added as primals inputs (forward) in EffectTokensWrapper. Tokens that will be used in backward are in partitioner saved values. We do not have control on which positions they are saved in forward outputs. 2/ If new tokens discovered in backward after tracing joint_fn, the result graph will be manually added in the end of primals. _aot_autograd/utils.py 3/ All effectful ops during backward are marked with 'must_be_in_backward' partitioner_tag, to prevent partiitoner to place them in forward. For that functional_tensor_mode got new optional state `self._effects_partitioner_tag` for effectful ops, to set after tracing forward. There are additional changes in partitioner to improve functionality of 'must_be_in_backward' 4/ Unlift tokens now should run for both forward and backward. - As saved for backward tokens are placed on non static places - we identify input and output tokens to erase, by input and output of `with_effects` operation - In forward we can have input tokens, discovered in backward, that are not used in with_effects ops in forward, but saved for backward. We identify them by position in forward inputs. 5/ Adding aot debug logging for graphs before unlifting and before adding additional primal for backward tokens. Tests: ``` python test/higher_order_ops/test_with_effects.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/132638 Approved by: https://github.com/bdhirsh	2024-08-23 15:30:58 +00:00
James Wu	f037803290	Add ChromiumEventLogger, log FXGraphCache and AOTAutogradCache (#132864 ) This PR implements ChromiumEventLogger in all @dynamo_timed events. For each dynamo timed call, we log: - A start event before starting the function execution - An end event after finishing the function execution - An extra pair of start/end events for any phase names included in dynamo. Separately, this also gives us the ability to log instant events. I use them to log cache hits/misses as a first step. The little arrows on the bottom of the UI are cache hits/misses, and you can look at cache details by clicking each triangle. The outputted chromium trace events can be viewed in perfetto for a timeline of an execution. Here's what it looks like for a run of nanogpt: ![image](https://github.com/user-attachments/assets/cb9e6c7a-1acf-45e6-8a27-6651d9ae6132) And another with warm start: ![image](https://github.com/user-attachments/assets/cd9709bc-59ef-4da1-a7dd-10b1a0ab9b8f) Trace events are based around the JSON Event format: https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview We may want to switch to the less deprecated Protobuf format later, but so far I don't see any features we care about supported there. Internal FB employees can see a link to this in the tlparse output: https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/.tmpVi1FIl/dedicated_log_torch_trace_bb4zl_bc.log/index.html I'll also work on logging these Pull Request resolved: https://github.com/pytorch/pytorch/pull/132864 Approved by: https://github.com/aorenste	2024-08-10 01:15:53 +00:00
Nicolas Macchioni	5cb05a82b4	[BC breaking] move benchmarking + prefer inductor path (#132827 ) move benchmarking out of `torch._inductor.runtime.runtime_utils` and into `torch._inductor.runtime.benchmarking`, and prefer this path over directly accessing Triton's benchmarking Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/132827 Approved by: https://github.com/eellison	2024-08-08 00:47:45 +00:00
Michael Lazos	a8f0979962	Add cudagraph static inputs logging (#132726 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132726 Approved by: https://github.com/anijain2305	2024-08-06 12:01:20 +00:00
Edward Z. Yang	898a431a46	Dump files that look like FX graphs to structured log (#132100 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/132100 Approved by: https://github.com/oulgen	2024-07-31 18:45:28 +00:00
Xuehai Pan	e7eeee473c	[BE][Easy][14/19] enforce style for empty lines in import segments in `torch/_[a-c]/` and `torch/_[e-h]/` and `torch/_[j-z]*/` (#129765 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129765 Approved by: https://github.com/ezyang	2024-07-31 10:42:50 +00:00
Edward Z. Yang	495d413519	Include code object of frame being compiled in stack (#132161 ) This is pretty useful to have! Test plan: https://internalfb.com/intern/fblearner/details/586653862/ Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/132161 Approved by: https://github.com/oulgen	2024-07-30 21:33:27 +00:00
Xiaodong Wang	56bb047449	[pt2] Increase dynamo/inductor default log level to info (#131311 ) Summary: Avoid the logs to be too verbose Test Plan: CI Differential Revision: D60028647 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131311 Approved by: https://github.com/oulgen	2024-07-22 17:33:29 +00:00
Aaron Orenstein	b193894b94	FakeTensor cache SymInt support (#127596 ) Adds support for SymInts in the FakeTensor cache. A couple notes: 1. When a SymInt is present in the input key for a FakeTensor operation we cache on the ShapeEnv instead of using the FakeTensorMode cache. This is necessary so we don't have to remember and check the guards. It reduces the cache hits but there's diminishing return on how much work we can do before the cache becomes more of a burden than a gain. 2. We need to be careful that when we cache an output SymInt that is a direct copy from the input that when we have a cache-hit we copy the SymNode from the input to the output. This is important because the fx-graph building code actually uses SymNode ids in the process of building the graph so constructing a same-content-but-different-id SymNode will fail. 3. In the cache key we store SymInts as a _PySymInputStub. These represent SymInt (and friends) but support `__hash__` and `__eq__` (which SymInt do not). 4. In the cache entry we store SymInts as a _SymIntOutputStub. Perf example: ``` python benchmarks/dynamo/timm_models.py --ci --accuracy --timing --explain --inductor --dynamic-shapes --dynamic-batch-only --device cuda --training --amp --total-partitions 2 --partition-id 0 --output /tmp/training_timm_models.csv --filter crossvit_9_240 ``` fake tensor cache before: ``` INFO: FakeTensor cache stats: INFO: cache_hits: 68137 INFO: cache_misses: 837 INFO: cache_bypasses: INFO: symbolic shape: 48224 INFO: CompositeImplicitAutograd: 917 INFO: non-fake tensor: 70 INFO: non-FakeTensor output: 62 INFO: non-builtin: 8 INFO: dynamic output shape: 1 ``` and after: ``` INFO: FakeTensor cache stats: INFO: cache_hits: 88187 INFO: cache_misses: 14233 INFO: cache_bypasses: INFO: CompositeImplicitAutograd: 1037 INFO: non-FakeTensor output: 602 INFO: non-fake tensor: 70 INFO: unsafe view: 36 INFO: non-builtin: 8 INFO: dynamic output shape: 1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/127596 Approved by: https://github.com/eellison ghstack dependencies: #131014, #129780	2024-07-21 19:26:38 +00:00
Oguz Ulgen	eee76c86a8	Write trace_structured events to scuba (#130955 ) Summary: https://fb.workplace.com/groups/1286739428954016/posts/1287192258908733 Test Plan: Run test with tlparse and inspect https://www.internalfb.com/intern/scuba/query/?dataset=pt2_trace_structured_events Differential Revision: D59866096 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130955 Approved by: https://github.com/ezyang	2024-07-19 06:02:47 +00:00
Edward Z. Yang	aa95fb99af	On advice of James March, log pid instead of tid (#130679 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/130679 Approved by: https://github.com/jmarchfb	2024-07-17 02:00:10 +00:00
Edward Z. Yang	6f54e961ea	Add trace_shape_events artifact tracing for ShapeEnv events (#130473 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/130473 Approved by: https://github.com/lezcano	2024-07-12 13:50:25 +00:00
Chien-Chin Huang	0d8dedb01b	[dtensor] Add dtensor to TORCH_LOGS (#129512 ) Summary: Add the basic log for dispatcher of dtensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/129512 Approved by: https://github.com/wanchaol, https://github.com/XilunWu	2024-07-12 06:50:53 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Edward Z. Yang	424cd1e1df	Enable TORCH_TRACE by default on Conda on Mast (#129988 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/129988 Approved by: https://github.com/kunalb	2024-07-03 03:35:45 +00:00
Jiashen Cao	b6689e0fb8	[ts migration] add logging as part of torch logging system (#129405 ) #### Description Add more verbose logging of conversion process. Output which IR is being converted, which function is used to do conversion, and whether it succeeds. #### Example `TORCH_LOGS="+export,ts2ep_conversion" pytest test/export/test_converter.py -s -k test_prim_tolist` ``` test/export/test_converter.py I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] TorchScript graph I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] graph(%x.1 : Long(3, strides=[1], requires_grad=0, device=cpu)): I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] %1 : __torch__.export.test_converter.___torch_mangle_1.Module = prim::CreateObject() I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] %2 : int = prim::Constant[value=1](), scope: export.test_converter.Module:: I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] %3 : int = prim::Constant[value=0](), scope: export.test_converter.Module:: I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] %4 : int[] = prim::tolist(%x.1, %2, %3), scope: export.test_converter.Module:: I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] return (%4) I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] I0624 13:19:26.416000 140608224474112 torch/_export/converter.py:734] V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert [%1 : __torch__.export.test_converter.___torch_mangle_1.Module = prim::CreateObject()] V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert using [convert_prim_CreateObject] succeeds V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert [%2 : int = prim::Constant[value=1](), scope: export.test_converter.Module::] V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert using [convert_prim_Constant] succeeds V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert [%3 : int = prim::Constant[value=0](), scope: export.test_converter.Module::] V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert using [convert_prim_Constant] succeeds V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert [%4 : int[] = prim::tolist(%x.1, %2, %3), scope: export.test_converter.Module::] V0624 13:19:26.417000 140608224474112 torch/_export/converter.py:690] Convert using [convert_prim_tolist] succeeds I0624 13:19:26.427000 140608224474112 torch/_export/converter.py:760] TS2EPConverter IR-to-IR conversion succeeds ``` #### Test Plan `pytest test/export/test_converter` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129405 Approved by: https://github.com/angelayi	2024-06-27 00:20:20 +00:00
mori360	ef55446538	[FSDP2] Add 'TORCH_LOGS=+fsdp' to log hooks(pre/post forward/backward) and FQN (_init_fqns) (#128663 ) Summary: Add '`TORCH_LOGS=+fsdp`' in the CLI to print fsdp logs Example: `TORCH_LOGS=+fsdp torchrun --standalone --nproc_per_node=2 run_fsdp.py` Description: Add logging to `FSDPParamGroup.pre_forward`, `FSDPParamGroup.post_forward`, `FSDPParamGroup.pre_backward`, and `FSDPParamGroup.post_backward`, `FSDPState._root_pre_forward` if is the root, and `FSDPState._root_post_backward_final_callback`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128663 Approved by: https://github.com/weifengpy, https://github.com/awgu	2024-06-21 23:25:58 +00:00
Aaron Orenstein	afe15d2d2f	Flip default value for mypy disallow_untyped_defs [3/11] (#127840 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127840 Approved by: https://github.com/oulgen	2024-06-08 18:28:01 +00:00
Edward Z. Yang	0aaac68c57	Add structured logging for tensor fakeification (#126879 ) This adds dumps of MetaTensorDesc and MetaStorageDesc to structured logs when they are triggered from Dynamo. The logs look like this: ``` V0522 08:13:25.267000 140224882566144 torch/_subclasses/meta_utils.py:195] {"describe_storage": {"id": 0, "describer_id": 0, "size": 32}, "frame_id": 0, "frame_compile_id": 0, "attempt": 0} V0522 08:13:25.267000 140224882566144 torch/_subclasses/meta_utils.py:220] {"describe_tensor": {"id": 0, "ndim": 1, "dtype": "torch.float32", "device": "device(type='cpu')", "size": [8], "is_leaf": true, "stride": [1], "storage": 0, "view_func": "<built-in method _view_func_unsafe of Tensor object at 0x7f882959e840>", "describer_id": 0}, "frame_id": 0, "frame_compile_id": 0, "attempt": 0} V0522 08:13:25.268000 140224882566144 torch/_subclasses/meta_utils.py:1594] {"describe_source": {"describer_id": 0, "id": 0, "source": "L['x']"}, "frame_id": 0, "frame_compile_id": 0, "attempt": 0} ``` The `describer_id` is used to disambiguate ids. We expect it to be unique per frame id, but if there is a bug it possibly is not. Note you will get redundant dumps when evaluation restarts. tlparse can use this to give a visualization of input tensors to a model, you could also use this to generate example inputs to run graphs on. Some care is taken to avoid redumping the tensor metadata multiple times, which would happen ordinarily because AOTAutograd refakifies everything after Dynamo, to deal with metadata mutation. Partially fixes https://github.com/pytorch/pytorch/issues/126644 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126879 Approved by: https://github.com/jamesjwu	2024-05-31 01:58:44 +00:00
Jason Ansel	c08afbb3da	[inductor] Add kernel_code logging artifact (#126631 ) This is useful for some compile errors where we don't finish outputting the full graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126631 Approved by: https://github.com/shunting314	2024-05-21 23:12:42 +00:00
Will Constable	54bc55c515	Remove dist_ prefix from TORCH_LOGS shortcuts (#126499 ) e.g. dist_ddp -> ddp 'distributed' shortcut remains unchained Feedback has been that it is not appealing to have the dist_ prefix, and the main reason for it was to keep the distributed shortcuts grouped together in the help menu. It's nice to have shorter shortcuts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126499 Approved by: https://github.com/XilunWu, https://github.com/kwen2501 ghstack dependencies: #126322	2024-05-18 00:07:30 +00:00
Will Constable	a0df40f195	Add dist_pp shortcut to TORCH_LOGS (#126322 ) distributed log category already includes pipelining since its under the torch.distributed umbrella. So both TORCH_LOGS=distributed and TORCH_LOGS=dist_pp will enable PP logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126322 Approved by: https://github.com/kwen2501	2024-05-17 05:32:15 +00:00
William Wen	55c705b602	[dynamo] add trace_bytecode logging artifact (#125360 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125360 Approved by: https://github.com/ezyang	2024-05-02 22:01:00 +00:00
Edward Z. Yang	7aa6bd7fa0	Refactor all top level usages of record_shapeenv_event to ShapeEnv class (#123735 ) This ensures that first argument to record_shapeenv_event is a ShapeEnv so we can appropriately short circuit when recording is not in progress. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123735 Approved by: https://github.com/ysiraichi, https://github.com/zou3519, https://github.com/albanD	2024-04-27 20:36:40 +00:00
Simon Fan	43a7ab2a21	[compiled autograd] introduce verbose logs, add autograd node info to graph (#124954 ) - sets it as a fake stack trace as we don't have a generic comment feature - when verbose is disabled, still adds a contextmanager and flag checks. the alternative is to use MACROS, but that wouldn't be usable with TORCH_LOGS Pull Request resolved: https://github.com/pytorch/pytorch/pull/124954 Approved by: https://github.com/jansel	2024-04-27 01:10:37 +00:00
PyTorch MergeBot	e607dc8abb	Revert "Refactor all top level usages of record_shapeenv_event to ShapeEnv class (#123735 )" This reverts commit `87bec7db4e`. Reverted https://github.com/pytorch/pytorch/pull/123735 on behalf of https://github.com/jeanschmidt due to Breaking internal signals, more info in D56587358 ([comment](https://github.com/pytorch/pytorch/pull/123735#issuecomment-2078695590))	2024-04-26 06:10:58 +00:00
Edward Z. Yang	87bec7db4e	Refactor all top level usages of record_shapeenv_event to ShapeEnv class (#123735 ) This ensures that first argument to record_shapeenv_event is a ShapeEnv so we can appropriately short circuit when recording is not in progress. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123735 Approved by: https://github.com/ysiraichi, https://github.com/zou3519, https://github.com/albanD ghstack dependencies: #124310, #124314, #124316, #124394, #124739, #124782, #124785	2024-04-25 14:02:48 +00:00
Edward Z. Yang	852111e1c2	[TORCH_TRACE] Record stack when no compile context is available (#122644 ) This will help me track down those annoying unknown compile products. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122644 Approved by: https://github.com/jamesjwu	2024-03-26 19:30:52 +00:00
Edward Z. Yang	2f064d895c	Switch TORCH_TRACE to accept a directory by default (#121331 ) Directory is better because it works smoothly with distributed runs; otherwise you'd need to modify torchrun to setup distinct log names for each file. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Differential Revision: [D54597814](https://our.internmc.facebook.com/intern/diff/D54597814) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121331 Approved by: https://github.com/albanD	2024-03-06 22:46:18 +00:00
Edward Yang	02a410ee12	Enable TORCH_TRACE by default in all Tupperware like environments (#120915 ) Summary: This is a reimplemented version of the FB specific code in https://www.internalfb.com/diff/D54230697 The new strategy is that we unconditionally install an FB handler to trace_log logger (and always set level to DEBUG). When the first log message is emitted, we check the JK/filesystem to see if we should actually do logging. If we decide we don't do logging, we remove the handler from trace_log and are done. build_only[github-export-checks,executorch,pytorch_benchmark,pytorch_quantization,pytorch_distributed,pytorch_distributed_gpu,pytorch_dynamo_inductor,pytorch_functorch,pytorch_fx2trt,pytorch_diff_train_tests_ads,glow_fb_pytorch_tests,training_platform,training_platform_compatibility,training_toolkit_applications,training_toolkit_examples,training_toolkit_model_optimization,dper3_pytorch,xplat_caffe2,pytorch_dev,android-pytorch-instrumentation-tests,smartpytorchgithub_first_try_merge,frl-target-determinator,f6-buck,training_platform_for_github,sigmoid_cpu,sigmoid_gpu,aiplatform_modelprocessing_for_github,accelerators_workloads_models_slimdsnn,ae_aotinductor_benchmark_test,aps_,aps_deterministic_ne_tests,dper_lib_silvertorch,torchrec,torchrec_fb,deeplearning_aot_inductor] Test Plan: sandcastle ``` buck2 test 'fbcode//mode/dev-nosan' fbcode//torchrec/inference/tests:test_single_gpu_executor -- --exact 'torchrec/inference/tests:test_single_gpu_executor - TorchDeployGPUTest.NestedModelSingleGPU' buck2 test 'fbcode//mode/dev-nosan' fbcode//dper_lib/silvertorch/modules/dynamic_stats/tests:accumulators_test -- --exact 'dper_lib/silvertorch/modules/dynamic_stats/tests:accumulators_test - test_global_fixed_interval_accumulator (dper_lib.silvertorch.modules.dynamic_stats.tests.accumulators_test.GlobalFixedIntervalUnivalentAcculumatorTest)' ``` Also running a test flow with/without JK enabled Differential Revision: D54275086 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120915 Approved by: https://github.com/yanboliang	2024-03-01 04:47:13 +00:00
Jason Ansel	184e815c74	Add TORCH_LOGS_FORMAT=short alias (#120757 ) Shorthand for `"%(levelname)s:%(name)s:%(message)s"` which is hard to remember. I find the default formatter annoying since just the metadata fills up most of the width of my terminal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120757 Approved by: https://github.com/ezyang	2024-02-28 04:40:48 +00:00
Edward Z. Yang	1a1fc1047d	Add structured trace logs (#120289 ) Overall design: https://docs.google.com/document/d/1CX_hJ0PNy9f3R1y8TJrfkSeLkvGjjjLU84BSXgS2AZ8/edit How to read the diff: * Most files are me augmenting pre-existing logging with structured variants. For the most part it's simple (esp FX graphs, which have a canonical string representation); it gets more complicated when I decided to JSON-ify some data structure instead of keeping the ad hoc printing (notably, guards and dynamo output graph sizes) * torch/_functorch/_aot_autograd/collect_metadata_analysis.py is some unrelated fixes I noticed while auditing artifact logs * torch/_logging/_internal.py has the actual trace log implementation. The trace logger is implement as a logger named torch.__trace which is disconnected from the logging hierarchy. It gets its own handler and formatter (TorchLogsFormatter with _is_trace True). `trace_structured` is the main way to emit a trace log. Unusually, there's a separate "metadata" and "payload" field. The metadata field should not be too long (as it is serialized as a single line) and is always JSON (we put contextual things like compile id in it); the payload field can be long and is emitted after the metadata log line and can span multiple lines. * torch/_logging/structured.py contains some helpers for converting Python data structures into JSON form. Notably, we have a string interning implementation here, which helps reduce the cost of serializing filenames into the log. * test/dynamo/test_structured_trace.py the tests are cribbed from test_logging.py, but all rewritten to use expect tests on munged versions of what we'd actually output. Payloads are never tested, since they tend not be very stable. https://github.com/ezyang/tlparse is a POC Rust program that can interpret these logs. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120289 Approved by: https://github.com/Skylion007 ghstack dependencies: #120712	2024-02-28 01:01:41 +00:00
PyTorch MergeBot	f3dd2a544c	Revert "Add structured trace logs (#120289 )" This reverts commit `9dfaef962c`. Reverted https://github.com/pytorch/pytorch/pull/120289 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54230697 ([comment](https://github.com/pytorch/pytorch/pull/120289#issuecomment-1967477120))	2024-02-27 19:49:05 +00:00
Edward Z. Yang	237773132d	Restore artifact name in log messages (#120671 ) Yuzhen Huang was complaining to me that searching for `__recompile` no longer works. This is because the glog format is filename, not logger name, so we lost the artifact name. Add it back. Looks like: ``` V0226 15:56:04.142000 139828992779264 torch/_dynamo/guards.py:1084] [0/2] __guards: ___check_type_id(L['inputs'], 7626144) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120671 Approved by: https://github.com/Skylion007	2024-02-27 03:37:11 +00:00
Edward Z. Yang	9dfaef962c	Add structured trace logs (#120289 ) Overall design: https://docs.google.com/document/d/1CX_hJ0PNy9f3R1y8TJrfkSeLkvGjjjLU84BSXgS2AZ8/edit How to read the diff: * Most files are me augmenting pre-existing logging with structured variants. For the most part it's simple (esp FX graphs, which have a canonical string representation); it gets more complicated when I decided to JSON-ify some data structure instead of keeping the ad hoc printing (notably, guards and dynamo output graph sizes) * torch/_functorch/_aot_autograd/collect_metadata_analysis.py is some unrelated fixes I noticed while auditing artifact logs * torch/_logging/_internal.py has the actual trace log implementation. The trace logger is implement as a logger named torch.__trace which is disconnected from the logging hierarchy. It gets its own handler and formatter (TorchLogsFormatter with _is_trace True). There's a teensy bit of FB specific code to automatically enable trace logging if a /logs directory exists. `trace_structured` is the main way to emit a trace log. Unusually, there's a separate "metadata" and "payload" field. The metadata field should not be too long (as it is serialized as a single line) and is always JSON (we put contextual things like compile id in it); the payload field can be long and is emitted after the metadata log line and can span multiple lines. * torch/_logging/structured.py contains some helpers for converting Python data structures into JSON form. Notably, we have a string interning implementation here, which helps reduce the cost of serializing filenames into the log. * test/dynamo/test_structured_trace.py the tests are cribbed from test_logging.py, but all rewritten to use expect tests on munged versions of what we'd actually output. Payloads are never tested, since they tend not be very stable. https://github.com/ezyang/tlparse is a POC Rust program that can interpret these logs. Testing that the fbcode detection works at https://www.internalfb.com/mlhub/pipelines/runs/fblearner/534553450 (Meta-only) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120289 Approved by: https://github.com/Skylion007	2024-02-27 00:04:23 +00:00
Edward Z. Yang	fbe8e0f92d	Fix missing right square bracket to match glog format (#119966 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119966 Approved by: https://github.com/oulgen ghstack dependencies: #119869	2024-02-16 15:14:00 +00:00
Edward Z. Yang	90f785dc34	Change default TORCH_LOGS format to match Meta/glog standard (#119869 ) Before: ``` [2024-02-13 19:34:50,591] [0/0] torch._dynamo.guards.__guards: [DEBUG] GUARDS: [2024-02-13 19:34:50,591] [0/0] torch._dynamo.guards.__guards: [DEBUG] ___check_type_id(L['x'], 70049616) # assert x.shape[0] > 2 # b.py:5 in f [2024-02-13 19:34:50,592] [0/0] torch._dynamo.guards.__guards: [DEBUG] hasattr(L['x'], '_dynamo_dynamic_indices') == False # assert x.shape[0] > 2 # b.py:5 in f ``` After this change, the logs look like this: ``` V0214 07:00:49.354000 139646045393920 torch/_dynamo/guards.py:1023 [0/0] GUARDS: V0214 07:00:49.354000 139646045393920 torch/_dynamo/guards.py:1039 [0/0] ___check_type_id(L['x'], 70050096) # assert x.shape[0] > 2 # b.py:5 in f V0214 07:00:49.355000 139646045393920 torch/_dynamo/guards.py:1039 [0/0] hasattr(L['x'], '_dynamo_dynamic_indices') == False # assert x.shape[0] > 2 # b.py:5 in f ``` The main differences from what we had before: * We don't print DEBUG/INFO/WARNING, instead, we only print a single character. DEBUG, somewhat oddly, maps to V, because it corresponds to glog VERBOSE * The year is omitted, and a more compact representation for date/month is adopted. Somewhat perplexingly, six digits are allocated for the nanoseconds, even though Python typically doesn't have that level of resolution * The thread ID is included (in a containerized environment, this thread id will be typically much lower) * Instead of using the module name, we give a filepath, as well as the line the log message was emitted from. I think the line number is a nice touch and improvement over our old logs, but one downside is we do lose the artifact name in the log message, in case anyone was grepping for that. * I chose to move the compile id prefix to the very end so as to keep a uniform layout before it, but I do think there are benefits to having it before the filename Meta only: This format was reverse engineered off of `6b8bbe3b53/supervisor/logging.py` and https://www.internalfb.com/code/fbsource/[e6728305a48540110f2bdba198aa74eee47290f9]/fbcode/tupperware/front_end/log_reader/filter/StreamingLogLineFilter.cpp?lines=105-114 Now, I think this may be slightly controversial, but I have chosen to apply this format by default in OSS. My reasoning is that many PT2 developers work with the logs in OSS, and keeping the format identical to what we run in prod will make it easier for these skills to transfer. The non-negotiable portion of the new format is "V0213 19:28:32"; the date string is expected to be in exactly this form or Tupperware will fail to parse it as a date. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119869 Approved by: https://github.com/oulgen, https://github.com/mlazos, https://github.com/Skylion007	2024-02-14 18:56:35 +00:00
PyTorch MergeBot	458e83b5b3	Revert "Add FakeTensor support to torch._utils._rebuild_tensor (#108186 )" This reverts commit `113506d2d4`. Reverted https://github.com/pytorch/pytorch/pull/108186 on behalf of https://github.com/atalman due to Reverted Internally ([comment](https://github.com/pytorch/pytorch/pull/108186#issuecomment-1935310344))	2024-02-09 04:19:20 +00:00
Thiago Crepaldi	113506d2d4	Add FakeTensor support to torch._utils._rebuild_tensor (#108186 ) Partially fixes https://github.com/pytorch/pytorch/issues/105077 Repro: ```python import tempfile import torch from torch._subclasses import fake_tensor class TheModelClass(torch.nn.Module): def __init__(self): super(TheModelClass, self).__init__() self.fc1 = torch.nn.Linear(5, 10) def forward(self, x): return self.fc1(x) with tempfile.NamedTemporaryFile() as state_dict_file: # Create state_dict to be loaded later model = TheModelClass() torch.save(model.state_dict(), state_dict_file.name) fake_mode = fake_tensor.FakeTensorMode() with fake_mode: # This is where the bug is triggered state_dict = torch.load(state_dict_file.name) ``` Error: ```bash Traceback (most recent call last): File "issue_gh_torch_105077.py", line 22, in <module> state_dict = torch.load(state_dict_file.name) File "/opt/pytorch/torch/serialization.py", line 1014, in load return _load(opened_zipfile, File "/opt/pytorch/torch/serialization.py", line 1422, in _load result = unpickler.load() File "/opt/pytorch/torch/_utils.py", line 205, in _rebuild_tensor_v2 tensor = _rebuild_tensor(storage, storage_offset, size, stride) File "/opt/pytorch/torch/_utils.py", line 184, in _rebuild_tensor return t.set_(storage._untyped_storage, storage_offset, size, stride) File "/opt/pytorch/torch/utils/_stats.py", line 20, in wrapper return fn(args, *kwargs) File "/opt/pytorch/torch/_subclasses/fake_tensor.py", line 1288, in __torch_dispatch__ return self.dispatch(func, types, args, kwargs) File "/opt/pytorch/torch/_subclasses/fake_tensor.py", line 1468, in dispatch self.invalidate_written_to_constants(func, flat_arg_fake_tensors, args, kwargs) File "/opt/pytorch/torch/_subclasses/fake_tensor.py", line 1733, in invalidate_written_to_constants _, new_kwargs = normalize_function( File "/opt/pytorch/torch/fx/operator_schemas.py", line 297, in normalize_function torch_op_schemas = get_signature_for_torch_op(target) File "/opt/pytorch/torch/fx/operator_schemas.py", line 167, in get_signature_for_torch_op signatures = [_torchscript_schema_to_signature(schema) for schema in schemas] File "/opt/pytorch/torch/fx/operator_schemas.py", line 167, in <listcomp> signatures = [_torchscript_schema_to_signature(schema) for schema in schemas] File "/opt/pytorch/torch/fx/operator_schemas.py", line 70, in _torchscript_schema_to_signature arg_type = _torchscript_type_to_python_type(arg.type) File "/opt/pytorch/torch/fx/operator_schemas.py", line 64, in _torchscript_type_to_python_type return eval(ts_type.annotation_str, _type_eval_globals) File "<string>", line 1, in <module> NameError: name 'Storage' is not defined ``` This PR adds the ability to create fake tensors during `torch.load` by wrapping the `torch.tensor.set_` call around a `torch.utils._mode_utils.no_dispatch()` to skip fake mode dispatcher for it and thus create a real tensor. It later calls `fake_mode.from_tensor(t)` to finally create the fake tensor. Co-authored-by: Edward Z. Yang <ezyang@mit.edu> Pull Request resolved: https://github.com/pytorch/pytorch/pull/108186 Approved by: https://github.com/ezyang	2024-02-08 03:01:34 +00:00
Edward Z. Yang	51e096114b	Increase recommended logging in DEFAULT_LOGGING (#119207 ) For long running batch jobs, it is best to opt for logs that are too spammy rather than not spammy enough. This lines up DEFAULT_LOGGING with our current internal guidance at Meta. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119207 Approved by: https://github.com/bdhirsh	2024-02-05 21:59:10 +00:00
PrincipalsOffice	0348975a87	Set up new logging artifact for SymNode (#119158 ) Fixes #113876 Hi, I updated various logging configs and the SymNode module to use the new dedicated logging artifact. This is my first pytorch PR, mirrored my changes off of https://github.com/pytorch/pytorch/pull/111808. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119158 Approved by: https://github.com/ezyang	2024-02-05 07:34:54 +00:00
Edward Z. Yang	dab16b6b8e	s/supress/suppress/ (#119132 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119132 Approved by: https://github.com/kit1980, https://github.com/malfet	2024-02-04 00:54:14 +00:00
Elias Ellison	e33e88e5bc	Add separate logging target for cudagraphs (#118329 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118329 Approved by: https://github.com/mlazos	2024-01-30 20:16:51 +00:00
angelayi	79de14546d	[export] Add TORCH_LOGS=export (#116993 ) Adds TORCH_LOGS=export which currently includes dynamo/dynamic logs. In the future if we add any logs under the torch/export directory it will also show up in the TORCH_LOGS=export Pull Request resolved: https://github.com/pytorch/pytorch/pull/116993 Approved by: https://github.com/avikchaudhuri	2024-01-11 03:02:23 +00:00
Oguz Ulgen	8918ce4087	Add TORCH_LOGS_OUT to direct TORCH_LOGS output (#117005 ) Twice now, while I was debugging accuracy bugs, I get dynamo logs that are 100k lines long and it is impossible to read them on the terminal. Lets add an option to write them to a file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117005 Approved by: https://github.com/ezyang, https://github.com/zou3519 ghstack dependencies: #116894	2024-01-09 23:46:22 +00:00
Will Constable	7562a00946	Make TORCH_LOGS="dist_ddp" include DDPOptimizer logs (#116794 ) Note: ddp_graphs is still 'separate' from log components since it is an artifact. Not sure it's possible to enable it by default when dist_ddp is selected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116794 Approved by: https://github.com/fduwjj	2024-01-05 21:31:42 +00:00
fduwjj	8deaa13417	[EZ][Distributed] Add 'c10d' to distributed TORCH_LOG comment (#116526 ) Address the comment in https://github.com/pytorch/pytorch/pull/116434, which I confused in the first beginning. Let's add c10d to the comment. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116526 Approved by: https://github.com/XilunWu	2023-12-29 04:40:37 +00:00
fduwjj	9c3ae37fc4	[Distributed] Add finer granularity tag for distributed submodule (#116434 ) This PR is the start to enable the integrate pytorch distributed logs in Torch LOGs. We now already have one tag "distributed" for all distributed components but distributed is a very large component and we want to have some hierarchy and give users options to only turn on logs for certain submodules. So we also added tags starting with "dist_*" for each submodule. (This PR only adds some of them and we are going to add more down the road) Related discussions can be found here: https://github.com/pytorch/pytorch/issues/113544 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116434 Approved by: https://github.com/awgu, https://github.com/wanchaol	2023-12-27 04:09:34 +00:00
soulitzer	cfb3cd11c1	Add basic autograd TORCH_LOGS support (#115438 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115438 Approved by: https://github.com/albanD	2023-12-20 15:23:44 +00:00
Aaron Meurer	81adbb6131	Sort the output of TORCH_LOGS=help (#114657 ) Previously the order was random because it was based on the order of dictionary keys. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114657 Approved by: https://github.com/lezcano	2023-11-30 20:13:51 +00:00
Jon Chuang	80ae00d11a	[AOT Refactor] jit compile runtime wrappers (#114564 ) --- Part _ of https://github.com/pytorch/pytorch/issues/114548 Total reduction in lines: 5200 lines -> 1100 lines Pull Request resolved: https://github.com/pytorch/pytorch/pull/114564 Approved by: https://github.com/bdhirsh ghstack dependencies: #114550, #114551, #114552, #114553, #114554, #114555, #114556, #114557, #114558, #114559, #114561, #114562, #114563	2023-11-30 00:28:57 +00:00
Will Constable	2333d381b2	Make 'distributed' TORCH_LOGS include ddpoptimizer (#114376 ) There are now 3 ways to see logs from ddpoptimzer. 1) TORCH_LOGS="distributed" 2) TORCH_LOGS="dynamo" 3) TORCH_LOGS="torch._dynamo.backends.distributed" (1 and 2 are different supersets of 3 that also include other content) Note: ddp_graphs is still a separate 'artifact' logger, which just includes graph dumps from the graph-splitting process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114376 Approved by: https://github.com/wanchaol	2023-11-28 02:39:28 +00:00
William Wen	2530d47cbe	[dynamo] re-add option to log all guard check fails (#113585 ) Followup to https://github.com/pytorch/pytorch/pull/110325 - re-add the `report_all_guard_failures config` as a logging artifact `recompiles_verbose` with the following changes: - evaluating the check must be wrapped with exception handling because subsequent code parts following the first failure may result in errors if evaluated (e.g. if a guard checks first for size, then tries to index - a guard failure due to insufficient size would result in an index error for the latter check). - Adding a test for this case Sample: ```python import torch def fn(x): return torch.rand(x[-1], len(x)) opt_fn = torch.compile(fn) opt_fn([4, 5, 6]) opt_fn([7, 8]) opt_fn([9]) ``` Output (with `TORCH_LOGS="recompiles_verbose"`): ```bash [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] Recompiling function fn in /data/users/williamwen/pytorch/playground5.py:15 [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] triggered by the following guard failure(s): [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] guard 0 failures: [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - len(L['x']) == 3 [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - L['x'][0] == 4 [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - L['x'][1] == 5 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] Recompiling function fn in /data/users/williamwen/pytorch/playground5.py:15 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] triggered by the following guard failure(s): [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] guard 0 failures: [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - len(L['x']) == 2 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] guard 1 failures: [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - len(L['x']) == 3 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - L['x'][0] == 4 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113585 Approved by: https://github.com/jon-chuang, https://github.com/ezyang	2023-11-16 21:20:29 +00:00
Edward Z. Yang	8033f65c0b	Don't toggle torch logger to NOTSET if it is not set; always use pre-existing (#113842 ) This is kind of hard to test. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113842 Approved by: https://github.com/wanchaol	2023-11-16 07:06:05 +00:00
Edward Z. Yang	ad06e9f060	Support logging aliases to list of modules (#113567 ) When SymNode was refactored into its own module, this broke logging for this file, as the `dynamic` alias no longer covered it. This PR adds supports for an alias to point to multiple qualified module names. To drive the refactor, I renamed `log_alias_to_log_qname` to `log_alias_to_log_qnames` and then audited all use sites. I invite you to do so as well. For good measure, I also add dynamic to dynamo, so that I always get dynamic logs when dynamo is enabled. Empirically this will be helpful because people keep sending me dynamo debug logs that don't have dynamic logs. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113567 Approved by: https://github.com/Skylion007, https://github.com/lezcano, https://github.com/mlazos ghstack dependencies: #113566	2023-11-13 23:35:18 +00:00
Edward Z. Yang	92ebf74ac1	Refactor loggers to use NOTSET when not set by user (#113566 ) Previously, the way our logging system worked was that for every registered log, we would explicit set a log level for it. This would lead to unintuitive behavior when you had multiple overlapping loggers, e.g., from the module hierarchy. Specifically, if you had `TORCH_LOGS=torch`, this would not actually set the logging level for torch._dynamo to be INFO, because the default log level is WARNING, and because torch._dynamo has a registered logger 'dynamo' this would end up setting the level on torch._dynamo to be WARNING, thereby overriding the level of the parent module torch. The 'all' logger did not display this behavior, but only because it was special cased to directly modify the default log level of all other loggers (and so this would not work for any sub-hierarchies). This PR refactors our code into a much more logical setup using NOTSET. Instead of setting the level of all loggers to some level, we instead default all loggers to NOTSET, unless a user explicitly requested logging from some logger. This means that if we have some logger which isn't explicitly mentioned by the user, parent loggers now have a chance to influence their log behavior. With this, I can eliminate the 'all' special case; 'all' really just means 'torch'. (I keep special handling directing all to torch for backwards compatibility, though arguably I can probably just turn all into an alias.) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113566 Approved by: https://github.com/mlazos, https://github.com/Skylion007	2023-11-13 23:35:18 +00:00
Edward Z. Yang	44d0226690	Fix logging exception/stacks from logging (#113394 ) We were accidentally dropping them in our formatter, oops. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/113394 Approved by: https://github.com/albanD	2023-11-10 01:17:29 +00:00
Will Feng	e9804aaacc	Fix unit tests and add logging for Inductor intra-graph reordering (#111981 ) 1. Fix code to make unit tests pass (incl. collect_env issue called out by @int3 in https://github.com/pytorch/pytorch/pull/108091#discussion_r1362901686). 2. Add logging for Inductor intra-graph reordering passes (`TORCH_LOGS="overlap"`), for easier debugging. Example log: ``` [rank0]:[2023-10-24 16:28:26,446] [0/0] torch._inductor.comms.__overlap: [DEBUG] ==== Visualize overlap before reordering pass <function reorder_compute_for_overlap at 0x7fa68c5568e0> ==== [rank0]:[2023-10-24 16:28:26,446] [0/0] torch._inductor.comms.__overlap: [DEBUG] ComputedBuffer (size=[4, 4], stride=[4, 1]) (buf0) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] ExternKernelOut (extern_kernels.mm) (size=[4, 4], stride=[4, 1]) (buf1) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] InPlaceHint (size=[4, 4], stride=[4, 1]) (buf2) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] AllReduce (size=[4, 4], stride=[4, 1]) (buf3) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] Wait (size=[4, 4], stride=[4, 1]) (buf4) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] ComputedBuffer (size=[4, 4], stride=[4, 1]) (buf5) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] InPlaceHint (size=[4, 4], stride=[4, 1]) (buf6) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] AllReduce (size=[4, 4], stride=[4, 1]) (buf7) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] Wait (size=[4, 4], stride=[4, 1]) (buf8) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] ExternKernelOut (extern_kernels.mm) (size=[4, 4], stride=[4, 1]) (buf9) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] ComputedBuffer (size=[4, 4], stride=[4, 1]) (buf10) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] ExternKernelOut (extern_kernels.mm) (size=[4, 4], stride=[4, 1]) (buf11) [rank0]:[2023-10-24 16:28:26,447] [0/0] torch._inductor.comms.__overlap: [DEBUG] Est. runtime (ms): 0.000228 [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] ==== Visualize overlap after reordering pass <function reorder_compute_for_overlap at 0x7fa68c5568e0> ==== [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] InPlaceHint (size=[4, 4], stride=[4, 1]) (buf2) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] AllReduce (size=[4, 4], stride=[4, 1]) (buf3) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] \| ComputedBuffer (size=[4, 4], stride=[4, 1]) (buf0) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] \| ExternKernelOut (extern_kernels.mm) (size=[4, 4], stride=[4, 1]) (buf1) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] \| ExternKernelOut (extern_kernels.mm) (size=[4, 4], stride=[4, 1]) (buf9) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] Wait (size=[4, 4], stride=[4, 1]) (buf4) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] ComputedBuffer (size=[4, 4], stride=[4, 1]) (buf5) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] InPlaceHint (size=[4, 4], stride=[4, 1]) (buf6) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] AllReduce (size=[4, 4], stride=[4, 1]) (buf7) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] Wait (size=[4, 4], stride=[4, 1]) (buf8) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] ComputedBuffer (size=[4, 4], stride=[4, 1]) (buf10) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] ExternKernelOut (extern_kernels.mm) (size=[4, 4], stride=[4, 1]) (buf11) [rank0]:[2023-10-24 16:28:26,448] [0/0] torch._inductor.comms.__overlap: [DEBUG] Est. runtime (ms): 0.000217 ``` The `\| SomeComputeOp` means the compute op is overlapped with the comm op above. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111981 Approved by: https://github.com/wanchaol	2023-10-25 18:19:43 +00:00
Oguz Ulgen	b56699b699	Add post grad graph logging (#111808 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111808 Approved by: https://github.com/Chillee ghstack dependencies: #111770	2023-10-23 23:24:04 +00:00
Oguz Ulgen	59ae0d9f9d	Allow setting logger output format with TORCH_LOGS_FORMAT (#111770 ) TORCH_LOGS_FORMAT="%(levelname)s: %(message)s" will only dump output level and message contents Pull Request resolved: https://github.com/pytorch/pytorch/pull/111770 Approved by: https://github.com/jansel	2023-10-23 18:42:27 +00:00
David Berard	898482f1bf	[logging] log exceptions when provided (#111164 ) This PR will cause logging.exception() to also dump the exception and stacktrace. Copied from `74723e1110/Lib/logging/__init__.py (L707-L711)` repro: <details> ```python import torch import torch._inductor.config torch._inductor.config.triton.inject_relu_bug_TESTING_ONLY = "runtime_error" def fn(x, y): return (x @ y).relu() x, y = [torch.rand((16, 16), device='cuda') for _ in range (2)] torch.compile(fn)(x, y) ``` run with TORCHDYNAMO_REPRO_AFTER=aot TORCHDYNAMO_REPRO_LEVEL=4 </details> before: ``` ... [2023-10-12 14:18:52,902] torch._dynamo.debug_utils: [ERROR] While minifying the program in accuracy minification mode, ran into a runtime exception which is likely an unrelated issue. Skipping this graph. ``` now: ``` ... [2023-10-12 14:18:52,902] torch._dynamo.debug_utils: [ERROR] While minifying the program in accuracy minification mode, ran into a runtime exception which is likely an unrelated issue. Skipping this graph. Traceback (most recent call last): File "/data/users/dberard/scripts/relu_accuracy_issue.py", line 10, in <module> torch.compile(fn)(x, y) ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/111164 Approved by: https://github.com/eellison	2023-10-13 03:52:26 +00:00
Jon Chuang	9c7f464eef	[inductor]: Better debugging of `can_fuse` decisions with `TORCH_LOGS=fusion` (#110415 ) Fixes https://github.com/pytorch/pytorch/issues/110393 Example logs (for adagrad on main). In this case, it clearly identifies device mismatch as a potential red flag, which is indeed the obstacle to adagrad's successful fusion. (see: https://github.com/pytorch/pytorch/pull/110339) ``` [2023-10-03 21:50:24,084] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] ===== attempting fusion (1/10): 18 nodes ===== [2023-10-03 21:50:24,084] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,084] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (foreach:3): candidate consumer has no dep in any foreach producer [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] 13 possible fusions: [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf0_buf1_buf2_buf3), ForeachKernelSchedulerNode(nodes=buf4_buf5_buf6_buf7)) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf4_buf5_buf6_buf7), SchedulerNode(name='buf8')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf4_buf5_buf6_buf7), SchedulerNode(name='buf10')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf0_buf1_buf2_buf3), SchedulerNode(name='buf12')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf0_buf1_buf2_buf3), SchedulerNode(name='buf14')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf4_buf5_buf6_buf7), SchedulerNode(name='buf9')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf4_buf5_buf6_buf7), SchedulerNode(name='buf11')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf0_buf1_buf2_buf3), SchedulerNode(name='buf13')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (ForeachKernelSchedulerNode(nodes=buf0_buf1_buf2_buf3), SchedulerNode(name='buf15')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (SchedulerNode(name='buf25'), SchedulerNode(name='buf33')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (SchedulerNode(name='buf43'), SchedulerNode(name='buf51')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (SchedulerNode(name='buf34'), SchedulerNode(name='buf42')) [2023-10-03 21:50:24,085] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] (SchedulerNode(name='buf16'), SchedulerNode(name='buf24')) [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] completed fusion round (1/10): fused 18 nodes into 5 nodes [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] ===== attempting fusion (2/10): 5 nodes ===== [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] cannot fuse (7): device mismatch (node1: cuda:0, node2: cpu) [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] 0 possible fusions: [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] completed fusion round (2/10): fused 5 nodes into 5 nodes [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] [2023-10-03 21:50:24,087] [0/0] torch._inductor.scheduler.__schedule: [DEBUG] ===== fusion complete (2 iterations) ===== ``` CC @jansel @ngimel @mlazos @shunting314 @peterbell10 as code owners Pull Request resolved: https://github.com/pytorch/pytorch/pull/110415 Approved by: https://github.com/mlazos	2023-10-13 00:36:45 +00:00
Aaron Gokaslan	f15adf204b	[BE]: Replace undocumented constant in logging (#109434 ) Replaces the undocumented alias with the proper constant WARNING Pull Request resolved: https://github.com/pytorch/pytorch/pull/109434 Approved by: https://github.com/ezyang	2023-09-16 20:17:32 +00:00
Michael Voznesensky	55a204ebc8	[Easy] log graphs in compiled_autograd if TORCH_LOGS=compiled_autograd (#108991 ) [Easy] log graphs in compiled_autograd if TORCH_LOGS=compiled_autograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/108991 Approved by: https://github.com/ezyang ghstack dependencies: #108846	2023-09-12 00:15:02 +00:00
Edward Z. Yang	66f67d9a25	Print restart attempt as part of Dynamo log context (#108864 ) Now looks like: ``` [2023-09-08 06:04:48,532] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE STORE_ATTR foo [ConstantVariable(int), NNModule Variable()] [2023-09-08 06:04:48,532] [0/0] torch._dynamo.convert_frame: [INFO] Restarting analysis due to _dynamo/variables/nn_module.py :138 in convert_to_unspecialized [2023-09-08 06:04:48,533] [0/0_1] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing f /data/users/ezyang/c/pytorch/a.py:6 [2023-09-08 06:04:48,533] [0/0_1] torch._dynamo.symbolic_convert.__trace_source: [DEBUG] TRACE starts_line f /data/users/ezyang/c/pytorch/a.py:6 ``` I'm happy to bikeshed the exact formatting of the attempt number if you want. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/108864 Approved by: https://github.com/mlazos, https://github.com/voznesenskym	2023-09-08 23:00:19 +00:00
chilli	11860d9d41	Added info for each artifact option, added a help option to TORCH_LOGS, and changed the error message (#107758 ) New message when invalid option is provided <img width="1551" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/8b61534a-ee55-431e-94fe-2ffa25b7fd5c"> TORCH_LOGS="help" <img width="1558" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/72e8939c-92fa-4141-8114-79db71451d42"> TORCH_LOGS="+help" <img width="1551" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/2cdc94ac-505a-478c-aa58-0175526075d2"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107758 Approved by: https://github.com/ezyang, https://github.com/mlazos ghstack dependencies: #106192	2023-08-31 02:12:35 +00:00
Animesh Jain	de972529dc	[logging] Add more flags to default logs (#107912 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107912 Approved by: https://github.com/mlazos	2023-08-29 01:01:02 +00:00
Edward Z. Yang	8316affc45	Add frame/recompile counter to all log messages in tracing context (#107530 ) All log messages that occur while running Dynamo compilation now have `[X/Y]` added to the beginning of their message. X represents the frame being compiled, while Y says which compilation of the frame. For example, if you are debugging a frame that is repeatedly recompiling, you can look for N/0, N/1, N/2, etc. for the same N. Here is what the logs look like as you transition from one frame to another: <img width="1372" alt="image" src="https://github.com/pytorch/pytorch/assets/13564/4897e368-1e50-4807-b342-54e911bcf087"> To accurately get this prefix added to all messages, I had to expand the scope of the `tracing` context manager. Its scope now coincides with `log_compilation_event`. To do this, I had to populate fake mode lazily in the TracingContext, since it isn't created until later, inside the OutputGraph. This subsumes the previous X.Y logging that was solely for dynamic shapes. Unfortunately I had to reindent some stuff. Review the diff with whitespace off. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107530 Approved by: https://github.com/anijain2305 ghstack dependencies: #107505, #107516	2023-08-21 13:02:12 +00:00
Edward Z. Yang	67bb3c05b0	Add verbose_guards logging artifact (#107388 ) It looks like this: ``` [DEBUG] GUARD: ___check_type_id(L['z'][L["MyEnum"].BAR], 7640416) and L['z'][L["MyEnum"].BAR] == 10 [DEBUG] Stack: [DEBUG] File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 6657, in <module> [DEBUG] run_tests() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/test_case.py", line 38, in run_tests [DEBUG] run_tests() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 985, in run_tests [DEBUG] unittest.main(argv=argv) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/main.py", line 101, in __init__ [DEBUG] self.runTests() [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/main.py", line 271, in runTests [DEBUG] self.result = testRunner.run(self.test) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/runner.py", line 184, in run [DEBUG] test(result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 84, in __call__ [DEBUG] return self.run(args, kwds) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 122, in run [DEBUG] test(result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 84, in __call__ [DEBUG] return self.run(args, *kwds) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 122, in run [DEBUG] test(result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 650, in __call__ [DEBUG] return self.run(args, *kwds) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2521, in run [DEBUG] self._run_with_retry( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2450, in _run_with_retry [DEBUG] super_run(result=result) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 591, in run [DEBUG] self._callTestMethod(testMethod) [DEBUG] File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 549, in _callTestMethod [DEBUG] method() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2377, in wrapper [DEBUG] method(args, *kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 2529, in test_enum_as_dict_key_with_overloaded_str [DEBUG] res = opt_fn(x) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 333, in _fn [DEBUG] return fn(args, *kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 2519, in fn [DEBUG] torch._dynamo.graph_break() [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 493, in catch_errors [DEBUG] return callback(frame, cache_size, hooks, frame_state) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 637, in _convert_frame [DEBUG] result = inner_convert(frame, cache_size, hooks, frame_state) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 133, in _fn [DEBUG] return fn(args, *kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 371, in _convert_frame_assert [DEBUG] return _compile( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 567, in _compile [DEBUG] guarded_code = compile_inner(code, one_graph, hooks, transform) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/utils.py", line 181, in time_wrapper [DEBUG] r = func(args, kwargs) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 466, in compile_inner [DEBUG] out_code = transform_code_object(code, transform) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object [DEBUG] transformations(instructions, code_options) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 416, in transform [DEBUG] tracer = InstructionTranslator( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/symbolic_convert.py", line 2018, in __init__ [DEBUG] self.symbolic_locals = collections.OrderedDict( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/symbolic_convert.py", line 2021, in <genexpr> [DEBUG] VariableBuilder( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 211, in __call__ [DEBUG] vt = self._wrap(value).clone(self.options()) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 404, in _wrap [DEBUG] result = { [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 405, in <dictcomp> [DEBUG] k: VariableBuilder( [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 211, in __call__ [DEBUG] vt = self._wrap(value).clone(*self.options()) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 354, in _wrap [DEBUG] return type_dispatch(self, value) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 837, in wrap_literal [DEBUG] return self.wrap_unspecialized_primitive(value) [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 1073, in wrap_unspecialized_primitive [DEBUG] guards=self.make_guards(GuardBuilder.CONSTANT_MATCH), [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 269, in make_guards [DEBUG] return {source.make_guard(guard) for guard in guards} [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 269, in <setcomp> [DEBUG] return {source.make_guard(guard) for guard in guards} [DEBUG] File "/data/users/ezyang/b/pytorch/torch/_guards.py", line 641, in make_guard [DEBUG] return Guard(self.name(), self.guard_sou ``` One downside is I can't report why* the guard was added. I'm not entirely sure how to do this; the problem is guards will propagate to a bunch of variables before finally getting included as part of the final set. Maybe a very very verbose version could report stack traces at every handoff point. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107388 Approved by: https://github.com/mlazos ghstack dependencies: #107438, #107358	2023-08-18 19:05:54 +00:00
Edward Z. Yang	36bb7a1f42	Add fast traceback utilities (#107358 ) This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for `traceback.extract_stack`, which is 20x slower than CapturedTraceback. I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107358 Approved by: https://github.com/zdevito, https://github.com/albanD ghstack dependencies: #107438	2023-08-18 19:05:54 +00:00
William Wen	5dcc85d663	[dynamo, logging] add default pt2 logging group (#106417 ) Create a new logging group that enables "interesting" logging: graph breaks, recompiles, symbolic shapes, guards, source trace. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106417 Approved by: https://github.com/ezyang	2023-08-04 20:34:42 +00:00
Animesh Jain	e1a0543dac	[logs] Share same formatter between trace_source and other Dynamo loggers (#106493 ) Earlier we wont have formatter prefix - like [rank] etc. This makes grepping out for trace_source for rank harder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106493 Approved by: https://github.com/williamwen42	2023-08-03 02:53:52 +00:00
BowenBao	cebc11ae8f	Register ONNX exporter under PT2 logging (#105989 ) As first step of adopting PT2 logging for ONNX exporter. Also adds `torch/_logging` to `.github/merge_rules.yaml` for ONNX exporter for easier follow ups. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105989 Approved by: https://github.com/abock, https://github.com/ezyang	2023-08-02 23:33:38 +00:00
Justin Chu	79c5e33349	[BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436 Approved by: https://github.com/malfet, https://github.com/albanD	2023-07-21 07:38:46 +00:00
William Wen	777fc0bb58	[dynamo] fine-grained bytecode-source attribution in python 3.11 (#104676 ) Since Python 3.11 bytecode contains endline and column information, for each bytecode, we attribute the source code corresponding to the bytecode in a more accurate way. For example, we can highlight a function call in a series of nested function calls, or highlight a function call spanning multiple lines. Sample: ```python import torch import torch._dynamo from functorch.experimental.control_flow import cond def h(x): return x * 5 def true_fn(x): return x * 2 def false_fn(x): return x * 3 def f(pred, x): x = h( h(h(x)) ) x = x[1:][:2] torch._dynamo.graph_break() x = cond(pred, true_fn, false_fn, [x]) opt_f = torch.compile(f, backend="eager") opt_f(torch.tensor(True), torch.randn(3, 3, 3, 3)) ``` Output: ``` $ TORCH_LOGS="trace_call" python playground9.py TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:16 h(h(x)) ~^^^ TRACE FX call mul from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1) return x * 5 ~~^~~ TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:16 h(h(x)) ~^^^^^^ TRACE FX call mul_1 from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1) return x * 5 ~~^~~ TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:15 x = h( ~^ h(h(x)) ^^^^^^^ ) ^ TRACE FX call mul_2 from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1) return x * 5 ~~^~~ TRACE FX call getitem from f /scratch/williamwen/work/pytorch/playground9.py:18 x = x[1:][:2] ~^^^^ TRACE FX call getitem_1 from f /scratch/williamwen/work/pytorch/playground9.py:18 x = x[1:][:2] ~~~~~^^^^ TRACE inlined call true_fn from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20 x = cond(pred, true_fn, false_fn, [x]) ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TRACE FX call mul from true_fn /scratch/williamwen/work/pytorch/playground9.py:9 (inline depth: 1) return x * 2 ~~^~~ TRACE inlined call false_fn from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20 x = cond(pred, true_fn, false_fn, [x]) ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TRACE FX call mul from false_fn /scratch/williamwen/work/pytorch/playground9.py:12 (inline depth: 1) return x * 3 ~~^~~ TRACE FX call cond from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20 x = cond(pred, true_fn, false_fn, [x]) ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104676 Approved by: https://github.com/ezyang	2023-07-20 17:18:52 +00:00
Edward Z. Yang	a2f04e9841	Force multi-line messages to still get log format prefix (#104932 ) This makes it easier to exclude multi-line messages using single line grepping. If your screen is wide enough this should not be a big problem. Example of what it looks like: ``` [2023-07-10 20:11:30,529] torch._dynamo.convert_frame.__guards: [DEBUG] GUARDS: [2023-07-10 20:11:30,529] torch._dynamo.convert_frame.__guards: [DEBUG] hasattr(L['x'], '_dynamo_dynamic_indices') == False [2023-07-10 20:11:30,529] torch._dynamo.convert_frame.__guards: [DEBUG] ___is_grad_enabled() [2023-07-10 20:11:30,529] torch._dynamo.convert_frame.__guards: [DEBUG] not ___are_deterministic_algorithms_enabled() [2023-07-10 20:11:30,529] torch._dynamo.convert_frame.__guards: [DEBUG] utils_device.CURRENT_DEVICE == None ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/104932 Approved by: https://github.com/mlazos, https://github.com/albanD	2023-07-11 20:00:52 +00:00
Edward Z. Yang	515e3f2bb9	Add [rankN]: to log messages when distributed is initialized (#104929 ) Doing it in the formatter is kind of naughty but I stared a while at logging.setLogRecordFactory for a bit, and then decided it was a bit too global for a library to use well. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/104929 Approved by: https://github.com/mlazos, https://github.com/Skylion007	2023-07-11 20:00:52 +00:00
Michael Lazos	dbb69f78fe	Add assert + test for artifact log booleans (#104907 ) Fixes https://github.com/pytorch/pytorch/issues/104885 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104907 Approved by: https://github.com/ezyang	2023-07-11 01:59:23 +00:00
William Wen	8c13e96be2	[dynamo] add logging artifact for traced graph tensor sizes (#104672 ) Log tensor size information with the `graph_sizes` logging artifact, as part of the model x-ray feature requests. Typically can be combined with `graph_code`. Sample: ```python import torch def fn(a, b, c, d): return (a + b) @ (c + d) opt_fn = torch.compile(fn, backend="eager", dynamic=False) opt_fn(torch.randn(10, 20), torch.randn(1, 20), torch.randn(20, 15), torch.randn(1, 15)) opt_fn(torch.randn(5, 2), torch.randn(1, 2), torch.randn(2, 4), torch.randn(1, 4)) ``` Output: ```shell $ TORCH_LOGS="graph_sizes,graph_code" python playground8.py [2023-07-06 01:42:39,093] torch._dynamo.output_graph.__graph_code: [DEBUG] TRACED GRAPH ===== __compiled_fn_0 ===== <eval_with_key>.0 class GraphModule(torch.nn.Module): def forward(self, L_a_ : torch.Tensor, L_b_ : torch.Tensor, L_c_ : torch.Tensor, L_d_ : torch.Tensor): l_a_ = L_a_ l_b_ = L_b_ l_c_ = L_c_ l_d_ = L_d_ # File: playground8.py:66, code: return (a + b) @ (c + d) add = l_a_ + l_b_; l_a_ = l_b_ = None add_1 = l_c_ + l_d_; l_c_ = l_d_ = None matmul = add @ add_1; add = add_1 = None return (matmul,) [2023-07-06 01:42:39,093] torch._dynamo.output_graph.__graph_sizes: [DEBUG] TRACED GRAPH TENSOR SIZES ===== __compiled_fn_0 ===== l_a_: (10, 20) l_b_: (1, 20) l_c_: (20, 15) l_d_: (1, 15) add: (10, 20) add_1: (20, 15) matmul: (10, 15) [2023-07-06 01:42:39,198] torch._dynamo.output_graph.__graph_code: [DEBUG] TRACED GRAPH ===== __compiled_fn_1 ===== <eval_with_key>.1 class GraphModule(torch.nn.Module): def forward(self, s0 : torch.SymInt, s1 : torch.SymInt, L_a_ : torch.Tensor, L_b_ : torch.Tensor, s4 : torch.SymInt, L_c_ : torch.Tensor, L_d_ : torch.Tensor): l_a_ = L_a_ l_b_ = L_b_ l_c_ = L_c_ l_d_ = L_d_ # File: playground8.py:66, code: return (a + b) @ (c + d) add = l_a_ + l_b_; l_a_ = l_b_ = None add_1 = l_c_ + l_d_; l_c_ = l_d_ = None matmul = add @ add_1; add = add_1 = None return (matmul,) [2023-07-06 01:42:39,198] torch._dynamo.output_graph.__graph_sizes: [DEBUG] TRACED GRAPH TENSOR SIZES ===== __compiled_fn_1 ===== l_a_: (s0, s1) l_a_ (concrete): (5, 2) l_b_: (1, s1) l_b_ (concrete): (1, 2) l_c_: (s1, s4) l_c_ (concrete): (2, 4) l_d_: (1, s4) l_d_ (concrete): (1, 4) add: (s0, s1) add (concrete): (5, 2) add_1: (s1, s4) add_1 (concrete): (2, 4) matmul: (s0, s4) matmul (concrete): (5, 4) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104672 Approved by: https://github.com/ezyang	2023-07-06 21:44:05 +00:00
William Wen	c7c9aa797f	[dynamo] New logging artifacts for source code attribution (#104013 ) Prototype for the feature request: >When working on a codebase that is unfamiliar to you, it can be helpful to single step through all of the code to see what is getting executed, what conditional branches are taken, and where indirect function jumps go. Model x-ray uses dynamo to give you a single step log of every source code line that does something relevant (i.e., a Tensor operation) Dynamo logs to the ~`starts_line`~ `trace_source` logging artifact at the start of tracing new bytecode with a new line. It logs the line of source code associated with that bytecode. ~~Dynamo logs to the `graph_source` logging when a FX GraphModule is constructed. For each node in the graph, it logs the location of the original source code associated with that node.~~ Development notes: https://docs.google.com/document/d/1LjFeHzCgDDt535QUq5HydcQs56d7jWl5RvW8TLZN19g/edit?usp=sharing Since the draft, we removed the `graph_source` logging artifact since printing the code of `GraphModule`s already displays the original source. Sample: ```python import torch from functorch.experimental.control_flow import cond def true_fn(x): return x * 2 def false_fn(x): return x * 3 def f_cond(pred, x): return cond(pred, true_fn, false_fn, [x]) def f_outer(pred, x): y = f_cond(pred, x) if x.sum() > 0: x = x * 2 else: x = x * 3 return x, y opt_f_cond = torch.compile(f_outer, backend="eager") opt_f_cond(torch.tensor(True), torch.randn(3, 3)) ``` Logs: ```shell $ TORCH_LOGS="trace_source" python playground8.py TRACE starts_line f_outer playground8.py:54 def f_outer(pred, x): TRACE starts_line f_outer playground8.py:55 y = f_cond(pred, x) TRACE starts_line f_cond playground8.py:51 (inline depth: 1) def f_cond(pred, x): TRACE starts_line f_cond playground8.py:52 (inline depth: 1) return cond(pred, true_fn, false_fn, [x]) TRACE starts_line true_fn playground8.py:45 (inline depth: 2) def true_fn(x): TRACE starts_line true_fn playground8.py:46 (inline depth: 2) return x * 2 TRACE starts_line false_fn playground8.py:48 (inline depth: 2) def false_fn(x): TRACE starts_line false_fn playground8.py:49 (inline depth: 2) return x * 3 TRACE starts_line f_outer playground8.py:56 if x.sum() > 0: TRACE starts_line <resume in f_outer> playground8.py:56 if x.sum() > 0: TRACE starts_line <resume in f_outer> playground8.py:57 x = x * 2 TRACE starts_line <resume in f_outer> playground8.py:60 return x, y ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104013 Approved by: https://github.com/ezyang	2023-07-06 21:43:55 +00:00
William Wen	d901dd94cb	[logging] add custom format option to logging artifacts (#104443 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104443 Approved by: https://github.com/mlazos	2023-06-30 19:54:14 +00:00
Michael Lazos	6c6c897d6b	Add graph break logging option instead of config flag (#103202 ) Make graph break logging a logging option vs a config setting Pull Request resolved: https://github.com/pytorch/pytorch/pull/103202 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2023-06-12 19:52:31 +00:00
Will Constable	a8549357d2	Add distributed category to TORCH_LOGS (#103351 ) Fix use of torch distributed testing assertLogs Pull Request resolved: https://github.com/pytorch/pytorch/pull/103351 Approved by: https://github.com/wanchaol	2023-06-10 02:21:36 +00:00
PyTorch MergeBot	90110b0e4f	Revert "Add distributed category to TORCH_LOGS (#103287 )" This reverts commit `0b252aebb2`. Reverted https://github.com/pytorch/pytorch/pull/103287 on behalf of https://github.com/ZainRizvi due to Breaks trunk ([comment](https://github.com/pytorch/pytorch/pull/103287#issuecomment-1585161976))	2023-06-09 21:51:25 +00:00
Will Constable	0b252aebb2	Add distributed category to TORCH_LOGS (#103287 ) This lets users run `TORCH_LOGS="+distributed" python myscript.py` and enable additional logging output for the distributed module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103287 Approved by: https://github.com/ezyang	2023-06-09 19:25:07 +00:00
spectrometerHBH	5ee46afc05	perf hint logging in inductor (#102250 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102250 Approved by: https://github.com/Skylion007, https://github.com/shunting314, https://github.com/jansel	2023-05-27 03:43:30 +00:00
Edward Z. Yang	3318a832b3	Tighten FakeTensor reentrancy asserts, add debugging (#102091 ) When investigating failures in https://github.com/pytorch/pytorch/pull/100017 I realized that we were reentering FakeTensorMode even though there was already one on the stack. Although we have attempted assert for these cases in the past, e.g., as in https://github.com/pytorch/pytorch/pull/97186 it seems that the existing protections were insufficient. In this particular case, the reapplication of FakeTensorMode was due to an interaction with NotImplemented multiple dispatch handling. If proxy tensor mode detects an unrecognized tensor type (this includes FakeTensor, if it is not tracked with a proxy), it will return NotImplemented to give this tensor a chance to unpack itself into proxyable operation. However, this is never the right thing for FakeTensor, where no unpacking is possible. However, today, FakeTensor attempts to reapply the FakeTensorMode, resulting in FakeTensorMode being twice on the stack. This PR does a number of things: * It adds an assert in `FakeTensorMode.__torch_dispatch__` that you must not already have this mode on the stack, this is ALWAYS an error * It modifies `FakeTensor.__torch_dispatch__` to return `NotImplemented` if the mode is already active. This prevents us from readding the mode on the stack * It adds a new logging artifact `not_implemented` which you can use to get debug logs about all of the times a `__torch_dispatch__` handler returned NotImplemented and why it did so. Your subclass has to manually opt into this logging, but I inserted the necessary logs for ProxyTensorMode and FakeTensor(Mode) * `with fake_mode` now no-ops if the fake mode is already on the stack, which is what users want anyway * I am BREAKING pre-autograd tracing, because it is currently doing something weird with the original C++ mode stack. Brian is going to follow up with a fix next week. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102091 Approved by: https://github.com/thiagocrepaldi, https://github.com/eellison, https://github.com/wanchaol, https://github.com/bdhirsh	2023-05-24 05:37:51 +00:00
Edward Z. Yang	e7a6818e97	Register top level logger for torch (#102090 ) This enables use of artifact logging in modules that aren't under the modules that were specified here. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102090 Approved by: https://github.com/Skylion007, https://github.com/mlazos	2023-05-23 21:24:21 +00:00
Aaron Gokaslan	dfe484a3b3	[BE]: Bugfix functorch and some generic typing improvements (#101337 ) Fixes some typing bugs found with newer versions of mypy Pull Request resolved: https://github.com/pytorch/pytorch/pull/101337 Approved by: https://github.com/ezyang	2023-05-14 14:20:56 +00:00
lezcano	f6c2859ee3	Print the path to the code with `TORCH_LOGS=output_code` (#99038 ) This is quite useful to play around with the code once it's been generated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99038 Approved by: https://github.com/mlazos	2023-05-13 16:20:57 +00:00
Michael Lazos	f73973d789	Expose function to retrieve list of registered loggers (#100776 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/100776 Approved by: https://github.com/ezyang	2023-05-06 04:22:28 +00:00
Michael Lazos	850556ed6e	Add "all" option to logging (#100664 ) Adds the long-promised "all" option to logging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100664 Approved by: https://github.com/lezcano	2023-05-06 01:11:18 +00:00
Michael Lazos	c525440ba3	Logging documentation updates (#100595 ) Updated the logging.rst with info about the env var. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100595 Approved by: https://github.com/msaroufim, https://github.com/lezcano	2023-05-04 21:54:02 +00:00
Bert Maher	e0bf51d3bf	[dynamo] Add ddp_graphs artifact (#100021 ) I want to be able to decouple DDP graph printing from the rest of dynamo DEBUG-level logging, since frequently these logs are particularly enlightening. Differential Revision: [D45290919](https://our.internmc.facebook.com/intern/diff/D45290919/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100021 Approved by: https://github.com/wconstab, https://github.com/mlazos	2023-04-27 03:53:23 +00:00
Bert Maher	6789342a56	[dynamo] Make bytecode logging off-by-default (#100093 ) A big model (like Meta's production models) can dump 100s of MBs of bytecode, making the logs virtually unusable. Let's only turn these on if they're explicitly requested. Differential Revision: [D45314055](https://our.internmc.facebook.com/intern/diff/D45314055/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100093 Approved by: https://github.com/mlazos	2023-04-26 23:06:22 +00:00
Kurt Mohler	1e8cf6ad7f	Add documentation for `torch._logging.set_logs` (#99219 ) Part of #98871 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99219 Approved by: https://github.com/mlazos, https://github.com/lezcano	2023-04-24 08:06:57 +00:00
Edward Z. Yang	6df87b2e9b	Rename sym_shapes logger to dynamic (#99335 ) This matches the logging with the user facing UX dynamic=True, rather than a new abbreviation that shows up no where else in the codebase. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99335 Approved by: https://github.com/Skylion007, https://github.com/mlazos, https://github.com/voznesenskym	2023-04-18 00:45:39 +00:00
Michael Lazos	a6a90eaf28	Remove unnecessary check when logging artifacts (#99260 ) Removes a check which would sometimes allow `off_by_default` artifacts to be logged if logged at a higher level. This change will only allow artifact messages to be displayed if the artifact is enabled, regardless of level. closes #99144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99260 Approved by: https://github.com/lezcano	2023-04-17 20:47:25 +00:00
Michael Lazos	3a400a5adc	Enable passing a dict of module names: log level to set_logs python api (#98989 ) Adds "module" kwarg to set_logs to allow a user to pass a dict of module qualified names to log level to the API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98989 Approved by: https://github.com/ezyang	2023-04-13 09:42:32 +00:00
Michael Lazos	f55e72c0f6	Add option to log recomps (#98564 ) Adds an option to TORCH_LOGS to log recompilations Pull Request resolved: https://github.com/pytorch/pytorch/pull/98564 Approved by: https://github.com/ezyang, https://github.com/voznesenskym	2023-04-07 17:30:27 +00:00

1 2 3 4

155 Commits