mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Overall design: https://docs.google.com/document/d/1CX_hJ0PNy9f3R1y8TJrfkSeLkvGjjjLU84BSXgS2AZ8/edit How to read the diff: * Most files are me augmenting pre-existing logging with structured variants. For the most part it's simple (esp FX graphs, which have a canonical string representation); it gets more complicated when I decided to JSON-ify some data structure instead of keeping the ad hoc printing (notably, guards and dynamo output graph sizes) * torch/_functorch/_aot_autograd/collect_metadata_analysis.py is some unrelated fixes I noticed while auditing artifact logs * torch/_logging/_internal.py has the actual trace log implementation. The trace logger is implement as a logger named torch.__trace which is disconnected from the logging hierarchy. It gets its own handler and formatter (TorchLogsFormatter with _is_trace True). `trace_structured` is the main way to emit a trace log. Unusually, there's a separate "metadata" and "payload" field. The metadata field should not be too long (as it is serialized as a single line) and is always JSON (we put contextual things like compile id in it); the payload field can be long and is emitted after the metadata log line and can span multiple lines. * torch/_logging/structured.py contains some helpers for converting Python data structures into JSON form. Notably, we have a string interning implementation here, which helps reduce the cost of serializing filenames into the log. * test/dynamo/test_structured_trace.py the tests are cribbed from test_logging.py, but all rewritten to use expect tests on munged versions of what we'd actually output. Payloads are never tested, since they tend not be very stable. https://github.com/ezyang/tlparse is a POC Rust program that can interpret these logs. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/120289 Approved by: https://github.com/Skylion007 ghstack dependencies: #120712
38 lines
873 B
Python
38 lines
873 B
Python
"""
|
|
Utilities for converting data types into structured JSON for dumping.
|
|
"""
|
|
|
|
import traceback
|
|
from typing import Dict, Sequence
|
|
|
|
import torch._logging._internal
|
|
|
|
|
|
INTERN_TABLE: Dict[str, int] = {}
|
|
|
|
|
|
def intern_string(s: str) -> int:
|
|
r = INTERN_TABLE.get(s, None)
|
|
if r is None:
|
|
r = len(INTERN_TABLE)
|
|
INTERN_TABLE[s] = r
|
|
torch._logging._internal.trace_structured(
|
|
"str", lambda: (s, r), suppress_context=True
|
|
)
|
|
return r
|
|
|
|
|
|
def from_traceback(tb: Sequence[traceback.FrameSummary]) -> object:
|
|
r = []
|
|
for frame in tb:
|
|
# dict naming convention here coincides with
|
|
# python/combined_traceback.cpp
|
|
r.append(
|
|
{
|
|
"line": frame.lineno,
|
|
"name": frame.name,
|
|
"filename": intern_string(frame.filename),
|
|
}
|
|
)
|
|
return r
|