mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68074 This is the first step of many PRs towards implementing the `torch.monitor` RFC https://github.com/pytorch/rfcs/pull/30 This defines the aggregation types, the `Stat` class and provides some simple collection of the stats. This doesn't match the RFC exactly as it incorporates some of the comments on the RFC as well as a few changes for performance. Changes: * added window_size to the stats. If specified it will always compute the stat using the `window_size` number of values. If there aren't enough values within that window it reports the previous stats. * This doesn't include the push metrics yet (will be coming). After more discussion it looks like the best way to handle this is to support a hybrid where the metric can set how frequently it'll be logged. For fixed window_size metrics it'll be logged each time it hits the window size. This will allow performant counters as well as lower frequency push counters (window_size=1). Performance considerations: * Updating the stats acquires a lock on that Stat object. This should be performant unless there's many-many threads writing to the same stat. Single thread will typically use futex so should be quite fast. * Adding/removing/fetching all stats sets a global lock on the stat list -- this shouldn't be an issue since these events happen infrequently. * Fetching stats accesses one stat at a time instead of a global lock. This means the exported values are linearizable but not serializable across multiple stats but I don't expect this to be an issue. Next steps: 1. Add StatCollector interface for push style metrics 1. Add pybind interfaces to expose to Python 1. Add default metric providers 1. Integrate into Kineto trace view Test Plan: buck test //caffe2/test/cpp/monitor:monitor CI Reviewed By: kiukchung Differential Revision: D32266032 fbshipit-source-id: dab8747b4712f5dba5644387817a3a0fda18b66a |
||
|---|---|---|
| .. | ||
| _C | ||
| _masked | ||
| ao | ||
| autograd | ||
| backends | ||
| contrib | ||
| cpu | ||
| csrc | ||
| cuda | ||
| distributed | ||
| distributions | ||
| fft | ||
| for_onnx | ||
| futures | ||
| fx | ||
| jit | ||
| legacy | ||
| lib | ||
| linalg | ||
| multiprocessing | ||
| nn | ||
| onnx | ||
| optim | ||
| package | ||
| profiler | ||
| quantization | ||
| sparse | ||
| special | ||
| testing | ||
| utils | ||
| __config__.py | ||
| __future__.py | ||
| __init__.py | ||
| _appdirs.py | ||
| _classes.py | ||
| _deploy.py | ||
| _jit_internal.py | ||
| _linalg_utils.py | ||
| _lobpcg.py | ||
| _lowrank.py | ||
| _namedtensor_internals.py | ||
| _ops.py | ||
| _python_dispatcher.py | ||
| _six.py | ||
| _sources.py | ||
| _storage_docs.py | ||
| _tensor_docs.py | ||
| _tensor_str.py | ||
| _tensor.py | ||
| _torch_docs.py | ||
| _utils_internal.py | ||
| _utils.py | ||
| _VF.py | ||
| _vmap_internals.py | ||
| abi-check.cpp | ||
| autocast_mode.py | ||
| CMakeLists.txt | ||
| custom_class_detail.h | ||
| custom_class.h | ||
| deploy.h | ||
| extension.h | ||
| functional.py | ||
| hub.py | ||
| library.h | ||
| overrides.py | ||
| py.typed | ||
| quasirandom.py | ||
| random.py | ||
| README.txt | ||
| script.h | ||
| serialization.py | ||
| storage.py | ||
| torch_version.py | ||
| types.py | ||
Note [TH abstraction violation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TH/THC provide some hpp headers, which are proper C++ headers rather than C headers. These headers serve double duty as *internal implementation detail* headers, whose contents should largely not be used by external clients. Ideally, we would not install these headers at all; instead, you should use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`) to manipulate these structs. However, there are a few places in torch/csrc where we violate this abstraction. They are marked with a pointer to this note. Each of those sites will have to be refactored when we refactor the guts of THTensor and related structures.