mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: When raising an exception here this causes pybind11's dispatcher to kick in, which causes aiplatform's logic to kick in (aiplatform::error_reporting::util::printAddressesWithBestEffortLocationInfo), which ultimately uses `folly::symbolizer::Symbolizer::symbolize` for building up the stack trace. In 3.8 this uses about 3.62% of the CPU time per pyperf (https://fburl.com/scuba/pyperf_experimental/on_demand/oi554uvy). In Cinder 3.8 for some reason this is worse - using 5.94% of the CPU. This exception is happening when doing a hasattr() on `prims` for things like `bitwise_left_shift` which don't exist: https://www.internalfb.com/code/fbsource/[2d695f650d00]/fbcode/caffe2/torch/_inductor/lowering.py?lines=590 That exception is ultimately going to be swallowed anyway, and the stack trace has no meaningful value. Furthermore because this is kind of an expected outcome in the code versus some random C++ exception the stack trace is less valuable as well. This changes this to return a (None, None) on the failure case instead of returning a valid op/overload list, avoiding the exception, and reclaiming the 3.62%-5.94% of time. Test Plan: Existing CI and perf run: https://fburl.com/scuba/pyperf_experimental/on_demand/oi554uvy Differential Revision: D50018789 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111438 Approved by: https://github.com/davidberard98 |
||
|---|---|---|
| .. | ||
| _awaits | ||
| _C | ||
| _C_flatbuffer | ||
| _custom_op | ||
| _decomp | ||
| _dispatch | ||
| _dynamo | ||
| _export | ||
| _functorch | ||
| _higher_order_ops | ||
| _inductor | ||
| _lazy | ||
| _library | ||
| _logging | ||
| _numpy | ||
| _prims | ||
| _prims_common | ||
| _refs | ||
| _subclasses | ||
| amp | ||
| ao | ||
| autograd | ||
| backends | ||
| compiler | ||
| contrib | ||
| cpu | ||
| csrc | ||
| cuda | ||
| distributed | ||
| distributions | ||
| export | ||
| fft | ||
| func | ||
| futures | ||
| fx | ||
| jit | ||
| legacy | ||
| lib | ||
| linalg | ||
| masked | ||
| monitor | ||
| mps | ||
| multiprocessing | ||
| nested | ||
| nn | ||
| onnx | ||
| optim | ||
| package | ||
| profiler | ||
| quantization | ||
| signal | ||
| sparse | ||
| special | ||
| testing | ||
| utils | ||
| __config__.py | ||
| __future__.py | ||
| __init__.py | ||
| _appdirs.py | ||
| _classes.py | ||
| _compile.py | ||
| _custom_ops.py | ||
| _deploy.py | ||
| _guards.py | ||
| _jit_internal.py | ||
| _linalg_utils.py | ||
| _lobpcg.py | ||
| _lowrank.py | ||
| _meta_registrations.py | ||
| _namedtensor_internals.py | ||
| _ops.py | ||
| _python_dispatcher.py | ||
| _sources.py | ||
| _storage_docs.py | ||
| _streambase.py | ||
| _tensor_docs.py | ||
| _tensor_str.py | ||
| _tensor.py | ||
| _torch_docs.py | ||
| _utils_internal.py | ||
| _utils.py | ||
| _VF.py | ||
| _vmap_internals.py | ||
| _weights_only_unpickler.py | ||
| abi-check.cpp | ||
| CMakeLists.txt | ||
| custom_class_detail.h | ||
| custom_class.h | ||
| extension.h | ||
| functional.py | ||
| hub.py | ||
| library.h | ||
| library.py | ||
| overrides.py | ||
| py.typed | ||
| quasirandom.py | ||
| random.py | ||
| README.txt | ||
| return_types.py | ||
| script.h | ||
| serialization.py | ||
| storage.py | ||
| torch_version.py | ||
| types.py | ||
| version.py.tpl | ||
Note [TH abstraction violation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TH/THC provide some hpp headers, which are proper C++ headers rather than C headers. These headers serve double duty as *internal implementation detail* headers, whose contents should largely not be used by external clients. Ideally, we would not install these headers at all; instead, you should use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`) to manipulate these structs. However, there are a few places in torch/csrc where we violate this abstraction. They are marked with a pointer to this note. Each of those sites will have to be refactored when we refactor the guts of THTensor and related structures.