pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Dino Viehland 5b71834785 Avoid c++ exception and stack trace (#111438 ) Summary: When raising an exception here this causes pybind11's dispatcher to kick in, which causes aiplatform's logic to kick in (aiplatform::error_reporting::util::printAddressesWithBestEffortLocationInfo), which ultimately uses `folly::symbolizer::Symbolizer::symbolize` for building up the stack trace. In 3.8 this uses about 3.62% of the CPU time per pyperf (https://fburl.com/scuba/pyperf_experimental/on_demand/oi554uvy). In Cinder 3.8 for some reason this is worse - using 5.94% of the CPU. This exception is happening when doing a hasattr() on `prims` for things like `bitwise_left_shift` which don't exist: https://www.internalfb.com/code/fbsource/[2d695f650d00]/fbcode/caffe2/torch/_inductor/lowering.py?lines=590 That exception is ultimately going to be swallowed anyway, and the stack trace has no meaningful value. Furthermore because this is kind of an expected outcome in the code versus some random C++ exception the stack trace is less valuable as well. This changes this to return a (None, None) on the failure case instead of returning a valid op/overload list, avoiding the exception, and reclaiming the 3.62%-5.94% of time. Test Plan: Existing CI and perf run: https://fburl.com/scuba/pyperf_experimental/on_demand/oi554uvy Differential Revision: D50018789 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111438 Approved by: https://github.com/davidberard98		2023-10-26 23:55:34 +00:00
..
_awaits
_C	Add regex matching to Inductor all2all collective unit tests (#112077 )	2023-10-26 08:29:30 +00:00
_C_flatbuffer
_custom_op	[custom op] Use canonical API to constrain unbacked values (#108372 )	2023-10-10 05:14:28 +00:00
_decomp	Do not import sympy within torch._prims_common (#112034 )	2023-10-26 12:53:25 +00:00
_dispatch
_dynamo	Split SymNode into its own file (#112037 )	2023-10-26 23:32:27 +00:00
_export	[aotinductor] Pass TorchIR to AOTInductor (#110020 )	2023-10-26 15:54:31 +00:00
_functorch	Split SymNode into its own file (#112037 )	2023-10-26 23:32:27 +00:00
_higher_order_ops	[inductor] Implement clone removal for user defined triton kernel via reinplace_scatters (#111627 )	2023-10-22 22:28:00 +00:00
_inductor	Split SymNode into its own file (#112037 )	2023-10-26 23:32:27 +00:00
_lazy
_library	Make torch.library.define consistent with the new APIs (#111307 )	2023-10-16 22:32:23 +00:00
_logging	Fix unit tests and add logging for Inductor intra-graph reordering (#111981 )	2023-10-25 18:19:43 +00:00
_numpy	Revert "WIP / TST: allow testing torch._numpy under Dynamo (#110401 )"	2023-10-25 18:21:16 +00:00
_prims	Use torch._check for cat error checking (#111035 )	2023-10-12 03:28:27 +00:00
_prims_common	Do not import sympy within torch._prims_common (#112034 )	2023-10-26 12:53:25 +00:00
_refs	[ATen] Support multi dim any and all reductions (#110310 )	2023-10-24 21:33:53 +00:00
_subclasses	Do not import sympy within torch._prims_common (#112034 )	2023-10-26 12:53:25 +00:00
amp
ao	[quant][pt2e][be] Cleanup observer insertion logic (#111828 )	2023-10-25 03:48:36 +00:00
autograd	Enable flake8-bugbear B020 lint (#110823 )	2023-10-24 22:43:47 +00:00
backends	typo: add space after cudnn error messages (#110806 )	2023-10-08 20:58:40 +00:00
compiler	Add cudagraph_mark_step_begin in torch.compiler, reference in error message (#111722 )	2023-10-25 21:53:21 +00:00
contrib
cpu	Add current_device() to torch.cpu (#110987 )	2023-10-11 05:13:10 +00:00
csrc	Avoid c++ exception and stack trace (#111438 )	2023-10-26 23:55:34 +00:00
cuda	bypass nvml for torch.cuda.device_count() if rocm (#110418 )	2023-10-23 16:15:48 +00:00
distributed	Revert "[2D] Enable 2D optimizer set_state_dict() (#111778 )"	2023-10-26 00:18:30 +00:00
distributions
export	Do not import sympy within torch._prims_common (#112034 )	2023-10-26 12:53:25 +00:00
fft
func
futures
fx	Split SymNode into its own file (#112037 )	2023-10-26 23:32:27 +00:00
jit	Revert "Remove deprecated fbgemm operators (#104535 )"	2023-10-25 16:34:16 +00:00
legacy
lib	Revert "Move at::{Refcounted,}MapAllocator to c10 (#109881 )"	2023-10-13 17:57:53 +00:00
linalg
masked	Make is_sparse a property of MaskedTensor (#110725 )	2023-10-09 22:35:38 +00:00
monitor
mps
multiprocessing	Multiprocessing support for NT (#110292 )	2023-10-10 21:58:19 +00:00
nested	Add compile support for NT unbind (#111531 )	2023-10-23 21:16:20 +00:00
nn	keep sync bn training flag same with converted bn's training flag (#111998 )	2023-10-26 08:18:08 +00:00
onnx	[ONNX] A better way to safe guard 2GB model serialization (#111984 )	2023-10-25 19:18:37 +00:00
optim	Make step() faster by passing in a tensor vs scalar 1 (#111084 )	2023-10-20 01:34:08 +00:00
package	Fix typo under torch directory (#110824 )	2023-10-09 19:16:43 +00:00
profiler	Reland [Profiler] Improve the docstring for export_memory_timeline (#110983 )	2023-10-11 16:42:05 +00:00
quantization	Add CUTLASS-based support for mixed dtypes matrix multiplication (#110981 )	2023-10-11 21:47:52 +00:00
signal
sparse	[sparse] semi-structured sparse + torch.compile support (#111049 )	2023-10-24 02:23:20 +00:00
special
testing	Add bf16 support to replicate padding (#112099 )	2023-10-26 20:30:49 +00:00
utils	Split SymNode into its own file (#112037 )	2023-10-26 23:32:27 +00:00
__config__.py
__future__.py
__init__.py	Split SymNode into its own file (#112037 )	2023-10-26 23:32:27 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py	[test][docs] Fix doctest warnings for syntax errors (#110517 )	2023-10-05 00:00:06 +00:00
_deploy.py
_guards.py	Do not import sympy within torch._prims_common (#112034 )	2023-10-26 12:53:25 +00:00
_jit_internal.py	Fix typo under torch directory (#110824 )	2023-10-09 19:16:43 +00:00
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py	Do not import sympy within torch._prims_common (#112034 )	2023-10-26 12:53:25 +00:00
_namedtensor_internals.py
_ops.py	Avoid c++ exception and stack trace (#111438 )	2023-10-26 23:55:34 +00:00
_python_dispatcher.py
_sources.py
_storage_docs.py	Document torch.from_file and fix UntypedStorage.from_file docs (#111688 )	2023-10-25 19:28:11 +00:00
_streambase.py	[dynamo][stream]support device-agnostic stream in dynamo and capture stream/event method in fx graph (#108312 )	2023-10-22 13:22:58 +00:00
_tensor_docs.py	Add `torch.utils.deterministic.fill_uninitialized_memory` flag (#111377 )	2023-10-26 02:39:06 +00:00
_tensor_str.py
_tensor.py	Pickle support for NT (#110219 )	2023-09-29 15:30:06 +00:00
_torch_docs.py	Add `torch.utils.deterministic.fill_uninitialized_memory` flag (#111377 )	2023-10-26 02:39:06 +00:00
_utils_internal.py
_utils.py	Fix typo under torch directory (#110824 )	2023-10-09 19:16:43 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt	[ROCm] remove HCC references (#111975 )	2023-10-26 02:39:10 +00:00
custom_class_detail.h
custom_class.h
extension.h
functional.py	raise instead of skip in test/test_meta.py (#110939 )	2023-10-17 10:17:43 +00:00
hub.py
library.h	[Reland2] Remove calls of c10::either (#110487 )	2023-10-06 00:25:15 +00:00
library.py	[torch.library] Clarify torch.library.define's schema (#111915 )	2023-10-25 21:20:54 +00:00
overrides.py	Revert "Remove deprecated fbgemm operators (#104535 )"	2023-10-25 16:34:16 +00:00
py.typed
quasirandom.py
random.py	use torch.xpu.manual_seed_all in torch.seed (#110376 )	2023-10-03 13:41:55 +00:00
README.txt
return_types.py
script.h
serialization.py	fix get device index if has _utils._get_device_index in privateuse1 (#108123 )	2023-10-07 06:18:59 +00:00
storage.py	Document torch.from_file and fix UntypedStorage.from_file docs (#111688 )	2023-10-25 19:28:11 +00:00
torch_version.py
types.py	Unify torch.SymInt and torch.types.SymInt (#110573 )	2023-10-24 16:17:23 +00:00
version.py.tpl

README.txt

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.