pytorch/torch
Riley Dulin 3be150653c [torch][ao] Add customizable loss function to NodeAccuracySummary (#136282)
Summary:
Add a customizable loss function callback to NodeAccuracySummary to
allow users to pass in their own loss function.

Also, fix some type errors and propagate better exception messages when
unexpected tensor comparisons occur. Finally, enhance the robustness of
`generate_numeric_debug_handle` in the case where it is called multiple
times on the same model, by avoiding reuse of the same IDs.

Test Plan: Added a test for this case in `test_numeric_debugger`.

Reviewed By: jerryzh168

Differential Revision: D62898297

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136282
Approved by: https://github.com/jerryzh168
2024-09-24 03:28:12 +00:00
..
_awaits
_C Revert "[PT2/Profiler] Add Context Info to Torch-Compiled Regions (#132765)" 2024-09-20 17:10:27 +00:00
_C_flatbuffer
_custom_op
_decomp Remove prims.slice_in_dim and prims.slice (#136150) 2024-09-23 01:27:22 +00:00
_dispatch
_dynamo Remove vt argument in raise_observed_exception (#136037) 2024-09-24 02:36:57 +00:00
_export [export] Deserialize args with python keyword names (#136036) 2024-09-17 18:13:14 +00:00
_functorch Make AOTAutogradCache support remote FXGraphCache (#136173) 2024-09-23 17:24:27 +00:00
_higher_order_ops Revert "Allow fx graph caching higher order operators (opt-in) (#135877)" 2024-09-23 09:04:24 +00:00
_inductor [inductor] Log precompilation time (#136395) 2024-09-24 01:47:54 +00:00
_lazy
_library [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
_logging Existing mypy issues (#136236) 2024-09-24 01:02:07 +00:00
_numpy
_prims Remove prims.slice_in_dim and prims.slice (#136150) 2024-09-23 01:27:22 +00:00
_prims_common
_refs Remove prims.slice_in_dim and prims.slice (#136150) 2024-09-23 01:27:22 +00:00
_strobelight [Pytorch] Cleanup Strobelight URL and shorten for readability (#136102) 2024-09-16 18:10:33 +00:00
_subclasses [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
_vendor
amp
ao [torch][ao] Add customizable loss function to NodeAccuracySummary (#136282) 2024-09-24 03:28:12 +00:00
autograd Param fixes in docstring (#136097) 2024-09-21 18:56:34 +00:00
backends Extending the Pytorch vec backend for SVE (ARM) (#119571) 2024-09-18 18:59:10 +00:00
compiler
contrib
cpu
csrc [AMD] Skipping 0 byte send/recv for AMD GPU (#136362) 2024-09-23 19:14:12 +00:00
cuda [ROCm][CI] upgrade CI to ROCm 6.2 (#132555) 2024-09-20 17:39:31 +00:00
distributed [Pipelining] Allow non-0 stages to accept kwargs (#136416) 2024-09-23 23:50:59 +00:00
distributions [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
export [TorchRec][PT2 IR][APF] short circuit the flatten/unflatten between EBC and KTRegroupAsDict modules (#136045) 2024-09-17 18:42:56 +00:00
fft
func
futures
fx Refactor maybe_evaluate_static into a worker function off of ShapeEnv (#135107) 2024-09-23 14:39:20 +00:00
jit
legacy
lib
linalg
masked [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
monitor
mps
mtia [MTIA] Support torch.cuda.get_device_capability equivalent API on MTIA (#135889) 2024-09-17 17:42:56 +00:00
multiprocessing [torch/multiprocessing] Use multiprocessing.reduction.register ForkingPickler.register to register custom tensor and storage reductions (#135030) 2024-09-16 20:07:29 +00:00
nested Support rms_norm() for NJT (#135872) 2024-09-17 18:09:20 +00:00
nn Support embedding_bag() with NJT input (#135888) 2024-09-23 17:35:19 +00:00
onnx [CODEMOD][caffe2] use npt.NDArray instead of np.ndarray in type annotations (#136288) 2024-09-19 12:40:36 +00:00
optim Add back optim type hints that were lost when *.pyi files were removed (#136185) 2024-09-17 15:45:15 +00:00
package [3.13] fix 3.13 pickle error in torch/package (#136049) 2024-09-14 14:28:09 +00:00
profiler [Profiler] Torch Profiler distributed info is not JSON serializable (#135548) 2024-09-13 02:22:33 +00:00
quantization
signal
sparse Add scaling arguments to bsr_dense_addmm (#136104) 2024-09-16 20:26:54 +00:00
special
testing Support embedding_bag() with NJT input (#135888) 2024-09-23 17:35:19 +00:00
utils Param fixes in docstring (#136097) 2024-09-21 18:56:34 +00:00
xpu [Intel GPU] Add XPU memory-related APIs (#129919) 2024-09-07 11:15:17 +00:00
__config__.py
__future__.py
__init__.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-23 19:57:13 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_guards.py Revert "[aotd] Fix freezing API for subclasses (#136265)" 2024-09-23 16:25:05 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py Add decomps for max_unpool (#133146) 2024-09-20 21:35:25 +00:00
_namedtensor_internals.py
_ops.py Revert "Allow fx graph caching higher order operators (opt-in) (#135877)" 2024-09-23 09:04:24 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-23 19:57:13 +00:00
_tensor_str.py
_tensor.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-23 19:57:13 +00:00
_torch_docs.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-23 19:57:13 +00:00
_utils_internal.py Revert "[Pytorch] Consolidate Strobelight compile time profiler between OSS and fbcode (#135953)" 2024-09-15 05:32:38 +00:00
_utils.py
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt
custom_class_detail.h
custom_class.h
extension.h
functional.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-23 19:57:13 +00:00
hub.py torch.hub: add get_dir/set_dir type hints (#134906) 2024-09-12 03:53:29 +00:00
library.h
library.py Param fixes in docstring (#136097) 2024-09-21 18:56:34 +00:00
overrides.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-23 19:57:13 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py [3.13] fix 3.13 pickle error in serialization.py (#136034) 2024-09-14 00:02:40 +00:00
storage.py
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.