pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Darshan Sanghani 33dd4f187d [pytorch/et] Allow ET to save additional resources for completing a trace like generated kernels and index tensor data (#143430 ) The resources directory lets ET observer dump any additional data like Triton kernels while capturing the ET. This allows us to use the ET trace to replay PT2 workloads and get visibility into data like generated kernels and their usage in a model, index tensor data etc. We also added a few ways to enable ET and ET Resources through the OS environment variables. Setting `ENABLE_PYTORCH_EXECUTION_TRACE` will enable default Execution Tracing in Pytorch. Additionally setting `ENABLE_PYTORCH_EXECUTION_TRACE_EXTRAS` will enable ET to collect extra resources from the ET run like Triton Kernels. Differential Revision: [D58707846](https://our.internmc.facebook.com/intern/diff/D58707846/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143430 Approved by: https://github.com/shengfukevin, https://github.com/sraikund16		2024-12-20 21:20:32 +00:00
..
_awaits
_C	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )	2024-12-20 19:32:03 +00:00
_C_flatbuffer
_custom_op
_decomp	[Inductor][CPU] disable bernoulli_p decomposition (#143460 )	2024-12-19 11:21:35 +00:00
_dispatch	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_dynamo	pgo: Log feature use (#142819 )	2024-12-20 20:22:20 +00:00
_export	[ts converter] use Dim.AUTO for ts -> export converter (#138273 )	2024-12-20 07:48:24 +00:00
_functorch	remove allow-untyped-defs for torch/_functorch/batch_norm_replacement.py (#143438 )	2024-12-18 09:01:06 +00:00
_higher_order_ops	[user triton] Raise an exception when encountering nested @triton.autotune decorators or @triton.heuristics (#143519 )	2024-12-20 06:38:45 +00:00
_inductor	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )	2024-12-20 19:32:03 +00:00
_lazy	remove allow-untyped-defs from torch/_lazy/config.py (#143603 )	2024-12-20 05:34:19 +00:00
_library	Revert "[export] don't decompose custom triton op when exporting (#142426 )"	2024-12-19 21:21:38 +00:00
_logging	Add "inductor_pre_grad_graph" logging (#142717 ) (#143126 )	2024-12-13 21:48:25 +00:00
_numpy	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_prims	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_prims_common	Pass allow_rhs_unbacked to the stride test in metadata test too (#143040 )	2024-12-19 09:37:50 +00:00
_refs	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_strobelight	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_subclasses	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_vendor
accelerator	[BE][accelerator] formalize API name `{current,set}_device_{idx => index}` (#140542 )	2024-12-12 10:53:48 +00:00
amp
ao	remove allow-untyped-defs from torch/ao/quantization/experimental/APoT_tensor.py (#143601 )	2024-12-20 05:26:09 +00:00
autograd	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
backends	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )	2024-12-20 19:32:03 +00:00
compiler	[export] add is_exporting flag (#142425 )	2024-12-18 21:36:28 +00:00
contrib
cpu
csrc	[c10d][fr] flight recorder improvements (#143446 )	2024-12-20 20:41:30 +00:00
cuda	[ROCm] Fix unit test: matmul_offline_mgpu_tunableop (#143507 )	2024-12-19 19:48:20 +00:00
distributed	remove allow-untyped-defs from torch/distributed/elastic/multiprocessing/errors/handlers.py (#143605 )	2024-12-20 05:25:01 +00:00
distributions	Remove some unused type ignores (round 1) (#142325 )	2024-12-09 18:23:46 +00:00
export	[export] add is_exporting flag (#142425 )	2024-12-18 21:36:28 +00:00
fft
func
futures
fx	Revert "refactor tensorify restart logic to use sources (#141517 )" (#143623 )	2024-12-20 15:38:34 +00:00
jit	Add warning to torch.jit.load (#143403 )	2024-12-18 00:17:41 +00:00
legacy
lib
linalg
masked	remove allow-untyped-defs for torch/masked/maskedtensor/creation.py (#143321 )	2024-12-17 16:44:50 +00:00
monitor
mps	[MPS] Add CompileShader method (#141478 )	2024-12-11 02:00:51 +00:00
mtia	(MTIA) Move "empty_cache" API (#143402 )	2024-12-20 17:39:06 +00:00
multiprocessing
nested	NJT linear_backward should not return inner tensor as-is (#143333 )	2024-12-18 00:15:18 +00:00
nn	Rewrite _reparametrize_module to use `contextmanager` (#138203 )	2024-12-20 12:02:27 +00:00
onnx	[Codemod][AddExplicitStrictExportArg] caffe2/torch/onnx/_internal/exporter (#143542 )	2024-12-20 00:54:52 +00:00
optim	Add support for differentiable LR in SGD + test v2.0 (#143510 )	2024-12-19 21:04:44 +00:00
package
profiler	[pytorch/et] Allow ET to save additional resources for completing a trace like generated kernels and index tensor data (#143430 )	2024-12-20 21:20:32 +00:00
quantization
signal
sparse
special
testing	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )	2024-12-20 19:32:03 +00:00
utils	Add config.save.use_pinned_memory_for_d2h to serialization config (#143342 )	2024-12-20 21:01:18 +00:00
xpu
__config__.py	remove allow-untyped-defs for torch/__config__.py (#143320 )	2024-12-17 00:16:09 +00:00
__future__.py
__init__.py	[dynamo, 3.13t] raise error if torch.compile is attempted in 3.13t (nogil) (#143404 )	2024-12-19 18:10:01 +00:00
_appdirs.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py	dynamo tracing perf: no import on hot path: 47.62 -> 47.26 (#143065 )	2024-12-20 20:06:42 +00:00
_jit_internal.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_linalg_utils.py
_lobpcg.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_lowrank.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_meta_registrations.py	[ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124 )	2024-12-20 19:32:03 +00:00
_namedtensor_internals.py
_ops.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py
_tensor_str.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
_tensor.py	`__cuda_array_interface__`: Use "<V2" for bfloat16. (#143042 )	2024-12-14 06:27:52 +00:00
_thread_safe_fork.py
_torch_docs.py	Add torch.cat tensors type promotion description (#141339 )	2024-12-14 01:36:41 +00:00
_utils_internal.py	[reland] Kill capture_pre_autograd_graph API (#143426 )	2024-12-18 12:07:09 +00:00
_utils.py	Reraise worker errors as runtime errors in more cases when the original exception can't be constructed (#140911 )	2024-12-14 03:11:36 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py	Remove unused Python variables in torch/[_-a]* (#133492 )	2024-12-12 17:39:14 +00:00
abi-check.cpp
CMakeLists.txt	export AOTI_TORCH_EXPORT on Windows. (#140030 )	2024-12-20 11:42:09 +00:00
custom_class_detail.h
custom_class.h
extension.h
functional.py
hub.py
library.h
library.py	make it clearer (in docs) one can double decorate with torch.library.impl_* APIs (#137608 )	2024-12-17 15:13:58 +00:00
overrides.py	[dim_order] raised runtime error when tensor has ambiguous dim order (#141632 )	2024-12-08 23:16:57 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py	Add config.save.use_pinned_memory_for_d2h to serialization config (#143342 )	2024-12-20 21:01:18 +00:00
storage.py
torch_version.py
types.py
version.py.tpl

README.txt

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.