pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Yu, Guangye b8550f527f Support gpu trace on XPU (#121795 ) # Motivation Support GPU trace on XPU backend. Add GPU trace to xpu runtime. It is beneficial to generalize the device caching allocator in the next step. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121795 Approved by: https://github.com/EikanWang, https://github.com/gujinghui, https://github.com/jgong5, https://github.com/albanD ghstack dependencies: #121794		2024-03-30 13:07:53 +00:00
..
_awaits
_C	Refactor gpu trace to be device-agnostic (#121794 )	2024-03-30 13:04:38 +00:00
_C_flatbuffer
_custom_op	infer_schema can add alias annotations when passed a list of mutated args (#122343 )	2024-03-21 21:39:07 +00:00
_decomp	Added DispatchKey.CompositeImplicitAutograd to all upsample_nearest*.default decompositions (#122782 )	2024-03-29 13:55:25 +00:00
_dispatch
_dynamo	Refactor gpu trace to be device-agnostic (#121794 )	2024-03-30 13:04:38 +00:00
_export	[export] Add torch_fn (#122693 )	2024-03-30 06:47:15 +00:00
_functorch	Added some checkpointing tests (#122848 )	2024-03-29 03:49:19 +00:00
_higher_order_ops	[inductor] Add torch.while_loop support to JIT Inductor (#122069 )	2024-03-22 02:45:27 +00:00
_inductor	[Inductor] Fix AFOC QPS Regression. (#122944 )	2024-03-30 07:34:41 +00:00
_lazy
_library	Fix FallbackKernel behavior on mutable ops (#118649 )	2024-02-09 19:01:54 +00:00
_logging	[TORCH_TRACE] Record stack when no compile context is available (#122644 )	2024-03-26 19:30:52 +00:00
_numpy
_prims	add decomposition for frexp (#119217 )	2024-02-23 21:52:42 +00:00
_prims_common	Make expected stride test in torch._prims_common size oblivious (#122370 )	2024-03-21 17:14:42 +00:00
_refs	Allow dynamo to inline through "hessian" (#121410 )	2024-03-27 21:39:37 +00:00
_subclasses	Some improvements to nonzero post guard_size_oblivious (#122156 )	2024-03-28 03:53:16 +00:00
_vendor
amp	Remove device assert in Gradscaler (#119362 )	2024-02-22 08:02:18 +00:00
ao	[export] Make quantizer compatible with the standard nn_module_stack. (#122819 )	2024-03-28 19:36:46 +00:00
autograd	Delete torch.autograd.function.traceable APIs (#122817 )	2024-03-28 18:24:15 +00:00
backends	[TorchElastic] Refactoring to support non-default logging strategy (#120691 )	2024-02-29 20:59:17 +00:00
compiler	[torch.export] Support is_compiling() flag for non-strict mode (#119602 )	2024-02-29 05:52:51 +00:00
contrib
cpu
csrc	Refactor gpu trace to be device-agnostic (#121794 )	2024-03-30 13:04:38 +00:00
cuda	Refactor gpu trace to be device-agnostic (#121794 )	2024-03-30 13:04:38 +00:00
distributed	[BE] minor logging cleanup in distributed (#122921 )	2024-03-29 03:34:01 +00:00
distributions
export	Revert "Add non strict inline constraints and runtime assertions to non-strict exported program (#122722 )"	2024-03-28 20:42:35 +00:00
fft
func	Let torch dynamo inline torch.func.grad (#118407 )	2024-02-28 20:05:00 +00:00
futures
fx	[export] Add torch_fn (#122693 )	2024-03-30 06:47:15 +00:00
jit	[jit] Fix _batch_norm_with_update shape function (#122430 )	2024-03-22 14:21:57 +00:00
legacy
lib	Remove unneeded linking of torch_shm_manager in CMake (#119540 )	2024-02-11 06:33:35 +00:00
linalg	Move doc links to point to main (#121823 )	2024-03-15 19:49:37 +00:00
masked
monitor
mps
multiprocessing
nested	[BE] minor logging cleanup in distributed (#122921 )	2024-03-29 03:34:01 +00:00
nn	Add RMSNorm module (#121364 )	2024-03-29 18:05:28 +00:00
onnx	Prevent dup initializers when ONNXProgram.save is called many times (#122435 )	2024-03-22 21:03:15 +00:00
optim	Add tensor step and capturable support to rprop (#122261 )	2024-03-28 23:31:18 +00:00
package	Back out "Support `triton.language.dtype` with `torch.compile` (#121690 )" (#122108 )	2024-03-18 20:50:28 +00:00
profiler	[profiler] Fix recorded profiler step number (#121127 )	2024-03-09 06:54:51 +00:00
quantization
signal
sparse	Update DimOrDims typing in torch.sparse (#122471 )	2024-03-25 16:25:56 +00:00
special
testing	Avoid COW materialize in conv forward ops (#122748 )	2024-03-29 20:34:19 +00:00
utils	Refactor gpu trace to be device-agnostic (#121794 )	2024-03-30 13:04:38 +00:00
xpu	Support gpu trace on XPU (#121795 )	2024-03-30 13:07:53 +00:00
__config__.py
__future__.py	Update nn.Module._apply to not gate on should_use_set_data when swap_tensors is set (#120659 )	2024-02-28 00:59:34 +00:00
__init__.py	[dynamo, 3.12] Allocate Dynamo shadow frames by mimicking CPython (#122146 )	2024-03-27 20:39:39 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py	[Lint] replace [assigment] with [method-assign] for methods (#119706 )	2024-02-13 02:06:04 +00:00
_guards.py	[dynamo] Compile time optimizations in tx.step() (#121790 )	2024-03-15 01:01:05 +00:00
_jit_internal.py	Add scuba logging for TorchScript usage (#121936 )	2024-03-19 17:38:27 +00:00
_linalg_utils.py
_lobpcg.py	[Lint] replace [assigment] with [method-assign] for methods (#119706 )	2024-02-13 02:06:04 +00:00
_lowrank.py	Fix svd_lowrank parameter `M` (#122681 )	2024-03-29 18:06:38 +00:00
_meta_registrations.py	Add metas for randint/rand factory functions out overload (#122375 )	2024-03-25 04:01:38 +00:00
_namedtensor_internals.py
_ops.py	Revert "Support map in pre-dispatch functionalization (#121444 )"	2024-03-28 23:42:26 +00:00
_python_dispatcher.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py	update the tensor.scatter_ doc (#120169 )	2024-02-23 02:51:55 +00:00
_tensor_str.py	Add sparse compressed meta tensor support (#120707 )	2024-03-01 13:28:47 +00:00
_tensor.py	Disallow {FakeTensor,FunctionalTensor}.data_ptr (#122514 )	2024-03-26 23:55:42 +00:00
_torch_docs.py	Graph-Safe RNG State Exchange for Tensor Parallelism (#114068 )	2024-03-27 01:14:38 +00:00
_utils_internal.py	[export] build the infra to rollout predispatch export. (#122326 )	2024-03-22 00:55:10 +00:00
_utils.py	Refactor gpu trace to be device-agnostic (#121794 )	2024-03-30 13:04:38 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt
custom_class_detail.h
custom_class.h
extension.h
functional.py	Fix ouput typos (#120870 )	2024-02-29 08:29:14 +00:00
hub.py	Add verbose parameter to torch.hub.list (#120717 )	2024-03-01 07:39:48 +00:00
library.h
library.py	Better error messages for impl_abstract_pystub (#120959 )	2024-03-04 15:24:36 +00:00
overrides.py	Add RMSNorm module (#121364 )	2024-03-29 18:05:28 +00:00
py.typed
quasirandom.py
random.py	[2/2] Intel GPU Runtime Upstreaming for Generator (#118613 )	2024-02-28 05:28:11 +00:00
README.txt
return_types.py	register torch.return_types in torch.fx._pytree (#120027 )	2024-02-23 21:52:42 +00:00
script.h
serialization.py	Add support to save safetensors checkpoint directly into onnx (#121001 )	2024-03-11 15:21:59 +00:00
storage.py	Add hpu device support in storage/resize (#119761 )	2024-02-17 01:04:27 +00:00
torch_version.py
types.py
version.py.tpl

README.txt

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.