pytorch/torch
Yu, Guangye b8550f527f Support gpu trace on XPU (#121795)
# Motivation
Support GPU trace on XPU backend. Add GPU trace to xpu runtime. It is beneficial to generalize the device caching allocator in the next step.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121795
Approved by: https://github.com/EikanWang, https://github.com/gujinghui, https://github.com/jgong5, https://github.com/albanD
ghstack dependencies: #121794
2024-03-30 13:07:53 +00:00
..
_awaits
_C Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
_C_flatbuffer
_custom_op infer_schema can add alias annotations when passed a list of mutated args (#122343) 2024-03-21 21:39:07 +00:00
_decomp Added DispatchKey.CompositeImplicitAutograd to all upsample_nearest*.default decompositions (#122782) 2024-03-29 13:55:25 +00:00
_dispatch
_dynamo Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
_export [export] Add torch_fn (#122693) 2024-03-30 06:47:15 +00:00
_functorch Added some checkpointing tests (#122848) 2024-03-29 03:49:19 +00:00
_higher_order_ops [inductor] Add torch.while_loop support to JIT Inductor (#122069) 2024-03-22 02:45:27 +00:00
_inductor [Inductor] Fix AFOC QPS Regression. (#122944) 2024-03-30 07:34:41 +00:00
_lazy
_library Fix FallbackKernel behavior on mutable ops (#118649) 2024-02-09 19:01:54 +00:00
_logging [TORCH_TRACE] Record stack when no compile context is available (#122644) 2024-03-26 19:30:52 +00:00
_numpy
_prims add decomposition for frexp (#119217) 2024-02-23 21:52:42 +00:00
_prims_common Make expected stride test in torch._prims_common size oblivious (#122370) 2024-03-21 17:14:42 +00:00
_refs Allow dynamo to inline through "hessian" (#121410) 2024-03-27 21:39:37 +00:00
_subclasses Some improvements to nonzero post guard_size_oblivious (#122156) 2024-03-28 03:53:16 +00:00
_vendor
amp Remove device assert in Gradscaler (#119362) 2024-02-22 08:02:18 +00:00
ao [export] Make quantizer compatible with the standard nn_module_stack. (#122819) 2024-03-28 19:36:46 +00:00
autograd Delete torch.autograd.function.traceable APIs (#122817) 2024-03-28 18:24:15 +00:00
backends [TorchElastic] Refactoring to support non-default logging strategy (#120691) 2024-02-29 20:59:17 +00:00
compiler [torch.export] Support is_compiling() flag for non-strict mode (#119602) 2024-02-29 05:52:51 +00:00
contrib
cpu
csrc Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
cuda Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
distributed [BE] minor logging cleanup in distributed (#122921) 2024-03-29 03:34:01 +00:00
distributions
export Revert "Add non strict inline constraints and runtime assertions to non-strict exported program (#122722)" 2024-03-28 20:42:35 +00:00
fft
func Let torch dynamo inline torch.func.grad (#118407) 2024-02-28 20:05:00 +00:00
futures
fx [export] Add torch_fn (#122693) 2024-03-30 06:47:15 +00:00
jit [jit] Fix _batch_norm_with_update shape function (#122430) 2024-03-22 14:21:57 +00:00
legacy
lib Remove unneeded linking of torch_shm_manager in CMake (#119540) 2024-02-11 06:33:35 +00:00
linalg Move doc links to point to main (#121823) 2024-03-15 19:49:37 +00:00
masked
monitor
mps
multiprocessing
nested [BE] minor logging cleanup in distributed (#122921) 2024-03-29 03:34:01 +00:00
nn Add RMSNorm module (#121364) 2024-03-29 18:05:28 +00:00
onnx Prevent dup initializers when ONNXProgram.save is called many times (#122435) 2024-03-22 21:03:15 +00:00
optim Add tensor step and capturable support to rprop (#122261) 2024-03-28 23:31:18 +00:00
package Back out "Support triton.language.dtype with torch.compile (#121690)" (#122108) 2024-03-18 20:50:28 +00:00
profiler [profiler] Fix recorded profiler step number (#121127) 2024-03-09 06:54:51 +00:00
quantization
signal
sparse Update DimOrDims typing in torch.sparse (#122471) 2024-03-25 16:25:56 +00:00
special
testing Avoid COW materialize in conv forward ops (#122748) 2024-03-29 20:34:19 +00:00
utils Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
xpu Support gpu trace on XPU (#121795) 2024-03-30 13:07:53 +00:00
__config__.py
__future__.py Update nn.Module._apply to not gate on should_use_set_data when swap_tensors is set (#120659) 2024-02-28 00:59:34 +00:00
__init__.py [dynamo, 3.12] Allocate Dynamo shadow frames by mimicking CPython (#122146) 2024-03-27 20:39:39 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py [Lint] replace [assigment] with [method-assign] for methods (#119706) 2024-02-13 02:06:04 +00:00
_guards.py [dynamo] Compile time optimizations in tx.step() (#121790) 2024-03-15 01:01:05 +00:00
_jit_internal.py Add scuba logging for TorchScript usage (#121936) 2024-03-19 17:38:27 +00:00
_linalg_utils.py
_lobpcg.py [Lint] replace [assigment] with [method-assign] for methods (#119706) 2024-02-13 02:06:04 +00:00
_lowrank.py Fix svd_lowrank parameter M (#122681) 2024-03-29 18:06:38 +00:00
_meta_registrations.py Add metas for randint/rand factory functions out overload (#122375) 2024-03-25 04:01:38 +00:00
_namedtensor_internals.py
_ops.py Revert "Support map in pre-dispatch functionalization (#121444)" 2024-03-28 23:42:26 +00:00
_python_dispatcher.py
_sources.py
_storage_docs.py
_streambase.py
_tensor_docs.py update the tensor.scatter_ doc (#120169) 2024-02-23 02:51:55 +00:00
_tensor_str.py Add sparse compressed meta tensor support (#120707) 2024-03-01 13:28:47 +00:00
_tensor.py Disallow {FakeTensor,FunctionalTensor}.data_ptr (#122514) 2024-03-26 23:55:42 +00:00
_torch_docs.py Graph-Safe RNG State Exchange for Tensor Parallelism (#114068) 2024-03-27 01:14:38 +00:00
_utils_internal.py [export] build the infra to rollout predispatch export. (#122326) 2024-03-22 00:55:10 +00:00
_utils.py Refactor gpu trace to be device-agnostic (#121794) 2024-03-30 13:04:38 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt
custom_class_detail.h
custom_class.h
extension.h
functional.py Fix ouput typos (#120870) 2024-02-29 08:29:14 +00:00
hub.py Add verbose parameter to torch.hub.list (#120717) 2024-03-01 07:39:48 +00:00
library.h
library.py Better error messages for impl_abstract_pystub (#120959) 2024-03-04 15:24:36 +00:00
overrides.py Add RMSNorm module (#121364) 2024-03-29 18:05:28 +00:00
py.typed
quasirandom.py
random.py [2/2] Intel GPU Runtime Upstreaming for Generator (#118613) 2024-02-28 05:28:11 +00:00
README.txt
return_types.py register torch.return_types in torch.fx._pytree (#120027) 2024-02-23 21:52:42 +00:00
script.h
serialization.py Add support to save safetensors checkpoint directly into onnx (#121001) 2024-03-11 15:21:59 +00:00
storage.py Add hpu device support in storage/resize (#119761) 2024-02-17 01:04:27 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.