pytorch/torch
soulitzer e41a0b33ec Allow Fakified subclass to have different device for inner and outer tensor (#141839)
Previously if a wrapper tensor subclass is fakified, the inner tensors would end up having the same device as the outer tensor. This PR makes it so that inner and outer tensors can have different devices.

See OffloadTensor PR https://github.com/pytorch/pytorch/pull/141840/files#diff-3bc0cf540b694f4ec0a3749f78b047456657a53a5657e495ffb68e5970c5fdaaR1955 for an application. A simpler test has been added in this PR.

This is technically bc-breaking because now the callback passed to MetaConverter needs to accept an extra argument, but no one external should be using this anyway?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141839
Approved by: https://github.com/bdhirsh
ghstack dependencies: #141166
2024-12-03 00:09:41 +00:00
..
_awaits
_C [MTIA] Support torch.mtia.empty_cache() (#141533) 2024-11-28 02:24:19 +00:00
_C_flatbuffer
_custom_op
_decomp Fix mismatched tensor metadata between FakeTensor and Intel XPU concrete tensor when running F.logsigmoid (#141333) 2024-12-02 22:09:20 +00:00
_dispatch
_dynamo Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
_export Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
_functorch Revert "[REFACTOR] Inline FxGraphCache.post_compile into sole call site (#141877)" 2024-12-02 21:26:13 +00:00
_higher_order_ops Revert "Ensure that BlockMask length must always exactly match the sequence length in flex_attention (#141625)" 2024-12-02 14:10:38 +00:00
_inductor [ROCm] Enable inductor GEMM lowering for gfx11 (#141687) 2024-12-02 22:13:34 +00:00
_lazy
_library [BE]: Apply PERF401 autofixes from ruff (#140980) 2024-11-20 17:52:07 +00:00
_logging [BE]: Apply PERF401 autofixes from ruff (#140980) 2024-11-20 17:52:07 +00:00
_numpy
_prims [BE]: Apply PERF401 autofixes from ruff (#140980) 2024-11-20 17:52:07 +00:00
_prims_common Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
_refs Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
_strobelight Increase default COMPILE_STROBELIGHT_MAX_STACK_LENGTH to 500 (#138006) 2024-10-17 07:31:32 +00:00
_subclasses Allow Fakified subclass to have different device for inner and outer tensor (#141839) 2024-12-03 00:09:41 +00:00
_vendor
accelerator Introduce a device-agnostic runtime API design (#132204) 2024-10-27 10:37:09 +00:00
amp [MPS] Add support for bf16 autocast (#139390) 2024-11-20 19:52:28 +00:00
ao Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
autograd Missing space in torch.autograd.Function deprecation warning (#141562) 2024-11-27 01:31:26 +00:00
backends Revert "[sparse] add search for optimal alg_id to torch.compile (#137427)" 2024-10-24 17:27:06 +00:00
compiler [dynamo] skip_guard_eval_unsafe stance for power users (#140251) 2024-11-21 06:28:58 +00:00
contrib Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
cpu [Inductor][CPP] Add oneDNN BRGEMM config for Half cpp gemm template (#136255) 2024-11-05 05:33:29 +00:00
csrc cpp_wrapper: Add support for MemoryFormat arguments (#141367) 2024-12-02 20:40:24 +00:00
cuda [ROCM] Support Multi-GPU offline tuning in TunableOp (#139673) 2024-11-26 19:07:41 +00:00
distributed Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
distributions [BE] Use torch.special.expm1 (#141518) 2024-11-26 01:47:11 +00:00
export improve typings in unflatten (#141817) 2024-11-30 22:12:15 +00:00
fft
func
futures
fx fix deep copy of empty graph (#141660) 2024-12-02 22:03:13 +00:00
jit Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
legacy
lib Add and use thread-safe strerror (#140472) 2024-11-19 04:24:17 +00:00
linalg docs: clarify alias usage for x parameter in vector_norm function (#136921) 2024-09-30 02:50:06 +00:00
masked [BE]: Update Typeguard to TypeIs for better type inference (#133814) 2024-10-26 15:07:13 +00:00
monitor
mps
mtia [MTIA] Support torch.mtia.empty_cache() (#141533) 2024-11-28 02:24:19 +00:00
multiprocessing Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
nested Switch to using Python nested int (#141166) 2024-12-02 19:17:30 +00:00
nn Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
onnx [ONNX] Remove special handling of torchvision.ops imports in onnx export (#141569) 2024-11-28 18:05:40 +00:00
optim Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
package [BE]: Apply PERF401 autofixes from ruff (#140980) 2024-11-20 17:52:07 +00:00
profiler Add skip_first_wait to profiler.schedule (V2) (#141512) 2024-11-26 18:10:54 +00:00
quantization Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
signal
sparse [sparse] add extra options to _cslt_spare_mm (#137427) 2024-11-27 05:32:45 +00:00
special
testing Revert "[BE]: Update mypy to 1.13.0 (#140808)" 2024-12-02 20:47:43 +00:00
utils Revert "[dynamo][pytree][1/N] make CXX pytree traceable: tree_iter / tree_leaves (#137397)" 2024-12-02 16:05:14 +00:00
xpu [BE]: Apply PERF401 autofixes from ruff (#140980) 2024-11-20 17:52:07 +00:00
__config__.py
__future__.py
__init__.py [dynamo] add SymNode bitwise and/or (#138777) 2024-11-22 23:36:16 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py Improve is_fbcode functionality (#136871) 2024-09-27 21:19:01 +00:00
_guards.py dynamo: guard on FSDP module parameters (#138819) 2024-11-13 20:46:46 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py [sparse] add extra options to _cslt_spare_mm (#137427) 2024-11-27 05:32:45 +00:00
_namedtensor_internals.py
_ops.py remove redundant a (#139046) 2024-10-28 17:47:24 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py Use torch.Stream&torch.Event for Dynamo capature (#134850) 2024-10-02 14:15:33 +00:00
_tensor_docs.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-27 12:54:47 +00:00
_tensor_str.py
_tensor.py type annotations for meta_utils (#140203) 2024-11-13 20:07:47 +00:00
_thread_safe_fork.py [inductor] parallel compile: add import of thread_safe_fork for internal (#137155) 2024-10-03 17:37:21 +00:00
_torch_docs.py Clarify torch.arange floating-point rounding behavior (#141655) 2024-11-27 09:31:39 +00:00
_utils_internal.py Change export IR to non-functional pre-dispatch IR (#139511) 2024-11-20 21:47:55 +00:00
_utils.py [Device] Add mps as device type in torch._utils._get_available_device_type() (#141098) 2024-11-20 20:45:59 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py Add small test case for #140230 (#140850) 2024-11-19 02:44:54 +00:00
abi-check.cpp
CMakeLists.txt Add torch.version.xpu (#139466) 2024-11-09 13:31:21 +00:00
custom_class_detail.h Remove some pre-cpp17 stuff (#138410) 2024-10-23 00:38:03 +00:00
custom_class.h Remove some pre-cpp17 stuff (#138410) 2024-10-23 00:38:03 +00:00
extension.h
functional.py Clarify opt-einsum usage, fix #127109 (#137596) 2024-10-09 20:31:24 +00:00
hub.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
library.h [1/N] Enable cppcoreguidelines-special-member-functions (#137405) 2024-10-23 00:16:53 +00:00
library.py no-op torch.library.custom_op APIs on torch.deploy (#139509) 2024-11-04 18:01:08 +00:00
overrides.py Add Weighted Loss Functions to PyTorch : WMSE, WMAE, and Weighted Huber Loss (#132049) 2024-10-31 21:59:43 +00:00
py.typed
quasirandom.py
random.py [Torch] Support meta device in random.fork_rng (#137715) 2024-10-16 18:00:39 +00:00
README.txt
return_types.py
script.h
serialization.py Allow NJT by default for weights_only torch.load (take 2) (#140739) 2024-11-19 02:44:53 +00:00
storage.py Fix .to(cpu) for Storage (#138011) 2024-10-23 01:31:48 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.