pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Yutao Xu 79fb7416e7 [Intel GPU] Add device guard for XPU structured operator in torchgen (#138802 ) This PR is a supplement to https://github.com/pytorch/pytorch/pull/133980. The previous PR fulfill the basic functionality of XPU device guard, while we found it fails to address structured operators. With current PR, the code snippet in RegisterXPU.cpp is as follows, where we can see the device guard is successfully generated. ```c++ struct structured_exp_out_functional final : public at::native::structured_exp_out { void set_output_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { auto current_device = guard_.current_device(); if (C10_UNLIKELY(current_device.has_value())) { TORCH_INTERNAL_ASSERT(current_device == options.device(), "structured kernels don't support multi-device outputs"); } else { guard_.reset_device(options.device()); } outputs_[output_idx] = create_out(sizes, strides, options); if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output at::native::structured_exp_out::set_output_raw_strided(output_idx, sizes, strides, options, names); } void set_output_raw_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { auto current_device = guard_.current_device(); if (C10_UNLIKELY(current_device.has_value())) { TORCH_INTERNAL_ASSERT(current_device == options.device(), "structured kernels don't support multi-device outputs"); } else { guard_.reset_device(options.device()); } outputs_[output_idx] = create_out(sizes, strides, options); if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output at::native::structured_exp_out::set_output_raw_strided(output_idx, sizes, strides, options, names); } const Tensor& maybe_get_output(int64_t output_idx) override { return outputs_[output_idx]; } std::array<Tensor, 1> outputs_; c10::OptionalDeviceGuard guard_; }; ``` However, without current change, the generated code is ```c++ struct structured_exp_out_functional final : public at::native::structured_exp_out { void set_output_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { outputs_[output_idx] = create_out(sizes, strides, options); if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output at::native::structured_exp_out::set_output_raw_strided(output_idx, sizes, strides, options, names); } void set_output_raw_strided( int64_t output_idx, IntArrayRef sizes, IntArrayRef strides, TensorOptions options, DimnameList names ) override { outputs_[output_idx] = create_out(sizes, strides, options); if (!names.empty()) { namedinference::propagate_names(outputs_[output_idx], names); } // super must happen after, so that downstream can use maybe_get_output // to retrieve the output at::native::structured_exp_out::set_output_raw_strided(output_idx, sizes, strides, options, names); } const Tensor& maybe_get_output(int64_t output_idx) override { return outputs_[output_idx]; } std::array<Tensor, 1> outputs_; }; ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/138802 Approved by: https://github.com/EikanWang, https://github.com/guangyey, https://github.com/ezyang		2024-11-13 05:40:38 +00:00
..
_autoheuristic	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
aoti	[aoti] Add masked_select to cshim (#139071 )	2024-10-31 21:52:53 +00:00
api	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
decompositions	[BE][Easy] eliminate relative import in `torchgen` (#128872 )	2024-06-21 14:11:46 +00:00
dest	[Intel GPU] Add device guard for XPU structured operator in torchgen (#138802 )	2024-11-13 05:40:38 +00:00
executorch	[Reland][7/N] Fix Wextra-semi warning (#140342 )	2024-11-12 18:55:31 +00:00
fuse	[BE] update type annotations for basic utilities in `torch/__init__.py` (#129001 )	2024-06-24 18:04:38 +00:00
operator_versions	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
selective_build	[BE][Easy] enable postponed annotations in `torchgen` (#129376 )	2024-06-29 09:23:39 +00:00
shape_functions	[BE][Easy] enable postponed annotations in `torchgen` (#129376 )	2024-06-29 09:23:39 +00:00
static_runtime	[6/N] Fix Wextra-semi warning (#139605 )	2024-11-04 13:43:16 +00:00
__init__.py
BUCK.oss
BUILD.bazel
build.bzl
code_template.py	[BE][Easy] enable postponed annotations in `torchgen` (#129376 )	2024-06-29 09:23:39 +00:00
context.py	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
gen_aoti_c_shim.py	[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (#136742 )	2024-11-09 13:19:52 +00:00
gen_backend_stubs.py	[Reland][7/N] Fix Wextra-semi warning (#140342 )	2024-11-12 18:55:31 +00:00
gen_executorch.py	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
gen_functionalization_type.py	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
gen_lazy_tensor.py	[BE] Format `.ci/` / `.github/` / `benchmarks/` / `functorch/` / `tools/` / `torchgen/` with `ruff format` (#132577 )	2024-10-11 18:30:26 +00:00
gen_schema_utils.py	[HOP] support generating schema for hop (#133521 )	2024-08-21 17:34:21 +00:00
gen_vmap_plumbing.py	Added batching rule for sdpa_math, sdpa_efficient_attention forward, cudnn, and flash attention (#133964 )	2024-08-22 05:29:49 +00:00
gen.py	[Reland][7/N] Fix Wextra-semi warning (#140342 )	2024-11-12 18:55:31 +00:00
local.py	[BE][Easy] enable postponed annotations in `torchgen` (#129376 )	2024-06-29 09:23:39 +00:00
model.py	[Intel GPU] Support RegisterSparseXPU.cpp codegen. (#139267 )	2024-11-13 01:41:43 +00:00
native_function_generation.py	[BE][Ez]: Use interned hardcoded string FURB156 (#138330 )	2024-10-18 18:26:16 +00:00
utils.py	[torchgen] reference generated comment to actual location of the generator and template (#130020 )	2024-07-05 21:47:14 +00:00
yaml_utils.py