pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

Stephen Jia 545d2126f6 [pt-vulkan] Enable Python code blocks in shader templates and upgrade shader template generation (#115948 ) Summary: This change makes two major improvements to PyTorch Vulkan's shader authoring workflow. ## Review Guide There are a lot of changed files because every GLSL shader had to be touched. The majority of changes is changing ``` #define PRECISION $precision #define FORMAT $format ``` to ``` #define PRECISION ${PRECISION} #define FORMAT ${FORMAT} ``` due to changes in how shader templates are processed. For reviewers, the primary functional changes to review are: * `gen_vulkan_spv.py` * Majority of functional changes are in this file, which controls how shader templates are processed. * `shader_params.yaml` * controls how shader variants are generated ## Python Codeblocks in Shader Templates From now on, every compute shader (i.e. `.glsl`) is treated as a shader template. To this effect, the `templates/` folder has been removed and there is now a global `shader_params.yaml` file to describe the shader variants that should be generated for all shader templates. Taking inspiration from XNNPACK's [`xngen` tool](https://github.com/google/XNNPACK/blob/master/tools/xngen.py), shader templates can now use Python codeblocks. One example is: ``` $if not INPLACE: layout(set = 0, binding = 0, FORMAT) uniform PRECISION restrict writeonly image3D uOutput; layout(set = 0, binding = 1) uniform PRECISION sampler3D uInput; layout(set = 0, binding = 2) uniform PRECISION sampler3D uOther; layout(set = 0, binding = 3) uniform PRECISION restrict Block { ivec4 output_sizes; ivec4 input_sizes; ivec4 other_sizes; float alpha; } uArgs; $else: layout(set = 0, binding = 0, FORMAT) uniform PRECISION restrict image3D uOutput; layout(set = 0, binding = 1) uniform PRECISION sampler3D uOther; layout(set = 0, binding = 2) uniform PRECISION restrict Block { ivec4 output_sizes; ivec4 other_sizes; float alpha; } uArgs; ``` Another is: ``` // PYTHON CODEBLOCK $if not IS_DIV: const int c_index = (pos.z % ((uArgs.output_sizes.z + 3) / 4)) * 4; if (uArgs.other_sizes.z != 1 && c_index + 3 >= uArgs.output_sizes.z) { ivec4 c_ind = ivec4(c_index) + ivec4(0, 1, 2, 3); vec4 mask = vec4(lessThan(c_ind, ivec4(uArgs.output_sizes.z))); other_texel = other_texel * mask + vec4(1, 1, 1, 1) - mask; } // PYTHON CODEBLOCK $if not INPLACE: ivec3 input_pos = map_output_pos_to_input_pos(pos, uArgs.output_sizes, uArgs.input_sizes); const vec4 in_texel = load_texel(input_pos, uArgs.output_sizes, uArgs.input_sizes, uInput); imageStore(uOutput, pos, OP(in_texel, other_texel, uArgs.alpha)); $else: const vec4 in_texel = imageLoad(uOutput, pos); imageStore(uOutput, pos, OP(in_texel, other_texel, uArgs.alpha)); ``` In addition to making it easier and clearer to write shader templates, this enables shaders that were previously unable to be consolidated into a single template to now be represented using a single template, such as non inplace and inplace variants of the same shader. ## `generate_variant_forall` in shader variant YAML configuration YAML files that describe how shader variants should be generated can now use a `generate_variant_forall` field to iterate over various settings for a specific parameter for each variant defined. Example: ``` unary_op: parameter_names_with_default_values: OPERATOR: exp(X) INPLACE: 0 generate_variant_forall: INPLACE: - VALUE: 0 SUFFIX: "" - VALUE: 1 SUFFIX: "inplace" shader_variants: - NAME: exp OPERATOR: exp(X) - NAME: sqrt OPERATOR: sqrt(X) - NAME: log OPERATOR: log(X) ``` Previously, the `inplace` variants would need to have separate `shader_variants` entries. If there are multiple variables that need to be iterated across, then all possible combinations will be generated. Would be good to take a look to see how the new YAML configuration works. Test Plan: There is no functional change to this diff; we only need to make sure that the generated shaders are still correct. Therefore, we only need to run `vulkan_api_test`. ``` # On Mac Laptop buck run --target-platforms ovr_config//platform/macos:arm64-fbsource //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1 -- --gtest_filter="*" ``` Reviewed By: digantdesai Differential Revision: D52087084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115948 Approved by: https://github.com/manuelcandales		2023-12-20 05:47:33 +00:00
..
alerts
amd_build	Revert "Initial Flash Attention support on ROCM (#114309 )" (#115975 )	2023-12-16 03:40:14 +00:00
autograd	SymInt'ify sparse_compressed_tensor (#107903 )	2023-12-17 17:36:20 +00:00
bazel_tools
build/bazel	Bump urllib3 from 2.0.6 to 2.0.7 in /tools/build/bazel (#111435 )	2023-10-18 17:14:06 -07:00
build_defs	Fix buck OSS build after #115570 (#115804 )	2023-12-14 08:33:07 +00:00
code_analyzer	[BE]: Apply FURB145 to make code more readable and idiomatic. (#112990 )	2023-11-06 13:15:04 +00:00
code_coverage
config
coverage_plugins_package	[BE] Enable ruff's UP rules and autoformat tools and scripts (#105428 )	2023-07-19 01:24:44 +00:00
dynamo	Replaced deprecated pkg_resources.packaging with packaging module (#113023 )	2023-11-10 15:06:03 +00:00
gdb	[BE] f-stringify torch/ and scripts (#105538 )	2023-07-21 19:35:24 +00:00
github
iwyu	[2/N] Cleanup header inclusions in torch_cpu by iwyu (#109964 )	2023-11-19 20:56:32 +00:00
jit	[BE] Enable ruff's UP rules and autoformat tools and scripts (#105428 )	2023-07-19 01:24:44 +00:00
linter	[CI] Update clang-format (#116002 )	2023-12-18 14:58:46 +00:00
lite_interpreter	removing some redundant str splits (#106089 )	2023-09-01 00:22:58 +00:00
lldb
onnx	[torch/csrc/onnx] Use nested namespaces (3/N) (#113993 )	2023-11-18 00:20:19 +00:00
pyi	AOTAutograd: keep input mutations in the graph if they are under no_grad, even if they require_grad (#114646 )	2023-11-29 04:29:32 +00:00
rules
rules_cc
setup_helpers	Fix the Requirement of CMake Version (#106254 )	2023-08-02 08:02:52 +00:00
shared
stats	[td] Consistent pytest cache (#113804 )	2023-11-17 23:45:47 +00:00
test	[pt-vulkan] Enable Python code blocks in shader templates and upgrade shader template generation (#115948 )	2023-12-20 05:47:33 +00:00
testing	[BE]: ruff FURB136: replace ternary with min/max (preview) (#114382 )	2023-11-22 22:10:01 +00:00
__init__.py
bazel.bzl
BUCK.bzl	Use global variables to register the return_types namedtuples (#108832 )	2023-09-13 17:42:46 +00:00
BUCK.oss
build_libtorch.py
build_pytorch_libs.py
download_mnist.py	[BE] f-stringify torch/ and scripts (#105538 )	2023-07-21 19:35:24 +00:00
extract_scripts.py
gen_flatbuffers.sh
gen_vulkan_spv.py	[pt-vulkan] Enable Python code blocks in shader templates and upgrade shader template generation (#115948 )	2023-12-20 05:47:33 +00:00
generate_torch_version.py	Enable import following in MYPYNOFOLLOW (now MYPYINDUCTOR) (#113830 )	2023-11-17 18:24:21 +00:00
generated_dirs.txt
git_add_generated_dirs.sh
git_reset_generated_dirs.sh
nightly.py	removing some redundant str splits (#106089 )	2023-09-01 00:22:58 +00:00
nvcc_fix_deps.py
pytorch.version
README.md
render_junit.py
substitute.py	[Codemod][python/main_function] caffe2: (#113357 )	2023-11-15 22:17:31 +00:00
update_masked_docs.py
vscode_settings.py

README.md

This folder contains a number of scripts which are used as part of the PyTorch build process. This directory also doubles as a Python module hierarchy (thus the __init__.py).

Overview

Modern infrastructure:

autograd - Code generation for autograd. This includes definitions of all our derivatives.
jit - Code generation for JIT
shared - Generic infrastructure that scripts in tools may find useful.
- module_loader.py - Makes it easier to import arbitrary Python files in a script, without having to add them to the PYTHONPATH first.

Build system pieces:

setup_helpers - Helper code for searching for third-party dependencies on the user system.
build_pytorch_libs.py - cross-platform script that builds all of the constituent libraries of PyTorch, but not the PyTorch Python extension itself.
build_libtorch.py - Script for building libtorch, a standalone C++ library without Python support. This build script is tested in CI.

Developer tools which you might find useful:

git_add_generated_dirs.sh and git_reset_generated_dirs.sh - Use this to force add generated files to your Git index, so that you can conveniently run diffs on them when working on code-generation. (See also generated_dirs.txt which specifies the list of directories with generated files.)

Important if you want to run on AMD GPU:

amd_build - HIPify scripts, for transpiling CUDA into AMD HIP. Right now, PyTorch and Caffe2 share logic for how to do this transpilation, but have separate entry-points for transpiling either PyTorch or Caffe2 code.
- build_amd.py - Top-level entry point for HIPifying our codebase.

Tools which are only situationally useful:

docker - Dockerfile for running (but not developing) PyTorch, using the official conda binary distribution. Context: https://github.com/pytorch/pytorch/issues/1619
download_mnist.py - Download the MNIST dataset; this is necessary if you want to run the C++ API tests.