pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Jesse Cai cf5262a84f [core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135 ) This PR adds in support for semi-structured sparsity via a tensor subclass. It currently uses the CUTLASS kernels merged in PR #100881. In the future we plan to add in cuSPARSELt support (see the other PRs in the stack), which will give us larger performance gains. This PR adds in 2 things: - a Tensor subclass, `SparseSemiStructuredTensor` to store the sparse tensor in copmressed form and override `__torch_dispatch__`. - a conversion function that takes in a dense tensor and a semi-structured sparse bool mask and creates an instance of the subclass. SparseSemiStructuredTensor The subclass stores the dense tensor in a contiguous flattened tensor for future compatability with cuSPARSELt, which expects this format. Note that the CUTLASS kernels do not have this limitation, as the specified values and the metadata are passed separately in `_structured_sparse_linear`. In the future we can use the cuSPARSELT bindings [here](https://github.com/pytorch/pytorch/pull/103700) for faster matmul, better dtype converage, and relaxed shape constraints. Since we currently don't have a way to go back from the sparse representation to the dense representation, and we store the weights in compressed form, we don't have a great way to handle .t(). Instead, we keep track of how often we've called transpose on our tensor, and if it's an unexpected number we throw an error. When the first argument is sparse, we expect an even number of calls to transpose, while when the second argument is sparse, we expect an odd number of calls. This is because we support second argument sparse matrix multiplications by using transpose properties. to_sparse_semi_structured This is a conversion function to convert a dense tensor and a semi-structured sparse bool mask into a subclass. Currently, we must pass in a bool mask, since we can't infer it becuase there may be additional zero elements in the dense tensor, so `tensor !=0` is not 2:4 sparse. Once we add either a method to derive the mask from the dense tensor or cuSPARSELt, we no longer need to pass in the mask. cuSPARSELt has it's own helper functions to create the metadata mask. User Details We have implemented support for the following ops for `torch.float16` and `torch.int8`: ``` torch.addmm(bias, dense, sparse.t()) torch.mm(dense, sparse) torch.mm(sparse, dense) aten.linear.default aten.t.default aten.t.detach ``` The end user interface to accelerate a nn.Linaer module with the subclass would look like this: ``` from torch.sparse import to_sparse_semi_structured mask = torch.Tensor([0, 0, 1, 1]).tile(128, 32).cuda().bool() linear = Model(128, 128).half().cuda() linear.weight = nn.Parameter(to_sparse_semi_structured(linear.weight, mask=linear.weight.bool()) ``` This also updates tests and the `torch.sparse` module docstring to reflect these changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102135 Approved by: https://github.com/albanD		2023-06-26 21:30:43 +00:00
..
_static	torch.compile docs: "Profiling to understand torch.compile performance (#102862 )	2023-06-06 22:00:36 +00:00
_templates	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
community	Update CODEOWNERS (#103934 )	2023-06-26 19:29:29 +00:00
compile	Revert "To add brief intro for CPU backend optimization (#103666 )"	2023-06-20 18:33:01 +00:00
elastic	[BE] Prefer dash over underscore in command-line options (#94505 )	2023-02-09 20:16:49 +00:00
notes	[draft] Update Multiprocessing best practices with CPU device (#103229 )	2023-06-25 06:26:40 +00:00
rpc
scripts	Rename Canonical Aten IR to Core Aten IR (#92904 )	2023-01-25 05:12:23 +00:00
amp.rst	[docs] Warn that GradScaler can scale under 1 (#101569 )	2023-05-16 23:56:07 +00:00
autograd.rst	[docs] Add missing functions to autograd.rst (#98854 )	2023-04-11 20:45:49 +00:00
backends.rst	Fix Backend docs search items (#101214 )	2023-05-22 14:58:38 +00:00
benchmark_utils.rst
bottleneck.rst	add itt unit test and docstrings (#84848 )	2022-09-28 01:39:58 +00:00
checkpoint.rst	add checkpoint support for custom device (#99626 )	2023-05-04 00:23:42 +00:00
compiler.rst	torch.compiler public namespace (#102182 )	2023-06-13 19:52:17 +00:00
complex_numbers.rst	Remove CUDA 11.6 note from complex docs (#100118 )	2023-04-27 16:26:27 +00:00
conf.py	Use the new analytics ID (#103766 )	2023-06-16 23:21:08 +00:00
config_mod.rst
cpp_extension.rst
cpp_index.rst
cuda._sanitizer.rst	Fix typos under docs directory (#88033 )	2022-10-31 19:31:56 +00:00
cuda.rst	Add more GPU metric instrumentation (#91717 )	2023-02-24 00:38:03 +00:00
cudnn_persistent_rnn.rst
cudnn_rnn_determinism.rst
data.rst	missed StackDataset documentation (#101927 )	2023-05-22 21:12:16 +00:00
ddp_comm_hooks.rst	[DOCS][DDP]Fix the simple of saving and reloading PowerSGD state and hook. (#102721 )	2023-06-10 00:15:00 +00:00
deploy.rst	Delete torch::deploy from pytorch core (#85953 )	2022-10-06 07:20:16 +00:00
distributed.algorithms.join.rst
distributed.checkpoint.rst	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
distributed.elastic.rst
distributed.optim.rst
distributed.rst	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
distributed.tensor.parallel.rst	[TP] Enable more generic attn in Tensor Parallelism (#100508 )	2023-05-07 18:15:49 +00:00
distributions.rst
dlpack.rst
docutils.conf
fft.rst
fsdp.rst	[FSDP()][3/N] Refactor public APIs (#87917 )	2022-10-31 16:45:21 +00:00
func.api.rst	[functorch] linearize (#94173 )	2023-02-09 15:45:08 +00:00
func.batch_norm.rst	Fix typo under docs directory (#97202 )	2023-03-21 01:24:10 +00:00
func.migrating.rst	[torch.func] Add migration guide from functorch (#91811 )	2023-01-17 22:14:42 +00:00
func.rst	Fix typo under docs directory (#92762 )	2023-01-23 18:07:22 +00:00
func.ux_limitations.rst	[torch.func] Add docs (#91319 )	2022-12-30 02:51:18 +00:00
func.whirlwind_tour.rst	[torch.func] Add docs (#91319 )	2022-12-30 02:51:18 +00:00
futures.rst
fx.rst	[fx] change from #users to num_users in graph printout (#101140 )	2023-06-20 21:24:32 +00:00
hub.rst	Fix typo under docs directory (#92762 )	2023-01-23 18:07:22 +00:00
index.rst	torch.compiler public namespace (#102182 )	2023-06-13 19:52:17 +00:00
ir.rst	Rename Canonical Aten IR to Core Aten IR (#92904 )	2023-01-25 05:12:23 +00:00
jit_builtin_functions.rst
jit_language_reference_v2.rst	Fix typo under docs directory (#97202 )	2023-03-21 01:24:10 +00:00
jit_language_reference.rst	[BE] [1/3] Rewrite `super()` calls in caffe2 and benchmarks (#94587 )	2023-02-11 18:19:48 +00:00
jit_python_reference.rst
jit_unsupported.rst	(Re-open) Adds cudaMallocAsync as an alternative backend for the CUDA allocator (#82682 )	2022-10-12 03:44:21 +00:00
jit_utils.rst
jit.rst	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
library.rst
linalg.rst	Add a note on the stability of linalg functions. (#88313 )	2022-11-07 22:44:23 +00:00
logging.rst	Add graph break logging option instead of config flag (#103202 )	2023-06-12 19:52:31 +00:00
masked.rst	Fix link in docs (#94686 )	2023-02-13 20:42:24 +00:00
math-quantizer-equation.png
mobile_optimizer.rst	[Reland] Clean Up MobileOptimizerType Rewrite Flags Public API and Documentation (#92081 )	2023-01-14 17:06:00 +00:00
model_zoo.rst
monitor.rst
mps.rst	[MPS] Add support for MPSProfiler Python bindings (#101002 )	2023-05-12 21:55:34 +00:00
multiprocessing.rst
name_inference.rst	Add itemsize and nbytes properties to Tensor (#98322 )	2023-04-05 12:11:55 +00:00
named_tensor.rst
nested.rst	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
nn.functional.rst	[SDPA] update type hint for scaled_dot_product_attention and documentation (#94008 )	2023-02-10 18:02:43 +00:00
nn.init.rst
nn.rst	[easy] Expose documentation for a few global nn.Module hooks (#97185 )	2023-03-21 20:09:29 +00:00
onnx_diagnostics.rst	[ONNX] Introduce 'diagnostics' to 'dynamo_export' api (#99668 )	2023-05-01 19:58:49 +00:00
onnx_supported_aten_ops.rst
onnx.rst	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
optim.rst	Optimized EMA implementation (#94820 )	2023-04-26 18:02:11 +00:00
package.rst
pipeline.rst	docs: Linking ResNeXt PyTorch Hub Pipeline (#98689 )	2023-04-11 02:20:26 +00:00
profiler.rst	Fix ITT unit-tests if PyTorch is compiled with `USE_ITT=OFF` (#86199 )	2022-10-04 21:57:05 +00:00
quantization-accuracy-debugging.rst	Fix typo under docs directory (#87583 )	2022-10-24 23:52:44 +00:00
quantization-backend-configuration.rst	update quantization doc: add x86 backend as default backend of server inference (#86794 )	2022-12-02 02:10:25 +00:00
quantization-support.rst	AO migration: migrate .rst files to new locations (#94211 )	2023-02-07 02:32:23 +00:00
quantization.rst	Quantization oneDNN backend only support VNNI CPU (#103653 )	2023-06-19 09:50:07 +00:00
random.rst
rpc.rst	Replace master with main in links and docs/conf.py (#100176 )	2023-05-02 18:20:32 +00:00
signal.rst	Nuttall window (#90103 )	2022-12-16 09:05:53 +00:00
sparse.rst	[core][pruning][sparse][feature] SparseSemiStructured tensor subclass (#102135 )	2023-06-26 21:30:43 +00:00
special.rst	[primTorch] special: j0, j1, spherical_j0 (#86049 )	2022-10-04 18:21:46 +00:00
storage.rst	Deprecate TypedStorage, its derived classes, and all of their public methods (#85303 )	2022-11-08 18:11:01 +00:00
tensor_attributes.rst	Add a warning about performance cost of set_default_device (#92703 )	2023-01-21 02:23:13 +00:00
tensor_view.rst
tensorboard.rst
tensors.rst	Add itemsize and nbytes properties to Tensor (#98322 )	2023-04-05 12:11:55 +00:00
testing.rst	document torch.testing.assert_allclose (#89526 )	2022-12-01 11:22:50 +00:00
torch.ao.ns._numeric_suite_fx.rst
torch.ao.ns._numeric_suite.rst
torch.overrides.rst	Add torch_dispatch and modes to extending.rst note (#102087 )	2023-06-22 12:56:35 +00:00
torch.rst	Re-land _cycleviz.py: visualize reference cycles holding cuda memory (#104051 )	2023-06-23 13:44:58 +00:00
type_info.rst