pytorch/torch
Fritz Obermeyer 0745591855 Vectorize LowerCholeskyTransform (#24131)
Summary:
Removes older `torch.stack`-based logic in favor of `torch.diagonal()` and `torch.diag_embed()`.

I see 100x speedup in my application, where my batched matrix has shape `(800, 32 ,32)`.
```py
import torch
from torch.distributions import constraints, transform_to
x = torch.randn(800, 32, 32, requires_grad=True)

# Before this PR:
%%timeit
transform_to(constraints.lower_cholesky)(x).sum().backward()
# 579 ms ± 34.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# After this PR:
%%timeit
transform_to(constraints.lower_cholesky)(x).sum().backward()
# 4.5 ms ± 241 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24131

Differential Revision: D16764035

Pulled By: ezyang

fbshipit-source-id: 170cdb0d924cdc94cd5ad3b75d1427404718d437
2019-08-15 06:46:19 -07:00
..
_thnn Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00
autograd Added torch.autograd.profiler.record_function() as context manager. (#23428) 2019-07-30 11:10:01 -07:00
backends Fix cuda and cudnn libraries search process on Windows (#20205) 2019-05-08 06:08:47 -07:00
contrib Remove GraphExecutor's python bindings (#19141) 2019-04-13 08:42:24 -07:00
csrc python udf over rpc (#23569) 2019-08-14 23:13:33 -07:00
cuda Let set_rng_state and get_rng_state accept string parameter (#23448) 2019-07-29 08:08:39 -07:00
distributed python udf over rpc (#23569) 2019-08-14 23:13:33 -07:00
distributions Vectorize LowerCholeskyTransform (#24131) 2019-08-15 06:46:19 -07:00
for_onnx Turn on F401: Unused import warning. (#18598) 2019-03-30 09:01:17 -07:00
jit fix lint 2019-08-14 17:37:39 -07:00
legacy Remove torch/legacy (#11823) 2018-09-20 14:00:54 -07:00
lib cleanup warnings 2019-08-12 16:12:30 -07:00
multiprocessing Add multiprocessing_context= argument to DataLoader (#22990) 2019-07-29 12:58:40 -07:00
nn test {__init__,from_float} on nnq{,d}.Linear 2019-08-14 17:42:23 -07:00
onnx Fix validation of dynamic axes names (#23974) 2019-08-13 16:33:27 -07:00
optim reduce memory usage for centered rmsprop (#24170) 2019-08-13 12:18:31 -07:00
quantization Remove the activation observer for default_qconfig (#24299) 2019-08-14 17:21:50 -07:00
sparse Correct conv and pooling docstrings in nn module (#17052) 2019-02-15 06:58:02 -08:00
testing Fix get_all_math_dtypes for device='cuda' retuning None (#23028) 2019-07-19 09:29:16 -07:00
utils Remove hard Caffe2 dependency for TensorBoard (#24295) 2019-08-13 20:33:24 -07:00
__config__.py Allow a non-OpenMP based build (#19749) 2019-05-06 19:34:48 -07:00
__future__.py Add torch.__future__._overwrite_module_params_on_conversion global flag, and check it in nn.Module._apply() (#21613) 2019-06-19 10:30:02 -07:00
__init__.py Updated docs and added deprecation warnings to acknowledge a bool tensor (#22261) 2019-08-05 07:42:34 -07:00
__init__.pyi.in Updated docs and added deprecation warnings to acknowledge a bool tensor (#22261) 2019-08-05 07:42:34 -07:00
_classes.py Initial torchbind prototype (#21098) 2019-08-02 18:45:15 -07:00
_jit_internal.py Add the ability to compile exports on traced modules (#24298) 2019-08-14 13:51:22 -07:00
_ops.py Initial torchbind prototype (#21098) 2019-08-02 18:45:15 -07:00
_six.py Finished the high-priority functions (#21127) 2019-06-04 17:59:05 -07:00
_storage_docs.py Enabled BFloat16 storage (#21523) 2019-07-09 21:51:06 -07:00
_tensor_docs.py Updated docs and added deprecation warnings to acknowledge a bool tensor (#22261) 2019-08-05 07:42:34 -07:00
_tensor_str.py Add names to repr for named tensors 2019-08-02 11:37:29 -07:00
_torch_docs.py Fix docstring for argmax (#23775) 2019-08-07 09:42:19 -07:00
_utils_internal.py Override the resolve_library_path in FBCode (#17497) 2019-03-12 22:09:24 -07:00
_utils.py Catch and print exception traceback in parallel_apply() workers (#18055) 2019-07-26 11:41:22 -07:00
abi-check.cpp Fixes for Torch Script C++ API (#11682) 2018-09-17 09:54:50 -07:00
CMakeLists.txt python udf over rpc (#23569) 2019-08-14 23:13:33 -07:00
custom_class.h search class type for methods (#23689) 2019-08-12 20:29:45 -07:00
extension.h Remove deprecated variable_tensor_functions (#15003) 2018-12-11 17:16:11 -08:00
functional.py Implement tensor.align_to(names), torch.align_tensors(*tensors) (#23804) 2019-08-14 09:40:27 -07:00
hub.py Use dst dir for temp file (#23629) 2019-07-31 19:04:03 -07:00
namedtensor.py Update tensor.view_names / tensor.names_ API (#23973) 2019-08-14 09:40:35 -07:00
py.typed More type stubs (#18511) 2019-04-01 16:03:58 -07:00
quasirandom.py Introduce SobolEngine (#10505) 2019-03-26 07:53:07 -07:00
random.py Refactor Random Number Generators in ATen (#21555) 2019-06-19 13:54:09 -07:00
README.txt
script.h Add Pickler C++ API (#23241) 2019-08-12 14:43:31 -07:00
serialization.py fix error message 2019-07-18 23:38:55 -07:00
storage.py Enabled BFloat16 storage (#21523) 2019-07-09 21:51:06 -07:00
tensor.py Update tensor.view_names / tensor.names_ API (#23973) 2019-08-14 09:40:35 -07:00

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.