pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Gao, Xiang	a47749cb28	Add at::one_hot (#15208 ) Summary: Closes: https://github.com/pytorch/pytorch/issues/15060 Differential Revision: D13528014 Pulled By: ezyang fbshipit-source-id: 5a18689a4c5638d92f9390c91517f741e5396293	2018-12-20 14:24:58 -08:00
vishwakftw	41e7e1bc40	Rename potrs to cholesky_solve (#15334 ) Summary: Changelog: - Renames `potrs` to `cholesky_solve` to remain consistent with Tensorflow and Scipy (not really, they call their function chol_solve) - Default argument for upper in cholesky_solve is False. This will allow a seamless interface between `cholesky` and `cholesky_solve`, since the `upper` argument in both function are the same. - Rename all tests - Create a tentative alias for `cholesky_solve` under the name `potrs`, and add deprecated warning to not promote usage. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15334 Differential Revision: D13507724 Pulled By: soumith fbshipit-source-id: b826996541e49d2e2bcd061b72a38c39450c76d0	2018-12-19 12:31:24 -08:00
Krishna Kalyan	c51c825efe	Delete ffi documentation (#15220 ) Summary: Deleting FFI documentation since its deprecated. Differential Revision: D13477329 Pulled By: soumith fbshipit-source-id: 0b3d485eb7cef1f05b6b397dff50f21a49d6409e	2018-12-15 09:49:02 -08:00
David Riazati	e9fb4d1f11	Fix jit doc codeblocks and tables (#15227 ) Summary: Some of the codeblocks were showing up as normal text and the "unsupported modules" table was formatted incorrectly Pull Request resolved: https://github.com/pytorch/pytorch/pull/15227 Differential Revision: D13468847 Pulled By: driazati fbshipit-source-id: eb7375710d4f6eca1d0f44dfc43c7c506300cb1e	2018-12-14 14:27:56 -08:00
David Riazati	5837320b70	Add script standard library documentation + cleanup (#14912 ) Summary: Documents what is supported in the script standard library. * Adds `my_script_module._get_method('forward').schema()` method to get function schema from a `ScriptModule` * Removes `torch.nn.functional` from the list of builtins. The only functions not supported are `nn.functional.fold` and `nn.functional.unfold`, but those currently just dispatch to their corresponding aten ops, so from a user's perspective it looks like they work. * Allow printing of `IValue::Device` by getting its string representation Pull Request resolved: https://github.com/pytorch/pytorch/pull/14912 Differential Revision: D13385928 Pulled By: driazati fbshipit-source-id: e391691b2f87dba6e13be05d4aa3ed2f004e31da	2018-12-12 12:30:13 -08:00
Michael Carilli	5d3a347685	Stashing checkpointing RNG states based on devices of arg tensors (#14518 ) Summary: This PR intends to address apaszke's concerns in https://github.com/pytorch/pytorch/pull/14253#issuecomment-441740016. Preserving the rng state is now controlled by a kwarg rather than a global state, hopefully in a python 2.7-compatible way. Additionally, the checkpointing function stashes and restores the RNG states of 1. devices associated with all input tensor args to run_fn as well as 2. the current device. I could easily change this to only save and restore the RNG states associated 1. alone. This would simplify the logic to create a [deduplicated, ordered](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R37) list of devices considered active. I'm wondering if the [get_device_states](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R32) and [set_device_states](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R47) functions are general enough to reside elsewhere (presumably torch/random.py). I'm also wondering if the check on [torch.cuda._initialized](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R47) would be better placed within `get_device_states`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14518 Differential Revision: D13356210 Pulled By: ezyang fbshipit-source-id: afa4cc21ce7862142d5cb1dec3750018df222039	2018-12-11 09:48:45 -08:00
Michael Suo	25144c8a09	s/Torch Script/TorchScript/g (#15011 ) Summary: pls Pull Request resolved: https://github.com/pytorch/pytorch/pull/15011 Differential Revision: D13404158 Pulled By: suo fbshipit-source-id: e906281463d65c86e4e9073eb0c0a26f4f29e307	2018-12-10 13:48:24 -08:00
James Reed	459aac4f24	Update graph printouts in JIT docs (#14914 ) Summary: Tracing records variable names and we have new types and stuff in the IR, so this updates the graph printouts in the docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/14914 Differential Revision: D13385101 Pulled By: jamesr66a fbshipit-source-id: 6477e4861f1ac916329853763c83ea157be77f23	2018-12-07 15:08:53 -08:00
Ailing Zhang	5734e96775	Improve hub documentation (#14862 ) Summary: Added a few examples and explains to how publish/load models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14862 Differential Revision: D13384790 Pulled By: ailzhang fbshipit-source-id: 008166e84e59dcb62c0be38a87982579524fb20e	2018-12-07 14:59:01 -08:00
vishwakftw	1c9df7facf	Expose torch.roll function and method (#14880 ) Summary: Fixes #14859 . Differential Revision: D13376915 Pulled By: zou3519 fbshipit-source-id: f1fc0e8492a159431a3fc0a19a41aa10429ecc80	2018-12-07 07:42:47 -08:00
Xiang Gao	3799d32b7b	Optimize images (#14084 ) Summary: This is a PR that [ImgBot](https://imgbot.net/) opened on my fork https://github.com/zasdfgbnm/pytorch/pull/1, I forward it here. ImgBot does lossless compression on images to reduce file size. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14084 Differential Revision: D13356293 Pulled By: ezyang fbshipit-source-id: 731236d95ad870db8ccb99b03ed306704365242c	2018-12-05 22:46:32 -08:00
Brendan Soffientini	2d60afbc90	Remove outdated css file and refs in cpp conf.py (#14779 ) Summary: pytorch_theme.css is no longer necessary for the cpp or html docs site build. The new theme styles are located at https://github.com/pytorch/pytorch_sphinx_theme. The Lato font is also no longer used in the new theme. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14779 Differential Revision: D13356125 Pulled By: ezyang fbshipit-source-id: c7635eb7512c7dcaddb9cad596ab3dbc96480144	2018-12-05 21:55:45 -08:00
peterjc123	e1eb32d9f1	Update magma to 2.4.0 for Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14738 Differential Revision: D13341611 Pulled By: soumith fbshipit-source-id: 39a49fc60e710cc32a463858c9cee57c182330e2	2018-12-05 09:53:39 -08:00
Teng Li	2d3cf98b49	Making dist.get_default_group private for PT1 release (#14767 ) Summary: When I wrote the frontend API, it is designed on not letting users use the default_group directly on any functions. It should really be private. All collectives are supposed to either use group.WORLD, or anything that comes out of new_group. That was the initial design. We need to make a TODO on removing group.WORLD one day. It exists for backward compatibility reasons and adds lots of complexity. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14767 Reviewed By: pietern Differential Revision: D13330655 Pulled By: teng-li fbshipit-source-id: ace107e1c3a9b3910a300b22815a9e8096fafb1c	2018-12-04 19:22:24 -08:00
Wei Yang	5ee8312b63	sparse.mm(), reland #14526 (#14661 ) Summary: - reland reverted PR #14526 with doc fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/14661 Differential Revision: D13289047 Pulled By: weiyangfb fbshipit-source-id: 5b843a11a58b56aeada3af2680a27cf89ecef4d8	2018-12-03 10:39:27 -08:00
Alyssa Wang	1c21dc6e16	Revert D13252990: [pytorch][PR] [sparse] sparse.mm(S, D) Differential Revision: D13252990 Original commit changeset: 8fdb14144405 fbshipit-source-id: 49b8b0759a6e647854689962ffa72a205b4a2088	2018-11-30 18:53:47 -08:00
Wei Yang	c3a2b1e155	sparse.mm(S, D) (#14526 ) Summary: - add `sparse.mm(S, D)` with backward - for `sparse.addmm()`, relax input constraint so that sparse matrix input doesn't have to coalesced Pull Request resolved: https://github.com/pytorch/pytorch/pull/14526 Reviewed By: ezyang Differential Revision: D13252990 Pulled By: weiyangfb fbshipit-source-id: 8fdb14144405a2122d4b8447ad4055cd0330e6e8	2018-11-30 14:15:34 -08:00
Pieter Noordhuis	3648c269e9	Misc distributed documentation updates (#14605 ) Summary: * s/environmental/environment/g * Casing (CUDA, InfiniBand, Ethernet) * Don't embed torch.multiprocessing.spawn but link to it (not part of the package) * spawn _function_ instead of _utility_ (it's mentioned after the launch utility which is a proper utility) Pull Request resolved: https://github.com/pytorch/pytorch/pull/14605 Differential Revision: D13273480 Pulled By: pietern fbshipit-source-id: da6b4b788134645f2dcfdd666d1bbfc9aabd97b1	2018-11-29 21:51:43 -08:00
Teng Li	2b7345bcd5	PT1 distributed doc update (#14530 ) Summary: Removed an incorrect section. We don't support this. I wrote this from my memory :( Pull Request resolved: https://github.com/pytorch/pytorch/pull/14530 Differential Revision: D13253471 Pulled By: teng-li fbshipit-source-id: c3f1ffc6c98ef8789157e885776e0b775ec47b15	2018-11-29 17:50:47 -08:00
albanD	f80d34a1c8	Update Tensor doc (#14339 ) Summary: Add to the Tensor doc info about `.device`, `.is_cuda`, `.requires_grad`, `.is_leaf` and `.grad`. Update the `register_backward_hook` doc with a warning stating that it does not work in all cases. Add support in the `_add_docstr` function to add docstring to attributes. There is an explicit cast here but I am not sure how to handle it properly. The thing is that the doc field for getsetdescr is written as being a const char * (as all other doc fields in descriptors objects) in cpython online documentation. But in the code, it is the only one that is not const. I assumed here that it is a bug in the code because it does not follow the doc and the convention of the others descriptors and so I cast out the const. EDIT: the online doc I was looking at is for 3.7 and in that version both the code and the doc are const. For older versions, both are non const. Please let me know if this should not be done. And if it should be done if there is a cleaner way to do it ! Pull Request resolved: https://github.com/pytorch/pytorch/pull/14339 Differential Revision: D13243266 Pulled By: ezyang fbshipit-source-id: 75b7838f7cd6c8dc72b0c61950e7a971baefaeeb	2018-11-28 15:28:17 -08:00
Wei Yang	be7c618fd7	torch.sparse.sum() (#12430 ) Summary: - to fix #12241 - add `_sparse_sum()` to ATen, and expose as `torch.sparse.sum()`, not support `SparseTensor.sum()` currently - this PR depends on #11253, and will need to be updated upon it lands - [x] implement forward - [x] implement backward - performance [benchmark script](https://gist.github.com/weiyangfb/f4c55c88b6092ef8f7e348f6b9ad8946#file-sparse_sum_benchmark-py): - sum all dims is fastest for sparse tensor - when input is sparse enough nnz = 0.1%, sum of sparse tensor is faster than dense in CPU, but not necessary in CUDA - CUDA backward is comparable (<2x) between `sum several dims` vs `sum all dims` in sparse - CPU backward uses binary search is still slow in sparse, takes `5x` time in `sum [0, 2, 3] dims` vs `sum all dims` - optimize CUDA backward for now - using thrust for sort and binary search, but runtime not improved - both of CPU and CUDA forward are slow in sparse (`sum several dims` vs `sum all dims`), at most `20x` slower in CPU, and `10x` in CUDA - improve CPU and CUDA forward kernels (nnz, sizes, sum_dims, keepdim, sum all or dims, bk=backward) \| CPU (sparse vs dense) \| CUDA(sparse vs dense) -- \| -- \| -- (1000, [1000, 1000, 2, 2], [0, 1], False, sumAll) \| 8.77 µs vs 72.9 µs \| 42.5 µs vs 108 µs (1000, [1000, 1000, 2, 2], [0, 1], False, sumD) \| 112 µs vs 4.47 ms \| 484 µs vs 407 µs (1000, [1000, 1000, 2, 2], [0, 1], False, sumAll, bk) \| 141 µs vs 148 µs \| 647 µs vs 231 µs (1000, [1000, 1000, 2, 2], [0, 1], False, sumD, bk) \| 235 µs vs 1.23 ms \| 781 µs vs 213 µs (1000, [1000, 1000, 2, 2], [2, 3], False, sumD) \| 48.5 µs vs 360 µs \| 160 µs vs 2.03 ms (1000, [1000, 1000, 2, 2], [2, 3], False, sumD, bk) \| 258 µs vs 1.22 ms \| 798 µs vs 224 µs (1000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD) \| 204 µs vs 882 µs \| 443 µs vs 133 µs (1000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD, bk) \| 709 µs vs 1.15 ms \| 893 µs vs 202 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumAll) \| 39.8 µs vs 81 µs \| 42.4 µs vs 113 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumD) \| 747 µs vs 4.7 ms \| 2.4 ms vs 414 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumAll, bk) \| 1.04 ms vs 126 µs \| 5.03 ms vs 231 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumD, bk) \| 1.12 ms vs 1.24 ms \| 5.99 ms vs 213 µs (10000, [1000, 1000, 2, 2], [2, 3], False, sumD) \| 133 µs vs 366 µs \| 463 µs vs 2.03 ms (10000, [1000, 1000, 2, 2], [2, 3], False, sumD, bk) \| 1.56 ms vs 1.22 ms \| 6.11 ms vs 229 µs (10000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD) \| 1.53 ms vs 799 µs \| 824 µs vs 134 µs (10000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD, bk) \| 5.15 ms vs 1.09 ms \| 7.02 ms vs 205 µs - after improving CPU and CUDA forward kernels - in `(1000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD)` forward, CPU takes ~~`171 µs`~~, in which `130 µs` is spent on `coalesce()`, for CUDA, total time is ~~`331 µs`~~, in which `141 µs` is spent on `coalesce()`, we need to reduce time at other places outside `coalesce()`. - after a few simple tweaks, now in the forward, it is at most `10x` slower in CPU, and `7x` in CUDA. And time takes in `sum dense dims only [2, 3]` is `~2x` of `sum all dims`. Speed of `sum all sparse dims [0, 1]` is on bar with `sum all dims` (nnz, sizes, sum_dims, keepdim, sum all or dims, bk=backward) \| CPU (sparse vs dense) \| CUDA(sparse vs dense) -- \| -- \| -- (1000, [1000, 1000, 2, 2], [0, 1], False, sumAll) \| 7 µs vs 69.5 µs \| 31.5 µs vs 61.6 µs (1000, [1000, 1000, 2, 2], [0, 1], False, sumD) \| 11.3 µs vs 4.72 ms \| 35.2 µs vs 285 µs (1000, [1000, 1000, 2, 2], [0, 1], False, sumAll, bk) \| 197 µs vs 124 µs \| 857 µs vs 134 µs (1000, [1000, 1000, 2, 2], [0, 1], False, sumD, bk) \| 124 µs vs 833 µs \| 796 µs vs 106 µs (1000, [1000, 1000, 2, 2], [2, 3], False, sumD) \| 20.5 µs vs 213 µs \| 39.4 µs vs 1.24 ms (1000, [1000, 1000, 2, 2], [2, 3], False, sumD, bk) \| 131 µs vs 830 µs \| 881 µs vs 132 µs (1000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD) \| 95.8 µs vs 409 µs \| 246 µs vs 87.2 µs (1000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD, bk) \| 624 µs vs 820 µs \| 953 µs vs 124 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumAll) \| 45.3 µs vs 72.9 µs \| 33.9 µs vs 57.2 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumD) \| 81.4 µs vs 4.49 ms \| 39.7 µs vs 280 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumAll, bk) \| 984 µs vs 111 µs \| 6.41 ms vs 121 µs (10000, [1000, 1000, 2, 2], [0, 1], False, sumD, bk) \| 1.45 ms vs 828 µs \| 6.77 ms vs 113 µs (10000, [1000, 1000, 2, 2], [2, 3], False, sumD) \| 74.9 µs vs 209 µs \| 37.7 µs vs 1.23 ms (10000, [1000, 1000, 2, 2], [2, 3], False, sumD, bk) \| 1.48 ms vs 845 µs \| 6.96 ms vs 132 µs (10000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD) \| 1.14 ms vs 411 µs \| 252 µs vs 87.8 µs (10000, [1000, 1000, 2, 2], [0, 2, 3], False, sumD, bk) \| 4.53 ms vs 851 µs \| 7.12 ms vs 128 µs - time takes in CUDA backward of sparse is super long with large variance (in case of nnz=10000, it normally takes 6-7ms). To improve backward of sparse ops, we will need to debug at places other than CUDA kernels. here is a benchmark of `torch.copy_()`: ``` >>> d = [1000, 1000, 2, 2] >>> nnz = 10000 >>> I = torch.cat([torch.randint(0, d[0], size=(nnz,)), torch.randint(0, d[1], size=(nnz,))], 0).reshape(2, nnz) >>> V = torch.randn(nnz, d[2], d[3]) >>> size = torch.Size(d) >>> S = torch.sparse_coo_tensor(I, V, size).coalesce().cuda() >>> S2 = torch.sparse_coo_tensor(I, V, size).coalesce().cuda().requires_grad_() >>> data = S2.clone() >>> S.copy_(S2) >>> y = S * 2 >>> torch.cuda.synchronize() >>> %timeit y.backward(data, retain_graph=True); torch.cuda.synchronize() 7.07 ms ± 3.06 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12430 Differential Revision: D12878313 Pulled By: weiyangfb fbshipit-source-id: e16dc7681ba41fdabf4838cf05e491ca9108c6fe	2018-11-28 02:19:12 -08:00
Teng Li	a38ed0268e	PT1 Stable Release Distributed Documentation (#14444 ) Summary: The doc covers pretty much all we have had on distributed for PT1 stable release, tracked in https://github.com/pytorch/pytorch/issues/14080 Tested by previewing the sphinx generated webpages. All look good. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14444 Differential Revision: D13227675 Pulled By: teng-li fbshipit-source-id: 752f00df096af38dd36e4a337ea2120ffea79f86	2018-11-28 00:34:11 -08:00
Wei Yang	50bc9dc9c3	fix doc for sparse.addmm (#14403 ) Summary: - fixing the doc issue in sparse.addmm ================ before change ================== ![image](https://user-images.githubusercontent.com/38509346/49063994-2f10fe80-f1ce-11e8-9ccc-54241bc45f0b.png) ![image](https://user-images.githubusercontent.com/38509346/49064064-641d5100-f1ce-11e8-865a-7227be7156ef.png) ================ post change ================== ![image](https://user-images.githubusercontent.com/38509346/49064078-76978a80-f1ce-11e8-8f38-f1f8ac9ce63b.png) ![image](https://user-images.githubusercontent.com/38509346/49064085-7bf4d500-f1ce-11e8-8a0d-bf9e5460d21f.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/14403 Differential Revision: D13216582 Pulled By: weiyangfb fbshipit-source-id: 52e0a20c6b341c37cfb31f281be3afe2a52ca532	2018-11-27 10:24:18 -08:00
Wei Yang	12558019a8	backward for sparse.addmm(D, S, D, alpha, beta) -> D (#13345 ) Summary: - introduce `sparse.addmm()` with backward for sparse matrix input for https://github.com/pytorch/pytorch/issues/12308 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13345 Differential Revision: D13094070 Pulled By: weiyangfb fbshipit-source-id: 136c08c3ca9bafb20577b60dd43d31c3e5cd5461	2018-11-26 17:47:48 -08:00
Michael Carilli	c36156eded	Option to preserve bitwise accuracy of gradient checkpointed vs non-checkpointed dropout (#14253 ) Summary: This issue was noticed, and fix proposed, by raulpuric. Checkpointing is implemented by rerunning a forward-pass segment for each checkpointed segment during backward. This can result in the RNG state advancing more than it would without checkpointing, which can cause checkpoints that include dropout invocations to lose end-to-end bitwise accuracy as compared to non-checkpointed passes. The present PR contains optional logic to juggle the RNG states such that checkpointed passes containing dropout achieve bitwise accuracy with non-checkpointed equivalents.** The user requests this behavior by supplying `preserve_rng_state=True` to `torch.utils.checkpoint` or `torch.utils.checkpoint_sequential`. Currently, `preserve_rng_state=True` may incur a moderate performance hit because restoring MTGP states can be expensive. However, restoring Philox states is dirt cheap, so syed-ahmed's [RNG refactor](https://github.com/pytorch/pytorch/pull/13070#discussion_r235179882), once merged, will make this option more or less free. I'm a little wary of the [def checkpoint(function, args, preserve_rng_state=False):](https://github.com/pytorch/pytorch/pull/14253/files#diff-58da227fc9b1d56752b7dfad90428fe0R75) argument-passing method (specifically, putting a kwarg after a variable argument list). Python 3 seems happy with it. Edit: It appears Python 2.7 is NOT happy with a [kwarg after args](https://travis-ci.org/pytorch/pytorch/builds/457706518?utm_source=github_status&utm_medium=notification). `preserve_rng_state` also needs to be communicated in a way that doesn't break any existing usage. I'm open to suggestions (a global flag perhaps)? **Batchnorm may still be an issue, but that's a battle for another day. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14253 Differential Revision: D13166665 Pulled By: soumith fbshipit-source-id: 240cddab57ceaccba038b0276151342344eeecd7	2018-11-23 08:09:43 -08:00
Pieter Noordhuis	1caa341c68	Add torch.multiprocessing.spawn docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13846 Differential Revision: D13029595 Pulled By: pietern fbshipit-source-id: b733b00f7070c18535c31801f20e6e717eec7748	2018-11-12 14:39:52 -08:00
Elias Ellison	a92ff57a4d	update range doc (#13730 ) Summary: Update range documentation to show that we don't support start or increment parameters Pull Request resolved: https://github.com/pytorch/pytorch/pull/13730 Differential Revision: D12982016 Pulled By: eellison fbshipit-source-id: cc1462fc1af547ae80c6d3b87999b7528bade8af	2018-11-08 11:40:52 -08:00
Wei Yang	5dd153b1c2	speed up torch.sparse_mask() cpu kernel (#13290 ) Summary: - `sparse_mask(D, S)` is useful to implement backward for `sparse_addmm()` - previous `sparse_mask(D, S)` cpu kernel is not parallelized - this PR speed up the cpu kernel for two separated cases: - `D.dim == S.sparse_dim`: simply parallelize the kernel - `D.dim > S.sparse_dim`: simply use CUDA kernel implementation - performance: `D.dim == S.sparse_dim` ``` >>> nnz = 100000 >>> dims = [1000, 1000] >>> I = torch.cat([torch.randint(0, dims[0], size=(nnz,)), torch.randint(0, dims[1], size=(nnz,))], 0).reshape(2, nnz) >>> V = torch.randn(nnz) >>> size = torch.Size(dims) >>> S = torch.sparse_coo_tensor(I, V, size).coalesce() >>> D = torch.randn(dims) >>> %timeit D.sparse_mask(S) ======= before change ======= 6.4 ms ± 684 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ======= after change ======= 333 µs ± 89.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` `D.dim > S.sparse_dim` ``` >>> nnz = 100000 >>> dims = [1000, 1000, 2, 2] >>> I = torch.cat([torch.randint(0, dims[0], size=(nnz,)), torch.randint(0, dims[1], size=(nnz,))], 0).reshape(2, nnz) >>> V = torch.randn(nnz, dims[2], dims[3]) >>> size = torch.Size(dims) >>> S = torch.sparse_coo_tensor(I, V, size).coalesce() >>> D = torch.randn(dims) %timeit D.sparse_mask(S) ======= before change ======= 495 ms ± 41.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ======= after change ======= 594 µs ± 68.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13290 Differential Revision: D12878336 Pulled By: weiyangfb fbshipit-source-id: 10b5981af382f7c6095a42c0fee7297d6438ce37	2018-11-07 20:02:17 -08:00
Brendan Soffientini	9900a8dd89	Remove outdated css and font files in html docs (#13699 ) Summary: The stylesheet at docs/source/_static/css/pytorch_theme.css is no longer necessary for the html docs build. The new html docs theme styles are located at https://github.com/pytorch/pytorch_sphinx_theme. The Lato font is also no longer used in the new theme. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13699 Differential Revision: D12967448 Pulled By: soumith fbshipit-source-id: 7de205162a61e3acacfd8b499660d328ff3812ec	2018-11-07 16:31:28 -08:00
Tongzhou Wang	044d00516c	Rename DistBackend -> Backend (#11830 ) Summary: Also add docs for get_backend, Backend, and reduce_op fixes #11803 cc The controller you requested could not be found. pietern apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/11830 Differential Revision: D9927991 Pulled By: SsnL fbshipit-source-id: a2ffb70826241ba84264f36f2cb173e00b19af48	2018-11-07 11:58:12 -08:00
Thomas Viehmann	f0ed927b62	Add diag_embed to ATen and torch (#12447 ) Summary: Fixes: #12160 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12447 Differential Revision: D12916234 Pulled By: SsnL fbshipit-source-id: 512a04efb0c2e0a54295b857a61be66c3aae13da	2018-11-05 08:55:28 -08:00
vishwakftw	d714ecf879	Rename potrf to cholesky (#12699 ) Summary: This PR performs a renaming of the function `potrf` responsible for the Cholesky decomposition on positive definite matrices to `cholesky` as NumPy and TF do. Billing of changes - make potrf cname for cholesky in Declarations.cwrap - modify the function names in ATen/core - modify the function names in Python frontend - issue warnings when potrf is called to notify users of the change Reviewed By: soumith Differential Revision: D10528361 Pulled By: zou3519 fbshipit-source-id: 19d9bcf8ffb38def698ae5acf30743884dda0d88	2018-11-01 15:10:55 -07:00
Ailing Zhang	4a3baec961	Hub Implementation (#12228 ) Summary: [Edit: after applied colesbury 's suggestions] * Hub module enable users to share code + pretrained weights through github repos. Example usage: ``` hub_model = hub.load( 'ailzhang/vision:hub', # repo_owner/repo_name:branch 'wrapper1', # entrypoint 1234, # args for callable [not applicable to resnet18] pretrained=True) # kwargs for callable ``` * Protocol on repo owner side: example https://github.com/ailzhang/vision/tree/hub * The "published" models should be at least in a branch/tag. It can't be a random commit. * Repo owner should have the following field defined in `hubconf.py` * function/entrypoint with function signature `def wrapper1(pretrained=False, args, kwargs):` `pretrained` allows users to load pretrained weights from repo owner. * `args` and `kwargs` are passed to the callable `resnet18`, repo owner should clearly specify their help message in the docstring ``` def wrapper1(pretrained=False, args, kwargs): """ pretrained (bool): a recommended kwargs for all entrypoints args & kwargs are arguments for the function """ from torchvision.models.resnet import resnet18 model = resnet18(args, *kwargs) checkpoint = 'https://download.pytorch.org/models/resnet18-5c106cde.pth' if pretrained: model.load_state_dict(model_zoo.load_url(checkpoint, progress=False)) return model ``` Hub_dir * `hub_dir` specifies where the intermediate files/folders will be saved. By default this is `~/.torch/hub`. * Users can change it by either setting the environment variable `TORCH_HUB_DIR` or calling `hub.set_dir(PATH_TO_HUB_DIR)`. * By default, we don't cleanup files after loading so that users can use cache next time. * Cache logic : * We used the cache by default if it exists in `hub_dir`. * Users can force a fresh reload by calling `hub.load(..., force_reload=True)`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12228 Differential Revision: D10511470 Pulled By: ailzhang fbshipit-source-id: 12ac27f01d33653f06b2483655546492f82cce38	2018-10-29 18:43:14 -07:00
Doug Friedman	bc352ace7c	dense.to_sparse() re: #8853 (#12171 ) Summary: Here is my stab at ```dense.to_sparse``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12171 Differential Revision: D10859078 Pulled By: weiyangfb fbshipit-source-id: 5df72f72ba4f8f10e283402ff7731fd535682664	2018-10-26 21:48:52 -07:00
Pat Mellon	21285e73da	Add Google pixel code Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12998 Differential Revision: D10515096 Pulled By: JoelMarcey fbshipit-source-id: 7f97014451448a70ea7f91d7d8bd96fbf6e83f7f	2018-10-23 13:26:37 -07:00
Benoit Steiner	3fb3a07f54	Added a default constructor for torch.finfo. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12847 Differential Revision: D10457487 Pulled By: benoitsteiner fbshipit-source-id: 7d164a71ba52631e5906098f643eecb0630879d1	2018-10-23 09:03:24 -07:00
Tongzhou Wang	b357470421	Add DistributedDataParallelCPU to doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12864 Differential Revision: D10481669 Pulled By: SsnL fbshipit-source-id: 20831af41aaba75546e6ed6a99f011f0447b1acf	2018-10-21 11:20:11 -07:00
Tongzhou Wang	8a35aafca6	Try to fix randomness.rst formatting again Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12853 Differential Revision: D10458439 Pulled By: SsnL fbshipit-source-id: ebd259e598327b0c5d63de6b7c182781fe361fbd	2018-10-18 19:18:49 -07:00
Tongzhou Wang	a85174b46a	Fix randomness.rst formatting Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12850 Differential Revision: D10457694 Pulled By: SsnL fbshipit-source-id: fa64964ff6d41625d9383ca96393017230e4ee0f	2018-10-18 18:26:26 -07:00
Thomas Viehmann	0521c47c91	Amend nondeterminism notes (#12217 ) Summary: include atomicAdd commentary as this is less well known There is some discussion in #12207 Unfortunately, I cannot seem to get the ..include working in `_tensor_docs.py` and `_torch_docs.py`. I could use a hint for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12217 Differential Revision: D10419739 Pulled By: SsnL fbshipit-source-id: eecd04fb7486bd9c6ee64cd34859d61a0a97ec4e	2018-10-16 23:59:26 -07:00
Benoit Steiner	bbe6ef3864	torch.finfo and torch.iinfo to mimic the numpy equivalent (#12472 ) Summary: This pull request intends to provide the functionality requested in https://github.com/pytorch/pytorch/issues/10742 by adding a new torch.finfo and torch.iinfo API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12472 Differential Revision: D10250829 Pulled By: benoitsteiner fbshipit-source-id: eb22ca55d5b0064bef381fa7f1eb75989977df30	2018-10-15 13:43:52 -07:00
Natalia Gimelshein	134b5d62e8	don't copy weight gradients in rnn (#12600 ) Summary: This PR gets rid of unnecessary copy of weight gradients in cudnn rnn. Also removes unnecessary check for input size when deciding whether to use persistent rnn, and adds doc string explaining when persistent rnn can be used. cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/12600 Differential Revision: D10359981 Pulled By: soumith fbshipit-source-id: 0fce11b527d543fabf21e6e9213fb2879853d7fb	2018-10-12 13:34:10 -07:00
vishwakftw	48bc57fa8d	Introduce chain_matmul (#12380 ) Summary: - This was one of the few functions left out from the list of functions in NumPy's `linalg` module - `multi_mm` is particularly useful for DL research, for quick analysis of deep linear networks - Added tests and doc string Pull Request resolved: https://github.com/pytorch/pytorch/pull/12380 Differential Revision: D10357136 Pulled By: SsnL fbshipit-source-id: 52b44fa18d6409bdeb76cbbb164fe4e88224458e	2018-10-12 03:58:12 -07:00
Yangqing Jia	38f3d1fc40	move flags to c10 (#12144 ) Summary: still influx. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144 Reviewed By: smessmer Differential Revision: D10140176 Pulled By: Yangqing fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c	2018-10-04 02:09:56 -07:00
Wei Yang	5ffc915f26	fix docs (#12126 ) Summary: - fix https://github.com/pytorch/pytorch/issues/12120 - add `torch.argsort`, `torch.pdist`, `broadcast_tensors` to *.rst files - add parameter dim to `torch.unique` doc - fix table and args for `torch.norm` - test plan: make html and check docs in browser gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/12126 Differential Revision: D10087006 Pulled By: weiyangfb fbshipit-source-id: 25f65c43d14e02140d0da988d8742c7ade3d8cc9	2018-09-29 22:26:45 -07:00
cclauss	b0248df72a	Docs: Change cuda(async) —> cuda(non_blocking) (#12158 ) Summary: goldsborough Modify the docs to match the changes made in #4999 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12158 Differential Revision: D10103964 Pulled By: SsnL fbshipit-source-id: 1b8692da86aca1a52e8d2e6cea76a5ad1f71e058	2018-09-28 08:39:27 -07:00
Doug Friedman	c2f8f5076c	add narrow() support for sparse tensors re: #8853 (#11342 ) Summary: Couple questions: 1) I used the log1p implementation in #8969 as a guide especially for testing. I'm not sure what the ```skipIfROCM``` annotation is for, so unsure if i need it for my test. 2) I implemented the branching logic in the narrow function itself; is this the right place to do so? I noticed that there a number of places where sparse-specific logic is handled with just an if statement in this file. Or should I implement a separate dispatch in native_functions.yml as in the log1p? And of course, happy to make any any other updates/changes that I may have missed as well. This is my first PR to the project. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11342 Differential Revision: D9978430 Pulled By: weiyangfb fbshipit-source-id: e73dc20302ab58925afb19e609e31f4a38c634ad	2018-09-26 12:24:54 -07:00
Brian Johnson	23f5b2abbe	Fixes an error with canonical url. (#11938 ) Summary: Deleted this section by mistake in last PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11938 Reviewed By: SsnL Differential Revision: D9993258 Pulled By: brianjo fbshipit-source-id: 2552178cebd005a1105a22930c4d128c67247378	2018-09-21 12:21:42 -07:00
Brian Johnson	17cd426c72	Updated docs styles (#11835 ) Summary: Updated requirements.txt and conf.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11835 Reviewed By: SsnL Differential Revision: D9941160 Pulled By: brianjo fbshipit-source-id: fbac91214558e6d17beff74261d990c7dc762038	2018-09-20 21:11:12 -07:00
Tongzhou Wang	c30790797f	Minor data loader doc improvements Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11821 Differential Revision: D9948292 Pulled By: SsnL fbshipit-source-id: 01c21c129423c0f7844b403e665a8fe021a9c820	2018-09-19 15:33:25 -07:00

1 2 3 4 5 ...

437 Commits