pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
mfkasim91	a49367e9c9	Update the docs of torch.eig about derivative (#47598 ) Summary: Related: https://github.com/pytorch/pytorch/issues/33090 I just realized that I haven't updated the docs of `torch.eig` when implementing the backward. Here's the PR updating the docs about the grad of `torch.eig`. cc albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/47598 Reviewed By: heitorschueroff Differential Revision: D24829373 Pulled By: albanD fbshipit-source-id: 89963ce66b2933e6c34e2efc93ad0f2c3dd28c68	2020-11-09 13:28:27 -08:00
Edward Yang	1aeefcdaa6	Revert D24730264: [pytorch][PR] Added CUDA support for complex input for torch.inverse Test Plan: revert-hammer Differential Revision: D24730264 (`33acbedace`) Original commit changeset: b9c94ec46301 fbshipit-source-id: beb9263700e9bc92685f74c37c46aa33f3b595b9	2020-11-06 07:28:14 -08:00
Ivan Yashchuk	33acbedace	Added CUDA support for complex input for torch.inverse (#45034 ) Summary: `torch.inverse` now works for complex inputs on GPU. Test cases with complex matrices are xfailed for now. For example, batched matmul does not work with complex yet. Ref. https://github.com/pytorch/pytorch/issues/33152 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45034 Reviewed By: zou3519 Differential Revision: D24730264 Pulled By: anjali411 fbshipit-source-id: b9c94ec463012913c117278a884adeee96ea02aa	2020-11-05 16:30:11 -08:00
Heitor Schueroff	a4ba018e57	Updated docs/test for dot and vdot (#47242 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47242 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D24733771 Pulled By: heitorschueroff fbshipit-source-id: 92e3b0e28e0565918335fa85d52abe5db9eeff57	2020-11-05 06:27:50 -08:00
Jane Xu	01da0fe5ff	Including generator param in randperm documentation (#47231 ) Summary: The `randperm` documentation is outdated and did not use to include the optional `generator` parameter. This PR just adds that along with the `pin_memory` parameter. This PR was brought up in [PR 47022](https://github.com/pytorch/pytorch/pull/47022), but is now rebased onto master. New docs look like: ![image](https://user-images.githubusercontent.com/31798555/97923963-e6084400-1d2c-11eb-9d46-573ba3189ad6.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/47231 Reviewed By: mruberry Differential Revision: D24711960 Pulled By: janeyx99 fbshipit-source-id: 3ff8be62ec33e34ef87d017ea97bb950621a3064	2020-11-04 09:37:41 -08:00
Erjia Guan	f1ac63d324	Implement copysign (#46396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46396 Related #38349 [numpy](https://numpy.org/doc/stable/reference/generated/numpy.copysign.html?highlight=copysign#numpy.copysign) - No in-place function - No method - Optional output - Available: byte, char, bool, int, short, long, float, double, half - Integral promoted to float - Not available: float/double complex `c = np.copysign(a, b)` \| a \| b \| c \| a.grad \| \| -1 \| -1 \| -1 \| 1 \| \| -0 \| -1 \| -0 \| 0 \| \| 0 \| -1 \| -0 \| 0 \| \| 1 \| -1 \| -1 \| -1 \| \| -1 \| -0 \| -1 \| 1 \| \| -0 \| -0 \| 0 \| 0 \| \| 0 \| -0 \| 0 \| 0 \| \| 1 \| -0 \| -1 \| -1 \| \| -1 \| 0 \| 1 \| -1 \| \| -0 \| 0 \| 0 \| 0 \| \| 0 \| 0 \| 0 \| 0 \| \| 1 \| 0 \| 1 \| 1 \| \| -1 \| 1 \| 1 \| -1 \| \| -0 \| 1 \| 0 \| 0 \| \| 0 \| 1 \| 0 \| 0 \| \| 1 \| 1 \| 1 \| 1 \| This function becomes non-differentiable at `a=0` for any `b`. So, in my opinion, we may set the gradient for `a=0` to 0. TODO: - [x] test (cpu/gpu) - [x] doc - [x] ~kernel_vec~ Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24401366 Pulled By: ejguan fbshipit-source-id: 3621c5ff74b185376a3705589983bb5197ab896d	2020-11-04 08:08:57 -08:00
Qi Zhou	0ec717c830	Support int32 indices and offsets in nn.EmbeddingBag (#46758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46758 It's in general helpful to support int32 indices and offsets, especially when such tensors are large and need to be transferred to accelerator backends. Since it may not be very useful to support the combination of int32 indices and int64 offsets, here we enforce that these two must have the same type. Test Plan: unit tests Reviewed By: ngimel Differential Revision: D24470808 fbshipit-source-id: 94b8a1d0b7fc9fe3d128247aa042c04d7c227f0b	2020-11-03 23:33:50 -08:00
Ivan Yashchuk	f276ab55cd	Added Kronecker product of tensors (torch.kron) (#45358 ) Summary: This PR adds a function for calculating the Kronecker product of tensors. The implementation is based on `at::tensordot` with permutations and reshape. Tests pass. TODO: - [x] Add more test cases - [x] Write documentation - [x] Add entry `common_methods_invokations.py` Ref. https://github.com/pytorch/pytorch/issues/42666 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45358 Reviewed By: mrshenli Differential Revision: D24680755 Pulled By: mruberry fbshipit-source-id: b1f8694589349986c3abfda3dc1971584932b3fa	2020-11-03 12:41:41 -08:00
Xiong Wei	74d730c0b5	implement NumPy-like functionality column_stack, row_stack (#46313 ) Summary: Related https://github.com/pytorch/pytorch/issues/38349 This PR implements `column_stack` as the composite ops of `torch.reshape` and `torch.hstack`, and makes `row_stack` as the alias of `torch.vstack`. Todo - [x] docs - [x] alias pattern for `row_stack` Pull Request resolved: https://github.com/pytorch/pytorch/pull/46313 Reviewed By: ngimel Differential Revision: D24585471 Pulled By: mruberry fbshipit-source-id: 62fc0ffd43d051dc3ecf386a3e9c0b89086c1d1c	2020-10-29 12:14:39 -07:00
mfkasim91	6eaa324c9f	Implement torch.igamma (#46183 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41637 This is regularized lower incomplete gamma function, equivalent to scipy's `gammainc` and tensorflow `igamma`. cc fritzo mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/46183 Reviewed By: gchanan Differential Revision: D24479126 Pulled By: mruberry fbshipit-source-id: fdf8ea289fe4ca1b408810732192411e948fcdfe	2020-10-29 11:40:18 -07:00
Pearu Peterson	905ed3c840	Revised sparse tensor documentation. (#45400 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44635. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45400 Reviewed By: ezyang Differential Revision: D24359410 Pulled By: mruberry fbshipit-source-id: 37c691a49a7b0042c7a298e0ed1226702b097c8b	2020-10-22 02:07:54 -07:00
Ivan Yashchuk	c1141b6f68	Added support for complex torch.pinverse (#45819 ) Summary: This PR adds support for complex-valued input for `torch.pinverse`. Fixed cuda SVD implementation to return singular values with real dtype. Fixes https://github.com/pytorch/pytorch/issues/45385. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45819 Reviewed By: heitorschueroff Differential Revision: D24306539 Pulled By: anjali411 fbshipit-source-id: 2fe19bc630de528e0643132689e1bc5ffeaa162a	2020-10-15 12:28:22 -07:00
Kurt Mohler	bd449334b8	Fix formatting issues in torch.tensor_split documentation (#46328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46328 Reviewed By: heitorschueroff Differential Revision: D24318003 Pulled By: mruberry fbshipit-source-id: 140d391dd927ff3374dd6c4c6e2da7cb67417b31	2020-10-15 10:08:38 -07:00
Erjia Guan	bed3b40523	Implement ravel (#46098 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46098 Doc: ![image](https://user-images.githubusercontent.com/68879799/95611323-ae5cf380-0a2f-11eb-9b8e-56bf79ce68af.png) Test Plan: Imported from OSS Reviewed By: glaringlee Differential Revision: D24253213 Pulled By: ejguan fbshipit-source-id: 42a866c902272cbe3743a9d0cb3afb9165d51c0b	2020-10-12 16:00:44 -07:00
Erjia Guan	59414b359d	Document fix for logspace and linspace (#46056 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46056 The result: * logspace ![image](https://user-images.githubusercontent.com/68879799/95513793-e6f5c200-0988-11eb-8279-b093612743ca.png) * linspace ![image](https://user-images.githubusercontent.com/68879799/95513824-f543de00-0988-11eb-9910-72d28d7b6277.png) Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D24204441 Pulled By: ejguan fbshipit-source-id: fe1179fdbebb326d33e9c474b1efc8282a391901	2020-10-09 10:20:57 -07:00
Heitor Schueroff de Souza	636eb18029	Fixed median nan propagation and implemented nanmedian (#45847 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45847 Original PR here https://github.com/pytorch/pytorch/pull/45084. Created this one because I was having problems with ghstack. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24136629 Pulled By: heitorschueroff fbshipit-source-id: dd7c7540a33f6a19e1ad70ba2479d5de44abbdf9	2020-10-08 11:20:21 -07:00
Natalia Gimelshein	52f2db752d	unify reproducibility notes (#45748 ) Summary: Many of our functions contain same warnings about results reproducibility. Make them use common template. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45748 Reviewed By: colesbury Differential Revision: D24089114 Pulled By: ngimel fbshipit-source-id: e6aa4ce6082f6e0f4ce2713c2bf1864ee1c3712a	2020-10-08 02:14:57 -07:00
Kurt Mohler	ef4817fe5a	Add `tensor_split` function, based on `numpy.array_split` (#45168 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/9382 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45168 Reviewed By: ngimel Differential Revision: D24166164 Pulled By: mruberry fbshipit-source-id: 795459821e52885bc99623a01a2abec060995ce6	2020-10-07 23:14:48 -07:00
Nikita Vedeneev	30bf799f9c	`torch.matrix_exp` doc fix (#45909 ) Summary: As per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45909 Reviewed By: dzhulgakov Differential Revision: D24147314 Pulled By: albanD fbshipit-source-id: fc21094f4dbdd04cc2063a9639b9d1f5728cb53f	2020-10-07 10:23:37 -07:00
Vaidotas Simkus	e154b36685	Standardized clamp kernels to Numpy-like implementation (#43288 ) Summary: BC-breaking note For ease of exposition let a_min be the value of the "min" argument to clamp, and a_max be the value of the "max" argument to clamp. This PR changes the behavior of torch.clamp to always compute min(max(a, a_min), a_max). torch.clamp currently computes this in its vectorized CPU specializations: `78b95b6204/aten/src/ATen/cpu/vec256/vec256_double.h (L304)` but in other places it clamps differently: `78b95b6204/aten/src/ATen/cpu/vec256/vec256_base.h (L624)` `78b95b6204/aten/src/ATen/native/cuda/UnaryOpsKernel.cu (L160)` These implementations are the same when a_min < a_max, but divergent when a_min > a_max. This divergence is easily triggered: ``` t = torch.arange(200).to(torch.float) torch.clamp(t, 4, 2)[0] : tensor(2.) torch.clamp(t.cuda(), 4, 2)[0] : tensor(4., device='cuda:0') torch.clamp(torch.tensor(0), 4, 2) : tensor(4) ``` This PR makes the behavior consistent with NumPy's clip. C++'s std::clamp's behavior is undefined when a_min > a_max, but Clang's std::clamp will return 10 in this case (although the program, per the above comment, is in error). Python has no standard clamp implementation. PR Summary Fixes discrepancy between AVX, CUDA, and base vector implementation for clamp, such that all implementations are consistent and use min(max_vec, max(min_vec, x) formula, thus making it equivalent to numpy.clip in all implementations. The same fix as in https://github.com/pytorch/pytorch/issues/32587 but isolated to the kernel change only, so that the internal team can benchmark. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43288 Reviewed By: colesbury Differential Revision: D24079453 Pulled By: mruberry fbshipit-source-id: 67f30d2f2c86bbd3e87080b32f00e8fb131a53f7	2020-10-06 13:42:08 -07:00
kshitij12345	f65ab89edd	[numpy] Add torch.nan_to_num (#44592 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 TODO: * [x] Add tests * [x] Add docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/44592 Reviewed By: colesbury Differential Revision: D24079472 Pulled By: mruberry fbshipit-source-id: 2b67d36cba46eaa7ca16cd72671b57750bd568bc	2020-10-05 01:38:56 -07:00
Ansley Ussery	db8b076272	Change signature for torch.poisson (#45656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45656 Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D24078609 Pulled By: ansleyadelaide fbshipit-source-id: 97a95b08334ed0d710e032a267b940c2fc9f7f40	2020-10-02 13:14:12 -07:00
Brian Hirsh	c703602e17	make broadcasting explanation clearer in matmul doc: #22763 (#45699 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45699 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24065584 Pulled By: bdhirsh fbshipit-source-id: 5e2cdd00ed18ad47d24d11751cfa5bee63853cc9	2020-10-02 06:51:42 -07:00
lixinyu	fc4209bd4f	Fix the bucketization wrong doc for right argument (#45684 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45684 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D24057996 Pulled By: glaringlee fbshipit-source-id: 3db1c24f3cae9747effa4b1f3c5c3baf6888c9a1	2020-10-01 18:16:49 -07:00
Xiang Gao	c2c7099944	Fix docs for kwargs, q-z (#43589 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43589 Reviewed By: zhangguanheng66 Differential Revision: D24006259 Pulled By: mruberry fbshipit-source-id: 39abd474744f152648aad201d7311b42d20efc88	2020-09-29 22:57:02 -07:00
Mike Ruberry	bb19a55429	Improves fft doc consistency and makes deprecation warnings more prominent (#45409 ) Summary: This PR makes the deprecation warnings for existing fft functions more prominent and makes the torch.stft deprecation warning consistent with our current deprecation planning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45409 Reviewed By: ngimel Differential Revision: D23974975 Pulled By: mruberry fbshipit-source-id: b90d8276095122ac3542ab625cb49b991379c1f8	2020-09-29 09:07:49 -07:00
Mike Ruberry	87f98a5b54	Updates torch.floor_divide documentation to clarify it's actually torch.trunc_divide (or torch.rtz_divide) (#45411 ) Summary: Addresses https://github.com/pytorch/pytorch/issues/43874 for 1.7. 1.8 will need to take floor_divide through a proper deprecation process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45411 Reviewed By: ngimel Differential Revision: D23974997 Pulled By: mruberry fbshipit-source-id: 16dd07e50a17ac76bfc93bd6b71d4ad72d909bf4	2020-09-29 05:55:44 -07:00
Xiong Wei	241afc9188	Migrate `addr` from the TH to Aten (CPU) (#44364 ) Summary: Related https://github.com/pytorch/pytorch/issues/24507 Fixes https://github.com/pytorch/pytorch/issues/24666 This PR is to modernize the CPU implementation of the vector `outer product`. The existing TH implementation for `torch.attr` is migrated to `aten`, as the `torch.ger` manipulates the `addr` functions to calculate outer product, Pull Request resolved: https://github.com/pytorch/pytorch/pull/44364 Reviewed By: ezyang Differential Revision: D23866733 Pulled By: mruberry fbshipit-source-id: 5159ea22f0e3c991123fe7c19cc9beb6ad00301e	2020-09-25 01:18:09 -07:00
Peter Bell	dc67b47bc9	Deprecate old fft functions (#44876 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44876 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23866715 Pulled By: mruberry fbshipit-source-id: 73305eb02f92cbd1ef7d175419529d19358fedda	2020-09-24 02:39:44 -07:00
anjali411	58b6ab69e5	torch.sgn for complex tensors (#39955 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39955 resolves https://github.com/pytorch/pytorch/issues/36323 by adding `torch.sgn` for complex tensors. `torch.sgn` returns `x/abs(x)` for `x != 0` and returns `0 + 0j` for `x==0` This PR doesn't test the correctness of the gradients. It will be done as a part of auditing all the ops in future once we decide the autograd behavior (JAX vs TF) and add gradchek. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23460526 Pulled By: anjali411 fbshipit-source-id: 70fc4e14e4d66196e27cf188e0422a335fc42f92	2020-09-22 08:24:53 -07:00
Mike Ruberry	60709ad1bf	Adds multiply and divide aliases (#44463 ) Summary: These alias are consistent with NumPy. Note that C++'s naming would be different (std::multiplies and std::divides), and that PyTorch's existing names (mul and div) are consistent with Python's dunders. This also improves the instructions for adding an alias to clarify that dispatch keys should be removed when copying native_function.yaml entries to create the alias entries. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44463 Reviewed By: ngimel Differential Revision: D23670782 Pulled By: mruberry fbshipit-source-id: 9f1bdf8ff447abc624ff9e9be7ac600f98340ac4	2020-09-19 15:47:52 -07:00
Heitor Schueroff de Souza	28085cbd39	Fixed quantile nan propagation and implemented nanquantile (#44393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44393 torch.quantile now correctly propagates nan and implemented torch.nanquantile similar to numpy.nanquantile. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23649613 Pulled By: heitorschueroff fbshipit-source-id: 5201d076745ae1237cedc7631c28cf446be99936	2020-09-17 05:53:25 -07:00
Muthu Arivoli	b61d3d8be8	Implement torch.kaiser_window (#44271 ) Summary: Related to https://github.com/pytorch/pytorch/issues/38349 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44271 Reviewed By: ngimel Differential Revision: D23727972 Pulled By: mruberry fbshipit-source-id: b4c931b2eb3a536231ad6d6c3cb66e52a13286ac	2020-09-16 20:41:31 -07:00
Xiang Gao	e48201c5cf	Mention TF32 on related docs (#44690 ) Summary: cc: ptrblck ![image](https://user-images.githubusercontent.com/1032377/93168022-cbbfcb80-f6d6-11ea-8f6e-f2c8a15c5bea.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44690 Reviewed By: ngimel Differential Revision: D23727921 Pulled By: mruberry fbshipit-source-id: db7cc8e74cde09c13d6a57683129fd839863b914	2020-09-16 19:18:30 -07:00
kshitij12345	1d733d660d	[docs] torch.min/max: remove incorrect warning from docs (#44615 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44195 cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/44615 Reviewed By: ngimel Differential Revision: D23703525 Pulled By: mruberry fbshipit-source-id: 471ebd764be667e29c03a30f3ef341440adc54d2	2020-09-15 10:42:08 -07:00
Muthu Arivoli	9c364da9b9	Fix doc builds for bool kwargs (#44686 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43669 The bool will still link to https://docs.python.org/3/library/functions.html#bool. Tested using bmm: ![image](https://user-images.githubusercontent.com/16063114/93156438-2ad11080-f6d6-11ea-9b81-96e02ee68d90.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44686 Reviewed By: ngimel Differential Revision: D23703823 Pulled By: mruberry fbshipit-source-id: 7286afad084f5ab24a1254ad84e5d01907781c85	2020-09-15 10:34:58 -07:00
Mike Ruberry	686e281bcf	Updates div to perform true division (#42907 ) Summary: This PR: - updates div to perform true division - makes torch.true_divide an alias of torch.div This follows on work in previous PyTorch releases that first deprecated div performing "integer" or "floor" division, then prevented it by throwing a runtime error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42907 Reviewed By: ngimel Differential Revision: D23622114 Pulled By: mruberry fbshipit-source-id: 414c7e3c1a662a6c3c731ad99cc942507d843927	2020-09-14 15:50:38 -07:00
Heitor Schueroff de Souza	199435af90	Update median doc to note return value of even-sized input (#44562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44562 Add a note that torch.median returns the smaller of the two middle elements for even-sized input and refer user to torch.quantile for the mean of the middle values. fixes https://github.com/pytorch/pytorch/issues/39520 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23657208 Pulled By: heitorschueroff fbshipit-source-id: 2747aa652d1e7f10229d9299b089295aeae092c2	2020-09-14 13:18:33 -07:00
kshitij12345	c68a99bd61	[numpy] Add `torch.exp2` (#44184 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 TODO * [x] Add tests * [x] Add docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/44184 Reviewed By: ngimel Differential Revision: D23674237 Pulled By: mruberry fbshipit-source-id: 7f4fb1900fad3051cd7fc9d3d7f6d985c5fb093c	2020-09-14 04:05:37 -07:00
Peter Bell	8daaa3bc7e	Fix latex error in heaviside docs (#44481 ) Summary: This fixes a `katex` error I was getting trying to build the docs: ``` ParseError: KaTeX parse error: Undefined control sequence: \0 at position 55: …gin{cases} ``` This failure was introduced in https://github.com/pytorch/pytorch/issues/42523. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44481 Reviewed By: colesbury Differential Revision: D23627700 Pulled By: mruberry fbshipit-source-id: 9cc09c687a7d9349da79a0ac87d6c962c9cfbe2d	2020-09-13 16:42:19 -07:00
Mike Ruberry	7e91728f68	Deprecates calling linspace and logspace without setting steps explicitly (#43860 ) Summary: BC-breaking note This change is BC-breaking for C++ callers of linspace and logspace if they were providing a steps argument that could not be converted to an optional. PR note This PR deprecates calling linspace and logspace wihout setting steps explicitly by: - updating the documentation to warn that not setting steps is deprecated - warning (once) when linspace and logspace are called without steps being specified A test for this behavior is added to test_tensor_creation_ops. The warning only appears once per process, however, so the test would pass even if no warning were thrown. Ideally there would be a mechanism to force all warnings, include those from TORCH_WARN_ONCE, to trigger. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43860 Reviewed By: izdeby Differential Revision: D23498980 Pulled By: mruberry fbshipit-source-id: c48d7a58896714d184cb6ff2a48e964243fafc90	2020-09-13 06:09:19 -07:00
Natalia Gimelshein	ecc6358dbe	Port nonzero cuda from THC to ATen (#44259 ) Summary: 1) Ports nonzero from THC to ATen 2) replaces most thrust uses with cub, to avoid synchronization and to improve performance. There is still one necessary synchronization point, communicating number of nonzero elements from GPU to CPU 3) slightly changes algorithm, now we first compute the number of nonzeros, and then allocate correct-sized output, instead of allocating full-sized output as was done before, to account for possibly all elements being non-zero 4) unfortunately, since the last transforms are still done with thrust, 2) is slightly beside the point, however it is a step towards a future without thrust 4) hard limits the number of elements in the input tensor to MAX_INT. Previous implementation allocated a Long tensor with the size ndimnelements, so that would be at least 16 GB for a tensor with MAX_INT elements. It is reasonable to say that larger tensors could not be used anyway. Benchmarking is done for tensors with approximately half non-zeros <details><summary>Benchmarking script</summary> <p> ``` import torch from torch.utils._benchmark import Timer from torch.utils._benchmark import Compare import sys device = "cuda" results = [] for numel in (1024 128,):#, 1024 * 1024, 1024 * 1024 * 128): inp = torch.randint(2, (numel,), device="cuda", dtype=torch.float) for ndim in range(2,3):#(1,4): if ndim == 1: shape = (numel,) elif ndim == 2: shape = (1024, numel // 1024) else: shape = (1024, 128, numel // 1024 // 128) inp = inp.reshape(shape) repeats = 3 timer = Timer(stmt="torch.nonzero(inp, as_tuple=False)", label="Nonzero", sub_label=f"number of elts {numel}", description = f"ndim {ndim}", globals=globals()) for i in range(repeats): results.append(timer.blocked_autorange()) print(f"\rnumel {numel} ndim {ndim}", end="") sys.stdout.flush() comparison = Compare(results) comparison.print() ``` </p> </details> ### Results Before: ``` [--------------------------- Nonzero ---------------------------] \| ndim 1 \| ndim 2 \| ndim 3 1 threads: ------------------------------------------------------ number of elts 131072 \| 55.2 \| 71.7 \| 90.5 number of elts 1048576 \| 113.2 \| 250.7 \| 497.0 number of elts 134217728 \| 8353.7 \| 23809.2 \| 54602.3 Times are in microseconds (us). ``` After: ``` [-------------------------- Nonzero --------------------------] \| ndim 1 \| ndim 2 \| ndim 3 1 threads: ---------------------------------------------------- number of elts 131072 \| 48.6 \| 79.1 \| 90.2 number of elts 1048576 \| 64.7 \| 134.2 \| 161.1 number of elts 134217728 \| 3748.8 \| 7881.3 \| 9953.7 Times are in microseconds (us). ``` There's a real regression for smallish 2D tensor due to added work of computing number of nonzero elements, however, for other sizes there are significant gains, and there are drastically lower memory requirements. Perf gains would be even larger for tensors with fewer nonzeros. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44259 Reviewed By: izdeby Differential Revision: D23581955 Pulled By: ngimel fbshipit-source-id: 0b99a767fd60d674003d83f0848dc550d7a363dc	2020-09-08 20:52:51 -07:00
Mike Ruberry	bb861e1d69	Ports CUDA var and std reduce all (with no out argument) to ATen, fixes var docs (#43858 ) Summary: When var and std are called without args (other than unbiased) they currently call into TH or THC. This PR: - Removes the THC var_all and std_all functions and updates CUDA var and std to use the ATen reduction - Fixes var's docs, which listed its arguments in the incorrect order - Adds new tests comparing var and std with their NumPy counterparts Performance appears to have improved as a result of this change. I ran experiments on 1D tensors, 1D tensors with every other element viewed ([::2]), 2D tensors and 2D transposed tensors. Some notable datapoints: - torch.randn((8000, 8000)) - var measured 0.0022215843200683594s on CUDA before the change - var measured 0.0020322799682617188s on CUDA after the change - torch.randn((8000, 8000)).T - var measured .015128850936889648 on CUDA before the change - var measured 0.001912832260131836 on CUDA after the change - torch.randn(8000 ** 2) - std measured 0.11031460762023926 on CUDA before the change - std measured 0.0017833709716796875 on CUDA after the change Timings for var and std are, as expected, similar. On the CPU, however, the performance change from making the analogous update was more complicated, and ngimel and I decided not to remove CPU var_all and std_all. ngimel wrote the following script that showcases how single-threaded CPU inference would suffer from this change: ``` import torch import numpy as np from torch.utils._benchmark import Timer from torch.utils._benchmark import Compare import sys base = 8 multiplier = 1 def stdfn(a): meanv = a.mean() ac = a-meanv return torch.sqrt(((acac).sum())/a.numel()) results = [] num_threads=1 for _ in range(7): size = basemultiplier input = torch.randn(size) tasks = [("torch.var(input)", "torch_var"), ("torch.var(input, dim=0)", "torch_var0"), ("stdfn(input)", "stdfn"), ("torch.sum(input, dim=0)", "torch_sum0") ] timers = [Timer(stmt=stmt, num_threads=num_threads, label="Index", sub_label=f"{size}", description=label, globals=globals()) for stmt, label in tasks] repeats = 3 for i, timer in enumerate(timers * repeats): results.append( timer.blocked_autorange() ) print(f"\r{i + 1} / {len(timers) * repeats}", end="") sys.stdout.flush() multiplier =10 print() comparison = Compare(results) comparison.print() ``` The TH timings using this script on my devfair are: ``` [------------------------------ Index ------------------------------] \| torch_var \| torch_var0 \| stdfn \| torch_sum0 1 threads: ---------------------------------------------------------- 8 \| 16.0 \| 5.6 \| 40.9 \| 5.0 80 \| 15.9 \| 6.1 \| 41.6 \| 4.9 800 \| 16.7 \| 12.0 \| 42.3 \| 5.0 8000 \| 27.2 \| 72.7 \| 51.5 \| 6.2 80000 \| 129.0 \| 715.0 \| 133.0 \| 18.0 800000 \| 1099.8 \| 6961.2 \| 842.0 \| 112.6 8000000 \| 11879.8 \| 68948.5 \| 20138.4 \| 1750.3 ``` and the ATen timings are: ``` [------------------------------ Index ------------------------------] \| torch_var \| torch_var0 \| stdfn \| torch_sum0 1 threads: ---------------------------------------------------------- 8 \| 4.3 \| 5.4 \| 41.4 \| 5.4 80 \| 4.9 \| 5.7 \| 42.6 \| 5.4 800 \| 10.7 \| 11.7 \| 43.3 \| 5.5 8000 \| 69.3 \| 72.2 \| 52.8 \| 6.6 80000 \| 679.1 \| 676.3 \| 129.5 \| 18.1 800000 \| 6770.8 \| 6728.8 \| 819.8 \| 109.7 8000000 \| 65928.2 \| 65538.7 \| 19408.7 \| 1699.4 ``` which demonstrates that performance is analogous to calling the existing var and std with `dim=0` on a 1D tensor. This would be a significant performance hit. Another simple script shows the performance is mixed when using multiple threads, too: ``` import torch import time # Benchmarking var and std, 1D with varying sizes base = 8 multiplier = 1 op = torch.var reps = 1000 for _ in range(7): size = base multiplier t = torch.randn(size) elapsed = 0 for _ in range(reps): start = time.time() op(t) end = time.time() elapsed += end - start multiplier *= 10 print("Size: ", size) print("Avg. elapsed time: ", elapsed / reps) ``` ``` var cpu TH vs ATen timings Size: 8 Avg. elapsed time: 1.7853736877441406e-05 vs 4.9788951873779295e-06 (ATen wins) Size: 80 Avg. elapsed time: 1.7803430557250977e-05 vs 6.156444549560547e-06 (ATen wins) Size: 800 Avg. elapsed time: 1.8569469451904296e-05 vs 1.2302875518798827e-05 (ATen wins) Size: 8000 Avg. elapsed time: 2.8756141662597655e-05 vs. 6.97789192199707e-05 (TH wins) Size: 80000 Avg. elapsed time: 0.00026622867584228516 vs. 0.0002447957992553711 (ATen wins) Size: 800000 Avg. elapsed time: 0.0010556647777557374 vs 0.00030616092681884767 (ATen wins) Size: 8000000 Avg. elapsed time: 0.009990205764770508 vs 0.002938544034957886 (ATen wins) std cpu TH vs ATen timings Size: 8 Avg. elapsed time: 1.6681909561157225e-05 vs. 4.659652709960938e-06 (ATen wins) Size: 80 Avg. elapsed time: 1.699185371398926e-05 vs. 5.431413650512695e-06 (ATen wins) Size: 800 Avg. elapsed time: 1.768803596496582e-05 vs. 1.1279821395874023e-05 (ATen wins) Size: 8000 Avg. elapsed time: 2.7791500091552735e-05 vs 7.031106948852539e-05 (TH wins) Size: 80000 Avg. elapsed time: 0.00018650460243225096 vs 0.00024368906021118164 (TH wins) Size: 800000 Avg. elapsed time: 0.0010522041320800782 vs 0.0003039860725402832 (ATen wins) Size: 8000000 Avg. elapsed time: 0.009976618766784668 vs. 0.0029211788177490234 (ATen wins) ``` These results show the TH solution still performs better than the ATen solution with default threading for some sizes. It seems like removing CPU var_all and std_all will require an improvement in ATen reductions. https://github.com/pytorch/pytorch/issues/40570 has been updated with this information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43858 Reviewed By: zou3519 Differential Revision: D23498981 Pulled By: mruberry fbshipit-source-id: 34bee046c4872d11c3f2ffa1b5beee8968b22050	2020-09-06 09:40:54 -07:00
Mike Ruberry	83a6e7d342	Adds inequality testing aliases for better NumPy compatibility (#43870 ) Summary: This PR adds the following aliaes: - not_equal for torch.ne - greater for torch.gt - greater_equal for torch.ge - less for torch.lt - less_equal for torch.le This aliases are consistent with NumPy's naming for these functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43870 Reviewed By: zou3519 Differential Revision: D23498975 Pulled By: mruberry fbshipit-source-id: 78560df98c9f7747e804a420c1e53fd1dd225002	2020-09-06 09:36:23 -07:00
Muthu Arivoli	719d29dab5	Implement torch.i0 and torch.kaiser_window (#43132 ) Summary: Related to https://github.com/pytorch/pytorch/issues/38349 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43132 Reviewed By: smessmer Differential Revision: D23479072 Pulled By: mruberry fbshipit-source-id: 4fb1de44830771c6a7222cf19f7728d9ac7c043b	2020-09-05 23:11:47 -07:00
Milind Yishu Ujjawal	ab7606702c	Rectified a few grammatical errors in documentation (#43695 ) Summary: Rectified a few grammatical errors in documentation of pytorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43695 Reviewed By: anjali411 Differential Revision: D23451600 Pulled By: ezyang fbshipit-source-id: bc7b34c240fde1b31cac811080befa2ff2989395	2020-09-02 23:59:45 -07:00
anjali411	129f406062	Make torch.conj() a composite function and return self for real tensors (#43270 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43270 `torch.conj` is a very commonly used operator for complex tensors, but it's mathematically a no op for real tensors. Switching to tensorflow gradients for complex tensors (as discussed in #41857) would involve adding `torch.conj()` to the backward definitions for a lot of operators. In order to preserve autograd performance for real tensors and maintain numpy compatibility for `torch.conj`, this PR updates `torch.conj()` which behaves the same for complex tensors but performs a view/returns `self` tensor for tensors of non-complex dtypes. The documentation states that the returned tensor for a real input shouldn't be mutated. We could perhaps return an immutable tensor for this case in future when that functionality is available (zdevito ezyang ). Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23460493 Pulled By: anjali411 fbshipit-source-id: 3b3bf0af55423b77ff2d0e29f5d2c160291ae3d9	2020-09-02 17:06:04 -07:00
kshitij12345	b6b5ebc345	Add `torch.vdot` (#43004 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42747 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43004 Reviewed By: mruberry Differential Revision: D23318935 Pulled By: anjali411 fbshipit-source-id: 12d4824b7cb42bb9ca703172c54ec5c663d9e325	2020-09-02 09:00:30 -07:00
Hong Xu	6da26cf0d9	Update torch.range warning message regarding the removal version number (#43569 ) Summary: `torch.range` still hasn't been removed way after version 0.5. This PR fixes the warning message. Alternatively, we can remove `torch.range`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43569 Reviewed By: ngimel Differential Revision: D23408233 Pulled By: mruberry fbshipit-source-id: 86c4f9f018ea5eddaf80b78a3c54dfa41cfc6fa6	2020-08-31 22:23:32 -07:00
kiyosora	3682df77db	Implementing NumPy-like function torch.heaviside() (#42523 ) Summary: - Related with https://github.com/pytorch/pytorch/issues/38349 - Implementing the NumPy-like function `torch.heaviside()` . Pull Request resolved: https://github.com/pytorch/pytorch/pull/42523 Reviewed By: ngimel Differential Revision: D23416743 Pulled By: mruberry fbshipit-source-id: 9975bd9c9fa73bd0958fe9879f79a692aeb722d5	2020-08-31 15:54:56 -07:00

1 2 3 4 5 ...

503 Commits