pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Jessica Lin	8acfecaecb	[1.6] Add optimizer_for_mobile doc into python api root doc (#41491 ) * Add optimizer_for_mobile doc into python api root doc * Apply suggestions from code review Remove all references to `optimization_blacklist` as it's missing in 1.6 Co-authored-by: Nikita Shulga <nshulga@fb.com>	2020-07-22 17:37:45 -07:00
anjali411	8f804baaa9	Doc note for complex (#41252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41252 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D22553266 Pulled By: anjali411 fbshipit-source-id: f6dc409da048496d72b29b0976dfd3dd6645bc4d	2020-07-22 14:49:51 -07:00
anjali411	a395e0903e	Autograd Doc for Complex Numbers (#41012 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41012 Test Plan: Imported from OSS Differential Revision: D22476911 Pulled By: anjali411 fbshipit-source-id: 7da20cb4312a0465272bebe053520d9911475828	2020-07-22 14:40:52 -07:00
Edward Z. Yang	2ca55430d2	Add reference documentation for torch/library.h (#41470 ) (#41602 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41470 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D22577426 Pulled By: ezyang fbshipit-source-id: 4bfe5806061e74181a74d161c868acb7c1ecd1e4	2020-07-22 11:10:16 -07:00
Luca Wehrstedt	d9e9e0087a	[v1.6] [RPC docs] Remove mention of TensorPipe's SHM and CMA backends as they're not built (#41229 ) Summary: In short, we messed up. The SHM and CMA backends of TensorPipe are Linux-specific and thus they are guarded by a #ifdef in the agent's code. Due to a mishap with CMake (due the fact that TensorPipe has two CMake files, one for PyTorch and a "standalone" one) we were not correctly propagating some flags and these #ifdefs were always false. This means that these two backends have always been disabled and have thus never been covered by our OSS CI. It would be irresponsible to enable them now in v1.6, so instead we remove any mention of them from the docs. Note that this is perhaps not as bad as it sounds. These two backends were providing higher performance (latency) when the two endpoints were on the same machine. However, I suspect that most RPC users will only do transfers across machines, for which SHM and CMA wouldn't have played any role. Original PR against master: #41200 (merged as `dde3d5f4a8`) Test Plan: Docs only	2020-07-10 09:02:08 -07:00
mrshenli	59bb44a8e8	Add a link in RPC doc page to point to PT Distributed overview (#41108 ) (#41156 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41108 Test Plan: Imported from OSS Differential Revision: D22440751 Pulled By: mrshenli fbshipit-source-id: 9e7b002091a3161ae385fdfcc26484ae8fc243bb	2020-07-09 07:49:10 -07:00
mcarilli	eaf3f2fd34	Added index_put to promotelist (#41036 ) * Added index_put to promotelist * docstring Co-authored-by: Michael Carilli <mcarilli@nvidia.com>	2020-07-07 13:00:32 -07:00
Wanchao	31d9776c04	[1.6] fix autograd doc subsubsection display issue (#40796 ) Master branch PR: https://github.com/pytorch/pytorch/pull/40582	2020-07-01 09:28:25 -07:00
eellison	8682ac147b	Docs merge (#40569 ) Co-authored-by: Elias Ellison <eellison@fb.com>	2020-06-26 12:24:08 -07:00
Jessica Lin	4cc605e80a	(1.6) Update docs feature classifications (#40539 ) Co-authored-by: Eli Uriegas <1700823+seemethere@users.noreply.github.com>	2020-06-26 12:23:02 -07:00
Jessica Lin	b0cce716f7	Add beta warning for quant docs (#40540 ) Add a beta warning to match stable and master docs: https://github.com/pytorch/pytorch/blob/master/docs/source/quantization.rst	2020-06-26 12:20:06 -07:00
mrshenli	0dc93ac119	[v1.6.0 patch] Install method docstrings from PyRRef to RRef (#40620 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40461 It turned out `:inheried-members:` (see [doc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#directive-autoclass)) is not really usable. Because pybind11 generates a docstring that writes `self` as parent class, `rpc.PyRRef`, type. As a workaround, I am pulling docstrings on parent-class, `PyRRef` class, into subclass, `RRef`. And do surgery on the docstring generated by pybind11. {F241283111} ghstack-source-id: 106472496 P134031188 Differential Revision: D7933834 fbshipit-source-id: c03a8a4c9d98888b64492a8caba1591595bfe247 Co-authored-by: Shihao Xu <shihaoxu@fb.com>	2020-06-26 12:15:28 -07:00
Jessica Lin	bb848df10b	[1.6] Remove table of contents at the top of rpc.rst (#40482 ) Master PR: https://github.com/pytorch/pytorch/pull/40205 Remove the table of contents created by the `.. contents:: :local: :depth: 2` since this page isn't one of the large documentation pages (https://github.com/pytorch/pytorch/issues/38010) and is simply a landing page for the Distributed RPC Framework. Changes made in this original PR: `f10fbcc820 (diff-250b9b23fd6f1a5c15aecdb72afb9d7d)`	2020-06-26 08:37:49 -07:00
Michael Carilli	3b040c478a	Make custom_fwd a no-op when not executed under autocast (#36171 ) Summary: Currently, a custom autograd function written with ``` torch.cuda.amp.custom_fwd(cast_inputs=dtype) def forward(ctx, *args): ... ``` casts incoming floating-point CUDA tensors to `dtype` unconditionally, regardless of whether the function executes in an autocast-enabled region. I think I had the wrong idea there. Autocast-disabled regions should give the user control of input types. Also, `custom_fwd(cast_inputs=dtype)`-decorated functions' behavior should align with native fp32list/fp16list functions. C++-side casting wrappers have no effect when autocast is disabled, and `custom_fwd`'s casting should behave the same way. The present PR changes `custom_fwd` so it only casts in autocast-enabled regions (also updates custom_fwd to ignore fp64 inputs, like the C++ wrappers). Pull Request resolved: https://github.com/pytorch/pytorch/pull/36171 Differential Revision: D22179511 Pulled By: ngimel fbshipit-source-id: 5a93d070179a43206066bce19da0a5a19ecaabbd	2020-06-23 10:23:02 -07:00
Vasiliy Kuznetsov	9bf255573f	quant docs: add and clean up ELU (#40377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40377 Cleans up the docstring for quantized ELU and adds it to the quantization docs. Test Plan: * build on Mac OS and inspect Differential Revision: D22162834 Pulled By: vkuzo fbshipit-source-id: e548fd4dc8d67db27ed19cac4dbdf2a942586759	2020-06-23 09:02:43 -07:00
Vasiliy Kuznetsov	d71ec51c0e	quant docs: add and clean up BatchNorm{n}d (#40346 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40346 Cleans up docstrings for quantized BatchNorm and adds to quantization docs Test Plan: * build on Mac OS and inspect Differential Revision: D22152633 Pulled By: vkuzo fbshipit-source-id: e0bf02194158231e0205b5b2df7f6f1ffc3c4d65	2020-06-23 09:02:41 -07:00
Vasiliy Kuznetsov	5e683517a7	quant docs: add and clean up InstanceNorm{n}d (#40345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40345 Fixes docstrings and adds to quantization docs for quantized InstanceNorm. Test Plan: * build on Mac OS and inspect Differential Revision: D22152637 Pulled By: vkuzo fbshipit-source-id: 7a485311ead20796b7a0944827d1d04e14ec8dcd	2020-06-23 09:02:39 -07:00
Vasiliy Kuznetsov	6e3fdd77ca	quant docs: add and clean up GroupNorm (#40343 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40343 Cleans up the quantized GroupNorm docstring and adds it to quantization docs. Test Plan: * build on Mac OS and inspect Differential Revision: D22152635 Pulled By: vkuzo fbshipit-source-id: 5553b841c7a5d77f1467f0c40657db9e5d730a12	2020-06-23 09:02:36 -07:00
Vasiliy Kuznetsov	d15fcc7e49	quant docs: add and clean up LayerNorm (#40342 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40342 Cleans up the docstrings for quantized LayerNorm, and adds it to the docs. Test Plan: * build on Mac OS and inspect Differential Revision: D22152639 Pulled By: vkuzo fbshipit-source-id: 38adf14b34675d1983ac4ed751938aa396e5400b	2020-06-23 09:02:34 -07:00
Vasiliy Kuznetsov	d27f8eaf92	quant docs: add and clean up hardtanh (#40341 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40341 Cleans up the hardtanh docstring and adds it to quantization docs. Test Plan: * build and inspect on Mac OS Differential Revision: D22152636 Pulled By: vkuzo fbshipit-source-id: c98e635199c8be332aa6958664ff23faad834908	2020-06-23 09:02:32 -07:00
Vasiliy Kuznetsov	8e74fb6a0c	quant docs: add and clean up hardsigmoid (#40340 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40340 Adds and simplifies quantization docs for hardsigmoid Test Plan: * build docs on Mac OS * inspect Differential Revision: D22152634 Pulled By: vkuzo fbshipit-source-id: 18da273023fb00e5f0bc1e881b00536492c606d3	2020-06-23 09:02:29 -07:00
Vasiliy Kuznetsov	c4594a97ae	quant docs: clean up hardswish (#40323 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40323 Cleans up the naming and the function param docs for quantized hardswish. Remove redundant docstrings and link to floating point modules instead. Test Plan: * build the docs on Mac OS * verify that every link works as expected Differential Revision: D22152638 Pulled By: vkuzo fbshipit-source-id: fef04874ae460b449c677424a6a1c6dd47054795	2020-06-23 08:59:34 -07:00
Michael Carilli	8066fba226	[RELAND2] Change AccumulateGrad to yield `.grad`s that match weights' memory layout (#40358 ) Summary: https://github.com/pytorch/pytorch/pull/40129 fixed the error responsible for the first revert, but exposed another error in the same test. This PR is intended as the "master copy" for merge, and it runs on full CI. Two other PRs (restricted to run on a small subset of CI) supporting debugging DDP failures/hangs with multiple devices per process (`test_c10d.py:DistributedDataParallelTest.test_grad_layout_1devicemodule_2replicaperprocess`). - https://github.com/pytorch/pytorch/pull/40290 tries the test with purely rowmajor contiguous params on an untouched master. In other words https://github.com/pytorch/pytorch/pull/40290 contains none of this PR's diffs aside from the test itself. - https://github.com/pytorch/pytorch/pull/40178, for comparison, tries the test with this PR's diffs. Both fail the same way, indicating failure is unrelated to this PR's other diffs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40358 Differential Revision: D22165785 Pulled By: albanD fbshipit-source-id: ac7cdd79af5c080ab74341671392dca8e717554e	2020-06-22 17:13:21 -07:00
Rohan Varma	ae2f1f0372	[DDP Note] Remove refs to RoundRobin PG until we officially support it (#40380 ) Summary: Removes line mentioning `ProcessGroupRoundRobin` since we don't intend it to be used as a public API just yet. We can add this back when we officially support the API Pull Request resolved: https://github.com/pytorch/pytorch/pull/40380 Differential Revision: D22165556 Pulled By: rohan-varma fbshipit-source-id: 24d0477d881dc74f2ff579de61dfd1ced2b09e75	2020-06-22 16:19:29 -07:00
anjali411	8ec2ae9a9f	Add view_as_real, view_as_complex for complex tensors (#39099 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39099 Test Plan: Imported from OSS Differential Revision: D22057886 Pulled By: anjali411 fbshipit-source-id: bad5ba7097ba0dd13f2c549b2463094dee9afa14	2020-06-22 15:15:27 -07:00
Edward Yang	e4766fb4d9	Meta tensors, but without code deduplication (#38490 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38490 A meta tensor is a tensor that is a lot like a normal tensor, except it doesn't actually have any data associated with it. You can use them to carry out shape/dtype computations without actually having to run the actual code; for example, this could be used to do shape inference in a JIT analysis pass. Check out the description in DispatchKey.h for more information. Meta tensors are part of a larger project to rationalize how we write kernels so that we don't have to duplicate shape logic in CPU kernel, CUDA kernel and meta kernel (this PR makes the duplication problem worse!) However, that infrastructure can be built on top of this proof of concept, which just shows how you can start writing meta kernels today even without this infrastructure. There are a lot of things that don't work: - I special cased printing for dense tensors only; if you try to allocate a meta sparse / quantized tensor things aren't going to work. - The printing formula implies that torch.tensor() can take an ellipsis, but I didn't add this. - I wrote an example formula for binary operators, but it isn't even right! (It doesn't do type promotion of memory layout correctly). The most future proof way to do it right is to factor out the relevant computation out of TensorIterator, as it is quite involved. - Nothing besides torch.add works right now - Meta functions are ALWAYS included in mobile builds (selective build doesn't work on them). This isn't a big deal for now but will become more pressing as more meta functions are added. One reason I'm putting up this PR now is to check with Yinghai Lu if we can unblock shape inference for accelerators, while we are still working on a long term plan for how to unify all shape computation across our kernels. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D21935609 Pulled By: ezyang fbshipit-source-id: f7d8636eeb8516b6bc296db99a16e56029972eee	2020-06-22 09:18:33 -07:00
Jerry Zhang	59ca1d31ca	[quant][graphmode] docstrings for top level APIs (#40328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40328 Test Plan: Imported from OSS Differential Revision: D22149708 fbshipit-source-id: 63a1cd229d9e4668fba0ef3977e894cb8984318b	2020-06-19 22:20:23 -07:00
Mike Ruberry	4f761f325c	Back out "[pytorch][PR] Removes dunder div" Summary: NVIDIA's Apex is updating to no longer rely on this behavior, but we're reverting this Python2->Python3 update to unblock internal apex users. Test Plan: Sandcaslte + OSS CI. Reviewed By: ngimel Differential Revision: D22146782 fbshipit-source-id: f9483d2cbf9dc3a469ad48a6c863edea3ae51070	2020-06-19 18:31:20 -07:00
Shen Li	3ca05500fa	Improve RPC documents (#40296 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40296 1. Added a link to parameter server tutorial 2. Explained current states for TorchScript support Test Plan: Imported from OSS Differential Revision: D22142647 Pulled By: mrshenli fbshipit-source-id: ffd697dd64a3aa874cf3f3488122ed805903370d	2020-06-19 15:34:49 -07:00
James Reed	c73095e78f	Add note to serialization docs about zipfile format (#40288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40288 Test Plan: Imported from OSS Differential Revision: D22140324 Pulled By: jamesr66a fbshipit-source-id: 01d7aa642ed2f4e4bdac4b7f3223bf4d7e62fd4d	2020-06-19 13:40:08 -07:00
Negin Raoof	73a156e81f	[ONNX] Update pytorch/onnx docs for new export API args (#39802 ) Summary: Update pytorch/onnx docs for new export API args: Use external data format and Training args. Pull Request resolved: https://github.com/pytorch/pytorch/pull/39802 Reviewed By: hl475 Differential Revision: D22139664 Pulled By: houseroad fbshipit-source-id: 7d6dcf75129cb88987f8c37b7d9d48ca594c0f38	2020-06-19 13:38:47 -07:00
Luca Wehrstedt	2393bab036	[TensorPipe] Update documentation (#40222 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40222 Mention the TensorPipe agent in the RPC docs and give users the information they need to choose which agent to use. ghstack-source-id: 106225711 Test Plan: Export to GitHub, build locally and try out the docs. Differential Revision: D22116494 fbshipit-source-id: 30703ba8410c40f64e785f60d71dfd9faa8de4a1	2020-06-19 04:26:49 -07:00
Meghan Lele	d58b8222b7	[JIT] Add support for with statements (#34705 ) Summary: Summary This commit adds support for with statements to PyTorch JIT. Each of the with items in a with statement is represented in the JIT IR as a pair of `prim::Enter` and `prim::Exit` nodes that call the `__enter__` and `__exit__` methods defined on the context manager objects returned by the expressions in the with item. Testing This commit adds unit tests for with statements with named with items, nameless with items, and with statements that encounter exceptions. ``` $ python test/test_jit.py TestWith.test_with_as Fail to import hypothesis in common_utils, tests are not derandomized . ---------------------------------------------------------------------- Ran 1 test in 0.430s OK ``` ``` $ python test/test_jit.py TestWith.test_with_no_as Fail to import hypothesis in common_utils, tests are not derandomized . ---------------------------------------------------------------------- Ran 1 test in 0.264s OK ``` ``` $ python test/test_jit.py TestWith.test_with_exceptions Fail to import hypothesis in common_utils, tests are not derandomized Couldn't download test skip set, leaving all tests enabled... . ---------------------------------------------------------------------- Ran 1 test in 1.053s OK ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/34705 Differential Revision: D22095945 Pulled By: SplitInfinity fbshipit-source-id: f661565a834786725259b8ea014b4d7532f9419d	2020-06-18 16:57:18 -07:00
Shihao Xu	f3f30d4354	[JIT x RPC] Consolidate RRef type class and RRef impl class (#35694 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35694 close https://github.com/pytorch/pytorch/issues/35110 Differential Revision: D7881729 fbshipit-source-id: eedda8f1b7510491886d469efeed4e002bb8b991	2020-06-18 07:46:38 -07:00
Shen Li	74142f76fa	Adding torch.futures to API docs (#40051 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40051 Test Plan: Imported from OSS Differential Revision: D22055031 Pulled By: mrshenli fbshipit-source-id: ce8a79ba4ffdc7dbed6d4c62b1c33b96764c89e7	2020-06-17 17:55:48 -07:00
Alban Desmaison	08227fea4f	Revert D22079377: [pytorch][PR] [RELAND] Change AccumulateGrad to yield `.grad`s that match weights' memory layout Test Plan: revert-hammer Differential Revision: D22079377 Original commit changeset: 9bd2b7e0c34f fbshipit-source-id: c22cc349d790caa574eace0d63980854c33e5a59	2020-06-17 10:17:27 -07:00
Michael Carilli	1ec8ece2b9	[RELAND] Change AccumulateGrad to yield `.grad`s that match weights' memory layout (#40129 ) Summary: https://github.com/pytorch/pytorch/pull/34904 was reverted because it had a misconfigured 4 GPU test that for some reason wasn't caught by external CI ([example failure](https://app.circleci.com/pipelines/github/pytorch/pytorch/181719/workflows/cfb37cd9-9a0c-4738-898b-d683934cd308/jobs/5868948/steps)). This PR reverts the revert, and adds diffs that should repair the misconfigured test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40129 Differential Revision: D22079377 Pulled By: albanD fbshipit-source-id: 9bd2b7e0c34fdaf887497b52037cfe82cba709c1	2020-06-17 09:02:54 -07:00
Mike Ruberry	9d588f7ce2	Removes dunder div (#39151 ) Summary: BC-breaking note: If a user is using one of these dunders directly they will not longer be available. Users should update to Python3 compatible dunders. Original PR note: `__div__` (and `__idiv__` and `__rdiv__`) are no longer special dunders in Python3. This PR replaces them with the `__truediv__` (`__itrudediv__`, `__rtruediv__`) dunders, since we no longer support Python2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/39151 Differential Revision: D22075713 Pulled By: mruberry fbshipit-source-id: d318b47b51f7cc4c3728b1606a34d81e49ba0fa1	2020-06-16 23:02:20 -07:00
Alban Desmaison	f1e575a0bf	Revert D20496044: [pytorch][PR] Change AccumulateGrad to yield `.grad`s that match weights' memory layout Test Plan: revert-hammer Differential Revision: D20496044 Original commit changeset: 248d680f4b1b fbshipit-source-id: 6462b25e3fb9c8596c1da443389089f09c32df4d	2020-06-16 10:38:40 -07:00
mattip	dd581b4512	DOC: fix rpc reference in top-level index (#40077 ) Summary: Fixes gh-40046 PR gh-37419 refactored the content of `docs/source/rpc/index.rst` into `docs/source/rpc.rst` but did not link to the latter from `doc/source/index.rst` so the top-level RPC documentation is missing from https://pytorch.org/docs/master/. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40077 Differential Revision: D22068128 Pulled By: mrshenli fbshipit-source-id: 394433f98f86509e0c9cb6d910a86fb8a2932683	2020-06-16 10:26:03 -07:00
Michael Carilli	2beb9690c3	Change AccumulateGrad to yield `.grad`s that match weights' memory layout (#34904 ) Summary: Currently, whether `AccumulateGrad` [steals](`67cb018462/torch/csrc/autograd/functions/accumulate_grad.h (L42)`) or [clones](`67cb018462/torch/csrc/autograd/functions/accumulate_grad.h (L80)`) an incoming gradient, the gradient ends up rowmajor contiguous, regardless of its param's layout. If the param's layout is channels last, or otherwise not rowmajor contigous, later kernels that apply gradients to params are forced into an uncoalesced memory access pattern for either the param or the gradient. This may not sound like a big deal but for any binary op on large tensors it's a >3X increase in gmem traffic => 3X slowdown. The present PR changes `AccumulateGrad` to prefer, where possible, stashing gradients that match their params' layouts (["Gradient Layout Contract"](https://github.com/pytorch/pytorch/pull/34904/files#diff-ef1a56d24f66b280dcdb401502d6a796R29-R38)). Allowing `AccumulateGrad` to stash non-rowmajor-contiguous grads means DDP allreduces and DP reduces must allow non-rowmajor-contiguous grads. This PR extends DDP and DP to allow gradients with non-rowmajor-contiguous strides as long as their layout is nonoverlapping and dense. For good measure, I include changes that allow all five nccl primitives (allreduce, reduce, broadcast, allgather, reducescatter) to act on non-rowmajor-contiguous tensors (again as long as each input's layout is nonoverlapping and dense, and as long as all tensors participating in a given collective have the same layout). The primitive comm changes aren't necessary to enable the DDP changes, but I wasn't sure this would end up true until I had written both sets of changes. I think primitive comm enablement is reasonable to keep in the PR, especially since the code for it is simple. Channels last params will be a major beneficiary of this PR, but I don't see it as channels-last-specific fix. The spirit is layout matching in general: - Grads should be stashed with memory layouts matching their params. - Src and dst tensors on opposite ends of collectives should have matching dense layouts. This PR also updates autograd docs to describe potential BC-breaking changes below. ## BC notes ngimel albanD gchanan #### BC-breaking In the common case where the user lets AccumulateGrad decide grad layouts, strides for grads of dense but non-rowmajor-contiguous params will change. Any user code that was accustomed to `view(-1)`ing these grads will break. Also, the circumstances under which a grad can be stolen directly from the backward function that created it, as opposed to deep-copied by AccumulateGrad, have changed. In most cases we expect silent performance improvement, because we expect channels-last-aware backward kernels will create channels last gradients for channels last params. Now those can be stolen, whereas before this PR they were cloned and made rowmajor contiguous. IMO this is a mild BC breakage. Param backward hooks still see grads come in with whatever format the backward kernel gave them. The only BC breakage potential I see is if user code relies somehow on a grad in a hook having or not having the same deep memory as the eventual `param.grad`. Any such users hopefully know they're off the edge of the map and understand how to update their expectations. #### BC escape hatches At alband's recommendation, this PR's changes to AccumulateGrad do not alter the pre-PR code's decisions about whether grad is accumulated in or out of place. Accumulations of new grads onto an existing `.grad` attribute were (usually) in-place before this PR and remain in-place after this PR, keeping the existing `.grad`'s layout. After this PR, if the user wants to force accumulation into a grad with a particular layout, they can preset `param.grad` to a zeroed tensor with the desired strides or call `grad.contiguous(desired format)`. This likely won't be as performant as letting AccumulateGrad establish grad layouts by cloning or stealing grads with contract-compliant strides, but at least users have a control point. One limitation (present before this PR and unchanged by this PR): Presetting `param.grad` does not ensure in-place accumulation all the time. For example, if `create_graph=True`, or if incoming `new_grad` is dense and existing `variable_grad` is sparse, accumulation occurs out of place, and the out-of-place result may not match the existing grad's strides. ---------------------------- I also noticed some potential DDP improvements that I considered out of scope but want to mention for visibility: 1. make sure Reducer's ops sync with AccumulateGrad streams 2. ~to reduce CPU overhead and incur fewer kernel launches, lazily create flat `contents` tensors by a single `cat` kernel only when a bucket is full, instead of `copy_`ing grads into `contents` individually as soon as they are received.~ PR includes a [minor change](https://github.com/pytorch/pytorch/pull/34904/files#diff-c269190a925a4b0df49eda8a8f6c5bd3R312-R315) to divide grads while copying them into flat buffers, instead of copying them in, then dividing separately. Without cat+div fusion, div-while-copying is the best we can do. 3. https://github.com/pytorch/pytorch/issues/38942 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34904 Differential Revision: D20496044 Pulled By: albanD fbshipit-source-id: 248d680f4b1bf77b0a986451844ec6e254469217	2020-06-16 08:43:31 -07:00
Shawn Zhong	96870181c6	Remove duplicated entries in `random.rst` (#39725 ) Summary: In the current master doc, every function under [`torch.random`](https://pytorch.org/docs/master/random.html) appears twice because the function docs are generated by both `automodule` and `autofunction`. This PR removes the parts generated by `autofunction`. See changed docs at https://5751500-65600975-gh.circle-artifacts.com/0/docs/random.html: ![image](https://user-images.githubusercontent.com/6421097/84165823-bf720580-aa39-11ea-9149-c428d43260f8.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/39725 Differential Revision: D21983701 Pulled By: ngimel fbshipit-source-id: 5f515d7fd8034687e754e2c7b2ea9e154b3ea9b9	2020-06-10 16:51:15 -07:00
lixinyu	7cb4eae8b1	correct some cpp extension code usages and documents (#39766 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39766 Test Plan: Imported from OSS Differential Revision: D21967284 Pulled By: glaringlee fbshipit-source-id: 8597916bee247cb5f8c82ed8297119d2f3a72170	2020-06-10 08:31:22 -07:00
kshitij12345	9733390998	Add `torch.flip{lr, ud}` (#38599 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/38349 TODO: * [x] Add Tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/38599 Differential Revision: D21941884 Pulled By: mruberry fbshipit-source-id: 7a442ff11051c2c868cf8e3c04e4bba0f1a1d426	2020-06-09 07:19:37 -07:00
krshrimali	335e4a1e3b	Add arcosh, arcsinh and arctanh to unary ops (#38388 ) Summary: This PR aims to add `arcosh`, `arcsinh` and `arctanh` support. Please see issue https://github.com/pytorch/pytorch/issues/38349 for more details. TODOs: * [x] Add test cases for `arcosh`, `arcsinh` and `arctanh`. (need help) * [x] Overload ops if `std::op` does not work with `thrust::complex` types (like for `sinh`, `cosh`). Note: `std::acosh, std::asinh, std::atanh` do not support `thrust::complex` types. Added support for complex types for these 3 ops (`arccosh, arcsinh, arctanh`) cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/38388 Differential Revision: D21882055 Pulled By: mruberry fbshipit-source-id: d334590b47c5a89e491a002c3e41e6ffa89000e3	2020-06-04 11:40:55 -07:00
mattip	ada2652ca6	Restore docs coverage test via sphinx (#39331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39331 Fixes gh-37590 Adds an extra `make coverage` to document building, which uses the built-in facility in sphinx to check docstring coverage. Also fixes a failure to import `torch/jit/supported_ops.py` which broke the [Torchscript Builtins](https://pytorch.org/docs/stable/jit_builtin_functions.html) page. This also adds the required `SPHINXOPTS` to turn warnings into error, but this is commented out. Note that since documentation of `torchvision` is merged in here, failures there would cause failures here if this is made active. Some thought might be needed about pinning the torchvision version merged into documentation. The first commit should fail, since the "ScriptModule" class is commented out. I did that in order to check that a CI failure is properly reported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38244 Differential Revision: D21640589 Pulled By: ezyang fbshipit-source-id: 1e240d81669b5f21404d596de4a27d192dc9fd8a	2020-06-04 10:49:38 -07:00
Aayush Naik	0829cadca3	Implement rad2deg, deg2rad (#38852 ) Summary: Resolves https://github.com/pytorch/pytorch/issues/38372. cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/38852 Differential Revision: D21868935 Pulled By: mruberry fbshipit-source-id: ae6ded11b743c9d1cdc032984b4abe0a115290d6	2020-06-03 22:21:54 -07:00
neginraoof	4d597cb794	[ONNX] Update pytoch/onnx doc (#39480 ) Summary: Updated dos for operator_export_types and recently added op symbolics. Pull Request resolved: https://github.com/pytorch/pytorch/pull/39480 Reviewed By: hl475 Differential Revision: D21877364 Pulled By: houseroad fbshipit-source-id: 9831fe5776629da897db6d7943f830528cb916d2	2020-06-03 22:15:30 -07:00
Shen Li	a05ef17e46	Add rpc.functions.async_execution decorator for rpc_sync/rpc_async (#39216 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39216 The `rpc.functions.async_execution` decorator specifies that the wrapped function is guaranteed to return a `torch.futures.Future`. The decorator adds a `_wrapped_async_rpc_function` attribute to the wrapper function. The caller retrieves this information and then sets `isAsyncFunction` argument accordingly which is later added to PythonCall RPC message as a field. On the callee side, if the PythonCall carries an asynchronous function, it will cast the function's return value to a jit::PythonFutureWrapper object, and then install response creation and communication as a callback on the that jit::PythonFutureWrapper. For applications, this feature is useful when a function needs to wait for IO or additional singaling. In those cases, marking the user function as `rpc.functions.async_execution` will prevent it from blocking one thread on callee for too long. Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D21779962 fbshipit-source-id: 6b6aa698bf6f91dad6ed2a7ee433df429b59e941	2020-06-02 23:21:25 -07:00
Xiang Gao	ebd4125e7e	[JIT] Make torch.unique_consecutive compatible (#39339 ) Summary: A `unique_consecutive` version of https://github.com/pytorch/pytorch/pull/38156 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39339 Differential Revision: D21823997 Pulled By: eellison fbshipit-source-id: d14596a36ba36497e296da5a344e0376cef56f1b	2020-06-02 14:54:29 -07:00

1 2 3 4 5 ...

1070 Commits