Summary:
This patch writes documentation for `Tensor.record_stream()`, which is not a documented API currently. I've discussed publishing it with colesbury in https://github.com/pytorch/pytorch/issues/23729.
The documentation is based on [the introduction at `CUDACachingAllocator.cpp`](25d1496d58/c10/cuda/CUDACachingAllocator.cpp (L47-L50)). ~~I didn't explain full details of the life cycle of memory blocks or stream awareness of the allocator for the consistent level of details with other documentations.~~ I explained about the stream awareness in a note block.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24078
Differential Revision: D16743526
Pulled By: zou3519
fbshipit-source-id: 05819c3cc96733e2ba93c0a7c0ca06933acb22f3
Summary:
This is a bunch of changes to the docs for stylistic changes,
correctness, and updates to the new script API / recent TorchScript
changes (i.e. namedtuple)
For reviewers, ping me to see a link of the rendered output.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24371
Pulled By: driazati
Differential Revision: D16832417
fbshipit-source-id: a28e748cf1b590964ca0ae2dfb5d8259c766a203
Summary:
Stacked PRs
* #24258 - [jit] Add `trace_module` to docs
* **#24208 - [jit] Cleanup documentation around `script` and `trace`**
Examples / info was duplicated between `ScriptModule`, `script`, and
`trace`, so this PR consolidates it and moves some things around to make
the docs more clear.
For reviewers, if you want to see the rendered output, ping me for a
link
](https://our.intern.facebook.com/intern/diff/16746236/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24208
Pulled By: driazati
Differential Revision: D16746236
fbshipit-source-id: fac3c6e762a31c897b132b8421baa8d4d61f694c
Summary:
**Patch Description**:
Update the docs to reflect one no longer needs to install tensorboard nightly, as Tensorboard 1.14.0 was [released last week](https://github.com/tensorflow/tensorboard/releases/tag/1.14.0).
**Testing**:
Haven't actually tested pytorch with tensorboard 1.14 yet. I'll update this PR once I have.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22026
Differential Revision: D16772136
Pulled By: orionr
fbshipit-source-id: 2e1e17300f304f50026837abbbc6ffb25704aac0
Summary:
This was previously buggy and not being displayed on master. This fixes
the issues with the script to generate the builtin function schemas and
moves it to its own page (it's 6000+ lines of schemas)
Sphinx looks like it will just keep going if it hits errors when importing modules, we should find out how to turn that off and put it in the CI
This also includes some other small fixes:
* removing internal only args from `script()` and `trace()` docs, this also requires manually keeping these argument lists up to date but I think the cleanliness is worth it
* removes outdated note about early returns
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24056
Pulled By: driazati
Differential Revision: D16742406
fbshipit-source-id: 9102ba14215995ffef5aaafcb66a6441113fad59
Summary:
Adds new people and reorders sections to make more sense
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23693
Differential Revision: D16618230
Pulled By: dzhulgakov
fbshipit-source-id: 74191b50c6603309a9e6d14960b7c666eec6abdd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23376
This uses master version of sphinxcontrib-katex as it only
recently got prerender support.
Fixes#20984
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D16582064
Pulled By: ezyang
fbshipit-source-id: 9ef24c5788c19572515ded2db2e8ebfb7a5ed44d
Summary:
Changelog:
- Rename `gels` to `lstsq`
- Fix all callsites
- Rename all tests
- Create a tentative alias for `lstsq` under the name `gels` and add a deprecation warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23460
Test Plan: - All tests should pass to confirm that the patch is correct
Differential Revision: D16547834
Pulled By: colesbury
fbshipit-source-id: b3bdb8f4c5d14c7716c3d9528e40324cc544e496
Summary:
Use the recursive script API in the existing docs
TODO:
* Migration guide for 1.1 -> 1.2
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21612
Pulled By: driazati
Differential Revision: D16553734
fbshipit-source-id: fb6be81a950224390bd5d19b9b3de2d97b3dc515
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23417
Test Plan:
cd docs; make html
Imported from OSS
Differential Revision: D16523781
Pulled By: ilia-cher
fbshipit-source-id: d6c09e8a85d39e6185bbdc4b312fea44fcdfff06
Summary:
Thanks adefazio for the feedback, adding a note to the Contribution guide so that folks don't start working on code without checking with the maintainers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23513
Differential Revision: D16546685
Pulled By: soumith
fbshipit-source-id: 1ee8ade963703c88374aedecb8c9e5ed39d7722d
Summary:
This is still work in progress.
There are several more items to add to complete this doc, including
- [x] LHS indexing, index assignments.
- [x] Tensor List.
- [x] ~Shape/Type propagation.~
- [x] FAQs
Please review and share your thoughts, feel free to add anything that you think should be included as well. houseroad spandantiwari lara-hdr neginraoof
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23185
Differential Revision: D16459647
Pulled By: houseroad
fbshipit-source-id: b401c005f848d957541ba3b00e00c93ac2f4609b
Summary:
With this change you can now list multiple interfaces separated by
comma. ProcessGroupGloo creates a single Gloo context for every device
in the list (a context represents a connection to every other
rank). For every collective that is called, it will select the context
in a round robin fashion. The number of worker threads responsible for
executing the collectives is set to be twice the number of devices.
If you have a single physical interface, and wish to employ increased
parallelism, you can also specify
`GLOO_SOCKET_IFNAME=eth0,eth0,eth0,eth0`. This makes ProcessGroupGloo
use 4 connections per rank, 4 I/O threads, and 8 worker threads
responsible for executing the collectives.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22978
ghstack-source-id: 87006270
Differential Revision: D16339962
fbshipit-source-id: 9aa1dc93d8e131c1714db349b0cbe57e9e7266f1
Summary:
Covering fleet-wide profiling, api logging, etc.
It's my first time writing rst, so suggestions are definitely welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23010
Differential Revision: D16456721
Pulled By: dzhulgakov
fbshipit-source-id: 3d3018f41499d04db0dca865bb3a9652d8cdf90a
Summary:
I manually went through all functions in `torch.*` and corrected any mismatch between the arguments mentioned in doc and the ones actually taken by the function. This fixes https://github.com/pytorch/pytorch/issues/8698.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22973
Differential Revision: D16419602
Pulled By: yf225
fbshipit-source-id: 5562c9b0b95a0759abee41f967c45efacf2267c2
Summary:
This cleans up the `torch.utils.tensorboard` API to remove all kwargs usage (which isn't clear to the user) and removes the "experimental" warning in prep for our 1.2 release.
We also don't need the additional PyTorch version checks now that we are in the codebase itself.
cc ezyang lanpa natalialunova
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21786
Reviewed By: natalialunova
Differential Revision: D15854892
Pulled By: orionr
fbshipit-source-id: 06b8498826946e578824d4b15c910edb3c2c20c6
Summary:
# What is this?
This is an implementation of the AdamW optimizer as implemented in [the fastai library](803894051b/fastai/callback.py) and as initially introduced in the paper [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101). It decouples the weight decay regularization step from the optimization step during training.
There have already been several abortive attempts to push this into pytorch in some form or fashion: https://github.com/pytorch/pytorch/pull/17468, https://github.com/pytorch/pytorch/pull/10866, https://github.com/pytorch/pytorch/pull/3740, https://github.com/pytorch/pytorch/pull/4429. Hopefully this one goes through.
# Why is this important?
Via a simple reparameterization, it can be shown that L2 regularization has a weight decay effect in the case of SGD optimization. Because of this, L2 regularization became synonymous with the concept of weight decay. However, it can be shown that the equivalence of L2 regularization and weight decay breaks down for more complex adaptive optimization schemes. It was shown in the paper [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101) that this is the reason why models trained with SGD achieve better generalization than those trained with Adam. Weight decay is a very effective regularizer. L2 regularization, in and of itself, is much less effective. By explicitly decaying the weights, we can achieve state-of-the-art results while also taking advantage of the quick convergence properties that adaptive optimization schemes have.
# How was this tested?
There were test cases added to `test_optim.py` and I also ran a [little experiment](https://gist.github.com/mjacar/0c9809b96513daff84fe3d9938f08638) to validate that this implementation is equivalent to the fastai implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21250
Differential Revision: D16060339
Pulled By: vincentqb
fbshipit-source-id: ded7cc9cfd3fde81f655b9ffb3e3d6b3543a4709