Commit Graph

347 Commits

Author SHA1 Message Date
Frank Lin
249e65b92d Graph-Safe RNG State Exchange for Tensor Parallelism (#114068)
See #113541

The PR allows for registering and controlling multiple RNG states using indices, ensuring cudagraph-safe operations, and includes both C++ and Python API changes to support this functionality.

cc  @eellison @anijain2305 @jansel @ezyang @ptrblck @csarofeen @mcarilli
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114068
Approved by: https://github.com/ezyang, https://github.com/eqy, https://github.com/xuzhao9
2024-03-27 01:14:38 +00:00
PyTorch MergeBot
4dc09d6aa4 Revert "Graph-Safe RNG State Exchange for Tensor Parallelism (#114068)"
This reverts commit e9dcda5cba.

Reverted https://github.com/pytorch/pytorch/pull/114068 on behalf of https://github.com/ezyang due to memory leak in another ci ([comment](https://github.com/pytorch/pytorch/pull/114068#issuecomment-2018044527))
2024-03-25 13:49:04 +00:00
Frank Lin
e9dcda5cba Graph-Safe RNG State Exchange for Tensor Parallelism (#114068)
See #113541

The PR allows for registering and controlling multiple RNG states using indices, ensuring cudagraph-safe operations, and includes both C++ and Python API changes to support this functionality.

cc  @eellison @anijain2305 @jansel @ezyang @ptrblck @csarofeen @mcarilli
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114068
Approved by: https://github.com/ezyang
2024-03-21 01:57:08 +00:00
Jane Xu
37e563276b Document complex optimizer semantic behavior (#121667)
<img width="817" alt="image" src="https://github.com/pytorch/pytorch/assets/31798555/565b389d-3e86-4767-9fcb-fe075b50aefe">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121667
Approved by: https://github.com/albanD
2024-03-16 00:43:47 +00:00
chilli
ed8eebd1c2 Changed cublas repdocubility URL (#121534)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121534
Approved by: https://github.com/Skylion007
2024-03-08 23:46:21 +00:00
Svetlana Karslioglu
5ae6f6cffe Test seo torch cuda (#119324)
Testing if this will help improve SEO of this page.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119324
Approved by: https://github.com/albanD
2024-02-07 00:39:51 +00:00
Mikayla Gawarecki
9ffed22391 Document file format returned by torch.save (#118719)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118719
Approved by: https://github.com/albanD
2024-02-03 02:11:44 +00:00
Will Constable
abe3c55a6a Update DDP dynamo debug docs (#118295)
Refreshes https://github.com/pytorch/pytorch/pull/114201 and updates it to include other log names that also include ddp_optimizer.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118295
Approved by: https://github.com/LucasLLC, https://github.com/wanchaol
2024-01-29 14:58:26 +00:00
Stas Bekman
86b4b27e26 [docs] start a new FSDP notes doc (#117323)
As discussed on [slack](https://pytorch.slack.com/archives/C3PDTEV8E/p1703699711772289) adding Andrew Gu's advanced FSDP design notes with a few additions from myself based on our discussion.

I hope I did the RST right, I haven't done RST in a while.

- The first section is Andrew's words verbatim + formatting
- The second section is Andrew's words verbatim + formatting + a few of my additions that were confirmed by Andrew, and which hopefully should help understand the process better.

tagging @albanD as requested.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117323
Approved by: https://github.com/awgu
2024-01-22 15:46:35 +00:00
PyTorch MergeBot
02209b5880 Revert "[docs] start a new FSDP notes doc (#117323)"
This reverts commit 7f474da6bc.

Reverted https://github.com/pytorch/pytorch/pull/117323 on behalf of https://github.com/awgu due to broke docs ([comment](https://github.com/pytorch/pytorch/pull/117323#issuecomment-1902740900))
2024-01-21 19:47:27 +00:00
Stas Bekman
7f474da6bc [docs] start a new FSDP notes doc (#117323)
As discussed on [slack](https://pytorch.slack.com/archives/C3PDTEV8E/p1703699711772289) adding Andrew Gu's advanced FSDP design notes with a few additions from myself based on our discussion.

I hope I did the RST right, I haven't done RST in a while.

- The first section is Andrew's words verbatim + formatting
- The second section is Andrew's words verbatim + formatting + a few of my additions that were confirmed by Andrew, and which hopefully should help understand the process better.

tagging @albanD as requested.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117323
Approved by: https://github.com/albanD, https://github.com/awgu
2024-01-21 15:11:24 +00:00
Xuehai Pan
55064a4ef9 [BE] add parentheses to kwargs unpacking func(*args, **(kwargs or {})) (#115026)
This PR adds parentheses to kwargs unpacking `func(*args, **(kwargs or {}))` for better code readability.

With/without the parentheses are semantic equivalent because they produce the same bytecode.

```console
$ echo "func(*args, **kwargs or {})" | python3 -m dis -
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (func)
              6 LOAD_NAME                1 (args)
              8 BUILD_MAP                0
             10 LOAD_NAME                2 (kwargs)
             12 JUMP_IF_TRUE_OR_POP      1 (to 16)
             14 BUILD_MAP                0
        >>   16 DICT_MERGE               1
             18 CALL_FUNCTION_EX         1
             20 POP_TOP
             22 LOAD_CONST               0 (None)
             24 RETURN_VALUE

$ echo "func(*args, **(kwargs or {}))" | python3 -m dis -
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_NAME                0 (func)
              6 LOAD_NAME                1 (args)
              8 BUILD_MAP                0
             10 LOAD_NAME                2 (kwargs)
             12 JUMP_IF_TRUE_OR_POP      1 (to 16)
             14 BUILD_MAP                0
        >>   16 DICT_MERGE               1
             18 CALL_FUNCTION_EX         1
             20 POP_TOP
             22 LOAD_CONST               0 (None)
             24 RETURN_VALUE
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115026
Approved by: https://github.com/Skylion007
2023-12-03 20:03:26 +00:00
Rohan Varma
3c78ea4c9d [DDP][Compile] Test to Ensure torch.compile works w/static_graph=True (#114621)
Resolves https://github.com/pytorch/pytorch/issues/93672. This was
actually fixed by https://github.com/pytorch/pytorch/pull/103487 but I didn't
realize that PR also fixes torch compile at the time.

Differential Revision: [D51596148](https://our.internmc.facebook.com/intern/diff/D51596148/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114621
Approved by: https://github.com/wconstab
2023-12-01 22:18:45 +00:00
Philip Meier
373f2060ba fix extending torch native API docs (#114863)
Couldn't think of a better `release notes:` label. Feel free to set a more fitting one
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114863
Approved by: https://github.com/mikaylagawarecki
2023-12-01 06:09:35 +00:00
Edward Z. Yang
09df6b771b Add a note about performant record_stream use. (#112526)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112526
Approved by: https://github.com/albanD
2023-11-02 15:50:22 +00:00
Kurt Mohler
fd209543d5 Add torch.utils.deterministic.fill_uninitialized_memory flag (#111377)
Part of #109802

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111377
Approved by: https://github.com/albanD, https://github.com/aaronenyeshi
2023-11-01 16:10:09 +00:00
PyTorch MergeBot
ace2713d1e Revert "Add torch.utils.deterministic.fill_uninitialized_memory flag (#111377)"
This reverts commit f1785373c0.

Reverted https://github.com/pytorch/pytorch/pull/111377 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111377#issuecomment-1784179040))
2023-10-29 17:41:55 +00:00
Kurt Mohler
f1785373c0 Add torch.utils.deterministic.fill_uninitialized_memory flag (#111377)
Part of #109802

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111377
Approved by: https://github.com/albanD
2023-10-26 02:39:06 +00:00
Nikita Shulga
d22e5e4b52 Fix DDP notes (#111833)
To include `import os` otherwise sample is not syntactically correct Reported in https://github.com/pytorch/pytorch.github.io/pull/1490

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111833
Approved by: https://github.com/wanchaol
2023-10-23 22:05:36 +00:00
eqy
894b9957c8 [DOCS][CUDA] Update TF32 docs for sm90 (#111337)
For #110252.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111337
Approved by: https://github.com/msaroufim
2023-10-19 09:36:13 +00:00
albanD
a0bbd075b2 Add the Mode section in the extending doc (#110073)
Cover the basic principles of Mode and an example on how to use them and their behavior.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110073
Approved by: https://github.com/janeyx99
2023-10-06 23:50:55 +00:00
Banit Agrawal
64583c4d04 [CUDA Host Allocator] Add support of CudaHostRegister (#108488)
Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister.

Differential Revision: D45843715

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108488
Approved by: https://github.com/zdevito
2023-10-06 04:13:02 +00:00
Kazuaki Ishizaki
aa3629ee3e Fix typo under docs directory (#110359)
This PR fixes typo in `.rst` files under docs directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110359
Approved by: https://github.com/kit1980
2023-10-03 16:36:05 +00:00
FFFrog
d4990ad5a1 Fix the example in the extending.func.rst (#109279)
As the title shown ,the `backward` function is missing the definition of `ind` and `ind_inv`, which will lead to error when calling backward
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109279
Approved by: https://github.com/zou3519
2023-09-14 17:29:39 +00:00
Zachary DeVito
40cbda274b document memory snapshotting (#107660)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107660
Approved by: https://github.com/albanD
ghstack dependencies: #107171, #107399
2023-08-24 19:20:03 +00:00
Jane Xu
515aa993e3 Document post acc grad hooks in backward hooks execution (#107323)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107323
Approved by: https://github.com/soulitzer, https://github.com/albanD
2023-08-22 18:37:03 +00:00
David Radley
dbc2216800 Add autograd modes table to docs (#104774)
Fixes #104461

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104774
Approved by: https://github.com/soulitzer
2023-07-08 03:14:10 +00:00
Aleksei Nikiforov
c42fd73cf9 Add functions to get and set default endianness in load() functions (#101973)
By default interpret tensor data as native endian, but add an option to interpret data as little endian or big endian.

Related to #101688

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101973
Approved by: https://github.com/mikaylagawarecki
2023-07-06 20:12:56 +00:00
Mikayla Gawarecki
981f24e806 Add docstring to torch.serialization.register_package (#104046)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104046
Approved by: https://github.com/albanD
2023-06-26 23:28:32 +00:00
ZhaoqiongZ
7cef7195f6 [draft] Update Multiprocessing best practices with CPU device (#103229)
Fixes [#102498](https://github.com/pytorch/pytorch/issues/102498)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103229
Approved by: https://github.com/mingfeima, https://github.com/svekars, https://github.com/jgong5
2023-06-25 06:26:40 +00:00
albanD
4143b6b89b Add torch_dispatch and modes to extending.rst note (#102087)
The following subjects are not in this PR and will be done in a follow up:
- Go through torch_function section and update to the latest phrasing and link to the proper new sections
- Go through torch.library and custom device docs to add links to the new sections as appropriate
- Top level explanations on which component should be used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102087
Approved by: https://github.com/janeyx99
2023-06-22 12:56:35 +00:00
Rickey K. Liang
807d81155f [CUDA][CUBLAS] Fix BF16 reduced precision reduction note in Numerical accuracy docs (#101884)
Fixes #100966

Ref #101044

Align implementation and documentation. (This is what's previously missed from the above issue and PR)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101884
Approved by: https://github.com/eqy, https://github.com/ezyang
2023-05-21 17:38:00 +00:00
Ran Ding
b5c8d0359c Update autograd.rst (#101007)
Fixes #ISSUE_NUMBER

typo fix and small change to improve clarity

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101007
Approved by: https://github.com/lezcano, https://github.com/anjali411
2023-05-12 11:47:51 +00:00
eqy
33f3dca6b5 [CUDA][CUBLAS] Fix BF16 reduced precision reduction note in docs (#101044)
#100966

CC @ngimel @ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101044
Approved by: https://github.com/ngimel
2023-05-10 06:50:58 +00:00
eqy
6e2efd16d8 [CUDA][CUBLAS] Add cuBLAS workspace allocation behavior to docs (#100919)
Adding to the docs for now, hopefully we can move to `cudaMallocAsync`-backed cuBLAS workspaces soon which should alleviate the recent confusion around `cuBLAS` "leaking" memory through workspaces.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100919
Approved by: https://github.com/ngimel
2023-05-10 06:40:26 +00:00
Richard Barnes
9c185b6b46 [codemod] Replace hasattr with getattr in caffe2/docs/source/notes/extending.rst (#100598)
Summary:
The pattern
```
X.Y if hasattr(X, "Y") else Z
```
can be replaced with
```
getattr(X, "Y", Z)
```

The [getattr](https://www.w3schools.com/python/ref_func_getattr.asp) function gives more succinct code than the [hasattr](https://www.w3schools.com/python/ref_func_hasattr.asp) function. Please use it when appropriate.

**This diff is very low risk. Green tests indicate that you can safely Accept & Ship.**

Test Plan: Sandcastle

Differential Revision: D44886464

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100598
Approved by: https://github.com/Skylion007
2023-05-04 16:36:15 +00:00
Svetlana Karslioglu
d425da8bf3 Replace master with main in links and docs/conf.py (#100176)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100176
Approved by: https://github.com/albanD, https://github.com/malfet
2023-05-02 18:20:32 +00:00
Richard Zou
6b9e22f3f6 Clarify the saving of intermediates in the "extending torch.func" docs (#98020)
Fixes https://github.com/pytorch/pytorch/issues/97260

We got some feedback that the page reads like "in order to save an input
for backward, you must return it as an output of the
autograd.Function.forward".

Doing so actually raises an error (on master and as of 2.1), but results
in an ambiguous situation on 2.0.0. To avoid more users running into
this, we clarify the documentation so it doesn't read like the above
and clearly mentions that you can save things from the inputs or
outputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98020
Approved by: https://github.com/soulitzer, https://github.com/kshitij12345
2023-03-31 13:57:37 +00:00
Kazuaki Ishizaki
50ed38a7eb Fix typo under docs directory (#97202)
This PR fixes typo in `.rst` files under docs directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97202
Approved by: https://github.com/kit1980
2023-03-21 01:24:10 +00:00
Xuehai Pan
8d45f555d7 [BE] [1/3] Rewrite super() calls in caffe2 and benchmarks (#94587)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587
Approved by: https://github.com/ezyang
2023-02-11 18:19:48 +00:00
double7
685108b201 [docs] Fix incorrect wrapping of function (#94446)
The sample code of document incorrectly wraps the function decorator. To fix this, update the attributes of `func` based on `torch_function`.

Fixes #94305

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94446
Approved by: https://github.com/ezyang
2023-02-09 16:01:10 +00:00
soulitzer
77cbaedd5c [docs] Add section about tensor hooks on in-place in autograd note (#93116)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93116
Approved by: https://github.com/albanD
2023-02-01 17:35:21 +00:00
Felix Divo
219e9533f0 Improve autograd doc on complex numbers (#93065)
A tiny change to fix formatting and clarify a bit in [this section](https://pytorch.org/docs/stable/notes/autograd.html#what-are-complex-derivatives).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93065
Approved by: https://github.com/albanD
2023-01-27 09:36:38 +00:00
Richard Zou
98b78aa11c [autograd.Function] setup_context always appears on the Function (#92312)
Previously, we used the existence of setup_context to switch between if
forward should take a ctx object or not.

To be consistent with all other staticmethod (which always exist on the
autograd.Function), this PR change it so that we use IF setup_context
gets overriden by the user to switch between if forward should take a
ctx object or not.

Fixes https://github.com/pytorch/pytorch/issues/91451

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92312
Approved by: https://github.com/albanD, https://github.com/soulitzer
2023-01-18 02:55:42 +00:00
soulitzer
88366a9075 Document hooks ordering behavior in the autograd note (#91667)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91667
Approved by: https://github.com/albanD
2023-01-18 00:20:13 +00:00
Richard Zou
2f9166ef89 [autograd.Function] Cleanup asymmetry in generate_vmap_rule and vmap (#91787)
This PR:
- changes generate_vmap_rule to either be True or False. Previously it
  could be True, False, or not set. This simplifies the implementation a
  bit.
- changes the vmap staticmethod to always be on the autograd.Function
  rather than sometimes defined.
  This is how the other staticmethod (forward, backward, jvp) are
  implemented and allows us to document it.

There are 4 possible states for the autograd.Function w.r.t. to the
above:
- generate_vmap_rule is True, vmap staticmethod overriden. This raises
  an error when used with vmap.
- generate_vmap_rule is False, vmap staticmethod overriden. This is
  valid.
- generate_vmap_rule is True, vmap staticmethod not overriden. This is
  valid.
- generate_vmap_rule is False, vmap staticmethod not overriden. This
  raises an error when used with vmap.

Future:
- setup_context needs the same treatment, but that's a bit tricker to
  implement.

Test Plan:
- new unittest
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91787
Approved by: https://github.com/soulitzer
2023-01-17 13:36:34 +00:00
Emilio Castillo
07e595e88a Add device_idx to free_fn in CUDAPluggableAllocator (#91398)
This was requested by nvidia folks, track also the device_id in the free function.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91398
Approved by: https://github.com/albanD
2023-01-12 05:03:48 +00:00
Kazuaki Ishizaki
4f91b8e0ee Fix typo under docs directory (#91871)
This PR fixes typo in '.rst' files under 'docs' directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91871
Approved by: https://github.com/ngimel
2023-01-10 22:33:36 +00:00
Will Constable
630ef6c711 Fix Dynamo+DDP documentation (#91832)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91832
Approved by: https://github.com/soumith, https://github.com/davidberard98
2023-01-09 17:35:49 +00:00
Richard Zou
264f5ed516 [autograd.Function] Add docs on the functorch interaction (#91452)
This PR:
- Updates autograd.Function.forward docs to reflect how you either
  define a forward with ctx or a separate forward and setup_context
- Updates the "Extending Autograd" docs to suggest the usage of
  autograd.Function with separate forward and setup_context. This should
  be the default because there is a low barrier to go from this to
  an autograd.Function that is fully supported by functorch transforms.
- Adds a new "Extending torch.func with autograd.Function" doc that
  explains how to use autograd.Function with torch.func. It also
  explains how to use generate_vmap_rule and how to manually write a
  vmap staticmethod.

While writing this, I noticed that the implementation of
setup_context staticmethod/generate_vmap_rule/vmap staticmethod are a
bit inconsistent with the other method/attributes on autograd.Function:
- https://github.com/pytorch/pytorch/issues/91451
- I'm happy to fix those if we think it is a problem, either in this PR
  or a followup (this PR is getting long, I want some initial docs
  out that I can point early adopters at, and fixing the problems in the
  future isn't really BC-breaking).

Test Plan:
- view docs preview
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91452
Approved by: https://github.com/soulitzer
2023-01-04 00:28:19 +00:00