pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Frank Lin	249e65b92d	Graph-Safe RNG State Exchange for Tensor Parallelism (#114068 ) See #113541 The PR allows for registering and controlling multiple RNG states using indices, ensuring cudagraph-safe operations, and includes both C++ and Python API changes to support this functionality. cc @eellison @anijain2305 @jansel @ezyang @ptrblck @csarofeen @mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/114068 Approved by: https://github.com/ezyang, https://github.com/eqy, https://github.com/xuzhao9	2024-03-27 01:14:38 +00:00
PyTorch MergeBot	4dc09d6aa4	Revert "Graph-Safe RNG State Exchange for Tensor Parallelism (#114068 )" This reverts commit `e9dcda5cba`. Reverted https://github.com/pytorch/pytorch/pull/114068 on behalf of https://github.com/ezyang due to memory leak in another ci ([comment](https://github.com/pytorch/pytorch/pull/114068#issuecomment-2018044527))	2024-03-25 13:49:04 +00:00
Frank Lin	e9dcda5cba	Graph-Safe RNG State Exchange for Tensor Parallelism (#114068 ) See #113541 The PR allows for registering and controlling multiple RNG states using indices, ensuring cudagraph-safe operations, and includes both C++ and Python API changes to support this functionality. cc @eellison @anijain2305 @jansel @ezyang @ptrblck @csarofeen @mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/114068 Approved by: https://github.com/ezyang	2024-03-21 01:57:08 +00:00
Jane Xu	37e563276b	Document complex optimizer semantic behavior (#121667 ) <img width="817" alt="image" src="https://github.com/pytorch/pytorch/assets/31798555/565b389d-3e86-4767-9fcb-fe075b50aefe"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/121667 Approved by: https://github.com/albanD	2024-03-16 00:43:47 +00:00
chilli	ed8eebd1c2	Changed cublas repdocubility URL (#121534 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121534 Approved by: https://github.com/Skylion007	2024-03-08 23:46:21 +00:00
Svetlana Karslioglu	5ae6f6cffe	Test seo torch cuda (#119324 ) Testing if this will help improve SEO of this page. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119324 Approved by: https://github.com/albanD	2024-02-07 00:39:51 +00:00
Mikayla Gawarecki	9ffed22391	Document file format returned by torch.save (#118719 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118719 Approved by: https://github.com/albanD	2024-02-03 02:11:44 +00:00
Will Constable	abe3c55a6a	Update DDP dynamo debug docs (#118295 ) Refreshes https://github.com/pytorch/pytorch/pull/114201 and updates it to include other log names that also include ddp_optimizer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118295 Approved by: https://github.com/LucasLLC, https://github.com/wanchaol	2024-01-29 14:58:26 +00:00
Stas Bekman	86b4b27e26	[docs] start a new FSDP notes doc (#117323 ) As discussed on [slack](https://pytorch.slack.com/archives/C3PDTEV8E/p1703699711772289) adding Andrew Gu's advanced FSDP design notes with a few additions from myself based on our discussion. I hope I did the RST right, I haven't done RST in a while. - The first section is Andrew's words verbatim + formatting - The second section is Andrew's words verbatim + formatting + a few of my additions that were confirmed by Andrew, and which hopefully should help understand the process better. tagging @albanD as requested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117323 Approved by: https://github.com/awgu	2024-01-22 15:46:35 +00:00
PyTorch MergeBot	02209b5880	Revert "[docs] start a new FSDP notes doc (#117323 )" This reverts commit `7f474da6bc`. Reverted https://github.com/pytorch/pytorch/pull/117323 on behalf of https://github.com/awgu due to broke docs ([comment](https://github.com/pytorch/pytorch/pull/117323#issuecomment-1902740900))	2024-01-21 19:47:27 +00:00
Stas Bekman	7f474da6bc	[docs] start a new FSDP notes doc (#117323 ) As discussed on [slack](https://pytorch.slack.com/archives/C3PDTEV8E/p1703699711772289) adding Andrew Gu's advanced FSDP design notes with a few additions from myself based on our discussion. I hope I did the RST right, I haven't done RST in a while. - The first section is Andrew's words verbatim + formatting - The second section is Andrew's words verbatim + formatting + a few of my additions that were confirmed by Andrew, and which hopefully should help understand the process better. tagging @albanD as requested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117323 Approved by: https://github.com/albanD, https://github.com/awgu	2024-01-21 15:11:24 +00:00
Xuehai Pan	55064a4ef9	[BE] add parentheses to kwargs unpacking `func(args, (kwargs or {}))` (#115026 ) This PR adds parentheses to kwargs unpacking `func(args, *(kwargs or {}))` for better code readability. With/without the parentheses are semantic equivalent because they produce the same bytecode. ```console $ echo "func(args, *kwargs or {})" \| python3 -m dis - 0 0 RESUME 0 1 2 PUSH_NULL 4 LOAD_NAME 0 (func) 6 LOAD_NAME 1 (args) 8 BUILD_MAP 0 10 LOAD_NAME 2 (kwargs) 12 JUMP_IF_TRUE_OR_POP 1 (to 16) 14 BUILD_MAP 0 >> 16 DICT_MERGE 1 18 CALL_FUNCTION_EX 1 20 POP_TOP 22 LOAD_CONST 0 (None) 24 RETURN_VALUE $ echo "func(args, **(kwargs or {}))" \| python3 -m dis - 0 0 RESUME 0 1 2 PUSH_NULL 4 LOAD_NAME 0 (func) 6 LOAD_NAME 1 (args) 8 BUILD_MAP 0 10 LOAD_NAME 2 (kwargs) 12 JUMP_IF_TRUE_OR_POP 1 (to 16) 14 BUILD_MAP 0 >> 16 DICT_MERGE 1 18 CALL_FUNCTION_EX 1 20 POP_TOP 22 LOAD_CONST 0 (None) 24 RETURN_VALUE ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115026 Approved by: https://github.com/Skylion007	2023-12-03 20:03:26 +00:00
Rohan Varma	3c78ea4c9d	[DDP][Compile] Test to Ensure torch.compile works w/static_graph=True (#114621 ) Resolves https://github.com/pytorch/pytorch/issues/93672. This was actually fixed by https://github.com/pytorch/pytorch/pull/103487 but I didn't realize that PR also fixes torch compile at the time. Differential Revision: [D51596148](https://our.internmc.facebook.com/intern/diff/D51596148/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114621 Approved by: https://github.com/wconstab	2023-12-01 22:18:45 +00:00
Philip Meier	373f2060ba	fix extending torch native API docs (#114863 ) Couldn't think of a better `release notes:` label. Feel free to set a more fitting one Pull Request resolved: https://github.com/pytorch/pytorch/pull/114863 Approved by: https://github.com/mikaylagawarecki	2023-12-01 06:09:35 +00:00
Edward Z. Yang	09df6b771b	Add a note about performant record_stream use. (#112526 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/112526 Approved by: https://github.com/albanD	2023-11-02 15:50:22 +00:00
Kurt Mohler	fd209543d5	Add `torch.utils.deterministic.fill_uninitialized_memory` flag (#111377 ) Part of #109802 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111377 Approved by: https://github.com/albanD, https://github.com/aaronenyeshi	2023-11-01 16:10:09 +00:00
PyTorch MergeBot	ace2713d1e	Revert "Add `torch.utils.deterministic.fill_uninitialized_memory` flag (#111377 )" This reverts commit `f1785373c0`. Reverted https://github.com/pytorch/pytorch/pull/111377 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111377#issuecomment-1784179040))	2023-10-29 17:41:55 +00:00
Kurt Mohler	f1785373c0	Add `torch.utils.deterministic.fill_uninitialized_memory` flag (#111377 ) Part of #109802 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111377 Approved by: https://github.com/albanD	2023-10-26 02:39:06 +00:00
Nikita Shulga	d22e5e4b52	Fix DDP notes (#111833 ) To include `import os` otherwise sample is not syntactically correct Reported in https://github.com/pytorch/pytorch.github.io/pull/1490 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111833 Approved by: https://github.com/wanchaol	2023-10-23 22:05:36 +00:00
eqy	894b9957c8	[DOCS][CUDA] Update TF32 docs for sm90 (#111337 ) For #110252. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111337 Approved by: https://github.com/msaroufim	2023-10-19 09:36:13 +00:00
albanD	a0bbd075b2	Add the Mode section in the extending doc (#110073 ) Cover the basic principles of Mode and an example on how to use them and their behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110073 Approved by: https://github.com/janeyx99	2023-10-06 23:50:55 +00:00
Banit Agrawal	64583c4d04	[CUDA Host Allocator] Add support of CudaHostRegister (#108488 ) Summary: This diff adds another option to create cuda pinned memory using cudaHostRegister. Differential Revision: D45843715 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108488 Approved by: https://github.com/zdevito	2023-10-06 04:13:02 +00:00
Kazuaki Ishizaki	aa3629ee3e	Fix typo under docs directory (#110359 ) This PR fixes typo in `.rst` files under docs directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/110359 Approved by: https://github.com/kit1980	2023-10-03 16:36:05 +00:00
FFFrog	d4990ad5a1	Fix the example in the extending.func.rst (#109279 ) As the title shown ,the `backward` function is missing the definition of `ind` and `ind_inv`, which will lead to error when calling backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/109279 Approved by: https://github.com/zou3519	2023-09-14 17:29:39 +00:00
Zachary DeVito	40cbda274b	document memory snapshotting (#107660 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107660 Approved by: https://github.com/albanD ghstack dependencies: #107171, #107399	2023-08-24 19:20:03 +00:00
Jane Xu	515aa993e3	Document post acc grad hooks in backward hooks execution (#107323 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107323 Approved by: https://github.com/soulitzer, https://github.com/albanD	2023-08-22 18:37:03 +00:00
David Radley	dbc2216800	Add autograd modes table to docs (#104774 ) Fixes #104461 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104774 Approved by: https://github.com/soulitzer	2023-07-08 03:14:10 +00:00
Aleksei Nikiforov	c42fd73cf9	Add functions to get and set default endianness in load() functions (#101973 ) By default interpret tensor data as native endian, but add an option to interpret data as little endian or big endian. Related to #101688 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101973 Approved by: https://github.com/mikaylagawarecki	2023-07-06 20:12:56 +00:00
Mikayla Gawarecki	981f24e806	Add docstring to torch.serialization.register_package (#104046 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104046 Approved by: https://github.com/albanD	2023-06-26 23:28:32 +00:00
ZhaoqiongZ	7cef7195f6	[draft] Update Multiprocessing best practices with CPU device (#103229 ) Fixes [#102498](https://github.com/pytorch/pytorch/issues/102498) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103229 Approved by: https://github.com/mingfeima, https://github.com/svekars, https://github.com/jgong5	2023-06-25 06:26:40 +00:00
albanD	4143b6b89b	Add torch_dispatch and modes to extending.rst note (#102087 ) The following subjects are not in this PR and will be done in a follow up: - Go through torch_function section and update to the latest phrasing and link to the proper new sections - Go through torch.library and custom device docs to add links to the new sections as appropriate - Top level explanations on which component should be used Pull Request resolved: https://github.com/pytorch/pytorch/pull/102087 Approved by: https://github.com/janeyx99	2023-06-22 12:56:35 +00:00
Rickey K. Liang	807d81155f	[CUDA][CUBLAS] Fix BF16 reduced precision reduction note in Numerical accuracy docs (#101884 ) Fixes #100966 Ref #101044 Align implementation and documentation. (This is what's previously missed from the above issue and PR) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101884 Approved by: https://github.com/eqy, https://github.com/ezyang	2023-05-21 17:38:00 +00:00
Ran Ding	b5c8d0359c	Update autograd.rst (#101007 ) Fixes #ISSUE_NUMBER typo fix and small change to improve clarity Pull Request resolved: https://github.com/pytorch/pytorch/pull/101007 Approved by: https://github.com/lezcano, https://github.com/anjali411	2023-05-12 11:47:51 +00:00
eqy	33f3dca6b5	[CUDA][CUBLAS] Fix BF16 reduced precision reduction note in docs (#101044 ) #100966 CC @ngimel @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/101044 Approved by: https://github.com/ngimel	2023-05-10 06:50:58 +00:00
eqy	6e2efd16d8	[CUDA][CUBLAS] Add cuBLAS workspace allocation behavior to docs (#100919 ) Adding to the docs for now, hopefully we can move to `cudaMallocAsync`-backed cuBLAS workspaces soon which should alleviate the recent confusion around `cuBLAS` "leaking" memory through workspaces. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100919 Approved by: https://github.com/ngimel	2023-05-10 06:40:26 +00:00
Richard Barnes	9c185b6b46	[codemod] Replace hasattr with getattr in caffe2/docs/source/notes/extending.rst (#100598 ) Summary: The pattern ``` X.Y if hasattr(X, "Y") else Z ``` can be replaced with ``` getattr(X, "Y", Z) ``` The [getattr](https://www.w3schools.com/python/ref_func_getattr.asp) function gives more succinct code than the [hasattr](https://www.w3schools.com/python/ref_func_hasattr.asp) function. Please use it when appropriate. This diff is very low risk. Green tests indicate that you can safely Accept & Ship. Test Plan: Sandcastle Differential Revision: D44886464 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100598 Approved by: https://github.com/Skylion007	2023-05-04 16:36:15 +00:00
Svetlana Karslioglu	d425da8bf3	Replace master with main in links and docs/conf.py (#100176 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/100176 Approved by: https://github.com/albanD, https://github.com/malfet	2023-05-02 18:20:32 +00:00
Richard Zou	6b9e22f3f6	Clarify the saving of intermediates in the "extending torch.func" docs (#98020 ) Fixes https://github.com/pytorch/pytorch/issues/97260 We got some feedback that the page reads like "in order to save an input for backward, you must return it as an output of the autograd.Function.forward". Doing so actually raises an error (on master and as of 2.1), but results in an ambiguous situation on 2.0.0. To avoid more users running into this, we clarify the documentation so it doesn't read like the above and clearly mentions that you can save things from the inputs or outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98020 Approved by: https://github.com/soulitzer, https://github.com/kshitij12345	2023-03-31 13:57:37 +00:00
Kazuaki Ishizaki	50ed38a7eb	Fix typo under docs directory (#97202 ) This PR fixes typo in `.rst` files under docs directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97202 Approved by: https://github.com/kit1980	2023-03-21 01:24:10 +00:00
Xuehai Pan	8d45f555d7	[BE] [1/3] Rewrite `super()` calls in caffe2 and benchmarks (#94587 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587 Approved by: https://github.com/ezyang	2023-02-11 18:19:48 +00:00
double7	685108b201	[docs] Fix incorrect wrapping of function (#94446 ) The sample code of document incorrectly wraps the function decorator. To fix this, update the attributes of `func` based on `torch_function`. Fixes #94305 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94446 Approved by: https://github.com/ezyang	2023-02-09 16:01:10 +00:00
soulitzer	77cbaedd5c	[docs] Add section about tensor hooks on in-place in autograd note (#93116 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93116 Approved by: https://github.com/albanD	2023-02-01 17:35:21 +00:00
Felix Divo	219e9533f0	Improve autograd doc on complex numbers (#93065 ) A tiny change to fix formatting and clarify a bit in [this section](https://pytorch.org/docs/stable/notes/autograd.html#what-are-complex-derivatives). Pull Request resolved: https://github.com/pytorch/pytorch/pull/93065 Approved by: https://github.com/albanD	2023-01-27 09:36:38 +00:00
Richard Zou	98b78aa11c	[autograd.Function] setup_context always appears on the Function (#92312 ) Previously, we used the existence of setup_context to switch between if forward should take a ctx object or not. To be consistent with all other staticmethod (which always exist on the autograd.Function), this PR change it so that we use IF setup_context gets overriden by the user to switch between if forward should take a ctx object or not. Fixes https://github.com/pytorch/pytorch/issues/91451 Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/92312 Approved by: https://github.com/albanD, https://github.com/soulitzer	2023-01-18 02:55:42 +00:00
soulitzer	88366a9075	Document hooks ordering behavior in the autograd note (#91667 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91667 Approved by: https://github.com/albanD	2023-01-18 00:20:13 +00:00
Richard Zou	2f9166ef89	[autograd.Function] Cleanup asymmetry in generate_vmap_rule and vmap (#91787 ) This PR: - changes generate_vmap_rule to either be True or False. Previously it could be True, False, or not set. This simplifies the implementation a bit. - changes the vmap staticmethod to always be on the autograd.Function rather than sometimes defined. This is how the other staticmethod (forward, backward, jvp) are implemented and allows us to document it. There are 4 possible states for the autograd.Function w.r.t. to the above: - generate_vmap_rule is True, vmap staticmethod overriden. This raises an error when used with vmap. - generate_vmap_rule is False, vmap staticmethod overriden. This is valid. - generate_vmap_rule is True, vmap staticmethod not overriden. This is valid. - generate_vmap_rule is False, vmap staticmethod not overriden. This raises an error when used with vmap. Future: - setup_context needs the same treatment, but that's a bit tricker to implement. Test Plan: - new unittest - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/91787 Approved by: https://github.com/soulitzer	2023-01-17 13:36:34 +00:00
Emilio Castillo	07e595e88a	Add `device_idx` to `free_fn` in `CUDAPluggableAllocator` (#91398 ) This was requested by nvidia folks, track also the device_id in the free function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91398 Approved by: https://github.com/albanD	2023-01-12 05:03:48 +00:00
Kazuaki Ishizaki	4f91b8e0ee	Fix typo under docs directory (#91871 ) This PR fixes typo in '.rst' files under 'docs' directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/91871 Approved by: https://github.com/ngimel	2023-01-10 22:33:36 +00:00
Will Constable	630ef6c711	Fix Dynamo+DDP documentation (#91832 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91832 Approved by: https://github.com/soumith, https://github.com/davidberard98	2023-01-09 17:35:49 +00:00
Richard Zou	264f5ed516	[autograd.Function] Add docs on the functorch interaction (#91452 ) This PR: - Updates autograd.Function.forward docs to reflect how you either define a forward with ctx or a separate forward and setup_context - Updates the "Extending Autograd" docs to suggest the usage of autograd.Function with separate forward and setup_context. This should be the default because there is a low barrier to go from this to an autograd.Function that is fully supported by functorch transforms. - Adds a new "Extending torch.func with autograd.Function" doc that explains how to use autograd.Function with torch.func. It also explains how to use generate_vmap_rule and how to manually write a vmap staticmethod. While writing this, I noticed that the implementation of setup_context staticmethod/generate_vmap_rule/vmap staticmethod are a bit inconsistent with the other method/attributes on autograd.Function: - https://github.com/pytorch/pytorch/issues/91451 - I'm happy to fix those if we think it is a problem, either in this PR or a followup (this PR is getting long, I want some initial docs out that I can point early adopters at, and fixing the problems in the future isn't really BC-breaking). Test Plan: - view docs preview Pull Request resolved: https://github.com/pytorch/pytorch/pull/91452 Approved by: https://github.com/soulitzer	2023-01-04 00:28:19 +00:00

1 2 3 4 5 ...

347 Commits