pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
cyy	8f728e28dd	Enable ASAN in CUDA tests (#147512 ) It should work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147512 Approved by: https://github.com/soulitzer	2025-02-25 02:58:39 +00:00
Aaron Orenstein	99dbc5b0e2	PEP585 update - test (#145176 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145176 Approved by: https://github.com/bobrenjc93	2025-01-22 04:48:28 +00:00
Prachi Gupta	b5be4d8c05	Fix ROCm skip decorator for test_ddp_tp and multiprocess UTs (#136161 ) skip_if_rocm is used only in multiprocess case (when UT test class is a child of MultiProcessTestCase). Each individual process can exit with a skip code. If used for single process UT, it will cause the UT to fail as the process returns a non-zero exit code. Use skipIfRocm in single process UTs. To avoid the above confusion, this PR renamed skip_if_rocm to skip_if_rocm_multiprocess. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/136161 Approved by: https://github.com/jithunnair-amd, https://github.com/kwen2501, https://github.com/fegin	2024-09-18 11:01:23 +00:00
Oguz Ulgen	920f0426ae	Add None return type to init -- tests rest (#132376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376 Approved by: https://github.com/jamesjwu ghstack dependencies: #132335, #132351, #132352	2024-08-01 15:44:51 +00:00
ekamiti	9e473fd868	Make adding Buffers more like adding Parameters (#125971 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971 Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos	2024-07-31 10:32:40 +00:00
Xuehai Pan	db3290846e	[BE][Easy][10/19] enforce style for empty lines in import segments in `test/d*/` (#129761 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129761 Approved by: https://github.com/fegin	2024-07-17 16:57:39 +00:00
Yuanhao Ji	e3effa5855	Enable UFMT on all of `test/distributed` (#123539 ) Partially addresses #123062 Ran lintrunner on: - `test/distributed` Pull Request resolved: https://github.com/pytorch/pytorch/pull/123539 Approved by: https://github.com/ezyang	2024-04-17 06:46:02 +00:00
PyTorch MergeBot	52be63eb2c	Revert "Enable UFMT on all of `test/distributed` (#123539 )" This reverts commit `89ac37fe91`. Reverted https://github.com/pytorch/pytorch/pull/123539 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/123539#issuecomment-2058329471))	2024-04-16 06:33:21 +00:00
Yuanhao Ji	89ac37fe91	Enable UFMT on all of `test/distributed` (#123539 ) Partially addresses #123062 Ran lintrunner on: - `test/distributed` Pull Request resolved: https://github.com/pytorch/pytorch/pull/123539 Approved by: https://github.com/ezyang	2024-04-16 03:23:56 +00:00
Aaron Gokaslan	1d6c5972c1	[BE]: Optimize min/max/sum comprehensions C419 (#123960 ) Automatic fixes that replaces certain list comprehensions with generator ones where appropriate so that they are immediately consumed. This is preview functionality in ruff for rule C419 and it was automatically applied. Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123960 Approved by: https://github.com/malfet	2024-04-12 23:54:15 +00:00
Jason Lu	bc88028e8e	Back out "Reland "Make adding buffers more like adding parameters (#104069 )" (#106224 )" (#106743 ) Summary: Original commit changeset: 81319beb97f3 Original Phabricator Diff: D47961182 Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822 Reviewed By: atuljangra Differential Revision: D48131623 @diff-train-skip-merge (D48131623 landed internally) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743 Approved by: https://github.com/malfet	2023-08-08 15:27:34 +00:00
Mikayla Gawarecki	d8e5f2aa6d	Reland "Make adding buffers more like adding parameters (#104069 )" (#106224 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224 Approved by: https://github.com/atalman, https://github.com/albanD	2023-07-31 17:18:56 +00:00
Rohan Varma	5d4e170d58	[Optim in backward] API to retrieve in-backward optimizers (#105991 ) API to retrieve in backward optimizer for checkpointing purposes Differential Revision: [D47782225](https://our.internmc.facebook.com/intern/diff/D47782225/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105991 Approved by: https://github.com/awgu	2023-07-29 01:36:25 +00:00
Andrey Talman	c6653b65d8	Back out "Make adding buffers more like adding parameters (#104069 )" (#105581 ) Summary: D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/ with `TypeError: register_buffer() takes 3 positional arguments but 4 were given` Original commit changeset: d4b4069fbd38 Original Phabricator Diff: D47537831 Test Plan: ``` buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform ``` Reviewed By: atalman Differential Revision: D47600140 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581 Approved by: https://github.com/mikaylagawarecki	2023-07-20 03:39:53 +00:00
ekamiti	32d422f335	Make adding buffers more like adding parameters (#104069 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069 Approved by: https://github.com/mikaylagawarecki	2023-07-17 17:59:05 +00:00
Justin Chu	01abbfbaae	[BE] Fix all B022 `useless-contextlib-suppress` (#100335 ) No arguments passed to contextlib.suppress. No exceptions will be suppressed and therefore this context manager is redundant Pull Request resolved: https://github.com/pytorch/pytorch/pull/100335 Approved by: https://github.com/Skylion007	2023-04-30 18:47:40 +00:00
Edward Z. Yang	5a7aad9681	Convert logging f-strings to use % format, part four (#98705 ) This does multi-line concatenated string literals. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98705 Approved by: https://github.com/voznesenskym	2023-04-11 13:17:59 +00:00
Rohan Varma	8e6287264d	[Optim in backward] register_hook=False API (#95096 ) Use this API to avoid registering hooks for applications that do their own custom logic. This eliminates the need for DDP to have to de-register these hooks. Differential Revision: [D43383794](https://our.internmc.facebook.com/intern/diff/D43383794/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D43383794/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/95096 Approved by: https://github.com/zhaojuanmao	2023-03-15 14:33:13 +00:00
Sergii Dymchenko	35bf5bac26	Fix "sandcastle_skip_if decorator name is confusing" (#95649 ) Fixes https://github.com/pytorch/pytorch/issues/89473 See the issue https://github.com/pytorch/pytorch/issues/89473 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95649 Approved by: https://github.com/atalman, https://github.com/malfet	2023-03-03 09:29:40 +00:00
fduwjj	a88bfc60c7	[2/N][ST deprecate][BE] Remove Replicate Tensor convert from DDP and PTD (#95450 ) No use is found for this ST/Replicated Tensor based DDP. As part of ShardedTensor migration, let's remove this logic. Trying to undo everything in https://github.com/pytorch/pytorch/pull/75753. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95450 Approved by: https://github.com/wanchaol	2023-02-26 03:03:37 +00:00
Xuehai Pan	046e88a291	[BE] [3/3] Rewrite `super()` calls in test (#94592 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592 Approved by: https://github.com/ezyang, https://github.com/seemethere	2023-02-12 22:20:53 +00:00
fduwjj	e7ace1ff93	[PT-D][NamedOptimizer][6/N] Upstream init_state from keyed to NamedOptimizer (#93887 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93887 Approved by: https://github.com/rohan-varma	2023-02-02 07:14:49 +00:00
Jane Xu	b90496eef5	[nn] zero_grad() set_to_none default True (#92731 ) Attempts to fix #92656 BC-breaking! This changes the default of zero_grad in optim and in nn to default set grads to None instead of zero tensors. We are changing the default because there are proven perf wins and existing code has typically not regressed due to this change. (will probably have to flesh out this note more). Pull Request resolved: https://github.com/pytorch/pytorch/pull/92731 Approved by: https://github.com/ngimel	2023-01-26 01:04:28 +00:00
fduwjj	368c737603	[PT-D][5/N] Enable add_param_group for named optimizer (#91928 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91928 Approved by: https://github.com/rohan-varma	2023-01-18 10:53:31 +00:00
fduwjj	32356aaee6	[4/N] Add test for partial training for NamedOptimizer (#91344 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91344 Approved by: https://github.com/rohan-varma	2023-01-09 22:19:49 +00:00
fduwjj	5fabd96f3c	[PT-D][3/N] Add FSDP hook with Named Optimizer (#91321 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91321 Approved by: https://github.com/fegin	2023-01-06 23:51:33 +00:00
fduwjj	c7e7ea92e2	[NamedOptimizer][2/N] Prepare the enablement of state_dict for FSDP (#91147 ) 1. Add param_group check logic and unit test 2. Remove unnecessary check for conditional param update 3. Return the param_group from the inner optimizer so that when param_group is None or not all params are specified, we still return the expected result. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91147 Approved by: https://github.com/fegin	2022-12-20 23:23:04 +00:00
fduwjj	1a48ae96ba	[PT-D][Easy] Reformat the optim code within PTD code base (#90399 ) Just run two commands: ``` ufmt format torch/distributed/optim/ ufmt format test/distributed/optim/ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90399 Approved by: https://github.com/awgu	2022-12-08 06:38:59 +00:00
fduwjj	1abe264ef0	[Upstream _NamedOptimzer] Reland PR (89480) (#90293 ) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): Reland https://github.com/pytorch/pytorch/pull/89480/ * #90294 * __->__ #90293 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90293 Approved by: https://github.com/awgu	2022-12-06 21:47:12 +00:00
PyTorch MergeBot	176b962f4b	Revert "[PT-D][Composability][1/N] Upstream NamedOptimizer from TorchRec (KeyedOptimizer in TR) (#89480 )" This reverts commit `31ec1a1ef7`. Reverted https://github.com/pytorch/pytorch/pull/89480 on behalf of https://github.com/kit1980 due to Broke test_correct_module_names	2022-12-06 07:22:37 +00:00
fduwjj	31ec1a1ef7	[PT-D][Composability][1/N] Upstream NamedOptimizer from TorchRec (KeyedOptimizer in TR) (#89480 ) In pytorch, the optim state_dict will always use number to index optimizer state_dict for parameters. Now composability workstream need a FQN based way to index optimizer state_dict for parameters.. For example, SGD optimizer might have something in its `state_dict` like: ``` {'state': {0: {'momentum_buffer': tensor(...)}, {1: {'momentum_buffer': tensor(...)}, ... } 'param_groups': [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'maximize': False, 'foreach': None, 'differentiable': False, 'params': [0, 1, 2, 3, 4, 5, 6, 7]}] } ``` And in NamedOptimizer we want the `state_dict` can be: ``` {'state': {'net1.0.weight': {'momentum_buffer': tensor(...)}, {'net1.0.bias': {'momentum_buffer': tensor(...)}, ... } 'param_groups': [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'maximize': False, 'foreach': None, 'differentiable': False, 'params': ['net1.0.weight', 'net1.0.bias', 'net2.0.weight', 'net2.0.bias', 'net3.weight', 'net3.bias', 'net4.1.weight', 'net4.1.bias']}] } ``` We also want to support load_state_dict to enable optim `state_dict` override for NameOptimizer. For the next couple PR/diffs, we also need to: 1. To make `NamedOptimizer` working with FSDP (like registering a hook for model wrapped with FSDP) and other PTD/PT components. 2. Make `NamedOptimizer` works well with apply_optim_in_backward 3. Upstream also `CombinedOptimizer`. Differential Revision: [D41432088](https://our.internmc.facebook.com/intern/diff/D41432088/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D41432088/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/89480 Approved by: https://github.com/rohan-varma	2022-12-06 04:34:19 +00:00
Rohan Varma	404f254e20	Upstream apply_optim_in_backward from TorchRec (#87397 ) (#88539 ) Summary: Upstreaming this as part of sharing common APIs. This is just a plain move, any changes needed to support DDP / FSDP will come in follow up diffs. Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D40564646 fbshipit-source-id: 619c434e02196812f8d4db1e40d07290e08b18f9 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88539 Approved by: https://github.com/awgu	2022-11-05 18:28:07 +00:00
Michael Carilli	ba27ee9e8f	[CUDA graphs] Allows Adam and AdamW to be capture-safe (#77862 ) Near term fix for https://github.com/pytorch/pytorch/issues/76368. Q. Why does the user need to request `capturable=True` in the optimizer constructor? Why can't capture safety be completely automatic? A. We need to set up capture-safe (device-side) state variables before capture. If we don't, and step() internally detects capture is underway, it's too late: the best we could do is create a device state variable and copy the current CPU value into it, which is not something we want baked into the graph. Q. Ok, why not just do the capture-safe approach with device-side state variables all the time? A. It incurs several more kernel launches per parameter, which could really add up and regress cpu overhead for ungraphed step()s. If the optimizer won't be captured, we should allow step() to stick with its current cpu-side state handling. Q. But cuda RNG is a stateful thing that maintains its state on the cpu outside of capture and replay, and we capture it automatically. Why can't we do the same thing here? A. The graph object can handle RNG generator increments because its capture_begin, capture_end, and replay() methods can see and access generator object. But the graph object has no explicit knowledge of or access to optimizer steps in its capture scope. We could let the user tell the graph object what optimizers will be stepped in its scope, ie something like ```python graph.will_use_optimizer(opt) graph.capture_begin() ... ``` but that seems clunkier than an optimizer constructor arg. I'm open to other ideas, but right now I think constructor arg is necessary and the least bad approach. Long term, https://github.com/pytorch/pytorch/issues/71274 is a better fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77862 Approved by: https://github.com/ezyang	2022-06-13 01:56:47 +00:00
pritam	3a38f175dd	Convert DDP parameters to ReplicatedTensor during forward pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75753 As per the design in https://github.com/pytorch/pytorch/issues/72138, convert DDP parameters to ReplicatedTensor during its forward pass. Concretely, this is done as follows: 1) Create a separate `_replicated_tensor_module` which is a copy of self.module without creating copies of the Tensors themselves. 2) Use `_replicated_tensor_module` instead of `self.module` during the forward pass. 3) Have a context manager `_ddp_replicated_tensor` to enable this, since certain edge cases can fail where self.module is changed out of band resulting in discrepancy between self.module and `_replicated_tensor_module`. Differential Revision: [D35533736](https://our.internmc.facebook.com/intern/diff/D35533736/) Approved by: https://github.com/wanchaol, https://github.com/rohan-varma	2022-04-18 03:27:23 +00:00
Andrew Gu	9012e8d65a	[ZeRO][BE] Clean up ZeRO tests (#73842 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73842 Overview This cleans up the `ZeroRedundancyOptimizer` tests. I apologize for strong formatting changes mixed in with actually-beneficial changes. It was convenient to unify the formatting while doing a deep comb through the full test file. The main non-formatting changes include: - Using `parametrize` instead of manually including `for` loops over possible argument values - Removing the `DEVICE` global variable, which was used only for the `TestZeroRedundancyOptimizerSingleRank` tests, in favor of consistent usage of `self.device` in both `TestZeroRedundancyOptimizerSingleRank` and `TestZeroRedundancyOptimizerDistributed` - Moving `assert ... == ...` to `self.assertEqual(..., ...)` when the assert is part of the test's correctness - Removing the `if self.rank >= self.world_size or (torch.cuda.is_available() and torch.cuda.device_count() < 2):` conditional guards in favor of `common_distributed.skip_if_no_gpu` for `TestZeroRedundancyOptimizerDistributed` - For `TestZeroRedundancyOptimizerDistributed`, `self.device` is `torch.device(self.rank)` if CUDA is available, while `self.world_size` is at least 2, even if `torch.cuda.device_count() == 1`. - The problematic case is exactly when `torch.cuda.device_count() == 1` but `self.world_size == 2` since then calling `self.device` on rank 1 will error. The existing conditional guard prevented this case for some tests, but it was not used consistently (e.g. `test_multiple_groups()`), which is most likely the reason for the hangs and resulting test flakiness. (From my experience landing the recent ZeRO constructor changes, the Windows environment uses a world size of 2 but only has 1 device available.) - A more robust solution is to always use the `skip_if_no_gpu` decorator as long as the test uses `self.device` and CUDA is available. This is in line with the recommended SPSD usage of ZeRO. - Renaming `test_multiple_groups()` to `test_nondefault_process_group()` - The existing `test_multiple_groups()` was slightly misnamed. Also, it is only nontrivial for a world size of (at least) 4 since it tests using a process group including only even ranks. It was marked as flaky on Windows, and I believe this is because of the world size and `torch.cuda.device_count()` mismatch. Now, the test only uses GPU if there are enough available and falls back to CPU otherwise, which is safe since the test uses Gloo backend. - There was also a duplicated section, which I was unsure how to non-naively de-duplicate. The top half and bottom half are identical even though they claim to target fitting into the broadcast bucket and not fitting into the broadcast bucket: `1d497114e7/test/distributed/optim/test_zero_redundancy_optimizer.py (L658-L684)` - Changing `_test_zero_model_parallel()` to not use CPU - This is my own fault, having introduced this inefficiency last summer. It makes more sense to simply designate one of the two GPUs for a process to be its default device rather than routing through CPU. Questions - How might we limit the runs for `test_ddp_zero_overlap()`? Because it parameterizes over many values, it contributes significantly to the time-to-signal. However, it is an experimental feature, so it is not critical that the tests run every time. Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D34675709 Pulled By: awgu fbshipit-source-id: 71ce9ac968fb34415cd65206855b4bb5e67754fb (cherry picked from commit 34e3dd0a184318ea9f63a1ee20cd14b111af3501)	2022-03-08 13:15:20 +00:00
Andrew Gu	c30659ffcc	[ZeRO] (Reland) Add ctor support for multiple param groups (#72932 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/72578. Overview Windows CI was failing due to the multi-rank single-GPU case (see [here](https://github.com/pytorch/pytorch/runs/5204906995?check_suite_focus=true)). To address this, I - added `common_distributed.skip_if_no_gpu` for `test_multiple_param_groups()` to ensure that each rank can safely call `to(self.device)` -- this targets the expected SPSD use case where each rank has its own GPU; - moved `test_constructor()` back to `TestZeroRedundancyOptimizerSingleRank` to check that the multiple parameter group method for construction works even on a single rank. Test Plan - I checked both tests for CPU, 1 GPU, 2 GPUs, 4 GPUs, and 8 GPUs. - I added the `ciflow/win` label to run the failing Windows CI test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72932 Reviewed By: rohan-varma Differential Revision: D34281482 Pulled By: awgu fbshipit-source-id: c4fe604ddd9d2c123c3071249741e6b8a6454b6e (cherry picked from commit `6bea9bcc63`)	2022-02-22 16:29:55 +00:00
Nikita Shulga	84cb810b3f	Revert D34106940: [ZeRO] Add ctor support for multiple param groups Test Plan: revert-hammer Differential Revision: D34106940 (`5dd0732457`) Original commit changeset: 7e70fc0b3cec Original Phabricator Diff: D34106940 (`5dd0732457`) fbshipit-source-id: 08f846c9c02be8756475f4e0b57eb381f10c27bd (cherry picked from commit `7675497d83`)	2022-02-16 03:45:15 +00:00
Andrew Gu	5dd0732457	[ZeRO] Add ctor support for multiple param groups (#72578 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72578 Overview This adds `ZeroRedundancyOptimizer` constructor support for multiple parameter groups (i.e. passing an `iterable` of `dict`s instead of an `iterable` of `torch.Tensor` as the `parameters` argument) to mirror the API for non-sharded optimizers. Fixes https://github.com/pytorch/pytorch/issues/71347 and https://github.com/pytorch/pytorch/issues/59973. This modifies `test_collect_shards()` to skip if ROCm. Test Plan I adjusted the existing constructor test, and I added a test for parity between constructing with two parameter groups up front versus constructor with one parameter group and adding the second parameter group after (via `add_param_group()`) versus a non-sharded optimizer. Test Plan: Imported from OSS Reviewed By: rohan-varma Differential Revision: D34106940 Pulled By: awgu fbshipit-source-id: 7e70fc0b3cec891646e0698eaedf02ff4354c128 (cherry picked from commit `40f2d45172`)	2022-02-15 16:51:30 +00:00
Adnios	a9c7d626e1	Add the `maximize` flag to AdamW (#70146 ) Summary: Related issue: https://github.com/pytorch/pytorch/issues/68052 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/70146 Reviewed By: malfet Differential Revision: D33254561 Pulled By: albanD fbshipit-source-id: f190c836a4162f936c5953e076747c345df21421	2021-12-23 09:20:29 -08:00
oliver	3d358a7678	Adds a `maximize` flag to Adam (#68164 ) Summary: Solves the next most important use case in https://github.com/pytorch/pytorch/issues/68052. I have kept the style as close to that in SGD as seemed reasonable, given the slight differences in their internal implementations. All feedback welcome! cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/68164 Reviewed By: VitalyFedyunin Differential Revision: D32994129 Pulled By: albanD fbshipit-source-id: 65c57c3f3dbbd3e3e5338d51def54482503e8850	2021-12-13 05:53:53 -08:00
oliver	f8297d40fc	Adds a `maximize` flag to SGD. (#67847 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46480 -- for SGD. ## Notes: - I have modified the existing tests to take a new `constructor_accepts_maximize` flag. When this is set to true, the ` _test_basic_cases_template` function will test both maximizing and minimizing the sample function. - This was the clearest way I could think of testing the changes -- I would appreciate feedback on this strategy. ## Work to be done: [] I need to update the docs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67847 Reviewed By: H-Huang Differential Revision: D32252631 Pulled By: albanD fbshipit-source-id: 27915a3cc2d18b7e4d17bfc2d666fe7d2cfdf9a4	2021-11-09 00:43:07 -08:00
Caspar van Leeuwen	a20a64af4e	Increased tolerance for test_zero_model_parallel tests (#67765 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67764 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/67765 Reviewed By: malfet Differential Revision: D32171621 Pulled By: mrshenli fbshipit-source-id: 8c34f4714289cb41824f3a18822a28ed670fa0a6	2021-11-04 15:17:45 -07:00
Jane Xu	34051d74da	Add test owner to distributed files starting with test_ (#66797 ) Summary: Action based on https://github.com/pytorch/pytorch/issues/66232 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/66797 Reviewed By: gchanan Differential Revision: D31761389 Pulled By: janeyx99 fbshipit-source-id: c27c9ab4acec1eb71d5edd4538cd113b770dfc6c	2021-10-19 10:55:20 -07:00
Rohan Varma	3bd26792c0	Skip test_multiple_groups on windows (#66154 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66154 Skips as the test is flaky: https://github.com/pytorch/pytorch/issues/66059 ghstack-source-id: 139763149 Test Plan: CI Reviewed By: mrshenli Differential Revision: D31403153 fbshipit-source-id: 7f47f17cee148a708346d6d9454c44a194d13a78	2021-10-05 18:33:23 -07:00
Rohan Varma	1c8949c51a	[BE] Run Zero test internally (#65519 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65519 Adds buck target so we can run this internally. ghstack-source-id: 139009957 Test Plan: CI Reviewed By: SciPioneer Differential Revision: D31072784 fbshipit-source-id: 7185cc1e6f9df3d79251eb017270471942a9d7dd	2021-09-25 13:26:50 -07:00
Rohan Varma	f70147b426	[BE] Enable ZeRO test on windows (#65385 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65385 Enables the ZeRO tests to run on windows. Closes https://github.com/pytorch/pytorch/issues/63086. Backend == NCCL was used as a proxy to see if we were running under CUDA, but Windows GPU tests uses Gloo. In this case use Gloo on GPU. For some reason these tests don't seem to test Gloo on GPU with ZeRO in general (picks NCCL backend when GPU is available), so kept that behavior for now. ghstack-source-id: 139003920 Test Plan: CI Reviewed By: mrshenli Differential Revision: D31071181 fbshipit-source-id: 45a76309ac5e882f5aa6c4b130118a68800754bb	2021-09-25 13:25:40 -07:00
Andrew Gu	2d75703c6a	Remove req to call step() in training loop (#63164 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63164 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D30284616 Pulled By: andwgu fbshipit-source-id: afdb677fb08851b139178a9f6d782196f26773e1	2021-08-13 08:22:44 -07:00
Andrew Gu	28f9e108b1	Pass `_allow_empty_param_list` into func opt ctor (#63163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63163 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D30284615 Pulled By: andwgu fbshipit-source-id: 4857f5b618ec5b007648737ab532ce605e5d70dc	2021-08-13 08:22:42 -07:00
Andrew Gu	bd81c9178a	Simplify data structures, add uniform approximation, fix mem leak (#63162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63162 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D30284617 Pulled By: andwgu fbshipit-source-id: 9bd9e5f89abcc0d3dac56b85d55cc88e843baa9f	2021-08-13 08:20:59 -07:00
Rohan Varma	eea52b7d47	Skip zero test on windows (#63087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63087 Test failed on windows unexpectedly see https://github.com/pytorch/pytorch/issues/63086. Skip for now while we investigate ghstack-source-id: 135631811 Test Plan: CI Reviewed By: ngimel Differential Revision: D30251300 fbshipit-source-id: 8acb1ea8863c654c171fe989ac24446c321c085d	2021-08-12 00:38:42 -07:00

1 2

76 Commits