pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Oguz Ulgen	eafbd20f23	Annotate all InstructionTranslator (#131509 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/131509 Approved by: https://github.com/zou3519	2024-07-24 05:31:01 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Michael Lazos	e72ef4f22a	Fix capturable enablement conditions (#125826 ) Only enable capturable if state hasn't been initialized and all parameters are on CUDA. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125826 Approved by: https://github.com/anijain2305 ghstack dependencies: #125825	2024-05-11 06:29:59 +00:00
Aaron Gokaslan	1dd42e42c4	[BE]: Try TCH autofixes on torch/ (#125536 ) Tries TCH autofixes and see what breaks Pull Request resolved: https://github.com/pytorch/pytorch/pull/125536 Approved by: https://github.com/ezyang	2024-05-05 23:13:59 +00:00
laithsakka	0b70026d3b	Do not pass none to has_pending_mutation (#125359 ) #fix https://github.com/pytorch/pytorch/issues/125315 Several failures when inlining nn module is enabled are due to passing None to has_pending_mutation from previous code, it sounds like its expected for variable to be none when not found, In that case we should skip it and not call has_pending_mutation this is tested in https://github.com/pytorch/pytorch/pull/125354 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125359 Approved by: https://github.com/mlazos	2024-05-02 09:08:22 +00:00
Michael Lazos	5050e627dc	Defer marking_static_address (#124309 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124309 Approved by: https://github.com/anijain2305 ghstack dependencies: #123324, #123404, #123405	2024-04-19 17:20:57 +00:00
Xuehai Pan	93e249969b	[BE] enable `ruff` rule `RSE` and remove useless parentheses in `raise` statements (#124261 ) Remove useless parentheses in `raise` statements if the exception type is raised with no argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261 Approved by: https://github.com/albanD	2024-04-17 19:29:34 +00:00
Michael Lazos	bff321716c	Remove special handling of step with closure (#123620 ) Implements https://github.com/pytorch/pytorch/issues/123479 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123620 Approved by: https://github.com/anijain2305 ghstack dependencies: #123496, #123497, #123551, #123552, #123618	2024-04-09 21:15:24 +00:00
Michael Lazos	7c23fed12c	Move step to cpu if state is already initialized (#123618 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123618 Approved by: https://github.com/anijain2305 ghstack dependencies: #123496, #123497, #123551, #123552	2024-04-09 09:04:18 +00:00
Michael Lazos	89e6292d48	Defer setting capturable in optimizer variable (#123497 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123497 Approved by: https://github.com/anijain2305 ghstack dependencies: #123496	2024-04-08 19:31:25 +00:00
PyTorch MergeBot	d9ac80f80c	Revert "Defer setting capturable in optimizer variable (#123497 )" This reverts commit `76b290344f`. Reverted https://github.com/pytorch/pytorch/pull/123497 on behalf of https://github.com/PaliC due to seems to have broken distributed/fsdp/test_fsdp_hybrid_shard.py as per `26bf05ccac` ([comment](https://github.com/pytorch/pytorch/pull/123496#issuecomment-2043251234))	2024-04-08 17:06:05 +00:00
Michael Lazos	76b290344f	Defer setting capturable in optimizer variable (#123497 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123497 Approved by: https://github.com/anijain2305 ghstack dependencies: #123496	2024-04-08 08:34:19 +00:00
Michael Lazos	d9d25076fe	Reduce guards of optimizer state dict to guard once per param group (#123413 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123413 Approved by: https://github.com/anijain2305	2024-04-06 00:12:59 +00:00
Michael Lazos	3e2b7e6052	[dynamo][guard overhead] Data ptr guard optimizer state tensors (#122858 ) Stricter (but faster) guarding on optimizer state tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/122858 Approved by: https://github.com/anijain2305	2024-04-03 21:42:06 +00:00
Animesh Jain	d91db70295	[dynamo][cpp-guards] Optimize tensor.grad accessor (#123226 ) For LayoutLM model, reduces C++ guard overhead by 1.48x. These are the numbers ![image](https://github.com/pytorch/pytorch/assets/13822661/25cfc35b-b67d-4903-8403-71fa931dacdd) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123226 Approved by: https://github.com/jansel	2024-04-03 05:32:13 +00:00
Animesh Jain	5b42c41b19	[dynamo][improve-guard-overhead] Skip TENSOR_MATCH guards on parameters for optimizers (#122647 ) 1.32x guard overhead reduction (1.092 vs vs 0.827 ms) for MegatronBertForCausalLM with 394 params. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122647 Approved by: https://github.com/jansel, https://github.com/mlazos ghstack dependencies: #122646	2024-03-27 19:40:43 +00:00
Jason Ansel	477d154ffd	[dynamo] Add missing _nonvar_fields annotations (#122219 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122219 Approved by: https://github.com/anijain2305 ghstack dependencies: #122218	2024-03-20 07:53:18 +00:00
Jason Ansel	46bf37b3f7	[dynamo] Replace VariableTracker.apply with visit/realize_all (#122218 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122218 Approved by: https://github.com/anijain2305	2024-03-20 07:53:18 +00:00
Jason Ansel	153a01833b	[dynamo] Optimize SourcelessBuilder (#122063 ) Improves `benchmarks/dynamo/microbenchmarks/dynamo_microbenchmarks.py` from 2.7s to 2.5s. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122063 Approved by: https://github.com/anijain2305 ghstack dependencies: #122039, #122043, #122055, #122058, #122060	2024-03-19 04:23:30 +00:00
Michael Lazos	15abc56bd5	Graph break on step closure in optimizer (#121777 ) Fixes https://github.com/pytorch/pytorch/issues/116494 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121777 Approved by: https://github.com/yanboliang	2024-03-14 03:18:23 +00:00
Michael Lazos	c5ef4df274	guard on grads being `None` in compiled optimizers (#121291 ) Fixes #115607 We were missing guards when the grads were set to `None`. So if we compiled the optimizer with grads set to their proper value, and then with the grads set to `None` we'd continuously run the `None` version because all of the guards would pass and it would be ordered before the correct version in the cache. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121291 Approved by: https://github.com/Skylion007, https://github.com/anijain2305	2024-03-06 18:33:23 +00:00
lezcano	cdbc29e91a	[dynamo,optim] Use the actual sources from the parameters when tracing "params" in an optimizer (#118535 ) Fixes the unnecessary guards described at https://github.com/pytorch/pytorch/pull/117983#discussion_r1467622149 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118535 Approved by: https://github.com/mlazos ghstack dependencies: #117982, #118098, #117983, #117625, #118194, #118003, #118208, #118199	2024-02-02 14:42:56 +00:00
lezcano	75a5c41921	[dynamo,optim] Place guards on the args before assuming they exist (#117983 ) This enables the new way of writing guards for dicts. Before we were doing things like ``` L['self'].param_groups[0][___dict_keys_getitem(L['self'].param_groups[0], 0)][3] is L['self'].param_groups[0]['params'][3] ``` without knowing whether `L['self'].param_groups[0][___dict_keys_getitem(L['self'].param_groups[0], 0)]` was a list. On a different note, I'll probably write a pass to recover the previous way to place guards on dicts via something like `DICT_KEYS` as an optimisation, as it seems relevant for optimisers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117983 Approved by: https://github.com/mlazos ghstack dependencies: #117982, #118098	2024-02-02 14:37:46 +00:00
Edward Z. Yang	d03173e88c	Unify MYPYINDUCTOR and MYPY (#118432 ) The original motivation for MYPYINDUCTOR was a faster type checking configuration that only checked a subset of files. With the removal of `follow_imports = ignore`, we are now able to use dmypy to do fast incremental typechecking, eliminating the need for this. Perhaps erroneously, when I tee'ed up this PR I elected to delete the `follow_imports = skip` designations in the mypy-inductor.ini. This lead to a number of extra type error suppressions that I manually edited. You will need to review. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118432 Approved by: https://github.com/Skylion007 ghstack dependencies: #118414, #118418	2024-01-27 17:23:20 +00:00
rzou	5e0ef84b01	[dynamo] Refactor install_global_once, remove usages of install_global_unsafe (#118100 ) We split install_global_once into two APIs: - `install_global_by_id(prefix, value) -> name`: installs a global if it hasn't been installed yet - `install_global(prefix, value) -> name`: always installs the global (and generates a unique name for it) Then, we refactor most callsites of `install_global_unsafe` to one of the previous. Some callsites cannot be refactored because we create the global name first, do a lot of stuff with it, and then install it. This fixes more test flakiness. Test Plan: - Existing tests; I can't reliably repro the flakiness Pull Request resolved: https://github.com/pytorch/pytorch/pull/118100 Approved by: https://github.com/ezyang, https://github.com/mlazos	2024-01-24 23:25:44 +00:00
Michael Lazos	f302a0d380	Re-enable SGD (#117434 ) Re-enables the SGD optimizer now that compile times are more reasonable. [Benchmark run](https://github.com/pytorch/pytorch/actions/runs/7511073761) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117434 Approved by: https://github.com/anijain2305, https://github.com/janeyx99	2024-01-19 04:28:50 +00:00
PyTorch MergeBot	b0084be114	Revert "Re-enable SGD (#117434 )" This reverts commit `e7fac72be7`. Reverted https://github.com/pytorch/pytorch/pull/117434 on behalf of https://github.com/lezcano due to breaks test_profiler.py when run with dynamo ([comment](https://github.com/pytorch/pytorch/pull/117434#issuecomment-1898311961))	2024-01-18 11:37:36 +00:00
Michael Lazos	e7fac72be7	Re-enable SGD (#117434 ) Re-enables the SGD optimizer now that compile times are more reasonable. [Benchmark run](https://github.com/pytorch/pytorch/actions/runs/7511073761) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117434 Approved by: https://github.com/anijain2305, https://github.com/janeyx99	2024-01-18 06:47:15 +00:00
Animesh Jain	e54b40e5eb	[dynamo] GetItemSource - restrict the supported index Source to be GlobalWeakRefSource (#117138 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117138 Approved by: https://github.com/jansel, https://github.com/mlazos	2024-01-12 18:21:14 +00:00
PyTorch MergeBot	ac0bed01df	Revert "[dynamo] GetItemSource - restrict the supported index Source to be GlobalWeakRefSource (#117138 )" This reverts commit `c278a1b39c`. Reverted https://github.com/pytorch/pytorch/pull/117138 on behalf of https://github.com/zou3519 due to Broke jobs on main, I'm not sure why ([comment](https://github.com/pytorch/pytorch/pull/117138#issuecomment-1888290068))	2024-01-12 01:55:49 +00:00
Animesh Jain	c278a1b39c	[dynamo] GetItemSource - restrict the supported index Source to be GlobalWeakRefSource (#117138 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117138 Approved by: https://github.com/jansel	2024-01-11 23:26:25 +00:00
Michael Lazos	fbeca60b1f	Remove replace_all and make VTs mutable (#113725 ) 1. Removes calls to `replace_all` and `clone` and makes VTs mutable. 2. Properly handles Tuple Iterator mutation. Previously TupleIterator variables would only be properly reconstructed if they were advanced at least once in a frame. On calls to `next`, the source information would be lost (due to constructing a new iterator without using builder), which would ensure that during codegen the variable would be reconstructed from scratch. Now that VTs are mutated, the source is never lost, so we need to properly track mutation and handle it by replaying calls to `next` at the end of the modified bytecode. 3. Added test for checking iadd side effects, this was missing in our unit test coverage. 4. Fixed two incorrect sources, DelayGraphBreakVariable, and UserMethodVariable both relied on setting the source to AttrSource(parent, name) at the callsite of `var_getattr`. 5. Fixed a bug in inplace adding for lists, it would set the resulting VariableTracker's source to `None` which would utilize a different reconstruct path in codegen. Now this is handled explicitly by reconstructing vars when allow_cache=`False`, so that during side effect replay, the mutated var is correctly updated. In subsequent PRs: * Refactoring side effect tracking to be significantly simpler (I think we only need an `is_modified` flag) * Refactor `next_variables` iterator to match the signature of `next` * Remove all references to `options` in the code * Refactor VTs representing mutable collections to implement their own mutation update handling * Remove clone and/or make it specific to lists for creating slices * Add mutation tracking/replay for sets * Add mutation tracking/replay for iter.py * Removing setting source in builder (it's set at the top level after a var is returned) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113725 Approved by: https://github.com/jansel	2023-12-10 09:31:21 +00:00
Jason Ansel	9664190952	[dynamo] Eagerly install guards (#111415 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111415 Approved by: https://github.com/voznesenskym ghstack dependencies: #111306	2023-11-07 19:55:19 +00:00
Jon Chuang	76918367ff	fix(dynamo): `Optimizer._init_group` did not handle return value (#110709 ) blocks: https://github.com/pytorch/pytorch/pull/110706 Causes a bug for all optimizers that use _init_group return value. `compile` + _init_group ret value is not on testing path. So we also add test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110709 Approved by: https://github.com/ezyang	2023-11-01 05:22:42 +00:00
Jason Ansel	c7b78fb76c	[dynamo] Replace recursively_contains with parents_tracker (#112122 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112122 Approved by: https://github.com/voznesenskym	2023-10-28 06:46:48 +00:00
Kazuaki Ishizaki	2c1b009e39	Fix typo under torch/_dynamo directory (#110459 ) This PR fixes typo of comments in files under `torch/_dynamo` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/110459 Approved by: https://github.com/colesbury	2023-10-04 16:05:05 +00:00
Michael Voznesensky	a902150a1e	[Easy] ConstantVariable() -> .create (#109896 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109896 Approved by: https://github.com/ezyang	2023-09-22 22:30:15 +00:00
Michael Lazos	49df1de383	Cudagraphs support for compiled optimizers (#107504 ) Marks all params/optimizer state as static addresses and a finalizer which cleans up the graph attributes when the optimizer goes out of scope. **Note: this does not mark grads as static because this will increase memory usage significantly There are two cases: 1. The upstream graph is cudagraphed - this case will work fine OOTB 2. The upstream graph is not cudagraphed - in this case, there will be a lot of copies introduced from the upstream (to copy the grads) into cudagraphed-owned memory, unless the user explicitly marks the grads as static. If the user does this, this will also require not deallocating the grads in zero_grad() (either the mod or optimizer version) by setting them to zero vs None. There is a PR (https://github.com/pytorch/pytorch/pull/107853) in flight to throw an error if zero_grad attempts to set static grads to None. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107504 Approved by: https://github.com/eellison	2023-08-31 20:47:18 +00:00
Michael Lazos	690ea933ca	Enable more e2e foreach optimizer compilation tests (#105438 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/105438 Approved by: https://github.com/jansel	2023-07-20 02:41:19 +00:00
Michael Lazos	a290cbf32b	Enable fused foreach Adam compilation (#104121 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/104121 Approved by: https://github.com/janeyx99	2023-07-05 23:40:03 +00:00
Michael Lazos	5a97c947c6	Fix optimizer grad mode state interaction with dynamo (#103952 ) Graph break before restoring the grad mode to ensure dynamo respects `no_grad`. This isn't a bug necessarily, but this will allow us to get good perf until aot is updated. https://github.com/pytorch/pytorch/issues/104053 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103952 Approved by: https://github.com/janeyx99	2023-06-23 02:07:08 +00:00
Michael Lazos	05e91a50d9	Manually generate guards for optimizer (#103121 ) Manually generate guards for optimizer rather than use variable builder, which can be slow with lots of params. This is the reason for ~10s compile slowdown Redisable `_init_group`. This is important, because if for any reason a frame which calls `_init_group` is run in the python interpreter, we will trace it, which we don't want to do. We only want to call it when it is accessed via the fast path implemented with the optimizer variable during symbolic interpretation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103121 Approved by: https://github.com/jansel	2023-06-08 21:45:19 +00:00
Michael Lazos	c46af25bb3	Initialize optimizer in dynamo to avoid graph break and tracing slowness (#102640 ) On calls to `_init_group` rather than tracing through it, extract python values from the arguments, and call the initialization. This avoids having to trace this function which is very slow with large parameters, and also avoids graph breaking on it. This is sound in this case because the state is only initialized once in the eager case. Guards on the state and params are generated explicitly rather than via tracing the initialization. Caveats: `_init_group` also gathers various state tensors into lists via mutating list arguments to pass to the functional optimizer implementation. These state tensors exist on the optimizer itself, but we don't know exactly how the gathering is done and which tensors correspond to which attributes of the optimizer module (each optimizer has different states). To rectify this, we keep weak_ptrs to all of the tensors collected in the lists in globals (similar to how parameter keys are stored for dictionaries). These pointers are guaranteed to be alive as long as the optimizer object is alive if the internal state is not interfered with and they are guarded with weakref guards Pull Request resolved: https://github.com/pytorch/pytorch/pull/102640 Approved by: https://github.com/jansel	2023-06-03 15:49:51 +00:00

43 Commits