pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
leslie-fang-intel	98929ceae3	[Inductor][CPP] Enable Local Buffer for Outer loop fusion (#126967 ) Summary Currently, the Inductor CPP backend [generated code](https://gist.github.com/leslie-fang-intel/98f91d43dabed581a1ffe23daf133a65#file-bf16-softmax-generated-code-wo-local-buffer-py) for `Softmax` with BF16 data type is significantly slower than the [ATen Implementation](`9a2beb862d/aten/src/ATen/native/cpu/SoftMaxKernel.cpp (L149)`). Upon comparing the generated code with ATen, the performance bottleneck appears to be related to the usage of [local buffer in ATen](`9a2beb862d/aten/src/ATen/native/cpu/SoftMaxKernel.cpp (L159-L160)`). In the current implementation, the Inductor uses the output buffer of Kernel Group Args to store and load temporary result (such as `exp`), since this buffer is corresponding to a `SchedulerNode`. Each thread accesses a portion of this output buffer via indexing. However, since this buffer (take this `exp` as example) is only utilized internally within decomposed `softmax`, this buffer can be replaced with a thread-local buffer similar to ATen's approach. In this PR, we have introduced the optimizations of `LocalBuffer`. Following this enhancement, the [new generated Inductor code with local buffer](https://gist.github.com/leslie-fang-intel/98f91d43dabed581a1ffe23daf133a65#file-bf16-softmax-generated-code-w-local-buffer-py) for BF16 `Softmax` demonstrates significantly improved performance. Running the benchmark [here](https://gist.github.com/leslie-fang-intel/37d81441237b5139c8295f5e6c4cd31a) to test this BF16 `Softmax` case on an 8480 Xeon server shows similar performance between the Inductor CPP Backend and the ATen implementation. TestPlan ``` python -u -m pytest -s -v inductor/test_cpu_repro.py -k test_local_buffer_in_outer_loop_fusion ``` Next Step - [ ] Support more than one Local Buffer/Global Buffer Pull Request resolved: https://github.com/pytorch/pytorch/pull/126967 Approved by: https://github.com/jgong5, https://github.com/peterbell10	2024-07-07 05:34:57 +00:00
Aaron Orenstein	afe15d2d2f	Flip default value for mypy disallow_untyped_defs [3/11] (#127840 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127840 Approved by: https://github.com/oulgen	2024-06-08 18:28:01 +00:00
Peter Bell	98fd23cccc	[EASY] Move OpsHandler and MockHandler to their own file (#119851 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119851 Approved by: https://github.com/lezcano ghstack dependencies: #119728	2024-02-15 18:54:41 +00:00
Andrew M. James	884b6d2a67	[inductor] Implementing missing magic methods on IR values. (#118933 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118933 Approved by: https://github.com/peterbell10	2024-02-06 05:50:26 +00:00
Edward Z. Yang	96d94f574e	Fix several bugs related to unbacked SymInt codegen in inductor (#117862 ) Let me tell you, this was a journey. * When we repropagate through FX interpreter in AOTAutograd, this will reallocate unbacked SymInts. We can eliminate all of these fresh allocations by appropriately asserting equalities on them setting up replacements. See also https://github.com/pytorch/pytorch/issues/111950 * The `inner_fn` of Loops can contain references to unbacked SymInts. We must collect them to prevent DCE. * Export naughtily accessed `_expr` when it should have accessed `expr` on SymNode. Fixed two sites of this. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117862 Approved by: https://github.com/bdhirsh	2024-01-26 18:08:03 +00:00
laith sakka	708e6241ed	Fix sympy_subs to preserve integer and non-negative properties. (#118150 ) This diff introduce the following changes: 1. Fix sympy_subs to preserve integer and non-negative properties of replaced symbol when replacement is string why is this needed? I was compiling an expression: xabs(y) where y =-2 what happens is that this expression is passed as ``s1abs(s0)`` then s0 is replaced to ks0 with a call to sympy_subs. but sympy_subs used to replace s0 (integer=false, nonegative=false) with ks0(inetegr=true, nonegative = true) resulting in ``xabs(ks0) = xks0`` which is wrong 2. rename sympy_symbol to sympy_index_symbol to make it explicit. 3. add assertion that replaced expression is not passed as string but always a sympy expression. Fixes https://github.com/pytorch/pytorch/issues/117757 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118150 Approved by: https://github.com/ezyang	2024-01-25 20:54:55 +00:00
Edward Z. Yang	df4e3d9d08	Document OpsHandler protocol (#117790 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117790 Approved by: https://github.com/jansel	2024-01-21 07:20:53 +00:00
Edward Z. Yang	634ce3c913	Document and type torch._inductor.virtualized (#117658 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117658 Approved by: https://github.com/eellison, https://github.com/peterbell10 ghstack dependencies: #117650	2024-01-18 03:03:20 +00:00
Jason Ansel	94363cee41	[inductor] Indexing refactors (#116078 ) Perf differences seems to be noise: ![image](https://github.com/pytorch/pytorch/assets/533820/d7a36574-0388-46e4-bd4d-b274d37cab2b) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116078 Approved by: https://github.com/aakhundov	2024-01-09 19:06:51 +00:00
Jez Ng	87925789ae	Make V.graph properly typed (#114025 ) Previously it lacked a type hint and so was treated as an Any type. This resulted in a lot of untyped code downstream as V.graph is referenced in many places in inductor code. I've typed it properly now as GraphLowering, and fixed the numerous type errors this surfaced. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114025 Approved by: https://github.com/eellison ghstack dependencies: #114013	2023-11-21 02:14:29 +00:00
Jez Ng	4667e20b3f	Delete a bunch of type-ignores (#113990 ) * Replaced `ignore[import]` by mypy config file entries * Removed a bunch of ignores around previously-fixed attr-defined / call-arg issues * Fixed some invalid / undefined types; added a few more type-ignores to squelch the downstream errors this exposed Pull Request resolved: https://github.com/pytorch/pytorch/pull/113990 Approved by: https://github.com/eellison, https://github.com/Skylion007 ghstack dependencies: #113979	2023-11-18 02:48:38 +00:00
Shunting Zhang	fbafff3668	[reland][inductor] benchmark fusion (#112450 ) reland https://github.com/pytorch/pytorch/pull/108193 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112450 Approved by: https://github.com/jansel	2023-10-31 18:17:06 +00:00
PyTorch MergeBot	fc0b0820fc	Revert "Readded device_assert skipping in index and index_put (and also added (#112093 )" This reverts commit `b110d87ac2`. Reverted https://github.com/pytorch/pytorch/pull/112093 on behalf of https://github.com/ZainRizvi due to Stack breaks internal builds ([comment](https://github.com/pytorch/pytorch/pull/112093#issuecomment-1785922905))	2023-10-30 19:45:41 +00:00
chilli	b110d87ac2	Readded device_assert skipping in index and index_put (and also added (#112093 ) copy to noop pass) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112093 Approved by: https://github.com/oulgen, https://github.com/lezcano	2023-10-27 18:23:49 +00:00
PyTorch MergeBot	64fd027f2e	Revert "[inductor] benchmark fusion (#108193 )" This reverts commit `73cc5d1cdd`. Reverted https://github.com/pytorch/pytorch/pull/108193 on behalf of https://github.com/izaitsevfb due to Trying to unblock the revert of #108690, please rebase and reland. ([comment](https://github.com/pytorch/pytorch/pull/108193#issuecomment-1782157638))	2023-10-27 01:40:06 +00:00
PyTorch MergeBot	0a3199dd7e	Revert "Readded device_assert skipping in index and index_put (and also added (#112093 )" This reverts commit `e38347f490`. Reverted https://github.com/pytorch/pytorch/pull/112093 on behalf of https://github.com/izaitsevfb due to Sorry, trying to resolve a conflict with intern, and unblock the revert of #108690 ([comment](https://github.com/pytorch/pytorch/pull/112093#issuecomment-1782154814))	2023-10-27 01:37:33 +00:00
Shunting Zhang	73cc5d1cdd	[inductor] benchmark fusion (#108193 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108193 Approved by: https://github.com/jansel	2023-10-26 22:18:37 +00:00
PyTorch MergeBot	485cc0faae	Revert "[inductor] benchmark fusion (#108193 )" This reverts commit `ec0cdcdf6a`. Reverted https://github.com/pytorch/pytorch/pull/108193 on behalf of https://github.com/ZainRizvi due to This test is breaking trunk. In the future please make sure to add the ciflow/trunk label before force merging any PR to ensure your code doesn't break those tests ([comment](https://github.com/pytorch/pytorch/pull/108193#issuecomment-1781473282))	2023-10-26 16:41:20 +00:00
chilli	e38347f490	Readded device_assert skipping in index and index_put (and also added (#112093 ) copy to noop pass) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112093 Approved by: https://github.com/oulgen, https://github.com/lezcano ghstack dependencies: #111990	2023-10-26 07:54:44 +00:00
Shunting Zhang	ec0cdcdf6a	[inductor] benchmark fusion (#108193 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108193 Approved by: https://github.com/jansel	2023-10-26 04:14:22 +00:00
Elias Ellison	0a147fd112	Pointwise fuse cat with pointwise inputs or outputs and <= 4 inputs (#111233 ) Improves perf of llama_v2 locally from 1.55 -> 1.57 The initial heuristic is to lower to pointwise if # of inputs is <= 4, and all the inputs are pointwise or cannot be memory planned away, or if all the outputs are pointwise. Perf run was +3% on inference.. There are definitely instances where we should be lowering to foreach_kernels, but it's less flexible for fusion. The motivating example was: ``` def rotate_half(x): """Rotates half the hidden dims of the input.""" x1 = x[..., : x.shape[-1] // 2] x2 = x[..., x.shape[-1] // 2 :] return torch.cat((-x2, x1), dim=-1) def apply_rotary_pos_emb(q, k, cos, sin): iota = torch.ops.prims.iota.default(512, start = 0, step = 1, dtype = torch.int64, device = device(type='cuda', index=0), requires_grad = False) # File: /scratch/eellison/work/torchdynamo/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py:657, code: position_ids = position_ids.unsqueeze(0).view(-1, seq_length) unsqueeze = torch.ops.aten.unsqueeze.default(iota, 0) position_ids = torch.ops.aten.reshape.default(unsqueeze, [-1, 512]); unsqueeze = None # The first two dimensions of cos and sin are always 1, so we can `squeeze` them. cos = cos.squeeze(1).squeeze(0) # [seq_len, dim] sin = sin.squeeze(1).squeeze(0) # [seq_len, dim] cos = cos[position_ids].unsqueeze(1) # [bs, 1, seq_len, dim] sin = sin[position_ids].unsqueeze(1) # [bs, 1, seq_len, dim] q_embed = (q * cos) + (rotate_half(q) * sin) k_embed = (k * cos) + (rotate_half(k) * sin) return q_embed, k_embed ``` Also not sure if I should be more worried about concatting reduction->pointwise inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111233 Approved by: https://github.com/Chillee	2023-10-21 02:34:05 +00:00
Jez Ng	c77dd684c9	Enable typechecking in _inductor/ir.py (#110112 ) I used a bunch of ignore-type comments, mostly due to https://github.com/pytorch/pytorch/issues/109963. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110112 Approved by: https://github.com/peterbell10	2023-10-07 04:19:38 +00:00
Jez Ng	d2d36aad6f	Enable typechecking for _inductor/virtualized.py (#108916 ) Also add a few more type annotations to utils.py (some of its functions are called from virtualized.py) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108916 Approved by: https://github.com/eellison	2023-09-13 13:04:51 +00:00
Mu-Chu Lee	30a33b76b9	[AOTInductor] Include constants in AOTInductor .so file. (#108473 ) Summary: Include constants in AOTInductor .so file. Added some difference: 1) serialize with ctypes instead of the native of torch.storage 2) Use the underlying for_blob instead of from_blob to construct Tensor. Test Plan: Unit tests: ``` test/inductor/test_aot_inductor.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/108473 Approved by: https://github.com/angelayi	2023-09-08 03:49:53 +00:00
Bin Bao	06d74e6b24	Revert "[AOTInductor] Include constants in AOTInductor .so file. (#10… (#108349 ) This reverts commit `c3239442a3` due to internal test failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108349 Approved by: https://github.com/aakhundov, https://github.com/zhxchen17	2023-08-31 16:26:02 +00:00
Mu-Chu Lee	c3239442a3	[AOTInductor] Include constants in AOTInductor .so file. (#107718 ) Summary: Include the constants into AOTInductor .so file. We do not modify existing API signatures but create necessary format with weight lifted out instead. Test Plan: test/inductor/test_aot_inductor.py Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/107718 Approved by: https://github.com/angelayi, https://github.com/eellison	2023-08-29 22:37:30 +00:00
FFFrog	83517c8dba	Enable Mypy Check in torch/_inductor/virtualized.py (#107127 ) Fixes #105230 ```shell $ lintrunner init && lintrunner -a torch/_inductor/virtualized.py ... ok No lint issues. Successfully applied all patches. ``` ```shell $ mypy torch/_inductor/virtualized.py Success: no issues found in 1 source file ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/107127 Approved by: https://github.com/eellison	2023-08-23 04:54:32 +00:00
Peter Bell	18b1c2907d	[inductor] Add ir.WelfordReduction with multiple outputs (#104725 ) This replaces `var_unnormalized` reduction type with `welford_reduce` which takes the input data and outputs not just the variance, but also the mean and weights which account for the full welford accumulator state. Thus we can avoid re-computing the mean, and we now have enough information to create a multilayer reduction which I implement here by adding a second reduction type called `welford_combine` which reduces over all three inputs simultaneously. Multi-layer support is particularly important as normalization operators like BatchNorm are being split in many timm models, which meant `var_unnormalized` had to fall back to two-pass variance calculation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104725 Approved by: https://github.com/lezcano	2023-08-18 08:18:01 +00:00
Bin Bao	da7ca82121	[inductor] Store real inputs to be used for cpp wrapper codegen (#103289 ) Summary: defaked args (zeros) may cause device-side access assertion, so record the orginal real tensor inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103289 Approved by: https://github.com/jansel, https://github.com/eellison	2023-06-15 20:05:50 +00:00
Jason Ansel	0c6f409cda	[inductor] Refactor RNG operators (#100064 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100064 Approved by: https://github.com/ngimel	2023-05-20 03:43:33 +00:00
Peter Bell	89bd5d3dab	[inductor] Implement magic methods on IR values (#101076 ) This wraps `ops` into an `OpsWrapper` object which wraps any returned IR values into an `OpsValue` instance. This allows magic methods to be implemented and means lowerings can write mathematical expressions much more fluently. So instead of ```python ops.add(ops.mul(ops.mul(ops.sub(ops.mul(_Ap2, x), _Ap3), x), x), _1) ``` we can write ```python (_Ap2 * x - _Ap3) * x * x + _1 ``` And it will translate to the equivalent `ops` calls. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101076 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-05-19 23:09:37 +00:00
Peter Bell	66e398951a	[inductor/decomp] Add aten._unsafe_index to disable range checks (#101602 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101602 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-05-17 23:36:24 +00:00
PyTorch MergeBot	5f07c589b0	Revert "[inductor] Refactor RNG operators (#100064 )" This reverts commit `3bbf0683a1`. Reverted https://github.com/pytorch/pytorch/pull/100064 on behalf of https://github.com/izaitsevfb due to breaks inductor tests, see D45936056 ([comment](https://github.com/pytorch/pytorch/pull/100064#issuecomment-1552093728))	2023-05-17 21:16:41 +00:00
Jason Ansel	3bbf0683a1	[inductor] Refactor RNG operators (#100064 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100064 Approved by: https://github.com/ngimel	2023-05-17 01:29:31 +00:00
lezcano	495e1b4d0e	Add device_asserts before indirect loads and stores (#98590 ) This PR also adds a way to CSE statements (not only assignments). The tests follow the pattern from https://github.com/openai/triton/pull/1143 They take a fair amount of time to run (90s in my box). If we wanted to improve this, we could avoid testing the `ndim == 3` case. Changes like this one make me hope that we get to clean the amount of lowerings we have at some point... Generated code for `x[y]` with `x.shape == (3, 2, 4), y.ndim == 1`: With `dynamic=False`: ```python tmp0 = tl.load(in_ptr0 + (x1), xmask) tl.device_assert(((0 <= tmp0) & (tmp0 < 3)) \| (~xmask), f"index out of bounds: 0 <= tmp0 < 3") tmp1 = tl.load(in_ptr1 + (x0 + (8tmp0)), xmask) ``` With `dynamic=True`: ```python tmp0 = tl.load(in_ptr0 + (x1), xmask) tl.device_assert(((0 <= tmp0) & (tmp0 < ks3)) \| (~xmask), f"index out of bounds: 0 <= tmp0 < ks3") tmp1 = tl.load(in_ptr1 + (x0 + (ks1ks2tmp0)), xmask) ``` Generated code for `x[y+1, y+1]` with `x.shape == (3, 2, 4), y.ndim == (3, 3)`: With `dynamic=False` (note how it folds the two upper bounds to `min(3, 2) == 2` ```python tmp0 = tl.load(in_ptr0 + (x1), xmask) tmp1 = 1 tmp2 = tmp0 + tmp1 tl.device_assert(((0 <= tmp2) & (tmp2 < 2)) \| (~xmask), f"index out of bounds: 0 <= tmp2 < 2") tmp3 = tl.load(in_ptr1 + (x0 + (12tmp2)), xmask) ``` With `dynamic=True`: ```python tl.device_assert(((0 <= tmp2) & (tmp2 < min(ks2, k1))) \| (~xmask), f"index out of bounds: 0 <= tmp2 < min(ks2, ks1)") ``` The same works when the CSE'd variable appears 3 or more times, but then it generates `min(ks0, min(ks1, ks2))` Generated code for `x[y] = z` with `x.ndim = 3`, `y.ndim = 1` and dynamic shapes ```python tmp0 = tl.load(in_ptr0 + (x1), xmask) tmp1 = tl.load(in_ptr1 + (x2), xmask) tl.device_assert(((0 <= tmp0) & (tmp0 < ks3)) \| (~xmask), f"index out of bounds: 0 <= tmp0 < ks3") tl.store(out_ptr0 + (x0 + (ks1ks2tmp0) + tl.zeros([XBLOCK], tl.int32)), tmp1, xmask) ``` Fixes https://github.com/pytorch/pytorch/issues/93538 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98590 Approved by: https://github.com/ngimel	2023-04-19 21:26:57 +00:00
Jason Ansel	fe4fec37a4	[inductor] Refactor IR printing (#96024 ) Reland #95567 part 2. The previous version of this had a bug which that added test triggers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96024 Approved by: https://github.com/ngimel	2023-03-07 02:23:06 +00:00
Jason Ansel	43dd043ea7	Revert "[inductor] Improve error messages (#95567 )" (#96014 ) This reverts commit `62b775583f`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96014 Approved by: https://github.com/Chillee	2023-03-04 04:03:31 +00:00
Jason Ansel	62b775583f	[inductor] Improve error messages (#95567 ) Example error message before/after (710 to 131 lines): https://gist.github.com/jansel/6fecad057738089fa95bf08c3de9fc8a Pull Request resolved: https://github.com/pytorch/pytorch/pull/95567 Approved by: https://github.com/mlazos	2023-03-02 02:20:55 +00:00
Wang, Eikan	9895c19a7a	To vectorize long datatype as mask index (#91076 ) In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91076 Approved by: https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel	2023-02-05 03:36:22 +00:00
Jason Ansel	8c09a005c5	[inductor] Pattern matching engine (copy) (#93291 ) This is an exact duplicate of https://github.com/pytorch/pytorch/pull/90739 The fbcode workflow for landing that diff seems buggy. The github-export-checks task is failing with credentials errors. Plan to try to land it using GH1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93291 Approved by: https://github.com/desertfire	2023-01-31 04:51:00 +00:00
Mark Saroufim	15af4b1cee	Dynamo, FX, Inductor Progress Bars (#88384 ) There are 3 progress bars each gated behind their own config, all off by default for now 1. Dynamo: Macro level config for dynamo, AOT, inductor 2. FX: Progress bar for each pass, with their names 3. Inductor Pull Request resolved: https://github.com/pytorch/pytorch/pull/88384 Approved by: https://github.com/wconstab, https://github.com/mlazos, https://github.com/malfet	2022-12-21 11:56:58 +00:00
Peter Bell	81f351acd7	[inductor] Prevent blowup in inner_fn_str and extract_read_writes (#88933 ) Currently the default `ops` handler expects strings as arguments and just formats them into a function call template string. For complex expressions, this can lead to exponential growth in terms. Say for example you have: ```python def fn(a): for _ in range(3) a = ops.mul(a, a) return a ``` You might expect `inner_fn_str` to contain 1 load and 3 multiplies, but instead you find 8 loads and 7 multiplies: ```python load(arg_0, i0) * load(arg_0, i0) * load(arg_0, i0) * load(arg_0, i0) * load(arg_0, i0) * load(arg_0, i0) * load(arg_0, i0) * load(arg_0, i0) ``` This type of blowup is present in the lowering for `max_pool2d_with_indices_backward` which in #pytorch/torchdynamo#1352 was reported to have caused the entire compilation to hang. This PR fixes the issue by formatting the string as a series of assignments to variables, so for the example above, we now get: ``` tmp0 = load(arg_0, i0) tmp1 = tmp0 * tmp0 tmp2 = tmp1 * tmp1 tmp3 = tmp2 * tmp2 return tmp3 ``` Which corresponds to sequence of `ops` calls made. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88933 Approved by: https://github.com/jansel	2022-12-15 15:36:52 +00:00
PyTorch MergeBot	6581063583	Revert "Dynamo, FX, Inductor Progress Bars (#88384 )" This reverts commit `db0ce4acf3`. Reverted https://github.com/pytorch/pytorch/pull/88384 on behalf of https://github.com/malfet due to Broke test_public_bindings across the board	2022-12-09 16:32:25 +00:00
Mark Saroufim	db0ce4acf3	Dynamo, FX, Inductor Progress Bars (#88384 ) There are 3 progress bars each gated behind their own config, all off by default for now 1. Dynamo: Macro level config for dynamo, AOT, inductor 2. FX: Progress bar for each pass, with their names 3. Inductor Pull Request resolved: https://github.com/pytorch/pytorch/pull/88384 Approved by: https://github.com/wconstab, https://github.com/mlazos	2022-12-09 04:32:31 +00:00
Jean Schmidt	f62e54df8f	Reland "Dynamo, FX, Inductor Progress Bars (#88384 )" … (#90055 ) This commit had inconsistent internal land and pr merged. This caused merge conflicts that required revert in both places, normalize the internal commit stack, and then re-land properly. Original commit: #88384 (`011452a2a1`) Inconsistent revert: #90018 (8566aa7c0b4bdca50bf85ca14705b4304de030b3) Revert of the inconsistent revert to restore healthy state (or re-land of the original commit): `cf3c3f2280` Landing the correct, internally congruent revert of the original commit: (This PR) #90055 (TBD) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90055 Approved by: https://github.com/DanilBaibak, https://github.com/malfet	2022-12-02 13:28:00 +00:00
PyTorch MergeBot	cf3c3f2280	Revert "Revert "Dynamo, FX, Inductor Progress Bars (#88384 )" (#90018 )" This reverts commit `bcf4292f04`. Reverted https://github.com/pytorch/pytorch/pull/90018 on behalf of https://github.com/jeanschmidt due to landed internal commit does not match with this one, causing merge conflict and preventing import and land new commits	2022-12-02 09:57:31 +00:00
Eli Uriegas	bcf4292f04	Revert "Dynamo, FX, Inductor Progress Bars (#88384 )" (#90018 ) This breaks in environments that use the fake tqdm `015b05af18/torch/hub.py (L26)` which doesn't support the 'desc' kwarg and is not iterable Original try using pytorchbot did not go through because of a merge conflict: https://github.com/pytorch/pytorch/pull/88384#issuecomment-1334272489 This reverts commit `011452a2a1`. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/90018 Approved by: https://github.com/drisspg, https://github.com/dbort	2022-12-01 20:17:07 +00:00
Mark Saroufim	011452a2a1	Dynamo, FX, Inductor Progress Bars (#88384 ) There are 3 progress bars each gated behind their own config, all off by default for now 1. Dynamo: Macro level config for dynamo, AOT, inductor 2. FX: Progress bar for each pass, with their names 3. Inductor Pull Request resolved: https://github.com/pytorch/pytorch/pull/88384 Approved by: https://github.com/wconstab, https://github.com/mlazos	2022-11-30 06:07:14 +00:00
Horace He	419ef2cdcf	Added utility to count memory reads/written in Inductor (#89203 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89203 Approved by: https://github.com/jansel, https://github.com/ngimel	2022-11-19 04:18:26 +00:00
Yanbo Liang	98f40af7e3	[Inductor] Truncate function expr str if it's too long at RecordLoadStore (#87248 ) See context at https://github.com/pytorch/torchdynamo/issues/1352#issuecomment-1283131872 Fixes https://github.com/pytorch/torchdynamo/issues/1352 cc @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @penguinwu Pull Request resolved: https://github.com/pytorch/pytorch/pull/87248 Approved by: https://github.com/jansel	2022-10-25 03:22:27 +00:00

1 2

52 Commits