pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Boyuan Feng	514409d032	update torchvision pin (#154255 ) Fixes #153985 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154255 Approved by: https://github.com/desertfire	2025-05-27 16:15:25 +00:00
Huy Do	6cd9d66b7f	Allow higher fp16 tolerance for phlippe_resnet on CUDA 12.8 (#154109 ) After https://github.com/pytorch/pytorch/pull/154004, one of the model `phlippe_resnet` needs higher tolerance for fp16 on CUDA 12.8. I can reproduce it locally with: ``` python benchmarks/dynamo/torchbench.py --accuracy --timing --explain --print-compilation-time --inductor --device cuda --training --amp --only phlippe_resnet E0522 02:47:12.392000 2130213 site-packages/torch/_dynamo/utils.py:2949] RMSE (res-fp64): 0.00144, (ref-fp64): 0.00036 and shape=torch.Size([]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.001000, use_larger_multiplier_for_smaller_tensor: 0 ``` I'm not sure what exactly happens behind the scene, but this should help fix the CI failure. Also remove some left over expected accuracy results for CUDA 12.4 which we are not using anymore on CI for benchmark jobs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154109 Approved by: https://github.com/Skylion007, https://github.com/malfet	2025-05-22 14:25:12 +00:00
Pian Pawakapan	8ea95d2e73	[inductor] dtype promotion error in cat decomp (#152995 ) cloning single tensor wasn't following dtype promotion rules for SAM model: https://github.com/pytorch/pytorch/issues/152606 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152995 Approved by: https://github.com/yushangdi, https://github.com/eellison	2025-05-09 16:58:58 +00:00
Animesh Jain	ecd74c953f	[dynamo] Recursively realize the stack_values (#152853 ) Might also fix - https://github.com/pytorch/pytorch/issues/135696 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152853 Approved by: https://github.com/Lucaskabela, https://github.com/mlazos, https://github.com/jansel	2025-05-07 02:36:44 +00:00
PyTorch MergeBot	fcd5e49138	Revert "[dynamo] Recursively realize the stack_values (#152853 )" This reverts commit `460888f908`. Reverted https://github.com/pytorch/pytorch/pull/152853 on behalf of https://github.com/malfet due to Looks like it broke inductor tests ([comment](https://github.com/pytorch/pytorch/pull/152853#issuecomment-2854897485))	2025-05-06 15:02:57 +00:00
Animesh Jain	460888f908	[dynamo] Recursively realize the stack_values (#152853 ) Might also fix - https://github.com/pytorch/pytorch/issues/135696 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152853 Approved by: https://github.com/Lucaskabela, https://github.com/mlazos, https://github.com/jansel	2025-05-06 06:30:31 +00:00
rzou	64957db6c9	Fix some inductor periodic benchmarks (#152605 ) Some were reporting "pass" consistently on https://hud.pytorch.org/ Those are fine to flip. I filed a separate issue for the now-regressions for AOTI: https://github.com/pytorch/pytorch/issues/152606. These should be looked at. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152605 Approved by: https://github.com/eellison, https://github.com/huydhn	2025-05-01 22:18:30 +00:00
Anthony Shoumikhin	e2f9759bd0	Fix broken URLs (#152237 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152237 Approved by: https://github.com/huydhn, https://github.com/malfet	2025-04-27 09:56:42 +00:00
Shunting Zhang	8f440a8e70	don't return logits for benchmark script (#151075 ) PT2 benchmark scripts has a pattern like: ``` def forward_and_backward_pass(self, mod, inputs, collect_outputs=True): cloned_inputs = clone_inputs(inputs) self.optimizer_zero_grad(mod) with self.autocast(self.autocast_arg): pred = mod(cloned_inputs) loss = self.compute_loss(pred) self.grad_scaler.scale(loss).backward() self.optimizer_step() if collect_outputs: return collect_results(mod, pred, loss, cloned_inputs) return None ``` for training. The collect_outputs argument is True only for accuracy testing and it's false for performance testing. For HF benchmark suite, a model usually returns tuple (loss, logits). For performance testing, even though the logits is never used anywhere, dynamo has to keep it due to the control flow. A few bad things if we keep logits here 1. the peak memory will be higher since the logits is large and we can not release its memory earlier. 2. we can not do optimization like chunking for the logits because the tensor needs to be returned from the pre-grad graph Actually I think it's fine to not return logits at all. - For training cases, checking loss and gradients for accuracy is good enough. It's hard to see two runs have mismatch logits but matching loss/gradients. - Also, discarding logits as soon as possible for perf benchmarking makes it more fair for us. On the other hand, it may be interesting to let dynamo support something like dynamo.constexpr (similar to tl.constexpr). A variable annotated as dynamo.constexpr will be specialized at compile time and we can do more optimization (DCE e.g.) at compile time. (A small [repro](https://gist.github.com/shunting314/0912a8947028a904c34f361021b8024d)) Benchmark results here [link](https://hud.pytorch.org/benchmark/compilers?dashboard=torchinductor&startTime=Fri%2C%2004%20Apr%202025%2018%3A03%3A26%20GMT&stopTime=Fri%2C%2011%20Apr%202025%2018%3A03%3A26%20GMT&granularity=hour&mode=training&dtype=amp&deviceName=cuda%20(h100)&lBranch=gh/shunting314/204/head&lCommit=fe25dab3f65e1b0e9db0af03f7664af70fcc9c66&rBranch=main&rCommit=55e62ff74ad5614faf80b060c7bfc551e3b7af5a) - HF 15% (1.51 -> 1.66 compression ratio) peak memory improvement - I also see 5% (2.74 -> 2.79x) perf win for HF. It could be true. We may generate more efficient kernels since we don't need keep logits and return it from the pre-grad graph. But I'll double check Pull Request resolved: https://github.com/pytorch/pytorch/pull/151075 Approved by: https://github.com/eellison, https://github.com/jansel	2025-04-15 17:13:00 +00:00
Animesh Jain	7b1a2373e8	[dynamo][super variable] Fix bug to use correct source (#151154 ) Fixes https://github.com/pytorch/pytorch/issues/150994 We should cherry-pick to 2.7 branch if possible, because this breaks torch.compile on some HF models. Look at the issue referenced here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151154 Approved by: https://github.com/jansel	2025-04-13 04:48:52 +00:00
LifengWang	51f0403f46	Update the baseline for max_autotune ci workflow (#149107 ) Since the issue https://github.com/pytorch/pytorch/issues/148535 is fixed in PR https://github.com/pytorch/pytorch/pull/148923, update the baseline for max_autotune ci workflow. Pull Request resolved: https://github.com/pytorch/pytorch/pull/149107 Approved by: https://github.com/chuanqi129, https://github.com/leslie-fang-intel, https://github.com/desertfire	2025-03-31 09:45:44 +00:00
LifengWang	e40a9e602b	Add the max_autotune tests in the periodic jobs. (#143560 ) To promptly detect issues with max_autotune, such as [#143102](https://github.com/pytorch/pytorch/issues/143102), add the max_autotune tests to the periodic CI to track the accuracy regularly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143560 Approved by: https://github.com/leslie-fang-intel, https://github.com/desertfire	2025-03-12 01:47:46 +00:00
Bin Bao	f69e58e8e8	[CI] Update crossvit_9_240 as pass (#148989 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148989 Approved by: https://github.com/ZainRizvi	2025-03-11 20:54:39 +00:00
Ting Lu	9769618d35	[CI] [inductor] Add cu126 inductor jobs and move away cu124 (#148612 ) https://github.com/pytorch/pytorch/issues/145570 breaking https://github.com/pytorch/pytorch/pull/140793 into eager and inductor benchmarks to unblock Seems many inductor yml are added after initial change was prepared. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148612 Approved by: https://github.com/nWEIdia, https://github.com/atalman Co-authored-by: atalman <atalman@fb.com>	2025-03-07 18:30:14 +00:00
Huy Do	04011304e5	Update dynamo expected 20250210 (#146856 ) Update all the ci accuracy expect values to make trunk green. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146856 Approved by: https://github.com/yanboliang	2025-02-12 18:01:20 +00:00
blorange-amd	5fd15a04b7	[ROCm] Enable inductor-periodic testing for MI300 (#144594 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144594 Approved by: https://github.com/malfet, https://github.com/huydhn Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-02-10 17:42:09 +00:00
Animesh Jain	e2e265e27b	[dynamo] Use polyfill to implement comparison operators (#144485 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144485 Approved by: https://github.com/jansel	2025-02-06 17:27:07 +00:00
Huy Do	f38d5b4a74	Update TorchBench commit to main (#145455 ) I'm adding sam2 to TorchBench https://github.com/pytorch/benchmark/issues/2566, so, as part of that, I'm updating PyTorch CI to use latest TorchBench commit. The corresponding change from TorchBench is https://github.com/pytorch/benchmark/pull/2584 The main thing to call out that the newer transformers added by https://github.com/pytorch/benchmark/pull/2488 is regressing several models. This needs to be investigated further, and I pin the version to unblock this change. * `hf_Roberta_base` a new model added by https://github.com/pytorch/benchmark/pull/2279, not sure why it fails accuracy on A10G, but it works fine on A100 * `speech_transformer` failures are pre-existing trunk failures, i.e. https://github.com/pytorch/pytorch/actions/runs/13040114684/job/36380989702#step:22:2408 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145455 Approved by: https://github.com/kit1980	2025-02-01 06:44:26 +00:00
PyTorch MergeBot	1185b81c51	Revert "[dynamo] Use polyfill to implement comparison operators (#144485 )" This reverts commit `d1f82de2bf`. Reverted https://github.com/pytorch/pytorch/pull/144485 on behalf of https://github.com/huydhn due to This seems to break dynamo tests in trunk after landing ([comment](https://github.com/pytorch/pytorch/pull/144485#issuecomment-2622893294))	2025-01-29 21:30:42 +00:00
Animesh Jain	d1f82de2bf	[dynamo] Use polyfill to implement comparison operators (#144485 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144485 Approved by: https://github.com/jansel	2025-01-29 17:37:40 +00:00
Bin Bao	6a44a61514	[BE] Bump TIMM pin (#145320 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145320 Approved by: https://github.com/Skylion007	2025-01-23 20:44:26 +00:00
Huy Do	8e4539245e	Update ci_expected_accuracy for TIMM levit_128 for further investigation (#145112 ) TSIA, it looks like an upstream change, but I'm not sure from where yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145112 Approved by: https://github.com/izaitsevfb, https://github.com/malfet	2025-01-18 01:55:34 +00:00
Huy Do	396630ed78	Update the accuracy results for moco and llama (#144523 ) This has been failing in trunk for sometimes, let's just update the accuracy results first. The command I run `python benchmarks/dynamo/ci_expected_accuracy/update_expected.py 127f836881e75e0c688619b54a35b018a69d7ee7`. I also fix the update script a bit to make it working after https://github.com/pytorch/pytorch/pull/139337 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144523 Approved by: https://github.com/kit1980, https://github.com/Skylion007	2025-01-10 19:40:49 +00:00
Guilherme Leobas	6bc17b0725	Update #graph breaks for moco benchmark (#144266 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144266 Approved by: https://github.com/zou3519	2025-01-09 18:51:13 +00:00
Guilherme Leobas	4c8d661348	Set `enable_trace_contextlib_contextmanager` flag to True (#140604 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140604 Approved by: https://github.com/zou3519 ghstack dependencies: #136033	2025-01-06 16:56:22 +00:00
PyTorch MergeBot	8d63a4a409	Revert "Set `enable_trace_contextlib_contextmanager` flag to True (#140604 )" This reverts commit `1c817fe671`. Reverted https://github.com/pytorch/pytorch/pull/140604 on behalf of https://github.com/guilhermeleobas due to breaking one of the benchmarks (moco) ([comment](https://github.com/pytorch/pytorch/pull/140604#issuecomment-2569640837))	2025-01-03 18:23:53 +00:00
Xuehai Pan	b6bdb67f82	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-12-29 17:23:13 +00:00
Animesh Jain	c3c27aef34	[dynamo] Remove HFPretrained config hack (#143698 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143698 Approved by: https://github.com/williamwen42, https://github.com/jansel ghstack dependencies: #143888	2024-12-28 02:03:13 +00:00
PyTorch MergeBot	475656fd9c	Revert "[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 )" This reverts commit `2293fe1024`. Reverted https://github.com/pytorch/pytorch/pull/129374 on behalf of https://github.com/malfet due to failing internal ROCM builds with error: ModuleNotFoundError: No module named hipify ([comment](https://github.com/pytorch/pytorch/pull/129374#issuecomment-2562973920))	2024-12-26 17:32:23 +00:00
Xuehai Pan	2293fe1024	[BE][Easy] use `pathlib.Path` instead of `dirname` / `".."` / `pardir` (#129374 ) Changes by apply order: 1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`. 2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`. 3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first. `.parent{...}.absolute()` -> `.absolute().parent{...}` 4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.) `.parent.parent.parent.parent` -> `.parents[3]` 5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~ ~`.parents[3]` -> `.parents[4 - 1]`~ 6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-12-21 22:08:01 +00:00
Guilherme Leobas	1c817fe671	Set `enable_trace_contextlib_contextmanager` flag to True (#140604 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140604 Approved by: https://github.com/zou3519 ghstack dependencies: #136033	2024-12-20 12:02:27 +00:00
Guilherme Leobas	673cc88fd6	Add support for `contextmanager` in Dynamo (#136033 ) Fixes #130559 * Intro This PR adds support for `@contextmanager` in Dynamo. We chose to limit the scope of this work to only `@contextmanager` and plan to handle generators fully in #141055 (still in draft). * Motivation Dynamo lacks support for generator functions. When it encounters one, it traces it as if it were a regular function. This is problematic because it can lead to incorrect behavior. To illustrate, consider the test case below: ```python import torch import contextlib @contextlib.contextmanager def set_default_dtype(dtype): old_dtype = torch.get_default_dtype() try: torch.set_default_dtype(dtype) yield finally: torch.set_default_dtype(old_dtype) @torch.compile(backend="eager", fullgraph=True) def fn(): with set_default_dtype(torch.float64): x = torch.tensor([3.0, 3.0 + 5.0j]) return x ``` Before this work, Dynamo would not stop at the `yield`, and the graph produced would contain both calls to `set_default_dtype` executed one after the other. This is incorrect because the context manager should execute code before and after the `yield`. * List of changes `YIELD_VALUE` now raises an exception (`YieldValueOp`) to signal that control flow must be suspended and returned to the caller. Additionally, `RETURN_VALUE` behaves differently in a generator function. Unlike regular functions, where `RETURN_VALUE` indicates the final result, in generators it signifies that the generator is exhausted and implicitly raises `StopIteration`. A new `VariableTracker` named `FunctionDecoratedByContextlibContextManagerVariable` was introduced to handle `@contextmanager`. This variable tracker acts not just as a wrapper for the original function but also maintains an internal `tx` (InstructionTranslator) object to suspend and return control flow to the parent tracer when a `yield` is encountered. * Corner cases Returning a context manager from a compiled function is not supported. This would require PyTorch to synchronize the generator state between Dynamo and the interpreter. Any attempt to return it will result in an `IncorrectUsage` exception. Graph breaks require special handling as well. In the event of a graph break, the frame associated with the context manager is skipped, and the context manager runs in eager mode. * This PR is breaking my code There is a configuration flag (`enable_trace_contextlib`) that can be set to `False` to disable tracing context managers. If this still causes crashes, please revert this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136033 Approved by: https://github.com/zou3519	2024-12-20 12:02:20 +00:00
Tugsbayasgalan Manlaibaatar	87f9c1abe5	Change export IR to non-functional pre-dispatch IR (#139511 ) Differential Revision: [D65362160](https://our.internmc.facebook.com/intern/diff/D65362160) State after this IR: 1. For the tests that require inference IR, they are replaced with ep.run_decomp({}) so export_for_training_run_decomp is sort of redundant but i guess it is still nice that multiple round of retracing still working. In general, we need some auditing to reduce our redundant testing coverages. 2. After this PR landed and not get reverted for a week or so, i will replace the export_for_training calls with export as they are the same thing now. 3. Added more tests to also cover now "deprecated" old IR by patching export to use old export. For reviewers, please look at the internal version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139511 Approved by: https://github.com/ydwu4, https://github.com/angelayi, https://github.com/avikchaudhuri	2024-11-20 21:47:55 +00:00
Catherine Lee	fc813df120	Benchmarks dynamo update script to use ClickHouse instead of Rockset (#140574 ) Query works but the part where it parses the job name is broken Pull Request resolved: https://github.com/pytorch/pytorch/pull/140574 Approved by: https://github.com/huydhn	2024-11-15 22:17:35 +00:00
Pian Pawakapan	09848c892a	[aot_compile] propagate ShapeEnv during lowering (#138362 ) We found that `export() -> _inductor.aot_compile()` lowering, 3 different ShapeEnvs get created, leading to errors when one ShapeEnv processes expressions created by another ShapeEnv. This plumbs the 2 places where ShapeEnv creation happens, detecting the original ShapeEnv from the GraphModule example values, so the original ShapeEnv is just reused. Differential Revision: D64613290 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138362 Approved by: https://github.com/angelayi	2024-10-24 22:22:14 +00:00
Animesh Jain	0a2407b93c	[dynamo] Support omegaconf DictConfig (#138378 ) Fixes https://github.com/pytorch/pytorch/issues/138224 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138378 Approved by: https://github.com/jansel ghstack dependencies: #138359	2024-10-20 02:43:17 +00:00
Colin Peppler	9690cacd61	[aotinductor] Add helper fn to atomically apply size_hint to an expr w/ unbacked symints (#137537 ) ### Context Fixes CUDA IMA in autotune_at_compile_time, where we would generate an example tensor with an incorrect stride. In the case below, the stride should be (u0 * 128, 128, 1). However, we apply the fallback on the entire expr (i.e. u0 * 128). ``` # buf817 = tensor(size=(s0, u0, 128), stride=(u0 * 128, 128, 1)) buf812 = generate_example_value( (64, 8192, 128), (8192, 128, 1), "cuda:0", torch.bfloat16, 0 ) ``` The fix is to apply the fallback on each symbol. ### Test ``` PYTORCH_NO_CUDA_MEMORY_CACHING=1 compute-sanitizer python test_aot_inductor.py -k test_stride_with_unbacked_expr_abi_compatible_cuda ========= Invalid __global__ write of size 2 bytes ``` Differential Revision: [D64074561](https://our.internmc.facebook.com/intern/diff/D64074561) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137537 Approved by: https://github.com/jingsh	2024-10-10 17:11:24 +00:00
Bin Bao	a15f3f51bc	[AOTI] Update sam_fast from timeout to fail_to_run (#136996 ) Summary: sam_fast changes from timeout to fail_to_run after https://github.com/pytorch/pytorch/pull/136591, which "regressed" in a good way. Update the expected result file and continue investigating. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136996 Approved by: https://github.com/ezyang	2024-09-30 14:05:49 +00:00
William Wen	2157e396a3	[dynamo] attempt run only mode when dynamo cache limit is hit (#136655 ) Implement https://github.com/pytorch/pytorch/issues/135458. Try run-only mode when dynamo cache limit is hit. If no valid cache entries are found, then skip code recursively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/136655 Approved by: https://github.com/jansel	2024-09-27 17:15:05 +00:00
zengxian	7ec17b49cf	Fix dynamo benchmark skip logic for cpu device (#135193 ) Fixes #132380, adjust torchbench and huggingface skip models list, then we can remove `--no-skip` when running benchmarks on 3 suites. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135193 Approved by: https://github.com/chuanqi129, https://github.com/jansel	2024-09-10 03:02:19 +00:00
Yanbo Liang	d81731615f	[Dynamo] Adding CallFunctionNoArgsSource and (#135425 ) CallFunctionNoArgsGuardAccessor to support torch.cuda.current_device() Pull Request resolved: https://github.com/pytorch/pytorch/pull/135425 Approved by: https://github.com/anijain2305	2024-09-09 22:46:00 +00:00
leslie-fang-intel	07689a38bf	[Inductor] Fix AOT weight alignment issue on CPU (#135205 ) Summary Fix issue: https://github.com/pytorch/pytorch/issues/135027. On CPU, the `consts_size` used to generate `_binary_constants_bin_start` is not padded to `ALIGN_BYTES`, while `serialized_weights` is, causing a failure in the 16K alignment check. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135205 Approved by: https://github.com/jgong5, https://github.com/desertfire	2024-09-06 03:06:51 +00:00
Animesh Jain	577a93514f	[dynamo][dynamic][heuristic] Mark tuple getitem integers as static (#134734 ) Fixes issue seen in https://github.com/pytorch/pytorch/issues/132872#issuecomment-2314574656 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134734 Approved by: https://github.com/jansel ghstack dependencies: #134653, #134713	2024-08-30 17:06:57 +00:00
Bin Bao	387d3fc296	[AOTI] Switch benchmarking to use export non-strict mode (#130977 ) Summary: Switch the export part used by AOTInductor benchmarking from strict to non-strict, and switch it from producing torch IR to aten IR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/130977 Approved by: https://github.com/angelayi ghstack dependencies: #134639	2024-08-29 16:08:52 +00:00
Bin Bao	e6bf1710ff	[Inductor][Refactor] Rename CPU benchmark test configs (#134639 ) Summary: benchmarks/dynamo/ci_expected_accuracy/update_expected.py expects a benchmark run config is named as {config}_{benchmark}, and CPU tests should follow the same naming convention. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134639 Approved by: https://github.com/huydhn	2024-08-28 14:49:55 +00:00
Weizhuo Zhang	5153550e4b	[CI] Add FP32 dynamic, AMP static, AMP dynamic for AOT inductor accuracy CPU CI test (#132836 ) This PR added 3 more accuracy test for AOT inductor CPU side. 1. FP32 dynamic shape accuracy test, torchbench suite 2. AMP static shape accuracy test, torchbench suite 3. AMP dynamic shape accuracy test, torchbench suite Test Time cost: \| Precision \| Shape Type \| Suite \| Time cost \| \|----------- \|------------ \|------------ \|----------- \| \| FP32 \| dynamic \| Torchbench \| 1h40m \| \| AMP \| Static \| Torchbench \| 1h38m \| \| AMP \| dynamic \| Torchbench \| 1h48m \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/132836 Approved by: https://github.com/desertfire	2024-08-19 14:26:48 +00:00
Bin Bao	0272934238	[Inductor][CPU] Fix an InvalidVecISA issue on CI (#131812 ) Summary: CPU CI nodes failed to find valid VecISA because importing torch under the default pytorch directory will fail with the following msg, so switch cwd to a tmp directory. ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/var/lib/jenkins/workspace/torch/__init__.py", line 66, in <module> from torch.torch_version import __version__ as __version__ File "/var/lib/jenkins/workspace/torch/torch_version.py", line 4, in <module> from torch.version import __version__ as internal_version ModuleNotFoundError: No module named 'torch.version' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/131812 Approved by: https://github.com/eellison, https://github.com/malfet	2024-07-26 22:31:44 +00:00
Xuehai Pan	c0ed38e644	[BE][Easy][3/19] enforce style for empty lines in import segments in `benchmarks/` (#129754 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129754 Approved by: https://github.com/ezyang	2024-07-17 14:34:42 +00:00
eellison	9ab8d47f9d	Constant folding for dynamic shape node (#129686 ) Extend constant folding for dynamic shape node, only support pointwise op and some restricted ops We support dynamic shapes by limiting constant folding of ops that are guaranteed to have uniform values (full, pointwise ops, and views) and running these operators with tensors of shape 1. This also eliminates the possibility of memory overhead of constant folding. Taken over from https://github.com/pytorch/pytorch/pull/128937 joint work with @imzhuhl Pull Request resolved: https://github.com/pytorch/pytorch/pull/129686 Approved by: https://github.com/Chillee ghstack dependencies: #130367	2024-07-16 00:17:11 +00:00
PyTorch MergeBot	9df4bc6a0d	Revert "Constant folding for dynamic shape node (#129686 )" This reverts commit `b7d287fbec`. Reverted https://github.com/pytorch/pytorch/pull/129686 on behalf of https://github.com/atalman due to Failing internally. Test: https://github.com/pytorch/ao/blob/main/test/prototype/mx_formats/test_mx_linear.py ([comment](https://github.com/pytorch/pytorch/pull/129686#issuecomment-2228755295))	2024-07-15 15:19:24 +00:00

1 2 3 4 5 ...

262 Commits