pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Wang, Chuanqi	0d3a4f7155	[CD] Enable Inductor performance test for xpu (#166289 ) Add Dynamo benchmark performance tests for XPU backend Pull Request resolved: https://github.com/pytorch/pytorch/pull/166289 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-10-31 10:52:07 +00:00
Shunting Zhang	0db6bcc015	Fix accuracy for layernorm/rmsnorm benchmarking (#166005 ) Example command: python benchmarks/dynamo/genai_layers/benchmark.py --exit-on-accuracy-failure --tolerance=1e-2 rmsnorm_backward Fix the accuracy problem for layernorm/rmsnorm fwd/bwd. Also fix some quack calls (maybe due to quack API change) Pull Request resolved: https://github.com/pytorch/pytorch/pull/166005 Approved by: https://github.com/BoyuanFeng	2025-10-24 18:14:51 +00:00
Shunting Zhang	673060beae	[inductor] turn Inductor deterministic mode on with torch.use_deterministic_algorithms (#165950 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165950 Approved by: https://github.com/v0i0, https://github.com/eellison	2025-10-23 02:48:42 +00:00
Jason Ansel	3c3b278872	[reland][fx] Move Node._prepend/Node._remove_from_list to C++ (#165882 ) Relands #148261 that was reverted by #150542 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165882 Approved by: https://github.com/ezyang	2025-10-21 19:43:55 +00:00
Tugsbayasgalan Manlaibaatar	c73f5080de	Migrating some more callsites (#163580 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163580 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #165582	2025-10-19 15:52:17 +00:00
Yuanyuan Chen	3255e7872b	Enable all flake8-logging-format rules (#164655 ) These rules are enabled by removing existing suppressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164655 Approved by: https://github.com/janeyx99, https://github.com/mlazos	2025-10-19 00:59:28 +00:00
Yuanyuan Chen	e595136187	Enable PLC1802 on ruff (#165813 ) This PR enables ruff check `PLC1802`, which detects len calls on sequences in a boolean test context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165813 Approved by: https://github.com/ezyang	2025-10-18 05:44:14 +00:00
Han, Xu	bfcdbd0a97	fix wrong accuracy_status when exception. (#165731 ) When I debug `XPU` accruacy issue, I found the script output wrong accuracy_status. When the `try` block raise an exception, we should process the exception, but not return the `fail_accuracy`. Before fixing, it returned as `fail_accuracy`: <img width="1109" height="216" alt="image" src="https://github.com/user-attachments/assets/385c354f-fbf6-48e4-a1be-3e37e987341b" /> After fixing, it returned the exception message: <img width="1101" height="292" alt="image" src="https://github.com/user-attachments/assets/f18c0e3c-8358-4ec7-a6bb-c2e01b69d27f" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/165731 Approved by: https://github.com/Stonepia, https://github.com/chuanqi129, https://github.com/Lucaskabela	2025-10-17 16:37:06 +00:00
Yuanyuan Chen	e925dfcc6b	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang, https://github.com/mlazos	2025-10-17 07:27:11 +00:00
Yuanyuan Chen	b2953f5643	[9/N] Apply ruff UP035 rule (#165515 ) This is follow-up of #165214 to continue applying ruff UP035 rule to the code base. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165515 Approved by: https://github.com/Lucaskabela	2025-10-17 00:09:51 +00:00
Jeff Daily	7a97832585	[ROCm] Add more timm models, forward fix #165381 (#165569 ) PR #165381 added timm models to cuda and cpu expected accuracy files. ROCm expected accuracy files were not updated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165569 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-15 18:11:21 +00:00
Yiming Zhou	47524dcc48	[benchmark] Add more timm models (#165381 ) Added following models to timm_models - [convnextv2_nano.fcmae_ft_in22k_in1k](https://huggingface.co/timm/convnextv2_nano.fcmae_ft_in22k_in1k) - [vit_base_patch14_dinov2.lvd142m](https://huggingface.co/timm/vit_base_patch14_dinov2.lvd142m) - [ViT-B-16-SigLIP-i18n-256](https://huggingface.co/timm/ViT-B-16-SigLIP-i18n-256) - [deit_tiny_patch16_224.fb_in1k](https://huggingface.co/timm/deit_tiny_patch16_224.fb_in1k) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165381 Approved by: https://github.com/BoyuanFeng	2025-10-15 01:19:10 +00:00
Yiming Zhou	102b7885ff	Add option to run AOT Precompile in benchmark (#164906 ) Use the existing benchmark infra to get some signals for AOT precompile pass rate on OSS models. Here we also measure and log the loading time. ``` python ./benchmarks/dynamo/huggingface.py --accuracy --inference --aot-precompile python ./benchmarks/dynamo/timm_models.py --accuracy --inference --aot-precompile python ./benchmarks/dynamo/torchbench.py --accuracy --inference --aot-precompile ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/164906 Approved by: https://github.com/zhxchen17	2025-10-14 20:59:55 +00:00
Huy Do	5ad7611b52	Reland vision pinned commit hash update (#164492 ) Redo https://github.com/pytorch/pytorch/pull/154694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164492 Approved by: https://github.com/yangw-dev	2025-10-12 04:53:27 +00:00
Shunting Zhang	5171f14064	[inductor] verify determinism with inductor benchmark script (#164904 ) Verify the deterministic mode with torch.compile benchmark scripts. Here is what my testing script does (pasted in the end): - run a model in default mode, save it's result - run the model again in default mode, but distort the benchmarking results. Compare it with the saved result. - Do the above again in deterministic mode. I tried to test a few modes - BertForMaskedLM and GoogleFnet: I can repro the numeric change by distorting the benchnmark result in the default mode. The non-determinism is gone in the deterministic mode - DistillGPT2: I can not repro the numeric change by distorting the benchmarking result in the default mode. It does not surprise me much. Reduction order change does not always cause numeric change. ``` model=GoogleFnet export TORCHINDUCTOR_WRITE_ARE_DETERMINISTIC_ALGORITHMS_ENABLED=0 export TORCHINDUCTOR_FORCE_DISABLE_CACHES=1 # disable autotune cache export TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE=0 export TORCHINDUCTOR_FX_GRAPH_CACHE=0 export TORCHINDUCTOR_CACHE_DIR=/tmp/torchinductor_shunting/ export TORCHINDUCTOR_BENCHMARK_KERNEL=1 export TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 export INDUCTOR_TEST_DISABLE_FRESH_CACHE=1 # Non deterministic mode # --float32 rather than --amp to make it easier to repro non-deterministic echo "Save results for non-deterministic mode" python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --save-model-outputs-to=/tmp/saved-non-deterministic.pkl echo "Compare results with distorted benchmarking in non-deterministic mode" TORCHINDUCTOR_DISTORT_BENCHMARKING_RESULT=inverse python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --compare-model-outputs-with=/tmp/saved-non-deterministic.pkl echo "Save results for deterministic mode" TORCHINDUCTOR_DETERMINISTIC=1 python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --save-model-outputs-to=/tmp/saved-deterministic.pkl echo "Compare results with distorted benchmarking in deterministic mode" TORCHINDUCTOR_DETERMINISTIC=1 TORCHINDUCTOR_DISTORT_BENCHMARKING_RESULT=inverse python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --compare-model-outputs-with=/tmp/saved-deterministic.pkl ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/164904 Approved by: https://github.com/jansel, https://github.com/v0i0	2025-10-12 00:03:42 +00:00
PyTorch MergeBot	d2cb183344	Revert "[inductor] verify determinism with inductor benchmark script (#164904 )" This reverts commit `a3c700656f`. Reverted https://github.com/pytorch/pytorch/pull/164904 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but there seems to be some failed vLLM failures coming out of this ([comment](https://github.com/pytorch/pytorch/pull/164904#issuecomment-3388443678))	2025-10-10 06:23:07 +00:00
Laith Sakka	7f2a902ea2	more sizelike deprecation (#164889 ) remove expext_size c++ bindings and usages Pull Request resolved: https://github.com/pytorch/pytorch/pull/164889 Approved by: https://github.com/mlazos ghstack dependencies: #164884, #164885, #164886, #164887, #164888	2025-10-10 03:45:06 +00:00
Shunting Zhang	a3c700656f	[inductor] verify determinism with inductor benchmark script (#164904 ) Verify the deterministic mode with torch.compile benchmark scripts. Here is what my testing script does (pasted in the end): - run a model in default mode, save it's result - run the model again in default mode, but distort the benchmarking results. Compare it with the saved result. - Do the above again in deterministic mode. I tried to test a few modes - BertForMaskedLM and GoogleFnet: I can repro the numeric change by distorting the benchnmark result in the default mode. The non-determinism is gone in the deterministic mode - DistillGPT2: I can not repro the numeric change by distorting the benchmarking result in the default mode. It does not surprise me much. Reduction order change does not always cause numeric change. ``` model=GoogleFnet export TORCHINDUCTOR_WRITE_ARE_DETERMINISTIC_ALGORITHMS_ENABLED=0 export TORCHINDUCTOR_FORCE_DISABLE_CACHES=1 # disable autotune cache export TORCHINDUCTOR_FX_GRAPH_REMOTE_CACHE=0 export TORCHINDUCTOR_FX_GRAPH_CACHE=0 export TORCHINDUCTOR_CACHE_DIR=/tmp/torchinductor_shunting/ export TORCHINDUCTOR_BENCHMARK_KERNEL=1 export TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 export INDUCTOR_TEST_DISABLE_FRESH_CACHE=1 # Non deterministic mode # --float32 rather than --amp to make it easier to repro non-deterministic echo "Save results for non-deterministic mode" python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --save-model-outputs-to=/tmp/saved-non-deterministic.pkl echo "Compare results with distorted benchmarking in non-deterministic mode" TORCHINDUCTOR_DISTORT_BENCHMARKING_RESULT=inverse python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --compare-model-outputs-with=/tmp/saved-non-deterministic.pkl echo "Save results for deterministic mode" TORCHINDUCTOR_DETERMINISTIC=1 python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --save-model-outputs-to=/tmp/saved-deterministic.pkl echo "Compare results with distorted benchmarking in deterministic mode" TORCHINDUCTOR_DETERMINISTIC=1 TORCHINDUCTOR_DISTORT_BENCHMARKING_RESULT=inverse python benchmarks/dynamo/huggingface.py --backend inductor --float32 --accuracy --only $model --training --disable-cudagraphs --compare-model-outputs-with=/tmp/saved-deterministic.pkl ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/164904 Approved by: https://github.com/jansel, https://github.com/v0i0 ghstack dependencies: #164801, #164532	2025-10-10 00:00:58 +00:00
Boyuan Feng	90b4e130d6	[Benchmark] cleanup torchbench models (#164816 ) Prune models from TorchInductor dashboard to reduce ci cost. This PR prunes torchbench models according to the [doc](https://docs.google.com/document/d/1nLPNNAU-_M9Clx9FMrJ1ycdPxe-xRA54olPnsFzdpoU/edit?tab=t.0), which removes timm and huggingface models from torchbench. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164816 Approved by: https://github.com/anijain2305, https://github.com/seemethere, https://github.com/huydhn, https://github.com/malfet	2025-10-09 00:31:25 +00:00
Boyuan Feng	83458197d1	[Benchmark] remove old timm models from benchmark (#164805 ) Prune models from TorchInductor dashboard to reduce ci cost. This PR prunes for timm models according to the [doc](https://docs.google.com/document/d/1nLPNNAU-_M9Clx9FMrJ1ycdPxe-xRA54olPnsFzdpoU/edit?tab=t.0), which reduces from 60 to 14 models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164805 Approved by: https://github.com/anijain2305, https://github.com/seemethere, https://github.com/huydhn, https://github.com/malfet	2025-10-08 17:14:58 +00:00
PyTorch MergeBot	1927783aa3	Revert "Reland vision pinned commit hash update (#164492 )" This reverts commit `6861a27062`. Reverted https://github.com/pytorch/pytorch/pull/164492 on behalf of https://github.com/izaitsevfb due to see autorevert msg above, inductor breakage is legit ([comment](https://github.com/pytorch/pytorch/pull/164492#issuecomment-3379537888))	2025-10-08 04:38:26 +00:00
Boyuan Feng	f76fdcaaf8	[Benchmark] cleanup huggingface models (#164815 ) Prune models from TorchInductor dashboard to reduce ci cost. This PR prunes for hugging face models according to the [doc](https://docs.google.com/document/d/1nLPNNAU-_M9Clx9FMrJ1ycdPxe-xRA54olPnsFzdpoU/edit?tab=t.0), which reduces from 46 to 27 models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164815 Approved by: https://github.com/anijain2305, https://github.com/seemethere, https://github.com/huydhn, https://github.com/malfet	2025-10-08 03:21:04 +00:00
Huy Do	6861a27062	Reland vision pinned commit hash update (#164492 ) Redo https://github.com/pytorch/pytorch/pull/154694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164492 Approved by: https://github.com/yangw-dev	2025-10-07 22:45:05 +00:00
PyTorch MergeBot	afee8062d5	Revert "Fix mesh.get_local_rank when it is > 1d (#164473 )" This reverts commit `83d71dfb2f`. Reverted https://github.com/pytorch/pytorch/pull/164473 on behalf of https://github.com/izaitsevfb due to appears to be causing vision_maskrcnn regression ([comment](https://github.com/pytorch/pytorch/pull/164473#issuecomment-3374738997))	2025-10-07 00:37:41 +00:00
PyTorch MergeBot	5d7360bb03	Revert "Enable all SIM rules except disabled ones (#164645 )" This reverts commit `321e602692`. Reverted https://github.com/pytorch/pytorch/pull/164645 on behalf of https://github.com/izaitsevfb due to causes lint failures ([comment](https://github.com/pytorch/pytorch/pull/164645#issuecomment-3369274351))	2025-10-05 19:32:21 +00:00
Yuanyuan Chen	321e602692	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang	2025-10-05 07:38:25 +00:00
Francisco Massa	83d71dfb2f	Fix mesh.get_local_rank when it is > 1d (#164473 ) Previously, we would not take the arguments passed by get_local_rank into account. This means that we wouldn't be able to trace this call if we had a device_mesh > 1d Pull Request resolved: https://github.com/pytorch/pytorch/pull/164473 Approved by: https://github.com/xmfan, https://github.com/Skylion007	2025-10-04 11:27:55 +00:00
Jeff Daily	412c6d28ec	[ROCm][CI] additional dynamo benchmarks for inductor-periodic (#164279 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164279 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-04 00:55:17 +00:00
PyTorch MergeBot	0319556a35	Revert "[vision hash update] update the pinned vision hash (#154694 )" This reverts commit `bcafea5c92`. Reverted https://github.com/pytorch/pytorch/pull/154694 on behalf of https://github.com/yangw-dev due to break the unittest for inductor with improved, update benchmarks/dynamo/ci_expected_accuracy/inductor_torchbench_inference.csv, see failure example https://github.com/pytorch/pytorch/actions/runs/18185852421/job/51776537817 ([comment](https://github.com/pytorch/pytorch/pull/154694#issuecomment-3362285901))	2025-10-02 17:32:04 +00:00
PyTorch UpdateBot	bcafea5c92	[vision hash update] update the pinned vision hash (#154694 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned vision hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154694 Approved by: https://github.com/pytorchbot Co-authored-by: Huy Do <huydhn@gmail.com>	2025-10-02 07:02:40 +00:00
Laith Sakka	b377c9e365	graph break on tolist if capture_scalar_outputs is false (#163807 ) address https://github.com/pytorch/pytorch/issues/163798 its problematic to not graph break because: 1. break current contract. 2. well dynamo trace then we have .item call then if we ever re-trace later in autograd for example we hit a failure (We do not know where to graph break at that point)! see the added unit test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163807 Approved by: https://github.com/bobrenjc93	2025-09-28 04:02:52 +00:00
Arsh Zahed	254d2864d6	Add runtime_overhead PR Time Benchmark (#163866 ) This adds a PR time benchmark that checks for runtime overhead on a very small graph. This will help track regressions in runtime overhead. Example Results: ``` runtime_overhead_inductor,instruction_count,222645 runtime_overhead_inductor_inference_mode,instruction_count,234998 runtime_overhead_inductor_requires_grad,instruction_count,293556 runtime_overhead_inductor_requires_grad_backward,instruction_count,78181 runtime_overhead_inductor_dynamic,instruction_count,234870 runtime_overhead_inductor_inference_mode_dynamic,instruction_count,248711 runtime_overhead_inductor_requires_grad_dynamic,instruction_count,309979 runtime_overhead_inductor_requires_grad_backward_dynamic,instruction_count,77599 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163866 Approved by: https://github.com/jansel, https://github.com/mlazos, https://github.com/anijain2305	2025-09-27 03:26:59 +00:00
Yidi Wu	21a41edd4f	Add fake_impl for _native_multi_head_attention (#163700 ) Test Plan: See added test in test_export.py Differential Revision: D83099187 Pull Request resolved: https://github.com/pytorch/pytorch/pull/163700 Approved by: https://github.com/angelayi	2025-09-25 19:01:27 +00:00
angelayi	dad54ca7c0	Add mistral/gpt-oss to benchmarks (#163565 ) Potential issues * gpt-oss-20b is probably too big (I can't run on my devserver) * Mistral requires HF authentication * Mistral also takes a while to run the performance checks (need to wait for CI) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163565 Approved by: https://github.com/huydhn	2025-09-24 06:12:36 +00:00
James Wu	bfe9e60ffb	Simplify PrecompileContext to no longer be a CacheArtifactManager (#162886 ) Summary: This diff does a big refactor of PrecompileContext to make it considerably simpler: instead of being a CacheArtifactManager and managing a bunch of bytes, it simply stores two things: dynamo cache entries and backend cache entries. When asked, it stitches them together into PrecompileCacheEntries, which are stored by DynamoCache. This structure then allows us to register DynamoCache to the regular Megacache API, instead of having two separate APIs that are confusing. It also lets us remove the autotune cache integration, since MegaCache API will automatically store autotune cache entries. The intent here is that users who want to use caching precompile will simply be able to use torch.compiler.save_cache_artifacts as before, just with `torch.dynamo.config.caching_precompile` set to True. They can also directly interact with PrecompileContext if they wish to specifically only load Precompile entries, using PrecompileContext.create_cache_entries(). Saving single entries and such with DynamoCache still works normally. Test Plan: All existing unit tests pass. Rollback Plan: Differential Revision: D82380307 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162886 Approved by: https://github.com/zhxchen17	2025-09-20 01:24:37 +00:00
dependabot[bot]	33e6c5a93d	[Dependabot] Update(deps): Bump transformers from 4.54.0 to 4.56.0 in /.ci/docker/ci_commit_pins (#162063 ) * [Dependabot] Update(deps): Bump transformers Bumps [transformers](https://github.com/huggingface/transformers) from 4.54.0 to 4.56.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.54.0...v4.56.0) --- updated-dependencies: - dependency-name: transformers dependency-version: 4.56.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Refresh results Signed-off-by: Huy Do <huydhn@gmail.com> * Another round of updates Signed-off-by: Huy Do <huydhn@gmail.com> * Another round of update Signed-off-by: Huy Do <huydhn@gmail.com> * Hopefully the last round of update Signed-off-by: Huy Do <huydhn@gmail.com> * Plz Signed-off-by: Huy Do <huydhn@gmail.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Huy Do <huydhn@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Huy Do <huydhn@gmail.com>	2025-09-19 02:50:36 -07:00
Animesh Jain	ddc56f6f92	[functional] Use the saved device on storage instead for device_custom (#162987 ) Trying to reduce the number of __torch_dispatch__ calls of FakeTensorMode in the AOT metadata collection pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162987 Approved by: https://github.com/Lucaskabela, https://github.com/bdhirsh, https://github.com/zou3519	2025-09-18 23:43:20 +00:00
Jeff Daily	62a746f62c	[ROCm] update ci_expected_accuracy for dynamo benchmarks (#163256 ) Some tests that were already failing changed status to skipped. Some model entries were missing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163256 Approved by: https://github.com/malfet Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-18 19:05:19 +00:00
Jeff Daily	c7fa16a05c	[ROCm][CI] update _rocm-test.yml based on _linux-test.yml (#163014 ) Fixes missing huggingface secrets and aligns _rocm-test.yml with other updates from _linux-test.yml that it was initially based on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163014 Approved by: https://github.com/huydhn	2025-09-16 02:14:38 +00:00
Jeff Daily	b334a5a379	[ROCm][benchmark] Add HF LLM benchmark expected accuracy (#162965 ) PR #156967 added HF LLM benchmarks but did not add the ci expected accuracy files for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162965 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-15 18:04:39 +00:00
angelayi	972140b7e9	[benchmark] Add HF LLM benchmarks (#156967 ) Results in https://docs.google.com/spreadsheets/d/1xXOPg9JjEmPx0zc5QBNdyXQq8-K2_r4ybHaiS-q7pZ0/edit?gid=88695043#gid=88695043 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156967 Approved by: https://github.com/huydhn Co-authored-by: Huy Do <huydhn@gmail.com>	2025-09-14 07:41:06 +00:00
David Berard	cad052423b	[triton] Update 3.5 pin to 5ae38bdb0dc066c5823e34dc9797afb9de42c866 (#162821 ) Include @aakhundov's sam_fast patch, plus NVIDIA's sm88/sm110 patches (thanks @nWEIdia) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162821 Approved by: https://github.com/atalman	2025-09-12 18:34:22 +00:00
atalman	e8eeb06034	Move inductor jobs 3.9->3.10 (#162323 ) Related to: https://github.com/pytorch/pytorch/issues/161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162323 Approved by: https://github.com/huydhn, https://github.com/Skylion007 Co-authored-by: Huy Do <huydhn@gmail.com>	2025-09-12 03:43:06 +00:00
PyTorch MergeBot	23170dfebc	Revert "Move inductor jobs 3.9->3.10 (#162323 )" This reverts commit `0663bdb123`. Reverted https://github.com/pytorch/pytorch/pull/162323 on behalf of https://github.com/huydhn due to Not sure what had happened, but some inductor unit tests start failing after this lands ([comment](https://github.com/pytorch/pytorch/pull/162323#issuecomment-3278125192))	2025-09-11 05:57:13 +00:00
atalman	0663bdb123	Move inductor jobs 3.9->3.10 (#162323 ) Related to: https://github.com/pytorch/pytorch/issues/161167 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162323 Approved by: https://github.com/huydhn, https://github.com/Skylion007	2025-09-10 20:58:41 +00:00
PyTorch MergeBot	e1f0a69943	Revert "test fixing benchmarks (#162503 )" This reverts commit `484c4093a8`. Reverted https://github.com/pytorch/pytorch/pull/162503 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it regresses CPU perf smoke test ([comment](https://github.com/pytorch/pytorch/pull/162503#issuecomment-3273554680))	2025-09-10 06:55:35 +00:00
angelayi	484c4093a8	test fixing benchmarks (#162503 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162503 Approved by: https://github.com/huydhn ghstack dependencies: #160741	2025-09-10 03:15:49 +00:00
David Berard	3f5993316e	[upstream triton] update triton pin to triton 3.5 (#162278 ) Update PyTorch to the latest Triton release candidate branch (release/3.5.x in triton-lang/triton) Notably: * this does not include the version number bump from 3.4 -> 3.5 (we'll do that in a follow-up PR) * sam_fast is still failing, so we've disabled it temporarily https://github.com/pytorch/pytorch/issues/162282 and we are committed to fixing it, ideally before the branch cut but possibly as a cherry-pick into the release branch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162278 Approved by: https://github.com/atalman ghstack dependencies: #162244, #162309	2025-09-08 14:29:24 +00:00
Animesh Jain	e9481b6617	[dynamo] Prevent unnecessary recompile on disabled functions in the compiled frame (#161883 ) Trying out a re-impl of https://github.com/pytorch/pytorch/pull/160934 The above PR led to OOM, most likely because of the cache holding to a nested function (which if not held in the cache would have been garbage collected), which holds on to cuda tensors in its closure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161883 Approved by: https://github.com/jansel	2025-09-02 01:13:48 +00:00
PyTorch MergeBot	9b67d8e344	Revert "[RELAND] Close some sources of fake tensor leakage (#161589 )" This reverts commit `5790b00975`. Reverted https://github.com/pytorch/pytorch/pull/161589 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/17305150611/job/49128381649) [HUD commit link](`5790b00975`) ([comment](https://github.com/pytorch/pytorch/pull/161589#issuecomment-3235224249))	2025-08-28 23:19:36 +00:00

1 2 3 4 5 ...

1215 Commits