pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Wang, Chuanqi	b09fb481e0	[CD] Upgrade GCC version to 13 for XPU build (#162474 ) Follow #152426 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162474 Approved by: https://github.com/zxiiro, https://github.com/atalman	2025-10-31 21:15:37 +00:00
Jeff Daily	239e7b541a	[ROCm][CI] upgrade nightly wheels to ROCm 7.1 (#166730 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/166730 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-31 17:30:47 +00:00
Jeff Daily	24e94e021a	[ROCm][CI] create ROCm 7.1 magma tarball (#166693 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/166693 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-31 15:20:00 +00:00
Xuehai Pan	69be99ee51	Remove manually synced arch versions in `tools/nightly.py` (#166616 ) Discussed with @atalman offline. To reduce duplicate changes and reduce the number of files to change when updating arch versions. ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/166616 Approved by: https://github.com/ezyang	2025-10-31 15:11:28 +00:00
Wang, Chuanqi	0d3a4f7155	[CD] Enable Inductor performance test for xpu (#166289 ) Add Dynamo benchmark performance tests for XPU backend Pull Request resolved: https://github.com/pytorch/pytorch/pull/166289 Approved by: https://github.com/EikanWang, https://github.com/atalman	2025-10-31 10:52:07 +00:00
Jeff Daily	1129605415	[ROCm][CI] create ROCm 7.1 images for binary builds (#166665 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/166665 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-31 02:52:37 +00:00
amdfaa	a7fd0b4001	[ROCm][CI] fix disk space message (#166645 ) Fixes diskspace cutoff to say that the machine does not have difference=100 - diskspace_cutoff_int space available. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166645 Approved by: https://github.com/jeffdaily	2025-10-30 19:38:34 +00:00
PyTorch UpdateBot	f20bf77874	[audio hash update] update the pinned audio hash (#166597 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned audio hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166597 Approved by: https://github.com/pytorchbot	2025-10-30 04:28:30 +00:00
amdfaa	0187db88d4	[ROCm][CI] Create periodic-rocm-mi200.yml (#166544 ) * We are separating out the rocm jobs of the periodic workflow * We are introducing a new label `ciflow/periodic-rocm-mi200` to allow us to run distributed tests only on ROCm runners, without triggering many other jobs on the `periodic.yml` workflow (via `ciflow/periodic`) * This new workflow will also be triggered via the `ciflow/periodic`, thus maintaining the old status quo. * We are reverting to the `linux.rocm.gpu.4` label since it targets a lot more CI nodes at this point than the K8s/ARC-based `linux.rocm.gpu.mi250.4` label, as that is still having some network/scaling issues. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166544 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-10-30 02:08:07 +00:00
Andrey Talman	82ff07c788	Add py 3.14 CI docker build pytorch-linux-jammy-py3.14-clang12 (#164791 ) Related to https://github.com/pytorch/pytorch/issues/156856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164791 Approved by: https://github.com/huydhn, https://github.com/malfet, https://github.com/albanD	2025-10-29 22:21:22 +00:00
Aaron Gokaslan	96b61844a7	[BE]: Update nvshmem to 3.4.5 (#164046 ) Release notes can be found here: https://docs.nvidia.com/nvshmem/release-notes-install-guide/release-notes/release-3405.html main difference is the addition of a CPU assisted IBGDA fallback which should allow NVSHMEM IBGDA to work on way more systems without admin intervention and without using GDRCopy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164046 Approved by: https://github.com/ezyang, https://github.com/kwen2501	2025-10-29 07:32:05 +00:00
etaf	1b655a87ef	[xpu][test] Enable more UTs for Intel GPU. (#166047 ) This PR enables additional Inductor unit tests for Intel GPU. Due to the increased number of test cases, the number of runners has been extended from 8 to 12 to prevent CI timeouts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166047 Approved by: https://github.com/jansel Co-authored-by: Deng, Daisy <daisy.deng@intel.com> Co-authored-by: Jason Ansel <jansel@jansel.net>	2025-10-29 06:25:36 +00:00
fffrog	cb6966704c	Add merge rule for PrivateUse1 Module (#166394 ) Add merge rights for the following people: - albanD - fffrog Pull Request resolved: https://github.com/pytorch/pytorch/pull/166394 Approved by: https://github.com/ezyang	2025-10-29 06:13:44 +00:00
PyTorch UpdateBot	5849eea129	[vision hash update] update the pinned vision hash (#166356 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned vision hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166356 Approved by: https://github.com/pytorchbot	2025-10-29 04:14:16 +00:00
Ting Lu	544b443ea1	[CD] Upgrade to CUDA 13.0.2 for nightly binaries (#165470 ) 13.0.U2 is posted, adding to nightlies Why we want to upgrade: CUDA 13.0.U2 included a new release from cuBLAS that 1. Enabled opt-in fixed-point emulation for FP64 matmuls (D/ZGEMM) which improves performance and power-efficiency. 2. Improved performance on NVIDIA [DGX Spark](https://www.nvidia.com/en-us/products/workstations/dgx-spark/) for FP16/BF16 and FP8 GEMMs. 3. adds BF16x9 FP32 emulation support for SYRK and HERK routines. Reference: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cublas-release-13-0-update-2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165470 Approved by: https://github.com/atalman	2025-10-28 15:14:43 +00:00
PyTorch MergeBot	74336f8c77	Revert "[CD] Upgrade to CUDA 13.0.2 for nightly binaries (#165470 )" This reverts commit `5e769ff867`. Reverted https://github.com/pytorch/pytorch/pull/165470 on behalf of https://github.com/atalman due to Sorry reverting for now, to restore trunk health ([comment](https://github.com/pytorch/pytorch/pull/165470#issuecomment-3454166879))	2025-10-28 02:21:48 +00:00
Ting Lu	5e769ff867	[CD] Upgrade to CUDA 13.0.2 for nightly binaries (#165470 ) 13.0.U2 is posted, adding to nightlies Why we want to upgrade: CUDA 13.0.U2 included a new release from cuBLAS that 1. Enabled opt-in fixed-point emulation for FP64 matmuls (D/ZGEMM) which improves performance and power-efficiency. 2. Improved performance on NVIDIA [DGX Spark](https://www.nvidia.com/en-us/products/workstations/dgx-spark/) for FP16/BF16 and FP8 GEMMs. 3. adds BF16x9 FP32 emulation support for SYRK and HERK routines. Reference: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cublas-release-13-0-update-2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165470 Approved by: https://github.com/atalman	2025-10-28 00:21:47 +00:00
PyTorch UpdateBot	4295a9a158	[xla hash update] update the pinned xla hash (#165895 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned xla hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165895 Approved by: https://github.com/pytorchbot	2025-10-27 11:47:29 +00:00
Richard Zou	b6a4236e5d	[label_to_label] minor updates (#166172 ) vllm-compile implies "module: vllm" and "oncall: pt2". The volume of issues in Flex -> HigherOrderOperators is too noisy, plus we have a different set of folks looking at each, so I'm going to make that not automatic anymore. We can still manually label flex issues as higher order operator issues. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166172 Approved by: https://github.com/angelayi	2025-10-24 22:47:23 +00:00
Yang Wang	fdcf402d82	vllm test build (#166146 ) FIx the vllm test build it's broken due to the flashinfer dependency Pull Request resolved: https://github.com/pytorch/pytorch/pull/166146 Approved by: https://github.com/huydhn	2025-10-24 19:18:10 +00:00
Huy Do	b146ea411e	Save GitHub env variables on ROCm (#165821 ) As `.github/actions/setup-rocm/action.yml` is now used on `linux_job_v2` to setup ROCm, we need to have this step here to save the list of GitHub env variables. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165821 Approved by: https://github.com/atalman	2025-10-23 22:13:37 +00:00
Catherine Lee	0977cc4474	[lint] Extend workflowsync linter to more files (#166082 ) And fix the lint issues found Pull Request resolved: https://github.com/pytorch/pytorch/pull/166082 Approved by: https://github.com/izaitsevfb, https://github.com/atalman	2025-10-23 20:29:29 +00:00
PyTorch UpdateBot	b1eb6dede5	[vision hash update] update the pinned vision hash (#166046 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned vision hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166046 Approved by: https://github.com/pytorchbot	2025-10-23 04:27:44 +00:00
Catherine Lee	e7592f4005	[CI] Move the periodic debug tests to newer runner (#165158 ) Previously g3 = NVIDIA Tesla M60 Now g6 = NVIDIA L4 Also change cuda arch list accordingly Pros: More memory, newer GPU Cons: That was one of the few remaining tests on g3 runners, so we probably lost coverage? We can probably run more tests in parallel now but I'm not going to do that here Disabled a bunch of sparse tests and nestedtensor tests that were previously skipped due to not having sufficient hardware? They are now failing with ``` Traceback (most recent call last): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3293, in wrapper method(args, kwargs) File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3292, in wrapper with policy(): File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2532, in __enter__ self.beforeStreams[-1].synchronize() File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/streams.py", line 105, in synchronize super().synchronize() torch.AcceleratorError: CUDA error: device-side assert triggered Search for `cudaErrorAssert' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Exception raised from stream_synchronize at /var/lib/jenkins/workspace/c10/cuda/CUDAFunctions.h:120 (most recent call first): C++ CapturedTraceback: #4 std::_Function_handler<std::shared_ptr<c10::LazyValue<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const> (), c10::SetStackTraceFetcher(std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) from Logging.cpp:0 #5 c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) from ??:0 #6 c10::cuda::c10_cuda_check_implementation(int, char const, char const, unsigned int, bool) [clone .cold] from CUDAException.cpp:0 #7 THCPStream_synchronize(_object, _object*) from Stream.cpp:0 #8 cfunction_vectorcall_NOARGS from /usr/local/src/conda/python-3.10.14/Objects/methodobject.c:489 #9 _PyObject_VectorcallTstate from /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #10 _PyEval_EvalFrame from /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 #11 _PyObject_VectorcallTstate from /usr/local/src/conda/python-3.10.14/Include/cpython/abstract.h:114 #12 _PyEval_EvalFrame from /usr/local/src/conda/python-3.10.14/Include/internal/pycore_ceval.h:46 ``` when run with cuda launch blocking I got a ton of stuff like ``` /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [5,3,0], thread: [2,7,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [5,3,0], thread: [3,7,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [0,0,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [1,0,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [2,0,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [3,0,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [0,1,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [1,1,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [3,1,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [0,2,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [2,2,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [3,2,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [0,3,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [1,3,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [1,4,0] Assertion `value < upper_bound` failed. /var/lib/jenkins/workspace/third_party/cutlass/include/cutlass/integer_subbyte.h:124: cutlass::integer_subbyte<Bits, Signed>::integer_subbyte(unsigned int) [with int Bits = 2; __nv_bool Signed = false]: block: [3,8,0], thread: [3,4,0] Assertion `value < upper_bound` failed. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165158 Approved by: https://github.com/seemethere	2025-10-21 21:28:12 +00:00
Ivan Zaitsev	5b35fc8777	Support multiple commits on push events in trunk tagging workflow (#165937 ) Context: * this workflow is used to create tags like `trunk/{sha}` for all `main` commits * those tags are used by [autorevert](https://github.com/pytorch/test-infra/blob/main/aws/lambda/pytorch-auto-revert/README.md) to rerun selected workflows Problem: currently the workflow creates only a single tag per push event, while ghstack pushes multiple commits per single push. This PR supports tag creation for all commits in the push event. Complimentary autorevert PR: https://github.com/pytorch/test-infra/pull/7291 --- ### Testing I created an identical copy of this workflow in my personal repo: https://github.com/izaitsevfb/pr-head-test/actions/workflows/trunk-tagging.yml See action runs there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165937 Approved by: https://github.com/huydhn	2025-10-21 20:52:34 +00:00
Wang, Chuanqi	292454942e	[CD] Introduce windows.12xlarge runners for CD Windows build (#165287 ) Follows https://github.com/pytorch/test-infra/pull/7174. Windows CD build time cost comparison as below \|Runner\|cpu\|cuda\|xpu\| \|-\|-\|-\|-\| \|windows.4xlarge\|1.5h\| 4.0h\| 5.5h\| \|windows.12xlarge\|0.5h\|1.5h\|2.5h\| Fixes #162962 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165287 Approved by: https://github.com/zxiiro, https://github.com/malfet, https://github.com/seemethere	2025-10-21 18:28:23 +00:00
PyTorch MergeBot	21131a2444	Revert "[ROCm][CI] Update rocm.yml workflow to use 1 GPU ARC runners (#165481 )" This reverts commit `ffa90d46e6`. Reverted https://github.com/pytorch/pytorch/pull/165481 on behalf of https://github.com/jeffdaily due to timeouts after merge ([comment](https://github.com/pytorch/pytorch/pull/165481#issuecomment-3426898171))	2025-10-21 14:15:55 +00:00
amdfaa	ffa90d46e6	[ROCm][CI] Update rocm.yml workflow to use 1 GPU ARC runners (#165481 ) * Moving rocm.yml from using persistent non-ARC runners from the combined MI2xx (MI210 + MI250) cluster to the ARC runners from the MI250 cluster. This halves the number of nodes, but provides access to approximately 4 times the runners, since every 8-GPU MI250 node now provides 8 1-GPU runners. This should help with concurrent capacity and queueing on the MI2xx jobs. Tested here successfully: https://github.com/pytorch/pytorch/actions/runs/18620814622/job/53092469720 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165481 Approved by: https://github.com/jeffdaily Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>	2025-10-21 04:02:04 +00:00
Jithun Nair	70592c6819	[ROCm][CI] Move gfx1100 workflows to own yaml file (#165699 ) This should allow us to move gfx1100 workflow to a lower frequency and also allow it to be triggered on PRs via a dedicated label, for any PRs that target Navi fixes such as [this](https://github.com/pytorch/pytorch/pull/165630) or [this](https://github.com/pytorch/pytorch/pull/165625). Pull Request resolved: https://github.com/pytorch/pytorch/pull/165699 Approved by: https://github.com/jeffdaily	2025-10-20 23:52:48 +00:00
PyTorch MergeBot	4f7f43253d	Revert "[ROCm][CI] Update rocm.yml workflow to use 1 GPU ARC runners (#165481 )" This reverts commit `8700d68fef`. Reverted https://github.com/pytorch/pytorch/pull/165481 on behalf of https://github.com/malfet due to Broke lint somehow, see `8f06a1308f/1` ([comment](https://github.com/pytorch/pytorch/pull/165481#issuecomment-3423642456))	2025-10-20 20:39:56 +00:00
amdfaa	8700d68fef	[ROCm][CI] Update rocm.yml workflow to use 1 GPU ARC runners (#165481 ) * Moving rocm.yml from using persistent non-ARC runners from the combined MI2xx (MI210 + MI250) cluster to the ARC runners from the MI250 cluster. This halves the number of nodes, but provides access to approximately 4 times the runners, since every 8-GPU MI250 node now provides 8 1-GPU runners. This should help with concurrent capacity and queueing on the MI2xx jobs. Tested here successfully: https://github.com/pytorch/pytorch/actions/runs/18620814622/job/53092469720 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165481 Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>	2025-10-20 16:06:37 +00:00
Jithun Nair	2705937080	[CI] Add rocm CI back to trunk for pre-submit/PR jobs (#165674 ) Only adding single-GPU shards for now, to observe how current capacity handles it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165674 Approved by: https://github.com/jeffdaily	2025-10-20 12:14:06 +00:00
Nikita Shulga	5d62b63a76	[BE] Use Python-3.14 GE build (#165804 ) 3.14 reached general availability on Oct 7th 2025, so we can remove all pre-release workarounds Pull Request resolved: https://github.com/pytorch/pytorch/pull/165804 Approved by: https://github.com/yangw-dev, https://github.com/Skylion007, https://github.com/cyyever	2025-10-19 11:45:10 +00:00
PyTorch UpdateBot	e939651972	[audio hash update] update the pinned audio hash (#165807 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned audio hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165807 Approved by: https://github.com/pytorchbot	2025-10-19 04:45:20 +00:00
Huy Do	9095a9dfae	[CD] Apply the fix from #162455 to aarch64+cu129 build (#165794 ) When trying to bring cu129 back in https://github.com/pytorch/pytorch/pull/163029, I mainly looked at https://github.com/pytorch/pytorch/pull/163029 and missed another tweak coming from https://github.com/pytorch/pytorch/pull/162455 I discover this issue when testing aarch64+cu129 builds in https://github.com/pytorch/test-infra/actions/runs/18603342105/job/53046883322?pr=7373. Surprisingly, there is no test running for aarch64 CUDA build from what I see in `79a37055e7`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165794 Approved by: https://github.com/malfet	2025-10-18 04:16:24 +00:00
drisspg	fe80f03726	Add B200 files to labeler and update codeowners (#165767 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165767 Approved by: https://github.com/slayton58	2025-10-17 23:24:17 +00:00
Nikita Shulga	6ece527fc5	[CI] Add aarch64 operator benchmark (#165585 ) Running on Graviton4 Skip ConvTranspose1d benchmarks if PyTorch is compiled with ACL, due to https://github.com/pytorch/pytorch/issues/165654 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165585 Approved by: https://github.com/huydhn	2025-10-17 14:42:14 +00:00
Yuanyuan Chen	e925dfcc6b	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang, https://github.com/mlazos	2025-10-17 07:27:11 +00:00
Shangdi Yu	d82527b32a	[Windows] Add AOTI cross-compilation CI (#165573 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165573 Approved by: https://github.com/malfet ghstack dependencies: #165560	2025-10-17 01:05:35 +00:00
Wei Wang	d7e275d4b4	[CI][CUDA] Add periodic b200 distributed job (#159323 ) 1. Run distributed job with B200 runner, periodically. 2. discovered generic distributed test issue that certain unit test hard-coded ranks, calling for require_exact_world_size(world_size) API instead of require_world_size(world_size). Pull Request resolved: https://github.com/pytorch/pytorch/pull/159323 Approved by: https://github.com/eqy Co-authored-by: Aidyn-A <aidyn.b.aitzhan@gmail.com>	2025-10-16 21:54:04 +00:00
Jithun Nair	d5db3aee0d	[CI] Use 1-GPU runners for rocm-mi355.yml (#165658 ) Should only need 1-GPU runners for rocm-mi355.yml since it runs `default` test config which only needs 1 GPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/165658 Approved by: https://github.com/jeffdaily	2025-10-16 21:53:22 +00:00
Maggie Moss	d795fb225a	[RFC] Add pyrefly to lintrunner (#165179 ) This will add pyrefly to lint runner as a warning only - and allow us to collect feedback about the tool before switching to pyrefly as the main type checker. References the steps outlined here: : https://github.com/pytorch/pytorch/issues/163283: test plan: `lintrunner init` `lintrunner` confirm when pyrefly errors are present results look like: https://gist.github.com/maggiemoss/e6cb2d015dd1ded560ae1329098cf33f Pull Request resolved: https://github.com/pytorch/pytorch/pull/165179 Approved by: https://github.com/ezyang	2025-10-16 20:07:09 +00:00
Huy Do	6dedd34c31	[CD] Skip 12.9 build on Windows (#165665 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/165665 Approved by: https://github.com/Camyll, https://github.com/malfet	2025-10-16 19:11:27 +00:00
Thanh Ha	85586d7efc	Make c7i the default for _linux-build.yml (#164747 ) Use linux.c7i.2xlarge as the default runner for the _linux-build.yml workflow. In testing we found that switching from c5 - c7i grants a 15-20% faster build times despite c7i costing 5% more. This should reduce costs of jobs using _linux-build.yml. Relates to pytorch/test-infra#7175. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164747 Approved by: https://github.com/atalman	2025-10-16 17:37:51 +00:00
Nikita Shulga	23fb7e9f4b	[CI] Add arch prefix in front of op benchmark results (#165584 ) To be able to run x86 and aarch64 benchmarks later on Pull Request resolved: https://github.com/pytorch/pytorch/pull/165584 Approved by: https://github.com/huydhn ghstack dependencies: #165583	2025-10-16 01:50:52 +00:00
Huy Do	c2bd41ac9f	Build vLLM nightly wheels for CUDA 13.0 (#163239 ) Now that https://github.com/vllm-project/vllm/pull/24599 has been merged Pull Request resolved: https://github.com/pytorch/pytorch/pull/163239 Approved by: https://github.com/malfet, https://github.com/atalman	2025-10-16 01:03:26 +00:00
Nikita Shulga	7e6721fb0a	[BE] Remove confusing `opbenchmark-on-demand-build` (#165583 ) As it doesn't have a test shard, so what's the point or running the build? Was added in https://github.com/pytorch/pytorch/pull/143733 and looks like test shard never existed for it Moreover, allow one to specify benchmark size as argument, so one technically can do a workflow dispatch with different opbenchmark sizes Pull Request resolved: https://github.com/pytorch/pytorch/pull/165583 Approved by: https://github.com/huydhn	2025-10-15 23:48:28 +00:00
PyTorch UpdateBot	59d30d1b75	[vision hash update] update the pinned vision hash (#165496 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned vision hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165496 Approved by: https://github.com/pytorchbot	2025-10-15 04:35:50 +00:00
PyTorch UpdateBot	3915898c22	[audio hash update] update the pinned audio hash (#165495 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/main/.github/workflows/nightly.yml). Update the pinned audio hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165495 Approved by: https://github.com/pytorchbot	2025-10-15 04:32:49 +00:00
Jean Schmidt	1ec0755a7e	[ISSUES] Update ci:sev template to include a note about ci: disable-autorevert label (#165459 ) We noticed that disabling autorevert in any and all ci:sevs is too impactful, as ci: sevs are sometimes created just to communicate an action or a impactful change. But sometimes durring a SEV we might not want to disable autorevert anyways, a example is a ci: sev impacting jobs we don't use as basis for autorevert. So, a note is added reminding the ci:sev author to optionally add this tag to disable auto-revert Note: using this opportunity to fix the ci: disable-autorevert issues. As it is best for the title to be simple and the displayed message in the GitHub interface to be decorated with emoji :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165459 Approved by: https://github.com/malfet	2025-10-14 20:32:46 +00:00

1 2 3 4 5 ...

4513 Commits