pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Yiming Zhou	6eb8d9671b	Enable torch.nn.functional.batch_norm in test_export_opinfo (#164261 ) Summary: There are actually 2 `nn.functional.batch_norm` in op_db. See https://github.com/pytorch/pytorch/blob/main/torch/testing/_internal/common_methods_invocations.py#L16797-L16831 So previously the test failed at `assert len(ops)==1` Test Plan: python test/export/test_export_opinfo.py TestExportOnFakeCudaCUDA Differential Revision: D83581427 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164261 Approved by: https://github.com/SherlockNoMad	2025-10-01 21:56:08 +00:00
Yiming Zhou	937869657e	Exporting aten.sdpa with cuda under fake mode on a cuda-less machine (#164162 ) Summary: As titled. sdpa will select backend based on hardware check, and it fails when exporting with cuda under fake mode on a cuda-less machine. We guard `at::cuda::is_available()` check before `at::cuda::getCurrentDeviceProperties()` and give warnings. Test Plan: buck2 run mode/dev-nosan caffe2/test:test_export -- -r nn_functional_scaled_dot_product_attention Differential Revision: D83496154 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164162 Approved by: https://github.com/SherlockNoMad	2025-09-30 17:21:31 +00:00
Sherlock Huang	3564cd294c	Fix TestExportOpInfo (#164184 ) Fixes https://github.com/pytorch/pytorch/issues/163699 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164184 Approved by: https://github.com/yiming0416, https://github.com/tugsbayasgalan	2025-09-30 16:12:39 +00:00
Yiming Zhou	2a45f30ae7	Exporting aten.conv with cuda under fake mode on a cuda-less machine (#163912 ) Summary: Improve op coverage of exporting a CUDA model on a CPU-only machine under fake tensor mode. For `torch.nn.functional.conv2d`, it will `_select_conv_backend` based on input and weight shapes. When calling into `supportsDepthwiseConvolutionWithCuDNN()`, it calls `at::cuda::getCurrentDeviceProperties()` and fails on a CPU-only machine. So we check if CUDA is actually enabled first. Test Plan: TORCH_SHOW_CPP_STACKTRACES=1 buck2 run fbcode//caffe2/test:test_export -- --r nn_functional_conv2d Reviewed By: angelayi, henryoier Differential Revision: D80562984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/163912 Approved by: https://github.com/SherlockNoMad	2025-09-26 06:04:20 +00:00
Sherlock Huang	6f9aef5fef	[2/n] Support module.to("cuda:0") in FakeTensorMode on cuda-less machine (#163433 ) Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Before Moving module.to("cuda:0") under fake tensor mode would have parameter on `meta` device. After parameters would be on "cuda:0" . Test Plan: buck2 run fbcode//caffe2/test:fake_tensor -- --r test_move_module Reviewed By: mikaylagawarecki Differential Revision: D80102876 Pull Request resolved: https://github.com/pytorch/pytorch/pull/163433 Approved by: https://github.com/albanD	2025-09-22 20:16:32 +00:00
Sherlock Huang	033b7d1e1a	[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available (#163187 ) Reland of #160532 Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163016 Approved by: https://github.com/huydhn Pull Request resolved: https://github.com/pytorch/pytorch/pull/163187 Approved by: https://github.com/angelayi	2025-09-18 04:46:26 +00:00
PyTorch MergeBot	79fd497423	Revert "[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#163016 )" This reverts commit `f1eb99e2e4`. Reverted https://github.com/pytorch/pytorch/pull/163016 on behalf of https://github.com/jeffdaily due to broke rocm CI, see export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nonzero_cuda_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/17787208381/job/50564369696) [HUD commit link](`f1eb99e2e4`) ([comment](https://github.com/pytorch/pytorch/pull/163016#issuecomment-3303707552))	2025-09-17 16:17:53 +00:00
Sherlock Huang	f1eb99e2e4	[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#163016 ) Reland of #160532 Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163016 Approved by: https://github.com/huydhn	2025-09-17 05:01:33 +00:00
PyTorch MergeBot	9c93dc8123	Revert "Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#160532 )" This reverts commit `a956c4ab1c`. Reverted https://github.com/pytorch/pytorch/pull/160532 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/160532#issuecomment-3287745165))	2025-09-13 07:42:12 +00:00
Sherlock Huang	a956c4ab1c	Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#160532 ) Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Test Plan: CI Rollback Plan: Differential Revision: D80181887 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160532 Approved by: https://github.com/henryoier, https://github.com/ezyang	2025-09-13 01:50:51 +00:00
Sherlock Huang	eaa5d9d3d3	Introduce OpInfo test for testing export on fake device (#160694 ) Summary: Prepare for the upcoming diffs for exporting on fake cuda device. Test Plan: test Rollback Plan: Differential Revision: D80304225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160694 Approved by: https://github.com/dolpm	2025-08-15 07:26:28 +00:00

11 Commits