H. Vetinari
e6c1e6e20e
simplify torch.utils.cpp_extension.include_paths; use it in cpp_builder ( #145480 )
...
While working on conda-forge integration, I needed to look at the way the include paths are calculated, and noticed an avoidable duplication between `torch/utils/cpp_extension.py` and `torch/_inductor/cpp_builder.py`. The latter already imports the former anyway, so simply reuse the same function.
Furthermore, remove long-obsolete include-paths. AFAICT, the `/TH` headers have not existed since pytorch 1.11.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145480
Approved by: https://github.com/ezyang
2025-01-27 07:19:42 +00:00
Bin Bao
b8087747f5
[inductor][BE] Enable test_cpu_cpp_wrapper in fbcode ( #145373 )
...
Differential Revision: D68278174
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145373
Approved by: https://github.com/Skylion007
2025-01-24 17:59:13 +00:00
Irem Yuksel
66bf7da446
Enable sleef for Win Arm64 ( #144876 )
...
Sleef module was disabled for Windows Arm64 on b021486405
This PR enables it again since the issue is no longer valid.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144876
Approved by: https://github.com/albanD , https://github.com/malfet
Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>
2025-01-23 19:22:58 +00:00
Aaron Orenstein
893ca1dfe1
PEP585 update - torch/_inductor/[_-i]* ( #145137 )
...
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145137
Approved by: https://github.com/bobrenjc93
2025-01-19 01:22:47 +00:00
bobrenjc93
a3ab27b8e0
Migrate from Tuple -> tuple in torch/_inductor ( #144264 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144264
Approved by: https://github.com/eellison
2025-01-07 03:27:27 +00:00
Bin Bao
fecf03fa3f
[AOTI][reland] Emit a CMakeLists.txt when package_cpp_only ( #143680 )
...
Summary: Emit a CMakeLists.txt with compile and link options when package_cpp_only is specified. After unzipping AOTI generated .pt2 package file, user can manually build the generated model code in their local environment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143680
Approved by: https://github.com/huydhn
2024-12-21 03:48:40 +00:00
xinan.lin
b5e159270a
[AOTI XPU] Replace intel compiler with g++ to build inductor CPP wrapper in runtime. ( #142322 )
...
This PR aims to removes the de pendency on Intel Compiler at Inductor runtime. Now we only need a SYCL_HOME in runtime to find the sycl headers and libs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142322
Approved by: https://github.com/EikanWang , https://github.com/desertfire , https://github.com/albanD
ghstack dependencies: #143491
2024-12-21 02:27:04 +00:00
Tom Ritchford
b5475d334e
[inductor] Fix an unused variable in cpu_vec_isa.py ( #138473 )
...
----
* Extracted from https://github.com/pytorch/pytorch/pull/133492
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138473
Approved by: https://github.com/EikanWang , https://github.com/albanD , https://github.com/xuhancn
2024-12-20 18:50:19 +00:00
Huamin Li
f5af87c23c
Make Inductor cpp backend enable_floating_point_contract_flag to take string ( #143450 )
...
Differential Revision: D66269001
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143450
Approved by: https://github.com/desertfire
2024-12-20 16:28:54 +00:00
PyTorch MergeBot
71479a9b9c
Revert "[AOTI] Emit a CMakeLists.txt when package_cpp_only ( #143352 )"
...
This reverts commit 429f4cd140 .
Reverted https://github.com/pytorch/pytorch/pull/143352 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the new test is failing on ROCm ([comment](https://github.com/pytorch/pytorch/pull/143352#issuecomment-2556365140 ))
2024-12-20 06:21:31 +00:00
Bin Bao
429f4cd140
[AOTI] Emit a CMakeLists.txt when package_cpp_only ( #143352 )
...
Summary: Emit a CMakeLists.txt with compile and link options when package_cpp_only is specified. After unzipping AOTI generated .pt2 package file, user can manually build the generated model code in their local environment.
Differential Revision: [D67458526](https://our.internmc.facebook.com/intern/diff/D67458526 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143352
Approved by: https://github.com/malfet
2024-12-19 22:01:05 +00:00
Bin Bao
0e8013fc1c
[AOTI] Fix a typo in cpp_builder.py ( #143351 )
...
Summary: passthough -> passthrough
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143351
Approved by: https://github.com/yushangdi , https://github.com/chenyang78
ghstack dependencies: #143350
2024-12-18 16:28:37 +00:00
Benjamin Glass
bb06fc79fb
cpp_builder: handle CUDA lib paths involving "stubs" in more circumstances ( #142175 )
...
conda packages for `cuda-driver-dev=12.4.127` use a "stubs" subdirectory to contain `libcuda.so`. This was previously only handled by cpp_builder in some cases, but now needs to be potentially handled more generally.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142175
Approved by: https://github.com/desertfire
2024-12-17 17:21:27 +00:00
Tom Ritchford
dc23f1944a
Remove unused Python variables in torch/[_-a]* ( #133492 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-12 17:39:14 +00:00
Colin L. Rice
d68403df3b
filelock: Make waitcounter variant to use ( #139816 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139816
Approved by: https://github.com/ezyang
2024-12-12 01:18:34 +00:00
PyTorch MergeBot
5c97ac9721
Revert "Remove unused Python variables in torch/[_-a]* ( #133492 )"
...
This reverts commit fda975a7b3 .
Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516 ))
2024-12-11 17:29:12 +00:00
PyTorch MergeBot
2374d460d0
Revert "filelock: Make waitcounter variant to use ( #139816 )"
...
This reverts commit 237c4b559c .
Reverted https://github.com/pytorch/pytorch/pull/139816 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/139816#issuecomment-2536616808 ))
2024-12-11 17:26:46 +00:00
Colin L. Rice
237c4b559c
filelock: Make waitcounter variant to use ( #139816 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139816
Approved by: https://github.com/ezyang
2024-12-10 23:02:59 +00:00
Tom Ritchford
fda975a7b3
Remove unused Python variables in torch/[_-a]* ( #133492 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492
Approved by: https://github.com/albanD
2024-12-10 21:48:44 +00:00
Colin Peppler
0602676c8d
[CUTLASS][AOTI] Fixes undefined symbol: cudaLaunchKernelExC ( #142094 )
...
Summary:
### Context
* When compiling the object file for a CUTLASS kernel, CUDA RT symbols are left undefined.
* When compiling the final shared object file, we statically link with `libcudart_static.a`.
* One important thing is that ordering matters when specifying the lib search paths (-L).
Test Plan:
```
// before diff
RuntimeError: Failure loading .so: /tmp/tmpqhz_dnza/model.so: undefined symbol: cudaLaunchKernelExC
```
Differential Revision: D66793974
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142094
Approved by: https://github.com/chenyang78 , https://github.com/hl475
2024-12-06 02:18:54 +00:00
xinan.lin
4742080ed9
[AOTI XPU] Enable Cpp wraper for Intel GPU. ( #135318 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135318
Approved by: https://github.com/jgong5 , https://github.com/EikanWang , https://github.com/guangyey , https://github.com/desertfire
2024-11-26 11:51:32 +00:00
Joseph Kleinhenz
7b2138b864
[inductor] fix uncaught exception when checking for openmp on macos ( #141208 )
...
Based on #133776
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141208
Approved by: https://github.com/Skylion007
2024-11-21 22:17:52 +00:00
Aaron Gokaslan
12e95aa4ee
[BE]: Apply PERF401 autofixes from ruff ( #140980 )
...
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby , https://github.com/malfet
2024-11-20 17:52:07 +00:00
Valentine233
263a5bf95e
[cpu] Modify inductor opt flag --- ftree-loop-vectorize ( #136827 )
...
Reopen https://github.com/pytorch/pytorch/pull/121782 , as more optimizations have landed.
Fixes https://github.com/pytorch/pytorch/issues/115261 , https://github.com/pytorch/pytorch/issues/113017 .
For CPU inductor path, remove -ftree-loop-vectorize from optimization flags to fix functional issues.
### Validation on 3 benchmark suites
#### FP32

Outlier models (speedup<0.8, single socket): None.
#### BF16

Outlier models (speedup<0.8, single socket multi threads):
- functorch_dp_cifar10 0.58
- opacus_cifar10 0.57
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136827
Approved by: https://github.com/jansel , https://github.com/jgong5
2024-11-12 01:26:18 +00:00
PyTorch MergeBot
347f96061f
Revert "[cpu] Modify inductor opt flag --- ftree-loop-vectorize ( #136827 )"
...
This reverts commit cf0bb6c435 .
Reverted https://github.com/pytorch/pytorch/pull/136827 on behalf of https://github.com/ZainRizvi due to Sorry but this breaks internally. See D65605094 for more details ([comment](https://github.com/pytorch/pytorch/pull/136827#issuecomment-2465805271 ))
2024-11-08 21:52:33 +00:00
Valentine233
cf0bb6c435
[cpu] Modify inductor opt flag --- ftree-loop-vectorize ( #136827 )
...
Reopen https://github.com/pytorch/pytorch/pull/121782 , as more optimizations have landed.
Fixes https://github.com/pytorch/pytorch/issues/115261 , https://github.com/pytorch/pytorch/issues/113017 .
For CPU inductor path, remove -ftree-loop-vectorize from optimization flags to fix functional issues.
### Validation on 3 benchmark suites
#### FP32

Outlier models (speedup<0.8, single socket): None.
#### BF16

Outlier models (speedup<0.8, single socket multi threads):
- functorch_dp_cifar10 0.58
- opacus_cifar10 0.57
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136827
Approved by: https://github.com/jansel , https://github.com/jgong5
2024-11-07 02:49:52 +00:00
Irem Yuksel
b021486405
Enable Windows Arm64 ( #133088 )
...
This PR enables Pytorch for Windows on Arm64 - CPU only.
Currently, there aren't any checks in place to build and test for Windows on Arm64, but we're working to implement those as soon as possible.
We recommend using [Arm Performance Libraries (APL)](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries ) as a BLAS option, which is introduced in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133088
Approved by: https://github.com/malfet
Co-authored-by: cristian panaite <panaite.cristian2000@gmail.com>
Co-authored-by: Stefan-Alin Pahontu <56953855+alinpahontu2912@users.noreply.github.com>
Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>
2024-10-24 16:10:44 +00:00
Zhuoran Zhao
2414c3f534
AOTI fixes for MI300 lowering ( #137939 )
...
Summary:
1) Add sleef back to enable SIMD on AMD
2) adding kpack to triton compute_meta for AMD triton, since there will be user-defined triton kernels using this for k-dim packing
Test Plan:
```
HIP_VISIBLE_DEVICES=0 TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCH_LOGS="output_code,graph_code" buck run mode/{opt,amd-gpu} -c fbcode.triton_backend=amd -c fbcode.enable_gpu_sections=true //hpc/new/models/feed/benchmark:feed_lower_benchmark -- --skip-flop-estimation --skip-trt --skip-ait --enable-aot-inductor --sync-mode=0 --gpu-trace --sample-input-tile-factor=1 --load="manifold://ads_storage_fblearner/tree/user/facebook/fblearner/predictor/925729118/0/gpu_lowering/input.merge" --lowering-input-str='{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/925729118/0/gpu_lowering/input.merge","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/925729118/0/gpu_lowering/mi300_output.merge","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"use_scripting":true,"preset_lowerer":"ifu_cint;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change","precision":3,"output_precision":3, "remove_unexpected_type_cast":false, "sample_input_tile_factor":32}},"model_entity_id":925729118,"model_snapshot_id":0,"add_sample_inputs":false,"hardware_type":0,"platform_arch":1,"dense_in_place_format":2}' --precision=bf16 2>&1 | tee local_benchmark_log.txt
```
Differential Revision: D64262924
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137939
Approved by: https://github.com/frank-wei
2024-10-17 16:09:04 +00:00
Bin Bao
fe43f72be7
[AOTI] Remove the non-ABI-compatible mode (part 2) ( #138047 )
...
Summary: Continue to clean up non-ABI-compatible mode related code.
Differential Revision: [D64444327](https://our.internmc.facebook.com/intern/diff/D64444327 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138047
Approved by: https://github.com/chenyang78
ghstack dependencies: #137982 , #138016 , #138009
2024-10-17 02:54:24 +00:00
Henry Tsang
a0a978ce23
[aoti config] add raise_error_on_ignored_optimization ( #138035 )
...
Summary: Unfortunately this means adding another config.
Test Plan: ci
Differential Revision: D64437699
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138035
Approved by: https://github.com/chenyang78 , https://github.com/desertfire
2024-10-16 18:38:47 +00:00
Bin Bao
c04b35a5ae
[AOTI] Add standalone version of TORCH_CHECK ( #136873 )
...
Summary: In the standalone mode, TORCH_CHECK throws std::runtime_error, instead of c10::Error. The goal is to cut dependency on libtorch. Specifically, AOTI generates CPU code which may call ATen vectorization ops and we need to make sure those ops are self-contained.
Differential Revision: [D63911928](https://our.internmc.facebook.com/intern/diff/D63911928 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136873
Approved by: https://github.com/albanD , https://github.com/chenyang78
2024-10-08 15:30:01 +00:00
Dan Zimmerman
b3972ee19a
[triton] Unify build_paths.py for NV & AMD, fix typing ( #136952 )
...
Summary: Some build improvements.
Test Plan: CI
Differential Revision: D63583959
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136952
Approved by: https://github.com/bertmaher
2024-09-30 21:51:45 +00:00
Isuru Fernando
2a178a6982
Avoid changing FTZ/DAZ flags in CPP builder ( #136466 )
...
Fixes https://github.com/pytorch/pytorch/issues/136273
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136466
Approved by: https://github.com/ezyang
2024-09-24 14:39:17 +00:00
xinan.lin
67735d1ee8
[Inductor] Generalize is_cuda to specific device_type to make cpp_wrapper mode be extensible ( #134693 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134693
Approved by: https://github.com/ezyang , https://github.com/EikanWang , https://github.com/jansel
2024-09-10 10:11:13 +00:00
Xu Han
29d72c1100
[inductor] check intel compiler minimal version ( #135209 )
...
On Windows: early version icx has `-print-file-name` issue, and can't preload correctly for inductor. Add minimal version check for Intel compiler.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135209
Approved by: https://github.com/ezyang
2024-09-06 03:21:07 +00:00
Xu Han
6448d351db
[inductor] clean up cpp_builder code. ( #134909 )
...
Clean up cpp_builder duplication code.
Hi @henrylhtsang , could you please help on land internally?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134909
Approved by: https://github.com/henrylhtsang
2024-09-04 05:29:08 +00:00
Xu Han
c40e622966
[inductor] add openmp config for intel conpiler on Linux. ( #134973 )
...
Config `openmp` for Intel Compiler on Linux.
Base on this PR, we can confirm the Intel optimized libraries are work built well.
<img width="1039" alt="image" src="https://github.com/user-attachments/assets/838d5114-c778-4961-9cfe-39a814647089 ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134973
Approved by: https://github.com/jgong5 , https://github.com/jansel
2024-09-03 20:10:21 +00:00
Xu Han
136badae64
[inductor] preload icx built in math libs ( #134870 )
...
Intel Compiler implenmented more math libraries than clang, for performance proposal.
We need preload them like openmp library.
reproduce UT:
```cmd
pytest test/inductor/test_cpu_cpp_wrapper.py -v -k test_silu_cpu_dynamic_shapes_cpp_wrapper
```
Depends of module:
<img width="804" alt="Image" src="https://github.com/user-attachments/assets/9a672e03-ebf5-4ebb-b182-09180e6f7841 ">
Local test pass:
<img width="857" alt="image" src="https://github.com/user-attachments/assets/afbb8c1c-8fcc-4d64-a3ad-c8521b137d2d ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134870
Approved by: https://github.com/jansel
2024-08-31 04:50:31 +00:00
Xu Han
15f5a4858b
[inductor] enable Intel Compiler(icx-cl) for inductor windows ( #134772 )
...
This PR is enable Intel Compiler (`icx-cl`) for Windows inductor, likes previous PR: https://github.com/pytorch/pytorch/pull/134444 which enable clang.
Changes:
1. Fix icx-cl crash by wrong decode args, the right decode should be "utf-8".
2. Add intel compiler check, and intel compiler Windows drivers check(icx-cl).
3. Add Intel compiler openmp args config.
4. Add intel compiler openmp binary preload.
For intel compiler openmp binary path:
<img width="788" alt="image" src="https://github.com/user-attachments/assets/54c76356-018d-4bef-a9b7-0ea150fd7aba ">
For performance, Intel compiler(`icx-cl`) is much better performance than MSVC(`cl`):
<img width="875" alt="image" src="https://github.com/user-attachments/assets/67865faf-b1de-4535-917a-486b72527204 ">
Append `clang-cl` performance data:
<img width="821" alt="image" src="https://github.com/user-attachments/assets/476f4568-bf58-457f-b73d-4e57f49be384 ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134772
Approved by: https://github.com/jgong5 , https://github.com/jansel
2024-08-30 17:51:46 +00:00
Zhuoran Zhao
8b4c487581
Fix AOTInductor complication on ROCM ( #134522 )
...
Summary:
Original PR (https://github.com/pytorch/pytorch/pull/124123 ) is broken by cpp_builder refactoring
So resubmit it to fix
Test Plan: Test with command here: https://www.internalfb.com/phabricator/paste/view/P1549765548
Differential Revision: D61827208
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134522
Approved by: https://github.com/frank-wei
2024-08-29 21:59:04 +00:00
Xu Han
1dd4b9221b
[inductor] enable clang for Windows inductor ( #134444 )
...
Changes:
1. Add Windows clang-cl compiler check.
2. Add openmp config for clang-cl.
3. Preload libomp.dll when use clang.
4. Add compiler flags syntax check for `clang` and `clang++`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134444
Approved by: https://github.com/jgong5 , https://github.com/jansel , https://github.com/malfet
2024-08-26 18:19:59 +00:00
Xu Han
98d6a6eb7d
[inductor] clean up TODO comments. ( #133718 )
...
clean up TODO comments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133718
Approved by: https://github.com/henrylhtsang
2024-08-16 22:12:01 +00:00
Xu Han
89795da5e3
[inductor] process compile_only case in all build options class. ( #129975 )
...
Optimize `compile_only` logical. Origin code only apply for `CppTorchCudaOptions`, this PR make it apply for all build option classes.
Changes:
1. Remove `libraries_dirs` and `libraries` settings, when `compile_only`.
2. Remove compile_only from CppTorchCudaOptions.
3. Make the `compile_only` apply for all classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129975
Approved by: https://github.com/henrylhtsang
2024-08-13 16:45:27 +00:00
Xu Han
9f0d90655d
[inductor] cpp_builder add dynamo time trace for compile_file ( #133103 )
...
trace `compile_file` time for cpp_builder.
Ref: https://github.com/pytorch/pytorch/pull/132328/files#diff-c9b517f8db609ffa866804dfa2689188a4fee20abacaa0b0dca91625c1b5cb8dR2224
<img width="994" alt="image" src="https://github.com/user-attachments/assets/862c7943-79dc-4d06-b398-a09595ad1295 ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133103
Approved by: https://github.com/ezyang
2024-08-10 04:55:02 +00:00
Henry Tsang
e98eac76b3
[inductor] switch AotCodeCompiler to new cpp_builder. (take 3) ( #132766 )
...
Summary: This is basically https://github.com/pytorch/pytorch/pull/131304 together with https://github.com/pytorch/pytorch/pull/132594 and absolute path fix for fbcode.
Test Plan: ci
Differential Revision: D60773405
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132766
Approved by: https://github.com/xuhancn , https://github.com/chenyang78 , https://github.com/desertfire
2024-08-06 23:56:34 +00:00
Xu Han
a672f6c84e
[inductor] unificate SUBPROCESS_DECODE_ARGS variable in cpp_builder.py ( #132615 )
...
[inductor] unificate SUBPROCESS_DECODE_ARGS variable in cpp_builder.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132615
Approved by: https://github.com/jgong5 , https://github.com/desertfire
2024-08-05 16:00:35 +00:00
Xu Han
7f8a384a8f
[inductor] add msvc_cl compiler check ( #132571 )
...
add `msvc_cl` compiler check.
Local test:
<img width="880" alt="image" src="https://github.com/user-attachments/assets/fe4da5e0-dd52-4dbc-831e-c32479e27a29 ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132571
Approved by: https://github.com/ezyang
2024-08-04 03:48:25 +00:00
Xu Han
36ec0fdf10
[inductor] check compiler exist on Windows. ( #132533 )
...
Current Windows env, if we are not activate the MSVC env. It will not raise a clear error to compiler:
<img width="904" alt="image" src="https://github.com/user-attachments/assets/725ea608-d181-40b1-8930-42fe2b32643a ">
With this PR, we can help users point to the issue is from compiler.
<img width="1034" alt="image" src="https://github.com/user-attachments/assets/8515a796-e3e9-4909-a68f-8a14d4864951 ">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132533
Approved by: https://github.com/jansel
2024-08-03 07:47:11 +00:00
Xu Han
475da800c7
[inductor] optimize cflags for Windows. ( #131980 )
...
changes:
1. optimize cflags for Windows. Ref: https://github.com/pytorch/pytorch/blob/v2.4.0/torch/utils/cpp_extension.py#L215
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131980
Approved by: https://github.com/jgong5 , https://github.com/jansel
2024-07-30 02:59:51 +00:00
Xu Han
28fd2e905d
[inductor] enhance cpp_builder lint check. ( #131752 )
...
enhance cpp_builder `mypy` check.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131752
Approved by: https://github.com/jgong5 , https://github.com/jansel
2024-07-27 02:46:27 +00:00