pytorch

OSSForks/pytorch

Fork 0

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Commit Graph

Author	SHA1	Message	Date
Mark Harfouche	43afaa4aac	Allow users to overwrite ld with environment variable in linker optimization script (#137331 ) This should help in the case of cross compilation. xref: https://github.com/conda-forge/pytorch-cpu-feedstock/pull/261 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/137331 Approved by: https://github.com/isuruf, https://github.com/seemethere	2024-11-26 22:54:24 +00:00
Aidyn-A	a6080f79e9	[Build] Add linker script optimization (#121975 ) This PR adds a linker script optimization based on prioritized symbols that can be extracted from the profiles of popular workloads. The present linker script was generated to target ARM+CUDA and later can be extended if necessary. The reason we target ARM is shown below: > PyTorch and other applications that access more than 24x 2MB code regions in quick succession can result in performance bottlenecks in the CPU front-end. The link-time optimization improves executable code locality and improve performance. We recommend turning on the optimization always for PyTorch and other application that behaves similarly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121975 Approved by: https://github.com/ptrblck, https://github.com/atalman	2024-04-09 20:22:25 +00:00

Author

SHA1

Message

Date

Mark Harfouche

43afaa4aac

Allow users to overwrite ld with environment variable in linker optimization script (#137331 )

This should help in the case of cross compilation.

xref: https://github.com/conda-forge/pytorch-cpu-feedstock/pull/261

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137331
Approved by: https://github.com/isuruf, https://github.com/seemethere

2024-11-26 22:54:24 +00:00

Aidyn-A

a6080f79e9

[Build] Add linker script optimization (#121975 )

This PR adds a linker script optimization based on prioritized symbols that can be extracted from the profiles of popular workloads. The present linker script was generated to target ARM+CUDA and later can be extended if necessary. The reason we target ARM is shown below:

> PyTorch and other applications that access more than 24x 2MB code regions in quick succession can result in performance bottlenecks in the CPU front-end.  The link-time optimization improves executable code locality and improve performance. We recommend turning on the optimization always for PyTorch and other application that behaves similarly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121975
Approved by: https://github.com/ptrblck, https://github.com/atalman

2024-04-09 20:22:25 +00:00

2 Commits