pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Mikayla Gawarecki db3685a35c Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 ) ## Background This PR adds `torch.utils.serialization.config.load.calculate_storage_offsets`. This option relies on the previous PR in this stack, where storage order was changed to non lexicographical. A `.format_version` entry was added to the zipfile and `calculate_storage_offsets` will only work on checkpoints with `.format_version`. When this is turned on, for `torch.load(mmap=True)`, offsets of each storage record (other than the 0th storage will be calculated instead of relying on `miniz` APIs to determine this). The existing APIs will issue multiple random reads (reading the end of central directory record, then reading the zipfile header for the record) to determine the storage offset where the record starts. This can greatly degrade `torch.load(mmap=True)` performance for non-filesystem cases. `6aaae9d78f/caffe2/serialize/inline_container.cc (L589-L605)` ## Testing strategy The agreed upon testing strategy was as follows: - Add debug code gated by an environment flag `TORCH_SERIALIZATION_DEBUG` that will run this offset calculation logic and verify it against getRecordOffset for each storage (when mmap=False) - This flag is set throughout CI, which means that every time `torch.load` is called, the offset calculation logic is implicitly being tested. Differential Revision: [D67673026](https://our.internmc.facebook.com/intern/diff/D67673026) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143880 Approved by: https://github.com/albanD ghstack dependencies: #143879		2025-01-27 23:57:30 +00:00
..
amp_examples.rst	Update document for autocast on CPU (#135299 )	2024-09-13 09:11:47 +00:00
autograd.rst	Fix unexpected inference_mode interaction with torch.autograd.functional.jacobian (#130307 )	2024-08-25 22:14:02 +00:00
broadcasting.rst
cpu_threading_runtimes.svg
cpu_threading_torchscript_inference.rst
cpu_threading_torchscript_inference.svg
cuda.rst	Revert "[CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441 )"	2025-01-27 19:38:26 +00:00
custom_operators.rst	Redirect the custom ops landing page :D (#139634 )	2024-11-04 22:25:15 +00:00
ddp.rst	Update DDP dynamo debug docs (#118295 )	2024-01-29 14:58:26 +00:00
extending.func.rst	Fix the example in the extending.func.rst (#109279 )	2023-09-14 17:29:39 +00:00
extending.rst	[doc] fix grammar in "Extending Torch" (#140209 )	2024-11-13 05:34:43 +00:00
faq.rst
fsdp.rst	[docs] start a new FSDP notes doc (#117323 )	2024-01-22 15:46:35 +00:00
get_start_xpu.rst	Fix IdentationError of code example (#145251 )	2025-01-23 18:17:11 +00:00
gradcheck.rst
hip.rst	[ROCm] set hipblas workspace (#138791 )	2024-10-29 01:37:55 +00:00
large_scale_deployments.rst
modules.rst	Fix to modules.rst: indent line with activation functions (#139667 )	2024-11-08 01:12:52 +00:00
mps.rst
multiprocessing.rst	[draft] Update Multiprocessing best practices with CPU device (#103229 )	2023-06-25 06:26:40 +00:00
numerical_accuracy.rst	Add option to configure reduced precision math backend for SDPA (#135964 )	2024-09-24 07:11:38 +00:00
randomness.rst	Fix typo in Reproducibility docs (#141341 )	2024-11-26 16:53:26 +00:00
serialization.rst	Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 )	2025-01-27 23:57:30 +00:00
windows.rst