pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
zeshengzong	97272e4b49	Fix `torch.nn.functional.hardswish` gradients corner case (#148049 ) Fixes #147801 ## Changes - Change hardswish gradient compute condition as [torch.nn.functional.hardswish](https://pytorch.org/docs/stable/generated/torch.nn.functional.hardswish.html) - Enable cuda for test `test_hardswish_grad_corner` - Add test case for value=-3 ## Test Result ```bash pytest test/test_nn.py -k test_hardswish pytest test/test_unary_ufuncs.py -k test_hardswish pytest test/inductor/test_torchinductor.py -k test_hardswish ``` ![image](https://github.com/user-attachments/assets/000cb5c4-15f5-4bfd-ab45-f52bf810ff3d) ![image](https://github.com/user-attachments/assets/38b08cf8-ea84-47a2-8e37-0a213da3e0c8) ![image](https://github.com/user-attachments/assets/54bc57be-2c57-46cc-ab90-94ea6cbe1c34) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148049 Approved by: https://github.com/soulitzer	2025-03-14 18:53:10 +00:00
Zhenghao Hu	e5fccb2bab	[pytorch] Fix duplicated Malloc/Free insertation when using IRBuilderBase::CreateMalloc/CreateFree in LLVM 18+ (#149058 ) Summary: Pytorch unitest hangs when jitting the Tensor kernel. The problem exists for LLVM version >= 18 due to this upstream change: `45bb45f2ae` `IRBuilderBase::CreateCall` will insert the instruction into the BasicBlock by default. And we don't need to explicitly insert the instruction when compiling the tensor kernel. Test Plan: ## Test with the release toolchain ``` buck test 'mode/dev' //caffe2/test:jit -- --exact 'caffe2/test:jit - test_concat_invariant (test_jit_fuser_te.TestTEFuserDynamic)' ``` ## Test with the Buckified toolchain Apply this D71046097 to select the LLVM libraries. ``` # Build tests buck build 'mode/dev-asan' //caffe2/test:jit --show-output ``` ``` # Run test (Change HASH and paths accordingly) HASH="b755f1c435832a1e" ENABLE_FLATBUFFER=0 FB_OVERRIDE_PYBIND11_GIL_INCREF_DECREF_CHECK=1 MKL_NUM_THREADS=1 NO_MULTIPROCESSING_SPAWN=0 OMP_NUM_THREADS=1 PYTORCH_TEST=1 PYTORCH_TEST_FBCODE=1 PYTORCH_TEST_WITH_ASAN=1 PYTORCH_TEST_WITH_DEV_DBG_ASAN=1 PYTORCH_TEST_WITH_TSAN=0 PYTORCH_TEST_WITH_UBSAN=1 SKIP_TEST_BOTTLENECK=1 TENSORPIPE_TLS_DATACENTER=test_dc TEST_PILOT=True TPX_IS_TEST_EXECUTION=true TPX_TIMEOUT_SEC=6000 \ buck-out/v2/gen/$HASH/caffe2/test/__jit__/jit.par --test-filter test_jit_fuser_te.TestTEFuserDynamic.test_concat_invariant ``` Differential Revision: D71046799 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149058 Approved by: https://github.com/dcci, https://github.com/Skylion007	2025-03-13 20:37:47 +00:00
Zhenghao Hu	f1444f006c	[caffe2/torch] Fixup upstream LLVM (major version 21) API changes (#148833 ) Latest LLVM introduced two changes related to the `Triple` usage that causes build failures when building pytorch. ## Failure in llvm_codegen.cpp: Triple is stored in Modules instead of the string: `979c275097` ## Failure in llvm_jit.cpp: Triple argument is removed from LLJITBuilder::... : `b18e5b6a36` Pull Request resolved: https://github.com/pytorch/pytorch/pull/148833 Approved by: https://github.com/Skylion007	2025-03-09 18:58:36 +00:00
cyy	f7c0c230b0	Fix compile errors (#148758 ) Fix ``` /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../include/c++/14.2.1/bits/unique_ptr.h:91:16: error: invalid application of 'sizeof' to an incomplete type 'torch::jit::AliasDb::WriteRegistry' 91 \| static_assert(sizeof(_Tp)>0, \| ^~~~~~~~~~~ /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../include/c++/14.2.1/bits/unique_ptr.h:399:4: note: in instantiation of member function 'std::default_delete<torch::jit::AliasDb::WriteRegistry>::operator()' requested here 399 \| get_deleter()(std::move(__ptr)); \| ^ ../torch/csrc/jit/ir/alias_analysis.cpp:200:10: note: in instantiation of member function 'std::unique_ptr<torch::jit::AliasDb::WriteRegistry>::~unique_ptr' requested here 200 \| AliasDb::~AliasDb() = default; \| ^ ../torch/csrc/jit/ir/alias_analysis.cpp:200:23: note: in defaulted destructor for 'torch::jit::AliasDb' first required here 200 \| AliasDb::~AliasDb() = default; \| ^ ../torch/csrc/jit/ir/alias_analysis.h:298:10: note: forward declaration of 'torch::jit::AliasDb::WriteRegistry' 298 \| struct WriteRegistry; \| ^ 1 error generated. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/148758 Approved by: https://github.com/Skylion007	2025-03-08 04:56:42 +00:00
Nikita Shulga	6602e632cd	Suppress build warnings when gcc-11 is used (#148763 ) By decorating the header with `C10_DIAGNOSTIC_PUSH_AND_IGNORED_IF_DEFINED("-Wmismatched-new-delete")` that will suppress following (when building against ancient llvm-9) ``` In file included from /var/lib/jenkins/workspace/torch/csrc/jit/tensorexpr/llvm_codegen.cpp:24: /opt/llvm/include/llvm/IR/IRBuilder.h: In member function 'llvm::LoadInst* llvm::IRBuilder<T, Inserter>::CreateLoad(llvm::Type, llvm::Value, const llvm::Twine&) [with T = llvm::ConstantFolder; Inserter = llvm::IRBuilderDefaultInserter]': /opt/llvm/include/llvm/IR/IRBuilder.h:1581:19: error: 'static void llvm::User::operator delete(void)' called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete] 1581 \| return Insert(new LoadInst(Ty, Ptr), Name); \| ^~~~~~~~~~~~~~~~~~~~~ /opt/llvm/include/llvm/IR/IRBuilder.h:1581:19: note: returned from 'static void llvm::UnaryInstruction::operator new(size_t)' ``` Probably a reasonable followup will be to disable NNC testing all-together, as project has been in a maintenance mode for a while now Pull Request resolved: https://github.com/pytorch/pytorch/pull/148763 Approved by: https://github.com/Skylion007, https://github.com/ZainRizvi, https://github.com/atalman ghstack dependencies: #148739	2025-03-07 20:43:35 +00:00
PyTorch MergeBot	abcca2fcbb	Revert "Fix `torch.nn.functional.hardswish` gradients corner case (#148049 )" This reverts commit `29b28e9d9f`. Reverted https://github.com/pytorch/pytorch/pull/148049 on behalf of https://github.com/soulitzer due to This may be causing an accuracy failure on inductor ([comment](https://github.com/pytorch/pytorch/pull/148049#issuecomment-2706839169))	2025-03-07 16:05:56 +00:00
zeshengzong	29b28e9d9f	Fix `torch.nn.functional.hardswish` gradients corner case (#148049 ) Fixes #147801 ## Changes - Change hardswish gradient compute condition as [torch.nn.functional.hardswish](https://pytorch.org/docs/stable/generated/torch.nn.functional.hardswish.html) - Enable cuda for test `test_hardswish_grad_corner` - Add test case for value=-3 ## Test Result ```bash pytest test/test_nn.py -k test_hardswish pytest test/test_unary_ufuncs.py -k test_hardswish pytest test/inductor/test_torchinductor.py -k test_hardswish ``` ![image](https://github.com/user-attachments/assets/000cb5c4-15f5-4bfd-ab45-f52bf810ff3d) ![image](https://github.com/user-attachments/assets/38b08cf8-ea84-47a2-8e37-0a213da3e0c8) ![image](https://github.com/user-attachments/assets/54bc57be-2c57-46cc-ab90-94ea6cbe1c34) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148049 Approved by: https://github.com/soulitzer	2025-03-06 19:04:52 +00:00
Mikayla Gawarecki	be0ceee1c3	Make record/storage alignment in torch.save configurable (#147788 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147788 Approved by: https://github.com/albanD ghstack dependencies: #147786, #147787	2025-03-06 12:04:46 +00:00
cyy	9aa897b992	Remove unnecessary tensor clone (#148159 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/148159 Approved by: https://github.com/Skylion007	2025-03-02 16:21:39 +00:00
Richard Barnes	5301710b15	[codemod] Fix unused-value issue in caffe2/aten/src/ATen/cuda/detail/CUDAHooks.cpp +4 (#147555 ) Summary: LLVM has a warning `-Wunused-value` which we treat as an error because it's so often diagnostic of a code issue. Unused values often indicate a programming mistake, but can also just be unnecessary cruft that harms readability and performance. For questions/comments, contact r-barnes. - If you approve of this diff, please use the "Accept & Ship" button :-) Test Plan: Sandcastle Differential Revision: D69945678 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147555 Approved by: https://github.com/Skylion007, https://github.com/eqy	2025-03-01 19:46:13 +00:00
Hoa Dinh	687fe64667	Fix crash in -[PTMCoreMLCompiler _compileModel:atPath:] (#147809 ) Summary: We could hit one of those exceptions: https://github.com/apple/coremltools/blob/main/modelpackage/src/ModelPackage.cpp#L205-L225 And it would make this code path crash. Test Plan: build. Differential Revision: D70122378 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147809 Approved by: https://github.com/mcr229	2025-02-25 20:56:16 +00:00
cyy	b61a556427	Turn onnx functions into static (#147598 ) To avoid exposing ONNX symbols. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147598 Approved by: https://github.com/justinchuby	2025-02-21 07:40:28 +00:00
Kevin Fu	4986f0f52e	[PT2]: allow empty dict to pass type check (#147167 ) (#147480 ) Summary: Seeing errors like when testing sigmoid for inline_cvr and perevent_cvr models. ``` terminate called after throwing an instance of 'c10::Error' what(): forward() Expected a value of type 'Dict[int, Tuple[Tensor, Tensor, Tensor]]' for argument 'event_based_features' but instead found type 'Dict[Any, Any]'. ``` Let empty dict pass type check. please, do NOT use any of the following flags, those are result of manual interventions in other parts of the system, misuse of them can be very painful for both detect and recover: Test Plan: ``` MODEL_ENTITY_ID=691508446 SNAPSHOT_ID=0 OTHER_MODEL_ENTITY_ID=649645886 OTHER_SNAPSHOT_ID=0 MODULE=local buck2 run mode/opt caffe2/torch/fb/model_transform/fx2trt/packaging:load_net_predictor -- \ --loadMode=BenchmarkAB \ --inputNetFile=/data/users/${USER}/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/${MODEL_ENTITY_ID}_${SNAPSHOT_ID}${suffix} \ --otherNetFile=/data/users/${USER}/models/${OTHER_MODEL_ENTITY_ID}/${OTHER_SNAPSHOT_ID}/${OTHER_MODEL_ENTITY_ID}_${OTHER_SNAPSHOT_ID}${suffix} \ --moduleName=${module} \ --submodToDevice "" \ --benchmarkDontRebatchSamples=true \ --sampleInputFilePath=/data/users/${USER}/models/${MODEL_ENTITY_ID}/${SNAPSHOT_ID}/archive_.predictor.disagg.gpu.local/data/sample_inputs/local.pt ``` Reviewed By: yjhao Differential Revision: D69871393 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147480 Approved by: https://github.com/henryoier, https://github.com/jeanschmidt	2025-02-21 07:00:46 +00:00
Michal Gallus	d9cf1debf9	[ROCm][Windows] Fix clang-cl error related to -Wmissing prototypes enabled (#146981 ) Some of the windows files (fused_kernels.cpp or temp_file.h) contain code that fail to compile when this flag is enabled when built with clang-cl. This PR resolves the issue by ensuring that even if we build with clang-cl, it doesn't include those flags on windows. Alternatively if needed, I can fix the files mentioned to pass under this flag. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146981 Approved by: https://github.com/cyyever, https://github.com/Skylion007	2025-02-18 07:41:12 +00:00
Zhou Fang	a8fa4bcfd2	[StaticRuntime] Support a new pattern (aten::to with 5 inputs) for ClipRangesToGatherToOffsets (#147189 ) Summary: Support the following new pattern for ClipRangesToGatherToOffsets: Before optimization: ``` %11175 : Tensor, %11176 : Tensor = fb::clip_ranges_gather(%int_66.1, %getitem_1784.1, %347) %getattr_256.1 : int = prim::dtype(%11175) %to_298.1 : Tensor = aten::to(%11176, %getattr_256.1, %13, %13, %12) %lengths_to_offsets_333.1 : Tensor = fb::lengths_to_offsets(%to_298.1, %8) ``` After optimization: ``` %11199 : int = prim::dtype(%int_66.1) %11200 : Tensor, %11201 : Tensor = fb::clip_ranges_gather_to_offsets(%int_66.1, %getitem_1784.1, %347, %8, %11199) ``` It is similar with https://github.com/pytorch/pytorch/pull/146931, but aten::to has 5 inputs instead of 4. Differential Revision: D69627793 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147189 Approved by: https://github.com/hanyilou123	2025-02-16 22:16:02 +00:00
cyy	8f291e8c00	Fix clang-tidy warnings in torch/jit (#146963 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/146963 Approved by: https://github.com/davidberard98	2025-02-15 03:36:59 +00:00
Zhou Fang	d774a6333d	[StaticRuntime] Support a new pattern for ClipRangesToGatherToOffsets (#146931 ) Summary: Support the following new pattern for ClipRangesToGatherToOffsets: Before optimization: ``` %18267 : Tensor, %18268 : Tensor = fb::clip_ranges_gather(%int_77.1, %getitem_2484.1, %493) %getattr_368.1 : int = prim::dtype(%18267) %to_443.1 : Tensor = aten::to(%18268, %getattr_368.1, %self._maybe_compute_kjt_to_jt_dict.is_weighted, %self._maybe_compute_kjt_to_jt_dict.is_weighted) %lengths_to_offsets_490.1 : Tensor = fb::lengths_to_offsets(%to_443.1, %8) ``` After optimization: ``` %18297 : int = prim::dtype(%int_77.1) %18298 : Tensor, %18299 : Tensor = fb::clip_ranges_gather_to_offsets(%int_77.1, %getitem_2484.1, %493, %8, %18297) ``` Reviewed By: garroud Differential Revision: D69373835 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146931 Approved by: https://github.com/hanyilou123	2025-02-12 08:19:41 +00:00
Zhou Fang	fc5913b6bf	[StaticRuntime] Fix a bug that memory planner ignores subblocks (#146728 ) (#146855 ) Summary: When Static Runtime graph node has sub-blocks, the memory planner does not consider sub-blocks' inputs as a node's input in memory planner. As the result, such nodes' inputs' lifetime is incorrect and corresponding tensor memory is released earlier than required and causes errors. Differential Revision: D69195886 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146855 Approved by: https://github.com/swolchok	2025-02-11 13:59:54 +00:00
cyy	25aa7ca62d	Cleanup CallOnce.h (#146700 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/146700 Approved by: https://github.com/albanD	2025-02-07 16:44:45 +00:00
Michal Gallus	9ea1823f96	[ROCm][Windows] Remove external linkage from an anonymous namespace (#146607 ) Fixes a clang-cl compiler error related to attempt to export a symbol that doesn't have any external linkage, since its declared within a local anonymous namespace. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146607 Approved by: https://github.com/jeffdaily	2025-02-06 23:48:20 +00:00
Michael Suo	99dd846672	[torch] fix builds for older pybind (#146630 ) Summary: some versions of pybind we build with don't have `py::set_error`. So just use the underlying python C API. Test Plan: unit tests Differential Revision: D69254629 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146630 Approved by: https://github.com/colin2328, https://github.com/ngimel	2025-02-06 21:22:00 +00:00
Michael Suo	425804db2b	[torch] fix exception types in custom class magic setattr/getattr (#146516 ) Summary: `c10::AttributeError` is not automatically converted to Python AttributeError, it needs some special macros (e.g. `HANDLE_TH_ERRORS`). Some Python functions like `hasattr` rely on the type of the throw exception to be correct. We don't need the fully generality of those macros, so just do a targeted error type conversion here. Test Plan: added unit test Differential Revision: D69197217 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146516 Approved by: https://github.com/zdevito	2025-02-06 02:14:11 +00:00
cyy	6293d1446b	[2/N] Remove NOLINT suppressions (#146402 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/146402 Approved by: https://github.com/soulitzer	2025-02-05 08:38:52 +00:00
PyTorch MergeBot	00dc5b10f6	Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211 )" This reverts commit `2fd1b6b361`. Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/atalman due to Breaks executorch tests ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2632202864))	2025-02-03 22:04:28 +00:00
cyy	2fd1b6b361	[Environment Variable][7/N] Use thread-safe getenv functions (#140211 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211 Approved by: https://github.com/ezyang, https://github.com/eqy	2025-02-01 12:33:41 +00:00
Mikayla Gawarecki	001e355a56	Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 ) ## Background This PR adds `torch.utils.serialization.config.load.calculate_storage_offsets`. This option relies on the previous PR in this stack, where storage order was changed to non lexicographical. A `.format_version` entry was added to the zipfile and `calculate_storage_offsets` will only work on checkpoints with `.format_version`. When this is turned on, for `torch.load(mmap=True)`, offsets of each storage record (other than the 0th storage will be calculated instead of relying on `miniz` APIs to determine this). The existing APIs will issue multiple random reads (reading the end of central directory record, then reading the zipfile header for the record) to determine the storage offset where the record starts. This can greatly degrade `torch.load(mmap=True)` performance for non-filesystem cases. `6aaae9d78f/caffe2/serialize/inline_container.cc (L589-L605)` ## How does this work The format for the checkpoint is as such ``` archive_name/ \|_ data.pkl \|_.format_version \|_byteorder \|_data/ \|_ 0 \|_ 1 \|_ 2 \|_ ... \|_ ``` Each `data/i` record represents a storage, where storages are written in the order that the Pickler encounters them. For each storage, our `persistent_load` logic saves the following metadata to the pickle file `dtype, numel, key, location` where `numel` is the number of bytes in the storage. Note that we always use `miniz` writer in the zip64 mode per [here](`7796e308d0/caffe2/serialize/inline_container.cc (L701)`) A zipfile record written by miniz looks as such ``` ---------------- ----------------- ------------------- ---------------- --------- ------------------------------ \| 30 byte header \| n byte filename \| zip64_extra_data \| m byte padding \| storage \| 16 or 24 byte local dir footer \| ---------------- ----------------- ------------------- ---------------- --------- ------------------------------ ``` - The header size (30) is given by [`MZ_ZIP_LOCAL_DIR_HEADER_SIZE`](https://github.com/pytorch/pytorch/blob/main/third_party/miniz-3.0.2/miniz.c?fbclid=IwZXh0bgNhZW0CMTEAAR2O8Vysd--UoSCxW70gabXIS1dbz733oHwuUQ5_Ff1hY2WU6PL2i6CSH4A_aem_J9oaU2HpDeWtJKOU9EnVqw#L3290) - filename will be `"{archive_name}/{filepath}"` - `zip64_extra_data` is determined by [`mz_zip_writer_create_zip64_extra_data`](`7796e308d0/third_party/miniz-3.0.2/miniz.c (L6202)`). Note that [we only create zip64_extra_data if storage_size >= 0xFFFFFFFF or the offset of the start of the header >= 0xFFFFFFFF](`7796e308d0/third_party/miniz-3.0.2/miniz.c (L6519-L6524)`) - `m` is determined by [`getPadding`](`7796e308d0/caffe2/serialize/inline_container.cc (L254)`), which accounts for filename, zip64_extra_data to determine `m` such that the start of `storage` is aligned to 64 bytes. The `m` bytes will always start with `F B padding_size" as the first 4 bytes - The local dir footer size is determined based on [this snippet ](`7796e308d0/third_party/miniz-3.0.2/miniz.c (L6610-L6632)`): if the buffer size is 0 it is skipped. If the zip64_extra_data was created, it is 24, otherwise it is 16. When `torch.utils.serialization.config.load.calculate_storage_offsets` is set we do the following - We keep track of where the "cursor" is in the file using `current_offset`, after each persistent_load call, it will be at the offset where the header for the next record starts - for the 0th storage, "data/0", we use the regular get_record_offset to determine the start of the storage - for any other storage, (where the storages will be in order encountered by the unpickler, 0, 1, 2, 3, ...) we use `get_record_offset_no_read`, which re-uses the `getPadding` logic to determine the offset of the storage - Note that `load_tensor` will only ever be called again with the same key if the storage's `._data_ptr()` is 0 [[pointer1](https://github.com/pytorch/pytorch/blob/main/torch/serialization.py#L1917-L1918)][[pointer2](https://github.com/pytorch/pytorch/blob/main/torch/serialization.py#L1936-L1937)], so we cache the offsets for this edge case - After each storage, if the storage is non-zero, we account for the local dir footer based on the logic described above ## Testing strategy The agreed upon testing strategy was as follows: - Add debug code gated by an environment flag `TORCH_SERIALIZATION_DEBUG` that will run this offset calculation logic and verify it against getRecordOffset for each storage (when mmap=False) - This flag is set throughout CI, which means that every time `torch.load` is called, the offset calculation logic is implicitly being tested. Differential Revision: [D67673026](https://our.internmc.facebook.com/intern/diff/D67673026) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143880 Approved by: https://github.com/albanD ghstack dependencies: #143879	2025-01-31 17:09:20 +00:00
Manav Avlani	f9227e7c33	Expose ToIValueAllowNumbersAsTensors to TORCH_PYTHON_API so we can use it in monarch (#146087 ) Summary: TSIA Test Plan: Tested up the stack but existing unittests Reviewed By: suo Differential Revision: D68917233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146087 Approved by: https://github.com/suo	2025-01-31 05:08:11 +00:00
PyTorch MergeBot	284f217011	Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211 )" This reverts commit `97b3b73f3e`. Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/ZainRizvi due to Sorry but this is failing internally. @eqy @ezyang can you please help this get remerged? See D68779772. ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2622504898))	2025-01-29 18:24:29 +00:00
cyyever	97b3b73f3e	[Environment Variable][7/N] Use thread-safe getenv functions (#140211 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211 Approved by: https://github.com/ezyang, https://github.com/eqy	2025-01-28 15:21:12 +00:00
PyTorch MergeBot	9010649292	Revert "Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 )" This reverts commit `db3685a35c`. Reverted https://github.com/pytorch/pytorch/pull/143880 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but either this PR or the base PR breaks distributed tests ([comment](https://github.com/pytorch/pytorch/pull/143880#issuecomment-2617743403))	2025-01-28 03:07:17 +00:00
Mikayla Gawarecki	db3685a35c	Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 ) ## Background This PR adds `torch.utils.serialization.config.load.calculate_storage_offsets`. This option relies on the previous PR in this stack, where storage order was changed to non lexicographical. A `.format_version` entry was added to the zipfile and `calculate_storage_offsets` will only work on checkpoints with `.format_version`. When this is turned on, for `torch.load(mmap=True)`, offsets of each storage record (other than the 0th storage will be calculated instead of relying on `miniz` APIs to determine this). The existing APIs will issue multiple random reads (reading the end of central directory record, then reading the zipfile header for the record) to determine the storage offset where the record starts. This can greatly degrade `torch.load(mmap=True)` performance for non-filesystem cases. `6aaae9d78f/caffe2/serialize/inline_container.cc (L589-L605)` ## Testing strategy The agreed upon testing strategy was as follows: - Add debug code gated by an environment flag `TORCH_SERIALIZATION_DEBUG` that will run this offset calculation logic and verify it against getRecordOffset for each storage (when mmap=False) - This flag is set throughout CI, which means that every time `torch.load` is called, the offset calculation logic is implicitly being tested. Differential Revision: [D67673026](https://our.internmc.facebook.com/intern/diff/D67673026) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143880 Approved by: https://github.com/albanD ghstack dependencies: #143879	2025-01-27 23:57:30 +00:00
c8ef	a989a0b13a	[NFC] Fix some minor typos. (#145599 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145599 Approved by: https://github.com/Skylion007	2025-01-24 18:58:59 +00:00
cyy	379bbef23c	Enable more C++ warnings (#143355 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/143355 Approved by: https://github.com/albanD	2024-12-27 05:46:57 +00:00
PyTorch MergeBot	9255ffc841	Revert "Enable more C++ warnings (#143355 )" This reverts commit `daa3ffe0eb`. Reverted https://github.com/pytorch/pytorch/pull/143355 on behalf of https://github.com/malfet due to It fails internal build system as it kind of breaks separation between native and native/cpu ([comment](https://github.com/pytorch/pytorch/pull/143355#issuecomment-2562961546))	2024-12-26 17:13:10 +00:00
cyy	daa3ffe0eb	Enable more C++ warnings (#143355 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/143355 Approved by: https://github.com/albanD	2024-12-21 09:19:02 +00:00
Sean Xiao	e4301aeaa5	[ODML] Make the ML feature provider thread safe (#143418 ) Summary: This PR is generated from a meta internal Diff, aiming to resolve a crash from a race condition on the dictionary. Test Plan: Build and run Print out the count/name/value of the dictionary and see if the values are get/set/removed correctly. Observe the print statement on app start within IG @diff-train-skip-merge Pull Request resolved: https://github.com/pytorch/pytorch/pull/143418 Approved by: https://github.com/shoumikhin	2024-12-19 04:47:56 +00:00
James	d4ed5941db	Fix floating point literals in IRPrinter (#142119 ) Fixes #114035 This is a recreation of #140002 with approval from its author. Original description: >when v larger than 1e16, the format will be error. example: v is 1.2e17, the output is 1.2e17.f, it have two point '.' Pull Request resolved: https://github.com/pytorch/pytorch/pull/142119 Approved by: https://github.com/jgong5, https://github.com/malfet	2024-12-18 21:59:48 +00:00
cyy	e9f6045e80	[15/N] Fix extra warnings brought by clang-tidy-17 (#143100 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/143100 Approved by: https://github.com/Skylion007	2024-12-14 03:24:10 +00:00
Richard Barnes	82ce888273	c10::string_view -> std::string_view in more places (#142517 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/142517 Approved by: https://github.com/malfet	2024-12-12 19:45:59 +00:00
Richard Barnes	7e41717a26	c10::string_view -> std::string_view in caffe2/jit (#142383 ) Test Plan: Sandcastle Differential Revision: D66939979 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142383 Approved by: https://github.com/malfet	2024-12-10 15:42:28 +00:00
cyyever	a108b282ff	[4/N] Avoid copy in std::get (#142285 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/142285 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2024-12-09 07:59:35 +00:00
Richard Barnes	46dc2965de	Adding missing space to pybind_utils.h error message (#142258 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/142258 Approved by: https://github.com/Skylion007	2024-12-08 20:46:32 +00:00
Richard Barnes	17f1a42c13	Add missing py::bytes to pybind_utils tryToInferType (#142265 ) I'm not sure what the best way to fix this is, but this does unbreak an internal test. Test Plan: Sandcastle Reviewed By: itamaro Pull Request resolved: https://github.com/pytorch/pytorch/pull/142265 Approved by: https://github.com/houseroad	2024-12-07 20:31:57 +00:00
cyy	ab5467897a	Fix NOLINTNEXTLINE (#141794 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141794 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2024-12-02 19:22:00 +00:00
PyTorch MergeBot	eb7deb2db5	Revert "Fix NOLINTNEXTLINE (#141794 )" This reverts commit `7dd9b5fc43`. Reverted https://github.com/pytorch/pytorch/pull/141794 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/12087979418/job/33711943084) [HUD commit link](`7dd9b5fc43`) ([comment](https://github.com/pytorch/pytorch/pull/141794#issuecomment-2511789484))	2024-12-02 15:07:50 +00:00
cyy	e29dabbd71	Fix performance-unnecessary-copy-initialization (#141792 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141792 Approved by: https://github.com/Skylion007	2024-11-29 22:10:06 +00:00
cyyever	7dd9b5fc43	Fix NOLINTNEXTLINE (#141794 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141794 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2024-11-29 16:23:59 +00:00
cyy	45ed7c13fa	Remove unneeded std::make_optional (#141567 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141567 Approved by: https://github.com/albanD	2024-11-28 00:05:21 +00:00
Richard Barnes	fca0f34b83	Switch c10::string_view to std::string_view (#139635 ) Shortens `string_view_starts_with` to `starts_with`. Adds some missing headers. Isolates `c10_string_view` to use with `get_fully_qualified_name`. Test Plan: Sandcastle Reviewed By: ezyang Differential Revision: D64833558 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139635 Approved by: https://github.com/Skylion007, https://github.com/ezyang	2024-11-27 01:41:18 +00:00
Richard Barnes	cb8c956b5f	Fix PyBind 2.10.4 compatibility issue in caffe2/torch/csrc/dynamo/guards.cpp +2 (#141456 ) Summary: See D65023502 and [here](https://fb.workplace.com/groups/mldp.users/permalink/8706556336131960/) for details. Test Plan: Sandcastle Reviewed By: itamaro Differential Revision: D66395491 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141456 Approved by: https://github.com/Skylion007	2024-11-24 21:05:48 +00:00

1 2 3 4 5 ...

7509 Commits