Commit Graph

903 Commits

Author SHA1 Message Date
PyTorch MergeBot
6b59a19242 Revert "[RELAND] Always build USE_DISTRIBUTED (#160449) and Make distributed modules importable even when backend not built (#159889) (#162594)"
This reverts commit 6e8f17c580.

Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3283985880))
2025-09-12 06:52:03 +00:00
Edward Yang
6e8f17c580 [RELAND] Always build USE_DISTRIBUTED (#160449) and Make distributed modules importable even when backend not built (#159889) (#162594)
Summary:
Original: D81957844 and D81957923

Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well

#buildall

Test Plan:
sandcastle and oss ci

Rollback Plan:

Reviewed By: H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594
Approved by: https://github.com/H-Huang, https://github.com/dcci
2025-09-12 03:56:18 +00:00
Edward Yang
dda071587f Revert "Make distributed modules importable even when backend not built (#159889)" (#162568)
This reverts commit a0d026688c.

Revert "Always build USE_DISTRIBUTED. (#160449)"

This reverts commit d80297a684.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162568
Approved by: https://github.com/huydhn
2025-09-10 04:29:42 +00:00
Scott Wolchok
0e7ccc09db [easy] Don't force copy result of getAllOperatorsFor in init.cpp (#162218)
It returns a const reference to a vector.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162218
Approved by: https://github.com/Skylion007
ghstack dependencies: #161591, #161595, #161633, #161634, #161692, #162219, #162220
2025-09-10 00:08:15 +00:00
Scott Wolchok
dcc42e95f4 Fix missing moves in initJITBindings (#162428)
Per @Skylion007 on #162219

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162428
Approved by: https://github.com/Skylion007
2025-09-09 08:47:33 +00:00
Scott Wolchok
a8a187b2cf Overload _get_operation_for_overload_or_packet & friends to accept ArrayRef (#162219)
Avoids requiring vector allocation to call this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162219
Approved by: https://github.com/Skylion007
ghstack dependencies: #161591, #161595, #161633, #161634, #161692
2025-09-09 01:10:06 +00:00
Scott Wolchok
a951f435fd Avoid redundant PyTuple_GetSize call in _maybe_handle_torch_function (#161633)
py::args::size() calls PyTuple_GetSize. Compiler can't know the two calls will always return the same result, so we have to consolidate them ourselves.

Differential Revision: [D81530096](https://our.internmc.facebook.com/intern/diff/D81530096)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161633
Approved by: https://github.com/ezyang, https://github.com/Skylion007
ghstack dependencies: #161591, #161595
2025-09-09 01:10:06 +00:00
Edward Yang
d80297a684 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-08 19:10:36 +00:00
Scott Wolchok
49c446c617 Add C++ function for torch.distributed.tensor._op_schema.is_view_op (#161595)
This seems to have been an especially slow one because of the repeated pybind access (schema is a pybind, as is arguments, and then we hit each argument). It's still ~~1% of total benchmark runtime because of the repeated single pybind function call, but that's a lot better.

Differential Revision: [D81530095](https://our.internmc.facebook.com/intern/diff/D81530095)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161595
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
ghstack dependencies: #161466, #161586, #161590, #161591
2025-09-08 16:28:08 +00:00
PyTorch MergeBot
1e0656f063 Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit de893e96c7.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to internal changes breaks import checks, see [D81845053](https://www.internalfb.com/diff/D81845053) ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3264887002))
2025-09-08 07:04:36 +00:00
Edward Yang
de893e96c7 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-05 20:15:11 +00:00
PyTorch MergeBot
adae7f66aa Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit c37103234a.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to Breaking internal build rules, see D81756619 ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3259430011))
2025-09-05 18:58:47 +00:00
Edward Yang
c37103234a Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-04 19:43:17 +00:00
PyTorch MergeBot
b7dad7dd49 Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit 90b08643c3.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to Already discussed with @ezyang about the internal quirks and errors ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3254219358))
2025-09-04 15:25:07 +00:00
Edward Yang
90b08643c3 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-03 07:33:55 +00:00
PyTorch MergeBot
4e42aa8ffc Revert "Always build USE_DISTRIBUTED. (#160449)"
This reverts commit b7034e9c92.

Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, can't be landed with forward fix due to internal tooling problems ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3246689684))
2025-09-02 20:28:42 +00:00
Edward Yang
b7034e9c92 Always build USE_DISTRIBUTED. (#160449)
Signed-off-by: Edward Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci
2025-09-01 23:00:21 +00:00
Scott Wolchok
5d35b49ba7 Fix forced copying def_property_readonly for FunctionSchema & friends (#161301)
This took me a bit to figure out and I'm pretty sure I've looked at
this code before. Pybind uses
`return_value_policy::reference_internal` for `def_property`, which
[causes the owning object to be kept alive for the lifespan of the
return
value](https://pybind11.readthedocs.io/en/stable/advanced/functions.html),
allowing the getter to safely avoid copying the property
value. However, lambdas act like they return `auto`, not
`decltype(auto)`, so our lambdas themselves were forcing copies!

Testing: observed std::vector<Argument> copying disappear in Linux
perf profile of someOpInfo._schema.arguments/returns (in
_python_dispatch.correct_storage_aliasing).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161301
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/wconstab
2025-08-30 06:55:42 +00:00
Scott Wolchok
67457dbb9d Fix non-const reference arguments in torch/csrc/jit/python/init.cpp (#161300)
Shouldn't be any generated code impact, just fixing bad practice.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161300
Approved by: https://github.com/wconstab, https://github.com/malfet
ghstack dependencies: #161286
2025-08-29 19:01:32 +00:00
Nikita Shulga
d8cb3db533 Add unsigned support to IValue (#160102)
- Moved repeated logic of saving int64/uint64 into a polymorphic container into `THPUtils_unpackInteger`
- Added `TestPythonDispatch.test_dispatch_uint64` regression test

Fixes https://github.com/pytorch/pytorch/issues/159168

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160102
Approved by: https://github.com/ezyang
2025-08-11 03:57:18 +00:00
Yanan Cao (PyTorch)
731ee31f7b [TorchScript, PT2] Add torch._check compatibility support (#159988)
Summary:
Add support for torch._check() in TorchScript jit.script frontend.

* It will be special cased to behave like torch._assert, turned into an if + raise exception.

Test Plan:
Unit tests

Rollback Plan:

Differential Revision: D79744604

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159988
Approved by: https://github.com/davidberard98
2025-08-08 23:14:13 +00:00
Xuehai Pan
541584d22e [BE][8/16] fix typos in torch/ (torch/csrc/jit/) (#156318)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156318
Approved by: https://github.com/albanD
2025-07-02 22:55:29 +00:00
Jason Ansel
0596323c35 Better fix for __index__ SymInt issue (#157201)
This improves on #156928

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157201
Approved by: https://github.com/ezyang
2025-07-01 07:06:46 +00:00
Laith Sakka
74ebd8d14e use guard_or_false for expand utils reduction (#155868)
This is classic broadcast like pattern.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155868
Approved by: https://github.com/bobrenjc93
2025-06-21 23:42:19 +00:00
Yidi Wu
545fbd58dc [export] inline jit.scripted function in export (#155180)
When we export a scripted function, we inline the original callable stored in "_torchdynamo_inline", this is the same strategy as torch.compile path.

We do the same thing for script method, where a "\_\_wrapped\_\_" attribute points to the original callable in most cases. There are some corner cases we identified: top-level jit.scripted modules' method doesn't have a \_\_wrapped\_\_. In this case, we fall back to the original scripted approach. Maybe there're more such cases but need verification.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155180
Approved by: https://github.com/zou3519
2025-06-10 20:34:12 +00:00
cyy
388912dd94 Remove AttributeError constructor (#154808)
It is a private API and uses C vsnprintf, which is not type safe.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154808
Approved by: https://github.com/Skylion007

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
2025-06-03 03:49:09 +00:00
PyTorch MergeBot
ef92653022 Revert "Remove AttributeError constructor (#154808)"
This reverts commit 3239da0c73.

Reverted https://github.com/pytorch/pytorch/pull/154808 on behalf of https://github.com/cyyever due to Need format code ([comment](https://github.com/pytorch/pytorch/pull/154808#issuecomment-2933286113))
2025-06-03 03:40:41 +00:00
Yuanyuan Chen
3239da0c73 Remove AttributeError constructor (#154808)
It is a private API and uses C vsnprintf, which is not type safe.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154808
Approved by: https://github.com/Skylion007

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
2025-06-03 02:18:51 +00:00
Aaron Gokaslan
0cd18ba1ca [BE][Ez] Update deprecated pybind11 functions (#154798)
Some checks failed
pull / linux-jammy-py3.9-gcc11 (push) Has been cancelled
pull / linux-docs (push) Has been cancelled
pull / linux-jammy-py3.9-gcc11-no-ops (push) Has been cancelled
pull / linux-jammy-py3.9-gcc11-pch (push) Has been cancelled
pull / linux-jammy-py3.10-clang15-asan (push) Has been cancelled
pull / linux-focal-py3.9-clang10-onnx (push) Has been cancelled
pull / linux-focal-py3.9-clang10 (push) Has been cancelled
pull / linux-focal-py3.13-clang10 (push) Has been cancelled
pull / linux-focal-cuda12.6-py3.10-gcc11-build-distributed (push) Has been cancelled
pull / linux-focal-cuda12.6-py3.10-gcc11-test (push) Has been cancelled
pull / linux-focal-cuda12.6-py3.10-gcc11 (push) Has been cancelled
pull / linux-jammy-py3-clang12-mobile-build (push) Has been cancelled
pull / linux-jammy-cuda11.8-cudnn9-py3.9-clang12 (push) Has been cancelled
pull / linux-focal-py3_9-clang9-xla (push) Has been cancelled
pull / linux-focal-cpu-py3.10-gcc11-bazel-test (push) Has been cancelled
pull / linux-jammy-py3.9-gcc11-mobile-lightweight-dispatch-build (push) Has been cancelled
pull / linux-jammy-rocm-py3.10 (push) Has been cancelled
pull / linux-focal-cuda12.6-py3.10-gcc11-sm89 (push) Has been cancelled
pull / unstable-linux-focal-cuda12.6-py3.10-gcc11-sm89-xfail (push) Has been cancelled
pull / linux-jammy-py3-clang12-executorch (push) Has been cancelled
pull / cuda12.8-py3.10-gcc9-sm75 (push) Has been cancelled
pull / linux-jammy-xpu-2025.1-py3.9 (push) Has been cancelled
inductor-unittest / cuda12.6-py3.10-gcc9-sm86 (push) Has been cancelled
inductor-unittest / cuda12.6-py3.12-gcc9-sm86 (push) Has been cancelled
inductor-unittest / linux-jammy-cpu-py3.12-gcc11-inductor-halide (push) Has been cancelled
inductor-unittest / linux-jammy-cpu-py3.12-gcc11-inductor-triton-cpu (push) Has been cancelled
inductor-unittest / linux-jammy-cpu-py3.9-gcc11-inductor (push) Has been cancelled
inductor-unittest / cuda12.6-py3.13-gcc9-sm86 (push) Has been cancelled
ossf-scorecard / Scorecards analysis (push) Has been cancelled
Close nonexistent disable issues / close-nonexistent-disable-issues (push) Has been cancelled
* getType() is deprecated, replace it with new/proper static method. These are backwards compatible with old pybind11 versions we support. So break this off before we upgrade to pybind11 3.0 where these methods are dropped in #154115

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154798
Approved by: https://github.com/jansel, https://github.com/cyyever
2025-06-01 06:17:50 +00:00
David Berard
a237831bc2 [JIT] Optimize DCE by storing a MemoryLocations for an entire set<Value*> (#153645)
Summary:
**TL;DR**: make DCE faster by replacing a Set<Value*> with a MemoryLocations sparse bitset (representing all the memory locations stored by the collection of all values in the set).

**Details**
The goal of this PR is to optimize this function from AliasDb:

```
bool AliasDb::writesToAlias(Node* n, const ValueSet& vs) const {
  const auto writtenTo = getWrites(n);
  if (writtenTo.empty()) {
    return false;
  }

  MemoryLocations locs;
  for (const auto v : vs) {
    auto it = elementMap_.find(v);
    if (it != elementMap_.end()) {
      const auto& vlocs = memoryDAG_->getMemoryLocations(it->second);
      if (writtenTo.intersects(vlocs)) {
        return true;
      }
    }
  }

  return false;
}
```

In the DCE use case, we have a ValueSet of live values, into which we insert `Value*`s; and sometimes need to check whether a node mutates any of the live values using `writesToAlias`.

Looping through all the values in the ValueSet and indexing into the elementMap_ is slow; so if we can pre-compute the MemoryLocations set, this speeds up the function. In some large model examples, I see ~15-25x speedups from this change.

**Implementation**: To avoid exposing too many details of AliasDb, I introduce a friend class `ValueAndMemoryLocationSet`, which is an insert-only set of Values, which also maintains the corresponding MemoryLocations.

Then in AliasDb, I use `ValueAndMemoryLocationSet` if we're using AliasDb for analysis, and otherwise use a `Set<Value*>` if we don't have AliasDb.

Test Plan: Rely on unit tests.

Differential Revision: D74827086

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153645
Approved by: https://github.com/eellison
2025-05-19 21:04:59 +00:00
David Berard
5e6e52e7c9 [JIT] add GRAPH_DEBUG for setGraphExecutorOptimize (#153549)
Summary: Optionally log when setGraphExecutorOptimize is called, so we can get insight into the GraphExecutor behavior.

Differential Revision: D74692508

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153549
Approved by: https://github.com/PaulZhang12, https://github.com/SamGinzburg
2025-05-14 20:07:25 +00:00
cyy
45efa1aaa8 [3/N] Use internal linkage in C++ files (#151297)
Follows #151070.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151297
Approved by: https://github.com/Skylion007
2025-05-05 17:48:39 +00:00
cyy
41bd0c900a [1/N] Deprecate c10::string_view and at::string (#151972)
The calls of `c10::string_view` in the code base are replaced by `std::string_view`. The calls of `at::string` are replaced by `std::string`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151972
Approved by: https://github.com/malfet
2025-04-29 07:23:52 +00:00
cyy
70d7638b0d Fix clang-tidy suppression in torch/csrc/jit (#152271)
Remove some clang-tidy suppression in torch/csrc/jit by applying fixes or refactoring.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152271
Approved by: https://github.com/Skylion007, https://github.com/malfet

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
2025-04-27 21:18:39 +00:00
cyyever
24ca7e91e6 [1/N] Use internal linkage in torch/csrc C++ files. (#150930)
Turn more functions and variables into static if they are not used outside the cpp files. Unused functions are removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150930
Approved by: https://github.com/Skylion007

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
2025-04-11 02:19:31 +00:00
Yidi Wu
c714d2fc0e [hop] support base_hop._gen_schema (#149688)
This PR creates two utils for generating a schema for hops from example inputs and use base hop as an exmaple.
1. HopArgumentInfoGen creates an argument or an output schema with mutation information.
2. CFuncitonSchemaGen piece together the argument info of inputs and outputs and produces torch._C.FunctionSchema.

is_write attribute of argument info can be computed. Note that the is_write annotation only works when the inputs are flattened (e.g. cannot support mutation inside tuple). We need special handling the case where we have tuple inputs like cond.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149688
Approved by: https://github.com/zou3519
2025-04-09 16:42:55 +00:00
cyy
79e8a69257 Enable move warnings for torch targets (#149923)
This PR enables more move warnings for torch targets and fixes some code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149923
Approved by: https://github.com/malfet
2025-03-26 08:38:13 +00:00
Mikayla Gawarecki
be0ceee1c3 Make record/storage alignment in torch.save configurable (#147788)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147788
Approved by: https://github.com/albanD
ghstack dependencies: #147786, #147787
2025-03-06 12:04:46 +00:00
cyy
9aa897b992 Remove unnecessary tensor clone (#148159)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148159
Approved by: https://github.com/Skylion007
2025-03-02 16:21:39 +00:00
Michael Suo
99dd846672 [torch] fix builds for older pybind (#146630)
Summary:
some versions of pybind we build with don't have `py::set_error`.

So just use the underlying python C API.

Test Plan: unit tests

Differential Revision: D69254629

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146630
Approved by: https://github.com/colin2328, https://github.com/ngimel
2025-02-06 21:22:00 +00:00
Michael Suo
425804db2b [torch] fix exception types in custom class magic setattr/getattr (#146516)
Summary:
`c10::AttributeError` is not automatically converted to Python AttributeError, it needs some special macros (e.g. `HANDLE_TH_ERRORS`).

Some Python functions like `hasattr` rely on the type of the throw exception to be correct.

We don't need the fully generality of those macros, so just do a targeted error type conversion here.

Test Plan: added unit test

Differential Revision: D69197217

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146516
Approved by: https://github.com/zdevito
2025-02-06 02:14:11 +00:00
Mikayla Gawarecki
001e355a56 Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880)
## Background

This PR adds `torch.utils.serialization.config.load.calculate_storage_offsets`. This option relies  on the previous PR in this stack, where storage order was changed to non lexicographical. A `.format_version` entry was added to the zipfile and `calculate_storage_offsets` will only work on checkpoints with `.format_version`.

When this is turned on, for `torch.load(mmap=True)`, offsets of each storage record (other than the 0th storage will be calculated instead of relying on `miniz` APIs to determine this).

The existing APIs will issue multiple random reads (reading the end of central directory record, then reading the zipfile header for the record) to determine the storage offset where the record starts. This can greatly degrade `torch.load(mmap=True)` performance for non-filesystem cases.

6aaae9d78f/caffe2/serialize/inline_container.cc (L589-L605)

## How does this work

The format for the checkpoint is as such

```
archive_name/
|_ data.pkl
|_.format_version
|_byteorder
|_data/
  |_ 0
  |_ 1
  |_ 2
  |_ ...
|_
```

Each `data/i` record represents a storage, where storages are written in the order that the Pickler encounters them.

For each storage, our `persistent_load` logic saves the following metadata to the pickle file `dtype, numel, key, location` where `numel` is the number of bytes in the storage.

Note that we always use `miniz` writer  in the zip64 mode per [here](7796e308d0/caffe2/serialize/inline_container.cc (L701)) A zipfile record written by miniz looks as such

```
 ---------------- ----------------- ------------------- ---------------- --------- ------------------------------
| 30 byte header | n byte filename | zip64_extra_data | m byte padding | storage | 16 or 24 byte local dir footer  |
 ---------------- ----------------- ------------------- ---------------- --------- ------------------------------
```

- The header size (30) is given by [`MZ_ZIP_LOCAL_DIR_HEADER_SIZE`](https://github.com/pytorch/pytorch/blob/main/third_party/miniz-3.0.2/miniz.c?fbclid=IwZXh0bgNhZW0CMTEAAR2O8Vysd--UoSCxW70gabXIS1dbz733oHwuUQ5_Ff1hY2WU6PL2i6CSH4A_aem_J9oaU2HpDeWtJKOU9EnVqw#L3290)
- filename will be `"{archive_name}/{filepath}"`

- `zip64_extra_data` is determined by [`mz_zip_writer_create_zip64_extra_data`](7796e308d0/third_party/miniz-3.0.2/miniz.c (L6202)). Note that [we only create zip64_extra_data if storage_size >= 0xFFFFFFFF or the offset of the start of the header >= 0xFFFFFFFF](7796e308d0/third_party/miniz-3.0.2/miniz.c (L6519-L6524))
- `m` is determined by [`getPadding`](7796e308d0/caffe2/serialize/inline_container.cc (L254)), which accounts for filename, zip64_extra_data to determine `m` such that the start of `storage` is aligned to 64 bytes. The `m` bytes will always start with `F B padding_size" as the first 4 bytes
- The local dir footer size is determined based on [this snippet ](7796e308d0/third_party/miniz-3.0.2/miniz.c (L6610-L6632)): if the buffer size is 0 it is skipped. If the zip64_extra_data was created, it is 24, otherwise it is 16.

When `torch.utils.serialization.config.load.calculate_storage_offsets` is set we do the following
- We keep track of where the "cursor" is in the file using `current_offset`, after each persistent_load call, it will be at the offset where the header for the next record starts
- for the 0th storage, "data/0", we use the regular get_record_offset to determine the start of the storage
- for any other storage, (where the storages will be in order encountered by the unpickler, 0, 1, 2, 3, ...) we use `get_record_offset_no_read`, which re-uses the `getPadding` logic to determine the offset of the storage
- Note that `load_tensor` will only ever be called again with the same key if the storage's `._data_ptr()` is 0 [[pointer1](https://github.com/pytorch/pytorch/blob/main/torch/serialization.py#L1917-L1918)][[pointer2](https://github.com/pytorch/pytorch/blob/main/torch/serialization.py#L1936-L1937)], so we cache the offsets for this edge case
- After each storage, if the storage is non-zero, we account for the local dir footer based on the logic described above

## Testing strategy

The agreed upon testing strategy was as follows:
- Add debug code gated by an environment flag `TORCH_SERIALIZATION_DEBUG` that will run this offset calculation logic and verify it against getRecordOffset for each storage (when mmap=False)
- This flag is set throughout CI, which means that every time `torch.load` is called, the offset calculation logic is implicitly being tested.

Differential Revision: [D67673026](https://our.internmc.facebook.com/intern/diff/D67673026)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143880
Approved by: https://github.com/albanD
ghstack dependencies: #143879
2025-01-31 17:09:20 +00:00
Manav Avlani
f9227e7c33 Expose ToIValueAllowNumbersAsTensors to TORCH_PYTHON_API so we can use it in monarch (#146087)
Summary: TSIA

Test Plan: Tested up the stack but existing unittests

Reviewed By: suo

Differential Revision: D68917233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146087
Approved by: https://github.com/suo
2025-01-31 05:08:11 +00:00
PyTorch MergeBot
9010649292 Revert "Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880)"
This reverts commit db3685a35c.

Reverted https://github.com/pytorch/pytorch/pull/143880 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but either this PR or the base PR breaks distributed tests ([comment](https://github.com/pytorch/pytorch/pull/143880#issuecomment-2617743403))
2025-01-28 03:07:17 +00:00
Mikayla Gawarecki
db3685a35c Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880)
## Background

This PR adds `torch.utils.serialization.config.load.calculate_storage_offsets`. This option relies  on the previous PR in this stack, where storage order was changed to non lexicographical. A `.format_version` entry was added to the zipfile and `calculate_storage_offsets` will only work on checkpoints with `.format_version`.

When this is turned on, for `torch.load(mmap=True)`, offsets of each storage record (other than the 0th storage will be calculated instead of relying on `miniz` APIs to determine this).

The existing APIs will issue multiple random reads (reading the end of central directory record, then reading the zipfile header for the record) to determine the storage offset where the record starts. This can greatly degrade `torch.load(mmap=True)` performance for non-filesystem cases.

6aaae9d78f/caffe2/serialize/inline_container.cc (L589-L605)

## Testing strategy

The agreed upon testing strategy was as follows:
- Add debug code gated by an environment flag `TORCH_SERIALIZATION_DEBUG` that will run this offset calculation logic and verify it against getRecordOffset for each storage (when mmap=False)
- This flag is set throughout CI, which means that every time `torch.load` is called, the offset calculation logic is implicitly being tested.

Differential Revision: [D67673026](https://our.internmc.facebook.com/intern/diff/D67673026)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143880
Approved by: https://github.com/albanD
ghstack dependencies: #143879
2025-01-27 23:57:30 +00:00
c8ef
a989a0b13a [NFC] Fix some minor typos. (#145599)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145599
Approved by: https://github.com/Skylion007
2025-01-24 18:58:59 +00:00
cyy
e9f6045e80 [15/N] Fix extra warnings brought by clang-tidy-17 (#143100)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143100
Approved by: https://github.com/Skylion007
2024-12-14 03:24:10 +00:00
Richard Barnes
46dc2965de Adding missing space to pybind_utils.h error message (#142258)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142258
Approved by: https://github.com/Skylion007
2024-12-08 20:46:32 +00:00
Richard Barnes
17f1a42c13 Add missing py::bytes to pybind_utils tryToInferType (#142265)
I'm not sure what the best way to fix this is, but this does unbreak an internal test.

Test Plan: Sandcastle

Reviewed By: itamaro

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142265
Approved by: https://github.com/houseroad
2024-12-07 20:31:57 +00:00
cyy
45ed7c13fa Remove unneeded std::make_optional (#141567)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141567
Approved by: https://github.com/albanD
2024-11-28 00:05:21 +00:00