pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Maggie Moss	eb83c3ca23	Clean up unused Pyrefly suppressions (#166178 ) Cleaning up ignores that are no longer needed in the repo and adding select suppressions so the main branch is clean. test plan: `lintrunner -a` Pull Request resolved: https://github.com/pytorch/pytorch/pull/166178 Approved by: https://github.com/oulgen	2025-10-25 05:32:21 +00:00
Maggie Moss	9944cac6e6	Add suppressions to torch/_inductor (#165062 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Split this directory into two PRs to keep them from being too large. Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the project-excludes field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: INFO 0 errors (6,884 ignored) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165062 Approved by: https://github.com/oulgen, https://github.com/mlazos	2025-10-09 20:34:20 +00:00
Benjamin Glass	22920c9138	Grab bag of (mostly) typing improvements (#158075 ) Collects some scattershot improvements made while attempting to enable training for AOTInductor. Non-typing changes are: 1. Swapping a few custom searches for the output node in an FX graph for calling `graph.output_node()`. 2. Removing two unused parameters from `torch.export._unlift._unlift`. 3. Switching handles to constants in `cpp_wrapper_cpu` to use C++ references for memory efficiency. 4. Cleaning out unused, unexported imports from `torch/export/__init__.py`, and adding one missing export to `__all__`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158075 Approved by: https://github.com/Skylion007	2025-07-21 19:17:01 +00:00
Shangdi Yu	eaf704914e	[aoti] package weights to disk and dedup (#155241 ) We package the weights and save them in `data/weights/` (`WEIGHTS_DIR`). In addition, we store a `weights_config.json` in the model folder for each model to specify which weight file corresponding to which weight name. Models can share weights. We dedup the weights based on their underlying storage (`tensor.untyped_storate()`). - Use `"aot_inductor.package_constants_on_disk": True` config to produce the `Weights` in aot_compile - If we see `Weights` in aoti_files, we'll automatically package them to disk - `"aot_inductor.package_constants_on_disk"` config and `"aot_inductor.package_constants_in_so"` config work independently. - Use `load_pt2(package_path, load_weights_from_disk=True)` to load the weights from disk. `load_weights_from_disk` defaults to False. Test Plan: ``` buck2 run @//mode/dev-nosan //caffe2/test/inductor:aot_inductor_package -- -r "test_package_shared_weights" ``` Tested with whisper at https://github.com/pytorch-labs/torchnative/pull/7 Rollback Plan: Differential Revision: D74747190 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155241 Approved by: https://github.com/desertfire	2025-06-19 17:17:17 +00:00
Benjamin Glass	4311aea5e7	[AOTInductor] Add class declarations to torch._C._aoti interface file (#155128 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155128 Approved by: https://github.com/desertfire ghstack dependencies: #155149	2025-06-17 00:10:57 +00:00
angelayi	d2bfd97d71	[export] Refactor pt2 save/load (#152495 ) Refactor the pt2 archive saving to consolidate the format of torch.export.save and torch._inductor.package.package_aoti. This PR adds the following functions, which torch.export.save and AOTI packaging calls into: ```python package_pt2( f: FileLike, *, exported_programs: Optional[Union[ExportedProgram, dict[str, ExportedProgram]]] = None, aoti_files: Optional[Union[list[str], dict[str, list[str]]]] = None, extra_files: Optional[dict[str, Any]] = None, ) -> FileLike @dataclass class PT2ArchiveContents: exported_programs: dict[str, ExportedProgram] aoti_runners: dict[str, AOTICompiledModel] extra_files: dict[str, Any] load_pt2(f: FileLike) -> PT2ArchiveContents ``` Power users directly call into these APIs if they want to bundle multiple exported programs, aoti files, or extra metadata. This is how the pt2 archive looks like ([spec](https://docs.google.com/document/d/1RQ4cmywilnFUT1VE-4oTGxwXdc8vowCSZsrRgo3wFA8/edit?tab=t.0)): ``` ├── archive_format ├── version ├── .data ├── data │ ├── aotinductor │ │ └── model1 │ │ ├── model1.cpp │ │ ├── model1.so # currently AOTI automatically moves weights in here, TODO to move it out │ │ ├── cg7domx3woam3nnliwud7yvtcencqctxkvvcafuriladwxw4nfiv.cubin │ │ └── cubaaxppb6xmuqdm4bej55h2pftbce3bjyyvljxbtdfuolmv45ex.cubin │ ├── weights │ │ ├── model1.pt # TODO to dedup weights between model1/model2 │ │ └── model2.pt │ └── constants │ │ ├── model1.pt # TODO to dedup weights between model1/model2 │ │ └── model2.pt │ └── sample_inputs │ ├── model1.pt # TODO to dedup weights between model1/model2 │ └── model2.pt ├── extra │ └── user_metadata.txt └── models ├── model1.json └── model2.json ``` Future todos: - unbundle the weights -- instead of .pt, we can use bin files, which will also allow us to dedup weights if we store multiple models - update aoti_compile_and_package to also save the exported program - integrate TNR with this packaging flow Pull Request resolved: https://github.com/pytorch/pytorch/pull/152495 Approved by: https://github.com/yushangdi	2025-06-04 06:04:29 +00:00
Angela Yi	3b21d79225	[export] Move PT2ArchiveWriter/Reader to torch/export (#153795 ) Summary: Before: `from sigmoid.core.package.pt2_archive import PT2ArchiveWriter, PT2ArchiveReader, is_sigmoid_package` After: `from torch.export.pt2_archive import PT2ArchiveWriter, PT2ArchiveReader, is_pt2_package` By merging the two PT2ArchiveReader/Writers, into using the native PytorchFileReader/Writer, the open source PT2 archive also changed to have an additional folder. However this PR still maintains support for loading an old PT2 archive which does not have the additional folder. Before: ``` ├── archive_format ├── byteorder ├── .data │ ├── serialization_id │ └── version ├── data │ ├── aotinductor ``` After: ``` ├── tmp │ ├── archive_format │ ├── byteorder │ ├── .data │ │ ├── serialization_id │ │ └── version │ ├── data │ │ ├── aotinductor ``` Test Plan: `buck2 test //sigmoid/...` https://www.internalfb.com/intern/testinfra/testrun/5348024839248187 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153795 Approved by: https://github.com/zhxchen17	2025-05-23 19:04:36 +00:00
PyTorch MergeBot	4ff19ecf66	Revert "[export] Move PT2ArchiveWriter/Reader to torch/export (#153795 )" This reverts commit `7e80f23516`. Reverted https://github.com/pytorch/pytorch/pull/153795 on behalf of https://github.com/malfet due to Looks like it broke lots of tests, see `ec368a1903/1` ([comment](https://github.com/pytorch/pytorch/pull/153795#issuecomment-2905415496))	2025-05-23 18:29:08 +00:00
Angela Yi	7e80f23516	[export] Move PT2ArchiveWriter/Reader to torch/export (#153795 ) Summary: Before: `from sigmoid.core.package.pt2_archive import PT2ArchiveWriter, PT2ArchiveReader, is_sigmoid_package` After: `from torch.export.pt2_archive import PT2ArchiveWriter, PT2ArchiveReader, is_pt2_package` By merging the two PT2ArchiveReader/Writers, into using the native PytorchFileReader/Writer, the open source PT2 archive also changed to have an additional folder. However this PR still maintains support for loading an old PT2 archive which does not have the additional folder. Before: ``` ├── archive_format ├── byteorder ├── .data │ ├── serialization_id │ └── version ├── data │ ├── aotinductor ``` After: ``` ├── tmp │ ├── archive_format │ ├── byteorder │ ├── .data │ │ ├── serialization_id │ │ └── version │ ├── data │ │ ├── aotinductor ``` Test Plan: `buck2 test //sigmoid/...` https://www.internalfb.com/intern/testinfra/testrun/5348024839248187 Differential Revision: D74616598 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153795 Approved by: https://github.com/zhxchen17	2025-05-23 15:40:25 +00:00
Angela Yi	b4fb801b2d	[export] Move PT2 constants to torch::_export (#153206 ) Test Plan: `buck2 test //sigmoid/...` https://www.internalfb.com/intern/testinfra/testrun/1970325119807758 Differential Revision: D74417085 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153206 Approved by: https://github.com/zhxchen17, https://github.com/dolpm	2025-05-17 08:21:59 +00:00
Julius Herb	8f54e56e62	Add optional device index to AOTIModelPackageLoader (#152093 ) This is my suggestion for resolving #152087 This PR extends the constructor of `AOTIModelPackageLoader` with an (optional) device index. The device type is still determined by `metadata_["AOTI_DEVICE_KEY"]`, but the `device_index` argument can be used to move an AOTI model package to different devices like `cuda:0`, `cuda:1`, ... in a convenient way. AFAIK, this is not possible so far using `AOTIModelPackageLoader` alone. The default case (no device index specified) with `metadata_["AOTI_DEVICE_KEY"] == "cuda"` would lead to the current behavior, i.e., the model is loaded to device `cuda`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152093 Approved by: https://github.com/desertfire	2025-05-04 11:40:12 +00:00
Mu-Chu Lee	107121dfad	[AOTInductor] Add interface for user managed buffer in package api. (#151325 ) Summary: https://github.com/pytorch/pytorch/pull/151141 We add interface for user managed buffer in the package api. Test Plan: Included in commit.] Reviewed By: henrylhtsang Differential Revision: D72985440 Pull Request resolved: https://github.com/pytorch/pytorch/pull/151325 Approved by: https://github.com/angelayi	2025-04-16 04:25:40 +00:00
Bin Bao	04e251a7dd	[AOTI] Add num_runners to AOTIModelPackageLoader (#149364 ) Summary: AOTIModelContainerRunner takes a num_runners argument for multi-threaded inference, but AOTIModelPackageLoader forgot to take the same parameter, although its run() API already expects to take an optional cudaStream_t parameter for multi-threaded inference. Differential Revision: [D71357418](https://our.internmc.facebook.com/intern/diff/D71357418) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149364 Approved by: https://github.com/angelayi	2025-03-19 02:28:06 +00:00
Shangdi Yu	cf19efd3d9	Support basic TorchBind in aot_compile and aoti_compile_and_package (#148506 ) Summary: Codegen - Skip some codegen parts for torchbind (such as arg decleration) because they are loaded in proxy executor, so we do not need to declare torchbind args in cpp code - Added a helper method to get the schema of CallTorchBind HOP. The returned schema is only the schema of `obj.method()`. Serialization Add support for torchbind object in serialization - For CallTorchBind HOP, we need to handle it specially because of it's schema. The output serialized args is in the format of `(obj, method, args, kwargs)`. - it.TorchBindObject inputs are serialized to `as_custom_obj` Argument. Packaging* Add torchbind objects file and `custom_objs_config.json` file to generated files output of `aot_compile`. The json file is stored in the `data/aotinductor/<model_name>` folder in pt2 archive. The torchbind objects are stored in data/constants/ folder in pt2 archive. The format of torchbind objects are `f"{CUSTOM_OBJ_FILENAME_PREFIX}{custom_obj_idx}"`. e.g. `custom_obj_0`. CustomClassHolder objects implement their own pickle methods. Note that this `custom_objs_config.json` file is different from the `model_constants_config.json` file produced in package_sigmoid(). The keys in `custom_objs_config` directly correspond to the arg name in extern nodes json. The key in `model_constants_config.json` produced by `package_sigmoid` is the attribute name in the user mode code. This is required for both internal and OSS torchbind support. For OSS torchbind support, we also need to package torchbind_constants into the .pt2 output. Work Left We still need to add torchbind support in ProxyExecutor for inductor.aoti_load_package to work. See other diffs in the stack. Test Plan: ``` buck run fbcode//mode/dev-nosan //caffe2/test/inductor:torchbind -- -r schema buck run fbcode//mode/dev-nosan //caffe2/test/inductor:torchbind -- -r aot_compile ``` Differential Revision: D69490718 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148506 Approved by: https://github.com/angelayi	2025-03-11 20:55:18 +00:00
Joel Schlosser	85467ed063	Fix for AOTI + CUDAGraphs when calling from Python (#148601 ) Background: I've been comparing performance of torch.compile vs. torch.export + AOTI (specifically, loaded from Python) on the Flux model and found a ~1.4% performance decrease with the latter. The trace shows that CUDAGraphs are not utilized for torch.export + AOTI, leading to higher overhead. When trying to manually CUDAGraph the loaded, previously exported + AOTIed model (thanks to @eellison for the logic here), I get: ``` Error: operation not permitted when stream is capturing ``` @desertfire confirms that this is due to multi-threading logic on the AOTI runtime side (in `AOTIModelContainer` / `AOTIModel`) conflicting with the use of CUDAGraphs. Fix: This PR takes the approach of providing an alternate, single-threaded method for running loaded models with the AOTI runtime. Details: * Python side introduces a new flag to enable this behavior (needs a better name): `torch._inductor.package.load_package(..., run_single_threaded=False)` * This flag is passed down to the C++ side's `AOTIModelPackageLoader`, which passes it to the `CreateAOTIModelRunnerFunc` during `AOTIModelContainerRunner` construction. * C++ side introduces single-threaded alternatives to model running and model container running: * `AOTIModelContainer.run_single_threaded()` / `AOTIModel.run_single_threaded()`. The interfaces match those of `run()`, but the synchronization logic has been removed. * Introduces `AOTInductorModelContainerRunSingleThreaded` to AOTI's `interface.h`; this is invoked by the `AOTIModelContainerRunner` utility class when `run_single_threaded=true`. I've verified on both a small repro and my real-world use case that I can manually CUDAGraph a loaded model that was previously exported + AOTIed. Future work: * Flip default value to `run_single_threaded=True` as Python-side inference doesn't take advantage of the AOTI runtime thread pool * There are some BC concerns here - models need to be re-serialized so the .so contains the new `AOTInductorModelContainerRunSingleThreaded` interface func. We can flip the default value and warn (instead of crashing) if the `AOTInductorModelContainerRunSingleThreaded` symbol does not exist. * Compose with cudagraph trees as opposed to manual cuda graph wrapping Pull Request resolved: https://github.com/pytorch/pytorch/pull/148601 Approved by: https://github.com/desertfire	2025-03-08 02:44:14 +00:00
Bin Bao	df7e43e5d4	[AOTI] Fix aot_inductor_package test errors (#148279 ) Summary: Fix fbcode test failures introduced by https://github.com/pytorch/pytorch/pull/147975. Make sure script.ld is copied to the build-time directory. Differential Revision: D70454149 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148279 Approved by: https://github.com/zoranzhao	2025-03-05 05:22:48 +00:00
Xuehai Pan	1cb4e2df65	[BE][PYFMT] migrate PYFMT for `torch._inductor` to `ruff format` (#144550 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144550 Approved by: https://github.com/jansel	2025-02-28 13:33:19 +00:00
zeshengzong	da216baaa2	Optimize inductor `Self` typing (#146669 ) Replace method return type with `Self` typing Pull Request resolved: https://github.com/pytorch/pytorch/pull/146669 Approved by: https://github.com/jansel	2025-02-10 20:39:56 +00:00
Randolf Scholz	835e770bad	Use `typing.IO[bytes]` instead of `io.BytesIO` in annotations (#144994 ) Fixes #144976 Using appoach ① `IO[bytes]`, but could also try with a protocol. ## Notes: - moved `torch.serialization.FILE_LIKE` to `torch.types.FileLike` - Use `FileLike` annotation where it makes sense - made sure those functions also support `os.PathLike` - Replaced `isinstance(x, io.BytesIO)` with `isinstance(x, (io.IOBase, IO))` where appropriate. - Replaced `BinaryIO` with `IO[bytes]` (the two ABCs are almost identical, the only difference is that `BinaryIO` allows `bytearray` input to `write`, whereas `IO[bytes]` only `bytes`) - needed to make `torch.serialization._opener` generic to avoid LSP violations. - skipped `torch/onnx/verification` for now (functions use `BytesIO.getvalue` which is not part of the `IO[bytes]` ABC, but it kind of seems that this is redundant, as e.g. `onnx.load` supports `str \| PathLike[str] \| IO[bytes]` directly... Pull Request resolved: https://github.com/pytorch/pytorch/pull/144994 Approved by: https://github.com/ezyang, https://github.com/Skylion007	2025-01-27 18:08:07 +00:00
Shangdi Yu	302b07f166	Implement deepcopy for AOTICompiledModel (#145423 ) Summary: Fix https://github.com/pytorch/pytorch/issues/145411 Support deepcopying AOTICompiledModel. The `loader` is shallow copied. Test Plan: ``` buck2 run fbcode//mode/opt //caffe2/test/inductor:aot_inductor_package -- -r deepcopy ``` Differential Revision: D68524673 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145423 Approved by: https://github.com/desertfire	2025-01-23 21:05:30 +00:00
Aaron Orenstein	bac62341eb	PEP585 update - torch/_inductor (#145198 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145198 Approved by: https://github.com/bobrenjc93	2025-01-21 21:04:33 +00:00
Bin Bao	2683691237	[AOTI] Add a boxed_run API (#142213 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/141696. Add a new C++ runner API (boxed_run) following dynamo's boxed calling convention, which steals tensors' ownership from the input tensor list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142213 Approved by: https://github.com/ezyang	2025-01-14 18:47:42 +00:00
PyTorch MergeBot	4f74864c94	Revert "[AOTI] Add a boxed_run API (#142213 )" This reverts commit `868984c3e3`. Reverted https://github.com/pytorch/pytorch/pull/142213 on behalf of https://github.com/kit1980 due to breaking lots of internal builds, see D68036023 ([comment](https://github.com/pytorch/pytorch/pull/142213#issuecomment-2588378262))	2025-01-13 22:43:47 +00:00
Bin Bao	868984c3e3	[AOTI] Add a boxed_run API (#142213 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/141696. Add a new C++ runner API (boxed_run) following dynamo's boxed calling convention, which steals tensors' ownership from the input tensor list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142213 Approved by: https://github.com/ezyang	2025-01-10 18:27:00 +00:00
Henry Tsang	12f1989a4a	[aoti package] seek 0 after loading buffer (#142204 ) Differential Revision: D66855265 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142204 Approved by: https://github.com/chenyang78, https://github.com/angelayi	2024-12-09 21:53:28 +00:00
Angela Yi	868d62552d	[aoti] Add load_constants to package api (#142246 ) Summary: With the changes in https://github.com/pytorch/pytorch/pull/140755 and https://github.com/pytorch/pytorch/pull/141997, I added a load_constants function to the packaging API. Currently this doesn't work for cpu. The workflow is something like: ``` ep = torch.export.export(model, example_inputs) package = torch._inductor.aoti_compile_and_package(ep, inductor_configs=inductor_configs) compiled = torch._inductor.aoti_load_package(package) print(compiled.get_constant_fqns()) # see what are the fqns needed/available compiled.load_constants(new_state_dict, check_full_update=True) # update the constants in AOTI ``` You can also use the `aot_inductor.package_constants_in_so` config to stop including the constants in the so: ``` package = torch._inductor.aoti_compile_and_package(ep, inductor_configs={`aot_inductor.package_constants_in_so`: False) compiled = torch._inductor.aoti_load_package(package) compiled(inputs) # segfaults because there are no constants --> we should probably have a better error msg compiled.load_constants(new_state_dict, check_full_update=True) compiled(inputs) ``` Test Plan: `buck2 run @//mode/dev-nosan //caffe2/test/inductor:aot_inductor_package -- -r "test_so_without_weight" ` Differential Revision: D66796206 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142246 Approved by: https://github.com/henrylhtsang, https://github.com/desertfire	2024-12-07 01:18:42 +00:00
angelayi	540dc0c114	[aoti] Prototype loading from bytes (#142070 ) Loader needs to have an official solution -- I'm pretty sure miniz can do this out of box, but haven't gotten the time to look at it yet. For now it just loads the buffer into a file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142070 Approved by: https://github.com/henrylhtsang	2024-12-05 18:38:02 +00:00
Angela Yi	baf756a785	[reland] [aoti] Selectively package AOTI generated files (#140675 ) Summary: Reland https://github.com/pytorch/pytorch/pull/140022 Test Plan: CI Differential Revision: D65929964 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140675 Approved by: https://github.com/desertfire	2024-11-15 23:48:34 +00:00
PyTorch MergeBot	b4cc5d38b4	Revert "[aoti] Remove dir after packaging (#140022 )" This reverts commit `ba136a78ba`. Reverted https://github.com/pytorch/pytorch/pull/140022 on behalf of https://github.com/angelayi due to sorry I realized I need to land from internal ([comment](https://github.com/pytorch/pytorch/pull/140022#issuecomment-2473814720))	2024-11-13 14:43:15 +00:00
angelayi	ba136a78ba	[aoti] Remove dir after packaging (#140022 ) Update AOTI to return a list of files that it generates when `aot_inductor.package=True`. Then we will only package the files that are in that list. This should fix the [caching issue](https://fb.workplace.com/groups/1028545332188949/permalink/1081702043539944/) and hopefully https://github.com/pytorch/pytorch/issues/140053. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140022 Approved by: https://github.com/larryliu0820, https://github.com/desertfire, https://github.com/malfet	2024-11-13 12:17:19 +00:00
PyTorch MergeBot	d48ea29b9a	Revert "[aoti] Remove dir after packaging (#140022 )" This reverts commit `8c6abe5a8c`. Reverted https://github.com/pytorch/pytorch/pull/140022 on behalf of https://github.com/huydhn due to Sorry for reverting your change but the lint failure is legit ([comment](https://github.com/pytorch/pytorch/pull/140022#issuecomment-2471847439))	2024-11-12 23:35:27 +00:00
angelayi	8c6abe5a8c	[aoti] Remove dir after packaging (#140022 ) Update AOTI to return a list of files that it generates when `aot_inductor.package=True`. Then we will only package the files that are in that list. This should fix the [caching issue](https://fb.workplace.com/groups/1028545332188949/permalink/1081702043539944/) and hopefully https://github.com/pytorch/pytorch/issues/140053. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140022 Approved by: https://github.com/larryliu0820, https://github.com/desertfire, https://github.com/malfet	2024-11-12 21:36:24 +00:00
angelayi	ce14f1f0c9	[aoti] Accept constant inputs (#137197 ) Fixes https://fb.workplace.com/groups/1028545332188949/posts/1056788036031345/?comment_id=1056790162697799&reply_comment_id=1057501845959964 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137197 Approved by: https://github.com/henrylhtsang, https://github.com/desertfire, https://github.com/hl475	2024-10-03 20:59:33 +00:00
angelayi	cd9ee49a69	[aoti] Add cpp loader (#135374 ) * Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will not package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? Differential Revision: [D62329906](https://our.internmc.facebook.com/intern/diff/D62329906) Pull Request resolved: https://github.com/pytorch/pytorch/pull/135374 Approved by: https://github.com/desertfire, https://github.com/malfet	2024-09-11 03:00:01 +00:00
Oguz Ulgen	09f9c256ad	Add basic mypy annotations to inductor (#132416 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132416 Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu ghstack dependencies: #132415	2024-08-04 18:43:37 +00:00
PyTorch MergeBot	f2ddd5e9e0	Revert "Add basic mypy annotations to inductor (#132416 )" This reverts commit `78927d37f6`. Reverted https://github.com/pytorch/pytorch/pull/132416 on behalf of https://github.com/ZainRizvi due to Sorry, this PR has entered a weird state in the diff train. Trying to revert it to skip it, and then we can try relanding it ([comment](https://github.com/pytorch/pytorch/pull/132415#issuecomment-2267631785))	2024-08-04 18:39:29 +00:00
Oguz Ulgen	78927d37f6	Add basic mypy annotations to inductor (#132416 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132416 Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu ghstack dependencies: #132415	2024-08-01 20:14:25 +00:00
Xuehai Pan	b6d477fd56	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129768 Approved by: https://github.com/jansel	2024-07-20 16:20:58 +00:00
angelayi	cbf274d4a7	[aoti] Add packaging solution (#129895 ) In this PR, I added support for packaging the AOTI generated files into a zipfile, and loading it in python. `compile_so` takes the path to the package, a device, and a desired so_path location, and compiles package into a .so, and saves to the specified location. `load_package` takes a path to the package and device, calls _extract_so, and then creates a callable to run the compiled model. The zipfile generated looks like the following: ``` \|- version \|- archive_format \|- data \|- aotinductor \|- cbtnafqaqrhvwztv7xudlal4xs6sofxa5oxccyuaqtrt6aozaklx.cubin # AOTI cuda generated cubin files \|- cskkqtna23bty2v3aq7g2q37cxrgufehlkuaaolhlgug5zg6fuwe.cpp # AOTI generated cpp file \|- cskkqtna23bty2v3aq7g2q37cxrgufehlkuaaolhlgug5zg6fuwe_compile_flags # Flags for compiling the .o \|- c6qqtnpgwfi3dv5nb76ai773kt45ezoxfwdmd7q37lvq6fs2tnoi.o # AOTI saved const.o \|- cskkqtna23bty2v3aq7g2q37cxrgufehlkuaaolhlgug5zg6fuwe_linker_flags # Flags for linking the files to form the .so \|- constants \|- constants.pt # Constants saved using torch.save, can be loaded using mmap ``` The workflow is something like: ``` with torch.no_grad(): ep = torch.export.export( model, example_inputs, dynamic_shapes=dynamic_shapes, strict=False, ) gm = ep.module() package_path = torch._inductor.aot_compile( gm, example_inputs, options= { "aot_inductor.output_path": "my_path.pt2", # or a directory "aot_inductor.package": True, } ) compiled_model = torch._inductor.package.load_package(package_path, device) return compiled_model ``` I tried turning on loading the weights using mmap by default, but had some trouble with it, so that is just left as a todo Pull Request resolved: https://github.com/pytorch/pytorch/pull/129895 Approved by: https://github.com/malfet	2024-07-17 13:56:58 +00:00

39 Commits