Commit Graph

7 Commits

Author SHA1 Message Date
Sherlock Huang
e924df23a6 [NativeRT] Strengthen matcher check for StaticDispatch kernel (#159187)
Summary:
Strength matcher for StaticDispatch kernels: all input, output tensor must be on CPU, all Device-typed attribute must be CPU.

Previously, we only check output tensor on CPU. This will miss catching the case where we do DeviceToHost aten._to_copy.

Prepare for turning on static dispatch kernel by default.

Test Plan:
I should add some test before land.

Rollback Plan:

Differential Revision: D78747600

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159187
Approved by: https://github.com/dolpm
2025-07-29 04:03:49 +00:00
Sherlock Huang
1abff80fae Reland D78841818 (#159216)
Summary: Relanding D78841818 with fixes

Test Plan:
Tested all failing tests

buck build --config fbcode.use_link_groups=true --flagfile fbcode//mode/dev-nosan fbcode//sigmoid/core/executor/memory/test:layout_planner_tests

buck test 'fbcode//mode/opt' fbcode//sigmoid/inference/test:test_passes

Rollback Plan:

Reviewed By: hl475

Differential Revision: D79038615

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159216
Approved by: https://github.com/dolpm
2025-07-28 07:39:35 +00:00
PyTorch MergeBot
3db8623dcb Revert "[NativeRT] Apply Device placement once when loading the graph (#158996)"
This reverts commit 28ee8be5bf.

Reverted https://github.com/pytorch/pytorch/pull/158996 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/158996#issuecomment-3121540050))
2025-07-26 09:05:26 +00:00
Sherlock Huang
28ee8be5bf [NativeRT] Apply Device placement once when loading the graph (#158996)
Summary:
Placement is leaked to too many classes!

In this diff, we consolidate all placement lookup into one place: Graph::ApplyDevicePlacement.

After applying placement, the in-memory graph, tensorMeta, weightMeta would already have the re-mapped device.
The subsequence weight loading, sample input loading, target device inference would look up the re-mapped device from graph's tensorMeta.

graph's tensorMeta becomes the only ground truth!

Test Plan:
Need to add some tests before landing.
This is a big change.

Rollback Plan:

Differential Revision: D78841818

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158996
Approved by: https://github.com/henryoier
2025-07-25 20:11:35 +00:00
Sherlock Huang
86dbc0ef67 [NativeRT] Remove makeProxyExecutor from ModelRunner interface (#158587)
Summary: makeProxyExecutor shouldn't be exposed to ModelRunner Interface.

Test Plan:
CI

Rollback Plan:

Differential Revision: D78501011

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158587
Approved by: https://github.com/yiming0416, https://github.com/henryoier
2025-07-18 03:20:40 +00:00
dolpm
725c327284 [nativert] add memory overlap debug assertion (#157290)
Summary: better safe than sorry. will throw if memory overlap detected when using planned tensors and debug mode is enabled -- this will make our planning unit tests more robust.

Test Plan:
ci

Rollback Plan:

Differential Revision: D77327841

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157290
Approved by: https://github.com/SherlockNoMad, https://github.com/zhxchen17
2025-07-14 19:12:41 +00:00
Sheng Qin
88c6199db0 [nativert] Move KernelFactory to PyTorch core (#156913)
Summary: Kernel factory handles the kernel nodes initializations and different type of kernels executions.

Test Plan:
CI

Rollback Plan:

Differential Revision: D77346836

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156913
Approved by: https://github.com/zhxchen17
2025-06-28 06:34:24 +00:00