pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Sherlock Huang	a8add2b92f	Support matching Args for SubgraphMatcher (#85456 ) Subgraph matcher now handles the matching of non-Node arguments. Here are the 4 cases - pn is Node, gn is Node: this go through the regular _match_node() function - pn is Noed, gn is not a Node: this is a match if only pn is a placeholder op - pn is not Node, gn is Node: this is a no match case - pn is not a Node, gn is not a Node: this will go through the argument comparison. With this change ``` def target(x): return foo(x, 3) def pattern(x, y): return foo(x, y) ``` is a match Pull Request resolved: https://github.com/pytorch/pytorch/pull/85456 Approved by: https://github.com/jerryzh168	2022-09-24 20:06:48 +00:00
Renfei Chen	4befe45084	[FX] Add one option to maintain the FX graph execution order after splitting_module (#85188 ) Summary: {F770932209} Given the original execution order and the node dependency relationship (note that the same dependency order could generate multiple execution order, which refers to “Topological Order”), after reunion, we could find the new execution order of the new GraphModule is different from the original one which is not what we want. For example, let’s assume that NewLeaf_1 is EmbeddingLookup (Calling EmbeddingLookup is awaitable, we will keep executing the following nodes rather than waiting for the result until we have to know the lookup result), NewLeaf_4 is the node where we HAVE to get the lookup result to interact with the NewLeaf_3. So NewLeaf_1 will launch a lookup kernel and all2all communication stream to distribute the result to all ranks. In the meantime, we want to keep executing NewLeaf_2 and NewLeaf_3 to avoid meaningless waiting. However, given the new execution order, we have to wait for the lookup kernel and all2all communication to be finished since the next node NewLeaf_4 needs the result, until then we can execute NewLeaf_2, etc. It cannot leverage the advantage of parallel computation and communication stream and will hurt the QPS a lot. So while constructing the GraphModule, we have to change from the topological order to the original order Test Plan: Unit test Not sure how to add tests in FX as there's no TARGETS, so I added in the TorchRec folder Differential Revision: D39567314 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85188 Approved by: https://github.com/SherlockNoMad	2022-09-23 23:21:54 +00:00
Sherlock Huang	34296e2f4c	SubgraphMatcher remove invalid matches (#85444 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85444 Approved by: https://github.com/rkindi	2022-09-22 02:59:11 +00:00
Elias Ellison	8bd9fe3f49	Changes to prepare for fake tensors on in functorch by default (#84432 ) Fixes some errors you run into in dynamo when turning on fake tensors. I'm waiting on flipping the switch because I need to also get some fixes into dynamo + do benchmarking. I could manually turn off fake tensors in functorch in dynamo, and then turn it on here if requested, although the changes here are pretty minimal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84432 Approved by: https://github.com/Chillee	2022-09-08 04:29:30 +00:00
Wei Wei	31ef8ddb8c	add option to remove passes (#84425 ) Summary: Add a remove_pass method in pass_manager to provide user option to remove any pass. Reviewed By: wushirong Differential Revision: D39080077 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84425 Approved by: https://github.com/yinghai	2022-09-07 17:21:27 +00:00
Qiming Lu	e71370064c	Improvements to FX Minimizer (#83833 ) Summary: This diff improves the FX Minimizer for better error reports, and fixes a few other issues. Test Plan: CI Differential Revision: D38900309 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83833 Approved by: https://github.com/yuhc, https://github.com/Chillee	2022-09-01 18:39:26 +00:00
Horace He	85931eaa6b	Rename fake_result to val (#84331 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84331 Approved by: https://github.com/ezyang	2022-08-31 17:44:18 +00:00
Sungmin Cho	bf67589915	Escape curly brackets in FxGraphDrawer _typename (#83604 ) Summary: Encountered `Error: bad label format` from dot (i.e. graphviz) when benchmarking models that have dict-like structure. The root cause was that curly brackets were not properly escaped, like this example P522499127 (unescaped curly brackets in target= string) This diff insert the fix in FxGraphDrawer, since many of these graph generation codes rely on that class. (Modified summary before exporting to GitHub PR) Test Plan: ``` CUDA_VISIBLE_DEVICES=7 buck run mode/opt -c python.package_style=inplace //hpc/new/models/feed/benchmark:feed_lower_benchmark -- --model-name={INSERT IFR QE MODEL NAME HERE} --batch-iter 100 --batch-size 768 --num-gpu 1 --lower-presets {INSERT ITS PRESET} ``` Will not encounter dot errors after this diff. (Modified test plan before exporting to GitHub PR) Reviewed By: yinghai Differential Revision: D38758827 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83604 Approved by: https://github.com/yinghai, https://github.com/jianyuh	2022-08-31 15:15:21 +00:00
Isaac Hoffman	20018aa766	modify split_by_tags to retain output order (#84136 ) Summary: Currently `split_by_tags` determines submodule output order by iterating over `used_in_main`. Since this is a `Set`, insertion order is not retained so we run into problems with submodule output order being "randomized" & inconsistent between splits. By using `Dict[Node, None]` we can implement `used_in_main` as an ordered set so that output order is consistent when splitting the same model. Test Plan: CI Differential Revision: D39039268 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84136 Approved by: https://github.com/houseroad	2022-08-30 20:36:33 +00:00
Zhengxu Chen	a402e100be	[fx] Make wrapped_fn also work for non-mutating passes. (#84232 ) Summary: Before the change, wrapped_fn should only take mutating passes, but we don't actually have any way to detect whether a pass is mutating before running it. To make this an abstraction without involving any precondition depending on PassManager run, we could just relax the precondition to take any kind of passes, and conditionally return the original pass based on the pass result. Test Plan: eyes Reviewed By: qihqi, angelayi Differential Revision: D39086343 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84232 Approved by: https://github.com/angelayi	2022-08-30 01:16:58 +00:00
Angela Yi	352da6de6b	[fx][pass] Fix type of exception (#84094 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84094 Approved by: https://github.com/SherlockNoMad	2022-08-29 16:55:59 +00:00
PyTorch MergeBot	1945d28f58	Revert "[fx][pass] Fix type of exception (#84094 )" This reverts commit `eb2fa2e042`. Reverted https://github.com/pytorch/pytorch/pull/84094 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-08-29 16:41:09 +00:00
Angela Yi	eb2fa2e042	[fx][pass] Fix type of exception (#84094 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84094 Approved by: https://github.com/SherlockNoMad	2022-08-26 22:34:14 +00:00
Nan Xiao	c47e0450f8	[fbia] Keep Track of full qualified name before and after remote sharding (#83889 ) Summary: track qualname changes in embedding sharding & FX split, and compose target qualname in the end of FBIA transform stage, so we can use the qualname mapping in XL materialize stage Test Plan: CI/CD with DISABLE_XLEBB_MATERIALIZATION = True https://fburl.com/fblearner/a8yljbux with DISABLE_XLEBB_MATERIALIZATION = False https://fburl.com/fblearner/2nvi0dam Reviewed By: lliu315gt Differential Revision: D38772525 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83889 Approved by: https://github.com/houseroad	2022-08-24 01:15:25 +00:00
Shirong Wu	fc470cf980	Back out "Support regex-style matching for Any and Oneof (#82853 )" (#83922 ) Reviewed By: hl475 Differential Revision: D38945806 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83922 Approved by: https://github.com/hl475	2022-08-24 00:17:46 +00:00
Angela Yi	89072177e1	[fx][pass infra] Adding error catching (#83933 ) Example: ``` ====================================================================== ERROR: test_pass_manager_error (fx.test_pass_infra.TestPassManager) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/angelayi/Projects/pytorch/torch/fx/passes/infra/pass_manager.py", line 285, in __call__ res = fn(module) File "/Users/angelayi/Projects/pytorch/test/fx/test_pass_infra.py", line 164, in pass_fail raise RuntimeError("bad") RuntimeError: bad The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/angelayi/Projects/pytorch/test/fx/test_pass_infra.py", line 170, in test_pass_manager_error pm(traced_m) File "/Users/angelayi/Projects/pytorch/torch/fx/passes/infra/pass_manager.py", line 289, in __call__ raise RuntimeError(msg) from e RuntimeError: An error occured when running the 'pass_fail' pass after the following passes: ['replace_add_with_mul_pass', 'replace_mul_with_div_pass'] ``` Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/83933 Approved by: https://github.com/SherlockNoMad	2022-08-23 23:56:50 +00:00
Brian Hirsh	8db04c1113	reinplace pass: special handling for view_scatter ops (#83846 ) There is already special handling in the reinplacing pass for removing `{view}_scatter` ops, but there is another case that needs special handling. In this code: ``` def f(): a = torch.zeros(4, 4, 4) a[:, 2:] = torch.ones(4, 2, 4) return a ``` Tracing normally with `make_fx()` gives you: ``` def forward(self): zeros = torch.ops.aten.zeros.default([4, 4, 4], device = device(type='cpu'), pin_memory = False) ones = torch.ops.aten.ones.default([4, 2, 4], device = device(type='cpu'), pin_memory = False) slice_tensor = torch.ops.aten.slice.Tensor(zeros, 0, 0, 9223372036854775807) slice_tensor_1 = torch.ops.aten.slice.Tensor(slice_tensor, 1, 2, 9223372036854775807); slice_tensor = None copy__default = torch.ops.aten.copy_.default(slice_tensor_1, ones); slice_tensor_1 = ones = None return zeros ``` Functionalizing it gives you: ``` def forward(self): zeros = torch.ops.aten.zeros.default([4, 4, 4], device = device(type='cpu'), pin_memory = False) ones = torch.ops.aten.ones.default([4, 2, 4], device = device(type='cpu'), pin_memory = False) slice_tensor = torch.ops.aten.slice.Tensor(zeros, 0, 0, 9223372036854775807) slice_tensor_1 = torch.ops.aten.slice.Tensor(slice_tensor, 1, 2, 9223372036854775807); slice_tensor = None slice_tensor_2 = torch.ops.aten.slice.Tensor(zeros, 0, 0, 9223372036854775807) slice_scatter_default = torch.ops.aten.slice_scatter.default(slice_tensor_2, ones, 1, 2, 9223372036854775807); slice_tensor_2 = ones = None slice_scatter_default_1 = torch.ops.aten.slice_scatter.default(zeros, slice_scatter_default, 0, 0, 9223372036854775807); zeros = slice_scatter_default = None return slice_scatter_default_1 ``` Notice that there are not any functional ops to directly re-inplace! What actually happened is that functionalization turned the `copy_()` into a `copy()`, but the out-of-place `copy()` operator gets optimized away because it's a no-op (when the input and output metadata are the same, `out = copy(a, b)` just returns `b`). What we actually want is to replace this line: ``` slice_scatter_default = torch.ops.aten.slice_scatter.default(slice_tensor_2, ones, 1, 2, ...); ``` with this: ``` new_slice = torch.ops.aten.slice.Tensor(slice_tensor_2, 1, 2, ...); _ = torch.ops.aten.copy_.default(new_slice, ones) ``` In the above, we're taking a fresh slice of the "base" tensor, and performing a `copy_()` on the slice, adding back what functionalization removed. We actually need to create a fresh "slice" node, because we're not guaranteed that one already exists in the graph (technically there should be one, but it might have been DCE'd by the time we hit re-inplacing) I also updated the docs for re-inplacing to more closely match the order of the logic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83846 Approved by: https://github.com/ezyang	2022-08-23 17:13:58 +00:00
Brian Hirsh	75ec7b7547	reinplace pass: bugfix for output node replacement (#83845 ) Cleaned up some of the arg replacement logic to use tree_map, so it handles FX nodes that have nested containers. See the added test: when you write a function that returns a list, the `output` node in the FX graph shows up as having `node.args = tuple(immutable_list(...))` Pull Request resolved: https://github.com/pytorch/pytorch/pull/83845 Approved by: https://github.com/ezyang	2022-08-23 17:13:58 +00:00
Alex Beloi	3c6c39e66e	[fx] refactor fba_passes into FBAPassManagerBuilder (#83268 ) Summary: This diff integrate FBAPassManagerBuilder as the primary orchestrator of FBA-FX passes Reviewed By: jfix71, dborkovic Differential Revision: D38186354 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83268 Approved by: https://github.com/dborkovic	2022-08-22 06:54:18 +00:00
Brian Hirsh	e9e7363854	reinplacing pass fixes for torchbench + huggingface (#83626 ) I'm testing out turning on re-inplacing + functionalization by default with the AOTAutograd + eager backend on torchbench + huggingface models. This PR contains a few bug fixes from turning re-inplacing on: (1) Handle more gracefully when FakeTensorMode is already turned on when you call reinplace (2) More robust detection for when an inplace variant of an op exists (the dumb bug was that `pow.Scalar` doesn't have an inplace variant, even though there are several overloads of `pow_`. None of them are eligible though (3) Avoid re-inplacing when it would require resizing the input buffer. This isn't allowed, because inplace ops aren't allowed to resize their inputs. For the last one, I gave the two main examples in more detail in the comments. Important cases are: ``` # This should not be re-inplaced at all; the op broadcasts, so this would require resizing the self tensor torch.add(tensor[1, 4], tensor[4, 4]) # This should not be re-inplaced, because the inplace and out-of-place variants of the op return different dtypes torch.ge(a, b) # However, this means that today when functionalization functionalists a `torch.ge_(a, b)` call, reinplacing won't properly de-functionalize it. I mentioned that optimization is worth adding later in the comments ``` (4) There's some logic around keeping `storage_to_nodes` up to date when we see a view op: if we re-inplace `out = a.add(...)`, and later in the program we encounter a "later_node",`out.view(..)`, and need to replace it with `a.view(...)`, then we need to update some metadata structures. I had to fix that logic: specifically, if "later_node" isn't a dispatcher op, (e.g. if it's an FX output node), I wasn't properly handling the case where the node's fake_meta info was not a tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83626 Approved by: https://github.com/ezyang	2022-08-19 23:30:45 +00:00
Sherlock Huang	39e6238788	Support regex-style matching for Any and Oneof (#82853 ) pseudo.any is a wildcard node that can be matched with any fx node with arbitrary number of inputs and outputs. For example, to match relu followed by one fx node: ``` def pattern(a): y = a.relu() z = torch.ops.pseudo.any(y) return z ``` pseudo.oneof is a special node that can be matched with a fx node whose target is in the permissible list. `targets` must be be a list of qualified name for operators, e.g. ["operator.add", "torch.sigmoid", "torch.ops.aten.foo", "torch.ops.prims.bar"] For example, using following pattern with pseudo.oneof ``` def pattern(a): y = a.relu() z = torch.ops.pseudo.oneof(y, targets=["relu", "torch.sigmoid", "operator.add"]) return z ``` It will have 3 matches in the following function ``` def forward(y): z = y.relu() x = z.relu() # first match x = x.relu() x = torch.sigmoid(x) # second match x = x.relu() return x + 1 # third match ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/82853 Approved by: https://github.com/ezyang	2022-08-12 18:43:13 +00:00
Sherlock Huang	2ca721cda5	An improved version of subgraph matcher (#82090 ) This new version of subgraph matcher further supports - optionally match with pattern's placeholder and output nodes - patterns with multiple outputs - filtering out non-containing matches - filtering out overlapping matches TODOs: - [x] Update replace_pattern() to use this matcher - [x] Fix cases with identical anchor - [x] Introduce wildcard matching, such Any, OneOf - [ ] Improve node comparer to match args and kwargs values Pull Request resolved: https://github.com/pytorch/pytorch/pull/82090 Approved by: https://github.com/ezyang	2022-08-12 03:32:09 +00:00
Sergii Dymchenko	a0b3854548	Change seperate -> separate (#83056 ) One instance was caught by Meta-internal "exact-word-misspell" linter in D38505529. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83056 Approved by: https://github.com/huydhn, https://github.com/seemethere	2022-08-09 23:11:34 +00:00
Horace He	51bbf6329a	Improved legalize_graph pass in FX (#82874 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82874 Approved by: https://github.com/jamesr66a	2022-08-07 00:13:17 +00:00
Shirong Wu	4ae40d74ac	Back out "Add an op_lowering_disallow_list in fx splitter base class. (#82288 )" (#82750 ) Summary: Revert since this breaks BC test More context: failing test https://www.internalfb.com/.../fblearner/details/361780349/ issue report thread https://fb.workplace.com/groups/2211200152361974/permalink/2303690223112966/ Test Plan: All unit test Differential Revision: D38399966 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82750 Approved by: https://github.com/yinghai	2022-08-05 02:15:00 +00:00
Brian Hirsh	d362b8e9e6	reland "add a reinplacing FX pass (#80897 )" (#82407 ) fixes #81457 fixes #81216 fixes #81212 fixes #81207 fixes #81206 fixes #81218 fixes #81203 fixes #81202 fixes #81214 fixes #81220 fixes #81205 fixes #81200 fixes #81204 fixes #81221 fixes #81209 fixes #81210 fixes #81215 fixes #81217 fixes #81222 fixes #81211 fixes #81201 fixes #81208 As part of this PR I'm also re-enabling all of the functionalization tests that got marked as flaky in CI (they're not actually flaky - I think they got marked because a PR that should have changed their expect-test output made it to master without the changes. I'll let CI run on this PR to confirm though). reland of https://github.com/pytorch/pytorch/pull/80897 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82407 Approved by: https://github.com/ezyang	2022-08-02 18:03:29 +00:00
Shirong Wu	09059d9148	integrate plugin (#82395 ) Differential Revision: D38162861 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82395 Approved by: https://github.com/frank-wei	2022-08-02 00:41:36 +00:00
Angela Yi	e06d1029f7	[fx] Minor modifications to pass infra (#82485 ) * Made PassBase calls optionally return PassResult since some passes might want to base inplace. * Added additional documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/82485 Approved by: https://github.com/SherlockNoMad	2022-08-01 20:10:01 +00:00
Ying Zhang	a71d0e882c	Add an op_lowering_disallow_list in fx splitter base class. (#82288 ) Summary: ATT, so that we can control not to lower some specific ops. Test Plan: Tested together with the next diff in stack. Differential Revision: D38188836 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82288 Approved by: https://github.com/mikeiovine, https://github.com/khabinov	2022-07-28 05:19:33 +00:00
PyTorch MergeBot	df36ccbd81	Revert "add a reinplacing FX pass (#80897 )" This reverts commit `3ef7a6921d`. Reverted https://github.com/pytorch/pytorch/pull/80897 on behalf of https://github.com/malfet due to broke windows trunk tests, see `3ef7a6921d`	2022-07-27 22:32:03 +00:00
Brian Hirsh	3ef7a6921d	add a reinplacing FX pass (#80897 ) Adds a "reinplacing" FX transform, that goes through an FX graph and tries to convert out-of-place op calls into inplace calls whenever possible. Followups from this PR include: - Set up torch bench, and run the whole torchbench suite using AOTAutograd + functionalize + rein placing transforms to surface any issues (this is what I'm currently working on). Right now, I have some basic unit tests just to sanity check that the general logic makes sense. - Add any missing inplace ops. This is mostly the `_scatter` ops, e.g. `diagonal_scatter_`, because these ops will commonly show up an FX graph after running functionalization. The criteria for when you can swap an op `b = a.add(...)` with `a.add_(...)` is: (1) An inplace variant of the operator with the same schema needs to exist (`aten.add` -> `aten.add_`) (2) `a` (or any of its aliases) can't be used as an input to any other operators later on in the graph (3) `a` can't be one of the inputs to the entire graph. It also can't be an alias of any of the inputs * * One thing to note: (3) means that we can't technically guarantee that we'll get back all memory usage that we lost from functionalization. Functionalization converts input mutations into out-of-place calls, and then adds a `copy_()` to the end of the graph to preserve semantics. I added logic to handle `copy_()` in this PR because it it's a pretty important optimizations in the context of `functionalization()`: any program that performs input mutations will have a `copy_()` in it after running functionalization. There are some examples in the test file, but I think staring at an example of where re-inplacing is/isn't allowed to run is helpful: ``` // Before functionalization def foo(a): tmp1 = a.add_(1) tmp2 = a.add(2) // After functionalization def foo(a) tmp1 = a.add(1) tmp2 = a.add(2) .... a.copy_(tmp1) // After re-inplacing def foo(a) // first add() is safe to re-inplace even though a is a program input, // because a's data is overwritten later by a copy_() tmp1 = a.add_(1) // second add() is NOT safe to re-inplace, because: // (1) a and tmp1 are aliased. Note that they weren't aliased in the original program, but they are now that we've done some re-inplacing. // (2) tmp1 is used as an input later in the program tmp2 = a.add(2) .... a.copy_(tmp1) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80897 Approved by: https://github.com/ezyang	2022-07-27 19:11:15 +00:00
Sherlock Huang	dc3c1ade4b	Some fixes for FX pass with nvFuser backend (#81911 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81911 Approved by: https://github.com/jjsjann123, https://github.com/IvanYashchuk, https://github.com/davidberard98	2022-07-22 19:49:33 +00:00
Edward Z. Yang	3c2c2cc947	cudagraphs dynamo backend (#80566 ) This backend handles cases where the preexisting cuda graphs implementation from dynamo is unsound/has errors. Requires this functorch bug fix: https://github.com/pytorch/functorch/pull/935 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/80566 Approved by: https://github.com/ngimel, https://github.com/wconstab	2022-07-22 14:06:07 +00:00
Shangdi Yu	c52ee6dc0a	CSE Pass and common pass Tests (#81742 ) Test cases for CSE Pass and common passes Pull Request resolved: https://github.com/pytorch/pytorch/pull/81742 Approved by: https://github.com/SherlockNoMad	2022-07-22 03:45:09 +00:00
Sherlock Huang	43e7fee764	[Reland] Recursively print graph module and its submodule (#81639 ) ghstack-source-id: fcfc024c440981ee3fe3537a5816089eadf2cc13 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81080 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/81639 Approved by: https://github.com/ezyang	2022-07-21 16:58:25 +00:00
Shangdi Yu	7c5dac5228	Dialect agnostic CSE Pass (#81530 ) Fixes comments in https://github.com/pytorch/pytorch/pull/81512 - banned ops is an input to the pass - update the fx/readme.md to include this file for better discoverability - use make_fx in torch repo Pull Request resolved: https://github.com/pytorch/pytorch/pull/81530 Approved by: https://github.com/SherlockNoMad	2022-07-20 00:56:41 +00:00
PyTorch MergeBot	4035a53cca	Revert "Recursively print graph module and its submodule (#81080 )" This reverts commit `fe7262329c`. Reverted https://github.com/pytorch/pytorch/pull/81080 on behalf of https://github.com/DanilBaibak due to Break internal build	2022-07-18 14:46:26 +00:00
Sherlock Huang	fe7262329c	Recursively print graph module and its submodule (#81080 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81080 Approved by: https://github.com/ezyang	2022-07-18 01:19:03 +00:00
Sherlock Huang	d625637c7c	Include aten.where.self in NvFuserOperatorSupport (#81436 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81436 Approved by: https://github.com/davidberard98	2022-07-16 03:29:27 +00:00
Shangdi Yu	938643b8bc	CSE_Pass (#81512 ) Migrate the CSE pass in functorch to pytorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/81512 Approved by: https://github.com/angelayi	2022-07-15 02:32:48 +00:00
Angela Yi	3d0b0b2f9b	[fx] PassManager changes (#80531 ) PassManager is a class used to run multiple passes on a given graph module. Class Attributes * `passes: List[Callable]`: A list of callable passes * `constraints: List[Callable]`: A list of constraints * `run_checks_after_each_pass`: Flag for running checks each pass Class Methods: * `__call__(graph_module: DispatchGraphModule)`: * Runs the passes based on the list of passes until the graph stops changes, or until `steps` number of times. * Each time a pass is run, it will check that the graph module still maintains the required invariants by calling `check()` and will lint the graph to check that it’s well formed if the flag `run_checks_after_each_pass` is set. * `check(graph_module: DispatchGraphModule)`: Runs various checks on the given graph module to make sure that it contains the needed data for passes * `add_check(check: Callable)`: Adds the `check` function to the given pass manager instance * `add_constraint(constraint: Callable)`: Adds a constraint to the current list of constraints We can create a PassManager and run it by doing: ``` PassManager(passes=[pass1, pass2])(graph_module) ``` Differential Revision: [D37523159](https://our.internmc.facebook.com/intern/diff/D37523159) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80531 Approved by: https://github.com/SherlockNoMad	2022-07-15 00:58:43 +00:00
jjsjann123	cc67a92e74	fixing call_module on subscripting into generator (#81258 ) named_modules() return a generator, which is not subscriptable and causes node support query to fail Pull Request resolved: https://github.com/pytorch/pytorch/pull/81258 Approved by: https://github.com/SherlockNoMad	2022-07-14 16:41:18 +00:00
Angela Yi	614779f975	[fx] PassResult (#81366 ) Passes should now return a `PassResult` which (for now) contain the following fields: * `graph_module`: The graph module modified during the pass * `modified`: A flag for if the graph module has been modified Pull Request resolved: https://github.com/pytorch/pytorch/pull/81366 Approved by: https://github.com/SherlockNoMad	2022-07-13 02:03:11 +00:00
Sherlock Huang	6b280e880a	Update NvFuserOperatorSupport (#81311 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81311 Approved by: https://github.com/davidberard98	2022-07-12 21:19:37 +00:00
Sherlock Huang	fc10a63727	Prims+NvFuser Backend Prototype (#80591 ) This PR integrates FX graph partitioner + Aten2Prims DecompositionInterpreter + Prims' TraceExecutor + naive caches for nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80591 Approved by: https://github.com/jjsjann123, https://github.com/ezyang	2022-07-08 19:53:03 +00:00
anjali411	4bf076e964	Add __all__ to torch.distributed, futures, fx, nn, package, benchmark submodules (#80520 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80520 Approved by: https://github.com/rohan-varma	2022-07-08 14:31:24 +00:00
Drazen Borkovic	9402219a36	Move serialize_module() out of OSS graph_manipulation.py to internal (#80785 ) Differential Revision: D37582495 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80785 Approved by: https://github.com/jfix71	2022-07-05 23:39:13 +00:00
Riley Dulin	d579838eb5	[torch][fx] Add ignore_parameters_and_buffers kwarg to FxGraphDrawer (#79982 ) Summary: Add an `ignore_parameters_and_buffers` parameter which will tell the graph drawer to leave off adding parameter and buffer nodes in the dot graph. This is useful for large networks, where we want to view the graph to get an idea of the topology and the shapes without needing to see every detail. Removing these buffers de-clutters the graph significantly without detracting much information. Reviewed By: jfix71 Differential Revision: D37317917 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79982 Approved by: https://github.com/jfix71	2022-06-29 22:48:43 +00:00
Sherlock Huang	ac5a94789f	Refactor lift_subgraph_as_module as a fx.passes.util function (#80292 ) lift_subgraph_as_module can be shared between fuser_utils.py and spliter_utils.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/80292 Approved by: https://github.com/jjsjann123, https://github.com/842974287	2022-06-29 22:35:39 +00:00
PyTorch MergeBot	58532256e9	Revert "Add __all__ for torch.distributed and fx modules (#80460 )" This reverts commit `5d40c3d5c8`. Reverted https://github.com/pytorch/pytorch/pull/80460 on behalf of https://github.com/malfet due to Broke MacOS testing, see https://github.com/pytorch/pytorch/runs/7105579664?check_suite_focus=true	2022-06-29 16:20:55 +00:00
anjali411	5d40c3d5c8	Add __all__ for torch.distributed and fx modules (#80460 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80460 Approved by: https://github.com/albanD, https://github.com/rohan-varma	2022-06-29 02:53:56 +00:00
anjali411	f68f77610a	Add __all__ to torch.nn.quantized, fx.passes, ao.nn and amp submodules (#80376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80376 Approved by: https://github.com/albanD	2022-06-27 21:36:27 +00:00
anjali411	3bcc19b29a	Add __all__ to various submodules in torch.fx, distributions, distributed, package (#80367 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80367 Approved by: https://github.com/albanD	2022-06-27 21:27:30 +00:00
Sherlock Huang	752c06e0e1	FX graph partitioner and fuser (#79439 ) This PR introduces two components. CapabilityBasedPartitioner for FX graph: given a list of supported operators, this partitioner tries to forms the largest subgraphs that only contain the supported ops. Fuser utility: given a list of nodes in FX graph, it lifts them as a sub-GraphModule in the original graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79439 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-06-24 18:49:37 +00:00
Oleg Khabinov	848af37209	Debug small ACC subgraphs elimination (#80117 ) Reviewed By: yinghai Differential Revision: D37368729 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80117 Approved by: https://github.com/yinghai, https://github.com/houseroad	2022-06-23 18:45:24 +00:00
Yinghai Lu	5a559a547d	[FX] fix split_util by using getattr_recursive instead of getattr (#80011 ) Summary: If the model contains ModuleList, it's possible that we got some of the weight attributes as module.sub.0.weight. `getattr` doesn't work in this case and we have a dedicated function `getattrt_recursive` for that. Just use that. Reviewed By: houseroad Differential Revision: D37326955 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80011 Approved by: https://github.com/houseroad	2022-06-23 03:35:46 +00:00
Angela Yi	8155753f8c	Added PassBase implementation Pull Request resolved: https://github.com/pytorch/pytorch/pull/79878 Approved by: https://github.com/SherlockNoMad	2022-06-22 16:37:08 +00:00
Drazen Borkovic	f54098cd3e	Create JSON from new FX IR and lower to LLVM (#77765 ) Summary: Replace TensorView objects with maps for JSONing. Lower to LLVM. Reviewed By: jaybean-dev, jfix71 Differential Revision: D36318989 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77765 Approved by: https://github.com/jfix71, https://github.com/jamesr66a	2022-05-20 03:20:57 +00:00
Jordan Fix	0c91efb64e	[fx/graph_manipulation] Fix _update_weight_fused_dtypes (#77702 ) Summary: D36335238 (`18e36a6295`) wasn't fully working due to previous impl which used op names for looking for matches. Instead use the FX graph itself. Differential Revision: D36462875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77702 Approved by: https://github.com/jamesr66a	2022-05-19 03:28:28 +00:00
Jordan Fix	18e36a6295	[graph_manipulation] Set fused dtypes for all constant params/buffers (#77401 ) Summary: We were handling constant attrs in a few different ways before, leading to confusion and missed handing for fused dtypes. This diff consolidates some of that code and unbreaks current breakage. Test Plan: CI. Recently broken tests now pass. Differential Revision: D36335238 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77401 Approved by: https://github.com/jaybean-dev, https://github.com/jamesr66a	2022-05-17 07:42:29 +00:00
Shirong Wu	ea8a0184b7	Fix fuse_parallel_linear (#76202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76202 move legalize_graph to a common tool class. Reviewed By: yinghai, jfix71, 842974287 Differential Revision: D35694145 fbshipit-source-id: b044df3b46b3029c383581f7853a4338c2b13c62 (cherry picked from commit 49884d557d220f981f5f894bdcd381df749e3efb)	2022-04-22 18:59:07 +00:00
Alex Beloi	78ba87ec4b	[fx][ShapeProp] make shapes and args/kwargs concrete for minimizer (#75291 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75291 As the title. Also adds debugging messaging to `ShapeProp.run_node` Reviewed By: yuhc Differential Revision: D34930081 fbshipit-source-id: ea4341ac2377b7b81404b14afeb5149d5556d92c (cherry picked from commit 8a929a910c17cff69fac501c5b9260bb23f260e2)	2022-04-06 07:57:38 +00:00
Alex Beloi	fd2050900b	[fx][1/2] add PassManager and refactor AFG/AGM (#74972 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74972 This diff * adds PassManager and supporting logic Test Plan: CI and ``` buck test //caffe2/torch/fx/passes:test_pass_manager ``` ``` Building: finished in 3.1 sec (100%) 124/124 jobs, 30/124 updated Total time: 3.7 sec More details at https://www.internalfb.com/intern/buck/build/4f947267-671c-48bc-ad07-190e5a731d2d BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 1423fed7-4674-44ce-9b84-c634f28a0406 Trace available for this run at /tmp/tpx-20220309-144735.217835-1423fed7-4674-44ce-9b84-c634f28a0406/trace.log RemoteExecution session id: reSessionID-1423fed7-4674-44ce-9b84-c634f28a0406-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/6473924544097816 ✓ ListingSuccess: caffe2/torch/fx/passes:test_pass_manager : 3 tests discovered (0.639) ✓ Pass: caffe2/torch/fx/passes:test_pass_manager - test_these_before_those_pass_constraint (caffe2.torch.fx.passes.tests.test_pass_manager.TestPassManager) (0.335) ✓ Pass: caffe2/torch/fx/passes:test_pass_manager - test_this_before_that_pass_constraint (caffe2.torch.fx.passes.tests.test_pass_manager.TestPassManager) (0.336) ✓ Pass: caffe2/torch/fx/passes:test_pass_manager - test_pass_manager_builder (caffe2.torch.fx.passes.tests.test_pass_manager.TestPassManager) (0.344) Summary Pass: 3 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/6473924544097816 ``` Reviewed By: yuhc, wushirong Differential Revision: D31484770 fbshipit-source-id: 7a8cde4c23727ff612bf7bf0d7b7db5d0c47b1a9 (cherry picked from commit c281c288fe870624574d34cfc93d732d4607f7d0)	2022-04-01 09:12:47 +00:00
Chao Gu	bdf468b94d	[FX] Fix type of argument min_acc_module_size (#74891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74891 As title, otherwise the below error is thrown: ``` TypeError: '>=' not supported between instances of 'int' and 'str' ``` Test Plan: easy Reviewed By: jackm321 Differential Revision: D35206473 fbshipit-source-id: 200c83b9a19b6aae6f0da03abe99121e55893fd3 (cherry picked from commit 20744d2ce59ea07ecdb2570929dd5344c65b751a)	2022-03-29 17:48:32 +00:00
James Reed	214951bc6b	[FX] Make split_module preserve proper placeholder names (#74736 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74736 Previously, `split_module` would incorrectly carry over the `name` of placeholders rather than their `target`: Original GraphModule ``` def forward(self, x, kwargs): _kwargs = kwargs getitem = _kwargs['foo']; _kwargs = None add = x + getitem; x = getitem = None return add ``` After splitting: ``` def forward(self, x, _kwargs): submod_0 = self.submod_0(_kwargs); _kwargs = None submod_1 = self.submod_1(x, submod_0); x = submod_0 = None return submod_1 ``` Notice that `kwargs` is turned into `_kwargs`, which is incorrect and we lose the kwarg expansion behavior. This patch switches the erroneous code in `split_module`, resulting in the correct split code being emitted: Original GraphModule ``` def forward(self, x, kwargs): _kwargs = kwargs getitem = _kwargs['foo']; _kwargs = None add = x + getitem; x = getitem = None return add ``` After splitting: ``` def forward(self, x, kwargs): _kwargs = kwargs submod_0 = self.submod_0(_kwargs); _kwargs = None submod_1 = self.submod_1(x, submod_0); x = submod_0 = None return submod_1 ``` Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D35137361 Pulled By: jamesr66a fbshipit-source-id: 46d079cfe16093c293fc268404fb8bc86ffcf583 (cherry picked from commit a020066281856184621561a8672eb57f5de31e92)	2022-03-25 23:36:27 +00:00
Jerry Zhang	eaae62fed9	Make args work in the uru10x10_to_trt_eval script (#74707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74707 att Test Plan: ``` buck run mode/dev-nosan -c fbcode.split-dwarf=true -c fbcode.platform=platform009 accelerators/workloads/models/uru10x10:uru_10x10_to_trt_eval -- -h ``` Reviewed By: 842974287 Differential Revision: D34088069 fbshipit-source-id: 5c89d25db6493e0f66f7e57aac24ed72196d0378 (cherry picked from commit d9d79f03e28d609a14ddc3e55b97c52b0e102438)	2022-03-25 03:52:47 +00:00
Jordan Fix	e99e3fa580	[fx/graph_drawer] Add skip_node_names_in_args option, default to True (#73815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73815 Add `skip_node_names_in_args` (default=`True`) which will skip including node names in args/kwargs during graph drawing. Test Plan: Default (`skip_node_names_in_args=True`): {F707455583} Vs. `skip_node_names_in_args=False`: {F707046375} Reviewed By: wushirong Differential Revision: D34659144 fbshipit-source-id: 9f0bd7bee98dc1ca8eecdabc960804564d83777b (cherry picked from commit a0ed64b51f0187115586f4001dc81148c7ed18b9)	2022-03-08 01:50:03 +00:00
Ke Wen	d14de3139a	[PyTorch FX] Return mapping of qualified names from split_module() (#73564 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73564 While maintaining API backward compatibility, add an optional output parameter to split_module() that returns a mapping from the new qualified names in the modules after split to the old qualified names in the original module Test Plan: 1. Added a test (test_split_qualname_mapping) to test_fx_experimental.py to check the returned qualname mapping ``` $ python test_fx_experimental.py ... Ran 1084 tests in 73.464s OK (skipped=531, expected failures=4) ``` 2. Ask test_fx.py to accept split_module's new signature ``` $ python test_fx.py --accept ``` Reviewed By: jamesr66a Differential Revision: D34541792 fbshipit-source-id: e8ec7e77ec884e4db7cad0c0593e31861c76e42d (cherry picked from commit d2e5a95a353ee5fb52cdba065f127489e9df47ae)	2022-03-02 23:32:54 +00:00
Shiyan Deng	5e86505693	Move util functions to a more common place (#73519 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73519 Move `getattr_recursive()` and `setattr_recursive()` to fx main. Test Plan: CI Reviewed By: khabinov Differential Revision: D34524723 fbshipit-source-id: a656e821d9dc1d446aa80cdc03a923bf0c05aeb5 (cherry picked from commit 4835965ac72d299487be14687823ea62394f4079)	2022-03-01 01:33:30 +00:00
Jordan Fix	b196e016a6	[fx/graph_drawer] Add args/kwargs and users (#73464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73464 - Improve formatting of graph by centering everything - Add num_users - Add args/kwargs - Don't print more than 10 of any list/tuple by default (this is necessary for very large concats) Test Plan: tested locally Reviewed By: khabinov Differential Revision: D34492256 fbshipit-source-id: 8073992edb3efddcf8bfd72e2d3db49cc242db10 (cherry picked from commit b1b802965c143fdb0d308b70f51aa741f7d90f78)	2022-02-26 11:29:39 +00:00
Jordan Fix	540cb5fee2	[graph_manipulation] Unpack list of outputs (#72940 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72940 att Reviewed By: jackm321 Differential Revision: D34282062 fbshipit-source-id: 743710c18e1f38286d1b91c91868bb22c760f3ca (cherry picked from commit `fd2bdd189d`)	2022-02-17 06:38:52 +00:00
Huamin Li	32dd4a8639	move fx_acc out of pytorch core (#72803 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72803 as title Reviewed By: jfix71 Differential Revision: D34101788 fbshipit-source-id: a9fd84671929af21405c049603e9895ec68de3d8 (cherry picked from commit `e98fd1c32d`)	2022-02-15 16:13:43 +00:00
Jordan Fix	4737ae7a16	[tools_common] Don't remove underscores from call_module targets in get_acc_ops_name (#72664 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72664 Test Plan: CI. Reviewed By: wushirong Differential Revision: D34148357 fbshipit-source-id: 9c75aaeae59461d7550fb00c6f98c879e98274f6 (cherry picked from commit `553525698a`)	2022-02-11 08:32:10 +00:00
Shiyan Deng	2afed243b5	[fx2trt] remove split.py (#71933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71933 Add the functionalities provided by split.py to splitter_base. - Propagate submodule inputs - Create SplitResult to hold the split results. Then removed split.py, to me this makes navigating the lowering code a bit easier. Added default split and trace function for use. Next step is to add better error handling for each stage during lowering and create unit tests for each stage. I'll probably make some bootcamp tasks for unit tests. Test Plan: CI Reviewed By: frank-wei, wushirong Differential Revision: D33794322 fbshipit-source-id: f991893047a3701177f54cf22d9a6e48e0529472 (cherry picked from commit `1f3e13efba`)	2022-02-08 03:31:25 +00:00
Alban Desmaison	6451e525e4	Revert D31316086: [fx-acc] PassManager Test Plan: revert-hammer Differential Revision: D31316086 (`aa4f048de9`) Original commit changeset: 4302c39e221c Original Phabricator Diff: D31316086 (`aa4f048de9`) fbshipit-source-id: 27554f001705d35ba38152fa36445d384b347b00 (cherry picked from commit `386c492d66`)	2022-02-04 20:02:54 +00:00
Alex Beloi	aa4f048de9	[fx-acc] PassManager (#67261 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67261 Adds a `Pass(callable)`, `PassManager`, `PassConstraint(callable)`, `PassManagerBuilder`. The idea is that a `Pass` modifies an IR in-place. `PassConstraint`s define a partial ordering on `Pass`s as a less than callable. `PassManager` manages the collection of `Pass`s, `PassConstraint`s and ensures validation before execution. `PassManagerBuilder` creates `PassManager`s (example usage in follow-up diff). These are very loosely typed, so could be applied to different IRs as well as transformation between IRs. Test Plan: ``` buck test mode/opt //caffe2/torch/fx/passes:test_pass_manager ``` ``` ore details at https://www.internalfb.com/intern/buck/build/210 BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: c635415b-cdc4-4574-9739-a16d2b93ad3a Trace available for this run at /tmp/tpx-20220203-114748.608700/trace.log RemoteExecution session id: reSessionID-c635415b-cdc4-4574-9739-a16d2b93ad3a-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/1970324927640328 ✓ ListingSuccess: caffe2/torch/fx/passes:test_pass_manager : 3 tests discovered (0.332) ✓ Pass: caffe2/torch/fx/passes:test_pass_manager - test_this_before_that_pass_constraint (caffe2.torch.fx.passes.tests.test_pass_manager.TestPassManager) (0.232) ✓ Pass: caffe2/torch/fx/passes:test_pass_manager - test_these_before_those_pass_constraint (caffe2.torch.fx.passes.tests.test_pass_manager.TestPassManager) (0.231) ✓ Pass: caffe2/torch/fx/passes:test_pass_manager - test_pass_manager_builder (caffe2.torch.fx.passes.tests.test_pass_manager.TestPassManager) (0.231) Summary Pass: 3 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/1970324927640328 ``` Reviewed By: jfix71, kflu Differential Revision: D31316086 fbshipit-source-id: 4302c39e221cfa43e2b2eda9f26d6d78da4db0f1 (cherry picked from commit `13c981ab00`)	2022-02-04 19:07:42 +00:00
Shirong Wu	e03c3dd150	Add leaf module code example (#72100 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72100 Facebook : enable splitter to properly read the leaf module specified by acc_tracer leaf module list, and parse leaf module as run_on_acc if customize leaf module converter is provided. add scratch board for customize leaf module converter and example code for std_conv2d_same converter. Reviewed By: jfix71 Differential Revision: D33698402 fbshipit-source-id: 01ce84ee1543f0fb8a8899256530ef1300797417 (cherry picked from commit `1357b2d528`)	2022-02-03 02:07:00 +00:00
Yinghai Lu	4c62ffa11e	Improve fx2trt benchmark (#72145 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72145 - Added a predicate that allows us not to lower nodes with specific names. - Added an observer function to help with the debugging Reviewed By: jasonjk-park, houseroad Differential Revision: D33785834 fbshipit-source-id: 7bdb7f33851da1118763c85f8e2121d01e4914a2 (cherry picked from commit `4e2268ed45`)	2022-02-02 22:41:58 +00:00
Tomomas	b2b63209e1	make code simplify in get bufffers and parameters (#70399 ) Summary: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/70399 Reviewed By: george-qi Differential Revision: D33823535 Pulled By: khabinov fbshipit-source-id: 8d1e49595da1f5cc14db7634a8c27556b02a5361 (cherry picked from commit `78bbb53614`)	2022-01-28 19:15:56 +00:00
Yinghai Lu	e5794974cb	[acc_tracer] Do not rewrite the leaf modules (#71790 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71790 If a leaf module is specified, it means we should treat it as a blackbox and we should just avoid rewriting it too. Test Plan: ``` buck test caffe2/test:test_fx_acc_tracer ``` with a new unit test. Reviewed By: jfix71, houseroad, wushirong Differential Revision: D33731903 fbshipit-source-id: 0560d9e8435b40f30d9b99dc3b2f47d1a04eb38b (cherry picked from commit `747e9e44ee`)	2022-01-26 07:32:04 +00:00
Shirong Wu	84d4087874	Fix trt const_fold as output use case (#71194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71194 Reviewed By: jfix71, khabinov Differential Revision: D33541168 fbshipit-source-id: dd5787430b272977963323a6ce38b3e15e979278	2022-01-12 16:57:19 -08:00
James Reed	de902b5d02	[FX] Add a default_value arg to Graph.placeholder and fix split_module (#71016 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71016 I found out that `split_module` doesn't preserve default values for arguments. In trying to fix that, I noticed that `Graph.placeholder` doesn't make it easy to add a default argument when making a placeholder. This PR addresses both of those issues Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D33482218 Pulled By: jamesr66a fbshipit-source-id: 57ebcdab25d267333fb1034994e08fc1bdb128ee	2022-01-12 14:03:17 -08:00
Shiyan Deng	54fe2741a1	[fx2trt] break down div (#71172 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71172 Break down div to smaller ops to make those div ops look like all other elementwise ops. Use operator div ops instead of torch div if possible to avoid converting literal numbers to torch tensor (like in the following). ``` a = 1 b = 2 // `c` would be 0.5 c = a / b // `c` would be torch.tensor([0.5]) c = torch.div(a, b) ``` The problem we saw on shufflenet is that there's size op followed by a div op which results in int64 tensors in acc traced graph (acc tracer turns operator.div to acc_ops.div which uses torch.div). And trt splitter splits out the reshape op that consumes the div op because we have a rule to split out ops that takes in int64 tensors as inputs. Test Plan: Unit tests. Reviewed By: wushirong Differential Revision: D33482231 fbshipit-source-id: 508a171520c4e5b4188cfc5c30c1370ba9db1c55	2022-01-12 09:46:46 -08:00
Mostafa Elhoushi	3f53365086	define `get_dot_graph` (#70541 ) Summary: In the [docstring](https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/graph_drawer.py#L54-L60) we mention `get_dot_graph but it is not defined, so I defined it here. Not sure if this is preferred, or should we update the docstring to use `get_main_dot_graph` Pull Request resolved: https://github.com/pytorch/pytorch/pull/70541 Test Plan: ``` g = FxGraphDrawer(symbolic_traced, "resnet18") with open("a.svg", "w") as f: f.write(g.get_dot_graph().create_svg()) ``` Reviewed By: khabinov Differential Revision: D33378080 Pulled By: mostafaelhoushi fbshipit-source-id: 7feea2425a12d5628ddca15beff0fe5110f4a111	2022-01-05 20:00:20 -08:00
Shirong Wu	998daf44d6	All get_attr node to be in64 type (#68818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68818 Operator Support was blocking all node with dtype int64 from lowering. This diff ease the condition, for input from get_attr node(which are known not gonna be used for trt compute) to have dtype int64. Reviewed By: brad-mengchi, 842974287 Differential Revision: D32609457 fbshipit-source-id: ea255f3281349a4254cb6abdeed671ab2c0216ba	2021-11-23 15:21:47 -08:00
Shirong Wu	7d38768d84	Rename splitter result (#68303 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68303 Result of splitter is run on either accelerator or directly on gpu, rename gpu part graph to run_on_gpu Test Plan: buck test mode/opt caffe2/test:trt_tools_test Reviewed By: 842974287 Differential Revision: D32392492 fbshipit-source-id: b085376c00c1097752e856e22c631d74a0fbc38f	2021-11-18 09:04:30 -08:00
Shirong Wu	799ebce3aa	Add algo recorder/replayer to lower.py (#68194 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68194 Add algorithm recorder/replayer to lower.py Reviewed By: yinghai Differential Revision: D31909575 fbshipit-source-id: 552f2ba4fbd6ea646316f6412d55416a76e1f69a	2021-11-11 21:22:22 -08:00
Shirong Wu	69adbc8778	Fix splitter_base and add unit test for trt splitter (#67569 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67569 Splitter_base has assumption that the first subgraph after split must be cpu subgraph if there exists cpu node. This is wrong, start subgraph should be determined by which subgraph has 0-dep node. Also add unit test for splitter. Reviewed By: yinghai Differential Revision: D32012549 fbshipit-source-id: e2639ccd7774b4295ca05c2ddbefff9726702b3f	2021-10-29 18:51:59 -07:00
Alex Beloi	74849d9188	[acc_shape_inference] add shape inference for quantize_per_channel (#66562 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66562 Adding shape inference for `acc_ops.quantize_per_channel`, and fixing some bugs. Bugs were related to the fact that `quantize_per_channel` arguments `scales` and `zero_points` take tensors, so when we fetch the values (which needs to be done using `.tolist()` instead of `.item()`) we may get either a list or a scalar value. Test Plan: # Test Quantized Resnet From sandbox with GPU that supports quantized types (tested with V100) `buck run mode/opt -c python.package_style=inplace caffe2:fx2trt_quantized_resnet_test` Output ``` ... [TensorRT] INFO: [MemUsageSnapshot] Builder end: CPU 0 MiB, GPU 1548 MiB [TensorRT] INFO: [MemUsageSnapshot] ExecutionContext creation begin: CPU 0 MiB, GPU 1548 MiB [TensorRT] VERBOSE: Using cublasLt a tactic source [TensorRT] WARNING: TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.1.0 [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 0, GPU 1556 (MiB) [TensorRT] VERBOSE: Using cuDNN as a tactic source [TensorRT] INFO: [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 0, GPU 1564 (MiB) [TensorRT] WARNING: TensorRT was linked against cuDNN 8.2.1 but loaded cuDNN 8.0.5 [TensorRT] VERBOSE: Total per-runner device memory is 23405056 [TensorRT] VERBOSE: Total per-runner host memory is 73760 [TensorRT] VERBOSE: Allocated activation device memory of size 154140672 [TensorRT] INFO: [MemUsageSnapshot] ExecutionContext creation end: CPU 0 MiB, GPU 1736 MiB trt fp16 time (ms/iter) 1.252899169921875 trt int8 time (ms/iter) 1.3774776458740234 trt implicit int8 time (ms/iter) 1.3835883140563965 PyTorch time (CUDA) (ms/iter) 4.34483528137207 PyTorch time (CPU) (ms/iter) 55.687150955200195 [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 0, GPU 1918 (MiB) [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 0, GPU 1866 (MiB) [TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 0, GPU 1738 (MiB) WARNING: Logging before InitGoogleLogging() is written to STDERR W1012 12:07:23.556475 711816 DynoConfigLoader.cpp:32] Failed to read config: No dyno config client ``` # Test shape inference `buck test mode/opt glow/fb/fx/acc_tracer:test_acc_shape_inference` Output ``` ... Summary Pass: 95 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/1407375092088240 ``` Reviewed By: jfix71, jerryzh168 Differential Revision: D31457323 fbshipit-source-id: 8ccc4a9b0ca655fb30838e88575aff2bf3a387a6	2021-10-13 21:03:08 -07:00
Patrick Spencer	9fb6ba24e7	Update `torch.fx.passes.split_module` docstring (#65542 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65542 Add docstring for torch.fx.passes.split_module that conforms to Google Python Style conventions. Changed original example to the example from this diff: https://www.internalfb.com/diff/D24925283 (`9734c042b8`) Test Plan: Ran buck test //caffe2/test:fx. No errors detected https://pxl.cl/1QCch Reviewed By: jamesr66a Differential Revision: D31145694 fbshipit-source-id: 8e54f3b1be3dca1c4d414fdeeab71b9f2b5d9f3e	2021-10-07 10:37:10 -07:00
Jordan Fix	592481a5cc	[fx][const_fold] Refactor to use base split module to simplify, and correctly handle non-single-Tensor outputs (#65933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65933 We use `split_module` to split the input model that we want to const fold into const and non-const subgraphs. Previously we were taking the non-const graph and trying to hack it back into the same signature as the input model. However this was complex/buggy. Instead, refactor to just keep using the base split module that contains both const and non-const graphs. This means we: - Inline the non-const graph into the split module - Remove the const graph from the module and replace it with a getattr that will be run to insert that attr when we `run_folding` Test Plan: Added test coverage to cover newly supported folding, and updated other tests for new strategy. Reviewed By: yinghai Differential Revision: D31293307 fbshipit-source-id: 6e283a8c7222cf07b14e30e74dffc8ae5ee8b55f	2021-10-01 10:26:29 -07:00
Jerry Zhang	3d6d4f4322	[fx2trt][quant] Add lowering support for per channel quantization in fx2trt (#64787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64787 This PR added support for lowering per channel quantization and dequantization operators in fx2trt, this also extends TensorMeta with extra arguments corresponding to per channel quantized Tensors, initially I was thinking of adding a qpram that can capture everything, but currently we still have some lowering support for fbgemm ops (which has scale and zero_point in operator interface). I think we can move everything to qprams after we deprecate lowering support for fbgemm ops in the future. Test Plan: Test for per channel weight: ``` python torch/fx/experimental/fx2trt/example/quantized_resnet_test.py ``` change BC compatibility test expect for TensorMeta ``` python test/test_fx.py TestFXAPIBackwardCompatibility.test_class_member_back_compat --accept ``` Imported from OSS Reviewed By: jfix71, mrshenli, 842974287 Differential Revision: D30879848 fbshipit-source-id: 76c3804bb1d9343183ae53d9f02c1a3bf6c79e1c	2021-09-30 18:54:14 -07:00
Kefei Lu	d4d3bb91f9	Refactor `OperatorSupport` related code and fix TRT not supporting int64 dtype (#65848 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65848 This diff includes: * [fix]: The initialization of `OperatorSupport._support_dict` makes it a class variable, so we need to move its initialization into constructor. * Add abstract class (more of an interface) `OperatorSupportBase`, since `OperatorSupport`'s purpose is too specific. * [refactor]: what `TRToperatorSupport` really does is to populate a `OperatorSupport._support_dict`, so there really is no reason for subclassing. So removing it, and changing it to instantiating a `OperatorSupport` with properly populated `_support_dict`. * Add a framework for defining simple and basic op support logic, and composing them into more complex ones: 1. `create_op_support` wraps a function into a `OperatorSupportBase` instance 2. `chain` can combine several simple `OperatorSupportBase` into more complex ones 3. `OpSupports` provides a set of pre-defined, simple `OperatorSupportBase` that can be composed together using `chain`. 1. Currently the only pre-defined one is `decline_if_input_dtype(..)`, which declares a node non-supported, if its args are of user specified dtype * Fix `TRTOperatorSupport` so that it not only looks for registered converters, but also decline a node if its arg is of int64 Test Plan: linter and CI Reviewed By: 842974287 Differential Revision: D31275525 fbshipit-source-id: bbc02f7ccf4902a7912bb98ba5be2c2fbd53b606	2021-09-30 13:36:26 -07:00
Kefei Lu	911d01c1de	type annotate operator_support (#65136 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65136 Opportunistically add type annotation for operator_support.py Test Plan: run linter, CI Reviewed By: yinghai Differential Revision: D30928464 fbshipit-source-id: 615c75152b9938792f03cdceb2a113bda6ab28c7	2021-09-29 10:38:47 -07:00
James Reed	0559cb37cd	[FX] Ensure BC coverage for all of torch.fx.passes (#65081 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65081 Test Plan: Imported from OSS Reviewed By: jbschlosser, khabinov Differential Revision: D30967428 Pulled By: jamesr66a fbshipit-source-id: 2ff83da728dc469f086cf504e71b43396db612d8	2021-09-17 09:32:43 -07:00
James Reed	cf7409e184	[FX] Move graph_manipulation and param_fetch out of experimental and into passes (#65183 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65183 ghstack-source-id: 138309655 Test Plan: waitforsadcastle Reviewed By: protonu Differential Revision: D31007630 fbshipit-source-id: 77d14b284737aabbe2b9e6394177a0c2e40aafba	2021-09-17 09:32:40 -07:00
James Reed	874f9bd509	[FX] Gate FXGraphDrawer on whether pydot is installed (#65088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65088 Test Plan: Imported from OSS Reviewed By: khabinov Differential Revision: D30967951 Pulled By: jamesr66a fbshipit-source-id: dba2f13a47889b3d4187de925b4fe74ee90b7f79	2021-09-16 10:04:33 -07:00
Kefei Lu	adbcc819cd	Fix fx2trt SplitterBase non_tensor_input logic (#64286 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64286 During graph splitting, `_SplitterBase` supports taking into consideration whether the subnet boundary nodes produces "supported" outputs that will cross the acc/non-acc boundary. Specifically, if the backend only supports Tensor-based data passing cross boundary, then we cannot split the graph at a place where the node output is a non-Tensor type (e.g., `Tuple[Tensor]`). There's currently a bug in this logic that it does not correctly detect the output type of a Node. Instead of using `Node.meta['tensor_meta']`, we should instead check `Node.meta['type']`. `Node.meta['tensor_meta']` is not appropriate because this key will exist if the node output is an iterable and one of the element is of type `Tensor`. So `Tuple[Tensor]` will be wrongly considered "supported". Test Plan: arc lint run CI tests Reviewed By: yinghai, 842974287 Differential Revision: D30617147 fbshipit-source-id: e8ba70dfaddc05cafb8037d58fca73b7ccbb1a49	2021-09-07 04:02:29 -07:00
James Reed	538647fe1f	[WIP][FX] BC guarantees for 1.10 (#63888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63888 Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D30523133 Pulled By: jamesr66a fbshipit-source-id: b04cc0d842a74862f42ecba98b757310cd2ec7b0	2021-08-30 19:56:46 -07:00
Kefei Lu	5757d03145	Add logging for _MinimizerBase Summary: Add logging so we know which nodes are currently being visited Test Plan: lint & SC tests Reviewed By: 842974287 Differential Revision: D30509865 fbshipit-source-id: 09e77e44c97c825242e0b24f90463b50f3ca19c6	2021-08-26 00:52:58 -07:00
Oleg Khabinov	a0c1c7e5d4	Fixing the case when starter nodes depend on get_attr node (#62234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62234 There was a typo that we caught until recently, thus making this fix. Reviewed By: 842974287 Differential Revision: D29924190 fbshipit-source-id: ee6259fcd41358aefe9680b419acc87c0c2821cb	2021-07-27 10:29:53 -07:00
Shiyan Deng	cc18654d66	[fx_acc] Refactoring acc_tracer (#61963 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61963 Test Plan: CI Reviewed By: jfix71 Differential Revision: D29772522 fbshipit-source-id: 4b117735147624f9428b933ea798495823423a0e	2021-07-21 20:09:15 -07:00
Zeina Migeed	4e2fe9718d	flatten operation (resnet50) (#61265 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61265 Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D29626383 Pulled By: migeed-z fbshipit-source-id: 107769fc14f1fad295a93a10e84235f25ae17357	2021-07-16 16:06:10 -07:00
Malay Bag	287c0ab170	[FX] Add requires_grad to TensorMetadata (#60972 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60972 For PyTorch model memory requirement calculation, requires_grad is needed. Output tensors with requires_grad are saved in module context and increases memory during forward pass. Test Plan: Existing test cases Reviewed By: jamesr66a Differential Revision: D29024932 fbshipit-source-id: def990f8c6ff6fa4537bfc377c646b9d44464ebd	2021-06-29 23:07:27 -07:00
Philip Meier	d5988c5eca	remove unused `type: ignore` directives (#60006 ) Summary: During development it is common practice to put `type: ignore` comments on lines that are correct, but `mypy` doesn't recognize this. This often stems from the fact, that the used `mypy` version wasn't able to handle the used pattern. With every new release `mypy` gets better at handling complex code. In addition to fix all the previously accepted but now failing patterns, we should also revisit all `type: ignore` comments to see if they are still needed or not. Fortunately, we don't need to do it manually: by adding `warn_unused_ignores = True` to the configuration, `mypy` will error out in case it encounters an `type: ignore` that is no longer needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60006 Reviewed By: jbschlosser, malfet Differential Revision: D29133237 Pulled By: albanD fbshipit-source-id: 41e82edc5cd5affa7ccedad044b59b94dad4425a	2021-06-18 07:23:31 -07:00
Oleg Khabinov	0d7d316dc1	[fx ir] Support lists and dicts in FX IR GraphDrawer (#58775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58775 Reviewed By: RoshanPAN Differential Revision: D28613939 fbshipit-source-id: 4164e2dd772b59240ea3907001fe4ebddb003060	2021-06-10 01:55:53 -07:00
Jordan Fix	6d97a80dd2	[fx][graph_drawer] Improve graph drawer coloring and tensor_meta handling (#58699 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58699 Make `call_function`/`call_method` random colors based on their target name. This coloring is stable according to the name of the target. Also handle tensor_meta more elegantly for quantized types, including print q_scale/q_zero_point if they're used. Test Plan: Tested locally Reviewed By: chenccfb, 842974287 Differential Revision: D28580333 fbshipit-source-id: ad9961e1106a1bfa5a018d009b0ddb8802d2163c	2021-05-20 21:26:04 -07:00
Shiyan Deng	bcacf91a71	[fx_glow]Add Support for importing quantized linear in FXIRImporter (#57483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57483 Pull Request resolved: https://github.com/pytorch/glow/pull/5622 Quantized linear has packed parameters. We want to unpack it so that it would be easier for graph optimization and importer to deal with the weight and bias. A customized remapping function is used to unpack quantized linear and map it to acc_op.linear. Test Plan: `buck test glow/fb/fx/nnpi_importer:test_importer` Reviewed By: gcatron, jfix71, khabinov Differential Revision: D27451237 fbshipit-source-id: e46e961734788fd5333e227ca6143fd37c33204e	2021-05-14 18:48:31 -07:00
Shiyan Deng	9d56176034	Fix splitter and add a unittest (#58075 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58075 Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1687 Reviewed By: mikekgfb Differential Revision: D28357724 fbshipit-source-id: 36c2d211576a90107bc75468a39408ffecaeed43	2021-05-12 10:40:37 -07:00
Oleg Khabinov	36a22967b7	[fx ir] Handle the case when output consumes get_attr directly (#57844 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57844 Reviewed By: 842974287 Differential Revision: D28294298 fbshipit-source-id: db337fadca9f10f208324c9da6d95620178a189b	2021-05-10 22:04:43 -07:00
Aravind Kalaiah	747312bf61	Support for accumulate nodes traversal and to access op names in the compare function (#57685 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57685 - Accumulate traversal : `minimizer.settings.traverse_method = "accumulate" ` - Feature - net_min_tests - Return op name to the compare function so that we can map the cosine similarity to the individual ops - Fix the settings combinations in net_min_tests Test Plan: buck test glow/fb/nnpi/lowering:net_min_tests NNPI_LOG_LEVEL=5 USE_INF_API=1 buck run mode/opt -j 12 --config fbcode//cxx.link_weight=3 --config misc.strip_binaries=debug-non-line -c glow.nnpi_project_name='fb-nnpi-nextgen' ai_codesign/video/inference:xrayvideo_2019a_eval -- --job create --model_a model_prod --device_a PTCPU --trace_a none --model_b model_v3 --device_b NNPI --trace_b fusion --replace_b true --log_level INFO --use_scrambled false --save_repro false --num_ab_runs 0 --symbolic_trace_b true --save_modified_model_b false USE_INF_API=1 buck test glow/fb/nnpi/lowering:net_min_tests Reviewed By: 842974287 Differential Revision: D27867010 fbshipit-source-id: 6a756468b1f1fe24ef0400669d911825a7562484	2021-05-10 15:52:17 -07:00
Oleg Khabinov	73f22bcbf9	[fx ir] Handle cases in GraphDrawer when shape, type or stride are not present (#57845 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57845 As title says Test Plan: N/A Reviewed By: 842974287 Differential Revision: D28295999 fbshipit-source-id: f2cbf80c468f13685b17bb396c1f48972744ced0	2021-05-07 17:24:48 -07:00
Shiyan Deng	d896d1f4ce	[fx splitter] Fix fusion group utility (#57280 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57280 We've found an issue that fusion group would results in circular dependency. For example ``` a -> b -> c -> d \| ^ + -------------+ Only a has non tensor output and currently we would create a fusion group (a, b, d). This results in circular dependency because now the fusion group depends on c while c depends on the fusion group as well. ``` This diff implement the solution discussed before. When we add a node to fusion group, we add all the nodes that are in the middle of the fusion group and this newly added node. Use the same logic in minimizer to build fusion group. Test Plan: split_tests and net_min_tests Reviewed By: khabinov Differential Revision: D27917432 fbshipit-source-id: a3d99fe5929dbc9f8eb0f45bccd83fd7b173795a	2021-04-30 10:18:01 -07:00
Shiyan Deng	a6fa6a6cda	[fx minimizer] Add an option to minimizer to allow return all intermediate results (#57279 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57279 Added an option "return_intermediate". If true, when building the submodule we want to run , we will replace the output with all the nodes, so that intermediate results of all the nodes will be returned as output. This is recommended to use with `run_node()` function. Test Plan: `buck test glow/fb/nnpi/lowering:net_min_tests` Reviewed By: khabinov Differential Revision: D27913887 fbshipit-source-id: 5a3eab02da05214fb9adeb25656c267b58075b1d	2021-04-29 13:46:25 -07:00
Horace He	786b0a8091	[FX] fix normalization issues with lists of tensors (#57004 ) Summary: Fixes issue with lists of tensors not being normalized correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/57004 Reviewed By: jamesr66a Differential Revision: D28034559 Pulled By: Chillee fbshipit-source-id: f935f0b73a8356acd8a2ae93fcfc0417f0eab224	2021-04-27 20:02:00 -07:00
Shiyan Deng	45692fbef0	[fx splitter][fx net_min] Move Splitter, Minimizer and necessary deps to OSS (#56201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56201 Refactor Splitter and Minimizer to superclass `_SplitterBase` and `_MinimizerBase` and move them to OSS. This is needed to create an OSS example of GPU lowering with those tools. Test Plan: CI Reviewed By: jackm321 Differential Revision: D27629598 fbshipit-source-id: 0d4da02105ca509b31f1a6c4a39b1122c2bc7bf0	2021-04-24 15:19:12 -07:00
Horace He	0df239e550	[FX] Make arg normalization a method on Node and not a pass (also augment tests to be exhaustive) (#55992 ) Summary: Commandeered from https://github.com/pytorch/pytorch/pull/54563 Primary changes from first PR: 1. Refactored primary `normalize_function` logic into `operator_schemas.py` so that non-FX users can use it. 2. Refactored tests a bit, and added a path to call `normalize_function` directly. 3. Moved check for `boolean_dispatch` so that `torch.lu` also gets properly handled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55992 Reviewed By: mruberry Differential Revision: D27774396 Pulled By: Chillee fbshipit-source-id: 7f65632e1d608e4abd55aec5ccbfdc3f67f52b8e	2021-04-22 03:53:41 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00
James Reed	d02919dd50	[FX] Make shape_prop handle targets with aggregate outputs (#56221 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56221 Test Plan: Imported from OSS Reviewed By: Chillee Differential Revision: D27810693 Pulled By: jamesr66a fbshipit-source-id: 17c6ad671786b3bacb5026bd88b8f5b7b4b96a1a	2021-04-16 18:58:25 -07:00
Jordan Fix	5eadc243f3	Preserve node meta info in split_module (#56212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56212 The current design doesn't make it easy to use `node.copy()`. Explicitly copy over the node's meta. Test Plan: Updated `test_subgraph_creation` in `test_fx_experimental` Reviewed By: jamesr66a Differential Revision: D27808477 fbshipit-source-id: 7fe7b6428c830307dbd1e395f16fa2774936d3b3	2021-04-16 18:02:50 -07:00
James Reed	2236f43da0	[FX] Put tensor metadata into a NamedTuple in ShapeProp (#55930 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55930 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D27741730 Pulled By: jamesr66a fbshipit-source-id: 0a0a1b94beed6c482add9e9551f316f3b4220ab2	2021-04-13 22:21:50 -07:00
James Reed	8bdea14cd3	[FX] Add memory_format to shape_prop (#55815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55815 Test Plan: Imported from OSS Reviewed By: pbelevich, ansley Differential Revision: D27716342 Pulled By: jamesr66a fbshipit-source-id: f7c22dd77a4f48650700fc4c3c44b1c59196282e	2021-04-13 16:37:54 -07:00
Shiyan Deng	43ede4c2e3	Add Per Tensor Quantization Support to FXIRImporter (#55405 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55405 Pull Request resolved: https://github.com/pytorch/glow/pull/5516 Allows FXIRImport to import quantized model. This diff doesn't include the supports for per-channel weights, linear and conv. Will address them in the next diff. Test Plan: buck test glow/fb/fx/nnpi_importer:test_importer Reviewed By: jackm321, jfix71 Differential Revision: D27313543 fbshipit-source-id: bf5c96ef5f2ff1835c09db981e0ceefaec56dd5b	2021-04-09 10:49:48 -07:00
James Reed	641d4ff160	[FX] Add stride to shape_prop pass (#55108 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55108 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D27482241 Pulled By: jamesr66a fbshipit-source-id: 7d928015712126e916c86225dc3ab27aba22d431	2021-04-02 19:57:11 -07:00
James Reed	bcb4583170	[FX] Add a metadata dict to Node and switch shapeprop to use that (#54926 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54926 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D27417801 Pulled By: jamesr66a fbshipit-source-id: 68a5155120a235065f58aa64ba1a6a97818dd0c1	2021-03-31 14:36:54 -07:00
Zeina Migeed	5105250e16	[FX] Add docs for shape propagation (#54554 ) Summary: Fixes #{i54538} Pull Request resolved: https://github.com/pytorch/pytorch/pull/54554 Reviewed By: nikithamalgifb Differential Revision: D27281263 Pulled By: jamesr66a fbshipit-source-id: 2fd3914f0e24be0b6a18ad7715f3336dcf7949ba	2021-03-23 21:18:11 -07:00
James Reed	a1c5eba4bd	[FX] Move some heavily used passes out of experimental (#51392 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51392 Test Plan: Imported from OSS Reviewed By: Chillee Differential Revision: D26161172 Pulled By: jamesr66a fbshipit-source-id: 04bfe606555bdf1988f527231d4de2e0196e6b37	2021-02-01 19:02:26 -08:00

1 2 3 4 5

227 Commits