To match nodes within the graph, the matcher currently flattens the arguments and compares each argument against each other. However, if it believes that a list input contains all literals, it will not flatten the list and will instead compare the list directly against each other. It determines if a list is a literal by checking if the first element is a node. However this doesn't work in some cases (like the test cases I added).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94375
Approved by: https://github.com/SherlockNoMad
Fixes https://github.com/pytorch/pytorch/issues/89421
The strategy is to patch the given function wrapped with `@torch.fx.wrap` so that if a tensor tracer is active, we will `proxy_call` the function.
`proxy_call` will also skip certain checks if the function to proxy call is not a torch op (checked with `isinstance(.., OpOverload)`.
@IvanYashchuk @ezyang @Chillee
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93273
Approved by: https://github.com/ezyang
Summary:
One of such places where circular reference can occur is: _load_state_dict_pre_hooks contains a _WrappedHook, _WrappedHook has a weakref to the same module.
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93038
Approved by: https://github.com/jerryzh168
In 3.11 bytecode size is not constant, so in order to get from `f_lasti` to opcode index, one need to search for the closes offset in disassembled instructions.
Update `_patch_function` to construct code with all the properties that exist in 3.11 runtime.
Update `_torchscript_schema_to_signature` to mark `from` named arg as positional argument only, as this is a reserved keyword in Python and as such checked by `inspect` package in 3.11
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92895
Approved by: https://github.com/albanD
# Summary
In preparation for pt 2.0 launch this PR updates SDPA's API and makes the function a nn.funcitonal public function.
## Changes
### API
Previously the the function signature was:
`scaled_dot_product_attention(query, key, value, attn_mask=None, need_attn_weights=False, dropout_p=0.0, is_causal=False) -> (Tensor, Tensor)`
Updated signature:
`scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False) -> Tensor`
This PR removes the need_attn_weights optional boolean variable and updates the return type to a singular tensor.
#### Reasoning:
The main goal of this function is to provide an easy interface for users to call into fused attention kernels e.g. (FlashAttention). The fused kernels do not currently support arbitrary attn_mask or dropout but there is a PR to mem-efficient attention to enable these. We want to have the API surface ready for when the backing kernels get updated.
The fused kernels save on memory usage by not materializing the weights and it is unlikely that a fast fused implementation will enable this feature so we are removing.
Discussed with folks at FAIR/Xformers and +1 this API change.
#### Make function Public
In preparation for the pt 2.0 launch we make the function public to start to generate user feedback
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92189
Approved by: https://github.com/cpuhrsch
Summary:
This PR supports the following feature for QConfigMapping:
```
qconfig_mapping = QConfigMapping().set_object_type(torch.nn.Conv2d, qconfig)
backend_config = get_qnnpack_pt2e_backend_config()
m = prepare_pt2e(m, qconfig_mapping, example_inputs, backend_config)
```
which means users want to set the qconfig for all calls to `torch.nn.Conv2d` to use `qconfig`, note this is only verified for the case when the module is broken down to a single aten op right now, e.g. torch.nn.Conv2d will be torch.ops.aten.convolution op when traced through. will need to support more complicated modules that is broken down to multiple operators later, e.g. (MaxPool)
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_qconfig_module_type
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92355
Approved by: https://github.com/jcaip
This PR:
- Updates the docs to say it is deprecated
- Raises a UserWarning
- Changes most of the callsites inside PyTorch to use
torch.func.functional_call, minus the test_stateless testing.
The motivation behind this is that we can now align behind a single
functional_call API in PyTorch.
Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92280
Approved by: https://github.com/albanD
Fixes: https://github.com/pytorch/pytorch/issues/88010
This PR does a couple things to stop slow gradcheck from timing out:
- Splits out test_ops_fwd_gradients from test_ops_gradients, and factors out TestFwdGradients and TestBwdGradients which both inherit from TestGradients, now situated in common_utils (maybe there is a better place?)
- Skips CompositeCompliance (and several other test files) for slow gradcheck CI since they do not use gradcheck
- because test times for test_ops_fwd_gradients and test_ops_gradients are either unknown or wrong, we hardcode them for now to prevent them from being put together. We can undo the hack after we see actual test times are updated. ("def calculate_shards" randomly divides tests with unknown test times in a round-robin fashion.)
- Updates references to test_ops_gradients and TestGradients
- Test files that are skipped for slow gradcheck CI are now centrally located in in run_tests.py, this reduces how fine-grained we can be with the skips, so for some skips (one so far) we still use the old skipping mechanism, e.g. for test_mps
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88216
Approved by: https://github.com/albanD
Re-submit of gh-72302
This still has a small performance hit, but it much smaller. On my
machine I see `_record_fucntion_exit._RecordFunction` takes 1.05 us
compared to the `Tensor` overload taking 0.79 us.
In an overall comparison, I see a 0.7 us slowdown from 6.0 us to
6.7 us for this timeit benchmark
```python
import torch
def foo():
with torch.profiler.record_function("foo"):
return torch.eye(3)
%timeit foo()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76420
Approved by: https://github.com/robieta
Summary:
{F770932209}
Given the original execution order and the node dependency relationship (note that the same dependency order could generate multiple execution order, which refers to “Topological Order”), after reunion, we could find the new execution order of the new GraphModule is different from the original one which is not what we want.
For example, let’s assume that NewLeaf_1 is EmbeddingLookup (Calling EmbeddingLookup is awaitable, we will keep executing the following nodes rather than waiting for the result until we have to know the lookup result), NewLeaf_4 is the node where we HAVE to get the lookup result to interact with the NewLeaf_3. So NewLeaf_1 will launch a lookup kernel and all2all communication stream to distribute the result to all ranks. In the meantime, we want to keep executing NewLeaf_2 and NewLeaf_3 to avoid meaningless waiting. However, given the new execution order, we have to wait for the lookup kernel and all2all communication to be finished since the next node NewLeaf_4 needs the result, until then we can execute NewLeaf_2, etc. It cannot leverage the advantage of parallel computation and communication stream and will hurt the QPS a lot.
So while constructing the GraphModule, we have to change from the topological order to the original order
Test Plan:
Unit test
Not sure how to add tests in FX as there's no TARGETS, so I added in the TorchRec folder
Differential Revision: D39567314
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85188
Approved by: https://github.com/SherlockNoMad
Ran across this issue while using torch.fx.wrap on a decorated function: it triggered a KeyError: 'wrapper_inside_decorator'. torch.fx.wrap stores function.__code__.co_name, but that isn't set correctly (and doesn't match it's name in the global namespace) for decorators; function.__name__ is set correctly.
Also adjusted to checking for callable instead of checking for the existing of __code__ to allow for a broader variety of functions that can be passed in. Eg. using functools.cache returns a callable that won't have a __code__ attribute.
I added a unit test (that incidentally fails every test in the suite before the fix commit -- because it affects the global state), and then a fix that addresses it.
```
In [1]: import functools
In [2]: def decorator(f):
...: @functools.wraps(f)
...: def wrapper(*args, **kwargs):
...: return f(*args, **kwargs)
...: return wrapper
...:
In [3]: @decorator
...: def some_function(x):
...: return x
...:
In [4]: some_function.__name__
Out[4]: 'some_function'
In [5]: some_function.__code__.co_name
Out[5]: 'wrapper'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84373
Approved by: https://github.com/jamesr66a, https://github.com/SherlockNoMad
Fixing [failing](https://github.com/pytorch/pytorch/runs/8083404365?check_suite_focus=true) tests by adjusting the input size for S3D. The reason the test is failing is because S3D requires a bigger input size than previously passed.
As noted before, TorchVision already checks that its models are FX traceable and ensures all the tests are updated and work properly prior adding new architectures. The tests here seem to duplicate our efforts and often break because they don't factor in details about each model. It might be worth considering running TorchVision's tests instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84526
Approved by: https://github.com/pbelevich
Resolves [breakages](https://github.com/pytorch/pytorch/runs/7762125339?check_suite_focus=true) observed at #82560
Context:
The current FX tests assume that every public method under `torchvision.models` is a model builder method. To get a list of those methods, they query the `__dict__` attribute of the module. Unfortunately this assumption is not true and the tests already contain some workarounds to filter some methods. A better approach would be to query TorchVision for all of its available models under a specific module. This is exactly what the new Registration API can help us do and that's what we use in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83187
Approved by: https://github.com/ezyang
PassManager is a class used to run multiple passes on a given graph module.
Class Attributes
* `passes: List[Callable]`: A list of callable passes
* `constraints: List[Callable]`: A list of constraints
* `run_checks_after_each_pass`: Flag for running checks each pass
Class Methods:
* `__call__(graph_module: DispatchGraphModule)`:
* Runs the passes based on the list of passes until the graph stops changes, or until `steps` number of times.
* Each time a pass is run, it will check that the graph module still maintains the required invariants by calling `check()` and will lint the graph to check that it’s well formed if the flag `run_checks_after_each_pass` is set.
* `check(graph_module: DispatchGraphModule)`: Runs various checks on the given graph module to make sure that it contains the needed data for passes
* `add_check(check: Callable)`: Adds the `check` function to the given pass manager instance
* `add_constraint(constraint: Callable)`: Adds a constraint to the current list of constraints
We can create a PassManager and run it by doing:
```
PassManager(passes=[pass1, pass2])(graph_module)
```
Differential Revision: [D37523159](https://our.internmc.facebook.com/intern/diff/D37523159)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80531
Approved by: https://github.com/SherlockNoMad
There are small typos in:
- caffe2/python/recurrent.py
- test/distributed/test_c10d_nccl.py
- test/test_fx.py
- torch/csrc/jit/runtime/autodiff.cpp
- torchgen/gen.py
Fixes:
- Should read `propagation` rather than `propogation`.
- Should read `multiplied` rather than `multuplied`.
- Should read `eliminate` rather than `elminate`.
- Should read `dispatcher` rather than `disaptcher`.
Semi-automated pull request generated by
https://github.com/timgates42/meticulous/blob/master/docs/NOTE.md
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81435
Approved by: https://github.com/ngimel
If installed with pep517 support, `torchvision` will be build againstreleased version of PyTorch rather than against the one currently installed on the system
Also update `torchvision` hash to 8a45147f9d and:
- Added `maskrcnn_resnet50_fpn_v2`, `maskrcnn_resnet50_fpn_v2`, `retinanet_resnet50_fpn_v2`, `ssd300_vgg16`, `fcos_resnet50_fpn` and `ssdlite320_mobilenet_v3_large` to the list of untraceable models
- Set default input size to (1, 3, 16, 224, 224) for `mvit_v1_b` model
- Skipped `test_roi_aligned`,`test_batched_nms`, `test_roi_pooled` and `test_roi_align_aligned` ONNX test (tracked in https://github.com/pytorch/pytorch/issues/81121 )
- Skipped TorchVision integration tests in `test_package` (tracked in https://github.com/pytorch/pytorch/issues/81115 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81074
Approved by: https://github.com/kit1980
Summary: The root module may have different forward functions. The current implementation assumes only the func forward can be traced. In this PR, we add an attribute func name to Tracer class to enable users trace different functions
Test Plan:
python3 test/test_fx.py TestFX.test_trace_multiple_funcs
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77502
Approved by: https://github.com/jamesr66a
Summary: The root module may have different forward functions. The current implementation assumes only the func `forward` can be traced. In this diff, we add an argument of forward func name to enable users trace different forward functions
Test Plan: N1903198
Differential Revision: D36157032
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77109
Approved by: https://github.com/jamesr66a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76253
We're observing large QPS regression on the original PR https://github.com/pytorch/pytorch/pull/72302. For the training job we had, it regressed from 720k QPS to 450k QPS (see the test plan in FB internal). We suspect this is because the api was changed from `_record_function_enter` to `_record_function_enter_new`, and we're running experiments to confirm that. Will add more details when the runs in the test plan has finished. For now, it's better to revert the diff to unblock internal usecases and we can think about how to reland this diff later.
Original commit changeset: dc9939f1fa6d
Original Phabricator Diff: D35257354
Test Plan:
on trunk: f338665947
with this diff: f338502850
Reviewed By: malfet, robieta
Differential Revision: D35853300
fbshipit-source-id: dd38042aeacb848f66756491a4c849c7c652a0e1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74662
Previously, we would not emit a check that `concrete_args` with value `None` matched that value during runtime. This fixes that and improves some of the warning messages
Test Plan: Imported from OSS
Reviewed By: Chillee
Differential Revision: D35137362
Pulled By: jamesr66a
fbshipit-source-id: 222a2c8a907748f90290f1c1b4ab8012b46099a0
(cherry picked from commit b960405ad87e57dcf62ca25dd4d4bdfc34c8744c)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74637
Forgot to update the expect file in https://github.com/pytorch/pytorch/pull/74242. Reland to include changes in expect file.
Test Plan: unit test
Reviewed By: yinghai
Differential Revision: D35089989
fbshipit-source-id: 5e3ad9c696cf31cbc691d34fdb77eff26f92e38d
(cherry picked from commit 110ac12f5e2bcca7552d4b4691c7d98fafb21a57)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74242
The inputs and outputs of the graph module might be different from the graph inputs and outputs if users are using custom codegen. In interpreter, it runs the graph instead of the generated forward function so it might not work if user provides the inputs to the graph module. To fill the gap, we call `process_inputs` and `process_outputs` inside interpreter.
Test Plan: unit test: test_interpreter_with_codegen
Reviewed By: jamesr66a, Chillee
Differential Revision: D34898108
fbshipit-source-id: 250bd236f6c8c1268a363cf19a09521a4f64b3a9
(cherry picked from commit b33076fa3b10788d455cecc590bc01c4ad8ef94c)
Summary:
The fx test case wasn't disabled properly because it didn't call the parent class' setUp().
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74216
Reviewed By: zou3519
Differential Revision: D34898707
Pulled By: janeyx99
fbshipit-source-id: 83e56f5a1efc50d24646c182160f7cfcb5bc9935
(cherry picked from commit bb8dd72d1640c1ef0201d615c5d405479afdf078)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74189
Use the codegen on the original graph module for the new graph module produced by transformer.
Test Plan: Added a unit test: test_custom_codegen_with_transformer
Reviewed By: yinghai
Differential Revision: D34867938
fbshipit-source-id: fcda6600faeccfa7a650ba7226ca125e8440b19c
(cherry picked from commit d098c12081f61ddcf69052db5b8a1f31b0a0b67b)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73763
The test that is enabled generates a graph as such:
```
linear_25 --> sigmoid_14 --> output_1
\--> output_2
```
Before this diff, (unpadding) layout_transform nodes would be added as follows:
```
linear_25 --> layout_xform1 --> sigmoid_14 --> layout_xform2--> output_1
\--> output_2
```
This causes an assertion to fail for the sigmoid node where the input and output types
don't match due to padding differences.
This diff modifies the replacement algorithm to not affect users of an output's parent node
when the user requires padded inputs. This yields the following graph instead:
```
linear_25 --> sigmoid_14 --> layout_xform2--> output_1
\--> layout_xform1 --> output_2
```
Test Plan: Manually and CI
Reviewed By: jfix71, dborkovic
Differential Revision: D34623590
fbshipit-source-id: 3834b06c95fc5626eccc282216cbe039ac5a3242
(cherry picked from commit af012372ae1a6bb654b0ed9b765993960d5251e4)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73198
Previously, if an arg to an FX node is a subclass of tuple then it gets sanitized essentially back to that base class. An example here is when setting an arg to be a TensorMetadata object, which is a NamedTuple, it will be set as a tuple instead.
- Change `map_aggregate` to repack the tuple to `type(a)` when it's not directly a tuple (try/except for best attempt)
- During codegen, call `add_global` for `type(a)` if it's not directly a tuple.
- Add an option for an arg to provide a `_custom_fx_repr_fn` for use inside stringifying via `_format_arg`
Test Plan: Added unit test coverage, where we inline the named tuple into arg/kwarg.
Reviewed By: jamesr66a
Differential Revision: D34381888
fbshipit-source-id: bd672a8542e2bba5aa604b448bec920efc256440
(cherry picked from commit 68f99c12dd)
Summary:
The goal of this is to make FX's codegen extensible. I've refactored it into a class with 5 extensibility points on it.
```
class Codegen(object):
def generate_prologue(self, free_vars: List[str], maybe_return_annotation: str) -> str:
"""
Given the free variables and a return annotation, generates the beginning of the FX function.
By default, `generate_prologue(['a', 'b'], '') == 'def forward(a, b):'`
"""
def generate_output(self, output_args: Argument) -> str:
"""
Given the output arguments, generates the return statement of the FX function.
"""
def process_inputs(self, args: Any) -> Any:
"""
Transforms the inputs so that the graph can take them as arguments, as
non-default codegen may result in the inputs to the function being
different from the inputs to the graph.
If the graph was directly runnable, this invariant should hold true
`f.process_outputs(f.graph(*f.process_inputs(*inputs))) == f(*inputs)`
"""
def process_outputs(self, outputs: Any) -> Any:
"""
Transforms the outputs of the graph to be identical to the codegen.
See ``process_inputs`` for more details.
"""
def additional_globals(self) -> List[Tuple[str, Any]]:
"""
If your codegen uses extra global values, add them here.
For example, return ['List', typing.List] if you need ``List`` in the global context.
"""
```
So, for example, the `ListCodeGen` we want for AOTAutograd looks like this
```
class ListCodeGen(CodeGen):
def generate_prologue(self, free_vars, maybe_return_annotation):
lst_unpack = f"""
def forward(self, args_list: List[torch.Tensor]){maybe_return_annotation}:
{', '.join(free_vars)} = args_list"""
return lst_unpack
def additional_globals(self):
return [('List', typing.List)]
def process_inputs(self, *inputs):
assert(len(inputs) == 1)
return inputs[0]
```
and
```
def f(a, b):
return a + b
nf = fx.symbolic_trace(f)
nf.graph.set_codegen(ListCodeGen())
nf.recompile()
print(nf.code)
```
would result in
```
def forward(self, args_list: List[torch.Tensor]):
a, b = args_list
add = a + b; a = b = None
return add
```
Backwards compatibility changes - I added `process_outputs` and `process_inputs` to `fx.Graph`, while removing `flatten_inputs` and `flatten_outputs` - those didn't have `backwards_compatibility` on them, so I *think* it's probably fine?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72566
Reviewed By: desertfire
Differential Revision: D34160424
Pulled By: Chillee
fbshipit-source-id: ebf6411312b373e3fbcb13288a34befa449a2375
(cherry picked from commit 13cd12eaa1)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61608
See #61544 for an example of issues created by functional wrappers. In this
case, these are directly wrapping the native function with no added
functionality. One exception was `bilinear` which was just missing the default
argument in C++, but was otherwise the same.
I've kept the symbol `torch.functional.istft` because it looks like public API,
but it could just as easily be moved to `_torch_docs.py`.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D31401361
Pulled By: albanD
fbshipit-source-id: 162b74d0b2d4f2e5c4834687a94541960cefdd52
(cherry picked from commit 700cd73ca1)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70960
This patch uses some bytecode introspection logic to see if a boolean is being used as an assert condition and if so, it records the assert in the fx graph and allows the trace to continue.
Test Plan: Imported from OSS
Reviewed By: jamesr66a
Differential Revision: D33570397
Pulled By: zdevito
fbshipit-source-id: 99d26cede8fe42c96d4032d9353c1ede7eb3d969
(cherry picked from commit 30d002da25)
Summary:
Pretty print inplace operators (`a+=b`, etc) in generated FX code. This is useful because it allows `torch.jit.script()` to parse these operators without error.
I don't believe FX tracing supports inplace ops yet, though I am generating them in torchdynamo and want to be able to lower them with torchscript.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71887
Reviewed By: jamesr66a
Differential Revision: D33806248
Pulled By: jansel
fbshipit-source-id: 5eb9f744caab2f745cefc83ea658e12e9e7a817d
(cherry picked from commit eacbd6bb83)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69368
Before this PR, copying a node would lose the stack trace. This PR
ensures that the stack trace is preserved across copies.
This is useful because quantization passes would like to start
allowing the user to preserve stack traces, and we use the copy
behavior.
Test Plan:
```
python test/test_fx.py TestFX.test_stack_traces
```
Imported from OSS
Reviewed By: jamesr66a
Differential Revision: D32835248
fbshipit-source-id: 91610fd8d05f5683cfa5e11fb6f9f3feacb8e241
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69032
I am removing it because, for packaging-related reasons, it's easier if
torch.fx is a pure Python module.
I don't think there is much reason to keep it: this functionality was
experimental, has no known users currently, and we didn't have a clear
path to turning it on by default due to regressions in tracing
performance. Also, it only was ever enabled for `rand` and friends.
Technically the removal of the `enable_cpatching` arguments on
`symbolic_trace` and `Tracer.__init__` are BC-breaking, but the
docstrings clearly state that the argument is experimental and BC is not
guaranteed, so I think it's fine.
Test Plan: Imported from OSS
Reviewed By: soulitzer
Differential Revision: D32706344
Pulled By: suo
fbshipit-source-id: 501648b5c3610ae71829b5e7db74e3b8c9e1a480
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67357
This PR adds OpInfos for:
- new_ones, new_zeros, new_full, new_empty
- rand_like, randint_like
I forgot to add the _like functions in a previous PR, so here they are.
Test Plan: - wait for tests
Reviewed By: mruberry
Differential Revision: D31969533
Pulled By: zou3519
fbshipit-source-id: 236d70d66e82f1d6f8e5254b55ca2a37b54c9494
Summary:
Most of the failing tests are since the test doesn't work with python functions (only builtins like `torch.add`).
I added a check for that and ported the remaining skips into the `skips` field.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67520
Reviewed By: ZolotukhinM
Differential Revision: D32046856
Pulled By: Chillee
fbshipit-source-id: 05fa3e3c40fa6cc4f776e0c24f667629b14afd25
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67068
Prepending a node to itself will result in the node gets removed from the graph.
Usually people won't prepend a node with itself. But people would accidentally try to append a node that's already next to `self` node, which will be prepending `self` to `self`.
Test Plan: Added a unit test
Reviewed By: jamesr66a
Differential Revision: D31849030
fbshipit-source-id: b0fdfbb893f785f268595acd823b426d57c15e61
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64282
OpInfos for:
- Tensor.bfloat16, Tensor.bool, Tensor.bypte, Tensor.char
- Tensor.double, Tensor.float, Tensor.half, Tensor.int
- Tensor.short, Tensor.long
None of these are supported by TorchScript. Also, the OpInfo autograd
test runner assumes that the operation is not allowed to change the
dtype of the argument, so only Tensor.double has
`supports_autograd=True` (in theory Tensor.bfloat16, Tensor.float,
Tensor.half should be differentiable).
Test Plan: - run tests
Reviewed By: dagitses
Differential Revision: D31452627
Pulled By: zou3519
fbshipit-source-id: b7f272e558558412c47aefe947af7f060dfb45c5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66430
On the whole, I'm not totally satisfied with this approach. I think we should be building a prefix tree data structure during initial iteration over the submodules and querying that when deleting submodules. But I think this approach works and I want to see if we can get it in before 1.10
Test Plan: Imported from OSS
Reviewed By: Chillee
Differential Revision: D31546137
Pulled By: jamesr66a
fbshipit-source-id: f08b8409a3cf511277017ccccb916097b7c4c4fe
Summary:
Currently, if the same tensor constant is reused multiple times, we'll store a tensor constant for each time we use it.
For example
```
val = torch.randn(5)
for _ in range(10):
x = x + val
```
ends up storing 10 tensor constants.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66211
Reviewed By: jamesr66a
Differential Revision: D31437089
Pulled By: Chillee
fbshipit-source-id: 401169c8d58ce0afb7025ae11060680ef544419f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66048
Previously, create_arg would fail if it encountered a not `None` layout argument. Adding it to `BaseArgumentTypes` list should be enough to fix that.
Test Plan: Added unittest
Reviewed By: jamesr66a
Differential Revision: D31362662
fbshipit-source-id: 20049971e18c17e9c75e50540500c567266daa55
Summary:
Previously resulted in `AttributeError: module 'operator' has no attribute 'and'`
and/or are python keywords, so they are renamed to `operator.and_` and `operator.or_`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65196
Reviewed By: Chillee
Differential Revision: D31020336
Pulled By: jansel
fbshipit-source-id: 51d888151fe78c0c1197ecaf161976b219c59694
Summary:
OpInfo tracker: https://github.com/pytorch/pytorch/issues/54261
- Eliminate duplicated testing logic in test_autograd
- Moved tests that rely on this testing logic to use OpInfos
- `cat` already has OpInfo (no action needed)
- Created OpInfo for `block_diag` and `broadcast_tensors`
Running into some FX errors. Added op to skip-list and created an issue here: https://github.com/pytorch/pytorch/issues/64997
Both `block_diag` and `broadcast_tensors` are variadic, so skipping `test_variant_consistency_jit` (from comments on other OpInfos, it looks like JIT does not support variadic tensors)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64993
Reviewed By: jbschlosser
Differential Revision: D30961736
Pulled By: soulitzer
fbshipit-source-id: e169305384a683acae1178c4e12e9e214a67226a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64271Closes#60417
Modified emit_node() in fx/graph.py to generate getattr() call with default value when len(node.args) != 2 instead of accessing the attribute.
Added test_torch_fx_getattr() in test/test_fx.py.
Test Plan:
pytest test/test_fx.py
Imported from OSS
Reviewed By: jamesr66a
Differential Revision: D30671265
fbshipit-source-id: f2db9ea47e0cb247547e200684f715aab006c374
Summary:
Implements feature request https://github.com/pytorch/pytorch/issues/62021
Test it out with
```python
from torch import fx
from torch import nn
def fx_int(x):
return int(x)
class MyModule(nn.Module):
def forward(self, x):
return fx_int(x.shape[0] / 2)
tracer = fx.Tracer(autowrap_functions=(fx_int,)) # or remove kwarg to demonstrate symbolic trace error
tracer.trace(MyModule())
```
First time contributor, so please advise if I could have done anything to make lives easier for next time.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62106
Reviewed By: SplitInfinity, driazati
Differential Revision: D30080834
Pulled By: jamesr66a
fbshipit-source-id: 68fadf8c881ea7930e7afd62b642874010fe4903
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62436
## Problem
Given two modules and a tracer that indiscriminately marks all modules as a leaf:
```
class InnerModule(torch.nn.Module):
def forward(self, t):
return t + t
class MyModule(torch.nn.Module):
def __init__(self, inner):
super().__init__()
self.inner = inner
def forward(self, t):
x = self.inner(t)
y = self.inner(t)
return x + y
class MyTracer(torch.fx.Tracer):
def is_leaf_module(self, module, name):
return True
```
One might generally expect the following behavior (note call_module nodes):
```
print(">> Outer GraphModule (with inner module as nn.Module):")
inner = InnerModule()
m = MyModule(inner)
gm = torch.fx.GraphModule(m, MyTracer().trace(m))
print(gm.graph.print_tabular())
>> Outer GraphModule (with inner module as nn.Module):
opcode name target args kwargs
------------- ------- ----------------------- ---------------- --------
placeholder t t () {}
call_module inner inner (t,) {}
call_module inner_1 inner (t,) {}
call_function add <built-in function add> (inner, inner_1) {}
output output output (add,) {}
None
```
However, when the inner module is first symbolically traced, the symbolic trace of the outer module ignores `is_leaf_node` entirely, and traces through the whole module (note call_function nodes).
```
print(">> Inner module as GraphModule:")
inner = InnerModule()
inner_gm = torch.fx.GraphModule(inner, MyTracer().trace(inner))
print(inner_gm.graph.print_tabular())
print(">> Outer GraphModule (with inner module as GraphModule):")
m = MyModule(inner_gm)
gm = torch.fx.GraphModule(m, MyTracer().trace(m))
print(gm.graph.print_tabular())
>> Inner module as GraphModule:
opcode name target args kwargs
------------- ------ ----------------------- ------ --------
placeholder t t () {}
call_function add <built-in function add> (t, t) {}
output output output (add,) {}
None
>> Outer GraphModule (with inner module as GraphModule):
opcode name target args kwargs
------------- ------ ----------------------- ------------ --------
placeholder t t () {}
call_function add <built-in function add> (t, t) {}
call_function add_1 <built-in function add> (t, t) {}
call_function add_2 <built-in function add> (add, add_1) {}
output output output (add_2,) {}
None
```
This is surprising behavior and at first glance violates the tracer's intent. As I understand it, `torch.fx.symbolic_trace.Tracer.trace` intends to patch `torch.nn.Module.__call__` with a `module_call_wrapper()` that records a `call_module` node if the module is a leaf, else executes `torch.fx._symbbolic_trace._orig_module_call = torch.nn.Module.__call__`, which is set a module loading time.
**Every submodule should be a leaf, but no `call_module` nodes are created when that submodule is a `GraphModule`. Why?**
Upon further inspection, I found:
- The constructor for GraphModule includes a path to `GraphModule.recompile()` via the setter for a `fx.Graph`:
```
inner_gm = torch.fx.GraphModule(inner, MyTracer().trace(inner))
File "/torch/fx/graph_module.py", line 252, in __init__
self.graph = graph
File "/torch/nn/modules/module.py", line 1183, in __setattr__
object.__setattr__(self, name, value)
File "/torch/fx/graph_module.py", line 277, in graph
self.recompile()
```
- `recompile()` wraps the `__call__` method by holding a reference to the `__call__` method at the time of recompilation:
```
cls = type(self)
cls_call = cls.__call__
...
def wrapped_call(self, *args, **kwargs):
try:
return cls_call(self, *args, **kwargs)
except Exception as e:
...
cls.__call__ = wrapped_call
```
- Recompilation of the inner GraphModule happens on initialization, before creation or tracing of the outer module. Adding some old-fashioned print debug statements gives:
```
Inner Module:
_orig_module_call: <function Module._call_impl at 0x7faaebfee8b0>
recompile: cls.__call__ now wraps _orig_module_call, <function Module._call_impl at 0x7faaebfee8b0>
Outer Module:
_orig_module_call: <function Module._call_impl at 0x7faaebfee8b0>
tracing: patching method <class 'torch.nn.modules.module.Module'>.__call__ <function Module._call_impl at 0x7faaebfee8b0> with <function Module._call_impl at 0x7fa9d42bce50>
outer module MRO before tracing:
(0) <class '__main__.MyModule'>: <function Module._call_impl at 0x7faaebfee8b0>
(1) <class 'torch.nn.modules.module.Module'>: <function Module._call_impl at 0x7faaebfee8b0>
(2) <class 'object'>: <method-wrapper '__call__' of type object at 0x7fac3cd15f00>
outer module MRO during tracing:
(0) <class '__main__.MyModule'>: <function Module._call_impl at 0x7fa9d42bce50>
(1) <class 'torch.nn.modules.module.Module'>: <function Module._call_impl at 0x7fa9d42bce50>
(2) <class 'object'>: <method-wrapper '__call__' of type object at 0x7fac3cd15f00>
inner module MRO before tracing:
(0) <class 'torch.fx.graph_module.GraphModule.__new__.<locals>.GraphModuleImpl'>: <function x.y.z.wrapped_call at 0x7fa9d42a8670>
(1) <class 'torch.fx.graph_module.GraphModule'>: <function Module._call_impl at 0x7faaebfee8b0>
(2) <class 'torch.nn.modules.module.Module'>: <function Module._call_impl at 0x7faaebfee8b0>
(3) <class 'object'>: <method-wrapper '__call__' of type object at 0x7fac3cd15f00>
inner module MRO during tracing:
(0) <class 'torch.fx.graph_module.GraphModule.__new__.<locals>.GraphModuleImpl'>: <function x.y.z.wrapped_call at 0x7fa9d42a8670>
(1) <class 'torch.fx.graph_module.GraphModule'>: <function Module._call_impl at 0x7fa9d42bce50>
(2) <class 'torch.nn.modules.module.Module'>: <function Module._call_impl at 0x7fa9d42bce50>
(3) <class 'object'>: <method-wrapper '__call__' of type object at 0x7fac3cd15f00>
```
- The outer module is patched correctly, but the inner module's first element in its MRO is the `wrapped_call` from `recompile` that still invokes `<function Module._call_impl at 0x7faaebfee8b0>` directly. Therefore, no call_module nodes are created.
## In Practice
In practice, this behavior affects the ability of `torch.package` to package `GraphModules` whose submodules are `GraphModules`. In our case, the `GraphModule` submodules are not passed through a constructor, but created separately and installed on the root `GraphModule` via `setattr`. This means that prior to packaging, there appear to be no issues with the module, since the root's graph was created before any call_module targets were replaced with `GraphModules`.
When unpackaging such a model with `torch.package`, `torch.fx.graph_module._deserialize_graph_module` uses an inline `KeepModules` tracer that sets all submodules to leaves; the unpackaged module is implicitly and surprisingly inlined in the process.
## Potential Solution
This behavior was previously not understood by us, and so the current workaround is a gnarly process of wrapping all submodules with a `nn.Module` with a manually installed forward method.
Changing `wrapped_call` to return `return super(type(self), self).__call__(*args, **kwargs)` whenever `__call__` is inherited at least appears to solve the issue. Does this seem like an acceptable approach?
## Other Thoughts
- Repeated calls to `recompile` create nested `wrapped_calls`, all for the purpose of error handling. This seems probably unnecessary ¯\\_(ツ)\_/¯
- If a root module with a overriden `__call__` method is symbolically traced, it is ignored
Test Plan:
```
buck test:
✓ ListingSuccess: caffe2/test:fx - main (12.570)
✓ Pass: caffe2/test:fx - test_tracing_graphmodules_as_leaf_submodules (test_fx.TestFX) (11.982)
```
Reviewed By: ansley
Differential Revision: D29997935
fbshipit-source-id: 1988fbb025b14188da26a3e73e94fb789c3c1f74
Summary:
Fixes https://github.com/pytorch/pytorch/issues/61733
Allow FX tracer to trace control flow (if/while) statements when parameter shapes are in the condition.
If the user specifies the new "param_shapes_constant" option when constructing a tracer, the model's parameter shape attribute will be evaluated and the resulting constant will be emitted into the IR during tracing.
Also added a new test
`
python test/fx/test_fx_param_shape_control_flow.py
`
The test also performs a somewhat whitebox style testing to check the generated Python code from the IR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61820
Reviewed By: bdhirsh
Differential Revision: D29969299
Pulled By: jerryzhenleicai
fbshipit-source-id: 99aae824bdfec880be69258de7ead5c8cd59eddc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62292
This PR adds pytree support for namedtuples. The challenge about namedtuple
is that each namedtuple class is actually different. This PR does the
following:
- it adds a namedtuple flatten/unflatten. The flatten function returns
a context that is the actual type of the namedtuple subclass. The
unflatten function uses that type to reconstruct the namedtuple
- Special cases all pytree logic to consider all namedtuples the same.
This is done by creating a `_get_node_type(pytree)` helper function that
returns `namedtuple` if `pytree` is any namedtuple subclass. The effect
of this is that all namedtuple subclasses will go through the namedtuple
flatten/unflatten functions
- Adds a `_namedtuple_flatten_spec` function for FX pytrees. This function
flattens the namedtuple based on the spec and is equivalent to the
`_tuple_flatten_spec`.
Test Plan
- new tests in test/test_pytree.py and test/test_fx.py
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D29947302
Pulled By: zou3519
fbshipit-source-id: 19c00665b13546642c315df0f243ad99b8e7ff7c
Summary:
### Issue
Build PyTorch wheel packages during build stage for pull requests and install during test stage.
### Fix
Update all tests which call lib*.so (under `./build folder`), change to call lib*.so in `{ent}/pytorch/lib/python3.8/site-packages/torch`
### Diff
This diff starts to update test_fx, test_backend and test_torchbind first to check if current ci pass
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61960
Test Plan: check of all ci workflows pass
Reviewed By: malfet, saketh-are
Differential Revision: D29823235
Pulled By: tktrungna
fbshipit-source-id: e7f652def698e303d4843fbaedf4859f5eca2fd9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61780
These changes would allow objects to control how they are handled when they are an argument to a torch.fx call_module node from within their source. Previously, we have been using a custom Tracer with an overridden create_arg() method and branching based on class name to handle args that are unusual (data classes, etc).
Reviewed By: suo, houseroad
Differential Revision: D27976120
fbshipit-source-id: 0c5249c5f8398368ca0fbec0ad8a07ccf99b7da4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61463
seems like a small oversight(?), current test fails when warnings are recorded. discovered this when calling `graph.call_module(existing_call_module_node.target)` and it raised a warning
Test Plan: `buck test //caffe2/test:fx`
Reviewed By: ansley
Differential Revision: D29637799
fbshipit-source-id: 2305629863230235f76a926fe2e4de480cbf853c
Summary:
Reference https://github.com/pytorch/pytorch/issues/50345
`zeta` was already present in the codebase to support computation of `polygamma`.
However, `zeta` only had `double(double, double)` signature **for CPU** before the PR (which meant that computation `polygamma` were always upcasted to `double` for zeta part).
With this PR, float computations will take place in float and double in double.
Have also refactored the code and moved the duplicate code from `Math.cuh` to `Math.h`
**Note**: For scipy, q is optional, and if it is `None`, it defaults `1` which corresponds to Reimann-Zeta. However, for `torch.specia.zeta`, I made it mandatory cause for me it feels odd without `q` this is Reimann-Zeta and with `q` it is the general Hurwitz Zeta. I think sticking to just general made more sense as passing `1` for q sounds trivial.
Verify:
* [x] Docs https://14234587-65600975-gh.circle-artifacts.com/0/docs/special.html#torch.special.zeta
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59623
Reviewed By: ngimel
Differential Revision: D29348269
Pulled By: mruberry
fbshipit-source-id: a3f9ebe1f7724dbe66de2b391afb9da1cfc3e4bb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60057
This ensures that if a function was `wrap`'d before symbolic tracing + being passed into the transformer then it will still be wrapped.
Test Plan: Added test to `test_fx.py`
Reviewed By: jamesr66a
Differential Revision: D29151191
fbshipit-source-id: 93560be59505bdcfe8d4f013e21d4719788afd59