mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57137 This PR corrects and expands our typing algorithm for unannotated, non-empty dicts and lists. Previously, to verify type correctness for an unannotated, non-empty container, we had gotten the type of the first element in the container, then checked if each following element was a subtype of the first type. That's too restrictive--what if the first element were a subtype of the second element? Instead, we should type the container by getting the smallest common supertype of all the given elements. We need slightly different rules for keys and values in dicts, though: because the set of key types is restricted, finding two key types that cannot be unified should cause an error. On the other hand, the set of value types is not restricted, so we should be able to use `Any` as a valid supertype. We need to keep the set of keys restricted since the keys are used to generate and match schemas. This does not break backwards compatibility, because the default element type is the smallest supertype of all the given types. So, if someone creates an unannotated dict where the keys are all `str` and the values are all `torch.Tensor`, the dict will be inferred to `Dict[str, Tensor]` just like it was before. Empty lists are still typed as `List[torch.Tensor],` and empty dicts are still typed as `Dict[str, Tensor]`. This PR unblocks three engineers on an FB-internal team and improves FX-TorchScript compatibility. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D28231839 Pulled By: ansley fbshipit-source-id: 7297bf239749daa54895add708185c75e6ca5999
113 lines
3.7 KiB
Markdown
113 lines
3.7 KiB
Markdown
# How to write tests using FileCheck
|
|
|
|
## What is FileCheck
|
|
|
|
FileCheck can be seen as an advanced version of grep. We use it for writing
|
|
small annotated unit tests for optimization passes. FileCheck used in PyTorch is
|
|
inspired by [LLVM FileCheck
|
|
Tool](https://llvm.org/docs/CommandGuide/FileCheck.html), but is not the same.
|
|
FileCheck is available for writing both C++ and python tests.
|
|
|
|
## How does it work
|
|
|
|
Let's look at a test written with FileCheck. The following test verifies that
|
|
CSE pass removes one out of two similar `aten::mul` nodes. Here is how the test
|
|
looks like:
|
|
|
|
```python
|
|
def test_cse():
|
|
input_str = """graph(%a : Tensor, %b : Tensor):
|
|
# CHECK: aten::mul
|
|
%x : Tensor = aten::mul(%a, %b)
|
|
# Check that the second aten::mul is removed by CSE.
|
|
# CHECK-NOT: aten::mul
|
|
%y : Tensor = aten::mul(%a, %b)
|
|
# CHECK: return
|
|
return (%x, %y)
|
|
"""
|
|
parsed = parse_ir(input_str)
|
|
optimized = run_cse(parsed)
|
|
FileCheck().run(input_str, optimized)
|
|
```
|
|
|
|
Let's look in detail at how it works. First, the input string is parsed by
|
|
`parse_ir`. At that stage all annotations are ignored since they are written in
|
|
comments, so this is what parser essentially sees:
|
|
|
|
```
|
|
graph(%a : Tensor, %b : Tensor):
|
|
%x : Tensor = aten::mul(%a, %b)
|
|
%y : Tensor = aten::mul(%a, %b)
|
|
return (%x, %y)
|
|
```
|
|
|
|
We then run CSE on the parsed IR and expect it to remove the second `aten::mul`,
|
|
which is redundant. After CSE our IR looks like this:
|
|
|
|
```
|
|
graph(%a : Tensor, %b : Tensor):
|
|
%x : Tensor = aten::mul(%a, %b)
|
|
return (%x, %x)
|
|
```
|
|
|
|
And now we run `FileCheck` passing to it both original input string and the
|
|
optimized IR. From the input string `FileCheck` ignores everything except `#
|
|
CHECK` pragmas and essentially it sees the input string like this:
|
|
|
|
```
|
|
# CHECK: aten::mul (1)
|
|
# CHECK-NOT: aten::mul (2)
|
|
# CHECK: return (3)
|
|
```
|
|
|
|
It then checks that the optimized IR satisfies the specified annotations. It
|
|
first finds string `%x : Tensor = aten::mul(%a, %b)` matching the annotation (1),
|
|
then it finds string `return (%x, %x)` matching the annotation (3), and since
|
|
there were no lines matching `aten::mul` after the match (1) and before the
|
|
match (3), the annotation (2) is also satisfied.
|
|
|
|
One could also register FileCheck annotations using a builder API. To generate
|
|
annotations from the example above one would write:
|
|
```python
|
|
FileCheck().check("aten::mul") \
|
|
.check_not("aten::mul") \
|
|
.check("return") \
|
|
.run(optimized)
|
|
```
|
|
|
|
## Supported pragmas
|
|
|
|
* `CHECK: <pattern>`
|
|
Scans the input until `PATTERN` is found. Fails if the pattern is not found.
|
|
* `CHECK-NEXT: <pattern>`
|
|
Scans the input on the line immediately following the previous CHECK until
|
|
`PATTERN` is found. Fails if the pattern is not found on that line.
|
|
* `CHECK-NOT: <pattern>`
|
|
Scans the input and fails if `PATTERN` is found on any line. The scan stops when
|
|
a match for a next `CHECK` is found.
|
|
* `CHECK-SAME: <pattern>`
|
|
Checks that PATTERN is found in the line of the last match.
|
|
* `CHECK-COUNT-<num>: <pattern>`
|
|
Scans the input and succeeds when a line containing at least `NUM` entries of
|
|
`PATTERN` is found.
|
|
* `CHECK-COUNT-EXACTLY-<num>: <pattern>`
|
|
Scans the input and succeeds when a line containing exactly `NUM` entries of
|
|
`PATTERN` is found.
|
|
* `CHECK-DAG: pattern`
|
|
Works similar to the usual `CHECK` pragma, but also matches if there exists a
|
|
way to reorder the CHECK-DAG pragmas to satisfy all patterns.
|
|
For example the following pattern:
|
|
```
|
|
# CHECK: foo
|
|
# CHECK-DAG: bar
|
|
# CHECK-DAG: ham
|
|
# CHECK: end
|
|
```
|
|
would match the following input (note that `ham` and `bar` are swapped):
|
|
```
|
|
foo
|
|
ham
|
|
bar
|
|
end
|
|
```
|