This PR adds enforcement of testing header only APIs.
The benefit of torch/header_only_apis.txt is twofold:
1) this gives us a clear view of what we expect to be header only
2) this allows us to enforce testing
The enforcement added in this PR is very basic--we literally string match that a symbol in `torch/header_only_apis.txt` is in a cpp test. This is meant to be a first step in verifying our APIs are properly tested and can get fancier over time. For now, I've added myself as a codeowner to learn what to look out for in terms of proper tests. Over time, I anticipate we can automate more steps, but right now let's just get something out the door.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153635
Approved by: https://github.com/albanD
ghstack dependencies: #153965
## Improvements to `docstring_linter`
* Add a "grandfather list" of existing undocumented classes and functions (`--grandfather`, `--grandfather-tolerance`, `--no-grandfather`, `--write-grandfather`)
* In classes, now just one of the class itself or its `__init__()` method needs to be documented (`--lint-init` turns the old behavior back on)
* Now classes and functions defined local to other functions do not need to be documented (`--lint-local` turns the old behavior back on)
* New `--report` flag produces a compact report of long, undocumented classes or function definitions: see attached example run over all pytorch: [pytorch-docs.json](https://github.com/user-attachments/files/18455981/pytorch-docs.json)
## Help text
```
$ python tools/linter/adapters/docstring_linter.py --help
usage: docstring_linter.py [-h] [-l] [-v] [--grandfather GRANDFATHER] [--grandfather-tolerance GRANDFATHER_TOLERANCE] [--lint-init]
[--lint-local] [--lint-protected] [--max-class MAX_CLASS] [--max-def MAX_DEF]
[--min-docstring MIN_DOCSTRING] [--no-grandfather] [--report] [--write-grandfather]
[files ...]
`docstring_linter` reports on long functions, methods or classes without docstrings
positional arguments:
files A list of files or directories to lint
optional arguments:
-h, --help show this help message and exit
-l, --lintrunner Run for lintrunner and print LintMessages which aren't edits
-v, --verbose Print more debug info
--grandfather GRANDFATHER, -g GRANDFATHER
Set the grandfather list
--grandfather-tolerance GRANDFATHER_TOLERANCE, -t GRANDFATHER_TOLERANCE
Tolerance for grandfather sizes, in percent
--lint-init, -i Lint __init__ and class separately
--lint-local, -o Lint definitions inside other functions
--lint-protected, -p Lint functions, methods and classes that start with _
--max-class MAX_CLASS, -c MAX_CLASS
Maximum number of lines for an undocumented class
--max-def MAX_DEF, -d MAX_DEF
Maximum number of lines for an undocumented function
--min-docstring MIN_DOCSTRING, -s MIN_DOCSTRING
Minimum number of characters for a docstring
--no-grandfather, -n Disable the grandfather list
--report, -r Print a report on all classes and defs
--write-grandfather, -w
Rewrite the grandfather list
```
---
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145834
Approved by: https://github.com/amjames, https://github.com/eellison
### `set_linter` only
* Fix gnarly [bug](dbed747aae/tools/test/set_linter_testdata/python_code.py.txt.python (L42)) which would have garbled Python files involving sets contained in sets.
* Better handling of new Python3.12 token types
### Both linters.
* Recover from and report on unparseable Python files
* Remove `ParseError.check()` (it made it harder to read the code)
* FileLinter is now generic on `PythonFile`
### Notes
As I started working on new docstring features, I found a nasty bug and an edge case bug in set linter, and realized both the linters crash when there is a badly-formed Python file in the repo.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144620
Approved by: https://github.com/amjames, https://github.com/jansel
Because Clang-tidy 19 has more powerful clang-analyzer checks to detect subtle bugs. New checks such as misc-use-internal-linkage can help identify potential static variables or functions, thus reducing binary sizes.
Some new checks are disabled temporarily for later enabling. Additional warnings have been fixed or suppressed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148648
Approved by: https://github.com/Skylion007
If the command is too long, the linter fails with
```
Failed due to OSError:
[Errno 7] Argument list too long: 'grep'
```
Fix this by batching the command so it is shorter
Limit of 750k was chosen due to `getconf ARG_MAX` returns ~1M on my mac. My guess is that most people shouldn't hit this unless they run --all-files and the directory length is long.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145950
Approved by: https://github.com/wdvr
This is an attempt to fix flaky mypy errors in CI that look like:
```
dmypy status --verbose
connection_name : /var/folders/rf/qrn1jkgj0b9_tcznwp8ck46w0000gn/T/tmpjoqsid7_/dmypy.sock
pid : 32233
error : timed out
Daemon is stuck; consider /Users/zainr/pytorch/venv/bin/dmypy kill
```
"Fix" it by not using the daemon at all, since it doesn't actually provide any perf benefits in CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145961
Approved by: https://github.com/malfet
This is one of a series of PRs to update us to PEP585 (changing Dict -> dict, List -> list, etc). Most of the PRs were completely automated with RUFF as follows:
Since RUFF UP006 is considered an "unsafe" fix first we need to enable unsafe fixes:
```
--- a/tools/linter/adapters/ruff_linter.py
+++ b/tools/linter/adapters/ruff_linter.py
@@ -313,6 +313,7 @@
"ruff",
"check",
"--fix-only",
+ "--unsafe-fixes",
"--exit-zero",
*([f"--config={config}"] if config else []),
"--stdin-filename",
```
Then we need to tell RUFF to allow UP006 (as a final PR once all of these have landed this will be made permanent):
```
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -40,7 +40,7 @@
[tool.ruff]
-target-version = "py38"
+target-version = "py39"
line-length = 88
src = ["caffe2", "torch", "torchgen", "functorch", "test"]
@@ -87,7 +87,6 @@
"SIM116", # Disable Use a dictionary instead of consecutive `if` statements
"SIM117",
"SIM118",
- "UP006", # keep-runtime-typing
"UP007", # keep-runtime-typing
]
select = [
```
Finally running `lintrunner -a --take RUFF` will fix up the deprecated uses.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145101
Approved by: https://github.com/bobrenjc93
Lintrunner can only apply changes (-a) if only one suggestion is made per file. The grep_linter makes a suggestion for every line it finds incorrect, so it creates multiple suggestions per file if there are multiple lines that it wants to change
This sets the `line` parameter of the LintMessage to None for all of grep_linter, but I'm not sure if that entry did anything
I'm not sure if enabling -a is the best idea, since its currently used for tabs and tab width might differ each time? I had one instance where running with -a cause the spacing to change. On the other hand, -a would have already worked if only one line was bad
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144589
Approved by: https://github.com/huydhn
Changes by apply order:
1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.
`.parent{...}.absolute()` -> `.absolute().parent{...}`
4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)
`.parent.parent.parent.parent` -> `.parents[3]`
5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~
~`.parents[3]` -> `.parents[4 - 1]`~
6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
Changes by apply order:
1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.
`.parent{...}.absolute()` -> `.absolute().parent{...}`
4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)
`.parent.parent.parent.parent` -> `.parents[3]`
5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~
~`.parents[3]` -> `.parents[4 - 1]`~
6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
As title, this patch prevents developers from importing third party
libraries to patch things in Dynamo, unless there's no other easy
workaround (in which case one would add the library to the allowlist in
`import_linter.py`, as instructed by the lint error).
For instance, if we remove `einops` from the allowlist, we'd get this
```verbatim
>>> Lint for torch/_dynamo/decorators.py:
Error (IMPORT) Disallowed import
importing from einops is not allowed, if you believe there's a valid
reason, please add it to import_linter.py
608 |# Note: this carefully avoids eagerly import einops.
609 |# TODO: we should delete this whole _allow_in_graph_einops logic by approximately 2024 Q2
610 |def _allow_in_graph_einops():
>>> 611 | import einops
612 |
613 | try:
614 | # requires einops > 0.6.1, torch >= 2.0
Error (IMPORT) Disallowed import
importing from einops is not allowed, if you believe there's a valid
reason, please add it to import_linter.py
612 |
613 | try:
614 | # requires einops > 0.6.1, torch >= 2.0
>>> 615 | from einops._torch_specific import ( # type: ignore[attr-defined] # noqa: F401
616 | _ops_were_registered_in_torchdynamo,
617 | )
618 |
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143312
Approved by: https://github.com/zou3519
Now syntax errors look like this:
```
torch/_dynamo/variables/base.py:20:
Error (RUFF) E999
SyntaxError: Expected ',', found indent.
See https://beta.ruff.rs/docs/rules/
>>> 19 |class SourceType(Enum]:
20 | """
21 | This Enum divides VariableTracker into 2 cases, depending on the variable
22 | it represents:
[...more errors...]
```
Note that most syntax errors lead to a cascade of other errors, so the exception is generally wrong, but the location and name are good.
Before they looked like this:
```
>>> General linter failure:
Error (RUFF) Linter failed
Linter failed. This a bug, please file an issue against the linter
maintainer.
CONTEXT:
Linter command failed with non-zero exit code.
STDERR:
<MainThread:DEBUG> $ /home/rec/.conda/envs/pytorch-dev-constant/
bin/python3 -m ruff check --exit-zero --quiet --output-format=json
--config=pyproject.toml /home/rec/git-constant/pytorch/torch/_dynamo/
variables/base.py
<MainThread:DEBUG> took 38ms
Traceback (most recent call last):
File "/home/rec/git-constant/pytorch/tools/linter/adapters/
ruff_linter.py", line 465, in <module>
main()
File "/home/rec/git-constant/pytorch/tools/linter/adapters/
ruff_linter.py", line 424, in main
lint_messages = check_files(
File "/home/rec/git-constant/pytorch/tools/linter/adapters/
ruff_linter.py", line 273, in check_files
return [
File "/home/rec/git-constant/pytorch/tools/linter/adapters/
ruff_linter.py", line 288, in <listcomp>
severity=severities.get(vuln["code"],
get_issue_severity(vuln["code"])),
File "/home/rec/git-constant/pytorch/tools/linter/adapters/
ruff_linter.py", line 172, in get_issue_severity
if any(
File "/home/rec/git-constant/pytorch/tools/linter/adapters/
ruff_linter.py", line 173, in <genexpr>
code.startswith(x)
AttributeError: 'NoneType' object has no attribute 'startswith'
STDOUT:
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142312
Approved by: https://github.com/Skylion007
MInor, adds a linter that ensures that all jobs run on pull_request, schedule, push etc have a `if: github.repository_owner == 'pytorch'` or are dependent on a job that has that check
There is also a setting in Github repos that can disable all workflows for that repo
A lot of these are unnecessary because many jobs use reusable workflows that have that check. However, this is a one time change so I'm not that bothered
Unfortunately I can't put this at the workflow level, which would make this better
Lots of weird string parsing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138849
Approved by: https://github.com/malfet