Adds a ruff lint rule to ban raising raw exceptions. Most of these should at the very least be runtime exception, value errors, type errors or some other errors. There are hundreds of instance of these bad exception types already in the codebase, so I have noqa'd most of them. Hopefully this error code will get commiters to rethink what exception type they should raise when they submit a PR.
I also encourage people to gradually go and fix all the existing noqas that have been added so they can be removed overtime and our exception typing can be improved.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124570
Approved by: https://github.com/ezyang
Update ruff to 0.4.1 .
This version fixes a lot false negatives/false positives, is 20-40% faster, and has various other bug fixes.
Below is a before and after table showing the execution time of ruff lint and ruff format in milliseconds courtesy of https://astral.sh/blog/ruff-v0.4.0
| Repository | Linter (v0.3) | Linter (v0.4) | Formatter (v0.3) | Formatter (v0.4) |
|----------------------------------------------------|---------------|---------------|------------------|------------------|
| [pytorch/pytorch](https://github.com/pytorch/pytorch) | 328.7 | 251.8 | 351.1 | 274.9 |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124549
Approved by: https://github.com/ezyang
For the user-defined `Mapping` type, it may contain some metadata (e.g., pytorch/tensordict#679, https://github.com/pytorch/pytorch/pull/120195#issue-2141716712). Simply use `type(mapping)({k: v for k, v in mapping.items()})` do not take this metadata into account. This PR uses `copy.copy(mapping)` to create a clone of the original collection and iteratively updates the elements in the cloned collection. This preserves the metadata in the original collection via `copy.copy(...)` rather than relying on the `__init__` method in the user-defined classes.
Reference:
- pytorch/tensordict#679
- #120195Closes#120195
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120553
Approved by: https://github.com/vmoens
Fixes https://github.com/pytorch/pytorch/issues/118129
Suppressions automatically added with
```
import re
with open("error_file.txt", "r") as f:
errors = f.readlines()
error_lines = {}
for error in errors:
match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
if match:
file_path, line_number, error_type = match.groups()
if file_path not in error_lines:
error_lines[file_path] = {}
error_lines[file_path][int(line_number)] = error_type
for file_path, lines in error_lines.items():
with open(file_path, "r") as f:
code = f.readlines()
for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n"
with open(file_path, "w") as f:
f.writelines(code)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Co-authored-by: Catherine Lee <csl@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
Fixes https://github.com/pytorch/pytorch/issues/118129
Suppressions automatically added with
```
import re
with open("error_file.txt", "r") as f:
errors = f.readlines()
error_lines = {}
for error in errors:
match = re.match(r"(.*):(\d+):\d+: error:.*\[(.*)\]", error)
if match:
file_path, line_number, error_type = match.groups()
if file_path not in error_lines:
error_lines[file_path] = {}
error_lines[file_path][int(line_number)] = error_type
for file_path, lines in error_lines.items():
with open(file_path, "r") as f:
code = f.readlines()
for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True):
code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n"
with open(file_path, "w") as f:
f.writelines(code)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533
Approved by: https://github.com/Skylion007, https://github.com/zou3519
This is a lot of files changed! Don't panic! Here's how it works:
* Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file.
* When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded.
* The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors.
* Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list.
* Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves.
* torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state.
* There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many.
In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file.
The codemod was done with this script authored by GPT-4:
```
import glob
exclude_patterns = [
...
]
for pattern in exclude_patterns:
for filepath in glob.glob(pattern, recursive=True):
if filepath.endswith('.py'):
with open(filepath, 'r+') as f:
content = f.read()
f.seek(0, 0)
f.write('# mypy: ignore-errors\n\n' + content)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414
Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD
Removes an unnecessary duplicated utility functions and just have it rely on itertools. Since the file is low traffic, I also added the modified files to UFMT'd files and formatted them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116192
Approved by: https://github.com/malfet
Re-enable type checking for distributed_c10d.py
Type checking for distributed_c10d.py was inadvertently turned off in issues that have accumulated since.
Note: the backwards compatibility linter does not like some of these changes. But they were incorrect before. This needs human verification, however.
#suppress-api-compatibility-check
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115223
Approved by: https://github.com/wconstab
Applies PLW0108 which removes useless lambda calls in Python, the rule is in preview so it is not ready to be enabled by default just yet. These are the autofixes from the rule.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113602
Approved by: https://github.com/albanD
Fixes#112636
Before: 265
```
torch/utils/data/datapipes/dataframe/structures.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
D208: Docstring is over-indented
torch/utils/data/datapipes/dataframe/structures.py:8 in public class `DataChunkDF`:
D400: First line should end with a period (not ',')
torch/utils/data/datapipes/dataframe/structures.py:13 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/dataframe/structures.py:17 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/datapipe.py:43 in public class `IterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/datapipe.py:119 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:122 in public method `__getattr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:135 in public method `register_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:139 in public method `register_datapipe_as_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:161 in public method `__getstate__`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:161 in public method `__getstate__`:
D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/data/datapipes/datapipe.py:171 in public method `__reduce_ex__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:180 in public method `set_getstate_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:186 in public method `set_reduce_ex_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:191 in public method `__repr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:197 in public method `__str__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:203 in public method `__dir__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:208 in public method `reset`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:208 in public method `reset`:
D400: First line should end with a period (not ',')
torch/utils/data/datapipes/datapipe.py:217 in public class `DFIterDataPipe`:
D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:223 in public class `MapDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/datapipe.py:261 in public method `__getattr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:274 in public method `register_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:278 in public method `register_datapipe_as_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:293 in public method `__getstate__`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/datapipe.py:293 in public method `__getstate__`:
D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/data/datapipes/datapipe.py:303 in public method `__reduce_ex__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:312 in public method `set_getstate_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:318 in public method `set_reduce_ex_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:323 in public method `__repr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:329 in public method `__str__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:335 in public method `__dir__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:392 in public class `DataChunk`:
D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:393 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/datapipe.py:397 in public method `as_str`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:401 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:404 in public method `raw_iterator`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/callable.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/callable.py:23 in public class `MapperIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/callable.py:23 in public class `MapperIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/callable.py:63 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/callable.py:121 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:125 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:173 in public class `CollatorIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/callable.py:213 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combinatorics.py:18 in public class `SamplerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:29 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:44 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:47 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combinatorics.py:56 in public class `ShufflerIterDataPipe`:
D400: First line should end with a period (not 'r')
torch/utils/data/datapipes/iter/combinatorics.py:94 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:114 in public method `set_shuffle`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:118 in public method `set_seed`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:122 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:137 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:142 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:150 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:165 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:179 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:26 in public class `ConcaterIterDataPipe`:
D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/combining.py:44 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:51 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:55 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:64 in public class `ForkerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:92 in public method `__new__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:108 in private class `_ContainerTemplate`:
D400: First line should end with a period (not 'd')
torch/utils/data/datapipes/iter/combining.py:126 in private method `get_length_by_instance`:
D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/iter/combining.py:126 in private method `get_length_by_instance`:
D400: First line should end with a period (not '`')
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:136 in private class `_ForkerIterDataPipe`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:275 in private class `_ChildDataPipe`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:320 in private method `_set_main_datapipe_valid_iterator_id`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:343 in private method `_check_valid_iterator_id`:
D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:351 in public class `DemultiplexerIterDataPipe`:
D400: First line should end with a period (not 'n')
torch/utils/data/datapipes/iter/combining.py:384 in public method `__new__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:399 in private class `_DemultiplexerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:399 in private class `_DemultiplexerIterDataPipe`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:534 in public class `MultiplexerIterDataPipe`:
D400: First line should end with a period (not ',')
torch/utils/data/datapipes/iter/combining.py:549 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:553 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:566 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:572 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:575 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:585 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:593 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:599 in public class `ZipperIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/combining.py:599 in public class `ZipperIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/combining.py:615 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:622 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:626 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/filelister.py:15 in public class `FileListerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/filelister.py:36 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/filelister.py:58 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:62 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/fileopener.py:15 in public class `FileOpenerIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/fileopener.py:15 in public class `FileOpenerIterDataPipe`:
D400: First line should end with a period (not 'm')
torch/utils/data/datapipes/iter/fileopener.py:42 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/fileopener.py:66 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:69 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:31 in public class `BatcherIterDataPipe`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/grouping.py:55 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:68 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:79 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:91 in public class `UnBatcherIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:91 in public class `UnBatcherIterDataPipe`:
D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/grouping.py:112 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:118 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/grouping.py:143 in public class `GrouperIterDataPipe`:
D400: First line should end with a period (not ',')
torch/utils/data/datapipes/iter/grouping.py:185 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:233 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:257 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/grouping.py:261 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:278 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:294 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/routeddecoder.py:19 in public class `RoutedDecoderIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/routeddecoder.py:19 in public class `RoutedDecoderIterDataPipe`:
D400: First line should end with a period (not 'a')
torch/utils/data/datapipes/iter/routeddecoder.py:37 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/routeddecoder.py:53 in public method `add_handler`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/routeddecoder.py:56 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:62 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/selecting.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/selecting.py:21 in public class `FilterIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/selecting.py:46 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/selecting.py:70 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/sharding.py:17 in public class `SHARDING_PRIORITIES`:
D101: Missing docstring in public class
torch/utils/data/datapipes/iter/sharding.py:30 in public class `ShardingFilterIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/sharding.py:30 in public class `ShardingFilterIterDataPipe`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/iter/sharding.py:39 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/sharding.py:47 in public method `apply_sharding`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/sharding.py:74 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:79 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/streamreader.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/iter/streamreader.py:10 in public class `StreamReaderIterDataPipe`:
D400: First line should end with a period (not 'l')
torch/utils/data/datapipes/iter/streamreader.py:27 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/streamreader.py:31 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/utils.py:9 in public class `IterableWrapperIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/iter/utils.py:29 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/utils.py:33 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:49 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/callable.py:14 in public function `default_fn`:
D103: Missing docstring in public function
torch/utils/data/datapipes/map/callable.py:20 in public class `MapperMapDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/callable.py:20 in public class `MapperMapDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/callable.py:45 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/callable.py:55 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:58 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/combinatorics.py:15 in public class `ShufflerIterDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combinatorics.py:55 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combinatorics.py:68 in public method `set_shuffle`:
D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:72 in public method `set_seed`:
D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:76 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:85 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:92 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:95 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:110 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/combining.py:12 in public class `ConcaterMapDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combining.py:12 in public class `ConcaterMapDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/combining.py:34 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:43 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:52 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:58 in public class `ZipperMapDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/combining.py:58 in public class `ZipperMapDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/combining.py:76 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:85 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:94 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/map/grouping.py:12 in public class `BatcherMapDataPipe`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/map/grouping.py:34 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/grouping.py:47 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:60 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/utils.py:9 in public class `SequenceWrapperMapDataPipe`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/map/utils.py:32 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/utils.py:45 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:48 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/utils/common.py:26 in public function `validate_input_col`:
D400: First line should end with a period (not 'n')
torch/utils/data/datapipes/utils/common.py:26 in public function `validate_input_col`:
D401: First line should be in imperative mood (perhaps 'Check', not 'Checks')
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
D400: First line should end with a period (not 'g')
torch/utils/data/datapipes/utils/common.py:127 in private function `_check_unpickable_fn`:
D401: First line should be in imperative mood (perhaps 'Check', not 'Checks')
torch/utils/data/datapipes/utils/common.py:156 in public function `match_masks`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:170 in public function `get_file_pathnames_from_root`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:207 in public function `get_file_binaries_from_pathnames`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:220 in public function `validate_pathname_binary_tuple`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:290 in public class `StreamWrapper`:
D400: First line should end with a period (not 'y')
torch/utils/data/datapipes/utils/common.py:298 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/common.py:315 in public method `close_streams`:
D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/data/datapipes/utils/common.py:331 in public method `__getattr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:335 in public method `close`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/common.py:351 in public method `autoclose`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/common.py:351 in public method `autoclose`:
D400: First line should end with a period (not 's')
torch/utils/data/datapipes/utils/common.py:359 in public method `__dir__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:364 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:368 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:371 in public method `__next__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:374 in public method `__repr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:380 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:383 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/decoder.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/utils/decoder.py:31 in public function `basichandlers`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
D202: No blank lines allowed after function docstring (found 1)
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:87 in public function `handle_extension`:
D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/utils/data/datapipes/utils/decoder.py:115 in public class `ImageHandler`:
D204: 1 blank line required after class docstring (found 0)
torch/utils/data/datapipes/utils/decoder.py:115 in public class `ImageHandler`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:139 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:143 in public method `__call__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:187 in public function `imagehandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:194 in public function `videohandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:215 in public function `audiohandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:236 in public class `MatHandler`:
D101: Missing docstring in public class
torch/utils/data/datapipes/utils/decoder.py:237 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:247 in public method `__call__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:253 in public function `mathandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:261 in public function `extension_extract_fn`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:270 in public class `Decoder`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/decoder.py:276 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:282 in public method `add_handler`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:292 in public method `decode1`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:309 in public method `decode`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:326 in public method `__call__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/snapshot.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
D400: First line should end with a period (not ',')
torch/utils/data/datapipes/utils/snapshot.py:11 in private function `_simple_graph_snapshot_restoration`:
D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/utils/tensorboard/_convert_np.py:1 at module level:
D200: One-line docstring should fit on one line with quotes (found 3)
torch/utils/tensorboard/_convert_np.py:9 in public function `make_np`:
D205: 1 blank line required between summary line and description (found 0)
torch/utils/tensorboard/_convert_np.py:9 in public function `make_np`:
D400: First line should end with a period (not ':')
265
```
After: 166
```
torch/utils/data/datapipes/dataframe/structures.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/dataframe/structures.py:10 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/dataframe/structures.py:14 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/datapipe.py:120 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:123 in public method `__getattr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:136 in public method `register_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:140 in public method `register_datapipe_as_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:173 in public method `__reduce_ex__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:182 in public method `set_getstate_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:188 in public method `set_reduce_ex_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:193 in public method `__repr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:199 in public method `__str__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:205 in public method `__dir__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:221 in public class `DFIterDataPipe`:
D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:266 in public method `__getattr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:279 in public method `register_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:283 in public method `register_datapipe_as_function`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:309 in public method `__reduce_ex__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:318 in public method `set_getstate_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:324 in public method `set_reduce_ex_hook`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:329 in public method `__repr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:335 in public method `__str__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:341 in public method `__dir__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:398 in public class `DataChunk`:
D101: Missing docstring in public class
torch/utils/data/datapipes/datapipe.py:399 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/datapipe.py:403 in public method `as_str`:
D102: Missing docstring in public method
torch/utils/data/datapipes/datapipe.py:407 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/datapipe.py:410 in public method `raw_iterator`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/callable.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/callable.py:65 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/callable.py:123 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:127 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/callable.py:216 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combinatorics.py:30 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:45 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:48 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:97 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combinatorics.py:117 in public method `set_shuffle`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:121 in public method `set_seed`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:125 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:140 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:145 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combinatorics.py:153 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:168 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combinatorics.py:182 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/combining.py:46 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:53 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:57 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:95 in public method `__new__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:388 in public method `__new__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:556 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:560 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:573 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:579 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/combining.py:582 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:592 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:600 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:624 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/combining.py:631 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/combining.py:635 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/filelister.py:37 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/filelister.py:59 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/filelister.py:63 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/fileopener.py:41 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/fileopener.py:65 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/fileopener.py:68 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/grouping.py:57 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:70 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:81 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:115 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:121 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:190 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/grouping.py:238 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:262 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/grouping.py:266 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:283 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/grouping.py:299 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/routeddecoder.py:38 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/routeddecoder.py:54 in public method `add_handler`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/routeddecoder.py:57 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/routeddecoder.py:63 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/selecting.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/selecting.py:47 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/selecting.py:71 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/sharding.py:17 in public class `SHARDING_PRIORITIES`:
D101: Missing docstring in public class
torch/utils/data/datapipes/iter/sharding.py:40 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/sharding.py:48 in public method `apply_sharding`:
D102: Missing docstring in public method
torch/utils/data/datapipes/iter/sharding.py:75 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/sharding.py:80 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/streamreader.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/streamreader.py:29 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/streamreader.py:33 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/iter/utils.py:30 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/iter/utils.py:34 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/iter/utils.py:50 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/callable.py:14 in public function `default_fn`:
D103: Missing docstring in public function
torch/utils/data/datapipes/map/callable.py:47 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/callable.py:57 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/callable.py:60 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/combinatorics.py:56 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combinatorics.py:69 in public method `set_shuffle`:
D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:73 in public method `set_seed`:
D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:77 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:86 in public method `reset`:
D102: Missing docstring in public method
torch/utils/data/datapipes/map/combinatorics.py:93 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:96 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combinatorics.py:111 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/combining.py:36 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:45 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:54 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:80 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/combining.py:89 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/combining.py:98 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/grouping.py:36 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/grouping.py:49 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/grouping.py:62 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/map/utils.py:33 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/map/utils.py:46 in public method `__getitem__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/map/utils.py:49 in public method `__len__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/utils/common.py:157 in public function `match_masks`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:171 in public function `get_file_pathnames_from_root`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:208 in public function `get_file_binaries_from_pathnames`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:221 in public function `validate_pathname_binary_tuple`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/common.py:300 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/common.py:331 in public method `__getattr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:335 in public method `close`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/common.py:356 in public method `__dir__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:361 in public method `__del__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:365 in public method `__iter__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:368 in public method `__next__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:371 in public method `__repr__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:377 in public method `__getstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/common.py:380 in public method `__setstate__`:
D105: Missing docstring in magic method
torch/utils/data/datapipes/utils/decoder.py:1 at module level:
D100: Missing docstring in public module
torch/utils/data/datapipes/utils/decoder.py:31 in public function `basichandlers`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:141 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:145 in public method `__call__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:189 in public function `imagehandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:196 in public function `videohandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:217 in public function `audiohandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:238 in public class `MatHandler`:
D101: Missing docstring in public class
torch/utils/data/datapipes/utils/decoder.py:239 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:249 in public method `__call__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:255 in public function `mathandler`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:263 in public function `extension_extract_fn`:
D103: Missing docstring in public function
torch/utils/data/datapipes/utils/decoder.py:279 in public method `__init__`:
D107: Missing docstring in __init__
torch/utils/data/datapipes/utils/decoder.py:285 in public method `add_handler`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:295 in public method `decode1`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:312 in public method `decode`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/decoder.py:329 in public method `__call__`:
D102: Missing docstring in public method
torch/utils/data/datapipes/utils/snapshot.py:1 at module level:
D100: Missing docstring in public module
166
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112765
Approved by: https://github.com/ejguan
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
Alternative to https://github.com/pytorch/pytorch/pull/107034, implements @ezyang 's suggestion from https://github.com/pytorch/pytorch/pull/107034#discussion_r1292857201.
This PR addresses https://fb.workplace.com/groups/pytorch.oss.dev/posts/1699944830430051 and does a bunch of stacked changes:
- Make `Generator` class support GC;this makes all `Generator` instances tracked and accessile through Python's GC.
- Use the GC to retrieve all existing Generator instances in Dataloader's `_worker_loop` and re-seed them: this extends what is already applied to the global/default Generator, which is already re-seeded.
~TODO: a bit of docs and justification, which I'll do if this PR is mergeable.~ -- Done
CC @albanD @ezyang as previously discussed
BC-Breaking Note
-------------------
We now re-seed all `Generator` instances within the `Dataloader` workers' loop to ensure that their RNG is different across workers.
Previously, the RNG of user-defined `Generators` would be the same across workers, which could lead to wrong training procedures. This only affects user-defined `Generators`, not the default `Generator` (which was already re-seeded).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107131
Approved by: https://github.com/ezyang
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)
That were reverted due to the conflict with internal source repo.
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
- Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
- Add missing return statement to `torch._export. deserialize_graph`
- Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
- Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
- Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
In our DDP training workloads, each rank was initializing a `RandomSampler` for a dataset with a length of 3.5 billion items. We noticed that when this sampler was in scope, `gc.collect` calls were taking on the order of seconds to run, which would slow down the entire training iteration. This is because when we call `torch.randperm(n).tolist()`, we create a python list of 3.5 billion items, which massively slows down the periodic mark & sweep garbage collection.
This PR swaps out the `.tolist()` call with a `.numpy()` call and manually calls `.item()` on each element as it is being requested. This has two benefits:
1. The first call to `RandomSampler::__next__` should be about twice as fast, since `.numpy` does not copy the contents of the original tensor
2. The runtime of `gc.collect()` calls no longer scales linearly with the size of the dataset passed to `RandomSampler`
I've attached some `timeit` samples to illustrate the speedups with this Pr:
```
Main (no GC): 51.72115747816861
Main (10 GC calls) 83.61965207383037
PR (no GC) 33.06403830461204
PR (10 GC calls) 33.959467427805066
```
Code
```python
from timeit import timeit
baseline_no_gc = """
import torch
n = int(1e9)
steps = n // 100
x = torch.randperm(n).tolist()
x_iter = iter(x)
for i in range(steps):
next(x_iter)
"""
baseline_gc = """
import torch
import gc
n = int(1e9)
steps = n // 100
gc_every = steps // 10
x = torch.randperm(n).tolist()
x_iter = iter(x)
for i in range(steps):
next(x_iter)
if i % gc_every == 0:
gc.collect()
"""
numpy_no_gc = """
import torch
n = int(1e9)
steps = n // 100
x = torch.randperm(n).numpy()
x_iter = (i.item() for i in x)
for i in range(steps):
next(x_iter)
"""
numpy_gc = """
import torch
import gc
n = int(1e9)
steps = n // 100
gc_every = steps // 10
x = torch.randperm(n).numpy()
x_iter = (i.item() for i in x)
for i in range(steps):
next(x_iter)
if i % gc_every == 0:
gc.collect()
"""
if __name__ == "__main__":
print("Main (no GC): ", timeit(baseline_no_gc, number=1))
print("Main (10 GC calls)", timeit(baseline_gc, number=1))
print("PR (no GC)", timeit(numpy_no_gc, number=1))
print("PR (10 GC calls)", timeit(numpy_gc, number=1))
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103339
Approved by: https://github.com/kit1980
Torch wrapping datasets list has:
`TensorDataset`
`ConcatDataset`
`ChainDataset`
`TensorDataset` is useful for stacking sets of tensors but can't work with objects without `.size()` method.
This PR proposes `StackDataset`, similar to `TensorDataset` but for a general case like `ConcatDataset`.
Possible usage of `StackDataset` is multimodal networks with different input like image+text or for staking non-tensor input and property to predict.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101338
Approved by: https://github.com/ejguan, https://github.com/NivekT
DataLoader supports batched loading from Mapped Datasets.
This is the fetcher's implementation of auto-detection of batch loading support.
torch.utils.data._utils.fetch._MapDatasetFetcher
```
class _MapDatasetFetcher(_BaseDatasetFetcher):
def fetch(self, possibly_batched_index):
if self.auto_collation:
if hasattr(self.dataset, "__getitems__") and self.dataset.__getitems__:
data = self.dataset.__getitems__(possibly_batched_index)
else:
data = [self.dataset[idx] for idx in possibly_batched_index]
```
Description of Dataset API now shows this feature.
Additionally, Subset dataset now supports `__getitems__` if parent dataset supports it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100375
Approved by: https://github.com/ejguan, https://github.com/NivekT
Add helpful context message to `NotImplementedError`'s thrown by Dataset and IterableDataset, reminding users that they must implement `__getitem__`/`__iter__` in subclasses. Currently, users are presented with a bare `NotImplementedError` without describing the remedy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100667
Approved by: https://github.com/NivekT
Fixes#96975
Changes:
- Make sure custom ShardingDataPipe with `apply_sharding` can be used by `DataLoader`
- Allow the `apply_sharding` function without the last argument of `sharding_group`
- Make `DataLoader` not relying on `sharding_group`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97287
Approved by: https://github.com/NivekT
Changes:
- #95200
1. Recognize `.py.in` and `.pyi.in` files as Python in VS Code for a better development experience.
2. Fix deep setting merge in `tools/vscode_settings.py`.
- #95267
3. Use `Namedtuple` rather than `namedtuple + __annotations__` for `torch.nn.utils.rnn.PackedSequence_`:
`namedtuple + __annotations__`:
```python
PackedSequence_ = namedtuple('PackedSequence_',
['data', 'batch_sizes', 'sorted_indices', 'unsorted_indices'])
# type annotation for PackedSequence_ to make it compatible with TorchScript
PackedSequence_.__annotations__ = {'data': torch.Tensor, 'batch_sizes': torch.Tensor,
'sorted_indices': Optional[torch.Tensor],
'unsorted_indices': Optional[torch.Tensor]}
```
`Namedtuple`: Python 3.6+
```python
class PackedSequence_(NamedTuple):
data: torch.Tensor
batch_sizes: torch.Tensor
sorted_indices: Optional[torch.Tensor]
unsorted_indices: Optional[torch.Tensor]
```
- => this PR: #95268
4. Sort import statements and remove unnecessary imports in `.pyi`, `.pyi.in` files.
5. Format `.pyi`, `.pyi.in` files and remove unnecessary ellipsis `...` in type stubs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95268
Approved by: https://github.com/huydhn
I don't think the docstring explaining `pin_memory_device` is very clear. If it weren't for the string type, I would not have guessed that this was about the device that is referred to in the `pin_memory` option (and honestly, it took me a few minutes before noticing the type).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94349
Approved by: https://github.com/ejguan
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
Optimize unnecessary collection cast calls, unnecessary calls to list, tuple, and dict, and simplify calls to the sorted builtin. This should strictly improve speed and improve readability.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94323
Approved by: https://github.com/albanD
Applies some more harmless pyupgrades. This one gets rid of deprecated aliases in unit_tests and more upgrades yield for loops into yield from generators which are more performance and propagates more information / exceptions from original generator. This is the modern recommended way of forwarding generators.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94309
Approved by: https://github.com/albanD
Move `ShardingFilterIterDataPipe` into a dedicated file.
Also, propose to have a dedicated parent class (`_ShardingIterDataPipe`) for sharding data pipe, as this seems more like a "system/engine-level" datapipe that gives strong hints to RS on how to execute, and needs first-class citizen treatment in RS (compared with other "user-level" datapipe that are mostly composable `Callable[[Iterable], Iterable]`. So we don't need to based on whether `is_shardable` and `apply_sharding` are presented in DataPipe in `graph_settings.py`. But open to other discussions.
Open question: Should
[ShardingRoundRobinDispatcherIterDataPipe](01fc762003/torchdata/datapipes/iter/util/sharding.py (L16-L17)) also be considered as a `_ShardingIterDataPipe`? (e.g. this sharding is executed by replicating (the metadata), while `ShardingRoundRobinDispatcherIterDataPipe` hints too expensive to replicate so requires round robin data exchange/dispatch).
Differential Revision: D43014692
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94095
Approved by: https://github.com/ejguan, https://github.com/NivekT
In file: combinatorics.py, the comparison of Collection length creates a logical short circuit.
if isinstance(self.sampler, Sized) and len(self.sampler) >= 0:
Here, the right side of the comparison will always return true.
I suggested that the Collection length check should be removed since this is redundant.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93025
Approved by: https://github.com/albanD
Changes:
- Allow multiple `sharding_filter` in the pipeline as long as they are not on the same branch
- [x] Add test
Example:
```mermaid
graph TD;
DP1-->sharding_filter_1;
sharding_filter_1-->DP3;
DP2-->sharding_filter_2;
sharding_filter_2-->DP4;
DP3-->DP4;
DP4-->output;
```
In order to properly shard `DP1` and `DP2`, we should allow multiple `sharding_filter`s
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90769
Approved by: https://github.com/NivekT
Fixes#88074
Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever `apply_sharding` is called). The behaviour is counter-intuitive because we do not expect `__len__` to have side-effects.
This PR makes `__len__` dynamically computed.
Changes:
- Add note to the `datapipes` README that `__len__` should be dynamic and why.
- Remove caching of length computations in `ConcaterIterDataPipe`, `MultiplexerIterDataPipe`, `ZipperIterDataPipe`, `BatcherIterDataPipe`, `ConcaterMapDataPipe`, and `BatcherMapDataPipe`.
- This required removal of the `length` attribute in setstate/getstate of `MultiplexerIterDataPipe`. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore the `length` of the loaded `state`.
- This also means the classes above no longer have a `length` attribute. I have found no uses of this, though.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88302
Approved by: https://github.com/NivekT
Fixes: https://github.com/pytorch/data/issues/865
I will add another PR in torchdata to validate this change would solve the infinite datapipe problem (I have tested locally). This is one of the most annoying stack of PRs cause by separation between TorchData and PyTorch.
There is a case that `file.close` is never called because when generator function has never reached to the end. A simple example would be `zip` two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by `gc`. So, the line of `file.close` is not executed. (This is the reason that Vitaly has to create this [hack](4451eb24e6/torch/utils/data/datapipes/iter/combining.py (L573-L583)) to retrieve all remaining data to make sure generator function is fully executed)
However, this hack introduces another problem where an infinite datapipe would make `zip` never end as it would try to deplete the infinite iterator. See: https://github.com/pytorch/data/issues/865
So, in this PR, I am adding a `try-finally` clause to make sure the `file.close` is always executed during the destruction of `generator` object. Then, we don't need the hack within `zip` any more.
Differential Revision: [D41699469](https://our.internmc.facebook.com/intern/diff/D41699469)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89974
Approved by: https://github.com/NivekT, https://github.com/wenleix
There is a case that `file.close` is never called because when generator function has never reached to the end. A simple example would be `zip` two datepipes with different length. The longer DataPipe would never reach the end of generator and then it will be cleaned up by `gc`. So, the line of `file.close` is not executed. (This is the reason that Vitaly has to create this [hack](4451eb24e6/torch/utils/data/datapipes/iter/combining.py (L573-L583)) to retrieve all remaining data to make sure generator function is fully executed)
However, this hack introduces another problem where an infinite datapipe would make `zip` never end as it would try to deplete the infinite iterator. See: https://github.com/pytorch/data/issues/865
So, in this PR, I am adding a `try-finally` clause to make sure the `file.close` is always executed during the destruction of `generator` object. Then, we don't need the hack within `zip` any more.
Differential Revision: [D41699470](https://our.internmc.facebook.com/intern/diff/D41699470)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89973
Approved by: https://github.com/NivekT
- This would remove the hard-coded check within `_ChildDataPipe`.
- Add `get_length_by_instance` to parent class to make sure there is a chance that child DataPipe can have different lengths
- Prevent Error when `__del__` executed when the object has already been removed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89216
Approved by: https://github.com/NivekT
This is temporary fix for internal SEV. We have run three different workflows to validate this fix would unblock internal SEV.
And, those are a few following-up tasks:
- [ ] Create reproducible test for multithreading with generator
- [ ] Figure out how to make fullsynciterator is working properly with generator
- [ ] Move Wrapper back to generator if needed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87459
Approved by: https://github.com/NivekT