Commit Graph

55 Commits

Author SHA1 Message Date
Kiuk Chung
abd16a8c64 [torch/multiprocessing] Use multiprocessing.reduction.register ForkingPickler.register to register custom tensor and storage reductions (#135030)
Right now `multiprocessing.reduction.register()` is simply an alias to `multiprocessing.reduction.ForkingPickler.register()`
https://github.com/python/cpython/blame/main/Lib/multiprocessing/reduction.py#L56, but the top-level `register()` function exposes less of the internal details of `multiprocessing.reduction` module.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135030
Approved by: https://github.com/albanD
2024-09-16 20:07:29 +00:00
Aaron Gokaslan
31715be72a [BE]: Update mypy to 1.11.2 (#133816)
Updates mypy to 1.11.1 to improve type inference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133816
Approved by: https://github.com/ezyang
2024-09-16 19:44:11 +00:00
PyTorch MergeBot
3117f2cf67 Revert "[BE]: Update mypy to 1.11.2 (#133816)"
This reverts commit 55299cfc22.

Reverted https://github.com/pytorch/pytorch/pull/133816 on behalf of https://github.com/jeanschmidt due to seems to have broken https://github.com/pytorch/pytorch/actions/runs/10865710499/job/30155699792 on main ([comment](https://github.com/pytorch/pytorch/pull/133816#issuecomment-2352377684))
2024-09-16 09:11:16 +00:00
Aaron Gokaslan
55299cfc22 [BE]: Update mypy to 1.11.2 (#133816)
Updates mypy to 1.11.1 to improve type inference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133816
Approved by: https://github.com/ezyang
2024-09-14 21:40:36 +00:00
Oguz Ulgen
72d2dba992 Add None return type to init (#132335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335
Approved by: https://github.com/albanD
2024-08-01 15:26:45 +00:00
Kurt Mohler
e590168865 Enable sharing meta tensors between processes (#129520)
Fixes #129436

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129520
Approved by: https://github.com/ezyang
2024-07-04 20:29:48 +00:00
Xuehai Pan
83bb9b7c53 [BE] explicitly export subpackage torch.utils (#128342)
Resolves #126401

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128342
Approved by: https://github.com/Skylion007
ghstack dependencies: #127707
2024-06-13 04:39:16 +00:00
Aaron Orenstein
038b927590 Flip default value for mypy disallow_untyped_defs [7/11] (#127844)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127844
Approved by: https://github.com/oulgen
ghstack dependencies: #127842, #127843
2024-06-08 18:49:45 +00:00
Pearu Peterson
0bd4d1f4ab Add sparse tensors support to dataloader. (#112842)
Fixes https://github.com/pytorch/pytorch/issues/106837

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112842
Approved by: https://github.com/cpuhrsch, https://github.com/gokulavasan
2023-11-19 16:05:27 +00:00
IvanLauLinTiong
91c90f232a Fix docstring errors in reductions.py, spawn.py, pool.py, parameter.py, cpp.py, grad.py, __init__.py, profiler.py, queue.py, graph.py (#113052)
Fixes #112595
- `torch/autograd/profiler.py` </br>
**Before: 37**

```
torch/autograd/profiler.py:1 at module level:
        D100: Missing docstring in public module
torch/autograd/profiler.py:91 in public class `profile`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/profiler.py:175 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:261 in public method `config`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:272 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:290 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:308 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:313 in public method `__str__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:322 in public method `table`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:346 in public method `export_chrome_trace`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:355 in public method `export_stacks`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:361 in public method `key_averages`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:368 in public method `total_average`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:377 in public method `self_cpu_time_total`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/profiler.py:377 in public method `self_cpu_time_total`:
        D400: First line should end with a period (not 'f')
torch/autograd/profiler.py:555 in public class `record_function`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/profiler.py:555 in public class `record_function`:
        D400: First line should end with a period (not 'f')
torch/autograd/profiler.py:591 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:602 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:608 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:625 in private method `_call_end_callbacks_on_future`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/profiler.py:625 in private method `_call_end_callbacks_on_future`:
        D400: First line should end with a period (not 'c')
torch/autograd/profiler.py:707 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:712 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:733 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:826 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:831 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:853 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:863 in public function `load_nvprof`:
        D401: First line should be in imperative mood (perhaps 'Open', not 'Opens')
torch/autograd/profiler.py:874 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:877 in public method `see`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:883 in public function `parse_nvprof_trace`:
        D103: Missing docstring in public function
torch/autograd/profiler.py:951 in public class `KinetoStepTracker`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/profiler.py:991 in public method `init_step_count`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:995 in public method `erase_step_count`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:1000 in public method `increment_step`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/profiler.py:1023 in public method `current_step`:
        D102: Missing docstring in public method
37
```

**After: 27**

```
torch/autograd/profiler.py:1 at module level:
        D100: Missing docstring in public module
torch/autograd/profiler.py:176 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:262 in public method `config`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:273 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:291 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:309 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:314 in public method `__str__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:323 in public method `table`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:347 in public method `export_chrome_trace`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:356 in public method `export_stacks`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:362 in public method `key_averages`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:369 in public method `total_average`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:593 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:604 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:610 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:708 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:713 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:734 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:827 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:832 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:854 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/profiler.py:875 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/profiler.py:878 in public method `see`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:884 in public function `parse_nvprof_trace`:
        D103: Missing docstring in public function
torch/autograd/profiler.py:993 in public method `init_step_count`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:997 in public method `erase_step_count`:
        D102: Missing docstring in public method
torch/autograd/profiler.py:1025 in public method `current_step`:
        D102: Missing docstring in public method
27
```

- `torch/autograd/graph.py` </br>
**Before: 22**

```
torch/autograd/graph.py:1 at module level:
        D100: Missing docstring in public module
torch/autograd/graph.py:24 in public class `Node`:
        D101: Missing docstring in public class
torch/autograd/graph.py:27 in public method `name`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/autograd/graph.py:42 in public method `next_functions`:
        D102: Missing docstring in public method
torch/autograd/graph.py:47 in public method `metadata`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/autograd/graph.py:56 in public method `register_hook`:
        D401: First line should be in imperative mood (perhaps 'Register', not 'Registers')
torch/autograd/graph.py:94 in public method `register_prehook`:
        D401: First line should be in imperative mood (perhaps 'Register', not 'Registers')
torch/autograd/graph.py:129 in public method `__subclasshook__`:
        D105: Missing docstring in magic method
torch/autograd/graph.py:147 in public function `get_gradient_edge`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/graph.py:147 in public function `get_gradient_edge`:
        D400: First line should end with a period (not 'f')
torch/autograd/graph.py:147 in public function `get_gradient_edge`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/autograd/graph.py:166 in public function `increment_version`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/graph.py:166 in public function `increment_version`:
        D400: First line should end with a period (not 'd')
torch/autograd/graph.py:166 in public function `increment_version`:
        D401: First line should be in imperative mood; try rephrasing (found 'This')
torch/autograd/graph.py:243 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/graph.py:251 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/graph.py:256 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/graph.py:261 in public class `save_on_cpu`:
        D205: 1 blank line required between summary line and description (found 0)
torch/autograd/graph.py:261 in public class `save_on_cpu`:
        D400: First line should end with a period (not 'e')
torch/autograd/graph.py:303 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/graph.py:365 in public function `register_multi_grad_hook`:
        D401: First line should be in imperative mood (perhaps 'Register', not 'Registers')
torch/autograd/graph.py:588 in public function `allow_mutation_on_saved_tensors`:
        D400: First line should end with a period (not 'd')
22
```

**After: 8**

```
torch/autograd/graph.py:1 at module level:
        D100: Missing docstring in public module
torch/autograd/graph.py:24 in public class `Node`:
        D101: Missing docstring in public class
torch/autograd/graph.py:42 in public method `next_functions`:
        D102: Missing docstring in public method
torch/autograd/graph.py:129 in public method `__subclasshook__`:
        D105: Missing docstring in magic method
torch/autograd/graph.py:244 in public method `__init__`:
        D107: Missing docstring in __init__
torch/autograd/graph.py:252 in public method `__enter__`:
        D105: Missing docstring in magic method
torch/autograd/graph.py:257 in public method `__exit__`:
        D105: Missing docstring in magic method
torch/autograd/graph.py:303 in public method `__init__`:
        D107: Missing docstring in __init__
8
```

- `torch/multiprocessing/pool.py` </br>
**Before: 6**

```
torch/multiprocessing/pool.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/pool.py:7 in public function `clean_worker`:
        D103: Missing docstring in public function
torch/multiprocessing/pool.py:18 in public class `Pool`:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/pool.py:18 in public class `Pool`:
        D209: Multi-line docstring closing quotes should be on a separate line
torch/multiprocessing/pool.py:29 in private method `_repopulate_pool`:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/pool.py:29 in private method `_repopulate_pool`:
        D400: First line should end with a period (not ',')
6
```

**After: 2**

```
torch/multiprocessing/pool.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/pool.py:7 in public function `clean_worker`:
        D103: Missing docstring in public function
2
```

- `torch/multiprocessing/queue.py` </br>
**Before: 11**

```
torch/multiprocessing/queue.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/queue.py:8 in public class `ConnectionWrapper`:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/queue.py:8 in public class `ConnectionWrapper`:
        D209: Multi-line docstring closing quotes should be on a separate line
torch/multiprocessing/queue.py:8 in public class `ConnectionWrapper`:
        D400: First line should end with a period (not 'o')
torch/multiprocessing/queue.py:11 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/queue.py:14 in public method `send`:
        D102: Missing docstring in public method
torch/multiprocessing/queue.py:19 in public method `recv`:
        D102: Missing docstring in public method
torch/multiprocessing/queue.py:23 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/multiprocessing/queue.py:29 in public class `Queue`:
        D101: Missing docstring in public class
torch/multiprocessing/queue.py:30 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/queue.py:38 in public class `SimpleQueue`:
        D101: Missing docstring in public class
11
```

**After: 8**

```
torch/multiprocessing/queue.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/queue.py:10 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/queue.py:13 in public method `send`:
        D102: Missing docstring in public method
torch/multiprocessing/queue.py:18 in public method `recv`:
        D102: Missing docstring in public method
torch/multiprocessing/queue.py:22 in public method `__getattr__`:
        D105: Missing docstring in magic method
torch/multiprocessing/queue.py:28 in public class `Queue`:
        D101: Missing docstring in public class
torch/multiprocessing/queue.py:29 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/queue.py:37 in public class `SimpleQueue`:
        D101: Missing docstring in public class
8
```

- `torch/multiprocessing/reductions.py` </br>
**Before: 31**

```
torch/multiprocessing/reductions.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/reductions.py:24 in public class `StorageWeakRef`:
        D209: Multi-line docstring closing quotes should be on a separate line
torch/multiprocessing/reductions.py:31 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/reductions.py:38 in public method `from_weakref`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:44 in public method `expired`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:47 in public method `__del__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:50 in public method `__hash__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:53 in public method `__eq__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:60 in public class `SharedCache`:
        D400: First line should end with a period (not 'f')
torch/multiprocessing/reductions.py:62 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/reductions.py:75 in public method `get`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:79 in public method `__setitem__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:85 in public method `free_dead_references`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:99 in public function `rebuild_event`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:103 in public function `reduce_event`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:108 in public function `rebuild_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:121 in public function `rebuild_cuda_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:189 in public function `reduce_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:347 in public function `rebuild_nested_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:364 in public function `reduce_nested_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:389 in public function `fd_id`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:397 in public function `storage_from_cache`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:404 in public function `rebuild_storage_fd`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:417 in public function `rebuild_storage_filename`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:437 in public function `rebuild_storage_empty`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:441 in public function `rebuild_typed_storage`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:446 in public function `reduce_typed_storage`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:450 in public function `rebuild_typed_storage_child`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:455 in public function `reduce_typed_storage_child`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:459 in public function `reduce_storage`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:488 in public function `init_reductions`:
        D103: Missing docstring in public function
31
```

**After: 29**

```
torch/multiprocessing/reductions.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/reductions.py:32 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/reductions.py:39 in public method `from_weakref`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:45 in public method `expired`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:48 in public method `__del__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:51 in public method `__hash__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:54 in public method `__eq__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:63 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/reductions.py:76 in public method `get`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:80 in public method `__setitem__`:
        D105: Missing docstring in magic method
torch/multiprocessing/reductions.py:86 in public method `free_dead_references`:
        D102: Missing docstring in public method
torch/multiprocessing/reductions.py:100 in public function `rebuild_event`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:104 in public function `reduce_event`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:109 in public function `rebuild_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:122 in public function `rebuild_cuda_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:190 in public function `reduce_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:348 in public function `rebuild_nested_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:365 in public function `reduce_nested_tensor`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:390 in public function `fd_id`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:398 in public function `storage_from_cache`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:405 in public function `rebuild_storage_fd`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:418 in public function `rebuild_storage_filename`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:438 in public function `rebuild_storage_empty`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:442 in public function `rebuild_typed_storage`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:447 in public function `reduce_typed_storage`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:451 in public function `rebuild_typed_storage_child`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:456 in public function `reduce_typed_storage_child`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:460 in public function `reduce_storage`:
        D103: Missing docstring in public function
torch/multiprocessing/reductions.py:489 in public function `init_reductions`:
        D103: Missing docstring in public function
29
```

- `torch/multiprocessing/spawn.py` </br>
**Before: 19**

```
torch/multiprocessing/spawn.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/spawn.py:11 in public class `ProcessException`:
        D101: Missing docstring in public class
torch/multiprocessing/spawn.py:14 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:20 in public method `__reduce__`:
        D105: Missing docstring in magic method
torch/multiprocessing/spawn.py:25 in public class `ProcessRaisedException`:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/spawn.py:25 in public class `ProcessRaisedException`:
        D400: First line should end with a period (not 'n')
torch/multiprocessing/spawn.py:30 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:40 in public class `ProcessExitedException`:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/spawn.py:40 in public class `ProcessExitedException`:
        D400: First line should end with a period (not 'l')
torch/multiprocessing/spawn.py:47 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:59 in public method `__reduce__`:
        D105: Missing docstring in magic method
torch/multiprocessing/spawn.py:85 in public class `ProcessContext`:
        D101: Missing docstring in public class
torch/multiprocessing/spawn.py:86 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:93 in public method `pids`:
        D102: Missing docstring in public method
torch/multiprocessing/spawn.py:97 in public method `join`:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/spawn.py:97 in public method `join`:
        D401: First line should be in imperative mood (perhaps 'Try', not 'Tries')
torch/multiprocessing/spawn.py:166 in public class `SpawnContext`:
        D101: Missing docstring in public class
torch/multiprocessing/spawn.py:167 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:180 in public function `start_processes`:
        D103: Missing docstring in public function
19
```

**After: 13**

```
torch/multiprocessing/spawn.py:1 at module level:
        D100: Missing docstring in public module
torch/multiprocessing/spawn.py:11 in public class `ProcessException`:
        D101: Missing docstring in public class
torch/multiprocessing/spawn.py:14 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:20 in public method `__reduce__`:
        D105: Missing docstring in magic method
torch/multiprocessing/spawn.py:27 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:41 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:53 in public method `__reduce__`:
        D105: Missing docstring in magic method
torch/multiprocessing/spawn.py:79 in public class `ProcessContext`:
        D101: Missing docstring in public class
torch/multiprocessing/spawn.py:80 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:87 in public method `pids`:
        D102: Missing docstring in public method
torch/multiprocessing/spawn.py:161 in public class `SpawnContext`:
        D101: Missing docstring in public class
torch/multiprocessing/spawn.py:162 in public method `__init__`:
        D107: Missing docstring in __init__
torch/multiprocessing/spawn.py:175 in public function `start_processes`:
        D103: Missing docstring in public function
13
```

- `torch/multiprocessing/__init__.py` </br>
**Before: 0**

```
torch/multiprocessing/__init__.py:1 at module level:
        D205: 1 blank line required between summary line and description (found 0)
torch/multiprocessing/__init__.py:1 at module level:
        D400: First line should end with a period (not '`')
torch/multiprocessing/__init__.py:57 in public function `set_sharing_strategy`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
torch/multiprocessing/__init__.py:69 in public function `get_sharing_strategy`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
torch/multiprocessing/__init__.py:74 in public function `get_all_sharing_strategies`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
5
```

**After: 0**

- `torch/nn/__init__.py` </br>
**Before: 3**

```
torch/nn/__init__.py:1 at module level:
        D104: Missing docstring in public package
torch/nn/__init__.py:14 in public function `factory_kwargs`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/__init__.py:14 in public function `factory_kwargs`:
        D400: First line should end with a period (not 'd')
3
```

**After: 1**

```
torch/nn/__init__.py:1 at module level:
        D104: Missing docstring in public package
1
```

- `torch/nn/cpp.py` </br>
**Before: 16**

```
torch/nn/cpp.py:7 in public class `OrderedDictWrapper`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/cpp.py:7 in public class `OrderedDictWrapper`:
        D400: First line should end with a period (not 'e')
torch/nn/cpp.py:16 in public method `__init__`:
        D107: Missing docstring in __init__
torch/nn/cpp.py:21 in public method `cpp_dict`:
        D102: Missing docstring in public method
torch/nn/cpp.py:27 in public method `items`:
        D102: Missing docstring in public method
torch/nn/cpp.py:30 in public method `keys`:
        D102: Missing docstring in public method
torch/nn/cpp.py:33 in public method `values`:
        D102: Missing docstring in public method
torch/nn/cpp.py:36 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:39 in public method `__len__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:42 in public method `__contains__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:45 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:50 in public class `ModuleWrapper`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/cpp.py:50 in public class `ModuleWrapper`:
        D400: First line should end with a period (not 'd')
torch/nn/cpp.py:55 in public method `__init__`:
        D107: Missing docstring in __init__
torch/nn/cpp.py:83 in public method `training`:
        D102: Missing docstring in public method
torch/nn/cpp.py:90 in public method `__repr__`:
        D105: Missing docstring in magic method
16
```

**After: 12**

```
torch/nn/cpp.py:16 in public method `__init__`:
        D107: Missing docstring in __init__
torch/nn/cpp.py:21 in public method `cpp_dict`:
        D102: Missing docstring in public method
torch/nn/cpp.py:27 in public method `items`:
        D102: Missing docstring in public method
torch/nn/cpp.py:30 in public method `keys`:
        D102: Missing docstring in public method
torch/nn/cpp.py:33 in public method `values`:
        D102: Missing docstring in public method
torch/nn/cpp.py:36 in public method `__iter__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:39 in public method `__len__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:42 in public method `__contains__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:45 in public method `__getitem__`:
        D105: Missing docstring in magic method
torch/nn/cpp.py:52 in public method `__init__`:
        D107: Missing docstring in __init__
torch/nn/cpp.py:80 in public method `training`:
        D102: Missing docstring in public method
torch/nn/cpp.py:87 in public method `__repr__`:
        D105: Missing docstring in magic method
12
```

- `torch/nn/grad.py` </br>
**Before: 10**

```
torch/nn/grad.py:1 at module level:
        D400: First line should end with a period (not 'e')
torch/nn/grad.py:8 in public function `conv1d_input`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/grad.py:8 in public function `conv1d_input`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
torch/nn/grad.py:40 in public function `conv1d_weight`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
torch/nn/grad.py:71 in public function `conv2d_input`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/grad.py:71 in public function `conv2d_input`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
torch/nn/grad.py:103 in public function `conv2d_weight`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
torch/nn/grad.py:134 in public function `conv3d_input`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/grad.py:134 in public function `conv3d_input`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
torch/nn/grad.py:166 in public function `conv3d_weight`:
        D401: First line should be in imperative mood (perhaps 'Compute', not 'Computes')
10
```

**After: 0**

- `torch/nn/parameter.py` </br>
**Before: 17**

```
torch/nn/parameter.py:1 at module level:
        D100: Missing docstring in public module
torch/nn/parameter.py:14 in public class `Parameter`:
        D204: 1 blank line required after class docstring (found 0)
torch/nn/parameter.py:33 in public method `__new__`:
        D102: Missing docstring in public method
torch/nn/parameter.py:54 in public method `__deepcopy__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:62 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:65 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:84 in public class `UninitializedTensorMixin`:
        D101: Missing docstring in public class
torch/nn/parameter.py:105 in public method `materialize`:
        D205: 1 blank line required between summary line and description (found 0)
torch/nn/parameter.py:125 in public method `shape`:
        D102: Missing docstring in public method
torch/nn/parameter.py:132 in public method `share_memory_`:
        D102: Missing docstring in public method
torch/nn/parameter.py:138 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:141 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:149 in public method `__torch_function__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:164 in public function `is_lazy`:
        D103: Missing docstring in public function
torch/nn/parameter.py:186 in public method `__new__`:
        D102: Missing docstring in public method
torch/nn/parameter.py:191 in public method `__deepcopy__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:217 in public method `__new__`:
        D102: Missing docstring in public method
17
```

**After: 15**

```
torch/nn/parameter.py:1 at module level:
        D100: Missing docstring in public module
torch/nn/parameter.py:34 in public method `__new__`:
        D102: Missing docstring in public method
torch/nn/parameter.py:55 in public method `__deepcopy__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:63 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:66 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:85 in public class `UninitializedTensorMixin`:
        D101: Missing docstring in public class
torch/nn/parameter.py:127 in public method `shape`:
        D102: Missing docstring in public method
torch/nn/parameter.py:134 in public method `share_memory_`:
        D102: Missing docstring in public method
torch/nn/parameter.py:140 in public method `__repr__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:143 in public method `__reduce_ex__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:151 in public method `__torch_function__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:166 in public function `is_lazy`:
        D103: Missing docstring in public function
torch/nn/parameter.py:188 in public method `__new__`:
        D102: Missing docstring in public method
torch/nn/parameter.py:193 in public method `__deepcopy__`:
        D105: Missing docstring in magic method
torch/nn/parameter.py:219 in public method `__new__`:
        D102: Missing docstring in public method
15
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113052
Approved by: https://github.com/mikaylagawarecki, https://github.com/soulitzer
2023-11-10 21:19:17 +00:00
Joel Schlosser
43ea782af3 Multiprocessing support for NT (#110292)
Fixes #110161

Allows NTs to be used in DataLoaders with `num_workers > 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110292
Approved by: https://github.com/cpuhrsch, https://github.com/albanD
2023-10-10 21:58:19 +00:00
PyTorch MergeBot
dac895c10a Revert "Multiprocessing support for NT (#110292)"
This reverts commit f17fe89e14.

Reverted https://github.com/pytorch/pytorch/pull/110292 on behalf of https://github.com/kit1980 due to Causes CUDA memory leaks ([comment](https://github.com/pytorch/pytorch/pull/110292#issuecomment-1749852095))
2023-10-06 01:07:40 +00:00
Joel Schlosser
f17fe89e14 Multiprocessing support for NT (#110292)
Fixes #110161

Allows NTs to be used in DataLoaders with `num_workers > 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110292
Approved by: https://github.com/cpuhrsch, https://github.com/albanD
2023-10-05 15:04:48 +00:00
PyTorch MergeBot
7e6cf04a84 Revert "Multiprocessing support for NT (#110292)"
This reverts commit 881e7304d6.

Reverted https://github.com/pytorch/pytorch/pull/110292 on behalf of https://github.com/jbschlosser due to Address review comments ([comment](https://github.com/pytorch/pytorch/pull/110292#issuecomment-1743524901))
2023-10-02 18:27:13 +00:00
Joel Schlosser
881e7304d6 Multiprocessing support for NT (#110292)
Fixes #110161

Allows NTs to be used in DataLoaders with `num_workers > 1`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110292
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #110219
2023-10-02 18:14:34 +00:00
Edward Z. Yang
3bf922a6ce Apply UFMT to low traffic torch modules (#106249)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249
Approved by: https://github.com/Skylion007
2023-07-29 23:37:30 +00:00
Elias Ellison
5c8fea5647 Reduce overhead in CUDAGraph Trees (#98529)
Significantly reduces overhead of constructing Tensors and Storages and checking Storage Liveness. Removes the regression for HF models that I tested and removes 75% of overhead of the extremely overhead bound resnet50 training we have in torchbench. (.91x base commit, 1.02x torchinductor default, 1.16x this PR, 1.25 previous cudagraphs impl).

This PR takes care of all of the lower hanging fruit.

- Computes storage aliasing at record time instead of during at runtime. We no longer need to use a runtime storage cache, and can instead index directly into the existing alias if there is one, or construct a new Storage

- Moves the heavyweight C++ calls into a batch - getting storage weakrefs and constructing tensors

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98529
Approved by: https://github.com/jansel, https://github.com/ngimel
2023-04-07 05:46:08 +00:00
Aaron Gokaslan
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
Kurt Mohler
ee28b865ee Deprecate TypedStorage, its derived classes, and all of their public methods (#85303)
Part of #85302

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303
Approved by: https://github.com/ezyang
2022-11-08 18:11:01 +00:00
Kurt Mohler
14d0296e5c Rename _Typed/_UntypedStorage to Typed/UntypedStorage and update docs (#82438)
### Description

Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public.

`TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`.

Documentation for storages is improved as well.

### Issue
Fixes #82436

### Testing
N/A

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438
Approved by: https://github.com/ezyang
2022-07-30 19:37:08 +00:00
Elias Ellison
1058b47562 Weak-ref-ify MetaConverter and FakeTensorConverter (#80544)
Make `MetaConverter` and `FakeTensorConverter` hold weak references to their memoized tensors, and also have `MetaConverter` hold weak reference to Tensor storage. Otherwise it can be tricky for users to make sure all existing FakeTensors or FakeTensorModes are deleted which otherwise will leak memory.

I ran into https://github.com/pytorch/pytorch/issues/7733 which I was able to get around with the following (see comment from code):

```
# torch.Tensors cannot be used as a key in a dictionary
# because they define a custom __eq__ function which when used
# to resolve hash collisions will throw when comparing tensors:
# "RuntimeError: bool value of Tensor with more than one value is ambiguous."
# To avoid that, we use an object which will hold a Tensor and use
# its id for both hashing and equality.
# In order to use this as a weak key reference, we cannot
# simply use weakref.WeakKeyDictionary because the newly constructed
# WeakTensorRefKey only use would be a dictionary so it would have no strong
# references.
# To get around this issue, we can use it as a normal key, and then set
# `weakref.finalize` to delete the key when its contained tensor dies.
```

While for the tensor memo we can set a `weakref.finalize` callback that will remove the corresponding `WeakTensorRefKey` from the tensor memo, at the point that this callback is invoked the tensor storage is not yet deallocated.. See comment from code:

```
# [expired-storages]
# NB: even though the tensor has died,
# the deallocation of its storage can take longer,
# even when the storage has no other uses/views.
# In this case, the StorageWeakRef object will be kept alive
# longer than it needs to be, however the storage itself
# will be deallocated. We retain the possibly dead storages
# and periodically check if any of them are expired and
# can be freed.
```

partial fix for https://github.com/pytorch/torchdynamo/issues/468
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80544
Approved by: https://github.com/ezyang
2022-06-29 23:36:35 +00:00
Kurt Mohler
cecb2ad95e Restore old names for private funcs in legacy storages (#77861)
Followup from #75459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77861
Approved by: https://github.com/ezyang
2022-05-20 02:03:34 +00:00
Kurt Mohler
aea6e2c396 Merge torch.cuda._UntypedStorage into torch._UntypedStorage (#75459)
Fixes #74933

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75459
Approved by: https://github.com/ezyang
2022-05-19 13:54:39 +00:00
Kurt Mohler
79ddc72b85 Virtualize <type>Storage classes (#66970)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/66228

cc ezyang bhosmer smessmer ljk53 bdhirsh

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66970

Reviewed By: bdhirsh

Differential Revision: D33245612

Pulled By: ezyang

fbshipit-source-id: 4c61c2cb029e2b94b0e68927c377d3e1c358dd7c
(cherry picked from commit d29fcdfb4bc2cc17b1795d4349e4b56fa0d1cf12)
2022-03-22 23:44:48 +00:00
Kurt Mohler
8e7fe87630 Rename Typed/UntypedStorage to _Typed/_UntypedStorage (#72540)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72540

Reviewed By: jbschlosser

Differential Revision: D34216823

Pulled By: bdhirsh

fbshipit-source-id: 1bc9930ab582771ebf02308e035576cd1a0dbe47
(cherry picked from commit 329238f612)
2022-02-15 23:53:01 +00:00
Kurt Mohler
5883523c1d Remove dtype from torch.Storage and use only torch.ByteStorage (#62030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030

Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible

Fixes https://github.com/pytorch/pytorch/issues/47442

* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls.  `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
 To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.

Original pull request: https://github.com/pytorch/pytorch/pull/59671

Reviewed By: soulitzer, ngimel

Differential Revision: D29466819

Pulled By: ezyang

fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
2021-10-05 13:50:34 -07:00
Vasiliy Alekseev
bac4cfd54d Fix mp serialization for integer nn.Parameter on CUDA (#56529)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/56342

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56529

Reviewed By: albanD

Differential Revision: D27896094

Pulled By: ngimel

fbshipit-source-id: fe817781eb7139ea57c78acfd56e7c11b61eb4ed
2021-04-22 16:21:04 -07:00
Robert Gmyr
93bbbeccf7 Make SharedCache thread-safe (#53750)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53731

Make SharedCache thread-safe by using explicit locks instead of relying on atomicity of certain Python operations

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53750

Reviewed By: malfet

Differential Revision: D27304793

Pulled By: albanD

fbshipit-source-id: 7c62babe4357bed57df3056fbda6801fb6168846
2021-03-25 06:35:03 -07:00
Guilherme Leobas
cf92b0f3a0 add type annotations to multiprocessing module (#47756)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47757

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47756

Reviewed By: malfet

Differential Revision: D24970773

Pulled By: ezyang

fbshipit-source-id: b0b9edb9cc1057829c6320e78174c6d5f7a77477
2020-11-16 13:05:49 -08:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Brian Wignall
e7fe64f6a6 Fix typos (#30606)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606

Differential Revision: D18763028

Pulled By: mrshenli

fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c
2019-12-02 20:17:42 -08:00
Richard Zou
277d442d18 Rename torch.namedtensor -> torch._namedtensor_internals (#26349)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26349

The directory holds a lot of private helper functions that help
implement named tensor functionality. Instead of naming each helper
function with a leading underscore, I change the name of the import to
`_namedtensor_internals` to signal it should not be used directly.

Test Plan: - [namedtensor ci]

Differential Revision: D17424178

Pulled By: zou3519

fbshipit-source-id: 8f7b74346765759303480e581038a661021acf53
2019-09-18 05:47:09 -07:00
Richard Zou
2513ca66ca Add guards for using named tensor with serialization and multiprocessing (#25345)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25345

Test Plan
- New tests [namedtensor ci]

Test Plan: Imported from OSS

Differential Revision: D17101486

Pulled By: zou3519

fbshipit-source-id: 58e803b042056ee6abab8551517f74078f2b81d5
2019-08-29 14:10:33 -07:00
SsnL
eb756746ab Fix possible deadlock in SharedCache inside a forked child proc (#25158)
Summary:
Related: https://github.com/pytorch/pytorch/issues/24927#issuecomment-524608021

`fork` inherits lock state. So if we happen to unfortunately fork when the `SharedCache` lock is held. We could deadlock in the child process when some code tries to acquire it.

Following pytorch multiprocessing library design, this patch resets the lock to a new object after fork. A similar example from python core lib for `multiprocessing.Queue` is :

```py
class Queue(object):
    def __init__(self, ...):
        ...
        self._after_fork()
        if sys.platform != 'win32':
            register_after_fork(self, Queue._after_fork)

    def _after_fork(self):
        debug('Queue._after_fork()')
        self._notempty = threading.Condition(threading.Lock())
        self._buffer = collections.deque()
        self._thread = None
        self._jointhread = None
        self._joincancelled = False
        self._closed = False
        self._close = None
        self._send_bytes = self._writer.send_bytes
        self._recv_bytes = self._reader.recv_bytes
        self._poll = self._reader.poll
```

d4d60134b2/Lib/multiprocessing/queues.py (L54-L78)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25158

Differential Revision: D17091227

Pulled By: soumith

fbshipit-source-id: ee7130f47d7bbd42fc34a2598f1f6974d8d7cdb7
2019-08-28 13:34:03 -07:00
Tongzhou Wang
bc6281028c rebuild_storage_fd retry on EINTR (#21723)
Summary:
Some data loader tests are flaky on py 2 with the following error
```
Jun 12 22:17:31 Traceback (most recent call last):
Jun 12 22:17:31   File "test_dataloader.py", line 798, in test_iterable_dataset
Jun 12 22:17:31     fetched = sorted([d.item() for d in dataloader_iter])
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 697, in __next__
Jun 12 22:17:31     idx, data = self._get_data()
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 664, in _get_data
Jun 12 22:17:31     success, data = self._try_get_data()
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 617, in _try_get_data
Jun 12 22:17:31     data = self.data_queue.get(timeout=timeout)
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/multiprocessing/queues.py", line 135, in get
Jun 12 22:17:31     res = self._recv()
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/multiprocessing/queue.py", line 22, in recv
Jun 12 22:17:31     return pickle.loads(buf)
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 1382, in loads
Jun 12 22:17:31     return Unpickler(file).load()
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 858, in load
Jun 12 22:17:31     dispatch[key](self)
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/pickle.py", line 1133, in load_reduce
Jun 12 22:17:31     value = func(*args)
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/site-packages/torch/multiprocessing/reductions.py", line 274, in rebuild_storage_fd
Jun 12 22:17:31     fd = multiprocessing.reduction.rebuild_handle(df)
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/multiprocessing/reduction.py", line 157, in rebuild_handle
Jun 12 22:17:31     new_handle = recv_handle(conn)
Jun 12 22:17:31   File "/opt/python/2.7.9/lib/python2.7/multiprocessing/reduction.py", line 83, in recv_handle
Jun 12 22:17:31     return _multiprocessing.recvfd(conn.fileno())
Jun 12 22:17:31 OSError: [Errno 4] Interrupted system call
```

Apparently, Python 2.7's `recvfd` calls `recvmsg` without EINTR retry: https://github.com/python/cpython/blob/2.7/Modules/_multiprocessing/multiprocessing.c#L174
So we should call it with an outer try-catch loop.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21723

Differential Revision: D15806247

Pulled By: ezyang

fbshipit-source-id: 16cb661cc0fb418fd37353a1fef7ceeb634f02b7
2019-06-14 09:10:00 -07:00
Soumith Chintala
2e029db2f9 fixes multiprocessing serialization for integer nn.Parameter (#18639)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/17345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18639

Differential Revision: D14711565

Pulled By: soumith

fbshipit-source-id: 0063ed138a215b95d6571dcd68b18569714abe19
2019-04-01 17:15:42 -07:00
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
Vitaly Fedyunin
5653a914f7 Implement reference counting for shared IPC CUDA tensors (#16854)
Summary:
This is to fix #16141 and similar issues.

The idea is to track a reference to every shared CUDA Storage and deallocate memory only after a consumer process deallocates received Storage.

ezyang Done with cleanup. Same (insignificantly better) performance as in file-per-share solution, but handles millions of shared tensors easily. Note [ ] documentation in progress.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16854

Differential Revision: D13994490

Pulled By: VitalyFedyunin

fbshipit-source-id: 565148ec3ac4fafb32d37fde0486b325bed6fbd1
2019-03-25 10:24:38 -07:00
hysts
cbefd0323b Fix typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17521

Differential Revision: D14237482

Pulled By: soumith

fbshipit-source-id: 636e0fbe2c667d15fcb649136a65ae64937fa0cb
2019-02-26 20:23:34 -08:00
Shen Li
24f4d3987e Move all Stream and Event Python implementation to C++ (#15937)
Summary:
1. Added `torch/csrc/cuda/Event.h` and `torch/csrc/cuda/Event.cpp` to bind Python Event class to C++ implementation.
2. Move all CUDA runtime invocations from `torch/cuda/streams.py` to C++
3. Added tests to cover Stream and Event APIs. ~(event IPC handle tests is introduced in #15974)~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15937

Differential Revision: D13649001

Pulled By: mrshenli

fbshipit-source-id: 84ca58f35f6ba679a4ba33150ceba678d760d240
2019-01-17 07:29:22 -08:00
Ailing Zhang
be47470c91 Fix cuda multiprocessing cached memory (#14736)
Summary:
This PR fixes #11422

In the old world of CUDA IPC, when we want to share a tensor T from A to B, we have to share the whole CUDA mem allocation where T's storage sit in. And we casted it to the same type of storage of T's.

This causes problem when two different types of storage got allocated to the same CUDA mem block. When we try to reconstruct the second tensor, it will complain about wrong storage type.

In this PR we reconstruct the storage only (not the entire mem block). However, CUDA only allows one open memHandle once per process, we have to save the device pointer in a global cache so that we can reconstruct tensors as they come.

Thanks a ton to ezyang who helped design the solution and debugged the issue!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14736

Differential Revision: D13335899

Pulled By: ailzhang

fbshipit-source-id: cad69db392ed6f8fdc2b93a9dc2899f6d378c371
2018-12-05 10:55:43 -08:00
Edward Yang
3bfa7258b3 Don't serialize hooks (#11705)
Summary:
Fixes #11683.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11705

Differential Revision: D9833057

Pulled By: ezyang

fbshipit-source-id: 18af9bcd77b088326738d567100fbe4a4c869dd6
2018-10-16 20:11:03 -07:00
Sam Gross
0b63d12db6 Don't call into Python during Storage destruction. (#10407)
Summary:
```
This removes PyObjectFinalizer. We were seeing SIGSEGV at exit in some
programs that use multiprocessing. The backtrace pointed to
StorageRef.__del__ being called from subtype_dealloc. My guess is that
the Python interpreter was shutdown before all C++ Storage objects were
deallocated. Deallocating the C++ Storage called the finalizer which
called back into Python after it was no longer safe to do so.

This avoids a callback from C++ into Python during Storage finalization.
Instead, dead Storage objects (expired weak references) are collected
periodically when shared_cache exceeds a limit. The limit is scaled with
2x the number of live references, which places an upper bound on the
amount of extra memory held by dead Storage objects. In practice, this
should be very small.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10407

Differential Revision: D9272400

Pulled By: colesbury

fbshipit-source-id: ecb14d9c6d54ffc91e134c34a4e770a4d09048a2
2018-08-13 11:20:07 -07:00
Edward Yang
674f7a9778 Correctly share CUDA Parameters. (#10220)
Summary:
```
    Correctly share CUDA Parameters, requires_grad and hooks.

    Previously, the following was true:

    - If you put a Parameter for a CUDA tensor
      in multiprocessing queue (or otherwise tried to transfer it),
      this failed, saying that we cannot pickle CUDA storage.
      This is issue #9996.

    - If you put a leaf Tensor that requires_grad=True through the
      multiprocessing queue, it would come out the other end as
      requires_grad=False (It should have come out the other end
      as requires_grad=True).  Similarly, backwards hooks were
      lost.

    - If you put a non-leaf Tensor that requires_grad=True through
      the multiprocessing queue, it would come out the other end
      as requires_grad=False.

    The root cause for the first issue was that implementation of
    reductions for Parameter used the superclass implementation
    (tensor) in __reduce_ex__, but this always picks up the
    non-ForkingPickler reduction, which doesn't work with CUDA tensors.
    So, we registered a new ForkingPickler specifically for Parameter,
    and adjusted the code to correctly rewrap a Tensor in a Parameter
    if it was originally a parameter.

    While working on this, we realized that requires_grad and backwards
    hooks would not be preserved in the ForkingPickler reduction
    implementation.  We fixed the reducer to save these parameters.
    However, Adam Paszke pointed out that we shouldn't allow sending
    requires_grad=True, non-leaf Tensors over a multiprocessing
    queue, since we don't actually support autograd over process
    boundar.  We now throw an error in this case; this may cause
    previously working code to fail, but this is easy enough to fix;
    just detach() the tensor before sending it.  The error message says
    so.

    Fixes #9996.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10220

Differential Revision: D9160746

Pulled By: ezyang

fbshipit-source-id: a39c0dbc012ba5afc7a9e646da5c7f325b3cf05c
2018-08-10 13:54:56 -07:00
Edward Yang
976f9253a5 Eliminate storage views. (#9466)
Summary:
Storage views were previously used to implement CUDA IPC sharing,
but they weren't necessary.  The new strategy is described in
Note [CUDA IPC and the caching allocator].

This also fixes an unrelated bug, where we weren't actually using
the Tensor forking pickler, because we didn't register a pickler
for torch.Tensor.

Fixes #9447.  Fixes #46.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

CC apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9466

Reviewed By: apaszke

Differential Revision: D8859698

Pulled By: ezyang

fbshipit-source-id: 3362cb92f6ae4aa37084c57d79b31004bd0b4a97
2018-07-16 15:40:24 -07:00
Edward Yang
d0d1820814 Add weak pointer and finalizer support directly to THStorage. (#9148)
Summary:
The underlying use-case is the file descriptor to storage cache in
torch.multiprocessing.reductions.  Previously, this was implemented by wrapping
an existing allocator with a "weak ref" allocator which also knew to null out
the weak reference when the storage died.  This is terribly oblique, and
prevents us from refactoring the allocators to get rid of per-storage allocator
state.

So instead of going through this fiasco, we instead directly implement weak
pointers and finalizers in THStorage.  Weak pointers to THStorage retain the
THStorage struct, but not the data_ptr.  When all strong references die,
data_ptr dies and the finalizers get invoked.

There is one major hazard in this patch, which is what happens if you
repeatedly call _weak_ref on a storage.  For cleanliness, we no longer
shove our grubby fingers into the finalizer struct to see if there is already
a Python object for the weak reference and return it; we just create a new one
(no one is checking these Python objects for identity).  This means if you
keep calling it, we'll keep piling on finalizers.  That's bad! But I am
not going to fix it until it is actually a problem for someone, because
then we need to add another caching layer.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9148

Differential Revision: D8729106

Pulled By: ezyang

fbshipit-source-id: 69710ca3b7c7e05069090e1b263f8b6b9f1cf72f
2018-07-10 06:25:33 -07:00
Richard Zou
e831ad6204 Fix sharing of empty tensor in multiprocessing (#6229)
Fixes #5719

Previously, the following would error out with an "Invalid file
descriptor" error:
```
import torch
import torch.multiprocessing as mp

q = mp.Queue()
t = torch.tensor([])
q.put(t)
```
on some OSes. The problem was that because one cannot mmap data of size
0, and that an empty tensor has a storage of size 0, the file descriptor
for the storage (referencing shared memory) was not being set. The
multiprocessing sharing code then calls DupFD on that uninitialized file
descriptor, leading to an error.

This PR special cases sharing an empty tensor on the CPU. CUDA does not
have this problem.

Unit tests for both cpu and cuda empty tensors
2018-04-03 11:49:40 -04:00
Adam Lerer
e71cf20192 improved serialization (no tar copy) (#713) 2017-02-22 22:24:20 +01:00
Sam Gross
bd5303010d Refactor autograd package to separate Python dependencies. (#662)
The core autograd Variable, Function, and Engine no longer depend on the
Python API. This let's us implement functions in C++. In the future, we
can also multithread engine and release the GIL for most of the
non-Python backwards.
2017-02-13 16:00:16 -08:00