Commit Graph

7 Commits

Author SHA1 Message Date
Xuehai Pan
4226ed1585 [BE] Format uncategorized Python files with ruff format (#132576)
Remove patterns `**`, `test/**`, and `torch/**` in `tools/linter/adapters/pyfmt_linter.py` and run `lintrunner`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132576
Approved by: https://github.com/ezyang, https://github.com/Skylion007
ghstack dependencies: #132574
2024-08-04 17:13:31 +00:00
Yuanhao Ji
d5182bb75b Enable UFMT on test/test_cuda*.py (#124352)
Part of: #123062

Ran lintrunner on:

- test/test_cuda.py
- test/test_cuda_expandable_segments.py
- test/test_cuda_multigpu.py
- test/test_cuda_nvml_based_avail.py
- test/test_cuda_primary_ctx.py
- test/test_cuda_sanitizer.py
- test/test_cuda_trace.py

Detail:

```bash
$ lintrunner -a --take UFMT --all-files
ok No lint issues.
Successfully applied all patches.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124352
Approved by: https://github.com/ezyang
2024-04-25 18:31:08 +00:00
shibo19
bb2fcc7659 unify TEST_CUDA (#106685)
Fixes #ISSUE_NUMBER
as title, unify TEST_CUDA
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106685
Approved by: https://github.com/zou3519
2023-08-10 09:01:36 +00:00
Catherine Lee
eea0733045 Reduce pytest blocklist (#96016)
`TestCase = object` or variations of it get switched to `TestCase = NoTest`.

unittest collects test based on subclassing unittest.TestCase, so setting TestCase = object removes it from unittest test collection.  pytest collects based on name (https://docs.pytest.org/en/7.1.x/reference/reference.html#confval-python_classes) but can be told to ignore a class (bottom of https://docs.pytest.org/en/7.1.x/example/pythoncollection.html#changing-naming-conventions)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96016
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
2023-03-07 18:30:27 +00:00
Mateusz Sypniewski
b70c254ebb Rework printing tensor aliases in CSAN error message (#85008)
Small rework of how the error message is formatted, introduces a distinction between the arguments and the output of kernels. Verified manually on multiple examples that the message is printed as expected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85008
Approved by: https://github.com/lw
2022-09-21 13:41:52 +00:00
Mateusz Sypniewski
8e57ce63a1 Add CSAN support for CPU synchronizations (#84428)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84428
Approved by: https://github.com/ngimel, https://github.com/lw
2022-09-09 15:19:33 +00:00
Mateusz Sypniewski
2b2e0fddf8 Add CUDA Sanitizer (#83984)
Example of a simple synchronization error:
```
a = torch.rand(4, 2, device="cuda")

with torch.cuda.stream(second_stream):
    torch.mul(a, 5, out=a)
```
Output produced by CSAN:
```
============================
CSAN detected a possible data race on tensor with data pointer 139719969079296
Access by stream 94646435460352 during kernel:
aten::mul.out(Tensor self, Tensor other, *, Tensor(a!) out) -> Tensor(a!)
writing to argument: self, out, output
With stack trace:
  File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 364, in _handle_kernel_launch
    stack_trace = traceback.StackSummary.extract(
  File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 544, in __torch_dispatch__
    errors = self.event_handler._handle_kernel_launch(
  File "/private/home/sypniewski/pytorch/torch/utils/_python_dispatch.py", line 76, in wrapped
    return f(self, *args, **kwargs)
  File "/private/home/sypniewski/pytorch/tester.py", line 9, in <module>
    torch.mul(a, 5, out=a)

Previous access by stream 0 during kernel:
aten::rand(int[] size, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> Tensor
writing to argument: output
With stack trace:
  File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 364, in _handle_kernel_launch
    stack_trace = traceback.StackSummary.extract(
  File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 544, in __torch_dispatch__
    errors = self.event_handler._handle_kernel_launch(
  File "/private/home/sypniewski/pytorch/torch/utils/_python_dispatch.py", line 76, in wrapped
    return f(self, *args, **kwargs)
  File "/private/home/sypniewski/pytorch/tester.py", line 6, in <module>
    a = torch.rand(10000, device="cuda")

Tensor was allocated with stack trace:
  File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 420, in _handle_memory_allocation
    traceback.StackSummary.extract(
  File "/private/home/sypniewski/pytorch/torch/utils/_cuda_trace.py", line 23, in fire_callbacks
    cb(*args, **kwargs)
  File "/private/home/sypniewski/pytorch/torch/_ops.py", line 60, in __call__
    return self._op(*args, **kwargs or {})
  File "/private/home/sypniewski/pytorch/torch/cuda/_sanitizer.py", line 541, in __torch_dispatch__
    outputs = func(*args, **kwargs)
  File "/private/home/sypniewski/pytorch/torch/utils/_python_dispatch.py", line 76, in wrapped
    return f(self, *args, **kwargs)
  File "/private/home/sypniewski/pytorch/tester.py", line 6, in <module>
    a = torch.rand(10000, device="cuda")
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83984
Approved by: https://github.com/ezyang
2022-09-07 16:55:03 +00:00