This PR introduces **-Wmissing-prototypes** of clang-tidy to prevent further coding errors such as the one fixed by PR #96714.
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at fd2cf2a</samp>
This pull request makes several internal functions static to improve performance and avoid name clashes. It also fixes some typos, formatting, and missing includes in various files. It adds a new .clang-tidy check to warn about missing prototypes for non-static functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96805
Approved by: https://github.com/malfet, https://github.com/albanD
Fixes for PyTorch/XLA functionalization integration
---
Some notable changes include:
- More asserts in `FunctionalTensorWrapper`, so bugs show up more cleanly in cases where we e.g. forget to wrap an output
- Make the *_scatter ops `CompositeExplicitAutogradNonFunctional`, so we get a better error message and XLA doesn't accidentally try to us them
- Fix LTC/XLA codegen in core to handle multi-tensor out= ops with no returns
- Better erroring: Allow XLA to use the CPU fallback from core in a way so that it always errors on view ops, which XLA should no longer see.
- Update MetaConverter to exclude XLA tensors in raising NotImplemented…
- Add `_propagate_xla_data` op
- Add meta tensor support for some ops
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94537
Approved by: https://github.com/bdhirsh
Follow up from: Quansight-Labs/numpy_pytorch_interop#3
This PR adds support for NumPy scalars for `torch.asarray`.
**Before:** treats the scalar as an object that implements the buffer protocol. Thus, interprets the data as the default data type (`float32`)
```python
>>> torch.asarray(numpy.float64(0.5))
tensor([0.0000, 1.7500])
```
**After:** identifies the NumPy scalar, and does the "right" thing. i.e. creates a 0-dimensional tensor from the NumPy array that doesn't share its memory
```python
>>> torch.asarray(numpy.float64(0.5))
tensor(0.5000, dtype=torch.float64)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90914
Approved by: https://github.com/lezcano, https://github.com/mruberry
This PR is a copy of https://github.com/pytorch/pytorch/pull/90849 that merge was reverted.
The PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI:
`torch.sparse.check_sparse_tensor_invariants` class provides different ways to enable/disable the invariant checking.
`torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden.
The PR fixes https://github.com/pytorch/pytorch/issues/90833
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92094
Approved by: https://github.com/cpuhrsch
This PR adds "check sparse tensor invariants" flag to Context that when enabled will trigger sparse tensor data invariants checks in unsafe methods of constructing sparse COO/CSR/CSC/BSR/BSC tensors. The feature includes the following changes to UI:
- `torch.enable_check_sparse_tensor_invariants` and `torch.is_check_sparse_tensor_invariants_enabled` functions to globally enable/disable the invariant checks and to retrieve the state of the feature, respectively
- `torch.sparse_coo/csr/csc/bsr/bsc/compressed_tensor` functions have a new optional argument `check_invariants` to enable/disable the invariant checks explicitly. When the `check_invariants` argument is specified, the global state of the feature is temporarily overridden.
The PR also fixes https://github.com/pytorch/pytorch/issues/90833
# Main issue
*The following content is outdated after merging the PRs in this ghstack but kept for the record.*
The importance of this feature is that when enabling the invariants checks by default, say, via
<details>
```
$ git diff
diff --git a/torch/__init__.py b/torch/__init__.py
index c8543057c7..19a91d0482 100644
--- a/torch/__init__.py
+++ b/torch/__init__.py
@@ -1239,3 +1239,8 @@ if 'TORCH_CUDA_SANITIZER' in os.environ:
# Populate magic methods on SymInt and SymFloat
import torch.fx.experimental.symbolic_shapes
+
+# temporarily enable sparse tensor arguments validation in unsafe
+# constructors:
+
+torch._C._set_check_sparse_tensor_invariants(True)
```
</details>
a massive number of test failures/errors occur in test_sparse_csr.py tests:
```
$ pytest -sv test/test_sparse_csr.py
<snip>
==== 4293 failed, 1557 passed, 237 skipped, 2744 errors in 69.71s (0:01:09) ====
```
that means that we are silently constructing sparse compressed tensors that do not satisfy the sparse tensor invariants. In particular, the following errors are raised:
```
AssertionError: "resize_as_sparse_compressed_tensor_: self and src must have the same layout" does not match "expected values to be a strided and contiguous tensor"
RuntimeError: CUDA error: device-side assert triggered
RuntimeError: `col_indices[..., crow_indices[..., i - 1]:crow_indices[..., i]] for all i = 1, ..., nrows are sorted and distinct along the last dimension values` is not satisfied.
RuntimeError: expected col_indices to be a strided and contiguous tensor
RuntimeError: expected row_indices to be a strided and contiguous tensor
RuntimeError: expected values to be a strided and contiguous tensor
RuntimeError: for_each: failed to synchronize: cudaErrorAssert: device-side assert triggered
RuntimeError: tensor dimensionality must be sum of batch, base, and dense dimensionalities (=0 + 2 + 0) but got 3
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90849
Approved by: https://github.com/amjames, https://github.com/cpuhrsch
NumPy versions 1.22 and 1.23 (and their respective bugfix releases included) have a buggy implementation of the Dlpack deleter that doesn't account for no-GIL contexts. Since we now release the GIL when deallocating tensors in `THPVariable_clear`, this leads to a failure of internal consistency checks when freeing a Dlpack-backed tensor from NumPy.
This PR adds a check for the buggy NumPy versions and overrides the `DlManagedTensor` deleter to reacquire the GIL before deallocation.
### Rationale for this implementation
The version check was added to `tensor_numpy.h/cpp` as it seemed like a more logical location for it than creating a new translation unit. The overriding of the deleter was originally attempted by directly modifying `at::fromDlpack`, but the lack of a build dependency on the Python C API in A10 prevented that. So, I extended the A10 Dlpack API instead to additionally accept a custom deleter functor.
Fixes#88082
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89759
Approved by: https://github.com/albanD
We define specializations for pybind11 defined templates
(in particular, PYBIND11_DECLARE_HOLDER_TYPE) and consequently
it is important that these specializations *always* be #include'd
when making use of pybind11 templates whose behavior depends on
these specializations, otherwise we can cause an ODR violation.
The easiest way to ensure that all the specializations are always
loaded is to designate a header (in this case, torch/csrc/util/pybind.h)
that ensures the specializations are defined, and then add a lint
to ensure this header is included whenever pybind11 headers are
included.
The existing grep linter didn't have enough knobs to do this
conveniently, so I added some features. I'm open to suggestions
for how to structure the features better. The main changes:
- Added an --allowlist-pattern flag, which turns off the grep lint
if some other line exists. This is used to stop the grep
lint from complaining about pybind11 includes if the util
include already exists.
- Added --match-first-only flag, which lets grep only match against
the first matching line. This is because, even if there are multiple
includes that are problematic, I only need to fix one of them.
We don't /really/ need this, but when I was running lintrunner -a
to fixup the preexisting codebase it was annoying without this,
as the lintrunner overall driver fails if there are multiple edits
on the same file.
I excluded any files that didn't otherwise have a dependency on
torch/ATen, this was mostly caffe2 and the valgrind wrapper compat
bindings.
Note the grep replacement is kind of crappy, but clang-tidy lint
cleaned it up in most cases.
See also https://github.com/pybind/pybind11/issues/4099
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82552
Approved by: https://github.com/albanD
### Description
Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public.
`TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`.
Documentation for storages is improved as well.
### Issue
Fixes#82436
### Testing
N/A
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438
Approved by: https://github.com/ezyang
This PR is doing a few interrelated things, all of which are necessary to get correctness. Read the comment in torch/fx/experimental/proxy_tensor.py for the high level overview.
Let's break down the parts of this PR:
* Bug fix where `enable_torch_dispatch_mode` with `None` doesn't work. This make `enable_torch_dispatch_mode(current_mode.inner)` work which is the basis for how we temporarily disable fake tensor mode.
* Bug fix for when fake tensor mode is combined with a non-mode tensor subclass. This actually could be ablated from this PR but it affects where the logic for allowing non fake tensor inputs with lift goes, so it's all in here in one go. There are some relevant tests for the fix in fake tensor, but it turns out I didn't need this because I'm always using proxy tensors as a mode (which ensures the ordering is right.)
* New `lift_fresh` view operator. Note that like lift, we have to manually write the functionalize kernel for these functions.
* The actual change, which is to save constants when we see them in the proxy tensor mode, and then propagate them as we go (because otherwise you'll handle mutations on constants incorrectly--see test.)
This is mildly BC-breaking if anyone was previously interposing on
at::lift, but this operator was relatively new and I checked
functorch which has no explicit reference to lift. So I think it
should not be too disruptive.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81192
Approved by: https://github.com/samdow, https://github.com/bdhirsh
## Motivation
Add `__torch_function__` override protocol supporting to the factory functions in defined in pytorch_torch_funcions_manual.cpp.
## Solution
By moving the PythonArg parser from the tensor_new.cpp and add the torch function handle dispatching for these API in `torch` name space.
as_tensor
sparse_coo_tensor
_sparse_coo_tensor_unsafe
sparce_csr_tensor
_sparce_csr_tensor_unsafe.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75639
Approved by: https://github.com/ezyang
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73850
Previously, torch.Tensor was treated as if it were torch.FloatTensor
(where Float is whatever the default dtype was). This is not good
behavior for tensor subclasses, which inherit from torch.Tensor and
will want to super() call into it and will only notice later that
only float works as a dtype. So in this PR I relax the behavior
for this case to make the torch.Tensor constructor more useful for
subclasses.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D34707396
Pulled By: ezyang
fbshipit-source-id: a995d601007b6fcd0317d89f66ca7e08c4d6053e
(cherry picked from commit e8d0d7b3e8b17681b931cbe4f5729de2e80cf3de)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73822
I guess hypothetically the logic duplication here is a faux
amis because we could say that the constructor and new method
should evolve APIs independently... but nah, it's not worth it.
There is only very slight differences between the two functions:
different error messages, and the new method does extra checks
to make sure the requested types are consistent with the base
Tensor. But I need to refactor this code and I really don't want
to do the refactor twice. So dedupe first.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: anjali411
Differential Revision: D34665171
Pulled By: ezyang
fbshipit-source-id: bd40ec7f6e694bfeff4e4aaab2f4e95cea250b65
(cherry picked from commit 10a03926d8d8f36506c9a3d62cf2c380f559b00b)
Summary:
Reland of https://github.com/pytorch/pytorch/pull/72623 that was reverted for the tls cleanup was removed.
From close inspection on the counting of the number of available keys, I think there is one more since the guard is actually one after the last usable key. With this update assert, the last updated key will still be <=63 which will fit just fine.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72832
Reviewed By: H-Huang
Differential Revision: D34228571
Pulled By: albanD
fbshipit-source-id: ce5e10a841ea87386727346cfc8d9327252574c4
(cherry picked from commit 59d3b86353)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65331
ghstack-source-id: 148862595
This is a performance optimization for the use case:
```
tensor = torch.tensor(<large_data>, device='meta')
```
where the current implementation requires a superfluous memory allocation on CPU even though the target device is a meta.
Test Plan: Run existing tests since no behavioral change is introduced.
Reviewed By: ezyang
Differential Revision: D31055036
fbshipit-source-id: 04d6c13594a71fc65bf2fbd567ee71833a879851
(cherry picked from commit 489d0a151a)