Summary:
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Test Plan:
contbuild & OSS CI,
Rollback Plan:
Reviewed By: malfet
Differential Revision: D77639021
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157472
Approved by: https://github.com/aorenste
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
Fixes https://github.com/pytorch/pytorch/issues/134798
In the regular Tensor case, when you call Tensor.data, there's a check
for if inference mode is active. If it is active, then we don't set the
version counter. We replicate this check for Tensor Subclasses (the bug
was we were trying to set the version counter on a FakeTensor in
inference_mode).
Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134878
Approved by: https://github.com/bdhirsh
tensorA.data = tensorB will call shallow_copy_from function to copy tensorB metadata and storage to tensorA metadata and storage. If tensorB extra_meta_ is nullptr,then tensorA extra_meta_ still keep in tensorA. This will contaminate new meta data in tensorA.
@ezyang @bdhirsh
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127616
Approved by: https://github.com/ezyang
Currently whenever the sizes or strides are modified for a `TensorImpl` we
eagerly recompute the numel and memory format flags. This is fine for static
shapes as it's all fast C++ code, but for symbolic shapes it runs slow python code.
This instead changes the `SymbolicShapeMeta` object to compute the derived
quantities lazily at the first request. This has the added benefit that we can
now pass assumptions in `empty_tensor_restride` which remove the need to compute
some contiguity flags at all.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112785
Approved by: https://github.com/ezyang
ghstack dependencies: #112689, #112890
This should be the last of the "it used to work with static shapes but
it doesn't work with dynamic shapes" hard errors. Now we will just
specialize if you hit it from C++.
The strategy here is a bit clever. We shunt the size() call to Python
binding if an error would have occurred. Importantly, we already have
logic to make sure the newly allocated ints stay live for the duration
of the ArrayRef access.
storage_offset is intentionally omitted because there are some problems
with it. I will fix them next.
This should let us get rid of the aotautograd_static test configuration.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111935
Approved by: https://github.com/zou3519
In cudagraph trees, we invalidate tensors at some point and drop their storage. Then, when they are accessed with .data_ptr(), a custom error message is thrown. Previously, this invalidation didn't also make untyped_storage()/storage() error which could result in a segfault.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109750
Approved by: https://github.com/zou3519
Unlike TORCH_CHECK, these always show C++ stacktrace on error. Put it
on errors where you frequently seem to need this information.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109373
Approved by: https://github.com/bdhirsh
ghstack dependencies: #109372
This PR introduces **-Wmissing-prototypes** of clang-tidy to prevent further coding errors such as the one fixed by PR #96714.
<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at fd2cf2a</samp>
This pull request makes several internal functions static to improve performance and avoid name clashes. It also fixes some typos, formatting, and missing includes in various files. It adds a new .clang-tidy check to warn about missing prototypes for non-static functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96805
Approved by: https://github.com/malfet, https://github.com/albanD
The strategy is that we will heap allocate a LargeNegativeIntSymNodeImpl whenever we have a large negative int, so that we can keep the old `is_symbolic` test (now called `is_heap_allocated`) on SymInt. Whenever we need to do something with these ints, though, we convert them back into a plain `int64_t` (and then, e.g., wrap it in whatever user specificed SymNodeImpl they need.) We cannot wrap directly in the user specified SymNodeImpl as we generally do not know what the "tracing context" is from C++. We expect large negative ints to be rare, so we don't apply optimizations like singleton-ifying INT_MIN. Here's the order to review:
* c10/core/SymInt.h and cpp
* `is_symbolic` renamed to `is_heap_allocated` as I needed to audit all use sites: the old `is_symbolic` test would return true for large negative int, but it would be wrong to then try to dispatch on the LargeNegativeIntSymNodeImpl which supports very few operations. In this file, I had to update expect_int,
* If you pass in a large negative integer, we instead heap allocate it in `promote_to_negative`. The function is written in a funny way to keep compact constructor code for SymInt (the heap allocation happens out of line)
* clone is now moved out-of-line
* New method maybe_as_int which will give you a constant int if it is possible, either because it's stored inline or in LargeNegativeIntSymNodeImpl. This is the preferred replacement for previous use of is_symbolic() and then as_int_unchecked().
* Rename toSymNodeImpl to toSymNode, which is more correct (since it returns a SymNode)
* Complete rewrite of `normalize_symints.cpp` to use new `maybe_as_int`. Cannot easily use the old code structure, so it's now done doing a macro and typing out each case manually (it's actually not that bad.)
* Reimplementations of all the unary operators by hand to use `maybe_as_int`, relatively simple.
* c10/core/LargeNegativeIntSymNodeImpl.h - Just stores a int64_t value, but it has to be big and negative. Most methods are not implemented, since we will rewrap the large negative int in the real SymNodeImpl subclass before doing operations with it
* The rest of the files are just rewriting code to use `maybe_as_int`. There is a nontrivial comment in c10/core/SymIntArrayRef.h
Very minor test adjustment in c10/test/core/SymInt_test.cpp . Plan to exercise this properly in next PR.
Companion XLA PR: https://github.com/pytorch/xla/pull/4882
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99157
Approved by: https://github.com/albanD