To codegen deferred runtime asserts, I need to be able to convert sympy expressions back into regular Python expressions that I can put in FX graphs. This PR adds some of the machinery to do this: it adds a new sympy analysis that runs operations on all FX traceable operations that can also be run with plain Python int/float/bool/etc. It's tested by symbolic tracing through the analysis, and then testing that this traced graph gives the same result as running the Python analysis directly.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113978
Approved by: https://github.com/aakhundov, https://github.com/lezcano
This PR also contains a basket of fixes that were turned up by now testing more arguments with SymInt. I fixed as many of the easy ones as I could easily get earlier in this stack and a bunch here, but there are some more annoying ones I xfailed.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113452
Approved by: https://github.com/Chillee
ghstack dependencies: #113877, #113911
`@classproperty` decorator adds another wrapper, so warning with default stacklevel (2) would always point to the wrapper implementation rather than at callee.
For example, before this change following code
```python
import torch
print(torch.FloatStorage.dtype)
```
will produce inactionable warning:
```
/Users/nshulga/git/pytorch/pytorch/torch/_utils.py:836: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.__get__(instance, owner)()
```
But after the change warning turns into:
```
/Users/nshulga/test/bar.py:2: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
print(torch.FloatStorage.dtype)
```
Discovered while reading https://github.com/pytorch/pytorch/issues/109108
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113601
Approved by: https://github.com/kit1980
`install_config_module` makes a regular module into a ConfigModule with
extra methods defined on it. mypy thinks those extra methods (or module
functions) are undefined since it cannot analyze something so
dynamic. As a workaround, I've created a fake module that defines these
extra functions, which I import into the config modules during type
checking.
As part of this change, I've also added more types to config_utils.py
and enabled typechecking for torch/_dynamo/config.py.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112130
Approved by: https://github.com/jansel
Summary:
CUDA Graph does not work well with CUPTI teardown.
1) crashes on 1st lazy CUPTI re-init after teardown (CUDA 11)
2) crashes on 2nd non-lazy CUPTI re-init after teardown (CUDA 12)
Workaround: turn off CUPTI teardown when using CUDA Graphs completely.
Test Plan: CI
Differential Revision: D50811284
Pulled By: aaronenyeshi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112507
Approved by: https://github.com/davidberard98
This PR:
- Moves TrueDiv, LShift, RShift, IsNonOverlappingAndDenseIndicator to `_sympy.functions.py`
- Moves SymNode to `fx.experimental.sym_node`.
- This file does not have any SymPy dependencies at import time
- It installs the magic methods in Sym{Bool,Int,Float}.
- N.b. With this split, we may be able to move Sym{Bool,Int,Float} to this file, and remove quite a few of the hacks around these classes
- Imports `sym_node` in `torch/__init__.py` rather than the whole `symbolic_shapes.py`.
This breaks the import-time dependency between torch and SymPy
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112037
Approved by: https://github.com/peterbell10
ghstack dependencies: #112035, #112036
This PR supports sym_ite. This is useful for converting SymBool to SymInt in e.g. #109916. Internally, it uses sympy.Piecewise. We cannot use sympy.ITE because it expects the arguments and output all to be boolean type but we want return SymInt type when converting a SymBool to SymInt. So we use sympy.Piecewise to denote the symbolic relationship.
Note that this pr uses the range analysis for sympy.Piecewise implemented in https://github.com/pytorch/pytorch/blob/main/torch/utils/_sympy/value_ranges.py.
Test Plan:
See added test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111440
Approved by: https://github.com/ezyang
This pr expose torch._higher_order_ops.cond as torch.cond.
1. Need to add #noqa: F811 to the _check calls in torch/__init__.py to address some confusing linter error "Redefinition of unused 'cond'" but only one cond is imported and for these lines that have this error, they don't define the cond but just use it as an argument.
2. Also add cond to the list that allows it to be traced through so as dynamo could trigger the CondHigherOrder logic instead of creating a TorchVariable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110293
Approved by: https://github.com/zou3519
This pr expose torch._higher_order_ops.cond as torch.cond.
1. Need to add #noqa: F811 to the _check calls in torch/__init__.py to address some confusing linter error "Redefinition of unused 'cond'" but only one cond is imported and for these lines that have this error, they don't define the cond but just use it as an argument.
2. Also add cond to the list that allows it to be traced through so as dynamo could trigger the CondHigherOrder logic instead of creating a TorchVariable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110293
Approved by: https://github.com/zou3519
Changelog:
- torch.library.impl_abstract optionally accepts a torch.library.Library
object. If passed in, then the lifetime of the registration is tied to
the Library object.
- we've also changed torch.library.impl_abstract to work on all
operators, including overloads.
- we refactored the `torch._custom_ops.*` and `torch._custom_op.*`
impl_abstract APIs and put them under torch._library. This is the
final resting place for them. I will follow-up with deleting
all the `torch._custom_ops.*` stuff later.
- There is a new "SimpleOperatorRegistry" where we actually collect the
abstract_impl. We will expand this to also hold the other
torch._custom_ops.* APIs when we move those to torch.library
NB: Previously we had designed
`impl_abstract` assuming a very high-level Python-only custom op API.
We've revisited that since; now, impl_abstract works for all custom ops,
no matter python or C++, no matter the schema. The new refactored design
reflects this better.
Test Plan:
- existing and new tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109912
Approved by: https://github.com/ezyang
Bugfix:
- previously, SymBool does not implement `__eq__`, Python falls back to default `__eq__ `and `__hash__`
- in this PR, we make SymBool implement `__eq__`
- symbolic SymBool now raises an error when hashed just like SymInt/SymFloat
New feature:
- previously, SymInt and SymFloat are unhashable (even if you are singleton or constant)
- in this PR, SymInt and SymBool are hashable if singleton/constant
Stay the same:
- SymNode are hashable due to default Python behavior
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109170
Approved by: https://github.com/ezyang
ghstack dependencies: #109169
In this PR:
- When Constant SymNode are detected in unary/binary ops demote them to plain int/bool before proceeding. Sometimes this means doing a unary op with a Constant SymNode would result in a plain bool.
- Introduce an is_symbolic method, only available from Python. We need this because isinstance(x, SymInt) is no longer sufficient to check whether a given int/SymInt is symbolic or not. See later PR in the stack to see how this is used.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109169
Approved by: https://github.com/ezyang
Reland - the previous PR was reverted by internal with this error:
```
File "/data/sandcastle/boxes/eden-trunk-hg-fbcode-fbsource/buck-out/v2/gen/fbcode/363cd7e240f5d021/caffe2/torch/fb/trainer/data_modules/tests/__test_dataloader__/test_dataloader#link-tree/torch/__init__.py", line 29, in <module>
from ._utils_internal import _functionalize_sync as _sync
ImportError: cannot import name '_functionalize_sync' from 'torch._utils_internal'
```
I couldn't figure out why internal was unhappy with the import. One potential reason is that I see a build rule for *another* `_utils_internal.py` in the fb folder here ([link](https://www.internalfb.com/code/fbsource/[30ed85cd88409af98b7490be137aaa5dfd7afd01]/fbcode/caffe2/TARGETS?lines=444))
Rather than burn more time investigating, I confirmed internally that the error goes away if I move the util from `torch/_utils_internal.py` to `torch/_utils.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109518
Approved by: https://github.com/albanD
Added two new utils to help with turning python functionalization on in AOTAutograd (next PR):
(1) updated `torch._sync()`. Previously, this API could only handle `torch.Tensor` instances that had a `FunctionalTensorWrapper` TensorImpl. It now needs to handle python `FunctionalTensor`'s. In theory I can probably break BC and change this API (since it's private?), but I decided not to do it in this PR stack do minimize the chance of reverts. Instead of updating that API directly (which is in C++), I just added a python shim that first tries to unwrap the python `FunctionalTensor` if there is one, then calls the existing C++ logic
(2) `mirror_autograd_meta` is now a standalone API that tries to mirror the `requires_grad` and `is_leaf` autograd metadata from one tensor to another. Previously this was hardcoded into `torch._to_functional_tensor()`. But I now need to use it in a more standalone way: later in AOTAutograd when we unwrap and re-wrap a tensor subclasses, we need to manually mirror the autograd metadata from the original to the updated version of the subclass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107917
Approved by: https://github.com/ezyang
ghstack dependencies: #106404
Or any other distro that have different purelib and platlib paths Regression was introduced, when small wheel base dependency was migrated from CUDA-11 to CUDA-12
Not sure why, but minor version of the package is no longer shipped with following CUDA-12:
- nvidia_cuda_nvrtc_cu12-12.1.105
- nvidia-cuda-cupti-cu12-12.1.105
- nvidia-cuda-cupti-cu12-12.1.105
But those were present in CUDA-11 release, i.e:
``` shell
bash-5.2# curl -OL 922c5996aa/nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl; unzip -t nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl |grep \.so
testing: nvidia/cuda_nvrtc/lib/libnvrtc-builtins.so.11.7 OK
testing: nvidia/cuda_nvrtc/lib/libnvrtc.so.11.2 OK
bash-5.2# curl -OL c64c03f49d/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl; unzip -t nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl|grep \.so
testing: nvidia/cuda_nvrtc/lib/libnvrtc-builtins.so.12.1 OK
testing: nvidia/cuda_nvrtc/lib/libnvrtc.so.12 OK
```
Fixes https://github.com/pytorch/pytorch/issues/109221
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109244
Approved by: https://github.com/huydhn
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.
I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
The documentation for `torch.set_float32_matmul_precision()` mentions a datatype called "bfloat16_3x". This doesn't appear to be a very standard term, and I had a hard time figuring out what exactly it meant. I now assume it refers to [[Henry2019]](http://arxiv.org/abs/1904.06376), which describes an algorithm by which a float32 multiplication is approximated via three bfloat16 multiplications. This PR updates the documentation to include this reference and to briefly describe how this algorithm works.
Note that I just learned everything that I wrote here, so I'd appreciate if someone more expert in this topic could check to make sure that I didn't get anything significantly wrong.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107169
Approved by: https://github.com/colesbury
This pattern shows up in torchrec KeyedJaggedTensor. Most
of the change in this PR is mechanical: whenever we failed
an unbacked symint test due to just error checking, replace the
conditional with something that calls expect_true (e.g.,
torch._check or TORCH_SYM_CHECK).
Some of the changes are a bit more nuanced, I've commented on the PR
accordingly.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106788
Approved by: https://github.com/lezcano
ghstack dependencies: #106720