Commit Graph

382 Commits

Author SHA1 Message Date
Sahan Paliskara
936e93058b Delete torch::deploy from pytorch core (#85953)
As we have migrated torch::deploy over to https://github.com/pytorch/multipy, we can now delete it from pytorch core as ongoing development will happen there.

This PR was created due to syncing issues with https://github.com/pytorch/pytorch/pull/85443 which is where the review history can be found.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85953
Approved by: https://github.com/seemethere, https://github.com/malfet
2022-10-06 07:20:16 +00:00
Elias Ellison
9da5646cdb Add device logic handling for functions which allow scalar inputs as tensors (#86149)
Some functions allow scalars as tensor inputs. Add handling for them in device logic.

Fix for https://github.com/pytorch/torchdynamo/issues/1445
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86149
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2022-10-04 18:54:00 +00:00
Driss Guessous
cd6477617c Custom sdp implementations dense (#85984)
# Summary

- This code creates the runtime dispatch system for choosing a performant fused SDP kernel. The only choice of fused kernel is flash_attention. It also creates python flags and a context manager that can be used to turn off and on behavior for dispatch.
- This also adds support for flash_attention with dense tensors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85984
Approved by: https://github.com/cpuhrsch
2022-10-03 17:36:37 +00:00
Elias Ellison
f183a989a2 Fix fake tensor kernel nesting (#85920)
If you e.g. printed within a decomp which would call `in_kernel_invocation_manager`, on the exit from the manager it would unilaterally remove meta from the tls / set the tensor to return its real device. We should just restore what the existing state was.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85920
Approved by: https://github.com/ezyang, https://github.com/bdhirsh, https://github.com/huydhn
2022-10-02 04:19:40 +00:00
PyTorch MergeBot
b562987c28 Revert "Fix fake tensor kernel nesting (#85920)"
This reverts commit c2d9ea7f4b.

Reverted https://github.com/pytorch/pytorch/pull/85920 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I suspect that it causes a flaky memory leak issue in TestFakeTensorCUDA.test_fake_crossref_backward_amp_linalg_lstsq_cuda_float32
2022-10-01 19:30:21 +00:00
Elias Ellison
c2d9ea7f4b Fix fake tensor kernel nesting (#85920)
If you e.g. printed within a decomp which would call `in_kernel_invocation_manager`, on the exit from the manager it would unilaterally remove meta from the tls / set the tensor to return its real device. We should just restore what the existing state was.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85920
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2022-09-30 23:11:20 +00:00
soulitzer
a876432aea Expose torch._will_engine_execute_node (#84773)
Addresses: https://github.com/pytorch/pytorch/issues/83617

This PR a way to query the TLS graph task's exec_info which is a map mapping the Node to a bool indicating whether it will be executed in the current backward pass (as determined by the inputs= argument for .grad of .backward).
- this works with both custom Function nodes and normal codegened nodes
-  to be able to verify whether the pyobject passed is an actual node, we now store pointers to PyTypeObjects into a set on registration.
- error out when .backward without inputs= to avoid silently returning True

Alternatives:
- not sure if it is possible to bind to Python from a raw pointer to Node. At least we wouldn't be able to use existing logic, and the Python object should only hold a weak reference to the Node.
- other solutions to the motivating issue seem to require more extensive modification to the engine

See the issue linked for an example of usage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84773
Approved by: https://github.com/albanD
2022-09-28 20:13:52 +00:00
Sherlock Huang
01dbbeeeb5 Expose cpp_backtrace to python binding (#84896)
We can now get cpp stack trace by calling torch.utils.get_cpp_backtrace()

Sample output when calling from a torch_dispatch stack:
```
<omitting python frames>
frame #23: torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName) (0x7f69330bab90 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/utils/python_arg_parser.cpp:323)
frame #24: <unknown function> (0x7f6932a09e79 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/python_variable.cpp:2252)
frame #25: <unknown function> (0x7f69261aee33 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:56)
frame #26: <unknown function> (0x7f69261afef9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:19)
frame #27: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #28: <unknown function> (0x7f6926fae9b9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/boxing.h:227)
frame #29: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821f5 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:106)
frame #30: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438)
frame #31: <unknown function> (0x7f692ae4f8be in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1361)
frame #32: <unknown function> (0x7f692ae4f9b1 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1362)
frame #33: <unknown function> (0x7f692aef77e9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13)
frame #34: <unknown function> (0x7f6926fae7d8 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:50)
frame #35: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821c9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:97)
frame #36: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438)
frame #37: <unknown function> (0x7f6929ec654a in /fsx/users/bahuang/repos/pytorch_fsx/build/aten/src/ATen/RedispatchFunctions.h:10697)
frame #38: <unknown function> (0x7f6929d9edae in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2837)
frame #39: <unknown function> (0x7f6929d9f043 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2838)
frame #40: <unknown function> (0x7f6929e7d2f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13)
frame #41: <unknown function> (0x7f6929eb1344 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:478)
frame #42: <unknown function> (0x7f6929ea7b99 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:490)
frame #43: <unknown function> (0x7f6929e7d370 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:563)
frame #44: <unknown function> (0x7f6929e7d43a in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:239)
frame #45: <unknown function> (0x7f6929e7d48c in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:364)
frame #46: <unknown function> (0x7f6929e7d50a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:554)
frame #47: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #48: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43)
frame #49: c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f692603890a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:652)
frame #50: <unknown function> (0x7f69260387f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:388)
frame #51: <unknown function> (0x7f69261af0ef in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:96)
frame #52: <unknown function> (0x7f69261aff2b in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:25)
frame #53: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #54: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43)
frame #55: c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6925fd6ab2 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:628)
frame #56: <unknown function> (0x7f6925fd6690 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:376)
frame #57: <unknown function> (0x7f692bf5b525 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:380)
frame #58: <unknown function> (0x7f692bf59fac in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/runtime/register_c10_ops.cpp:15)
frame #59: <unknown function> (0x7f692bf5af41 in /usr/include/c++/7/bits/std_function.h:316)
frame #60: std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const (0x7f6932ab9a0f in /usr/include/c++/7/bits/std_function.h:706)
frame #61: <unknown function> (0x7f6932aad541 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/stack.h:41)
frame #62: <unknown function> (0x7f6932ab3102 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1206 (discriminator 1))
frame #63: <unknown function> (0x7f6932ab3943 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1272)
frame #64: <unknown function> (0x7f6932a46120 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/init.cpp:1767)
frame #65: <unknown function> (0x7f6932a997be in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1441)
frame #66: <unknown function> (0x7f6932a8a985 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1410)
frame #67: <unknown function> (0x7f6932a66e1e in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:249)
frame #68: <unknown function> (0x7f6932a66ec2 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:224)
frame #69: <unknown function> (0x7f6932473111 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:929)
frame #104: __libc_start_main (0x7f693485dc87 in /build/glibc-uZu3wS/glibc-2.27/csu/../csu/libc-start.c:310)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84896
Approved by: https://github.com/ezyang
2022-09-27 14:59:08 +00:00
Elias Ellison
bcc544e9d7 Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-26 17:08:14 +00:00
PyTorch MergeBot
d10de31cc8 Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)"
This reverts commit 78afa0cf0c.

Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk 78afa0cf0c
2022-09-23 17:21:43 +00:00
Elias Ellison
78afa0cf0c Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-23 15:50:03 +00:00
Richard Zou
5e5c319549 Move functorch python bindings to torch/csrc (#85426)
This moves functorch's python bindings to torch/csrc/functorch/init.cpp.
Coming next is the torchdim move. I didn't do torchdim yet because
moving functorch's python bindings unblocks some other things that I
want to do first.

Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85426
Approved by: https://github.com/ezyang
2022-09-22 18:47:12 +00:00
PyTorch MergeBot
5043457a8e Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)"
This reverts commit 9c77083965.

Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk (and pull somehow) 9c77083965
2022-09-22 15:44:38 +00:00
Elias Ellison
9c77083965 Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-22 13:03:57 +00:00
PyTorch MergeBot
3122a96ee4 Revert "Improve and expose cpp_backtrace to python binding (#84896)"
This reverts commit 73fbca1ea6.

Reverted https://github.com/pytorch/pytorch/pull/84896 on behalf of https://github.com/kit1980 due to Broke libtorch and linux-binary-manywheel - 73fbca1ea6
2022-09-21 03:13:20 +00:00
Sherlock Huang
73fbca1ea6 Improve and expose cpp_backtrace to python binding (#84896)
We can now get cpp stack trace by calling torch.utils.get_cpp_backtrace()

Sample output when calling from a torch_dispatch stack:
```
<omitting python frames>
frame #23: torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<pybind11::handle>, _object*, _object*, char const*, _object*, char const*, torch::TorchFunctionName) (0x7f69330bab90 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/utils/python_arg_parser.cpp:323)
frame #24: <unknown function> (0x7f6932a09e79 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/python_variable.cpp:2252)
frame #25: <unknown function> (0x7f69261aee33 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:56)
frame #26: <unknown function> (0x7f69261afef9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:19)
frame #27: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #28: <unknown function> (0x7f6926fae9b9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/boxing.h:227)
frame #29: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821f5 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:106)
frame #30: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438)
frame #31: <unknown function> (0x7f692ae4f8be in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1361)
frame #32: <unknown function> (0x7f692ae4f9b1 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/ADInplaceOrViewType_1.cpp:1362)
frame #33: <unknown function> (0x7f692aef77e9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13)
frame #34: <unknown function> (0x7f6926fae7d8 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:50)
frame #35: at::Tensor c10::Dispatcher::redispatch<at::Tensor, at::Tensor const&>(c10::TypedOperatorHandle<at::Tensor (at::Tensor const&)> const&, c10::DispatchKeySet, at::Tensor const&) const (0x7f6926e821c9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:97)
frame #36: at::_ops::alias::redispatch(c10::DispatchKeySet, at::Tensor const&) (0x7f6927142c31 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:438)
frame #37: <unknown function> (0x7f6929ec654a in /fsx/users/bahuang/repos/pytorch_fsx/build/aten/src/ATen/RedispatchFunctions.h:10697)
frame #38: <unknown function> (0x7f6929d9edae in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2837)
frame #39: <unknown function> (0x7f6929d9f043 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/autograd/generated/VariableType_1.cpp:2838)
frame #40: <unknown function> (0x7f6929e7d2f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/WrapFunctionIntoFunctor.h:13)
frame #41: <unknown function> (0x7f6929eb1344 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:478)
frame #42: <unknown function> (0x7f6929ea7b99 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:490)
frame #43: <unknown function> (0x7f6929e7d370 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:563)
frame #44: <unknown function> (0x7f6929e7d43a in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:239)
frame #45: <unknown function> (0x7f6929e7d48c in /fsx/users/bahuang/repos/pytorch_fsx/c10/util/C++17.h:364)
frame #46: <unknown function> (0x7f6929e7d50a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:554)
frame #47: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #48: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43)
frame #49: c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f692603890a in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:652)
frame #50: <unknown function> (0x7f69260387f9 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:388)
frame #51: <unknown function> (0x7f69261af0ef in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/PythonFallbackKernel.cpp:96)
frame #52: <unknown function> (0x7f69261aff2b in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:25)
frame #53: c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadced in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/BoxedKernel_impl.h:41)
frame #54: c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6932aadd26 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/boxing/KernelFunction_impl.h:43)
frame #55: c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const (0x7f6925fd6ab2 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:628)
frame #56: <unknown function> (0x7f6925fd6690 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:376)
frame #57: <unknown function> (0x7f692bf5b525 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/dispatch/Dispatcher.h:380)
frame #58: <unknown function> (0x7f692bf59fac in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/runtime/register_c10_ops.cpp:15)
frame #59: <unknown function> (0x7f692bf5af41 in /usr/include/c++/7/bits/std_function.h:316)
frame #60: std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const (0x7f6932ab9a0f in /usr/include/c++/7/bits/std_function.h:706)
frame #61: <unknown function> (0x7f6932aad541 in /fsx/users/bahuang/repos/pytorch_fsx/aten/src/ATen/core/stack.h:41)
frame #62: <unknown function> (0x7f6932ab3102 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1206 (discriminator 1))
frame #63: <unknown function> (0x7f6932ab3943 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/pybind_utils.h:1272)
frame #64: <unknown function> (0x7f6932a46120 in /fsx/users/bahuang/repos/pytorch_fsx/torch/csrc/jit/python/init.cpp:1767)
frame #65: <unknown function> (0x7f6932a997be in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1441)
frame #66: <unknown function> (0x7f6932a8a985 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/cast.h:1410)
frame #67: <unknown function> (0x7f6932a66e1e in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:249)
frame #68: <unknown function> (0x7f6932a66ec2 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:224)
frame #69: <unknown function> (0x7f6932473111 in /fsx/users/bahuang/repos/pytorch_fsx/third_party/pybind11/include/pybind11/pybind11.h:929)
frame #104: __libc_start_main (0x7f693485dc87 in /build/glibc-uZu3wS/glibc-2.27/csu/../csu/libc-start.c:310)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84896
Approved by: https://github.com/ezyang
2022-09-21 01:32:33 +00:00
Aidyn-A
1456cca1fc Fix exception handling, improve overheads and avoid constructing storage for element size (#84612)
These changes were proposed by @MatthiasKohl in #84271 and #84542 that fix #84267 and #84056 respectively.
The reason I am creating the pull request is CLA check (see original PRs).

cc @ptrblck @malfet
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84612
Approved by: https://github.com/ngimel
2022-09-19 20:21:46 +00:00
soulitzer
7f88934a8f [reland 2] Call jit decomp in VariableType to improve forward AD coverage (#84976)
Reland of https://github.com/pytorch/pytorch/pull/84675
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84976
Approved by: https://github.com/zou3519
2022-09-15 22:46:19 +00:00
PyTorch MergeBot
36d79143ce Revert "[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151) (#84675)"
This reverts commit bb4e96c964.

Reverted https://github.com/pytorch/pytorch/pull/84675 on behalf of https://github.com/osalpekar due to causing asan xplat link-time errors like ld.lld: error: undefined symbol: torch::jit::has_jit_decomposition(c10::FunctionSchema const&)
2022-09-13 22:54:54 +00:00
soulitzer
bb4e96c964 [reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151) (#84675)
This reverts commit acb4a09628.

In addition, we also fix a memory leak in layer norm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84675
Approved by: https://github.com/zou3519
2022-09-12 20:33:14 +00:00
Mikayla Gawarecki
e217b30b0f Add torch.nested namespace (#84102)
First step towards #83775
- only `to_padded_tensor` is moved to the nested namespace for now
- following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in
`torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`.

~~**Question**: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~

[generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested)

Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102
Approved by: https://github.com/drisspg
2022-09-12 16:31:05 +00:00
Taylor Robie
1fa9a377d0 [Profiler] Start moving python bindings out of autograd (#82584)
A lot of profiler code still lives in autograd for historic reasons. However as we formalize and clean up profiler internals it makes sense to pull more and more into the profiler folders/namespace. For now I'm just moving some of the core config data structures and those related to `torch::profiler::impl::Result` to keep the scope manageable.

Differential Revision: [D37961462](https://our.internmc.facebook.com/intern/diff/D37961462/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37961462/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82584
Approved by: https://github.com/albanD, https://github.com/Gamrix
2022-08-19 17:15:18 +00:00
Edward Z. Yang
df69660832 Revert "Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"" (#82599)
This reverts commit 532b8a9e00.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82599
Approved by: https://github.com/albanD
2022-08-02 19:37:02 +00:00
Elias Ellison
642aed8b99 Add Autocast Support for FakeTensors / use fake device dispatch keys (#82449)
From PR:
```
Note: [Fake Tensor Dispatch Keys]
In order to model the behavior of device-specific autocast
and autograd logic, we update the dispatch keys of FakeTensors
to reflect their fake device. This includes the BackendComponent
(DispatchKey::Meta -> DispatchKey::CUDA), and also the BackendComponent
related Autocast and Autograd keys. __torch__dispatch__ sits below
Autocast and Autograd, and is only invoked when we are at the
kernel for the BackendComponent. Then, we add Meta to the
thread-local dispatch include set to hit the meta kernel
instead of the kernel of the BackendComponent for the fake device.
```

Also adds the `conv1/2/3d.padding` operators to the Autocast rule set. Without that fix, the FakeTensor dtype would diverge.

See: https://github.com/pytorch/pytorch/issues/81608

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82449
Approved by: https://github.com/ezyang
2022-08-01 21:40:36 +00:00
PyTorch MergeBot
532b8a9e00 Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"
This reverts commit 9465c0e0b5.

Reverted https://github.com/pytorch/pytorch/pull/82552 on behalf of https://github.com/zengk95 due to This seems to be breaking windows binary wheels
2022-08-01 20:25:35 +00:00
Edward Z. Yang
9465c0e0b5 Add a lint rule for torch/csrc/util/pybind.h include (#82552)
We define specializations for pybind11 defined templates
(in particular, PYBIND11_DECLARE_HOLDER_TYPE) and consequently
it is important that these specializations *always* be #include'd
when making use of pybind11 templates whose behavior depends on
these specializations, otherwise we can cause an ODR violation.

The easiest way to ensure that all the specializations are always
loaded is to designate a header (in this case, torch/csrc/util/pybind.h)
that ensures the specializations are defined, and then add a lint
to ensure this header is included whenever pybind11 headers are
included.

The existing grep linter didn't have enough knobs to do this
conveniently, so I added some features.  I'm open to suggestions
for how to structure the features better.  The main changes:

- Added an --allowlist-pattern flag, which turns off the grep lint
  if some other line exists.  This is used to stop the grep
  lint from complaining about pybind11 includes if the util
  include already exists.

- Added --match-first-only flag, which lets grep only match against
  the first matching line.  This is because, even if there are multiple
  includes that are problematic, I only need to fix one of them.
  We don't /really/ need this, but when I was running lintrunner -a
  to fixup the preexisting codebase it was annoying without this,
  as the lintrunner overall driver fails if there are multiple edits
  on the same file.

I excluded any files that didn't otherwise have a dependency on
torch/ATen, this was mostly caffe2 and the valgrind wrapper compat
bindings.

Note the grep replacement is kind of crappy, but clang-tidy lint
cleaned it up in most cases.

See also https://github.com/pybind/pybind11/issues/4099

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82552
Approved by: https://github.com/albanD
2022-08-01 17:16:58 +00:00
Jing Xu
0e95746580 [RFC] enable oneMKL&oneDNN on-demands verbose functinality (#63212)
**RFC:
Problem statement** 
Intel oneMKL and oneDNN are used to accelerate performance on Intel platforms. Both these 2 libraries provide verbose functionality to dump detailed operator execution information as well as execution time. These verbose messages are very helpful to performance profiling. However, the verbose functionality works for the entire execution. In many scenarios, though, we only would like to profile partial of the execution process. This feature is to expose PyTorch API functions to control oneDNN and oneMKL verbose functionality in runtime.

**Additional context**  
The most used performance profiling steps are shown as the following code snippet:

```
def inference(model, inputs):
    # step0 (optional): jit
    model = torch.jit.trace(model, inputs)

    # step1: warmup
    for _ in range(100):
        model(inputs)

    # step2: performance profiling. We only care the profiling result, as well as oneDNN and oneMKL verbose messages, of this step
    model(inputs)

    # step3 (optional): benchmarking
    t0 = time.time()
    for _ in range(100):
        model(inputs)
    t1 = time.time()
    print(‘dur: {}’.format((t1-t0)/100))
    return model(inputs)
```

Since environment variables MKL_VERBOSE and DNNL_VERBOSE will be effect to the entire progress, we will get a great number of verbose messages for all of 101 iterations (if step3 is not involved). However, we only care about the verbose messages dumped in step2. It is very difficult to filter unnecessary verbose messages out if we are running into a complicated usages scenario. Also, jit trace will also bring more undesired verbose messages.

Furthermore, there are more complicated topologies or usages like cascaded topologies as below:

```
model1 = Model1()
model2 = Model2()
model3 = Model3()
x1 = inference(model1, x)
x2 = inference(model2, x1)
y = inference(model3, x2)
```

There are many cases that it is very hard to split these child topologies out. In this scenario, it is not possible to investigate performance of each individual topology with `DNNL_VERBOSE` and `MKL_VERBOSE`.

To solve this issue, oneDNN and oneMKL provide API functions to make it possible to control verbose functionality in runtime.
```
int mkl_verbose (int enable)
status dnnl::set_verbose(int level)
```

oneDNN and oneMKL print verbose messages to stdout when oneMKL or oneDNN ops are executed.
Sample verbose messages:
```
MKL_VERBOSE SGEMM(t,n,768,2048,3072,0x7fff64115800,0x7fa1aca58040,3072,0x1041f5c0,3072,0x7fff64115820,0x981f0c0,768) 8.52ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:44
dnnl_verbose,exec,cpu,inner_product,brgemm:avx512_core,forward_training,src_f32::blocked:ab:f0 wei_f32::blocked:AB16b64a:f0 bia_f32::blocked:a:f0 dst_f32::blocked:ab:f0,,,mb16ic768oc768,0.0839844
```

**Design and implementation** 
The design is to make python-interfaced wrap functions to invoke mkl_verbose and dnnl::set_verbose functions.

**Design concern**  

- Need to add wrapper C++ functions for mkl_verbose and dnnl::set_verbose functions in torch/csrc and aten/csrc.
- Python API functions will be added to device-specific backends
  - with torch.backends.mkl.verbose(1):
  - with torch.backends.mkldnn.verbose(1):

**Use cases**  
```
def inference(model, inputs):
    # step0 (optional): jit
    model = torch.jit.trace(model, inputs)

    # step1: warmup
    for _ in range(100):
        model(inputs)

    # step2: performance profiling
    with torch.backends.mkl.verbose(1), torch.backends.mkldnn.verbose(1):
        model(inputs)

    # step3 (optional): benchmarking
    t0 = time.time()
    for _ in range(100):
        model(inputs)
    t1 = time.time()
    print(‘dur: {}’.format((t1-t0)/100))
    return model(inputs)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63212
Approved by: https://github.com/VitalyFedyunin, https://github.com/malfet
2022-07-27 23:29:35 +00:00
Jing Xu
3c7044728b Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)
More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx).

ITT is a functionality for labeling trace data during application execution across different Intel tools.
For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future.
It works for both Intel CPU and Intel XPU devices.

Pitch
Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU.

This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch.

Usage example:
```
with torch.autograd.profiler.emit_itt():
    for i in range(10):
        torch.itt.range_push('step_{}'.format(i))
        model(input)
        torch.itt.range_pop()
```

cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289
Approved by: https://github.com/malfet
2022-07-13 13:50:15 +00:00
PyTorch MergeBot
1454515253 Revert "Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)"
This reverts commit f988aa2b3f.

Reverted https://github.com/pytorch/pytorch/pull/63289 on behalf of https://github.com/malfet due to broke trunk, see f988aa2b3f
2022-06-30 12:49:41 +00:00
Jing Xu
f988aa2b3f Enable Intel® VTune™ Profiler's Instrumentation and Tracing Technology APIs (ITT) to PyTorch (#63289)
More detailed description of benefits can be found at #41001. This is Intel's counterpart of NVidia’s NVTX (https://pytorch.org/docs/stable/autograd.html#torch.autograd.profiler.emit_nvtx).

ITT is a functionality for labeling trace data during application execution across different Intel tools.
For integrating Intel(R) VTune Profiler into Kineto, ITT needs to be integrated into PyTorch first. It works with both standalone VTune Profiler [(https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)) and Kineto-integrated VTune functionality in the future.
It works for both Intel CPU and Intel XPU devices.

Pitch
Add VTune Profiler's ITT API function calls to annotate PyTorch ops, as well as developer customized code scopes on CPU, like NVTX for NVidia GPU.

This PR rebases the code changes at https://github.com/pytorch/pytorch/pull/61335 to the latest master branch.

Usage example:
```
with torch.autograd.profiler.emit_itt():
    for i in range(10):
        torch.itt.range_push('step_{}'.format(i))
        model(input)
        torch.itt.range_pop()
```

cc @ilia-cher @robieta @chaekit @gdankel @bitfort @ngimel @orionr @nbcsm @guotuofeng @guyang3532 @gaoteng-git
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63289
Approved by: https://github.com/malfet
2022-06-30 05:14:03 +00:00
Michael Suo
30fb2c4aba [lint] autoformat test/cpp and torch/csrc
Let's have some fun.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828

Approved by: https://github.com/ezyang
2022-06-11 21:11:16 +00:00
anjali411
38350acf8f Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322

Approved by: https://github.com/albanD
2022-06-11 00:29:32 +00:00
Elias Ellison
3c5a3ca9e8 Make FakeTensors return meta within kerenl invocation, add FakeTensor op tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78972

Approved by: https://github.com/ezyang
2022-06-09 01:39:27 +00:00
PyTorch MergeBot
954522a485 Revert "Autogen Tags enum, and allow specifying tags while defining an op"
This reverts commit 9476a78f37.

Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see 9476a78f37
2022-06-03 01:53:53 +00:00
anjali411
9476a78f37 Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-03 01:13:44 +00:00
PyTorch MergeBot
b994ce359e Revert "[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002)"
This reverts commit c274f2ad52.

Reverted https://github.com/pytorch/pytorch/pull/77002 on behalf of https://github.com/malfet due to please, as it breaks internal CI, but also no CUDA heads should be included from `torch/csrc/Module.cpp`, but rather should be implemented/registered in `torch/csrc/cuda/Module.cpp`
2022-05-24 21:52:35 +00:00
Nikita Shulga
6244daa6a9 [MPS] Fix torch.mps.is_available() (#78121)
By introducing `at:mps::is_available()` and changing `torch._C._is_mps_available` from property to memoizable callable

Also, if `_mtl_device` is released in MPSDevice destructor, shouldn't it be retained in the constructor

Looks like GitHubActions Mac runner does not have any Metal devices available, according to https://github.com/malfet/deleteme/runs/6560871657?check_suite_focus=true#step:3:15

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78121
Approved by: https://github.com/albanD
2022-05-24 05:10:38 +00:00
Eddie Yan
c274f2ad52 [cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002)
(reopening due to botched merge)
The cuDNN V8 API (main support merged in https://github.com/pytorch/pytorch/pull/60755) potentially exposes many more kernels with benchmark=True. While these additional kernels can improve performance, it is often unnecessary to run every kernel returned by the heuristic and doing so may degrade the user experience by causing the first model iteration to be very slow. To alleviate this issue, this PR introduces torch.backends.cudnn.benchmark_limit. benchmark_limit specifies the maximum number of working cuDNN kernels to try for a given workload, with the default being 10 (similar to what TensorFlow does). benchmark_limit = 0 yields the current behavior of trying every kernel returned by the heuristic.

CC @ptrblck @ngimel @xwang233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77002
Approved by: https://github.com/ngimel
2022-05-24 00:11:47 +00:00
Jeff Daily
9aed30d3ad [ROCm] support benchmark flag for MIOpen (#77438)
Fixes #68172.  Generally, this corrects multiple flaky convolution unit test behavior seen on ROCm.

The MIOpen integration has been forcing benchmark=True when calling `torch._C._set_cudnn_benchmark(False)`, typically called by `torch.backends.cudnn.set_flags(enabled=True, benchmark=False)`.  We now add support for MIOpen immediate mode to avoid benchmarking during MIOpen solution selection.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77438
Approved by: https://github.com/ngimel, https://github.com/malfet
2022-05-23 17:10:24 +00:00
Kurt Mohler
aea6e2c396 Merge torch.cuda._UntypedStorage into torch._UntypedStorage (#75459)
Fixes #74933

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75459
Approved by: https://github.com/ezyang
2022-05-19 13:54:39 +00:00
Kulin Seth
f348b1b2b5 Add the Runtime components for MPS backend. (#76725)
The PR adds the runtime components and few basic operations like copy, as_strided for MPS backend.

Current list of identified TODOs are:

-  https://github.com/pytorch/pytorch/issues/77176
- Unify the logic with CUDACachingAllocator and remove redundant code.
-  https://github.com/pytorch/pytorch/issues/77170
- Look into using C++ smart pointers where possible with ObjC code
- Use empty_strided_generic() to implement the `empty_strided_mps` code
- https://github.com/pytorch/pytorch/issues/77144
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76725
Approved by: https://github.com/albanD
2022-05-11 17:19:45 +00:00
Eddie Yan
e838137b3e Add high level control of fp32 matmul precision; disable TF32 for matmuls by default
#76440

CC @mruberry @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76509
Approved by: https://github.com/ngimel
2022-05-04 20:40:13 +00:00
Nikita Shulga
8473173c36 Remove breakpad dependency
This functionality does not seem to be used
and there are some requests to update dependency.

Add `third_party` to torch_cpu include directories if compiling with
Caffe2 support, as `caffe2/quantization/server/conv_dnnlowp_op.cc` depends on `third_party/fbgemm/src/RefImplementations.h`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394
Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-05-03 20:21:55 +00:00
Kulin Seth
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
Peter Bell
58fb3f018e Fix conjugate bit discrepancy in composite compliance
When testing composite compliance, the conj bit and neg bit are not
propagated to the wrapper tensor. This leads to problems when a
composite operator has two paths depending on whether one of these
bits are set, since the non-conjugated path will always be taken.

For example, `at::real` effectively does
```cpp
view_as_real(tensor.is_conj() ? tensor.conj() : tensor)
```
which will never call `conj()` because the `CompositeCompliantTensor`
never has has the conj bit set. The result is `view_as_real` fails
when `r.elem` does have the conj bit set.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75830

Approved by: https://github.com/zou3519
2022-04-19 13:59:28 +00:00
PyTorch MergeBot
d79d9fa283 Revert "Remove breakpad dependency"
This reverts commit 9aa3c7fd83.

Reverted https://github.com/pytorch/pytorch/pull/75394 on behalf of https://github.com/malfet
2022-04-17 17:58:51 +00:00
Nikita Shulga
9aa3c7fd83 Remove breakpad dependency
This functionality does not seem to be used
and there are some requests to update dependency

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394
Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-04-17 17:43:45 +00:00
Edward Z. Yang
35cfa74f97 Add a default implementation of __torch_dispatch__
I was working on an explanation of how to call into the "super"
implementation of some given ATen operation inside of __torch_dispatch__
(https://github.com/albanD/subclass_zoo/blob/main/trivial_tensors.py)
and I kept thinking to myself "Why doesn't just calling super() on
__torch_dispatch__ work"?  Well, after this patch, it does!  The idea
is if you don't actually unwrap the input tensors, you can call
super().__torch_dispatch__ to get at the original behavior.

Internally, this is implemented by disabling PythonKey and then
redispatching.  This implementation of disabled_torch_dispatch is
not /quite/ right, and some reasons why are commented in the code.
There is then some extra work I have to do to make sure we recognize
disabled_torch_dispatch as the "default" implementation (so we don't
start slapping PythonKey on all tensors, including base Tensors),
which is modeled the same way as how disabled_torch_function is done.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73684

Approved by: albanD
2022-03-03 20:19:33 +00:00
Can Balioglu
7366724e07 Introduce an environment variable to change c10 log level (#71746)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71746

This PR contains the following improvements:

- It exposes a new environment variable `TORCH_CPP_LOG_LEVEL` that enables users to set the log level of c10 logging facility (supports both GLOG and c10 loggers). Valid values are `INFO`, `WARNING`, `ERROR`, and `FATAL` or their numerical equivalents `0`, `1`, `2`, and `3`.
- It implements an `initLogging()` function and calls it as part of `torch._C` module import to ensure that the underlying logging facility is correctly initialized in Python.

With these changes a user can dynamically set the log level of c10 as in the following example:

```
$ TORCH_CPP_LOG_LEVEL=INFO python my_torch_script.py
```
ghstack-source-id: 149822703

Test Plan: Run existing tests.

Reviewed By: malfet

Differential Revision: D33756252

fbshipit-source-id: 7fd078c03a598595d992de0b474a23cec91838af
(cherry picked from commit 01d6ec6207faedf259ed1368730e9e197cb3e1c6)
2022-02-24 14:34:01 +00:00
Alban Desmaison
7807a83f6e Fix error handling TestSetDefaultMobileCPUAllocator
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73207
2022-02-22 19:45:49 +00:00