Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67004
New version because the other one was impossible to rebase
Trace custom classes
Test Plan: CI.
Reviewed By: dhruvbird
Differential Revision: D31818978
fbshipit-source-id: daa22ccb153e32685bcca43a303ba9e21042d052
Summary:
Switches most of the simple for loops outside of `jit` directories to use `c10::irange`.
Generated with D28874212.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59481
Test Plan: Sandcastle
Reviewed By: ngimel
Differential Revision: D28909681
fbshipit-source-id: ec9ab1bd602933238d9d0f73d4d8d027b75d9d85
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54548
We don't need to inline most of this class; doing so bloats code size and build time.
ghstack-source-id: 129765666
Test Plan:
Existing CI
buildsizebot some mobile apps
Reviewed By: jamesr66a
Differential Revision: D27277317
fbshipit-source-id: 7643aa35e4d794fee0a48a3bbe0890c2e428ae78
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51319
We were going out of our way to accommodate `IValue::to<Tensor>` returning a copy of the inner Tensor. `IValue::toTensor` is capable of returning a reference without copying, so if we use it directly, we can allow kernels that want to take `Tensor &` to do so!
As a bonus, we get reduced build times.
ghstack-source-id: 121378961
Test Plan:
Rely on CI for correctness.
Profiled build time with -ftime-trace for RegisterCPU.cpp using an extracted build invocation.
Before: P168244900
After: P168245014
Note reduced time spent compiling make_boxed_from_unboxed_functor.
I also ran the AdIndexer benchmark (https://fb.quip.com/ztERAYjuzdlr) with static runtime disabled and batch size 1 to see how big the effect on boxed call performance was (any kernels that take `Tensor&` or `const Tensor&` should now actually save a refcount bump). Looks like it was roughly 1% better:
Before: 124-125 usec/iter
After: 122-123 usec/iter
Reviewed By: bhosmer
Differential Revision: D26138549
fbshipit-source-id: b0f830527da360c542c815bef2f7e1692615b32a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35411
The file and class names in ATen/core/boxing were quite confusing.
Let's rename them for readability.
Also move function schema inference out of the boxing logic into op_registration.h where it belongs.
ghstack-source-id: 101539206
Test Plan: waitforsandcastle
Differential Revision: D20653621
fbshipit-source-id: 6a79c73d5758bee1e072d543c030913b18a69c7c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35218
We should express the ownership semantics directly here. Using
`shared_ptr` makes it too easy to leak ownership by inadvertently
storing a copy.
Test Plan: Imported from OSS
Differential Revision: D20682673
Pulled By: suo
fbshipit-source-id: 32002ee515eb8bb7b37e6d0aac3c0695df4eec79