Commit Graph

144 Commits

Author SHA1 Message Date
Edward Z. Yang
f7ee061638 Wconstab/reland pysymint (#79795)
rebased https://github.com/pytorch/pytorch/pull/79617/ to see if issues are reproducible.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79795
Approved by: https://github.com/malfet
2022-06-20 22:55:06 +00:00
Scott Wolchok
6e90572bb9 [PyTorch] Don't create a new Storage in FreeMemory unnecessarily
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79573

No reason to go through an extra heap allocation.

Differential Revision: [D37157595](https://our.internmc.facebook.com/intern/diff/D37157595/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37157595/)!

Approved by: https://github.com/ezyang
2022-06-17 00:46:16 +00:00
PyTorch MergeBot
44436947bc Revert "Reland PySymInt (#79617)"
This reverts commit 8ef6356f26.

Reverted https://github.com/pytorch/pytorch/pull/79617 on behalf of https://github.com/zengk95 due to this is breaking periodic jobs (and maybe pull) on trunk
2022-06-16 19:40:27 +00:00
Nikolay Korovaiko
8ef6356f26 Reland PySymInt (#79617)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79617
Approved by: https://github.com/Chillee
2022-06-16 04:18:06 +00:00
PyTorch MergeBot
b8db0a0475 Revert "Python Bindings for SymInts (#78135)"
This reverts commit d332724071.

Reverted https://github.com/pytorch/pytorch/pull/78135 on behalf of https://github.com/ezyang due to broke torchvision tests
2022-06-15 13:52:14 +00:00
Nikolay Korovaiko
d332724071 Python Bindings for SymInts (#78135)
This PR adds support for `SymInt`s in python. Namely,
* `THPVariable_size` now returns `sym_sizes()`
* python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s
* pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced
* a large number of tests added to demonstrate how to implement python symints.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135
Approved by: https://github.com/ezyang
2022-06-14 02:17:59 +00:00
George Qi
05624bcf7b add sizes to slowpath
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79295

Approved by: https://github.com/ezyang
2022-06-14 01:19:59 +00:00
Scott Wolchok
fff1948b02 [PyTorch] intrusive_ptr: don't guarantee release_resources will be called
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76767

We're spending a virtual function call in the common case
where there are no weak references just to save a small amount of care
in intrusive_ptr_target subclasses that override release_resources, of
which there aren't very many.

Differential Revision: [D36109757](https://our.internmc.facebook.com/intern/diff/D36109757/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36109757/)!

Approved by: https://github.com/ezyang
2022-06-10 19:30:35 +00:00
George Qi
a90f006fe5 add strides to slow path
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78610

Approved by: https://github.com/ezyang
2022-06-10 16:59:14 +00:00
Brian Hirsh
19228205ae functionalization: update wrapper to propagate symints
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78820

Approved by: https://github.com/ezyang
2022-06-06 14:14:06 +00:00
Michael Suo
22b10873f3 Allow torchdispatch to customize dim()
This follows the template in
https://github.com/pytorch/pytorch/pull/77396

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78691

Approved by: https://github.com/ezyang
2022-06-02 20:54:13 +00:00
Michael Suo
49979c4021 [symint] Make TensorImpl::sizes_and_strides_ contain SymInt
Change our representation of sizes and strides to contain SymInts
instead of int64_t.

Right now it's not actually possible to create a Tensor with symbolic
shape, so this change is intended to be a no-op.

But the intended behavior is:
- If you create a Tensor with symbolic shape, a `CustomSizes` policy
will be set, and the `has_symbolic_sizes_strides_` bit will be set. (not
currently implemented)
- Calling any TensorImpl function that naively interacts with sizes and
strides will throw. For hot-path functions (`sizes()`, `strides()`), we
make use of the existing policy check to throw. For others, we just have
a regular `TORCH_CHECK(!has_symbolic_sizes_strides_)`.

This also undoes the explicit constructor I made in
https://github.com/pytorch/pytorch/pull/77666; it ended up being more
annoying than useful when making these changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78272

Approved by: https://github.com/Krovatkin, https://github.com/Chillee
2022-05-25 20:54:51 +00:00
Elias Ellison
2d93e1fada Add slow path for device
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77684

Approved by: https://github.com/ezyang
2022-05-24 21:56:01 +00:00
PyTorch MergeBot
fb84f2223c Revert "[symint] Make TensorImpl::sizes_and_strides_ contain SymInt"
This reverts commit a7a818d9e2.

Reverted https://github.com/pytorch/pytorch/pull/77994 on behalf of https://github.com/seemethere due to Talked with @suo and we decided to revert because of broken [internal builds](https://www.internalfb.com/intern/sandcastle/job/678535557/). Also appears as though internal codegen might be broken as well.
2022-05-24 00:14:02 +00:00
Michael Suo
a7a818d9e2 [symint] Make TensorImpl::sizes_and_strides_ contain SymInt
Change our representation of sizes and strides to contain SymInts
instead of int64_t.

Right now it's not actually possible to create a Tensor with symbolic
shape, so this change is intended to be a no-op.

But the intended behavior is:
- If you create a Tensor with symbolic shape, a `CustomSizes` policy
will be set, and the `has_symbolic_sizes_strides_` bit will be set. (not
currently implemented)
- Calling any TensorImpl function that naively interacts with sizes and
strides will throw. For hot-path functions (`sizes()`, `strides()`), we
make use of the existing policy check to throw. For others, we just have
a regular `TORCH_CHECK(!has_symbolic_sizes_strides_)`.

This also undoes the explicit constructor I made in
https://github.com/pytorch/pytorch/pull/77666; it ended up being more
annoying than useful when making these changes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77994

Approved by: https://github.com/Krovatkin
2022-05-20 20:17:06 +00:00
Nikolay Korovaiko
df1f9b9840 Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#77756)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77756
Approved by: https://github.com/desertfire
2022-05-20 05:39:03 +00:00
George Qi
294fff16ec add slow path for is_contiguous (#77906)
Test Plan: CI

Reviewed By: malfet, b0noI

Differential Revision: D36493890

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77906
Approved by: https://github.com/malfet
2022-05-19 22:52:45 +00:00
PyTorch MergeBot
00a187c373 Revert "add slow path for is_contiguous"
This reverts commit f6beda89c6.

Reverted https://github.com/pytorch/pytorch/pull/77396 on behalf of https://github.com/malfet
2022-05-19 17:07:54 +00:00
PyTorch MergeBot
e9d660c331 Revert "Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)"""
This reverts commit acf7136a52.

Reverted https://github.com/pytorch/pytorch/pull/77719 on behalf of https://github.com/suo
2022-05-18 05:06:50 +00:00
Edward Z. Yang
acf7136a52 Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)""
This reverts commit c35bd8d423.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719

Approved by: https://github.com/Chillee, https://github.com/malfet
2022-05-18 03:25:43 +00:00
PyTorch MergeBot
c35bd8d423 Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)"
This reverts commit fc4c3c9bc7.

Reverted https://github.com/pytorch/pytorch/pull/76836 on behalf of https://github.com/suo
2022-05-18 02:45:25 +00:00
George Qi
f6beda89c6 add slow path for is_contiguous
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77396

Approved by: https://github.com/ezyang, https://github.com/cpuhrsch
2022-05-18 02:25:27 +00:00
Nikolay Korovaiko
fc4c3c9bc7 Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)
LTC Tensors now create real IR (SizeNode) for sym_sizes() in LTCTensorImpl.cpp.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76836
Approved by: https://github.com/ezyang
2022-05-18 00:40:42 +00:00
Brian Hirsh
5762c7b25b fix StridesPolicy logic for FunctionalTensorWrapper
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77358

Approved by: https://github.com/ezyang
2022-05-13 13:27:06 +00:00
Edward Z. Yang
2896f81dd4 Consolidate customization contiguous/sizes policy into unified policy
Prior to this PR, we had a mish-mash of ways of getting unconventional
sizes/strides behavior:

- In OSS (but not in fbcode), some methods are virtual and you
  can override them directly

- There is a is_contiguous policy which is a bitfield tag that lets
  you toggle is_contiguous to error or hit a virtual method
  is_contiguous_custom if it is set.  Ordinarily is_contiguous()
  is virtual and you can just override it, but this works EVEN IF
  is_contiguous() is non-virtual (e.g., in fbcode)

- There is also a sizes policy which is the same idea but for sizes

This PR unifies these mechanisms, and in doing so, eliminates the
maybe virtual/not-virtualness of the methods in question.  The primary
downside of this change is that it is BC-breaking (but the BC break is
very easy to fix!)

The new scheme works like this: we have three levels of policy for
sizes/strides (order matters).

- The Default policy is a conventional dense tensor, where we use
  all of the built-in fields to directly represent the
  sizes/strides/numel/contiguity of the tensor, and it is possible
  to bypass virtual call entirely.

- The CustomStrides policy represent tensors which have a custom
  notion of strides (most typically, that they don't support them),
  shunting strides() and is_contiguous() to virtual methods
  strides_custom() and is_contiguous_custom().  This INCLUDES handling
  for contiguity, since they typically go hand-in-hand (although
  the situation is murky with batched tensors).  The default
  implementations of these functions raise errors saying the tensor
  doesn't support them.

- The CustomSizes policy represent tensors which have a custom
  notion of sizes (the two notable examples are nested tensor, which
  doesn't have a representation of sizes in the conventional form, and
  XLA/LTC tensor, which synchronizes its sizes with an underlying
  compiler backend).  This shunts sizes(), numel() and dim() (along
  with everything from strides) to _custom() variants.

There is no special policy for erroring; instead, we just do a vcall
and expect the virtual method to raise an exception (the performance
hit from the vcall doesn't matter because you're about to raise a C++
exception anyway).  The default implementations of all overridable
functions are available at _default() which is helpful in some
situations when you just want to do a "sync" and then run the
conventional semantics.

This PR could be extended further in two ways but I did not do them
due to time constraints:

- Ideally, all TENSORIMPL_MAYBE_VIRTUAL would be eliminated from
  TensorImpl, by using the same policy trick.

- set_size and set_stride are still virtual; it's not entirely clear
  the same trick should be used here though as these methods are
  deprecated.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77036

Approved by: https://github.com/bdhirsh
2022-05-11 00:23:07 +00:00
Edward Z. Yang
4bd5b1614b Move legacy Caffe2 TensorImpl methods out of header
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77028

Approved by: https://github.com/bdhirsh
2022-05-11 00:23:07 +00:00
Edward Z. Yang
337e3932aa Fix data race on owns_pyobj_ accesses with non-GIL protected threads
Fixes https://github.com/pytorch/pytorch/issues/75529

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75563

Approved by: https://github.com/swolchok
2022-04-15 16:24:15 +00:00
Edward Z. Yang
2772870860 Preserve Python dispatch keys upon copy_tensor_metadata_except_version_counter
Whether or not this is a reasonable operation to do in the presence of
subclasses is a good question in and of itself, but this fixes an
obvious invariant violation, which is that if a Tensor reports that
it is a tensor subclass, it had better have the Python dispatch key.
Previously, the dispatch key would have gotten unconditionally cleared;
now we preserve what ever the original bit was.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75644

Approved by: https://github.com/albanD
2022-04-15 13:26:23 +00:00
Edward Z. Yang
de6353ba88 Introduce SafePyObject, make TorchDispatchTypeObject use it
The pattern of a PyObject* bundled with a PyInterpreter* is pretty
useful in many contexts (e.g., TorchDispatchTypeObject) so I have turned
it into a dedicated class SafePyObject.  In the process I fixed a
bug with the old TorchDispatchTypeObject (copy constructor/assignment
was not deleted), made the API more safe (retrieving the PyObject*
pointer requires verification that the PyInterpreter* matches) and
fixed some minor inefficiencies in C++ code.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75142

Approved by: https://github.com/zou3519
2022-04-04 14:35:01 +00:00
Edward Z. Yang
1faf1cdf12 Split PyInterpreter into its own file.
I also took the opportunity to update the documentation a little
for clarity.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75141

Approved by: https://github.com/zou3519
2022-04-04 14:35:01 +00:00
Brian Hirsh
1b7d7d9327 Reland: "free up dispatch key space (in C++)" (#74963)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74963

This is a re-land of D35192346 (9872a06d77) and D35192317 (a9216cde6c), which together are a diff that changes the internal representation of `DispatchKeySet` in pytorch core to free up the number of dispatch keys that we have available. See a more detailed description of the design in the original PR: https://github.com/pytorch/pytorch/pull/69633.

The original PR broke Milan workflows, which use a pytorch mobile build, and manifested as a memory corruption bug inside of `liboacrmerged.so`.

**Background: Existing Mobile Optimization**
Pytorch mobile builds have an existing optimization (here cc23725e89/c10/core/DispatchKey.h (L382) and here cc23725e89/aten/src/ATen/core/dispatch/OperatorEntry.h (L214)), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).

In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.

The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here: cc23725e89/aten/src/ATen/core/dispatch/Dispatcher.h (L294).

The mobile-optimization currently does **not** extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator). So the per-operator tables on mobile are size 8, while the fallback table is size 64.

**The Bug**
This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line: https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294).

That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the `Dispatcher` object, `backendFallbackKernels_`.

**Why didn't this problem show up in OSS CI? Why didn't it break other internal mobile workflows aside from Milan?**

Ideally, this failure would show up as part of the OSS signal on GitHub, since we already have mobile OSS builds. Given that it was another memory corruption issue that only affected Milan (subset of mobile), I'm not sure what's specific about Milan's builds that caused it only to manifest there. dreiss I wonder if there's another flavor of mobile builds we could run in OSS CI that could potentially help catch this?

**The debugging experience was pretty difficult**

Debugging the Milan-specific failure was made difficult by the following:

(1) lack of CI
- the original Milan failure didn't surface on my original diff, because the Milan job(s) that failed weren't triggered to run on pytorch changes. There's probably a balance to strike here, since those jobs will only be useful if they aren't flaky, and if they can produce reliable failure logs for debugging.

(2) It's difficult to get a repro.
- my work laptop doesn't have the right specs to run the Milan development workflow (not enough disk space)
- There is an existing OnDemand workflow for Milan, but it appears to be relatively new, and after a bunch of help from MarcioPorto, we ran into issues forwarding the log output from Milan tests on the emulator back to the terminal (see the original discussion here: https://fb.workplace.com/groups/OnDemandFRL/permalink/1424937774645433/)

(3) Lack of stack-traces.
- Most Milan failures didn't include actionable stack traces. phding generously helped me debug by running my suggested patches locally, and reporting back if there were any failures. The failing test didn't include a stack trace though (just the line where the crash appeared), so I ended up making some educated guesses about what the issue was based on the area of the crash.
ghstack-source-id: 152688542

Test Plan: Confirmed with phding that the broken Milan workflow from the previous version of this diff is now passing.

Reviewed By: phding, albanD

Differential Revision: D35222806

fbshipit-source-id: 0ad115a0f768bc8ea5d4c203b2990254c7092d30
(cherry picked from commit 002b91966f11fd55ab3fa3801b636fa39a6dd12c)
2022-03-31 21:52:38 +00:00
Brian Hirsh
9872a06d77 Back out "free up dispatch key space (in C++)" (#74859)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74859

Original commit changeset: 6d1dd0fd8144

Original Phabricator Diff: D34227616 (2cbddc0e9b)
ghstack-source-id: 152381077

(Note: this ignores all push blocking failures!)

Test Plan:
Test on Milan with "get weather utterance"
buck build fbsourcefbandroid/mode/opt fbsourcefbandroid/mode/milan_build_rdk  //fbandroid/apps/wearable/system/speechservice:speechservice_target30_xhdpi_armv7_release_debug_keystore -c  pt.has_backtaces=1

Reviewed By: phding

Differential Revision: D35192346

fbshipit-source-id: b962de5d5effaf23f9aa8afd3ef36f8c6383de5b
(cherry picked from commit 913e3027a11457aaa2d97a9d89ebc6133b14213c)
2022-03-29 15:39:17 +00:00
Brian Hirsh
a9216cde6c Back out "DispatchKeySet perf improvements" (#74858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74858

Original commit changeset: c7695e16dba3

Original Phabricator Diff: D34227615 (c0491c9179)

The Milan Assistant is crashing, see P489979944

After back out the change D34227615 (c0491c9179) and D34227616 (2cbddc0e9b), it works fine now
ghstack-source-id: 152380988

(Note: this ignores all push blocking failures!)

Test Plan:
Test on Milan with "get weather utterance"

buck build fbsource//fbandroid/mode/opt fbsource//fbandroid/mode/milan_build_rdk  //fbandroid/apps/wearable/system/speechservice:speechservice_target30_xhdpi_armv7_release_debug_keystore -c  pt.has_backtaces=1

Reviewed By: phding

Differential Revision: D35192317

fbshipit-source-id: e38081810a569b45ca037e019ec1c8773971534d
(cherry picked from commit 78833ac6997fbc8e20bd0f3ee0e0fe55a075054c)
2022-03-29 15:39:17 +00:00
Brian Hirsh
c0491c9179 DispatchKeySet perf improvements (#72828)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72828

Reland of D34034847 (8aa3620d73)
ghstack-source-id: 152161453

Test Plan: confirm that Milan tests are passing

Reviewed By: ezyang, albanD

Differential Revision: D34227615

fbshipit-source-id: c7695e16dba3076e8ab9df8654327c5d57e92c77
(cherry picked from commit 940717db1551b799964894e0bb97757ecae14235)
2022-03-25 17:04:51 +00:00
Brian Hirsh
2cbddc0e9b free up dispatch key space (in C++) (#72827)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72827

Reland of D34034848 (6690256021)
ghstack-source-id: 152161452

Test Plan: Confirm that Milan tests are passing

Reviewed By: ezyang

Differential Revision: D34227616

fbshipit-source-id: 6d1dd0fd8144dfbd9e194cd7564cce017e7db968
(cherry picked from commit e5c1b29fedd5c2a0bad810cedc94aa784136b6aa)
2022-03-25 17:04:51 +00:00
Scott Wolchok
90be8fa279 [PyTorch] Make TensorImpl::sizes() customizable and disable it for NestedTensorImpl (#73817)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73817

NestedTensorImpl doesn't have sizes(). Silently getting wrong results back from it is not conducive to efficient software development. Make it throw while allowing sizes() to be inlined in the common case anyway, just like is_contiguous(). Thanks ezyang for the reminder that we could do this.
ghstack-source-id: 151302903

Test Plan: Updated test_nestedtensor.py

Reviewed By: ezyang

Differential Revision: D34660829

fbshipit-source-id: 1289f21127d6a8359893f9174f3c430a290f2c7f
(cherry picked from commit 7098b9fcfbd25a03bac19e1148426ff073810edd)
2022-03-15 19:24:57 +00:00
CodemodService FBSourceClangFormatLinterBot
f395a75c67 [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D34263670

fbshipit-source-id: 9479899031c817ad8cbefba30db7d0203804fd99
(cherry picked from commit c13e2138f4)
2022-02-16 15:59:57 +00:00
Alban Desmaison
a7cac05ca6 Add new tls snapshot feature (#72832)
Summary:
Reland of https://github.com/pytorch/pytorch/pull/72623 that was reverted for the tls cleanup was removed.

From close inspection on the counting of the number of available keys, I think there is one more since the guard is actually one after the last usable key. With this update assert, the last updated key will still be <=63 which will fit just fine.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72832

Reviewed By: H-Huang

Differential Revision: D34228571

Pulled By: albanD

fbshipit-source-id: ce5e10a841ea87386727346cfc8d9327252574c4
(cherry picked from commit 59d3b86353)
2022-02-15 19:02:05 +00:00
Brian Hirsh
22ccf448e8 Revert D34034848: free up dispatch key space (in C++)
Test Plan: revert-hammer

Differential Revision:
D34034848 (6690256021)

Original commit changeset: 9677ee2c0a1a

Original Phabricator Diff: D34034848 (6690256021)

fbshipit-source-id: fd50943d915ef813bb9f9ab278fb582429eea3b1
(cherry picked from commit 3acefee1cd)
2022-02-14 23:29:00 +00:00
Brian Hirsh
7f560fb3e0 Revert D34034847: DispatchKeySet perf improvements
Test Plan: revert-hammer

Differential Revision:
D34034847 (8aa3620d73)

Original commit changeset: a930e44513a7

Original Phabricator Diff: D34034847 (8aa3620d73)

fbshipit-source-id: 57b8b7dee252bb8d10316189a034517a28c42199
(cherry picked from commit c3151d4e73)
2022-02-14 23:29:00 +00:00
Brian Hirsh
f1a9650e4f Revert D34214953: Add new tls snapshot feature
Test Plan: revert-hammer

Differential Revision:
D34214953 (6199b5231f)

Original commit changeset: 7aa5d5e3540a

Original Phabricator Diff: D34214953 (6199b5231f)

fbshipit-source-id: 5d271e9a5ab021b8202402630dbf917b43c55421
(cherry picked from commit a12c630198)
2022-02-14 23:14:19 +00:00
Alban Desmaison
6199b5231f Add new tls snapshot feature (#72623)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72623

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D34214953

Pulled By: albanD

fbshipit-source-id: 7aa5d5e3540a45a0ae70c5af3a4495c755908aa9
(cherry picked from commit dc0a1ab54a)
2022-02-14 20:46:54 +00:00
Brian Hirsh
8aa3620d73 DispatchKeySet perf improvements (#72403)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72403

reland of D33301590 (18cbe80f23)
ghstack-source-id: 148830729

Test Plan: CI, and running explicit mobile test: `buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled`

Reviewed By: albanD

Differential Revision: D34034847

fbshipit-source-id: a930e44513a76c0c82c9d27f0fc2d2a6d7d90cf9
(cherry picked from commit 7f1ea7584c)
2022-02-14 16:02:29 +00:00
Brian Hirsh
6690256021 free up dispatch key space (in C++) (#72402)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72402

The original PR had an array-out-of-bounds access in `DispatchKeyExtractor.cpp`, that wasn't caught by ASAN and appeared to only manifest in a subset of android internal tests. After fixing the OOB access (and adding more asserts), I confirmed that the android internal test passes.

Reland of D33255193 (20b8653dfa)
ghstack-source-id: 148830728

Test Plan:
Steps to test:

(1) connect to a mobile OD

(2) run `one_world android emulator android-29` in a terminal to start the android emulator

(3) In a separate terminal, run the test: `buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled`

I also ran `buck test fbandroid/mode/dbg //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test`, which failed before and passed after the PR.

Reviewed By: albanD

Differential Revision: D34034848

fbshipit-source-id: 9677ee2c0a1afd1183896f7055009445712523c5
(cherry picked from commit 9ab9b12d35)
2022-02-14 16:02:29 +00:00
Jacob Szwejbka
791e7df7d9 Back out "free up dispatch key space (in C++)"
Summary: I think this diff stack broke all the related tasks below.

Test Plan:
For our failing tests:

buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled

For the ubn:

Not really sure what to do, trying to build the app and see if I can use an effect?

Reviewed By: shoumikhin

Differential Revision: D34018849

fbshipit-source-id: 3571718cb6621931af931b494e0a70d6e0164e65
(cherry picked from commit 3cc63cb2ea)
2022-02-05 01:25:42 +00:00
Jacob Szwejbka
888e3fbcb5 Back out "DispatchKeySet perf improvements"
Summary: D34018849

Test Plan: D34018849

Reviewed By: shoumikhin

Differential Revision: D34018840

fbshipit-source-id: a78e3ea5b8ac93e9e002e2583961fd3a545a0abd
(cherry picked from commit 57b7c51f74)
2022-02-05 01:06:33 +00:00
Brian Hirsh
18cbe80f23 DispatchKeySet perf improvements (#70364)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70364

A bunch of optimizations I made while staring at callgrind, after the DispatchKeySet changes further down in this stack.

There are basically three optimizations in this PR:
- Making `DispatchKeySet`'s constexpr (where previously they weren't)
- Condensing multiple keyset membership calls into a single function call
- Making `TensorImpl::layout()` fastpath. The common case it to return `kstrided`, but we were doing a bunch of checks before returning it in most cases.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33301590

Pulled By: bdhirsh

fbshipit-source-id: 6ec28e66e7fe21f9decae317e8a4013dcf44e2fb
(cherry picked from commit 5defa1676e)
2022-02-04 17:57:38 +00:00
Brian Hirsh
20b8653dfa free up dispatch key space (in C++) (#69633)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69633

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33255193

Pulled By: bdhirsh

fbshipit-source-id: 79773e9c15bf4f2f27675121a49ff5ffd1375238
(cherry picked from commit eac0b13005)
2022-02-04 17:57:38 +00:00
Jin Luo
c83eaf5c26 add the default destructor of TensorImpl (#72190)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72190

# Context
Some compilers could not generate the destructor correctly.

# Mitigation
add the default destructor.

Test Plan: ^CI

Reviewed By: albanD

Differential Revision: D33936970

fbshipit-source-id: c21aa1cce8565d8c25389de8970880392737afb1
(cherry picked from commit 7ab4b8b14e)
2022-02-03 01:40:25 +00:00
Can Balioglu
f45e217c01 Consolidate the overloads of TensorImpl::shallow_copy_and_detach (#68953)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68953

This PR consolidates the almost identical lvalue and rvalue implementations of shallow_copy_and_detach into a single templated function.
ghstack-source-id: 147238376

Test Plan: Run existing unit tests.

Reviewed By: fduwjj

Differential Revision: D32679741

fbshipit-source-id: 89a870335d2e09ffd005c943733a787d20d352f9
(cherry picked from commit 750344c860)
2022-01-19 21:52:13 +00:00