Commit Graph

25787 Commits

Author SHA1 Message Date
Edward Z. Yang
f7365eca90 Add unbacked symints support; item works now (#90624)
The big idea is to add `create_unbacked_symfloat` and `create_unbacked_symint` to ShapeEnv, allowing you to allocate symbolic floats/ints corresponding to data you don't know about at compile time. Then, instead of immediately erroring out when you try to call local_scalar_dense on a FakeTensor, we instead create a fresh symint/symfloat and return that.

There a bunch of odds and ends that need to be handled:

* A number of `numel` calls converted to `sym_numel`
* When we finally return from item(), we need to ensure we actually produce a SymInt/SymFloat when appropriate. The previous binding code assumed that you would have to get a normal Python item. I add a pybind11 binding for Scalar (to PyObject only) and refactor the code to use that. There is some trickiness where you are NOT allowed to go through c10::SymInt if there isn't actually any SymInt involved. See comment.
* One of our unit tests tripped an implicit data dependent access which occurs when you pass a Tensor as an argument to a sizes parameter. This is also converted to support symbolic shapes
* We now support tracking bare SymInt/SymFloat returns in proxy tensor mode (this was already in symbolic-shapes branch)
* Whenever we allocate an unbacked symint, we record the stack trace it was allocated at. These get printed when you attempt data dependent access on the symint (e.g., you try to guard on it)
* Subtlety: unbacked symints are not necessarily > 1. I added a test for this.

These unbacked symints are not very useful right now as you will almost always immediately raise an error later when you try to guard on them. The next logical step is adding an assertion refinement system that lets ShapeEnv learn facts about unbacked symints so it can do a better job eliding guards that are unnecessary.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90624
Approved by: https://github.com/Skylion007, https://github.com/voznesenskym
2022-12-12 13:33:07 +00:00
Michael Voznesensky
5adc18dcbc Shape guard structure (#90679)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90679
Approved by: https://github.com/ezyang
2022-12-12 09:50:00 +00:00
Yanbo Liang
2e0ce24890 [Dynamo] Support access nn.Module keys (#90502)
Fixes https://github.com/pytorch/torchdynamo/issues/1973

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90502
Approved by: https://github.com/jansel
2022-12-12 09:15:42 +00:00
Yuxin Wu
c8ed84ad06 Fix a static initialization order fiasco in c10d (#90149)
The `TORCH_LIBRARY_IMPL` registrations in `OpsImpl.cpp` needs to happen after `ProcessGroup` is registered as a torch class -- which happens in `Ops.cpp`. However, the order of the registrations is undefined between the two files.

If the registration in `OpsImpl.cpp` runs before `Ops.cpp`, we get a crash at program launch similar to #83255 . This happens in our internal build.

This PR moves `OpsImpl.cpp` to the end of `Oops.cpp`. Because according to the omniscient lord of chatGPT:
<img width="600" alt="2022-12-04_19-25" src="https://user-images.githubusercontent.com/1381301/205542847-3535b319-3c2a-4e8e-bc11-27913f6afb39.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90149
Approved by: https://github.com/kwen2501, https://github.com/H-Huang, https://github.com/soumith
2022-12-12 08:21:54 +00:00
XiaobingSuper
4ca2fc485c inductor(CPU): add Conv+binary+unary fusion filter (#90259)
For Conv+binary+unary fusion, we only support conv+add+relu, this PR adds a such check to fix TIMM failed models.
TODO: enable more Conv+binary+unary fusion to improve TIMM models' performance.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90259
Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/jansel
2022-12-12 06:04:55 +00:00
Bert Maher
c318de4274 [dynamo] Get GPU names without calling nvidia-smi (#90474)
Believe it or not, inductor can sometimes be used on machines that
have CUDA GPUs but no nvidia-smi.  Let's use torch APIs instead of subprocess.

Differential Revision: [D41841930](https://our.internmc.facebook.com/intern/diff/D41841930/)

Differential Revision: [D41841930](https://our.internmc.facebook.com/intern/diff/D41841930)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90474
Approved by: https://github.com/voznesenskym, https://github.com/anijain2305
2022-12-12 05:31:50 +00:00
Bert Maher
b95ea4f149 [pt2] Reset dynamo log level when exiting inductor debug context (#90473)
When entering an inductor debug context we increase the log level of
dynamo; I guess this makes sense, since if we're debugging inductor, and
inductor calls into dynamo, we probably want visibility into what dynamo is
doing.

But when we exit that context, we probably want to go back to whatever level of
dynamo-specific logging was in place before.  Dynamo generates lots of debug
info (guards, bytecode), and it's a lot to sift through if you're not
specifically interested in it.

Differential Revision: [D41841879](https://our.internmc.facebook.com/intern/diff/D41841879/)

Differential Revision: [D41841879](https://our.internmc.facebook.com/intern/diff/D41841879)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90473
Approved by: https://github.com/mlazos, https://github.com/jansel
2022-12-12 04:39:37 +00:00
Bert Maher
d3d85e1c3b Emit torch.cuda.synchronize() after every kernel call in inductor (#90472)
Debugging illegal memory access is hard; even CUDA_LAUNCH_BLOCKING=1
and using C10_CUDA_KERNEL_LAUNCH_CHECK doesn't guarantee a useful stack trace.
doesn't necessarily guarantee that you'll get a stack trace pointing to the
right kernel.  This diff adds a config option to force a CUDA synchronize after
every kernel call in inductor, for debugging those tricky cases.

Differential Revision: [D41744967](https://our.internmc.facebook.com/intern/diff/D41744967/)

Differential Revision: [D41744967](https://our.internmc.facebook.com/intern/diff/D41744967)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90472
Approved by: https://github.com/jansel
2022-12-12 04:35:10 +00:00
Edward Z. Yang
8fd31ac4da Preserve original GraphArgs for shape guard codegen (#90665)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90665
Approved by: https://github.com/voznesenskym
2022-12-12 02:35:23 +00:00
Edward Z. Yang
9447005ae3 Improve dynamo debug logging (#90664)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90664
Approved by: https://github.com/voznesenskym
2022-12-12 02:35:23 +00:00
Edward Z. Yang
450bd282e0 Slightly improve error messages on sympy failure (#90655)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90655
Approved by: https://github.com/Skylion007, https://github.com/voznesenskym
2022-12-12 01:58:34 +00:00
Yuxin Wu
8127724c3b Skip some unittests (#90609)
* Skip a unittest that needs FFT if not built with FFT
* Mark a test with "slow": `python test/test_ops.py -k TestCompositeComplianceCUDA.test_forward_ad_svd_lowrank_cuda_float32` took >5min on my machine.
* Skip a flaky test that's marked "expectedFailure", similar to #90233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90609
Approved by: https://github.com/soumith
2022-12-11 23:53:05 +00:00
Michael Voznesensky
11442accc6 Make torch._guards, shuffle structures around for migration (#90636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90636
Approved by: https://github.com/ezyang
2022-12-11 23:16:07 +00:00
Edward Z. Yang
e33f1eeeb7 SymIntify resize_ and deduplicate memory format logic (#90442)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90442
Approved by: https://github.com/bdhirsh
2022-12-11 14:38:38 +00:00
PyTorch MergeBot
15a4c60383 Revert "Make torch._guards, shuffle structures around for migration (#90636)"
This reverts commit 933b6c4eed.

Reverted https://github.com/pytorch/pytorch/pull/90636 on behalf of https://github.com/huydhn due to Breaking lint on master. Please rebase and run lintrunner -a before re-merging the PR
2022-12-11 10:15:47 +00:00
Shen Li
7ec1cb8553 [FSDP] Fix _pre_forward type annotation (#90621)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90621
Approved by: https://github.com/awgu, https://github.com/Skylion007
2022-12-11 06:39:38 +00:00
Shen Li
80542add73 [FSDP] Allow MixedPrecision to skip inputs (#90620)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90620
Approved by: https://github.com/rohan-varma, https://github.com/awgu
2022-12-11 06:39:38 +00:00
Andrew Gu
31351c61dd [FSDP] Tighten post-bwd cast to reduce_dtype (#90615)
This lowers the `reduce_dtype` retrieval to the `handle` instead of the `state` in preparation for `fully_shard`, and this adds a guard to avoid a no-op `to()` call.

Note that this change pretty much gets overridden in following PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90615
Approved by: https://github.com/rohan-varma
2022-12-11 06:39:34 +00:00
Michael Voznesensky
933b6c4eed Make torch._guards, shuffle structures around for migration (#90636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90636
Approved by: https://github.com/ezyang
2022-12-11 06:04:17 +00:00
Rohan Varma
c7d2fb7f86 Adopt state_dict_pre_hook in FSDP (#90436)
Use register_state_dict_pre_hook in FSDP to simplify state_dict implementations & remove hacks. This removes `def state_dict` entirely and paves the path for composable API as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90436
Approved by: https://github.com/fegin
2022-12-11 03:54:26 +00:00
Andrew Gu
e7efeb5282 [FSDP] Save _stream_to_name for debugging (#90611)
This saves a data structure `_stream_to_name: Dict[torch.cuda.Stream, str]` that maps each FSDP stream to its name. This can help in debugging by checking `_stream_to_name[torch.cuda.current_stream()]` to see if it is `"default"` or `"unshard"` in the post-backward hook for example.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90611
Approved by: https://github.com/rohan-varma
2022-12-11 03:46:18 +00:00
Andrew Gu
9eccfedca2 [Reland][FSDP] Another fix for DTensor, use_orig_params=True (#90562)
This is a reland of https://github.com/pytorch/pytorch/pull/89845 with nothing changed. This should avoid the internal breakage now that `DTensor` does not import `torchgen` (https://github.com/pytorch/pytorch/pull/90106).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90562
Approved by: https://github.com/fduwjj
2022-12-10 22:50:30 +00:00
Shen Li
a69cdd9cf8 Add global registry to composable API contract (#90579)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90579
Approved by: https://github.com/awgu, https://github.com/yhcharles
2022-12-10 22:41:10 +00:00
Aaron Gokaslan
12671fe620 Reserve space for std::vector output in extract_tensors for nccl python bindings (#88203)
Optimizes the nccl python bindings to reserve space when converting PythonObject* into Tensors. This should reduce the number of unnecessary allocations in the nccl bindings as the std::vector grows.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88203
Approved by: https://github.com/ezyang
2022-12-10 20:28:19 +00:00
Sergii Dymchenko
9ef1d55e6b Fix non-existing parameters in docstrings in torch/nn (#90596)
This is a continuation of https://github.com/pytorch/pytorch/pull/90505

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90596
Approved by: https://github.com/lezcano
2022-12-10 14:37:31 +00:00
Edward Z. Yang
45109ec30a Completely redo how ShapeEnv guards are generated (#90528)
Instead of inferring shape mappings from a bunch of data structures that were plumbed in InstructionTranslator, we instead work out mappings by just iterating over the GraphArgs and mapping symbols to arguments as they show up. If multiple argument sizes/strides/offset map to the same symbol, this means they are duck sized, so we also generate extra equality tests that they must be equal. Finally, we generate 0/1 specialization guards. The resulting code is much shorter, and I think also easier to understand.

TODO: Delete all the tensor ref tracking code, it's unnecessary

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90528
Approved by: https://github.com/voznesenskym
2022-12-10 13:35:04 +00:00
Edward Z. Yang
49c674e155 Revert guaranteed symint allocation (#90381)
So, uh, I have a new strategy for generating dupe guards, one where I don't actually need to allocate symints for every tensor that is fakeified. So I'm reverting the changes I made from earlier PRs in this one.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90381
Approved by: https://github.com/voznesenskym
2022-12-10 13:17:34 +00:00
Edward Z. Yang
b68dead20c Keep track of source name on all allocated SymInts (#90295)
Wow, I had to sweat so much to get this PR out lol.

This PR enforces the invariant that whenever we allocate SymInts as part of fakeification, the SymInt is associated with a Source, and in fact we store the string source name on SymbolWithSourceName. We use 'sname' as the shorthand for source name, as 'name' is already used by sympy to name symbols.

In order to store source names, we have to plumb source names from Dynamo to PyTorch. This made doing this PR a bit bone crushing, because there are many points in the Dynamo codebase where we are improperly converting intermediate tensors into fake tensors, where there is no source (and there cannot be, because it's a frickin' intermediate tensor). I've fixed all of the really awful cases in earlier PRs in the stack. This PR is just plumbing in source names from places where we do have it.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90295
Approved by: https://github.com/voznesenskym
2022-12-10 13:17:34 +00:00
blzheng
f9aa099074 [Inductor] fix issue: redeclaration of float g_tmp_buffer_xxx (#90270)
This pr is to fix the issue: redeclaration of 'float g_tmp_buffer_in_ptr1[16] = {0};'
If a bool or uint8 tensor is used by multiple op, this tensor will be loaded multiple times. On load, it writes the declaration of this variable, i.e., `self.loads.writeline(f"float {g_tmp_buf}[{nelements}] = {{0}};")`, which will introduce redeclaration error.

![image](https://user-images.githubusercontent.com/69951214/205869956-5c325761-dc09-4aa8-a9ed-fad7f4c85917.png)
![image](https://user-images.githubusercontent.com/69951214/205870695-ee252f17-8f54-484f-9b0a-3a424c479327.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90270
Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel
2022-12-10 12:59:30 +00:00
Jiewen Tan
5a665a39d1 [LTC] Make some LazyGraphExecutor private data structures protected (#90598)
Summary:
This pull request makes some LazyGraphExecutor private data structures protected such that XLAGraphExecutor can reuse them.

Here is the list:
1. DeviceLocker.
2. DeviceLockerArena.
3. DataCacheArena. In addition, it also introduces LazyGraphExecutor::ResetTrimCounter() such that XLAGraphExecutor can reuse the trim counter.

Test Plan:
CI.

P.S. This is to re-land #90457.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90598
Approved by: https://github.com/JackCaoG
2022-12-10 08:19:12 +00:00
Zachary DeVito
3b3ed25109 Add a way to visualize memory snapshot traces (#90348)
This adds a d3-based interactive visualization for exploring the memory
allocation traces that the caching allocator can capture. This visualization
code can also be attached to kineto trace information in the future to also
provide visualization for the memory events captured there, which come with
addition information about the graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90348
Approved by: https://github.com/robieta
2022-12-10 02:45:11 +00:00
Yanli Zhao
2bac4d1fae [reland] add save and load stats in memory_tracker (#90510)
reland https://github.com/pytorch/pytorch/pull/90144, this PR removed temporary path "memory.trace" in the unit test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90510
Approved by: https://github.com/rohan-varma
2022-12-10 01:39:22 +00:00
BowenBao
1b2c59ad24 [ONNX] Introduce ONNX reference evaluator for verification (#89808)
Reference evaluator requires ONNX >= 1.13. Running in CI is blocked by unable to bump onnx submodule version, like in #83201. Local tests pass.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89808
Approved by: https://github.com/justinchuby
2022-12-10 01:29:12 +00:00
Wanchao Liang
7afba50508 [dtensor] delete unused torch_function (#90449)
torch_function is not actually getting used yet today, deleting
it first and we can revisit once we really need it
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90449
Approved by: https://github.com/fduwjj
2022-12-10 01:29:02 +00:00
BowenBao
79f9672249 [ONNX] Use VerificationOptions to wrap option arguments (#89807)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89807
Approved by: https://github.com/justinchuby, https://github.com/titaiwangms
2022-12-09 23:49:51 +00:00
Angela Yi
6de216a2e8 [fx] Have replace_pattern return replaced nodes (#90244)
Summary: Modified replace_pattern in the subgraph rewriter to return a list of pairs of matches along with their corresponding replacement nodes in the modified graph (`List[Tuple[Match, List[Node]]]`). This allows us to easily modify the replaced nodes, including setting the metadata.

Test Plan: CI

Differential Revision: D41737056

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90244
Approved by: https://github.com/SherlockNoMad
2022-12-09 23:43:16 +00:00
Jiawen Liu
4a1633ca69 [Inductor] GEMM Shape Padding Optimization (#90425)
Summary:
Optimize the shape padding in the following perspectives:
- Add BFloat16 support for AMP training and Float16 support for inference
- Optimize microbenchmark to avoid peak memory issue, and include profiling memory ops to make more accurate decision
- Add a flag to turn off/on padding dims N and M in `torch.bmm` due to expensive memory copy of `.contiguous` to avoid peak memory issues in internal models

Test Plan: CI

Differential Revision: D41724868

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90425
Approved by: https://github.com/jianyuh
2022-12-09 22:48:02 +00:00
PyTorch MergeBot
b7dfbf876f Revert "[LTC] Make some LazyGraphExecutor private data structures protected (#90457)"
This reverts commit 93aa6e3e36.

Reverted https://github.com/pytorch/pytorch/pull/90457 on behalf of https://github.com/clee2000 due to broke xla somehow 93aa6e3e36 https://github.com/pytorch/pytorch/actions/runs/3659842773/jobs/6186552659
2022-12-09 22:28:24 +00:00
Angela Yi
02eb0bdbc1 [fx] Added better tests to pass infra (#90432)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90432
Approved by: https://github.com/SherlockNoMad
2022-12-09 21:43:18 +00:00
Sergii Dymchenko
f51f6aa387 Fix non-existing parameters in docstrings (#90505)
Continuation after https://github.com/pytorch/pytorch/pull/90163.

Here is a script I used to find all the non-existing arguments in the docstrings (the script can give false positives in presence of *args/**kwargs or decorators):

_Edit:_
I've realized that the indentation is wrong for the last `break` in the script, so the script only gives output for a function if the first docstring argument is wrong. I'll create a separate PR if I find more issues with corrected script.

``` python
import ast
import os
import docstring_parser

for root, dirs, files in os.walk('.'):
    for name in files:
        if root.startswith("./.git/") or root.startswith("./third_party/"):
            continue
        if name.endswith(".py"):
            full_name = os.path.join(root, name)
            with open(full_name, "r") as source:
                tree = ast.parse(source.read())
                for node in ast.walk(tree):
                    if isinstance(node, ast.FunctionDef):
                        all_node_args = node.args.args
                        if node.args.vararg is not None:
                            all_node_args.append(node.args.vararg)
                        if node.args.kwarg is not None:
                            all_node_args.append(node.args.kwarg)
                        if node.args.posonlyargs is not None:
                            all_node_args.extend(node.args.posonlyargs)
                        if node.args.kwonlyargs is not None:
                            all_node_args.extend(node.args.kwonlyargs)
                        args = [a.arg for a in all_node_args]
                        docstring = docstring_parser.parse(ast.get_docstring(node))
                        doc_args = [a.arg_name for a in docstring.params]
                        clean_doc_args = []
                        for a in doc_args:
                            clean_a = ""
                            for c in a.split()[0]:
                                if c.isalnum() or c == '_':
                                    clean_a += c
                            if clean_a:
                                clean_doc_args.append(clean_a)
                        doc_args = clean_doc_args
                        for a in doc_args:
                            if a not in args:
                                print(full_name, node.lineno, args, doc_args)
                            break

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90505
Approved by: https://github.com/malfet, https://github.com/ZainRizvi
2022-12-09 21:43:09 +00:00
Driss Guessous
912748e3b7 [SDP] Fix alignment check for efficient_attention (#90413)
Fixes a bug found using head_dim_size==100 on an a100 gpu. This PR contains stricter guards on the input shape. These constraints are taken from xformers: https://github.com/facebookresearch/xformers/blob/gh/danthe3rd/60/orig/xformers/ops/fmha/cutlass.py#L23
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90413
Approved by: https://github.com/mikekgfb
2022-12-09 21:09:25 +00:00
Michael Lazos
9c4189f82d [dynamo] Add is_compiling for dynamo (#90329)
`is_tracing` returns True during dynamo tracing and False when run in Eager

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90329
Approved by: https://github.com/jansel
2022-12-09 20:19:41 +00:00
Shen Li
082450609c [FSDP] Allow nested FSDP wrapper to use different mixed precision (#90523)
The main change is to move `args` and `kwargs` dtype convertion
from `_root_pre_forward` to `_pre_forward`, so that every
FSDP has a chance to apply its own precision.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90523
Approved by: https://github.com/awgu, https://github.com/rohan-varma
2022-12-09 20:06:05 +00:00
mfkasim1
eedf7a4989 Log1p complex for CUDA (#90422)
Another pull request in the direction of solving #89205: log1p for complex numbers in CUDA.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90422
Approved by: https://github.com/lezcano
2022-12-09 19:53:22 +00:00
PyTorch MergeBot
b2795d3c4e Revert "[inductor] New approach for computing triton load/store masks (#89566)"
This reverts commit c6c2de586d.

Reverted https://github.com/pytorch/pytorch/pull/89566 on behalf of https://github.com/clee2000 due to broke test_invalid_operand_issue1_cuda in inductor/test_torchinductor on https://github.com/pytorch/pytorch/actions/runs/3657444733/jobs/6181700572
2022-12-09 19:36:25 +00:00
Frederik Gerzer
09ccda0d94 Fix: Make __len__ of datapipes dynamic (#88302)
Fixes #88074

Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever `apply_sharding` is called). The behaviour is counter-intuitive because we do not expect `__len__` to have side-effects.

This PR makes `__len__` dynamically computed.

Changes:
- Add note to the `datapipes` README that `__len__` should be dynamic and why.
- Remove caching of length computations in `ConcaterIterDataPipe`, `MultiplexerIterDataPipe`, `ZipperIterDataPipe`, `BatcherIterDataPipe`, `ConcaterMapDataPipe`, and `BatcherMapDataPipe`.
- This required removal of the `length` attribute in setstate/getstate of `MultiplexerIterDataPipe`. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore the `length` of the loaded `state`.
- This also means the classes above no longer have a `length` attribute. I have found no uses of this, though.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88302
Approved by: https://github.com/NivekT
2022-12-09 19:15:53 +00:00
Jiewen Tan
93aa6e3e36 [LTC] Make some LazyGraphExecutor private data structures protected (#90457)
Summary:
This pull request makes some LazyGraphExecutor private data structures protected such that XLAGraphExecutor can reuse them.

Here is the list:
1. DeviceLocker.
2. DeviceLockerArena.
3. DataCacheArena.

In addition, it also introduces LazyGraphExecutor::ResetTrimCounter() such that XLAGraphExecutor can reuse the trim counter.

Test Plan:
CI.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90457
Approved by: https://github.com/JackCaoG
2022-12-09 18:28:13 +00:00
Michael Lazos
730e44bbc7 Add logging for aot autograd and unified debug flag (#88987)
- Adds `log_level` to aot's config
- Outputs log to `<graph_name>_<log_level>.log` in aot_torchinductor subfolder of the debug directory
- Modifies the Inductor debug context to use the graph name when naming the folder instead of the os pid
- Adds `TORCH_COMPILE_DEBUG` flag to enable it, (as well as separate dynamo and inductor logs)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88987
Approved by: https://github.com/Chillee
2022-12-09 17:28:10 +00:00
Bin Bao
282dfe8ba4 [inductor][Reland] Use decomposition for _to_copy (#90494)
Summary: also contains a fix for https://github.com/pytorch/pytorch/issues/89633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90494
Approved by: https://github.com/ngimel
2022-12-09 16:51:50 +00:00
PyTorch MergeBot
6581063583 Revert "Dynamo, FX, Inductor Progress Bars (#88384)"
This reverts commit db0ce4acf3.

Reverted https://github.com/pytorch/pytorch/pull/88384 on behalf of https://github.com/malfet due to Broke test_public_bindings across the board
2022-12-09 16:32:25 +00:00