pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Edward Z. Yang	f7365eca90	Add unbacked symints support; item works now (#90624 ) The big idea is to add `create_unbacked_symfloat` and `create_unbacked_symint` to ShapeEnv, allowing you to allocate symbolic floats/ints corresponding to data you don't know about at compile time. Then, instead of immediately erroring out when you try to call local_scalar_dense on a FakeTensor, we instead create a fresh symint/symfloat and return that. There a bunch of odds and ends that need to be handled: * A number of `numel` calls converted to `sym_numel` * When we finally return from item(), we need to ensure we actually produce a SymInt/SymFloat when appropriate. The previous binding code assumed that you would have to get a normal Python item. I add a pybind11 binding for Scalar (to PyObject only) and refactor the code to use that. There is some trickiness where you are NOT allowed to go through c10::SymInt if there isn't actually any SymInt involved. See comment. * One of our unit tests tripped an implicit data dependent access which occurs when you pass a Tensor as an argument to a sizes parameter. This is also converted to support symbolic shapes * We now support tracking bare SymInt/SymFloat returns in proxy tensor mode (this was already in symbolic-shapes branch) * Whenever we allocate an unbacked symint, we record the stack trace it was allocated at. These get printed when you attempt data dependent access on the symint (e.g., you try to guard on it) * Subtlety: unbacked symints are not necessarily > 1. I added a test for this. These unbacked symints are not very useful right now as you will almost always immediately raise an error later when you try to guard on them. The next logical step is adding an assertion refinement system that lets ShapeEnv learn facts about unbacked symints so it can do a better job eliding guards that are unnecessary. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90624 Approved by: https://github.com/Skylion007, https://github.com/voznesenskym	2022-12-12 13:33:07 +00:00
Michael Voznesensky	5adc18dcbc	Shape guard structure (#90679 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/90679 Approved by: https://github.com/ezyang	2022-12-12 09:50:00 +00:00
Yanbo Liang	2e0ce24890	[Dynamo] Support access nn.Module keys (#90502 ) Fixes https://github.com/pytorch/torchdynamo/issues/1973 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90502 Approved by: https://github.com/jansel	2022-12-12 09:15:42 +00:00
Yuxin Wu	c8ed84ad06	Fix a static initialization order fiasco in c10d (#90149 ) The `TORCH_LIBRARY_IMPL` registrations in `OpsImpl.cpp` needs to happen after `ProcessGroup` is registered as a torch class -- which happens in `Ops.cpp`. However, the order of the registrations is undefined between the two files. If the registration in `OpsImpl.cpp` runs before `Ops.cpp`, we get a crash at program launch similar to #83255 . This happens in our internal build. This PR moves `OpsImpl.cpp` to the end of `Oops.cpp`. Because according to the omniscient lord of chatGPT: <img width="600" alt="2022-12-04_19-25" src="https://user-images.githubusercontent.com/1381301/205542847-3535b319-3c2a-4e8e-bc11-27913f6afb39.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90149 Approved by: https://github.com/kwen2501, https://github.com/H-Huang, https://github.com/soumith	2022-12-12 08:21:54 +00:00
XiaobingSuper	4ca2fc485c	inductor(CPU): add Conv+binary+unary fusion filter (#90259 ) For Conv+binary+unary fusion, we only support conv+add+relu, this PR adds a such check to fix TIMM failed models. TODO: enable more Conv+binary+unary fusion to improve TIMM models' performance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90259 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/jansel	2022-12-12 06:04:55 +00:00
Bert Maher	c318de4274	[dynamo] Get GPU names without calling nvidia-smi (#90474 ) Believe it or not, inductor can sometimes be used on machines that have CUDA GPUs but no nvidia-smi. Let's use torch APIs instead of subprocess. Differential Revision: [D41841930](https://our.internmc.facebook.com/intern/diff/D41841930/) Differential Revision: [D41841930](https://our.internmc.facebook.com/intern/diff/D41841930) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90474 Approved by: https://github.com/voznesenskym, https://github.com/anijain2305	2022-12-12 05:31:50 +00:00
Bert Maher	b95ea4f149	[pt2] Reset dynamo log level when exiting inductor debug context (#90473 ) When entering an inductor debug context we increase the log level of dynamo; I guess this makes sense, since if we're debugging inductor, and inductor calls into dynamo, we probably want visibility into what dynamo is doing. But when we exit that context, we probably want to go back to whatever level of dynamo-specific logging was in place before. Dynamo generates lots of debug info (guards, bytecode), and it's a lot to sift through if you're not specifically interested in it. Differential Revision: [D41841879](https://our.internmc.facebook.com/intern/diff/D41841879/) Differential Revision: [D41841879](https://our.internmc.facebook.com/intern/diff/D41841879) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90473 Approved by: https://github.com/mlazos, https://github.com/jansel	2022-12-12 04:39:37 +00:00
Bert Maher	d3d85e1c3b	Emit torch.cuda.synchronize() after every kernel call in inductor (#90472 ) Debugging illegal memory access is hard; even CUDA_LAUNCH_BLOCKING=1 and using C10_CUDA_KERNEL_LAUNCH_CHECK doesn't guarantee a useful stack trace. doesn't necessarily guarantee that you'll get a stack trace pointing to the right kernel. This diff adds a config option to force a CUDA synchronize after every kernel call in inductor, for debugging those tricky cases. Differential Revision: [D41744967](https://our.internmc.facebook.com/intern/diff/D41744967/) Differential Revision: [D41744967](https://our.internmc.facebook.com/intern/diff/D41744967) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90472 Approved by: https://github.com/jansel	2022-12-12 04:35:10 +00:00
Edward Z. Yang	8fd31ac4da	Preserve original GraphArgs for shape guard codegen (#90665 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90665 Approved by: https://github.com/voznesenskym	2022-12-12 02:35:23 +00:00
Edward Z. Yang	9447005ae3	Improve dynamo debug logging (#90664 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90664 Approved by: https://github.com/voznesenskym	2022-12-12 02:35:23 +00:00
Edward Z. Yang	450bd282e0	Slightly improve error messages on sympy failure (#90655 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90655 Approved by: https://github.com/Skylion007, https://github.com/voznesenskym	2022-12-12 01:58:34 +00:00
Yuxin Wu	8127724c3b	Skip some unittests (#90609 ) * Skip a unittest that needs FFT if not built with FFT * Mark a test with "slow": `python test/test_ops.py -k TestCompositeComplianceCUDA.test_forward_ad_svd_lowrank_cuda_float32` took >5min on my machine. * Skip a flaky test that's marked "expectedFailure", similar to #90233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90609 Approved by: https://github.com/soumith	2022-12-11 23:53:05 +00:00
Michael Voznesensky	11442accc6	Make torch._guards, shuffle structures around for migration (#90636 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90636 Approved by: https://github.com/ezyang	2022-12-11 23:16:07 +00:00
Edward Z. Yang	e33f1eeeb7	SymIntify resize_ and deduplicate memory format logic (#90442 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90442 Approved by: https://github.com/bdhirsh	2022-12-11 14:38:38 +00:00
PyTorch MergeBot	15a4c60383	Revert "Make torch._guards, shuffle structures around for migration (#90636 )" This reverts commit `933b6c4eed`. Reverted https://github.com/pytorch/pytorch/pull/90636 on behalf of https://github.com/huydhn due to Breaking lint on master. Please rebase and run lintrunner -a before re-merging the PR	2022-12-11 10:15:47 +00:00
Shen Li	7ec1cb8553	[FSDP] Fix _pre_forward type annotation (#90621 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90621 Approved by: https://github.com/awgu, https://github.com/Skylion007	2022-12-11 06:39:38 +00:00
Shen Li	80542add73	[FSDP] Allow MixedPrecision to skip inputs (#90620 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90620 Approved by: https://github.com/rohan-varma, https://github.com/awgu	2022-12-11 06:39:38 +00:00
Andrew Gu	31351c61dd	[FSDP] Tighten post-bwd cast to `reduce_dtype` (#90615 ) This lowers the `reduce_dtype` retrieval to the `handle` instead of the `state` in preparation for `fully_shard`, and this adds a guard to avoid a no-op `to()` call. Note that this change pretty much gets overridden in following PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90615 Approved by: https://github.com/rohan-varma	2022-12-11 06:39:34 +00:00
Michael Voznesensky	933b6c4eed	Make torch._guards, shuffle structures around for migration (#90636 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90636 Approved by: https://github.com/ezyang	2022-12-11 06:04:17 +00:00
Rohan Varma	c7d2fb7f86	Adopt state_dict_pre_hook in FSDP (#90436 ) Use register_state_dict_pre_hook in FSDP to simplify state_dict implementations & remove hacks. This removes `def state_dict` entirely and paves the path for composable API as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90436 Approved by: https://github.com/fegin	2022-12-11 03:54:26 +00:00
Andrew Gu	e7efeb5282	[FSDP] Save `_stream_to_name` for debugging (#90611 ) This saves a data structure `_stream_to_name: Dict[torch.cuda.Stream, str]` that maps each FSDP stream to its name. This can help in debugging by checking `_stream_to_name[torch.cuda.current_stream()]` to see if it is `"default"` or `"unshard"` in the post-backward hook for example. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90611 Approved by: https://github.com/rohan-varma	2022-12-11 03:46:18 +00:00
Andrew Gu	9eccfedca2	[Reland][FSDP] Another fix for `DTensor`, `use_orig_params=True` (#90562 ) This is a reland of https://github.com/pytorch/pytorch/pull/89845 with nothing changed. This should avoid the internal breakage now that `DTensor` does not import `torchgen` (https://github.com/pytorch/pytorch/pull/90106). Pull Request resolved: https://github.com/pytorch/pytorch/pull/90562 Approved by: https://github.com/fduwjj	2022-12-10 22:50:30 +00:00
Shen Li	a69cdd9cf8	Add global registry to composable API contract (#90579 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90579 Approved by: https://github.com/awgu, https://github.com/yhcharles	2022-12-10 22:41:10 +00:00
Aaron Gokaslan	12671fe620	Reserve space for std::vector output in extract_tensors for nccl python bindings (#88203 ) Optimizes the nccl python bindings to reserve space when converting PythonObject* into Tensors. This should reduce the number of unnecessary allocations in the nccl bindings as the std::vector grows. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88203 Approved by: https://github.com/ezyang	2022-12-10 20:28:19 +00:00
Sergii Dymchenko	9ef1d55e6b	Fix non-existing parameters in docstrings in torch/nn (#90596 ) This is a continuation of https://github.com/pytorch/pytorch/pull/90505 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90596 Approved by: https://github.com/lezcano	2022-12-10 14:37:31 +00:00
Edward Z. Yang	45109ec30a	Completely redo how ShapeEnv guards are generated (#90528 ) Instead of inferring shape mappings from a bunch of data structures that were plumbed in InstructionTranslator, we instead work out mappings by just iterating over the GraphArgs and mapping symbols to arguments as they show up. If multiple argument sizes/strides/offset map to the same symbol, this means they are duck sized, so we also generate extra equality tests that they must be equal. Finally, we generate 0/1 specialization guards. The resulting code is much shorter, and I think also easier to understand. TODO: Delete all the tensor ref tracking code, it's unnecessary Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90528 Approved by: https://github.com/voznesenskym	2022-12-10 13:35:04 +00:00
Edward Z. Yang	49c674e155	Revert guaranteed symint allocation (#90381 ) So, uh, I have a new strategy for generating dupe guards, one where I don't actually need to allocate symints for every tensor that is fakeified. So I'm reverting the changes I made from earlier PRs in this one. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90381 Approved by: https://github.com/voznesenskym	2022-12-10 13:17:34 +00:00
Edward Z. Yang	b68dead20c	Keep track of source name on all allocated SymInts (#90295 ) Wow, I had to sweat so much to get this PR out lol. This PR enforces the invariant that whenever we allocate SymInts as part of fakeification, the SymInt is associated with a Source, and in fact we store the string source name on SymbolWithSourceName. We use 'sname' as the shorthand for source name, as 'name' is already used by sympy to name symbols. In order to store source names, we have to plumb source names from Dynamo to PyTorch. This made doing this PR a bit bone crushing, because there are many points in the Dynamo codebase where we are improperly converting intermediate tensors into fake tensors, where there is no source (and there cannot be, because it's a frickin' intermediate tensor). I've fixed all of the really awful cases in earlier PRs in the stack. This PR is just plumbing in source names from places where we do have it. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90295 Approved by: https://github.com/voznesenskym	2022-12-10 13:17:34 +00:00
blzheng	f9aa099074	[Inductor] fix issue: redeclaration of float g_tmp_buffer_xxx (#90270 ) This pr is to fix the issue: redeclaration of 'float g_tmp_buffer_in_ptr1[16] = {0};' If a bool or uint8 tensor is used by multiple op, this tensor will be loaded multiple times. On load, it writes the declaration of this variable, i.e., `self.loads.writeline(f"float {g_tmp_buf}[{nelements}] = {{0}};")`, which will introduce redeclaration error. ![image](https://user-images.githubusercontent.com/69951214/205869956-5c325761-dc09-4aa8-a9ed-fad7f4c85917.png) ![image](https://user-images.githubusercontent.com/69951214/205870695-ee252f17-8f54-484f-9b0a-3a424c479327.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90270 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel	2022-12-10 12:59:30 +00:00
Jiewen Tan	5a665a39d1	[LTC] Make some LazyGraphExecutor private data structures protected (#90598 ) Summary: This pull request makes some LazyGraphExecutor private data structures protected such that XLAGraphExecutor can reuse them. Here is the list: 1. DeviceLocker. 2. DeviceLockerArena. 3. DataCacheArena. In addition, it also introduces LazyGraphExecutor::ResetTrimCounter() such that XLAGraphExecutor can reuse the trim counter. Test Plan: CI. P.S. This is to re-land #90457. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90598 Approved by: https://github.com/JackCaoG	2022-12-10 08:19:12 +00:00
Zachary DeVito	3b3ed25109	Add a way to visualize memory snapshot traces (#90348 ) This adds a d3-based interactive visualization for exploring the memory allocation traces that the caching allocator can capture. This visualization code can also be attached to kineto trace information in the future to also provide visualization for the memory events captured there, which come with addition information about the graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90348 Approved by: https://github.com/robieta	2022-12-10 02:45:11 +00:00
Yanli Zhao	2bac4d1fae	[reland] add save and load stats in memory_tracker (#90510 ) reland https://github.com/pytorch/pytorch/pull/90144, this PR removed temporary path "memory.trace" in the unit test Pull Request resolved: https://github.com/pytorch/pytorch/pull/90510 Approved by: https://github.com/rohan-varma	2022-12-10 01:39:22 +00:00
BowenBao	1b2c59ad24	[ONNX] Introduce ONNX reference evaluator for verification (#89808 ) Reference evaluator requires ONNX >= 1.13. Running in CI is blocked by unable to bump onnx submodule version, like in #83201. Local tests pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89808 Approved by: https://github.com/justinchuby	2022-12-10 01:29:12 +00:00
Wanchao Liang	7afba50508	[dtensor] delete unused torch_function (#90449 ) torch_function is not actually getting used yet today, deleting it first and we can revisit once we really need it Pull Request resolved: https://github.com/pytorch/pytorch/pull/90449 Approved by: https://github.com/fduwjj	2022-12-10 01:29:02 +00:00
BowenBao	79f9672249	[ONNX] Use `VerificationOptions` to wrap option arguments (#89807 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89807 Approved by: https://github.com/justinchuby, https://github.com/titaiwangms	2022-12-09 23:49:51 +00:00
Angela Yi	6de216a2e8	[fx] Have replace_pattern return replaced nodes (#90244 ) Summary: Modified replace_pattern in the subgraph rewriter to return a list of pairs of matches along with their corresponding replacement nodes in the modified graph (`List[Tuple[Match, List[Node]]]`). This allows us to easily modify the replaced nodes, including setting the metadata. Test Plan: CI Differential Revision: D41737056 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90244 Approved by: https://github.com/SherlockNoMad	2022-12-09 23:43:16 +00:00
Jiawen Liu	4a1633ca69	[Inductor] GEMM Shape Padding Optimization (#90425 ) Summary: Optimize the shape padding in the following perspectives: - Add BFloat16 support for AMP training and Float16 support for inference - Optimize microbenchmark to avoid peak memory issue, and include profiling memory ops to make more accurate decision - Add a flag to turn off/on padding dims N and M in `torch.bmm` due to expensive memory copy of `.contiguous` to avoid peak memory issues in internal models Test Plan: CI Differential Revision: D41724868 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90425 Approved by: https://github.com/jianyuh	2022-12-09 22:48:02 +00:00
PyTorch MergeBot	b7dfbf876f	Revert "[LTC] Make some LazyGraphExecutor private data structures protected (#90457 )" This reverts commit `93aa6e3e36`. Reverted https://github.com/pytorch/pytorch/pull/90457 on behalf of https://github.com/clee2000 due to broke xla somehow `93aa6e3e36` https://github.com/pytorch/pytorch/actions/runs/3659842773/jobs/6186552659	2022-12-09 22:28:24 +00:00
Angela Yi	02eb0bdbc1	[fx] Added better tests to pass infra (#90432 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90432 Approved by: https://github.com/SherlockNoMad	2022-12-09 21:43:18 +00:00
Sergii Dymchenko	f51f6aa387	Fix non-existing parameters in docstrings (#90505 ) Continuation after https://github.com/pytorch/pytorch/pull/90163. Here is a script I used to find all the non-existing arguments in the docstrings (the script can give false positives in presence of args/*kwargs or decorators): _Edit:_ I've realized that the indentation is wrong for the last `break` in the script, so the script only gives output for a function if the first docstring argument is wrong. I'll create a separate PR if I find more issues with corrected script. ``` python import ast import os import docstring_parser for root, dirs, files in os.walk('.'): for name in files: if root.startswith("./.git/") or root.startswith("./third_party/"): continue if name.endswith(".py"): full_name = os.path.join(root, name) with open(full_name, "r") as source: tree = ast.parse(source.read()) for node in ast.walk(tree): if isinstance(node, ast.FunctionDef): all_node_args = node.args.args if node.args.vararg is not None: all_node_args.append(node.args.vararg) if node.args.kwarg is not None: all_node_args.append(node.args.kwarg) if node.args.posonlyargs is not None: all_node_args.extend(node.args.posonlyargs) if node.args.kwonlyargs is not None: all_node_args.extend(node.args.kwonlyargs) args = [a.arg for a in all_node_args] docstring = docstring_parser.parse(ast.get_docstring(node)) doc_args = [a.arg_name for a in docstring.params] clean_doc_args = [] for a in doc_args: clean_a = "" for c in a.split()[0]: if c.isalnum() or c == '_': clean_a += c if clean_a: clean_doc_args.append(clean_a) doc_args = clean_doc_args for a in doc_args: if a not in args: print(full_name, node.lineno, args, doc_args) break ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90505 Approved by: https://github.com/malfet, https://github.com/ZainRizvi	2022-12-09 21:43:09 +00:00
Driss Guessous	912748e3b7	[SDP] Fix alignment check for efficient_attention (#90413 ) Fixes a bug found using head_dim_size==100 on an a100 gpu. This PR contains stricter guards on the input shape. These constraints are taken from xformers: https://github.com/facebookresearch/xformers/blob/gh/danthe3rd/60/orig/xformers/ops/fmha/cutlass.py#L23 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90413 Approved by: https://github.com/mikekgfb	2022-12-09 21:09:25 +00:00
Michael Lazos	9c4189f82d	[dynamo] Add is_compiling for dynamo (#90329 ) `is_tracing` returns True during dynamo tracing and False when run in Eager Pull Request resolved: https://github.com/pytorch/pytorch/pull/90329 Approved by: https://github.com/jansel	2022-12-09 20:19:41 +00:00
Shen Li	082450609c	[FSDP] Allow nested FSDP wrapper to use different mixed precision (#90523 ) The main change is to move `args` and `kwargs` dtype convertion from `_root_pre_forward` to `_pre_forward`, so that every FSDP has a chance to apply its own precision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90523 Approved by: https://github.com/awgu, https://github.com/rohan-varma	2022-12-09 20:06:05 +00:00
mfkasim1	eedf7a4989	Log1p complex for CUDA (#90422 ) Another pull request in the direction of solving #89205: log1p for complex numbers in CUDA. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90422 Approved by: https://github.com/lezcano	2022-12-09 19:53:22 +00:00
PyTorch MergeBot	b2795d3c4e	Revert "[inductor] New approach for computing triton load/store masks (#89566 )" This reverts commit `c6c2de586d`. Reverted https://github.com/pytorch/pytorch/pull/89566 on behalf of https://github.com/clee2000 due to broke test_invalid_operand_issue1_cuda in inductor/test_torchinductor on https://github.com/pytorch/pytorch/actions/runs/3657444733/jobs/6181700572	2022-12-09 19:36:25 +00:00
Frederik Gerzer	09ccda0d94	Fix: Make `__len__` of datapipes dynamic (#88302 ) Fixes #88074 Several datapipes have their lengths cached on being executed for the first time. However, source datapipes might change in length (most prominently, whenever `apply_sharding` is called). The behaviour is counter-intuitive because we do not expect `__len__` to have side-effects. This PR makes `__len__` dynamically computed. Changes: - Add note to the `datapipes` README that `__len__` should be dynamic and why. - Remove caching of length computations in `ConcaterIterDataPipe`, `MultiplexerIterDataPipe`, `ZipperIterDataPipe`, `BatcherIterDataPipe`, `ConcaterMapDataPipe`, and `BatcherMapDataPipe`. - This required removal of the `length` attribute in setstate/getstate of `MultiplexerIterDataPipe`. I am unsure whether to remove this completely and risk breaking saved checkpoints (as I did) or whether to just ignore the `length` of the loaded `state`. - This also means the classes above no longer have a `length` attribute. I have found no uses of this, though. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88302 Approved by: https://github.com/NivekT	2022-12-09 19:15:53 +00:00
Jiewen Tan	93aa6e3e36	[LTC] Make some LazyGraphExecutor private data structures protected (#90457 ) Summary: This pull request makes some LazyGraphExecutor private data structures protected such that XLAGraphExecutor can reuse them. Here is the list: 1. DeviceLocker. 2. DeviceLockerArena. 3. DataCacheArena. In addition, it also introduces LazyGraphExecutor::ResetTrimCounter() such that XLAGraphExecutor can reuse the trim counter. Test Plan: CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90457 Approved by: https://github.com/JackCaoG	2022-12-09 18:28:13 +00:00
Michael Lazos	730e44bbc7	Add logging for aot autograd and unified debug flag (#88987 ) - Adds `log_level` to aot's config - Outputs log to `<graph_name>_<log_level>.log` in aot_torchinductor subfolder of the debug directory - Modifies the Inductor debug context to use the graph name when naming the folder instead of the os pid - Adds `TORCH_COMPILE_DEBUG` flag to enable it, (as well as separate dynamo and inductor logs) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88987 Approved by: https://github.com/Chillee	2022-12-09 17:28:10 +00:00
Bin Bao	282dfe8ba4	[inductor][Reland] Use decomposition for _to_copy (#90494 ) Summary: also contains a fix for https://github.com/pytorch/pytorch/issues/89633 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90494 Approved by: https://github.com/ngimel	2022-12-09 16:51:50 +00:00
PyTorch MergeBot	6581063583	Revert "Dynamo, FX, Inductor Progress Bars (#88384 )" This reverts commit `db0ce4acf3`. Reverted https://github.com/pytorch/pytorch/pull/88384 on behalf of https://github.com/malfet due to Broke test_public_bindings across the board	2022-12-09 16:32:25 +00:00

1 2 3 4 5 ...

25787 Commits