pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Animesh Jain	514f9279f8	[dynamo][compile-time] Manually implement nn.Module.__getattr__ to reduce compile time (#129315 ) # Compile time for eager backend ## AlbertForMaskedLM No inlining - 3.65 seconds Inlining on main - 7.48 seconds Inlining + this PR - 6.70 seconds ## MobileBertForMaskedLM No inlining - 26.90 seconds Inlining on main - 48.21 seconds Inlining + this PR - 43.85 seconds Next PR in the stack makes the total compile time better/comparable to no inlining Pull Request resolved: https://github.com/pytorch/pytorch/pull/129315 Approved by: https://github.com/jansel ghstack dependencies: #129316	2024-06-25 01:31:26 +00:00
Simon Fan	f0443ad174	[compiled autograd] flatten runtime inputs with fast path (#129116 ) covered by test_compiled_autograd.py and test_standalone_compile.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/129116 Approved by: https://github.com/jansel ghstack dependencies: #127960, #128905, #128982, #128987, #129181	2024-06-21 08:16:33 +00:00
Simon Fan	123812790b	[compiled autograd] update benchmarks to use cli flags for fullgraph/dynamic (#127960 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127960 Approved by: https://github.com/jansel	2024-06-21 08:16:33 +00:00
Animesh Jain	6b5fbc544e	[dynamo] Use polyfill to trace through the attributes of torch.jit.* and lru_cache_wrapper (#128336 ) Earlier we were taking the vt for `obj` and then monkeypatching that `vt.source` to be `obj._torchdynamo_inline`. If one accesses `obj.attr_a`, this would cause problems because Dynamo would then search it in `obj._torchdynamo_inline.attr_a`. This PR makes it more functional, so that we have different vts for obj and `ob._torchdynamo_inline`. Fixes https://github.com/pytorch/pytorch/issues/93698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128336 Approved by: https://github.com/jansel, https://github.com/yanboliang ghstack dependencies: #129117	2024-06-21 07:44:44 +00:00
Yanbo Liang	acefc5c016	[torch.compile] Enable bwd compilation metrics (#128973 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128973 Approved by: https://github.com/dshi7	2024-06-19 03:45:41 +00:00
Simon Fan	4b96575a09	[dynamo][aot autograd] Silently disable default saved tensor hooks during tracing (#123196 ) FIXES #113263. Same idea as in https://github.com/pytorch/pytorch/pull/113417, but we need a more intrusive C API to silently nop default saved tensor hooks, in order to support user-code that use torch.autograd.disable_saved_tensors_hooks (see test_unpack_hooks_can_be_disabled). We mock the output of get_hooks while leaving push/pop untouched. For compiled autograd, we're firing pack hooks once and unpack hooks twice right now, I'll look into this separately from this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123196 Approved by: https://github.com/soulitzer	2024-06-14 20:28:08 +00:00
chilli	c486e2ab64	Add coloring to fx graph print out (#128476 ) Note: Won't land immediately, at least I'll need to add a color option to the field. But curious if any tests fail. Old: <img width="1294" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/c3a750ed-5e54-4621-b2e4-be5481be15b6"> New: <img width="1303" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/3a1f1adc-6f3a-413e-8b87-ee53da9bf4ed"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/128476 Approved by: https://github.com/ezyang	2024-06-13 23:39:04 +00:00
rzou	87072dcfdb	Change Dynamo's custom ops warning message to be less spammy (#128456 ) This is a short-term fix (for 2.4). In the longer term we should fix https://github.com/pytorch/pytorch/issues/128430 The problem is that warnings.warn that are inside Dynamo print all the time. Python warnings are supposed to print once, unless their cache is reset: Dynamo ends up resetting that cache everytime it runs. As a workaround we provide our own warn_once cache that is keyed on the warning msg. I am not worried about this increasing memory usage because that's effectively what python's warnings.warn cache does. Test Plan: - fix tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128456 Approved by: https://github.com/anijain2305	2024-06-12 21:57:12 +00:00
Animesh Jain	c0b87afcad	[RELAND2][dynamo][nn-modules] Trace through nn.Module dunder methods for UnspecializedNNModule (#126578 ) Tracing through `__init__` is important because it initializes (calls STORE_ATTR) on members. By doing that, we kick in the mutation tracking for these objects. So, things like mutating `_modules` etc is tracked automatically. Fixes https://github.com/pytorch/pytorch/issues/111837 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126578 Approved by: https://github.com/jansel	2024-06-12 04:09:23 +00:00
PyTorch MergeBot	adb699189b	Revert "[RELAND][dynamo][nn-modules] Trace through nn.Module dunder methods for UnspecializedNNModule (#126578 )" This reverts commit `b2d602306a`. Reverted https://github.com/pytorch/pytorch/pull/126578 on behalf of https://github.com/clee2000 due to failed internal test D58394084. Author has forward fix but includes external changes so reverting is a bit easier to coordinate ([comment](https://github.com/pytorch/pytorch/pull/126578#issuecomment-2161481839))	2024-06-11 19:41:41 +00:00
BowenBao	61f922c2ca	Fix 'get_real_value' on placeholder nodes (#127698 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127698 Approved by: https://github.com/jansel ghstack dependencies: #127695, #127696	2024-06-11 18:57:25 +00:00
BowenBao	984b1a8c35	Fix 'get_attr' call in dynamo 'run_node' (#127696 ) Fixes #124858 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127696 Approved by: https://github.com/jansel ghstack dependencies: #127695	2024-06-11 18:57:25 +00:00
Animesh Jain	b2d602306a	[RELAND][dynamo][nn-modules] Trace through nn.Module dunder methods for UnspecializedNNModule (#126578 ) Tracing through `__init__` is important because it initializes (calls STORE_ATTR) on members. By doing that, we kick in the mutation tracking for these objects. So, things like mutating `_modules` etc is tracked automatically. Fixes https://github.com/pytorch/pytorch/issues/111837 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126578 Approved by: https://github.com/jansel ghstack dependencies: #128295	2024-06-10 23:11:04 +00:00
PyTorch MergeBot	ca561d639b	Revert "Fix 'get_attr' call in dynamo 'run_node' (#127696 )" This reverts commit `b741819b05`. Reverted https://github.com/pytorch/pytorch/pull/127696 on behalf of https://github.com/clee2000 due to broke (executorch?) internal tests D58295865 ([comment](https://github.com/pytorch/pytorch/pull/127696#issuecomment-2158820093))	2024-06-10 16:29:20 +00:00
PyTorch MergeBot	d22287d1ad	Revert "Fix 'get_real_value' on placeholder nodes (#127698 )" This reverts commit `19b31d899a`. Reverted https://github.com/pytorch/pytorch/pull/127698 on behalf of https://github.com/clee2000 due to broke (executorch?) internal tests D58295865 ([comment](https://github.com/pytorch/pytorch/pull/127696#issuecomment-2158820093))	2024-06-10 16:29:20 +00:00
Aaron Orenstein	dcfa7702c3	Flip default value for mypy disallow_untyped_defs [1/11] (#127838 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127838 Approved by: https://github.com/oulgen	2024-06-08 18:16:33 +00:00
PyTorch MergeBot	44371bd432	Revert "[dynamo][nn-modules] Trace through nn.Module dunder methods for UnspecializedNNModule (#126578 )" This reverts commit `7ede78f9f5`. Reverted https://github.com/pytorch/pytorch/pull/126578 on behalf of https://github.com/anijain2305 due to pippy tests fail ([comment](https://github.com/pytorch/pytorch/pull/126578#issuecomment-2155836555))	2024-06-08 06:35:34 +00:00
dshi7	3a620a0f65	bug fix of dynamo_timed in cprofile (#128203 ) Fixes #ISSUE_NUMBER fb-only: "Entire Frame" was missing before this change. Before: https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/f565966006-TrainingApplication/20240527/rank_0/5_0_1/compilation_metrics_23.html After: https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/f569854578-TrainingApplication/20240606/rank_0/0_0_0/compilation_metrics_16.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/128203 Approved by: https://github.com/Chillee	2024-06-07 20:47:27 +00:00
BowenBao	19b31d899a	Fix 'get_real_value' on placeholder nodes (#127698 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127698 Approved by: https://github.com/jansel ghstack dependencies: #127695, #127696	2024-06-07 17:13:43 +00:00
BowenBao	b741819b05	Fix 'get_attr' call in dynamo 'run_node' (#127696 ) Fixes #124858 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127696 Approved by: https://github.com/jansel ghstack dependencies: #127695	2024-06-07 17:13:43 +00:00
Animesh Jain	7ede78f9f5	[dynamo][nn-modules] Trace through nn.Module dunder methods for UnspecializedNNModule (#126578 ) Tracing through `__init__` is important because it initializes (calls STORE_ATTR) on members. By doing that, we kick in the mutation tracking for these objects. So, things like mutating `_modules` etc is tracked automatically. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126578 Approved by: https://github.com/jansel ghstack dependencies: #128001	2024-06-06 23:05:49 +00:00
Animesh Jain	569c5e72e7	[dynamo] Unspec nn module when global backward hooks are present (#127802 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127802 Approved by: https://github.com/jansel ghstack dependencies: #127785	2024-06-04 18:25:46 +00:00
Yanbo Liang	7e97b33fbb	[Dynamo] Log backward graph compilation metrics (#126629 ) Fixes #125313 Compilation metric logs for the code example at #125313: ``` %s CompilationMetrics(compile_id='0/0', frame_key='1', co_name='forward', co_filename='/data/users/ybliang/debug/debug2.py', co_firstlineno=10, cache_size=0, accumulated_cache_size=0, guard_count=11, shape_env_guard_count=0, graph_op_count=1, graph_node_count=3, graph_input_count=1, start_time=1716247236.6165977, entire_frame_compile_time_s=7.926939964294434, backend_compile_time_s=7.887059926986694, inductor_compile_time_s=4.108498811721802, code_gen_time_s=3.97833514213562, fail_type=None, fail_reason=None, fail_user_frame_filename=None, fail_user_frame_lineno=None, non_compliant_ops=set(), compliant_custom_ops=set(), restart_reasons={"'skip function graph_break in file /home/ybliang/local/pytorch/torch/_dynamo/decorators.py'"}, dynamo_time_before_restart_s=0.025330543518066406, has_guarded_code=True, is_fwd=True) %s CompilationMetrics(compile_id='1/0', frame_key='2', co_name='torch_dynamo_resume_in_forward_at_12', co_filename='/data/users/ybliang/debug/debug2.py', co_firstlineno=12, cache_size=0, accumulated_cache_size=0, guard_count=10, shape_env_guard_count=0, graph_op_count=2, graph_node_count=5, graph_input_count=1, start_time=1716247244.544928, entire_frame_compile_time_s=0.10148310661315918, backend_compile_time_s=0.08753013610839844, inductor_compile_time_s=0.03691983222961426, code_gen_time_s=0.022417306900024414, fail_type=None, fail_reason=None, fail_user_frame_filename=None, fail_user_frame_lineno=None, non_compliant_ops=set(), compliant_custom_ops=set(), restart_reasons=set(), dynamo_time_before_restart_s=0.0, has_guarded_code=True, is_fwd=True) tensor([[-0.1622, -0.0000, -0.0000, 0.5643, -0.0000, 0.0000, -0.5087, 0.0914, -0.0000, -0.0421]], grad_fn=<CompiledFunctionBackward>) %s CompilationMetrics(compile_id='1/0', frame_key=None, co_name=None, co_filename=None, co_firstlineno=None, cache_size=None, accumulated_cache_size=None, guard_count=None, shape_env_guard_count=None, graph_op_count=None, graph_node_count=None, graph_input_count=None, start_time=None, entire_frame_compile_time_s=None, backend_compile_time_s=None, inductor_compile_time_s=0.026738643646240234, code_gen_time_s=0.016446352005004883, fail_type=None, fail_reason=None, fail_user_frame_filename=None, fail_user_frame_lineno=None, non_compliant_ops=None, compliant_custom_ops=None, restart_reasons=None, dynamo_time_before_restart_s=None, has_guarded_code=None, is_fwd=False) %s CompilationMetrics(compile_id='0/0', frame_key=None, co_name=None, co_filename=None, co_firstlineno=None, cache_size=None, accumulated_cache_size=None, guard_count=None, shape_env_guard_count=None, graph_op_count=None, graph_node_count=None, graph_input_count=None, start_time=None, entire_frame_compile_time_s=None, backend_compile_time_s=None, inductor_compile_time_s=0.14563536643981934, code_gen_time_s=0.08652091026306152, fail_type=None, fail_reason=None, fail_user_frame_filename=None, fail_user_frame_lineno=None, non_compliant_ops=None, compliant_custom_ops=None, restart_reasons=None, dynamo_time_before_restart_s=None, has_guarded_code=None, is_fwd=False) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/126629 Approved by: https://github.com/ezyang	2024-06-03 03:55:33 +00:00
Michael Lazos	2129903aa3	Properly detect nested torch function args (#127496 ) Dynamo was not detecting nested torch function classes in containers. This was due to pytree compatibility for variable trackers being removed. Fixes https://github.com/pytorch/pytorch/issues/127174 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127496 Approved by: https://github.com/anijain2305	2024-06-02 03:43:22 +00:00
dshi7	932e04142d	extract calculate_time_spent from print_time_report (#127362 ) Fixes #ISSUE_NUMBER wrap certain steps in a separate function for easier TTFB instrumentation (fb internal use case) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127362 Approved by: https://github.com/yanboliang, https://github.com/mengluy0125	2024-05-29 04:37:15 +00:00
Aaron Gokaslan	3cb16ebf08	[BE]: Update ruff to 0.4.5 (#126979 ) Update ruff to 0.4.5 and addresses some false negatives that have been found in the newer version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126979 Approved by: https://github.com/ezyang	2024-05-24 18:38:35 +00:00
Matthew Hoffman	81277baa0c	Remove removed ruff rule TRY200 (#126256 ) My TOML linter is complaining that "TRY200" is not acceptable for the `tool.ruff.lint` schema. From the ruff docs: https://docs.astral.sh/ruff/rules/reraise-no-cause/ > This rule has been removed and its documentation is only available for historical reasons. > > This rule is identical to [B904](https://docs.astral.sh/ruff/rules/raise-without-from-inside-except/) which should be used instead. and we are currently explicitly ignoring B904. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126256 Approved by: https://github.com/Skylion007	2024-05-17 16:31:05 +00:00
Edward Z. Yang	9c9d0c2fab	Add VariableTracker.debug_repr (#126299 ) Now you can print arbitrary values at compile time with comptime.print() Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/126299 Approved by: https://github.com/jansel ghstack dependencies: #126292	2024-05-15 23:55:29 +00:00
dshi7	9df2f8687f	cprofile every compile id [x/y] to keep consistent with tlparse (#125659 ) This PR moves cprofile decorator to keep consistent with `torch_inductor_stats` logging and is needed by fbcode diffs of profiling enablement in internal e2e jobs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125659 Approved by: https://github.com/ezyang	2024-05-14 17:09:28 +00:00
PyTorch MergeBot	7ffa5558ee	Revert "[FX] Update type hints in `torch.fx._compatibility.py` (#125469 )" This reverts commit `235b4d6ec2`. Reverted https://github.com/pytorch/pytorch/pull/125469 on behalf of https://github.com/izaitsevfb due to breaks pyre in dependent projects (internal: see D56986361) ([comment](https://github.com/pytorch/pytorch/pull/125469#issuecomment-2096665396))	2024-05-06 18:36:43 +00:00
Xuehai Pan	235b4d6ec2	[FX] Update type hints in `torch.fx._compatibility.py` (#125469 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/125469 Approved by: https://github.com/Skylion007 ghstack dependencies: #125468	2024-05-05 19:30:22 +00:00
Avik Chaudhuri	746da8755c	switch tests from constrain_as* to torch._check* (#125253 ) To fix data-dependent errors we want to recommend that people use `torch._check` APIs. The `constrain_as` APIs should be fully subsumed by them, and in the future we should kill them entirely. Differential Revision: D56774333 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125253 Approved by: https://github.com/ezyang	2024-05-01 21:01:27 +00:00
Edward Z. Yang	c3c4465f50	Add has_guarded_code to CompilationMetrics (#125279 ) While studying some tlparse, I noticed that CompilationMetrics was reporting that there was no error for frames that have no nodes. I'm pretty sure we don't actually install a frame in this situation. has_guarded_code will tell us if that's the case, because it says if the GuardedCode object is None or not. Actually, while working on this, I was wondering if we can ever trigger the "skip this frame entirely, do not trace it ever again" codepath, as best as I could tell, it's impossible for this to happen by the time we get to compilation metrics block. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125279 Approved by: https://github.com/yanboliang	2024-05-01 06:12:05 +00:00
Daohang Shi	b7d67e476d	upload pt2 cprofile stats to manifold (#125162 ) Summary: https://fb.workplace.com/groups/257735836456307/permalink/657458576484029/ upload cprofile to manifold D56696397 has a script to convert profiler stats to dot graphs (see its test plan) Test Plan: non-MAST `TORCH_COMPILE_CPROFILE=1 buck2 run mode/opt mode/inplace //pytorch/benchmark:run -- ads_mc_igctr_mc3_v0 -d cuda -t train --torchdynamo inductor --profile --profile-export-chrome-trace` https://www.internalfb.com/manifold/explorer/pyper_traces/tree/compilation_cprofile/test/20240428_234002_7562397568 MAST `buck2 run mode/opt aps_models/ads/icvr:icvr_launcher -- mode=mast_ctr_cvr_cmf_rep launcher.fbl_entitlement=ai_infra_training_rnd_tc features=ctr_cvr_conso_cmf_pipeline_features_455876776_3teach model=ctr_cvr_cmf_when_rep_config_msmn_3teach model_name=ctr_cvr_when model.when_arch.use_extended_residual_contexts=True optimizers.dense_default.lr_schedule.0.max_iters=20000 training.planner.storage_reservation_policy=FixedPercentage training.planner.storage_reservation_percentage=0.72 data_loader.dataset.batch_size=2048 trainer.garbage_collection.garbage_collection_interval=100 model.when_arch.layer_norm_init_weight=0.3 optimizers.dense_default.lr_schedule.0.value=0.001 model.when_arch.customized_mlp_init_scale=0.3 launcher.num_workers=128 launcher.max_retries=10 launcher.data_project=oncall_ads_model_platform launcher.hardware=ZIONEX_80G data_loader.dataset.table_ds="[2024-01-01]" launcher.job_name=test_inductor_logging` https://www.internalfb.com/manifold/explorer/pyper_traces/tree/compilation_cprofile/aps-test_inductor_logging-745febb51a Generating dotty files from D56696397 ``` Generating dot file from cprofile stats /home/daohang/aps-test_inductor_logging-745febb51a/0/0/_compile1.profile ... P1225733598: https://www.internalfb.com/intern/paste/P1225733598/ Dotty: https://www.internalfb.com/intern/graphviz/?paste=1225733598 Generating dot file from cprofile stats /home/daohang/aps-test_inductor_logging-745febb51a/0/0/_compile10.profile ... P1225733629: https://www.internalfb.com/intern/paste/P1225733629/ Dotty: https://www.internalfb.com/intern/graphviz/?paste=1225733629 Generating dot file from cprofile stats /home/daohang/aps-test_inductor_logging-745febb51a/0/0/_compile0.profile ... P1225733649: https://www.internalfb.com/intern/paste/P1225733649/ Dotty: https://www.internalfb.com/intern/graphviz/?paste=1225733649 ``` Differential Revision: D56679561 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125162 Approved by: https://github.com/anijain2305	2024-04-30 15:05:01 +00:00
Edward Z. Yang	b04dca1502	Add pending_fresh_unbacked_symbols, populate unbacked_bindings for Dynamo (#124290 ) The important comment: ``` # Whenever we allocate a fresh unbacked Symbol, we add it to this # pending list. Unbacked symbol allocation can occur at unpredictable # points during meta tensor propagation, but at some point, the we # have to know what the binding site for an unbacked symbol is, and # this is computed when we actually place the node in the graph. The # important thing is that we always actually handle every unaccounted # for unbacked symbol, so this list helps us keep track of them and # then make sure they are all accounted for. # # We could potentially give rise to errors earlier by lexically # scoping when we do propagation, and only allowing unbacked symbols # to be allocated at this point in time. However this is inconvenient # to do in Dynamo, because fake tensor propagation is far from when we # analyze binding sites (set_example_value), so we do it in a more # mutatey way. # # NB: fresh unbacked symbols NEVER get substitutions applied to them, # they are binding sites! ``` The compute_unbacked_bindings is the other half of the equation: the thing that actually consumes the pending_fresh_unbacked_symbols and does something with them. Important comment: ``` After having run fake tensor propagation and producing example_value result, traverse example_value looking for freshly bound unbacked symbols and record their paths for later. It is an error if we have allocated an unbacked SymInt but it cannot be found in example_value. (NB: this means if you have a multi-output function, you must call this on the tuple of tensor output, you cannot wait!) ``` For example, if I return a tensor with size `[u0, u1]`, and u1 is a fresh unbacked SymInt, then I'll have `{u1: KeyPath(".size(1)")}`, telling me I can get u1 by running `size(1)` on the result of this node. u0 is not fresh (it probably flowed in as an argument), so I don't generate a binding for it. I eventually intend to propagate this information all the way to Inductor lowering, where extra metadata about unbacked symbol binding will be canonically used for codegen, instead of trying to infer it from defs/uses. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124290 Approved by: https://github.com/lezcano	2024-04-24 09:11:34 +00:00
Edward Z. Yang	f34905f61d	Assert that TracingContext is available when set_example_value is called (#124284 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124284 Approved by: https://github.com/Chillee ghstack dependencies: #124105, #124059, #124176, #124283	2024-04-21 11:23:13 +00:00
Boyuan Feng	aa2da0cdd2	[Export] Add runtime assert to non-strict export (#123681 ) This PR moves insert_deferred_runtime_asserts from dynamo to torch.fx.passes and uses it to add runtime assertion for non-strict export. Differential Revision: D55944267 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123681 Approved by: https://github.com/tugsbayasgalan, https://github.com/angelayi	2024-04-18 16:13:27 +00:00
Edward Z. Yang	bebdbb63ce	Introduce set_example_value and use it throughout Dynamo (#124176 ) I'm going to setup some extra behavior when we set example value, so I need a convenient place to interpose. I cannot easily do it on meta itself because its a generic dict with no interposition point. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124176 Approved by: https://github.com/oulgen ghstack dependencies: #124105, #124059	2024-04-17 22:57:11 +00:00
Aaron Gokaslan	1d6c5972c1	[BE]: Optimize min/max/sum comprehensions C419 (#123960 ) Automatic fixes that replaces certain list comprehensions with generator ones where appropriate so that they are immediately consumed. This is preview functionality in ruff for rule C419 and it was automatically applied. Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/123960 Approved by: https://github.com/malfet	2024-04-12 23:54:15 +00:00
Aart Bik	d564fe7dca	[sparse] add proper path for cloning sparse tensors (#123127 ) The code does the right thing (rather than crashing). This is a small step towards https://github.com/pytorch/pytorch/issues/117188 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123127 Approved by: https://github.com/pearu, https://github.com/cpuhrsch	2024-04-12 23:19:51 +00:00
Simon Fan	7fc3aa5f81	[compiled autograd][aot] Trim runtime refs for list inputs from dynamo (#122535 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122535 Approved by: https://github.com/bdhirsh ghstack dependencies: #123630, #123674, #122353, #123359	2024-04-12 10:29:09 +00:00
Simon Fan	d274d57037	[compiled autograd][dynamo] Make compiled graph take in boxed inputs (#122353 ) ### Context In today's Dynamo, we lift all tensors encountered during tracing to be individual graph inputs, even when they were in a container. And [Dynamo generates](`fdc281f258/torch/_dynamo/codegen.py (L371)`) the runtime function's signature using the graph's graphargs. This means that the generated function will have each grapharg as an argument, which is problematic if we want to free the inputs in inductor codegen. See [python function arguments are kept alive for the duration of the function call](https://github.com/pytorch/pytorch/pull/83137#issuecomment-1211320670). ```python # original code def forward(inputs): a, b, c, d, e = inputs inputs.clear() out = a out += b del b # frees memory out += c del c # frees memory out += d del d # frees memory out += e del e # frees memory return out # compiled code: def forward(a, b, c, d, e): # b, c, d, e can't be freed before end of function ``` This isn't a concern when compiling forward because a, b, c, d, e are all from user code, and should be kept alive. But when compiling backwards, a, b, c, d, e may be intermediate results i.e. activations, that we DO want to clear ASAP to remain on par with eager peak memory. ### Solution We have encountered similar memory problems in AOTAutograd before, where we adopted the boxed calling convention (wrapping to-be-freed objects in a list), adding list clearing to inductor codegen, and being careful about holding references to elements in the input list. We need to do something similar, but for inputs from the user program (compiled autograd fx graph in this case). This PR support lists as graphargs/placeholder nodes. When tracing a list of tensors, we create a node for it, and pre-emptively initialize variable trackers for its elements before they are used in the user program. Subsequent uses of those variables will find hits in the lookup table `input_source_to_var`. With the inputs as a list in the graph args, our compiled code can free inputs just like in the eager case. ```python def forward(inputs): # a, b, c, d, e can be freed within the function now ``` Currently, AOT/Inductor flattens list input via [flatten_graph_inputs wrapper](`597f479643/torch/_inductor/compile_fx.py (L1454-L1478)`), which is why this PR's CI can be green. Additional changes are needed to its runtime wrapper, done in the next PR. The next step is to ensure that we are careful in forwarding the list to inductor codegen without holding additional references. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122353 Approved by: https://github.com/jansel ghstack dependencies: #123630, #123674	2024-04-12 10:29:09 +00:00
Brian Hirsh	fa013f69bb	dynamo assertion that graph has no fake-tensor constants should check for subclasses (#118644 ) This would have caught some of the nasty errors in https://github.com/pytorch/pytorch/pull/118191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118644 Approved by: https://github.com/tugsbayasgalan, https://github.com/zou3519 ghstack dependencies: #118647	2024-04-11 20:10:15 +00:00
Simon Fan	8ac0f072e6	[aot eager] Support frontend graphs with list arguments (#123212 ) We already support bumpy inputs for 3rd party frontend and compiled backward graph, we should add the behavior to aot_eager too Pull Request resolved: https://github.com/pytorch/pytorch/pull/123212 Approved by: https://github.com/jansel ghstack dependencies: #122691, #122746, #123007	2024-04-03 17:07:52 +00:00
Aaron Orenstein	a8b7480f0d	fix dynamo.explain examples (#122745 ) `dynamo.explain()` was updated to return a structure but the docs weren't updated to match. - Update the docs to use the new API - Remove some dead code left when `explain` was updated. - Drive-by: Fix some `nopython` uses that I noticed - Drive-by: I noticed an ignored error coming from CleanupHook on shutdown - make it check the global before setting it. Fixes #122573 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122745 Approved by: https://github.com/jansel	2024-03-27 22:53:27 +00:00
chilli	a54ea7bbd8	Made several changes to min-cut partitioner that allow it to recompute more things (#121692 ) Perf results <img width="862" alt="image" src="https://github.com/pytorch/pytorch/assets/6355099/8d44e633-8941-46a6-8e7d-806330a8c890"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/121692 Approved by: https://github.com/shunting314, https://github.com/eellison ghstack dependencies: #122686, #122688	2024-03-27 22:45:52 +00:00
William Wen	71d40ff861	[dynamo, 3.12] fix typing variable tracing (#122741 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122741 Approved by: https://github.com/jansel ghstack dependencies: #122146, #122335, #122354, #122355, #122356, #122449, #122455, #122456, #122530, #122737, #122738, #122739, #122740	2024-03-27 20:39:39 +00:00
chilli	67a4d6d6cb	Stopped TORCH_COMPILE_DEBUG from printing out a bunch of logs (#122688 ) @ezyang suggests using TORCH_TRACE for dumping out all intermediate logs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122688 Approved by: https://github.com/ezyang, https://github.com/mlazos ghstack dependencies: #122686	2024-03-27 00:24:40 +00:00
Edward Z. Yang	7e176ebb47	Log compilation_metrics to TORCH_TRACE (#122638 ) It's not technically needed as you can get it from Scuba too, but it's more convenient for tlparse to get at it this way. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/122638 Approved by: https://github.com/albanD	2024-03-26 14:10:55 +00:00
Guilherme Leobas	4eaa000acc	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-22 20:25:47 +00:00
Peter Bell	5790096059	[dynamo] Remove uses of `raise unimplemented` (#122136 ) `unimplemented` is a function that raises an error, so `raise unimplemented(...)` never reaches the `raise`. Another related issue is that `raise unimplemented(...) from e` doesn't attach the exception cause correctly. I fix this by adding a `from_exc` argument to `unimplemented`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122136 Approved by: https://github.com/lezcano	2024-03-22 19:29:58 +00:00
William Wen	23524710e6	[dynamo] use proxies to nn.Module in dynamo generated GraphModules (#120756 ) Fixes remaining refleaks found when debugging https://github.com/pytorch/pytorch/issues/119607, tests added in https://github.com/pytorch/pytorch/pull/120657. Also fixes some tests that xfail: https://github.com/pytorch/pytorch/issues/120631 (not entirely sure why), but introduced tests now fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120756 Approved by: https://github.com/jansel	2024-03-21 21:23:12 +00:00
PyTorch MergeBot	0696db8202	Revert "Teach dynamo about torch.func.jvp (#119926 )" This reverts commit `17489784b6`. Reverted https://github.com/pytorch/pytorch/pull/119926 on behalf of https://github.com/peterbell10 due to broken mac jobs on main ([comment](https://github.com/pytorch/pytorch/pull/119926#issuecomment-2010327997))	2024-03-20 18:34:43 +00:00
Guilherme Leobas	17489784b6	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-20 13:09:19 +00:00
Oguz Ulgen	c0b2e56c8f	Support triton.language.dtype with torch.compile -- Second Attempt (#122141 ) This PR is the second attempt at supporting `triton.language.dtype`, now instead of putting it on the graph, we put it on the side table since it is a constant. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122141 Approved by: https://github.com/jansel ghstack dependencies: #122140	2024-03-19 19:40:52 +00:00
PyTorch MergeBot	36e5c1dcab	Revert "Teach dynamo about torch.func.jvp (#119926 )" This reverts commit `edd04b7c16`. Reverted https://github.com/pytorch/pytorch/pull/119926 on behalf of https://github.com/jeanschmidt due to lots of breakages in pull jobs, checking if reverting this one will help ([comment](https://github.com/pytorch/pytorch/pull/119926#issuecomment-2007915919))	2024-03-19 18:59:46 +00:00
Guilherme Leobas	edd04b7c16	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-19 13:06:42 +00:00
Oguz Ulgen	7c5e29ae71	Back out "Support `triton.language.dtype` with `torch.compile` (#121690 )" (#122108 ) Summary: Some hard to deal with package import/export related problems. Lets revert and start with clean slate. Test Plan: CI Differential Revision: D55024877 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122108 Approved by: https://github.com/ezyang	2024-03-18 20:50:28 +00:00
James Wu	df1cdaedeb	Log restart reasons and extra compile time in CompilationMetrics (#121827 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121827 Approved by: https://github.com/ezyang, https://github.com/yanboliang	2024-03-18 18:59:25 +00:00
Jason Ansel	4034873a31	[dynamo] Optimize builtin handling (#122035 ) Improves `benchmarks/dynamo/microbenchmarks/dynamo_microbenchmarks.py` from 7.3s to 6.7s. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122035 Approved by: https://github.com/Skylion007 ghstack dependencies: #122032, #122033, #122034	2024-03-18 18:08:06 +00:00
lezcano	d0d09f5977	Fix torch.compile links (#121824 ) Fixes https://github.com/pytorch/pytorch.github.io/issues/1567 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121824 Approved by: https://github.com/svekars, https://github.com/peterbell10, https://github.com/malfet ghstack dependencies: #121823	2024-03-15 19:49:37 +00:00
lezcano	8a5a377190	Move doc links to point to main (#121823 ) The previous links were pointing to an outdated branch Command: `find . -type f -exec sed -i "s:docs/main:docs/master:g" {} + ` Pull Request resolved: https://github.com/pytorch/pytorch/pull/121823 Approved by: https://github.com/albanD, https://github.com/malfet	2024-03-15 19:49:37 +00:00
Jason Ansel	5a2b4fc8f0	[dynamo] Convert invalid args into graph breaks (#121784 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121784 Approved by: https://github.com/yanboliang	2024-03-15 06:51:27 +00:00
PyTorch MergeBot	70c6f542f2	Revert "[dynamo] Convert invalid args into graph breaks (#121784 )" This reverts commit `0df39480f6`. Reverted https://github.com/pytorch/pytorch/pull/121784 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think it breaks ONNX test in trunk `0c1ac4484d` ([comment](https://github.com/pytorch/pytorch/pull/121784#issuecomment-1995979435))	2024-03-13 22:12:43 +00:00
Jason Ansel	0df39480f6	[dynamo] Convert invalid args into graph breaks (#121784 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121784 Approved by: https://github.com/yanboliang ghstack dependencies: #121615, #121616	2024-03-13 20:02:33 +00:00
Jason Ansel	a13dd92d88	[dynamo] Minor compile time optimizations in torch.py (#121615 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121615 Approved by: https://github.com/oulgen	2024-03-13 05:36:22 +00:00
Yanan Cao	7d05c4c093	Remove error anti-pattern when dealing with dynamic shape output (#121681 ) There are cases where capture_dynamic_output_shape_ops=True and we will still see DynamicOutputShapeException. For example, when an op doesn't have a meta kernel implemented to return the correct dynamic shape output. If we blindly give users instructions to set capture_dynamic_output_shape_ops to True, users would try it and see no change. As witnessed in this issue: https://github.com/pytorch/pytorch/issues/121036#issuecomment-1985221435 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121681 Approved by: https://github.com/tugsbayasgalan	2024-03-13 00:45:23 +00:00
Oguz Ulgen	79ee6bbde3	Support `triton.language.dtype` with `torch.compile` (#121690 ) Putting this PR as an RFC since I have resorted to some horrible hacks in order to make this work. ``` (Pdb) p triton.language.float32 triton.language.fp32 (Pdb) p str(triton.language.float32) 'fp32' (Pdb) p repr(triton.language.float32) 'triton.language.fp32' ``` This means that we need to "rewrite" them for fx graph and inductor execution. This PR allows Mamba2 to work with `torch.compile`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121690 Approved by: https://github.com/Skylion007	2024-03-12 23:21:46 +00:00
Shunting Zhang	522d972924	[eazy] add more log when accuracy check fail (#121656 ) Add these log to debug the regress of accuracy test for dm_nfnet_f0 model for training. With these extra log when the accuracy check fail, we can verify if it's close to succeed or not. If yes that indicates there is no real issue but just flaky and we probably can tune the tolerance to fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121656 Approved by: https://github.com/jansel, https://github.com/Skylion007	2024-03-12 20:58:20 +00:00
rzou	3ef0befdc9	Better error messages for impl_abstract_pystub (#120959 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120959 Approved by: https://github.com/drisspg	2024-03-04 15:24:36 +00:00
Animesh Jain	b7f2522692	[dynamo][compile-time] Remove unnecessary tree_map_only (#121052 ) Reduces the torch.compile(backend="eager") for this code by 1-2 seconds. ~~~ def fn(x): for _ in range(10000): # x = torch.sin(x) x = torch.ops.aten.sin(x) # x = sin(x) return x ~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/121052 Approved by: https://github.com/jansel ghstack dependencies: #121053	2024-03-03 06:59:43 +00:00
Guilherme Leobas	491c2b4665	Let torch dynamo inline torch.func.grad (#118407 ) When dynamo sees torch.func.grad, it tries to inline all frames related to. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118407 Approved by: https://github.com/zou3519	2024-02-28 20:05:00 +00:00
Yanbo Liang	5a0a964444	[Dynamo] Fix guards for script_if_tracing or lru_cache fn with default args (#120390 ) Fixes #120387 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120390 Approved by: https://github.com/anijain2305	2024-02-26 19:40:14 +00:00
Michael Lazos	56203fc407	Add profiling for backward (#120540 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/120540 Approved by: https://github.com/anijain2305	2024-02-24 16:53:28 +00:00
Thiago Crepaldi	3588e7f265	Ignore .numpy() under FakeTensorMode() (#120261 ) Fixes #120259 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120261 Approved by: https://github.com/jansel	2024-02-22 22:49:20 +00:00
PyTorch MergeBot	8fa6340701	Revert "Ignore .numpy() under FakeTensorMode() (#120261 )" This reverts commit `952b37145b`. Reverted https://github.com/pytorch/pytorch/pull/120261 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems breaking trunk on Python 3.12 `952b37145b` ([comment](https://github.com/pytorch/pytorch/pull/120261#issuecomment-1958267417))	2024-02-21 23:09:27 +00:00
Thiago Crepaldi	952b37145b	Ignore .numpy() under FakeTensorMode() (#120261 ) Fixes #120259 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120261 Approved by: https://github.com/jansel	2024-02-21 22:06:29 +00:00
Yanbo Liang	d42ede8ae4	[torch.compile] Log compilation start time for timeline view (#120220 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/120220 Approved by: https://github.com/angelayi	2024-02-20 21:07:40 +00:00
Shunting Zhang	becfda005e	tiny improvement to the cprofile wrapper (#120100 ) 1. right now we double increment the profile counter. The PR avoid that so we don't end up with profile_0, profile_2, profile_4 ... 2. log the latency to run the passed in function with profiling on so we can easily skip those _compile call which returns quickly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120100 Approved by: https://github.com/eellison	2024-02-17 02:10:25 +00:00
Menglu Yu	7b1f5c874f	[PT2][Optimus][Observability] Log the optimus graph transformation to the scuba (#119745 ) Summary: Current everstore upload logging may cuase excessive compilation time when the model has lots of graph breaks (post: https://fb.workplace.com/groups/257735836456307/permalink/633533465543207/), we here log the transformation only when the graph changed Test Plan: timeout flows: f528209775 f530084719 Differential Revision: D53692344 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119745 Approved by: https://github.com/jackiexu1992	2024-02-16 21:32:04 +00:00
laith sakka	3693d8f467	Do to convert UnsupportedFakeTensorException into RuntimeError in runNode for proper graph breaking. (#120026 ) Fix: https://github.com/pytorch/pytorch/issues/119779 by properly graph breaking a proper fix is to handle quantized tensors for full complete solution. if when generating a fake tensor, UnsupportedFakeTensorException is thrown, then its handled and converted into a Unimplemented in inside wrap_fake_exception which is then translated to a graph break. However run_node used to convert UnsupportedFakeTensorException into a runtime error, creating runtime errors instead of graph breaks whenever generating a fake tensor for a quantized tensor fails. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120026 Approved by: https://github.com/jansel	2024-02-16 09:21:58 +00:00
Yanbo Liang	7f5b87c953	[torch.compile] Log more compilation time breakdown (#119865 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/119865 Approved by: https://github.com/ezyang	2024-02-15 02:20:07 +00:00
laith sakka	edd9ddf73f	Propagate allow_non_graph_fake between get_fake_values_from_nodes and get_fake_values (#119731 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119731 Approved by: https://github.com/jansel, https://github.com/anijain2305 ghstack dependencies: #119314, #119435	2024-02-14 15:26:17 +00:00
laith sakka	ea8e4fd5ac	Support FunctoolsPartialVariable::get_function, fix NamedTupleVariable::as_proxy and handle call_function in get_fake_values_from_nodes (#119435 ) partially address https://github.com/pytorch/pytorch/issues/118785 This diff fixes three things: 1. add get_function to FunctoolsPartialVariable note that it will be available only if all args constant otherwise, it would throw unimplemented in the call to asPythonConstant. 2. NamedTupleVariable takes args dispatched not as list ex: NamedTuple(a, b, c) vs NamedTuple([a, b, c]), hence fix that by specializing asProxy. 3. A call to create_arg from within create_proxy, changes a python NamedTuple to a function call node without associating an example value! Updated get_fake_values_from_nodes to handle such case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119435 Approved by: https://github.com/jansel, https://github.com/anijain2305 ghstack dependencies: #119314	2024-02-13 01:44:08 +00:00
Jason Ansel	e1c1b8c2b2	[dynamo] Improve support for backwards hooks (#119525 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119525 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2024-02-10 01:14:03 +00:00
PyTorch MergeBot	25a0fa6d13	Revert "[dynamo] Improve support for backwards hooks (#119525 )" This reverts commit `b1f4b2a63c`. Reverted https://github.com/pytorch/pytorch/pull/119525 on behalf of https://github.com/clee2000 due to broke test_autograd.py::TestAutograd::test_post_accumulate_grad_hook_gets_cleaned_up on dynamo https://github.com/pytorch/pytorch/actions/runs/7847212828/job/21416215820 `b1f4b2a63c`. The failure exists on the PR as well, but got masked by the other test. Putting this as no signal? ([comment](https://github.com/pytorch/pytorch/pull/119525#issuecomment-1936447169))	2024-02-09 18:58:55 +00:00
Jason Ansel	b1f4b2a63c	[dynamo] Improve support for backwards hooks (#119525 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119525 Approved by: https://github.com/yanboliang	2024-02-09 17:02:40 +00:00
Yanbo Liang	0f478d9d61	[Dynamo][15/N] Merge allow_in_graph/inline/skip trace rules check into trace_rule.lookup (#118971 ) Finally we have this PR to merge allow_in_graph/inline/skip trace rules into ```trace_rules.lookup_inner```, where we can define and lookup trace rules at both function level and file level. Going forward, this is the central place that we define and consulte Dynamo trace rule for any function. * ```trace_rules.looup``` is the API can return allow_in_graph, inline or skip. * ```skipfiles.check``` is the API can return inline or skip, since we have multiple places that only do inline/skip check. * I'll move ```skipfiles.check``` to ```trace_rules.check``` as one of the follow-ups. * Both functions consulte ```trace_rules.lookup_inner``` to get the tracing rule. To avoid a single big PR, I left a few items as the follow-ups: * Remove ```skipfiles.py``` and merge the code into ```trace_rules.py```. * We do double check in ```symbolic_convert.check_inlineable```, will refactor and simplify it. We should only do inline/skip check before generating ```SkipFilesVariable``` and ```UserFunctionVariable```. * Rename ```SkipFilesVariable``` as ```SkipFunctionVariable```, since we only handle functions. * The inline/skip reasons are not logged for some cases, since the new lookup framework doesn't always return inline/skip reasons. I'll refactor loggings to record the inline/skip reason in next step. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118971 Approved by: https://github.com/jansel	2024-02-07 05:15:39 +00:00
Jason Ansel	ec31d11580	[dynamo] Skip dynamo when inside a functorch context (#118901 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118901 Approved by: https://github.com/zou3519	2024-02-06 20:22:24 +00:00
Edward Z. Yang	abc09b27b9	Some minor type stub improvements (#118529 ) I was just playing around with improving the typing of symbolic_shapes. The PR is not "complete" but I in particular wanted to get feedback on whether or not people liked making ValueRanges Generic; it seems that distinguishing if you have an Expr ValueRange or a SympyBoolean ValueRange is a lot of trouble for downstream. Using TypeGuard, we can perform refinements on the generic parameter inside methods, although we still have to cast back to ValueRange[T] due to https://github.com/python/mypy/issues/14425#issuecomment-1914852707 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118529 Approved by: https://github.com/Skylion007	2024-02-04 00:19:00 +00:00
PyTorch MergeBot	dbba1d4bf5	Revert "Some minor type stub improvements (#118529 )" This reverts commit `c978f38bd4`. Reverted https://github.com/pytorch/pytorch/pull/118529 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/118529#issuecomment-1922362331))	2024-02-01 22:18:36 +00:00
Edward Z. Yang	c978f38bd4	Some minor type stub improvements (#118529 ) I was just playing around with improving the typing of symbolic_shapes. The PR is not "complete" but I in particular wanted to get feedback on whether or not people liked making ValueRanges Generic; it seems that distinguishing if you have an Expr ValueRange or a SympyBoolean ValueRange is a lot of trouble for downstream. Using TypeGuard, we can perform refinements on the generic parameter inside methods, although we still have to cast back to ValueRange[T] due to https://github.com/python/mypy/issues/14425#issuecomment-1914852707 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118529 Approved by: https://github.com/Skylion007	2024-01-31 20:56:56 +00:00
Catherine Lee	4f5785b6b3	Enable possibly-undefined error code (#118533 ) Fixes https://github.com/pytorch/pytorch/issues/118129 Suppressions automatically added with ``` import re with open("error_file.txt", "r") as f: errors = f.readlines() error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Co-authored-by: Catherine Lee <csl@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2024-01-30 21:07:01 +00:00
PyTorch MergeBot	40ece2e579	Revert "Enable possibly-undefined error code (#118533 )" This reverts commit `4f13f69a45`. Reverted https://github.com/pytorch/pytorch/pull/118533 on behalf of https://github.com/clee2000 due to sorry i'm trying to figure out a codev merge conflict, if this works i'll be back to rebase and merge ([comment](https://github.com/pytorch/pytorch/pull/118533#issuecomment-1917695185))	2024-01-30 19:00:34 +00:00
Edward Z. Yang	4f13f69a45	Enable possibly-undefined error code (#118533 ) Fixes https://github.com/pytorch/pytorch/issues/118129 Suppressions automatically added with ``` import re with open("error_file.txt", "r") as f: errors = f.readlines() error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2024-01-30 05:08:10 +00:00
Yanbo Liang	ca1d70632d	[14/N][Dynamo] Make trace_rules.lookup only handle function + callable type (#118366 ) Step by step changes to unblock #118264 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118366 Approved by: https://github.com/angelayi	2024-01-27 23:02:44 +00:00
rzou	5e0ef84b01	[dynamo] Refactor install_global_once, remove usages of install_global_unsafe (#118100 ) We split install_global_once into two APIs: - `install_global_by_id(prefix, value) -> name`: installs a global if it hasn't been installed yet - `install_global(prefix, value) -> name`: always installs the global (and generates a unique name for it) Then, we refactor most callsites of `install_global_unsafe` to one of the previous. Some callsites cannot be refactored because we create the global name first, do a lot of stuff with it, and then install it. This fixes more test flakiness. Test Plan: - Existing tests; I can't reliably repro the flakiness Pull Request resolved: https://github.com/pytorch/pytorch/pull/118100 Approved by: https://github.com/ezyang, https://github.com/mlazos	2024-01-24 23:25:44 +00:00
Yanbo Liang	c0732c8d5e	[Dynamo] Add complex to literal constant (#117819 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/117819 Approved by: https://github.com/zou3519	2024-01-23 23:46:46 +00:00
rzou	e309d6fa1c	Better unsupported op error message (#117770 ) Previously, if someone wrote a python abstract impl but didn't import the module it is in, then we would raise an error message suggesting that the user needs to add an abstract impl for the operator. In addition to this, we suggest that the user try importing the module associated with the operator in the pystub (it's not guaranteed that an abstract impl does exist) to avoid confusion. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/117770 Approved by: https://github.com/ydwu4, https://github.com/williamwen42	2024-01-23 15:05:16 +00:00
lezcano	f4df0f061c	Implement set in terms of dict (#110524 ) This allows to heavily simplify the implementation of set, which was "quite unique". Now we represent a set a as a dict where all its values are None. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110524 Approved by: https://github.com/jansel ghstack dependencies: #112252, #117630	2024-01-18 09:36:41 +00:00
Simon Fan	88bf84f106	[benchmark] add --compile-autograd to dynamo benchmarks (#117196 ) Adds `--compile-autograd` flag to benchmark suite to run accuracy and performance tests. Also adds autograd_captures and autograd_compiles to dynamo stats e.g. accuracy_inductor.csv ``` dev,name,batch_size,accuracy,calls_captured,unique_graphs,graph_breaks,unique_graph_breaks,autograd_captures,autograd_compiles cuda,BERT_pytorch,4,pass,2655,2,8,7,1,1 cuda,Background_Matting,4,pass_due_to_skip,0,0,0,0,0,0 cuda,DALLE2_pytorch,0,eager_fail_to_run,0,0,0,0,0,0 cuda,LearningToPaint,4,pass,639,2,8,7,1,1 ... ``` e.g. speedup_inductor.csv ``` dev,name,batch_size,speedup,abs_latency,compilation_latency,compression_ratio,eager_peak_mem,dynamo_peak_mem,calls_captured,unique_graphs,graph_breaks,unique_graph_breaks,autograd_captures,autograd_compiles cuda,hf_T5,8,1.214311,136.236793,88.350570,0.751322,18.754706,24.962275,3298,2,8,8,1,1 cuda,hf_T5,8,1.226645,135.431856,52.461461,1.040973,18.754706,18.016508,795,1,7,7,0,0 ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117196 Approved by: https://github.com/jansel	2024-01-11 20:12:58 +00:00
Edward Z. Yang	5b24877663	Improve uint{16,32,64} dlpack/numpy compatibility (#116808 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/116808 Approved by: https://github.com/malfet, https://github.com/albanD	2024-01-11 17:01:54 +00:00
voznesenskym	4c0d63180a	Support NNModules as dict keys (#116723 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116723 Approved by: https://github.com/lezcano	2024-01-09 03:32:47 +00:00
voznesenskym	de005b14ab	[dynamo] fix more broken dict tests (#116943 ) Forward fixing after #111196 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116943 Approved by: https://github.com/huydhn	2024-01-07 08:00:16 +00:00
voznesenskym	83e8a0721d	Reland #111196 (take 4) "Support tensors as Dict keys" (#116934 ) Fixes #ISSUE_NUMBER See that PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/116934 Approved by: https://github.com/ezyang, https://github.com/huydhn	2024-01-07 01:37:26 +00:00
PyTorch MergeBot	2dca3e99eb	Revert "Support tensors as Dict keys Re-PR of #111196 (#116785 )" This reverts commit `1badad9ce9`. Reverted https://github.com/pytorch/pytorch/pull/116785 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/116785#issuecomment-1879592261))	2024-01-06 08:22:33 +00:00
voznesenskym	1badad9ce9	Support tensors as Dict keys Re-PR of #111196 (#116785 ) This prepares the PR where we implement sets in terms of dicts. To do so, rather than storing internally a dictionary that maps literals to VariableTrackers, it stores (pretty much) a dictionary from VTs to VTs. To do so, keys are wrapped in an opaque internal class _Hashable. The Hashable class is opaque on purpose so that it fails hard if if it inadvertently leaks back into user code. We also found and fixed a number of latent bugs and inconsistencies in the way dynamo checked what can be a dict key. More generally, we make much clearer what are the things that need to be modified to add a new supported key type to Dicts. Fixes [#107595](https://www.internalfb.com/tasks?t=107595) Fixes [#111603](https://www.internalfb.com/tasks?t=111603) Re-PR of https://github.com/pytorch/pytorch/pull/111196 sadly due to reverts, we could not reuse @lezcano's original PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116785 Approved by: https://github.com/mlazos	2024-01-06 03:35:35 +00:00
Edward Z. Yang	0249c4a785	Add config toggle suggestions for data-dependent/dynamic output shape (#114337 ) Fixes https://github.com/pytorch/pytorch/issues/114220 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/114337 Approved by: https://github.com/aakhundov	2024-01-05 14:01:01 +00:00
Aaron Gokaslan	86cd6655a1	[BE]: Use exist_ok arg for os.makedirs calls (#116561 ) Optimize os.makedirs calls to use exist_ok parameter when possible to avoid unnecessary checks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116561 Approved by: https://github.com/malfet	2023-12-30 21:12:53 +00:00
Yanbo Liang	d59350cc1c	[Dynamo] Consolidate common constant types (#116366 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/116366 Approved by: https://github.com/Skylion007	2023-12-27 23:54:35 +00:00
Yanbo Liang	f657b2b1f8	[Dynamo][10/N] Remove TorchVariable and is_allowed (#116312 ) After this refactor: * ```TorchVariable``` definition and all references are removed. * All ```is_allowed``` references except one are removed. - The only left one is in ```torch/_dynamo/decorators:_disallow_in_graph_helper```. It was called when users put ```disallow_in_graph``` decorator on a function. Since we use the lists in ```trace_rules``` to decide the function's trace rule, so the decorator would only be used as customer function rather than torch functions. I'll defer this to a separate decorator refactor PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116312 Approved by: https://github.com/jansel	2023-12-27 18:47:05 +00:00
PyTorch MergeBot	3b709d7c1e	Revert "[Dynamo][10/N] Remove TorchVariable and is_allowed (#116312 )" This reverts commit `015bd0e0a1`. Reverted https://github.com/pytorch/pytorch/pull/116312 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/116312#issuecomment-1869825506))	2023-12-26 23:47:15 +00:00
PyTorch MergeBot	0edc348788	Revert "[Dynamo] Consolidate common constant types (#116366 )" This reverts commit `36dccc2aba`. Reverted https://github.com/pytorch/pytorch/pull/116366 on behalf of https://github.com/kit1980 due to Need to revert this because of https://github.com/pytorch/pytorch/pull/116312 ([comment](https://github.com/pytorch/pytorch/pull/116366#issuecomment-1869821625))	2023-12-26 23:36:52 +00:00
Yanbo Liang	36dccc2aba	[Dynamo] Consolidate common constant types (#116366 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/116366 Approved by: https://github.com/Skylion007	2023-12-24 22:58:01 +00:00
Yanbo Liang	015bd0e0a1	[Dynamo][10/N] Remove TorchVariable and is_allowed (#116312 ) After this refactor: * ```TorchVariable``` definition and all references are removed. * All ```is_allowed``` references except one are removed. - The only left one is in ```torch/_dynamo/decorators:_disallow_in_graph_helper```. It was called when users put ```disallow_in_graph``` decorator on a function. Since we use the lists in ```trace_rules``` to decide the function's trace rule, so the decorator would only be used as customer function rather than torch functions. I'll defer this to a separate decorator refactor PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116312 Approved by: https://github.com/jansel	2023-12-23 09:44:09 +00:00
Shunting Zhang	99f7e721fe	[inductor] make inductor work with new triton compile interface (#115878 ) Recent 2 triton PRs (https://github.com/openai/triton/pull/2701, https://github.com/openai/triton/pull/2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API. Also there is some simplification between compilation call in subprocess and the one in main process - previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that - previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process. Updated: There are more interface change from triton side. E.g. - tl.math.{min, max} now requires a propagate_nan argument - JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton. - triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115878 Approved by: https://github.com/jansel	2023-12-22 00:09:29 +00:00
PyTorch MergeBot	db35ccf463	Revert "[innductor] make inductor work with new triton compile interface (#115878 )" This reverts commit `bbded928b3`. Reverted https://github.com/pytorch/pytorch/pull/115878 on behalf of https://github.com/kit1980 due to Broke ROCm https://github.com/pytorch/pytorch/actions/runs/7282149837/job/19844618618 ([comment](https://github.com/pytorch/pytorch/pull/115878#issuecomment-1865369349))	2023-12-21 02:00:17 +00:00
Yanbo Liang	be9de33240	[Dynamo][9/N] Make SkipFilesVariable wrap functions only (#115963 ) Make ```SkipFilesVariable``` only handle function type, and route skipped classes to ```UserDefinedClassVariable```. The reasons behind this are: * We'd like to remove ```is_allowed```, so the allowed/disallowed torch classes should have a proper place to handle. We can put them in either ```SkipFilesVariable``` and ```UserDefinedClassVariable``` under the current architecture, but it's confusing to have two places do one thing. - Going forward, let's make ```SkipFilesVariable``` only handle functions, and probably I'll rename it to ```SkippedFunctionVariable``` in the following PRs. - Let's do dispatch by value's type, all torch classes stuff would go to ```UserDefinedClassVariable``` in the next PR. * We'd merge in_graph/skip/inline trace decision into the same API ```trace_rule.lookup```, so probably we have to limit the input to only function for better organizing ```VariableBuilder._wrap``` logics. - Next step, I'll merge ```skipfiles.check``` into ```trace_rules.lookup```, and do the skipfile check before wrapping them into correct variable tracker. - Though the ```TorchCtxManagerClassVariable``` is decided by ```trace_rules.lookup```, I'll refactor it out in the following PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115963 Approved by: https://github.com/jansel	2023-12-21 01:35:07 +00:00
Shunting Zhang	bbded928b3	[innductor] make inductor work with new triton compile interface (#115878 ) Recent 2 triton PRs (https://github.com/openai/triton/pull/2701, https://github.com/openai/triton/pull/2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API. Also there is some simplification between compilation call in subprocess and the one in main process - previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that - previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process. Updated: There are more interface change from triton side. E.g. - tl.math.{min, max} now requires a propagate_nan argument - JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton. - triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115878 Approved by: https://github.com/jansel	2023-12-21 00:03:38 +00:00
Michael Lazos	8eb7f6276b	Ensure wrapping subclasses with `as_subclass` is supported (#116091 ) As title Pull Request resolved: https://github.com/pytorch/pytorch/pull/116091 Approved by: https://github.com/pmeier, https://github.com/zou3519	2023-12-20 14:37:08 +00:00
PyTorch MergeBot	bdfabe5e7d	Revert "[Dynamo][9/N] Make SkipFilesVariable wrap functions only (#115963 )" This reverts commit `bb5a27052f`. Reverted https://github.com/pytorch/pytorch/pull/115963 on behalf of https://github.com/jeanschmidt due to causing significant performance regression, identified by number of ops in ads, please check internal diff ([comment](https://github.com/pytorch/pytorch/pull/115963#issuecomment-1864361697))	2023-12-20 12:06:55 +00:00
Yanbo Liang	bb5a27052f	[Dynamo][9/N] Make SkipFilesVariable wrap functions only (#115963 ) Make ```SkipFilesVariable``` only handle function type, and route skipped classes to ```UserDefinedClassVariable```. The reasons behind this are: * We'd like to remove ```is_allowed```, so the allowed/disallowed torch classes should have a proper place to handle. We can put them in either ```SkipFilesVariable``` and ```UserDefinedClassVariable``` under the current architecture, but it's confusing to have two places do one thing. - Going forward, let's make ```SkipFilesVariable``` only handle functions, and probably I'll rename it to ```SkippedFunctionVariable``` in the following PRs. - Let's do dispatch by value's type, all torch classes stuff would go to ```UserDefinedClassVariable``` in the next PR. * We'd merge in_graph/skip/inline trace decision into the same API ```trace_rule.lookup```, so probably we have to limit the input to only function for better organizing ```VariableBuilder._wrap``` logics. - Next step, I'll merge ```skipfiles.check``` into ```trace_rules.lookup```, and do the skipfile check before wrapping them into correct variable tracker. - Though the ```TorchCtxManagerClassVariable``` is decided by ```trace_rules.lookup```, I'll refactor it out in the following PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115963 Approved by: https://github.com/jansel	2023-12-19 02:01:47 +00:00
David Berard	054f9548b4	[dynamo] Store CompilationEvents in a buffer in torch._dynamo.utils (#115788 ) Motivation: it would be nice to be able to test using the metrics in log_compilation_event; currently dumps logs (or logs to a database in fbcode) - these are hard to use in unit tests. This change: * always record the information in torch._dynamo.utils.record_compilation_metrics; here, log into a limited-size deque to prevent the list of metrics from getting too long * if config.log_compilation_metrics, then call back into the original log_compilation_event function Pull Request resolved: https://github.com/pytorch/pytorch/pull/115788 Approved by: https://github.com/yanboliang	2023-12-18 23:26:13 +00:00
Yanbo Liang	b4d6443bcf	[Dynamo] Log innermost user frame filename & lineno for better error aggregation (#115899 ) CompilationMetrics example: ``` frame_key='1', co_name='fn', co_filename='/data/users/ybliang/debug/debug1.py', co_firstlineno=58, cache_size=0, accumulated_cache_size=0, guard_count=None, graph_op_count=None, graph_node_count=None, graph_input_count=None, entire_frame_compile_time_s=None, backend_compile_time_s=None, fail_type="<class 'torch._dynamo.exc.Unsupported'>", fail_reason='custome dict init with args/kwargs unimplemented', fail_user_frame_filename='/data/users/ybliang/debug/debug1.py', fail_user_frame_lineno=61 ``` where: * ```fail_type``` and ```fail_reason``` are exceptions inside of Dynamo. * ```fail_user_frame_filename``` and ```fail_user_frame_lineno``` are where the original user code triggered the exception. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115899 Approved by: https://github.com/davidberard98, https://github.com/ydwu4	2023-12-15 08:24:55 +00:00
David Berard	67232199b1	[dynamo] Log shape_env_guard_count separately from guard_count (#115776 ) guard_count counts all the shape_env guards as a single guard; log the shape_env_guard_count separately so those metrics can be used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115776 Approved by: https://github.com/yanboliang	2023-12-14 20:12:49 +00:00
Michael Lazos	869e52e3dd	Support torch function user objects (#111765 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111765 Approved by: https://github.com/jansel	2023-12-13 22:11:52 +00:00
Michael Lazos	fbeca60b1f	Remove replace_all and make VTs mutable (#113725 ) 1. Removes calls to `replace_all` and `clone` and makes VTs mutable. 2. Properly handles Tuple Iterator mutation. Previously TupleIterator variables would only be properly reconstructed if they were advanced at least once in a frame. On calls to `next`, the source information would be lost (due to constructing a new iterator without using builder), which would ensure that during codegen the variable would be reconstructed from scratch. Now that VTs are mutated, the source is never lost, so we need to properly track mutation and handle it by replaying calls to `next` at the end of the modified bytecode. 3. Added test for checking iadd side effects, this was missing in our unit test coverage. 4. Fixed two incorrect sources, DelayGraphBreakVariable, and UserMethodVariable both relied on setting the source to AttrSource(parent, name) at the callsite of `var_getattr`. 5. Fixed a bug in inplace adding for lists, it would set the resulting VariableTracker's source to `None` which would utilize a different reconstruct path in codegen. Now this is handled explicitly by reconstructing vars when allow_cache=`False`, so that during side effect replay, the mutated var is correctly updated. In subsequent PRs: * Refactoring side effect tracking to be significantly simpler (I think we only need an `is_modified` flag) * Refactor `next_variables` iterator to match the signature of `next` * Remove all references to `options` in the code * Refactor VTs representing mutable collections to implement their own mutation update handling * Remove clone and/or make it specific to lists for creating slices * Add mutation tracking/replay for sets * Add mutation tracking/replay for iter.py * Removing setting source in builder (it's set at the top level after a var is returned) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113725 Approved by: https://github.com/jansel	2023-12-10 09:31:21 +00:00
Yanbo Liang	da341d0d48	[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432 ) This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432 Approved by: https://github.com/ezyang, https://github.com/jansel	2023-12-09 05:11:44 +00:00
PyTorch MergeBot	e8e4141773	Revert "[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432 )" This reverts commit `e61d6b42f0`. Reverted https://github.com/pytorch/pytorch/pull/113432 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing dynamo tests in trunk `e61d6b42f0`, landrace? ([comment](https://github.com/pytorch/pytorch/pull/113432#issuecomment-1847787981))	2023-12-08 20:15:39 +00:00
Yanbo Liang	e61d6b42f0	[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432 ) This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432 Approved by: https://github.com/ezyang, https://github.com/jansel	2023-12-08 17:15:14 +00:00
Yanbo Liang	4620170008	[Dynamo] Revert multiple PRs since they triggered compilation stuck internally (#115126 ) Revert the following PRs to mitigate internal compilation stuck: #113432 #114016 #114507 #114196 #114739 #114669 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115126 Approved by: https://github.com/xush6528	2023-12-05 22:35:37 +00:00
rzou	c56d91ba39	Log pt2_compliant custom ops used with torch.compile (#115083 ) Summary: We already log non-pt2_compliant ops. This PR extends the logging to include pt2_compliant custom ops. We do not log all pt2_compliant ops (i.e. including builtin ops) because it would probably take too much memory Test Plan: Tested locally Pull Request resolved: https://github.com/pytorch/pytorch/pull/115083 Approved by: https://github.com/yanboliang, https://github.com/williamwen42	2023-12-05 00:51:33 +00:00
Yanbo Liang	ab5385fc50	[Dynamo][6.3/N] Further cleanup torch.py (#114669 ) A follow-up PR to clean up what I found during the refactor of torch.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/114669 Approved by: https://github.com/jansel	2023-12-01 04:08:29 +00:00
Aaron Gokaslan	4bb3a02d02	[BE]: Enable Ruff + Flake8 G201,G202 logging format rule. (#114474 ) Standardizes logging calls to always use logging.exception instead of logging.error where appropriate and enforces it with a lint. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114474 Approved by: https://github.com/jansel, https://github.com/malfet	2023-11-27 17:38:08 +00:00
PyTorch MergeBot	8232d4d1c3	Revert "[BE]: Enable Ruff + Flake8 G201,G202 logging format rule. (#114474 )" This reverts commit `d30497f6b6`. Reverted https://github.com/pytorch/pytorch/pull/114474 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but I see a bunch of inductor failure after the commit `d30497f6b6`, trying to revert to see if it helps fix the issues ([comment](https://github.com/pytorch/pytorch/pull/114474#issuecomment-1827271887))	2023-11-27 07:36:08 +00:00
voznesenskym	081c5b3adc	Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926 ) (#114526 ) Summary: The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors at the end of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor. This PR is the result of a lot of back and forth with ezyang and eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same: 1) We cache source->symbol in shape_env 2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification 3) We create a new fake mode for backends (from https://github.com/pytorch/pytorch/pull/113605/files) This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't). We went back to the drawing board here, but with a few concessions: 1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons 2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (ezyang did this) cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng imported-using-ghimport Test Plan: Imported from OSS Reviewed By: huydhn, Chillee Differential Revision: D51566250 Pulled By: voznesenskym Pull Request resolved: https://github.com/pytorch/pytorch/pull/114526 Approved by: https://github.com/Chillee, https://github.com/huydhn	2023-11-26 23:40:32 +00:00
Aaron Gokaslan	d30497f6b6	[BE]: Enable Ruff + Flake8 G201,G202 logging format rule. (#114474 ) Standardizes logging calls to always use logging.exception instead of logging.error where appropriate and enforces it with a lint. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114474 Approved by: https://github.com/jansel	2023-11-24 23:29:51 +00:00
PyTorch MergeBot	2f3beb715c	Revert "Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926 )" This reverts commit `2ca1119d53`. Reverted https://github.com/pytorch/pytorch/pull/113926 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113926#issuecomment-1822713852))	2023-11-22 12:52:33 +00:00
voznesenskym	2ca1119d53	Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926 ) The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors at the end of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor. This PR is the result of a lot of back and forth with @ezyang and @eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same: 1) We cache source->symbol in shape_env 2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification 3) We create a new fake mode for backends (from https://github.com/pytorch/pytorch/pull/113605/files) This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't). We went back to the drawing board here, but with a few concessions: 1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons 2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (@ezyang did this) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113926 Approved by: https://github.com/ezyang, https://github.com/eellison	2023-11-20 23:06:37 +00:00
Jez Ng	631fb33fd6	Enable import following in MYPYNOFOLLOW (now MYPYINDUCTOR) (#113830 ) Skipping importing some packages for now to make this change more tractable. For some reason, lintrunner on CI raises errors in all imported `.pyi` files, even though it doesn't on my local machine. The errors are all from missing generic types, as the MYPYINDUCTOR config has `disallow_any_generics` set. I have thus added `disable-error-code` comments to the relevant files, though I fixed a few that were easy enough. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113830 Approved by: https://github.com/Skylion007 ghstack dependencies: #113722, #113721	2023-11-17 18:24:21 +00:00
William Wen	2530d47cbe	[dynamo] re-add option to log all guard check fails (#113585 ) Followup to https://github.com/pytorch/pytorch/pull/110325 - re-add the `report_all_guard_failures config` as a logging artifact `recompiles_verbose` with the following changes: - evaluating the check must be wrapped with exception handling because subsequent code parts following the first failure may result in errors if evaluated (e.g. if a guard checks first for size, then tries to index - a guard failure due to insufficient size would result in an index error for the latter check). - Adding a test for this case Sample: ```python import torch def fn(x): return torch.rand(x[-1], len(x)) opt_fn = torch.compile(fn) opt_fn([4, 5, 6]) opt_fn([7, 8]) opt_fn([9]) ``` Output (with `TORCH_LOGS="recompiles_verbose"`): ```bash [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] Recompiling function fn in /data/users/williamwen/pytorch/playground5.py:15 [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] triggered by the following guard failure(s): [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] guard 0 failures: [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - len(L['x']) == 3 [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - L['x'][0] == 4 [2023-11-15 16:13:26,741] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - L['x'][1] == 5 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] Recompiling function fn in /data/users/williamwen/pytorch/playground5.py:15 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] triggered by the following guard failure(s): [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] guard 0 failures: [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - len(L['x']) == 2 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] guard 1 failures: [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - len(L['x']) == 3 [2023-11-15 16:13:26,970] torch._dynamo.guards.__recompiles_verbose: [DEBUG] - L['x'][0] == 4 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/113585 Approved by: https://github.com/jon-chuang, https://github.com/ezyang	2023-11-16 21:20:29 +00:00
Jez Ng	0a9dbbbaad	Make _inductor/fx_utils.py, _dynamo/utils.py pass follow_imports typechecking (#113722 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113722 Approved by: https://github.com/lezcano	2023-11-16 05:44:15 +00:00
Jez Ng	a3b859fc67	Drop dynamo-specific type hints on Tensor in favor of type-ignores (#113720 ) Per [this][1] discussion, plus some offline discussion. The summary: @albanD considers the core PyTorch types like Tensor to be extremely brittle, and does not think the risk of adding these typed attributes to be worth it. @eellison mentioned that we could use `WeakTensorKeyDictionary` instead. However, based on the sparse usage of these bonus attributes, I think that would be overkill. So I've opted to go with a few more type-ignore comments instead. [1]: https://github.com/pytorch/pytorch/pull/113610#discussion_r1392907367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113720 Approved by: https://github.com/ezyang, https://github.com/albanD, https://github.com/eellison ghstack dependencies: #113534, #113610	2023-11-16 01:54:00 +00:00
PyTorch MergeBot	5d170fce29	Revert "Support tensors as Dict keys (#111196 )" This reverts commit `b0805fa5d0`. Reverted https://github.com/pytorch/pytorch/pull/111196 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing internally. I will provide the details there ([comment](https://github.com/pytorch/pytorch/pull/111196#issuecomment-1813410149))	2023-11-15 23:08:00 +00:00
Yanbo Liang	6b01126df5	[Easy] [Dynamo] Catch OSError when calling inspect.getfile (#113671 ) Fixes #111328 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113671 Approved by: https://github.com/Skylion007, https://github.com/williamwen42	2023-11-14 22:15:32 +00:00
Aaron Gokaslan	18d7b8e4f7	[BE]: ruff apply rule PLW1510 to find silent subprocess errors (#113644 ) Reopens #111682 that I messed up due to a bad rebase and triggered some issues with CLA. This explicitly adds check=True or False to any subprocess calls where appropriate. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113644 Approved by: https://github.com/ezyang, https://github.com/kit1980	2023-11-14 20:59:40 +00:00
Aaron Gokaslan	b7b2178204	[BE]: Remove useless lambdas (#113602 ) Applies PLW0108 which removes useless lambda calls in Python, the rule is in preview so it is not ready to be enabled by default just yet. These are the autofixes from the rule. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113602 Approved by: https://github.com/albanD	2023-11-14 20:06:48 +00:00
lezcano	b0805fa5d0	Support tensors as Dict keys (#111196 ) This prepares the PR where we implement sets in terms of dicts. To do so, rather than storing internally a dictionary that maps literals to VariableTrackers, it stores (pretty much) a dictionary from VTs to VTs. To do so, keys are wrapped in an opaque internal class `_Hashable`. The Hashable class is opaque on purpose so that it fails hard if if it inadvertently leaks back into user code. We also found and fixed a number of latent bugs and inconsistencies in the way dynamo checked what can be a dict key. More generally, we make much clearer what are the things that need to be modified to add a new supported key type to Dicts. Fixes https://github.com/pytorch/pytorch/issues/107595 Fixes https://github.com/pytorch/pytorch/issues/111603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111196 Approved by: https://github.com/jansel	2023-11-14 19:14:03 +00:00
Jez Ng	d00c983b63	[dynamo] Make {testing,debug_utils,utils}.py pass follow_imports typechecking (#113519 ) Notes: * `debug_insert_nops` in testing.py was passing `None` to the compiler_fn parameter of `OutputGraph`, hence the modifications there. * I added `disable-error-code="method-assign"` to debug_utils.py as it does several such assignments. I guess mypy doesn't like it because it makes code near-impossible to safely typecheck. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113519 Approved by: https://github.com/Skylion007 ghstack dependencies: #113413, #113518	2023-11-11 22:15:46 +00:00
Jez Ng	c1fa708b03	[dynamo] Enable typechecking for utils.py (#112971 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112971 Approved by: https://github.com/lezcano, https://github.com/jansel ghstack dependencies: #112130, #112970	2023-11-08 21:17:45 +00:00
Jason Ansel	3914566c73	[dynamo] Refactor OrderedDict to dict (#113234 ) In Python3 all dicts are ordered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113234 Approved by: https://github.com/oulgen, https://github.com/lezcano	2023-11-08 09:27:08 +00:00
William Wen	ad1c3467e2	[dynamo] run guard fail hooks for each cache entry for which there is a cache miss (#110325 ) Attempt number 2 at https://github.com/pytorch/pytorch/issues/108950. Improves debugging for guard failures/recompilations by: - only running guard fail reason generation during recompilation, instead of when a guard fails during dynamo cache lookup (so generating guard failure reasons is not on the critical path) - ~~always reporting all guard failures~~ Reports the first-failing guard failure for each cache entry. We don't expect a performance hit since the guard fail reasons are only generated at recompile time rather than runtime. Perf benchmark to check this (https://hud.pytorch.org/benchmark/torchbench/inductor_with_cudagraphs?startTime=Fri,%2027%20Oct%202023%2017:42:43%20GMT&stopTime=Fri,%2003%20Nov%202023%2017:42:43%20GMT&granularity=hour&mode=training&dtype=amp&lBranch=gh/williamwen42/62/head&lCommit=f4724f5ffc6d17ceae513a42fc18627be7b85482&rBranch=main&rCommit=29f3d392bf230072e3bffae37b078e770cae1956). We may also need to verify this on benchmarks where guard fails are common. Sample script: ```python import torch def generate_data(b): return ( torch.randn(b, 3, 32, 32).to(torch.float32).cuda(), torch.randint(1000, (b,)).cuda(), ) from torchvision.models import resnet18 def init_model(): return resnet18().to(torch.float32).cuda() model = init_model() model_opt = torch.compile(model, dynamic=False) for b in range(16, 32): data = generate_data(b) model_opt(data[0]) ``` Sample logs: ```bash (/data/users/williamwen/py310-env) [williamwen@devgpu020.odn1 /data/users/williamwen/pytorch (wwen/log-all-guards)]$ python playground5.py /data/users/williamwen/pytorch/torch/_inductor/compile_fx.py:141: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance. warnings.warn( [2023-11-06 14:50:47,605] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (8) [2023-11-06 14:50:47,605] torch._dynamo.convert_frame: [WARNING] function: 'forward' (/data/users/williamwen/torchvision/torchvision/models/resnet.py:284) [2023-11-06 14:50:47,605] torch._dynamo.convert_frame: [WARNING] last reason: tensor 'L['x']' size mismatch at index 0. expected 16, actual 24 [2023-11-06 14:50:47,605] torch._dynamo.convert_frame: [WARNING] To log all recompilation reasons, use TORCH_LOGS="recompiles". [2023-11-06 14:50:47,605] torch._dynamo.convert_frame: [WARNING] To diagnose recompilation issues, see https://pytorch.org/docs/master/compile/troubleshooting.html. (/data/users/williamwen/py310-env) [williamwen@devgpu020.odn1 /data/users/williamwen/pytorch (wwen/log-all-guards)]$ TORCH_LOGS="recompiles" python playground5.py /data/users/williamwen/pytorch/torch/_inductor/compile_fx.py:141: UserWarning: TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled. Consider setting `torch.set_float32_matmul_precision('high')` for better performance. warnings.warn( [2023-11-06 14:53:31,591] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:53:31,591] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:53:31,591] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 17 [2023-11-06 14:53:41,333] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:53:41,333] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:53:41,333] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 18 [2023-11-06 14:53:41,333] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 18 [2023-11-06 14:53:50,463] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:53:50,463] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:53:50,463] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 18, actual 19 [2023-11-06 14:53:50,463] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 19 [2023-11-06 14:53:50,463] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 19 [2023-11-06 14:53:59,848] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:53:59,848] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:53:59,848] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 19, actual 20 [2023-11-06 14:53:59,848] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 18, actual 20 [2023-11-06 14:53:59,848] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 20 [2023-11-06 14:53:59,848] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 20 [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 20, actual 21 [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 19, actual 21 [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 18, actual 21 [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 21 [2023-11-06 14:54:08,549] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 21 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 21, actual 22 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 20, actual 22 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 19, actual 22 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 18, actual 22 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 22 [2023-11-06 14:54:17,795] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 22 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 22, actual 23 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 21, actual 23 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 20, actual 23 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 19, actual 23 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 18, actual 23 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 23 [2023-11-06 14:54:27,430] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 23 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function forward in /data/users/williamwen/torchvision/torchvision/models/resnet.py:284 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 23, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 22, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 21, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 20, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 19, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 18, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 17, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 16, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (8) [2023-11-06 14:54:36,744] torch._dynamo.convert_frame: [WARNING] function: 'forward' (/data/users/williamwen/torchvision/torchvision/models/resnet.py:284) [2023-11-06 14:54:36,744] torch._dynamo.convert_frame: [WARNING] last reason: tensor 'L['x']' size mismatch at index 0. expected 16, actual 24 [2023-11-06 14:54:36,744] torch._dynamo.convert_frame: [WARNING] To log all recompilation reasons, use TORCH_LOGS="recompiles". [2023-11-06 14:54:36,744] torch._dynamo.convert_frame: [WARNING] To diagnose recompilation issues, see https://pytorch.org/docs/master/compile/troubleshooting.html. [2023-11-06 14:54:45,922] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:54:45,922] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:54:45,922] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 25 [2023-11-06 14:54:54,691] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:54:54,691] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:54:54,691] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 25, actual 26 [2023-11-06 14:54:54,691] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 26 [2023-11-06 14:55:03,591] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:55:03,591] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:55:03,591] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 26, actual 27 [2023-11-06 14:55:03,591] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 25, actual 27 [2023-11-06 14:55:03,591] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 27 [2023-11-06 14:55:12,384] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:55:12,384] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:55:12,384] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 27, actual 28 [2023-11-06 14:55:12,384] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 26, actual 28 [2023-11-06 14:55:12,384] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 25, actual 28 [2023-11-06 14:55:12,384] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 28 [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 28, actual 29 [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 27, actual 29 [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 26, actual 29 [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 25, actual 29 [2023-11-06 14:55:21,442] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 29 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 29, actual 30 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 28, actual 30 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 27, actual 30 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 26, actual 30 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 25, actual 30 [2023-11-06 14:55:30,315] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 30 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] Recompiling function _forward_impl in /data/users/williamwen/torchvision/torchvision/models/resnet.py:266 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] triggered by the following guard failure(s): [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 30, actual 31 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 29, actual 31 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 28, actual 31 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 27, actual 31 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 26, actual 31 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 25, actual 31 [2023-11-06 14:55:39,839] torch._dynamo.guards.__recompiles: [DEBUG] - tensor 'L['x']' size mismatch at index 0. expected 24, actual 31 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/110325 Approved by: https://github.com/ezyang, https://github.com/jon-chuang	2023-11-07 20:10:59 +00:00
Jason Ansel	5fe96eaaf4	[dynamo] Remove VariableTracker.propagate (#111726 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111726 Approved by: https://github.com/voznesenskym ghstack dependencies: #111306, #111415, #111725	2023-11-07 19:55:19 +00:00
Richard Zou	4f5acf8329	Log non-pt2_compliant ops encountered by Dynamo (#112581 ) Summary: See internal diff for more changes. Whenever we encounter a non-compliant op, we add it to a set on the OutputGraph. When a compilation event happens, we log the contents of this set. I'm planning on flipping the `only_allow_pt2_compliant_ops` config from False to True after the logging determines that existing models do not use non-compliant ops. Test Plan: - Tested the logging internally locally Differential Revision: D50884828 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112581 Approved by: https://github.com/yanboliang	2023-11-01 22:53:16 +00:00
rzou	1483097679	Update how Dynamo decides to graph break on an OpOverloadPacket (#112200 ) Previously, under config.only_allow_pt2_compliant_ops, Dynamo graph breaks when it see an OpOverloadPacket where any overloads are not PT2 compliant. This is potentially brittle: if someone (unlikely) adds a new overload for a custom operator, then this would cause a previously non-graph-breaking call to the OpOverloadPacket to graph break. In this PR: - When Dynamo is about to write a call to an operator to the FX graph, we check if it is PT2 compliant. - For OpOverload, we check to see if the tag is on it - For OpOverloadPacket, we do overload resolution and check to see if the tag is on the OpOverload that it resolves to. Test Plan: - new tests, existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/112200 Approved by: https://github.com/bdhirsh	2023-10-31 19:10:37 +00:00
Jason Ansel	4b8a5e1854	[dynamo] Remove VariableTracker.as_specialized (#112363 ) My local testing can't seem to find this function actually doing anything. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112363 Approved by: https://github.com/yanboliang	2023-10-30 20:07:55 +00:00
Yanbo Liang	061bf1a153	[5/N] Make torch context manager a TorchCtxManagerClassVariable (#111622 ) Major change in this PR is to make torch context manager class a separate ```TorchCtxManagerClassVariable```, since we have dynamo implementation for these ctx managers. I was thinking to wrap them as ```UserDefinedClassVariable``` and do dispatch at ```USCVariable.call_function```, but it seems almost the same amount of work and this way is more clear. This is on the way of moving ```TorchVariable``` to ```TorchFunctionVariable``` which will only handle the functions who would be allowed in graph (e.g, ```torch.sin```) and constant folded (e.g, ```torch.is_floating_point```). All other torch functions would be go through skip/inline rules, and would be wrapped as ```UserFunctionVariable``` (for inlined) and ```SkipFilesVariable``` (for skipped). The next steps: * Wrap torch modules, classes, objects as regular ```PythonModuleVariable```, ```UserDefinedClassVariable``` and ```UserDefinedObjectVariable```. * Generate the allow in graph torch functions list and wrap them as ```TorchFunctionVariable```. * Finally merge ```skipfiles.check``` and ```is_allowed``` into one function ```allow_skip.check(fn)``` which would return a Enum of allow, skip and inline. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111622 Approved by: https://github.com/jansel	2023-10-27 21:26:54 +00:00
lezcano	1dcbd1c088	[dynamo] [easy] Move Set to dicts.py (#110522 ) A set is more of a dict than a list if you ask me. This comes before the refactor where we implement sets and dicts via the same logic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110522 Approved by: https://github.com/jansel	2023-10-27 20:17:10 +00:00
Michael Lazos	a6e556f8b0	Support calling __torch_function__ attribute access (#111737 ) Triggers `__torch_function__` tracing on attribute/method/property access matching the eager behavior for non-overridden attributes/methods/properties that are present on `torch.Tensor`. Some caveats: 1. for methods there doesn't seem to be a way to check if the original implementation of a method is overridden via monkey patching or not. For example: ``` class LocalSubclass(torch.Tensor): @classmethod def __torch_function__(cls, func, types, args=(), kwargs=None): if kwargs is None: kwargs = {} return super().__torch_function__(func, types, args, kwargs) x = torch.ones(2, 2).as_subclass(LocalSubclass) > x.sigmoid <built-in method sigmoid of LocalSubclass object at 0x7f8d305bb5e0> ``` There isn't a way to verify that this built-in method is equivalent to the base `torch.Tensor` implementation as each instance will have a different built-in method object that can't be traced back to the original `torch.Tensor` impl. You can check that the class itself has the original implementation via ``` > inspect.getattr_static(LocalSubclass, "sigmoid") <method 'sigmoid' of 'torch._C.TensorBase' objects> ``` But we can't detect if the user dynamically patches an object with a built-in method called sigmoid which does something completely different. 2. If a user overrides a method but calls the original implementation we will still graph break. This will require modifying `SuperVariable` (and any other way to get the original impl) to handle tensor subclasses. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111737 Approved by: https://github.com/jansel, https://github.com/ezyang	2023-10-27 04:57:19 +00:00
Jon Chuang	0ed461ae4c	[dynamo] Ensure Dynamo uses this graph's fakes for `Tensor` `example_value`s (#111954 ) Fixes https://github.com/pytorch/pytorch/issues/111869, Fixes (detailed list of cases handled): https://github.com/pytorch/pytorch/pull/111913#discussion_r1370267313, fully fixes: https://github.com/pytorch/pytorch/issues/111873 Adds sanity checks ensuring that Dynamo uses this graph's fakes for Tensor `example_values` Handles the main (and only?) entrypoints for new `FakeTensor`s in a Dynamo graph: - `wrap_fx_proxy_cls` - `VariableTracker.wrap_tensor` Ensures that `get_fake_value` returns a fake except when we know we are going to properly wrap non-fakes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111954 Approved by: https://github.com/ezyang	2023-10-25 23:54:18 +00:00
Jon Chuang	6d78f34a06	fix regression which creates a new fake tensor (#111864 ) Fixes regression identified here: `ccd6b373b5 (r1369334484)` Now that `get_fake_value` will identify aliases, we should not try to wrap the fake value again. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111864 Approved by: https://github.com/eellison	2023-10-24 05:11:48 +00:00
Jon Chuang	47eed65481	[dynamo] Add `is_` support for `Tensor`s, force `get_fake_value` to reuse previously computed `example_value` if available (#111565 ) Use FakeTensor id match as equivalent to object identity match cc Pull Request resolved: https://github.com/pytorch/pytorch/pull/111565 Approved by: https://github.com/ezyang	2023-10-21 13:56:30 +00:00
Michael Lazos	62df159c3f	move tf override tensor to torch_function.py (#111714 ) Moves TensorWithTFOverride to torch_function.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/111714 Approved by: https://github.com/eellison, https://github.com/voznesenskym	2023-10-21 02:29:01 +00:00
Edward Z. Yang	14c2f296e0	Don't suppress original error message for data-dependent value (#111596 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/111596 Approved by: https://github.com/suo	2023-10-20 19:38:50 +00:00
Michael Lazos	a55ecec195	[dynamo][`__torch_function__` 2/n] Refactor TensorWithTFOverrideVariable (#109556 ) This is purely a refactor that preserves the existing behavior and tests. The main contributions of the PR are to refactor the dispatch of `__torch_function__` to enable calling it with TF override objects in any argument position and matching the eager dispatch behavior. This will allow for the following in upcoming PRs: 1) have TensorWithTFOverrideVariable inherit from TensorVariable 2) enable tracing through the base `__torch_function__` implementation. Note: this depends on https://github.com/pytorch/pytorch/pull/109542 towards tracing for https://github.com/pytorch/pytorch/issues/93723 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109556 Approved by: https://github.com/jansel, https://github.com/ezyang	2023-10-20 18:53:38 +00:00
Aaron Gokaslan	cb856b08b2	[BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496 ) Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496 Approved by: https://github.com/malfet	2023-10-19 21:56:36 +00:00
ydwu4	9f562a3de3	[dynamo] make disable_cahce_limit also disable accumulated cache limit (#111334 ) Fixes #111329. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111334 Approved by: https://github.com/yanboliang	2023-10-16 17:59:04 +00:00
PyTorch MergeBot	33403336fa	Revert "[user errors] compulsory case names, allow multiple (#110878 )" This reverts commit `2ae71c4598`. Reverted https://github.com/pytorch/pytorch/pull/110878 on behalf of https://github.com/kit1980 due to export/test_export.py::TestExport::test_multiple_definitions_same_name_dim - TypeError: UserError.init() missing 1 required positional argument: 'case_names' ([comment](https://github.com/pytorch/pytorch/pull/110878#issuecomment-1754360051))	2023-10-10 04:44:40 +00:00
Avik Chaudhuri	2ae71c4598	[user errors] compulsory case names, allow multiple (#110878 ) We want to get to a point where most UserErrors link to exportdb examples. This PR makes passing case names non-optional to make this intent clearer and encourage developers who raise UserErrors to make or point to examples that make fixing such errors more obvious for users. In addition, sometimes there are multiple examples that are relevant to an error. Thus this PR also enables passing multiple case names. Retry of #110733 which was reverted due to a landrace. Differential Revision: [D50087148](https://our.internmc.facebook.com/intern/diff/D50087148/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110878 Approved by: https://github.com/gmagogsfm, https://github.com/tugsbayasgalan	2023-10-10 03:48:07 +00:00
Huy Do	18f0d3af72	Revert "[user errors] compulsory case names, allow multiple (#110733 )" (#110783 ) This reverts commit `983f6f36db`. I have no idea how to revert https://github.com/pytorch/pytorch/pull/110733 with the bot. So reverting it manually for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110783 Approved by: https://github.com/ZainRizvi, https://github.com/kit1980	2023-10-07 07:32:39 +00:00
Avik Chaudhuri	983f6f36db	[user errors] compulsory case names, allow multiple (#110733 ) We want to get to a point where most `UserError`s link to `exportdb` examples. This PR makes passing case names non-optional to make this intent clearer and encourage developers who raise `UserError`s to make or point to examples that make fixing such errors more obvious for users. In addition, sometimes there are multiple examples that are relevant to an error. Thus this PR also enables passing multiple case names. Differential Revision: [D50020465](https://our.internmc.facebook.com/intern/diff/D50020465/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110733 Approved by: https://github.com/zhxchen17	2023-10-07 01:25:12 +00:00
chilli	f767a6c57a	Made pattern-matcher diagnostics lazily reported + added TORCH_COMPILE_CPROFILE (#110504 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110504 Approved by: https://github.com/mlazos, https://github.com/eellison ghstack dependencies: #110501	2023-10-05 15:47:30 +00:00
PyTorch MergeBot	1e4c0641ce	Revert "Made pattern-matcher diagnostics lazily reported + added TORCH_COMPILE_CPROFILE (#110504 )" This reverts commit `9648df1a6a`. Reverted https://github.com/pytorch/pytorch/pull/110504 on behalf of https://github.com/PaliC due to temporarily will revert as it's causing problems with difftrain import ([comment](https://github.com/pytorch/pytorch/pull/110504#issuecomment-1749132253))	2023-10-05 15:28:23 +00:00
Avik Chaudhuri	416eca9736	export db links for user errors (#110555 ) Ideally all `_dynamo.exc.UserError`s should have "case names", i.e., link to examples in `exportdb`. This PR adds case names to several instances of `_dynamo.exc.UserError`. In particular, looking at coverage based on `UserErrorType`: * `DYNAMIC_CONTROL_FLOW`, `ANTI_PATTERN`, and `STANDARD_LIBRARY` are fully covered. * `CONSTRAINT_VIOLATION` and `DYNAMIC_DIM` have no coverage. We don't seem to have any dedicated examples of specifying dynamic shapes in `exportdb` (although they are used in some other examples without explanation, to avoid some specialization that would make such examples moot). * `INVALID_INPUT` is only partly covered. Frankly this is tedious to cover via examples. Differential Revision: [D49928518](https://our.internmc.facebook.com/intern/diff/D49928518/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110555 Approved by: https://github.com/angelayi, https://github.com/ydwu4	2023-10-05 05:03:04 +00:00
chilli	9648df1a6a	Made pattern-matcher diagnostics lazily reported + added TORCH_COMPILE_CPROFILE (#110504 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110504 Approved by: https://github.com/mlazos, https://github.com/eellison ghstack dependencies: #110501	2023-10-05 01:34:57 +00:00
Xu Zhao	2e31fae5c5	Cleanup the code in the `dynamo` userbenchmark (#110519 ) Summary: Skip importing the modules that are only available in the pytorch source code, not pytorch nightly release. Make dynamo benchmark work on both OSS and internal. X-link: https://github.com/pytorch/benchmark/pull/1960 Test Plan: ``` $ python run_benchmark.py dynamo --only alexnet --training --performance --inductor loading model: 0it [00:05, ?it/s] cuda train alexnet running benchmark: 100%\|█████████████████\| 30/30 [00:00<00:00, 41.46it/s] 1.129x ``` ``` $ buck2 run mode/opt //pytorch/benchmark:run_benchmark -- dynamo --only alexnet --training --inductor --performance --output-directory $HOME loading model: 0it [00:16, ?it/s] running benchmark: 100%\|█████████████████\| 30/30 [00:00<00:00, 37.94it/s] cuda train alexnet 1.120x ``` Differential Revision: D49912006 Pulled By: xuzhao9 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110519 Approved by: https://github.com/desertfire, https://github.com/jansel	2023-10-04 23:26:30 +00:00
Kazuaki Ishizaki	2c1b009e39	Fix typo under torch/_dynamo directory (#110459 ) This PR fixes typo of comments in files under `torch/_dynamo` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/110459 Approved by: https://github.com/colesbury	2023-10-04 16:05:05 +00:00
chilli	005e8ddcb9	cache the hash construction on Guard (#110464 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110464 Approved by: https://github.com/zou3519, https://github.com/voznesenskym	2023-10-04 04:49:18 +00:00
eellison	98c8550158	Fix Triplet Margin Loss Opinfo (#110302 ) Triplet Margin Loss takes in a Callable `distance_function` parameter which is not supported as an argument on the fx graph. See previous error: > File "/scratch/eellison/work/pytorch/torch/_dynamo/symbolic_convert.py", line 562, in call_function self.push(fn.call_function(self, args, kwargs)) File "/scratch/eellison/work/pytorch/torch/_dynamo/variables/torch.py", line 723, in call_function proxy_args_kwargs(args, kwargs), File "/scratch/eellison/work/pytorch/torch/_dynamo/utils.py", line 504, in proxy_args_kwargs f"call_function args: {typestr(args)} {typestr(*list(kwargs.values()))}" File "/scratch/eellison/work/pytorch/torch/_dynamo/exc.py", line 143, in unimplemented raise Unsupported(msg) torch._dynamo.exc.Unsupported: call_function args: TensorVariable() TensorVariable() TensorVariable() ConstantVariable(float) NNModuleVariable() This is fixable by just inlining into `triplet_margin_loss` and continuing to compile it. This required support for `has_torch_function_variadic`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110302 Approved by: https://github.com/mlazos	2023-10-03 20:26:13 +00:00
Animesh Jain	8ed08e5a7c	[dynamo] Graph break on rng get/set state - remove GeneratorStateSource (#109410 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109410 Approved by: https://github.com/ezyang ghstack dependencies: #109411	2023-09-22 22:31:55 +00:00
Oguz Ulgen	1df14f1bf8	Move has_triton to top level triton utils so that dynamo can also access (#109832 ) it without creating cyclic dependencies Pull Request resolved: https://github.com/pytorch/pytorch/pull/109832 Approved by: https://github.com/zou3519	2023-09-22 19:33:41 +00:00
lezcano	8597d37536	Implement numpy(force=True) (#109636 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109636 Approved by: https://github.com/ezyang ghstack dependencies: #109634	2023-09-20 20:06:13 +00:00
Edward Z. Yang	103260a43b	Re-define check for `typing` classes. (#109201 ) This PR fix the `is_typing` function: checks whether a value is an instance of a class from the `typing` package. This reverts commit b09c09f7bb3adb6a5b8a107a5b96757b569daa8d. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109201 Approved by: https://github.com/ezyang	2023-09-20 00:04:56 +00:00
leslie-fang-intel	55c19a3c6d	Inductor: Increase multiplier to 3 for Inductor AMP benchmark correctness check (#109097 ) Summary As reported in https://github.com/pytorch/pytorch/issues/108333, we find some of the models have failed the benchmark's correctness check. However, the end-to-end model's accuracy ([test script](https://gist.github.com/leslie-fang-intel/aac8b3c2b450532fd0517c758bb845e0)) when comparing AMP with FP32 is within a difference of less than 0.1%. Thus, it's possible that the correctness check failures for these models are false alarms. We use multiplier of 3 instead of 2 in this PR to avoid these false alarms. Model end-to-end accuracy test results are: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/jiahaofa/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/jiahaofa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> </head> <body link="#0563C1" vlink="#954F72"> SPR \| \| \| \| \| \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| FP32 Imperative TOP1 Accuracy \| FP32 Imperative TOP5 Accuracy \| BF16 AMP Inductor TOP1 Accuracy \| BF16 AMP Inductor TOP5 Accuracy \| BF16/FP32 Relative Loss TOP1 Accuracy \| BF16/FP32 Relative Loss TOP5 Accuracy gluon_inception_v3 \| 73.262 \| 90.774 \| 73.256 \| 90.802 \| -0.01% \| 0.03% mobilenetv2_100 \| 72.89 \| 90.996 \| 72.826 \| 90.946 \| -0.09% \| -0.05% mobilenetv3_large_100 \| 75.72 \| 92.55 \| 75.764 \| 92.554 \| 0.06% \| 0.00% </body> </html> Pull Request resolved: https://github.com/pytorch/pytorch/pull/109097 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-09-16 10:02:56 +00:00
Yukio Siraichi	dfdc0b63c9	Bisect FX node asserts on `ValidationException`. (#107493 ) This PR introduces binary search for finding smaller validation errors, when they occur. We do that by bisecting the sequence of `torch._assert` FX nodes recorded as the source expression of the translation validator (TV) by `ShapeEnv.evaluate_expr` calls. Then, we raise the error caused by the earliest node. In summary, the changes are: - Call `bisect` on `ValidationError` @ _torch/_dynamo/convert_frame.py_ - Implement the binary search @ _torch/fx/experimental/symbolic_shapes.py_ Edit: moved `ShapeEnv` replay-recording to #107989 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107493 Approved by: https://github.com/ezyang ghstack dependencies: #107989	2023-09-15 15:18:12 +00:00
Emil Laftchiev	f2639a2c37	Back out "Dynamo support for autograd.Function w/ once_differentiable (#108686 )" (#109199 ) Summary: Original commit changeset: e11cddf1fecc Original Phabricator Diff: D49064185 Test Plan: Comparing PT1 and PT2 performance on the IG Feed Model with this diff backed out: N4274204 Comparing the PT1 and PT2 performance on IG Feed with this diff committed: N4271093 Reviewed By: zou3519 Differential Revision: D49230047 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109199 Approved by: https://github.com/zou3519, https://github.com/xw285cornell	2023-09-13 15:43:20 +00:00
Nakul Camsamudram	3b265e021f	Support Optional typehint without graph breaking (#108970 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108970 Approved by: https://github.com/anijain2305	2023-09-11 16:42:44 +00:00
Richard Zou	ef2bbe1ae1	Dynamo support for autograd.Function w/ once_differentiable (#108686 ) Fixes #106893 There are two main changes: - Before this PR, the function returned by once_differentiable was included in skipfiles (because its .co_code is torch/autograd/function.py). This PR adds a mechanism to tell Dynamo to inline a function, no matter if it is included in skipfiles. - A bugfix: when we are introspecting the backward, we need to turn the grad mode off. This is to accurately model the eager-mode semantics: In eager-mode PyTorch, if second-order gradients were not requested, then the grad mode is off. torch.compile does not work with higher-order gradients and just assumes we do first-order gradients, so this is OK. Test Plan: - new test Differential Revision: [D49064185](https://our.internmc.facebook.com/intern/diff/D49064185) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108686 Approved by: https://github.com/voznesenskym	2023-09-08 16:10:32 +00:00
Evgeni Burovski	1f20531939	fall back to eager on `NotImplementedError` (#107863 ) Follow-up to https://github.com/pytorch/pytorch/pull/107710: Help dynamo fall back to eager when compiling unimplemented numpy constructs: - arrays of strings - (arg){min, max} for complex types - various arguments typed as NotImplemented (`np.ones(4, order="F")` etc) - numpy functions which torch._numpy does not implement To test, run (we do not implement arrays of strings) ``` import torch import numpy as np @torch.compile(fullgraph=False) def fn(): return np.asarray(["L", "U"]) ``` and observe it compiles with fullgraph=False and fails with fullgraph=True Fixes https://github.com/pytorch/pytorch/issues/107970 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107863 Approved by: https://github.com/ezyang, https://github.com/lezcano	2023-09-07 21:22:20 +00:00
Elias Ellison	e18f512b81	Update accuracy checking for nan, floats (#108202 ) Fixes inference accuracy for `doctr_reco_predictor` and `pyhpc_turbulent_kinetic_energy`. For the `same(float, float)` comparison we weren't going through the more rigorous tensor comparison path which takes into account the fp64 base results. Also return True when fp64 base result are not well formed (nan). I debugged these models and the source of divergence were innocuous: `doctr_reco_predictor` - can be fixed by turning off layout optimization, decomp for batch norm `pyhpc_turbulent_kinetic_energy` - divergence caused because fused kernel keeps precision in fp32 instead of casting back and forth from/to fp32/bf16. Fused kernel is better precision, anyway. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108202 Approved by: https://github.com/jansel	2023-09-01 02:54:01 +00:00
Brian Hirsh	5efd63b1b8	better support for fakeifying and dynamoing through torch_dispatch subclasses (with dynamic shapes) (#107415 ) There is already some support for plumbing `__torch_dispatch__` tensor subclasses through dynamo, but this PR beefs it up a bit and adds a test. In particular: (1) Fakeifying tensor subclasses didn't properly set autograd metadata (requires_grad, is_leaf) on the newly fakeified wrapper subclass. I don't actually have a test for this in this PR, but it's tested pretty heavily later in my aot autograd tests (2) Fakeifying tensor subclasses didn't properly track source information for dynamic shapes on the inner tensors. I added a new `WrapperSubclassFieldSource` subclass, that represents a source coming from a tensor field on a wrapper subclass, which I use in the fakeifying logic, and again in symbolic_shapes.py to generate proper guards. (3) `_make_wrapper_subclass()` marginally updated this code to work better with dynamic shapes. One thing that's a bit weird about `_make_wrapper_subclass`: it has two overloads, and the first explicitly does not support dynamic shapes (and the second.. does not support kwargs). I think that later we probably want to consolidate / at least make the first overload work with dynamic shapes, but I didn't want to handle that in this PR (so these smaller changes seemed like a strict improvement). Pull Request resolved: https://github.com/pytorch/pytorch/pull/107415 Approved by: https://github.com/ezyang	2023-08-29 02:36:48 +00:00
Animesh Jain	9d2ffc5dfa	[reland][Dynamo] cache_size policy #107496 (#108069 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108069 Approved by: https://github.com/yanboliang	2023-08-28 22:06:54 +00:00
Tugsbayasgalan Manlaibaatar	485de73004	Improve unbacked symint error msg (#107806 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107806 Approved by: https://github.com/avikchaudhuri	2023-08-25 01:07:09 +00:00
lezcano	207b06d099	[dynamo] Wrap ndarray dunder methods (#107689 ) Fixes https://github.com/pytorch/pytorch/issues/107437 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107689 Approved by: https://github.com/ezyang ghstack dependencies: #107687, #107688, #107710, #107711, #107746	2023-08-23 13:55:36 +00:00
lezcano	612c8a8c84	Guard numpy imports in the dynamo folder (#107299 ) Fixes https://github.com/pytorch/pytorch/issues/107228 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107299 Approved by: https://github.com/atalman	2023-08-21 19:07:20 +00:00
Edward Z. Yang	36bb7a1f42	Add fast traceback utilities (#107358 ) This adds some utilities for conveniently working with fast combined CapturedTraceback from Python. The main goal of these utilities is to make it easier for people to use CapturedTraceback as a drop-in replacement for `traceback.extract_stack`, which is 20x slower than CapturedTraceback. I port symbolic shapes to use the new CapturedTraceback code, to validate that the APIs work and are useful. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/107358 Approved by: https://github.com/zdevito, https://github.com/albanD ghstack dependencies: #107438	2023-08-18 19:05:54 +00:00
Michael Lazos	e0d6072f69	Add API to mark input tensors static for cudagraphs (#107154 ) Adds API to mark tensor as a static input - To make this trigger recompiles properly, I'll need to update tensor match checks to also check for this new attribute Additional concern is memory - the tensors will be kept alive, but this is the current behavior for nn modules and parameters. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107154 Approved by: https://github.com/eellison	2023-08-16 04:38:19 +00:00
Yanbo Liang	fbfb9a1648	[Dynamo] Improve PT2 fbcode logging observability (#106932 ) Summary: https://docs.google.com/document/d/1D5K3_ELsda3tIUeSyNL_2yee-M3jVWbirqSQ5BDNvHQ/edit This is the revamped version of D47908299. For each frame, we will record a list of compilation metrics: e.g, backend_compile time, entire_frame_compile time, cache_size, co_filename, co_firstlineno, co_name, guards, graph input_count, graph node_count, graph op_count. With the help of job info: mast_job_name, global_rank, we can satisfy the requirements from `Things I’ve used/wanted to use our logging to determine` in https://docs.google.com/document/d/1D5K3_ELsda3tIUeSyNL_2yee-M3jVWbirqSQ5BDNvHQ/edit (or add more metrics for this framework) Test Plan: ``` buck2 test //caffe2/test:test_dynamo ``` Differential Revision: D48142400 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106932 Approved by: https://github.com/anijain2305	2023-08-11 20:46:04 +00:00
lezcano	a9dca53438	NumPy support in torch.compile (#106211 ) RFC: https://github.com/pytorch/rfcs/pull/54 First commit is the contents of https://github.com/Quansight-Labs/numpy_pytorch_interop/ We have already been using this in core for the last few months as a external dependency. This PR pulls all these into core. In the next commits, I do a number of things in this order - Fix a few small issues - Make the tests that this PR adds pass - Bend backwards until lintrunner passes - Remove the optional dependency on `torch_np` and simply rely on the upstreamed code - Fix a number dynamo tests that were passing before (they were not tasting anything I think) and are not passing now. Missing from this PR (but not blocking): - Have a flag that deactivates tracing NumPy functions and simply breaks. There used to be one but after the merge stopped working and I removed it. @lezcano to investigate. - https://github.com/pytorch/pytorch/pull/106431#issuecomment-1667079543. @voznesenskym to submit a fix after we merge. All the tests in `tests/torch_np` take about 75s to run. This was a work by @ev-br, @rgommers @honno and I. I did not create this PR via ghstack (which would have been convenient) as this is a collaboration, and ghstack doesn't allow for shared contributions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106211 Approved by: https://github.com/ezyang	2023-08-11 00:39:32 +00:00
Jason Lu	bc88028e8e	Back out "Reland "Make adding buffers more like adding parameters (#104069 )" (#106224 )" (#106743 ) Summary: Original commit changeset: 81319beb97f3 Original Phabricator Diff: D47961182 Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822 Reviewed By: atuljangra Differential Revision: D48131623 @diff-train-skip-merge (D48131623 landed internally) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743 Approved by: https://github.com/malfet	2023-08-08 15:27:34 +00:00
Thomas Ortner	cc21fa75a3	Enable dynamic shapes of torch.nn.Parameter (#105855 ) This PR adds a new configuration that enables shapes of torch.nn.Parameter to be treated as dynamic in order to avoid extensive recompilation when Paramters are used instead of Tensor. This features addresses part of issue #105279 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105855 Approved by: https://github.com/ezyang	2023-08-08 05:40:01 +00:00
Mikayla Gawarecki	d8e5f2aa6d	Reland "Make adding buffers more like adding parameters (#104069 )" (#106224 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224 Approved by: https://github.com/atalman, https://github.com/albanD	2023-07-31 17:18:56 +00:00
Michael Voznesensky	8549abc347	Grab bag of DTensor enablement stuff (Enable whole graph capture for DTensor) (#105787 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105787 Approved by: https://github.com/ezyang	2023-07-30 00:17:45 +00:00
Aaron Gokaslan	6d43c89f37	[BE]: Update Ruff to 0.0.280 (#105724 ) Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724 Approved by: https://github.com/ezyang, https://github.com/janeyx99	2023-07-22 23:03:34 +00:00
angelayi	b0a04331b4	[dynamo] Fix import if numpy is not installed (#105711 ) This [line](https://github.com/pytorch/pytorch/blob/main/torch/_dynamo/allowed_functions.py#L18) results in an import issue if numpy is not installed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105711 Approved by: https://github.com/yanboliang, https://github.com/ezyang	2023-07-21 05:52:32 +00:00
William Wen	777fc0bb58	[dynamo] fine-grained bytecode-source attribution in python 3.11 (#104676 ) Since Python 3.11 bytecode contains endline and column information, for each bytecode, we attribute the source code corresponding to the bytecode in a more accurate way. For example, we can highlight a function call in a series of nested function calls, or highlight a function call spanning multiple lines. Sample: ```python import torch import torch._dynamo from functorch.experimental.control_flow import cond def h(x): return x * 5 def true_fn(x): return x * 2 def false_fn(x): return x * 3 def f(pred, x): x = h( h(h(x)) ) x = x[1:][:2] torch._dynamo.graph_break() x = cond(pred, true_fn, false_fn, [x]) opt_f = torch.compile(f, backend="eager") opt_f(torch.tensor(True), torch.randn(3, 3, 3, 3)) ``` Output: ``` $ TORCH_LOGS="trace_call" python playground9.py TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:16 h(h(x)) ~^^^ TRACE FX call mul from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1) return x * 5 ~~^~~ TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:16 h(h(x)) ~^^^^^^ TRACE FX call mul_1 from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1) return x * 5 ~~^~~ TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:15 x = h( ~^ h(h(x)) ^^^^^^^ ) ^ TRACE FX call mul_2 from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1) return x * 5 ~~^~~ TRACE FX call getitem from f /scratch/williamwen/work/pytorch/playground9.py:18 x = x[1:][:2] ~^^^^ TRACE FX call getitem_1 from f /scratch/williamwen/work/pytorch/playground9.py:18 x = x[1:][:2] ~~~~~^^^^ TRACE inlined call true_fn from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20 x = cond(pred, true_fn, false_fn, [x]) ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TRACE FX call mul from true_fn /scratch/williamwen/work/pytorch/playground9.py:9 (inline depth: 1) return x * 2 ~~^~~ TRACE inlined call false_fn from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20 x = cond(pred, true_fn, false_fn, [x]) ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TRACE FX call mul from false_fn /scratch/williamwen/work/pytorch/playground9.py:12 (inline depth: 1) return x * 3 ~~^~~ TRACE FX call cond from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20 x = cond(pred, true_fn, false_fn, [x]) ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104676 Approved by: https://github.com/ezyang	2023-07-20 17:18:52 +00:00
Andrey Talman	c6653b65d8	Back out "Make adding buffers more like adding parameters (#104069 )" (#105581 ) Summary: D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/ with `TypeError: register_buffer() takes 3 positional arguments but 4 were given` Original commit changeset: d4b4069fbd38 Original Phabricator Diff: D47537831 Test Plan: ``` buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform ``` Reviewed By: atalman Differential Revision: D47600140 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581 Approved by: https://github.com/mikaylagawarecki	2023-07-20 03:39:53 +00:00
kshitij12345	e137ac6c59	[dynamo][torch_np] support linalg, random and fft module (#105320 ) Support tracing through `np.linalg` with `torch_np` installed. Will update with other modules if this approach makes sense. TODO: * [x] Add test for `fft` and `random`. Fixes https://github.com/pytorch/pytorch/issues/105269 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105320 Approved by: https://github.com/ezyang, https://github.com/lezcano	2023-07-19 11:06:37 +00:00
Michael Lazos	1597dd7a54	Report guard failures with recompiles logging (#105500 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/105500 Approved by: https://github.com/Chillee, https://github.com/anijain2305	2023-07-19 02:20:44 +00:00
Wanchao Liang	cb23373264	[dynamo] allow tensor subclass fakification in dynamo (#105308 ) This PR adds necessary plumbing through torchdynamo to allow tensor subclasses with certain contract (i.e. with `__tensor_flatten__` and `__tensor_unflatten__`) to goes through the dynamo fakification pass by fakifying the tensor subclass internal components. Some of the tensor subclass contract logic mostly borrowed from https://github.com/pytorch/pytorch/pull/97540 Added some tests to verify simply passing through a tensor subclass (i.e. DTensor) through dynamo eager works as expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105308 Approved by: https://github.com/ezyang	2023-07-18 17:28:04 +00:00
Aleksandar Samardžić	5d473a950f	Make conversions from/to sparse semi-structured always @torch.compile-d (#105272 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105272 Approved by: https://github.com/ezyang	2023-07-18 04:51:28 +00:00
lezcano	a26afb9848	Better comparisons for np.ndarrays in dynamo (#105333 ) This takes tolerances into account. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105333 Approved by: https://github.com/larryliu0820	2023-07-17 20:20:50 +00:00
ekamiti	32d422f335	Make adding buffers more like adding parameters (#104069 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069 Approved by: https://github.com/mikaylagawarecki	2023-07-17 17:59:05 +00:00
Animesh Jain	95232c216b	[dynamo] Bugfix for enums (#105306 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105306 Approved by: https://github.com/yanboliang	2023-07-17 16:39:16 +00:00
lezcano	b190f46514	Allow NumPy code in torch.compile to run on cuda (#104699 ) This can be achieved by doing `torch.set_default_device("cuda")`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104699 Approved by: https://github.com/ezyang, https://github.com/larryliu0820	2023-07-06 18:43:09 +00:00
Animesh Jain	8c191d8eef	[dynamo][ac] Reland #104397 - Remove disable monkeypatching of utils.checkpoint (#104665 ) NO CHANGE from before. The ancestor diff was reverted, so this diff got reverted as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104665 Approved by: https://github.com/wconstab	2023-07-06 00:48:02 +00:00
Animesh Jain	4005152b92	[dynamo] Organize higherorderops variable trackers (#104565 ) The main change is moving the higherorderops from torch.py to higher_order_ops.py. And creating smaller subclasses of HigherOrderOp for cond, map etc Pull Request resolved: https://github.com/pytorch/pytorch/pull/104565 Approved by: https://github.com/zou3519	2023-07-05 22:19:26 +00:00
PyTorch MergeBot	40f53912cf	Revert "[dynamo][ac] Remove disable monkeypatching of utils.checkpoint (#104397 )" This reverts commit `537a6c0651`. Reverted https://github.com/pytorch/pytorch/pull/104397 on behalf of https://github.com/huydhn due to This has been reverted internally by D47216591, so I need to also revert it on OSS to keep them in sync ([comment](https://github.com/pytorch/pytorch/pull/104397#issuecomment-1621086360))	2023-07-05 06:11:08 +00:00
Animesh Jain	537a6c0651	[dynamo][ac] Remove disable monkeypatching of utils.checkpoint (#104397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104397 Approved by: https://github.com/wconstab	2023-06-30 02:27:06 +00:00
Animesh Jain	2bb83cd45c	[dynamo][ac] Minor refactor for better code organization and a bugfix (#104276 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104276 Approved by: https://github.com/zou3519	2023-06-29 12:57:59 +00:00
cdzhan	c06bb82ba1	fix specialization when you pass an unspec int into slicing on a Python list. (#104142 ) Fixes #103545 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104142 Approved by: https://github.com/malfet, https://github.com/jansel	2023-06-28 13:13:07 +00:00
Animesh Jain	75dab587ef	[dynamo] FSDP + AC + torch.compile (#103953 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103953 Approved by: https://github.com/wanchaol	2023-06-24 01:40:56 +00:00
Vinay Kumar Burugu	3c28431a0f	Feature: Dump compile_times when TORCH_LOGS=dynamo is enabled. (#104057 ) Partial implementation of https://github.com/pytorch/pytorch/issues/103173. This PR only implements the feature to dump compile_times at the end of the session using the atexit handler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104057 Approved by: https://github.com/ezyang	2023-06-23 05:25:09 +00:00
Thiago Crepaldi	6f655d4195	Add symbolic tracing support to torch._dynamo.export (fake input + weights) (#100017 ) Fixes #95900 Using the following repro as guide: ```python import torch import torch._dynamo from torch._subclasses import fake_tensor from torch.fx.experimental.symbolic_shapes import ShapeEnv from torch._dynamo.output_graph import config class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.linear = torch.nn.Linear(2, 2) self.linear2 = torch.nn.Linear(2, 2) def forward(self, x): out = self.linear(x) out = self.linear2(out) return out fake_mode = fake_tensor.FakeTensorMode(allow_non_fake_inputs=False, allow_fallback_kernels=True, shape_env=ShapeEnv( allow_scalar_outputs=config.capture_scalar_outputs, allow_dynamic_output_shape_ops=config.capture_dynamic_output_shape_ops, frame_id=0 ), ) # Fakefying input/model before calling torch._dynamo.export with fake_mode: fake_x = torch.rand(5, 2, 2) model = Model() # Calling torch._dynamo.export without active fake mode graph_module, guards = torch._dynamo.export( model, fake_x, aten_graph=True, fake_mode=fake_mode ) graph_module.print_readable() graph_module.graph.print_tabular() ``` Summary of changes: * Plumb fake_mode through torch.export API. When specified, it replaces the creation of a new FaketendorMode at InstructionTranslator on behalf of OutputGraph Hacks FakeTensor.__new__ to prevent a torch.tensor._make_subclass call for inputs that are already fakefied by user. This probably need to be fixed in a nicer way. Any idea? * Removed a few asserts that didn't want faked tensors coming from user script * Added torch._subclasses.fake_tensor.FakeTensor to type list on a few asserts check to allow fake inputs The changes above allowed symbolic tracing with both static and dynamic shapes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100017 Approved by: https://github.com/ezyang	2023-06-15 21:28:10 +00:00
Mengwei Liu	96c23fe212	[dynamo][numpy] Add support for builtin functions (#103457 ) In order to be able to run stuff like: ``` def f(x): a = x.numpy() return a + a ``` This PR adds a branch in `BuiltinVariable` to handle `NumpyNdarrayVariable` case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103457 Approved by: https://github.com/ezyang	2023-06-15 09:18:45 +00:00
Animesh Jain	16c2090b2d	[benchmark][compile] Limit number of bounding boxes to 5 (#103413 ) Depends on https://github.com/pytorch/benchmark/pull/1729 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103413 Approved by: https://github.com/ezyang	2023-06-15 01:06:40 +00:00
Edward Z. Yang	ddf4cd69ec	Delete ifdyn and ifunspec combinators (#103596 ) Replaced with expect tests for ease of updating. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103596 Approved by: https://github.com/voznesenskym	2023-06-15 00:14:17 +00:00
Animesh Jain	bd0ed940b7	[activation checkpoint][dynamo] Wrap AC into Tag based higher order op (#102935 ) These are the numbers with this PR ![image](https://github.com/pytorch/pytorch/assets/13822661/63e991d5-80e2-4e94-8e4b-243621c3990e) There are 3 main followups * A naive partitioner gives better memory footprint than min-cut partitioner here. Currently, we are using min-cut partitioner. Waiting for @Chillee to discuss this further to either modify min-cut or add a naive partitioner. * aot_eager is < 1x memory footprint. This is true even for non AC models. This could hide some inefficiency somewhere. * inductor is giving very different memory numbers between AOT-traced-AC (duplicate early) vs this implementation. This leads to some inefficiency in inductor that we need to resolve. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102935 Approved by: https://github.com/jansel	2023-06-14 20:15:43 +00:00
Edward Z. Yang	8b015c166c	Don't test dynamic_shapes in tensor_always_has_static_shape (#103517 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103517 Approved by: https://github.com/anijain2305	2023-06-14 07:04:17 +00:00
Mengwei Liu	2eac8bd2b8	[dynamo][numpy] Support ndarray methods (#97537 ) This PR adds universal support for ndarray methods. After #100839 each `NumpyNdarrayVariable` should wrap a `torch.Tensor`. This PR adds a `numpy_method_wrapper` which converts the `torch.Tensor` to `torch_np.ndarray` and then call the numpy ndarray method. Then we also try to return a `torch.Tensor` (return as-is if the value is not ndarray-like) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97537 Approved by: https://github.com/ezyang	2023-06-12 17:21:31 +00:00
Edward Z. Yang	12cd1dbba0	Handle recursive tuple in clone_inputs (#102979 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102979 Approved by: https://github.com/wconstab	2023-06-05 22:11:48 +00:00
Michael Lazos	c46af25bb3	Initialize optimizer in dynamo to avoid graph break and tracing slowness (#102640 ) On calls to `_init_group` rather than tracing through it, extract python values from the arguments, and call the initialization. This avoids having to trace this function which is very slow with large parameters, and also avoids graph breaking on it. This is sound in this case because the state is only initialized once in the eager case. Guards on the state and params are generated explicitly rather than via tracing the initialization. Caveats: `_init_group` also gathers various state tensors into lists via mutating list arguments to pass to the functional optimizer implementation. These state tensors exist on the optimizer itself, but we don't know exactly how the gathering is done and which tensors correspond to which attributes of the optimizer module (each optimizer has different states). To rectify this, we keep weak_ptrs to all of the tensors collected in the lists in globals (similar to how parameter keys are stored for dictionaries). These pointers are guaranteed to be alive as long as the optimizer object is alive if the internal state is not interfered with and they are guarded with weakref guards Pull Request resolved: https://github.com/pytorch/pytorch/pull/102640 Approved by: https://github.com/jansel	2023-06-03 15:49:51 +00:00
Mengwei Liu	c304fddf68	[dynamo][numpy] Support graph break for numpy ndarray (#100839 ) Issue: #93684 In previous PRs #95849 #99560 we redirect `numpy.`, `<tensor>.numpy()` calls to `torch_np.` methods and attributes, by creating `NumpyNdarrayVariable` for those calls. We need to handle `NumpyNdarrayVariable` when graph break happens. This PR did 2 things: 1. In `codegen.py` we made sure we can reconstruct the value wrapped by `NumpyNdarrayVariable`, to be `torch_np.ndarray` in the stack whenerver we recompiles the subgraph. 2. In `builder.py` we can wrap the value to be `NumpyNdarrayVariable` and save it as graph input. ----- Starting from commit 6: ## A new design for supporting numpy in dynamo In short the core concept doesn't change: we still convert `numpy` API calls to `torch_np` API calls. However, instead of wrapping a `torch_np.ndarray` in `NumpyNdarrayVariable`, the new design wraps a `torch.Tensor`. The reason for doing this change is because we need to keep `torch.Tensor` everywhere in the captured graph, so that it works well with the backend of dynamo. See discussions in https://github.com/Quansight-Labs/numpy_pytorch_interop/issues/142 for details. ### Flow This is an example showing how do we think about dynamo working on a simple function: ```python def f(x: torch.Tensor, y: torch.Tensor): a, b = x.numpy(), y.numpy() c = np.add(x, y) return torch.from_numpy(c) ``` ``` +------------+ +------------+ torch.Tensor \| \|numpy.ndarray\| \| -------------- .numpy() --------------\| \| \| \| \| \| +------------------+ +------------+ \| numpy.add \|numpy.ndarray\| \|torch.Tensor +------------+ \| --------------\| torch.from_numpy -------------- torch.Tensor \| \|numpy.ndarray\| \| \| \| -------------- .numpy() --------------\| \| +------------------+ \| \| \| \| +------------+ +------------+ +------------+ +----------------+ torch.Tensor \| \|torch.Tensor \| \| -------------- .detach() --------------\| \| \| \| \| \| +----------------+ +------------+ +------------+ \| \|torch_np.ndarray\| \|torch.Tensor\| \|torch.Tensor \| torch_np.add -----------------\| util.to_tensor -------------\| .detach() -------------- +------------+ \| \| \| \| \| \| torch.Tensor \| \|torch.Tensor \| \| +----------------+ +------------+ -------------- .detach() --------------\| \| \| \| \| \| +------------+ \| +----------------+ \| \| wrapper on torch_np.add \| +--------------------------------------------------------+ ``` ### Approach `torch_np` APIs can take both `torch_np.ndarray` as well as `torch.Tensor`. What we need to do is to have a wrapper for these APIs to convert the return value back to `torch.Tensor`. This way only the wrapper is showing up in the captured graph, with `torch.Tensor`s as input and `torch.Tensor` as output. If we have a graph break or we've traced to the end of the program, we need to inspect all the `NumpyNdarrayVariable` in the stack and convert them back to `numpy.ndarray`, to make sure the compiled version is still behaving the same as the eager version. ### Examples Here's an example of the graph generated: ```python def fn(x: np.ndarray, y: np.ndarray): a = x.real b = y.real torch._dynamo.graph_break() return np.add(a, 1), np.add(b, 1) ``` Graph generated: ``` [2023-05-16 10:31:48,737] torch._dynamo.output_graph.__graph: [DEBUG] TRACED GRAPH __compiled_fn_0 <eval_with_key>.0 opcode name target args kwargs ------------- -------------- ---------------------------------------------------------- ---------------------- -------- placeholder l_x_ L_x_ () {} placeholder l_y_ L_y_ () {} call_function from_numpy <built-in method from_numpy of type object at 0x12b1fdc80> (l_x_,) {} call_function from_numpy_1 <built-in method from_numpy of type object at 0x12b1fdc80> (l_y_,) {} call_function attr_wrapper <function attr_wrapper at 0x12e8693a0> (from_numpy, 'real') {} call_function attr_wrapper_1 <function attr_wrapper at 0x12e8693a0> (from_numpy_1, 'real') {} output output output ((),) {} [2023-05-16 10:31:48,908] torch._dynamo.output_graph.__graph: [DEBUG] TRACED GRAPH __compiled_fn_2 <eval_with_key>.1 opcode name target args kwargs ------------- ------------- ---------------------------------------------------------- ------------------------------- -------- placeholder l_a_ L_a_ () {} placeholder l_b_ L_b_ () {} call_function from_numpy <built-in method from_numpy of type object at 0x12b1fdc80> (l_a_,) {} call_function from_numpy_1 <built-in method from_numpy of type object at 0x12b1fdc80> (l_b_,) {} call_function wrapped_add <Wrapped function <original add>> (from_numpy, 1) {} call_function wrapped_add_1 <Wrapped function <original add>> (from_numpy_1, 1) {} output output output ((wrapped_add, wrapped_add_1),) {} ``` ### Changes * `codegen.py`: reconstruct `numpy.ndarray` from `NumpyNdarrayVariable` by adding bytecode to call `utils.to_numpy_helper()`. * `output_graph.py`: getting rid of legacy code that does exactly what `codegen.py` does, which only handling return case but not graph break case. * `utils.py`: added helpers to convert `numpy.ndarray` to `torch.Tensor` and vice versa. Also adding a wrapper class that takes in a function. In `__call__` it calls the function and converts its out to `torch.Tensor` (or a list of it). * `builder.py`: add method to wrap `numpy.ndarray` graph inputs into `NumpyNdarrayVariable`, by calling `torch.numpy` in the proxy. * `misc.py`: `numpy` API calls goes into `NumpyVariable` and we find the function with the same name in `torch_np` module, then wrap it with the wrapper defined in `utils.py`. * `tensor.py`, `torch.py`: proxy `tensor.numpy()` to be `torch.detach()` but wrap it with `NumpyNdarrayVariable`. Similarly, `torch.from_numpy()` -> `torch.detach()` but wrap it with `TensorVariable`. In `NumpyNdarrayVariable`, do the similar `torch_np.ndarray` to `torch.Tensor` wrapping for attributes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100839 Approved by: https://github.com/ezyang	2023-06-03 00:54:25 +00:00
Edward Z. Yang	90b1b17c9f	Fix string concatenation with non-string (#102728 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102728 Approved by: https://github.com/Skylion007	2023-06-01 20:02:03 +00:00
Animesh Jain	2fa1b563da	[dynamo] Activation checkpoint higher order ops - Reland 101028 (#101790 ) https://github.com/pytorch/pytorch/pull/101028 was reverted due to internal breakage. Relanding. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101790 Approved by: https://github.com/zou3519	2023-05-18 19:09:14 +00:00
Yanbo Liang	7052fb37bd	[Dynamo] Improve handling UnspecializedNNModuleVariable side effect (#101141 ) Fixes #101102 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101141 Approved by: https://github.com/jansel	2023-05-16 03:57:13 +00:00
PyTorch MergeBot	d0db7d624d	Revert "[dynamo] Activation checkpointing as higher order op (#101028 )" This reverts commit `de15e740a1`. Reverted https://github.com/pytorch/pytorch/pull/101028 on behalf of https://github.com/jeanschmidt due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/101028#issuecomment-1548280970))	2023-05-15 17:47:08 +00:00
Michael Lazos	d75f93603a	Flatten exceptions in dynamo (#100779 ) Fixes https://github.com/pytorch/pytorch/issues/93571 [before and after](https://gist.github.com/mlazos/256b0e8f0f98495752a22b960e9f4fcb) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100779 Approved by: https://github.com/ezyang	2023-05-13 00:58:57 +00:00
Animesh Jain	de15e740a1	[dynamo] Activation checkpointing as higher order op (#101028 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101028 Approved by: https://github.com/voznesenskym, https://github.com/zou3519	2023-05-12 03:17:41 +00:00
Jerry Zhang	c3f3cb5b0f	[quant][pt2e] Support conv bn fusion in convert step for QAT flow (#100442 ) Summary: This PR adds support for folding bn weights into conv for QAT flow, this is equivalent to the QAT branch of `from_float` in eager mode quantized conv module: https://github.com/pytorch/pytorch/blob/main/torch/ao/nn/quantized/modules/conv.py#L223 Items that needs followup: * there are some workaround I did because quantize_per_tensor is using float/int args and dynamo does not support these args, need to fix after we change the quantized model representation and also change these args to Tensor Test Plan: buck2 test @//mode/opt //caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_convert_qat_conv_bn_fusion (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' Reviewed By: andrewor14 Differential Revision: D45344281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100442 Approved by: https://github.com/kimishpatel	2023-05-09 19:43:51 +00:00
Bin Bao	86ddfc7f68	[inductor] Move cpp wrapper trigger logic to inner_compile (#100611 ) Summary: This enables cpp wrapper for backward as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100611 Approved by: https://github.com/jansel	2023-05-08 15:24:02 +00:00
Animesh Jain	3f025c607c	summarize graph breaks (#100696 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100696 Approved by: https://github.com/yanboliang	2023-05-05 22:27:47 +00:00
Edward Z. Yang	ce1ad1c143	Add load_storage (#100519 ) This adds a new operator debugprims::load_storage which does the unusual thing of loading a tensor from disk (via ContentStoreReader). This will be used in a later PR to implement delta debugging in the minifier, even when the repro is too big to fit into memory. The way it works is that you specify a name of the tensor you want to load, as well as enough metadata to reconstruct the tensor, if the store isn't available. If there is an active content store, we read and return the tensor from that store; otherwise we use `rand_strided` to create it. I needed some infra improvements to do this: * `custom_op` now supports factory functions. Factory functions have to be registered specially via `impl_factory` * I modified `clone_input` to also support dtype conversion, which I use to change the dtype of a loaded tensor if necessary. * ContentStore needs to work with a device argument, so we torch.load directly to the correct device. This is for fake tensor support. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100519 Approved by: https://github.com/zou3519, https://github.com/anijain2305	2023-05-05 05:25:03 +00:00
Animesh Jain	8994d9e610	[dynamo] Hide guard_fail_hook behind a flag to improve cache lookup time (+10% DebertaV2) (#100590 ) For TorchDynamo eager backend, DebertaV2 speedup improves from 0.77x to 0.87x. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100590 Approved by: https://github.com/voznesenskym, https://github.com/wconstab	2023-05-04 18:52:21 +00:00
Edward Z. Yang	c7e9f40653	Misc accuracy improvements on minifier (#100447 ) The changes: * Add config knob `same_two_models_use_fp64` for toggling whether or not to use fp64 * Add a test showing that RMSE is superior to atol/rtol * Add `--strict-accuracy` options, which allows for testing against integral/boolean accuracy. Regular accuracy by default now ONLY. There's a test which exercises this, it's a little delicate but I had trouble thinking of a good test otherwise. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100447 Approved by: https://github.com/voznesenskym	2023-05-04 02:51:26 +00:00
kshitij12345	8b64dee5d2	[fix] torch_compile_debug don't log with 0 (#100462 ) Fixes https://github.com/pytorch/pytorch/issues/99906 Tested locally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100462 Approved by: https://github.com/mlazos	2023-05-03 08:23:09 +00:00
Richard Zou	984a2397ba	Refactor OutputGraph (#99987 ) This PR splits OutputGraph into two classes: - SubgraphTracer (handles FX-tracing) - OutputGraph (handles Dynamo-specific output graph logic, like tracking graph inputs, compiling the graph, and executing it). The motivation behind this is in the next PR up in the stack. TL;DR is: in order to do higher-order operators, we need nested SubgraphTracer, one for each level of nesting of the higher-order operators. I'm happy to flatten the stack into a single PR, but this separate made it easier for me to test. Lmk if you want the stack flattened. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/99987 Approved by: https://github.com/anijain2305, https://github.com/voznesenskym	2023-05-02 17:11:02 +00:00
Michael Voznesensky	aafc6ce8cc	Produce constant variables in cases where a SymNode is created with a constant (#100144 ) ` AOT_DYNAMIC_SHAPES=1 TORCHDYNAMO_DYNAMIC_SHAPES=1 benchmarks/dynamo/huggingface.py --performance --training --amp --backend eager --disable-cudagraphs --device cuda --only AllenaiLongformerBase --explain` Looks promising! Goes from: Dynamo produced 173 graphs covering 2760 ops with 160 graph breaks (14 unique) To: Dynamo produced 6 graphs covering 2298 ops with 15 graph breaks (7 unique) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100144 Approved by: https://github.com/ezyang	2023-05-01 21:32:11 +00:00
Edward Z. Yang	2d8deffc1e	Refactor repro/minifier into CLI; add analyze (#100226 ) This is a two part PR; I can split it if you really want me to. The first part is a refactor of the after aot repro/minifier scripts to come with a command line interface. I maintain exact BC with the previous interface (so, e.g., you still get a repro.py and a run_minifier.py that do the same thing as before), but each of these scripts also take command line arguments now which you can use to customize what actually happens. Check `run_repro` for full documentation on the arguments. The second part of this is an implementation of `analyze` subcommand on the new CLI for any repro. <img width="1277" alt="image" src="https://user-images.githubusercontent.com/13564/235045677-8545aab7-5e83-4813-bbec-47783dc60122.png"> This facility is oriented towards accuracy debugging. It does several things: 1. It will run your model twice and check for nondeterminism in inductor/float64, even on intermediate inputs (our benchmarking nondeterminism test only checks for nondeterminism on the final output). This makes localizing which operator is nondeterministic easy. 2. It will run your compiled model side-by-side with eager and float64 variants, and then report when things diverge too far from RMSE delta from float64. Importantly, it does all this without requiring every intermediate to be held in memory (which will cause an OOM on large repros, such as the one I tested this on.) Some other minor improvements: * MinifierTestBase now has an easy to comment out spot that you can use to retain the temporary directory; good for debugging * We print "running minifier" and "running repro" in MinifierTestBase to make it easier to orient where logs are coming from * same takes a `log_error` optional argument which you can use to reroute the error logs when things mismatch * counters["inductor"]["intermediate_hooks"] tracks the number of intermediate hooks we've codegen'ed; good for populate the tqdm interface * torch.fx.interpreter gets an official `boxed_run` interface which uses the boxed arguments calling convention and doesn't retain inputs unnecessarily long * torch.utils._content_store gets compute_tensor_metadata/read_tensor_metadata helper functions for computing tensor information without serializing it Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/100226 Approved by: https://github.com/bertmaher, https://github.com/bdhirsh, https://github.com/anijain2305	2023-05-01 11:12:38 +00:00
PyTorch MergeBot	89c43f4108	Revert "Produce constant variables in cases where a SymNode is created with a constant (#100144 )" This reverts commit `d7bdfd3454`. Reverted https://github.com/pytorch/pytorch/pull/100144 on behalf of https://github.com/ezyang due to ci failure is real ([comment](https://github.com/pytorch/pytorch/pull/100144#issuecomment-1529587039))	2023-05-01 11:10:48 +00:00

... 3 4 5 6 7 ...

545 Commits