pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Andrew M. James	ade075444f	[dynamo] Support numpy.dtype (#124481 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124481 Approved by: https://github.com/lezcano	2024-05-29 14:45:14 +00:00
Yanbo Liang	da9bf77f0a	[Dynamo] Support SET_UPDATE (#126243 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/126243 Approved by: https://github.com/anijain2305, https://github.com/Skylion007, https://github.com/jansel	2024-05-16 20:05:34 +00:00
Yanbo Liang	f91cae461d	[Dynamo] SizeVariable supports hasattr (#126222 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/126222 Approved by: https://github.com/williamwen42, https://github.com/anijain2305	2024-05-15 17:16:36 +00:00
Yanbo Liang	51ed4c46cf	[Dynamo] Supports torch._C._is_any_autocast_enabled (#126196 ) Fixes #126026 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126196 Approved by: https://github.com/anijain2305	2024-05-15 03:16:13 +00:00
Yanbo Liang	bdaa9b2981	[Dynamo] Wrap set as SetVariable and support isdisjoint by polyfill (#126046 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/126046 Approved by: https://github.com/anijain2305, https://github.com/jansel	2024-05-14 04:56:06 +00:00
Edward Z. Yang	ecd62746e3	Also pull size/stride info from example_value (#125505 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125505 Approved by: https://github.com/jansel	2024-05-05 22:27:46 +00:00
Animesh Jain	1a0b247762	[dynamo] Bug fix for LOAD_GLOBAL and STORE_GLOBAL (#125002 ) Earlier globals of inlined functions from other files were not handled correctly. We were not tracking mutations on them. They were colliding with the same global name in the parent function etc. This PR overrides the LOAD/STORE_GLOBAL for inline tx and tracks mutation on them separately. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125002 Approved by: https://github.com/jansel ghstack dependencies: #125097, #125107	2024-04-28 15:24:17 +00:00
YangQun1	91d565da0c	[dynamo] Add support for tensor's is_complex method (#124927 ) This PR is to add support for tensor's is_complex method in dynamo. Take the following code as an example: ```python def test_tensor_is_complex(x): if x.is_complex(): return x + 1 else: return x - 1 ``` Before this fix, the is_complex() call will cause a graph break "torch.* op returned non-Tensor bool call_method is_complex". After this fix, the graph break can be avoided. Fixes #122692 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124927 Approved by: https://github.com/ezyang	2024-04-26 18:28:14 +00:00
Yanbo Liang	0d90d4d613	[Dynamo] Fix NamedTuple hasattr bug (#124531 ) Fixes #124402 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124531 Approved by: https://github.com/jansel	2024-04-21 04:36:22 +00:00
Jason Ansel	6bac183dc2	[dynamo] Support numpy.iinfo/finfo (#123803 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123803 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700, #123705, #123786, #123790	2024-04-12 19:03:13 +00:00
Jason Ansel	6b0ba6bbd3	[dynamo] Improve constant-prop for regex/torch.__version__ (#123705 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123705 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700	2024-04-12 19:03:13 +00:00
Guilherme Leobas	84658d9c4f	Enable `capture_func_transforms` by default (#122211 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122211 Approved by: https://github.com/zou3519	2024-04-05 03:29:11 +00:00
Jason Ansel	2a137f7af1	[dynamo] Support hasattr on UserDefinedClassVariable (#122564 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122564 Approved by: https://github.com/anijain2305	2024-03-29 17:34:14 +00:00
Jason Ansel	069270db60	[dynamo] Fix list comparison ops (#122559 ) Fixes #122376 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122559 Approved by: https://github.com/Skylion007	2024-03-25 07:03:23 +00:00
Jason Ansel	07caea5c12	[dynamo] Refactor COMPARE_OP and comparison builtins (#122043 ) This removes the duplicate handling of comparison ops between symbolic_convert and bultin and refactors the handling to use the binop infrastructure. This change regresses overheads a bit, but this is fixed in the next PR. New test skips are variants of `type(e) is np.ndarray` previously falling back to eager. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122043 Approved by: https://github.com/anijain2305 ghstack dependencies: #122039	2024-03-19 04:23:17 +00:00
Aaron Gokaslan	d55d803812	Add operator length hint support (#121495 ) Seemed like an easy operator to squeeze into Python 2.3 . Added a simple test. Partially addresses #116396 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121495 Approved by: https://github.com/albanD	2024-03-08 19:08:33 +00:00
laith sakka	d21c6eb215	Do not wrap output with input device inside _to_copy (#119868 ) Fixing https://github.com/pytorch/pytorch/issues/118790 This diff revert a small part of the code that was introduced in https://github.com/pytorch/pytorch/pull/104689 The PR above added a comment that "In case of dtype promotion, fake tensor converted into tensor" but its not always the case that a conversion in dtype causes a fake tensor to be a tensor. When such conversion does not happen we get the following error ``` Creating a new Tensor subclass FakeTensor but the raw Tensor object is already associated to a python object of type FakeTensor ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/119868 Approved by: https://github.com/ezyang, https://github.com/thiagocrepaldi	2024-02-28 01:51:43 +00:00
Yanbo Liang	5a0a964444	[Dynamo] Fix guards for script_if_tracing or lru_cache fn with default args (#120390 ) Fixes #120387 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120390 Approved by: https://github.com/anijain2305	2024-02-26 19:40:14 +00:00
laith sakka	ea8e4fd5ac	Support FunctoolsPartialVariable::get_function, fix NamedTupleVariable::as_proxy and handle call_function in get_fake_values_from_nodes (#119435 ) partially address https://github.com/pytorch/pytorch/issues/118785 This diff fixes three things: 1. add get_function to FunctoolsPartialVariable note that it will be available only if all args constant otherwise, it would throw unimplemented in the call to asPythonConstant. 2. NamedTupleVariable takes args dispatched not as list ex: NamedTuple(a, b, c) vs NamedTuple([a, b, c]), hence fix that by specializing asProxy. 3. A call to create_arg from within create_proxy, changes a python NamedTuple to a function call node without associating an example value! Updated get_fake_values_from_nodes to handle such case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119435 Approved by: https://github.com/jansel, https://github.com/anijain2305 ghstack dependencies: #119314	2024-02-13 01:44:08 +00:00
Jason Ansel	74d55b0e63	[dynamo] Support torch.distributed.fsdp._flat_param._same_storage_size (#119627 ) Replaces #117690 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119627 Approved by: https://github.com/Skylion007	2024-02-13 01:27:37 +00:00
laith sakka	c814d8e5c2	Fix handling random() calls encountered inside inlined code. (#119218 ) Fix https://github.com/pytorch/pytorch/issues/118787 In the compiled function, calls to random() are replaced with a single function call to a function that generates all the random variables . The random calls encountered during compilation used to be tracked inside a variable stored inside the instruction translator. And when there are nested translators, the tracked calls used to get lost when the inner instructions translator popped out. This diff fixes that by moving the tracked calla to the output graph which is shared across translators that are generating the same function. More details about the issue and why this solution is picked are in the github issue above. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119218 Approved by: https://github.com/jansel, https://github.com/anijain2305	2024-02-06 23:48:21 +00:00
Jason Ansel	5e78c4b0f4	[dynamo] Functools partial reconstruct (#118583 ) Replaces #117721 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118583 Approved by: https://github.com/yanboliang ghstack dependencies: #118901, #118616	2024-02-06 23:42:43 +00:00
laith sakka	923a7c7572	add test elipsis to dynamo test functions (#118754 ) add tests to ensure the reported bug in #117563 is not failing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118754 Approved by: https://github.com/anijain2305	2024-02-01 19:05:01 +00:00
rzou	318e6ff40e	Fix `__name__` on a reconstructed NestedUserFunctionVariable (#118768 ) ``` def f(): def g(): return () print(g.__name__) f() ``` The following script should print `g` (with or without torch.compile), but prints `f.<locals>.g` with torch.compile. The problem looks like we use the co_qualname when reconstructing the NestedUserFunctionVariable. I switched this over to use the co_name. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118768 Approved by: https://github.com/yanboliang, https://github.com/jansel	2024-02-01 18:59:01 +00:00
Yanbo Liang	4fc4f5eb06	[Dynamo] Support tensor is not tensor (#118840 ) Fixes Meta internal use case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118840 Approved by: https://github.com/yf225	2024-02-01 07:32:43 +00:00
laith sakka	8455447972	Support builtin callable with object arguments in dynamo (#118678 ) Fix issue #117556 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118678 Approved by: https://github.com/anijain2305	2024-01-31 17:54:08 +00:00
laith sakka	1bf9ddf130	add test_truth (#118597 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118597 Approved by: https://github.com/anijain2305	2024-01-31 15:10:58 +00:00
ydwu4	fc5cde7579	[dynamo] constant fold torch.cuda.get_device_properties to avoid graph break (#118422 ) Before the PR, we have a graph break for code like this, ```python def test_get_device_properties_tensor_device(a): x = a.to("cuda") prop = torch.cuda.get_device_properties(x.device) if prop.major == 8: return x + prop.multi_processor_count return x + prop.max_threads_per_multi_processor ``` This PR constant folds the torch.cuda.get_device_properties and we'll get a following dynamo graph: ```python [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] <eval_with_key>.0 class GraphModule(torch.nn.Module): [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] def forward(self, L_a_ : torch.Tensor): [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] l_a_ = L_a_ [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] # File: /home/yidi/local/pytorch/test/dynamo/test_functions.py:544 in test_get_device_properties_tensor_device, code: x = a.to("cuda") [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] x = l_a_.to('cuda'); l_a_ = None [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] # File: /home/yidi/local/pytorch/test/dynamo/test_functions.py:547 in test_get_device_properties_tensor_device, code: return x + prop.multi_processor_count [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] add = x + 108; x = None [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] return (add,) [2024-01-26 13:28:13,253] [0/0] torch._dynamo.output_graph.__graph: [DEBUG] ``` The signature of get_device_properties is: ```python def get_device_properties(device: _device_t) -> _CudaDeviceProperties: ``` I think it's safe to constant fold get_device_properties(): 1. torch.cuda.get_device_properties(tensor.device). In this case, tensor.device.index is guarded in _check_tensor 2. torch.cuda.get_device_properties(device_int_id). We don't expect the GPU properties for a particular index changes during a torch.compile run and it make sense to specialize the properties for a concrete device_int_id. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118422 Approved by: https://github.com/yanboliang, https://github.com/jansel	2024-01-29 20:26:40 +00:00
ydwu4	5b31516008	[dynamo] inline torch.jit._unwrap_optional (#118434 ) Before this pr, torch.jit._unwrap_optional is in the skipfile list thus causing a graph break. Check its implementation it's just a normal python function [here](`ff8e33556e/torch/jit/_script.py (L1681-L1683)`): ```python def _unwrap_optional(x): assert x is not None, "Unwrapping null optional" return x ``` We could safely inline it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118434 Approved by: https://github.com/yanboliang	2024-01-27 02:22:14 +00:00
ydwu4	71757093c5	[dynamo] avoid graph break on torch.backends.cuda.matmul.allow_tf32 (#118236 ) Before the PR, we have a graph break for the following test: ```python def test_cublas_allow_tf32(x): if torch.backends.cuda.matmul.allow_tf32: return x.sin() + 1 return x.cos() - 1 ``` In this PR, we first add "torch.backends.cuda" to MOD_INLINELIST to trace through the python binding and get the actual call torch._C._get_cublas_allow_tf32, where it's already a TorchInGraphVariable. Because _get_cublas_allow_tf32 is accessing the same variable as at::globalContext().allowTF32CuBLAS(), which is guarded by dynamo as a global state [here](https://github.com/pytorch/pytorch/blob/main/torch/csrc/dynamo/guards.cpp#L443), we could safely assume it returns a ConstantVariable during tracing. After this pr, we get the following graph: ```python [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] <eval_with_key>.0 class GraphModule(torch.nn.Module): [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] def forward(self, L_x_ : torch.Tensor): [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] l_x_ = L_x_ [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: /home/yidi/local/pytorch/test/dynamo/test_functions.py:515 in test_cublas_allow_tf32, code: return x.cos() - 1 [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] cos = l_x_.cos(); l_x_ = None [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] sub = cos - 1; cos = None [2024-01-24 15:31:01,501] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] return (sub,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118236 Approved by: https://github.com/yanboliang, https://github.com/anijain2305	2024-01-25 23:40:23 +00:00
ydwu4	fae569b4f2	[dynamo] avoid graph break on tensor.element_size() (#118229 ) Before this PR, for the following code, we have a graph break `torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor int call_method element_size` ```python import torch def f(x): return x.sin().element_size() + x.sin() x = torch.randn(2, 2) torch.compile(f, backend="eager", fullgraph=True)(x) ``` After this PR, we got the following graph, where element_size() is baked in as a constant. ```python [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] <eval_with_key>.0 class GraphModule(torch.nn.Module): [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] def forward(self, L_x_ : torch.Tensor): [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] l_x_ = L_x_ [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: /home/yidi/local/pytorch/test.py:4 in f, code: return x.sin().element_size() + x.sin() [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] sin = l_x_.sin() [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] sin_1 = l_x_.sin(); l_x_ = None [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] add = 4 + sin_1; sin_1 = None [2024-01-24 13:49:02,814] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] return (add,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118229 Approved by: https://github.com/yanboliang, https://github.com/jansel, https://github.com/anijain2305	2024-01-25 22:28:37 +00:00
laith sakka	b47cf4182e	Fix support non tensor inputs to operator.pos function (#118251 ) Fixes #118231 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118251 Approved by: https://github.com/Skylion007, https://github.com/anijain2305	2024-01-25 20:37:40 +00:00
Animesh Jain	6e4e81a9ef	[dynamo] Extend LazyVariableTracker to tuples (#117426 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117426 Approved by: https://github.com/lezcano, https://github.com/jansel	2024-01-18 15:51:28 +00:00
lezcano	4ba5318d3f	[dynamo] Add DictView variable tracker (#108420 ) This also starts a comparison pattern where we don't ask variables what's their type, but what are their capabilities. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108420 Approved by: https://github.com/jansel ghstack dependencies: #112252, #117630, #110524	2024-01-18 09:37:33 +00:00
Aaron Gokaslan	62496ffd0d	[dynamo][easy]: Add support for `operator.truth` (#117463 ) * This is an old builtin function equivalent to the bool constructor. it is easy enough to add support for. * I also realized the tests were in the wrong class (the one reserved for testing default args) so I moved them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117463 Approved by: https://github.com/jansel	2024-01-14 19:08:31 +00:00
Aaron Gokaslan	bf27dd6df9	Add dynamo support for operator.abs (#117442 ) A test case for operator.abs and allows for constant folding with it. Partially applies to #116396 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117442 Approved by: https://github.com/jansel, https://github.com/malfet	2024-01-13 21:38:55 +00:00
Guilherme Leobas	4f3d698cac	Impl. call_hasattr for BaseUserFunctionVariable (#116049 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116049 Approved by: https://github.com/zou3519	2024-01-09 22:58:58 +00:00
Aaron Gokaslan	1dd4813328	[BE][dynamo]: Add operator is and is not tests to dynamo tests (#116397 ) Adds an operator that was unit not tested in our test suite - improves coverage. Inspired by looking into https://github.com/pytorch/pytorch/pull/116397 after @XuehaiPan brought up some issues with builtins in #116389 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116397 Approved by: https://github.com/albanD, https://github.com/jansel	2024-01-09 21:13:22 +00:00
Guoliang He	0159e3abbd	[dynamo] add a handler for itertools_chain_from_iterable and test (#116849 ) 1. add a handler for itertools_chain_from_iterable 2. a test for itertools_chain_from_iterable Fixes #116463 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116849 Approved by: https://github.com/ezyang	2024-01-05 15:14:18 +00:00
Xuehai Pan	3149e4a667	[dynamo] fix `sum()` function with `start` argument (#116389 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116389 Approved by: https://github.com/Skylion007, https://github.com/malfet	2023-12-27 20:42:27 +00:00
PyTorch MergeBot	e0e90bc0d4	Revert "[dynamo] fix `sum()` function with `start` argument (#116389 )" This reverts commit `3c9076f070`. Reverted https://github.com/pytorch/pytorch/pull/116389 on behalf of https://github.com/kit1980 due to Breaks Meta-internal tests, but the issue could have been caught on GitHub ([comment](https://github.com/pytorch/pytorch/pull/116389#issuecomment-1870556927))	2023-12-27 19:05:55 +00:00
Oguz Ulgen	8abeacda6f	Refactor user defined triton kernel tests (#116425 ) I will be adding more triton tests of different types, so I'm moving them to a brand new file. While doing this, I also cleaned up some flake linting opt outs Pull Request resolved: https://github.com/pytorch/pytorch/pull/116425 Approved by: https://github.com/aakhundov	2023-12-26 23:54:26 +00:00
Xuehai Pan	3c9076f070	[dynamo] fix `sum()` function with `start` argument (#116389 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116389 Approved by: https://github.com/Skylion007	2023-12-26 06:37:55 +00:00
Xuehai Pan	039fbeb016	[dynamo] fix `functools.reduce()` function with `None` as `initial` (#116398 ) The `initial` argument in `functools.reduce` can be `None`. ```python initial_missing = object() def reduce(function, iterable, initial=initial_missing, /): it = iter(iterable) if initial is initial_missing: value = next(it) else: value = initial for element in it: value = function(value, element) return value ``` Reference: - python/cpython#102759 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116398 Approved by: https://github.com/Skylion007	2023-12-25 21:23:28 +00:00
Tugsbayasgalan Manlaibaatar	76b1d44d57	pre_dispatch aot_export (#115188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188 Approved by: https://github.com/bdhirsh	2023-12-25 04:51:21 +00:00
Shunting Zhang	99f7e721fe	[inductor] make inductor work with new triton compile interface (#115878 ) Recent 2 triton PRs (https://github.com/openai/triton/pull/2701, https://github.com/openai/triton/pull/2756) change the interface for triton.compile, this PR added the necessary change on inductor side to work with both old and new compile API. Also there is some simplification between compilation call in subprocess and the one in main process - previously we pass warm_cache_only=True if the compilation happens in subprocess. But triton never use that argument in the currently used pin. So I removed that - previously we only pass compute_capability if compilation happens in subprocess. The PR change that to always passing compute_capability to triton.compile no matter if the compilation happens in main or sub process. Updated: There are more interface change from triton side. E.g. - tl.math.{min, max} now requires a propagate_nan argument - JITFunction.run now requires a warmup argument. This affect the benchmarking phase of matmul max-autotune; on the other hand, JITFunction.run forbids stream argument now. Simply removing passing this in when benchmarking matmul triton kernel will work for both old and new version of triton. - triton Autotuner change attribute name from 'warmup' to 'num_warmup' and from 'rep' to 'num_rep'. This cause dynamo failed to handle triton Autotuner object since dynamo TritonKernelVariable makes assumption about attribute names. It's used in some test cases that a model call triton Autotuner directly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115878 Approved by: https://github.com/jansel	2023-12-22 00:09:29 +00:00
Adnan Akhundov	247f9c3de4	Preserve strides of custom Triton kernel args (#116219 ) Summary: Currently, we [`clone`](`19207b9183/torch/_inductor/lowering.py (L5273)`) every `TensorBox` argument of custom Triton kernels while lowering them to the Inductor IR, during which the stride information of the kernel inputs is lost. This is problematic in the common case when the strides of a `torch.Tensor` argument are passed as scalars to a custom Triton kernel alongside the tensor itself (due to the underlying Triton code interpreting the tensors as raw pointers, so the contained stride semantics of the `torch.Tensor` is lost). In this PR, we add an extended version of the existing [`clone` lowering](`19207b9183/torch/_inductor/lowering.py (L2289)`)---`clone_preserve_reinterpret_view`---which carries over the `ir.ReinterpretVew` layers (if any) from the source `TensorBox` to the cloned one. The rationale behind adding a new function (and switching to it in the `triton_kernel_wrap` only for now) as opposed to extending the existing `clone` is keeping the semantics of the latter untouched, as it is a lowering of `torch.clone` (albeit incomplete, as the `memory_format` is currently ignored). Changing the existing `clone` would change the semantics which is not necessarily desirable in general. Open to suggestions, though. Test Plan: ``` $ python test/dynamo/test_functions.py -k test_triton_kernel_strided_input ... ---------------------------------------------------------------------- Ran 1 test in 5.568s OK ``` Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116219 Approved by: https://github.com/jansel	2023-12-21 22:46:32 +00:00
PyTorch MergeBot	0567f71ac6	Revert " pre_dispatch aot_export (#115188 )" This reverts commit `a267d67350`. Reverted https://github.com/pytorch/pytorch/pull/115188 on behalf of https://github.com/jeanschmidt due to sadly, it is required to revert this commit in order to revert https://github.com/pytorch/pytorch/pull/115454 ([comment](https://github.com/pytorch/pytorch/pull/115188#issuecomment-1866310014))	2023-12-21 14:03:18 +00:00
Tugsbayasgalan Manlaibaatar	a267d67350	pre_dispatch aot_export (#115188 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188 Approved by: https://github.com/bdhirsh	2023-12-20 21:36:25 +00:00
Oguz Ulgen	01b979fc9a	[Inductor] Fix constant folding and extern kernel mutation tracking bugs (#115908 ) This PR fixes two bugs 1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation 2) NoneLayout buffers should not be deleted as they do not exist Pull Request resolved: https://github.com/pytorch/pytorch/pull/115908 Approved by: https://github.com/aakhundov, https://github.com/jansel	2023-12-19 02:06:50 +00:00

1 2 3 4

177 Commits