pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
David Berard	df1e855313	[fake_impls] fix max_seqlen return values in efficient_attention_forward (#120842 ) To match the actual implementation, we should return the max_seqlen_q/k, not M, N, when in the sparse case `7e185277cd/aten/src/ATen/native/transformers/cuda/attention.cu (L981-L996)` Note that although the .cu file sets max_seqlen_k = 0 in the sparse case, it actually returns max_seqlen_k or N: `7e185277cd/aten/src/ATen/native/transformers/cuda/attention.cu (L1224-L1231)` Tests - added in the next PR (#102839, which also fixes other parts of the test_fake tests so that we can un-xfail them and actually run the tests) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120842 Approved by: https://github.com/YuqingJ ghstack dependencies: #120682	2024-02-29 07:12:27 +00:00
PyTorch MergeBot	dbe0967a0a	Revert "Add test to check that COW inputs are not materialized (#119507 )" This reverts commit `2ebf2c88ba`. Reverted https://github.com/pytorch/pytorch/pull/119507 on behalf of https://github.com/izaitsevfb due to breaks xla jobs ([comment](https://github.com/pytorch/pytorch/pull/119507#issuecomment-1970022840))	2024-02-28 22:26:59 +00:00
Kurt Mohler	2ebf2c88ba	Add test to check that COW inputs are not materialized (#119507 ) Part of #97856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119507 Approved by: https://github.com/ezyang ghstack dependencies: #120455	2024-02-28 00:37:33 +00:00
Isuru Fernando	b7df3bba62	add decomposition for frexp (#119217 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119217 Approved by: https://github.com/peterbell10 ghstack dependencies: #119284, #120027	2024-02-23 21:52:42 +00:00
Joel Schlosser	9ec8dd2467	Reify view_func() closures as ViewFuncs (#118404 ) Replaces `view_func()` closures with a reified `ViewFunc` data structure. Codegen generates a `ViewFunc` subclass for each view op (e.g. `NarrowViewFunc`) containing state needed to reconstruct the view. The `ViewFunc` API allows for querying and hot-swapping any `SymInt`s or `Tensors` in the state through `get_symints()` / `get_tensors()` / `clone_and_set()`, which will be essential for fake-ification later on. ```cpp /// Base class for view functions, providing reapplication of a view on a new base. /// Each view op should get a codegenerated subclass of this class containing /// any state needed to reconstruct the view. The class also provides convenience /// accessors for saved SymInts / tensor state. This is useful for e.g. fake-ification, /// where we want to use symbolic values or fake tensors instead. struct TORCH_API ViewFunc { virtual ~ViewFunc() {} /// Returns any SymInts in the saved state. virtual std::vector<c10::SymInt> get_symints() const { return {}; } /// Returns the number of SymInts in the saved state. virtual size_t num_symints() const { return 0; } /// Returns any tensors in the saved state. virtual std::vector<at::Tensor> get_tensors() const { return {}; } /// Returns the number of tensors in the saved state. virtual size_t num_tensors() const { return 0; } /// Reapplies the view on the given base using the saved state. virtual at::Tensor operator()(const at::Tensor&) const = 0; /// Returns a clone of this ViewFunc, optionally with the specified saved state. virtual std::unique_ptr<ViewFunc> clone_and_set( std::optional<std::vector<c10::SymInt>> = c10::nullopt, std::optional<std::vector<at::Tensor>> = c10::nullopt) const = 0; protected: /// Sets the values of any SymInts in the saved state. The input vector size must /// match the number of SymInts in the saved state (i.e. the size of the list /// returned by get_symints()). virtual void set_symints(std::vector<c10::SymInt>) {} /// Sets the values of any Tensors in the saved state. The input vector size must /// match the number of Tensors in the saved state (i.e. the size of the list /// returned by get_tensors()). virtual void set_tensors(std::vector<at::Tensor>) {} }; ``` New codegen files: * `torch/csrc/autograd/generated/ViewFunc.h` * `torch/csrc/autograd/generated/ViewFuncs.cpp` The templates for these also contains impls for `ChainedViewFunc` and `ErroringViewFunc` which are used in a few places within autograd. Example codegen for `slice.Tensor`: ```cpp // torch/csrc/autograd/generated/ViewFuncs.h #define SLICE_TENSOR_VIEW_FUNC_AVAILABLE struct SliceTensorViewFunc : public torch::autograd::ViewFunc { SliceTensorViewFunc(int64_t dim, c10::optional<c10::SymInt> start, c10::optional<c10::SymInt> end, c10::SymInt step) : dim(dim), start(start), end(end), step(step) {}; virtual ~SliceTensorViewFunc() override {}; virtual std::vector<c10::SymInt> get_symints() const override; virtual size_t num_symints() const override; virtual std::vector<at::Tensor> get_tensors() const override; virtual size_t num_tensors() const override; virtual at::Tensor operator()(const at::Tensor&) const override; virtual std::unique_ptr<ViewFunc> clone_and_set( std::optional<std::vector<c10::SymInt>> = c10::nullopt, std::optional<std::vector<at::Tensor>> = c10::nullopt) const override; protected: virtual void set_symints(std::vector<c10::SymInt>) override; virtual void set_tensors(std::vector<at::Tensor>) override; private: int64_t dim; c10::optional<c10::SymInt> start; c10::optional<c10::SymInt> end; c10::SymInt step; }; ... // torch/csrc/autograd/generated/ViewFuncs.cpp std::vector<c10::SymInt> SliceTensorViewFunc::get_symints() const { ::std::vector<c10::SymInt> symints; symints.reserve((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1); if(start.has_value()) symints.insert(symints.end(), (start)); if(end.has_value()) symints.insert(symints.end(), (end)); symints.push_back(step); return symints; } size_t SliceTensorViewFunc::num_symints() const { return static_cast<size_t>((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1); } void SliceTensorViewFunc::set_symints(std::vector<c10::SymInt> symints) { TORCH_INTERNAL_ASSERT(symints.size() == num_symints()); auto i = 0; if(start.has_value()) start = symints[i]; i += (start.has_value() ? 1 : 0); if(end.has_value()) end = symints[i]; i += (end.has_value() ? 1 : 0); step = symints[i]; } std::vector<at::Tensor> SliceTensorViewFunc::get_tensors() const { ::std::vector<at::Tensor> tensors; return tensors; } size_t SliceTensorViewFunc::num_tensors() const { return static_cast<size_t>(0); } void SliceTensorViewFunc::set_tensors(std::vector<at::Tensor> tensors) { TORCH_INTERNAL_ASSERT(tensors.size() == num_tensors()); } at::Tensor SliceTensorViewFunc::operator()(const at::Tensor& input_base) const { return at::_ops::slice_Tensor::call(input_base, dim, start, end, step); } std::unique_ptr<ViewFunc> SliceTensorViewFunc::clone_and_set( std::optional<std::vector<c10::SymInt>> symints, std::optional<std::vector<at::Tensor>> tensors) const { auto output = std::make_unique<SliceTensorViewFunc>(dim, start, end, step); if (symints.has_value()) { output->set_symints(std::move((symints))); } if (tensors.has_value()) { output->set_tensors(std::move((tensors))); } return output; } ``` The `_view_func()` / `_view_func_unsafe()` methods now accept two additional (optional) args for `symint_visitor_fn` / `tensor_visitor_fn`. If these are defined, they are expected to be python callables that operate on a single SymInt / tensor and return a new one. This allows for the hot-swapping needed during fake-ification. For testing, there are extensive pre-existing tests, and I added a test to ensure that hot-swapping functions correctly. ```sh python test/test_autograd.py -k test_view_func_replay python test/test_ops.py -k test_view_replay ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118404 Approved by: https://github.com/ezyang	2024-02-14 22:00:43 +00:00
blorange-amd	df9b44436a	[ROCm] Enable float16/complex32 fft tests on ROCm (#117296 ) This PR is to enable float16/complex32 fft tests on ROCm. Sample results are attached here: [test_spectral_ops_results.log](https://github.com/pytorch/pytorch/files/13908533/test_spectral_ops_results.log) test_decomp::TestDecompCUDA::test_comprehensive_fft* test_decomp::TestDecompCUDA::test_quick_fft* test_jit_fuser_te::TestNNCOpInfoCUDA::test_nnc_correctness_fft* test_meta::TestMetaCUDA::test_dispatch_meta_inplace_fft* test_meta::TestMetaCUDA::test_dispatch_meta_outplace_fft* test_meta::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft* test_meta::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft* test_meta::TestMetaCUDA::test_meta_inplace_fft* test_meta::TestMetaCUDA::test_meta_outplace_fft* test_ops::TestCommonCUDA::test_complex_half_reference_testing_fft* test_ops::TestCommonCUDA::test_python_ref__refs_fft* test_ops::TestCommonCUDA::test_python_ref_executor__refs_fft* test_ops::TestCommonCUDA::test_python_ref_meta__refs* test_ops::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft* test_schema_check::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft* test_spectral_ops::TestFFTCUDA::test_empty_fft__refs_fft* test_spectral_ops::TestFFTCUDA::test_empty_fft_fft* test_spectral_ops::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft* test_spectral_ops::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft* test_spectral_ops::TestFFTCUDA::test_fft_round_trip_cuda* test_spectral_ops::TestFFTCUDA::test_fft_type_promotion_cuda* test_spectral_ops::TestFFTCUDA::test_fftn_round_trip_cuda* test_spectral_ops::TestFFTCUDA::test_hfftn_cuda_float16 test_spectral_ops::TestFFTCUDA::test_ihfftn_cuda_float16 test_utils::TestDeviceUtilsCUDA::test_device_mode_ops_fft Pull Request resolved: https://github.com/pytorch/pytorch/pull/117296 Approved by: https://github.com/pruthvistony, https://github.com/malfet	2024-02-13 22:35:32 +00:00
PyTorch MergeBot	24bdd03d23	Revert "Reify view_func() closures as ViewFuncs (#118404 )" This reverts commit `d5a6762263`. Reverted https://github.com/pytorch/pytorch/pull/118404 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/118404#issuecomment-1938600260))	2024-02-12 12:38:51 +00:00
Joel Schlosser	d5a6762263	Reify view_func() closures as ViewFuncs (#118404 ) Replaces `view_func()` closures with a reified `ViewFunc` data structure. Codegen generates a `ViewFunc` subclass for each view op (e.g. `NarrowViewFunc`) containing state needed to reconstruct the view. The `ViewFunc` API allows for querying and hot-swapping any `SymInt`s or `Tensors` in the state through `get_symints()` / `get_tensors()` / `clone_and_set()`, which will be essential for fake-ification later on. ```cpp /// Base class for view functions, providing reapplication of a view on a new base. /// Each view op should get a codegenerated subclass of this class containing /// any state needed to reconstruct the view. The class also provides convenience /// accessors for saved SymInts / tensor state. This is useful for e.g. fake-ification, /// where we want to use symbolic values or fake tensors instead. struct TORCH_API ViewFunc { virtual ~ViewFunc() {} /// Returns any SymInts in the saved state. virtual std::vector<c10::SymInt> get_symints() const { return {}; } /// Returns the number of SymInts in the saved state. virtual size_t num_symints() const { return 0; } /// Returns any tensors in the saved state. virtual std::vector<at::Tensor> get_tensors() const { return {}; } /// Returns the number of tensors in the saved state. virtual size_t num_tensors() const { return 0; } /// Reapplies the view on the given base using the saved state. virtual at::Tensor operator()(const at::Tensor&) const = 0; /// Returns a clone of this ViewFunc, optionally with the specified saved state. virtual std::unique_ptr<ViewFunc> clone_and_set( std::optional<std::vector<c10::SymInt>> = c10::nullopt, std::optional<std::vector<at::Tensor>> = c10::nullopt) const = 0; protected: /// Sets the values of any SymInts in the saved state. The input vector size must /// match the number of SymInts in the saved state (i.e. the size of the list /// returned by get_symints()). virtual void set_symints(std::vector<c10::SymInt>) {} /// Sets the values of any Tensors in the saved state. The input vector size must /// match the number of Tensors in the saved state (i.e. the size of the list /// returned by get_tensors()). virtual void set_tensors(std::vector<at::Tensor>) {} }; ``` New codegen files: * `torch/csrc/autograd/generated/ViewFunc.h` * `torch/csrc/autograd/generated/ViewFuncs.cpp` The templates for these also contains impls for `ChainedViewFunc` and `ErroringViewFunc` which are used in a few places within autograd. Example codegen for `slice.Tensor`: ```cpp // torch/csrc/autograd/generated/ViewFuncs.h #define SLICE_TENSOR_VIEW_FUNC_AVAILABLE struct SliceTensorViewFunc : public torch::autograd::ViewFunc { SliceTensorViewFunc(int64_t dim, c10::optional<c10::SymInt> start, c10::optional<c10::SymInt> end, c10::SymInt step) : dim(dim), start(start), end(end), step(step) {}; virtual ~SliceTensorViewFunc() override {}; virtual std::vector<c10::SymInt> get_symints() const override; virtual size_t num_symints() const override; virtual std::vector<at::Tensor> get_tensors() const override; virtual size_t num_tensors() const override; virtual at::Tensor operator()(const at::Tensor&) const override; virtual std::unique_ptr<ViewFunc> clone_and_set( std::optional<std::vector<c10::SymInt>> = c10::nullopt, std::optional<std::vector<at::Tensor>> = c10::nullopt) const override; protected: virtual void set_symints(std::vector<c10::SymInt>) override; virtual void set_tensors(std::vector<at::Tensor>) override; private: int64_t dim; c10::optional<c10::SymInt> start; c10::optional<c10::SymInt> end; c10::SymInt step; }; ... // torch/csrc/autograd/generated/ViewFuncs.cpp std::vector<c10::SymInt> SliceTensorViewFunc::get_symints() const { ::std::vector<c10::SymInt> symints; symints.reserve((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1); if(start.has_value()) symints.insert(symints.end(), (start)); if(end.has_value()) symints.insert(symints.end(), (end)); symints.push_back(step); return symints; } size_t SliceTensorViewFunc::num_symints() const { return static_cast<size_t>((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1); } void SliceTensorViewFunc::set_symints(std::vector<c10::SymInt> symints) { TORCH_INTERNAL_ASSERT(symints.size() == num_symints()); auto i = 0; if(start.has_value()) start = symints[i]; i += (start.has_value() ? 1 : 0); if(end.has_value()) end = symints[i]; i += (end.has_value() ? 1 : 0); step = symints[i]; } std::vector<at::Tensor> SliceTensorViewFunc::get_tensors() const { ::std::vector<at::Tensor> tensors; return tensors; } size_t SliceTensorViewFunc::num_tensors() const { return static_cast<size_t>(0); } void SliceTensorViewFunc::set_tensors(std::vector<at::Tensor> tensors) { TORCH_INTERNAL_ASSERT(tensors.size() == num_tensors()); } at::Tensor SliceTensorViewFunc::operator()(const at::Tensor& input_base) const { return at::_ops::slice_Tensor::call(input_base, dim, start, end, step); } std::unique_ptr<ViewFunc> SliceTensorViewFunc::clone_and_set( std::optional<std::vector<c10::SymInt>> symints, std::optional<std::vector<at::Tensor>> tensors) const { auto output = std::make_unique<SliceTensorViewFunc>(dim, start, end, step); if (symints.has_value()) { output->set_symints(std::move((symints))); } if (tensors.has_value()) { output->set_tensors(std::move((tensors))); } return output; } ``` The `_view_func()` / `_view_func_unsafe()` methods now accept two additional (optional) args for `symint_visitor_fn` / `tensor_visitor_fn`. If these are defined, they are expected to be python callables that operate on a single SymInt / tensor and return a new one. This allows for the hot-swapping needed during fake-ification. For testing, there are extensive pre-existing tests, and I added a test to ensure that hot-swapping functions correctly. ```sh python test/test_autograd.py -k test_view_func_replay python test/test_ops.py -k test_view_replay ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118404 Approved by: https://github.com/ezyang	2024-02-09 18:51:36 +00:00
Isuru Fernando	3e79ef6db8	Complete decomposition for aten.round (#118635 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118635 Approved by: https://github.com/peterbell10	2024-02-01 17:14:44 +00:00
Isuru Fernando	2f7839e6db	register decomposition for rsub in torch._refs (#118288 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118288 Approved by: https://github.com/lezcano ghstack dependencies: #118398	2024-01-30 22:18:15 +00:00
Alexander Grund	f1aef2c094	Don't check is_conj for `_refs.linalg.svd` (#117972 ) The flag is not correctly set when PyTorch is compiled with GPU support resulting in failures in `test_ops.py::test_python_ref_meta__refs_linalg_svd_cpu_complex`. Use a similar approach to test_meta and skip the check for this function. Workaround for #105068 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117972 Approved by: https://github.com/lezcano	2024-01-26 15:24:29 +00:00
Sam Larsen	208e64a9ba	Initial implementation of FakeTensor caching (#113873 ) Summary: Cache the result of FakeTensor dispatch and skip re-evaluation on cache hits. Test Plan: New unit tests. Caching is enabled in this diff, so all existing tests exercise the cache as well. Differential Revision: [D52841637](https://our.internmc.facebook.com/intern/diff/D52841637) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113873 Approved by: https://github.com/eellison	2024-01-17 20:38:54 +00:00
Joel Schlosser	3c21264c9b	Introduce reverse view_funcs (#115894 ) Part 2 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw). Details: * Codegen `rev_view_func()` alongside `view_func()` * Reverse view_func gives you a "base" from a "view": `rev_view_func(new_view) -> new_base` AKA it plays the original view backwards * Utilizes the functional inverses defined in `FunctionalInverses.cpp`, passing `InverseReturnMode::AlwaysView` * Manually implements functional inverses for `narrow()` and `chunk()` * NB: Multi-output views now set view_func() / rev_view_func() for each of the output views! * Due to this, the `as_view()` overload that operates on a list of views is scrapped in favor of iteration via codegen Example codegen in `ADInplaceOrViewTypeN.cpp`: ```cpp at::Tensor narrow(c10::DispatchKeySet ks, const at::Tensor & self, int64_t dim, c10::SymInt start, c10::SymInt length) { auto _tmp = ([&]() { at::AutoDispatchBelowADInplaceOrView guard; return at::_ops::narrow::redispatch(ks & c10::after_ADInplaceOrView_keyset, self, dim, start, length); })(); std::function<at::Tensor(const at::Tensor&)> func=nullptr; std::function<at::Tensor(const at::Tensor&)> rev_func=nullptr; if (false \|\| !self.unsafeGetTensorImpl()->support_as_strided() \|\| c10::AutogradState::get_tls_state().get_view_replay_enabled()) { func = [=](const at::Tensor& input_base) { return at::_ops::narrow::call(input_base, dim, start, length); }; rev_func = [=](const at::Tensor& input_view) { // NB: args from narrow() signature are passed along to the inverse return at::functionalization::FunctionalInverses::narrow_copy_inverse(self, input_view, at::functionalization::InverseReturnMode::AlwaysView, dim, start, length); }; } auto result = as_view(/* base / self, / output / _tmp, / is_bw_differentiable / true, / is_fw_differentiable / true, / view_func / func, / rev_view_func / rev_func, / creation_meta */ InferenceMode::is_enabled() ? CreationMeta::INFERENCE_MODE : (at::GradMode::is_enabled() ? CreationMeta::DEFAULT : CreationMeta::NO_GRAD_MODE)); return result; } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115894 Approved by: https://github.com/soulitzer	2024-01-05 16:48:12 +00:00
Aaron Gokaslan	3fe437b24b	[BE]: Update flake8 to v6.1.0 and fix lints (#116591 ) Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling. - Replace `assert(0)` with `raise AssertionError()` - Remove extraneous parenthesis i.e. - `assert(a == b)` -> `assert a == b` - `if(x > y or y < z):`->`if x > y or y < z:` - And `return('...')` -> `return '...'` Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591 Approved by: https://github.com/albanD, https://github.com/malfet	2024-01-03 06:04:44 +00:00
kflu	c5dcb50c00	[easy] aten ops: support passing all args as kwargs, including `self` (#114920 ) Summary: This is important for writing aten IR based graph transformation. ``` In [4]: [x.name for x in torch.ops.aten.reshape.default._schema.arguments] Out[4]: ['self', 'shape'] In [8]: torch.ops.aten.reshape.default(torch.rand(1,2), shape=[2]) Out[8]: tensor([0.7584, 0.4834]) # === CANNOT CALL `self` BY KWARGS === In [7]: torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[7], line 1 ----> 1 torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2]) TypeError: OpOverload.__call__() got multiple values for argument 'self' ``` # Where's the problem? 1. the aten ops first arg is usually named `self` (aten/src/ATen/native/native_functions.yaml) 2. Unfortunately, in `torch._ops.{OpOverload, OpOverloadPacket}.__call__()`, the first arg is (by python convention) named `self` too. So when call `self` by kwargs, `OpOverloadPacket.__call__` received: ``` OpOverloadPacket.__call__(self, {"self": ...}) ``` It is Python that does not allow some argument named "arg" to appear twice. and hence > TypeError: OpOverload.__call__() got multiple values for argument 'self' # How to fix? Note that, in above, `self` is an instance of `OpOverloadPacket`, and the "self" kwarg is the input tensor to the aten op. To fix, we only need to differentiate the two `self`s. In Python, first arg of a method does not need to be named `self`. So we change the `__call__` definition to: ``` def __call__(_self, ...): ``` Now the call becomes: ``` OpOverloadPacket.__call__(_self, {"self": ...}) ``` where: * `_self` is the instance to the `OpOverloadPacket` * `"self"` is the input tensor to the aten op. Test Plan: ``` In [4]: [x.name for x in torch.ops.aten.reshape.default._schema.arguments] Out[4]: ['self', 'shape'] In [3]: torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2]) Out[3]: tensor([0.5127, 0.3051]) ``` Differential Revision: D51731996 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114920 Approved by: https://github.com/houseroad	2023-12-16 18:32:58 +00:00
rzou	3477a2ee03	unMarkDynamoStrictTest on OpInfo-based tests (#115856 ) These take too long to run under strict mode. We'll worry about them later. Note that these decorators don't do anything yet (unless we flip the default from non-strict to strict). Pull Request resolved: https://github.com/pytorch/pytorch/pull/115856 Approved by: https://github.com/voznesenskym ghstack dependencies: #115845, #115855	2023-12-15 01:22:31 +00:00
Isuru Fernando	505574c46a	Add decomposition for torch.block_diag (#115096 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/115096 Approved by: https://github.com/peterbell10	2023-12-11 20:04:22 +00:00
Aaron Gokaslan	794545c11f	[BE]: Enable RUF015 codebase wide (#115507 ) Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507 Approved by: https://github.com/malfet	2023-12-11 15:51:01 +00:00
Isuru Fernando	e4a88d9581	Convert SymInts to SymFloats with SymPy (#113683 ) Fixes #109365 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113683 Approved by: https://github.com/ezyang, https://github.com/lezcano	2023-11-20 23:35:40 +00:00
Evgeni Burovski	237cbd5be6	BUG: trace frames with numpy scalar -> ndarray functions (#112959 ) Fixes #112951 Make dynamo detect that `np.arange(3)` returns a FakeTensor, so the frame needs to be traced. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112959 Approved by: https://github.com/lezcano	2023-11-17 03:00:24 +00:00
Aryan Gupta	8cee0a25bd	fix: Flake8-BugBear code B-026 for PyTorch (#111362 ) Fixes #106571 I have fixed the B-026 error codes for Flake8 tests on the codebase. Please review and tell me anything else to do. Thanks and excited for this first contribution to PyTorch. Also I refer this issue which introduced [B-026](https://github.com/PyCQA/flake8-bugbear/issues/286) in `pytest-bugbear` and discuss the error code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111362 Approved by: https://github.com/Skylion007	2023-11-07 21:38:18 +00:00
Peter Bell	66c32d099a	Use `pytree.arg_tree_leaves` everywhere (#112394 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112394 Approved by: https://github.com/lezcano ghstack dependencies: #112391, #112392, #112393	2023-10-31 15:57:06 +00:00
Peter Bell	bbd5b935e4	Use `pytree.tree_leaves` everywhere (#112324 ) This changes all the instances I could find of `tree_flatten(...)[0]` or `x, _ = tree_flatten` to use `tree_leaves`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324 Approved by: https://github.com/lezcano ghstack dependencies: #112327, #112323	2023-10-30 03:39:04 +00:00
William Wen	a380bf3297	[dynamo, test] skip flaky dynamo-wrapped tests (#112310 ) ghstack-source-id: 7a87e33e7513e7924e4513b6473284562989ed4c Pull Request resolved: https://github.com/pytorch/pytorch/pull/112309 Skip flaky tests reported by - https://github.com/pytorch/pytorch/issues/111825 - https://github.com/pytorch/pytorch/issues/111826 - https://github.com/pytorch/pytorch/issues/111909 - https://github.com/pytorch/pytorch/issues/112142 - https://github.com/pytorch/pytorch/issues/112220 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112310 Approved by: https://github.com/xmfan	2023-10-28 04:14:57 +00:00
Isuru Fernando	c120e5606e	Use ops_and_refs in test_ops.py instead of _ops_and_refs (#112022 ) `ops_and_refs` and `_ops_and_refs` have the same definition. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112022 Approved by: https://github.com/lezcano	2023-10-27 18:37:05 +00:00
Isuru Fernando	fdbb73fa4e	Check both ops and refs in test_strided_layout (#112160 ) Trying #112023 again to see if CLA issue is fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112160 Approved by: https://github.com/lezcano, https://github.com/Neilblaze	2023-10-27 15:35:34 +00:00
alhridoy	0c64ac0d3a	Add tests for strided layout in factory functions (#111463 ) Fixes #111222 This pull request adds tests for factory functions that create tensors with a strided layout. The tests are added to the `test_ops.py` file and check the behavior of the `empty`, `zeros`, `ones`, and `rand` factory functions when used with the `layout=torch.strided` argument. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111463 Approved by: https://github.com/lezcano	2023-10-24 17:05:44 +00:00
Philip Meier	973c87b320	raise instead of skip in test/test_meta.py (#110939 ) Supersedes #109004. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110939 Approved by: https://github.com/lezcano, https://github.com/kurtamohler	2023-10-17 10:17:43 +00:00
Jez Ng	ddb0c26511	[inductor] Re-enable more fixed tests (#110798 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110798 Approved by: https://github.com/Skylion007	2023-10-09 04:36:51 +00:00
Jez Ng	dddf581da7	[dynamo] Add graph break on requires_grad_() (#110053 ) Fixes #107861. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110053 Approved by: https://github.com/eellison	2023-10-04 06:22:16 +00:00
SS-JIA	5df8aca994	[core IR] Add a core decomposition for floor_divide (#110046 ) ## Context Introduce a core decomposition for `aten.floor_divide` into other `aten` ops, and add it to the core ATen decomposition table. This replaces the decomposition of `floor_divide` that was used by Inductor. I noticed there was a note on that decomposition ``` # TorchInductor-only decomposition. It should not be taken to core. # See https://github.com/pytorch/torchdynamo/pull/1120 ``` but couldn't discern the reason why this is the case. cc: @lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/110046 Approved by: https://github.com/peterbell10	2023-09-26 08:39:21 +00:00
SS-JIA	7de669f2f9	[core IR] Remove trunc decomp and add trunc to core (#109902 ) Following up from [this comment](https://github.com/pytorch/pytorch/pull/109319#discussion_r1330803226). Remove the decomposition for `trunc`, and add it as a core operator. Going forward, provide similar treatment for operators that map cleanly to hardware instructions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109902 Approved by: https://github.com/peterbell10	2023-09-25 18:18:06 +00:00
Mwiza Kunda	6b7b9c796e	Fix registering jit decompositions for jvp for out wrapped decomps (#109367 ) Python decompositions wrapped by `out_wrapper` need to be unwrapped before compiling with TorchScript since: - `out_wrapper` extends the decompositions signature with an out parameter, however this `out` parameter is not present in the source code of the original decomposition so the resulting `ScriptFunction` will not have an `out` parameter - `out_wrapper` is in the `torch._prims_common.wrappers` module so its `globals()` are different to the globals of the decomposition to be wrapped. This may cause symbol resolution to fail with the TorchScript compiler since it is compiling the unwrapped decomps source code rather than the wrapper The python decomposition for `aten.trace` is wrapped as an example, other decompositions are to be fixed in https://github.com/pytorch/pytorch/pull/107707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109367 Approved by: https://github.com/lezcano	2023-09-21 16:36:51 +00:00
Salil Desai	2e721aab98	[Decomposition] Trunc (#109319 ) Summary: Add Decomp for Trunc and add it to core_aten_decompositions Differential Revision: D49042033 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109319 Approved by: https://github.com/SherlockNoMad	2023-09-19 13:30:13 +00:00
Jez Ng	7f3885137f	Add meta function for _segment_reduce (#109359 ) This fixes numerous tests which were xfailing. For instance, the `_segment_reduce.lengths` OpInfo test, which was previously relying on the fallback kernel to determine the shape of the meta tensor. The fallback kernel would fail with segment_reduce(): Expected all rows of lengths along axis to sum to data.size(lengths.dim()-1) when !unsafe. as it was trying to read the values of a meta tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109359 Approved by: https://github.com/ezyang	2023-09-16 13:31:03 +00:00
PyTorch MergeBot	41bd0fde7e	Revert "Remove fixed skips (#108674 )" This reverts commit `ab9fb03d6f`. Reverted https://github.com/pytorch/pytorch/pull/108674 on behalf of https://github.com/huydhn due to Sorry for picking this up a bit late, but with https://github.com/pytorch/pytorch/pull/108647 reverted, these tests are failing again. So we need to wait for the PR to reland before we can land this change ([comment](https://github.com/pytorch/pytorch/pull/108674#issuecomment-1715202692))	2023-09-12 08:04:32 +00:00
Ken Jin	c458fa0d35	Decompose/add reference for `view_as_complex` (#108005 ) Aten source: `d4a99631dd/aten/src/ATen/native/ComplexHelper.h (L78)` Documentation reference: https://pytorch.org/docs/stable/generated/torch.view_as_complex.html Note: this adds a new primitive `view_of_dtype`, which is trivially implemented, as its meta function is already implemented elsewhere. Finally, this is not registered as a decomposition (yet), because TorchInductor does not yet support complex types. It should be added once we do. Closes https://github.com/pytorch/pytorch/issues/108020 as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108005 Approved by: https://github.com/peterbell10, https://github.com/ezyang	2023-09-07 23:49:20 +00:00
eellison	ab9fb03d6f	Remove fixed skips (#108674 ) These no longer fail with TEST_WITH_TORCHINDUCTOR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108674 Approved by: https://github.com/desertfire	2023-09-07 17:36:56 +00:00
Kurt Mohler	3f88e3105f	Reland: Remove remaining global `set_default_dtype` calls from tests (#108088 ) Fixes #68972 Relands #107246 To avoid causing Meta-internal CI failures, this PR avoids always asserting that the default dtype is float in the `TestCase.setUp/tearDown` methods. Instead, the assert is only done if `TestCase._default_dtype_check_enabled == True`. `_default_dtype_check_enabled` is set to True in the `if __name__ == "__main__":` blocks of all the relevant test files that have required changes for this issue Pull Request resolved: https://github.com/pytorch/pytorch/pull/108088 Approved by: https://github.com/ezyang	2023-09-07 03:04:34 +00:00
PyTorch MergeBot	43527d41a2	Revert "Remove fixed skips (#108674 )" This reverts commit `518cfda2dd`. Reverted https://github.com/pytorch/pytorch/pull/108674 on behalf of https://github.com/huydhn due to Sorry for reverting this, but one test is failing on inductor `518cfda2dd`, and it seems easier to revert this than disabling the test ([comment](https://github.com/pytorch/pytorch/pull/108674#issuecomment-1709310192))	2023-09-07 00:56:46 +00:00
eellison	518cfda2dd	Remove fixed skips (#108674 ) These no longer fail with TEST_WITH_TORCHINDUCTOR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108674 Approved by: https://github.com/desertfire	2023-09-06 22:33:43 +00:00
Guilherme Leobas	7e878c9d10	Add decomposition for `aten.take_along_dim` (#108185 ) xref #107875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108185 Approved by: https://github.com/lezcano	2023-09-04 13:49:53 +00:00
lezcano	239ee76177	Add refs/decomps for dot/vdot (#108194 ) Follow-up on https://github.com/pytorch/pytorch/issues/108127#issuecomment-1698142427 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108194 Approved by: https://github.com/peterbell10 ghstack dependencies: #108188	2023-08-31 15:30:23 +00:00
Sherlock Huang	ee4b99cc3a	Decomp for aten.dropout (#106274 ) When exporting dropout with cpu tensor, we get following graph module ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format) bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9); empty_memory_format = None div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9); bernoulli_p = None mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar); arg0_1 = div_scalar = None return (mul_tensor,) ``` In addition, if we export with eval() mode, we will have an empty graph. However, when exporting with cuda tensor, we got ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True); arg0_1 = None getitem: f32[512, 10] = native_dropout_default[0]; native_dropout_default = None return (getitem,) ``` and exporting under eval() mode will still have a dropout node in graph. This PR make exporting with CPU tensor also produce aten.native_dropout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274 Approved by: https://github.com/ezyang	2023-08-23 21:12:37 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Ivan Yashchuk	c913f3857f	Remove dynamo+nvfuser (#105789 ) This PR removes unmaintained Dynamo+nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789 Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD	2023-08-08 22:29:32 +00:00
PyTorch MergeBot	891bb259f8	Revert "Remove dynamo+nvfuser (#105789 )" This reverts commit `6030151d37`. Reverted https://github.com/pytorch/pytorch/pull/105789 on behalf of https://github.com/DanilBaibak due to Break a lot of tests on main. ([comment](https://github.com/pytorch/pytorch/pull/105789#issuecomment-1669710571))	2023-08-08 14:20:32 +00:00
Ivan Yashchuk	6030151d37	Remove dynamo+nvfuser (#105789 ) This PR removes unmaintained Dynamo+nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789 Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD	2023-08-08 13:29:31 +00:00

1 2 3 4 5 ...

384 Commits