pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Ivan Yashchuk	fba13d94a1	Remove deprecated torch.symeig (#70988 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`. - [x] XLA PR: https://github.com/pytorch/xla/pull/4498 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988 Approved by: https://github.com/lezcano, https://github.com/kit1980, https://github.com/malfet	2023-01-31 11:59:11 +00:00
PyTorch MergeBot	acdd462b1a	Revert "Remove deprecated torch.symeig (#70988 )" This reverts commit `d70ed68162`. Reverted https://github.com/pytorch/pytorch/pull/70988 on behalf of https://github.com/kit1980 due to Failing XLA tests, forward fix unsuccessful	2023-01-24 19:03:40 +00:00
Michael Gschwind	7265f60ad0	Regularize mask handling for attn_mask and key_padding_mask (#92733 ) Summary: Regularize mask handling for attn_mask and key_padding_mask * Update documentation to remove reference to byte masks (which were deprecated long ago) * Introduce check and warn about deprecation if attn_mask and key_padding_mask types mismatch * Convert all masks to float before combining * Combine by adding Test Plan: sandcastle & github CI Differential Revision: D42653215 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92733 Approved by: https://github.com/ngimel, https://github.com/drisspg	2023-01-24 14:12:05 +00:00
Ivan Yashchuk	d70ed68162	Remove deprecated torch.symeig (#70988 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988 Approved by: https://github.com/lezcano, https://github.com/kit1980	2023-01-23 22:51:40 +00:00
Driss Guessous	df14650f0b	[SDPA] Update SDPA API and make function Public (#92189 ) # Summary In preparation for pt 2.0 launch this PR updates SDPA's API and makes the function a nn.funcitonal public function. ## Changes ### API Previously the the function signature was: `scaled_dot_product_attention(query, key, value, attn_mask=None, need_attn_weights=False, dropout_p=0.0, is_causal=False) -> (Tensor, Tensor)` Updated signature: `scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False) -> Tensor` This PR removes the need_attn_weights optional boolean variable and updates the return type to a singular tensor. #### Reasoning: The main goal of this function is to provide an easy interface for users to call into fused attention kernels e.g. (FlashAttention). The fused kernels do not currently support arbitrary attn_mask or dropout but there is a PR to mem-efficient attention to enable these. We want to have the API surface ready for when the backing kernels get updated. The fused kernels save on memory usage by not materializing the weights and it is unlikely that a fast fused implementation will enable this feature so we are removing. Discussed with folks at FAIR/Xformers and +1 this API change. #### Make function Public In preparation for the pt 2.0 launch we make the function public to start to generate user feedback Pull Request resolved: https://github.com/pytorch/pytorch/pull/92189 Approved by: https://github.com/cpuhrsch	2023-01-23 20:50:46 +00:00
kshitij12345	d2728bb6a7	[functorch] add is_any_true (#92686 ) Adds `is_any_true` similar to `is_all_true` (https://github.com/pytorch/pytorch/pull/89097/files) This would unblock https://github.com/pytorch/functorch/issues/1049 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92686 Approved by: https://github.com/Chillee	2023-01-21 03:36:05 +00:00
Edward Z. Yang	5c6f5439b7	Implement SymBool (#92149 ) We have known for a while that we should in principle support SymBool as a separate concept from SymInt and SymFloat ( in particular, every distinct numeric type should get its own API). However, recent work with unbacked SymInts in, e.g., https://github.com/pytorch/pytorch/pull/90985 have made this a priority to implement. The essential problem is that our logic for computing the contiguity of tensors performs branches on the passed in input sizes, and this causes us to require guards when constructing tensors from unbacked SymInts. Morally, this should not be a big deal because, we only really care about the regular (non-channels-last) contiguity of the tensor, which should be guaranteed since most people aren't calling `empty_strided` on the tensor, however, because we store a bool (not a SymBool, prior to this PR it doesn't exist) on TensorImpl, we are forced to immediately compute these values, even if the value ends up not being used at all. In particular, even when a user allocates a contiguous tensor, we still must compute channels-last contiguity (as some contiguous tensors are also channels-last contiguous, but others are not.) This PR implements SymBool, and makes TensorImpl use SymBool to store the contiguity information in ExtraMeta. There are a number of knock on effects, which I now discuss below. * I introduce a new C++ type SymBool, analogous to SymInt and SymFloat. This type supports logical and, logical or and logical negation. I support the bitwise operations on this class (but not the conventional logic operators) to make it clear that logical operations on SymBool are NOT short-circuiting. I also, for now, do NOT support implicit conversion of SymBool to bool (creating a guard in this case). This does matter too much in practice, as in this PR I did not modify the equality operations (e.g., `==` on SymInt) to return SymBool, so all preexisting implicit guards did not need to be changed. I also introduced symbolic comparison functions `sym_eq`, etc. on SymInt to make it possible to create SymBool. The current implementation of comparison functions makes it unfortunately easy to accidentally introduce guards when you do not mean to (as both `s0 == s1` and `s0.sym_eq(s1)` are valid spellings of equality operation); in the short term, I intend to prevent excess guarding in this situation by unit testing; in the long term making the equality operators return SymBool is probably the correct fix. * ~~I modify TensorImpl to store SymBool for the `is_contiguous` fields and friends on `ExtraMeta`. In practice, this essentially meant reverting most of the changes from https://github.com/pytorch/pytorch/pull/85936 . In particular, the fields on ExtraMeta are no longer strongly typed; at the time I was particularly concerned about the giant lambda I was using as the setter getting a desynchronized argument order, but now that I have individual setters for each field the only "big list" of boolean arguments is in the constructor of ExtraMeta, which seems like an acceptable risk. The semantics of TensorImpl are now that we guard only when you actually attempt to access the contiguity of the tensor via, e.g., `is_contiguous`. By in large, the contiguity calculation in the implementations now needs to be duplicated (as the boolean version can short circuit, but the SymBool version cannot); you should carefully review the duplicate new implementations. I typically use the `identity` template to disambiguate which version of the function I need, and rely on overloading to allow for implementation sharing. The changes to the `compute_` functions are particularly interesting; for most of the functions, I preserved their original non-symbolic implementation, and then introduce a new symbolic implementation that is branch-less (making use of our new SymBool operations). However, `compute_non_overlapping_and_dense` is special, see next bullet.~~ This appears to cause performance problems, so I am leaving this to an update PR. * (Update: the Python side pieces for this are still in this PR, but they are not wired up until later PRs.) While the contiguity calculations are relatively easy to write in a branch-free way, `compute_non_overlapping_and_dense` is not: it involves a sort on the strides. While in principle we can still make it go through by using a data oblivious sorting network, this seems like too much complication for a field that is likely never used (because typically, it will be obvious that a tensor is non overlapping and dense, because the tensor is contiguous.) So we take a different approach: instead of trying to trace through the logic computation of non-overlapping and dense, we instead introduce a new opaque operator IsNonOverlappingAndDenseIndicator which represents all of the compute that would have been done here. This function returns an integer 0 if `is_non_overlapping_and_dense` would have returned `False`, and an integer 1 otherwise, for technical reasons (Sympy does not easily allow defining custom functions that return booleans). The function itself only knows how to evaluate itself if all of its arguments are integers; otherwise it is left unevaluated. This means we can always guard on it (as `size_hint` will always be able to evaluate through it), but otherwise its insides are left a black box. We typically do NOT expect this custom function to show up in actual boolean expressions, because we will typically shortcut it due to the tensor being contiguous. It's possible we should apply this treatment to all of the other `compute_` operations, more investigation necessary. As a technical note, because this operator takes a pair of a list of SymInts, we need to support converting `ArrayRef<SymNode>` to Python, and I also unpack the pair of lists into a single list because I don't know if Sympy operations can actually validly take lists of Sympy expressions as inputs. See for example `_make_node_sizes_strides` * On the Python side, we also introduce a SymBool class, and update SymNode to track bool as a valid pytype. There is some subtlety here: bool is a subclass of int, so one has to be careful about `isinstance` checks (in fact, in most cases I replaced `isinstance(x, int)` with `type(x) is int` for expressly this reason.) Additionally, unlike, C++, I do NOT define bitwise inverse on SymBool, because it does not do the correct thing when run on booleans, e.g., `~True` is `-2`. (For that matter, they don't do the right thing in C++ either, but at least in principle the compiler can warn you about it with `-Wbool-operation`, and so the rule is simple in C++; only use logical operations if the types are statically known to be SymBool). Alas, logical negation is not overrideable, so we have to introduce `sym_not` which must be used in place of `not` whenever a SymBool can turn up. To avoid confusion with `__not__` which may imply that `operators.__not__` might be acceptable to use (it isn't), our magic method is called `__sym_not__`. The other bitwise operators `&` and `\|` do the right thing with booleans and are acceptable to use. * There is some annoyance working with booleans in Sympy. Unlike int and float, booleans live in their own algebra and they support less operations than regular numbers. In particular, `sympy.expand` does not work on them. To get around this, I introduce `safe_expand` which only calls expand on operations which are known to be expandable. TODO: this PR appears to greatly regress performance of symbolic reasoning. In particular, `python test/functorch/test_aotdispatch.py -k max_pool2d` performs really poorly with these changes. Need to investigate. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92149 Approved by: https://github.com/albanD, https://github.com/Skylion007	2023-01-21 02:21:56 +00:00
Kurt Mohler	647b8f8e3e	Add TORCH_CHECK_TENSOR_ALL (#89097 ) `TORCH_CHECK_TENSOR_ALL(cond, ...)` is a wrapper around `TORCH_CHECK` which allows the condition argument to be a tensor, batched or unbatched. `cond` can be a boolean tensor of any size. If any element is False, or if `cond.numel() == 0`, then `TORCH_CHECK_TENSOR_ALL` raises an error Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89097 Approved by: https://github.com/zou3519	2023-01-19 21:04:09 +00:00
Edward Z. Yang	6420fecdc4	Introduce sym_min and sym_max (#92107 ) It turns out our old max/min implementation didn't do anything, because `__max__` and `__min__` are not actually magic methods in Python. So I give 'em the `sym_` treatment, similar to the other non-overrideable builtins. NB: I would like to use `sym_max` when computing contiguous strides but this appears to make `python test/functorch/test_aotdispatch.py -v -k test_aot_autograd_symbolic_exhaustive_nn_functional_max_pool2d_cpu_float32` run extremely slowly. Needs investigating. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92107 Approved by: https://github.com/albanD, https://github.com/voznesenskym, https://github.com/Skylion007	2023-01-18 20:57:27 +00:00
yanbing-j	94a7c01159	Enable oneDNN implementation in LSTM op (#91158 ) ### Description This PR is to enable oneDNN implementation in LSTM op to improve the performance of it. Both FP32 and BF16 are supported. ### Performance improvement In CPX 28C, with setting iomp and jemalloc. We choose 8 LSTM input options (including input_size, hidden_size, num_layers, bidirectional, bias, batch_first, dropout, batch_size, seq_len), and the final option is a real input from train-clean-100 in LibriSpeech dataset. The performance improvements are shown in the following figures. We can see that LSTM with oneDNN implementation can perform better than the original. In single socket: ![image](https://user-images.githubusercontent.com/61222868/211182994-833debec-518a-4b35-8504-6b0fadb17930.png) ![image](https://user-images.githubusercontent.com/61222868/211183012-31e1253f-2c60-4c92-a656-c239a971b453.png) In single core: ![image](https://user-images.githubusercontent.com/61222868/211183017-186e5d47-cb9a-4c1e-914f-fa718e769f1c.png) ![image](https://user-images.githubusercontent.com/61222868/211183022-53266857-5a9e-4a95-b300-33fa34811d08.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91158 Approved by: https://github.com/jgong5, https://github.com/malfet	2023-01-18 04:41:18 +00:00
Edward Z. Yang	333540a458	Reland "Add torch.utils.device_mode" (#91796 ) Original PR https://github.com/pytorch/pytorch/pull/91525 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796 Approved by: https://github.com/albanD	2023-01-09 20:57:12 +00:00
PyTorch MergeBot	9b415240d4	Revert "Reland "Add torch.utils.device_mode" (#91796 )" This reverts commit `81b5eff3c3`. Reverted https://github.com/pytorch/pytorch/pull/91796 on behalf of https://github.com/huydhn due to This breaks trunk with the following failed test https://hud.pytorch.org/failure/test_jit_save%2CTestTracer	2023-01-09 04:45:47 +00:00
Edward Z. Yang	81b5eff3c3	Reland "Add torch.utils.device_mode" (#91796 ) Original PR https://github.com/pytorch/pytorch/pull/91525 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796 Approved by: https://github.com/albanD	2023-01-08 03:44:56 +00:00
PyTorch MergeBot	f571ae4fdb	Revert "Make torch.device usable as a context manager (#91525 )" This reverts commit `619d52a5d2`. Reverted https://github.com/pytorch/pytorch/pull/91525 on behalf of https://github.com/mehtanirav due to Internal breakages	2023-01-05 21:34:50 +00:00
Edward Z. Yang	619d52a5d2	Make torch.device usable as a context manager (#91525 ) Fixes https://github.com/pytorch/pytorch/issues/82296 Fixes https://github.com/pytorch/pytorch/issues/27878 Fixes https://github.com/pytorch/pytorch/issues/260 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91525 Approved by: https://github.com/albanD	2023-01-04 01:32:00 +00:00
Kurt Mohler	08a47549af	Rename `Tensor._storage` to `Tensor.untyped_storage` and update docs (#91414 ) Fixes #89224 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91414 Approved by: https://github.com/ezyang	2022-12-28 19:21:34 +00:00
Joel Schlosser	8b55b86dbd	Move sym_int and sym_float alongside SymInt / SymFloat in base torch package (#91317 ) This PR moves the definitions for: * `sym_int` * `sym_ceil` (used only for `sym_int`) * `sym_floor` (used only for `sym_int`) * `sym_float` from `torch/fx/experimental/symbolic_shapes.py` to `torch/__init__.py`, where `SymInt` and `SymFloat` are already defined. This removes the need for several in-line imports, and enables proper JIT script gating for #91318. I'm very open to doing this in a better way! Pull Request resolved: https://github.com/pytorch/pytorch/pull/91317 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2022-12-28 16:08:16 +00:00
Richard Zou	fb2e1878cb	[torch.func] alias torch.func.vmap as torch.vmap (#91026 ) This PR also redirects torch.vmap to torch.func.vmap instead of the old vmap prototype. Test Plan: - tests - view docs preview Pull Request resolved: https://github.com/pytorch/pytorch/pull/91026 Approved by: https://github.com/albanD, https://github.com/samdow	2022-12-21 20:51:49 +00:00
Michael Gschwind	512ec181ec	Introduce causal mask (#90508 ) Summary: Introduce causal mask This PR introduces a causal mask option _causal_mask (as well as causal mask detection if attn_mask is provided), since current custom kernels do not support arbitrary masks. Test Plan: sandcastle & github ci/cd Differential Revision: D41723137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90508 Approved by: https://github.com/albanD	2022-12-16 21:39:42 +00:00
Sergii Dymchenko	f51f6aa387	Fix non-existing parameters in docstrings (#90505 ) Continuation after https://github.com/pytorch/pytorch/pull/90163. Here is a script I used to find all the non-existing arguments in the docstrings (the script can give false positives in presence of args/*kwargs or decorators): _Edit:_ I've realized that the indentation is wrong for the last `break` in the script, so the script only gives output for a function if the first docstring argument is wrong. I'll create a separate PR if I find more issues with corrected script. ``` python import ast import os import docstring_parser for root, dirs, files in os.walk('.'): for name in files: if root.startswith("./.git/") or root.startswith("./third_party/"): continue if name.endswith(".py"): full_name = os.path.join(root, name) with open(full_name, "r") as source: tree = ast.parse(source.read()) for node in ast.walk(tree): if isinstance(node, ast.FunctionDef): all_node_args = node.args.args if node.args.vararg is not None: all_node_args.append(node.args.vararg) if node.args.kwarg is not None: all_node_args.append(node.args.kwarg) if node.args.posonlyargs is not None: all_node_args.extend(node.args.posonlyargs) if node.args.kwonlyargs is not None: all_node_args.extend(node.args.kwonlyargs) args = [a.arg for a in all_node_args] docstring = docstring_parser.parse(ast.get_docstring(node)) doc_args = [a.arg_name for a in docstring.params] clean_doc_args = [] for a in doc_args: clean_a = "" for c in a.split()[0]: if c.isalnum() or c == '_': clean_a += c if clean_a: clean_doc_args.append(clean_a) doc_args = clean_doc_args for a in doc_args: if a not in args: print(full_name, node.lineno, args, doc_args) break ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/90505 Approved by: https://github.com/malfet, https://github.com/ZainRizvi	2022-12-09 21:43:09 +00:00
Nikita Shulga	768bd3fb4a	Add `torch.compile` implementation (#89607 ) `torch.compile` can be used either as decorator or to optimize model directly, for example: ``` @torch.compile def foo(x): return torch.sin(x) + x.max() ``` or ``` mod = torch.nn.ReLU() optimized_mod = torch.compile(mod, mode="max-autotune") ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/89607 Approved by: https://github.com/soumith	2022-12-01 20:17:52 +00:00
albanD	c79489c8e6	Expose to python the backward AD view_func (#89586 ) This will be useful for other systems (AOTAutograd) that want to replay autograd views. FYI @bdhirsh Pull Request resolved: https://github.com/pytorch/pytorch/pull/89586 Approved by: https://github.com/soulitzer	2022-11-24 03:39:58 +00:00
Jane Xu	8695f0cced	Rectify `native_batch_norm` schema by splitting it into two legit schemas (#88697 ) Using the same repro from the issue (but with BatchNorm2D) Rectifies native_batch_norm schema by splitting the schema into 2: 1. one will have NON-optional alias-able running_mean and running_var inputs 2. the other will just not have those parameters at all (no_stats variation) Calling for name suggestions! ## test plan I've added tests in test_functionalization.py as well as an entry in common_method_invocations.py for `native_batch_norm_legit` CI should pass. ## next steps Because of bc/fc reasons, we reroute native_batch_norm to call our new schemas ONLY through the python dispatcher, but in 2 weeks or so, we should make `native_batch_norm_legit` the official batch_norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88697 Approved by: https://github.com/albanD	2022-11-23 23:23:17 +00:00
Kurt Mohler	ee28b865ee	Deprecate TypedStorage, its derived classes, and all of their public methods (#85303 ) Part of #85302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303 Approved by: https://github.com/ezyang	2022-11-08 18:11:01 +00:00
Kazuaki Ishizaki	2ddefbdc3c	Fix typos used in documents under torch directory (#88300 ) This PR fixes typos, in comments of Python files, that are found from a search box at https://pytorch.org/docs/master/search.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/88300 Approved by: https://github.com/lezcano	2022-11-02 09:38:13 +00:00
Kazuaki Ishizaki	0fab8df0b6	Fix incorrect param names in get_testing_overrides (#87625 ) This PR fixes incorrect parameter names for lambda in `get_testing_overrides()` Pull Request resolved: https://github.com/pytorch/pytorch/pull/87625 Approved by: https://github.com/kit1980	2022-10-25 02:49:14 +00:00
samdow	169ec120ef	[Modes] refactor modes to only use a stack in cpp (#86458 ) Refactors the mode code to only have the C++ mode stack and not the "C++ mode" like we originally had. This also simplifies the mode logic in a number of places Pull Request resolved: https://github.com/pytorch/pytorch/pull/86458 Approved by: https://github.com/zou3519	2022-10-21 19:18:23 +00:00
Antoni Viros i Martin	cdbffa7f66	🦊 [AI Accelerators] Consolidate native_layer_norm for nested tensor (#86295 ) Summary: In order to make the layer normalization implementation for nested tensors public, it needs to be generalized to accept a normalized_shape argument instead of assuming it to be the last dimension of the nested_tensor. This commit does that, as well as adding extra unit tests to ensure the implementation is correct. Test Plan: All unit tests designed to test different ways of using the function work: `buck test //caffe2/test:nested -- test_layer_norm` Differential Revision: D40105207 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86295 Approved by: https://github.com/drisspg	2022-10-06 13:10:25 +00:00
Ivan Yashchuk	b00a5359f7	Add a way to skip lowering to nvprims (#85811 ) This PR adds `skip_ops` argument to `TorchRefsNvfuserCapabilityMode` and `NvfuserPrimsMode` which is an iterable of function names to be skipped in the translation to nvprims process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85811 Approved by: https://github.com/mruberry, https://github.com/jjsjann123	2022-09-30 12:01:45 +00:00
Mikayla Gawarecki	afaee00fec	Add python `nested_tensor` and `as_nested_tensor` constructors in `torch.nested` (#85593 ) Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ). Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc. Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593 Approved by: https://github.com/drisspg, https://github.com/cpuhrsch	2022-09-28 20:15:02 +00:00
Edward Z. Yang	24a268143d	Directly access has_symbolic_sizes_strides, avoid expensive test (#85754 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85754 Approved by: https://github.com/albanD	2022-09-28 00:26:11 +00:00
samdow	a106611055	[Modes] fix handle_torch_funcion logic (#85707 ) Fixes #85696. I didn't totally get what was happening in handle_torch_function and so was trying to recreate the original logic instead of follow what the C++ is doing. This fixes that Pull Request resolved: https://github.com/pytorch/pytorch/pull/85707 Approved by: https://github.com/ezyang	2022-09-27 18:35:51 +00:00
samdow	18d8c548f4	[Modes] remove enable and rewrite mode stack (squashed) (#84774 ) Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch\|function} This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup ### Background Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like ```python ## PRE-PR UX def f(mode): with mode.restore(): # user needs to understand this restore thing? ... with Mode() as m: pass f(m) ``` Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation" step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write ```python ## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR def f(mode): with mode: ... f(Mode()) ``` Technical Details With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-09-27 01:04:35 +00:00
Ivan Yashchuk	539076e2c2	Remove deprecated torch.lstsq (#70980 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.lstsq`. There's a note in `tools/codegen/gen.py` about `lstsq` schema in `native_function.yaml` that I will not remove: `87139d8532/tools/codegen/gen.py (L734-L770)` cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70980 Approved by: https://github.com/lezcano, https://github.com/kit1980	2022-09-23 00:16:55 +00:00
Ivan Yashchuk	bcf93181a0	Remove deprecated torch.matrix_rank (#70981 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.matrix_rank`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70981 Approved by: https://github.com/lezcano, https://github.com/kit1980	2022-09-22 17:40:46 +00:00
Mikayla Gawarecki	77f1f98479	Re-introduce `torch.Tensor.to_padded_tensor` (#85293 ) Differential Revision: [D39629004](https://our.internmc.facebook.com/intern/diff/D39629004) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85293 Approved by: https://github.com/cpuhrsch	2022-09-21 18:45:56 +00:00
Khushi Agrawal	2386cd2945	[reland] [numpy] add torch.concatenate, alias of torch.cat (#85073 ) Previous PR: #82946 Fixes #81161 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85073 Approved by: https://github.com/mruberry	2022-09-15 19:34:44 +00:00
PyTorch MergeBot	fa7bf3e2dc	Revert "[numpy] add `torch.concatenate`, alias of torch.cat (#82946 )" This reverts commit `270e5e519d`. Reverted https://github.com/pytorch/pytorch/pull/82946 on behalf of https://github.com/malfet due to Broke M1 tests, see `270e5e519d`	2022-09-14 21:32:11 +00:00
Khushi Agrawal	270e5e519d	[numpy] add `torch.concatenate`, alias of torch.cat (#82946 ) As per the title. Fixes: #81161 - [x] add ErrorInputs - ~[ ] dtype argument?~ - ~[ ] casting argument?~ As discussed offline with @kshitij12345, we can currently ignore `dtype` and `casting` arguments. cc: @kshitij12345! Pull Request resolved: https://github.com/pytorch/pytorch/pull/82946 Approved by: https://github.com/mruberry	2022-09-14 19:28:43 +00:00
Mikayla Gawarecki	e217b30b0f	Add `torch.nested` namespace (#84102 ) First step towards #83775 - only `to_padded_tensor` is moved to the nested namespace for now - following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in `torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`. ~~Question: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~ [generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested) Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102 Approved by: https://github.com/drisspg	2022-09-12 16:31:05 +00:00
Ivan Yashchuk	01c54ad6de	Remove deprecated torch.eig (#70982 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982 Approved by: https://github.com/Lezcano, https://github.com/malfet	2022-09-09 21:31:57 +00:00
samdow	7532d5b125	[Modes] remove inner constructor kwarg (#83925 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83925 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-08-31 00:05:56 +00:00
Michael Gschwind	cf2c94e6de	NestedTensor Softmax (#83435 ) Summary: Simple mask compute and softmax Test Plan: unit test Differential Revision: D38711915 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83435 Approved by: https://github.com/erichan1, https://github.com/huydhn	2022-08-17 21:57:42 +00:00
PyTorch MergeBot	0061e67629	Revert "NestedTensor Softmax (#83435 )" This reverts commit `d7fc76a1ed`. Reverted https://github.com/pytorch/pytorch/pull/83435 on behalf of https://github.com/huydhn due to This is suspected to break functorch tests in trunk `d7fc76a1ed`	2022-08-17 16:19:38 +00:00
Michael Gschwind	d7fc76a1ed	NestedTensor Softmax (#83435 ) Summary: Simple mask compute and softmax Test Plan: unit test Differential Revision: D38711915 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83435 Approved by: https://github.com/erichan1	2022-08-17 04:19:23 +00:00
soulitzer	31fad3926a	Add option to run anomaly mode without nan checking (#83481 ) Fixes https://github.com/pytorch/pytorch/issues/83117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83481 Approved by: https://github.com/albanD	2022-08-16 22:56:23 +00:00
Jeff Daily	d52d2bd5a9	[ROCm] MIOpen fused convolution relu (#82002 ) Adds MIOpen fused convolution relu for fp32 and contiguous memory format. Adds fallbacks for conv + z + bias + relu, fp16, and channels last until MIOpen adds these features. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82002 Approved by: https://github.com/ngimel, https://github.com/malfet	2022-08-16 20:49:33 +00:00
albanD	e4ea751810	Fix hash for Tensor subclasses (#83174 ) Fixes https://github.com/pytorch/pytorch/issues/82832 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83174 Approved by: https://github.com/ezyang	2022-08-10 19:23:56 +00:00
Fabio Rocha	fd84c458f4	Add torch.unflatten and improve its docs (#81399 ) unflatten now has a free function version in torch.flatten in addition to the method in torch.Tensor.flatten. Updated docs to reflect this and polished them a little. For consistency, changed the signature of the int version of unflatten in native_functions.yaml. Some override tests were failing because unflatten has unusual characteristics in terms of the .int and .Dimname versions having different number of arguments so this required some changes to test/test_override.py Removed support for using mix of integer and string arguments when specifying dimensions in unflatten. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81399 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-07-29 15:02:42 +00:00
samdow	2ac24675cc	get rid of push_torch_{dispatch, function}_mode (#78215 ) Currently we have 2 ways of doing the same thing for torch dispatch and function modes: `with push_torch_dispatch_mode(X)` or `with X.push(...)` is now the equivalent of doing `with X()` This removes the first API (which is older and private so we don't need to go through a deprecation cycle) There is some risk here that this might land race with a PR that uses the old API but in general it seems like most are using the `with X()` API or `enable_torch_dispatch_mode(X())` which isn't getting removed. EDIT: left the `with X.push(...)` API since there were ~3 land races with that over the past day or so. But made it give a warning and ask users to use the other API Pull Request resolved: https://github.com/pytorch/pytorch/pull/78215 Approved by: https://github.com/ezyang	2022-07-22 18:56:37 +00:00
Edward Z. Yang	d4f065d261	Return mode object from __enter__ (#80998 ) This makes `with Mode() as m:` work. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/80998 Approved by: https://github.com/samdow	2022-07-12 23:22:26 +00:00
lezcano	e505796a2c	[Array API] Add linalg.vecdot (#70542 ) This PR adds the function `linalg.vecdot` specified by the [Array API](https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#function-vecdot) For the complex case, it chooses to implement \sum x_i y_i. See the discussion in https://github.com/data-apis/array-api/issues/356 Edit. When it comes to testing, this function is not quite a binopt, nor a reduction opt. As such, we're this close to be able to get the extra testing, but we don't quite make it. Now, it's such a simple op that I think we'll make it without this. Resolves https://github.com/pytorch/pytorch/issues/18027. cc @mruberry @rgommers @pmeier @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi Pull Request resolved: https://github.com/pytorch/pytorch/pull/70542 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-07-12 14:28:54 +00:00
PyTorch MergeBot	39f659c3ba	Revert "[Array API] Add linalg.vecdot (#70542 )" This reverts commit `74208a9c68`. Reverted https://github.com/pytorch/pytorch/pull/70542 on behalf of https://github.com/malfet due to Broke CUDA-10.2 for vecdot_bfloat16, see `74208a9c68`	2022-07-08 22:56:51 +00:00
lezcano	74208a9c68	[Array API] Add linalg.vecdot (#70542 ) This PR adds the function `linalg.vecdot` specified by the [Array API](https://data-apis.org/array-api/latest/API_specification/linear_algebra_functions.html#function-vecdot) For the complex case, it chooses to implement \sum x_i y_i. See the discussion in https://github.com/data-apis/array-api/issues/356 Edit. When it comes to testing, this function is not quite a binopt, nor a reduction opt. As such, we're this close to be able to get the extra testing, but we don't quite make it. Now, it's such a simple op that I think we'll make it without this. Resolves https://github.com/pytorch/pytorch/issues/18027. cc @mruberry @rgommers @pmeier @asmeurer @leofang @AnirudhDagar @asi1024 @emcastillo @kmaehashi Pull Request resolved: https://github.com/pytorch/pytorch/pull/70542 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-07-08 15:37:58 +00:00
Nikolay Korovaiko	8389ccbcd8	reinstate size and shape returning symints (#79560 ) This PR redirects `size` and `.shape` to call `sym_sizes` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79560 Approved by: https://github.com/Chillee	2022-07-08 01:17:33 +00:00
lezcano	19f3d4d795	Expose linalg.solve_ex (#80073 ) This prepares for making `linalg.inv_ex` just a call into this function Pull Request resolved: https://github.com/pytorch/pytorch/pull/80073 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD	2022-07-01 16:09:23 +00:00
Allen Goodman	63ef2a03e5	torch.special.scaled_modified_bessel_k0 (#78900 ) ```Python scaled_modified_bessel_k0(input, *, out=None) -> Tensor ``` Scaled modified Bessel function of the second kind of order $0$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78900 Approved by: https://github.com/mruberry	2022-06-29 14:53:37 +00:00
Nikolay Korovaiko	7e34edf12d	adding sym_size override (#80357 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/80357 Approved by: https://github.com/ezyang	2022-06-29 00:53:45 +00:00
PyTorch MergeBot	602c38ff63	Revert "torch.special.gamma (#78904 )" This reverts commit `f563f25efd`. Reverted https://github.com/pytorch/pytorch/pull/78904 on behalf of https://github.com/suo due to This PR appears to have broken mac tests on master `f563f25efd`	2022-06-28 00:54:22 +00:00
Allen Goodman	ab8797d69b	torch.special.spherical_bessel_j0 (#78912 ) ```Python spherical_bessel_j0(input, *, out=None) -> Tensor ``` Spherical Bessel function of the first kind of order $0$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78912 Approved by: https://github.com/mruberry	2022-06-27 20:14:46 +00:00
Allen Goodman	f563f25efd	torch.special.gamma (#78904 ) ```Python gamma(input, *, out=None) -> Tensor ``` Gamma function $\Gamma\left(\text{input}\right)$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78904 Approved by: https://github.com/mruberry	2022-06-27 19:36:17 +00:00
Allen Goodman	b3ca3638be	torch.special.scaled_modified_bessel_k1 (#78901 ) ```Python scaled_modified_bessel_k1(input, *, out=None) -> Tensor ``` Scaled modified Bessel function of the second kind of order $1$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78901 Approved by: https://github.com/mruberry	2022-06-24 20:57:38 +00:00
Allen Goodman	b3308e21bf	torch.special.airy_ai (#78902 ) ```Python airy_ai(input, *, out=None) -> Tensor ``` Airy function $\text{Ai}\left(\text{input}\right)$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78902 Approved by: https://github.com/mruberry, https://github.com/linbinyu, https://github.com/seemethere	2022-06-23 19:33:40 +00:00
Edward Z. Yang	f7ee061638	Wconstab/reland pysymint (#79795 ) rebased https://github.com/pytorch/pytorch/pull/79617/ to see if issues are reproducible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79795 Approved by: https://github.com/malfet	2022-06-20 22:55:06 +00:00
Mikayla Gawarecki	7360b53ff3	reland Add offsets-based reduction to segment_reduce (CPU, CUDA) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79725 Approved by: https://github.com/george-qi	2022-06-17 15:49:31 +00:00
PyTorch MergeBot	44436947bc	Revert "Reland PySymInt (#79617 )" This reverts commit `8ef6356f26`. Reverted https://github.com/pytorch/pytorch/pull/79617 on behalf of https://github.com/zengk95 due to this is breaking periodic jobs (and maybe pull) on trunk	2022-06-16 19:40:27 +00:00
Nikolay Korovaiko	8ef6356f26	Reland PySymInt (#79617 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/79617 Approved by: https://github.com/Chillee	2022-06-16 04:18:06 +00:00
drisspg	b9f83cb737	use is_same_size in autograd init (#79553 ) Broke: #79446 into a smaller commit that just adds is_same_size to the the autograd __init_file. This function is_same_size will be dispatched to the original behavior for regular tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/79553 Approved by: https://github.com/soulitzer	2022-06-15 19:49:42 +00:00
Joel Benjamin Schlosser	2d73c8e6e0	Add Dropout1d module Pull Request resolved: https://github.com/pytorch/pytorch/pull/79545 Approved by: https://github.com/ngimel, https://github.com/albanD	2022-06-15 14:39:07 +00:00
PyTorch MergeBot	b8db0a0475	Revert "Python Bindings for SymInts (#78135 )" This reverts commit `d332724071`. Reverted https://github.com/pytorch/pytorch/pull/78135 on behalf of https://github.com/ezyang due to broke torchvision tests	2022-06-15 13:52:14 +00:00
Nikolay Korovaiko	d332724071	Python Bindings for SymInts (#78135 ) This PR adds support for `SymInt`s in python. Namely, * `THPVariable_size` now returns `sym_sizes()` * python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s * pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced * a large number of tests added to demonstrate how to implement python symints. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135 Approved by: https://github.com/ezyang	2022-06-14 02:17:59 +00:00
George Qi	05624bcf7b	add sizes to slowpath Pull Request resolved: https://github.com/pytorch/pytorch/pull/79295 Approved by: https://github.com/ezyang	2022-06-14 01:19:59 +00:00
PyTorch MergeBot	3b194fd532	Revert "Add offsets-based reduction to segment_reduce (CPU, CUDA)" This reverts commit `1ec30a6647`. Reverted https://github.com/pytorch/pytorch/pull/78907 on behalf of https://github.com/osalpekar due to Caused Typecasting errors in PT Distributed and fx2trt builds internally	2022-06-13 22:37:25 +00:00
Mikayla Gawarecki	1ec30a6647	Add offsets-based reduction to segment_reduce (CPU, CUDA) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78907 Approved by: https://github.com/cpuhrsch	2022-06-11 17:43:42 +00:00
lezcano	54949a5abc	Simplify and optimize linalg.solve This PR heavily simplifies the code of `linalg.solve`. At the same time, this implementation saves quite a few copies of the input data in some cases (e.g. A is contiguous) We also implement it in such a way that the derivative goes from computing two LU decompositions and two LU solves to no LU decompositions and one LU solves. It also avoids a number of unnecessary copies the derivative was unnecessarily performing (at least the copy of two matrices). On top of this, we add a `left` kw-only arg that allows the user to solve `XA = B` rather concisely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74046 Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry	2022-06-11 04:06:40 +00:00
samdow	3734fcc8f8	add ability to push a mode if the current mode is an ancestor Pull Request resolved: https://github.com/pytorch/pytorch/pull/78822 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-06-10 18:27:04 +00:00
George Qi	a90f006fe5	add strides to slow path Pull Request resolved: https://github.com/pytorch/pytorch/pull/78610 Approved by: https://github.com/ezyang	2022-06-10 16:59:14 +00:00
lezcano	c7d6cec078	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634 Approved by: https://github.com/malfet	2022-06-07 22:28:28 +00:00
vitrioil	ebb7f424b8	Add Tensor.is_cpu (#78887 ) Fixes #76872 Not sure if this is also required. `ac8c6d09d1/torch/csrc/tensor/python_tensor.cpp (L146)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78887 Approved by: https://github.com/ezyang	2022-06-06 22:01:12 +00:00
samdow	184e0065b3	add better error message for class method Pull Request resolved: https://github.com/pytorch/pytorch/pull/78821 Approved by: https://github.com/ezyang	2022-06-06 13:31:32 +00:00
Allen Goodman	bc84143152	Orthogonal Polynomials (#78304 ) ```Python chebyshev_polynomial_v(input, n, , out=None) -> Tensor ``` Chebyshev polynomial of the third kind $V_{n}(\text{input})$. ```Python chebyshev_polynomial_w(input, n, , out=None) -> Tensor ``` Chebyshev polynomial of the fourth kind $W_{n}(\text{input})$. ```Python legendre_polynomial_p(input, n, , out=None) -> Tensor ``` Legendre polynomial $P_{n}(\text{input})$. ```Python shifted_chebyshev_polynomial_t(input, n, , out=None) -> Tensor ``` Shifted Chebyshev polynomial of the first kind $T_{n}^{\ast}(\text{input})$. ```Python shifted_chebyshev_polynomial_u(input, n, , out=None) -> Tensor ``` Shifted Chebyshev polynomial of the second kind $U_{n}^{\ast}(\text{input})$. ```Python shifted_chebyshev_polynomial_v(input, n, , out=None) -> Tensor ``` Shifted Chebyshev polynomial of the third kind $V_{n}^{\ast}(\text{input})$. ```Python shifted_chebyshev_polynomial_w(input, n, *, out=None) -> Tensor ``` Shifted Chebyshev polynomial of the fourth kind $W_{n}^{\ast}(\text{input})$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78304 Approved by: https://github.com/mruberry	2022-06-03 22:38:56 +00:00
Allen Goodman	4a5381ab40	Bessel functions (#78451 ) Adds: ```Python bessel_j0(input, , out=None) -> Tensor ``` Bessel function of the first kind of order $0$, $J_{0}(\text{input})$. ```Python bessel_j1(input, , out=None) -> Tensor ``` Bessel function of the first kind of order $1$, $J_{1}(\text{input})$. ```Python bessel_j0(input, , out=None) -> Tensor ``` Bessel function of the second kind of order $0$, $Y_{0}(\text{input})$. ```Python bessel_j1(input, , out=None) -> Tensor ``` Bessel function of the second kind of order $1$, $Y_{1}(\text{input})$. ```Python modified_bessel_i0(input, , out=None) -> Tensor ``` Modified Bessel function of the first kind of order $0$, $I_{0}(\text{input})$. ```Python modified_bessel_i1(input, , out=None) -> Tensor ``` Modified Bessel function of the first kind of order $1$, $I_{1}(\text{input})$. ```Python modified_bessel_k0(input, , out=None) -> Tensor ``` Modified Bessel function of the second kind of order $0$, $K_{0}(\text{input})$. ```Python modified_bessel_k1(input, , out=None) -> Tensor ``` Modified Bessel function of the second kind of order $1$, $K_{1}(\text{input})$. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78451 Approved by: https://github.com/mruberry	2022-06-02 14:06:20 +00:00
samdow	aa06d05297	enable with semantics Pull Request resolved: https://github.com/pytorch/pytorch/pull/78214 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-06-01 21:14:45 +00:00
Allen Goodman	64e0d0c4fe	Laguerre polynomial (#78366 ) Adds: ```Python laguerre_polynomial_l(input, n, *, out=None) -> Tensor ``` Laguerre polynomial $L_{n}(\text{input})$. ## Derivatives Recommended $k$-derivative formula with respect to $\text{input}$: $$\frac{d^{k}}{d \times \text{input}^{k}} L_{n}(\text{input}) = -1^{k} \times L_{-k + n}^{k}(\text{input})$$ where $L_{n}^{\alpha}$ is the associated Laguerre polynomial. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78366 Approved by: https://github.com/mruberry	2022-05-30 17:24:00 +00:00
Allen Goodman	9dc6d42c18	Probabilist’s Hermite polynomial (#78357 ) Adds: ```Python hermite_polynomial_he(input, n, *, out=None) -> Tensor ``` Physicist’s Hermite polynomial $He_{n}(\text{input})$. If $n = 0$, $1$ is returned. If $n = 1$, $\text{input}$ is returned. Otherwise, the recursion: $$He_{n + 1}(\text{input}) = 2 \times \text{input} \times He_{n}(\text{input}) - He_{n - 1}(\text{input})$$ is evaluated. ## Derivatives Recommended $k$-derivative formula with respect to $\text{input}$: $$\frac{d^{k}}{d \times \text{input}^{k}} He_{n}^{(k)} = \frac{n!}{(n - k)!}He_{n - k}(\text{input}).$$ Pull Request resolved: https://github.com/pytorch/pytorch/pull/78357 Approved by: https://github.com/mruberry	2022-05-28 13:56:12 +00:00
Allen Goodman	18273c39da	Physicist’s Hermite polynomial (#78352 ) Adds: ```Python hermite_polynomial_h(input, n, *, out=None) -> Tensor ``` Physicist’s Hermite polynomial $H_{n}(\text{input})$. If $n = 0$, $1$ is returned. If $n = 1$, $\text{input}$ is returned. Otherwise, the recursion: $$H_{n + 1}(\text{input}) = 2 \times \text{input} \times H_{n}(\text{input}) - H_{n - 1}(\text{input})$$ is evaluated. ## Derivatives Recommended $k$-derivative formula with respect to $\text{input}$: $$\frac{d^{k}}{d \times \text{input}^{k}} H_{n}^{(k)} = 2^{k} \times \frac{n!}{(n - k)!}H_{n - k}(\text{input})$$ Pull Request resolved: https://github.com/pytorch/pytorch/pull/78352 Approved by: https://github.com/mruberry	2022-05-28 02:26:30 +00:00
Allen Goodman	40a6cc6cc6	Chebyshev polynomial of the second kind (#78293 ) Adds: ```Python chebyshev_polynomial_u(input, n, *, out=None) -> Tensor ``` Chebyshev polynomial of the second kind $U_{n}(\text{input})$. If $n = 0$, $1$ is returned. If $n = 1$, $2 \times \text{input}$ is returned. If $n < 6$ or $\|\text{input}\| > 1$ the recursion: $$T_{n + 1}(\text{input}) = 2 \times \text{input} \times T_{n}(\text{input}) - T_{n - 1}(\text{input})$$ is evaluated. Otherwise, the explicit trigonometric formula: $$\frac{\text{sin}((n + 1) \times \text{arccos}(\text{input}))}{\text{sin}(\text{arccos}(\text{input}))}$$ is evaluated. ## Derivatives Recommended first derivative formula with respect to $\text{input}$: $$\frac{(-1 - n)\times U_{-1 + n}(\text{input}) + n \times \text{input} \times U_{n}(x)}{-1 + \text{input}^{2}}.$$ Recommended $k$-derivative formula with respect to $\text{n}$: $$\frac{\text{arccos}(\text{input})^{k} \times \text{sin}(\frac{k \times \pi}{2} + (1 + n) \times \text{arccos}(\text{input}))}{\sqrt{1 - \text{input}^{2}}}.$$ ## Example ```Python x = torch.linspace(-1.0, 1.0, 256) matplotlib.pyplot.plot(x, torch.special.chebyshev_polynomial_u(x, 10)) ``` ![image](https://user-images.githubusercontent.com/315821/170352780-12af63d3-ce31-4948-8b68-8ecc37c71ac5.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78293 Approved by: https://github.com/mruberry	2022-05-27 18:32:11 +00:00
Allen Goodman	029bbe4995	Chebyshev polynomial of the first kind (#78196 ) Adds: ```Python chebyshev_polynomial_t(input, n, *, out=None) -> Tensor ``` Chebyshev polynomial of the first kind $T_{n}(\text{input})$. If $n = 0$, $1$ is returned. If $n = 1$, $\text{input}$ is returned. If $n < 6$ or $\|\text{input}\| > 1$ the recursion: $$T_{n + 1}(\text{input}) = 2 \times \text{input} \times T_{n}(\text{input}) - T_{n - 1}(\text{input})$$ is evaluated. Otherwise, the explicit trigonometric formula: $$T_{n}(\text{input}) = \text{cos}(n \times \text{arccos}(x))$$ is evaluated. ## Derivatives Recommended $k$-derivative formula with respect to $\text{input}$: $$2^{-1 + k} \times n \times \Gamma(k) \times C_{-k + n}^{k}(\text{input})$$ where $C$ is the Gegenbauer polynomial. Recommended $k$-derivative formula with respect to $\text{n}$: $$\text{arccos}(\text{input})^{k} \times \text{cos}(\frac{k \times \pi}{2} + n \times \text{arccos}(\text{input})).$$ ## Example ```Python x = torch.linspace(-1, 1, 256) matplotlib.pyplot.plot(x, torch.special.chebyshev_polynomial_t(x, 10)) ``` ![image](https://user-images.githubusercontent.com/315821/170125525-60415735-4d49-4cbd-9278-26286413f635.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78196 Approved by: https://github.com/mruberry	2022-05-26 21:06:44 +00:00
PyTorch MergeBot	d450034f24	Revert "Beta function (#78031 )" This reverts commit `da16450360`. Reverted https://github.com/pytorch/pytorch/pull/78031 on behalf of https://github.com/suo due to broke trunk, see the above message	2022-05-24 22:55:06 +00:00
Brian Hirsh	07e4533403	reland of as_strided support for functionalization; introduce as_strided_scatter This reverts commit `a95f1edd85`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199 Approved by: https://github.com/ezyang	2022-05-24 22:40:44 +00:00
Allen Goodman	da16450360	Beta function (#78031 ) Euler beta function: ```Python torch.special.beta(input, other, *, out=None) → Tensor ``` `reentrant_gamma` and `reentrant_ln_gamma` implementations (using Stirling’s approximation) are provided. I started working on this before I realized we were missing a gamma implementation (despite providing incomplete gamma implementations). Uses the coefficients computed by Steve Moshier to replicate SciPy’s implementation. Likewise, it mimics SciPy’s behavior (instead of the behavior in Cephes). Pull Request resolved: https://github.com/pytorch/pytorch/pull/78031 Approved by: https://github.com/mruberry	2022-05-24 21:07:25 +00:00
PyTorch MergeBot	a95f1edd85	Revert "as_strided support for functionalization; introduce as_strided_scatter" This reverts commit `3a921f2d26`. Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master `3a921f2d26`. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them	2022-05-24 20:19:12 +00:00
Brian Hirsh	3a921f2d26	as_strided support for functionalization; introduce as_strided_scatter Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128 Approved by: https://github.com/ezyang	2022-05-24 18:20:31 +00:00
Edward Z. Yang	4941e72e40	Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 )"" This reverts commit `c35bd8d423`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719 Approved by: https://github.com/Chillee, https://github.com/malfet	2022-05-18 18:40:57 +00:00
PyTorch MergeBot	48581d74ad	Revert "Add dispatch mode testing for meta tensors and other stuff" This reverts commit `c1cdb1216b`. Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet	2022-05-18 02:56:48 +00:00
Edward Z. Yang	c1cdb1216b	Add dispatch mode testing for meta tensors and other stuff We don't have any coverage for meta tensor correctness for backwards because torch function mode can only allow us to interpose on Python torch API calls, but backwards invocations happen from C++. To make this possible, I add torch_dispatch_meta test which runs the tests with __torch_dispatch__ While doing this, I needed to generate fresh expected failure / skip lists for the new test suite, and I discovered that my original scaffolding for this purpose was woefully insufficient. So I rewrote how the test framework worked, and at the same time rewrote the __torch_function__ code to also use the new logic. Here's whats new: - Expected failure / skip is now done on a per function call basis, rather than the entire test. This means that separate OpInfo samples for a function don't affect each other. - There are now only two lists: expect failure list (where the test consistently fails on all runs) and skip list (where the test sometimes passes and fails. - We explicitly notate the dtype that failed. I considered detecting when something failed on all dtypes, but this was complicated and listing everything out seemed to be nice and simple. To keep the dtypes short, I introduce a shorthand notation for dtypes. - Conversion to meta tensors is factored into its own class MetaConverter - To regenerate the expected failure / skip lists, just run with PYTORCH_COLLECT_EXPECT and filter on a specific test type (test_meta or test_dispatch_meta) for whichever you want to update. Other misc fixes: - Fix max_pool1d to work with BFloat16 in all circumstances, by making it dispatch and then fixing a minor compile error (constexpr doesn't work with BFloat16) - Add resolve_name for turning random torch API functions into string names - Add push classmethod to the Mode classes, so that you can more easily push a mode onto the mode stack - Add some more skips for missing LAPACK - Added an API to let you query if there's already a registration for a function, added a test to check that we register_meta for all decompositions (except detach, that decomp is wrong lol), and then update all the necessary sites to make the test pass. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477 Approved by: https://github.com/zou3519	2022-05-18 00:18:34 +00:00
Christian Puhrsch	8c608a79b4	Compressed sparse layout conversion stubs (#77489 ) This PR unifies sparse layout conversions into a single location and adds stubs to raise a Runtime error for unsupported conversions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77489 Approved by: https://github.com/pearu, https://github.com/mruberry	2022-05-16 18:37:42 +00:00
Pearu Peterson	88205886d7	Add ccol_indices and row_indices methods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77503 Approved by: https://github.com/cpuhrsch	2022-05-16 00:23:54 +00:00
Christian Puhrsch	289192199a	Add to_sparse_bsr (#77366 ) Conversion function of CSR to BSR. Follow up work includes - Conversion from strided, COO, CSC, BSC - autograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/77366 Approved by: https://github.com/IvanYashchuk, https://github.com/mikaylagawarecki	2022-05-13 20:16:03 +00:00
Mikayla Gawarecki	841c65f499	Unprivate _index_reduce and add documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997 Approved by: https://github.com/cpuhrsch	2022-05-13 19:48:38 +00:00
Ivan Yashchuk	890bdf13e1	Remove deprecated torch.solve (#70986 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.solve`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70986 Approved by: https://github.com/Lezcano, https://github.com/albanD	2022-05-10 13:44:07 +00:00
PyTorch MergeBot	4ebc4890dd	Revert "Add linalg.lu_solve" This reverts commit `fc5b4a5a33`. Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet	2022-05-09 19:12:30 +00:00
lezcano	621ff0f973	Add linalg.vander This PR adds `linalg.vander`, the linalg version of `torch.vander`. We add autograd support and support for batched inputs. We also take this chance to improve the docs (TODO: Check that they render correctly!) and add an OpInfo. Discussion: The current default for the `increasing` kwargs is extremely odd as it is the opposite of the classical definition (see [wiki](https://en.wikipedia.org/wiki/Vandermonde_matrix)). This is reflected in the docs, where I explicit both the odd defaults that we use and the classical definition. See also [this stackoverflow post](https://stackoverflow.com/a/71758047/5280578), which shows how people are confused by this defaults. My take on this would be to correct the default to be `increasing=True` and document the divergence with NumPy (as we do for other `linalg` functions) as: - It is what people expect - It gives the correct determinant called "the Vandermonde determinant" rather than (-1)^{n-1} times the Vandermonde det (ugh). - [Minor] It is more efficient (no `flip` needed) - Since it's under `linalg.vander`, it's strictly not a drop-in replacement for `np.vander`. We will deprecate `torch.vander` in a PR after this one in this stack (once we settle on what's the correct default). Thoughts? mruberry cc kgryte rgommers as they might have some context for the defaults of NumPy. Fixes https://github.com/pytorch/pytorch/issues/60197 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76303 Approved by: https://github.com/albanD, https://github.com/mruberry	2022-05-06 08:44:14 +00:00
lezcano	fc5b4a5a33	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-05-05 19:02:13 +00:00
lezcano	7cb7cd5802	Add linalg.lu This PR modifies `lu_unpack` by: - Using less memory when unpacking `L` and `U` - Fuse the subtraction by `-1` with `unpack_pivots_stub` - Define tensors of the correct types to avoid copies - Port `lu_unpack` to be a strucutred kernel so that its `_out` version does not incur on extra copies Then we implement `linalg.lu` as a structured kernel, as we want to compute its derivative manually. We do so because composing the derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient. This new function and `lu_unpack` comes with all the things it can come: forward and backward ad, decent docs, correctness tests, OpInfo, complex support, support for metatensors and support for vmap and vmap over the gradients. I really hope we don't continue adding more features. This PR also avoids saving some of the tensors that were previously saved unnecessarily for the backward in `lu_factor_ex_backward` and `lu_backward` and does some other general improvements here and there to the forward and backward AD formulae of other related functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67833 Approved by: https://github.com/IvanYashchuk, https://github.com/nikitaved, https://github.com/mruberry	2022-05-05 09:17:05 +00:00
Edward Z. Yang	48eb8d6aad	Use TorchFunctionMode to implement PrimTorch tracing context Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76735 Approved by: https://github.com/mruberry	2022-05-04 23:49:46 +00:00
Eddie Yan	e838137b3e	Add high level control of fp32 matmul precision; disable TF32 for matmuls by default #76440 CC @mruberry @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/76509 Approved by: https://github.com/ngimel	2022-05-04 20:40:13 +00:00
samdow	6779366f27	add nested mode to python mode Pull Request resolved: https://github.com/pytorch/pytorch/pull/75965 Approved by: https://github.com/albanD, https://github.com/ezyang, https://github.com/zou3519	2022-05-04 13:01:06 +00:00
Pearu Peterson	436a7be059	Factory functions for sparse CSC, BSR, and BSC tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/76634 Tests for Sparse Compressed factory functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/76746 Approved by: https://github.com/cpuhrsch	2022-05-04 03:30:41 +00:00
PyTorch MergeBot	bc5307347f	Revert "Add linalg.vander" This reverts commit `1ea49c68d0`. Reverted https://github.com/pytorch/pytorch/pull/76303 on behalf of https://github.com/malfet	2022-05-02 18:50:08 +00:00
Pearu Peterson	e6b4d77c3e	Sparse Compressed tensor factory function 2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76623 Approved by: https://github.com/cpuhrsch	2022-05-02 17:38:30 +00:00
lezcano	1ea49c68d0	Add linalg.vander This PR adds `linalg.vander`, the linalg version of `torch.vander`. We add autograd support and support for batched inputs. We also take this chance to improve the docs (TODO: Check that they render correctly!) and add an OpInfo. Discussion: The current default for the `increasing` kwargs is extremely odd as it is the opposite of the classical definition (see [wiki](https://en.wikipedia.org/wiki/Vandermonde_matrix)). This is reflected in the docs, where I explicit both the odd defaults that we use and the classical definition. See also [this stackoverflow post](https://stackoverflow.com/a/71758047/5280578), which shows how people are confused by this defaults. My take on this would be to correct the default to be `increasing=True` and document the divergence with NumPy (as we do for other `linalg` functions) as: - It is what people expect - It gives the correct determinant called "the Vandermonde determinant" rather than (-1)^{n-1} times the Vandermonde det (ugh). - [Minor] It is more efficient (no `flip` needed) - Since it's under `linalg.vander`, it's strictly not a drop-in replacement for `np.vander`. We will deprecate `torch.vander` in a PR after this one in this stack (once we settle on what's the correct default). Thoughts? mruberry cc kgryte rgommers as they might have some context for the defaults of NumPy. Fixes https://github.com/pytorch/pytorch/issues/60197 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76303 Approved by: https://github.com/albanD	2022-05-02 15:26:44 +00:00
Ivan Yashchuk	8bb7203049	Add torch.linalg.ldl_factor_ex and torch.linalg.ldl_solve This PR adds a function for computing the LDL decomposition and a function that can solve systems of linear equations using this decomposition. The result of `torch.linalg.ldl_factor_ex` is in a compact form and it's required to use it only through `torch.linalg.ldl_solve`. In the future, we could provide `ldl_unpack` function that transforms the compact representation into explicit matrices. Fixes https://github.com/pytorch/pytorch/issues/54847. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/69828 Approved by: https://github.com/Lezcano, https://github.com/mruberry, https://github.com/albanD	2022-04-28 19:23:37 +00:00
Mikayla Gawarecki	676a4a3969	Prototype _index_reduce (CPU-only) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75981 Approved by: https://github.com/cpuhrsch	2022-04-27 23:01:00 +00:00
Joel Benjamin Schlosser	bc34cf5fe4	Support for tensor subclasses as parameters Pull Request resolved: https://github.com/pytorch/pytorch/pull/73459 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-04-27 19:28:55 +00:00
Kulin Seth	54c75e1e8f	Add "mps" device to PyTorch framework. Remove the "mlc" device for Mac platforms. This commit will be followed up with: * adding MPS runtime components * PyTorch ops for MPS device Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291 Approved by: https://github.com/albanD	2022-04-27 19:21:57 +00:00
Brian Hirsh	ea5209c9fd	functionalization: add native fill() op Pull Request resolved: https://github.com/pytorch/pytorch/pull/76084 Approved by: https://github.com/ezyang	2022-04-25 21:34:16 +00:00
kshitij12345	aa51704ce5	[complex32] add chalf alias for complex32 and chalf method Reference: https://github.com/pytorch/pytorch/issues/74537 Adds chalf alias for complex32 and also adds method `chalf` similar to `cfloat, cdouble` TODO: * [x] Add docs * [x] Add override Pull Request resolved: https://github.com/pytorch/pytorch/pull/75320 Approved by: https://github.com/anjali411	2022-04-20 23:44:47 +00:00
albanD	cd0591dff3	Change default TLS behavior in dispatch to favor is-a style Pull Request resolved: https://github.com/pytorch/pytorch/pull/75827 Approved by: https://github.com/ezyang	2022-04-20 17:32:29 +00:00
Edward Z. Yang	ee955b8bb9	Cannibalize noarch CI job into crossref CI job crossref is a new strategy for performing tests when you want to run a normal PyTorch API call, separately run some variation of the API call (e.g., same thing but all the arguments are meta tensors) and then cross-reference the results to see that they are consistent. Any logic you add to CrossRefMode will get run on every PyTorch API call that is called in the course of PyTorch's test suite. This can be a good choice for correctness testing if OpInfo testing is not exhaustive enough. For now, the crossref test doesn't do anything except verify that we can validly push a mode onto the torch function mode stack for all functions. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/75988 Approved by: https://github.com/seemethere	2022-04-20 11:56:25 +00:00
Edward Z. Yang	d9219d2944	Add torch.nn.init to list of overridable functions Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76014 Approved by: https://github.com/zou3519	2022-04-20 11:55:56 +00:00
Alban Desmaison	3467f3fa80	Remove spurious warning when using disabled torch function Pull Request resolved: https://github.com/pytorch/pytorch/pull/75826 Approved by: https://github.com/ezyang	2022-04-15 17:08:45 +00:00
Scott Wolchok	97c993ca7a	[PyTorch] Add NestedTensor support functions for transformers Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491 Here are the NestedTensor kernels we'll need for the improved transformer implementation. Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)! Approved by: https://github.com/cpuhrsch	2022-04-14 16:30:23 +00:00
Brian Hirsh	23b8414391	code-generate non-aliasing {view}_copy kernels (#73442 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73442 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D35016025 Pulled By: bdhirsh fbshipit-source-id: 2a7f303ec76f5913b744c7822a531d55a57589c9 (cherry picked from commit 3abe13c2a787bcbe9c41b0a335c96e5a3d3642fb)	2022-04-11 19:48:55 +00:00
Edward Z. Yang	0a1bc5f501	Miscellaneous __torch_function__ fixes I figured these out by unconditionally turning on a no-op torch function mode on the test suite and then fixing errors as they showed up. Here's what I found: - _parse_to failed internal assert when __torch_function__'ed because it claims its name is "to" to the argument parser; added a name override so we know how to find the correct name - Infix operator magic methods on Tensor did not uniformly handle __torch_function__ and TypeError to NotImplemented. Now, we always do the __torch_function__ handling in _wrap_type_error_to_not_implemented and your implementation of __torch_function__ gets its TypeErrors converted to NotImplemented (for better or for worse; see https://github.com/pytorch/pytorch/issues/75462 ) - A few cases where code was incorrectly testing if a Tensor was Tensor-like in the wrong way, now use is_tensor_like (in grad and in distributions). Also update docs for has_torch_function to push people to use is_tensor_like. - is_grads_batched was dropped from grad in handle_torch_function, now fixed - Report that you have a torch function even if torch function is disabled if a mode is enabled. This makes it possible for a mode to return NotImplemented, pass to a subclass which does some processing and then pass back to the mode even after the subclass disables __torch_function__ (so the tensors are treated "as if" they are regular Tensors). This brings the C++ handling behavior in line with the Python behavior. - Make the Python implementation of overloaded types computation match the C++ version: when torch function is disabled, there are no overloaded types (because they all report they are not overloaded). Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/75484 Approved by: https://github.com/zou3519	2022-04-11 16:52:16 +00:00
Scott Wolchok	48147675f2	[PyTorch] _addm_activation native function for matmul/bias/activation fusion Pull Request resolved: https://github.com/pytorch/pytorch/pull/74490 Here's an extended version of addmm that takes advantage of cublasLt's fused addmm + relu/gelu support. Differential Revision: [D35019612](https://our.internmc.facebook.com/intern/diff/D35019612/) Approved by: https://github.com/ngimel	2022-04-08 17:54:09 +00:00
Anthony Barbier	ce9e27a0fc	Add new keys for Graphcore IPU (DispatchKey / Backend / DeviceType) We need a key to register our out of tree backend: https://github.com/graphcore/poptorch Pull Request resolved: https://github.com/pytorch/pytorch/pull/74763 Approved by: https://github.com/bdhirsh	2022-04-07 17:18:45 +00:00
Edward Z. Yang	31c86625cc	__torch_function__ mode Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/75154 Approved by: https://github.com/albanD, https://github.com/zou3519	2022-04-07 02:23:29 +00:00
Peter Bell	1ab03a0f6f	Deprecate `__torch_function__` as instance method in C++ Ref #63767 This has already been deprecated in the python code for a long time, but was never deprecated in the C++ api so it's possible users might not have had sufficient warning yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74829 Approved by: https://github.com/ezyang	2022-04-06 02:28:00 +00:00
Mikayla Gawarecki	e9a8e6f74a	Add include_self flag to scatter_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/74607 Approved by: https://github.com/cpuhrsch	2022-04-05 16:31:39 +00:00
Peter Bell	bf16552617	Restore TestTorchFunctionOverride Fixes #74122 This re-enables TestTorchFunctionOverride and fixes a bunch of test failures that had crept in while it was disabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74202 Approved by: https://github.com/ezyang	2022-04-04 01:26:20 +00:00
Mikayla Gawarecki	2bfa018462	[BC-breaking] Use ScatterGatherKernel for scatter_reduce (CPU-only) (#74226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74226 Update signature of `scatter_reduce_` to match `scatter_/scatter_add_` `Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)` - Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce` - `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_` - Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py` Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D35222842 Pulled By: mikaylagawarecki fbshipit-source-id: 84930add2ad30baf872c495251373313cb7428bd (cherry picked from commit 1b45139482e22eb0dc8b6aec2a7b25a4b58e31df)	2022-04-01 05:57:45 +00:00
Sherlockk Huang	bbf7e159e0	Implement torch.special.log_ndtr Implements torch.special.log_ndtr Issue: https://github.com/pytorch/pytorch/issues/50345 TODO: - [x] adding proper reference to scipy implementation - [x] double check if the changes in test/test_unary_ufuncs.py is really necessary - [x] check setting for UnaryUfuncInfo cc: @kshitij12345 @mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/74795 Approved by: https://github.com/anjali411	2022-03-29 23:13:37 +00:00
Scott Wolchok	f9d0bc5338	[PyTorch] Delete NestedTensor Python wrapper (#74691 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74691 The wrapper just called through to methods on the underlying Tensor. ghstack-source-id: 152433754 Test Plan: existing tests Reviewed By: ezyang Differential Revision: D34689789 fbshipit-source-id: cf53476780cf3ed00a3aa4add441300bfe8e27ce (cherry picked from commit 5a9e5eb6bc13eb30be6e3c3bc4ac954c92704198)	2022-03-29 19:13:40 +00:00
Christian Puhrsch	e55b73d65a	Add strided layout support for to_dense Fixes #59958 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74486 Approved by: https://github.com/pearu, https://github.com/suo	2022-03-29 00:12:48 +00:00
Christian Puhrsch	7fe0b6a5cd	mul(sparse_csr, sparse_csr) using mul(sparse, sparse) Basic fallback implementation. Let's make this faster once used. NOTE: This is stacked on top of https://github.com/pytorch/pytorch/pull/74294 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74266 Approved by: https://github.com/pearu, https://github.com/malfet	2022-03-25 17:10:33 +00:00
Edward Z. Yang	a5b848aec1	Use has_torch_function_unary instead of manual type test. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/74278 Approved by: https://github.com/albanD	2022-03-17 02:14:40 +00:00
Scott Wolchok	d4a4430059	[PyTorch] Add Tensor.is_nested (#73999 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73999 Seems to be the typical way to detect a flavor of TensorImpl. ghstack-source-id: 151440167 Test Plan: Existing tests? Reviewed By: ezyang Differential Revision: D34665269 fbshipit-source-id: 5081a00928933e0c5252eeddca43bae0b026013d (cherry picked from commit 7cf62a3f69f158a33c5108f7e96ea4c5520f0f15)	2022-03-16 17:04:30 +00:00
Edward Z. Yang	35cfa74f97	Add a default implementation of __torch_dispatch__ I was working on an explanation of how to call into the "super" implementation of some given ATen operation inside of __torch_dispatch__ (https://github.com/albanD/subclass_zoo/blob/main/trivial_tensors.py) and I kept thinking to myself "Why doesn't just calling super() on __torch_dispatch__ work"? Well, after this patch, it does! The idea is if you don't actually unwrap the input tensors, you can call super().__torch_dispatch__ to get at the original behavior. Internally, this is implemented by disabling PythonKey and then redispatching. This implementation of disabled_torch_dispatch is not /quite/ right, and some reasons why are commented in the code. There is then some extra work I have to do to make sure we recognize disabled_torch_dispatch as the "default" implementation (so we don't start slapping PythonKey on all tensors, including base Tensors), which is modeled the same way as how disabled_torch_function is done. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/73684 Approved by: albanD	2022-03-03 20:19:33 +00:00
Nikita Shulga	cfb6c942fe	`scatter_reduce` documentation (#73125 ) Summary: Reland of https://github.com/pytorch/pytorch/issues/68580 (which were milestoned for 1.11) plus partial revert of https://github.com/pytorch/pytorch/pull/72543 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73125 Reviewed By: bdhirsh Differential Revision: D34355217 Pulled By: malfet fbshipit-source-id: 325ecdeaf53183d653b44ee5e6e8839ceefd9200 (cherry picked from commit `71db31748a`)	2022-02-22 19:33:46 +00:00
Scott Wolchok	79a216ce57	Move native MHA code out of PyTorch core (#72944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72944 Doesn't make sense to develop it in core right now. ghstack-source-id: 149456040 Test Plan: CI run MHA benchmark in benchmark_transformers.py to make sure it doesn't crash Reviewed By: zrphercule Differential Revision: D34283104 fbshipit-source-id: 4f0c7a6bc066f938ceac891320d4cf4c3f8a9cd6 (cherry picked from commit `b9df65e97c`)	2022-02-18 21:34:06 +00:00
Brian Hirsh	f87f753bb9	avoiding adding some functions to the public python API before 1.11 release (#72543 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72543 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D34085724 Pulled By: bdhirsh fbshipit-source-id: 941d5a90a6fa5328268d623e0e2b01577e4132ca (cherry picked from commit `6676a0c79a`)	2022-02-14 19:49:01 +00:00
Ryan Spring	4f8b986e28	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: VitalyFedyunin Differential Revision: D33894937 Pulled By: jbschlosser fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851 (cherry picked from commit `6e986f91a9`)	2022-02-14 03:40:32 +00:00
Brian Muse	8bf3179f6e	#71946 Remove Python 3.6 references (#72211 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/71946 This commit removes some bits of code that were hard coded for Python 3.6 support from the `.circleci` and `torch` folders. It should only be merged if https://github.com/pytorch/pytorch/issues/66462 is complete. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72211 Reviewed By: dagitses, seemethere Differential Revision: D33982604 Pulled By: musebc fbshipit-source-id: 8f453bf9909df615addd59538adb369c65484044 (cherry picked from commit `944a9970fe`)	2022-02-08 03:46:20 +00:00
Rui Zhu	541773d268	Make native MHA private for release 1.11 (#72200 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72200 This op should still remain private in release 1.11, add underscore before op name to make it happens Test Plan: buck run mode/opt -c fbcode.enable_gpu_sections=true pytext/fb/tools:benchmark_transformers -- mha --batch-size=10 --max-sequence-length=16 Reviewed By: bdhirsh Differential Revision: D33952191 fbshipit-source-id: 3f8525ac9c23bb286f51476342113ebc31b8ed59 (cherry picked from commit `6e41bfa4fc`)	2022-02-03 04:15:18 +00:00
Nikita Shulga	74c44ba9d6	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692 (cherry picked from commit `c9efb58223`)	2022-01-31 17:44:19 +00:00
Ryan Spring	23d03025dc	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33 (cherry picked from commit `3a53b3e94f`)	2022-01-31 17:07:45 +00:00
Joel Schlosser	cb823d9f07	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93 (cherry picked from commit `e9fb2d1db1`)	2022-01-28 18:35:01 +00:00
Ryan Spring	f499ab9cef	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187 (cherry picked from commit `4713dd9cca`)	2022-01-28 16:59:09 +00:00
Mikayla Gawarecki	fdec94504f	Rename _scatter_reduce to scatter_reduce and make it unstructured (#71787 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71787 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33778524 Pulled By: cpuhrsch fbshipit-source-id: 55a330e1c2227c0eaaa1c0d2f9205a4dee24a11b (cherry picked from commit `6e4a8a91da`)	2022-01-27 16:29:13 +00:00
lezcano	108b37db84	[Array API] Add linalg.diagonal (#70599 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70599 This PR adds `linalg.diagonal` following the Array API: https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.html#linalg-diagonal-x-axis1-0-axis2-1-offset-0 Fixes https://github.com/pytorch/pytorch/issues/62813 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33760506 Pulled By: mruberry fbshipit-source-id: e32c3490321d8c3f31b3bb538bc1f72b39bd2854 (cherry picked from commit `44f41f8e39`)	2022-01-26 08:08:32 +00:00
mingfeima	054b90f0d6	add channels last support for ChannelShuffle (#50247 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50247 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26007052 Pulled By: VitalyFedyunin fbshipit-source-id: 08f737d64a65791c8002ffd56b79b02cf14d6159	2022-01-14 11:55:21 -08:00
Rui Zhu	9267fd8d73	[WIP] [ATen] Add native_multi_attention_self_attention CPU + GPU implementation (#70649 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70649 As described in https://fb.quip.com/oxpiA1uDBjgP This implements the first parts of the RFC, and is a rough draft showing the approach. The idea is that for the first cut we can maintain very close (identical I believe in this diff) numerical equivalence to the existing nn.MHA implementation, which is what this diff attempts to do. In subsequent implementations, once we have a working and adopted native self-attention implementation, we could then explore alternative implementations, etc. The current implementation is similar to existing dedicated implementations such as LightSeq/FasterTransformer/DeepSpeed, and for MHA on both CPUs and GPUs is between 1.2x and 2x faster depending on the setting. It makes some approximations/restrictions (doesn't handle masking in masked softmax, etc), but these shouldn't materially impact performance. This does the first few items: * add native_multi_head_attention(...) , native_multi_head_attention_backward(..) to native_functions.yaml * Implement native_multi_head_attention(..) on GPU, extracting bits and pieces out of LS/DS/FT as appropriate * Implement native_multi_head_attention(..) on CPU The backward implementation is still WIP, but the idea would be to: * Hook these up in derivatives.yaml Implement native_multi_head_attention_backward(..) on GPU, extracting out bits and pieces out of LS/DS (not FT since it’s inference only) * Implement native_multi_head_attention_backward(..) on CPU * In torch.nn.functional.multi_head_attention_forward `23321ba7a3/torch/nn/functional.py (L4953)`, add some conditionals to check if we are being called in a BERT/ViT-style encoder fashion, and invoke the native function directly. Test Plan: TODO Reviewed By: mikekgfb Differential Revision: D31829981 fbshipit-source-id: c430344d91ba7a5fbee3138e50b3e62efbb33d96	2022-01-08 21:50:41 -08:00
lezcano	a35b4b49d2	Add linalg.lu_factor (#66933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933 This PR exposes `torch.lu` as `torch.linalg.lu_factor` and `torch.linalg.lu_factor_ex`. This PR also adds support for matrices with zero elements both in the size of the matrix and the batch. Note that this function simply returns empty tensors of the correct size in this case. We add a test and an OpInfo for the new function. This PR also adds documentation for this new function in line of the documentation in the rest of `torch.linalg`. Fixes https://github.com/pytorch/pytorch/issues/56590 Fixes https://github.com/pytorch/pytorch/issues/64014 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D32834069 Pulled By: mruberry fbshipit-source-id: 51ef12535fa91d292f419acf83b800b86ee9c7eb	2022-01-05 20:32:12 -08:00
Heitor Schueroff	34c49d3d3b	Document torch.quantile interpolation kwarg (#70637 ) Summary: clone of https://github.com/pytorch/pytorch/pull/59397 This PR documents the interpolation kwarg parameter added in https://github.com/pytorch/pytorch/issues/49267. Now that the forward compatibility period is over, we can expose this parameter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70637 Reviewed By: jbschlosser Differential Revision: D33411707 Pulled By: anjali411 fbshipit-source-id: f5f2d0a6739b3a855bbdf58fc671ac2f0342ce69	2022-01-05 11:02:13 -08:00
Joel Schlosser	e6c3aa3880	Remove backward ops for mkldnn convolution (#70467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70467 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33342476 Pulled By: jbschlosser fbshipit-source-id: 9811d02b16adea0dd1dd2500261f4b3b294d2dee	2021-12-30 14:29:22 -08:00
anjali411	3e6164449f	Add efficient zero tensors (#64837 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64837 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D32834987 Pulled By: anjali411 fbshipit-source-id: 20ea08ade0db0044ca633d9c1a117a6a2e65d1fd	2021-12-08 10:37:39 -08:00
Mark Richardson	834bd3134e	Back out "Add efficient zero tensors" (#69327 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69327 Original commit changeset: d44096d88265 Original Phabricator Diff: D32144240 (`668574af4a`) Test Plan: CI original diff failed 175 builds in CI Reviewed By: airboyang, anjali411 Differential Revision: D32809407 fbshipit-source-id: c7c8e69bcee0274992e2d5da901f035332e60071	2021-12-02 19:11:41 -08:00
anjali411	668574af4a	Add efficient zero tensors (#64837 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64837 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32144240 Pulled By: anjali411 fbshipit-source-id: d44096d882657c7f9270a16636900e0b73cefa40	2021-12-02 08:47:45 -08:00
Mike Ruberry	6ae34ea6f8	Revert D32521980: Add linalg.lu_factor Test Plan: revert-hammer Differential Revision: D32521980 (`b10929a14a`) Original commit changeset: 26a49ebd87f8 fbshipit-source-id: e1a6bb9c2ece9bd78190fe17e16a46e3358c5c82	2021-11-28 17:22:15 -08:00
lezcano	b10929a14a	Add linalg.lu_factor (#66933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933 This PR exposes `torch.lu` as `torch.linalg.lu_factor` and `torch.linalg.lu_factor_ex`. This PR also adds support for matrices with zero elements both in the size of the matrix and the batch. Note that this function simply returns empty tensors of the correct size in this case. We add a test and an OpInfo for the new function. This PR also adds documentation for this new function in line of the documentation in the rest of `torch.linalg`. Fixes https://github.com/pytorch/pytorch/issues/56590 Fixes https://github.com/pytorch/pytorch/issues/64014 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32521980 Pulled By: mruberry fbshipit-source-id: 26a49ebd87f8a41472f8cd4e9de4ddfb7f5581fb	2021-11-27 17:52:48 -08:00
lezcano	b46c89d950	Add linalg.solve_triangular (#63568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63568 This PR adds the first solver with structure to `linalg`. This solver has an API compatible with that of `linalg.solve` preparing these for a possible future merge of the APIs. The new API: - Just returns the solution, rather than the solution and a copy of `A` - Removes the confusing `transpose` argument and replaces it by a correct handling of conj and strides within the call - Adds a `left=True` kwarg. This can be achieved via transposes of the inputs and the result, but it's exposed for convenience. This PR also implements a dataflow that minimises the number of copies needed before calling LAPACK / MAGMA / cuBLAS and takes advantage of the conjugate and neg bits. This algorithm is implemented for `solve_triangular` (which, for this, is the most complex of all the solvers due to the `upper` parameters). Once more solvers are added, we will factor out this calling algorithm, so that all of them can take advantage of it. Given the complexity of this algorithm, we implement some thorough testing. We also added tests for all the backends, which was not done before. We also add forward AD support for `linalg.solve_triangular` and improve the docs of `linalg.solve_triangular`. We also fix a few issues with those of `torch.triangular_solve`. Resolves https://github.com/pytorch/pytorch/issues/54258 Resolves https://github.com/pytorch/pytorch/issues/56327 Resolves https://github.com/pytorch/pytorch/issues/45734 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32588230 Pulled By: mruberry fbshipit-source-id: 69e484849deb9ad7bb992cc97905df29c8915910	2021-11-22 12:41:06 -08:00
jiej	ca92111758	Add native_dropout (#63937 ) Summary: Adds native_dropout to have a reasonable target for torchscript in auto diff. native_dropout has scale and train as arguments in its signature, this makes native_dropout more consistent with other operators and removes conditionals in the autodiff definition. cc gmagogsfm Pull Request resolved: https://github.com/pytorch/pytorch/pull/63937 Reviewed By: mruberry Differential Revision: D32477657 Pulled By: ngimel fbshipit-source-id: d37b137a37acafa50990f60c77f5cea2818454e4	2021-11-18 19:41:10 -08:00
Jane Xu	9f4e004abd	Revert D32283178: Add linalg.solve_triangular Test Plan: revert-hammer Differential Revision: D32283178 (`0706607abc`) Original commit changeset: deb672e6e52f fbshipit-source-id: d2a3421292147426cc61c2f063b721acf9004755	2021-11-18 14:46:10 -08:00
lezcano	0706607abc	Add linalg.solve_triangular (#63568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63568 This PR adds the first solver with structure to `linalg`. This solver has an API compatible with that of `linalg.solve` preparing these for a possible future merge of the APIs. The new API: - Just returns the solution, rather than the solution and a copy of `A` - Removes the confusing `transpose` argument and replaces it by a correct handling of conj and strides within the call - Adds a `left=True` kwarg. This can be achieved via transposes of the inputs and the result, but it's exposed for convenience. This PR also implements a dataflow that minimises the number of copies needed before calling LAPACK / MAGMA / cuBLAS and takes advantage of the conjugate and neg bits. This algorithm is implemented for `solve_triangular` (which, for this, is the most complex of all the solvers due to the `upper` parameters). Once more solvers are added, we will factor out this calling algorithm, so that all of them can take advantage of it. Given the complexity of this algorithm, we implement some thorough testing. We also added tests for all the backends, which was not done before. We also add forward AD support for `linalg.solve_triangular` and improve the docs of `linalg.solve_triangular`. We also fix a few issues with those of `torch.triangular_solve`. Resolves https://github.com/pytorch/pytorch/issues/54258 Resolves https://github.com/pytorch/pytorch/issues/56327 Resolves https://github.com/pytorch/pytorch/issues/45734 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: zou3519, JacobSzwejbka Differential Revision: D32283178 Pulled By: mruberry fbshipit-source-id: deb672e6e52f58b76536ab4158073927a35e43a8	2021-11-18 09:45:51 -08:00
Rok	952ca25daa	Sparse CSR: add `convert_indices_from_csr_to_coo` (#66774 ) Summary: This PR adds conversion from CSR to COO. Fixes https://github.com/pytorch/pytorch/issues/56959 cc nikitaved pearu cpuhrsch IvanYashchuk gchanan mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/66774 Reviewed By: zou3519 Differential Revision: D32288415 Pulled By: cpuhrsch fbshipit-source-id: 683ba658dc46835fdf3c0e24645c0c2bb243b968	2021-11-17 22:28:30 -08:00
rusty1s	9807787135	`scatter_reduce` (#68115 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/63780 Basic functionality of a `scatter_reduce` algorithm with `reduce="sum"`: * `scatter_reduce` is named as `scatter_reduce2` due to compiling issues * It currently re-uses functionality from `scatter_add` * Tests are missing: WIP The error when the `scatter_reduce` naming is used: ``` In file included from aten/src/ATen/core/TensorBody.h:3, from ../aten/src/ATen/core/Tensor.h:3, from ../aten/src/ATen/DeviceGuard.h:4, from ../aten/src/ATen/ATen.h:11, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Operators.h:13949:18: error: redefinition of ‘struct at::_ops::scatter_reduce’ 13949 \| struct TORCH_API scatter_reduce { \| ^~~~~~~~~~~~~~ aten/src/ATen/Operators.h:13817:18: note: previous definition of ‘struct at::_ops::scatter_reduce’ 13817 \| struct TORCH_API scatter_reduce { \| ^~~~~~~~~~~~~~ aten/src/ATen/Operators.h:13960:18: error: redefinition of ‘struct at::_ops::scatter_reduce_out’ 13960 \| struct TORCH_API scatter_reduce_out { \| ^~~~~~~~~~~~~~~~~~ aten/src/ATen/Operators.h:13839:18: note: previous definition of ‘struct at::_ops::scatter_reduce_out’ 13839 \| struct TORCH_API scatter_reduce_out { \| ^~~~~~~~~~~~~~~~~~ In file included from ../aten/src/ATen/core/Tensor.h:3, from ../aten/src/ATen/DeviceGuard.h:4, from ../aten/src/ATen/ATen.h:11, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/core/TensorBody.h: In member function ‘at::Tensor at::Tensor::scatter_reduce(int64_t, const at::Tensor&, c10::string_view, c10::optional<long int>) const’: aten/src/ATen/core/TensorBody.h:3976:83: error: cannot convert ‘c10::string_view’ {aka ‘c10::basic_string_view<char>’} to ‘const at::Tensor&’ 3976 \| return at::_ops::scatter_reduce::call(const_cast<Tensor&>(*this), dim, index, reduce, output_size); \| ^~~~~~ \| \| \| c10::string_view {aka c10::basic_string_view<char>} In file included from aten/src/ATen/core/TensorBody.h:3, from ../aten/src/ATen/core/Tensor.h:3, from ../aten/src/ATen/DeviceGuard.h:4, from ../aten/src/ATen/ATen.h:11, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Operators.h:13824:109: note: initializing argument 4 of ‘static at::Tensor at::_ops::scatter_reduce::call(const at::Tensor&, int64_t, const at::Tensor&, const at::Tensor&, c10::string_view)’ 13824 \| static at::Tensor call(const at::Tensor & self, int64_t dim, const at::Tensor & index, const at::Tensor & src, c10::string_view reduce); \| ~~~~~~~~~~~~~~~~~~~^~~ In file included from ../aten/src/ATen/ATen.h:15, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Functions.h: In function ‘at::Tensor at::scatter_reduce(const at::Tensor&, int64_t, const at::Tensor&, c10::string_view, c10::optional<long int>)’: aten/src/ATen/Functions.h:7119:61: error: cannot convert ‘c10::string_view’ {aka ‘c10::basic_string_view<char>’} to ‘const at::Tensor&’ 7119 \| return at::_ops::scatter_reduce::call(self, dim, index, reduce, output_size); \| ^~~~~~ \| \| \| c10::string_view {aka c10::basic_string_view<char>} In file included from aten/src/ATen/core/TensorBody.h:3, from ../aten/src/ATen/core/Tensor.h:3, from ../aten/src/ATen/DeviceGuard.h:4, from ../aten/src/ATen/ATen.h:11, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Operators.h:13824:109: note: initializing argument 4 of ‘static at::Tensor at::_ops::scatter_reduce::call(const at::Tensor&, int64_t, const at::Tensor&, const at::Tensor&, c10::string_view)’ 13824 \| static at::Tensor call(const at::Tensor & self, int64_t dim, const at::Tensor & index, const at::Tensor & src, c10::string_view reduce); \| ~~~~~~~~~~~~~~~~~~~^~~ In file included from ../aten/src/ATen/ATen.h:15, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Functions.h: In function ‘at::Tensor& at::scatter_reduce_out(at::Tensor&, const at::Tensor&, int64_t, const at::Tensor&, c10::string_view, c10::optional<long int>)’: aten/src/ATen/Functions.h:7124:65: error: cannot convert ‘c10::string_view’ {aka ‘c10::basic_string_view<char>’} to ‘const at::Tensor&’ 7124 \| return at::_ops::scatter_reduce_out::call(self, dim, index, reduce, output_size, out); \| ^~~~~~ \| \| \| c10::string_view {aka c10::basic_string_view<char>} In file included from aten/src/ATen/core/TensorBody.h:3, from ../aten/src/ATen/core/Tensor.h:3, from ../aten/src/ATen/DeviceGuard.h:4, from ../aten/src/ATen/ATen.h:11, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Operators.h:13846:111: note: initializing argument 4 of ‘static at::Tensor& at::_ops::scatter_reduce_out::call(const at::Tensor&, int64_t, const at::Tensor&, const at::Tensor&, c10::string_view, at::Tensor&)’ 13846 \| static at::Tensor & call(const at::Tensor & self, int64_t dim, const at::Tensor & index, const at::Tensor & src, c10::string_view reduce, at::Tensor & out); \| ~~~~~~~~~~~~~~~~~~~^~~ In file included from ../aten/src/ATen/ATen.h:15, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Functions.h: In function ‘at::Tensor& at::scatter_reduce_outf(const at::Tensor&, int64_t, const at::Tensor&, c10::string_view, c10::optional<long int>, at::Tensor&)’: aten/src/ATen/Functions.h:7129:65: error: cannot convert ‘c10::string_view’ {aka ‘c10::basic_string_view<char>’} to ‘const at::Tensor&’ 7129 \| return at::_ops::scatter_reduce_out::call(self, dim, index, reduce, output_size, out); \| ^~~~~~ \| \| \| c10::string_view {aka c10::basic_string_view<char>} In file included from aten/src/ATen/core/TensorBody.h:3, from ../aten/src/ATen/core/Tensor.h:3, from ../aten/src/ATen/DeviceGuard.h:4, from ../aten/src/ATen/ATen.h:11, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/Operators.h:13846:111: note: initializing argument 4 of ‘static at::Tensor& at::_ops::scatter_reduce_out::call(const at::Tensor&, int64_t, const at::Tensor&, const at::Tensor&, c10::string_view, at::Tensor&)’ 13846 \| static at::Tensor & call(const at::Tensor & self, int64_t dim, const at::Tensor & index, const at::Tensor & src, c10::string_view reduce, at::Tensor & out); \| ~~~~~~~~~~~~~~~~~~~^~~ In file included from aten/src/ATen/NativeFunctions.h:6, from ../aten/src/ATen/TensorIndexing.h:12, from ../aten/src/ATen/ATen.h:20, from aten/src/ATen/native/cpu/CopyKernel.cpp.DEFAULT.cpp:1: aten/src/ATen/NativeMetaFunctions.h: At global scope: aten/src/ATen/NativeMetaFunctions.h:496:18: error: redefinition of ‘struct at::meta::structured_scatter_reduce’ 496 \| struct TORCH_API structured_scatter_reduce : public at::impl::MetaBase { \| ^~~~~~~~~~~~~~~~~~~~~~~~~ aten/src/ATen/NativeMetaFunctions.h:481:18: note: previous definition of ‘struct at::meta::structured_scatter_reduce’ 481 \| struct TORCH_API structured_scatter_reduce : public at::impl::MetaBase { \| ^~~~~~~~~~~~~~~~~~~~~~~~~ ninja: build stopped: subcommand failed. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68115 Reviewed By: albanD Differential Revision: D32488450 Pulled By: cpuhrsch fbshipit-source-id: 65e79c6d0555c0d5715535bb52aade8d5fcd9722	2021-11-17 19:53:12 -08:00
vfdev-5	3da2e09c9b	Added antialias flag to interpolate (CPU only, bilinear) (#65142 ) Summary: Description: - Added antialias flag to interpolate (CPU only) - forward and backward for bilinear mode - added tests ### Benchmarks <details> <summary> Forward pass, CPU. PTH interpolation vs PIL </summary> Cases: - PTH RGB 3 Channels, float32 vs PIL RGB uint8 (apply vs pears) - PTH 1 Channel, float32 vs PIL 1 Channel Float Code: https://gist.github.com/vfdev-5/b173761a567f2283b3c649c3c0574112 ``` # OMP_NUM_THREADS=1 python bench_interp_aa_vs_pillow.py Torch config: PyTorch built with: - GCC 9.3 - C++ Version: 201402 - OpenMP 201511 (a.k.a. OpenMP 4.5) - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_75,code=sm_75 - CuDNN 8.0.5 - Build settings: BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_PYTORCH_QNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=1, USE_CUDNN=1, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=0, USE_OPENMP=ON, Num threads: 1 [------------------------ Downsampling: torch.Size([1, 3, 906, 438]) -> (320, 196) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 2.9 \| 3.1 channels_last non-contiguous torch.float32 \| 2.6 \| 3.6 Times are in milliseconds (ms). [------------------------ Downsampling: torch.Size([1, 3, 906, 438]) -> (460, 220) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 3.4 \| 4.0 channels_last non-contiguous torch.float32 \| 3.4 \| 4.8 Times are in milliseconds (ms). [------------------------ Downsampling: torch.Size([1, 3, 906, 438]) -> (120, 96) -------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 1.6 \| 1.8 channels_last non-contiguous torch.float32 \| 1.6 \| 1.9 Times are in milliseconds (ms). [----------------------- Downsampling: torch.Size([1, 3, 906, 438]) -> (1200, 196) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 9.0 \| 11.3 channels_last non-contiguous torch.float32 \| 8.9 \| 12.5 Times are in milliseconds (ms). [----------------------- Downsampling: torch.Size([1, 3, 906, 438]) -> (120, 1200) ------------------------] \| Reference, PIL 8.3.2, mode: RGB \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------------------------- channels_first contiguous torch.float32 \| 2.1 \| 1.8 channels_last non-contiguous torch.float32 \| 2.1 \| 3.4 Times are in milliseconds (ms). [--------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (320, 196) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 1.2 \| 1.0 Times are in milliseconds (ms). [--------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (460, 220) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 1.4 \| 1.3 Times are in milliseconds (ms). [--------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (120, 96) ---------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 719.9 \| 599.9 Times are in microseconds (us). [-------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (1200, 196) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 3.7 \| 3.5 Times are in milliseconds (ms). [-------------- Downsampling: torch.Size([1, 1, 906, 438]) -> (120, 1200) --------------] \| Reference, PIL 8.3.2, mode: F \| 1.10.0a0+git1e87d91 1 threads: ------------------------------------------------------------------------------ contiguous torch.float32 \| 834.4 \| 605.7 Times are in microseconds (us). ``` </details> Code is moved from torchvision: https://github.com/pytorch/vision/pull/4208 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65142 Reviewed By: mrshenli Differential Revision: D32432405 Pulled By: jbschlosser fbshipit-source-id: b66c548347f257c522c36105868532e8bc1d4c6d	2021-11-17 09:10:15 -08:00
Thomas Metcalfe	ba16b1eca7	[numpy] Alias `arctan2` to `atan2` (#67010 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/65906 Adds an alias `arctan2` to improve numpy compatibility cc mruberry rgommers Pull Request resolved: https://github.com/pytorch/pytorch/pull/67010 Reviewed By: anjali411 Differential Revision: D32378998 Pulled By: mruberry fbshipit-source-id: 424c5c10c12b49c20ee83ccd109325c480b5b6cf	2021-11-16 09:41:09 -08:00
David Dang	f7366ca51b	implemented quantize_per_tensor_dynamic and added a corresponding test script (#68004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68004 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D32301792 Pulled By: dzdang fbshipit-source-id: f680557ba4736d095efc33e8c92111265f25aee0	2021-11-13 06:34:36 -08:00
Anirudh Dagar	b07a11929d	Array API: Add torch.linalg.cross (#63285 ) Summary: ### Create `linalg.cross` Fixes https://github.com/pytorch/pytorch/issues/62810 As discussed in the corresponding issue, this PR adds `cross` to the `linalg` namespace (Note: There is no method variant) which is slightly different in behaviour compared to `torch.cross`. Note: this is NOT an alias as suggested in mruberry's [https://github.com/pytorch/pytorch/issues/62810 comment](https://github.com/pytorch/pytorch/issues/62810#issuecomment-897504372) below > linalg.cross being consistent with the Python Array API (over NumPy) makes sense because NumPy has no linalg.cross. I also think we can implement linalg.cross without immediately deprecating torch.cross, although we should definitely refer users to linalg.cross. Deprecating torch.cross will require additional review. While it's not used often it is used, and it's unclear if users are relying on its unique behavior or not. The current default implementation of `torch.cross` is extremely weird and confusing. This has also been reported multiple times previously. (See https://github.com/pytorch/pytorch/issues/17229, https://github.com/pytorch/pytorch/issues/39310, https://github.com/pytorch/pytorch/issues/41850, https://github.com/pytorch/pytorch/issues/50273) - [x] Add `torch.linalg.cross` with default `dim=-1` - [x] Add OpInfo and other tests for `torch.linalg.cross` - [x] Add broadcasting support to `torch.cross` and `torch.linalg.cross` - [x] Remove out skip from `torch.cross` OpInfo - [x] Add docs for `torch.linalg.cross`. Improve docs for `torch.cross` mentioning `linalg.cross` and the difference between the two. Also adds a warning to `torch.cross`, that it may change in the future (we might want to deprecate it later) --- ### Additional Fixes to `torch.cross` - [x] Fix Doc for Tensor.cross - [x] Fix torch.cross in `torch/overridres.py` While working on `linalg.cross` I noticed these small issues with `torch.cross` itself. [Tensor.cross docs](https://pytorch.org/docs/stable/generated/torch.Tensor.cross.html) still mentions `dim=-1` default which is actually wrong. It should be `dim=None` after the behaviour was updated in PR https://github.com/pytorch/pytorch/issues/17582 but the documentation for the `method` or `function` variant wasn’t updated. Later PR https://github.com/pytorch/pytorch/issues/41850 updated the documentation for the `function` variant i.e `torch.cross` and also added the following warning about the weird behaviour. > If `dim` is not given, it defaults to the first dimension found with the size 3. Note that this might be unexpected. But still, the `Tensor.cross` docs were missed and remained outdated. I’m finally fixing that here. Also fixing `torch/overrides.py` for `torch.cross` as well now, with `dim=None`. To verify according to the docs the default behaviour of `dim=-1` should raise, you can try the following. ```python a = torch.randn(3, 4) b = torch.randn(3, 4) b.cross(a) # this works because the implementation finds 3 in the first dimension and the default behaviour as shown in documentation is actually not true. >>> tensor([[ 0.7171, -1.1059, 0.4162, 1.3026], [ 0.4320, -2.1591, -1.1423, 1.2314], [-0.6034, -1.6592, -0.8016, 1.6467]]) b.cross(a, dim=-1) # this raises as expected since the last dimension doesn't have a 3 >>> RuntimeError: dimension -1 does not have size 3 ``` Please take a closer look (particularly the autograd part, this is the first time I'm dealing with `derivatives.yaml`). If there is something missing, wrong or needs more explanation, please let me know. Looking forward to the feedback. cc mruberry Lezcano IvanYashchuk rgommers Pull Request resolved: https://github.com/pytorch/pytorch/pull/63285 Reviewed By: gchanan Differential Revision: D32313346 Pulled By: mruberry fbshipit-source-id: e68c2687c57367274e8ddb7ef28ee92dcd4c9f2c	2021-11-11 12:49:41 -08:00
Kurt Mohler	db014b8529	Add `set_deterministic_debug_mode` and `get_deterministic_debug_mode` (#67778 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67386 Pull Request resolved: https://github.com/pytorch/pytorch/pull/67778 Reviewed By: ngimel Differential Revision: D32310661 Pulled By: mruberry fbshipit-source-id: 300129e96ca51c22fa711182ce6a9f4d4d2ce57f	2021-11-11 12:48:29 -08:00
kshitij12345	510e3026a9	[numpy] add torch.argwhere (#64257 ) Summary: Adds `torch.argwhere` as an alias to `torch.nonzero` Currently, `torch.nonzero` is actually provides equivalent functionality to `np.argwhere`. From NumPy docs, > np.argwhere(a) is almost the same as np.transpose(np.nonzero(a)), but produces a result of the correct shape for a 0D array. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64257 Reviewed By: qihqi Differential Revision: D32049884 Pulled By: saketh-are fbshipit-source-id: 016e49884698daa53b83e384435c3f8f6b5bf6bb	2021-10-30 15:26:11 -07:00
Brian Hirsh	03f3a0331b	add slice/select/diagonal_scatter variants as primitive ops (#64430 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64430 The functionalization pass needs `{view}_scatter` versions of the slice/select/diagonal ops in order to correctly propagate mutations from a view to its base. On top of that, the implementations need to be primitive w.r.t. autograd, because they look something like `...slice().copy_()`, and the functionalization pass can't use views + mutations inside of it's own alias-removal machinery! I added some basic tests that I tried to base off of existing tests for views (particularly around testing the derivative formulas), but I'm wondering if I should add something more comprehensive. Also, as_strided fits into this category - the functionalization pass will need an `as_strided_scatter` op that's primitive w.r.t. autograd. I didn't add it for now, because it'll involve duplicating a bunch of logic from the current `as_strided_backward()` function, and also writing a derivative formula that I wasn't sure how to write :) Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D31942092 Pulled By: bdhirsh fbshipit-source-id: c702a57c2748a7c771c14e4bcc3e996b48fcc4c8	2021-10-28 10:51:12 -07:00
jjsjann123	1ec732bc46	Add fp16/fp32 autocasting to JIT/TorchScript (#63939 ) Summary: Adds mixed precision autocasting support between fp32/fp16 to torchscript/JIT. More in depth descriptoin can be found at [torch/csrc/jit/JIT-AUTOCAST.md](https://github.com/pytorch/pytorch/pull/63939/files#diff-1f1772aaa508841c5bb58b74ab98f49a1e577612cd9ea5c386c8714a75db830b) This PR implemented an autocast optimization pass that inserts casting ops per AMP rule (torch/csrc/jit/passes/autocast.cpp), that mimics the behavior of eager autocast. The pass also takes into consideration the context of `torch.cuda.amp.autocast` and only inserts casting ops within the enabled context manager, giving feature parity as with eager amp autocast. We currently provide JIT AMP autocast as a prototyping feature, so it is default off and could be turned on via `torch._C._jit_set_autocast_mode(True)` The JIT support for autocast is subject to different constraints compared to the eager mode implementation (mostly related to the fact that TorchScript is statically typed), restriction on the user facing python code is described in doc torch/csrc/jit/JIT-AUTOCAST.md This is a prototype, there are also implementation limitation that's necessary to keep this PR small and get something functioning quickly on upstream, so we can iterate on designs. Few limitation/challenge that is not properly resolved in this PR: 1. Autocast inserts cast operation, which would have impact on scalar type of output tensor feeding downstream operations. We are not currently propagating the updated scalar types, this would give issues/wrong results on operations in promotion rules. 2. Backward for autodiff in JIT misses the casting of dgrad to input scalar type, as what autograd does in eager. This forces us to explicitly mark the casting operation for certain operations (e.g. binary ops), otherwise, we might be feeding dgrad with mismatch scalar type to input. This could potentially break gradient function consuming dgrad. (e.g. gemm backwards, which assumes grad_output to be of same scalar type as input') 3. `torch.autocast` api has an optional argument `dtype` which is not currently supported in the JIT autocast and we require a static value. Credit goes mostly to: tlemo kevinstephano Pull Request resolved: https://github.com/pytorch/pytorch/pull/63939 Reviewed By: navahgar Differential Revision: D31093381 Pulled By: eellison fbshipit-source-id: da6e26c668c38b01e296f304507048d6c1794314	2021-10-27 12:11:36 -07:00
Saketh Are	33790c4e06	Implement histogramdd on CPU (#65318 ) Summary: Implements `torch.histogramdd` analogous to `numpy.histogramdd`. Builds on https://github.com/pytorch/pytorch/pull/58780, generalizing the existing `torch.histogram` kernel to handle D-dimensional inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/65318 Reviewed By: soulitzer Differential Revision: D31654555 Pulled By: saketh-are fbshipit-source-id: 14b781fac0fd3698b052dbd6f0fda46e50d4c5f1	2021-10-21 16:09:31 -07:00
Natalia Gimelshein	f29e5220a6	Revert D31474901: [pytorch][PR] [numpy] add torch.argwhere Test Plan: revert-hammer Differential Revision: D31474901 Original commit changeset: 335327a4986f fbshipit-source-id: 534093e459762ff7a888c58d76e49e362015f2ba	2021-10-21 15:50:54 -07:00
kshitij12345	462f333c01	[numpy] add torch.argwhere (#64257 ) Summary: Adds `torch.argwhere` as an alias to `torch.nonzero` Currently, `torch.nonzero` is actually provides equivalent functionality to `np.argwhere`. From NumPy docs, > np.argwhere(a) is almost the same as np.transpose(np.nonzero(a)), but produces a result of the correct shape for a 0D array. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64257 Reviewed By: dagitses Differential Revision: D31474901 Pulled By: saketh-are fbshipit-source-id: 335327a4986fa327da74e1fb8624cc1e56959c70	2021-10-21 14:02:11 -07:00
lezcano	a2e94b80fa	Create linalg.matrix_exp (#62715 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62715 Fixes https://github.com/pytorch/pytorch/issues/61648 Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D31641698 Pulled By: mruberry fbshipit-source-id: 2e2965d14807b6b4fada4b809d539066dd0ba277	2021-10-19 09:07:15 -07:00
Yukio Siraichi	8854817f44	Implement Python Array API `asarray` function. (#60627 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60627 In this PR, the core of `frombuffer` and `fromDLPack` onto _tensor_new.cpp_. `asarray` uses such refactored functions for interpreting the object as a tensor. We follow the Python Array API standard found: https://data-apis.org/array-api/latest/API_specification/creation_functions.html?highlight=asarray Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D31640510 Pulled By: mruberry fbshipit-source-id: d0869e0d73cb50023d5866b001dac5d34ca30dfd	2021-10-16 21:11:31 -07:00
lezcano	82a216c45b	Add tensor.{adjoint(),H,mT,mH} methods and properties (#64179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64179 This PR follows the discussion in https://github.com/pytorch/pytorch/issues/45063#issuecomment-904431478 Fixes https://github.com/pytorch/pytorch/issues/45063 cc ezyang anjali411 dylanbespalko mruberry Lezcano nikitaved rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi heitorschueroff Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D30730483 Pulled By: anjali411 fbshipit-source-id: 821d25083f5f682450f6812bf852dc96a1cdf9f2	2021-10-13 07:44:43 -07:00
Kurt Mohler	5883523c1d	Remove dtype from torch.Storage and use only torch.ByteStorage (#62030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030 Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible Fixes https://github.com/pytorch/pytorch/issues/47442 * THE SERIALIZATION FORMAT IS FULLY FC/BC. We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today. * There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate. * As we no longer know what dtype of a storage is, we've removed the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes. * `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls. `Storage._new_with_file` and `_set_from_file` require explicit element size arguments. * It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor. * It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling. * The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall. To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage or your serialization code will degrade to standard file-based serialization. Original pull request: https://github.com/pytorch/pytorch/pull/59671 Reviewed By: soulitzer, ngimel Differential Revision: D29466819 Pulled By: ezyang fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e	2021-10-05 13:50:34 -07:00
Supriya Rao	458a00bacb	Back out "[quant] update fused_obs_fake_quant op to accept output_fake_quant argument" (#66063 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66063 Original commit changeset: bffe776216d0 Test Plan: CI Reviewed By: vkuzo Differential Revision: D31347042 fbshipit-source-id: f56f628dc4690187bf284a8f2fda4c6aae10c1d6	2021-10-05 11:02:54 -07:00
kshitij12345	c1447f06a8	[special] special alias for softmax (#62251 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/50345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62251 Reviewed By: H-Huang Differential Revision: D31141834 Pulled By: mruberry fbshipit-source-id: aecaf62af248e9034ef589159ce0fb325c729493	2021-10-01 03:55:32 -07:00
Peter Bell	6285348f06	Implement n-dimensional hermitian FFTs (#63890 ) Summary: Closes https://github.com/pytorch/pytorch/issues/59127 cc mruberry peterbell10 walterddr Pull Request resolved: https://github.com/pytorch/pytorch/pull/63890 Reviewed By: ngimel Differential Revision: D30761909 Pulled By: mruberry fbshipit-source-id: 06e1e4dc65726f35c99a74f18b9fa36eb7d694a5	2021-09-30 16:02:28 -07:00
Supriya Rao	4666e3f192	[quant] update fused_obs_fake_quant op to accept output_fake_quant argument (#65621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65621 Add a new attribute to the FusedMovingAvgObsFakeQuantize that controls if the Fake Quant operation should be applied at the output of a particular layer. The motivation is to give the users additional control to control the numerics of the fake_quant operators during training. It defaults to always fake quant the output (True). Note: We will still observer the tensors as before (only the fake_quant operation is controlled using this flag) For example ``` input model x -> fc1 -> fc2 -> non_quantizable_op -> fc3 After fake_quant x -> fake_quant(x) -> fc1 -> fake_quant(fc1) -> fc2 -> fake_quant(fc2) -> non_quantizable_op -> fake_quant() -> fc3 -> fake_quantize(fc3) With output_fake_quant disabled at the output of fc2 and fc3 (since their outputs are non-quantizable) x -> fake_quant(x) -> fc1 -> fake_quant(fc1) -> fc2 -> non_quantizable_op -> fake_quant() -> fc3 ``` Test Plan: ./buck-out/gen/caffe2/test/quantization_fx\#binary.par -r test_disable_output_fake_quant Reviewed By: jerryzh168 Differential Revision: D31174526 fbshipit-source-id: bffe776216d041fb09133a6fb09bfc2c0bb46b89	2021-09-30 01:08:01 -07:00
Edward Yang	70a545b21e	Add Tensor._make_wrapper_subclass (#65340 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65340 I thought about a few possible ways of doing this. The main hazard is that if I create a CPU tensor that doesn't have any real storage, the moment I actually try to access the data on the tensor I will segfault. So I don't want to use _make_subclass on a "cpu meta tensor" because the CPU meta tensor (with no subclass) is radioactive: printing it will immediately cause a segfault. So instead, I have to create the CPU meta tensor AND subclass all in one go, and that means I need another function for it. One downside to doing it this way is I need another overload for explicit strides, and in general it is difficult to get the view relationships to all work out properly; tracked at https://github.com/pytorch/pytorch/issues/65339 Fixes https://github.com/pytorch/pytorch/issues/62972 Fixes https://github.com/pytorch/pytorch/issues/62730 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D31057231 Pulled By: ezyang fbshipit-source-id: 73522769e093ae8a1bf0c7f7e594659bfb827b28	2021-09-22 11:10:47 -07:00
albanD	6eafe7f15e	Actually deprecate __torch_function__ as plain methods (#64843 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64843 Fix for https://github.com/pytorch/pytorch/issues/63767 Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D30991425 Pulled By: albanD fbshipit-source-id: 1214143b8aea87e6ff406c7fc13096bd15d1a768	2021-09-17 08:32:53 -07:00
albanD	473e55d5b2	Use classmethods for overrides (#64841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64841 Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D30991424 Pulled By: albanD fbshipit-source-id: 551e2119768f3a4292713f3bfa83930f5506adbd	2021-09-17 08:32:49 -07:00
Heitor Schueroff	b37503e452	Initial implementation of nanmean (#62671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62671 Very crude first implementation of `torch.nanmean`. The current reduction kernels do not have good support for implementing nan* variants. Rather than implementing new kernels for each nan* operator, I will work on new reduction kernels with support for a `nan_policy` flag and then I will port `nanmean` to use that. TODO - [x] Fix autograd issue Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D30515181 Pulled By: heitorschueroff fbshipit-source-id: 303004ebd7ac9cf963dc4f8e2553eaded5f013f0	2021-09-13 05:53:58 -07:00
Emilio Castillo	1cb3507ed3	Adds DLPack support (#57110 ) Summary: Partially Fixes https://github.com/pytorch/pytorch/issues/55090 Depends on https://github.com/pytorch/pytorch/issues/55365 Inspired by https://github.com/dmlc/dlpack/issues/57#issuecomment-774482973 Questions, in PyTorch we can't create streams or easily synchronize them from just an integer. Should we add an [`ExternalStream`](https://docs.cupy.dev/en/stable/reference/generated/cupy.cuda.ExternalStream.html) object like the one we have in CuPy? TODO: Add tests Would like some feedback as this design needs quite a few iterations rgommers leofang Pull Request resolved: https://github.com/pytorch/pytorch/pull/57110 Reviewed By: saketh-are Differential Revision: D30761481 Pulled By: mruberry fbshipit-source-id: e85d78df3c1f8defc2a698878da89cd843cb1209	2021-09-12 19:47:15 -07:00
Edward Yang	d4b1016850	Filter out _disabled_torch_function_impl from handle_torch_function (#64689 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64689 This brings it in line with the C++ implementation. Fixes https://github.com/pytorch/pytorch/issues/64687 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D30816215 Pulled By: ezyang fbshipit-source-id: ed36af6c35467ae678d9548197efd97c36d38dec	2021-09-09 07:29:09 -07:00
leslie-fang-intel	768014b3e6	Allow disabling cache in autocast (automatic mixed precision) (#63552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63552 In this PR, we want to exclude these 2 cases in the `Autocast` weight cache usages: - Using `torch.jit.trace` under the `Autocast` As report in https://github.com/pytorch/pytorch/issues/50231 and several other discussions, using `torch.jit.trace` under the `Autocast`, the trace process would hit Autocast's weight cache and fails. So we should disable weight cache under the trace process. - Using `Autocast` with `Grad mode` - Usually we are using `Grad mode` for training. Since in the training phase, the weight will change in every step. So we doesn't need to cache the weight. - For the recommended `Autocast` training case in the [doc](https://pytorch.org/docs/stable/amp.html), `Autocast` will clear the cache every step leaving the context. We should disable it to save the clear operations. ``` model = Net().cuda() optimizer = optim.SGD(model.parameters(), ...) for input, target in data: optimizer.zero_grad() with autocast(): output = model(input) loss = loss_fn(output, target) loss.backward() optimizer.step() ``` Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D30644913 Pulled By: ezyang fbshipit-source-id: ad7bc87372e554e7aa1aa0795e9676871b3974e7	2021-09-08 07:47:18 -07:00
kshitij12345	2c351c76e0	[special] Alias igamma, igammac to special.gammaninc, special.gammaincc (#61902 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/50345 Also added relevant OpInfo TODO: * [x] Check rendered docs gammainc : https://docs-preview.pytorch.org/61902/special.html#torch.special.gammainc * [x] Check rendered docs gammaincc: https://docs-preview.pytorch.org/61902/special.html#torch.special.gammaincc Pull Request resolved: https://github.com/pytorch/pytorch/pull/61902 Reviewed By: ngimel Differential Revision: D30761428 Pulled By: mruberry fbshipit-source-id: 06a16432873357958d53364f12a4e91c29779d26	2021-09-07 15:31:26 -07:00
Anirudh Dagar	337c71be05	Array API: Add `torch.linalg.matmul` alias to `torch.matmul` (#63227 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/62811 Add `torch.linalg.matmul` alias to `torch.matmul`. Note that the `linalg.matmul` doesn't have a `method` variant. Also cleaning up `torch/_torch_docs.py` when formatting is not needed. cc IvanYashchuk Lezcano mruberry rgommers Pull Request resolved: https://github.com/pytorch/pytorch/pull/63227 Reviewed By: mrshenli Differential Revision: D30770235 Pulled By: mruberry fbshipit-source-id: bfba77dfcbb61fcd44f22ba41bd8d84c21132403	2021-09-07 12:35:32 -07:00
Anirudh Dagar	1a1fb31cfa	Support `torch.concat` alias, add `cat` OpInfo & remove OpInfo test_out skips {cat, stack, hstack, vtack, dstack} (#62560 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/61767 ## Changes - [x] Add `torch.concat` alias to `torch.cat` - [x] Add OpInfo for `cat`/`concat` - [x] Fix `test_out` skips (Use `at::native::resize_output` or `at::native::resize_output_check`) - [x] `cat`/`concat` - [x] `stack` - [x] `hstack` - [x] `dstack` - [x] `vstack`/`row_stack` - [x] Remove redundant tests for `cat`/`stack` ~I've not added `cat`/`concat` to OpInfo `op_db` yet, since cat is a little more tricky than other OpInfos (should have a lot of tests) and currently there are no OpInfos for that. I can try to add that in a subsequent PR or maybe here itself, whatever is suggested.~ Edit: cat/concat OpInfo has been added. Note: I've added the named tensor support for `concat` alias as well, maybe that's out of spec in `array-api` but it is still useful for consistency in PyTorch. Thanks to krshrimali for guidance on my first PR :)) cc mruberry rgommers pmeier asmeurer leofang AnirudhDagar asi1024 emcastillo kmaehashi heitorschueroff krshrimali Pull Request resolved: https://github.com/pytorch/pytorch/pull/62560 Reviewed By: saketh-are Differential Revision: D30762069 Pulled By: mruberry fbshipit-source-id: 6985159d1d9756238890488a0ab3ae7699d94337	2021-09-06 23:57:18 -07:00
Thomas J. Fan	d3bcba5f85	ENH Adds label_smoothing to cross entropy loss (#63122 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/7455 Partially resolves pytorch/vision#4281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63122 Reviewed By: iramazanli Differential Revision: D30586076 Pulled By: jbschlosser fbshipit-source-id: 06afc3aa1f8b9edb07fe9ed68c58968ad1926924	2021-08-29 23:33:04 -07:00
Aaron Bockover	c78ab28441	Add support for the ONNX Runtime Eager Mode backend (#58248 ) Summary: This PR implements the necessary hooks/stubs/enums/etc for complete ONNX Runtime (ORT) Eager Mode integration. The actual extension will live out of tree at https://github.com/pytorch/ort. We have been [working on this at Microsoft](https://github.com/microsoft/onnxruntime-pytorch/tree/eager-ort/torch_onnxruntime) for the last few months, and are finally ready to contribute the PyTorch core changes upstream (nothing major or exciting, just the usual boilerplate for adding new backends). The ORT backend will allow us to ferry [almost] all torch ops into granular ONNX kernels that ORT will eagerly execute against any devices it supports (therefore, we only need a single ORT backend from a PyTorch perspective). Pull Request resolved: https://github.com/pytorch/pytorch/pull/58248 Reviewed By: astaff Differential Revision: D30344992 Pulled By: albanD fbshipit-source-id: 69082b32121246340d686e16653626114b7714b2	2021-08-20 11:17:13 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00

... 2 3 4 5 6 ...

572 Commits