pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
M.L. Croci	1f0223d6bb	Fix bug in gaussian_nll_loss (#56469 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/53964. cc albanD almson ## Major changes: - Overhauled the actual loss calculation so that the shapes are now correct (in functional.py) - added the missing doc in nn.functional.rst ## Minor changes (in functional.py): - I removed the previous check on whether input and target were the same shape. This is to allow for broadcasting, say when you have 10 predictions that all have the same target. - I added some comments to explain each shape check in detail. Let me know if these should be shortened/cut. Screenshots of updated docs attached. Let me know what you think, thanks! ## Edit: Description of change of behaviour (affecting BC): The backwards-compatibility is only affected for the `reduction='none'` mode. This was the source of the bug. For tensors with size (N, D), the old returned loss had size (N), as incorrect summation was happening. It will now have size (N, D) as expected. ### Example Define input tensors, all with size (2, 3). `input = torch.tensor([[0., 1., 3.], [2., 4., 0.]], requires_grad=True)` `target = torch.tensor([[1., 4., 2.], [-1., 2., 3.]])` `var = 2*torch.ones(size=(2, 3), requires_grad=True)` Initialise loss with reduction mode 'none'. We expect the returned loss to have the same size as the input tensors, (2, 3). `loss = torch.nn.GaussianNLLLoss(reduction='none')` Old behaviour: `print(loss(input, target, var)) ` `# Gives tensor([3.7897, 6.5397], grad_fn=<MulBackward0>. This has size (2).` New behaviour: `print(loss(input, target, var)) ` `# Gives tensor([[0.5966, 2.5966, 0.5966], [2.5966, 1.3466, 2.5966]], grad_fn=<MulBackward0>)` `# This has the expected size, (2, 3).` To recover the old behaviour, sum along all dimensions except for the 0th: `print(loss(input, target, var).sum(dim=1))` `# Gives tensor([3.7897, 6.5397], grad_fn=<SumBackward1>.` ![doc1](https://user-images.githubusercontent.com/26558092/115391089-f7f47b00-a1d6-11eb-8726-e4da9057aee0.png) ![doc2](https://user-images.githubusercontent.com/26558092/115391094-f925a800-a1d6-11eb-954b-afd187f42bc7.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56469 Reviewed By: jbschlosser, agolynski Differential Revision: D27894170 Pulled By: albanD fbshipit-source-id: 197890189c97c22109491c47f469336b5b03a23f	2021-04-22 07:43:48 -07:00
Nikita Shulga	6d7d36d255	s/“pad”/"pad"/ in files introduced by #56065 (#56618 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56618 Reviewed By: albanD Differential Revision: D27919343 Pulled By: malfet fbshipit-source-id: 2fac8ba5f399e050463141eba225da935c97a5ce	2021-04-21 17:40:29 -07:00
Joel Schlosser	8a81c4dc27	Update padding_idx docs for EmbeddingBag to better match Embedding's (#56065 ) Summary: Match updated `Embedding` docs from https://github.com/pytorch/pytorch/pull/54026 as closely as possible. Additionally, update the C++ side `Embedding` docs, since those were missed in the previous PR. There are 6 (!) places for docs: 1. Python module form in `sparse.py` - includes an additional line about newly constructed `Embedding`s / `EmbeddingBag`s 2. Python `from_pretrained()` in `sparse.py` (refers back to module docs) 3. Python functional form in `functional.py` 4. C++ module options - includes an additional line about newly constructed `Embedding`s / `EmbeddingBag`s 5. C++ `from_pretrained()` options 6. C++ functional options Pull Request resolved: https://github.com/pytorch/pytorch/pull/56065 Reviewed By: malfet Differential Revision: D27908383 Pulled By: jbschlosser fbshipit-source-id: c5891fed1c9d33b4b8cd63500a14c1a77d92cc78	2021-04-21 12:10:37 -07:00
Sam Estep	e3900d2ba5	Add lint for unqualified `noqa` (#56272 ) Summary: As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future. Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two: ``` test/jit/test_misc.py:27: print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999 test/jit/test_misc.py:28: print(f"format blank") # noqa F541 ``` However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored: - If you change them to anything else, the warnings will still be suppressed. - If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally: ``` test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment ``` I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2365189927 Reviewed By: janeyx99 Differential Revision: D27830127 Pulled By: samestep fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb	2021-04-19 13:16:18 -07:00
Kurt Mohler	3fe4718d16	Add `padding_idx` argument to EmbeddingBag (#49237 ) Summary: This PR adds a `padding_idx` parameter to `nn.EmbeddingBag` and `nn.functional.embedding_bag`. As with `nn.Embedding`'s `padding_idx` argument, if an embedding's index is equal to `padding_idx` it is ignored, so it is not included in the reduction. This PR does not add support for `padding_idx` for quantized or ONNX `EmbeddingBag` for opset10/11 (opset9 is supported). In these cases, an error is thrown if `padding_idx` is provided. Fixes https://github.com/pytorch/pytorch/issues/3194 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49237 Reviewed By: walterddr, VitalyFedyunin Differential Revision: D26948258 Pulled By: jbschlosser fbshipit-source-id: 3ca672f7e768941f3261ab405fc7597c97ce3dfc	2021-04-14 09:38:01 -07:00
Jeff Yang	263d8ef4ef	docs: fix formatting for embedding_bag (#54666 ) Summary: fixes https://github.com/pytorch/pytorch/issues/43499 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54666 Reviewed By: H-Huang Differential Revision: D27411027 Pulled By: jbschlosser fbshipit-source-id: a84cc174155bd725e108d8f953a21bb8de8d9d23	2021-04-07 06:32:16 -07:00
Hameer Abbasi	db3a9d7f8a	Fix __torch_function__ tests. (#54492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54492 Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27292567 Pulled By: ezyang fbshipit-source-id: dc29daea967c6d8aaf63bdbcb4aff0bb13d7a5f7	2021-03-26 10:59:15 -07:00
Bel H	645119eaef	Lowering NLLLoss/CrossEntropyLoss to ATen code (#53789 ) Summary: * Lowering NLLLoss/CrossEntropyLoss to ATen dispatch * This allows the MLC device to override these ops * Reduce code duplication between the Python and C++ APIs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53789 Reviewed By: ailzhang Differential Revision: D27345793 Pulled By: albanD fbshipit-source-id: 99c0d617ed5e7ee8f27f7a495a25ab4158d9aad6	2021-03-26 07:31:08 -07:00
Edward Yang	33b95c6bac	Add __torch_function__ support for torch.nn.functional.embedding (#54478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54478 Fixes #54292 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D27264179 Pulled By: ezyang fbshipit-source-id: cd267e2e668fdd8d7f958bf70a0b93e058ec7c23	2021-03-23 17:22:39 -07:00
Peter Bell	04e0cbf5a9	Add padding='same' mode to conv{1,2,3}d (#45667 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45667 First part of #3867 (Pooling operators still to do) This adds a `padding='same'` mode to the interface of `conv{n}d`and `nn.Conv{n}d`. This should match the behaviour of `tensorflow`. I couldn't find it explicitly documented but through experimentation I found `tensorflow` returns the shape `ceil(len/stride)` and always adds any extra asymmetric padding onto the right side of the input. Since the `native_functions.yaml` schema doesn't seem to support strings or enums, I've moved the function interface into python and it now dispatches between the numerically padded `conv{n}d` and the `_conv{n}d_same` variant. Underscores because I couldn't see any way to avoid exporting a function into the `torch` namespace. A note on asymmetric padding. The total padding required can be odd if both the kernel-length is even and the dilation is odd. mkldnn has native support for asymmetric padding, so there is no overhead there, but for other backends I resort to padding the input tensor by 1 on the right hand side to make the remaining padding symmetrical. In these cases, I use `TORCH_WARN_ONCE` to notify the user of the performance implications. Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D27170744 Pulled By: jbschlosser fbshipit-source-id: b3d8a0380e0787ae781f2e5d8ee365a7bfd49f22	2021-03-18 16:22:03 -07:00
kshitij12345	c1a39620b8	[nn] nn.Embedding : `padding_idx` doc update (#53809 ) Summary: Follow-up of https://github.com/pytorch/pytorch/pull/53447 Reference: https://github.com/pytorch/pytorch/pull/53447#discussion_r590521051 Pull Request resolved: https://github.com/pytorch/pytorch/pull/53809 Reviewed By: bdhirsh Differential Revision: D27049643 Pulled By: jbschlosser fbshipit-source-id: 623a2a254783b86391dc2b0777b688506adb4c0e	2021-03-15 11:54:51 -07:00
kshitij12345	45ddf113c9	[fix] nn.Embedding: allow changing the padding vector (#53447 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/53368 Pull Request resolved: https://github.com/pytorch/pytorch/pull/53447 Reviewed By: albanD Differential Revision: D26946284 Pulled By: jbschlosser fbshipit-source-id: 54e5eec7da86fa02b1b6e4a235d66976a80764fc	2021-03-10 09:53:27 -08:00
James Reed	215950e2be	Convert type annotations in nn/functional.py to py3 syntax (#53656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53656 Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D26926018 Pulled By: jamesr66a fbshipit-source-id: 2381583cf93c9c9d0c9eeaa6e41eddce3729942d	2021-03-09 22:26:22 -08:00
Evelyn Fitzgerald	b4395b046a	Edit SiLU documentation (#53239 ) Summary: I edited the documentation for `nn.SiLU` and `F.silu` to: - Explain that SiLU is also known as swish and that it stands for "Sigmoid Linear Unit." - Ensure that "SiLU" is correctly capitalized. I believe these changes will help users find the function they're looking for by adding relevant keywords to the docs. Fixes: N/A Pull Request resolved: https://github.com/pytorch/pytorch/pull/53239 Reviewed By: jbschlosser Differential Revision: D26816998 Pulled By: albanD fbshipit-source-id: b4e9976e6b7e88686e3fa7061c0e9b693bd6d198	2021-03-04 12:51:25 -08:00
Joel Schlosser	e86476f736	Huber loss (#50553 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48595. ## Background This PR implements HuberLoss, which differs from SmoothL1Loss by a factor of beta. The current implementation does not share logic between the two. Feedback is welcome for the optimal way to minimize code duplication while remaining performant. I've done some early [benchmarking](https://pytorch.org/tutorials/recipes/recipes/benchmark.html#collecting-instruction-counts-with-callgrind) with Huber calling in to the Smooth L1 kernel and scaling afterwards; for the simple test case I used, instruction counts are as follows: ``` Huber loss calls dedicated Huber kernel: 2,795,300 Huber loss calls Smooth L1 kernel and scales afterwards: 4,523,612 ``` With these numbers, instruction counts are ~62% higher when using the pre-existing Smooth L1 kernel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50553 Test Plan: ``` python test/test_nn.py TestNN.test_HuberLoss python test/test_nn.py TestNN.test_HuberLoss_delta python test/test_nn.py TestNN.test_huber_loss_invalid_delta python test/test_nn.py TestNNDeviceTypeCPU.test_smooth_l1_loss_vs_huber_loss_cpu python test/test_nn.py TestNNDeviceTypeCUDA.test_smooth_l1_loss_vs_huber_loss_cuda python test/test_nn.py TestNNDeviceTypeCPU.test_invalid_reduction_strings_cpu python test/test_nn.py TestNNDeviceTypeCUDA.test_invalid_reduction_strings_cuda python test/test_nn.py TestNN.test_loss_equal_input_target_shape python test/test_nn.py TestNN.test_pointwise_loss_broadcast python test/test_overrides.py python test/test_jit.py TestJitGeneratedFunctional.test_nn_huber_loss python test/test_type_hints.py python test/test_cpp_api_parity.py build/bin/test_api ``` ## Documentation <img width="677" alt="Screen Shot 2021-01-14 at 4 25 08 PM" src="https://user-images.githubusercontent.com/75754324/104651224-5a445980-5685-11eb-884b-14ea517958c2.png"> <img width="677" alt="Screen Shot 2021-01-14 at 4 24 35 PM" src="https://user-images.githubusercontent.com/75754324/104651190-4e589780-5685-11eb-974d-8c63a89c050e.png"> <img width="661" alt="Screen Shot 2021-01-14 at 4 24 45 PM" src="https://user-images.githubusercontent.com/75754324/104651198-50225b00-5685-11eb-958e-136b36f6f8a8.png"> <img width="869" alt="Screen Shot 2021-01-14 at 4 25 27 PM" src="https://user-images.githubusercontent.com/75754324/104651208-53b5e200-5685-11eb-9fe4-5ff433aa13c5.png"> <img width="862" alt="Screen Shot 2021-01-14 at 4 25 48 PM" src="https://user-images.githubusercontent.com/75754324/104651209-53b5e200-5685-11eb-8051-b0cfddcb07d3.png"> Reviewed By: H-Huang Differential Revision: D26734071 Pulled By: jbschlosser fbshipit-source-id: c98c1b5f32a16f7a2a4e04bdce678080eceed5d5	2021-03-02 17:30:45 -08:00
Jeff Yang	316eabe9ba	fix(docs): remove redundant hardsigmoid() in docstring to show up `inplace` parameter (#52559 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50016 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52559 Reviewed By: ailzhang Differential Revision: D26636347 Pulled By: vkuzo fbshipit-source-id: da615d0eb6372637a6441e53698e86252591f6d8	2021-02-25 09:09:32 -08:00
Bel H	99a428ab22	Lower ReLu6 to aten (#52723 ) Summary: -Lower Relu6 to ATen -Change Python and C++ to reflect change -adds an entry in native_functions.yaml for that new function -this is needed as we would like to intercept ReLU6 at a higher level with an XLA-approach codegen. -Should pass functional C++ tests pass. But please let me know if more tests are required. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52723 Reviewed By: ailzhang Differential Revision: D26641414 Pulled By: albanD fbshipit-source-id: dacfc70a236c4313f95901524f5f021503f6a60f	2021-02-25 08:38:11 -08:00
Jeff Yang	f111ec48c1	docs: add fractional_max_pool in nn.functional (#52557 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/51708 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52557 Reviewed By: bdhirsh Differential Revision: D26591388 Pulled By: jbschlosser fbshipit-source-id: 42643864df92ea014e69a8ec5c29333735e98898	2021-02-22 20:45:07 -08:00
Joel Schlosser	a39b1c42c1	MHA: Fix regression and apply bias flag to both in/out proj (#52537 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/52257 ## Background Reverts MHA behavior for `bias` flag to that of v1.5: flag enables or disables both in and out projection biases. Updates type annotations for both in and out projections biases from `Tensor` to `Optional[Tensor]` for `torch.jit.script` usage. Note: With this change, `_LinearWithBias` defined in `torch/nn/modules/linear.py` is no longer utilized. Completely removing it would require updates to quantization logic in the following files: ``` test/quantization/test_quantized_module.py torch/nn/quantizable/modules/activation.py torch/nn/quantized/dynamic/modules/linear.py torch/nn/quantized/modules/linear.py torch/quantization/quantization_mappings.py ``` This PR takes a conservative initial approach and leaves these files unchanged. Is it safe to fully remove `_LinearWithBias`? Pull Request resolved: https://github.com/pytorch/pytorch/pull/52537 Test Plan: ``` python test/test_nn.py TestNN.test_multihead_attn_no_bias ``` ## BC-Breaking Note In v1.6, the behavior of `MultiheadAttention`'s `bias` flag was incorrectly changed to affect only the in projection layer. That is, setting `bias=False` would fail to disable the bias for the out projection layer. This regression has been fixed, and the `bias` flag now correctly applies to both the in and out projection layers. Reviewed By: bdhirsh Differential Revision: D26583639 Pulled By: jbschlosser fbshipit-source-id: b805f3a052628efb28b89377a41e06f71747ac5b	2021-02-22 14:47:12 -08:00
Mike Ruberry	594a66d778	Warn about floor_divide performing incorrect rounding (#50281 ) (#50281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51745 Test Plan: Imported from OSS Reviewed By: ngimel Pulled By: mruberry Differential Revision: D26257855 fbshipit-source-id: e5d497cf07b0c746838ed081c5d0e82fb4cb701b	2021-02-10 03:13:34 -08:00
Brandon Lin	35b3e16091	[pytorch] Fix torch.nn.functional.normalize to be properly scriptable (#51909 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51909 Several scenarios don't work when trying to script `F.normalize`, notably when you try to symbolically trace through it with using the default argument: ``` import torch.nn.functional as F import torch from torch.fx import symbolic_trace def f(x): return F.normalize(x) gm = symbolic_trace(f) torch.jit.script(gm) ``` which leads to the error ``` RuntimeError: normalize(Tensor input, float p=2., int dim=1, float eps=9.9999999999999998e-13, Tensor? out=None) -> (Tensor): Expected a value of type 'float' for argument 'p' but instead found type 'int'. : def forward(self, x): normalize_1 = torch.nn.functional.normalize(x, p = 2, dim = 1, eps = 1e-12, out = None); x = None ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE return normalize_1 Reviewed By: jamesr66a Differential Revision: D26324308 fbshipit-source-id: 30dd944a6011795d17164f2c746068daac570cea	2021-02-09 07:26:57 -08:00
jiej	4d703d040b	Linear autodiff revert revert (#51613 ) Summary: patch PR https://github.com/pytorch/pytorch/issues/50856 and rollbak the revert D26105797 (`e488e3c443`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/51613 Reviewed By: mruberry Differential Revision: D26253999 Pulled By: ngimel fbshipit-source-id: a20b1591de06dd277e4cd95542e3291a2f5a252c	2021-02-04 16:32:05 -08:00
Natalia Gimelshein	26f9ac98e5	Revert D26105797: [pytorch][PR] Exposing linear layer to fuser Test Plan: revert-hammer Differential Revision: D26105797 (`e488e3c443`) Original commit changeset: 6f7cedb9f6e3 fbshipit-source-id: f0858cefed76d726e9dba61e51e1eaf2af4c99c5	2021-02-02 17:39:17 -08:00
jiej	e488e3c443	Exposing linear layer to fuser (#50856 ) Summary: 1. enabling linear in autodiff; 2. remove control flow in python for linear; Pull Request resolved: https://github.com/pytorch/pytorch/pull/50856 Reviewed By: pbelevich Differential Revision: D26105797 Pulled By: eellison fbshipit-source-id: 6f7cedb9f6e3e46daa24223d2a6080880498deb4	2021-02-02 15:39:01 -08:00
M.L. Croci	8eb90d4865	Add Gaussian NLL Loss (#50886 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48520. cc albanD (This is a clean retry PR https://github.com/pytorch/pytorch/issues/49807) Pull Request resolved: https://github.com/pytorch/pytorch/pull/50886 Reviewed By: ejguan Differential Revision: D26007435 Pulled By: albanD fbshipit-source-id: 88fe91b40dea6f72e093e6301f0f04fcc842d2f0	2021-01-22 06:56:49 -08:00
Taylor Robie	6a3fc0c21c	Treat has_torch_function and object_has_torch_function as static False when scripting (#48966 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48966 This PR lets us skip the `if not torch.jit.is_scripting():` guards on `functional` and `nn.functional` by directly registering `has_torch_function` and `object_has_torch_function` to the JIT as statically False. Benchmarks The benchmark script is kind of long. The reason is that it's testing all four PRs in the stack, plus threading and subprocessing so that the benchmark can utilize multiple cores while still collecting good numbers. Both wall times and instruction counts were collected. This stack changes dozens of operators / functions, but very mechanically such that there are only a handful of codepath changes. Each row is a slightly different code path (e.g. testing in Python, testing in the arg parser, different input types, etc.) <details> <summary> Test script </summary> ``` import argparse import multiprocessing import multiprocessing.dummy import os import pickle import queue import random import sys import subprocess import tempfile import time import torch from torch.utils.benchmark import Timer, Compare, Measurement NUM_CORES = multiprocessing.cpu_count() ENVS = { "ref": "HEAD (current)", "torch_fn_overhead_stack_0": "#48963", "torch_fn_overhead_stack_1": "#48964", "torch_fn_overhead_stack_2": "#48965", "torch_fn_overhead_stack_3": "#48966", } CALLGRIND_ENVS = tuple(ENVS.keys()) MIN_RUN_TIME = 3 REPLICATES = { "longer": 1_000, "long": 300, "short": 50, } CALLGRIND_NUMBER = { "overnight": 500_000, "long": 250_000, "short": 10_000, } CALLGRIND_TIMEOUT = { "overnight": 800, "long": 400, "short": 100, } SETUP = """ x = torch.ones((1, 1)) y = torch.ones((1, 1)) w_tensor = torch.ones((1, 1), requires_grad=True) linear = torch.nn.Linear(1, 1, bias=False) linear_w = linear.weight """ TASKS = { "C++: unary `.t()`": "w_tensor.t()", "C++: unary (Parameter) `.t()`": "linear_w.t()", "C++: binary (Parameter) `mul` ": "x + linear_w", "tensor.py: _wrap_type_error_to_not_implemented `__floordiv__`": "x // y", "tensor.py: method `__hash__`": "hash(x)", "Python scalar `__rsub__`": "1 - x", "functional.py: (unary) `unique`": "torch.functional.unique(x)", "functional.py: (args) `atleast_1d`": "torch.functional.atleast_1d((x, y))", "nn/functional.py: (unary) `relu`": "torch.nn.functional.relu(x)", "nn/functional.py: (args) `linear`": "torch.nn.functional.linear(x, w_tensor)", "nn/functional.py: (args) `linear (Parameter)`": "torch.nn.functional.linear(x, linear_w)", "Linear(..., bias=False)": "linear(x)", } def _worker_main(argv, fn): parser = argparse.ArgumentParser() parser.add_argument("--output_file", type=str) parser.add_argument("--single_task", type=int, default=None) parser.add_argument("--length", type=str) args = parser.parse_args(argv) single_task = args.single_task conda_prefix = os.getenv("CONDA_PREFIX") assert torch.__file__.startswith(conda_prefix) env = os.path.split(conda_prefix)[1] assert env in ENVS results = [] for i, (k, stmt) in enumerate(TASKS.items()): if single_task is not None and single_task != i: continue timer = Timer( stmt=stmt, setup=SETUP, sub_label=k, description=ENVS[env], ) results.append(fn(timer, args.length)) with open(args.output_file, "wb") as f: pickle.dump(results, f) def worker_main(argv): _worker_main( argv, lambda timer, _: timer.blocked_autorange(min_run_time=MIN_RUN_TIME) ) def callgrind_worker_main(argv): _worker_main( argv, lambda timer, length: timer.collect_callgrind(number=CALLGRIND_NUMBER[length], collect_baseline=False)) def main(argv): parser = argparse.ArgumentParser() parser.add_argument("--long", action="store_true") parser.add_argument("--longer", action="store_true") args = parser.parse_args(argv) if args.longer: length = "longer" elif args.long: length = "long" else: length = "short" replicates = REPLICATES[length] num_workers = int(NUM_CORES // 2) tasks = list(ENVS.keys()) * replicates random.shuffle(tasks) task_queue = queue.Queue() for _ in range(replicates): envs = list(ENVS.keys()) random.shuffle(envs) for e in envs: task_queue.put((e, None)) callgrind_task_queue = queue.Queue() for e in CALLGRIND_ENVS: for i, _ in enumerate(TASKS): callgrind_task_queue.put((e, i)) results = [] callgrind_results = [] def map_fn(worker_id): # Adjacent cores often share cache and maxing out a machine can distort # timings so we space them out. callgrind_cores = f"{worker_id * 2}-{worker_id * 2 + 1}" time_cores = str(worker_id * 2) _, output_file = tempfile.mkstemp(suffix=".pkl") try: loop_tasks = ( # Callgrind is long running, and then the workers can help with # timing after they finish collecting counts. (callgrind_task_queue, callgrind_results, "callgrind_worker", callgrind_cores, CALLGRIND_TIMEOUT[length]), (task_queue, results, "worker", time_cores, None)) for queue_i, results_i, mode_i, cores, timeout in loop_tasks: while True: try: env, task_i = queue_i.get_nowait() except queue.Empty: break remaining_attempts = 3 while True: try: subprocess.run( " ".join([ "source", "activate", env, "&&", "taskset", "--cpu-list", cores, "python", os.path.abspath(__file__), "--mode", mode_i, "--length", length, "--output_file", output_file ] + ([] if task_i is None else ["--single_task", str(task_i)])), shell=True, check=True, timeout=timeout, ) break except subprocess.TimeoutExpired: # Sometimes Valgrind will hang if there are too many # concurrent runs. remaining_attempts -= 1 if not remaining_attempts: print("Too many failed attempts.") raise print(f"Timeout after {timeout} sec. Retrying.") # We don't need a lock, as the GIL is enough. with open(output_file, "rb") as f: results_i.extend(pickle.load(f)) finally: os.remove(output_file) with multiprocessing.dummy.Pool(num_workers) as pool: st, st_estimate, eta, n_total = time.time(), None, "", len(tasks) * len(TASKS) map_job = pool.map_async(map_fn, range(num_workers)) while not map_job.ready(): n_complete = len(results) if n_complete and len(callgrind_results): if st_estimate is None: st_estimate = time.time() else: sec_per_element = (time.time() - st_estimate) / n_complete n_remaining = n_total - n_complete eta = f"ETA: {n_remaining * sec_per_element:.0f} sec" print( f"\r{n_complete} / {n_total} " f"({len(callgrind_results)} / {len(CALLGRIND_ENVS) * len(TASKS)}) " f"{eta}".ljust(40), end="") sys.stdout.flush() time.sleep(2) total_time = int(time.time() - st) print(f"\nTotal time: {int(total_time // 60)} min, {total_time % 60} sec") desc_to_ind = {k: i for i, k in enumerate(ENVS.values())} results.sort(key=lambda r: desc_to_ind[r.description]) # TODO: Compare should be richer and more modular. compare = Compare(results) compare.trim_significant_figures() compare.colorize(rowwise=True) # Manually add master vs. overall relative delta t. merged_results = { (r.description, r.sub_label): r for r in Measurement.merge(results) } cmp_lines = str(compare).splitlines(False) print(cmp_lines[0][:-1] + "-" * 15 + "]") print(f"{cmp_lines[1]} \|{'':>10}\u0394t") print(cmp_lines[2] + "-" * 15) for l, t in zip(cmp_lines[3:3 + len(TASKS)], TASKS.keys()): assert l.strip().startswith(t) t0 = merged_results[(ENVS["ref"], t)].median t1 = merged_results[(ENVS["torch_fn_overhead_stack_3"], t)].median print(f"{l} \|{'':>5}{(t1 / t0 - 1) * 100:>6.1f}%") print("\n".join(cmp_lines[3 + len(TASKS):])) counts_dict = { (r.task_spec.description, r.task_spec.sub_label): r.counts(denoise=True) for r in callgrind_results } def rel_diff(x, x0): return f"{(x / x0 - 1) * 100:>6.1f}%" task_pad = max(len(t) for t in TASKS) print(f"\n\nInstruction % change (relative to `{CALLGRIND_ENVS[0]}`)") print(" " * (task_pad + 8) + (" " * 7).join([ENVS[env] for env in CALLGRIND_ENVS[1:]])) for t in TASKS: values = [counts_dict[(ENVS[env], t)] for env in CALLGRIND_ENVS] print(t.ljust(task_pad + 3) + " ".join([ rel_diff(v, values[0]).rjust(len(ENVS[env]) + 5) for v, env in zip(values[1:], CALLGRIND_ENVS[1:])])) print("\033[4m" + " Instructions per invocation".ljust(task_pad + 3) + " ".join([ f"{v // CALLGRIND_NUMBER[length]:.0f}".rjust(len(ENVS[env]) + 5) for v, env in zip(values[1:], CALLGRIND_ENVS[1:])]) + "\033[0m") print() import pdb pdb.set_trace() if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("--mode", type=str, choices=("main", "worker", "callgrind_worker"), default="main") args, remaining = parser.parse_known_args() if args.mode == "main": main(remaining) elif args.mode == "callgrind_worker": callgrind_worker_main(remaining) else: worker_main(remaining) ``` </details> Wall time <img width="1178" alt="Screen Shot 2020-12-12 at 12 28 13 PM" src="https://user-images.githubusercontent.com/13089297/101994419-284f6a00-3c77-11eb-8dc8-4f69a890302e.png"> <details> <summary> Longer run (`python test.py --long`) is basically identical. </summary> <img width="1184" alt="Screen Shot 2020-12-12 at 5 02 47 PM" src="https://user-images.githubusercontent.com/13089297/102000425-2350e180-3c9c-11eb-999e-a95b37e9ef54.png"> </details> Callgrind <img width="936" alt="Screen Shot 2020-12-12 at 12 28 54 PM" src="https://user-images.githubusercontent.com/13089297/101994421-2e454b00-3c77-11eb-9cd3-8cde550f536e.png"> Test Plan: existing unit tests. Reviewed By: ezyang Differential Revision: D25590731 Pulled By: robieta fbshipit-source-id: fe05305ff22b0e34ced44b60f2e9f07907a099dd	2021-01-10 19:23:38 -08:00
Taylor Robie	d31a760be4	move has_torch_function to C++, and make a special case object_has_torch_function (#48965 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48965 This PR pulls `__torch_function__` checking entirely into C++, and adds a special `object_has_torch_function` method for ops which only have one arg as this lets us skip tuple construction and unpacking. We can now also do away with the Python side fast bailout for `Tensor` (e.g. `if any(type(t) is not Tensor for t in tensors) and has_torch_function(tensors)`) because they're actually slower than checking with the Python C API. Test Plan: Existing unit tests. Benchmarks are in #48966 Reviewed By: ezyang Differential Revision: D25590732 Pulled By: robieta fbshipit-source-id: 6bd74788f06cdd673f3a2db898143d18c577eb42	2021-01-10 19:23:35 -08:00
Richard Barnes	2bceee785f	Clean up simple type annotations in nn/functional.py (#50106 ) Summary: Also reformats code to pass linters. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50106 Test Plan: Sandcastle tests Reviewed By: xush6528 Differential Revision: D25787566 fbshipit-source-id: 39c86b4021e279f92f8ccf30252a6cfae1063c3c	2021-01-07 15:33:40 -08:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
Joel Schlosser	68d438c9da	Add PixelUnshuffle (#49334 ) Summary: Adds an implementation of `torch.nn.PixelUnshuffle` as the inverse operation of `torch.nn.PixelShuffle`. This addresses https://github.com/pytorch/pytorch/issues/2456 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49334 Test Plan: ``` # Unit tests. python test/test_nn.py TestNN.test_pixel_shuffle_unshuffle # Module test. python test/test_nn.py TestNN.test_PixelUnshuffle # C++ API tests. build/bin/test_api # C++ / python parity tests. python test/test_cpp_api_parity.py # JIT test. python test/test_jit.py TestJitGeneratedFunctional.test_nn_pixel_unshuffle # Override tests. python test/test_overrides.py # Type hint tests. python test/test_type_hints.py ``` Screenshots of rendered docs: <img width="876" alt="Screen Shot 2020-12-18 at 12 19 05 PM" src="https://user-images.githubusercontent.com/75754324/102642255-6b07bb00-412b-11eb-88fa-e53e7e8ba720.png"> <img width="984" alt="Screen Shot 2020-12-18 at 12 19 26 PM" src="https://user-images.githubusercontent.com/75754324/102642276-70fd9c00-412b-11eb-8548-445082a2db02.png"> <img width="932" alt="Screen Shot 2020-12-18 at 12 19 34 PM" src="https://user-images.githubusercontent.com/75754324/102642704-19abfb80-412c-11eb-9546-95bdd1c3cf22.png"> <img width="876" alt="Screen Shot 2020-12-22 at 12 51 36 PM" src="https://user-images.githubusercontent.com/75754324/102918259-986aa680-4454-11eb-99e7-a0b4c8b3e283.png"> <img width="869" alt="Screen Shot 2020-12-22 at 12 51 44 PM" src="https://user-images.githubusercontent.com/75754324/102918274-9ef91e00-4454-11eb-94bb-91b58aff47d3.png"> Reviewed By: mruberry Differential Revision: D25401439 Pulled By: jbschlosser fbshipit-source-id: 209d92ce7295e51699e83616d0c62170a7ce75c8	2020-12-22 20:14:55 -08:00
Guanheng Zhang	e2b4c63dd9	Enable the faster combined weight branch in MHA when query/key/value is same object with nan (#48126 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/47979 For MHA module, it is preferred to use the combined weight branch as much as possible when query/key/value are same (in case of same values by `torch.equal` or exactly same object by `is` ops). This PR will enable the faster branch when a single object with `nan` is passed to MHA. For the background knowledge ``` import torch a = torch.tensor([float('NaN'), 1, float('NaN'), 2, 3]) print(a is a) # True print(torch.equal(a, a)) # False ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/48126 Reviewed By: gchanan Differential Revision: D25042082 Pulled By: zhangguanheng66 fbshipit-source-id: 6bb17a520e176ddbb326ddf30ee091a84fcbbf27	2020-11-18 08:24:41 -08:00
Qi Zhou	0ec717c830	Support int32 indices and offsets in nn.EmbeddingBag (#46758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46758 It's in general helpful to support int32 indices and offsets, especially when such tensors are large and need to be transferred to accelerator backends. Since it may not be very useful to support the combination of int32 indices and int64 offsets, here we enforce that these two must have the same type. Test Plan: unit tests Reviewed By: ngimel Differential Revision: D24470808 fbshipit-source-id: 94b8a1d0b7fc9fe3d128247aa042c04d7c227f0b	2020-11-03 23:33:50 -08:00
pomelyu	f41f3e3cd1	Implement bicubic grid sampler (#44780 ) Summary: Fix https://github.com/pytorch/pytorch/issues/44601 I added bicubic grid sampler in both cpu and cuda side, but haven't in AVX2 There is a [colab notebook](https://colab.research.google.com/drive/1mIh6TLLj5WWM_NcmKDRvY5Gltbb781oU?usp=sharing) show some test results. The notebook use bilinear for test, since I could only use distributed version of pytorch in it. You could just download it and modify the `mode_torch=bicubic` to show the results. There are some duplicate code about getting and setting values, since the helper function used in bilinear at first clip the coordinate beyond boundary, and then get or set the value. However, in bicubic, there are more points should be consider. I could refactor that part after making sure the overall calculation are correct. Thanks Pull Request resolved: https://github.com/pytorch/pytorch/pull/44780 Reviewed By: mrshenli Differential Revision: D24681114 Pulled By: mruberry fbshipit-source-id: d39c8715e2093a5a5906cb0ef040d62bde578567	2020-11-03 15:34:59 -08:00
Ollin Boer Bohan	ac4ee0ef5d	Fix typo in docs for interpolate (#46589 ) Summary: Removes a spurious backtick in [the docs for `torch.nn.functional.interpolate`](https://pytorch.org/docs/stable/nn.functional.html?highlight=grid_sample#torch.nn.functional.interpolate) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46589 Reviewed By: zou3519 Differential Revision: D24422550 Pulled By: ezyang fbshipit-source-id: c1e6b7de4584b2a3f68b458801a33b3fc71c1944	2020-10-21 11:31:53 -07:00
n-v-k	64b0686986	Expose ChannelShuffle (#46000 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45999 Also small fix for caffe2 counterpart Pull Request resolved: https://github.com/pytorch/pytorch/pull/46000 Reviewed By: mruberry Differential Revision: D24185855 Pulled By: ngimel fbshipit-source-id: c5d599bb8100b86b81c6901f1b8b8baefc12cb16	2020-10-08 16:00:01 -07:00
Natalia Gimelshein	52f2db752d	unify reproducibility notes (#45748 ) Summary: Many of our functions contain same warnings about results reproducibility. Make them use common template. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45748 Reviewed By: colesbury Differential Revision: D24089114 Pulled By: ngimel fbshipit-source-id: e6aa4ce6082f6e0f4ce2713c2bf1864ee1c3712a	2020-10-08 02:14:57 -07:00
Ansley Ussery	7726754e70	Add function signature for pixel_shuffle (#45661 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45661 Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D24078627 Pulled By: ansleyadelaide fbshipit-source-id: 44917ff5932e4d0adcc18ce24ecfc0b5686818e3	2020-10-02 11:46:35 -07:00
Guilherme Leobas	c1e6592964	Enable type-checking of torch.nn.quantized.* modules (#43110 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43029 I am not changing the following files in this PR: * `torch/nn/quantized/dynamic/modules/rnn.py` due to https://github.com/pytorch/pytorch/issues/43072 * `torch/nn/quantized/modules/conv.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/43110 Reviewed By: gchanan Differential Revision: D23963258 Pulled By: ezyang fbshipit-source-id: 0fb0fd13af283f6f7b3434e7bbf62165357d1f98	2020-09-29 18:14:29 -07:00
Brian Hirsh	439930c81b	adding a beta parameter to the smooth_l1 loss fn (#44433 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44433 Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time fixing some type errors, updated fn signature in a few more files removing my usage of Scalar, making beta a double everywhere instead Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D23636720 Pulled By: bdhirsh fbshipit-source-id: caea2a1f8dd72b3b5fd1d72dd886b2fcd690af6d	2020-09-25 16:36:28 -07:00
Kurt Mohler	d1c68a7069	Clarify that 5-D 'bilinear' grid_sample is actually trilinear (#45090 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41528 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45090 Reviewed By: ailzhang Differential Revision: D23841046 Pulled By: zou3519 fbshipit-source-id: 941770cd5b3e705608957739026e9113e5f0c616	2020-09-22 15:10:22 -07:00
Mike Ruberry	ef885c10d8	[pytorch] Add triplet margin loss with custom distance (#43680 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43680 As discussed [here](https://github.com/pytorch/pytorch/issues/43342), adding in a Python-only implementation of the triplet-margin loss that takes a custom distance function. Still discussing whether this is necessary to add to PyTorch Core. Test Plan: python test/run_tests.py Imported from OSS Reviewed By: albanD Differential Revision: D23363898 fbshipit-source-id: 1cafc05abecdbe7812b41deaa1e50ea11239d0cb	2020-09-22 11:35:52 -07:00
Xiang Gao	e48201c5cf	Mention TF32 on related docs (#44690 ) Summary: cc: ptrblck ![image](https://user-images.githubusercontent.com/1032377/93168022-cbbfcb80-f6d6-11ea-8f6e-f2c8a15c5bea.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/44690 Reviewed By: ngimel Differential Revision: D23727921 Pulled By: mruberry fbshipit-source-id: db7cc8e74cde09c13d6a57683129fd839863b914	2020-09-16 19:18:30 -07:00
Xiang Gao	20ac736200	Remove py2 compatible future imports (#44735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735 Reviewed By: mruberry Differential Revision: D23731306 Pulled By: ezyang fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f	2020-09-16 12:55:57 -07:00
Gregory Chanan	5579b53a7f	Fix SmoothL1Loss when target.requires_grad is True. (#44486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44486 SmoothL1Loss had a completely different (and incorrect, see #43228) path when target.requires_grad was True. This PR does the following: 1) adds derivative support for target via the normal derivatives.yaml route 2) kill the different (and incorrect) path for when target.requires_grad was True 3) modify the SmoothL1Loss CriterionTests to verify that the target derivative is checked. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23630699 Pulled By: gchanan fbshipit-source-id: 0f94d1a928002122d6b6875182867618e713a917	2020-09-11 12:13:36 -07:00
David Reiss	7d78a6fcdd	Update interpolate to use new upsample overloads (#43025 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43025 - Use new overloads that better reflect the arguments to interpolate. - More uniform interface for upsample ops allows simplifying the Python code. - Also reorder overloads in native_functions.yaml to give them priority. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37177 ghstack-source-id: 106938111 Test Plan: test_nn has pretty good coverage. Relying on CI for ONNX, etc. Didn't test FC because this change is not forward compatible. To ensure backwards compatibility, I ran this code before this change ```python def test_func(arg): interp = torch.nn.functional.interpolate with_size = interp(arg, size=(16,16)) with_scale = interp(arg, scale_factor=[2.1, 2.2], recompute_scale_factor=False) with_compute = interp(arg, scale_factor=[2.1, 2.2]) return (with_size, with_scale, with_compute) traced_func = torch.jit.trace(test_func, torch.randn(1,1,1,1)) sample = torch.randn(1, 3, 7, 7) output = traced_func(sample) assert not torch.allclose(output[1], output[2]) torch.jit.save(traced_func, "model.pt") torch.save((sample, output), "data.pt") ``` then this code after this change ```python model = torch.jit.load("model.pt") sample, golden = torch.load("data.pt") result = model(sample) for r, g in zip(result, golden): assert torch.allclose(r, g) ``` Reviewed By: AshkanAliabadi Differential Revision: D21209991 fbshipit-source-id: 5b2ebb7c3ed76947361fe532d1dbdd6faa3544c8	2020-09-11 09:59:14 -07:00
Gregory Chanan	3de2c0b42f	Fix L1Loss when target.requires_grad is True. (#44471 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44471 L1Loss had a completely different (and incorrect, see #43228) path when target.requires_grad was True. This PR does the following: 1) adds derivative support for target via the normal derivatives.yaml route 2) kill the different (and incorrect) path for when target.requires_grad was True 3) modify the L1Loss CriterionTests to verify that the target derivative is checked. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23626008 Pulled By: gchanan fbshipit-source-id: 2828be16b56b8dabe114962223d71b0e9a85f0f5	2020-09-11 09:51:16 -07:00
Gregory Chanan	d07d25a8c5	Fix MSELoss when target.requires_grad is True. (#44437 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44437 MSELoss had a completely different (and incorrect, see https://github.com/pytorch/pytorch/issues/43228) path when target.requires_grad was True. This PR does the following: 1) adds derivative support for target via the normal derivatives.yaml route 2) kill the different (and incorrect) path for when target.requires_grad was True 3) modify the MSELoss CriterionTests to verify that the target derivative is checked. TODO: 1) do we still need check_criterion_jacobian when we run grad/gradgrad checks? 2) ensure the Module tests check when target.requires_grad 3) do we actually test when reduction='none' and reduction='mean'? Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23612166 Pulled By: gchanan fbshipit-source-id: 4f74d38d8a81063c74e002e07fbb7837b2172a10	2020-09-11 08:51:28 -07:00
Chris Huynh	7b547f086f	To fix extra memory allocation when using circular padding (#39273 ) Summary: For fixing https://github.com/pytorch/pytorch/issues/39256 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39273 Reviewed By: anjali411 Differential Revision: D23471811 Pulled By: mruberry fbshipit-source-id: fb324b51baea765311715cdf14642b334f335733	2020-09-10 00:15:31 -07:00
Nikita Shulga	442684cb25	Enable typechecks for torch.nn.modules.[activation\|upsampling] (#44093 ) Summary: Add missing `hardsigmoid`, `silu`, `hardswish` and `multi_head_attention_forward` to functional.pyi.in Embed some typing annotations into functional.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/44093 Reviewed By: ezyang Differential Revision: D23494384 Pulled By: malfet fbshipit-source-id: 27023c16ff5951ceaebb78799c4629efa25f7c5c	2020-09-03 13:20:04 -07:00
Vincent QB	fab012aa28	Revert "Added support for Huber Loss (#37599 )" (#43351 ) Summary: This reverts commit `11e5174926` due to [comment](https://github.com/pytorch/pytorch/pull/37599#pullrequestreview-471950192). Pull Request resolved: https://github.com/pytorch/pytorch/pull/43351 Reviewed By: pbelevich, seemethere Differential Revision: D23249511 Pulled By: vincentqb fbshipit-source-id: 18b8b346f00eaf0ef7376b06579d404a84add4de	2020-09-01 06:34:26 -07:00
Gregory Chanan	42c895de4d	Properly check that reduction strings are valid for l1_loss, smoothl1_loss, and mse_loss. (#43527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43527 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23306786 Pulled By: gchanan fbshipit-source-id: f3b7c9c02ae02813da116cb6b247a95727c47587	2020-08-31 09:53:56 -07:00
Gregory Chanan	1dcc4fb6b7	Kill unused _pointwise_loss function. (#43523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43523 The code is also wrong, see https://github.com/pytorch/pytorch/issues/43228. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D23305461 Pulled By: gchanan fbshipit-source-id: 9fe516d87a4243d5ce3c29e8822417709a1d6346	2020-08-31 07:58:04 -07:00
David Reiss	31788ae151	Trim trailing whitespace Test Plan: CI Reviewed By: linbinyu Differential Revision: D23108919 fbshipit-source-id: 913c982351a94080944f350641d7966c6c2cc508	2020-08-14 09:18:40 -07:00
Nikita Shulga	3cf2551f2f	Fix `torch.nn.functional.grid_sample` crashes if `grid` has NaNs (#42703 ) Summary: In `clip_coordinates` replace `minimum(maximum(in))` composition with `clamp_max(clamp_min(in))` Swap order of `clamp_min` operands to clamp NaNs in grid to 0 Fixes https://github.com/pytorch/pytorch/issues/42616 Pull Request resolved: https://github.com/pytorch/pytorch/pull/42703 Reviewed By: ezyang Differential Revision: D22987447 Pulled By: malfet fbshipit-source-id: a8a2d6de8043d6b77c8707326c5412d0250efae6	2020-08-10 16:20:09 -07:00
Hameer Abbasi	3d46e02ea1	Add __torch_function__ for methods (#37091 ) Summary: According to pytorch/rfcs#3 From the goals in the RFC: 1. Support subclassing `torch.Tensor` in Python (done here) 2. Preserve `torch.Tensor` subclasses when calling `torch` functions on them (done here) 3. Use the PyTorch API with `torch.Tensor`-like objects that are _not_ `torch.Tensor` subclasses (done in https://github.com/pytorch/pytorch/issues/30730) 4. Preserve `torch.Tensor` subclasses when calling `torch.Tensor` methods. (done here) 5. Propagating subclass instances correctly also with operators, using views/slices/indexing/etc. (done here) 6. Preserve subclass attributes when using methods or views/slices/indexing. (done here) 7. A way to insert code that operates on both functions and methods uniformly (so we can write a single function that overrides all operators). (done here) 8. The ability to give external libraries a way to also define functions/methods that follow the `__torch_function__` protocol. (will be addressed in a separate PR) This PR makes the following changes: 1. Adds the `self` argument to the arg parser. 2. Dispatches on `self` as well if `self` is not `nullptr`. 3. Adds a `torch._C.DisableTorchFunction` context manager to disable `__torch_function__`. 4. Adds a `torch::torch_function_enabled()` and `torch._C._torch_function_enabled()` to check the state of `__torch_function__`. 5. Dispatches all `torch._C.TensorBase` and `torch.Tensor` methods via `__torch_function__`. TODO: - [x] Sequence Methods - [x] Docs - [x] Tests Closes https://github.com/pytorch/pytorch/issues/28361 Benchmarks in https://github.com/pytorch/pytorch/pull/37091#issuecomment-633657778 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37091 Reviewed By: ngimel Differential Revision: D22765678 Pulled By: ezyang fbshipit-source-id: 53f8aa17ddb8b1108c0997f6a7aa13cb5be73de0	2020-08-05 20:44:13 -07:00
Yanan Cao	bdcf320bed	Support custom exception message (#41907 ) Summary: Raise and assert used to have a hard-coded error message "Exception". User provided error message was ignored. This PR adds support to represent user's error message in TorchScript. This breaks backward compatibility because now we actually need to script the user's error message, which can potentially contain unscriptable expressions. Such programs can break when scripting, but saved models can still continue to work. Increased an op count in test_mobile_optimizer.py because now we need aten::format to form the actual exception message. This is built upon an WIP PR: https://github.com/pytorch/pytorch/pull/34112 by driazati Pull Request resolved: https://github.com/pytorch/pytorch/pull/41907 Reviewed By: ngimel Differential Revision: D22778301 Pulled By: gmagogsfm fbshipit-source-id: 2b94f0db4ae9fe70c4cd03f4048e519ea96323ad	2020-08-01 13:03:45 -07:00
alexandrosstergiou	11e5174926	Added support for Huber Loss (#37599 ) Summary: Current losses in PyTorch only include a (partial) implementation of Huber loss through `smooth l1` based on Fast RCNN - which essentially uses a delta value of 1. Changing/Renaming the [`_smooth_l1_loss()`](`3e1859959a/torch/nn/functional.py (L2487)`) and refactoring to include delta, enables to use the actual function. Supplementary to this, I have also made a functional and criterion versions for anyone that wants to set the delta explicitly - based on the functional `smooth_l1_loss()` and the criterion `Smooth_L1_Loss()` Pull Request resolved: https://github.com/pytorch/pytorch/pull/37599 Differential Revision: D21559311 Pulled By: vincentqb fbshipit-source-id: 34b2a5a237462e119920d6f55ba5ab9b8e086a8c	2020-07-27 10:42:30 -07:00
Oren Amsalem	b6690eb29a	Might be good for newcomers to read what N means (#41851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41851 Reviewed By: izdeby Differential Revision: D22703602 Pulled By: mrshenli fbshipit-source-id: 44905f43cdf53b38e383347e5002a28c9363a446	2020-07-23 16:10:38 -07:00
wudenggang	9600ed9af3	typo fixes (#41632 ) Summary: typo fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/41632 Reviewed By: ezyang Differential Revision: D22617827 Pulled By: mrshenli fbshipit-source-id: c2bfcb7cc36913a8dd32f13fc9adc3aa0a9b682f	2020-07-20 07:23:00 -07:00
Nathan Goldbaum	1e230a5c52	rewrite C++ __torch_function__ handling to work with TensorList operands (#41575 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41575 Fixes https://github.com/pytorch/pytorch/issues/34294 This updates the C++ argument parser to correctly handle `TensorList` operands. I've also included a number of updates to the testing infrastructure, this is because we're now doing a much more careful job of testing the signatures of aten kernels, using the type information about the arguments as read in from `Declarations.yaml`. The changes to the tests are required because we're now only checking for `__torch_function__` attributes on `Tensor`, `Optional[Tensor]` and elements of `TensorList` operands, whereas before we were checking for `__torch_function__` on all operands, so the relatively simplistic approach the tests were using before -- assuming all positional arguments might be tensors -- doesn't work anymore. I now think that checking for `__torch_function__` on all operands was a mistake in the original design. The updates to the signatures of the `lambda` functions are to handle this new, more stringent checking of signatures. I also added override support for `torch.nn.functional.threshold` `torch.nn.functional.layer_norm`, which did not yet have python-level support. Benchmarks are still WIP. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34725 Reviewed By: mruberry Differential Revision: D22357738 Pulled By: ezyang fbshipit-source-id: 0e7f4a58517867b2e3f193a0a8390e2ed294e1f3	2020-07-17 08:54:29 -07:00
Kurt Mohler	0b73ea0ea2	Change BCELoss size mismatch warning into an error (#41426 ) Summary: BCELoss currently uses different broadcasting semantics than numpy. Since previous versions of PyTorch have thrown a warning in these cases telling the user that input sizes should match, and since the CUDA and CPU results differ when sizes do not match, it makes sense to upgrade the size mismatch warning to an error. We can consider supporting numpy broadcasting semantics in BCELoss in the future if needed. Closes https://github.com/pytorch/pytorch/issues/40023 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41426 Reviewed By: zou3519 Differential Revision: D22540841 Pulled By: ezyang fbshipit-source-id: 6c6d94c78fa0ae30ebe385d05a9e3501a42b3652	2020-07-14 20:34:06 -07:00
Heitor Schueroff de Souza	75a4862f63	Added SiLU activation function (#41034 ) Summary: Implemented the SiLU activation function as discussed in https://github.com/pytorch/pytorch/issues/3169. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41034 Reviewed By: glaringlee Differential Revision: D22465203 Pulled By: heitorschueroff fbshipit-source-id: b27d064529fc99600c586ad49b594b52b718b0d2	2020-07-10 07:37:30 -07:00
Negin Raoof	f69d6a7ea3	[ONNX] Update Default Value of recompute_scale_factor in Interpolate (#39453 ) Summary: This is a duplicate of https://github.com/pytorch/pytorch/pull/38362 "This PR completes Interpolate's deprecation process for recomputing the scales values, by updating the default value of the parameter recompute_scale_factor as planned for pytorch 1.6.0. The warning message is also updated accordingly." I'm recreating this PR as previous one is not being updated. cc gchanan Pull Request resolved: https://github.com/pytorch/pytorch/pull/39453 Reviewed By: hl475 Differential Revision: D21955284 Pulled By: houseroad fbshipit-source-id: 911585d39273a9f8de30d47e88f57562216968d8	2020-07-09 11:32:49 -07:00
David Reiss	4dad829ea3	In interpolate, inline the call to _interp_output_size (#37173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37173 This function is only called in one place, so inline it. This eliminates boilerplate related to overloads and allows for further simplification of shared logic in later diffs. All shared local variables have the same names (from closed_over_args), and no local variables accidentally collide. ghstack-source-id: 106938108 Test Plan: Existing tests for interpolate. Differential Revision: D21209995 fbshipit-source-id: acfadf31936296b2aac0833f704764669194b06f	2020-07-07 13:52:18 -07:00
David Reiss	3c1c74c366	In interpolate, move exceptional cases to the bottom (#37172 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37172 This improves readability by keeping cases with similar behavior close together. It should also have a very tiny positive impact on perf. ghstack-source-id: 106938109 Test Plan: Existing tests for interpolate. Differential Revision: D21209996 fbshipit-source-id: c813e56aa6ba7370b89a2784fcb62cc146005258	2020-07-07 13:52:16 -07:00
David Reiss	8f0e254790	In interpolate, use if instead of elif (#37171 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37171 Every one of these branches returns or raises, so there's no need for elif. This makes it a little easier to reorder and move conditions. ghstack-source-id: 106938110 Test Plan: Existing test for interpolate. Differential Revision: D21209992 fbshipit-source-id: 5c517e61ced91464b713f7ccf53349b05e27461c	2020-07-07 13:49:53 -07:00
Nayef Ahmed	71af538e31	Updated assert to remove check on 3rd dim for MHA (#39402 ) Summary: ## Description * Updated assert statement to remove check on 3rd dimension (features) for keys and values in MultiheadAttention / Transform * The feature dimension for keys and values can now be of different sizes * Refer to https://github.com/pytorch/pytorch/issues/27623 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39402 Reviewed By: zhangguanheng66 Differential Revision: D21841678 Pulled By: Nayef211 fbshipit-source-id: f0c9e5e0f33259ae2abb6bf9e7fb14e3aa9008eb	2020-06-02 13:35:39 -07:00
Vasiliy Kuznetsov	e5ada042b1	QAT ConvBN: remove explicit folding and use BN instead (#38478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38478 Before this PR, the QAT ConvBN module inlined the batch normalization code in order to reproduce Conv+BN folding. This PR updates the module to use BN directly. This is mathematically equivalent to previous behavior as long as we properly scale and fake quant the conv weights, but allows us to reuse the BN code instead of reimplementing it. In particular, this should help with speed since we can use dedicated BN kernels, and also with DDP since we can hook up SyncBatchNorm. Test Plan: ``` python test/test_quantization.py TestQATModule ``` Imported from OSS Differential Revision: D21603230 fbshipit-source-id: ecf8afdd833b67c2fbd21a8fd14366079fa55e64	2020-05-19 08:58:42 -07:00
Bharat123rox	8752d6a736	DOC: Correct upsample doc to match interpolation (#38455 ) Summary: Fix https://github.com/pytorch/pytorch/issues/38334 and correct the docs of `torch.nn.functional.upsample` Pull Request resolved: https://github.com/pytorch/pytorch/pull/38455 Differential Revision: D21583515 Pulled By: driazati fbshipit-source-id: 6ac5a79ba489bdcdd3fab34e4eddb4864e20a29e	2020-05-15 17:09:26 -07:00
Donna Choi	4c99a9b672	Add documentation for hardswish (#37989 ) Summary: Fix issue https://github.com/pytorch/pytorch/issues/37431. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37989 Differential Revision: D21502182 Pulled By: zou3519 fbshipit-source-id: 245586fb555f7f1d9ec8d87269035b6fe626b47b	2020-05-12 06:48:51 -07:00
Donna Choi	ca2206d071	Add documentation for FeatureAlphaDropout (#36295 ) Summary: These changes add documentation for FeatureAlphaDropout, based on a need raised in an issue by SsnL (Issue https://github.com/pytorch/pytorch/issues/9886). Pull Request resolved: https://github.com/pytorch/pytorch/pull/36295 Differential Revision: D21478591 Pulled By: zou3519 fbshipit-source-id: a73c40bf1c7e3b1f301dc3347cef7b32e9842320	2020-05-08 15:09:01 -07:00
Richard Zou	172bcdb8c8	Add documentation for nn.Hardsigmoid and nn.functional.hardsigmoid. (#38120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38120 Test Plan: build docs locally and attach a screenshot to this PR. Differential Revision: D21477815 Pulled By: zou3519 fbshipit-source-id: 420bbcfcbd191d1a8e33cdf4a90c95bf00a5d226	2020-05-08 13:56:45 -07:00
David Reiss	d6b51e4adf	In interpolate, join short lines (#37170 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37170 ghstack-source-id: 102773588 Test Plan: CI Reviewed By: kimishpatel Differential Revision: D21209998 fbshipit-source-id: 9386e54aa85a5576678d21d443017079028f8dca	2020-05-06 13:03:45 -07:00
David Reiss	59f03c69ab	In interpolate, give a short name to scale_factor_list (#37169 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37169 This allows some cleanup of the code below by making lines shorter. ghstack-source-id: 102773593 Test Plan: Existing tests for interpolate. Reviewed By: kimishpatel Differential Revision: D21209988 fbshipit-source-id: cffcdf9a580b15c4f1fa83e3f27b5a69f66bf6f7	2020-05-06 13:03:39 -07:00
David Reiss	4996961826	In interpolate, only call _interp_output_size in one place (#37168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37168 It looks like this was made a separate function because of the `dim` argument, but that argument is always equal to `input.dim() - 2`. Remove the argument and consolidate all call sites into one. This also means that this will be called on paths that previously didn't call it, but all those cases throw exceptions anyway. ghstack-source-id: 102773596 Test Plan: Existing tests for interpolate. Reviewed By: kimishpatel Differential Revision: D21209993 fbshipit-source-id: 2c274a3a6900ebfdb8d60b311a4c3bd956fa7c37	2020-05-06 13:03:33 -07:00
Edward Yang	4fef3763dd	Revert "Revert D21337640: [pytorch][PR] Split up documentation into subpages and clean up some warnings" (#37778 ) Summary: Original PR: https://github.com/pytorch/pytorch/pull/37419 cc mattip suo Pull Request resolved: https://github.com/pytorch/pytorch/pull/37778 Differential Revision: D21385774 Pulled By: ezyang fbshipit-source-id: 5de532faab8bae132736b6b5189e0ee2ac9935be	2020-05-04 14:32:35 -07:00
Michael Suo	20f7e62b1d	Revert D21337640: [pytorch][PR] Split up documentation into subpages and clean up some warnings Test Plan: revert-hammer Differential Revision: D21337640 Original commit changeset: d4ad198780c3 fbshipit-source-id: fa9ba6ac542173a50bdb45bfa12f3fec0ed704fb	2020-05-04 10:57:55 -07:00
mattip	f10fbcc820	Split up documentation into subpages and clean up some warnings (#37419 ) Summary: xref gh-32838, gh-34032 This is a major refactor of parts of the documentation to split it up using sphinx's `autosummary` feature which will build out `autofuction` and `autoclass` stub files and link to them. The end result is that the top module pages like torch.nn.rst and torch.rst are now more like table-of-contents to the actual single-class or single-function documentations pages. Along the way, I modified many of the docstrings to eliminate sphinx warnings when building. I think the only thing I changed from a non-documentation perspective is to add names to `__all__` when adding them to `globals()` in `torch.__init__.py` I do not know the CI system: are the documentation build artifacts available after the build, so reviewers can preview before merging? Pull Request resolved: https://github.com/pytorch/pytorch/pull/37419 Differential Revision: D21337640 Pulled By: ezyang fbshipit-source-id: d4ad198780c3ae7a96a9f22651e00ff2d31a0c0f	2020-05-04 09:39:22 -07:00
Kimish Patel	df31ddbd98	Add channel shuffle op fp32 + quantized. (#36815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36815 Pytorch does not have native channel shuffle op. This diff adds that for both fp and quantized tensors. For FP implementation is inefficient one. For quantized there is a native QNNPACK op for this. ghstack-source-id: 103267234 Test Plan: buck run caffe2/test:quantization -- quantization.test_quantized.TestQuantizedOps.test_channel_shuffle X86 implementation for QNNPACK is sse2 so this may not be the most efficient for x86. Reviewed By: dreiss Differential Revision: D21093841 fbshipit-source-id: 5282945f352df43fdffaa8544fe34dba99a5b97e	2020-05-01 10:07:15 -07:00
Ning Dong	5bb9357345	Update assertion in MHA forward to support FP16 training (#37539 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37539 Bug fix Test Plan: This passed fbtranslate local integration test when I toggle fp16 to true on GPU. Also it passed in with D21312488 Reviewed By: zhangguanheng66 Differential Revision: D21311505 fbshipit-source-id: 7ebd7375ef2c1b2ba4ac6fe7be5e7be1a490a319	2020-04-29 16:29:23 -07:00
Alban Desmaison	3799d1d74a	Fix many doc issues (#37099 ) Summary: Fix https://github.com/pytorch/pytorch/issues/35643 https://github.com/pytorch/pytorch/issues/37063 https://github.com/pytorch/pytorch/issues/36307 https://github.com/pytorch/pytorch/issues/35861 https://github.com/pytorch/pytorch/issues/35299 https://github.com/pytorch/pytorch/issues/23108 https://github.com/pytorch/pytorch/issues/4661 Just a bunch of small updates on the doc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37099 Differential Revision: D21185713 Pulled By: albanD fbshipit-source-id: 4ac06d6709dc0da6109a6ad3daae75667ee5863e	2020-04-23 10:01:03 -07:00
Ralf Gommers	78d5707041	Fix type annotations and make MyPy run on torch/ (#36584 ) Summary: This PR fixes a couple of syntax errors in `torch/` that prevent MyPy from running, fixes simple type annotation errors (e.g. missing `from typing import List, Tuple, Optional`), and adds granular ignores for errors in particular modules as well as for missing typing in third party packages. As a result, running `mypy` in the root dir of the repo now runs on: - `torch/` - `aten/src/ATen/function_wrapper.py` (the only file already covered in CI) In CI this runs on GitHub Actions, job Lint, sub-job "quick-checks", task "MyPy typecheck". It should give (right now): `Success: no issues found in 329 source files`. Here are the details of the original 855 errors when running `mypy torch` on current master (after fixing the couple of syntax errors that prevent `mypy` from running through): <details> ``` torch/utils/tensorboard/_proto_graph.py:1: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.node_def_pb2' torch/utils/tensorboard/_proto_graph.py:2: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.attr_value_pb2' torch/utils/tensorboard/_proto_graph.py:3: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.tensor_shape_pb2' torch/utils/backcompat/__init__.py:1: error: Cannot find implementation or library stub for module named 'torch._C' torch/for_onnx/__init__.py:1: error: Cannot find implementation or library stub for module named 'torch.for_onnx.onnx' torch/cuda/nvtx.py:2: error: Cannot find implementation or library stub for module named 'torch._C' torch/utils/show_pickle.py:59: error: Name 'pickle._Unpickler' is not defined torch/utils/show_pickle.py:113: error: "Type[PrettyPrinter]" has no attribute "_dispatch" torch/utils/tensorboard/_onnx_graph.py:1: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.graph_pb2' torch/utils/tensorboard/_onnx_graph.py:2: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.node_def_pb2' torch/utils/tensorboard/_onnx_graph.py:3: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.versions_pb2' torch/utils/tensorboard/_onnx_graph.py:4: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.attr_value_pb2' torch/utils/tensorboard/_onnx_graph.py:5: error: Cannot find implementation or library stub for module named 'tensorboard.compat.proto.tensor_shape_pb2' torch/utils/tensorboard/_onnx_graph.py:9: error: Cannot find implementation or library stub for module named 'onnx' torch/contrib/_tensorboard_vis.py:10: error: Cannot find implementation or library stub for module named 'tensorflow.core.util' torch/contrib/_tensorboard_vis.py:11: error: Cannot find implementation or library stub for module named 'tensorflow.core.framework' torch/contrib/_tensorboard_vis.py:12: error: Cannot find implementation or library stub for module named 'tensorflow.python.summary.writer.writer' torch/utils/hipify/hipify_python.py:43: error: Need type annotation for 'CAFFE2_TEMPLATE_MAP' (hint: "CAFFE2_TEMPLATE_MAP: Dict[<type>, <type>] = ...") torch/utils/hipify/hipify_python.py:636: error: "object" has no attribute "items" torch/nn/_reduction.py:27: error: Name 'Optional' is not defined torch/nn/_reduction.py:27: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/nn/_reduction.py:47: error: Name 'Optional' is not defined torch/nn/_reduction.py:47: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/utils/tensorboard/_utils.py:17: error: Skipping analyzing 'matplotlib.pyplot': found module but no type hints or library stubs torch/utils/tensorboard/_utils.py:17: error: Skipping analyzing 'matplotlib': found module but no type hints or library stubs torch/utils/tensorboard/_utils.py:18: error: Skipping analyzing 'matplotlib.backends.backend_agg': found module but no type hints or library stubs torch/utils/tensorboard/_utils.py:18: error: Skipping analyzing 'matplotlib.backends': found module but no type hints or library stubs torch/nn/modules/utils.py:27: error: Name 'List' is not defined torch/nn/modules/utils.py:27: note: Did you forget to import it from "typing"? (Suggestion: "from typing import List") caffe2/proto/caffe2_pb2.py:17: error: Unexpected keyword argument "serialized_options" for "FileDescriptor"; did you mean "serialized_pb"? caffe2/proto/caffe2_pb2.py:25: error: Unexpected keyword argument "serialized_options" for "EnumDescriptor" caffe2/proto/caffe2_pb2.py:31: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:35: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:39: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:43: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:47: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:51: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:55: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:59: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:63: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:67: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:71: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:75: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:102: error: Unexpected keyword argument "serialized_options" for "EnumDescriptor" caffe2/proto/caffe2_pb2.py:108: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:112: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:124: error: Unexpected keyword argument "serialized_options" for "EnumDescriptor" caffe2/proto/caffe2_pb2.py:130: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:134: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:138: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:142: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:146: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:150: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:154: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:158: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:162: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:166: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:170: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:174: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:178: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:182: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:194: error: Unexpected keyword argument "serialized_options" for "EnumDescriptor" caffe2/proto/caffe2_pb2.py:200: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:204: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:208: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:212: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:224: error: Unexpected keyword argument "serialized_options" for "EnumDescriptor" caffe2/proto/caffe2_pb2.py:230: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:234: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:238: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:242: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:246: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:250: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:254: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/caffe2_pb2.py:267: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:274: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:281: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:288: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:295: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:302: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:327: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:334: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:341: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:364: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:371: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:378: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:385: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:392: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:399: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:406: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:413: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:420: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:427: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:434: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:441: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:448: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:455: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:462: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:488: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:495: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:502: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:509: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:516: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:523: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:530: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:537: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:544: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:551: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:558: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:565: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:572: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:596: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:603: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:627: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:634: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:641: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:648: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:655: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:662: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:686: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:693: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:717: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:724: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:731: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:738: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:763: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:770: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:777: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:784: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:808: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:815: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:822: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:829: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:836: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:843: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:850: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:857: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:864: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:871: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:878: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:885: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:892: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:916: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:923: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:930: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:937: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:944: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:951: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:958: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:982: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:989: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:996: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1003: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1010: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1017: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1024: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1031: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1038: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1045: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1052: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1059: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1066: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1090: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1097: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1104: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1128: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1135: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1142: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1166: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1173: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1180: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1187: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1194: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1218: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1225: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1232: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1239: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1246: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1253: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1260: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1267: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1274: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1281: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1305: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1312: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1319: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1326: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1333: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1340: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1347: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1354: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1361: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1368: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1375: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1382: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1389: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1396: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1420: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1427: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1434: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1441: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1465: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1472: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1479: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1486: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1493: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1500: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1507: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1514: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1538: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/caffe2_pb2.py:1545: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1552: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1559: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1566: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/caffe2_pb2.py:1667: error: "GeneratedProtocolMessageType" has no attribute "Segment" torch/multiprocessing/queue.py:4: error: No library stub file for standard library module 'multiprocessing.reduction' caffe2/proto/torch_pb2.py:18: error: Unexpected keyword argument "serialized_options" for "FileDescriptor"; did you mean "serialized_pb"? caffe2/proto/torch_pb2.py:27: error: Unexpected keyword argument "serialized_options" for "EnumDescriptor" caffe2/proto/torch_pb2.py:33: error: Unexpected keyword argument "serialized_options" for "EnumValueDescriptor" caffe2/proto/torch_pb2.py:50: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:57: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:81: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:88: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:95: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:102: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:109: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:116: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:123: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:130: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:137: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:144: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:151: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:175: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:182: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:189: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:196: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:220: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:227: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:234: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:241: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:265: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:272: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:279: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:286: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:293: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:300: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:307: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:314: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:321: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:328: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:335: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:342: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:366: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:373: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:397: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/torch_pb2.py:404: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:411: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:418: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:425: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/torch_pb2.py:432: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:17: error: Unexpected keyword argument "serialized_options" for "FileDescriptor"; did you mean "serialized_pb"? caffe2/proto/metanet_pb2.py:29: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/metanet_pb2.py:36: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:43: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:50: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:57: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:64: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:88: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/metanet_pb2.py:95: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:102: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:126: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/metanet_pb2.py:133: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:140: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:164: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/metanet_pb2.py:171: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:178: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:202: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/metanet_pb2.py:209: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:216: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:240: error: Unexpected keyword argument "serialized_options" for "Descriptor" caffe2/proto/metanet_pb2.py:247: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:254: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:261: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:268: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:275: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:282: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:289: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/metanet_pb2.py:296: error: Unexpected keyword argument "serialized_options" for "FieldDescriptor" caffe2/proto/__init__.py:13: error: Skipping analyzing 'caffe2.caffe2.fb.session.proto': found module but no type hints or library stubs torch/multiprocessing/pool.py:3: error: No library stub file for standard library module 'multiprocessing.util' torch/multiprocessing/pool.py:3: note: (Stub files are from https://github.com/python/typeshed) caffe2/python/scope.py:10: error: Skipping analyzing 'past.builtins': found module but no type hints or library stubs caffe2/python/__init__.py:7: error: Module has no attribute "CPU" caffe2/python/__init__.py:8: error: Module has no attribute "CUDA" caffe2/python/__init__.py:9: error: Module has no attribute "MKLDNN" caffe2/python/__init__.py:10: error: Module has no attribute "OPENGL" caffe2/python/__init__.py:11: error: Module has no attribute "OPENCL" caffe2/python/__init__.py:12: error: Module has no attribute "IDEEP" caffe2/python/__init__.py:13: error: Module has no attribute "HIP" caffe2/python/__init__.py:14: error: Module has no attribute "COMPILE_TIME_MAX_DEVICE_TYPES"; maybe "PROTO_COMPILE_TIME_MAX_DEVICE_TYPES"? caffe2/python/__init__.py:15: error: Module has no attribute "ONLY_FOR_TEST"; maybe "PROTO_ONLY_FOR_TEST"? caffe2/python/__init__.py:34: error: Item "_Loader" of "Optional[_Loader]" has no attribute "exec_module" caffe2/python/__init__.py:34: error: Item "None" of "Optional[_Loader]" has no attribute "exec_module" caffe2/python/__init__.py:35: error: Module has no attribute "cuda" caffe2/python/__init__.py:37: error: Module has no attribute "cuda" caffe2/python/__init__.py:49: error: Module has no attribute "add_dll_directory" torch/random.py:4: error: Cannot find implementation or library stub for module named 'torch._C' torch/_classes.py:2: error: Cannot find implementation or library stub for module named 'torch._C' torch/onnx/__init__.py:1: error: Cannot find implementation or library stub for module named 'torch._C' torch/hub.py:21: error: Skipping analyzing 'tqdm.auto': found module but no type hints or library stubs torch/hub.py:24: error: Skipping analyzing 'tqdm': found module but no type hints or library stubs torch/hub.py:27: error: Name 'tqdm' already defined (possibly by an import) torch/_tensor_str.py:164: error: Not all arguments converted during string formatting torch/_ops.py:1: error: Cannot find implementation or library stub for module named 'torch._C' torch/_linalg_utils.py:26: error: Name 'Optional' is not defined torch/_linalg_utils.py:26: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_linalg_utils.py:26: error: Name 'Tensor' is not defined torch/_linalg_utils.py:63: error: Name 'Tensor' is not defined torch/_linalg_utils.py:63: error: Name 'Optional' is not defined torch/_linalg_utils.py:63: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_linalg_utils.py:70: error: Name 'Optional' is not defined torch/_linalg_utils.py:70: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_linalg_utils.py:70: error: Name 'Tensor' is not defined torch/_linalg_utils.py:88: error: Name 'Tensor' is not defined torch/_linalg_utils.py:88: error: Name 'Optional' is not defined torch/_linalg_utils.py:88: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_linalg_utils.py:88: error: Name 'Tuple' is not defined torch/_linalg_utils.py:88: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/_jit_internal.py:17: error: Need type annotation for 'boolean_dispatched' torch/_jit_internal.py:474: error: Need type annotation for '_overloaded_fns' (hint: "_overloaded_fns: Dict[<type>, <type>] = ...") torch/_jit_internal.py:512: error: Need type annotation for '_overloaded_methods' (hint: "_overloaded_methods: Dict[<type>, <type>] = ...") torch/_jit_internal.py:648: error: Incompatible types in assignment (expression has type "FinalCls", variable has type "_SpecialForm") torch/sparse/__init__.py:11: error: Name 'Tensor' is not defined torch/sparse/__init__.py:71: error: Name 'Tensor' is not defined torch/sparse/__init__.py:71: error: Name 'Optional' is not defined torch/sparse/__init__.py:71: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/sparse/__init__.py:71: error: Name 'Tuple' is not defined torch/sparse/__init__.py:71: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/nn/init.py:109: error: Name 'Tensor' is not defined torch/nn/init.py:126: error: Name 'Tensor' is not defined torch/nn/init.py:142: error: Name 'Tensor' is not defined torch/nn/init.py:165: error: Name 'Tensor' is not defined torch/nn/init.py:180: error: Name 'Tensor' is not defined torch/nn/init.py:194: error: Name 'Tensor' is not defined torch/nn/init.py:287: error: Name 'Tensor' is not defined torch/nn/init.py:315: error: Name 'Tensor' is not defined torch/multiprocessing/reductions.py:8: error: No library stub file for standard library module 'multiprocessing.util' torch/multiprocessing/reductions.py:9: error: No library stub file for standard library module 'multiprocessing.reduction' torch/multiprocessing/reductions.py:17: error: No library stub file for standard library module 'multiprocessing.resource_sharer' torch/jit/_builtins.py:72: error: Module has no attribute "_no_grad_embedding_renorm_" torch/jit/_builtins.py:80: error: Module has no attribute "stft" torch/jit/_builtins.py:81: error: Module has no attribute "cdist" torch/jit/_builtins.py:82: error: Module has no attribute "norm" torch/jit/_builtins.py:83: error: Module has no attribute "nuclear_norm" torch/jit/_builtins.py:84: error: Module has no attribute "frobenius_norm" torch/backends/cudnn/__init__.py:8: error: Cannot find implementation or library stub for module named 'torch._C' torch/backends/cudnn/__init__.py:86: error: Need type annotation for '_handles' (hint: "_handles: Dict[<type>, <type>] = ...") torch/autograd/profiler.py:13: error: Name 'ContextDecorator' already defined (possibly by an import) torch/autograd/function.py:2: error: Cannot find implementation or library stub for module named 'torch._C' torch/autograd/function.py:2: note: See https://mypy.readthedocs.io/en/latest/running_mypy.html#missing-imports torch/autograd/function.py:109: error: Unsupported dynamic base class "with_metaclass" torch/serialization.py:609: error: "Callable[[Any], Any]" has no attribute "cache" torch/_lowrank.py:11: error: Name 'Tensor' is not defined torch/_lowrank.py:13: error: Name 'Optional' is not defined torch/_lowrank.py:13: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_lowrank.py:14: error: Name 'Optional' is not defined torch/_lowrank.py:14: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_lowrank.py:14: error: Name 'Tensor' is not defined torch/_lowrank.py:82: error: Name 'Tensor' is not defined torch/_lowrank.py:82: error: Name 'Optional' is not defined torch/_lowrank.py:82: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_lowrank.py:82: error: Name 'Tuple' is not defined torch/_lowrank.py:82: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/_lowrank.py:130: error: Name 'Tensor' is not defined torch/_lowrank.py:130: error: Name 'Optional' is not defined torch/_lowrank.py:130: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_lowrank.py:130: error: Name 'Tuple' is not defined torch/_lowrank.py:130: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/_lowrank.py:167: error: Name 'Tensor' is not defined torch/_lowrank.py:167: error: Name 'Optional' is not defined torch/_lowrank.py:167: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/_lowrank.py:167: error: Name 'Tuple' is not defined torch/_lowrank.py:167: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/quantization/observer.py:45: error: Variable "torch.quantization.observer.ABC" is not valid as a type torch/quantization/observer.py:45: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases torch/quantization/observer.py:45: error: Invalid base class "ABC" torch/quantization/observer.py:127: error: Name 'Tensor' is not defined torch/quantization/observer.py:127: error: Name 'Tuple' is not defined torch/quantization/observer.py:127: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/quantization/observer.py:172: error: Module has no attribute "per_tensor_symmetric" torch/quantization/observer.py:172: error: Module has no attribute "per_channel_symmetric" torch/quantization/observer.py:192: error: Name 'Tensor' is not defined torch/quantization/observer.py:192: error: Name 'Tuple' is not defined torch/quantization/observer.py:192: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/quantization/observer.py:233: error: Module has no attribute "per_tensor_symmetric" torch/quantization/observer.py:233: error: Module has no attribute "per_channel_symmetric" torch/quantization/observer.py:534: error: Name 'Tensor' is not defined torch/quantization/observer.py:885: error: Name 'Tensor' is not defined torch/quantization/observer.py:885: error: Name 'Tuple' is not defined torch/quantization/observer.py:885: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/quantization/observer.py:894: error: Cannot determine type of 'max_val' torch/quantization/observer.py:894: error: Cannot determine type of 'min_val' torch/quantization/observer.py:899: error: Cannot determine type of 'min_val' torch/quantization/observer.py:902: error: Name 'Tensor' is not defined torch/quantization/observer.py:925: error: Name 'Tensor' is not defined torch/quantization/observer.py:928: error: Cannot determine type of 'min_val' torch/quantization/observer.py:929: error: Cannot determine type of 'max_val' torch/quantization/observer.py:946: error: Argument "min" to "histc" has incompatible type "Tuple[Tensor, Tensor]"; expected "Union[int, float, bool]" torch/quantization/observer.py:946: error: Argument "max" to "histc" has incompatible type "Tuple[Tensor, Tensor]"; expected "Union[int, float, bool]" torch/quantization/observer.py:1056: error: Module has no attribute "per_tensor_symmetric" torch/quantization/observer.py:1058: error: Module has no attribute "per_channel_symmetric" torch/nn/quantized/functional.py:76: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:76: error: Name 'BroadcastingList2' is not defined torch/nn/quantized/functional.py:259: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:259: error: Name 'Optional' is not defined torch/nn/quantized/functional.py:259: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/nn/quantized/functional.py:289: error: Module has no attribute "ops" torch/nn/quantized/functional.py:290: error: Module has no attribute "ops" torch/nn/quantized/functional.py:308: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:326: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:356: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:371: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:400: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:400: error: Name 'Optional' is not defined torch/nn/quantized/functional.py:400: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/nn/quantized/functional.py:430: error: Name 'Tensor' is not defined torch/nn/quantized/functional.py:448: error: Name 'Tensor' is not defined torch/nn/quantized/modules/linear.py:26: error: Module has no attribute "ops" torch/nn/quantized/modules/linear.py:28: error: Module has no attribute "ops" torch/nn/quantized/modules/functional_modules.py:40: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:47: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:54: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:61: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:68: error: Name 'List' is not defined torch/nn/quantized/modules/functional_modules.py:68: note: Did you forget to import it from "typing"? (Suggestion: "from typing import List") torch/nn/quantized/modules/functional_modules.py:68: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:75: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:140: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:146: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:151: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:157: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:162: error: Name 'List' is not defined torch/nn/quantized/modules/functional_modules.py:162: note: Did you forget to import it from "typing"? (Suggestion: "from typing import List") torch/nn/quantized/modules/functional_modules.py:162: error: Name 'Tensor' is not defined torch/nn/quantized/modules/functional_modules.py:168: error: Name 'Tensor' is not defined torch/multiprocessing/spawn.py:9: error: Module 'torch.multiprocessing' has no attribute '_prctl_pr_set_pdeathsig' torch/multiprocessing/__init__.py:28: error: Module has no attribute "__all__" torch/jit/frontend.py:9: error: Cannot find implementation or library stub for module named 'torch._C._jit_tree_views' torch/jit/annotations.py:6: error: Module 'torch._jit_internal' has no attribute 'BroadcastingList2'; maybe "BroadcastingList1" or "BroadcastingListCls"? torch/jit/annotations.py:6: error: Module 'torch._jit_internal' has no attribute 'BroadcastingList3'; maybe "BroadcastingList1" or "BroadcastingListCls"? torch/jit/annotations.py:9: error: Cannot find implementation or library stub for module named 'torch._C' torch/distributions/distribution.py:16: error: Need type annotation for 'arg_constraints' (hint: "arg_constraints: Dict[<type>, <type>] = ...") torch/distributions/distribution.py:74: error: Name 'arg_constraints' already defined on line 16 torch/distributions/distribution.py:84: error: Name 'support' already defined on line 15 torch/functional.py:114: error: Name 'Tuple' is not defined torch/functional.py:114: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/functional.py:114: error: Name 'Optional' is not defined torch/functional.py:114: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:189: error: Incompatible types in assignment (expression has type "None", variable has type "Tensor") torch/functional.py:200: error: Argument 1 to "_indices_product" has incompatible type "Tuple[int, ...]"; expected "List[int]" torch/functional.py:204: error: No overload variant of "__setitem__" of "list" matches argument types "Tensor", "int" torch/functional.py:204: note: Possible overload variants: torch/functional.py:204: note: def __setitem__(self, int, int) -> None torch/functional.py:204: note: def __setitem__(self, slice, Iterable[int]) -> None torch/functional.py:204: error: No overload variant of "__getitem__" of "list" matches argument type "Tensor" torch/functional.py:204: note: def __getitem__(self, int) -> int torch/functional.py:204: note: def __getitem__(self, slice) -> List[int] torch/functional.py:207: error: "Tensor" has no attribute "copy_" torch/functional.py:212: error: No overload variant of "__setitem__" of "list" matches argument types "Tensor", "int" torch/functional.py:212: note: Possible overload variants: torch/functional.py:212: note: def __setitem__(self, int, int) -> None torch/functional.py:212: note: def __setitem__(self, slice, Iterable[int]) -> None torch/functional.py:212: error: No overload variant of "__getitem__" of "list" matches argument type "Tensor" torch/functional.py:212: note: def __getitem__(self, int) -> int torch/functional.py:212: note: def __getitem__(self, slice) -> List[int] torch/functional.py:215: error: Incompatible types in assignment (expression has type "None", variable has type "Tensor") torch/functional.py:334: error: Name 'Optional' is not defined torch/functional.py:334: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:429: error: Argument 2 to "pad" has incompatible type "Tuple[int, int]"; expected "List[int]" torch/functional.py:431: error: Module has no attribute "stft" torch/functional.py:766: error: Module has no attribute "cdist" torch/functional.py:768: error: Module has no attribute "cdist" torch/functional.py:770: error: Module has no attribute "cdist" torch/functional.py:775: error: Name 'Optional' is not defined torch/functional.py:775: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:780: error: Name 'Optional' is not defined torch/functional.py:780: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:780: error: Name 'number' is not defined torch/functional.py:780: error: Name 'norm' already defined on line 775 torch/functional.py:785: error: Name 'Optional' is not defined torch/functional.py:785: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:785: error: Name 'number' is not defined torch/functional.py:785: error: Name 'norm' already defined on line 775 torch/functional.py:790: error: Name 'Optional' is not defined torch/functional.py:790: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:790: error: Name 'norm' already defined on line 775 torch/functional.py:795: error: Name 'norm' already defined on line 775 torch/functional.py:960: error: Name 'Any' is not defined torch/functional.py:960: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Any") torch/functional.py:960: error: Name 'Tuple' is not defined torch/functional.py:960: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/functional.py:1036: error: Argument 1 to "len" has incompatible type "int"; expected "Sized" torch/functional.py:1041: error: Name 'Optional' is not defined torch/functional.py:1041: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:1041: error: Name 'Tuple' is not defined torch/functional.py:1041: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/functional.py:1056: error: Name 'Optional' is not defined torch/functional.py:1056: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/functional.py:1056: error: Name 'Tuple' is not defined torch/functional.py:1056: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Tuple") torch/distributions/von_mises.py:87: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/negative_binomial.py:25: error: Incompatible types in assignment (expression has type "_IntegerGreaterThan", base class "Distribution" defined the type as "None") torch/distributions/multivariate_normal.py:116: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/laplace.py:23: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/independent.py:34: error: Need type annotation for 'arg_constraints' (hint: "arg_constraints: Dict[<type>, <type>] = ...") torch/distributions/cauchy.py:28: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/poisson.py:28: error: Incompatible types in assignment (expression has type "_IntegerGreaterThan", base class "Distribution" defined the type as "None") torch/distributions/one_hot_categorical.py:32: error: Incompatible types in assignment (expression has type "_Simplex", base class "Distribution" defined the type as "None") torch/distributions/normal.py:27: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/lowrank_multivariate_normal.py:79: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/gamma.py:30: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/exponential.py:23: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/fishersnedecor.py:25: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/dirichlet.py:44: error: Incompatible types in assignment (expression has type "_Simplex", base class "Distribution" defined the type as "None") torch/nn/quantized/dynamic/modules/rnn.py:230: error: Incompatible types in assignment (expression has type "int", variable has type "Tensor") torch/nn/quantized/dynamic/modules/rnn.py:232: error: Incompatible types in assignment (expression has type "int", variable has type "Tensor") torch/nn/quantized/dynamic/modules/rnn.py:236: error: Incompatible return value type (got "Tuple[Any, Tensor, Any]", expected "Tuple[int, int, int]") torch/nn/quantized/dynamic/modules/rnn.py:351: error: Incompatible types in assignment (expression has type "Type[LSTM]", base class "RNNBase" defined the type as "Type[RNNBase]") torch/nn/quantized/dynamic/modules/rnn.py:381: error: Module has no attribute "quantized_lstm" torch/nn/quantized/dynamic/modules/rnn.py:385: error: Module has no attribute "quantized_lstm" torch/nn/quantized/dynamic/modules/rnn.py:414: error: Argument 1 to "forward_impl" of "LSTM" has incompatible type "PackedSequence"; expected "Tensor" torch/nn/quantized/dynamic/modules/rnn.py:416: error: Incompatible types in assignment (expression has type "PackedSequence", variable has type "Tensor") torch/nn/quantized/dynamic/modules/rnn.py:418: error: Incompatible return value type (got "Tuple[Tensor, Tuple[Tensor, Tensor]]", expected "Tuple[PackedSequence, Tuple[Tensor, Tensor]]") torch/nn/quantized/dynamic/modules/rnn.py:420: error: Argument 1 of "permute_hidden" is incompatible with supertype "RNNBase"; supertype defines the argument type as "Tensor" torch/nn/quantized/dynamic/modules/rnn.py:420: error: Return type "Tuple[Tensor, Tensor]" of "permute_hidden" incompatible with return type "Tensor" in supertype "RNNBase" torch/nn/quantized/dynamic/modules/rnn.py:426: error: Argument 2 of "check_forward_args" is incompatible with supertype "RNNBase"; supertype defines the argument type as "Tensor" torch/nn/intrinsic/qat/modules/conv_fused.py:232: error: Incompatible types in assignment (expression has type "Type[ConvBnReLU2d]", base class "ConvBn2d" defined the type as "Type[ConvBn2d]") torch/distributions/beta.py:27: error: Incompatible types in assignment (expression has type "_Interval", base class "Distribution" defined the type as "None") torch/distributions/geometric.py:31: error: Incompatible types in assignment (expression has type "_IntegerGreaterThan", base class "Distribution" defined the type as "None") torch/distributions/continuous_bernoulli.py:38: error: Incompatible types in assignment (expression has type "_Interval", base class "Distribution" defined the type as "None") torch/distributions/bernoulli.py:30: error: Incompatible types in assignment (expression has type "_Boolean", base class "Distribution" defined the type as "None") torch/quantization/fake_quantize.py:126: error: Module has no attribute "per_tensor_symmetric" torch/quantization/fake_quantize.py:132: error: Module has no attribute "per_channel_symmetric" torch/distributions/transformed_distribution.py:41: error: Need type annotation for 'arg_constraints' (hint: "arg_constraints: Dict[<type>, <type>] = ...") torch/jit/__init__.py:1: error: Cannot find implementation or library stub for module named 'torch._C' torch/jit/__init__.py:15: error: Module 'torch.utils' has no attribute 'set_module' torch/jit/__init__.py:70: error: Name 'Attribute' already defined on line 68 torch/jit/__init__.py:213: error: On Python 3 '{}'.format(b'abc') produces "b'abc'"; use !r if this is a desired behavior torch/jit/__init__.py:215: error: On Python 3 '{}'.format(b'abc') produces "b'abc'"; use !r if this is a desired behavior torch/jit/__init__.py:1524: error: Unsupported dynamic base class "with_metaclass" torch/jit/__init__.py:1869: error: Name 'ScriptModule' already defined on line 1524 torch/jit/__init__.py:1998: error: Need type annotation for '_jit_caching_layer' torch/jit/__init__.py:1999: error: Need type annotation for '_jit_function_overload_caching' torch/distributions/relaxed_categorical.py:34: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/relaxed_categorical.py:108: error: Incompatible types in assignment (expression has type "_Simplex", base class "Distribution" defined the type as "None") torch/distributions/relaxed_bernoulli.py:31: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/relaxed_bernoulli.py:114: error: Incompatible types in assignment (expression has type "_Interval", base class "Distribution" defined the type as "None") torch/distributions/logistic_normal.py:31: error: Incompatible types in assignment (expression has type "_Simplex", base class "Distribution" defined the type as "None") torch/distributions/log_normal.py:26: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/half_normal.py:27: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/half_cauchy.py:28: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/gumbel.py:28: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/nn/quantized/modules/conv.py:18: error: Module 'torch.nn.utils' has no attribute 'fuse_conv_bn_weights' torch/nn/quantized/modules/conv.py:209: error: Name 'Optional' is not defined torch/nn/quantized/modules/conv.py:209: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/nn/quantized/modules/conv.py:214: error: Module has no attribute "ops" torch/nn/quantized/modules/conv.py:321: error: Name 'Optional' is not defined torch/nn/quantized/modules/conv.py:321: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/nn/quantized/modules/conv.py:323: error: Module has no attribute "ops" torch/nn/quantized/modules/conv.py:447: error: Name 'Optional' is not defined torch/nn/quantized/modules/conv.py:447: note: Did you forget to import it from "typing"? (Suggestion: "from typing import Optional") torch/nn/quantized/modules/conv.py:449: error: Module has no attribute "ops" torch/nn/quantized/modules/conv.py:513: error: Name 'nn.modules.conv._ConvTransposeNd' is not defined torch/nn/quantized/modules/conv.py:525: error: Name 'List' is not defined torch/nn/quantized/modules/conv.py:525: note: Did you forget to import it from "typing"? (Suggestion: "from typing import List") torch/nn/quantized/modules/conv.py:527: error: Name 'List' is not defined torch/nn/quantized/modules/conv.py:527: note: Did you forget to import it from "typing"? (Suggestion: "from typing import List") torch/nn/intrinsic/quantized/modules/conv_relu.py:8: error: Module 'torch.nn.utils' has no attribute 'fuse_conv_bn_weights' torch/nn/intrinsic/quantized/modules/conv_relu.py:21: error: Incompatible types in assignment (expression has type "Type[ConvReLU2d]", base class "Conv2d" defined the type as "Type[Conv2d]") torch/nn/intrinsic/quantized/modules/conv_relu.py:62: error: Incompatible types in assignment (expression has type "Type[ConvReLU3d]", base class "Conv3d" defined the type as "Type[Conv3d]") torch/distributions/weibull.py:25: error: Incompatible types in assignment (expression has type "_GreaterThan", base class "Distribution" defined the type as "None") torch/distributions/kl.py:35: error: Need type annotation for '_KL_MEMOIZE' (hint: "_KL_MEMOIZE: Dict[<type>, <type>] = ...") torch/distributions/studentT.py:27: error: Incompatible types in assignment (expression has type "_Real", base class "Distribution" defined the type as "None") torch/distributions/mixture_same_family.py:48: error: Need type annotation for 'arg_constraints' (hint: "arg_constraints: Dict[<type>, <type>] = ...") torch/distributions/__init__.py:158: error: Name 'transforms' is not defined torch/onnx/utils.py:21: error: Cannot find implementation or library stub for module named 'torch._C' torch/distributed/rendezvous.py:4: error: Cannot find implementation or library stub for module named 'urlparse' torch/distributed/rendezvous.py:4: error: Name 'urlparse' already defined (possibly by an import) torch/distributed/rendezvous.py:4: error: Name 'urlunparse' already defined (possibly by an import) torch/distributed/rendezvous.py:9: error: Module 'torch.distributed' has no attribute 'FileStore' torch/distributed/rendezvous.py:9: error: Module 'torch.distributed' has no attribute 'TCPStore' torch/distributed/rendezvous.py:65: error: On Python 3 '{}'.format(b'abc') produces "b'abc'"; use !r if this is a desired behavior torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'AllreduceOptions'; maybe "ReduceOptions" or "AllreduceCoalescedOptions"? torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'AllreduceCoalescedOptions'; maybe "AllreduceOptions"? torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'AllToAllOptions' torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'BroadcastOptions' torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'GatherOptions'; maybe "ScatterOptions"? torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'ReduceOptions'; maybe "AllreduceOptions", "ReduceScatterOptions", or "ReduceOp"? torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'ReduceScatterOptions'; maybe "ScatterOptions" or "ReduceOptions"? torch/distributed/distributed_c10d.py:11: error: Module 'torch.distributed' has no attribute 'ScatterOptions'; maybe "ReduceScatterOptions" or Pull Request resolved: https://github.com/pytorch/pytorch/pull/36584 Reviewed By: seemethere, ailzhang Differential Revision: D21155985 Pulled By: ezyang fbshipit-source-id: f628d4293992576207167e7c417998fad15898d1	2020-04-22 14:17:08 -07:00
Guanheng Zhang	b607c83a26	Add support for bool/byte `attn_mask` tensor in MultiheadAttention/Transformer modules (#33763 ) Summary: Add the support to accept both float, byte, and bool tensors for `attn_mask`. No breakage is expected. - If a bool tensor is provided, positions with `True` are not allowed to attend while `False` values will be unchanged. - if a byte tensor is provided, it will be converted to bool tensor. Positions with non-zero are not allowed to attend while zero values will be unchanged. - If a float tensor is provided, it will be added to the attention weight. Note: the behavior of the float mask tensor is slightly different from the first two options because it is added to the attention weight, rather than calling `masked_fill_` function. Also, converting a byte tensor to bool tensor within `multi_head_attention_forward` causes extra overhead. Therefore, a bool mask is recommended here. For `key_padding_mask`: - if a bool tensor is provided, it will be converted to bool tensor. The positions with the value of `True` will be ignored while the position with the value of `False` will be unchanged. - If a byte tensor is provided, the positions with the value of non-zero will be ignored while the position with the value of zero will be unchanged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33763 Differential Revision: D20925358 Pulled By: zhangguanheng66 fbshipit-source-id: de174056be183cdad0f3de8024ee0a3c5eb364c9	2020-04-21 14:06:59 -07:00
Bo Chang	2e93808cde	Update functional.py (#36600 ) Summary: Fix a latex typo in the docstring. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36600 Differential Revision: D21106164 Pulled By: mruberry fbshipit-source-id: b611f0eac209b59b06ace1017e418a68341c4105	2020-04-18 02:16:54 -07:00
Lin Yang	cc5befc461	[Format] format a few files (#35187 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35187 When I touch these files, lint will always introduce some unintended change, to prevent it from happening, we need to format the code first. change is generated by: arc f Test Plan: integration test. Differential Revision: D20587596 fbshipit-source-id: 512cf6b86bd6632a61c80ed53e3a9e229feecc2a	2020-04-17 14:30:01 -07:00
Robin Lobel	34a10238d5	fix is_float_scale_factor warning (c++) (#35601 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35601 Differential Revision: D20925642 Pulled By: yf225 fbshipit-source-id: a4e1f953efce04b3f399a8e526fb6c055cc2971c	2020-04-08 17:52:09 -07:00
Vasiliy Kuznetsov	4ef383d5db	add type hints on recently added ops to make them scriptable (#35885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35885 For the ops I added recently, ensure all the typehints are present, so that JIT can script them. We might want to look into a test for this in the future. Test Plan: scripting works for all of them now: https://gist.github.com/vkuzo/1d92fdea548ad596310fffcbe95e4438 Imported from OSS Differential Revision: D20818431 fbshipit-source-id: 0de61eaf70c08d625128c6fffd05788e6e5bb920	2020-04-06 12:17:16 -07:00
Nik Ved	35cdb78522	Make kl_div accept target in log space (#34586 ) Summary: Fixes [32520](https://github.com/pytorch/pytorch/issues/32520), implements [34536](https://github.com/pytorch/pytorch/issues/34536). Here are some benchmarks: ```python import torch import torch.nn.functional as F from IPython import get_ipython ipython = get_ipython() torch.set_num_threads(1) for d in [5, 10, 20, 50, 100, 1000]: i = torch.rand(d, d) t = torch.rand(d, d) print(f"Size: {d}x{d}") ipython.magic("timeit F.kl_div(i, t, reduction='none', log_target=False)") ipython.magic("timeit F.kl_div(i, t.log(), reduction='none', log_target=True)") ``` Output: ``` Size: 5x5 16 µs ± 33 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 8.24 µs ± 17.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 10x10 16.7 µs ± 17.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 8.7 µs ± 20.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 20x20 17.7 µs ± 47.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 9.7 µs ± 28.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 50x50 23.6 µs ± 60.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 15 µs ± 33.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Size: 100x100 42.8 µs ± 223 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 34 µs ± 17.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) Size: 1000x1000 3.9 ms ± 1.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 3.45 ms ± 364 ns per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/34586 Differential Revision: D20652726 Pulled By: ezyang fbshipit-source-id: 480697b4cd01341bbeee7514a8b812705a0600ea	2020-04-01 12:26:58 -07:00
Vasiliy Kuznetsov	b4c4342747	hswish and hardsigmoid: improve docs (#35431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35431 Resolving z-a-f's comments on earlier PRs on making the docblocks easier to read. Test Plan: render the new docblocks in http://rst.aaroniles.net/ CI Imported from OSS Differential Revision: D20658668 fbshipit-source-id: 5ea4a21d6b8dc9d744e2f4ede2f9d5d799fb902f	2020-03-31 10:01:07 -07:00
davidriazati	43928effee	[jit] Remove _assert_int_or_pair op (#34509 ) Summary: This one doesn't actually do anything so we don't need an op for it. It is used inside `torch.nn.functional.unfold` which is already tested Pull Request resolved: https://github.com/pytorch/pytorch/pull/34509 Pulled By: driazati Differential Revision: D20676445 fbshipit-source-id: b72d1308bdec593367ec4e14bf9a901d0b62e1cc	2020-03-27 18:37:49 -07:00
Vasiliy Kuznetsov	f3e9fa6122	add hardswish FP operator (#34747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34747 Adds the hardswish FP operator from MobileNetV3 to PyTorch. This is for common operator coverage, since this is widely used. A future PR will add the quantized version. CUDA is saved for a future PR as well. Test Plan: tests pass: ``` python test/test_torch.py TestTorchDeviceTypeCPU.test_hardswish_cpu_float32 ``` microbenchmark: https://gist.github.com/vkuzo/b10d3b238f24e58c585314e8b5385aca (batch_size == 1: 11.5GiB/s, batch_size == 4: 11.9GiB/s) Imported from OSS Differential Revision: D20451404 fbshipit-source-id: c7e13c9ab1a83e27a1ba18182947c82c896efae2	2020-03-24 15:15:34 -07:00
Vasiliy Kuznetsov	1bac5fd0d3	add hardsigmoid FP operator to PyTorch (#34545 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34545 This is for common operator coverage, since this is widely used. A future PR will add the quantized version. Some initial questions for reviewers, since it's my first FP operator diff: * do we need a backwards.out method for this? * do we need CUDA? If yes, should it be this PR or is it ok to split Test Plan: ``` // test python test/test_torch.py TestTorchDeviceTypeCPU.test_hardsigmoid_cpu_float32 // benchmark python -m pt.hardsigmoid_test ... Forward Execution Time (us) : 40.315 Forward Execution Time (us) : 42.603 ``` Imported from OSS Differential Revision: D20371692 fbshipit-source-id: 95668400da9577fd1002ce3f76b9777c6f96c327	2020-03-16 15:24:12 -07:00
eellison	44256199a9	[JIT] remove specialized list ops (#34520 ) Summary: Now that lists are no longer specialized, we can register only one operator for list ops that are generic to their element type. This PR reorgs lists into three sets of ops: - CREATE_GENERIC_LIST_OPS - CREATE_SPECIALIZED_LIST_OPS - CREATE_COMPARATOR_LIST_OPS_SPECIALIZED (we didn't bind certain specialized ops to Tensor) This is important to land quickly because mobile is finalizing its bytecode soon, after which we could not remove these ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34520 Reviewed By: iseeyuan Differential Revision: D20429775 Pulled By: eellison fbshipit-source-id: ae6519f9b0f731eaa2bf4ac20736317d0a66b8a0	2020-03-12 17:49:23 -07:00
Elias Ellison	514cba0661	[JIT] remove builtin interpolate functions (#34514 ) Summary: `torch.nn.functional.interpolate` was written as a builtin op when we scripted the standard library, because it has four possible overloads. As a result, whenever we make a change to `interpolate`, we need to make changes in two places, and it also makes it impossible to optimize the interpolate op. The builtin is tech debt. I talked with ailzhang, and the symbolic script changes are good to remove (i guess that makes a third place we needed to re-implement interpolate). I'm trying to get rid of unneccessary builtin operators because we're standardizing mobile bytecode soon, so we should try to get this landed as soon as possible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/34514 Differential Revision: D20391089 Pulled By: eellison fbshipit-source-id: abc84cdecfac67332bcba6b308fca4db44303121	2020-03-12 09:21:33 -07:00
Elias Ellison	fddf73250d	[JIT] fix resolving of functions in torch/functional. fix compilation of torch.stft (#33504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33504 Fix resolution fo functions that are bound onto torch in torch/functional.py. This does not fix compilation of all of those functions, those will be done in follow ups. Does torch.stft as a start. Fixes #21478 Test Plan: Imported from OSS Differential Revision: D20014591 Pulled By: eellison fbshipit-source-id: bb362f1b5479adbb890e72a54111ef716679d127	2020-02-26 18:35:43 -08:00
Natalia Gimelshein	a9cef05f5d	improve EmbeddingBag performance on cuda (#33589 ) Summary: This PR improves performance of EmbeddingBag on cuda by removing 5 kernel launches (2 of those are synchronizing memcopies). - 2 memcopies are checking values of offsets[0] and offsets[-1] to be in expected range (0 for the former, less than number of indices for the latter). It seems strange to check only those 2 values, if users are providing invalid offsets, invalid values can be anywhere in the array, not only the first and last element. After this PR, the checks are skipped on cuda, the first value is forced to 0, if the last value is larger than expected, cuda kernel will assert. It is less nice than ValueError, but then again, the kernel could have asserted if other offset values were invalid. On the cpu, the checks are moved inside the cpu implementation from functional.py, and will throw RuntimeError instead of ValueError. - 3 or 4 initializations (depending on the mode) of the output tensors with .zeros() are unnecessary, because every element of those tensors is written to, so their data can be uninitialized on the start. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33589 Reviewed By: jianyuh Differential Revision: D20078011 Pulled By: ngimel fbshipit-source-id: 2fb2e2080313af64adc5cf1b9fc6ffbdc6efaf16	2020-02-24 21:37:34 -08:00
Nathan Goldbaum	fa80299bdf	__torch_function__ overrides for torch.functional and torch.nn.functional (#32799 ) Summary: This adds `__torch_function__` support for all functions in `torch.functional` and `torch.nn.functional`. The changes to C++ code and codegen scripts are to facilitate adding `__torch_function__` support for the native functions in `torch._C._nn`. Note that I moved the `handle_torch_function` C++ function to a header that both `python_torch_functions.cpp` and `python_nn_functions.cpp` include. The changes to `python_nn_functions.cpp` mirror the changes I made to `python_torch_functions.cpp` when `__torch_function__` support was first added in https://github.com/pytorch/pytorch/issues/27064. Due to the somewhat different way the `torch._C` and `torch._C._nn` namespaces are initialized I needed to create a new static reference to the `torch._C._nn` namespace (`THPNNVariableFunctions`). I'm not sure if that is the best way to do this. In principle I could import these namespaces in each kernel and avoid the global variable but that would have a runtime cost. I added `__torch_function__` support to the Python functions in `torch.nn.functional` following the approach in https://github.com/pytorch/pytorch/issues/32194. I re-enabled the test that checks if all functions in the `torch` namespace are explicitly tested for `__torch_function__` support. I also generalized the check to work for `torch.functional` and `torch.nn.functional` as well. This test was explicitly disabled in https://github.com/pytorch/pytorch/issues/30730 and I'm happy to disable it again if you think that's appropriate. I figured now was as good a time as any to try to re-enable it. Finally I adjusted the existing torch API tests to suppress deprecation warnings and add keyword arguments used by some of the code in `torch.nn.functional` that were missed when I originally added the tests in https://github.com/pytorch/pytorch/issues/27064. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32799 Differential Revision: D19956809 Pulled By: ezyang fbshipit-source-id: 40d34e0109cc4b9f3ef62f409d2d35a1d84e3d22	2020-02-21 08:38:37 -08:00
Vasil Khalidov	cfb4862673	[pytorch] correct input size check for GroupNorm (#33008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33008 Corrects D19373507 to allow valid use cases that fail now. Multiplies batch size by the number of elements in a group to get the correct number of elements over which statistics are computed. Details: The current implementation disallows GroupNorm to be applied to tensors of shape e.g. `(1, C, 1, 1)` to prevent cases where statistics are computed over 1 element and thus result in a tensor filled with zeros. However, in GroupNorm the statistics are calculated across channels. So in case where one has an input tensor of shape `(1, 256, 1, 1)` for `GroupNorm(32, 256)`, the statistics will be computed over 8 elements and thus be meaningful. One use case is [Atrous Spatial Pyramid Pooling (ASPPPooling)](`791c172a33/torchvision/models/segmentation/deeplabv3.py (L50)`), where GroupNorm could be used in place of BatchNorm [here](`791c172a33/torchvision/models/segmentation/deeplabv3.py (L55)`). However, now this is prohibited and results in failures. Proposed solution consists in correcting the computation of the number of elements over which statistics are computed. The number of elements per group is taken into account in the batch size. Test Plan: check that existing tests pass Reviewed By: fmassa Differential Revision: D19723407 fbshipit-source-id: c85c244c832e6592e9aedb279d0acc867eef8f0c	2020-02-18 06:43:53 -08:00
Lara	4502d8c391	Interpolate Float [] support in ONNX (#32554 ) Summary: The PR https://github.com/pytorch/pytorch/pull/31791 adds support for float[] constant, which affects some cases of ONNX interpolate support. This PR adds float[] constants support in ONNX, updates interpolate in ONNX, and re-enable the disabled tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32554 Reviewed By: hl475 Differential Revision: D19566596 Pulled By: houseroad fbshipit-source-id: 843f62c86126fdf4f9c0117b65965682a776e7e9	2020-02-04 16:14:40 -08:00
Hong Xu	3cac9900ca	Clarify when softplus is reverted to linear. (#32945 ) Summary: The default value is removed because it is explained right below. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32945 Reviewed By: soumith Differential Revision: D19706567 Pulled By: ailzhang fbshipit-source-id: 1b7cc87991532f69b81aaae2451d944f70dda427	2020-02-03 17:54:31 -08:00
Sameer Deshmukh	602394e996	verify input sizes for instance norm and group norm (#29082 ) Summary: Fix for https://github.com/pytorch/pytorch/issues/19250 Pull Request resolved: https://github.com/pytorch/pytorch/pull/29082 Differential Revision: D19373507 Pulled By: ezyang fbshipit-source-id: 231a79280f4cd7db2c26218a60869356a124bf77	2020-01-27 09:05:56 -08:00
Jianyu Huang	3ada2e0d64	[pytorch][embeddingbag] Parallelize the EmbeddingBag operator (#4049 ) Summary: Pull Request resolved: https://github.com/pytorch/glow/pull/4049 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27477 We would like to add the intra-op parallelization support for the EmbeddingBag operator. This should bring speedup for the DLRM benchmark: https://github.com/pytorch/pytorch/pull/24385 Benchmark code: ``` from __future__ import absolute_import, division, print_function, unicode_literals import torch import time eb = torch.nn.EmbeddingBag(1000000, 64, mode='sum') input = torch.LongTensor(1500).random_(0, 1000000) offsets = torch.zeros(64, dtype=torch.int64) niter = 10000 s = time.time() for _ in range(niter): out = eb(input, offsets) time_per_iter = (time.time() - s) / niter print('time_per_iter', time_per_iter) print('GB/s', (input.numel() * 64 * 4 + out.numel() * 4) / time_per_iter / 1e9) ``` The following results are single core on Skylake T6: - Before our change (with the original caffe2::EmbeddingLookup) time_per_iter 6.313693523406982e-05 GB/s 6.341517821789133 - After our change using the EmbeddingLookupIdx API which takes the offsets instead of lengths. time_per_iter 5.7627105712890626e-05 GB/s 6.947841559053659 - With Intel's PR: https://github.com/pytorch/pytorch/pull/24385 time_per_iter 7.393271923065185e-05 GB/s 5.415518381664018 For multi-core performance, because Clang doesn't work with OMP, I can only see the single-core performance on SKL T6. ghstack-source-id: 97124557 Test Plan: With D16990830: ``` buck run mode/dev //caffe2/caffe2/perfkernels:embedding_bench ``` With D17750961: ``` buck run mode/opt //experimental/jianyuhuang/embeddingbag:eb buck run mode/opt-lto //experimental/jianyuhuang/embeddingbag:eb ``` OSS test ``` python run_test.py -i nn -- TestNNDeviceTypeCPU.test_EmbeddingBag_per_sample_weights_and_new_offsets_cpu ``` Buck test ``` buck test mode/dev-nosan //caffe2/test:nn -- "test_EmbeddingBag_per_sample_weights_and_new_offsets_cpu" OMP_NUM_THREADS=3 buck test mode/opt -c pytorch.parallel_backend=tbb //caffe2/test:nn -- "test_EmbeddingBag_per_sample_weights_and_new_offsets" --print-passing-details ``` Generate the AVX2 code for embedding_lookup_idx_avx2.cc: ``` python hp_emblookup_codegen.py --use-offsets ``` Differential Revision: D17768404 fbshipit-source-id: 8dcd15a62d75b737fa97e0eff17f347052675700	2020-01-23 21:29:44 -08:00
Guanheng Zhang	db02a4e4ce	Support 3D attention mask in MultiheadAttention. (#31996 ) Summary: Support a 3D attention mask for MultiheadAttention. If `attn_mask` has the batch dimension, it will not be unsqueezed. Fix https://github.com/pytorch/pytorch/issues/30678 Relevant issues/pr: https://github.com/pytorch/pytorch/pull/25359 https://github.com/pytorch/pytorch/issues/29520 Pull Request resolved: https://github.com/pytorch/pytorch/pull/31996 Differential Revision: D19332816 Pulled By: zhangguanheng66 fbshipit-source-id: 3448af4b219607af60e02655affe59997ad212d9	2020-01-23 13:16:48 -08:00
Tongzhou Wang	cc2d5b15ad	F.normalize uses clamp_min_ inplace (#32360 ) Summary: We don't care about autograd when `out!=None` anyways Pull Request resolved: https://github.com/pytorch/pytorch/pull/32360 Differential Revision: D19452402 Pulled By: colesbury fbshipit-source-id: c54775289f8a700019ca61e951d59ff4894ac980	2020-01-21 10:38:06 -08:00
Alban Desmaison	77c78b7d28	remove .data from torch/nn doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31481 Test Plan: Imported from OSS Differential Revision: D19303242 Pulled By: albanD fbshipit-source-id: 4f650df9e9e302a299175967bcc6e30a5099fa2a	2020-01-14 07:30:42 -08:00
BowenBao	c4f10e0fe7	Renaming scales parameter for interpolate (#31526 ) Summary: PR separated from https://github.com/pytorch/pytorch/pull/31274. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31526 Reviewed By: zou3519 Differential Revision: D19221931 Pulled By: gchanan fbshipit-source-id: 81958a9910867ac9d62f2b47abc49384526c4e51	2020-01-02 08:19:30 -08:00
Lara	97c1e90f46	ONNX Interpolate Add Scales Params (#28324 ) Summary: Fix for : https://github.com/pytorch/pytorch/issues/27176 Pull Request resolved: https://github.com/pytorch/pytorch/pull/28324 Reviewed By: hl475 Differential Revision: D18309133 Pulled By: houseroad fbshipit-source-id: 348bb41393442c6b107d88fc2cd3224e0afa3ccf	2019-12-11 20:09:15 -08:00
Tongzhou Wang	d6ca93b353	add doc for F.softplus Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30055 Differential Revision: D18762624 Pulled By: zou3519 fbshipit-source-id: 61da88cbb8cd0f37ac26b0fb8aaacdbe85c724ba	2019-12-04 07:16:30 -08:00
Brian Wignall	e7fe64f6a6	Fix typos (#30606 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606 Differential Revision: D18763028 Pulled By: mrshenli fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c	2019-12-02 20:17:42 -08:00
Christian Puhrsch	7903fb118f	Move qkv_same, kv_same into branch (#30142 ) Summary: Perf improvements to multi_head_attention_forward - qkv_same and kv_same were not used outside of that branch. Further, kv_same was calculated even though it is not used if qkv_same Pull Request resolved: https://github.com/pytorch/pytorch/pull/30142 Differential Revision: D18610938 Pulled By: cpuhrsch fbshipit-source-id: 19b7456f20aef90032b0f42d7da8c8a2d5563ee3	2019-11-22 10:40:02 -08:00
Vitaly Fedyunin	a4f60b64dc	explicitly provide memory format when calling to *_like operators Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29391 Test Plan: Imported from OSS Differential Revision: D18429726 Pulled By: VitalyFedyunin fbshipit-source-id: 07dfff568ad776cf792122913530566d53be55fa	2019-11-18 21:47:52 -08:00
Andreas Koepf	c7ed89cf65	Migrate `nll_loss2d` from TH to ATen (CPU) (#28304 ) Summary: Added check for indicies in Reduction::None case. ### Benchmark results Note: Due to the size of the input tensors this time the random number generation is responsible for a significant portion of the total time. It is better to look at the individual net time-outputs (which do not include the input preparation). Script used for benchmark.: [nnl_loss2d_benchmark.py](https://gist.github.com/andreaskoepf/5864aa91e243317cb282c1e7fe576e1b) #### WITH PR applied ``` using reduction: none CPU forward 1000 took 7.916500908322632e-05 CPU forward 10000 took 0.0002642290201038122 CPU forward 100000 took 0.003828087996225804 CPU forward 1000000 took 0.037140720000024885 CPU forward 10000000 took 0.33387596398824826 CPU forward TOTAL time 7.218988707987592 using reduction: mean CPU forward 1000 took 9.165197843685746e-05 CPU forward 10000 took 0.0005258890159893781 CPU forward 100000 took 0.0050761590246111155 CPU forward 1000000 took 0.047345594997750595 CPU forward 10000000 took 0.4790863030066248 CPU forward TOTAL time 7.9106070210109465 CPU for- & backward 1000 took 0.0005489500181283802 CPU for- & backward 10000 took 0.0015284279943443835 CPU for- & backward 100000 took 0.015138130984269083 CPU for- & backward 1000000 took 0.15741890601930209 CPU for- & backward 10000000 took 1.6703072849777527 CPU for- & backward TOTAL time 9.555764263990568 using reduction: sum CPU forward 1000 took 8.789298590272665e-05 CPU forward 10000 took 0.000514078012201935 CPU forward 100000 took 0.005135576997417957 CPU forward 1000000 took 0.04715992201818153 CPU forward 10000000 took 0.4821214270195924 CPU forward TOTAL time 7.9119505700073205 CPU for- & backward 1000 took 0.00047759301378391683 CPU for- & backward 10000 took 0.0015945070190355182 CPU for- & backward 100000 took 0.018208994006272405 CPU for- & backward 1000000 took 0.15904426100314595 CPU for- & backward 10000000 took 1.5679037219961174 CPU for- & backward TOTAL time 9.495157692988869 ``` #### WITHOUT original TH impl ``` using reduction: none CPU forward 1000 took 0.0003981560003012419 CPU forward 10000 took 0.0035912430030293763 CPU forward 100000 took 0.035353766987100244 CPU forward 1000000 took 0.3428319719969295 CPU forward 10000000 took 3.364342701010173 CPU forward TOTAL time 11.166179805004504 using reduction: mean CPU forward 1000 took 8.63690220285207e-05 CPU forward 10000 took 0.0004704220045823604 CPU forward 100000 took 0.0045734510058537126 CPU forward 1000000 took 0.046232511987909675 CPU forward 10000000 took 0.4191019559802953 CPU forward TOTAL time 7.846049971994944 CPU for- & backward 1000 took 0.0005974550149403512 CPU for- & backward 10000 took 0.0014057719963602722 CPU for- & backward 100000 took 0.013776941981632262 CPU for- & backward 1000000 took 0.13876214998890646 CPU for- & backward 10000000 took 1.3666698939923663 CPU for- & backward TOTAL time 9.10526105100871 using reduction: sum CPU forward 1000 took 7.598899537697434e-05 CPU forward 10000 took 0.00046885499614290893 CPU forward 100000 took 0.0044489419960882515 CPU forward 1000000 took 0.04495517900795676 CPU forward 10000000 took 0.418376043002354 CPU forward TOTAL time 7.789334400993539 CPU for- & backward 1000 took 0.0004464260127861053 CPU for- & backward 10000 took 0.0017732900159899145 CPU for- & backward 100000 took 0.01626713399309665 CPU for- & backward 1000000 took 0.11790941300569102 CPU for- & backward 10000000 took 1.4346664609911386 CPU for- & backward TOTAL time 9.294745502003934 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/28304 Differential Revision: D18350157 Pulled By: ezyang fbshipit-source-id: e9437debe51386a483f4265193c475cdc90b28e4	2019-11-09 18:31:20 -08:00
Xiaomeng Yang	2460dced8f	Add torch.nn.GELU for GELU activation (#28944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28944 Add torch.nn.GELU for GELU activation Test Plan: buck test mode/dev-nosan //caffe2/test:nn -- "GELU" Reviewed By: hl475, houseroad Differential Revision: D18240946 fbshipit-source-id: 6284b30def9bd4c12bf7fb2ed08b1b2f0310bb78	2019-11-03 21:55:05 -08:00
Lu Fang	e9a91756cd	Back out "[pytorch][PR] Migrate soft_margin_loss from the TH to Aten (CUDA+CPU)" Summary: Original commit changeset: 9ddffe4dbbfa Test Plan: ci Reviewed By: yf225 Differential Revision: D17939581 fbshipit-source-id: 44a3b843bf1e7059fec57b9e3d12ed4886816145	2019-10-15 21:12:10 -07:00
Edward Yang	2aa84d927b	Revert D17939700: Revert D17889288: [pytorch][PR] Migrate soft_margin_loss from the TH to Aten (CUDA+CPU) Test Plan: revert-hammer Differential Revision: D17939700 Original commit changeset: 4fc6156ba388 fbshipit-source-id: dded0a2140d2c14cd2f2a574987ecc164b0e5bfe	2019-10-15 15:24:36 -07:00
Edward Yang	c44e33b578	Revert D17889288: [pytorch][PR] Migrate soft_margin_loss from the TH to Aten (CUDA+CPU) Test Plan: revert-hammer Differential Revision: D17889288 Original commit changeset: 9ddffe4dbbfa fbshipit-source-id: 4fc6156ba38834512b2f735ac0d03e34e69b7286	2019-10-15 14:35:28 -07:00
Andreas Koepf	9033ace9c4	Migrate soft_margin_loss from the TH to Aten (CUDA+CPU) (#27673 ) Summary: Replaces fused TH kernels with a 2-liner of regular Tensor functions. Benchmarking revealed that performance improves compared to PyTorch 1.2. Refs: https://github.com/pytorch/pytorch/issues/24631, https://github.com/pytorch/pytorch/issues/24632, https://github.com/pytorch/pytorch/issues/24764, https://github.com/pytorch/pytorch/issues/24765 VitalyFedyunin ### Benchmarking results on my laptop: ## 1.4.0a0+f63c9e8 output ``` PyTorch version: 1.4.0a0+f63c9e8 CPU Operator sanity check: tensor(0.5926, grad_fn=<MeanBackward0>) tensor([-0.0159, -0.0170, -0.0011, -0.0083, -0.0140, -0.0217, -0.0290, -0.0262, -0.0078, -0.0129]) double backward tensor(-0.1540, grad_fn=<SumBackward0>) ok GPU Operator sanity check: tensor(0.5601, device='cuda:0', grad_fn=<MeanBackward0>) tensor([-0.0393, -0.0316, -0.0233, -0.0140, -0.0141, -0.0161, -0.0322, -0.0238, -0.0054, -0.0151], device='cuda:0') double backward tensor(-0.2148, device='cuda:0', grad_fn=<SumBackward0>) ok CPU warmup 1000 took 9.025700273923576e-05 CPU warmup 10000 took 0.0009383050055475906 CPU warmup 100000 took 0.0015631120040779933 CPU warmup TOTAL time 0.0026368020044174045 CPU forward 1000 took 6.919399311300367e-05 CPU forward 10000 took 0.00014462800754699856 CPU forward 100000 took 0.0011234670091653243 CPU forward 1000000 took 0.014555767003912479 CPU forward 10000000 took 0.13409724000666756 CPU forward 100000000 took 1.246048310000333 CPU forward TOTAL time 1.3961777170043206 CPU for- & backward 1000 took 0.0003219560021534562 CPU for- & backward 10000 took 0.00037290599721018225 CPU for- & backward 100000 took 0.001975035003852099 CPU for- & backward 1000000 took 0.02621342398924753 CPU for- & backward 10000000 took 0.2944270490115741 CPU for- & backward 100000000 took 1.6856628700043075 CPU for- & backward TOTAL time 2.0091958299890393 GPU warmup 1000 took 0.0002462909906171262 GPU warmup 10000 took 9.991199476644397e-05 GPU warmup 100000 took 0.00034347400651313365 GPU warmup TOTAL time 0.0007382350013358518 GPU forward 1000 took 9.67290106927976e-05 GPU forward 10000 took 9.349700121674687e-05 GPU forward 100000 took 9.384499571751803e-05 GPU forward 1000000 took 0.0004975290066795424 GPU forward 10000000 took 0.0017606960027478635 GPU forward 100000000 took 0.003572814996005036 GPU forward TOTAL time 0.006185991995153017 GPU for- & backward 1000 took 0.00035818999458570033 GPU for- & backward 10000 took 0.0003240450023440644 GPU for- & backward 100000 took 0.0003223370003979653 GPU for- & backward 1000000 took 0.00036740700306836516 GPU for- & backward 10000000 took 0.0003690610028570518 GPU for- & backward 100000000 took 0.0003672500024549663 GPU for- & backward TOTAL time 0.002197896988946013 ``` ## 1.2 output ``` PyTorch version: 1.2.0 CPU Operator sanity check: tensor(0.5926, grad_fn=<SoftMarginLossBackward>) tensor([-0.0159, -0.0170, -0.0011, -0.0083, -0.0140, -0.0217, -0.0290, -0.0262, -0.0078, -0.0129]) double backward tensor(-0.1540, grad_fn=<SumBackward0>) ok GPU Operator sanity check: tensor(0.5601, device='cuda:0', grad_fn=<SoftMarginLossBackward>) tensor([-0.0393, -0.0316, -0.0233, -0.0140, -0.0141, -0.0161, -0.0322, -0.0238, -0.0054, -0.0151], device='cuda:0') double backward tensor(-0.2148, device='cuda:0', grad_fn=<SumBackward0>) ok CPU warmup 1000 took 8.422900282312185e-05 CPU warmup 10000 took 0.00036992700188420713 CPU warmup 100000 took 0.003682684007799253 CPU warmup TOTAL time 0.004169487991021015 CPU forward 1000 took 5.521099956240505e-05 CPU forward 10000 took 0.00036948200431652367 CPU forward 100000 took 0.003762389998883009 CPU forward 1000000 took 0.03725024699815549 CPU forward 10000000 took 0.3614480490068672 CPU forward 100000000 took 3.6139175269927364 CPU forward TOTAL time 4.016912263003178 CPU for- & backward 1000 took 0.0002734809968387708 CPU for- & backward 10000 took 0.0006605249946005642 CPU for- & backward 100000 took 0.005437346000690013 CPU for- & backward 1000000 took 0.051245586000732146 CPU for- & backward 10000000 took 0.5291594529990107 CPU for- & backward 100000000 took 5.23841712900321 CPU for- & backward TOTAL time 5.8253340990049765 GPU warmup 1000 took 0.0005757809994975105 GPU warmup 10000 took 0.0004058420017827302 GPU warmup 100000 took 0.0003764610009966418 GPU warmup TOTAL time 0.0013992580061312765 GPU forward 1000 took 0.0003543390048434958 GPU forward 10000 took 0.0003633670130511746 GPU forward 100000 took 0.0004807310033356771 GPU forward 1000000 took 0.0005875999922864139 GPU forward 10000000 took 0.0016903509967960417 GPU forward 100000000 took 0.014400018990272656 GPU forward TOTAL time 0.0179396449966589 GPU for- & backward 1000 took 0.0006167769897729158 GPU for- & backward 10000 took 0.0006845899915788323 GPU for- & backward 100000 took 0.000631830989732407 GPU for- & backward 1000000 took 0.0010741150035755709 GPU for- & backward 10000000 took 0.0017265130009036511 GPU for- & backward 100000000 took 0.014847910992102697 GPU for- & backward TOTAL time 0.01965981800458394 ``` ### Code used for performance test ``` import torch import torch.nn.functional as F import torch.nn as nn from timeit import default_timer torch.manual_seed(0) cpu = torch.device('cpu') gpu = torch.device('cuda') loss_fn = F.soft_margin_loss def run_benchmark(name, depth, require_grad, device, fn): total_start = default_timer() for i in range(3, 3 + depth): start = default_timer() n = 10 ** i a = torch.rand(n, requires_grad=require_grad, device=device) b = torch.rand(n, device=device) fn(a, b) end = default_timer() print('{} {} took {}'.format(name, n, end-start)) total_end = default_timer() print('{} TOTAL time {}'.format(name, total_end-total_start)) def fwd_only(a, b): out = loss_fn(a, b) def fwd_bck(a, b): out = loss_fn(a, b) out.backward() def sanity_check(name, device): print('{} Operator sanity check:'.format(name)) a = torch.rand(10, requires_grad=True, device=device) b = torch.rand(10, device=device) out = loss_fn(a,b) print(out) out.backward() print(a.grad) print('double backward') loss = loss_fn(a, b) loss2 = torch.autograd.grad(loss, a, create_graph=True) z = loss2[0].sum() print(z) z.backward() print('ok') print() print('PyTorch version:', torch.__version__) sanity_check('CPU', cpu) sanity_check('GPU', gpu) print() run_benchmark('CPU warmup', 3, False, cpu, fwd_only) run_benchmark('CPU forward', 6, False, cpu, fwd_only) run_benchmark('CPU for- & backward', 6, True, cpu, fwd_bck) print() run_benchmark('GPU warmup', 3, False, gpu, fwd_only) run_benchmark('GPU forward', 6, False, gpu, fwd_only) run_benchmark('GPU for- & backward', 6, True, gpu, fwd_bck) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/27673 Differential Revision: D17889288 Pulled By: ezyang fbshipit-source-id: 9ddffe4dbbfab6180847a8fec32443910f18f0a9	2019-10-15 08:44:57 -07:00
zou3519	e5d6b75319	Bag of documentation fixes; fix more sphinx warnings (#27850 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27850 Many of these are real problems in the documentation (i.e., link or bullet point doesn't display correctly). Test Plan: - built and viewed the documentation for each change locally. Differential Revision: D17908123 Pulled By: zou3519 fbshipit-source-id: 65c92a352c89b90fb6b508c388b0874233a3817a	2019-10-15 07:31:14 -07:00
Ailing Zhang	15f9fe1d92	Add missing Optional annotation. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27564 Differential Revision: D17816121 Pulled By: ailzhang fbshipit-source-id: 5a4ac12ed81bf5d900ec3e7ab616082cb98d832d	2019-10-11 09:04:29 -07:00
Guanheng Zhang	eb93200321	Fix DDP incompatibility issue with nn.MultiheadAttention. (#26826 ) Summary: Fix issue https://github.com/pytorch/pytorch/issues/26698. With different query/keys/value dimensions, `nn.MultiheadAttention` has DDP incompatibility issue because in that case `in_proj_weight` attribute is created but not used. Fix it and add a distributed unit test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26826 Differential Revision: D17583807 Pulled By: zhangguanheng66 fbshipit-source-id: c393584c331ed4f57ebaf2d4015ef04589c973f6	2019-10-08 12:13:34 -07:00
Lara	d396c7332a	Update ONNX Export for Interpolate in Opset 11 (#26778 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26778 - Add support for linear and cubic interpolate in opset 11. - Add support for 1d and 3d interpolate in nearest mode for opset 7 and 8. - Add tests for all cases of interpolate in ORT tests (nearest/linear/cubic, 1d/2d/3d, upsample/downsample). Original PR resolved: https://github.com/pytorch/pytorch/pull/24805 Reviewed By: hl475 Differential Revision: D17564911 Pulled By: houseroad fbshipit-source-id: 591e1f5b361854ace322eca1590f8f84d29c1a5d	2019-09-25 05:43:20 -07:00
Edward Yang	1bb895e1c1	Revert D17330801: [pytorch][PR] Update ONNX Export for Interpolate in Opset 11 Test Plan: revert-hammer Differential Revision: D17330801 Original commit changeset: 1bdefff9e72f fbshipit-source-id: dff07477403170c27260f736ab6e6010f0deca9f	2019-09-24 18:56:45 -07:00
Lara	de3d4686ca	Update ONNX Export for Interpolate in Opset 11 (#24805 ) Summary: - Add support for linear and cubic interpolate in opset 11. - Add support for 1d and 3d interpolate in nearest mode for opset 7 and 8. - Add tests for all cases of interpolate in ORT tests (nearest/linear/cubic, 1d/2d/3d, upsample/downsample). Pull Request resolved: https://github.com/pytorch/pytorch/pull/24805 Reviewed By: hl475 Differential Revision: D17330801 Pulled By: houseroad fbshipit-source-id: 1bdefff9e72f5e70c51f4721e1d7347478b7505b	2019-09-24 16:29:57 -07:00
Patrick Donnelly	883628cb5c	Added documentation for nn.functional.bilinear (#24951 ) Summary: Adds documentation for `nn.functional.bilinear`, as requested in https://github.com/pytorch/pytorch/issues/9886. The format follows that of `nn.functional.linear`, and borrows from `nn.bilinear` in its description of `Tensor` shapes. I am happy to add more extensive documentation (e.g. "Args," "Example(s)"). From what I gather, the format of comments is inconsistent across functions in `nn.functional.py` and between modules (e.g. `nn.functional` and `nn`). It's my first PR, so guidance for contributing documentation and other code would be greatly appreciated! Pull Request resolved: https://github.com/pytorch/pytorch/pull/24951 Differential Revision: D17091261 Pulled By: soumith fbshipit-source-id: efe2ad764700dfd6f30eedc03de4e1cd0d10ac72	2019-08-28 08:19:25 -07:00
bnehoran	74b65c32be	Add align_corners option to grid_sample and affine_grid, change default to False (#24929 ) Summary: Resolves: https://github.com/pytorch/pytorch/issues/20785 Addresses https://github.com/pytorch/pytorch/issues/24470 for `affine_grid` Subsumes and closes: https://github.com/pytorch/pytorch/pull/24878 and likewise closes: https://github.com/pytorch/pytorch/issues/24821 Adds the `align_corners` option to `grid_sample` and `affine_grid`, paralleling the option that was added to `interpolate` in version 0.4.0. In short, setting `align_corners` to `False` allows these functions to be resolution agnostic. This ensures, for example, that a grid generated from a neural net trained to warp 1024x1024 images will also work to warp the same image upsampled/downsampled to other resolutions like 512x512 or 2048x2048 without producing scaling/stretching artifacts. Refer to the documentation and https://github.com/pytorch/pytorch/issues/20785 for more details. #### BC-Breaking Changes - Important: BC-Breaking change because of new default for `align_corners` The old functionality can still be achieved by setting `align_corners=True`, but the default is now set to `align_corners=False`, since this is the more correct setting, and since this matches the default setting of `interpolate`. - Should not cause BC issues: BC-Breaking change for pathological use case 2D affine transforms on 1D coordinates and 3D affine transforms on 2D coordinates (that is, when one of the spatial dimensions has an empty span) are ill-defined, and not an intended use case of `affine_grid`. Whereas before, all grid point components along such dimension were set arbitrarily to `-1` (that is, before multiplying be the affine matrix), they are now all set instead to `0`, which is a much more consistent and defensible arbitrary choice. A warning is triggered for such cases. #### Documentation - Update `affine_grid` documentation to express that it does indeed support 3D affine transforms. This support was already there but not documented. - Add documentation warnings for BC-breaking changes in `grid_sample` and `affine_grid` (see above). #### Refactors - `affine_grid` no longer dispatches to cuDNN under any circumstances. The decision point for when the cuDNN `affine_grid_generator` is compatible with the native PyTorch version and when it fails is a headache to maintain (see [these conditions](`5377478e94/torch/nn/_functions/vision.py (L7-L8)`)). The native PyTorch kernel is now used in all cases. - The kernels for `grid_sample` are slightly refactored to make maintenance easier. #### Tests Two new tests are added in `test_nn.py`: - `test_affine_grid_error_checking` for errors and warnings in `affine_grid` - `test_affine_grid_3D` for testing `affine_grid`'s 3D functionality. The functionality existed prior to this, but wasn't tested. Pull Request resolved: https://github.com/pytorch/pytorch/pull/24929 Differential Revision: D16949064 Pulled By: ailzhang fbshipit-source-id: b133ce0d47a2a5b3e2140b9d05fb05fca9140926	2019-08-21 21:17:49 -07:00
Ailing Zhang	b0737ccdc1	Revert D16887357: [pytorch][PR] [BC-BREAKING] Add align_corners option to grid_sample and affine_grid, change default to False Differential Revision: D16887357 Original commit changeset: ea09aad7853e fbshipit-source-id: 0bebb159be4e6ebe479771b42c0b483f5a84a094	2019-08-19 22:05:56 -07:00
Barak Nehoran	87217cfd2a	Add align_corners option to grid_sample and affine_grid, change default to False (#23923 ) Summary: Resolves: https://github.com/pytorch/pytorch/issues/20785 Adds the `align_corners` option to `grid_sample` and `affine_grid`, paralleling the option that was added to `interpolate` in version 0.4.0. In short, setting `align_corners` to `False` allows these functions to be resolution agnostic. This ensures, for example, that a grid generated from a neural net trained to warp 1024x1024 images will also work to warp the same image upsampled/downsampled to other resolutions like 512x512 or 2048x2048 without producing scaling/stretching artifacts. Refer to the documentation and https://github.com/pytorch/pytorch/issues/20785 for more details. Important: BC-Breaking Change because of new default The old functionality can still be achieved by setting `align_corners=True`, but the default is now set to `align_corners=False`, since this is the more correct setting, and since this matches the default setting of `interpolate`. The vectorized 2D cpu version of `grid_sampler` is refactored a bit. I don’t suspect that this refactor would affect the runtime much, since it is mostly done in inlined functions, but I may be wrong, and this has to be verified by profiling. ~The tests are not yet updated to reflect the new default. New tests should probably also be added to test both settings of `align_corners`.~ _Tests are now updated._ Pull Request resolved: https://github.com/pytorch/pytorch/pull/23923 Differential Revision: D16887357 Pulled By: ailzhang fbshipit-source-id: ea09aad7853ef16536e719a898db8ba31595daa5	2019-08-19 09:45:44 -07:00
Elias Ellison	33a1c30cb1	cleanup torch/nn/functional.py (#23977 ) Summary: Cleanup torch/nn/functional now that JIT: - Handles multiple returns - Typechecks exits (exceptions) - assertions refine types Pull Request resolved: https://github.com/pytorch/pytorch/pull/23977 Differential Revision: D16697750 Pulled By: eellison fbshipit-source-id: 1f777d6b9ead1105de50120fffd46d523e1e6797	2019-08-07 16:31:36 -07:00
Tongzhou Wang	3107f1dcd5	fix align_corners doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23707 Differential Revision: D16617565 Pulled By: ezyang fbshipit-source-id: 9ae581e9233d8c2b92f35b9486af1dab30ce8e3a	2019-08-02 12:43:35 -07:00
Ailing Zhang	b7d90332ea	add notes about overshoot in bicubic mode (#23321 ) Summary: fix https://github.com/pytorch/pytorch/issues/21044 Bicubic interpolation can cause overshoot. Opencv keeps results dtype aligned with input dtype: - If input is uint8, the result is clamped [0, 255] - If input is float, the result is unclamped. In Pytorch case, we only accept float input, so we'll keep the result unclamped, and add some notes so that users can explicitly call `torch.clamp()` when necessary. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23321 Differential Revision: D16464796 Pulled By: ailzhang fbshipit-source-id: 177915e525d1f54c2209e277cf73e40699ed1acd	2019-07-24 14:46:37 -07:00
Igor Fedan	c2df54d6d0	avg_pool2d avg_pool3d for LongTensor (#22433 ) Summary: Generate avg_pool2d/avg_pool3d for LongTensor for CPU. Added divisor_override parameter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22433 Differential Revision: D16108809 Pulled By: ifedan fbshipit-source-id: 8de7ff585a0479702cceafb5ccf9dfea62a9cc50	2019-07-17 19:59:09 -07:00
Tongzhou Wang	332824551c	Fix F.one_hot doc signature Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22929 Differential Revision: D16290741 Pulled By: ezyang fbshipit-source-id: d8b979e64d92b94c5a70bb4ffe2a83042ed6abfc	2019-07-17 13:23:25 -07:00
David Riazati	10c4b98ade	Remove weak script (#22212 ) Summary: * Deletes all weak script decorators / associated data structures / methods * In order to keep supporting the standard library in script, this enables recursive script on any function defined in `torch.nn` * Most changes in `torch/nn` are the result of `ag -Q "weak" torch/nn/ -l \| xargs sed -i '/weak/d'`, only `rnn.py` needed manual editing to use the `ignore` and `export` to continue supporting the overloaded `forward` methods * `Sequential`/`ModuleList` no longer need to be added to constants since they are compiled on demand This should also fix https://github.com/pytorch/pytorch/issues/22212 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22212 Differential Revision: D15988346 Pulled By: driazati fbshipit-source-id: af223e3ad0580be895377312949997a70e988e4f	2019-07-03 17:28:25 -07:00
Guanheng Zhang	bb0f299f27	Update MultiheadAttention module support key/value with different number of features and allow static key/value (#21288 ) Summary: The changes include: 1. Allow key/value to have different number of features with query. It supports the case when key and value have different feature dimensions. 2. Support three separate proj_weight, in addition to a single in_proj_weight. The proj_weight of key and value may have different dimension with that of query so three separate proj_weights are necessary. In case that key and value have same dimension as query, it is preferred to use a single large proj_weight for performance reason. However, it should be noted that using a single large weight or three separate weights is a size-dependent decision. 3. Give an option to use static k and v in the multihead_attn operator (see saved_k and saved_v). Those static key/value tensors can now be re-used when training the model. 4. Add more test cases to cover the arguments. Note: current users should not be affected by the changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21288 Differential Revision: D15738808 Pulled By: zhangguanheng66 fbshipit-source-id: 288b995787ad55fba374184b3d15b5c6fe9abb5c	2019-07-02 18:06:25 -07:00
Lara	34aee933f9	ONNX Export Interpolate (Resize) for opset version 10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21434 Reviewed By: zrphercule Differential Revision: D15777197 Pulled By: houseroad fbshipit-source-id: 517b06a54a234ffdb762401e83f5a732023ed259	2019-06-19 13:40:27 -07:00
Ivan Ogasawara	0f675f9cbc	Port im2col and vol2col (#21769 ) Summary: resolves partially https://github.com/pytorch/pytorch/issues/18353 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21769 Differential Revision: D15854530 Pulled By: ezyang fbshipit-source-id: 574853c068010d1b7588047d2ab7450077471447	2019-06-17 10:06:26 -07:00
Natalia Gimelshein	efd20de276	fix multihead attention for half (#21658 ) Summary: Currently multihead attention for half type is broken ``` File "/home/ngimel/pytorch/torch/nn/functional.py", line 3279, in multi_head_attention_forward attn_output = torch.bmm(attn_output_weights, v) RuntimeError: Expected object of scalar type Float but got scalar type Half for argument https://github.com/pytorch/pytorch/issues/2 'mat2' ``` because softmax converts half inputs into fp32 inputs. This is unnecessary - all the computations in softmax will be done in fp32 anyway, and the results need to be converted into fp16 for the subsequent batch matrix multiply, so nothing is gained by writing them out in fp32. This PR gets rid of type casting in softmax, so that half works. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21658 Differential Revision: D15807487 Pulled By: zhangguanheng66 fbshipit-source-id: 4709ec71a36383d0d35a8f01021e12e22b94992d	2019-06-13 15:17:04 -07:00
Kabir Kwatra	26bcadcc61	Gumbel-Softmax Arxiv Docs Link Fix (#21376 ) Summary: Links separated #20297 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21376 Differential Revision: D15696413 Pulled By: ezyang fbshipit-source-id: 513bd430e41c109aa2d0fbaa9a242acb2a12059b	2019-06-06 10:11:18 -07:00
Xiaomeng Yang	0c6efbd410	Fix gelu documents (#21265 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21265 Fix gelu documents Reviewed By: hl475 Differential Revision: D15598958 fbshipit-source-id: 483040069102daada705401c36c8990598142d3d	2019-06-02 20:17:56 -07:00
Xiaomeng Yang	93ae040ff0	Add gelu activation in pytorch (#20665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20665 Add gelu activation forward on CPU in pytorch Compare to current python implemented version of gelu in BERT model like def gelu(self, x): x * 0.5 * (1.0 + torch.erf(x / self.sqrt_two)) The torch.nn.functional.gelu function can reduce the forward time from 333ms to 109ms (with MKL) / 112ms (without MKL) for input size = [64, 128, 56, 56] on a devvm. Reviewed By: zheng-xq Differential Revision: D15400974 fbshipit-source-id: f606b43d1dd64e3c42a12c4991411d47551a8121	2019-06-02 09:08:47 -07:00
Guanheng Zhang	8e3311c5e2	Remove functionality unsupported by the JIT from multi_head_attention_forward. (#20653 ) Summary: Remove the internal functions in multi_head_attention_forward. Those internal functions cause 10-15% performance regression and there is possibly a JIT issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20653 Differential Revision: D15398888 Pulled By: cpuhrsch fbshipit-source-id: 0a3f053a4ade5009e73d3974fa6733c2bff9d929	2019-05-27 15:12:58 -07:00
daquexian	a3a458ed30	Fix align corner docs (#20961 ) Summary: I believe the `True` and `False` in the doc are reversed :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20961 Differential Revision: D15510806 Pulled By: soumith fbshipit-source-id: 62566bb595e187506b23dedc24892e48f35b1147	2019-05-26 14:57:37 -07:00
Yifu Wang	5e69e76aba	Remove padding_mode from torch.nn.functional.conv{1,2,3}d's docstr (#20891 ) Summary: Fixes #20694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20891 Differential Revision: D15510790 Pulled By: soumith fbshipit-source-id: aa3630693c7446bf18a390cb49c4df9bc9c59eea	2019-05-26 14:52:51 -07:00
Josef Lindman Hörnlund	87040af498	Fix documentation for attention mask shape (#20850 ) Summary: Attention mask should be of shape `(L, S)` since it is added to `attn_output_weights`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20850 Differential Revision: D15495587 Pulled By: ezyang fbshipit-source-id: 61d6801da5291df960daab273e874df28aedbf6e	2019-05-24 09:10:11 -07:00
Guanheng Zhang	3caf4e6985	Remove weak_script in MultiheadAttention function. (#20563 ) Summary: Remove weak_script. After recently splitting the forward() function in MultiheadAttention module, we notice a memory leak on GPU. Fix the problem by removing those "weak_script" decorator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20563 Differential Revision: D15368262 Pulled By: zhangguanheng66 fbshipit-source-id: 475db93c9ee0dbaea8fb914c004e7d1e0d419bc2	2019-05-15 20:10:39 -07:00
Jason Lian	6e82b1c77d	Split nn.MultiHeadAttention into Module + functional (#20415 ) Summary: Moving functions from torch/nn/modules/activation.py to torch/nn/functional.py. For functions not implemented (_get_input_buffer and _set_input_buffer), a TODO is added. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20415 Differential Revision: D15318078 Pulled By: jamarshon fbshipit-source-id: 5ca698e2913821442cf8609cc61ac8190496a3c6	2019-05-14 08:41:28 -07:00
interesaaat	35fed93b1e	Adding Poisson NLL loss to libtorch (#19316 ) Summary: This PR add Poisson NLL loss to aten and substitute the python implementation with a call to the c++. Fixes #19186. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19316 Differential Revision: D15012957 Pulled By: ezyang fbshipit-source-id: 0a3f56e8307969c2f9cc321b5357a496c3d1784e	2019-05-10 11:57:49 -07:00
Ailing Zhang	899bddeeb6	fix typo in adaptive methods annotation (#20306 ) Summary: fixes #20215 The confusing behavior was caused by typos in type annotation :( Pull Request resolved: https://github.com/pytorch/pytorch/pull/20306 Differential Revision: D15276216 Pulled By: ailzhang fbshipit-source-id: 1b0c9635a72a05c9b537f80d85b117b5077fbec7	2019-05-09 09:29:37 -07:00
Mikhail Zolotukhin	3a0727e58b	Fix flake8. (#19832 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19832 ghimport-source-id: 7360a52dbcf83458797c27002afc1fd53ee5907f Differential Revision: D15115620 Pulled By: ZolotukhinM fbshipit-source-id: aa62b04facc1e1824a8889a32dace5804daa21df	2019-04-30 12:09:10 -07:00
Tongzhou Wang	42fbeef5d7	update F.grid_sample doc for clarity (#19754 ) Summary: https://github.com/pytorch/pytorch/issues/19717 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19754 Differential Revision: D15085449 Pulled By: soumith fbshipit-source-id: 0dda05bd395d58a496bf397ca7f1c50a239b0ed1	2019-04-26 16:01:24 -07:00

1 2 3 4 5 ...

588 Commits