Commit Graph

466 Commits

Author SHA1 Message Date
Natalia Gimelshein
26f9ac98e5 Revert D26105797: [pytorch][PR] Exposing linear layer to fuser
Test Plan: revert-hammer

Differential Revision:
D26105797 (e488e3c443)

Original commit changeset: 6f7cedb9f6e3

fbshipit-source-id: f0858cefed76d726e9dba61e51e1eaf2af4c99c5
2021-02-02 17:39:17 -08:00
jiej
e488e3c443 Exposing linear layer to fuser (#50856)
Summary:
1. enabling linear in autodiff;
2. remove control flow in python for linear;

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50856

Reviewed By: pbelevich

Differential Revision: D26105797

Pulled By: eellison

fbshipit-source-id: 6f7cedb9f6e3e46daa24223d2a6080880498deb4
2021-02-02 15:39:01 -08:00
M.L. Croci
8eb90d4865 Add Gaussian NLL Loss (#50886)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48520.

cc albanD (This is a clean retry PR https://github.com/pytorch/pytorch/issues/49807)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50886

Reviewed By: ejguan

Differential Revision: D26007435

Pulled By: albanD

fbshipit-source-id: 88fe91b40dea6f72e093e6301f0f04fcc842d2f0
2021-01-22 06:56:49 -08:00
Taylor Robie
6a3fc0c21c Treat has_torch_function and object_has_torch_function as static False when scripting (#48966)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48966

This PR lets us skip the `if not torch.jit.is_scripting():` guards on `functional` and `nn.functional` by directly registering `has_torch_function` and `object_has_torch_function` to the JIT as statically False.

**Benchmarks**

The benchmark script is kind of long. The reason is that it's testing all four PRs in the stack, plus threading and subprocessing so that the benchmark can utilize multiple cores while still collecting good numbers. Both wall times and instruction counts were collected. This stack changes dozens of operators / functions, but very mechanically such that there are only a handful of codepath changes. Each row is a slightly different code path (e.g. testing in Python, testing in the arg parser, different input types, etc.)

<details>

<summary> Test script </summary>

```
import argparse
import multiprocessing
import multiprocessing.dummy
import os
import pickle
import queue
import random
import sys
import subprocess
import tempfile
import time

import torch
from torch.utils.benchmark import Timer, Compare, Measurement

NUM_CORES = multiprocessing.cpu_count()
ENVS = {
    "ref": "HEAD (current)",
    "torch_fn_overhead_stack_0": "#48963",
    "torch_fn_overhead_stack_1": "#48964",
    "torch_fn_overhead_stack_2": "#48965",
    "torch_fn_overhead_stack_3": "#48966",
}

CALLGRIND_ENVS = tuple(ENVS.keys())

MIN_RUN_TIME = 3
REPLICATES = {
    "longer": 1_000,
    "long": 300,
    "short": 50,
}

CALLGRIND_NUMBER = {
    "overnight": 500_000,
    "long": 250_000,
    "short": 10_000,
}

CALLGRIND_TIMEOUT = {
    "overnight": 800,
    "long": 400,
    "short": 100,
}

SETUP = """
    x = torch.ones((1, 1))
    y = torch.ones((1, 1))
    w_tensor = torch.ones((1, 1), requires_grad=True)
    linear = torch.nn.Linear(1, 1, bias=False)
    linear_w = linear.weight
"""

TASKS = {
    "C++: unary                 `.t()`": "w_tensor.t()",
    "C++: unary  (Parameter)    `.t()`": "linear_w.t()",
    "C++: binary (Parameter)    `mul` ": "x + linear_w",
    "tensor.py: _wrap_type_error_to_not_implemented `__floordiv__`": "x // y",
    "tensor.py: method          `__hash__`": "hash(x)",
    "Python scalar              `__rsub__`": "1 - x",
    "functional.py: (unary)     `unique`": "torch.functional.unique(x)",
    "functional.py: (args)      `atleast_1d`": "torch.functional.atleast_1d((x, y))",
    "nn/functional.py: (unary)  `relu`": "torch.nn.functional.relu(x)",
    "nn/functional.py: (args)   `linear`": "torch.nn.functional.linear(x, w_tensor)",
    "nn/functional.py: (args)   `linear (Parameter)`": "torch.nn.functional.linear(x, linear_w)",
    "Linear(..., bias=False)": "linear(x)",
}

def _worker_main(argv, fn):
    parser = argparse.ArgumentParser()
    parser.add_argument("--output_file", type=str)
    parser.add_argument("--single_task", type=int, default=None)
    parser.add_argument("--length", type=str)
    args = parser.parse_args(argv)
    single_task = args.single_task

    conda_prefix = os.getenv("CONDA_PREFIX")
    assert torch.__file__.startswith(conda_prefix)

    env = os.path.split(conda_prefix)[1]
    assert env in ENVS

    results = []
    for i, (k, stmt) in enumerate(TASKS.items()):
        if single_task is not None and single_task != i:
            continue

        timer = Timer(
            stmt=stmt,
            setup=SETUP,
            sub_label=k,
            description=ENVS[env],
        )
        results.append(fn(timer, args.length))

    with open(args.output_file, "wb") as f:
        pickle.dump(results, f)

def worker_main(argv):
    _worker_main(
        argv,
        lambda timer, _: timer.blocked_autorange(min_run_time=MIN_RUN_TIME)
    )

def callgrind_worker_main(argv):
    _worker_main(
        argv,
        lambda timer, length: timer.collect_callgrind(number=CALLGRIND_NUMBER[length], collect_baseline=False))

def main(argv):
    parser = argparse.ArgumentParser()
    parser.add_argument("--long", action="store_true")
    parser.add_argument("--longer", action="store_true")
    args = parser.parse_args(argv)

    if args.longer:
        length = "longer"
    elif args.long:
        length = "long"
    else:
        length = "short"
    replicates = REPLICATES[length]

    num_workers = int(NUM_CORES // 2)
    tasks = list(ENVS.keys()) * replicates
    random.shuffle(tasks)
    task_queue = queue.Queue()
    for _ in range(replicates):
        envs = list(ENVS.keys())
        random.shuffle(envs)
        for e in envs:
            task_queue.put((e, None))

    callgrind_task_queue = queue.Queue()
    for e in CALLGRIND_ENVS:
        for i, _ in enumerate(TASKS):
            callgrind_task_queue.put((e, i))

    results = []
    callgrind_results = []

    def map_fn(worker_id):
        # Adjacent cores often share cache and maxing out a machine can distort
        # timings so we space them out.
        callgrind_cores = f"{worker_id * 2}-{worker_id * 2 + 1}"
        time_cores = str(worker_id * 2)
        _, output_file = tempfile.mkstemp(suffix=".pkl")
        try:
            loop_tasks = (
                # Callgrind is long running, and then the workers can help with
                # timing after they finish collecting counts.
                (callgrind_task_queue, callgrind_results, "callgrind_worker", callgrind_cores, CALLGRIND_TIMEOUT[length]),
                (task_queue, results, "worker", time_cores, None))

            for queue_i, results_i, mode_i, cores, timeout in loop_tasks:
                while True:
                    try:
                        env, task_i = queue_i.get_nowait()
                    except queue.Empty:
                        break

                    remaining_attempts = 3
                    while True:
                        try:
                            subprocess.run(
                                " ".join([
                                    "source", "activate", env, "&&",
                                    "taskset", "--cpu-list", cores,
                                    "python", os.path.abspath(__file__),
                                    "--mode", mode_i,
                                    "--length", length,
                                    "--output_file", output_file
                                ] + ([] if task_i is None else ["--single_task", str(task_i)])),
                                shell=True,
                                check=True,
                                timeout=timeout,
                            )
                            break

                        except subprocess.TimeoutExpired:
                            # Sometimes Valgrind will hang if there are too many
                            # concurrent runs.
                            remaining_attempts -= 1
                            if not remaining_attempts:
                                print("Too many failed attempts.")
                                raise
                            print(f"Timeout after {timeout} sec. Retrying.")

                    # We don't need a lock, as the GIL is enough.
                    with open(output_file, "rb") as f:
                        results_i.extend(pickle.load(f))

        finally:
            os.remove(output_file)

    with multiprocessing.dummy.Pool(num_workers) as pool:
        st, st_estimate, eta, n_total = time.time(), None, "", len(tasks) * len(TASKS)
        map_job = pool.map_async(map_fn, range(num_workers))
        while not map_job.ready():
            n_complete = len(results)
            if n_complete and len(callgrind_results):
                if st_estimate is None:
                    st_estimate = time.time()
                else:
                    sec_per_element = (time.time() - st_estimate) / n_complete
                    n_remaining = n_total - n_complete
                    eta = f"ETA: {n_remaining * sec_per_element:.0f} sec"

            print(
                f"\r{n_complete} / {n_total}  "
                f"({len(callgrind_results)} / {len(CALLGRIND_ENVS) * len(TASKS)})   "
                f"{eta}".ljust(40), end="")
            sys.stdout.flush()
            time.sleep(2)
    total_time = int(time.time() - st)
    print(f"\nTotal time: {int(total_time // 60)} min, {total_time % 60} sec")

    desc_to_ind = {k: i for i, k in enumerate(ENVS.values())}
    results.sort(key=lambda r: desc_to_ind[r.description])

    # TODO: Compare should be richer and more modular.
    compare = Compare(results)
    compare.trim_significant_figures()
    compare.colorize(rowwise=True)

    # Manually add master vs. overall relative delta t.
    merged_results = {
        (r.description, r.sub_label): r
        for r in Measurement.merge(results)
    }

    cmp_lines = str(compare).splitlines(False)
    print(cmp_lines[0][:-1] + "-" * 15 + "]")
    print(f"{cmp_lines[1]} |{'':>10}\u0394t")
    print(cmp_lines[2] + "-" * 15)
    for l, t in zip(cmp_lines[3:3 + len(TASKS)], TASKS.keys()):
        assert l.strip().startswith(t)
        t0 = merged_results[(ENVS["ref"], t)].median
        t1 = merged_results[(ENVS["torch_fn_overhead_stack_3"], t)].median
        print(f"{l} |{'':>5}{(t1 / t0 - 1) * 100:>6.1f}%")
    print("\n".join(cmp_lines[3 + len(TASKS):]))

    counts_dict = {
        (r.task_spec.description, r.task_spec.sub_label): r.counts(denoise=True)
        for r in callgrind_results
    }

    def rel_diff(x, x0):
        return f"{(x / x0 - 1) * 100:>6.1f}%"

    task_pad = max(len(t) for t in TASKS)
    print(f"\n\nInstruction % change (relative to `{CALLGRIND_ENVS[0]}`)")
    print(" " * (task_pad + 8)  + (" " * 7).join([ENVS[env] for env in CALLGRIND_ENVS[1:]]))
    for t in TASKS:
        values = [counts_dict[(ENVS[env], t)] for env in CALLGRIND_ENVS]

        print(t.ljust(task_pad + 3) + "  ".join([
            rel_diff(v, values[0]).rjust(len(ENVS[env]) + 5)
            for v, env in zip(values[1:], CALLGRIND_ENVS[1:])]))

        print("\033[4m" + "    Instructions per invocation".ljust(task_pad + 3) + "  ".join([
            f"{v // CALLGRIND_NUMBER[length]:.0f}".rjust(len(ENVS[env]) + 5)
            for v, env in zip(values[1:], CALLGRIND_ENVS[1:])]) + "\033[0m")
        print()

    import pdb
    pdb.set_trace()

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--mode", type=str, choices=("main", "worker", "callgrind_worker"), default="main")
    args, remaining = parser.parse_known_args()

    if args.mode == "main":
        main(remaining)

    elif args.mode == "callgrind_worker":
        callgrind_worker_main(remaining)

    else:
        worker_main(remaining)

```

</details>

**Wall time**
<img width="1178" alt="Screen Shot 2020-12-12 at 12 28 13 PM" src="https://user-images.githubusercontent.com/13089297/101994419-284f6a00-3c77-11eb-8dc8-4f69a890302e.png">

<details>

<summary> Longer run (`python test.py --long`) is basically identical. </summary>

<img width="1184" alt="Screen Shot 2020-12-12 at 5 02 47 PM" src="https://user-images.githubusercontent.com/13089297/102000425-2350e180-3c9c-11eb-999e-a95b37e9ef54.png">

</details>

**Callgrind**
<img width="936" alt="Screen Shot 2020-12-12 at 12 28 54 PM" src="https://user-images.githubusercontent.com/13089297/101994421-2e454b00-3c77-11eb-9cd3-8cde550f536e.png">

Test Plan: existing unit tests.

Reviewed By: ezyang

Differential Revision: D25590731

Pulled By: robieta

fbshipit-source-id: fe05305ff22b0e34ced44b60f2e9f07907a099dd
2021-01-10 19:23:38 -08:00
Taylor Robie
d31a760be4 move has_torch_function to C++, and make a special case object_has_torch_function (#48965)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48965

This PR pulls `__torch_function__` checking entirely into C++, and adds a special `object_has_torch_function` method for ops which only have one arg as this lets us skip tuple construction and unpacking. We can now also do away with the Python side fast bailout for `Tensor` (e.g. `if any(type(t) is not Tensor for t in tensors) and has_torch_function(tensors)`) because they're actually slower than checking with the Python C API.

Test Plan: Existing unit tests. Benchmarks are in #48966

Reviewed By: ezyang

Differential Revision: D25590732

Pulled By: robieta

fbshipit-source-id: 6bd74788f06cdd673f3a2db898143d18c577eb42
2021-01-10 19:23:35 -08:00
Richard Barnes
2bceee785f Clean up simple type annotations in nn/functional.py (#50106)
Summary:
Also reformats code to pass linters.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50106

Test Plan: Sandcastle tests

Reviewed By: xush6528

Differential Revision: D25787566

fbshipit-source-id: 39c86b4021e279f92f8ccf30252a6cfae1063c3c
2021-01-07 15:33:40 -08:00
Samuel Marks
e6779d4357 [*.py] Rename "Arguments:" to "Args:" (#49736)
Summary:
I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings.

```sh
(pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do
    printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" | paste -s -d+ -- | bc)"; done
Args:      1095
Arguments: 0336
```

It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per:

  - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md)

  - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md)

  - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst)

Therefore, only `Args:` is valid. This PR replaces them throughout the codebase.

PS: For related PRs, see tensorflow/tensorflow/pull/45420

PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736

Reviewed By: albanD

Differential Revision: D25710534

Pulled By: soumith

fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619
2020-12-28 09:34:47 -08:00
Joel Schlosser
68d438c9da Add PixelUnshuffle (#49334)
Summary:
Adds an implementation of `torch.nn.PixelUnshuffle` as the inverse operation of `torch.nn.PixelShuffle`. This addresses https://github.com/pytorch/pytorch/issues/2456

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49334

Test Plan:
```
# Unit tests.
python test/test_nn.py TestNN.test_pixel_shuffle_unshuffle

# Module test.
python test/test_nn.py TestNN.test_PixelUnshuffle

# C++ API tests.
build/bin/test_api

# C++ / python parity tests.
python test/test_cpp_api_parity.py

# JIT test.
python test/test_jit.py TestJitGeneratedFunctional.test_nn_pixel_unshuffle

# Override tests.
python test/test_overrides.py

# Type hint tests.
python test/test_type_hints.py
```

Screenshots of rendered docs:
<img width="876" alt="Screen Shot 2020-12-18 at 12 19 05 PM" src="https://user-images.githubusercontent.com/75754324/102642255-6b07bb00-412b-11eb-88fa-e53e7e8ba720.png">
<img width="984" alt="Screen Shot 2020-12-18 at 12 19 26 PM" src="https://user-images.githubusercontent.com/75754324/102642276-70fd9c00-412b-11eb-8548-445082a2db02.png">
<img width="932" alt="Screen Shot 2020-12-18 at 12 19 34 PM" src="https://user-images.githubusercontent.com/75754324/102642704-19abfb80-412c-11eb-9546-95bdd1c3cf22.png">
<img width="876" alt="Screen Shot 2020-12-22 at 12 51 36 PM" src="https://user-images.githubusercontent.com/75754324/102918259-986aa680-4454-11eb-99e7-a0b4c8b3e283.png">
<img width="869" alt="Screen Shot 2020-12-22 at 12 51 44 PM" src="https://user-images.githubusercontent.com/75754324/102918274-9ef91e00-4454-11eb-94bb-91b58aff47d3.png">

Reviewed By: mruberry

Differential Revision: D25401439

Pulled By: jbschlosser

fbshipit-source-id: 209d92ce7295e51699e83616d0c62170a7ce75c8
2020-12-22 20:14:55 -08:00
Guanheng Zhang
e2b4c63dd9 Enable the faster combined weight branch in MHA when query/key/value is same object with nan (#48126)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47979

For MHA module, it is preferred to use the combined weight branch as much as possible when query/key/value are same (in case of same values by `torch.equal` or exactly same object by `is` ops). This PR will enable the faster branch when a single object with `nan` is passed to MHA.

For the background knowledge
```
import torch
a = torch.tensor([float('NaN'), 1, float('NaN'), 2, 3])
print(a is a) # True
print(torch.equal(a, a)) # False
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48126

Reviewed By: gchanan

Differential Revision: D25042082

Pulled By: zhangguanheng66

fbshipit-source-id: 6bb17a520e176ddbb326ddf30ee091a84fcbbf27
2020-11-18 08:24:41 -08:00
Qi Zhou
0ec717c830 Support int32 indices and offsets in nn.EmbeddingBag (#46758)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46758

It's in general helpful to support int32 indices and offsets, especially when such tensors are large and need to be transferred to accelerator backends. Since it may not be very useful to support the combination of int32 indices and int64 offsets, here we enforce that these two must have the same type.

Test Plan: unit tests

Reviewed By: ngimel

Differential Revision: D24470808

fbshipit-source-id: 94b8a1d0b7fc9fe3d128247aa042c04d7c227f0b
2020-11-03 23:33:50 -08:00
pomelyu
f41f3e3cd1 Implement bicubic grid sampler (#44780)
Summary:
Fix https://github.com/pytorch/pytorch/issues/44601

I added bicubic grid sampler in both cpu and cuda side, but haven't in AVX2

There is a [colab notebook](https://colab.research.google.com/drive/1mIh6TLLj5WWM_NcmKDRvY5Gltbb781oU?usp=sharing) show some test results. The notebook use bilinear for test, since I could only use distributed version of pytorch in it. You could just download it and modify the `mode_torch=bicubic` to show the results.

There are some duplicate code about getting and setting values, since the helper function used in bilinear at first clip the coordinate beyond boundary, and then get or set the value. However, in bicubic, there are more points should be consider. I could refactor that part after making sure the overall calculation are correct.

Thanks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44780

Reviewed By: mrshenli

Differential Revision: D24681114

Pulled By: mruberry

fbshipit-source-id: d39c8715e2093a5a5906cb0ef040d62bde578567
2020-11-03 15:34:59 -08:00
Ollin Boer Bohan
ac4ee0ef5d Fix typo in docs for interpolate (#46589)
Summary:
Removes a spurious backtick in [the docs for `torch.nn.functional.interpolate`](https://pytorch.org/docs/stable/nn.functional.html?highlight=grid_sample#torch.nn.functional.interpolate)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46589

Reviewed By: zou3519

Differential Revision: D24422550

Pulled By: ezyang

fbshipit-source-id: c1e6b7de4584b2a3f68b458801a33b3fc71c1944
2020-10-21 11:31:53 -07:00
n-v-k
64b0686986 Expose ChannelShuffle (#46000)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45999
Also small fix for caffe2 counterpart

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46000

Reviewed By: mruberry

Differential Revision: D24185855

Pulled By: ngimel

fbshipit-source-id: c5d599bb8100b86b81c6901f1b8b8baefc12cb16
2020-10-08 16:00:01 -07:00
Natalia Gimelshein
52f2db752d unify reproducibility notes (#45748)
Summary:
Many of our functions contain same warnings about results reproducibility. Make them use common template.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45748

Reviewed By: colesbury

Differential Revision: D24089114

Pulled By: ngimel

fbshipit-source-id: e6aa4ce6082f6e0f4ce2713c2bf1864ee1c3712a
2020-10-08 02:14:57 -07:00
Ansley Ussery
7726754e70 Add function signature for pixel_shuffle (#45661)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45661

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D24078627

Pulled By: ansleyadelaide

fbshipit-source-id: 44917ff5932e4d0adcc18ce24ecfc0b5686818e3
2020-10-02 11:46:35 -07:00
Guilherme Leobas
c1e6592964 Enable type-checking of torch.nn.quantized.* modules (#43110)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/43029

I am not changing the following files in this PR:
* `torch/nn/quantized/dynamic/modules/rnn.py` due to https://github.com/pytorch/pytorch/issues/43072
* `torch/nn/quantized/modules/conv.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43110

Reviewed By: gchanan

Differential Revision: D23963258

Pulled By: ezyang

fbshipit-source-id: 0fb0fd13af283f6f7b3434e7bbf62165357d1f98
2020-09-29 18:14:29 -07:00
Brian Hirsh
439930c81b adding a beta parameter to the smooth_l1 loss fn (#44433)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44433

Not entirely sure why, but changing the type of beta from `float` to `double in autocast_mode.cpp and FunctionsManual.h fixes my compiler errors, failing instead at link time

fixing some type errors, updated fn signature in a few more files

removing my usage of Scalar, making beta a double everywhere instead

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D23636720

Pulled By: bdhirsh

fbshipit-source-id: caea2a1f8dd72b3b5fd1d72dd886b2fcd690af6d
2020-09-25 16:36:28 -07:00
Kurt Mohler
d1c68a7069 Clarify that 5-D 'bilinear' grid_sample is actually trilinear (#45090)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41528

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45090

Reviewed By: ailzhang

Differential Revision: D23841046

Pulled By: zou3519

fbshipit-source-id: 941770cd5b3e705608957739026e9113e5f0c616
2020-09-22 15:10:22 -07:00
Mike Ruberry
ef885c10d8 [pytorch] Add triplet margin loss with custom distance (#43680)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43680

As discussed [here](https://github.com/pytorch/pytorch/issues/43342),
adding in a Python-only implementation of the triplet-margin loss that takes a
custom distance function.  Still discussing whether this is necessary to add to
PyTorch Core.

Test Plan:
python test/run_tests.py

Imported from OSS

Reviewed By: albanD

Differential Revision: D23363898

fbshipit-source-id: 1cafc05abecdbe7812b41deaa1e50ea11239d0cb
2020-09-22 11:35:52 -07:00
Xiang Gao
e48201c5cf Mention TF32 on related docs (#44690)
Summary:
cc: ptrblck

![image](https://user-images.githubusercontent.com/1032377/93168022-cbbfcb80-f6d6-11ea-8f6e-f2c8a15c5bea.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44690

Reviewed By: ngimel

Differential Revision: D23727921

Pulled By: mruberry

fbshipit-source-id: db7cc8e74cde09c13d6a57683129fd839863b914
2020-09-16 19:18:30 -07:00
Xiang Gao
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
Gregory Chanan
5579b53a7f Fix SmoothL1Loss when target.requires_grad is True. (#44486)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44486

SmoothL1Loss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:

1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the SmoothL1Loss CriterionTests to verify that the target derivative is checked.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D23630699

Pulled By: gchanan

fbshipit-source-id: 0f94d1a928002122d6b6875182867618e713a917
2020-09-11 12:13:36 -07:00
David Reiss
7d78a6fcdd Update interpolate to use new upsample overloads (#43025)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43025

- Use new overloads that better reflect the arguments to interpolate.
- More uniform interface for upsample ops allows simplifying the Python code.
- Also reorder overloads in native_functions.yaml to give them priority.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/37177

ghstack-source-id: 106938111

Test Plan:
test_nn has pretty good coverage.

Relying on CI for ONNX, etc.

Didn't test FC because this change is *not* forward compatible.

To ensure backwards compatibility, I ran this code before this change

```python
def test_func(arg):
    interp = torch.nn.functional.interpolate
    with_size = interp(arg, size=(16,16))
    with_scale = interp(arg, scale_factor=[2.1, 2.2], recompute_scale_factor=False)
    with_compute = interp(arg, scale_factor=[2.1, 2.2])
    return (with_size, with_scale, with_compute)

traced_func = torch.jit.trace(test_func, torch.randn(1,1,1,1))

sample = torch.randn(1, 3, 7, 7)
output = traced_func(sample)

assert not torch.allclose(output[1], output[2])

torch.jit.save(traced_func, "model.pt")
torch.save((sample, output), "data.pt")
```

then this code after this change

```python
model = torch.jit.load("model.pt")
sample, golden = torch.load("data.pt")
result = model(sample)
for r, g in zip(result, golden):
    assert torch.allclose(r, g)
```

Reviewed By: AshkanAliabadi

Differential Revision: D21209991

fbshipit-source-id: 5b2ebb7c3ed76947361fe532d1dbdd6faa3544c8
2020-09-11 09:59:14 -07:00
Gregory Chanan
3de2c0b42f Fix L1Loss when target.requires_grad is True. (#44471)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44471

L1Loss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:

1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the L1Loss CriterionTests to verify that the target derivative is checked.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D23626008

Pulled By: gchanan

fbshipit-source-id: 2828be16b56b8dabe114962223d71b0e9a85f0f5
2020-09-11 09:51:16 -07:00
Gregory Chanan
d07d25a8c5 Fix MSELoss when target.requires_grad is True. (#44437)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44437

MSELoss had a completely different (and incorrect, see https://github.com/pytorch/pytorch/issues/43228) path when target.requires_grad was True.

This PR does the following:
1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:
1) do we still need check_criterion_jacobian when we run grad/gradgrad checks?
2) ensure the Module tests check when target.requires_grad
3) do we actually test when reduction='none' and reduction='mean'?

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D23612166

Pulled By: gchanan

fbshipit-source-id: 4f74d38d8a81063c74e002e07fbb7837b2172a10
2020-09-11 08:51:28 -07:00
Chris Huynh
7b547f086f To fix extra memory allocation when using circular padding (#39273)
Summary:
For fixing https://github.com/pytorch/pytorch/issues/39256

Pull Request resolved: https://github.com/pytorch/pytorch/pull/39273

Reviewed By: anjali411

Differential Revision: D23471811

Pulled By: mruberry

fbshipit-source-id: fb324b51baea765311715cdf14642b334f335733
2020-09-10 00:15:31 -07:00
Nikita Shulga
442684cb25 Enable typechecks for torch.nn.modules.[activation|upsampling] (#44093)
Summary:
Add missing `hardsigmoid`, `silu`, `hardswish` and `multi_head_attention_forward` to functional.pyi.in
 Embed some typing annotations into functional.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44093

Reviewed By: ezyang

Differential Revision: D23494384

Pulled By: malfet

fbshipit-source-id: 27023c16ff5951ceaebb78799c4629efa25f7c5c
2020-09-03 13:20:04 -07:00
Vincent QB
fab012aa28 Revert "Added support for Huber Loss (#37599)" (#43351)
Summary:
This reverts commit 11e5174926 due to [comment](https://github.com/pytorch/pytorch/pull/37599#pullrequestreview-471950192).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43351

Reviewed By: pbelevich, seemethere

Differential Revision: D23249511

Pulled By: vincentqb

fbshipit-source-id: 18b8b346f00eaf0ef7376b06579d404a84add4de
2020-09-01 06:34:26 -07:00
Gregory Chanan
42c895de4d Properly check that reduction strings are valid for l1_loss, smoothl1_loss, and mse_loss. (#43527)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43527

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D23306786

Pulled By: gchanan

fbshipit-source-id: f3b7c9c02ae02813da116cb6b247a95727c47587
2020-08-31 09:53:56 -07:00
Gregory Chanan
1dcc4fb6b7 Kill unused _pointwise_loss function. (#43523)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43523

The code is also wrong, see https://github.com/pytorch/pytorch/issues/43228.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D23305461

Pulled By: gchanan

fbshipit-source-id: 9fe516d87a4243d5ce3c29e8822417709a1d6346
2020-08-31 07:58:04 -07:00
David Reiss
31788ae151 Trim trailing whitespace
Test Plan: CI

Reviewed By: linbinyu

Differential Revision: D23108919

fbshipit-source-id: 913c982351a94080944f350641d7966c6c2cc508
2020-08-14 09:18:40 -07:00
Nikita Shulga
3cf2551f2f Fix torch.nn.functional.grid_sample crashes if grid has NaNs (#42703)
Summary:
In `clip_coordinates` replace `minimum(maximum(in))` composition with `clamp_max(clamp_min(in))`
Swap order of `clamp_min` operands to clamp NaNs in grid to 0

Fixes https://github.com/pytorch/pytorch/issues/42616

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42703

Reviewed By: ezyang

Differential Revision: D22987447

Pulled By: malfet

fbshipit-source-id: a8a2d6de8043d6b77c8707326c5412d0250efae6
2020-08-10 16:20:09 -07:00
Hameer Abbasi
3d46e02ea1 Add __torch_function__ for methods (#37091)
Summary:
According to pytorch/rfcs#3

From the goals in the RFC:

1. Support subclassing `torch.Tensor` in Python (done here)
2. Preserve `torch.Tensor` subclasses when calling `torch` functions on them (done here)
3. Use the PyTorch API with `torch.Tensor`-like objects that are _not_ `torch.Tensor`
   subclasses (done in https://github.com/pytorch/pytorch/issues/30730)
4. Preserve `torch.Tensor` subclasses when calling `torch.Tensor` methods. (done here)
5. Propagating subclass instances correctly also with operators, using
   views/slices/indexing/etc. (done here)
6. Preserve subclass attributes when using methods or views/slices/indexing. (done here)
7. A way to insert code that operates on both functions and methods uniformly
   (so we can write a single function that overrides all operators). (done here)
8. The ability to give external libraries a way to also define
   functions/methods that follow the `__torch_function__` protocol. (will be addressed in a separate PR)

This PR makes the following changes:

1. Adds the `self` argument to the arg parser.
2. Dispatches on `self` as well if `self` is not `nullptr`.
3. Adds a `torch._C.DisableTorchFunction` context manager to disable `__torch_function__`.
4. Adds a `torch::torch_function_enabled()` and `torch._C._torch_function_enabled()` to check the state of `__torch_function__`.
5. Dispatches all `torch._C.TensorBase` and `torch.Tensor` methods via `__torch_function__`.

TODO:

- [x] Sequence Methods
- [x] Docs
- [x] Tests

Closes https://github.com/pytorch/pytorch/issues/28361

Benchmarks in https://github.com/pytorch/pytorch/pull/37091#issuecomment-633657778

Pull Request resolved: https://github.com/pytorch/pytorch/pull/37091

Reviewed By: ngimel

Differential Revision: D22765678

Pulled By: ezyang

fbshipit-source-id: 53f8aa17ddb8b1108c0997f6a7aa13cb5be73de0
2020-08-05 20:44:13 -07:00
Yanan Cao
bdcf320bed Support custom exception message (#41907)
Summary:
Raise and assert used to have a hard-coded error message "Exception". User provided error message was ignored. This PR adds support to represent user's error message in TorchScript.

This breaks backward compatibility because now we actually need to script the user's error message, which can potentially contain unscriptable expressions. Such programs can break when scripting, but saved models can still continue to work.

Increased an op count in test_mobile_optimizer.py because now we need aten::format to form the actual exception message.

This is built upon an WIP PR:  https://github.com/pytorch/pytorch/pull/34112 by driazati

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41907

Reviewed By: ngimel

Differential Revision: D22778301

Pulled By: gmagogsfm

fbshipit-source-id: 2b94f0db4ae9fe70c4cd03f4048e519ea96323ad
2020-08-01 13:03:45 -07:00
alexandrosstergiou
11e5174926 Added support for Huber Loss (#37599)
Summary:
Current losses in PyTorch only include a (partial) implementation of Huber loss through `smooth l1` based on Fast RCNN - which essentially uses a delta value of 1. Changing/Renaming the [`_smooth_l1_loss()`](3e1859959a/torch/nn/functional.py (L2487)) and refactoring to include delta, enables to use the actual function.

Supplementary to this, I have also made a functional and criterion versions for anyone that wants to set the delta explicitly - based on the functional `smooth_l1_loss()` and the criterion `Smooth_L1_Loss()`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/37599

Differential Revision: D21559311

Pulled By: vincentqb

fbshipit-source-id: 34b2a5a237462e119920d6f55ba5ab9b8e086a8c
2020-07-27 10:42:30 -07:00
Oren Amsalem
b6690eb29a Might be good for newcomers to read what N means (#41851)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41851

Reviewed By: izdeby

Differential Revision: D22703602

Pulled By: mrshenli

fbshipit-source-id: 44905f43cdf53b38e383347e5002a28c9363a446
2020-07-23 16:10:38 -07:00
wudenggang
9600ed9af3 typo fixes (#41632)
Summary:
typo fixes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41632

Reviewed By: ezyang

Differential Revision: D22617827

Pulled By: mrshenli

fbshipit-source-id: c2bfcb7cc36913a8dd32f13fc9adc3aa0a9b682f
2020-07-20 07:23:00 -07:00
Nathan Goldbaum
1e230a5c52 rewrite C++ __torch_function__ handling to work with TensorList operands (#41575)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41575

Fixes https://github.com/pytorch/pytorch/issues/34294

This updates the C++ argument parser to correctly handle `TensorList` operands. I've also included a number of updates to the testing infrastructure, this is because we're now doing a much more careful job of testing the signatures of aten kernels, using the type information about the arguments as read in from `Declarations.yaml`. The changes to the tests are required because we're now only checking for `__torch_function__` attributes on `Tensor`, `Optional[Tensor]` and elements of `TensorList` operands, whereas before we were checking for `__torch_function__` on all operands, so the relatively simplistic approach the tests were using before -- assuming all positional arguments might be tensors -- doesn't work anymore. I now think that checking for `__torch_function__` on all operands was a mistake in the original design.

The updates to the signatures of the `lambda` functions are to handle this new, more stringent checking of signatures.

I also added override support for `torch.nn.functional.threshold` `torch.nn.functional.layer_norm`, which did not yet have python-level support.

Benchmarks are still WIP.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/34725

Reviewed By: mruberry

Differential Revision: D22357738

Pulled By: ezyang

fbshipit-source-id: 0e7f4a58517867b2e3f193a0a8390e2ed294e1f3
2020-07-17 08:54:29 -07:00
Kurt Mohler
0b73ea0ea2 Change BCELoss size mismatch warning into an error (#41426)
Summary:
BCELoss currently uses different broadcasting semantics than numpy. Since previous versions of PyTorch have thrown a warning in these cases telling the user that input sizes should match, and since the CUDA and CPU results differ when sizes do not match, it makes sense to upgrade the size mismatch warning to an error.

We can consider supporting numpy broadcasting semantics in BCELoss in the future if needed.

Closes https://github.com/pytorch/pytorch/issues/40023

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41426

Reviewed By: zou3519

Differential Revision: D22540841

Pulled By: ezyang

fbshipit-source-id: 6c6d94c78fa0ae30ebe385d05a9e3501a42b3652
2020-07-14 20:34:06 -07:00
Heitor Schueroff de Souza
75a4862f63 Added SiLU activation function (#41034)
Summary:
Implemented the SiLU activation function as discussed in https://github.com/pytorch/pytorch/issues/3169.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41034

Reviewed By: glaringlee

Differential Revision: D22465203

Pulled By: heitorschueroff

fbshipit-source-id: b27d064529fc99600c586ad49b594b52b718b0d2
2020-07-10 07:37:30 -07:00
Negin Raoof
f69d6a7ea3 [ONNX] Update Default Value of recompute_scale_factor in Interpolate (#39453)
Summary:
This is a duplicate of https://github.com/pytorch/pytorch/pull/38362

"This PR completes Interpolate's deprecation process for recomputing the scales values, by updating the default value of the parameter recompute_scale_factor as planned for pytorch 1.6.0.
The warning message is also updated accordingly."

I'm recreating this PR as previous one is not being updated.

cc gchanan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/39453

Reviewed By: hl475

Differential Revision: D21955284

Pulled By: houseroad

fbshipit-source-id: 911585d39273a9f8de30d47e88f57562216968d8
2020-07-09 11:32:49 -07:00
David Reiss
4dad829ea3 In interpolate, inline the call to _interp_output_size (#37173)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37173

This function is only called in one place, so inline it.  This eliminates
boilerplate related to overloads and allows for further simplification
of shared logic in later diffs.

All shared local variables have the same names (from closed_over_args),
and no local variables accidentally collide.
ghstack-source-id: 106938108

Test Plan: Existing tests for interpolate.

Differential Revision: D21209995

fbshipit-source-id: acfadf31936296b2aac0833f704764669194b06f
2020-07-07 13:52:18 -07:00
David Reiss
3c1c74c366 In interpolate, move exceptional cases to the bottom (#37172)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37172

This improves readability by keeping cases with similar behavior close
together.  It should also have a very tiny positive impact on perf.
ghstack-source-id: 106938109

Test Plan: Existing tests for interpolate.

Differential Revision: D21209996

fbshipit-source-id: c813e56aa6ba7370b89a2784fcb62cc146005258
2020-07-07 13:52:16 -07:00
David Reiss
8f0e254790 In interpolate, use if instead of elif (#37171)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37171

Every one of these branches returns or raises, so there's no need for elif.
This makes it a little easier to reorder and move conditions.
ghstack-source-id: 106938110

Test Plan: Existing test for interpolate.

Differential Revision: D21209992

fbshipit-source-id: 5c517e61ced91464b713f7ccf53349b05e27461c
2020-07-07 13:49:53 -07:00
Nayef Ahmed
71af538e31 Updated assert to remove check on 3rd dim for MHA (#39402)
Summary:
## Description
* Updated assert statement to remove check on 3rd dimension (features) for keys and values in MultiheadAttention / Transform
* The feature dimension for keys and values can now be of different sizes
* Refer to https://github.com/pytorch/pytorch/issues/27623
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39402

Reviewed By: zhangguanheng66

Differential Revision: D21841678

Pulled By: Nayef211

fbshipit-source-id: f0c9e5e0f33259ae2abb6bf9e7fb14e3aa9008eb
2020-06-02 13:35:39 -07:00
Vasiliy Kuznetsov
e5ada042b1 QAT ConvBN: remove explicit folding and use BN instead (#38478)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38478

Before this PR, the QAT ConvBN module inlined the batch normalization code
in order to reproduce Conv+BN folding.

This PR updates the module to use BN directly.  This is mathematically
equivalent to previous behavior as long as we properly scale
and fake quant the conv weights, but allows us to reuse the BN code
instead of reimplementing it.

In particular, this should help with speed since we can use dedicated
BN kernels, and also with DDP since we can hook up SyncBatchNorm.

Test Plan:
```
python test/test_quantization.py TestQATModule
```

Imported from OSS

Differential Revision: D21603230

fbshipit-source-id: ecf8afdd833b67c2fbd21a8fd14366079fa55e64
2020-05-19 08:58:42 -07:00
Bharat123rox
8752d6a736 DOC: Correct upsample doc to match interpolation (#38455)
Summary:
Fix https://github.com/pytorch/pytorch/issues/38334 and correct the docs of `torch.nn.functional.upsample`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38455

Differential Revision: D21583515

Pulled By: driazati

fbshipit-source-id: 6ac5a79ba489bdcdd3fab34e4eddb4864e20a29e
2020-05-15 17:09:26 -07:00
Donna Choi
4c99a9b672 Add documentation for hardswish (#37989)
Summary:
Fix issue https://github.com/pytorch/pytorch/issues/37431.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37989

Differential Revision: D21502182

Pulled By: zou3519

fbshipit-source-id: 245586fb555f7f1d9ec8d87269035b6fe626b47b
2020-05-12 06:48:51 -07:00
Donna Choi
ca2206d071 Add documentation for FeatureAlphaDropout (#36295)
Summary:
These changes add documentation for FeatureAlphaDropout, based on a need raised in an issue by SsnL (Issue https://github.com/pytorch/pytorch/issues/9886).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36295

Differential Revision: D21478591

Pulled By: zou3519

fbshipit-source-id: a73c40bf1c7e3b1f301dc3347cef7b32e9842320
2020-05-08 15:09:01 -07:00
Richard Zou
172bcdb8c8 Add documentation for nn.Hardsigmoid and nn.functional.hardsigmoid. (#38120)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38120

Test Plan: build docs locally and attach a screenshot to this PR.

Differential Revision: D21477815

Pulled By: zou3519

fbshipit-source-id: 420bbcfcbd191d1a8e33cdf4a90c95bf00a5d226
2020-05-08 13:56:45 -07:00