Commit Graph

267 Commits

Author SHA1 Message Date
Pavel Belevich
5cc75e46fa Split test_c10d.py to test_c10d_common.py, test_c10d_gloo.py, test_c10d_nccl.py (#56598)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56598

Test Plan: NA

Reviewed By: SciPioneer

Differential Revision: D27913170

fbshipit-source-id: 3439d18141131b02d55f2ca399a4c795cba2b04b
2021-04-21 22:10:41 -07:00
Joel Schlosser
12b2bc94d7 Revert D27909732: [pytorch][PR] Support factory kwargs in torch.nn modules
Test Plan: revert-hammer

Differential Revision:
D27909732 (5a09def9b0)

Original commit changeset: d8684b2403ab

fbshipit-source-id: d00d69fae4fa4ed58d9e97e70b27a06a0dcb39e4
2021-04-21 13:44:03 -07:00
Joel Schlosser
5a09def9b0 Support factory kwargs in torch.nn modules (#54508)
Summary:
Continuation of https://github.com/pytorch/pytorch/pull/53144

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508

Reviewed By: malfet

Differential Revision: D27909732

Pulled By: jbschlosser

fbshipit-source-id: d8684b2403ab7eb336371d118799146a2520bd76
2021-04-21 13:20:11 -07:00
Aliaksandr Ivanou
c5c5230890 Pytorch resolve bug around incorrect rdzv handler resolution (#56386)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56386

The diff resolves bug around incorrect handler resolution:
_create_static_handler pointed towards etcd, and _create_etcd_handler pointed towards static.

Test Plan:
buck test mode/dev-nosan //caffe2/test/distributed:test_launcher

Added test_launcher to the ci/cd tests

Reviewed By: cbalioglu

Differential Revision: D27858897

fbshipit-source-id: 440155789958c091ce5755e7c9524e4bb704203a
2021-04-19 23:50:28 -07:00
Natalia Gimelshein
92d24e3060 Revert D27855386: [pytorch][PR] Support factory kwargs in torch.nn modules
Test Plan: revert-hammer

Differential Revision:
D27855386 (40483acc51)

Original commit changeset: dabd505d2a04

fbshipit-source-id: f5bf3120d87861b30a8e1bf11977ad7d27cd8500
2021-04-19 20:07:20 -07:00
Joel Schlosser
40483acc51 Support factory kwargs in torch.nn modules (#54508)
Summary:
Continuation of https://github.com/pytorch/pytorch/pull/53144

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508

Reviewed By: bdhirsh

Differential Revision: D27855386

Pulled By: jbschlosser

fbshipit-source-id: dabd505d2a04208e74b158570fb2859c736eea2c
2021-04-19 12:24:58 -07:00
Sam Estep
d05e7c163f Revert D27600457: [pytorch][PR] Support factory kwargs in torch.nn modules
Test Plan: revert-hammer

Differential Revision:
D27600457 (1077f87269)

Original commit changeset: b58bfee61c39

fbshipit-source-id: 19d5bfc5133a3880383731d0332503ca1f3bce0c
2021-04-19 07:47:24 -07:00
Joel Schlosser
1077f87269 Support factory kwargs in torch.nn modules (#54508)
Summary:
Continuation of https://github.com/pytorch/pytorch/pull/53144

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508

Reviewed By: mrshenli

Differential Revision: D27600457

Pulled By: jbschlosser

fbshipit-source-id: b58bfee61c3917524b4622f63ef216c27a588eb1
2021-04-19 06:58:40 -07:00
Sam Estep
1e9c7ad4cb Add a test to measure import torch time (#56041)
Summary:
This PR adds a couple very simple tests which (as the code comment says) measure the time it takes to `import torch` and ask for the CUDA device count.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56041

Test Plan:
```
$ rm -r /tmp/reports ; python3 test/test_import_time.py --save-xml=/tmp/reports

Running tests...
----------------------------------------------------------------------
..
----------------------------------------------------------------------
Ran 2 tests in 1.855s

OK

Generating XML reports...
```
```
$ tools/print_test_stats.py /tmp/reports
No scribe access token provided, skip sending report!
class TestImportTime:
    tests: 2 failed: 0 skipped: 0 errored: 0
    run_time: 1.85 seconds
    avg_time: 0.93 seconds
    median_time: 0.93 seconds
    2 longest tests:
        test_time_cuda_device_count time: 1.10 seconds
        test_time_import_torch time: 0.75 seconds

Total runtime is 0:00:01
2 longest tests of entire run:
    TestImportTime.test_time_cuda_device_count  time: 1.10 seconds
    TestImportTime.test_time_import_torch  time: 0.75 seconds
```

Reviewed By: driazati

Differential Revision: D27770908

Pulled By: samestep

fbshipit-source-id: 01bbf5a339f41d3a1f493e6fa8c946ff7567daec
2021-04-15 00:53:30 -07:00
Edward Yang
bc86358cf5 Make run_test.py work even if s3_stat_parser fails to import (#56039)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56039

Python will try to eagerly resolve the name references even if
the import failed.  Quote them so that it doesn't.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: janeyx99

Differential Revision: D27770536

Pulled By: ezyang

fbshipit-source-id: b111739289498f9bab856fb9424f3080efee4ee0
2021-04-14 13:21:50 -07:00
Luca Wehrstedt
3f8d476857 Split out CUDA RPC tests (#55695)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55695

In order to be able to run CUDA tests on their own (e.g., to avoid running CPU tests on GPU machines).

Done by moving test methods to a separate class (and sometimes introducing a "common" base class for utils), and then providing new entry points inside a `cuda/` subdirectory.

Test Plan: Checked they are run on Sandcastle.

Reviewed By: mrshenli

Differential Revision: D27618198

fbshipit-source-id: 8f671657f79c8ae115748ab7752fe0066705893b
2021-04-12 07:48:08 -07:00
Rong Rong (AI Infra)
55db156229 remove test_jit_py3.py entirely (#55560)
Summary:
1. move module related stuff to test_module_container
2. created test_types for types and annotation
3. created test_misc for the rest

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55560

Reviewed By: VitalyFedyunin

Differential Revision: D27650911

Pulled By: walterddr

fbshipit-source-id: d895a7da9e9c3d25a662a37faf4daabc276b9c1a
2021-04-08 14:28:54 -07:00
Erjia Guan
f9a0bbbeb8 [DataPipe] Remove duplicate dataset (#54553)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54553

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D27279301

Pulled By: ejguan

fbshipit-source-id: 112a83e7061e3f35dc517eb623bd9ca93c2f034c
2021-04-07 10:11:22 -07:00
Jane Xu
bf37bf7da4 Make JSON files more human readable (#55335)
Summary:
Prettifies JSON files .pytorch-test-times and .pytorch-slow-tests so that not everything is on one single line.

This is of slightly more importance as generated  .pytorch-slow-tests ends up getting stored in our test-infra repo ([example](ad9cd87565)), and it is nice to not have that lil red symbol at the end.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55335

Reviewed By: samestep

Differential Revision: D27576930

Pulled By: janeyx99

fbshipit-source-id: be58565b8c8593a9bfcfab383ee19facc79f0572
2021-04-05 17:23:36 -07:00
Jane Xu
717e70a824 (BE) Refactor get-test-times-from-S3 into s3_stat_parser (#54808)
Summary:
Moves more s3 parsing code to s3_stat_parser.py. This is another step in modularizing the parsing code more correctly. I will also be using this exact function in future slowTest code.

Also replaces some Any's in the code to be Report.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54808

Test Plan:
.pytorch-test-times generated before the code and after this code is the same.
CI should pass, specifically the test tools GHA.

Reviewed By: walterddr

Differential Revision: D27375783

Pulled By: janeyx99

fbshipit-source-id: bec28551668b2eb3fdd60d802200993e493eac83
2021-03-29 08:45:22 -07:00
Rong Rong (AI Infra)
d4045e9aa1 initial commit to refactor all s3 access codes to s3_stats_parser (#54681)
Summary:
First step to move all S3 related operations into S3 parser utils.
in the end we provide APIs from s3_stats_parser:
1. downloading data as reports and uploading data as reports
2. filter by job name

and handle all compression, formatting inside.

TODO
- [ ] Refactor out upload into s3_stats_parser
- [ ] Remove all S3/BOTO related checkers and try/catch blocks outside of s3_stats_parser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54681

Test Plan:
1. Running tools/test/* covers the refactoring logic (test_test_history.py and test_stats.py as entrypoint and both using the 2 new APIs in s3_stats_parser after the refactoring.
2. print_test_stats.py's main argparse entrypoint is covered by CI step Report Test Result step.
3. run `python test/run_test.py --export-past-test-times` before and after this PR should result in the same file content in .pytorch-test-times

Reviewed By: ailzhang

Differential Revision: D27346742

Pulled By: walterddr

fbshipit-source-id: fb40162e631e007fed9d5821fe4f190bda2cb52e
2021-03-26 06:49:15 -07:00
Jane Xu
792f5ffb83 Also strip slow_test (#54528)
Summary:
Since `_test1`, `_test2` and `_build` and `test` are all stripped, `slow_test` should be stripped as well. This way, the _slow_test stats will be considered as a part of all stats relating to a particular build job, though currently, it doesn't do much because the jobs don't share a common stemmed name--the build has `_gcc7` while the slow_test CI job does not.

This makes me think...do we omit the `gcc7` intentionally? Are there other things I should strip, e.g., `multigpu_test`?

See:
ci/circleci: pytorch_linux_xenial_cuda10_2_cudnn7_py3_slow_test
ci/circleci: pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1
ci/circleci: pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54528

Reviewed By: samestep

Differential Revision: D27270393

Pulled By: janeyx99

fbshipit-source-id: ffb7289cfe4dba52ded67f50a89f3e75e7bad68d
2021-03-23 14:44:21 -07:00
Jane Xu
635595f706 Change sharding in ci (#54228)
Summary:
Step three (landing this should fix https://github.com/pytorch/pytorch/issues/53882)!

Modifying CI to compute job times during build so that the exported job times can be used for sharding future test jobs.
The builds that are exempted from this:
- `bazel` (no python tests so no need)
- `libtorch` (no python stuff so no need)
- `onnx` (the test shards are not calculated the same way)
- `asan` (runs into error I don't know how to debug/we can debug later: [logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/288019/workflows/57f95f67-1a1b-44a0-9b02-9652b57f2a5f/jobs/11693962)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54228

Test Plan: CI

Reviewed By: samestep

Differential Revision: D27192978

Pulled By: janeyx99

fbshipit-source-id: 3cb20d14f4989e61873043b81dfd6b0f82d17ccd
2021-03-22 08:40:34 -07:00
Jane Xu
0645e2b490 Use shard file if present, improve functions used for sharding (#54210)
Summary:
Step 2 to fixing https://github.com/pytorch/pytorch/issues/53882 :)

This changes TARGET_DET_LIST and sharding automation by checking if there's already cached data from the commit in `.pytorch-test-times`. If not, it pulls data from S3 and updates the file to have the stats. This way, S3 pulling does not need to happen more than once for the same commit.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54210

Test Plan:
the following methods should run the same set of tests.
First `export CIRCLE_JOB=pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2` or your favorite CIRCLE JOB.

1. Pull data first and use it:
Download the data from S3 and write it to the cache file with `python test/run_test.py --export-historic-test-times .pytorch-test-times`
Now run `python test/run_test.py --shard 1 10`

2. Make the sharding job pull data:
Delete the file you just created: `rm .pytorch-test-times`
Now run `python test/run_test.py --shard 1 10`

Reviewed By: walterddr

Differential Revision: D27136849

Pulled By: janeyx99

fbshipit-source-id: 51a42c4e2fa3f8cf15e682679dd3eb6130aad927
2021-03-18 13:25:51 -07:00
Jane Xu
2e7311ef25 First step to refactoring S3 reading logic (#53755)
Summary:
This is an initial attempt in refactoring and consolidating our S3 read logic for print_test_stats.py, test_history.py, and run_test.py. This way, boto3 and botocore do not need to be imported in various places throughout the code base, and duplicated logic (such as the many type definitions) can exist in one place: `tools/stat_utils/s3_stat_parser.py`. walterddr contributed to this PR by moving print_test_stats.py to the tools folder and the corresponding tests a subfolder within tools.

**NOTE: this removes those tests from CI as the new `tools/test/test_stats.py` is not in the test/ directory as the other tests in TESTS in run_test.py.**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53755

Test Plan:
This refactoring change should not break anything, so running the files as before should work as they did previously.
To make sure that print_test_stats.py still functions: run `python tools/test/test_stats.py` and make sure all tests pass.
To make sure that test_history.py works, run the example commands from `tools/test_history.py --help` and check that their output matches that shown. Note that the script will continue printing for a while, so don't be alarmed.

Some next steps:
- Actually coming up with similarities among the three current use cases and further refactoring/consolidating of functions (e.g., combining simplify and get_cases)
- Moving more parsing logic to s3_stat_parser.py to have better abstraction between our files
- Adding tests for s3_stat_parser.py when there is more functionality in it

Reviewed By: agolynski, samestep

Differential Revision: D27030285

Pulled By: janeyx99

fbshipit-source-id: e664781324ef7c0c30943bfd7f17c895075ef7a7
2021-03-17 12:38:09 -07:00
Jane Xu
f30a7a2739 Add export-historic-test-times option to dump S3 test times into a JSON file (#54083)
Summary:
This will allow for future work to use the test times file (which will save computation time and also allow for more consistency). (Step one to fixing https://github.com/pytorch/pytorch/issues/53882)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54083

Test Plan:
export CIRCLE_JOB=your-favorite-circleci-job e.g., pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2
`python test/run_test.py --export-historic-test-times` OR
`python test/run_test.py --export-historic-test-times .your-favorite-file`

When opening either .pytorch-test-times or .your-favorite-file, you should see something like:
```
{"commit": "2d559a09392aabb84dfb4a498010b2f01d99818c", "job_times": {"distributed/test_distributed_spawn": 583.5889999999973, "distributed/test_data_parallel": 4.866999999999997, "test_binary_ufuncs": 171.1569999999998, "test_numpy_interop": 2.5649999999999995, "test_public_bindings": 0.011,...}}
```

Note that no tests will be run when this option is specified.

Reviewed By: walterddr

Differential Revision: D27091351

Pulled By: janeyx99

fbshipit-source-id: e191d739268d86de0a0ba0eea0006969859d1940
2021-03-17 12:22:00 -07:00
Jane Xu
ee35060888 Fix sharding algo + test it (#53942)
Summary:
This PR:
1. moves sharding algorithm from run_test.py to framework_utils.py (let me know if you have a better place for it)
2. adds tests for the algorithm in test_testing.py
3. fixes the algorithm so that it doesn't tack on the unknown jobs all to the shard with the minimum time, but instead distributes them around the shards.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53942

Test Plan: python test/test_testing.py -k TestFrameworkUtils

Reviewed By: samestep

Differential Revision: D27047223

Pulled By: janeyx99

fbshipit-source-id: 824b20009c0bb707aa5361de445cdec795d5e3f1
2021-03-15 16:33:56 -07:00
Nikita Shulga
b00cdfe136 Fix run_test_module logic (#53884)
Summary:
First argument is either file name or test module name, but key to `CUSTOM_HANDLERS` is test module name.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53884

Test Plan: Run `python3 run_test.py -i distributed/test_distributed_spawn.py`

Reviewed By: janeyx99

Differential Revision: D27006164

Pulled By: malfet

fbshipit-source-id: f30b42856cd2754e5981c1c69618f84e392c986a
2021-03-12 09:53:58 -08:00
Aliaksandr Ivanou
ec484981c6 [3/n][torch/elastic][upstream] Move torchelastic/events to torch/distributed/events (#53760)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53760

Pull Request resolved: https://github.com/pytorch/elastic/pull/143

The diff upsteams torchelastic/events to the torch.

Test Plan:
buck test mode/dev-nosan //pytorch/elastic/torchelastic/agent/...
    buck test mode/dev-nosan //caffe2/test/distributed/elastic/events/fb/...

Reviewed By: kiukchung

Differential Revision: D26932830

fbshipit-source-id: 23fc10d2ead5af7f7ed510ae0d2581cc2421cf76
2021-03-11 11:25:24 -08:00
Guilherme Leobas
cb68039363 Port NumPy typing testing style to PyTorch (#52408)
Summary:
ref: https://github.com/pytorch/pytorch/issues/16574

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52408

Reviewed By: anjali411

Differential Revision: D26654687

Pulled By: malfet

fbshipit-source-id: 6feb603d8fb03c2ba2a01468bfde1a9901e889fd
2021-03-10 12:18:01 -08:00
Jane Xu
bcbe07200c Improve logic for S3 stats gathering. Uses automatic SLOW_TESTS. (#53549)
Summary:
This PR:
1. refactors the logic for S3 stats gathering.
2. Renames SLOW_TESTS to TARGET_DET_LIST to disambiguate and remove confusion with slowTest
2. detects slow tests (tests with time > 5min) to add to the TARGET_DET_LIST based on results in S3 from the previous nightly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53549

Test Plan:
Set CIRCLE_JOB to your favorite CI job (like `pytorch_linux_bionic_py3_8_gcc9_coverage_test1`).
Run `python test/run_test.py --determine-from=<your fave pytorch files>`
e.g., `python test/run_test.py --determine-from=test/run_test.py`

Reviewed By: mrshenli

Differential Revision: D26904478

Pulled By: janeyx99

fbshipit-source-id: 9576b34f4fee09291d60e36ff2631753a3925094
2021-03-10 09:37:06 -08:00
Sam Estep
8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00
Jane Xu
c0adabe172 automate sharding using S3 test time stats (#53269)
Summary:
Uses nightly commit stats to automatically shard tests based on execution time.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53269

Test Plan:
set CIRCLE_JOB to an existing job, like `pytorch_linux_bionic_py3_6_clang9_test`
Then you can run something like: `python test/run_test.py --shard 1 10`

Reviewed By: malfet

Differential Revision: D26819440

Pulled By: janeyx99

fbshipit-source-id: 6bc73d6aa3d52d9850817536be15d7b54a72780e
2021-03-05 13:40:24 -08:00
Yi Zhang
fd582af06c enable coverage test for dataloader on Windows (#52550)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50661
For coverage,
The class qualified name is `'SimpleCustomBatch': <class '__mp_main__.SimpleCustomBatch'>`

For pytest
The class qualified name is `'SimpleCustomBatch': <class 'test_dataloader.SimpleCustomBatch'>`

So move the class to one separate file

![image](https://user-images.githubusercontent.com/16190118/108611869-d6b51f80-741d-11eb-908e-be7a64da916d.png)

As malfet suggestion, use __import__ to avoid adding new file.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52550

Reviewed By: walterddr

Differential Revision: D26754023

Pulled By: malfet

fbshipit-source-id: 34b0fbe7336b9303cedc28ec6116ab752a2d3630
2021-03-02 18:40:47 -08:00
Meghan Lele
1d6bd15790 [JIT] Add torch._C._jit submodule (#52910)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52910

**Summary**
PR #52158 tried to move all JIT bindings from `torch._C` to a new
submodule `torch._C._jit`, but that...did not go well. This pull request
adds the new `torch._C._jit` submodule, but does not migrate the
existing bindings. Instead, it adds a unit test that fails if any new
bindings are added to `torch._C`. A comment in the test instructs
developers to add their new binding to the allowlist if it really should
be in `torch._C`, or to add it to the appropriate submodule (e.g
`torch._C._jit`, for example). The idea is to prevent the issue
described in #51691 from getting *worse* if it cannot be fixed.

**Test Plan**
Continuous integration.

**Fixes**
This commit fixes #51691.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D26698373

Pulled By: SplitInfinity

fbshipit-source-id: ec9f5426051227a513d4fd09512b624420e0100b
2021-02-26 16:05:05 -08:00
Kimish Patel
a6e94d274f [Pytorch] Add python binding to use mobile cpu allocator. (#52323)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52323

Using default cpu allocator for ops executed on qnnpack backend will result in
asan failures with heap overflow since qnnpack (and xnnpack) can access input
beyond their and/beginning.

Here we are enabling this feature specifically to enable dynamic sparse linear op test
using qnnpack engine. In dynamic linear op, the fp32 bias is not packed and
hence can result in out-of-bound access.

Test Plan: test_set_default_mobile_cpu_allocator.py

Reviewed By: z-a-f

Differential Revision: D26263481

fbshipit-source-id: a49227cac7e6781b0db4a156ca734d7671972d9f
2021-02-17 08:42:23 -08:00
Chester Liu
58eb23378f Clean up usage of torch._six partially (#49785)
Summary:
See https://github.com/pytorch/pytorch/issues/42919

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785

Reviewed By: mruberry

Differential Revision: D25963833

Pulled By: bugra

fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2
2021-02-08 13:58:34 -08:00
mattip
9cbefad83f concantenate LICENSE files when building a wheel (#51634)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50695

I checked locally that the concatenated license file appears at `torch-<version>.dist-info/LICENSE` in the wheel.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51634

Reviewed By: zhangguanheng66

Differential Revision: D26225550

Pulled By: walterddr

fbshipit-source-id: 830c59fb7aea0eb50b99e295edddad9edab6ba3a
2021-02-08 08:28:46 -08:00
vfdev
b106250047 Introduced AliasInfo for OpInfo (#50368)
Summary:
Introduced AliasInfo for OpInfo.

Context: Split of https://github.com/pytorch/pytorch/issues/49158

cc mruberry , please let me know if you'd like to see here more code to cover

> [ ] fold test_op_aliases.py into OpInfo-based testing in test_ops.py

from https://github.com/pytorch/pytorch/issues/50006

and/or add `UnaryUfuncInfo('abs')` as discussed https://github.com/pytorch/pytorch/pull/49158/files#r548774221

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50368

Reviewed By: ngimel

Differential Revision: D26177261

Pulled By: mruberry

fbshipit-source-id: 2e3884a387e8d5365fe05945375f0a9d1b5f5d82
2021-02-02 00:10:09 -08:00
Radhakrishnan Venkataramani
3397919dcf Rowwise Prune op (Add the test to OSS run_test), Make the op private. (#46131)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46131

Refer to the title.

Test Plan: `buck test caffe2/test:pruning`

Reviewed By: raghuramank100

Differential Revision: D24230472

fbshipit-source-id: 8f0a83446c23fdf30d0313b8c3f5ff1a463b50c7
2021-01-29 06:08:18 -08:00
lixinyu
5ed0ad4b6a DataPipe naming convension update (#51262)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51262

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D26120628

Pulled By: glaringlee

fbshipit-source-id: 6855a0dd6d4a93ff93adce1039960ffd7057a827
2021-01-28 17:44:36 -08:00
Benjamin Lefaudeux
87fb3707d9 ZeroRedundancyOptimizer: an implementation of a standalone sharded optimizer wrapper (#46750)
Summary:
Implement the first stage of ZeRO, sharding of the optimizer state, as described in [this blog post](https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/) and [this paper](https://arxiv.org/abs/1910.02054). This implementation is completely independent from the [DeepSpeed](https://github.com/microsoft/DeepSpeed) framework, and aims at providing ZeRO-compliant building blocks within the PyTorch scheme of things.

This works by:
- acting as a wrapper to a pytorch optimizer. ZeROptimizer does not optimize anything by itself, it only shards optimizers for distributed jobs
- each rank distributes parameters according to a given partitioning scheme (could be updated), and owns the update of a given shard only
- the .step() is called on each rank as expected, the fact that the optimizer actually works on a shard of the model is not visible from the outside
- when the update is completed, each rank broadcasts the updated model shard to all the other ranks

This can be used with DDP, although some communications are wasted in that case (gradients are all-reduced to all ranks). This implementation was initially developed in [Fairscale](https://github.com/facebookresearch/fairscale), and can also be used with an optimized DDP which only reduces to the relevant ranks. More context on ZeRO and PyTorch can be found in [this RFC](https://github.com/pytorch/pytorch/issues/42849)

The API with respect to loading and saving the state is a known pain point and should probably be discussed an updated. Other possible follow ups include integrating more closely to a [modularized DDP](https://github.com/pytorch/pytorch/issues/37002), [making the checkpoints partition-agnostic](https://github.com/facebookresearch/fairscale/issues/164), [exposing a gradient clipping option](https://github.com/facebookresearch/fairscale/issues/98) and making sure that mixed precision states are properly handled.

original authors include msbaines, min-xu-ai and myself

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46750

Reviewed By: mruberry

Differential Revision: D25958918

Pulled By: blefaudeux

fbshipit-source-id: 14280f2fd90cf251eee8ef9ac0f1fa6025ae9c50
2021-01-20 14:36:16 -08:00
peter
a1b1d0cdc0 Better split of the windows test jobs (#50660)
Summary:
See discussion in https://github.com/pytorch/pytorch/pull/50320#discussion_r554447365.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50660

Reviewed By: xuzhao9, samestep

Differential Revision: D25959021

Pulled By: seemethere

fbshipit-source-id: 7623bddc09e7d55208b8a1af4b5a23fba2cdeb14
2021-01-19 15:07:33 -08:00
Mikhail Zolotukhin
e9dc8fc162 [TensorExpr] Add python bindings. (#49698)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49698

Reincarnation of #47620 by jamesr66a.

It's just an initial bunch of things that we're exposing to python, more
is expected to come in future. Some things can probably be done better,
but I'm putting this out anyway, since some other people were interested
in using and/or developing this.

Differential Revision: D25668694

Test Plan: Imported from OSS

Reviewed By: bertmaher

Pulled By: ZolotukhinM

fbshipit-source-id: fb0fd1b31e851ef9ab724686b9ac2d172fa4905a
2021-01-14 21:02:47 -08:00
Nikita Shulga
22bd277891 Run test_type_hints first (#49748)
Summary:
Since it sort of a liner check and fails frequently

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49748

Reviewed By: vkuzo

Differential Revision: D25682980

Pulled By: malfet

fbshipit-source-id: 7dba28242dced0277bad56dc887d3273c1e9e575
2021-01-04 09:33:13 -08:00
Samuel Marks
e6779d4357 [*.py] Rename "Arguments:" to "Args:" (#49736)
Summary:
I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings.

```sh
(pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do
    printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" | paste -s -d+ -- | bc)"; done
Args:      1095
Arguments: 0336
```

It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per:

  - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md)

  - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md)

  - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst)

Therefore, only `Args:` is valid. This PR replaces them throughout the codebase.

PS: For related PRs, see tensorflow/tensorflow/pull/45420

PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736

Reviewed By: albanD

Differential Revision: D25710534

Pulled By: soumith

fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619
2020-12-28 09:34:47 -08:00
Nikita Shulga
12942ea52b [BE] Introduce set_cwd context manager (#49657)
Summary:
Used to temporarily change working directory, but restore it even if exception is raised
Use it in test_type_hints and during code coverage collection

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49657

Reviewed By: walterddr

Differential Revision: D25660543

Pulled By: malfet

fbshipit-source-id: 77f08d57e4b60b95daa4068d0dacf7c25f978526
2020-12-21 12:08:48 -08:00
Erjia Guan
1b6fc1fd42 [WIP][DataLoader] CollateIterableDataset prototype (#48933)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48933

Prototype for CollateIterableDataset.
Move `collate_batch_fn` to BatchIterableDataset

- CollateIterableDataset
  - [x] Prototype
  - [x] Tests
- BatchIterableDataset
  - [x] Prototype
  - [x] Tests
- SamplerIterableDataset
  - [x] Prototype
  - [x] Tests

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D25623635

Pulled By: ejguan

fbshipit-source-id: 99ba077619f672551ac15367baaba985db35a9c2
2020-12-21 07:04:25 -08:00
Nikita Shulga
6f381de006 Inline coverage report combining/reporting (#49615)
Summary:
Instead of calling coverage frontend import coverage module and call combine() and html_report()

Fixes https://github.com/pytorch/pytorch/issues/49596 by not using a strict mode when combining those reports

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49615

Reviewed By: seemethere

Differential Revision: D25645196

Pulled By: malfet

fbshipit-source-id: be55b5c23a3569a331cbdf3f86d8c89bc27d5fe1
2020-12-18 17:08:46 -08:00
Pritam Damania
9d91360b5d Cleanup APIs for pipeline parallelism. (#48630)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48630

1) Make torch.distributed.pipeline package public.
2) Make several helper methods private.
ghstack-source-id: 118820803

Test Plan: waitforbuildbot

Reviewed By: rohan-varma

Differential Revision: D25235688

fbshipit-source-id: c32833ebf090ddbd4eaf06fcb5e3f9d421623a60
2020-12-18 15:17:13 -08:00
Rong Rong (AI Infra)
df2337097d add files to SLOW_TESTS for target determinator (#49500)
Summary:
- test_torch was split into 6 in https://github.com/pytorch/pytorch/issues/47356.
- also test_linalg has 10 slowtest marking.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49500

Reviewed By: ezyang, malfet

Differential Revision: D25598085

Pulled By: walterddr

fbshipit-source-id: 74b0b433897721db86c00e236d1dd925d7a6d3d0
2020-12-16 19:10:56 -08:00
Brian Hirsh
9908b93dcf fix test_dispatch tests to error on duplicate def (#49254)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49254

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25505170

Pulled By: bdhirsh

fbshipit-source-id: 6796f4ce022c3141934ee69c7caaa08e663adf39
2020-12-15 08:27:52 -08:00
Pritam Damania
df027bfd2c Modify Pipe to return an RRef. (#47829)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47829

As per proposal in https://github.com/pytorch/pytorch/issues/44827,
the API needs to return an RRef to support inter-host pipelining.

For now, we just return a local RRef and only support pipeline on a single
host. But having this change in the API upfront ensures we don't make any BC
breaking changes later.
ghstack-source-id: 118366784

Test Plan: waitforbuildbot

Reviewed By: rohan-varma

Differential Revision: D24914022

fbshipit-source-id: e711e7d12efa45645f752f0e5e776a3d845f3ef5
2020-12-11 14:55:16 -08:00
Rong Rong
ef50c94e7c reenabling MPI test (#48725)
Summary:
fixes https://github.com/pytorch/pytorch/issues/47443.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48725

Reviewed By: mrshenli

Differential Revision: D25278758

Pulled By: walterddr

fbshipit-source-id: a02d0fef99a7941c8e98da16a45d840e12b8b0c3
2020-12-03 06:50:36 -08:00
neerajprad
5489a98cd3 Add support for CorrCholeskyTransform (#48041)
Summary:
This adds a transform to convert a real vector of (D * (D-1))/2 dimension into the cholesky factor of a D x D correlation matrix. This follows the implementation in [NumPyro](https://github.com/pyro-ppl/numpyro/blob/master/numpyro/distributions/transforms.py) by fehiepsi. This is needed for the LKJDistribution which will be added in a subsequent PR.

Also in line with the ongoing effort to refactor distributions test, this moves the transforms test into its own file that uses pytest with parametrized fixtures.

For review:
 fehiepsi - could you help review the math?
 fritzo - do you have any suggestions for what to do about the event dimension (more details are in the comment below)?
 ezyang - could you review the changes in `run_test.py`? Instead of a separate `PYTEST_TESTS`, I have clubbed these tests in `USE_PYTEST_LIST` to avoid duplicate logic. The only difference is that we do not anymore check if pytest is not installed and exclude the tests in the list. I figured that if existing tests are already using pytest, this should not matter.

TODOs (probably not all can be satisfied at the same time):
 - [x] Use operations that are JIT friendly, i.e. the transform works with different sized input under JIT.
 - [x] Resolve test failures - currently `arange(scalar_tensor)` fails on certain backends but this is needed for JIT. Maybe we should only support same sized tensor under JIT?
 - [x] Add tests to check that the transform gives correct gradients and is in agreement with the `log_det_jacobian`.
 - [x] Add `input_event_dim` and `output_event_dim` to `CorrCholeskyTransform`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48041

Reviewed By: zhangguanheng66

Differential Revision: D25262505

Pulled By: neerajprad

fbshipit-source-id: 5a57e1c19d8230b53592437590b9169bdf2f71e9
2020-12-03 03:21:08 -08:00