Commit Graph

167 Commits

Author SHA1 Message Date
Ilia Cherniavskii
d8c384544e Destroy CUDA events after profiling (#39962)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39962

Adding a simple wrapper with ref count for cuda event and
destroying cuda event after the last copy is destroyed

Test Plan: CI cuda profiler tests

Differential Revision: D22027092

Pulled By: ilia-cher

fbshipit-source-id: e0810388aa60b2291eb010896e13af1fad92e472
2020-06-23 10:44:39 -07:00
Pritam Damania
e632bf8d57 Add thrift and tensorpipe backend tests for test_ddp_under_dist_autograd. (#40210)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40210

ghstack-source-id: 106300839

Test Plan: waitforbuildbot

Differential Revision: D22110065

fbshipit-source-id: d9ebd009b8d451c75708eadc7eb3f2b788e875aa
2020-06-20 22:59:59 -07:00
Ivan Kobzarev
3852215170 [vulkan] jit passes for vulkan conv2 prepack and fuse with clamp (#39282)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39282

Test Plan: Imported from OSS

Differential Revision: D21962424

Pulled By: IvanKobzarev

fbshipit-source-id: 2d20e827d2c3836b7e6b443293377c68dc1ffa5a
2020-06-20 14:12:21 -07:00
Jeff Daily
89ef8f8141 add test_openmp to ROCM_BLACKLIST (#40204)
Summary:
This test is flaky for rocm platform.  Add to blacklist until it can be further reviewed.

CC ezyang xw285cornell sunway513
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40204

Differential Revision: D22108295

Pulled By: xw285cornell

fbshipit-source-id: 802444a7b41260edcb6ce393237784f3e6c52a74
2020-06-18 15:15:35 -07:00
Shihao Xu
00651b8c93 [distribtued.nn] Implement TorchScript-compatible RemoteModule API (#37139)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37139

See design doc in https://github.com/pytorch/pytorch/issues/37136

ghstack-source-id: 105926270

Test Plan:
TODO:

- Make the generated Interface usable. https://github.com/pytorch/pytorch/pull/37139#discussion_r434190978
-
- Avoid generating the same template instances for Module that is not scriptable.
- Remove "infer_module_interface_cls".
- Use Python format instead of a CodeTemplate
- Use Python tempfile to track and delete file. Does it work if there is crash.

```
buck test mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator

buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \
buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_scripted_remote_module_template

buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \
buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_non_scripted_remote_module_template
```

```
buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_spawn
```

```
buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_async_script

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_sync_script

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_with_kwargs

buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \
buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name
```

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
```

buck test mode/opt-asan //caffe2/test:jit -- 'test_script_forward_method_replacement

buck build mode/dev-nosan //caffe2/test:jit && \
buck-out/gen/caffe2/test/jit\#binary.par -r 'test_script_forward_method_replacement'

buck build mode/dev-nosan //caffe2/test:jit && \
buck-out/gen/caffe2/test/jit\#binary.par -r 'test_imported_classes'

Differential Revision: D20499658

fbshipit-source-id: dd9383ae4eb2343366c11127664f845b91ca3b0a
2020-06-15 19:07:35 -07:00
Ilia Cherniavskii
cc3fc786b7 [resubmit] [pytorch][PR] Fix for num_threads==1 in OpenMP "parallel for" (#39533)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39533

Test Plan: CI

Reviewed By: ngimel

Differential Revision: D21889269

fbshipit-source-id: 5ba13a0a3ec11edd0b6a7c3fdb35396b847a3d9e
2020-06-15 13:14:59 -07:00
HC Zhu
acc13ac828 [PyTorch] Make DDP reducer work under distributed autograd (#37998)
Summary:
## Why doesn’t DDP work under dist_autograd?
DDP follows the steps below
1. [DDP Python constructor](8d6a8d2b3f/torch/nn/parallel/distributed.py (L389-L393)) (on a module) creates a [C++ Reducer](https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/reducer.cpp), which holds references to all parameters (or variables in C++ code).
2. The reducer installs a post hook on each model parameter.
3. The backward run starts and triggers the post hooks installed above.
4. The post hook of a parameter simply marks the parameter ready for all-reduce.
5. Once all parameters in a bucket are ready, an all-reduce process starts by reading variable `.grad` and writes to variable `.grad`.

But under dist_autograd, `.grad` of a variable is not populated at all. Instead, grads are in a global map in distributed context from variables to their grads.

## Solution of this PR
The distributed engine to set a thread_local variable in a backward run indicating we're running in distributed mode. DDP reducer can then appropriately use `.grad` or the distributed context based on the thread local. More precisely, the thread local is set before calling the post hooks installed by the DDP reducer so that DDP post hooks can retrieve this thread local.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37998

Test Plan:
```
python test/distributed/test_ddp_under_dist_autograd.py
```

FB repo
```
buck test caffe2/test/distributed/...
```

DDP accuracy benchmark workflow run

```
flow-cli canary pytorch.benchmark.accuracy_comparison.workflow --parameters-json '{"node_world_size": 4, "dist_backend": "nccl"}' --run-as-secure-group fblearner_flow --entitlement gpu_prod
```

f196173157

Reviewed By: pritamdamania87

Differential Revision: D21513795

Pulled By: hczhu

fbshipit-source-id: fe21e68ecdc9274182db4d4bb5a1e2d68ef927a2
2020-06-10 08:38:14 -07:00
Jithun Nair
545a3e1eca Remove test_nccl from ROCM_BLACKLIST and enable only a couple of test_nccl tests (#39354)
Summary:
All individual test_nccl unit tests have been disabled for ROCm in bf9395438f
test_nccl was also added to the ROCM_BLACKLIST in 87b198d309
However, the issue only arises when running the test_nccl suite as a whole (as opposed to any one test individually). More details in comments here: https://github.com/pytorch/pytorch/pull/38689

This PR enables test_nccl suite with only two tests so as to workaround the as-yet unresolved issue above, while allowing at least one test_nccl collective test to run on ROCm. This is also needed as a precursor for: https://github.com/pytorch/pytorch/pull/38515
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39354

Differential Revision: D21843194

Pulled By: mrshenli

fbshipit-source-id: b28d1e073d8d0fdc1b59928fc3b00187cfd02a35
2020-06-05 13:52:23 -07:00
mattip
ada2652ca6 Restore docs coverage test via sphinx (#39331)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39331

Fixes gh-37590

Adds an extra `make coverage` to document building, which uses the built-in facility in sphinx to check docstring coverage. Also fixes a failure to import `torch/jit/supported_ops.py` which broke the [Torchscript Builtins](https://pytorch.org/docs/stable/jit_builtin_functions.html) page.

This also adds the required `SPHINXOPTS` to turn warnings into error, but this is commented out. Note that since documentation of `torchvision` is merged in here, failures there would cause failures here if this is made active. Some thought might be needed about pinning the torchvision version merged into documentation.

The first commit should fail, since the "ScriptModule" class is commented out. I did that in order to check that a CI failure is properly reported.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38244

Differential Revision: D21640589

Pulled By: ezyang

fbshipit-source-id: 1e240d81669b5f21404d596de4a27d192dc9fd8a
2020-06-04 10:49:38 -07:00
Oguz Ulgen
4a0a38c17a Revert D21652452: [pytorch][PR] Fix for num_threads==1 in OpenMP "parallel for"
Test Plan: revert-hammer

Differential Revision:
D21652452

Original commit changeset: 2cda7777c0ea

fbshipit-source-id: fdd9a0346ce32a962766f57e13357dd2bc60d8b8
2020-06-03 22:51:51 -07:00
Luca Wehrstedt
5beb3b0c53 [TensorPipe] Re-enable dist optimizer tests (#39441)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39441

This is the last test suite to be enabled for TensorPipe.
ghstack-source-id: 105166757

Test Plan: Ran the tests, hundreds of times each, in different build modes.

Differential Revision: D21858975

fbshipit-source-id: ee0a7e64b77b4b1974f031207031cc14afb3a8c2
2020-06-03 09:00:52 -07:00
Luca Wehrstedt
b1dab266f7 [TensorPipe] Re-enable dist autograd tests (#39440)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39440

After the RPC tests, re-enable the second test suite: dist autograd.
ghstack-source-id: 105165393

Test Plan: Ran the tests, several times each, in different build configs.

Differential Revision: D21858974

fbshipit-source-id: 409377d564c36fecae51b9e4c776d94187b434a2
2020-06-03 08:59:19 -07:00
Luca Wehrstedt
3f099879f7 [TensorPipe] Re-enable RPC tests (#39406)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39406

For now, just the RPC test (no dist autograd or dist optimizer).

I removed the skipping decorator from all the tests except those that explicitly use the ProcessGroup options.

Includes #39027.
ghstack-source-id: 105159974

Test Plan: Ran the tests several hundred times, in various build modes. Saw some flakes, but at a rate of about 0.1%

Differential Revision: D21716069

fbshipit-source-id: 9d2a99e112049a63745772c18e7a58266ed8e74e
2020-06-03 07:14:30 -07:00
mattip
a952f9bb06 Fix for num_threads==1 in OpenMP "parallel for" (#36479)
Summary:
fixes gh-32284

Move the non-parallel stanza out of the parallel context, and use `num_threads` to limit nesting `parallel for`s. The nesting caused a memory leak in the test script in the issue.

This should probably have a test somewhere: are there tests for ParallelOpenMP?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36479

Differential Revision: D21652452

Pulled By: ilia-cher

fbshipit-source-id: 2cda7777c0eafbe268550a82fed306e52fb6eb25
2020-06-02 18:56:13 -07:00
Shen Li
bb0377bb24 Expose torch.futures.Future (#39008)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39008

This commit adds a `torch.futures.Future` type and exposes its ctor,
`wait`, `then`, and `set_result` APIs. This type is currently a
wrapper of `c10::ivalue::Future` and mainly used by RPC for now. Later,
we could revamp c10d APIs to return this `Future` type as well. More
utils will be added into `torch.futures` package in followup PRs.

Test Plan: Imported from OSS

Differential Revision: D21723022

Pulled By: mrshenli

fbshipit-source-id: 92e56160544e9bf00d11db3e8347a1b9707882c9
2020-06-02 10:12:56 -07:00
Nikita Shulga
39d037253c Test PyTorch using python-3.8 + GCC-9 on Bionic (Reland) (#39121)
Summary:
Enable new test config in .circleci/config.yml
Skip scanning several 3rd-party packages to work around https://bugs.python.org/issue40350
Remove pre python-3.5 checks from `test.sh` and update `scikit-learn` to python-3.8 compatible version

This is a reland of https://github.com/pytorch/pytorch/pull/39030
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39121

Differential Revision: D21820375

Pulled By: malfet

fbshipit-source-id: d0be79b7d204cf692e055d42b9be42402dc4c1c0
2020-06-01 11:11:12 -07:00
Rohan Varma
988e31c788 Revert D21752017: [pytorch][PR] Test PyTorch using python-3.8 + GCC-9 on Bionic
Test Plan: revert-hammer

Differential Revision:
D21752017

Original commit changeset: 56c841636349

fbshipit-source-id: adf08e03ba9610050fc5440ef453789f805fdc6b
2020-05-27 17:42:22 -07:00
Nikita Shulga
30dd4acbf6 Test PyTorch using python-3.8 + GCC-9 on Bionic (#39030)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39030

Differential Revision: D21752017

Pulled By: malfet

fbshipit-source-id: 56c841636349e24c9ebef8dac18c283de3664fa5
2020-05-27 15:56:37 -07:00
Nikolay Korovaiko
4fcd1c3123 run te only for profiling executor (#38591)
Summary:
* Disable the mode where PE can still run the old fuser.
* Clean up
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38591

Differential Revision: D21643664

Pulled By: Krovatkin

fbshipit-source-id: 6753ed6bdc544698a1340e59a624608ff3abf7f9
2020-05-26 18:35:25 -07:00
Shen Li
40ce90bfc1 Revert D21560096: [Tensorpipe Agent] Enabling tests with OSS CI
Test Plan: revert-hammer

Differential Revision:
D21560096

Original commit changeset: 7d61cc1c354e

fbshipit-source-id: 6adfd87e354545031203d65d04f0bad4687a93cd
2020-05-19 19:39:33 -07:00
Jeff Daily
87b198d309 add distributed/test_nccl to ROCM_BLACKLIST (#38730)
Summary:
CC ezyang xw285cornell sunway513

Work-around for recent ROCm CI failures due to 9cfc10d52e (https://github.com/pytorch/pytorch/issues/37294).  Replaces full revert suggested by PR https://github.com/pytorch/pytorch/issues/38689.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38730

Differential Revision: D21648707

Pulled By: xw285cornell

fbshipit-source-id: 627b11b229c7eadca1f6e0c6192c6b5b6416e6a1
2020-05-19 14:45:50 -07:00
Omkar Salpekar
87aa2d25ae [Tensorpipe Agent] Enabling tests with OSS CI (#38447)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38447

This PR modifies `run_tests.py` to enable running Tensorpipe Agent tests with the OSS CI.
ghstack-source-id: 104321881

Test Plan: CI

Differential Revision: D21560096

fbshipit-source-id: 7d61cc1c354e9353c4a586dd2b56690c28d51d10
2020-05-19 13:34:06 -07:00
Nikita Shulga
72e5b7ae5b Add option to run python unittests in parallel (#37180)
Summary:
So far results looks quite promising: test_nn is purely sequential tests and can be accelerated 3x
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37180

Differential Revision: D21437871

Pulled By: malfet

fbshipit-source-id: 8679a8af355f839f2c9dae3bf36d2e102af05425
2020-05-06 22:14:11 -07:00
Kimish Patel
b1b6bc36a5 Enable xnnpack_integration test in CI. (#37838)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37838

Test Plan: oss: python test/test_xnnpack_integration.py

Reviewed By: xcheng16

Differential Revision: D21405850

fbshipit-source-id: ba4ba06692b49315f110653d9492b2e14b618574
2020-05-06 13:53:03 -07:00
ashishfarmer
402f635bbe Enable ahead of time compilation for HIPExtensions using ninja (#37800)
Summary:
This pull request enables ahead of time compilation of HIPExtensions with ninja by setting appropriate compilation flags for ROCm environment. Also, this enables the unit test for testing cuda_extensions on ROCm as well as removing test for ahead of time compilation of extensions with ninja from ROCM_BLACKLIST

ezyang jeffdaily
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37800

Differential Revision: D21408148

Pulled By: soumith

fbshipit-source-id: 146f4ffb3418f3534e6ce86805d3fe9c3eae84e1
2020-05-05 20:53:35 -07:00
ashishfarmer
bbd2350c99 Disable tests failing on test2 in ROCm CI (#37427)
Summary:
This pull request disables the unit tests that were observed to be failing once `test2` was enabled. These tests will be one by one looked at and fixed at the earliest, but until then disabling them to unblock `test2`
The pull request also disables fftPlanDestroy for rocFFT to avoid double-freeing FFT handles

cc: ezyang jeffdaily
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37427

Differential Revision: D21302909

Pulled By: ezyang

fbshipit-source-id: ecadda3778e65b7f4f97e24b932b96b9ce928616
2020-04-29 09:56:28 -07:00
Nikolay Korovaiko
edc5ef1afb run the simple executor for jit tests by default, add profiling jobs … (#37017)
Summary:
…for fusion tests

fix flake8 warnings

fix ci failures

fix test_determination.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017

Differential Revision: D21238446

Pulled By: Krovatkin

fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c
2020-04-28 19:16:52 -07:00
Nikita Shulga
47c4dca1ab Remove python-2 or python<3.5 checks from unit tests (#37252)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37252

Test Plan: CI

Differential Revision: D21241083

Pulled By: malfet

fbshipit-source-id: 44164b822f7905288abb2beda0175d2162d86143
2020-04-24 17:42:04 -07:00
Jerry Zhang
230b68168b [quant] Refactor test files (#36964)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36964

Rename and restructure quantization related tests
https://github.com/pytorch/pytorch/issues/31625

Test Plan:
.

Imported from OSS

Differential Revision: D21192509

fbshipit-source-id: 148c93e86e0ea68ab18a067fe74a8035a29a1e4e
2020-04-23 10:28:56 -07:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Jerry Zhang
57c50db441 [reland][quant] Add backward compatiblity test (#36842)
Summary:
re-created the same PR: https://github.com/pytorch/pytorch/pull/36639
because ghimport does not support importing binary files right now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36842

Test Plan: python test/quantization/test_backward_compatibility.py

Differential Revision: D21100689

Pulled By: jerryzh168

fbshipit-source-id: 625a0f9da98138c9c2891b9d99fc45d85fa27cca
2020-04-17 21:24:31 -07:00
Xingying Cheng
86f354c530 Python binding api to optimize for mobile model on script module. (#36357)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36357
ghstack-source-id: 101907180

Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops.

Test Plan: python test/test_optimizer.py

Differential Revision: D20946076

fbshipit-source-id: 93cb4a5bb2371128f802d738eb26d0a4f3b2fe10
2020-04-17 16:21:27 -07:00
Mike Ruberry
f00014b790 Revert D21080503: [pytorch][PR] [quant] Add backward compatiblity test
Test Plan: revert-hammer

Differential Revision:
D21080503

Original commit changeset: 1dca08208bcc

fbshipit-source-id: 5cd8c22130ff28b9231f657f80961e94b65b5792
2020-04-16 22:03:12 -07:00
Jerry Zhang
484a00b2d3 [quant] Add backward compatiblity test (#36771)
Summary:
re-created the same PR: https://github.com/pytorch/pytorch/pull/36639
because ghimport does not support importing binary files right now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36771

Test Plan: python test/quantization/test_backward_compatibility.py

Differential Revision: D21080503

Pulled By: jerryzh168

fbshipit-source-id: 1dca08208bccead60bba03e5fb5d39e1a1d7c20d
2020-04-16 19:00:30 -07:00
Haixin Liu
455d4aab64 [PyTorch Numeric Suite] Add weight compare API (#36186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36186

Start PyTorch Numeric Suite under PyTorch quantization and add weight compare API to it.
ghstack-source-id: 102062165

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_compare_weights'

Differential Revision: D20903395

fbshipit-source-id: 125d84569837142626a0e2119b3b7657a32dbf4e
2020-04-13 19:02:00 -07:00
Thomas Viehmann
d070c0bcf0 ROCm: enable cpp_extensions.load/load_inline (#35897)
Summary:
This enables cpp_extensions.load/load_inline. This works by hipify-ing cuda sources.
Also enable tests.
CuDNN/MIOpen extensions aren't yet supported, I propose to not do this in this PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35897

Differential Revision: D20983279

Pulled By: ezyang

fbshipit-source-id: a5d0f5ac592d04488a6a46522c58e2ee0a6fd57c
2020-04-13 11:44:08 -07:00
David Reiss
fab06bfb75 Add utility for bundling sample inputs with models (#35631)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35631

Bundling sample inputs with our models with a standardized interface
will make it possible to write benchmarking and code-coverage tools that
call all models in a uniform way.  The intent is to make this a standard
for mobile models within Facebook.  Putting it in torch/utils so tests
can run on GitHub and because it might be useful for others as well.

`augment_model_with_bundled_inputs` is the primary entry point.  See
its docstring for usage information and the test for some example uses.

One design question I had was how much power should be available for
automatic deflating and inflating of inputs.  The current scheme gives
some automatic handling and a reasonable escape hatch
("_bundled_input_inflate_format") for top-level tensor arguments, but no
automatic support for (e.g.) tensors in tuples or long strings.  For
more complex cases, we have the ultimate escape hatch of just defining
_generate_bundled_inputs in the model.

Another design question was whether to add the inputs to the model or
wrap the model in a wrapper module that had these methods and delegated
calls to `forward`.  Because models can have other exposed methods and
attributes, the wrapped seemed too onerous.

Test Plan: Unit test.

Differential Revision: D20925013

Pulled By: dreiss

fbshipit-source-id: 4dbbb4cce41e5752133b4ecdb05e1c92bac6b2d5
2020-04-08 13:10:36 -07:00
Johannes M Dieterich
45fc881f05 [ROCm] Hotfix: Black list tensorexpr test set that has failures on ROCm (#36049)
Summary:
Test set got enabled with ROCm failures in https://github.com/pytorch/pytorch/pull/35914 - black list it for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36049

Differential Revision: D20869814

Pulled By: zou3519

fbshipit-source-id: fcdb2abc9f3407344b56cf8d48b7740008317020
2020-04-06 13:26:05 -07:00
David Reiss
a054d05707 Add torch.utils.show_pickle for showing pickle contents in saved models (#35168)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35168

Sometimes when a saved model isn't working, it's nice to be able to look
at the contents of the pickle files.  Unfortunately, pickletools output
isn't particularly readable, and unpickling is often either not possible
or runs so much post-processing code that it's not possible to tell
exactly what is present in the pickled data.

This script uses a custom Unpickler to unpickle (almost) any data into
stub objects that have no dependency on torch or any other runtime types
and suppress (almost) any postprocessing code.

As a convenience, the wrapper can search through zip files, supporting
command lines like

`python -m torch.utils.show_pickle /path/to/model.pt1@*/data.pkl`

When the module is invoked as main, we also install a hack in pprint to
allow semi-resonable formatting of our stub objects.

Test Plan: Ran it on a data.pkl, constants.pkl, and a debug pkl

Differential Revision: D20842550

Pulled By: dreiss

fbshipit-source-id: ef662d8915fc5795039054d1f8fef2e1c51cf40a
2020-04-03 15:11:20 -07:00
Mikhail Zolotukhin
ba3cec867f Reenable test/test_tensorexpr.py (#35914)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35914

Test Plan: Imported from OSS

Differential Revision: D20827188

Pulled By: ZolotukhinM

fbshipit-source-id: ffcc1bb0396a0a19afb577a7ab4ca95c7e4ced37
2020-04-03 12:20:31 -07:00
Will Feng (FAIAR)
2fa3c1570d Refactor C++ API parity test mechanism and turn it on in CI again (#35190)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35190

The following are the main changes:
- The main logic of C++ API parity test mechanism is moved from `test/test_cpp_api_parity.py` to `test/cpp_api_parity/module_impl_check.py` and `test/cpp_api_parity/functional_impl_check.py`, so that there is a clear separation between module tests and functional tests, although they still share a lot of common utility functions which are all in `test/cpp_api_parity/utils.py`.
- Module init tests (i.e. testing whether C++ module accepts the same constructor options as the corresponding Python module) is removed and will be added again in the future.
- `cpp_constructor_args` / `cpp_options_args` / `cpp_function_call` are added as appropriate to all test params dict in `torch/testing/_internal/common_nn.py`, to indicate how to run C++ API parity test for this test params dict.

Test Plan: Imported from OSS

Differential Revision: D20588198

Pulled By: yf225

fbshipit-source-id: 11238c560c8247129584b9b49df73fff40c4d81d
2020-04-03 11:20:36 -07:00
Feng Tian
762270c51f add c10d dynamic loading mechanism and unit test (#28068)
Summary:
The original behavior of pytorch c10d only supports built-in c10d backends, such as
nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically
loading 3rd party communication libraries which are derived from ProcessGroup base class.

related RFC is in: https://github.com/pytorch/pytorch/issues/27955

Through this way, user just need specify a 3rd party c10d backend name when invoking
torch.distributed.init_process_group(). The proposed logic will try to load corresponding
c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend
through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068

Differential Revision: D19174838

Pulled By: agolynski

fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62
2020-04-02 15:46:51 -07:00
Nick Korovaiko
ddcad5b9ca temp disable test_tensorexpr.py
Test Plan: test on CI

Reviewed By: soumith

Differential Revision: D20823336

fbshipit-source-id: 65c04bc57c6a120003cb561613645d2d7e60189c
2020-04-02 14:28:22 -07:00
Christian Sarofeen
6d24f8fe21 Infrastructure for a new CUDA Fuser (#34785)
Summary:
**Summary:** This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_  One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated.

**Warning:** This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser.

**Short term goals:**

Parity with current CUDA fuser (including performance):
- Dynamic shapes (no recompilation)
- Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code)
- Dropout

**Mid-term goals:**

- Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation).
- 1-D reductions fused with pointwise operations
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785

Reviewed By: ZolotukhinM

Differential Revision: D20650977

Pulled By: soumith

fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63
2020-04-02 09:22:42 -07:00
Nick Korovaiko
2f50c11954 add test_tensorexpr.py (#35776)
Summary:
Adding `test_tensorexpr.py` to our CI. There's a few complications: the first one is that we now always run `SimpleIREVal` as a part of simplifier, so the counts will always be greater than one. We can potentially invest some effort to differentiate between a real codegen call to `SimpleIREval` and calls in simplifier, but it's probably not that important and the second change to turn not being able to retrieve a counter into a default value of 0 since the test are structured to test for either an llvm or simpleireval backends, so it only seems appropriate to not fail the test too early.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35776

Differential Revision: D20799333

Pulled By: Krovatkin

fbshipit-source-id: 2a94ff98e647180c6e6aea141a411c3376c509f9
2020-04-01 22:05:37 -07:00
Jerry Zhang
ab26dfb44e [quant] Move quantization tests into test/quantization (#35812)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35812

Test Plan:
.

Imported from OSS

Differential Revision: D20795329

fbshipit-source-id: 42cc905c44ce7b86720aeef512d747ff6788d7a2
2020-04-01 12:44:19 -07:00
Michael Suo
319aee1afb Revert D20771828: [quant] Move quantization tests into test/quantization
Test Plan: revert-hammer

Differential Revision:
D20771828

Original commit changeset: 5f1df5e86c29

fbshipit-source-id: d14f915f291ae8a90026c5b65624459211495f47
2020-03-31 23:01:00 -07:00
Jerry Zhang
fef6c617d4 [quant] Move quantization tests into test/quantization (#35688)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35688

Test Plan:
.

Imported from OSS

Differential Revision: D20771828

fbshipit-source-id: 5f1df5e86c29f7bdfbdc6563450e909b3bfdc07a
2020-03-31 20:30:57 -07:00
Johannes M Dieterich
0eb26fb01e [ROCm] Properly blacklist (#35230)
Summary:
test_python_all_except_nn
+ /usr/bin/python3.6 test/run_test.py --exclude test_nn test_jit_simple
test_jit_legacy test_jit_fuser_legacy --verbose --bring-to-front
test_quantization test_quantized test_quantized_tensor
test_quantized_nn_mods --determine-from=

test_nn continues to be run as part of test1 target

This will allows us to run run_test.py and correctly disabling these sets for ROCm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35230

Differential Revision: D20735851

Pulled By: ezyang

fbshipit-source-id: 255d21374c9605c8f8b6ffa1b08f58fb10d8e543
2020-03-30 08:57:03 -07:00
Omkar Salpekar
4025729e88 [1.5 Release][RPC Reliability] RRef Idempotency and RPC Retry enablement (#33636)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33636

Fixes https://github.com/pytorch/pytorch/issues/32119, https://github.com/pytorch/pytorch/issues/26116,
https://github.com/pytorch/pytorch/issues/33072

Makes RRef control messages idempotent and enables sending with retries for distributed autograd cleanup and RRef internal messages.

In order to effectively test whether these RRef and distributed autograd cleanup work with network failures/retries, I implemented an  RPC Agent with a faulty send function, and enabled running tests using this as a third backend (in addition to Thrift and PGA). The tests using this backend are in a separate class (the test cases are similar but with minor changes to ensure short-running tests wait for retried RPCs to finish).

This faulty RPC Agent is pretty configurable. The tests can configure which messages types to fail, and how many messages to fail, but going forward, other RPC functionality can be overriden with faulty methods to test with failures injected.

Differential Revision: D20019236

fbshipit-source-id: 540a977e96b2e29aa0393ff12621fa293fe92b48
2020-03-20 20:07:47 -07:00