pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Mike Ruberry	36c87f1243	Refactors test_torch.py to be fewer than 10k lines (#47356 ) Summary: Creates multiple new test suites to have fewer tests in test_torch.py, consistent with previous test suite creation like test_unary_ufuncs.py and test_linalg.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47356 Reviewed By: ngimel Differential Revision: D25202268 Pulled By: mruberry fbshipit-source-id: 75fde3ca76545d1b32b86d432a5cb7a5ba8f5bb6	2020-11-28 20:11:40 -08:00
Jithun Nair	f1c985695c	Enabled gloo backend in test_distributed unit tests for ROCm (#40395 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40395 Reviewed By: ngimel Differential Revision: D25181692 Pulled By: mrshenli fbshipit-source-id: 29f478c974791efc0acea210c8c9e574944746a5	2020-11-25 19:51:40 -08:00
Sam Estep	c4a6df989c	Pass any verbosity from test/run_test.py to pytest (#48204 ) Summary: Previously it was only possible to pass up to one [verbosity level](https://adamj.eu/tech/2019/10/03/my-most-used-pytest-commandline-flags/) to `pytest` when running a test via `test/run_test.py`. Presumably that behavior was never added because `unittest` [doesn't do anything extra](https://stackoverflow.com/a/1322648/5044950) when given more than one `--verbose` flag. This PR removes that limitation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48204 Test Plan: Make a dummy `pytest`-style file `test/test_foo.py`: ```py def test_bar(): assert 'hello\n' * 10 == 'hello\n' * 20 ``` Then add `'test_foo'` to both `TESTS` and `USE_PYTEST_LIST` in `test/run_test.py`, and run this command: ```sh test/run_test.py -vvi test_foo ``` Reviewed By: walterddr Differential Revision: D25069147 Pulled By: samestep fbshipit-source-id: 2765ee78d18cc84ea0e262520838993f9e9ee04f	2020-11-19 08:06:26 -08:00
Wanchao Liang	bc484cfed1	[c10d][jit] initial torchbind bindings for ProcessGroupNCCL (#42944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42944 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D23228682 Pulled By: wanchaol fbshipit-source-id: 30f4258ec2a90202264745511b897f4e1f5550f7	2020-11-17 21:01:55 -08:00
Xiang Gao	6e42b77be1	Add '--allow-run-as-root' to mpiexec to allow running distributed test inside a container (#43794 ) Summary: Inside a container, the user is often root. We should allow this use case so that people can easily run `run_test.py` insider a container Pull Request resolved: https://github.com/pytorch/pytorch/pull/43794 Reviewed By: ezyang Differential Revision: D24904469 Pulled By: malfet fbshipit-source-id: f96cb9dda3e7bd18b29801cde4c5b0616c750016	2020-11-13 15:31:06 -08:00
Jane Xu	579cfc6641	Moving test order to rebalance test1 and test2 times (#47290 ) Summary: asan testing diff is absurd right now, moving some heftier tests to be in shard2 (test_nn and test_quantization) Pull Request resolved: https://github.com/pytorch/pytorch/pull/47290 Reviewed By: malfet Differential Revision: D24706877 Pulled By: janeyx99 fbshipit-source-id: 35069d1e425857f85775f9be76501d6a158e0376	2020-11-03 09:39:29 -08:00
Pritam Damania	78de12f588	Replace -f with -x for pytest tests. (#46967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46967 Tests under `tests/distributed/_pipeline/sync` use pytest and specifying the `-f` option for such tests as follows: `python test/run_test.py -i distributed/_pipeline/sync/skip/test_api -- -f` doesn't work. The equivalent option for pytest is `-x`. To resolve this issue, I've updated `run_test.py` to replace `-f` with `-x` for pytest tests. More details in https://github.com/pytorch/pytorch/issues/46782 #Closes: https://github.com/pytorch/pytorch/issues/46782 ghstack-source-id: 115440558 Test Plan: 1) waitforbuildbot 2) `python test/run_test.py -i distributed/_pipeline/sync/skip/test_api -- -f` Reviewed By: malfet Differential Revision: D24584556 fbshipit-source-id: bd87f5b4953504e5659fe72fc8615e126e5490ff	2020-10-29 15:28:06 -07:00
Jane Xu	85954164a4	fix minor bug, message variable does not exist (#46777 ) Summary: When run with `--continue-through-error`, the script ends with the following error: ``` Traceback (most recent call last): File "run_test.py", line 745, in <module> main() File "run_test.py", line 741, in main print_to_stderr(message) NameError: name 'message' is not defined make: *** [macos-compat] Error 1 ``` This PR just changes `message` to `err`, which is the intended variable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46777 Reviewed By: seemethere Differential Revision: D24510460 Pulled By: janeyx99 fbshipit-source-id: be1124b6fc72b178d62acc168d0cbc74962de52b	2020-10-23 14:20:23 -07:00
Pritam Damania	06d50b5eb0	Pull in fairscale.nn.Pipe into PyTorch. (#44090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44090 This is an initial commit pulling in the torchgpipe fork at https://github.com/facebookresearch/fairscale. The purpose of this commit is to just pull in the code and ensure all tests and builds work fine. We will slowly modify this to match our intended API mentioned in https://fb.quip.com/txurAV3zIFox#RPZACAfAKMq. Follow up PRs would address further changes needed on top of the initial commit.. We're pulling the code into the `torch.distributed._pipeline.sync` package. The package is private on purpose since there is a lot of work (ex: docs, API changes etc.) that needs to go in before we can actually officially support this. ghstack-source-id: 114864254 Test Plan: 1) waitforbuildbot 2) Ran all tests on my devgpu Reviewed By: mrshenli Differential Revision: D23493316 fbshipit-source-id: fe3c8b7dadeeb86abdc00e8a8652491b0b16743a	2020-10-22 10:59:02 -07:00
Richard Zou	0285618a11	Add utilities to support handling of nested python data structures (#46287 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46287 This adds a lightweight `pytree` implementation that is similar to and inspired by JAX pytrees, tensorflow.nest, deepmind/tree, TorchBeast's TensorNest, etc. A pytree is Python nested data structure. It is a tree in the sense that nodes are Python collections (e.g., list, tuple, dict) and the leaves are Python values. Furthermore, a pytree should not contain reference cycles. This PR: - adds support for flattening and unflattening nested Python list/dict/tuples Context: nested Tensor inputs for vmap -------------------------------------- Right now, vmap is restricted to taking in flat lists of tensors. This is because vmap needs to be able to convert every tensor in the input that is being vmapped over into a BatchedTensor. With a pytree library, we can simply flatten the input data structure (returning the leaves), map all of the Tensors in the flat input to BatchedTensors, and unflatten the flat list of BatchedTensors into a new input. Or equivalently, with a `tree_map` function, we can map a nested python data structure containing Tensors into one containing BatchedTensors. Future work ----------- In some future PRs, we'll add nested input support for vmap. The prerequisites for that are: - a `broadcast_to(small, big)` that broadcasts `small` up to `big`. This is for handling the in_dims to vmap: the in_dims structure must be compatible with the structure of the inputs. Test Plan --------- - New tests in test/test_pytree.py Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D24392890 Pulled By: zou3519 fbshipit-source-id: 7daf7430c5a38354e7d203a72882bd7a9b24cfb1	2020-10-20 07:45:45 -07:00
jiej	ac146c4820	[nvFuser] Switching to `CudaFusionGuard` from `BailOut` for nvfuser - update 2 (#46452 ) Summary: 1. Added CudaFusionGuard as the custom TypeCheck for nvfuser; enabled dynamic shape support with profiling executor; 2. dropped support for legacy fuser; 3. re-enabled nvfuser tests; 4. added registration for profiling record to allow profiling on user specified nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46452 Reviewed By: zou3519, anjali411 Differential Revision: D24364642 Pulled By: ngimel fbshipit-source-id: daf53a9a6b6636e1ede420a3a6d0397d4a8b450b	2020-10-19 15:44:31 -07:00
Taylor Robie	dda95e6914	More Timer refinement (#46023 ) Summary: This PR just adds more polish to the benchmark utils: 1) `common.py`, `timer.py`, and `valgrind_wrapper/timer_interface.py` are now MyPy strict compliant. (except for three violations due to external deps.) Compare and Fuzzer will be covered in a future PR. 2) `CallgrindStats` now uses `TaskSpec` rather than accepting the individual fields which brings it closer to `Measurement`. 3) Some `__repr__` logic has been moved into `TaskSpec` (which `Measurement` and `CallgrindStats` use in their own `__repr__`s) for a more unified feel and less horrible f-string hacking, and the repr's have been given a cleanup pass. 4) `Tuple[FunctionCount, ...]` has been formalized as the `FunctionCounts` class, which has a much nicer `__repr__` than just the raw tuple, as well as some convenience methods (`__add__`, `__sub__`, `filter`, `transform`) for easier DIY stat exploration. (I find myself using the latter two a lot now.) My personal experience is that manipulating `FunctionCounts` is massively more pleasant than the raw tuples of `FunctionCount`. (Though it's still possible to get at the raw data if you want.) 5) Better support for multi-line `stmt` and `setup`. 6) Compare now also supports rowwise coloring, which is often the more natural layout for A/B testing. 7) Limited support for `globals` in `collect_callgrind`. This should make it easier to benchmark JIT models. (CC ZolotukhinM) 8) More unit tests, including extensive tests for the Callgrind stats manipulation APIs. 9) Mitigate issue with `MKL_THREADING_LAYER` when run in Jupyter. (https://github.com/pytorch/pytorch/issues/37377) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46023 Test Plan: changes should be covered by existing and new unit tests. Reviewed By: navahgar, malfet Differential Revision: D24313911 Pulled By: robieta fbshipit-source-id: 835d4b5cde336fb7ff0adef3c0fd614d64df0f77	2020-10-15 16:32:53 -07:00
Wang Xu	62d37b9f26	add size_based_partition final (#46282 ) Summary: Reopen the PR: https://github.com/pytorch/pytorch/pull/45837 This PR add a new feature for Partitioner() class called size_based_partition. Given a list of devices with the same memory size, this function could distribute graph nodes into different devices. To implement this feature, several help functions are created in Partitioner.py and GraphManipulation.py. An unit test is also added in test/test_fx_experimental.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/46282 Reviewed By: gcatron Differential Revision: D24288470 Pulled By: scottxu0730 fbshipit-source-id: e81b1e0c56e34f61e497d868882126216eba7538	2020-10-14 03:44:05 -07:00
Neeraj Pradhan	faa9c22a51	Support pytest for distribution testing (#45648 ) Summary: In response to https://github.com/pytorch/pytorch/issues/11578. This is a test run to see if CI (and other internal systems) works fine with pytest style tests. - Creates a separate `distributions` directory within `test`. - For testing, this rewrites the `constraint` tests as parameterized tests in pytest. I don't plan to convert any other tests to pytest style, but only expose this option for adding new tests, if required. If this is a success, we can move `EXAMPLES` in `test_distributions` into a separate file that can be imported by both pytest and unittest style tests. cc. fritzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/45648 Reviewed By: ezyang, colesbury Differential Revision: D24080248 Pulled By: neerajprad fbshipit-source-id: 1f2e7d169c3c291a3051d0cece17851560fe9ea9	2020-10-13 10:56:50 -07:00
Jane Xu	ba78eb80ff	including tensorexpr tests in CI for all configs (#46188 ) Summary: Removed test_tensorexpr from the JIT-EXECUTOR exclude list. CI will now run those tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46188 Reviewed By: glaringlee Differential Revision: D24255433 Pulled By: janeyx99 fbshipit-source-id: f18e5b41d49b439407c1c24ef6190ef68bc809bf	2020-10-12 12:03:06 -07:00
Jane (Yuan) Xu	be137e45cd	reorganizing tests so that test1 and test2 are balanced in timing (#45778 ) Summary: used --shard option to split up python tests ran from `test/run_test.py` in the testing script run in CI also revised a help message to be more accurate for --shard. Test results: BEFORE: \| EVENT \| TIMING \| \|---\|---\| \| TEST1 \| \| \| \| \| \| test_python_nn \| 35m19s \| \| test_cpp_extensions \| 30s \| \| total \| 35m49s \| \| TEST2 \| \| \| \| \| \| install_torchvision \| 35s \| \| test_python_all_except_nn_and_cpp_extensions \| 255m37s \| \| test_aten \| SKIPPED \| \| test_libtorch \| 9m8s \| \| test_custom_script_ops \| SKIPPED \| \| test_custom_backend \| SKIPPED \| \| test_torch_function_benchmark \| 10s \| \| total \| 4hr24m \| AFTER THIS SHARD: \| EVENT \| TIMING \| \|---\|---\| \| TEST1 \| \| \| \| \| \| test_autograd \| 26m30s \| \| test_foreach \| 69m \| \| test_nn \| test_nn is 35m38s \| \| total \| 3h1m \| \| TEST2 \| \| \| \| \| \| test-quantization \| 41m28s \| \| test_spectral_ops \| 17m37s \| \| test_torch \| 8m56s \| \| test_jit_legacy \| 16m21s \| \| total \| 2h18m \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/45778 Reviewed By: albanD Differential Revision: D24137156 Pulled By: janeyx99 fbshipit-source-id: 5873fec47aedb9f699ebbda653a4d32a9950fc13	2020-10-06 07:57:08 -07:00
Jane Xu	8bc0c755be	adding option to move excluding to run_test.py instead of test.sh (#45868 ) Summary: Cleaning up test.sh a tiny bit Pull Request resolved: https://github.com/pytorch/pytorch/pull/45868 Reviewed By: albanD Differential Revision: D24122726 Pulled By: janeyx99 fbshipit-source-id: e8254accad15ad887a000ec1401c401389393c92	2020-10-06 07:13:27 -07:00
Jane (Yuan) Xu	6acd7b686c	adding sharding option to run_test.py (#45583 ) Summary: Added a sharding option to run_test.py to enable users to run a subset of the many tests. The new `--shard` argument takes in two integer values, `x` and `y`, where the larger value would denote the number of shards and the smaller value would denote which shard to run. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45583 Reviewed By: malfet Differential Revision: D24083469 Pulled By: janeyx99 fbshipit-source-id: 1777bd7822c95b3bf37079deff9381c6f8eaf4cc	2020-10-02 11:21:51 -07:00
Thomas Viehmann	22a34bcf4e	ROCm {emoji:2764} TensorExpr (#45506 ) Summary: This might be an alternative to reverting https://github.com/pytorch/pytorch/issues/45396 . The obvious rough edge is that I'm not really seeing the work group limits that TensorExpr produces. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45506 Reviewed By: zhangguanheng66 Differential Revision: D23991410 Pulled By: Krovatkin fbshipit-source-id: 11d3fc4600e4bffb1d1192c6b8dd2fe22c1e064e	2020-09-29 16:52:16 -07:00
gunandrose4u	f07ac6a004	Fix Windows build failure after DDP PR merged (#45335 ) Summary: Fixes #{issue number} This is resubmit for PR https://github.com/pytorch/pytorch/issues/42897 . Together with fix for Windows build issue introduced by PR https://github.com/pytorch/pytorch/issues/44344 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/45335 Reviewed By: zou3519 Differential Revision: D23931471 Pulled By: mrshenli fbshipit-source-id: f49b5a114944c1450b32934b3292170be064f494	2020-09-25 12:37:50 -07:00
Mike Ruberry	95df8657c9	Enables test linalg (#45278 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45271. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45278 Reviewed By: ngimel Differential Revision: D23926124 Pulled By: mruberry fbshipit-source-id: 26692597f9a1988e5fa846f97b8430c3689cac27	2020-09-24 23:09:38 -07:00
Mike Ruberry	103fa3894a	Revert D23841786: [pytorch][PR] Enable distributed package on windows, Gloo backend supported only Test Plan: revert-hammer Differential Revision: D23841786 (`0122299f9b`) Original commit changeset: 334ba1ed73ef fbshipit-source-id: ec95432f9957df56a5a04e52661f5db920b7f57f	2020-09-24 22:44:33 -07:00
gunandrose4u	0122299f9b	Enable distributed package on windows, Gloo backend supported only (#42897 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42095 For test case part will be committed to this PR later mrshenli, please help to review Pull Request resolved: https://github.com/pytorch/pytorch/pull/42897 Reviewed By: osalpekar Differential Revision: D23841786 Pulled By: mrshenli fbshipit-source-id: 334ba1ed73eff2f668857390fc32d1bc7f08e5f3	2020-09-24 21:13:55 -07:00
Zachary DeVito	cb75addee4	torch.package - a way to package models and code (#45015 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45015 torch.package allows you to write packages of code, pickled python data, and arbitrary binary and text resources into a self-contained package. torch.package.PackageExporter writes the packages and torch.package.PackageImporter reads them. The importers can load this code in a hermetic way, such that code is loaded from the package rather than the normal python import system. This allows for the packaging of PyTorch model code and data so that it can be run on a server or used in the future for transfer learning. The code contained in packages is copied file-by-file from the original source when it is created, and the file format is a specially organized zip file. Future users of the package can unzip the package, and edit the code in order to perform custom modifications to it. The importer for packages ensures that code in the module can only be loaded from within the package, except for modules explicitly listed as external using :method:`extern_module`. The file `extern_modules` in the zip archive lists all the modules that a package externally depends on. This prevents "implicit" dependencies where the package runs locally because it is importing a locally-installed package, but then fails when the package is copied to another machine. Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D23824337 Pulled By: zdevito fbshipit-source-id: 1247c34ba9b656f9db68a83e31f2a0fbe3bea6bd	2020-09-22 21:21:21 -07:00
Richard Zou	07cba8b1fc	Run vmap tests in CI (#44656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44656 All this time, test_vmap wasn't running in the CI. Fortunately all the tests pass locally for me. h/t to anjali411 for pointing this out. Test Plan: - Wait for CI Reviewed By: anjali411 Differential Revision: D23689355 Pulled By: zou3519 fbshipit-source-id: 543c3e6aed0af77bfd6ea7a7549337f8230e3d32	2020-09-15 10:59:00 -07:00
Nikita Shulga	fc51047af5	Small fixes in Dependency.cmake and run_test.py (#44414 ) Summary: Do not add gencode flags to NVCC_FLAGS twice: First time they are added in `cmake/public/cuda.cmake` no need to do it again in `cmake/Dependencies.cmake` Copy `additional_unittest_args` before appending local options to it in `run_test()` method Pull Request resolved: https://github.com/pytorch/pytorch/pull/44414 Reviewed By: seemethere Differential Revision: D23605733 Pulled By: malfet fbshipit-source-id: 782a0da61650356a978a892fb03c66cb1a1ea26b	2020-09-09 15:09:33 -07:00
Rohan Varma	106459acac	Rename test_distributed to test_distributed_fork (#42932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42932 Follow up from https://github.com/pytorch/pytorch/pull/41769, rename `test_distributed` to `test_distributed_fork` to make it explicit that it forks. New command to run test: `python test/run_test.py -i distributed/test_distributed_fork -v` ghstack-source-id: 111632568 Test Plan: `python test/run_test.py -i distributed/test_distributed_fork -v` Reviewed By: izdeby Differential Revision: D23072201 fbshipit-source-id: 48581688b6c5193a309e803c3de38e70be980872	2020-09-08 23:13:37 -07:00
Rohan Varma	b22abbe381	Enable test_distributed to work with spawn mode (#41769 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41769 Currently the tests in `test_distributed` only work with the `fork` mode multiprocessing, this PR introduces support for `spawn` mode multiprocessing as well (while keeping the `fork` mode intact). Motivations for the change: 1) Spawn multiprocessing is the default on MacOS, so it better emulates how MacOS users would use distributed 2) With python 3.8+, spawn is the default on linux, so we should have test coverage for this 3) PT multiprocessing suggests using spawn/forkserver over fork, for sharing cuda tensors: https://pytorch.org/docs/stable/multiprocessing.html 4) Spawn is better supported with respect to certain sanitizers such as TSAN, so adding this sanitizer coverage may help us uncover issues. How it is done: 1) Move `test_distributed` tests in `_DistTestBase` class to a shared file `distributed_test` (similar to how the RPC tests are structured) 2) For `Barrier`, refactor the setup of temp directories, as the current version did not work with spawn, each process would get a different randomly generated directory and thus would write to different barriers. 3) Add all the relevant builds to run internally and in OSS. Running test_distributed with spawn mode in OSS can be done with: `python test/run_test.py -i distributed/test_distributed_spawn -v` Reviewed By: izdeby Differential Revision: D22408023 fbshipit-source-id: e206be16961fd80438f995e221f18139d7e6d2a9	2020-09-08 23:11:12 -07:00
Mike Ruberry	665feda15b	Adds opinfo-based autograd tests and (un)supported dtype tests (#43451 ) Summary: This PR adds a new test suite, test_ops.py, designed for generic tests across all operators with OpInfos. It currently has two kinds of tests: - it validates that the OpInfo has the correct supported dtypes by verifying that unsupported dtypes throw an error and supported dtypes do not - it runs grad and gradgrad checks on each op and its variants (method and inplace) that has an OpInfo This is a significant expansion and simplification of the current autogenerated autograd tests, which spend considerable processing their inputs. As an alternative, this PR extends OpInfos with "SampleInputs" that are much easier to use. These sample inputs are analogous to the existing tuples in`method_tests()`. Future PRs will extend OpInfo-based testing to other uses of `method_tests()`, like test_jit.py, to ensure that new operator tests can be implemented entirely using an OpInfo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43451 Reviewed By: albanD Differential Revision: D23481723 Pulled By: mruberry fbshipit-source-id: 0c2cdeacc1fdaaf8c69bcd060d623fa3db3d6459	2020-09-03 02:50:48 -07:00
Sinan Nasir	1a79d7bb28	DDP communication hook examples (#43310 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43310 In this diff, we prepared some example DDP communication hooks [#40848](https://github.com/pytorch/pytorch/pull/40848): 1\. `allreduce_hook`: This DDP communication hook just calls ``allreduce`` using ``GradBucket`` tensors. Once gradient tensors are aggregated across all workers, its ``then`` callback takes the mean and returns the result. If user registers this hook DDP results is expected to be same as the case where no hook was registered. Hence, this won't change behavior of DDP and user can use this as a reference or modify this hook to log useful information or any other purposes while unaffecting DDP behavior. 2\. `allgather_then_aggregate_hook` Similar to ``allreduce_hook``, this hook first gathers ``GradBucket`` tensors and its ``then`` callback aggregates the gathered gradient tensors and takes mean. Instead of ``allreduce`` this hook uses ``allgather``. Note that with W workers, both the computation and communication time scale as O(W) for allgather compared to O(logW) for allreduce. Therefore, this hook is expected to be much slower than ``allreduce_hook`` although both essentially do the same thing with the gradients. 3\. `fp16_compress_hook` This DDP communication hook implements a simple gradient compression approach that converts ``GradBucket`` tensors whose type is assumed to be ``torch.float32`` to half-precision floating point format (``torch.float16``). It allreduces those ``float16`` gradient tensors. Once compressed gradient tensors are allreduced, its then callback called ``decompress`` converts the aggregated result back to ``float32`` and takes the mean. 4\. `quantization_pertensor_hook` does quantization per tensor and uses the idea in https://pytorch.org/docs/master/generated/torch.quantize_per_tensor.html. Note that we separately send scale and zero_point (two floats per rank) before quantized tensors. 5\. `quantization_perchannel_hook` does quantization per channel similar to https://pytorch.org/docs/master/generated/torch.quantize_per_channel.html. The main motivation is that after the initial QSGD study diff, we realized that for considerably large gradient tensors such as a tensor that contains 6 million floats quantizing dividing it into smaller channels (512 float chunks) and quantizing independently may significantly increase the resolution and result with lower error. ghstack-source-id: 110923269 Test Plan: python torch/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py Couldn't download test skip set, leaving all tests enabled... ..... ---------------------------------------------------------------------- Ran 4 tests in 26.724s OK Internal testing: ``` buck run mode/dev-nosan //caffe2/test/distributed/algorithms/ddp_comm_hooks:test_ddp_hooks ``` Reviewed By: malfet Differential Revision: D22937999 fbshipit-source-id: 274452e7932414570999cb978ae77a97eb3fb0ec	2020-08-28 18:59:14 -07:00
Nikita Shulga	1bda5e480c	Add Python code coverage (#43600 ) Summary: Replace `test` with `coverage_test` stage for `pytorch-linux-bionic-py3.8-gcc9` configuration Add `coverage.xml` to the list of ignored files Add `codecov.yml` that maps installed pytorch folders back to original locations Cleanup coverage option utilization in `run_test.py` and adapt it towards combining coverage reports across the runs Pull Request resolved: https://github.com/pytorch/pytorch/pull/43600 Reviewed By: seemethere Differential Revision: D23351877 Pulled By: malfet fbshipit-source-id: acf78ae4c8f3e23920a76cce1d50f2821b83eb06	2020-08-26 16:16:03 -07:00
albanD	e08e93f946	Reland of benchmark code (#43428 ) Summary: Reland of the benchmark code that broke the slow tests because the GPU were running out of memory Pull Request resolved: https://github.com/pytorch/pytorch/pull/43428 Reviewed By: ngimel Differential Revision: D23296136 Pulled By: albanD fbshipit-source-id: 0002ae23dc82f401604e33d0905d6b9eedebc851	2020-08-24 13:27:26 -07:00
Mike Ruberry	4dc8f3be8c	Creates test_tensor_creation_ops.py test suite (#43104 ) Summary: As part of our continued refactoring of test_torch.py, this takes tests for tensor creation ops like torch.eye, torch.randint, and torch.ones_like and puts them in test_tensor_creation_ops.py. There hare three test classes in the new test suite: TestTensorCreation, TestRandomTensorCreation, TestLikeTensorCreation. TestViewOps and tests for construction of tensors from NumPy arrays have been left in test_torch.py. These might be refactored separately into test_view_ops.py and test_numpy_interop.py in the future. Most of the tests ported from test_torch.py were left as is or received a signature change to make them nominally "device generic." Future work will need to review test coverage and update the tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43104 Reviewed By: ngimel Differential Revision: D23280358 Pulled By: mruberry fbshipit-source-id: 469325dd1a734509dd478cc7fe0413e276ffb192	2020-08-22 23:18:54 -07:00
Alban Desmaison	74781ab5b8	Revert D23242101: [pytorch][PR] Implement first draft of autograd benchmark. Test Plan: revert-hammer Differential Revision: D23242101 (`c2511bdfa4`) Original commit changeset: a2b92d5a4341 fbshipit-source-id: bda562d15565f074b448022d180ec8f959c6ecc9	2020-08-21 12:22:57 -07:00
albanD	c2511bdfa4	Implement first draft of autograd benchmark. (#40586 ) Summary: It is quite a lot of code because I pulled some code from torchaudio and torchvision to remove issues I had to get latest version with pytorch built from source while I can't build there libs from source (dependency missing for torchaudio). The compare script generates table as follows: \| model \| task \| speedup \| mean (before) \| var (before) \| mean (after) \| var (after) \| \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| \| resnet18 \| vjp \| 1.021151844124464 \| 1.5627719163894653 \| 0.005164200905710459 \| 1.5304011106491089 \| 0.003979875706136227 \| \| resnet18 \| vhp \| 0.9919114430761606 \| 6.8089728355407715 \| 0.019538333639502525 \| 6.86449670791626 \| 0.014775685034692287 \| \| resnet18 \| jvp \| 0.9715963084255123 \| 5.720699310302734 \| 0.08197150379419327 \| 5.887938499450684 \| 0.018408503383398056 \| \| ppl_simple_reg \| vjp \| 0.9529183269165618 \| 0.000362396240234375 \| 7.526952949810095e-10 \| 0.00038030146970413625 \| 7.726220357939795e-11 \| \| ppl_simple_reg \| vhp \| 0.9317708619586977 \| 0.00048058031825348735 \| 5.035701855504726e-10 \| 0.0005157709238119423 \| 3.250243477137538e-11 \| \| ppl_simple_reg \| jvp \| 0.8609755877018406 \| 0.00045447348384186625 \| 9.646707044286273e-11 \| 0.0005278587341308594 \| 1.4493808930815533e-10 \| \| ppl_simple_reg \| hvp \| 0.9764100147808232 \| 0.0005881547695025802 \| 7.618464747949361e-10 \| 0.0006023645401000977 \| 6.370915461850757e-10 \| \| ppl_simple_reg \| jacobian \| 1.0019173715134297 \| 0.0003612995205912739 \| 2.2979899233499523e-11 \| 0.0003606081008911133 \| 1.2609764794835332e-11 \| \| ppl_simple_reg \| hessian \| 1.0358429970264393 \| 0.00206911563873291 \| 2.590938796842579e-09 \| 0.0019975185859948397 \| 2.8916853356264482e-09 \| \| ppl_robust_reg \| vjp \| 1.0669910916521521 \| 0.0017304659122601151 \| 3.1047047155396967e-09 \| 0.0016218185191974044 \| 4.926861585374809e-09 \| \| ppl_robust_reg \| vhp \| 1.0181130455462972 \| 0.0029563189018517733 \| 2.6359153082466946e-08 \| 0.0029037236236035824 \| 1.020585038702393e-08 \| \| ppl_robust_reg \| jvp \| 0.9818360373406179 \| 0.0026934861671179533 \| 6.981357714153091e-09 \| 0.00274331565015018 \| 3.589908459389335e-08 \| \| ppl_robust_reg \| hvp \| 1.0270848910527002 \| 0.005576515104621649 \| 3.2798087801211295e-08 \| 0.005429458804428577 \| 6.438724398094564e-08 \| \| ppl_robust_reg \| jacobian \| 1.0543611284155785 \| 0.00167675013653934 \| 2.3236829349571053e-08 \| 0.001590299652889371 \| 1.2011492245278532e-08 \| \| ppl_robust_reg \| hessian \| 1.0535378727082656 \| 0.01643357239663601 \| 1.8450685956850066e-06 \| 0.015598463825881481 \| 2.1876705602608126e-07 \| \| wav2letter \| vjp \| 1.0060408105086573 \| 0.3516994118690491 \| 1.4463969819189515e-05 \| 0.349587619304657 \| 9.897866402752697e-05 \| \| wav2letter \| vhp \| 0.9873655295086051 \| 1.1196287870407104 \| 0.00474404776468873 \| 1.133955717086792 \| 0.009759620763361454 \| \| wav2letter \| jvp \| 0.9741820317882822 \| 0.7888165712356567 \| 0.0017476462526246905 \| 0.8097219467163086 \| 0.0018235758179798722 \| \| transfo \| vjp \| 0.9883954031921641 \| 2.8865864276885986 \| 0.008410997688770294 \| 2.9204773902893066 \| 0.006901870481669903 \| \| transfo \| vhp \| 1.0111290842971339 \| 8.374398231506348 \| 0.014904373325407505 \| 8.282224655151367 \| 0.04449500888586044 \| \| transfo \| jvp \| 1.0080534543381963 \| 6.293097972869873 \| 0.03796082362532616 \| 6.24282169342041 \| 0.010179692879319191 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/40586 Reviewed By: pbelevich Differential Revision: D23242101 Pulled By: albanD fbshipit-source-id: a2b92d5a4341fe1472711a685ca425ec257d6384	2020-08-21 07:36:26 -07:00
Mike Ruberry	e2eb0cb1a9	Adds arccosh alias for acosh and adds an alias consistency test (#43107 ) Summary: This adds the torch.arccosh alias and updates alias testing to validate the consistency of the aliased and original operations. The alias testing is also updated to run on CPU and CUDA, which revealed a memory leak when tracing (see https://github.com/pytorch/pytorch/issues/43119). Pull Request resolved: https://github.com/pytorch/pytorch/pull/43107 Reviewed By: ngimel Differential Revision: D23156472 Pulled By: mruberry fbshipit-source-id: 6155fac7954fcc49b95e7c72ed917c85e0eabfcd	2020-08-16 22:12:25 -07:00
Mike Ruberry	bee174dc3f	Adds linalg.det alias, fixes outer alias, updates alias testing (#42802 ) Summary: This PR: - updates test_op_normalization.py, which verifies that aliases are correctly translated in the JIT - adds torch.linalg.det as an alias for torch.det - moves the torch.linalg.outer alias to torch.outer (to be consistent with NumPy) The torch.linalg.outer alias was put the linalg namespace erroneously as a placeholder since it's a "linear algebra op" according to NumPy but is actually still in the main NumPy namespace. The updates to test_op_normalization are necessary. Previously it was using method_tests to generate tests, and method_tests assumes test suites using it also use the device generic framework, which test_op_normalization did not. For example, some ops require decorators like `skipCPUIfNoLapack`, which only works in device generic test classes. Moving test_op_normalization to the device generic framework also lets these tests run on CPU and CUDA. Continued reliance on method_tests() is excessive since the test suite is only interested in testing aliasing, and a simpler and more readable `AliasInfo` class is used for the required information. An example impedance mismatch between method_tests and the new tests, for example, was how to handle ops in namespaces like torch.linalg.det. In the future this information will likely be folded into a common 'OpInfo' registry in the test suite. The actual tests performed are similar to what they were previously: a scripted and traced version of the op is run and the test verifies that both graphs do not contain the alias name and do contain the aliased name. The guidance for adding an alias has been updated accordingly. cc mattip Note: ngimel suggests: - deprecating and then removing the `torch.ger` name - reviewing the implementation of `torch.outer` Pull Request resolved: https://github.com/pytorch/pytorch/pull/42802 Reviewed By: zou3519 Differential Revision: D23059883 Pulled By: mruberry fbshipit-source-id: 11321c2a7fb283a6e7c0d8899849ad7476be42d1	2020-08-11 21:48:31 -07:00
Mike Ruberry	4bafca1a69	Adds list of operator-related information for testing (#41662 ) Summary: This PR adds: - an "OpInfo" class in common_method_invocations that can contain useful information about an operator, like what dtypes it supports - a more specialized "UnaryUfuncInfo" class designed to help test the unary ufuncs - the `ops` decorator, which can generate test variants from lists of OpInfos - test_unary_ufuncs.py, a new test suite stub that shows how the `ops` decorator and operator information can be used to improve the thoroughness of our testing The single test in test_unary_ufuncs.py simply ensures that the dtypes associated with a unary ufunc operator in its OpInfo entry are correct. Writing a test like this previously, however, would have required manually constructing test-specific operator information and writing a custom test generator. The `ops` decorator and a common place to put operator information make writing tests like this easier and allows what would have been test-specific information to be reused. The `ops` decorator extends and composes with the existing device generic test framework, allowing its decorators to be reused. For example, the `onlyOnCPUAndCUDA` decorator works with the new `ops` decorator. This should keep the tests readable and consistent. Future PRs will likely: - continue refactoring the too large test_torch.py into more verticals (unary ufuncs, binary ufuncs, reductions...) - add more operator information to common_method_invocations.py - refactor tests for unary ufuncs into test_unary_ufunc Examples of possible future extensions are [here](`616747e50d`), where an example unary ufunc test is added, and [here](`d0b624f110`), where example autograd tests are added. Both tests leverage the operator info in common_method_invocations to simplify testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41662 Reviewed By: ngimel Differential Revision: D23048416 Pulled By: mruberry fbshipit-source-id: ecce279ac8767f742150d45854404921a6855f2c	2020-08-11 11:34:53 -07:00
James Reed	575e7497f6	Introduce experimental FX library (#42741 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42741 Test Plan: Imported from OSS Reviewed By: dzhulgakov Differential Revision: D23006383 Pulled By: jamesr66a fbshipit-source-id: 6cb6d921981fcae47a07df581ffcf900fb8a7fe8	2020-08-11 10:01:47 -07:00
Luca Wehrstedt	935fcc9580	[RPC tests] Merge process group tests into single entry point (#40818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40818 Summary of the entire stack: -- This diff is part of an attempt to refactor the RPC tests. They currently suffer from several problems: - Several ways to specify the agent to use: there exists one "generic" fixture that uses the global variable TEST_CONFIG to look up the agent name, and is used for process group and Thrift, and then there are separate fixtures for the flaky agent and the TensorPipe one. - These two ways lead to having two separate decorators (`requires_process_group_agent` and `@_skip_if_tensorpipe_agent`) which must both be specified, making it unclear what the effect of each of them is and what happens if only one is given. - Thrift must override the TEST_CONFIG global variable before any other import (in order for the `requires_process_group_agent` decorator to work correctly) and for that it must use a "trap" file, which makes it even harder to track which agent is being used, and which is specific to Buck, and thus cannot be used in OSS by other agents. - Even if the TensorPipe fixture doesn't use TEST_CONFIG, it still needs to set it to the right value for other parts of the code to work. (This is done in `dist_init`). - There are a few functions in dist_utils.py that return some properties of the agent (e.g., a regexp to match against the error it returns in case of shutdown). These functions are effectively chained if/elses on the various agents, which has the effect of "leaking" some part of the Thrift agent into OSS. - Each test suite (RPC, dist autograd/dist optimizer, their JIT versions, remote module, ...) must be run on each agent (or almost; the faulty one is an exception) in both fork and spawn mode. Each of these combinations is a separate file, which leads to a proliferation of scripts. - There is no "master list" of what combinations make sense and should be run. Therefore it has happened that when adding new tests or new agents we forgot to enroll them into the right tests. (TensorPipe is still missing a few tests, it turns out). - All of these tiny "entry point" files contain almost the same duplicated boilerplate. This makes it very easy to get the wrong content into one of them due to a bad copy-paste. This refactoring aims to address these problems by: - Avoiding global state, defaults/override, traps, if/elses, ... and have a single way to specify the agent, based on an abstract base class and several concrete subclasses which can be "mixed in" to any test suite. - Instead of enabling/disabling tests using decorators, the tests that are specific to a certain agent are now in a separate class (which is a subclass of the "generic" test suite) so that they are only picked up by the agent they apply to. - Instead of having one separate entry point script for each combination, it uses one entry point for each agent, and in that script it provides a list of all the test suites it wants to run on that agent. And it does that by trying to deduplicate the boilerplate as much as possible. (In fact, the various agent-suite combinations could be grouped in any way, not necessarily by agent as I did here). It provides further advantages: - It puts all the agents on equal standing, by not having any of them be the default, making it thus easier to migrate from process group to TensorPipe. - It will make it easier to add more versions of the TensorPipe tests (e.g., one that disables the same-machine backends in order to test the TCP-based ones) without a further duplication of entry points, of boilerplate, ... Summary of this commit -- This diff does the changes described above for the process group agent. It defines a fixture for it (instead of using the generic fixture in its default behavior) and then merges all the entry points into a single script. Note that after this change there won't be anymore a "vanilla" RPC test: all test scripts now specify what agent they are using. This puts all agents on equal standing. ghstack-source-id: 109229474 Test Plan: Sandcastle and CircleCI Reviewed By: pritamdamania87 Differential Revision: D22283182 fbshipit-source-id: 7e3626bbbf37d88b892077a03725f0598576b370	2020-08-05 15:10:07 -07:00
Luca Wehrstedt	b93c7c54eb	[RPC tests] Merge tests for faulty agent into single script (#40817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40817 Summary of the entire stack: -- This diff is part of an attempt to refactor the RPC tests. They currently suffer from several problems: - Several ways to specify the agent to use: there exists one "generic" fixture that uses the global variable TEST_CONFIG to look up the agent name, and is used for process group and Thrift, and then there are separate fixtures for the flaky agent and the TensorPipe one. - These two ways lead to having two separate decorators (`requires_process_group_agent` and `@_skip_if_tensorpipe_agent`) which must both be specified, making it unclear what the effect of each of them is and what happens if only one is given. - Thrift must override the TEST_CONFIG global variable before any other import (in order for the `requires_process_group_agent` decorator to work correctly) and for that it must use a "trap" file, which makes it even harder to track which agent is being used, and which is specific to Buck, and thus cannot be used in OSS by other agents. - Even if the TensorPipe fixture doesn't use TEST_CONFIG, it still needs to set it to the right value for other parts of the code to work. (This is done in `dist_init`). - There are a few functions in dist_utils.py that return some properties of the agent (e.g., a regexp to match against the error it returns in case of shutdown). These functions are effectively chained if/elses on the various agents, which has the effect of "leaking" some part of the Thrift agent into OSS. - Each test suite (RPC, dist autograd/dist optimizer, their JIT versions, remote module, ...) must be run on each agent (or almost; the faulty one is an exception) in both fork and spawn mode. Each of these combinations is a separate file, which leads to a proliferation of scripts. - There is no "master list" of what combinations make sense and should be run. Therefore it has happened that when adding new tests or new agents we forgot to enroll them into the right tests. (TensorPipe is still missing a few tests, it turns out). - All of these tiny "entry point" files contain almost the same duplicated boilerplate. This makes it very easy to get the wrong content into one of them due to a bad copy-paste. This refactoring aims to address these problems by: - Avoiding global state, defaults/override, traps, if/elses, ... and have a single way to specify the agent, based on an abstract base class and several concrete subclasses which can be "mixed in" to any test suite. - Instead of enabling/disabling tests using decorators, the tests that are specific to a certain agent are now in a separate class (which is a subclass of the "generic" test suite) so that they are only picked up by the agent they apply to. - Instead of having one separate entry point script for each combination, it uses one entry point for each agent, and in that script it provides a list of all the test suites it wants to run on that agent. And it does that by trying to deduplicate the boilerplate as much as possible. (In fact, the various agent-suite combinations could be grouped in any way, not necessarily by agent as I did here). It provides further advantages: - It puts all the agents on equal standing, by not having any of them be the default, making it thus easier to migrate from process group to TensorPipe. - It will make it easier to add more versions of the TensorPipe tests (e.g., one that disables the same-machine backends in order to test the TCP-based ones) without a further duplication of entry points, of boilerplate, ... Summary of this commit -- This diff does the changes described above for the faulty agent, which is its own strange beast. It merges all the test entry points (i.e., the combinations of agent, suite and fork/spawn) into a single file. It also modifies the test suites that are intended to be run only on the faulty agent, which used to inherit from its fixture, to inherit from the generic fixture, as they will be mixed in with the faulty fixture at the very end, inside the entry point script. ghstack-source-id: 109229477 Test Plan: Sandcastle and CircleCI Reviewed By: pritamdamania87 Differential Revision: D22283178 fbshipit-source-id: 72659efe6652dac8450473642a578933030f2c74	2020-08-05 15:10:04 -07:00
Luca Wehrstedt	edf6c4bc4d	[RPC tests] Merge TensorPipe tests into single entry point (#40816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40816 Summary of the entire stack: -- This diff is part of an attempt to refactor the RPC tests. They currently suffer from several problems: - Several ways to specify the agent to use: there exists one "generic" fixture that uses the global variable TEST_CONFIG to look up the agent name, and is used for process group and Thrift, and then there are separate fixtures for the flaky agent and the TensorPipe one. - These two ways lead to having two separate decorators (`requires_process_group_agent` and `@_skip_if_tensorpipe_agent`) which must both be specified, making it unclear what the effect of each of them is and what happens if only one is given. - Thrift must override the TEST_CONFIG global variable before any other import (in order for the `requires_process_group_agent` decorator to work correctly) and for that it must use a "trap" file, which makes it even harder to track which agent is being used, and which is specific to Buck, and thus cannot be used in OSS by other agents. - Even if the TensorPipe fixture doesn't use TEST_CONFIG, it still needs to set it to the right value for other parts of the code to work. (This is done in `dist_init`). - There are a few functions in dist_utils.py that return some properties of the agent (e.g., a regexp to match against the error it returns in case of shutdown). These functions are effectively chained if/elses on the various agents, which has the effect of "leaking" some part of the Thrift agent into OSS. - Each test suite (RPC, dist autograd/dist optimizer, their JIT versions, remote module, ...) must be run on each agent (or almost; the faulty one is an exception) in both fork and spawn mode. Each of these combinations is a separate file, which leads to a proliferation of scripts. - There is no "master list" of what combinations make sense and should be run. Therefore it has happened that when adding new tests or new agents we forgot to enroll them into the right tests. (TensorPipe is still missing a few tests, it turns out). - All of these tiny "entry point" files contain almost the same duplicated boilerplate. This makes it very easy to get the wrong content into one of them due to a bad copy-paste. This refactoring aims to address these problems by: - Avoiding global state, defaults/override, traps, if/elses, ... and have a single way to specify the agent, based on an abstract base class and several concrete subclasses which can be "mixed in" to any test suite. - Instead of enabling/disabling tests using decorators, the tests that are specific to a certain agent are now in a separate class (which is a subclass of the "generic" test suite) so that they are only picked up by the agent they apply to. - Instead of having one separate entry point script for each combination, it uses one entry point for each agent, and in that script it provides a list of all the test suites it wants to run on that agent. And it does that by trying to deduplicate the boilerplate as much as possible. (In fact, the various agent-suite combinations could be grouped in any way, not necessarily by agent as I did here). It provides further advantages: - It puts all the agents on equal standing, by not having any of them be the default, making it thus easier to migrate from process group to TensorPipe. - It will make it easier to add more versions of the TensorPipe tests (e.g., one that disables the same-machine backends in order to test the TCP-based ones) without a further duplication of entry points, of boilerplate, ... Summary of this commit -- This diff does the changes described above for the TensorPipe agent. It fixes its fixture (making it inherit from the generic fixture) and merges all the entry point scripts into a single one, so that it's easier to have a clear overview of all the test suites which we run on TensorPipe (you'll notice that many are missing: the JIT ones, the remote module one, ...). ghstack-source-id: 109229476 Test Plan: Sandcastle and CircleCI Reviewed By: pritamdamania87 Differential Revision: D22283180 fbshipit-source-id: d5e9f9f4e6d4bfd6fbcae7ae56eed63d2567a02f	2020-08-05 15:08:32 -07:00
iurii zdebskyi	e995c3d21e	Add private API to support tensor lists: _foreach_add(TensorList tensors, Scalar scalar) (#41554 ) Summary: Initial PR for the Tensor List functionality. Motivation [GitHub issue](https://github.com/pytorch/pytorch/issues/38655) Current PyTorch optimizer implementations are not efficient in cases when we work with a lot of small feature tensors. Starting a lot of kernels slows down the whole process. We need to reduce the number of kernels that we start. As an example, we should be looking at [NVIDIAs Apex](https://github.com/NVIDIA/apex). In order to track progress, we will pick PyTorchs DCGAN model with Adam optimizer and once the optimizer is reimplemented with tensor lists, benchmark the model performance against original model version, Apexs version with original Adam optimizer and it’s FusedAdam optimizer. In this PR - Adding `multi_tensor_apply` mechanism which will help to efficiently apply passed functor on a given list of tensors on CUDA. - Adding a first private API - `std::vector<Tensor> _foreach_add(TensorList tensors, Scalar scalar)` Tests Tested via unit tests Plan for the next PRs 1. Cover these ops with `multi_tensor_apply` support - exponent - division - mul_ - add_ - addcmul_ - addcdiv_ - Sqrt 2. Rewrite PyTorch optimizers to use for-each operators in order to get performance gains. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41554 Reviewed By: cpuhrsch Differential Revision: D22829724 Pulled By: izdeby fbshipit-source-id: 47febdbf7845cf931958a638567b7428a24782b1	2020-08-04 15:01:09 -07:00
Mike Ruberry	4b6e5f42a4	Creates spectral ops test suite (#42157 ) Summary: In preparation for creating the new torch.fft namespace and NumPy-like fft functions, as well as supporting our goal of refactoring and reducing the size of test_torch.py, this PR creates a test suite for our spectral ops. The existing spectral op tests from test_torch.py and test_cuda.py are moved to test_spectral_ops.py and updated to run under the device generic test framework. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42157 Reviewed By: albanD Differential Revision: D22811096 Pulled By: mruberry fbshipit-source-id: e5c50f0016ea6bb8b093cd6df2dbcef6db9bb6b6	2020-07-29 11:36:18 -07:00
Alexander Grund	86492410bc	Don't run tests with custom arguments with pytest (#41397 ) Summary: This patch basically removes the `-m pytest` parameters when `extra_unittest_args` is used (e.g. `--subprocess`) Fixes https://github.com/pytorch/pytorch/issues/41393 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41397 Reviewed By: pbelevich Differential Revision: D22792133 Pulled By: ezyang fbshipit-source-id: 29930d703666f4ecc0d727356bbab4a5f7ed4860	2020-07-28 08:17:36 -07:00
Noman Arshad	1a8269a566	Replace blacklist with blocklist in test/run_test.py file. (#42011 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41716 test/run_test.py file updated with an appropriate replacement for blacklist and whitelist. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42011 Reviewed By: pbelevich Differential Revision: D22791836 Pulled By: malfet fbshipit-source-id: 8139649c5b70c876b711e25c33f3051ea8461063	2020-07-28 07:56:01 -07:00
Eli Uriegas	f71cccc457	test: Add option to continue testing through error (#41136 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41136 Running this within CI seems impossible since this script exits out after one failed test, so let's just add an option that CI can use to power through these errors. Should not affect current functionality. Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Differential Revision: D22441694 Pulled By: seemethere fbshipit-source-id: 7f152fea15af9d47a964062ad43830818de5a109	2020-07-08 17:26:13 -07:00
David Reiss	5e03a1e926	Add support for int[]? arguments in native_functions.yaml (#37174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37174 ghstack-source-id: 106938112 Test Plan: Upcoming diffs use this for upsampling. Differential Revision: D21210002 fbshipit-source-id: d6a55ab6420c05a92873a569221b613149aa0daa	2020-07-07 13:52:20 -07:00
Christian Sarofeen	b9b4f05abf	[nvFuser] Working towards reductions, codegen improvements (#40864 ) Summary: Have basic reduction fusion working, and have improved code generator to approach performance of eager mode reductions. Coming soon will be pointwise-reduction fusions in a way that should prevent the possibility of hitting regressions. Also working on performant softmax kernels in the code generator which may be our next fusion target. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40864 Reviewed By: ngimel Differential Revision: D22392877 Pulled By: soumith fbshipit-source-id: 457448a807d628b1035f6d90bc0abe8a87bf8447	2020-07-06 14:52:49 -07:00
Jeff Daily	ac8c8b028d	[ROCm] restore jit tests (#40447 ) Summary: Remove `skipIfRocm` from most jit tests and enable `RUN_CUDA_HALF` tests for ROCm. These changes passed more than three rounds of CI testing against the ROCm CI. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40447 Differential Revision: D22190711 Pulled By: xw285cornell fbshipit-source-id: bac44825a2675d247b3abe2ec2f80420a95348a3	2020-06-27 01:03:59 -07:00
Ilia Cherniavskii	d8c384544e	Destroy CUDA events after profiling (#39962 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39962 Adding a simple wrapper with ref count for cuda event and destroying cuda event after the last copy is destroyed Test Plan: CI cuda profiler tests Differential Revision: D22027092 Pulled By: ilia-cher fbshipit-source-id: e0810388aa60b2291eb010896e13af1fad92e472	2020-06-23 10:44:39 -07:00
Pritam Damania	e632bf8d57	Add thrift and tensorpipe backend tests for test_ddp_under_dist_autograd. (#40210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40210 ghstack-source-id: 106300839 Test Plan: waitforbuildbot Differential Revision: D22110065 fbshipit-source-id: d9ebd009b8d451c75708eadc7eb3f2b788e875aa	2020-06-20 22:59:59 -07:00
Ivan Kobzarev	3852215170	[vulkan] jit passes for vulkan conv2 prepack and fuse with clamp (#39282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39282 Test Plan: Imported from OSS Differential Revision: D21962424 Pulled By: IvanKobzarev fbshipit-source-id: 2d20e827d2c3836b7e6b443293377c68dc1ffa5a	2020-06-20 14:12:21 -07:00
Jeff Daily	89ef8f8141	add test_openmp to ROCM_BLACKLIST (#40204 ) Summary: This test is flaky for rocm platform. Add to blacklist until it can be further reviewed. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40204 Differential Revision: D22108295 Pulled By: xw285cornell fbshipit-source-id: 802444a7b41260edcb6ce393237784f3e6c52a74	2020-06-18 15:15:35 -07:00
Shihao Xu	00651b8c93	[distribtued.nn] Implement TorchScript-compatible RemoteModule API (#37139 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37139 See design doc in https://github.com/pytorch/pytorch/issues/37136 ghstack-source-id: 105926270 Test Plan: TODO: - Make the generated Interface usable. https://github.com/pytorch/pytorch/pull/37139#discussion_r434190978 - - Avoid generating the same template instances for Module that is not scriptable. - Remove "infer_module_interface_cls". - Use Python format instead of a CodeTemplate - Use Python tempfile to track and delete file. Does it work if there is crash. ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_scripted_remote_module_template buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_non_scripted_remote_module_template ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_spawn ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_async_script buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_sync_script buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_with_kwargs buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork ``` buck test mode/opt-asan //caffe2/test:jit -- 'test_script_forward_method_replacement buck build mode/dev-nosan //caffe2/test:jit && \ buck-out/gen/caffe2/test/jit\#binary.par -r 'test_script_forward_method_replacement' buck build mode/dev-nosan //caffe2/test:jit && \ buck-out/gen/caffe2/test/jit\#binary.par -r 'test_imported_classes' Differential Revision: D20499658 fbshipit-source-id: dd9383ae4eb2343366c11127664f845b91ca3b0a	2020-06-15 19:07:35 -07:00
Ilia Cherniavskii	cc3fc786b7	[resubmit] [pytorch][PR] Fix for num_threads==1 in OpenMP "parallel for" (#39533 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39533 Test Plan: CI Reviewed By: ngimel Differential Revision: D21889269 fbshipit-source-id: 5ba13a0a3ec11edd0b6a7c3fdb35396b847a3d9e	2020-06-15 13:14:59 -07:00
HC Zhu	acc13ac828	[PyTorch] Make DDP reducer work under distributed autograd (#37998 ) Summary: ## Why doesn’t DDP work under dist_autograd? DDP follows the steps below 1. [DDP Python constructor](`8d6a8d2b3f/torch/nn/parallel/distributed.py (L389-L393)`) (on a module) creates a [C++ Reducer](https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/reducer.cpp), which holds references to all parameters (or variables in C++ code). 2. The reducer installs a post hook on each model parameter. 3. The backward run starts and triggers the post hooks installed above. 4. The post hook of a parameter simply marks the parameter ready for all-reduce. 5. Once all parameters in a bucket are ready, an all-reduce process starts by reading variable `.grad` and writes to variable `.grad`. But under dist_autograd, `.grad` of a variable is not populated at all. Instead, grads are in a global map in distributed context from variables to their grads. ## Solution of this PR The distributed engine to set a thread_local variable in a backward run indicating we're running in distributed mode. DDP reducer can then appropriately use `.grad` or the distributed context based on the thread local. More precisely, the thread local is set before calling the post hooks installed by the DDP reducer so that DDP post hooks can retrieve this thread local. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37998 Test Plan: ``` python test/distributed/test_ddp_under_dist_autograd.py ``` FB repo ``` buck test caffe2/test/distributed/... ``` DDP accuracy benchmark workflow run ``` flow-cli canary pytorch.benchmark.accuracy_comparison.workflow --parameters-json '{"node_world_size": 4, "dist_backend": "nccl"}' --run-as-secure-group fblearner_flow --entitlement gpu_prod ``` f196173157 Reviewed By: pritamdamania87 Differential Revision: D21513795 Pulled By: hczhu fbshipit-source-id: fe21e68ecdc9274182db4d4bb5a1e2d68ef927a2	2020-06-10 08:38:14 -07:00
Jithun Nair	545a3e1eca	Remove test_nccl from ROCM_BLACKLIST and enable only a couple of test_nccl tests (#39354 ) Summary: All individual test_nccl unit tests have been disabled for ROCm in `bf9395438f` test_nccl was also added to the ROCM_BLACKLIST in `87b198d309` However, the issue only arises when running the test_nccl suite as a whole (as opposed to any one test individually). More details in comments here: https://github.com/pytorch/pytorch/pull/38689 This PR enables test_nccl suite with only two tests so as to workaround the as-yet unresolved issue above, while allowing at least one test_nccl collective test to run on ROCm. This is also needed as a precursor for: https://github.com/pytorch/pytorch/pull/38515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39354 Differential Revision: D21843194 Pulled By: mrshenli fbshipit-source-id: b28d1e073d8d0fdc1b59928fc3b00187cfd02a35	2020-06-05 13:52:23 -07:00
mattip	ada2652ca6	Restore docs coverage test via sphinx (#39331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39331 Fixes gh-37590 Adds an extra `make coverage` to document building, which uses the built-in facility in sphinx to check docstring coverage. Also fixes a failure to import `torch/jit/supported_ops.py` which broke the [Torchscript Builtins](https://pytorch.org/docs/stable/jit_builtin_functions.html) page. This also adds the required `SPHINXOPTS` to turn warnings into error, but this is commented out. Note that since documentation of `torchvision` is merged in here, failures there would cause failures here if this is made active. Some thought might be needed about pinning the torchvision version merged into documentation. The first commit should fail, since the "ScriptModule" class is commented out. I did that in order to check that a CI failure is properly reported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38244 Differential Revision: D21640589 Pulled By: ezyang fbshipit-source-id: 1e240d81669b5f21404d596de4a27d192dc9fd8a	2020-06-04 10:49:38 -07:00
Oguz Ulgen	4a0a38c17a	Revert D21652452: [pytorch][PR] Fix for num_threads==1 in OpenMP "parallel for" Test Plan: revert-hammer Differential Revision: D21652452 Original commit changeset: 2cda7777c0ea fbshipit-source-id: fdd9a0346ce32a962766f57e13357dd2bc60d8b8	2020-06-03 22:51:51 -07:00
Luca Wehrstedt	5beb3b0c53	[TensorPipe] Re-enable dist optimizer tests (#39441 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39441 This is the last test suite to be enabled for TensorPipe. ghstack-source-id: 105166757 Test Plan: Ran the tests, hundreds of times each, in different build modes. Differential Revision: D21858975 fbshipit-source-id: ee0a7e64b77b4b1974f031207031cc14afb3a8c2	2020-06-03 09:00:52 -07:00
Luca Wehrstedt	b1dab266f7	[TensorPipe] Re-enable dist autograd tests (#39440 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39440 After the RPC tests, re-enable the second test suite: dist autograd. ghstack-source-id: 105165393 Test Plan: Ran the tests, several times each, in different build configs. Differential Revision: D21858974 fbshipit-source-id: 409377d564c36fecae51b9e4c776d94187b434a2	2020-06-03 08:59:19 -07:00
Luca Wehrstedt	3f099879f7	[TensorPipe] Re-enable RPC tests (#39406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39406 For now, just the RPC test (no dist autograd or dist optimizer). I removed the skipping decorator from all the tests except those that explicitly use the ProcessGroup options. Includes #39027. ghstack-source-id: 105159974 Test Plan: Ran the tests several hundred times, in various build modes. Saw some flakes, but at a rate of about 0.1% Differential Revision: D21716069 fbshipit-source-id: 9d2a99e112049a63745772c18e7a58266ed8e74e	2020-06-03 07:14:30 -07:00
mattip	a952f9bb06	Fix for num_threads==1 in OpenMP "parallel for" (#36479 ) Summary: fixes gh-32284 Move the non-parallel stanza out of the parallel context, and use `num_threads` to limit nesting `parallel for`s. The nesting caused a memory leak in the test script in the issue. This should probably have a test somewhere: are there tests for ParallelOpenMP? Pull Request resolved: https://github.com/pytorch/pytorch/pull/36479 Differential Revision: D21652452 Pulled By: ilia-cher fbshipit-source-id: 2cda7777c0eafbe268550a82fed306e52fb6eb25	2020-06-02 18:56:13 -07:00
Shen Li	bb0377bb24	Expose torch.futures.Future (#39008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39008 This commit adds a `torch.futures.Future` type and exposes its ctor, `wait`, `then`, and `set_result` APIs. This type is currently a wrapper of `c10::ivalue::Future` and mainly used by RPC for now. Later, we could revamp c10d APIs to return this `Future` type as well. More utils will be added into `torch.futures` package in followup PRs. Test Plan: Imported from OSS Differential Revision: D21723022 Pulled By: mrshenli fbshipit-source-id: 92e56160544e9bf00d11db3e8347a1b9707882c9	2020-06-02 10:12:56 -07:00
Nikita Shulga	39d037253c	Test PyTorch using python-3.8 + GCC-9 on Bionic (Reland) (#39121 ) Summary: Enable new test config in .circleci/config.yml Skip scanning several 3rd-party packages to work around https://bugs.python.org/issue40350 Remove pre python-3.5 checks from `test.sh` and update `scikit-learn` to python-3.8 compatible version This is a reland of https://github.com/pytorch/pytorch/pull/39030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39121 Differential Revision: D21820375 Pulled By: malfet fbshipit-source-id: d0be79b7d204cf692e055d42b9be42402dc4c1c0	2020-06-01 11:11:12 -07:00
Rohan Varma	988e31c788	Revert D21752017: [pytorch][PR] Test PyTorch using python-3.8 + GCC-9 on Bionic Test Plan: revert-hammer Differential Revision: D21752017 Original commit changeset: 56c841636349 fbshipit-source-id: adf08e03ba9610050fc5440ef453789f805fdc6b	2020-05-27 17:42:22 -07:00
Nikita Shulga	30dd4acbf6	Test PyTorch using python-3.8 + GCC-9 on Bionic (#39030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39030 Differential Revision: D21752017 Pulled By: malfet fbshipit-source-id: 56c841636349e24c9ebef8dac18c283de3664fa5	2020-05-27 15:56:37 -07:00
Nikolay Korovaiko	4fcd1c3123	run te only for profiling executor (#38591 ) Summary: * Disable the mode where PE can still run the old fuser. * Clean up Pull Request resolved: https://github.com/pytorch/pytorch/pull/38591 Differential Revision: D21643664 Pulled By: Krovatkin fbshipit-source-id: 6753ed6bdc544698a1340e59a624608ff3abf7f9	2020-05-26 18:35:25 -07:00
Shen Li	40ce90bfc1	Revert D21560096: [Tensorpipe Agent] Enabling tests with OSS CI Test Plan: revert-hammer Differential Revision: D21560096 Original commit changeset: 7d61cc1c354e fbshipit-source-id: 6adfd87e354545031203d65d04f0bad4687a93cd	2020-05-19 19:39:33 -07:00
Jeff Daily	87b198d309	add distributed/test_nccl to ROCM_BLACKLIST (#38730 ) Summary: CC ezyang xw285cornell sunway513 Work-around for recent ROCm CI failures due to `9cfc10d52e` (https://github.com/pytorch/pytorch/issues/37294). Replaces full revert suggested by PR https://github.com/pytorch/pytorch/issues/38689. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38730 Differential Revision: D21648707 Pulled By: xw285cornell fbshipit-source-id: 627b11b229c7eadca1f6e0c6192c6b5b6416e6a1	2020-05-19 14:45:50 -07:00
Omkar Salpekar	87aa2d25ae	[Tensorpipe Agent] Enabling tests with OSS CI (#38447 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38447 This PR modifies `run_tests.py` to enable running Tensorpipe Agent tests with the OSS CI. ghstack-source-id: 104321881 Test Plan: CI Differential Revision: D21560096 fbshipit-source-id: 7d61cc1c354e9353c4a586dd2b56690c28d51d10	2020-05-19 13:34:06 -07:00
Nikita Shulga	72e5b7ae5b	Add option to run python unittests in parallel (#37180 ) Summary: So far results looks quite promising: test_nn is purely sequential tests and can be accelerated 3x Pull Request resolved: https://github.com/pytorch/pytorch/pull/37180 Differential Revision: D21437871 Pulled By: malfet fbshipit-source-id: 8679a8af355f839f2c9dae3bf36d2e102af05425	2020-05-06 22:14:11 -07:00
Kimish Patel	b1b6bc36a5	Enable xnnpack_integration test in CI. (#37838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37838 Test Plan: oss: python test/test_xnnpack_integration.py Reviewed By: xcheng16 Differential Revision: D21405850 fbshipit-source-id: ba4ba06692b49315f110653d9492b2e14b618574	2020-05-06 13:53:03 -07:00
ashishfarmer	402f635bbe	Enable ahead of time compilation for HIPExtensions using ninja (#37800 ) Summary: This pull request enables ahead of time compilation of HIPExtensions with ninja by setting appropriate compilation flags for ROCm environment. Also, this enables the unit test for testing cuda_extensions on ROCm as well as removing test for ahead of time compilation of extensions with ninja from ROCM_BLACKLIST ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37800 Differential Revision: D21408148 Pulled By: soumith fbshipit-source-id: 146f4ffb3418f3534e6ce86805d3fe9c3eae84e1	2020-05-05 20:53:35 -07:00
ashishfarmer	bbd2350c99	Disable tests failing on test2 in ROCm CI (#37427 ) Summary: This pull request disables the unit tests that were observed to be failing once `test2` was enabled. These tests will be one by one looked at and fixed at the earliest, but until then disabling them to unblock `test2` The pull request also disables fftPlanDestroy for rocFFT to avoid double-freeing FFT handles cc: ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37427 Differential Revision: D21302909 Pulled By: ezyang fbshipit-source-id: ecadda3778e65b7f4f97e24b932b96b9ce928616	2020-04-29 09:56:28 -07:00
Nikolay Korovaiko	edc5ef1afb	run the simple executor for jit tests by default, add profiling jobs … (#37017 ) Summary: …for fusion tests fix flake8 warnings fix ci failures fix test_determination.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017 Differential Revision: D21238446 Pulled By: Krovatkin fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c	2020-04-28 19:16:52 -07:00
Nikita Shulga	47c4dca1ab	Remove python-2 or python<3.5 checks from unit tests (#37252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37252 Test Plan: CI Differential Revision: D21241083 Pulled By: malfet fbshipit-source-id: 44164b822f7905288abb2beda0175d2162d86143	2020-04-24 17:42:04 -07:00
Jerry Zhang	230b68168b	[quant] Refactor test files (#36964 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36964 Rename and restructure quantization related tests https://github.com/pytorch/pytorch/issues/31625 Test Plan: . Imported from OSS Differential Revision: D21192509 fbshipit-source-id: 148c93e86e0ea68ab18a067fe74a8035a29a1e4e	2020-04-23 10:28:56 -07:00
David Reiss	e75fb4356b	Remove (most) Python 2 support from Python code (#35615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615 Python 2 has reached end-of-life and is no longer supported by PyTorch. Now we can clean up a lot of cruft that we put in place to support it. These changes were all done manually, and I skipped anything that seemed like it would take more than a few seconds, so I think it makes sense to review it manually as well (though using side-by-side view and ignoring whitespace change might be helpful). Test Plan: CI Differential Revision: D20842886 Pulled By: dreiss fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed	2020-04-22 09:23:14 -07:00
Jerry Zhang	57c50db441	[reland][quant] Add backward compatiblity test (#36842 ) Summary: re-created the same PR: https://github.com/pytorch/pytorch/pull/36639 because ghimport does not support importing binary files right now Pull Request resolved: https://github.com/pytorch/pytorch/pull/36842 Test Plan: python test/quantization/test_backward_compatibility.py Differential Revision: D21100689 Pulled By: jerryzh168 fbshipit-source-id: 625a0f9da98138c9c2891b9d99fc45d85fa27cca	2020-04-17 21:24:31 -07:00
Xingying Cheng	86f354c530	Python binding api to optimize for mobile model on script module. (#36357 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36357 ghstack-source-id: 101907180 Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: D20946076 fbshipit-source-id: 93cb4a5bb2371128f802d738eb26d0a4f3b2fe10	2020-04-17 16:21:27 -07:00
Mike Ruberry	f00014b790	Revert D21080503: [pytorch][PR] [quant] Add backward compatiblity test Test Plan: revert-hammer Differential Revision: D21080503 Original commit changeset: 1dca08208bcc fbshipit-source-id: 5cd8c22130ff28b9231f657f80961e94b65b5792	2020-04-16 22:03:12 -07:00
Jerry Zhang	484a00b2d3	[quant] Add backward compatiblity test (#36771 ) Summary: re-created the same PR: https://github.com/pytorch/pytorch/pull/36639 because ghimport does not support importing binary files right now Pull Request resolved: https://github.com/pytorch/pytorch/pull/36771 Test Plan: python test/quantization/test_backward_compatibility.py Differential Revision: D21080503 Pulled By: jerryzh168 fbshipit-source-id: 1dca08208bccead60bba03e5fb5d39e1a1d7c20d	2020-04-16 19:00:30 -07:00
Haixin Liu	455d4aab64	[PyTorch Numeric Suite] Add weight compare API (#36186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36186 Start PyTorch Numeric Suite under PyTorch quantization and add weight compare API to it. ghstack-source-id: 102062165 Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_compare_weights' Differential Revision: D20903395 fbshipit-source-id: 125d84569837142626a0e2119b3b7657a32dbf4e	2020-04-13 19:02:00 -07:00
Thomas Viehmann	d070c0bcf0	ROCm: enable cpp_extensions.load/load_inline (#35897 ) Summary: This enables cpp_extensions.load/load_inline. This works by hipify-ing cuda sources. Also enable tests. CuDNN/MIOpen extensions aren't yet supported, I propose to not do this in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35897 Differential Revision: D20983279 Pulled By: ezyang fbshipit-source-id: a5d0f5ac592d04488a6a46522c58e2ee0a6fd57c	2020-04-13 11:44:08 -07:00
David Reiss	fab06bfb75	Add utility for bundling sample inputs with models (#35631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35631 Bundling sample inputs with our models with a standardized interface will make it possible to write benchmarking and code-coverage tools that call all models in a uniform way. The intent is to make this a standard for mobile models within Facebook. Putting it in torch/utils so tests can run on GitHub and because it might be useful for others as well. `augment_model_with_bundled_inputs` is the primary entry point. See its docstring for usage information and the test for some example uses. One design question I had was how much power should be available for automatic deflating and inflating of inputs. The current scheme gives some automatic handling and a reasonable escape hatch ("_bundled_input_inflate_format") for top-level tensor arguments, but no automatic support for (e.g.) tensors in tuples or long strings. For more complex cases, we have the ultimate escape hatch of just defining _generate_bundled_inputs in the model. Another design question was whether to add the inputs to the model or wrap the model in a wrapper module that had these methods and delegated calls to `forward`. Because models can have other exposed methods and attributes, the wrapped seemed too onerous. Test Plan: Unit test. Differential Revision: D20925013 Pulled By: dreiss fbshipit-source-id: 4dbbb4cce41e5752133b4ecdb05e1c92bac6b2d5	2020-04-08 13:10:36 -07:00
Johannes M Dieterich	45fc881f05	[ROCm] Hotfix: Black list tensorexpr test set that has failures on ROCm (#36049 ) Summary: Test set got enabled with ROCm failures in https://github.com/pytorch/pytorch/pull/35914 - black list it for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36049 Differential Revision: D20869814 Pulled By: zou3519 fbshipit-source-id: fcdb2abc9f3407344b56cf8d48b7740008317020	2020-04-06 13:26:05 -07:00
David Reiss	a054d05707	Add torch.utils.show_pickle for showing pickle contents in saved models (#35168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35168 Sometimes when a saved model isn't working, it's nice to be able to look at the contents of the pickle files. Unfortunately, pickletools output isn't particularly readable, and unpickling is often either not possible or runs so much post-processing code that it's not possible to tell exactly what is present in the pickled data. This script uses a custom Unpickler to unpickle (almost) any data into stub objects that have no dependency on torch or any other runtime types and suppress (almost) any postprocessing code. As a convenience, the wrapper can search through zip files, supporting command lines like `python -m torch.utils.show_pickle /path/to/model.pt1@*/data.pkl` When the module is invoked as main, we also install a hack in pprint to allow semi-resonable formatting of our stub objects. Test Plan: Ran it on a data.pkl, constants.pkl, and a debug pkl Differential Revision: D20842550 Pulled By: dreiss fbshipit-source-id: ef662d8915fc5795039054d1f8fef2e1c51cf40a	2020-04-03 15:11:20 -07:00
Mikhail Zolotukhin	ba3cec867f	Reenable test/test_tensorexpr.py (#35914 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35914 Test Plan: Imported from OSS Differential Revision: D20827188 Pulled By: ZolotukhinM fbshipit-source-id: ffcc1bb0396a0a19afb577a7ab4ca95c7e4ced37	2020-04-03 12:20:31 -07:00
Will Feng (FAIAR)	2fa3c1570d	Refactor C++ API parity test mechanism and turn it on in CI again (#35190 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35190 The following are the main changes: - The main logic of C++ API parity test mechanism is moved from `test/test_cpp_api_parity.py` to `test/cpp_api_parity/module_impl_check.py` and `test/cpp_api_parity/functional_impl_check.py`, so that there is a clear separation between module tests and functional tests, although they still share a lot of common utility functions which are all in `test/cpp_api_parity/utils.py`. - Module init tests (i.e. testing whether C++ module accepts the same constructor options as the corresponding Python module) is removed and will be added again in the future. - `cpp_constructor_args` / `cpp_options_args` / `cpp_function_call` are added as appropriate to all test params dict in `torch/testing/_internal/common_nn.py`, to indicate how to run C++ API parity test for this test params dict. Test Plan: Imported from OSS Differential Revision: D20588198 Pulled By: yf225 fbshipit-source-id: 11238c560c8247129584b9b49df73fff40c4d81d	2020-04-03 11:20:36 -07:00
Feng Tian	762270c51f	add c10d dynamic loading mechanism and unit test (#28068 ) Summary: The original behavior of pytorch c10d only supports built-in c10d backends, such as nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically loading 3rd party communication libraries which are derived from ProcessGroup base class. related RFC is in: https://github.com/pytorch/pytorch/issues/27955 Through this way, user just need specify a 3rd party c10d backend name when invoking torch.distributed.init_process_group(). The proposed logic will try to load corresponding c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068 Differential Revision: D19174838 Pulled By: agolynski fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62	2020-04-02 15:46:51 -07:00
Nick Korovaiko	ddcad5b9ca	temp disable test_tensorexpr.py Test Plan: test on CI Reviewed By: soumith Differential Revision: D20823336 fbshipit-source-id: 65c04bc57c6a120003cb561613645d2d7e60189c	2020-04-02 14:28:22 -07:00
Christian Sarofeen	6d24f8fe21	Infrastructure for a new CUDA Fuser (#34785 ) Summary: Summary: This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_ One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated. Warning: This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser. Short term goals: Parity with current CUDA fuser (including performance): - Dynamic shapes (no recompilation) - Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code) - Dropout Mid-term goals: - Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation). - 1-D reductions fused with pointwise operations Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785 Reviewed By: ZolotukhinM Differential Revision: D20650977 Pulled By: soumith fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63	2020-04-02 09:22:42 -07:00
Nick Korovaiko	2f50c11954	add test_tensorexpr.py (#35776 ) Summary: Adding `test_tensorexpr.py` to our CI. There's a few complications: the first one is that we now always run `SimpleIREVal` as a part of simplifier, so the counts will always be greater than one. We can potentially invest some effort to differentiate between a real codegen call to `SimpleIREval` and calls in simplifier, but it's probably not that important and the second change to turn not being able to retrieve a counter into a default value of 0 since the test are structured to test for either an llvm or simpleireval backends, so it only seems appropriate to not fail the test too early. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35776 Differential Revision: D20799333 Pulled By: Krovatkin fbshipit-source-id: 2a94ff98e647180c6e6aea141a411c3376c509f9	2020-04-01 22:05:37 -07:00
Jerry Zhang	ab26dfb44e	[quant] Move quantization tests into test/quantization (#35812 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35812 Test Plan: . Imported from OSS Differential Revision: D20795329 fbshipit-source-id: 42cc905c44ce7b86720aeef512d747ff6788d7a2	2020-04-01 12:44:19 -07:00
Michael Suo	319aee1afb	Revert D20771828: [quant] Move quantization tests into test/quantization Test Plan: revert-hammer Differential Revision: D20771828 Original commit changeset: 5f1df5e86c29 fbshipit-source-id: d14f915f291ae8a90026c5b65624459211495f47	2020-03-31 23:01:00 -07:00
Jerry Zhang	fef6c617d4	[quant] Move quantization tests into test/quantization (#35688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35688 Test Plan: . Imported from OSS Differential Revision: D20771828 fbshipit-source-id: 5f1df5e86c29f7bdfbdc6563450e909b3bfdc07a	2020-03-31 20:30:57 -07:00
Johannes M Dieterich	0eb26fb01e	[ROCm] Properly blacklist (#35230 ) Summary: test_python_all_except_nn + /usr/bin/python3.6 test/run_test.py --exclude test_nn test_jit_simple test_jit_legacy test_jit_fuser_legacy --verbose --bring-to-front test_quantization test_quantized test_quantized_tensor test_quantized_nn_mods --determine-from= test_nn continues to be run as part of test1 target This will allows us to run run_test.py and correctly disabling these sets for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35230 Differential Revision: D20735851 Pulled By: ezyang fbshipit-source-id: 255d21374c9605c8f8b6ffa1b08f58fb10d8e543	2020-03-30 08:57:03 -07:00
Omkar Salpekar	4025729e88	[1.5 Release][RPC Reliability] RRef Idempotency and RPC Retry enablement (#33636 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33636 Fixes https://github.com/pytorch/pytorch/issues/32119, https://github.com/pytorch/pytorch/issues/26116, https://github.com/pytorch/pytorch/issues/33072 Makes RRef control messages idempotent and enables sending with retries for distributed autograd cleanup and RRef internal messages. In order to effectively test whether these RRef and distributed autograd cleanup work with network failures/retries, I implemented an RPC Agent with a faulty send function, and enabled running tests using this as a third backend (in addition to Thrift and PGA). The tests using this backend are in a separate class (the test cases are similar but with minor changes to ensure short-running tests wait for retried RPCs to finish). This faulty RPC Agent is pretty configurable. The tests can configure which messages types to fail, and how many messages to fail, but going forward, other RPC functionality can be overriden with faulty methods to test with failures injected. Differential Revision: D20019236 fbshipit-source-id: 540a977e96b2e29aa0393ff12621fa293fe92b48	2020-03-20 20:07:47 -07:00
Mikhail Zolotukhin	12f0052eee	Add TensorExpr Fuser tests (resubmit). (#35085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35085 Test Plan: Imported from OSS Differential Revision: D20552334 Pulled By: ZolotukhinM fbshipit-source-id: 628fcf4719a879f18978ff8a0a64afbb045df645	2020-03-20 13:19:31 -07:00
Natalia Gimelshein	3c90a90730	Revert D20540599: Add TensorExpr Fuser tests. Test Plan: revert-hammer Differential Revision: D20540599 Original commit changeset: ced9b6657fe7 fbshipit-source-id: e8fa11f20207c35f39b3fbe6f45fc627715377c1	2020-03-19 18:37:32 -07:00
Mikhail Zolotukhin	7b59f41009	Add TensorExpr Fuser tests. (#35052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35052 Differential Revision: D20540599 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: ced9b6657fe72bca61833ab5d59bdaddcacd114b	2020-03-19 14:31:54 -07:00
Mikhail Zolotukhin	42b2c8c65d	[TensorExpr] Add a fuser pass based on tensor expressions. (#34226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34226 LLVM and Cuda backends are added in subsequent PRs, so at this point the fuser is pretty useless, but it still can be tested and its logic is not going to change with addition of the codegens. Differential Revision: D20251838 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: 82b0d221fa89904ed526689d02a6c7676a8ce8de	2020-03-16 11:49:24 -07:00
Yunus Rahbar	ed11e2536a	[pytorch_ci] Skip determination tests in rocm Summary: I don't know why, but this segfaults on rocm. Test Plan: Can only be tested on master Reviewed By: mrshenli Differential Revision: D20286011 fbshipit-source-id: dde952449bf54ae459d36020f3e3db6fa087b39f	2020-03-05 11:23:02 -08:00
Shihao Xu	e2ddf935bb	Run RPC JIT tests with variable type hints only in Python >=3.6 (#34284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34284 Python 3.5 only supports function type hints. Variable type hints are introduced in Python 3.6. So these tests with JIT type hints will fail with "Syntax Error" in Python 3.5 environment. ghstack-source-id: 99542199 Test Plan: ` Differential Revision: D7348891 fbshipit-source-id: c4c71ac021f35b5e6f7ce4d3e6af10dd1d2600cc	2020-03-04 18:59:08 -08:00
Yunus Rahbar	1546d2afeb	[pytorch_ci] Don't run determination tests in py35 Test Plan: Can only really be tested in PyTorch master Reviewed By: mrshenli Differential Revision: D20260023 fbshipit-source-id: b5444c376894bfccd6524cf04a71cf76eea72275	2020-03-04 14:23:40 -08:00
Yunus Rahbar	7cee787a19	[pytorch_ci] Python target determinator (#33577 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33577 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33221 This will make it so that if a pull request is just pure Python files, then we'll only run the Python tests that are connected to the dependency graph of the touched files. Assumptions made: - the Python code does not do dynamic imports - test_X.py never imports from test_Y.py Right now this is only done for test_nn (presumably the largest test entrypoint), but it's not much more work to do it for all the other test entrypoints too. Test Plan: CircleCI results when touching just a few Python files: - pytorch_macos_10_13_py3_test: 41 ->13 minutes https://circleci.com/gh/pytorch/pytorch/4550574?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link - pytorch_windows_vs2019_py36_cuda10.1_test1: 11 -> 2 minutes https://circleci.com/gh/pytorch/pytorch/4550846?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link - pytorch_windows_vs2019_py36_cuda10.1_test2: 51 -> 21 minutes https://circleci.com/gh/pytorch/pytorch/4550845?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link - pytorch_linux_xenial_py3_6_gcc5_4_test: 41 -> 14 minutes https://circleci.com/gh/pytorch/pytorch/4550543?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link Differential Revision: D20009089 fbshipit-source-id: 41708cc301d1c866eb92a04421d8346feb0e3cb5	2020-03-03 18:01:12 -08:00
Shihao Xu	a1862468d0	Add missing test launchers for JitRpcTest and JitDistAutogradTest (#32891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32891 - Add JitDistAutoGradTest into fork/spawn test launcher - Add JitRpcTest into fork/spawn test launcher ghstack-source-id: 98900090 Test Plan: ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_spawn ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc:dist_autograd_fork buck test mode/dev-nosan //caffe2/test/distributed/rpc:dist_autograd_spawn ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork_thrift buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_spawn buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_spawn_thrift ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:dist_autograd_fork buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:dist_autograd_fork_thrift buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:dist_autograd_spawn buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:dist_autograd_spawn_thrift ``` Differential Revision: D5785394 fbshipit-source-id: 335a85424d22f1a83874be81a8139499c9a68ce2	2020-02-24 21:42:47 -08:00
ashish	616beb1412	[ROCm] Added support for pytorch extensions to use HIP (#32669 ) Summary: This pull request has changes for: 1. Enabling a torch module with HIP code to be compiled by cpp_extensions.py 2. Fixes for hipify module to be able to be used by a torch extension cc: ezyang iotamudelta jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/32669 Differential Revision: D20033893 Pulled By: zou3519 fbshipit-source-id: fd6ddc8cdcd3930f41008636bb2bc9dd26cdb008	2020-02-21 12:10:02 -08:00
anjali411	13e4ee7883	Added tensor.is_complex(), is_complex and dtype.is_complex py binding, tensor printing, and dixed the scalar type returned for complex float (#33268 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33268 Test Plan: Imported from OSS Differential Revision: D19907698 Pulled By: anjali411 fbshipit-source-id: c3ce2e99fc09da91a90a8fb94e5525a00bb23703	2020-02-20 13:38:01 -08:00
Jithun Nair	3c4cec56aa	Enable test_distributed for ROCm but only with nccl backend [REDUX] (#32551 ) Summary: This is a redux of the original PR https://github.com/pytorch/pytorch/issues/28814 which was reverted in PR https://github.com/pytorch/pytorch/issues/29736 due to test_DistributedDataParallel being suspected as being flaky. Further investigation revealed it wasn't flakiness, but a bug in the PyTorch source code which has been now fixed in PR https://github.com/pytorch/pytorch/issues/32356. This PR is another attempt at enabling the test_distributed unit test suite only for the nccl backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32551 Differential Revision: D19729966 Pulled By: bddppq fbshipit-source-id: 12a0d850991a903cc7723d63693b6157071d7115	2020-02-10 12:42:36 -08:00
Richard Zou	6209412647	Add option to use ninja to compile ahead-of-time cpp_extensions (#32495 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495 Background ------------------------------ Previously, ninja was used to compile+link inline cpp_extensions and ahead-of-time cpp_extensions were compiled with distutils. This PR adds the ability to compile (but not link) ahead-of-time cpp_extensions with ninja. The main motivation for this is to speed up cpp_extension builds: distutils does not make use of parallelism. With this PR, using the new option, on my machine, - torchvision compilation goes from 3m43s to 49s - nestedtensor compilation goes from 2m0s to 28s. User-facing changes ------------------------------ I added a `use_ninja` flag to BuildExtension. This defaults to `True`. When `use_ninja` is True: - it will attempt to use ninja. - If we cannot use ninja, then this throws a warning and falls back to distutils. - Situations we cannot use ninja: Windows (NYI, I'll open a new issue for this), if ninja cannot be found on the system. Implementation Details ------------------------------ This PR makes this change in two steps. Please me know if it would be easier to review this if I split this up into a stacked diff. Those changes are: 1) refactor _write_ninja_file to separate the policy (what compiler flags to pass) from the mechanism (how to write the ninja file and do compilation). 2) call _write_ninja_file and _run_ninja_build while building ahead-of-time cpp_extensions. These are only used to compile objects; distutils still handles the linking. Change 1: refactor _write_ninja_file to seperate policy from mechanism - I split _write_ninja_file into: _write_ninja_file and _write_ninja_file_to_build_library - I renamed _build_extension_module to _run_ninja_build Change 2: Call _write_ninja_file while building ahead-of-time cpp_extensions - _write_ninja_file_and_compile_objects calls _write_ninja_file to only build object files. - We monkey-patch distutils.CCompiler.compile to call _write_ninja_files_and_compile_objects - distutils still handles the linking step. The linking step is not a bottleneck so it was not a concern. - This change only works on unix-based systems. Our code for windows goes down a different codepath and I did not want to mess with that. - If a system does not support ninja, we raise a warning and fall back to the original compilation path. Test Plan ------------------------------ Adhoc testing - I built torchvision using pytorch master and printed out the build commands. Next, I used this branch to build torchvision and looked at the ninja file. I compared the ninja file with the build commands and asserted that they were functionally the same. - I repeated the above for pytorch/nestedtensor. PyTorch test suite - I split `test_cpp_extensions` into `test_cpp_extensions_aot` and `test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests ahead-of-time and the JIT version tests just-in-time (not to be confused with TorchScript) - `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with a module that was built with ninja, and once with a module that was built without ninja. - run_test.py asserts that when we are building with use_ninja=True, ninja is actually available on the system. Test Plan: Imported from OSS Differential Revision: D19730432 Pulled By: zou3519 fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90	2020-02-05 18:49:29 -08:00
Edward Yang	6874278985	Revert D19611800: [PyTorch][TorchScript] Add support for join on List of strings in TorchScript Test Plan: revert-hammer Differential Revision: D19611800 Original commit changeset: cef66356abc1 fbshipit-source-id: 41af9e0de83b1fb808b17255ec905e137909457d	2020-01-30 06:46:28 -08:00
Sampath Mummadi	8ead65a946	[PyTorch][TorchScript] Add support for join on List of strings in TorchScript Summary: Add support for join on List of strings in TorchScript. Test Plan: (pytorch) smummadi@smummadi-mbp pytorch % python test/test_jit_string.py Fail to import hypothesis in common_utils, tests are not derandomized . ---------------------------------------------------------------------- Ran 1 test in 1.090s OK Differential Revision: D19611800 fbshipit-source-id: cef66356abc14dfd100a806d25dd1a8bc9af0a11	2020-01-29 18:22:52 -08:00
davidriazati	2060e0a9dd	Split serialization tests to their own file (#32241 ) Summary: Stacked PRs * #32244 - Make zip serialization the default * #32241 - Split serialization tests to their own file This makes them all easier to run as a batch. This PR is just a code move / fixing up imports. There are still some serialization tests in `test_torch.py` as part of `TestDeviceType`. ](https://our.intern.facebook.com/intern/diff/19415826/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/32241 Pulled By: driazati Differential Revision: D19415826 fbshipit-source-id: a3f6cfe1626ff2f9b9631c409bf525bd32e4639b	2020-01-28 15:04:05 -08:00
Pritam Damania	f050b16dd9	Move pytorch distributed tests to separate folder for contbuild. (#30445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445 Create distributed and rpc directories under caffe/test for better management of unit tests. Differential Revision: D18702786 fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606	2020-01-22 21:16:59 -08:00
Peter Bell	7fdc6cb74e	Fix test_data_parallel name errors and add to run_test.py (#32428 ) Summary: While working on https://github.com/pytorch/pytorch/issues/31768 and trying to add tests for `DataParallel`, I discovered that: - `test_data_parallel.py` can't be run through `run_test.py` - running it with `pytest` fails with many name errors `test_data_parallel.py` seems to have been split from `test_nn.py` in https://github.com/pytorch/pytorch/issues/28297 but not in a state where it can actually be run. Presumably `DataParallel` hasn't been tested by CI in the time since. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32428 Differential Revision: D19499345 Pulled By: ezyang fbshipit-source-id: f9b748a99a5c85fc6675c22506cf10bbfd9c8a4d	2020-01-21 15:11:03 -08:00
Nathan Goldbaum	9d3402e4cb	Add the __torch_function__ API override mechanism (#30730 ) Summary: This is a re-do of https://github.com/pytorch/pytorch/issues/27064, which was reverted (`b8792c0438`). This was landed at the same time as other work that added new operators to the `torch` namespace so the check for whether the `torch` namespace is exhaustively checked for overridability was triggering test failures. I've temporarily disabled that check and added an explanatory comment that the check will be re-enabled in a future PR that will be merged during a time when the commit velocity on PyTorch is lower. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30730 Differential Revision: D18813270 Pulled By: ezyang fbshipit-source-id: 70477c4656dca8fea6e7bc59259555041fcfbf68	2019-12-04 13:19:07 -08:00
Edward Yang	b8792c0438	Revert D18645954: add __torch_function__ API override mechanism Test Plan: revert-hammer Differential Revision: D18645954 Original commit changeset: 54b5e4344d7a fbshipit-source-id: 4a7aebb483e6b001130d6f384ccc53c5a808ab13	2019-12-04 07:41:47 -08:00
Prasun Anand	d12786b24f	add __torch_function__ API override mechanism (#27064 ) Summary: Closes https://github.com/pytorch/pytorch/issues/24015 (see description of that issue for more details). For a toy example, see the `DiagonalTensor` and `SubDiagonalTensor` class in test/test_overrides.py. This PR currently contains: * tests for `__torch_function__` behavior * modification to `gen_python_functions` and `parse` function signatures and dispatched to correct overloaded argument. This feature is inspired by and analogous to NumPy's `__array_function__` protocol ([see NumPy Enhancement Proposal 18](https://numpy.org/neps/nep-0018-array-function-protocol.html#trying-array-function-methods-until-the-right-one-works)). ### Benchmarks: See Nathan's comment below: https://github.com/pytorch/pytorch/pull/27064#issuecomment-554601189 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27064 Differential Revision: D18645954 Pulled By: ezyang fbshipit-source-id: 54b5e4344d7afdbcf996bb57191b0bdadc7b1767	2019-12-04 05:56:46 -08:00
Brian Wignall	e7fe64f6a6	Fix typos (#30606 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606 Differential Revision: D18763028 Pulled By: mrshenli fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c	2019-12-02 20:17:42 -08:00
Michael Suo	4b0a6d299c	test reporting (#29658 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29658 This PR makes our test scripts output artifacts that CircleCI can understand. This has a few benefits: 1. We can actually see failed tests and their output in the job screen (instead of having to scroll through logs) 2. We can use the CircleCI test metadata API to track failed tests programmatically. it looks like this (old ui): https://circleci.com/gh/pytorch/pytorch/3546584?pipelines-ui-opt-out or this (new ui): https://app.circleci.com/jobs/github/pytorch/pytorch/3546584/tests Test Plan: Imported from OSS Differential Revision: D18597261 Pulled By: suo fbshipit-source-id: 07fc7d26bbb834e13cc4cc0e48178645ae6579f5	2019-11-19 11:15:31 -08:00
Edward Yang	7d287688eb	Revert D5689636: Add RpcAgentTestFixture to extract duplicate code Test Plan: revert-hammer Differential Revision: D5689636 Original commit changeset: f35eea1359ad fbshipit-source-id: 31928fce5e96b3beceefbc9a03f54769f10b7e1a	2019-11-19 08:14:44 -08:00
Yanli Zhao	861ef05015	Remove rpc fork and dist autograd fork tests from PyTorch repo (#29827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29827 There are known issues for "fork tests + OMP" in Pytorch, rpc and dist autograd tests use OMP thread pools, this caused rpc fork and dist autograd fork tests to be flaky. So remove these fork tests from PyTorch repo. rpc spawn and dist autograd spawn tests are still running. Test Plan: unit tests Differential Revision: D18507384 fbshipit-source-id: 9e239f13850832b4b84724828537f73512f3fca9	2019-11-19 07:02:59 -08:00
Shihao Xu	8dd67057f1	Add RpcAgentTestFixture to extract duplicate code (#29747 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29747 There are duplicate code for component that rely on RpcAgent. Extract them into a re-usable test fixture class. Test Plan: ### RPC + RRef ``` buck test mode/dev-nosan //caffe2/test:rpc_fork buck test mode/dev-nosan //caffe2/test:rpc_spawn ``` ``` buck test mode/dev-nosan //caffe2/test:rpc_fork_thrift buck test mode/dev-nosan //caffe2/test:rpc_spawn_thrift ``` ### Dist Autograd ``` buck test mode/dev-nosan //caffe2/test:dist_autograd_fork buck test mode/dev-nosan //caffe2/test:dist_autograd_spawn ``` ``` buck test mode/dev-nosan //caffe2/test:dist_autograd_fork_thrift buck test mode/dev-nosan //caffe2/test:dist_autograd_spawn_thrift ``` ### Dist Optimizer ``` buck test mode/dev-nosan //caffe2/test:dist_optimizer_fork buck test mode/dev-nosan //caffe2/test:dist_optimizer_spawn ``` ``` buck test mode/dev-nosan //caffe2/test:dist_optimizer_fork_thrift buck test mode/dev-nosan //caffe2/test:dist_optimizer_spawn_thrift ``` Differential Revision: D5689636 fbshipit-source-id: f35eea1359addaaac9bd8d00d0a5df228a236511	2019-11-18 12:54:17 -08:00
Junjie Bai	2b05ae0704	Revert "Enable test_distributed for ROCm but only with nccl backend" (#29736 ) Summary: This reverts commit `7073ee2090`. They are flaky on master: https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py3.6-clang7-rocmdeb-ubuntu16.04-test2/6830//console https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py3.6-clang7-rocmdeb-ubuntu16.04-test2/6824//console https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py3.6-clang7-rocmdeb-ubuntu16.04-test2/6802//console cc jithunnair-amd Pull Request resolved: https://github.com/pytorch/pytorch/pull/29736 Differential Revision: D18480543 Pulled By: bddppq fbshipit-source-id: 9a1dd9aa5f5959dc6fbbfdab0df997514221217a	2019-11-13 13:53:05 -08:00
Jithun Nair	7073ee2090	Enable test_distributed for ROCm but only with nccl backend Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28814 Differential Revision: D18437300 Pulled By: ezyang fbshipit-source-id: bf1ab68e0fde683e0082f6c9fe2fc20e2bc8fc06	2019-11-12 07:52:30 -08:00
Nikolay Korovaiko	5b702ab52b	switching to a simple/full executor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29230 Differential Revision: D18402229 Pulled By: Krovatkin fbshipit-source-id: 62f4bc9bc89c0c7369359bba1359c22a2fa80f46	2019-11-11 13:41:35 -08:00
Jerry Zhang	1c436ded44	Remove `test_quantizer.py` and reuse one of its test in `test_quantization.py` (#27269 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27269 Remove `test_quantizer.py`, add and rewrite one of the tests in `test_quantizer` in `test_quantization.py` The conv test is removed for now since conv pattern is still broken, we'll add another test later ghstack-source-id: 92869823 Test Plan: python test/test_quantization.py Imported from OSS Differential Revision: D18182916 fbshipit-source-id: 325b5d8e877228d6a513e3ddf52c974479250d42	2019-10-29 19:04:21 -07:00
Yanli Zhao	3214f134b6	fix python rpc handler exit crash (#27251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27251 Explicitly clean up py::objects to avoid segment faults when py::objects with CPython are cleaned up later at program exit. See similar issues reported https://github.com/pybind/pybind11/issues/1598 and https://github.com/pybind/pybind11/issues/1493. Our local tests also caught this segment faults if py::objects are cleaned up at program exit. The explaination is: CPython cleans up most critical utitlies before cleaning up PythonRpcHandler singleton, so when PythonRpcHandler signleton cleans up py::objects and call dec_ref(), it will crash. The solution is to clean up py::objects earlier when Rpc agent join(). Be note that py::objects can not be cleaned up when Rpc agent is destroyed as well, as Rpc agent is global variable and it will have same issue as PythonRpcHandler. close #27182 ghstack-source-id: 92035069 Test Plan: unit tests on python 3.6 and python 3.5 Differential Revision: D17727362 fbshipit-source-id: c254023f6a85acce35528ba756a4efabba9a519f	2019-10-16 16:57:38 -07:00
Will Feng	c67d3533a7	Update C++ torch::nn parity table, and temporarily disable C++ API parity test (#28117 ) Summary: This PR updates `test/cpp_api_parity/parity-tracker.md` to reflect our progress on C++ `torch::nn` parity. It also disables the C++ API parity test temporarily, and as the next step I will refactor the parity test to make it simpler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/28117 Differential Revision: D17957948 Pulled By: yf225 fbshipit-source-id: 1dd836c25665f57ba8efc6d1abf671a95c03eff7	2019-10-16 11:54:13 -07:00
Jithun Nair	6eef469074	Enable mgpu unit tests for rocm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27518 Differential Revision: D17880153 Pulled By: bddppq fbshipit-source-id: 5b6210104ec66747558a08f97dda1e7796f681df	2019-10-11 14:35:36 -07:00
Pieter Noordhuis	c5ec0a7ede	Don't run dist_autograd_fork on Python 2 (#27612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27612 The file imports from torch.distributed.rpc, which won't be initialized when running on Python 2. Test Plan: Imported from OSS Differential Revision: D17855033 Pulled By: pietern fbshipit-source-id: 6e6b0ca248d0512dac5a44e10e153c710cefe02c	2019-10-11 11:18:46 -07:00
Yanli Zhao	fc249c7924	skip all rpc and dist autograd spawn tests for <PY36 (#27191 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27191 skip rpc and distautograd spawns tests for <python 3.6 ghstack-source-id: 91231565 close #27157 Test Plan: unit tests Differential Revision: D17697368 fbshipit-source-id: bb8cf1f47de41f9d350fd60afe37fece293d8680	2019-10-02 23:05:51 -07:00
Shihao Xu	00e588290b	Add test case for init_rpc_backend (#26997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26997 Reverting accidental change in https://github.com/pytorch/pytorch/pull/26919 ghstack-source-id: 91126906 Reviewed By: zhaojuanmao Differential Revision: D17637468 fbshipit-source-id: 9ffcf4b15b37effe6b5d5f82338ff89298c82a52	2019-10-01 15:44:34 -07:00
Shen Li	bb8983e936	Revert D17694691: Enable distributed autograd tests for >py36 Test Plan: revert-hammer Differential Revision: D17694691 Original commit changeset: 6e7b74064589 fbshipit-source-id: 7da10f478adbbde05f16eb6095acb000d7945c99	2019-10-01 15:00:33 -07:00
Shen Li	7bbb2df6d9	Enable distributed autograd tests for >py36 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27166 Test Plan: Imported from OSS Reviewed By: zhaojuanmao Differential Revision: D17694691 Pulled By: mrshenli fbshipit-source-id: 6e7b740645891fd3cc67600de26346f7b336773b	2019-10-01 14:46:06 -07:00
Yanli Zhao	1d2d59dd79	make rpc and dist-autograd multiprocess test to use both fork and spawn (#25656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25656 spawn multiprocessing can catch some issues that fork multiprocessing can not catch, meanwhile fork can work properly with asan tests, but spawn multiprocessing can not work with asan tests for some use cases right now. so this diff adding support to launch both spawn and fork tests in multiProcessingTestCase class, also let test_rpc and test_dist_autograd to run both spawn and fork tests ghstack-source-id: 91096705 Test Plan: unit tests Reviewed By: xush6528 Differential Revision: D17086007 fbshipit-source-id: af2446e7abe948c37081cff24ed060fd87f84922	2019-10-01 11:15:22 -07:00
Mike Ruberry	a9a9d362e2	Makes test_indexing.py device generic (#26634 ) Summary: - Makes test_indexing.py device generic - Removes test_indexing_cuda.py Note: a couple tests in test_indexing.py were already CPU and CUDA tests, meaning these tests were run multiple times when CUDA was available. Genericizing test_indexing.py corrects this and lets these tests be run on other device types, like XLA, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26634 Differential Revision: D17529001 Pulled By: mruberry fbshipit-source-id: e71ba28d947749255a0aceeb7b77a42c4811439d	2019-09-23 11:52:48 -07:00
peter	2ce8c83f67	Enable CPU fused kernel on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25578 Differential Revision: D17397156 Pulled By: ezyang fbshipit-source-id: b243528c2bfd5a0d401897833048429e67fe40ef	2019-09-17 07:29:40 -07:00
Pieter Noordhuis	e4cd807cdb	Make running Gloo tests conditional on availability Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25913 Test Plan: Imported from OSS Differential Revision: D17313283 Pulled By: pietern fbshipit-source-id: f07cb456e79a0067eac0f7abbc378fbd05c5565f	2019-09-11 02:20:32 -07:00
Lu Fang	75cac0fe69	expose parse_schema and __eq__ function to python and add round trip tests (#23208 ) Summary: expose necessary functions to python, and add round-way tests for function schema str() and parsing functions. We iterate over all the registered function schemas and get the string, then parse the string. We compare the schema generated from parsing with the original one, and make sure they are equal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23208 ghstack-source-id: 89638026 Test Plan: buck test //caffe2/test:function_schema Reviewed By: zrphercule Differential Revision: D16435471 fbshipit-source-id: 6961ab096335eb88a96b132575996c24090fd4c0	2019-09-06 15:50:56 -07:00
Brian Vaughan	88e4cee3e7	Improve handling of mixed-type tensor operations (#22273 ) Summary: Improve handling of mixed-type tensor operations. This PR affects the arithmetic (add, sub, mul, and div) operators implemented via TensorIterator (so dense but not sparse tensor ops). For these operators, we will now promote to reasonable types where possible, following the rules defined in https://github.com/pytorch/pytorch/issues/9515, and error in cases where the cast would require floating point -> integral or non-boolean to boolean downcasts. The details of the promotion rules are described here: https://github.com/nairbv/pytorch/blob/promote_types_strict/docs/source/tensor_attributes.rst Some specific backwards incompatible examples: * now `int_tensor * float` will result in a float tensor, whereas previously the floating point operand was first cast to an int. Previously `torch.tensor(10) * 1.9` => `tensor(10)` because the 1.9 was downcast to `1`. Now the result will be the more intuitive `tensor(19)` * Now `int_tensor *= float` will error, since the floating point result of this operation can't be cast into the in-place integral type result. See more examples/detail in the original issue (https://github.com/pytorch/pytorch/issues/9515), in the above linked tensor_attributes.rst doc, or in the test_type_promotion.py tests added in this PR: https://github.com/nairbv/pytorch/blob/promote_types_strict/test/test_type_promotion.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/22273 Reviewed By: gchanan Differential Revision: D16582230 Pulled By: nairbv fbshipit-source-id: 4029cca891908cdbf4253e4513c617bba7306cb3	2019-09-05 18:26:09 -07:00
Pritam Damania	7818e7e5d4	Basic framework for Distributed Autograd context. (#24875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24875 As per https://github.com/pytorch/pytorch/issues/23110, each autograd pass would be assigned a unique autograd_context_id. In this change we introduce a DistAutogradContainer per worker which holds information for each autograd pass currently running. DistAutogradContainer has a map from the autograd_context_id to DistAutogradContext (which holds all the relevant information for the autograd pass). DistAutogradContext currently only stores the autograd_context_id and more information would be added to it later as we build out the rest of the framework. The autograd_context_id is a 64 bit globally unique integer where the first 16 bits are the worker_id and next 48 bits are auto-incrementing for uniqueness. Sample python code on how this would be used for distributed autograd: ``` import torch.distributed.autograd as dist_autograd worker_id = 0 dist_autograd.init(worker_id) with dist_autograd.context() as context_id: # forward pass... # backward pass... # optimizer step... ``` ghstack-source-id: 89119248 Test Plan: unit tests. Differential Revision: D16356694 fbshipit-source-id: d1a8678da0c2af611758dbb5d624d554212330ce	2019-08-28 18:51:56 -07:00
Raghuraman Krishnamoorthi	9945c0cea6	Work around for bias quantization for conv and linear operators (#25212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25212 In eager mode, all modules need to work with input tensors that can change qparams dynamically. This issue https://github.com/pytorch/pytorch/issues/23874 will address this via FBGEMM modifications. This is a work around before that. ghstack-source-id: 89118038 Test Plan: buck test caffe2/test:quantized -- 'test_conv_api $test_quantized_nn_mods\.ModuleAPITest$' --print-passing-details Summary (total time 65.86s): PASS: 1 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 Differential Revision: D17064471 fbshipit-source-id: 3c192442b19bf2d9d88d4e52de6c24dc134a846f	2019-08-28 07:24:03 -07:00
Elias Ellison	277cd748f9	skip fstrings test if not py36 (#25184 ) Summary: Fixes py35 job on master Pull Request resolved: https://github.com/pytorch/pytorch/pull/25184 Differential Revision: D17057957 Pulled By: eellison fbshipit-source-id: 53decc408680d9436395698cbd4b4ede98933159	2019-08-26 13:58:45 -07:00
Will Feng	1bf1970fe2	Add Python/C++ torch.nn API parity test harness (#23852 ) Summary: This PR adds test harness for checking Python / C++ API parity for `torch.nn.Module` subclasses. Under the hood, we use JIT tracing to transfer `nn.Module` state from Python to C++, so that we can test initialization / forward / backward on Python / C++ modules with the same parameters and buffers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23852 Differential Revision: D16830204 Pulled By: yf225 fbshipit-source-id: 9b5298c0e8cd30e341a9f026e6f05604a82d6002	2019-08-26 08:02:25 -07:00
Elias Ellison	ab38059bc7	fix annotated assignment (#25094 ) Summary: Fixing parsing for annotated assignment `List[int] a = []`. See https://github.com/pytorch/pytorch/pull/24989/files?file-filters%5B%5D=.py for changes to the test_jit_py3 & run_test files. follow up to https://github.com/pytorch/pytorch/pull/24477 and fix for https://github.com/pytorch/pytorch/issues/25086 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25094 Differential Revision: D16985016 Pulled By: eellison fbshipit-source-id: 6be1363f2503303b96bd2e6a9f188ad72441f4eb	2019-08-23 13:14:38 -07:00
Zachary DeVito	f9f5af0ed7	Revert D16949314: [jit] Fix bugs in assignment to optionals Test Plan: revert-hammer Differential Revision: D16949314 Original commit changeset: 7f63d88b30a3 fbshipit-source-id: d1f00de2ad9c3484b731ad1b24205ca60024355d	2019-08-22 16:50:48 -07:00
Zachary DeVito	bb79b61ce7	Fix bugs in assignment to optionals (#24989 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24989 This fixes the cases where a type annotated with optional cannot be conditionally assigned to none: ``` x : Optional[int] = 4 if ...: x = None ``` Test Plan: Imported from OSS Differential Revision: D16949314 Pulled By: zdevito fbshipit-source-id: 7f63d88b30a3f5b024c2a539aa74967c9202af00	2019-08-22 16:27:46 -07:00
Michael Suo	ef14d88f27	Make torch.jit.Attribute work with PYTORCH_ENABLED=0 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23851 Test Plan: Imported from OSS Differential Revision: D16840394 Pulled By: suo fbshipit-source-id: b72e081513de73f565f3aeaa61125b7d3aa9c3e7	2019-08-19 15:23:21 -07:00
Michael Suo	0ce7264ed6	Don't require slow test reporting in `run_tests.py --pytest` (#24448 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24448 The setting `--durations=10` was hard-coded, which is annoying as I don't necessarily care. A good alternative to get the same behavior is: ``` python run_tests.py --pytest -- --durations=10 ``` Test Plan: Imported from OSS Differential Revision: D16876380 Pulled By: suo fbshipit-source-id: 1e14d366db45b6b9bf4a4ab1633b0f6ece29f6bc	2019-08-17 01:26:07 -07:00
James Reed	7597741159	Run quantization tests first Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24366 Test Plan: Imported from OSS Differential Revision: D16815295 Pulled By: jamesr66a fbshipit-source-id: 01478ce2fcbe0620cd5cf9854121602e0663c057	2019-08-14 18:09:32 -07:00
James Reed	e7f1977bae	test_nn_quantized -> test_quantized_nn_mods (#24201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24201 It turns out that the `run_test` script uses a blacklist of "exclude" tests and tests if the test name [starts with](https://github.com/pytorch/pytorch/blob/master/test/run_test.py#L342) the given blacklist item. `nn` was passed as a blacklist item in CI, and that meant that not only was test_nn skipped, but also test_nn_quantized. This renames the test to avoid this situation, and imo puts it in a better position lexicographically next to the other quantization tests. Test Plan: Imported from OSS Differential Revision: D16772820 Pulled By: jamesr66a fbshipit-source-id: 4cde0729b48ae3e36fcedab9c98197831af82dde	2019-08-13 17:07:15 -07:00
Shen Li	8b349073ce	sync and async torch.distributed.rpc for builtin operators (#23228 ) Summary: Features: * sync and async RPC for builtin operators * RpcAgent API * ProcessGroupAgent implementation Goal: * have a minimum working and testable RPC implementation * make sure the RpcAgent API is sufficient for future ThriftAgent and TensorPipeAgent implementation * For tensor pipe implementation, it might allocate multiple underlying communication channels with different types, and might also use streaming serialization/deserialization for large tensors. To support this requirement, the current implementation only convert a BuiltinOp into a Message which contains a byte vector and a tensor table. It is up to the RpcAgent implementation to determine how it would like to serialize a Message object. * For ThriftAgent, as Thrift has it own request/response matching solution, the Message.id is no longer necessary. Hence the id can be dropped during serialization. All it needs to do is to pass the response Message object to the Future returned by send(...). * support blocking and non-blocking RequestCallback * blocking means the callback won't return before sending out the response * non-blocking can be achieved by enqueue the `(from, request, RpcAgent&)` tuple and use a different thread to process them. That is why there is an `RpcAgent&` arg in the param list. We are not exporting this diff until we finalize distributed autograd design and publish the API review publicly. https://fb.quip.com/FabTAZKVgQpf Pull Request resolved: https://github.com/pytorch/pytorch/pull/23228 ghstack-source-id: 87816717 Reviewed By: zhaojuanmao Differential Revision: D15194693 fbshipit-source-id: 7adb600796613cde6073db6c227451b89940ecaf	2019-08-06 16:03:01 -07:00
James Reed	40f0b1c844	Enable OSS quantization tests (#23858 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23858 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23718 Changes: - Enable tests for quantization test files in `run_tests.py` - Remove `__future__` imports from `torch/nn/qat/modules/__init__.py`, since `unicode_literals` messes up imports on python2 because the elements in `__all__` will be Unicode and not string - Skip PostTrainingQuantTests if the build doesn't have FBGEMM (only a small subset of targets in tests) or if testing under UBSAN (the suppression file doesn't seem to work) Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D16639467 Pulled By: jamesr66a fbshipit-source-id: 532766797c216976dd7e07d751f768ff8e0fc207	2019-08-06 11:20:30 -07:00
SsnL	8482efb203	pin_memory malloc now uses existing context if available. (#22229 ) Summary: This is achieved by using `cuDevicePrimaryCtxGetState` as a way to check whether a primary context exists on a device. It is not too slow, from this benchmark of a single call to it on CUDA 10.1, Titan Xp, driver 415.27: ``` --------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_cuDevicePrimaryCtxGetState 301 ns 301 ns 2319746 ``` Commits: 1. Add `CUDAHooks::getDeviceWithPrimaryContext` which returns a device index with primary context (if exists). Link `c10/cuda` against `libcuda` for device API calls. 2. Use `getDeviceWithPrimaryContext` to check primary context in `pin_memory`. Fix `OptionalDeviceGuard` doc. 3. Refactor `test_cuda_primary_ctx.py` to support multiple tests. Add test for this in that file. Fixes https://github.com/pytorch/pytorch/issues/21081. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22229 Differential Revision: D16170194 Pulled By: zou3519 fbshipit-source-id: 485a45f211b7844c9e69c63f3b3b75194a796c5d	2019-07-16 10:18:30 -07:00
Pieter Noordhuis	6ff0c6ca3f	Remove THD (#22065 ) Summary: It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it. Closes https://github.com/pytorch/pytorch/issues/18967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065 Reviewed By: mrshenli Differential Revision: D15983669 Pulled By: pietern fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165	2019-06-25 12:19:13 -07:00
Shen Li	25d1496d58	Fix Process Group for tensors shared across processes (#21449 ) Summary: Ops on a Process Group (pg) instance will hit an error when input/output tensors are created on a different process, because, pg calls `recordStream` on `CUDACachingAllocator` which only knows tensors created within the same process. The proposed solution is to add a `suppressError` arg (suggestions for better names?) to `recordStream`. See comments in code for arguments. CC pichuang1984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21449 Differential Revision: D15689736 Pulled By: mrshenli fbshipit-source-id: e7fc81b167868f8666536067eaa7ae2c8584d88e	2019-06-11 11:50:25 -07:00
Elias Ellison	f6e5846a67	add handle to run all jit tests (#21161 ) Summary: Now you can run `python test/run_tests --jit` to run all jit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/21161 Differential Revision: D15563912 Pulled By: eellison fbshipit-source-id: 4bb0285cda4168b72a3dc4bba471485566a59873	2019-05-30 14:12:21 -07:00
Dmytro Dzhulgakov	c25e33789e	Lightweight at-most-once logging for API usage (#20745 ) Summary: Resubmit #20698 which got messed up. Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl. Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745 Differential Revision: D15429196 Pulled By: dzhulgakov fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca	2019-05-23 23:17:59 -07:00
Richard Zou	83a80d2b31	Add test/test_namedtensor.py (#20168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20168 ghimport-source-id: 78bd3c4b6bc87c216ce33dba13b61feb87e5fe53 Reviewed By: gchanan Differential Revision: D15278222 Pulled By: zou3519 fbshipit-source-id: 3bcdb1cb654400350d42464dd9e0d5e0a7116e1e	2019-05-09 09:09:22 -07:00
Tzu-Wei Huang	98e312cf96	TensorBoard support within PyTorch (#16196 ) Summary: This PR adds TensorBoard logging support natively within PyTorch. It is based on the tensorboardX code developed by lanpa and relies on changes inside the tensorflow/tensorboard repo landing at https://github.com/tensorflow/tensorboard/pull/2065. With these changes users can simply `pip install tensorboard; pip install torch` and then log PyTorch data directly to the TensorBoard protobuf format using ``` import torch from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter() s1 = torch.rand(1) writer.add_scalar('data/scalar1', s1[0], 0) writer.close() ``` Design: - `EventFileWriter` and `RecordWriter` from tensorboardX now live in tensorflow/tensorboard - `SummaryWriter` and PyTorch-specific conversion from tensors, nn modules, etc. now live in pytorch/pytorch. We also support Caffe2 blobs and nets. Action items: - [x] `from torch.utils.tensorboard import SummaryWriter` - [x] rename functions - [x] unittests - [x] move actual writing function to tensorflow/tensorboard in https://github.com/tensorflow/tensorboard/pull/2065 Review: - Please review for PyTorch standard formatting, code usage, etc. - Please verify unittest usage is correct and executing in CI Any significant changes made here will likely be synced back to github.com/lanpa/tensorboardX/ in the future. cc orionr, ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/16196 Differential Revision: D15062901 Pulled By: orionr fbshipit-source-id: 3812eb6aa07a2811979c5c7b70810261f9ea169e	2019-04-25 21:30:23 -07:00
Junjie Bai	ef499cd567	Remove no-fork workaround for running tests with ROCm (#19436 ) Summary: This should have been fixed in newest ROCm version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19436 Reviewed By: ezyang Differential Revision: D15004685 Pulled By: bddppq fbshipit-source-id: 19fd4cca94c914dc54aabfbb4e62b328aa348a35	2019-04-19 09:51:03 -07:00
Zafar Takhirov	c145c34a7b	Basic implementation of QRelu in C10 (#19091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19091 Implements a basic quantized ReLU (uint8). This is a temporary solution before using the `QTensor` type instead of the tuple. Reviewed By: dzhulgakov Differential Revision: D14565413 fbshipit-source-id: 7d53cf5628cf9ec135603d6a1fb7c79cd9383019	2019-04-11 08:47:56 -07:00
jgong5	3ad710b837	Add MKL-DNN Tensor (#17748 ) Summary: This is a minimalist PR to add MKL-DNN tensor per discussion from Github issue: https://github.com/pytorch/pytorch/issues/16038 Ops with MKL-DNN tensor will be supported in following-up PRs to speed up imperative path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17748 Reviewed By: dzhulgakov Differential Revision: D14614640 Pulled By: bddppq fbshipit-source-id: c58de98e244b0c63ae11e10d752a8e8ed920c533	2019-04-08 21:41:38 -07:00
Elias Ellison	a5ddecd00c	Move fuser to test_jit_fuser (#18590 ) Summary: Start of breaking up test_jit.py New files will have the format test_jit_* so they are easily grepable but remain in the same directory so we don't have to go through multiple sources for imports. I am adding a test that's expected to fail to be sure it's running. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18590 Reviewed By: wanchaol Differential Revision: D14677094 Pulled By: eellison fbshipit-source-id: 9782c6aa9525bb6f332fc75cfff004c83a417522	2019-03-29 18:13:26 -07:00
Edward Yang	4bea15f580	Fix lint in run_test.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17815 Reviewed By: eellison Differential Revision: D14390308 fbshipit-source-id: 22efd62a1bbd1fc8155a942d7160d5b7d3158e6b	2019-03-08 14:41:36 -08:00
peter	c78da0c6ed	Enable using CMD when building cpp extensions on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706 Differential Revision: D14346482 Pulled By: ezyang fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9	2019-03-06 14:45:31 -08:00
Gao, Xiang	b6b99fd7d3	Add namedtuple return for min, median, mode, kthvalue, add test for namedtuple return API (#16186 ) Summary: This partially fixes https://github.com/pytorch/pytorch/issues/394 and depend on https://github.com/pytorch/pytorch/pull/15429. I suggest to review this only after https://github.com/pytorch/pytorch/pull/15429 get landed, otherwise the diff might be large to review. The test only allows explicitly whitelisted operators to have named return. Differential Revision: D14070735 Pulled By: ezyang fbshipit-source-id: ace2a672998b4e4a8094f52cbda5aa1cea6e3b42	2019-02-16 00:01:33 -08:00
Xiang Gao	07b5782ff7	Add some missing docs to torch.rst, new unittest to enforce torch.rst no longer miss anything (#16039 ) Summary: This prevent people (reviewer, PR author) from forgetting adding things to `torch.rst`. When something new is added to `_torch_doc.py` or `functional.py` but intentionally not in `torch.rst`, people should manually whitelist it in `test_docs_coverage.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16039 Differential Revision: D14070903 Pulled By: ezyang fbshipit-source-id: 60f2a42eb5efe81be073ed64e54525d143eb643e	2019-02-15 07:02:31 -08:00
Thomas Viehmann	6a6983ed7f	create type hint stub files for module torch (#12500 ) Summary: We have: - This is an initial stab at creating a type stub `torch/__init__.pyi` . - This is only tested on Python 3, since that's the only Python version mypy works on. - So far, we only aim at doing this for torch functions and torch.Tensor. - Quite a few methods and functions have to be typed manually. These are done in `torch/__init__.pyi.in` For me, PyCharm (the non-paid one) didn't seem to indicate errors in the .pyi when opening and seemed to be able to get the type hint for the few functions I tried, but I don't use PyCharm for my usual PyTorch activities, so I didn't extensively try this out. An example of a generated PYI is at [this gist](https://gist.github.com/ezyang/bf9b6a5fa8827c52152858169bcb61b1). Pull Request resolved: https://github.com/pytorch/pytorch/pull/12500 Differential Revision: D13695553 Pulled By: ezyang fbshipit-source-id: 4566c71913ede4e4c23ebc4a72c17151f94e8e21	2019-01-29 12:14:17 -08:00
Syed Tousif Ahmed	17e3ab957a	Report the slowest 10 tests when using pytest (#16423 ) Summary: This flag is useful in identifying if a test is taking way too long like the ones in the following snippet when running the test suite with pytest. `9757ad35b0/test/common_utils.py (L814-L835)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16423 Differential Revision: D13843507 Pulled By: ezyang fbshipit-source-id: 643e1766a85905b3b112ea5ca562135a17896a72	2019-01-28 10:33:05 -08:00
SsnL	ffd613800f	Add IS_PYTORCH_CI flag for testing (#16006 ) Summary: Use case: Some data loader tests rely on `psutil` (a third party lib). So they are guarded by `skipIf`. But we want to always test them on CI envs. With `IS_PYTORCH_CI`, we can raise if `psutil` is not found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16006 Reviewed By: ezyang Differential Revision: D13673957 Pulled By: yf225 fbshipit-source-id: c63a7138093f45333c0b371fed0bcc88b67f2a22	2019-01-16 23:07:38 -08:00
Mickaël Schoentgen	71c6e24373	Fix several ResourceWarning: unclosed file (#15746 ) Summary: Hello, This is a patch to fix `ResourceWarning: unclosed file`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15746 Differential Revision: D13587286 Pulled By: soumith fbshipit-source-id: 08ac34c5b51d9334867f65a2927bff11511553f3	2019-01-09 15:36:53 -08:00
bddppq	2db742fc95	Do not use fork to invoke test scripts in pytorch rocm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14600 Differential Revision: D13523937 Pulled By: bddppq fbshipit-source-id: 1493fdd051283650081d7944bb2bd7f0c4c44990	2018-12-19 21:35:16 -08:00
Peter Goldsborough	6f2307ba6a	Allow building libraries with setuptools that dont have abi suffix (#14130 ) Summary: When using `setuptools` to build a Python extension, setuptools will automatically add an ABI suffix like `cpython-37m-x86_64-linux-gnu` to the shared library name when using Python 3. This is required for extensions meant to be imported as Python modules. When we use setuptools to build shared libraries not meant as Python modules, for example libraries that define and register TorchScript custom ops, having your library called `my_ops.cpython-37m-x86_64-linux-gnu.so` is a bit annoying compared to just `my_ops.so`, especially since you have to reference the library name when loading it with `torch.ops.load_library` in Python. This PR fixes this by adding a `with_options` class method to the `torch.utils.cpp_extension.BuildExtension` which allows configuring the `BuildExtension`. In this case, the first option we add is `no_python_abi_suffix`, which we then use in `get_ext_filename` (override from `setuptools.build_ext`) to throw away the ABI suffix. I've added a test `setup.py` in a `no_python_abi_suffix_test` folder. Fixes https://github.com/pytorch/pytorch/issues/14188 t-vi fmassa soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/14130 Differential Revision: D13216575 Pulled By: goldsborough fbshipit-source-id: 67dc345c1278a1a4ee4ca907d848bc1fb4956cfa	2018-11-27 17:35:53 -08:00
Sam Gross	006505bb8f	Speed-up "advanced" indexing operations (#13420 ) Summary: This speeds-up "advanced" indexing (indexing a tensor by a tensor) on CPU and GPU. There's still a bunch of work to do, including speeding up indexing by a byte (boolean) mask and speeding up the derivative calculation for advanced indexing. Here's some speed comparisons to indexing on master using a little [benchmark script](https://gist.github.com/colesbury/c369db72aad594e5e032c8fda557d909) with 16 OpenMP threads and on a P100. The test cases are listed as (input shape -> output shape). \| Test case \| CPU (old vs. new) \| CUDA (old vs. new) \| \|-----------------------\|---------------------\|------------------------\| \| 1024x1024 -> 512x1024 \| 225 us vs. 57 us \| 297 us vs. 47 us \| \| 1024x1024 -> 1024x512 \| 208 us vs. 153 us \| 335 us vs. 54 us \| \| 50x50 -> 20000x50 \| 617 us vs. 77 us \| 239 us vs. 54 us \| \| 50x50 -> 50x20000 \| 575 us vs. 236 us \| 262 us vs. 58 us \| \| 2x5x10 -> 10 \| 65 us vs. 18 us \| 612 us vs. 93 us \| See #11647 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13420 Reviewed By: soumith Differential Revision: D13088936 Pulled By: colesbury fbshipit-source-id: 0a5c2ee9aa54e15f96d06692d1694c3b24b924e2	2018-11-27 15:23:59 -08:00
Johannes M Dieterich	ce48958606	enable more unit tests (#13166 ) Summary: This enables the distributions and utils test sets for ROCm. Individual tests are enabled that now pass due to fixes in HIP/HCC/libraries versions in white rabbit. For attention: bddppq ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/13166 Differential Revision: D12814759 Pulled By: bddppq fbshipit-source-id: ea70e775c707d7a8d2776fede6154a755adef43e	2018-11-12 18:49:52 -08:00
Peter Goldsborough	7978ba45ba	Update path in CI script to access ninja (#13646 ) Summary: We weren't running C++ extensions tests in CI. Also, let's error hard when `ninja` is not available instead of skipping C++ extensions tests. Fixes https://github.com/pytorch/pytorch/issues/13622 ezyang soumith yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13646 Differential Revision: D12961468 Pulled By: goldsborough fbshipit-source-id: 917c8a14063dc40e6ab79a0f7d345ae2d3566ba4	2018-11-07 14:31:29 -08:00
Pieter Noordhuis	be424de869	Add torch.multiprocessing.spawn helper (#13518 ) Summary: This helper addresses a common pattern where one spawns N processes to work on some common task (e.g. parallel preprocessing or multiple training loops). A straightforward approach is to use the multiprocessing API directly and then consecutively call join on the resulting processes. This pattern breaks down in the face of errors. If one of the processes terminates with an exception or via some signal, and it is not the first process that was launched, the join call on the first process won't be affected. This helper seeks to solve this by waiting on termination from any of the spawned processes. When any process terminates with a non-zero exit status, it terminates the remaining processes, and raises an exception in the parent process. If the process terminated with an exception, it is propagated to the parent. If the process terminated via a signal (e.g. SIGINT, SIGSEGV), this is mentioned in the exception as well. Requires Python >= 3.4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13518 Reviewed By: orionr Differential Revision: D12929045 Pulled By: pietern fbshipit-source-id: 00df19fa16a568d1e22f37a2ba65677ab0cce3fd	2018-11-06 14:08:37 -08:00
Tongzhou Wang	6d2b3cc869	Fix pytest, make it work with run_test.py (#13416 ) Summary: Fixes #13326 Also now you can use `run_test.py` with `pytest`. E.g., ``` python run_test.py -vci distributed -pt ``` Yes it works with `distributed` and `cpp_extension`. cc zou3519 vishwakftw Pull Request resolved: https://github.com/pytorch/pytorch/pull/13416 Differential Revision: D12895622 Pulled By: SsnL fbshipit-source-id: 2d18106f3a118d642a666bfb1318f41c859c3df7	2018-11-01 19:08:06 -07:00
verhoek	33b00bdbb8	cwd arg in shell function of run_test set to optional (#13247 ) Summary: Tiny fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13247 Differential Revision: D12830311 Pulled By: soumith fbshipit-source-id: 405620e3a1de5bfc7e039f9aaf2f7cb7a3bca1b1	2018-10-29 15:17:00 -07:00
Jesse Hellemn	448a32e0ee	Adding timestamps to the beginning of every test file in run_test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12994 Reviewed By: anderspapitto Differential Revision: D10515291 Pulled By: pjh5 fbshipit-source-id: 191054cdacff308b63e9063d22d62314398e4f88	2018-10-24 11:42:31 -07:00
Edward Yang	bc1d96ca98	Add support for inline expect tests. (#12825 ) Summary: expecttest and test_expecttest are the implementation and tests for this functionality. I wired it up to the --accept flag, but there's also a new environment variable EXPECTTEST_ACCEPT which may be more convenient to trigger. Haven't tested if this works in fbcode. There may be a few expect tests which will benefit from inline treatment, but I just did one to show it works. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/12825 Reviewed By: teng-li Differential Revision: D10448630 Pulled By: ezyang fbshipit-source-id: 3d339f82e2d00891309620a60e13039fa1ed8b46	2018-10-22 19:29:04 -07:00
James Sun	f4944f0f8a	Rename test/common.py to test/common_utils.py (#12794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794 common.py is used in base_module for almost all tests in test/. The name of this file is so common that can easily conflict with other dependencies if they happen to have another common.py in the base module. Rename the file to avoid conflict. Reviewed By: orionr Differential Revision: D10438204 fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380	2018-10-17 23:04:29 -07:00
Benoit Steiner	bbe6ef3864	torch.finfo and torch.iinfo to mimic the numpy equivalent (#12472 ) Summary: This pull request intends to provide the functionality requested in https://github.com/pytorch/pytorch/issues/10742 by adding a new torch.finfo and torch.iinfo API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12472 Differential Revision: D10250829 Pulled By: benoitsteiner fbshipit-source-id: eb22ca55d5b0064bef381fa7f1eb75989977df30	2018-10-15 13:43:52 -07:00
Alex Ford	7a1b668283	Implement Tensor.__cuda_array_interface__. (#11984 ) Summary: _Implements pytorch/pytorch#11914, cc: ezyang_ Implements `__cuda_array_interface__` for non-sparse cuda tensors, providing compatibility with numba (and other cuda projects...). Adds `numba` installation to the `xenial-cuda9` jenkins test environments via direct installation in `.jenkins/pytorch/test.sh` and numba-oriented test suite in `test/test_numba_integration.py`. See interface reference at: https://numba.pydata.org/numba-doc/latest/cuda/cuda_array_interface.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/11984 Differential Revision: D10361430 Pulled By: ezyang fbshipit-source-id: 6e7742a7ae4e8d5f534afd794ab6f54f67808b63	2018-10-12 13:41:05 -07:00
Christian Puhrsch	d8f6be686d	Remove torch/legacy (#11823 ) Summary: Largely unused and hinders current development Pull Request resolved: https://github.com/pytorch/pytorch/pull/11823 Differential Revision: D9925094 Pulled By: cpuhrsch fbshipit-source-id: c797f62180e2128f9a567b0c57c8347957470ea5	2018-09-20 14:00:54 -07:00
Gregory Chanan	85ff72348d	Only involve tensor device in CUDA -> CPU copy, not current device. (#11592 ) Summary: This also unifies the device usage between the async and sync case. Fixes https://github.com/pytorch/pytorch/issues/10832. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11592 Differential Revision: D9797355 Pulled By: gchanan fbshipit-source-id: e496cd371111cfaf9a6c664167967b395e3d72e9	2018-09-13 16:32:46 -07:00
Teng Li	0988bbad2d	C10d release to torch.distributed for PT1 (#11405 ) Summary: The old `torch.distributed` will go to `torch.distributed.deprecated` The old DDP will go to `torch.nn.parallel.deprecated` Now `torch.nn.parallel.DDP` will use c10d DDP Now `torch.distributed` will use C10d frontend API Pull Request resolved: https://github.com/pytorch/pytorch/pull/11405 Reviewed By: pietern Differential Revision: D9733733 Pulled By: teng-li fbshipit-source-id: d6a3f3e73f8d3a7fcb1f4baef53c78063b8cbb08	2018-09-10 23:27:22 -07:00
Tongzhou Wang	0d5e4a2c66	Allow passing through arguments to unittest (#11209 ) Summary: Example: ```sh python run_test.py -i sparse -- TestSparse.test_factory_size_check -f ``` With this, the `--verbose` option is redundant (one can call `python run_test.py -- -v` instead of `python run_test.py -v`. But since this is (probably) a frequently used flag, I didn't remove the existing easier-to-use option. cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/11209 Differential Revision: D9632215 Pulled By: SsnL fbshipit-source-id: ff522802da11ef0a0714578be46e4a44f6343d44	2018-09-03 20:09:08 -07:00
iotamudelta	33c7cc13ca	improve docker packages, fix bugs, enable tests, enable FFT (#10893 ) Summary: * improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs) * integrate rocFFT (i.e., enable Fourier functionality) * fix bugs in ROCm caused by wrong warp size * enable more test sets, skip the tests that don't work on ROCm yet * don't disable asserts any longer in hipification * small improvements Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893 Differential Revision: D9615053 Pulled By: ezyang fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b	2018-09-02 08:54:42 -07:00
Teng Li	56539f5fe1	PT1 Distributed Release MileStone No.1 - Completed Distributed Package and CI tests (#10871 ) Summary: The PR includes: (1) torch.distributed.c10d, which now includes the complete backward compatible frontend API for `torch.distributed` (2) `env://` init method functionality (3) Minor change to `test_distributed.py`, which is now a test for `torch.distributed.c10d`. (4) The old `test_distributed.py' is now moved to `test_distributed_thd` (5) Miscellaneous bug fixes. (6) DDP CPU test is removed since c10d doesn't have this support yet, but this is a very easy test after moving DDP CPU's dependency to torch.distributed.c10d. (7) CI config to test MPI, NCCL, and Gloo backend of c10d Now all the distributed test including c10d DDP can pass with the c10d frontend API TODO: (in a separate PR) MPI subgroup support, once this is added, CI group test will be enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10871 Differential Revision: D9554514 Pulled By: teng-li fbshipit-source-id: fb686ad42258526c8b4372148e82969fac4f42dd	2018-08-29 12:55:57 -07:00
Teng Li	a88463cd9a	Working async version of AllGather, test fix and compiler warnings, and CI (#10932 ) Summary: The previous NCCL all gather doesn't work as expected. This is a fully working async version. Tested on both C++ and Python Frontend. Multi-node: ``` tengli@learnfair042:~/new_pytorch/pytorch/torch/lib/build/c10d/test$ TMPFILE="/private/home/tengli/temp/tengli-test" RANK=0 WORLD_SIZE=2 ./ProcessGroupNCCLTest Multi-node world size: 2 rank: 0 Allreduce test successful Broadcast test successful Reduce test successful Allgather test successful tengli@learnfair117:~/new_pytorch/pytorch/torch/lib/build/c10d/test$ TMPFILE="/private/home/tengli/temp/tengli-test" RANK=1 WORLD_SIZE=2 ./ProcessGroupNCCLTest Multi-node world size: 2 rank: 1 Allreduce test successful Broadcast test successful Reduce test successful Allgather test successful ``` CI test: ``` test_set_get (__main__.FileStoreTest) ... ok test_set_get (__main__.PrefixFileStoreTest) ... ok test_set_get (__main__.PrefixTCPStoreTest) ... ok test_allreduce_ops (__main__.ProcessGroupGlooTest) ... ok test_broadcast_ops (__main__.ProcessGroupGlooTest) ... ok test_allgather_ops (__main__.ProcessGroupNCCLTest) ... ok test_allreduce_ops (__main__.ProcessGroupNCCLTest) ... ok test_broadcast_ops (__main__.ProcessGroupNCCLTest) ... ok test_reduce_ops (__main__.ProcessGroupNCCLTest) ... ok test_common_errors (__main__.RendezvousFileTest) ... ok test_nominal (__main__.RendezvousFileTest) ... ok test_common_errors (__main__.RendezvousTCPTest) ... ok test_nominal (__main__.RendezvousTCPTest) ... ok test_unknown_handler (__main__.RendezvousTest) ... ok test_set_get (__main__.TCPStoreTest) ... ok ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10932 Differential Revision: D9542067 Pulled By: teng-li fbshipit-source-id: 25513eddcc3119fd736875d69dfb631b10f4ac86	2018-08-28 12:40:14 -07:00
Johannes M Dieterich	a4c59a9dab	MIOpen integration, more tests enabled, bug fixes (#10612 ) Summary: * first integration of MIOpen for batch norm and conv on ROCm * workaround a ROCm compiler bug exposed by elementwise_kernel through explicit capture of variables in the densest packing * workaround a ROCm compiler bug exposed by having `extern "C" __host__` as a definition and just `__host__` in the implementation through the hipify script * use fabs() in accordance with C++11 for double absolute, not ::abs() which is integer-only on ROCm * enable test_sparse set on CI, skip tests that don't work currently on ROCm * enable more tests in test_optim after the elementwise_bug got fixed * enable more tests in test_dataloader * improvements to hipification and ROCm build With this, resnet18 on CIFAR data trains without hang or crash in our tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10612 Reviewed By: bddppq Differential Revision: D9423872 Pulled By: ezyang fbshipit-source-id: 22c0c985217d65c593f35762b3eb16969ad96bdd	2018-08-23 15:24:47 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00
iotamudelta	a38b572de3	enable unit tests and other changes (#10266 ) Summary: This PR for the ROCm target does the following: * enable some unit tests on ROCm * fix a missing static_cast that breaks BatchNorm call on ROCm * fix BatchNorm to work on ROCm w/ ROCm warp sizes etc * improve the pyhipify script by introducing kernel scope to some transpilations and other improvements * fix a linking issue on ROCm * for more unit test sets: mark currently broken tests broken (to be fixed) * enable THINLTO (phase one) to parallelize linking * address the first failing of the elementwise kernel by removing non-working ROCm specialization Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266 Differential Revision: D9184178 Pulled By: ezyang fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297	2018-08-06 14:54:01 -07:00
Pieter Noordhuis	695d40efc2	Create initial Python bindings for c10d (#8119 ) * Build and install c10d from tools/build_pytorch_libs.sh * Create initial Python bindings for c10d * clang-format * Switch link order to include more symbols * Add bindings and tests for ProcessGroupGloo * Add broadcast test * Separate build flag for c10d * Explicit PIC property * Skip c10d tests if not available * Remove c10d from Windows blacklist Let it skip by itself because it won't be available anyway. * Make lint happy * Comments * Move c10d module into torch.distributed * Close tempfile such that it is deleted	2018-06-08 12:59:51 -07:00
Tongzhou Wang	85ee94b7be	Add memory leak check in CUDA tests (#7270 ) * Add memory leak check in CUDA tests * Tracking multi-GPU too * fix run_test.py not running __name__ == '__main__' content; add test for make_cuda_memory_checked_test * add a comment * skip if cuda * 1. Change the wrapper to a method in common.py:TestCase 2. Refactor common constants/method that initialize CUDA context into common_cuda.py 3. Update some test files to use TEST_CUDA and TEST_MULTIGPU * Fix MaxUnpool3d forward memory leak * Fix MultiLabelMarginCriterion forward memory leak * Fix MultiMarginLoss backward memory leak * default doCUDAMemoryCheck to False * make the wrapper skip-able * use TEST_MULTIGPU * add align_corners=True/False tests for Upsample; fix TEST_CUDNN * finalize interface * VolumetricMaxUnpooling_updateOutput * fix test_nccl * rename THC caching allocator methods to be clearer * make the wrapped function a method * address comments; revert changes to aten/src/THC/THCCachingAllocator.cpp * fix renamed var	2018-05-31 15:09:54 -04:00
Francisco Massa	b240cc9b87	Add support for dotted names in CPP Extensions (#6986 ) * Add support for dotted names in CPP Extensions * Modify tests for cpp extensions Test that dotted names work * Py2 fixes * Make run_test cpp_extensions Win-compatible	2018-04-29 18:10:03 +02:00
Simeon Monov	dc94182db0	Check for --noprefix option for mpiexec in run_test.py (#6690 ) * Check for --noprefix option for mpiexec --noprefix option to mpiexec is not part of the MPI standard. It is needed in certain configurations when using OpenMPI but not supported with other MPI implementations such as MPICH and maybe others. This commit adds a check if the option is supported by the current mpiexec. Also this commit fixes Issue #4965 and MPI tests can be enabled in the CI. Fixes: #4965 * Update run_test.py	2018-04-17 23:34:33 -04:00
xhzhao	f2c9975378	Add DistributedDataParallelCPU (#5919 )	2018-04-17 15:36:47 +02:00
Simeon Monov	24b4931462	Improve run_test.py to support running individual test classes and methods (#6344 ) * Improve run_test.py to support running individual test classes and methods Added support in run_test.py for running individual test classes and methods. The -i/--include option can specify a list of test modules, classes or methods like this: python run_test.py -i autograd torch.TestTorch.test_abs \ torch.TestTorch.test_add utils.TestBottleneck -f, -l and -x behaviour stays the same as before * Fixed some code formatting * Multiple fixes according to the reviews in #6344	2018-04-16 14:33:50 -04:00
peterjc123	d45f3d0d5c	Skip cpp_extensions test when possible on Windows (#6423 )	2018-04-12 12:12:39 +02:00
Peter Goldsborough	6f10978e7b	Skip C++ extensions test when ninja is not available (#6480 )	2018-04-10 14:50:24 -07:00
Peter Goldsborough	c3f7e5ff55	Install signal handler for SIGCHLD in run_test.py (#6436 ) Handle exit signal in run_test.py	2018-04-10 11:31:23 -07:00
peterjc123	63af898d46	Fix extension test on Windows (#5548 ) * Change cpp_extensions.py to make it work on Windows * Fix linting * Show python paths * Debug * Debug 1 * set PYTHONPATH * Add ATen into library * expose essential libs and functions, and copy _C.lib * Specify dir in header * Update check_abi for MSVC * Activate cl environment to compile cpp extensions * change version string * Redirect stderr to stdout * Add monkey patch for windows * Remove unnecessary self * Fix various issues * Append necessary flags * add /MD flag to cuda * Install ninja * Use THP_API instead of THP_CLASS * Beautify the paths * Revert "Use THP_API instead of THP_CLASS" This reverts commit dd7e74c44db48e4c5f85bb8e3c698ff9de71ba2d. * Use THP_API instead of THP_CLASS(new)	2018-04-02 13:53:25 -04:00
Edward Z. Yang	2ad972c9eb	A complete revamp of our test scripts. (#5904 ) - All of the scripts are based off of the idea that they should be as simple as possible, and all the heavy lifting done in the construction of the Docker file. The scripts are really simple now. A bigger philosophical discussion can be found in .jenkins/README.md - build-asan.sh is split out of build.sh, as ASAN builds are a bit specialized and it's inappropriate to run many of the other builds as part of them. - We now build and run with mkl/mkl-include on the CPU only builds - We now report sccache and ccache stats at the end of all builds. - run_test.py flushes stdout/stderr before making a subprocess call, which should solve our interleaving problems. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-22 16:31:50 -04:00
Peter Goldsborough	4613eef69e	Simplify run_test.py and dont use shell=True (#5767 ) * Simplify run_test.py and dont use shell=True * Fix non-shell output for check_output and always print to stderr * Use shlex.split instead of str.split * s/log/print_to_stderr * with_init -> with_init_file * Remove bufsize argument	2018-03-15 01:12:51 -04:00
Edward Z. Yang	3f3b686056	Refactor run_test.py to pass all options, not just verbose. (#5760 ) I need this because run_test is going to need to read other options than just verbose when I implement JUnit XML dumping. (JUnit XML dumping cannot be implemented solely by frobbing --python because the XML file to dump to must vary based on the test name.) Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-14 07:44:58 -04:00
Edward Z. Yang	cadeb0cb17	Revert "ATen ReduceOps (#5481 )" (#5765 ) * Revert "ATen ReduceOps (#5481)" This reverts commit `310c3735b9`. * Revert "Check that new cpuinfo and tbb submodules exist (#5714)" This reverts commit `1a23c9901d`.	2018-03-13 23:50:16 -04:00
Peter Goldsborough	16fa12214d	raise RuntimeError on test failure (#5754 )	2018-03-13 18:53:43 -04:00
cpuhrsch	310c3735b9	ATen ReduceOps (#5481 ) This diff adds vectorization to ATen. It uses intel intrinsics to build a general vec256 class, that represents types of 256bit width. These can then be treated like regular variables. Using those it implements torch.sum() for the contiguous case. It uses Intel TBB for multithreading, which allows workstealing and chunks the reduction operations based on a experimentally chosen value (_THRESHOLD). It uses cpuinfo to pick the right code depending on the host's capabilities. The kernels are implemented under native/cpu. Each .cpp file is compiled with -avx, -avx2 and no additional flags. A macro is used to append AVX, AVX2 or NONE to the function name. The header then needs to define the functions three times, one for each capability. This could be improved by either changing the cmake file a bit or possibly generating source code using a Python script etc. For the non-contiguous case this defaults to the current implementation within TH. For CUDA is entirely defaults to the implementation within THC. There probably needs to be a bit of a debate around the design decisions here, the additional dependencies, parallelization strategy, clarity, etc. The numerical results also diverge from numpy with larger tensors, which is expected since we're summing, for example, 8 numbers and then adding the result to the running sum, instead of each number one by one. But there might be something to be said about accumulating into a double for floats or the degree of divergence, the behavior with respect to CUDA, etc. I wrote a [small Python script]( https://github.com/cpuhrsch/benchmark/blob/sumall/benchmarks/sum_bench.py) to compare the results with numpy numerically as well as on timing. I ran this script to create timings both on master and this branch. Here is the command for 1 core `OMP_NUM_THREAD=1 taskset -c 0 python sum_bench.py --enable_numpy 200` Here is the command for all cores `python sum_bench.py --enable_numpy 200` Here are the results of each: [Master, 1 core](https://paste.fedoraproject.org/paste/Nho9JzHpPVK9av8a6mByjQ) [This branch, 1 core](https://paste.fedoraproject.org/paste/6xLHkYvcVJx9z~5MoHxN4w) [Master, all cores](https://paste.fedoraproject.org/paste/5l3V1d5zGqvJcMXIUteMRw) [This branch, all cores](https://paste.fedoraproject.org/paste/J4RuDU-0Drz0aZwtphQwEA) To test the command is `python sum_bench.py --test 200` [This branch, test results](https://paste.fedoraproject.org/paste/kTEoUC~oWgXA6XWMAfNfNw) For this test we look at the average absolute value of the differences. This does not take into account the relative magnitude of the numbers. The numbers are sampled from a standard normal distribution. In terms of performance this diff should bring PyTorch on par with Numpy and usually exceed it by 1.5 to 2x.	2018-03-12 15:19:12 -04:00
Peter Goldsborough	6404904d8a	Fix run_test.py (#5693 )	2018-03-10 19:16:40 -05:00
Peter Goldsborough	53876c4606	Rewrite run_test.sh in Python (#5615 )	2018-03-09 22:02:02 +01:00

... 9 10 11 12 13 ...

717 Commits