pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Author	SHA1	Message	Date
Catherine Lee	49e10c1598	[ci] test_ops in parallel, ci tests log to file (#85528 ) part one of splitting up https://github.com/pytorch/pytorch/pull/84961 into (probably 2) parts contains * logging to file * testing test_ops in parallel Pull Request resolved: https://github.com/pytorch/pytorch/pull/85528 Approved by: https://github.com/huydhn	2022-09-23 20:45:20 +00:00
Nikita Shulga	46a6a50f4e	Skip MPS test from generic M1 testsuite (#85500 ) As there is separate Run MPS shard right now See if this reduces the number of crashes Pull Request resolved: https://github.com/pytorch/pytorch/pull/85500 Approved by: https://github.com/clee2000, https://github.com/kit1980, https://github.com/huydhn	2022-09-22 22:13:47 +00:00
PyTorch MergeBot	3dce26635f	Revert "test in parallel at file granularity (#84961 )" This reverts commit `8107666c6a`. Reverted https://github.com/pytorch/pytorch/pull/84961 on behalf of https://github.com/clee2000 due to makes test_forward_ad_nn_functional_max_unpool2d_cuda_float32 flakily unexpectedly pass	2022-09-21 20:21:25 +00:00
Catherine Lee	8107666c6a	test in parallel at file granularity (#84961 ) run tests in parallel at the test file granularity runs 3 files in parallel using multiprocessing pool, output goes to a file, which is then printed when the test finishes. Some tests cannot be run in parallel (usually due to lacking memory), so we run those after. Sharding is changed to attempt to mask large files with other large files/run them on the same shard. test_ops* gets a custom handler to run it because it is simply too big (2hrs on windows) and linalg_cholesky fails (I would really like a solution to this if possible, but until then we use the custom handler). reduces cuda tests by a lot, reduces total windows test time by ~1hr Ref. https://github.com/pytorch/pytorch/issues/82894 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84961 Approved by: https://github.com/huydhn	2022-09-21 16:58:11 +00:00
Arindam Roy	a185dc2e63	[ROCm] re-enable tensorexpr and test_openmp (#81367 ) The following tests are being re-enabled for ROCm: - test_openmp.py - TestTensorExprPyBind tests in test_tensorexpr_pybind.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/81367 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-09-14 00:41:16 +00:00
Xiang Gao	08c4f8c7a7	ProcessGroupUCC tests (#83285 ) - [x] Direct dependency on UCX is completely removed, UCC active set API always enabled - [x] Remove `TORCH_UCC_PROFILING_ENABLE`, always enable profiling - [x] Fixes profiling of `recv` and `all_gather` - [x] Use the NCCL TL of UCC on CUDA, as the UCP TL is not well supported on CUDA Most tests are passing, but there are a few skipped tests: - `scatter` and `gather` are not supported by the UCP TL of UCC on CPU tensors - A few flaky tests in PyTorch's CI environment - Profiler-related failures, some of them will be fixed by @Fuzzkatt in https://github.com/pytorch/pytorch/pull/84368 After this PR is merged, I will continue to work on these skipped failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83285 Approved by: https://github.com/vtlam, https://github.com/malfet, https://github.com/kwen2501	2022-09-10 10:56:05 +00:00
Huy Do	90d6112a94	Test distributed backends in parallel (#84034 ) This allows multiple backends (nccl, gloo) to be tested in parallel and speed up the process. The improvement is mainly in the 1st distributed CUDA shard where the long pole `distributed/test_distributed_spawn` test is executed: * [linux-bionic-cuda11.6-py3.10-gcc7 / test (distributed, 1, 2, linux.8xlarge.nvidia.gpu)](https://github.com/pytorch/pytorch/runs/8007596825?check_suite_focus=true#logs) takes 1h24m. This is better than the current average expectation of 2h12m On the other hand, there is no improvement for the following two jobs: * [linux-focal-py3.7-gcc7 / test (distributed, 1, 1, linux.2xlarge)](https://github.com/pytorch/pytorch/runs/8007417353?check_suite_focus=true#logs) takes 1h47m * [linux-bionic-cuda11.6-py3.10-gcc7 / test (distributed, 2, 2, linux.8xlarge.nvidia.gpu)](https://github.com/pytorch/pytorch/runs/8007596870?check_suite_focus=true#logs) takes 1h40m This is still a gain though because it allows us to add more shards for distributed test if needed. Issue https://github.com/pytorch/pytorch/issues/83694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84034 Approved by: https://github.com/wanchaol	2022-09-01 03:48:54 +00:00
PyTorch MergeBot	772721a4b7	Revert "Test distributed backends in parallel (#84034 )" This reverts commit `3ae5be74ac`. Reverted https://github.com/pytorch/pytorch/pull/84034 on behalf of https://github.com/huydhn due to This somehow revives the flaky test https://github.com/pytorch/pytorch/issues/76428	2022-08-30 21:01:25 +00:00
Huy Do	3ae5be74ac	Test distributed backends in parallel (#84034 ) This allows multiple backends (nccl, gloo) to be tested in parallel and speed up the process. The improvement is mainly in the 1st distributed CUDA shard where the long pole `distributed/test_distributed_spawn` test is executed: * [linux-bionic-cuda11.6-py3.10-gcc7 / test (distributed, 1, 2, linux.8xlarge.nvidia.gpu)](https://github.com/pytorch/pytorch/runs/8007596825?check_suite_focus=true#logs) takes 1h24m. This is better than the current average expectation of 2h12m On the other hand, there is no improvement for the following two jobs: * [linux-focal-py3.7-gcc7 / test (distributed, 1, 1, linux.2xlarge)](https://github.com/pytorch/pytorch/runs/8007417353?check_suite_focus=true#logs) takes 1h47m * [linux-bionic-cuda11.6-py3.10-gcc7 / test (distributed, 2, 2, linux.8xlarge.nvidia.gpu)](https://github.com/pytorch/pytorch/runs/8007596870?check_suite_focus=true#logs) takes 1h40m This is still a gain though because it allows us to add more shards for distributed test if needed. Issue https://github.com/pytorch/pytorch/issues/83694 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84034 Approved by: https://github.com/wanchaol	2022-08-30 19:06:49 +00:00
joncrall	b136f3f310	More doctest refinements. (#83317 ) Follow up to #82797 Now that the doctests themselves are in a better state, we should be able to enable xdoctest on the CI so they stay that way. @ezyang @vadimkantorov Pull Request resolved: https://github.com/pytorch/pytorch/pull/83317 Approved by: https://github.com/ezyang	2022-08-22 20:07:26 +00:00
joncrall	4618371da5	Integrate xdoctest - Rebased (#82797 ) This is a new version of #15648 based on the latest master branch. Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR. In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.) Fixes https://github.com/pytorch/pytorch/issues/71105 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797 Approved by: https://github.com/ezyang	2022-08-12 02:08:01 +00:00
Mateusz Sypniewski	916def84d4	CUDA trace Python hooks (#82824 ) ### Description This adds Python hooks into PyTorch that allow the user to register their own callbacks for events such as tensor allocation, stream allocation, event record / wait etc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82824 Approved by: https://github.com/lw, https://github.com/ezyang, https://github.com/malfet	2022-08-11 10:21:40 +00:00
pbialecki	b4f7e22640	Enable periodic builds for CUDA 11.7 (#81688 ) CC @atalman Pull Request resolved: https://github.com/pytorch/pytorch/pull/81688 Approved by: https://github.com/atalman	2022-08-10 00:03:51 +00:00
Nikita Shulga	b9cdd6d0ac	Do not assume what is in `os.environ` (#82495 ) `os.environ['FOO']` will raise `IndexError` if `FOO` is not found, while `os.environ.get('FOO')` would simply return `None` Fixes https://github.com/pytorch/pytorch/issues/82492 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82495 Approved by: https://github.com/huydhn, https://github.com/kit1980	2022-07-29 22:57:18 +00:00
Catherine Lee	86f038dd56	download test times during build to avoid race conditions (#81915 ) After https://github.com/pytorch/pytorch/pull/81116, we started pulling test times straight from the source instead of first downloading them in the build job and then having the test job take the build jobs version. This can cause an issues where different shards pull different versions of the file, leading to incorrect sharding (ex two shards running the same tests file on accident). This generally happens if the test jobs happen while the test times file is being updated (unlikely, but not impossible) or if someone reruns a test job the next day. In this PR, I return to the old method of downloading the test times file during the build job and having the test jobs pull from the build jobs uploaded artifacts. If there is no test times file in the build job's artifacts, we fall back to the default sharding plan. Notes: * script moved to a new file to avoid needing to import torch, which would require torch to be built, which can cause issues with asan * I got errors with asan (`ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.`), so I put the script at the beginning of the build ### Test Plan Verified that the number of tests ran in the pull and trunk workflows are similar to workflows run on master. Checked logs to see if artifacts were being used for sharding. Spot checked a few test configs to check that their lists of selected tests didn't overlap. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81915 Approved by: https://github.com/huydhn	2022-07-28 16:35:01 +00:00
Richard Zou	5f4e8c0a4d	Add ability to functorch tests via run_test.py (#82012 ) This PR: - adds the ability to run functorch tests via run_test.py - changes the functorch shards in PyTorch CI to invoke functorch tests via run_test.py The main motivation for this is so that functorch tests hook into the standard PyTorch test infrastructure. Questions for reviewers: - the functorch tests are located outside of the pytorch/test folder (they're in the pytorch/functorch/test folder). Is this OK? (run_test.py works locally for me). Test Plan: - checked that `python run_test.py --functorch` ran functorch tests locally - Local mock test: added `{"test_compilation_for_dynamic_shape (__main__.TestCompileCache)": ["https://github.com/pytorch/pytorch/issues/82016", ["linux"]]}` to .pytorch-disabled-tests.json, ran functorch tests, verified that the test was skipped. - Wait for CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/82012 Approved by: https://github.com/janeyx99	2022-07-25 14:23:18 +00:00
Michael Suo	9f58d5d7ce	[test stats] use published test stats for sharding (#81116 ) Use the nightly-published test stats to perform sharding, instead of calculating it in every build job. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81116 Approved by: https://github.com/janeyx99	2022-07-12 04:50:19 +00:00
Brian Hirsh	282de5539d	add open device registration test with cpp extensions (#80477 ) Adding a test for open device registration using cpp extensions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80477 Approved by: https://github.com/albanD, https://github.com/malfet	2022-07-12 01:46:16 +00:00
Charlie Yan	ffae7308c9	Enable test: distributed/algorithms/quantization/test_quantization (#80097 ) fixes https://github.com/pytorch/pytorch/issues/69017 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80097 Approved by: https://github.com/wanchaol	2022-07-01 01:32:33 +00:00
Charlie Yan	14eadf937b	Enable test: test/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/80077 Approved by: https://github.com/wanchaol	2022-06-23 00:11:53 +00:00
Alex Hedges	cb2b7b1e57	Fix code that triggers BytesWarning (#79868 ) Fixes #74812. I have fixed the multiple instances in the repository that trigger `BytesWarning`, and I have enabled the `-bb` option when tests are run to prevent regressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79868 Approved by: https://github.com/janeyx99	2022-06-21 01:12:21 +00:00
PyTorch MergeBot	e10cbe3880	Revert "Fix BytesWarning in torch.load() (#74813 )" This reverts commit `6c2e8119dd`. Reverted https://github.com/pytorch/pytorch/pull/74813 on behalf of https://github.com/janeyx99 due to Broke slow tests in cuda 10.2 https://github.com/pytorch/pytorch/runs/6944238177?check_suite_focus=true	2022-06-18 03:53:54 +00:00
Alex Hedges	6c2e8119dd	Fix BytesWarning in torch.load() (#74813 ) Fixes #74812. I have enabled the `-bb` option when tests are run to prevent regressions. I don't think it will make CI run more slowly, but I'm not entirely sure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74813 Approved by: https://github.com/kit1980	2022-06-17 22:56:43 +00:00
Michael Suo	842da8a5de	[ci] remove TD + test specification code from run_test.py In the case of target determination, this is just removing comments that refer to non-existent code. In the case of the test specification code; this removes (what I believe to be) an unused feature. If we're using this somehow let me know and I can revise the PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79372 Approved by: https://github.com/janeyx99	2022-06-13 16:09:53 +00:00
Michael Suo	943c09a53e	[ci] clean up dead code related to PR test selection This is never used and not tested, so removing it for clarity. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79363 Approved by: https://github.com/janeyx99	2022-06-13 16:09:51 +00:00
Michael Suo	c978b609f7	[ci] remove IN_CI env var The conventional env var to set is CI. Both circle and GHA set it, so IN_CI is unnecessary Pull Request resolved: https://github.com/pytorch/pytorch/pull/79229 Approved by: https://github.com/janeyx99	2022-06-11 17:16:30 +00:00
Jagadish Krishnamoorthy	2d354cdc2a	[ROCm] Enable test_instantiator, test_type_hints (#78633 ) Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78633 Approved by: https://github.com/malfet, https://github.com/pruthvistony	2022-06-06 06:09:34 +00:00
Xiao Wang	ef0332e36d	Allow relocatable device code linking in pytorch CUDA extensions (#78225 ) Close https://github.com/pytorch/pytorch/issues/57543 Doc: check `Relocatable device code linking:` in https://docs-preview.pytorch.org/78225/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension Pull Request resolved: https://github.com/pytorch/pytorch/pull/78225 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-06-02 21:35:56 +00:00
Kurt Mohler	1705be8ff7	Fix `_free_weak_ref` error (#78575 ) Fixes #74016 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78575 Approved by: https://github.com/ezyang	2022-06-01 00:07:48 +00:00
pritam	37eb31599c	[reland] Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77987 ) 1. Enabled multigpu tests. 2. Fixed failing multigpu tests. 3. Fixed custom operator decorator to be first preference in operator dispatch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77987 Approved by: https://github.com/fduwjj, https://github.com/wanchaol, https://github.com/janeyx99	2022-05-21 22:33:58 +00:00
PyTorch MergeBot	0f74b44f1a	Revert "Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77825 )" This reverts commit `8d4c8df33a`. Reverted https://github.com/pytorch/pytorch/pull/77825 on behalf of https://github.com/janeyx99 due to as it will break multigpu test reporting	2022-05-20 17:59:03 +00:00
pritam	8d4c8df33a	Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77825 ) 1. Enabled multigpu tests. 2. Fixed failing multigpu tests. 3. Fixed custom operator decorator to be first preference in operator dispatch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77825 Approved by: https://github.com/wanchaol, https://github.com/fduwjj	2022-05-20 16:53:27 +00:00
PyTorch MergeBot	5e0f559d23	Revert "Add sharding tests to multigpu-test.sh (#77708 )" This reverts commit `a7cf95a609`. Reverted https://github.com/pytorch/pytorch/pull/77708 on behalf of https://github.com/suo	2022-05-18 21:47:11 +00:00
pritam	a7cf95a609	Add sharding tests to multigpu-test.sh (#77708 ) Summary: These tests were being skipped since they don't run on multigpu jobs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77708 Approved by: https://github.com/wanchaol	2022-05-18 17:37:55 +00:00
Wanchao Liang	25fa964d96	[shard] add clone/detach and set requires_grad for ShardedTensor This PR adding clone/detach and set requires_grad to ShardedTensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/77367 Approved by: https://github.com/pritamdamania87	2022-05-16 21:42:27 +00:00
pritam	9e52b50e34	Additional ops for ShardedTensor, ReplicatedTensor and PartialTensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76477 Adding the following ops: 1) softmax for ShardedTensor 2) getitem and unsqueeze for ReplicatedTensor 3) transpose and cat for PartialTensor Differential Revision: [D35979510](https://our.internmc.facebook.com/intern/diff/D35979510/) Approved by: https://github.com/fduwjj, https://github.com/wanchaol	2022-05-06 16:28:04 +00:00
Catherine Lee	56ea57de61	shard `pull / linux-xenial-cuda11.3-py3.7-gcc7 / test (distributed` 1->2 Fixes #ISSUE_NUMBER shard `pull / linux-xenial-cuda11.3-py3.7-gcc7 / test (distributed ...` from 1 shard to 2 Pros: - It currently takes about 2.6 hours and is 3rd longest running job on pull - Theoretically minimal overhead Cons: - Requires changes to the run_test.py which might have correctness issues Notes: - Cannot shard further as one of the test files is responsible for about half of the total run time spreadsheet regarding sharding: https://docs.google.com/spreadsheets/d/1BdtVsjRr0Is9LXMNilR02FEdPXNq7zEWl8AmR3ArsLQ/edit#gid=1153012347 Test Plan: <details><summary>expand to see test plan (its long)</summary> tests from a commit ran on master (90 tests ran) ``` 2022-05-03T12:45:34.7974184Z Selected tests: 2022-05-03T12:45:34.7974495Z distributed/_shard/sharded_optim/test_sharded_optim 2022-05-03T12:45:34.7974839Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-05-03T12:45:34.7975209Z distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-05-03T12:45:34.7975575Z distributed/_shard/sharded_tensor/ops/test_embedding 2022-05-03T12:45:34.7976180Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-05-03T12:45:34.7976802Z distributed/_shard/sharded_tensor/ops/test_init 2022-05-03T12:45:34.7977361Z distributed/_shard/sharded_tensor/ops/test_linear 2022-05-03T12:45:34.7978157Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-05-03T12:45:34.7978879Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-05-03T12:45:34.7979594Z distributed/_shard/sharded_tensor/test_sharded_tensor 2022-05-03T12:45:34.7980366Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-05-03T12:45:34.7981066Z distributed/_shard/sharding_plan/test_sharding_plan 2022-05-03T12:45:34.7981877Z distributed/_shard/sharding_spec/test_sharding_spec 2022-05-03T12:45:34.7982387Z distributed/_shard/test_partial_tensor 2022-05-03T12:45:34.7982691Z distributed/_shard/test_replicated_tensor 2022-05-03T12:45:34.7982994Z distributed/_shard/test_sharder 2022-05-03T12:45:34.7983280Z distributed/algorithms/test_join 2022-05-03T12:45:34.7983695Z distributed/elastic/events/lib_test 2022-05-03T12:45:34.7983984Z distributed/elastic/metrics/api_test 2022-05-03T12:45:34.7984308Z distributed/elastic/multiprocessing/api_test 2022-05-03T12:45:34.7984624Z distributed/elastic/timer/api_test 2022-05-03T12:45:34.7984924Z distributed/elastic/timer/local_timer_example 2022-05-03T12:45:34.7985254Z distributed/elastic/timer/local_timer_test 2022-05-03T12:45:34.7985575Z distributed/elastic/utils/distributed_test 2022-05-03T12:45:34.7985889Z distributed/elastic/utils/logging_test 2022-05-03T12:45:34.7986176Z distributed/elastic/utils/util_test 2022-05-03T12:45:34.7986492Z distributed/fsdp/test_flatten_params_wrapper 2022-05-03T12:45:34.7986799Z distributed/fsdp/test_fsdp_apply 2022-05-03T12:45:34.7987078Z distributed/fsdp/test_fsdp_checkpoint 2022-05-03T12:45:34.7987388Z distributed/fsdp/test_fsdp_clip_grad_norm 2022-05-03T12:45:34.7987691Z distributed/fsdp/test_fsdp_comm 2022-05-03T12:45:34.7987961Z distributed/fsdp/test_fsdp_core 2022-05-03T12:45:34.7988251Z distributed/fsdp/test_fsdp_exec_order 2022-05-03T12:45:34.7988570Z distributed/fsdp/test_fsdp_freezing_weights 2022-05-03T12:45:34.7988865Z distributed/fsdp/test_fsdp_grad_acc 2022-05-03T12:45:34.7989176Z distributed/fsdp/test_fsdp_ignored_modules 2022-05-03T12:45:34.7989478Z distributed/fsdp/test_fsdp_input 2022-05-03T12:45:34.7989950Z distributed/fsdp/test_fsdp_memory 2022-05-03T12:45:34.7990241Z distributed/fsdp/test_fsdp_meta 2022-05-03T12:45:34.7990640Z distributed/fsdp/test_fsdp_mixed_precision 2022-05-03T12:45:34.7990964Z distributed/fsdp/test_fsdp_multiple_forward 2022-05-03T12:45:34.7991293Z distributed/fsdp/test_fsdp_multiple_wrapping 2022-05-03T12:45:34.7991610Z distributed/fsdp/test_fsdp_optim_state 2022-05-03T12:45:34.7991895Z distributed/fsdp/test_fsdp_overlap 2022-05-03T12:45:34.7992195Z distributed/fsdp/test_fsdp_pure_fp16 2022-05-03T12:45:34.7992500Z distributed/fsdp/test_fsdp_state_dict 2022-05-03T12:45:34.7992818Z distributed/fsdp/test_fsdp_summon_full_params 2022-05-03T12:45:34.7993117Z distributed/fsdp/test_fsdp_traversal 2022-05-03T12:45:34.7993861Z distributed/fsdp/test_fsdp_uneven 2022-05-03T12:45:34.7994181Z distributed/fsdp/test_shard_utils 2022-05-03T12:45:34.7994447Z distributed/fsdp/test_utils 2022-05-03T12:45:34.7994721Z distributed/fsdp/test_wrap 2022-05-03T12:45:34.7995015Z distributed/nn/jit/test_instantiator 2022-05-03T12:45:34.7995328Z distributed/optim/test_zero_redundancy_optimizer 2022-05-03T12:45:34.7995664Z distributed/pipeline/sync/skip/test_api 2022-05-03T12:45:34.7995983Z distributed/pipeline/sync/skip/test_gpipe 2022-05-03T12:45:34.7996315Z distributed/pipeline/sync/skip/test_inspect_skip_layout 2022-05-03T12:45:34.7996652Z distributed/pipeline/sync/skip/test_leak 2022-05-03T12:45:34.7996977Z distributed/pipeline/sync/skip/test_portal 2022-05-03T12:45:34.7997292Z distributed/pipeline/sync/skip/test_stash_pop 2022-05-03T12:45:34.7997623Z distributed/pipeline/sync/skip/test_tracker 2022-05-03T12:45:34.7997968Z distributed/pipeline/sync/skip/test_verify_skippables 2022-05-03T12:45:34.7998301Z distributed/pipeline/sync/test_balance 2022-05-03T12:45:34.7998591Z distributed/pipeline/sync/test_bugs 2022-05-03T12:45:34.7998927Z distributed/pipeline/sync/test_checkpoint 2022-05-03T12:45:34.7999243Z distributed/pipeline/sync/test_copy 2022-05-03T12:45:34.7999557Z distributed/pipeline/sync/test_deferred_batch_norm 2022-05-03T12:45:34.7999896Z distributed/pipeline/sync/test_dependency 2022-05-03T12:45:34.8000215Z distributed/pipeline/sync/test_inplace 2022-05-03T12:45:34.8000516Z distributed/pipeline/sync/test_microbatch 2022-05-03T12:45:34.8000826Z distributed/pipeline/sync/test_phony 2022-05-03T12:45:34.8001130Z distributed/pipeline/sync/test_pipe 2022-05-03T12:45:34.8001424Z distributed/pipeline/sync/test_pipeline 2022-05-03T12:45:34.8001733Z distributed/pipeline/sync/test_stream 2022-05-03T12:45:34.8002055Z distributed/pipeline/sync/test_transparency 2022-05-03T12:45:34.8002353Z distributed/pipeline/sync/test_worker 2022-05-03T12:45:34.8002672Z distributed/rpc/cuda/test_tensorpipe_agent 2022-05-03T12:45:34.8002982Z distributed/rpc/test_faulty_agent 2022-05-03T12:45:34.8003270Z distributed/rpc/test_tensorpipe_agent 2022-05-03T12:45:34.8003568Z distributed/test_c10d_common 2022-05-03T12:45:34.8003839Z distributed/test_c10d_gloo 2022-05-03T12:45:34.8004088Z distributed/test_c10d_nccl 2022-05-03T12:45:34.8004369Z distributed/test_c10d_spawn_gloo 2022-05-03T12:45:34.8004656Z distributed/test_c10d_spawn_nccl 2022-05-03T12:45:34.8004938Z distributed/test_data_parallel 2022-05-03T12:45:34.8005212Z distributed/test_distributed_spawn 2022-05-03T12:45:34.8005496Z distributed/test_launcher 2022-05-03T12:45:34.8005767Z distributed/test_nccl 2022-05-03T12:45:34.8006019Z distributed/test_pg_wrapper 2022-05-03T12:45:34.8006285Z distributed/test_store ``` tests ran on first shard for distributed on this PR (34 tests) ``` 2022-05-02T21:26:00.1385256Z Selected tests: 2022-05-02T21:26:00.1385767Z distributed/test_distributed_spawn 2022-05-02T21:26:00.1386403Z distributed/elastic/multiprocessing/api_test 2022-05-02T21:26:00.1387051Z distributed/fsdp/test_fsdp_memory 2022-05-02T21:26:00.1387607Z distributed/fsdp/test_fsdp_ignored_modules 2022-05-02T21:26:00.1388179Z distributed/fsdp/test_fsdp_apply 2022-05-02T21:26:00.1388600Z distributed/_shard/sharded_tensor/ops/test_binary_cmp 2022-05-02T21:26:00.1389181Z distributed/_shard/sharding_spec/test_sharding_spec 2022-05-02T21:26:00.1389545Z distributed/_shard/sharded_tensor/ops/test_linear 2022-05-02T21:26:00.1389878Z distributed/fsdp/test_fsdp_uneven 2022-05-02T21:26:00.1390186Z distributed/fsdp/test_fsdp_multiple_wrapping 2022-05-02T21:26:00.1390526Z distributed/fsdp/test_fsdp_multiple_forward 2022-05-02T21:26:00.1390877Z distributed/_shard/sharded_tensor/ops/test_embedding 2022-05-02T21:26:00.1391219Z distributed/_shard/test_partial_tensor 2022-05-02T21:26:00.1391542Z distributed/_shard/sharded_optim/test_sharded_optim 2022-05-02T21:26:00.1391915Z distributed/_shard/sharded_tensor/ops/test_elementwise_ops 2022-05-02T21:26:00.1392297Z distributed/fsdp/test_flatten_params_wrapper 2022-05-02T21:26:00.1392585Z distributed/fsdp/test_utils 2022-05-02T21:26:00.1392883Z distributed/nn/jit/test_instantiator 2022-05-02T21:26:00.1393167Z distributed/test_nccl 2022-05-02T21:26:00.1393466Z distributed/_shard/sharding_plan/test_sharding_plan 2022-05-02T21:26:00.1393787Z distributed/_shard/test_sharder 2022-05-02T21:26:00.1394085Z distributed/elastic/timer/api_test 2022-05-02T21:26:00.1394383Z distributed/pipeline/sync/skip/test_api 2022-05-02T21:26:00.1394738Z distributed/pipeline/sync/skip/test_inspect_skip_layout 2022-05-02T21:26:00.1395090Z distributed/pipeline/sync/skip/test_portal 2022-05-02T21:26:00.1395424Z distributed/pipeline/sync/skip/test_tracker 2022-05-02T21:26:00.1395935Z distributed/pipeline/sync/test_balance 2022-05-02T21:26:00.1396288Z distributed/pipeline/sync/test_checkpoint 2022-05-02T21:26:00.1396635Z distributed/pipeline/sync/test_deferred_batch_norm 2022-05-02T21:26:00.1396953Z distributed/pipeline/sync/test_inplace 2022-05-02T21:26:00.1397269Z distributed/pipeline/sync/test_phony 2022-05-02T21:26:00.1397587Z distributed/pipeline/sync/test_pipeline 2022-05-02T21:26:00.1397903Z distributed/pipeline/sync/test_transparency 2022-05-02T21:26:00.1398221Z distributed/rpc/test_faulty_agent ``` tests ran on second shard for distributed on this PR (56 tests) ``` 2022-05-02T21:26:55.1342892Z Selected tests: 2022-05-02T21:26:55.1343201Z distributed/rpc/cuda/test_tensorpipe_agent 2022-05-02T21:26:55.1343526Z distributed/fsdp/test_fsdp_core 2022-05-02T21:26:55.1343829Z distributed/test_c10d_nccl 2022-05-02T21:26:55.1344089Z distributed/test_c10d_gloo 2022-05-02T21:26:55.1344408Z distributed/fsdp/test_fsdp_summon_full_params 2022-05-02T21:26:55.1344749Z distributed/fsdp/test_fsdp_mixed_precision 2022-05-02T21:26:55.1345085Z distributed/optim/test_zero_redundancy_optimizer 2022-05-02T21:26:55.1345423Z distributed/fsdp/test_fsdp_optim_state 2022-05-02T21:26:55.1345773Z distributed/_shard/sharded_tensor/test_sharded_tensor 2022-05-02T21:26:55.1346088Z distributed/fsdp/test_fsdp_state_dict 2022-05-02T21:26:55.1346379Z distributed/test_store 2022-05-02T21:26:55.1346661Z distributed/test_c10d_spawn_gloo 2022-05-02T21:26:55.1346966Z distributed/test_pg_wrapper 2022-05-02T21:26:55.1347252Z distributed/test_c10d_spawn_nccl 2022-05-02T21:26:55.1347565Z distributed/fsdp/test_fsdp_clip_grad_norm 2022-05-02T21:26:55.1347871Z distributed/fsdp/test_wrap 2022-05-02T21:26:55.1348369Z distributed/fsdp/test_fsdp_grad_acc 2022-05-02T21:26:55.1348679Z distributed/algorithms/test_join 2022-05-02T21:26:55.1349004Z distributed/fsdp/test_fsdp_freezing_weights 2022-05-02T21:26:55.1349305Z distributed/fsdp/test_fsdp_comm 2022-05-02T21:26:55.1349593Z distributed/test_c10d_common 2022-05-02T21:26:55.1349885Z distributed/fsdp/test_fsdp_meta 2022-05-02T21:26:55.1350171Z distributed/fsdp/test_fsdp_exec_order 2022-05-02T21:26:55.1350486Z distributed/fsdp/test_fsdp_checkpoint 2022-05-02T21:26:55.1350798Z distributed/fsdp/test_fsdp_overlap 2022-05-02T21:26:55.1351105Z distributed/elastic/timer/local_timer_example 2022-05-02T21:26:55.1351423Z distributed/fsdp/test_fsdp_input 2022-05-02T21:26:55.1351749Z distributed/_shard/sharded_tensor/ops/test_init 2022-05-02T21:26:55.1352190Z distributed/elastic/timer/local_timer_test 2022-05-02T21:26:55.1352520Z distributed/elastic/utils/distributed_test 2022-05-02T21:26:55.1352841Z distributed/fsdp/test_fsdp_pure_fp16 2022-05-02T21:26:55.1353150Z distributed/test_data_parallel 2022-05-02T21:26:55.1353437Z distributed/fsdp/test_fsdp_traversal 2022-05-02T21:26:55.1353792Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 2022-05-02T21:26:55.1354174Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 2022-05-02T21:26:55.1354534Z distributed/_shard/sharded_tensor/test_megatron_prototype 2022-05-02T21:26:55.1354858Z distributed/test_launcher 2022-05-02T21:26:55.1355149Z distributed/elastic/utils/util_test 2022-05-02T21:26:55.1355441Z distributed/elastic/utils/logging_test 2022-05-02T21:26:55.1355755Z distributed/elastic/metrics/api_test 2022-05-02T21:26:55.1356095Z distributed/_shard/sharded_tensor/ops/test_math_ops 2022-05-02T21:26:55.1356455Z distributed/_shard/test_replicated_tensor 2022-05-02T21:26:55.1356754Z distributed/elastic/events/lib_test 2022-05-02T21:26:55.1357065Z distributed/fsdp/test_shard_utils 2022-05-02T21:26:55.1357387Z distributed/pipeline/sync/skip/test_gpipe 2022-05-02T21:26:55.1357702Z distributed/pipeline/sync/skip/test_leak 2022-05-02T21:26:55.1358040Z distributed/pipeline/sync/skip/test_stash_pop 2022-05-02T21:26:55.1358396Z distributed/pipeline/sync/skip/test_verify_skippables 2022-05-02T21:26:55.1358716Z distributed/pipeline/sync/test_bugs 2022-05-02T21:26:55.1359027Z distributed/pipeline/sync/test_copy 2022-05-02T21:26:55.1359350Z distributed/pipeline/sync/test_dependency 2022-05-02T21:26:55.1359662Z distributed/pipeline/sync/test_microbatch 2022-05-02T21:26:55.1359983Z distributed/pipeline/sync/test_pipe 2022-05-02T21:26:55.1360299Z distributed/pipeline/sync/test_stream 2022-05-02T21:26:55.1360593Z distributed/pipeline/sync/test_worker 2022-05-02T21:26:55.1360912Z distributed/rpc/test_tensorpipe_agent ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76564 Approved by: https://github.com/jeffdaily, https://github.com/janeyx99	2022-05-03 23:01:42 +00:00
Junjie Wang (PyTorch)	7c44d560ba	[PT-D][Sharding] Enable ops needed in the transformer model training (#75374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75374 From the code base of FairSeq and MetaSeq codebase (which is essentially a transformer model), we have found that loads of ops are not supported by sharded tensor. So we now implement a simple version so that we can at least run a transformer example: Ops include: chuck, transpose, view, mask_fill, dropout, softmax and type_as. Isolate the common logic of registering simple ops into a function and for future register, we just need to implement at most three functions for a new op. ghstack-source-id: 155309147 Test Plan: CI Reviewed By: pritamdamania87 Differential Revision: D35123021 fbshipit-source-id: 660e559fb8b4a910eb63e0586c63ab927873a2ce (cherry picked from commit 83a87ebf627d863448dfe1019c7c5f7112cc14ab)	2022-05-03 17:20:28 +00:00
Junjie Wang (PyTorch)	c1037d0d4c	[PT-D][Sharding] Move Partial Tensor to the _shard folder and add logic to remove padding (#76199 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76199 Since Partial Tensor is somehow isolated to sharded tensor. We now move it to the _shard folder. Also, we added the logic to remove paddings when the size is not divisible by the world size. Modify the unit test to reflect this changes. Finally, we need to consider the placement order for the reshading spec for partial tensor, related logic is added in this change. Futhermore, for sharded linear, we will need to order the placement by rank to get the expected local result. ghstack-source-id: 154853290 Test Plan: CI Reviewed By: pritamdamania87, wanchaol Differential Revision: D35827894 fbshipit-source-id: 58dab77969b8b6557f42afa7e8f5a8a053dd5793 (cherry picked from commit abeb28f16582dcf707c2e165f39df6caf692384d)	2022-04-28 06:22:02 +00:00
Alban Desmaison	3d7abc0e55	Make -h work with run_test.py As per title. ### When running `python run_test.py -h` It used to show: - The general unittest parser help that we print via a second thread `35545d85dc/torch/testing/_internal/common_utils.py (L467-L470)` - The common_utils's parser help <details><summary>Full result</summary> <p> ```bash $ python run_test.py -h usage: run_test.py [-h] [-v] [-q] [--locals] [-f] [-c] [-b] [-k TESTNAMEPATTERNS] [tests [tests ...]] positional arguments: tests a list of any number of test modules, classes and test methods. optional arguments: -h, --help show this help message and exit -v, --verbose Verbose output -q, --quiet Quiet output --locals Show local variables in tracebacks -f, --failfast Stop on first fail or error -c, --catch Catch Ctrl-C and display results so far -b, --buffer Buffer stdout and stderr during tests -k TESTNAMEPATTERNS Only run tests which match the given substring Examples: run_test.py - run default set of tests run_test.py MyTestSuite - run suite 'MyTestSuite' run_test.py MyTestCase.testSomething - run MyTestCase.testSomething run_test.py MyTestCase - run all 'test' test methods in MyTestCase usage: run_test.py [-h] [--subprocess] [--seed SEED] [--accept] [--jit_executor JIT_EXECUTOR] [--repeat REPEAT] [--test_bailouts] [--save-xml [SAVE_XML]] [--discover-tests] [--log-suffix LOG_SUFFIX] [--run-parallel RUN_PARALLEL] [--import-slow-tests [IMPORT_SLOW_TESTS]] [--import-disabled-tests [IMPORT_DISABLED_TESTS]] optional arguments: -h, --help show this help message and exit --subprocess whether to run each test in a subprocess --seed SEED --accept --jit_executor JIT_EXECUTOR --repeat REPEAT --test_bailouts --save-xml [SAVE_XML] --discover-tests --log-suffix LOG_SUFFIX --run-parallel RUN_PARALLEL --import-slow-tests [IMPORT_SLOW_TESTS] --import-disabled-tests [IMPORT_DISABLED_TESTS] ``` </p> </details> It now prints: - The general unittest parser help the same way. Should we remove this? We can't merge them unfortunately as inittest does not accept parent / does not expose the parser for us to take it as a parent. - The combined common_utils + run_test parsers help <details><summary>Full result</summary> <p> ```bash $ python run_test.py -h usage: run_test.py [-h] [-v] [-q] [--locals] [-f] [-c] [-b] [-k TESTNAMEPATTERNS] [tests [tests ...]] positional arguments: tests a list of any number of test modules, classes and test methods. optional arguments: -h, --help show this help message and exit -v, --verbose Verbose output -q, --quiet Quiet output --locals Show local variables in tracebacks -f, --failfast Stop on first fail or error -c, --catch Catch Ctrl-C and display results so far -b, --buffer Buffer stdout and stderr during tests -k TESTNAMEPATTERNS Only run tests which match the given substring Examples: run_test.py - run default set of tests run_test.py MyTestSuite - run suite 'MyTestSuite' run_test.py MyTestCase.testSomething - run MyTestCase.testSomething run_test.py MyTestCase - run all 'test' test methods in MyTestCase Ignoring disabled issues: [] usage: run_test.py [-h] [--subprocess] [--seed SEED] [--accept] [--jit_executor JIT_EXECUTOR] [--repeat REPEAT] [--test_bailouts] [--save-xml [SAVE_XML]] [--discover-tests] [--log-suffix LOG_SUFFIX] [--run-parallel RUN_PARALLEL] [--import-slow-tests [IMPORT_SLOW_TESTS]] [--import-disabled-tests [IMPORT_DISABLED_TESTS]] [-v] [--jit] [--distributed-tests] [-core] [-pt] [-c] [-i TESTS [TESTS ...]] [-x TESTS [TESTS ...]] [-f TESTS] [-l TESTS] [--bring-to-front TESTS [TESTS ...]] [--ignore-win-blocklist] [--continue-through-error] [--export-past-test-times [EXPORT_PAST_TEST_TIMES]] [--shard SHARD SHARD] [--exclude-jit-executor] [--exclude-distributed-tests] [--run-specified-test-cases [RUN_SPECIFIED_TEST_CASES]] [--use-specified-test-cases-by {include,bring-to-front}] [--dry-run] [additional_unittest_args [additional_unittest_args ...]] Run the PyTorch unit test suite positional arguments: additional_unittest_args additional arguments passed through to unittest, e.g., python run_test.py -i sparse -- TestSparse.test_factory_size_check optional arguments: -h, --help show this help message and exit --subprocess whether to run each test in a subprocess --seed SEED --accept --jit_executor JIT_EXECUTOR --repeat REPEAT --test_bailouts --save-xml [SAVE_XML] --discover-tests --log-suffix LOG_SUFFIX --run-parallel RUN_PARALLEL --import-slow-tests [IMPORT_SLOW_TESTS] --import-disabled-tests [IMPORT_DISABLED_TESTS] -v, --verbose print verbose information and test-by-test results --jit, --jit run all jit tests --distributed-tests, --distributed-tests run all distributed tests -core, --core Only run core tests, or tests that validate PyTorch's ops, modules,and autograd. They are defined by CORE_TEST_LIST. -pt, --pytest If true, use `pytest` to execute the tests. E.g., this runs TestTorch with pytest in verbose and coverage mode: python run_test.py -vci torch -pt -c, --coverage enable coverage -i TESTS [TESTS ...], --include TESTS [TESTS ...] select a set of tests to include (defaults to ALL tests). tests must be a part of the TESTS list defined in run_test.py -x TESTS [TESTS ...], --exclude TESTS [TESTS ...] select a set of tests to exclude -f TESTS, --first TESTS select the test to start from (excludes previous tests) -l TESTS, --last TESTS select the last test to run (excludes following tests) --bring-to-front TESTS [TESTS ...] select a set of tests to run first. This can be used in situations where you want to run all tests, but care more about some set, e.g. after making a change to a specific component --ignore-win-blocklist always run blocklisted windows tests --continue-through-error Runs the full test suite despite one of the tests failing --export-past-test-times [EXPORT_PAST_TEST_TIMES] dumps test times from previous S3 stats into a file, format JSON --shard SHARD SHARD runs a shard of the tests (taking into account other selections), e.g., --shard 2 3 will break up the selected tests into 3 shards and run the tests in the 2nd shard (the first number should not exceed the second) --exclude-jit-executor exclude tests that are run for a specific jit config --exclude-distributed-tests exclude distributed tests --run-specified-test-cases [RUN_SPECIFIED_TEST_CASES] load specified test cases file dumped from previous OSS CI stats, format CSV. If all test cases should run for a <test_module> please add a single row: test_filename,test_case_name ... <test_module>,__all__ ... how we use the stats will be based on option "--use-specified-test-cases-by". --use-specified-test-cases-by {include,bring-to-front} used together with option "--run-specified-test-cases". When specified test case file is set, this option allows the user to control whether to only run the specified test modules or to simply bring the specified modules to front and also run the remaining modules. Note: regardless of this option, we will only run the specified test cases within a specified test module. For unspecified test modules with the bring-to-front option, all test cases will be run, as one may expect. --dry-run Only list the test that will run. where TESTS is any of: benchmark_utils/test_benchmark_utils, distributed/_shard/sharded_optim/test_sharded_optim, distributed/_shard/sharded_tensor/ops/test_binary_cmp, distributed/_shard/sharded_tensor/ops/test_elementwise_ops, distributed/_shard/sharded_tensor/ops/test_embedding, distributed/_shard/sharded_tensor/ops/test_embedding_bag, distributed/_shard/sharded_tensor/ops/test_init, distributed/_shard/sharded_tensor/ops/test_linear, distributed/_shard/sharded_tensor/ops/test_math_ops, distributed/_shard/sharded_tensor/test_megatron_prototype, distributed/_shard/sharded_tensor/test_partial_tensor, distributed/_shard/sharded_tensor/test_sharded_tensor, distributed/_shard/sharded_tensor/test_sharded_tensor_reshard, distributed/_shard/sharding_spec/test_sharding_spec, distributed/_shard/test_replicated_tensor, distributed/algorithms/test_join, distributed/elastic/events/lib_test, distributed/elastic/metrics/api_test, distributed/elastic/multiprocessing/api_test, distributed/elastic/timer/api_test, distributed/elastic/timer/local_timer_example, distributed/elastic/timer/local_timer_test, distributed/elastic/utils/distributed_test, distributed/elastic/utils/logging_test, distributed/elastic/utils/util_test, distributed/fsdp/test_flatten_params_wrapper, distributed/fsdp/test_fsdp_apply, distributed/fsdp/test_fsdp_checkpoint, distributed/fsdp/test_fsdp_clip_grad_norm, distributed/fsdp/test_fsdp_comm, distributed/fsdp/test_fsdp_core, distributed/fsdp/test_fsdp_freezing_weights, distributed/fsdp/test_fsdp_grad_acc, distributed/fsdp/test_fsdp_ignored_modules, distributed/fsdp/test_fsdp_input, distributed/fsdp/test_fsdp_memory, distributed/fsdp/test_fsdp_mixed_precision, distributed/fsdp/test_fsdp_multiple_forward, distributed/fsdp/test_fsdp_multiple_wrapping, distributed/fsdp/test_fsdp_optim_state, distributed/fsdp/test_fsdp_overlap, distributed/fsdp/test_fsdp_pure_fp16, distributed/fsdp/test_fsdp_state_dict, distributed/fsdp/test_fsdp_summon_full_params, distributed/fsdp/test_fsdp_traversal, distributed/fsdp/test_fsdp_uneven, distributed/fsdp/test_shard_utils, distributed/fsdp/test_utils, distributed/fsdp/test_wrap, distributed/nn/jit/test_instantiator, distributed/optim/test_zero_redundancy_optimizer, distributed/pipeline/sync/skip/test_api, distributed/pipeline/sync/skip/test_gpipe, distributed/pipeline/sync/skip/test_inspect_skip_layout, distributed/pipeline/sync/skip/test_leak, distributed/pipeline/sync/skip/test_portal, distributed/pipeline/sync/skip/test_stash_pop, distributed/pipeline/sync/skip/test_tracker, distributed/pipeline/sync/skip/test_verify_skippables, distributed/pipeline/sync/test_balance, distributed/pipeline/sync/test_bugs, distributed/pipeline/sync/test_checkpoint, distributed/pipeline/sync/test_copy, distributed/pipeline/sync/test_deferred_batch_norm, distributed/pipeline/sync/test_dependency, distributed/pipeline/sync/test_inplace, distributed/pipeline/sync/test_microbatch, distributed/pipeline/sync/test_phony, distributed/pipeline/sync/test_pipe, distributed/pipeline/sync/test_pipeline, distributed/pipeline/sync/test_stream, distributed/pipeline/sync/test_transparency, distributed/pipeline/sync/test_worker, distributed/rpc/cuda/test_tensorpipe_agent, distributed/rpc/test_faulty_agent, distributed/rpc/test_tensorpipe_agent, distributed/test_c10d_common, distributed/test_c10d_gloo, distributed/test_c10d_nccl, distributed/test_c10d_spawn_gloo, distributed/test_c10d_spawn_nccl, distributed/test_data_parallel, distributed/test_distributed_spawn, distributed/test_launcher, distributed/test_nccl, distributed/test_pg_wrapper, distributed/test_store, distributions/test_constraints, distributions/test_distributions, lazy/test_bindings, lazy/test_extract_compiled_graph, lazy/test_ts_opinfo, test_ao_sparsity, test_autocast, test_autograd, test_binary_ufuncs, test_bundled_inputs, test_complex, test_cpp_api_parity, test_cpp_extensions_aot_ninja, test_cpp_extensions_aot_no_ninja, test_cpp_extensions_jit, test_cuda, test_cuda_primary_ctx, test_dataloader, test_datapipe, test_deploy, test_deploy, test_dispatch, test_expanded_weights, test_foreach, test_function_schema, test_functional_autograd_benchmark, test_functional_optim, test_functionalization, test_futures, test_fx, test_fx_experimental, test_hub, test_import_stats, test_indexing, test_jit, test_jit_autocast, test_jit_cuda_fuser, test_jit_disabled, test_jit_fuser_legacy, test_jit_fuser_te, test_jit_legacy, test_jit_profiling, test_license, test_linalg, test_logging, test_masked, test_mkldnn, test_mobile_optimizer, test_model_dump, test_module_init, test_modules, test_monitor, test_multiprocessing, test_multiprocessing_spawn, test_namedtensor, test_namedtuple_return_api, test_native_functions, test_nestedtensor, test_nn, test_numba_integration, test_numpy_interop, test_openmp, test_ops, test_ops_gradients, test_ops_jit, test_optim, test_overrides, test_package, test_per_overload_api, test_profiler, test_pruning_op, test_public_bindings, test_python_dispatch, test_pytree, test_quantization, test_reductions, test_scatter_gather_ops, test_serialization, test_set_default_mobile_cpu_allocator, test_shape_ops, test_show_pickle, test_sort_and_select, test_sparse, test_sparse_csr, test_spectral_ops, test_stateless, test_tensor_creation_ops, test_tensorboard, test_tensorexpr, test_tensorexpr_pybind, test_testing, test_torch, test_type_hints, test_type_info, test_type_promotion, test_unary_ufuncs, test_utils, test_view_ops, test_vmap, test_vulkan, test_xnnpack_integration ``` </p> </details> ### When running anything else (for example `python test_autograd.py -h`) It did not change and still does: - The general unittest parser help that we print via a second thread - The common_utils's parser help Pull Request resolved: https://github.com/pytorch/pytorch/pull/76152 Approved by: https://github.com/malfet, https://github.com/seemethere	2022-04-25 14:01:33 +00:00
Wanchao Liang	78ea86a445	[shard] Sharder and ShardingPlan prototype (#73873 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73873 Basic ShardingPlan interface and Sharder implemention: 1. We provide `ShardingPlan` to allow user to specify all parameter sharding strategies for a given model, this including `plan` for sharding the parameters, and `output_plan` for tagging the output layout, `return_local_tensor` for converting back to DDP. 2. Introduce `shard_module` API, that could take a nn.Module, a ShardingPlan, then shard the module according to the plan. TODO: next PR we will introduce Extensible Sharder and ShardingPlanner. ghstack-source-id: 154682421 Test Plan: test_sharding_plann.py Reviewed By: pritamdamania87, fduwjj Differential Revision: D34695159 fbshipit-source-id: 3d695803c4b7e9a7543177ade5b709b5f847baa9 (cherry picked from commit 670cd279b0e5304a9bf0ce6e6651a08273a77035)	2022-04-25 13:01:24 +00:00
Jeff Daily	44bbb247a6	[ROCm] enable fsdp tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/75632 Approved by: https://github.com/kumpera, https://github.com/malfet	2022-04-22 19:50:36 +00:00
wanchaol	be354d8139	[shard] Add basic math ops to ShardedTensor and add ReplicatedTensor inter-op Pull Request resolved: https://github.com/pytorch/pytorch/pull/73703 This PR add basic math ops to ShardedTensor (+-/), and add ReplicatedTensor inter-op ShardedTensor to those math ops. This enables ShardedTensor (op) ReplicatedTensor to avoid communication in certain cases. Differential Revision: [D34560867](https://our.internmc.facebook.com/intern/diff/D34560867/) NOTE FOR REVIEWERS*: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D34560867/)! Approved by: https://github.com/pritamdamania87	2022-04-12 04:25:10 +00:00
Andrey Talman	622cff3e95	Cuda 11.6 Disable failing tests (#75420 ) Summary: This mitigates number of issues with CUDA 11.6 update and updates Linux driver . New issues discovered #[75391](https://github.com/pytorch/pytorch/issues/75391) #[75375](https://github.com/pytorch/pytorch/issues/75375) Old issue present since 11.3 #[57482](https://github.com/pytorch/pytorch/issues/57482) #[70111](https://github.com/pytorch/pytorch/issues/70111) These changes already testsed WIP PR: #[75337](https://github.com/pytorch/pytorch/pull/75337) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75420 Reviewed By: seemethere Differential Revision: D35481973 Pulled By: atalman fbshipit-source-id: 4db00c646e2df4f8650404763963c3b215110f1f (cherry picked from commit 518e19dc361b43273f5bd6bdfff942614e8466f5)	2022-04-07 22:43:15 +00:00
Brian Hirsh	9429dbb434	make functionalization work better with subclasses Pull Request resolved: https://github.com/pytorch/pytorch/pull/73441 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-04-04 15:33:27 +00:00
David Berard	27deefb5e1	[JIT] Enable NVFuser tests in OSS CI (#73322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73322 These tests have been disabled in OSS CI since #34785. Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D34436844 Pulled By: davidberard98 fbshipit-source-id: c5b14b33e7f369a6fa1e9cfbcb484a30dffc659e (cherry picked from commit b08f51587c0203c3e8b69f06ea613759e740aa4f)	2022-04-01 23:48:30 +00:00
Wanchao Liang	0524b2829a	[shard] Add ReplicatedTensor (#73529 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73529 Add ReplicatedTensor, a ReplicatedTensor is a type of tensor that have the same value on all ranks across the world_size. ReplicatedTensor is a :class:`~torch.Tensor` subclass, and it could be used together with ShardedTensor/Tensor together to express different types of computation. The inter-op rules defined as (using torch.add as an example op): ReplicatedTensor + ReplicatedTensor = ReplicatedTensor ReplicatedTensor + torch.Tensor = torch.Tensor ReplicatedTensor + ShardedTensor = ShardedTensor We also added a `validate()` API to help user validate if a replicated tensor on certain process_group is truly replicated or not. TODO: next PR gonna add ShardedTensor/PartialTensor logic to handle ReplicatedTensor. ghstack-source-id: 152064781 Test Plan: test_replicated_tensor Reviewed By: pritamdamania87, fduwjj Differential Revision: D34529374 fbshipit-source-id: 16ccb300e9f9c47ac29a17eb6d46d029ab7d60b8 (cherry picked from commit 44f4e11e795a1bf330a8108bda256950ca769525)	2022-03-24 12:41:17 +00:00
Jeff Daily	956a028b55	[ROCm] enable HIP IPC Enables code paths that use hipIpc* functions. Also enables test_multiprocessing.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74383 Approved by: https://github.com/osalpekar	2022-03-21 19:32:01 +00:00
Sahan Paliskara	0bfa2f8255	Move torch::deploy tests to their own workflow job (#73676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73676 For some reason https://github.com/pytorch/pytorch/pull/72637 ended up in getting messed up during rebasing so please refer to that pr for review history. This PR creates a new workflow called ` deploy-linux-xenial-cuda11.3-py3.7-gcc7` for torch::deploy tests. For testing go to https://www.torch-ci.com/pytorch/pytorch/pull/73676 and check if a build and test job occur with ` deploy-linux-xenial-cuda11.3-py3.7-gcc7` Test Plan: Imported from OSS Reviewed By: soulitzer Differential Revision: D34586702 Pulled By: PaliC fbshipit-source-id: 5627cf4ff411a4a04030f8b7726f84af979da213 (cherry picked from commit df6dddebb9fe078a6053a31033b5a40cc742fcf3)	2022-03-17 12:19:48 +00:00
atalman	ebca80ed08	Move test ops gradients and test ops jit to separate files Fixes #72368 As per reference issue, the test_ops in single file takes around 3:30-4:00Hrs to execute on asan jobs: Reference : pytorch_test_times.json ``` { "commit": "39535fec6c3ff5bf7c2d322d096c59571c3295ed", "JOB_BASE_NAME": "linux-xenial-py3.7-clang7-asan", "job_times": { "test_ops": 14928.355000000636, <- This test group is over 4hrs alone ``` ---- Hence separating test_ops into following parts: 1. TestGradients 2. TestJit 3. TestCommon and TestMathBits Pull Request resolved: https://github.com/pytorch/pytorch/pull/74297 Approved by: https://github.com/malfet	2022-03-17 02:07:50 +00:00
PyTorch MergeBot	232faeacf8	Revert "Move test ops gradients and test ops jit to separate files" This reverts commit `7cf9b942da`. Reverted https://github.com/pytorch/pytorch/pull/74297 on behalf of https://github.com/atalman	2022-03-16 20:08:23 +00:00
atalman	7cf9b942da	Move test ops gradients and test ops jit to separate files Fixes #72368 As per reference issue, the test_ops in single file takes around 3:30-4:00Hrs to execute on asan jobs: Reference : pytorch_test_times.json ``` { "commit": "39535fec6c3ff5bf7c2d322d096c59571c3295ed", "JOB_BASE_NAME": "linux-xenial-py3.7-clang7-asan", "job_times": { "test_ops": 14928.355000000636, <- This test group is over 4hrs alone ``` ---- Hence separating test_ops into following parts: 1. TestGradients 2. TestJit 3. TestCommon and TestMathBits Pull Request resolved: https://github.com/pytorch/pytorch/pull/74297 Approved by: https://github.com/malfet	2022-03-16 19:30:22 +00:00
Wanchao Liang	8b2ae86f02	[shard] disable rocm and windows for sharding_spec test (#74040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74040 fixes https://github.com/pytorch/pytorch/issues/73552 ghstack-source-id: 151046817 Test Plan: wait for ci Reviewed By: rohan-varma Differential Revision: D34792398 fbshipit-source-id: 84d08f01db8375817f48537505e7d988cb39d1f4 (cherry picked from commit 18b21ef0db91ddd22dc57a5b413e3e3ad594bb14)	2022-03-10 20:23:59 +00:00
Alban Desmaison	701fa16eed	only run complex autograd tests once Pull Request resolved: https://github.com/pytorch/pytorch/pull/73210	2022-03-01 23:42:07 +00:00
Alban Desmaison	f275b3f9a1	simplify run_test for distributed tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/73209	2022-03-01 23:37:37 +00:00
Alban Desmaison	7e919bd3c6	add dry run option and improve test list printing Pull Request resolved: https://github.com/pytorch/pytorch/pull/73208	2022-02-22 20:45:41 +00:00
Ilya Persky	1b089292df	Fix test failure when compiled without LAPACK support (#70671 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70670 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70671 Reviewed By: H-Huang Differential Revision: D34242339 Pulled By: janeyx99 fbshipit-source-id: 8cd13c13588007c60e9c3f17dbf707dcfa2e0e04 (cherry picked from commit `cf6dbe3e81`)	2022-02-15 16:38:47 +00:00
wushirong	4d01789f69	Remove fx2trt from oss CI (#72595 ) Summary: Remove fx2trt test from oss CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/72595 Test Plan: CI Reviewed By: houseroad Differential Revision: D34112595 Pulled By: wushirong fbshipit-source-id: 02376ef0f25381eff31b72dcbf964c1966af9793 (cherry picked from commit `e3d698a942`)	2022-02-10 18:49:31 +00:00
Junjie Wang (PyTorch)	88547396eb	[PT-D] Enable megatron-lm style MLP layers (Changes mainly on sharded linear op) (#69735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69735 We want to build a prototype of Megatron-LM so that we can apply PT-D op to models like transformer and other Meta flagship models like The basic idea of Megatron-LM is as following: 1. Col-wise sharding of linear weight. Perform the linear op for the first layer. 2. Perform a math op (optional), such as ReLU or GeLU. We use GeLU in our example unit test. The input is from step 1. 3. Row-wise sharing of linear weight. Perform the linear op for the second layer. The input is from step 2. We then save communications to concatenate the col-wise sharding results and spreading the input to different ranks for row-wise sharding. The change is as following: 1. Return a ShardedTensor for the col-wise sharding in the sharded_linear op. 2. Return a PartialTensors for the row-wise sharding in the sharded_linear op. 3. Leverage APIs already defined for `reshard` to merge/aggregate local results to a fully sync local result if needed. 4. Add helper function to create sharded tensor based on the local result. 5. Add a unit test to test the Megatron-LM idea mentioned above and compare with local ops, including the grad and optimizer so that we can ensure the correctness of the implementation. 6. Refactor the unit test of sharded linear to reflect the changes in the code. ghstack-source-id: 148273049 Test Plan: Unit test + CI Reviewed By: pritamdamania87 Differential Revision: D32978221 fbshipit-source-id: 565fc92e7807e19d53b0261f8ace3945bef69e3e (cherry picked from commit `344abe7520`)	2022-02-03 06:12:15 +00:00
Junjie Wang (PyTorch)	19d0de8a57	[PT-D][RFC] Resharding related API implement for ShardedTensor and Partial Tensor (#70079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70079 We defined a new concept named `PartialTensor`, which is an abstraction to represent Tensors that need aggregation across multiple devices and multiple processes. We also defined a API `reshard_output` to reshard a `PartialTensor` to `Tensor` or reshard a `ShardedTensor` to `ShardedTensor/Tensor`. This is done via class `ModuleResharder` which acts like a wrapper of original modules plus the a reshard in the final step. The `reshard` logic is defined in each class (`ShardedTensor` and `PartialTensor`). ghstack-source-id: 148273050 Test Plan: Unit test is in the next PR. Reviewed By: pritamdamania87 Differential Revision: D33121037 fbshipit-source-id: 5f56617ea526b857c5b73df6e069697d428ec359 (cherry picked from commit `58b1457cbc`)	2022-02-03 05:26:02 +00:00
Pritam Damania	64670e414e	[reland] Create torch.distributed._shard package. (#72141 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72141 We have many sharding components currently: torch.distributed._sharded_tensor, torch.distributed._sharding_spec, torch.distributed._sharded_optimizer and more coming. As a result, organizing all of this under the `torch.distributed._shard` package. For BC reasons, I'm still keeping the old packages and have them just reference the new package. ghstack-source-id: 148150861 ghstack-source-id: 148150861 Test Plan: waitforbuildbot Reviewed By: fduwjj Differential Revision: D33904585 fbshipit-source-id: 057e847eb7521b536a3ee4e0f94871aacc752062 (cherry picked from commit `29a70dd7af`)	2022-02-02 06:58:20 +00:00
Nikita Shulga	34494e6252	Back out "Create torch.distributed.shard package." (#72062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72062 Original commit changeset: dc692b31e260 Original Phabricator Diff: D33755913 (`87bbcf70f7`) Test Plan: CI Reviewed By: pbelevich Differential Revision: D33891115 fbshipit-source-id: 37286e03d743d8691319f07c95e9561d54f3d6d0 (cherry picked from commit `0c1b3fe008`)	2022-01-31 18:29:27 +00:00
Pritam Damania	87bbcf70f7	Create torch.distributed.shard package. (#71742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71742 We have many sharding components currently: torch.distributed._sharded_tensor, torch.distributed._sharding_spec, torch.distributed._sharded_optimizer and more coming. As a result, organizing all of this under the `torch.distributed.shard` package. For BC reasons, I'm still keeping the old packages and have them just reference the new package. ghstack-source-id: 147899768 Test Plan: waitforbuildbot Reviewed By: fduwjj, wanchaol Differential Revision: D33755913 fbshipit-source-id: dc692b31e2607063d55dfcb3db33ec53961d5a5b (cherry picked from commit `5b6885f358`)	2022-01-29 00:48:06 +00:00
Shirong Wu	7a08030903	Fix fx2trt CI test trigger condition (#71014 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71014 Replace test trigger with test_config matching. Test Plan: CI https://github.com/pytorch/pytorch/runs/4746717568?check_suite_focus=true Reviewed By: janeyx99 Differential Revision: D33480971 fbshipit-source-id: 9513e464753343a7ae47fcfaf48119f34bae94c5	2022-01-10 13:37:24 -08:00
Rodrigo Kumpera	2378421340	Implement torch.allclose for sharded tensor. (#70331 ) Summary: Implement torch.allclose op for sharded tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70331 Test Plan: Automated test added. pritamdamania87 Fixes https://github.com/pytorch/pytorch/issues/67112 cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Reviewed By: pritamdamania87 Differential Revision: D33339137 Pulled By: kumpera fbshipit-source-id: 4263e468eaa117317b190f69877bf3f8bbac5658	2022-01-07 08:37:04 -08:00
Ilya Persky	bc514cb425	Skip distributed tests if built with USE_DISTRIBUTED=0 (#70677 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/70676 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70677 Reviewed By: albanD Differential Revision: D33439808 Pulled By: janeyx99 fbshipit-source-id: 7f9971eb564dbbb6625fe5f78328c3abe3808719	2022-01-06 08:55:05 -08:00
Brian Hirsh	bb5b4cceb6	Revert "Revert D32498569: allow external backend codegen to toggle whether to generate out= and inplace kernels" (#69950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69950 This reverts commit `f6cad53443`. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33113545 Pulled By: bdhirsh fbshipit-source-id: d6590294662588d36c09662dea65919ad4e1e288	2022-01-04 14:52:00 -08:00
wushirong	31c7e5d629	Install TensorRT lib on oss docker and enable fx2trt unit test (#70203 ) Summary: CI Lib installed and unit test run on https://github.com/pytorch/pytorch/actions/runs/1604076060 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70203 Reviewed By: malfet Differential Revision: D33264641 Pulled By: wushirong fbshipit-source-id: ba30010bbd06e70d31415d8c52086d1779371bcf	2021-12-22 08:50:48 -08:00
Pritam Damania	0544f975e1	[reland] Support torch.equal for ShardedTensor. (#70145 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70145 Added support for torch.equal to ShardedTensor. This is really helpful in terms of comparing two ShardedTensors. ghstack-source-id: 146066939 Test Plan: waitforbuildbot Reviewed By: wanchaol Differential Revision: D33201714 fbshipit-source-id: 56adfc36e345d512c9901c56c07759bf658c745b	2021-12-21 13:22:52 -08:00
Michael Suo	19f898402d	Revert D33241684: [pytorch][PR] Install TensorRT lib on oss docker and enable fx2trt unit test Test Plan: revert-hammer Differential Revision: D33241684 (`dab3d3132b`) Original commit changeset: cd498908b00f Original Phabricator Diff: D33241684 (`dab3d3132b`) fbshipit-source-id: d5b2e663b5b0c9e570bd799b9f6111cd2a0de4f7	2021-12-20 23:14:35 -08:00
wushirong	dab3d3132b	Install TensorRT lib on oss docker and enable fx2trt unit test (#70203 ) Summary: CI Lib installed and unit test run on https://github.com/pytorch/pytorch/actions/runs/1604076060 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70203 Reviewed By: janeyx99 Differential Revision: D33241684 Pulled By: wushirong fbshipit-source-id: cd498908b00f3417bdeb5ede78f5576b3b71087c	2021-12-20 18:51:48 -08:00
Michael Suo	a406a427ae	Revert D33004315: Support torch.equal for ShardedTensor. Test Plan: revert-hammer Differential Revision: D33004315 (`1c4c81622c`) Original commit changeset: 786fe26baf82 Original Phabricator Diff: D33004315 (`1c4c81622c`) fbshipit-source-id: e1dda70fea656834fdf0f2a9f874415f7b460c6e	2021-12-15 14:14:06 -08:00
Pritam Damania	1c4c81622c	Support torch.equal for ShardedTensor. (#69734 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69734 Added support for `torch.equal` to ShardedTensor. This is really helpful in terms of comparing two ShardedTensors. Will implement `allclose` in a follow PR. ghstack-source-id: 145301451 Test Plan: waitforbuildbot Reviewed By: fduwjj, wanchaol Differential Revision: D33004315 fbshipit-source-id: 786fe26baf82e1bb4fecfdbfc9ad4b64e704877f	2021-12-15 13:07:36 -08:00
Brian Hirsh	f6cad53443	Revert D32498569: allow external backend codegen to toggle whether to generate out= and inplace kernels Test Plan: revert-hammer Differential Revision: D32498569 (`aa0cf68c17`) Original commit changeset: ebd932d042b9 Original Phabricator Diff: D32498569 (`aa0cf68c17`) fbshipit-source-id: 21a393fa339510d926512a7983d33ece327b743d	2021-12-14 15:27:24 -08:00
Nikita Shulga	24ee1d13f6	Another attempt to fix version comparison check (#69939 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/69939 Reviewed By: atalman Differential Revision: D33108135 Pulled By: malfet fbshipit-source-id: cadadfe5b04c4378f149136f8e1f8e8d6266775c	2021-12-14 14:54:15 -08:00
Wanchao Liang	800a457b6f	[shard] add ShardedOptimizer (#68607 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68607 This PR added ShardedOptimizer and a API to get module parameters along with ShardedTensor param, it allows user to use this Optimizer Wrapper to construct a optimizer that involves ShardedTensor The state_dict support will be a follow up diff ghstack-source-id: 145532834 Test Plan: python test_sharded_optim.py Reviewed By: pritamdamania87 Differential Revision: D32539994 fbshipit-source-id: a3313c6870d1f1817fc3e08dc2fc27dc43bef743	2021-12-14 12:15:20 -08:00
Nikita Shulga	fef9981998	Update run_test.py (#69920 ) Summary: Do not compare LooseVersion against string Pull Request resolved: https://github.com/pytorch/pytorch/pull/69920 Reviewed By: atalman Differential Revision: D33101166 Pulled By: malfet fbshipit-source-id: a2df9e01d17663262718f11e580c8b009764f7b5	2021-12-14 11:26:56 -08:00
Brian Hirsh	aa0cf68c17	allow external backend codegen to toggle whether to generate out= and inplace kernels (#68530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68530 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32498569 Pulled By: bdhirsh fbshipit-source-id: ebd932d042b988e19c71aa04a21677db9bdc9f04	2021-12-14 10:25:02 -08:00
Nikita Shulga	07767569c9	Properly import LooseVersion (#69904 ) Summary: This fixes regression introduced by https://github.com/pytorch/pytorch/pull/57040 Somehow importing `distutils` from `setuptool` caused import of `distutils.versions`, which is not a documented dependency and got change with the release of [setuptools-59.6.0](https://github.com/pypa/setuptools/tree/v59.6.0) We should not rely on that, as `import distutils` never re-imports `distutils.version`, which one can see by observing https://github.com/python/cpython/blob/3.9/Lib/distutils/__init__.py or by running: ``` % python3 -c "import distutils;print(distutils.__version__, dir(distutils))" 3.7.5 ['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'sys'] % python3 -c "from setuptools import distutils;print(distutils.__version__, dir(distutils))" 3.7.5 ['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'archive_util', 'ccompiler', 'cmd', 'config', 'core', 'debug', 'dep_util', 'dir_util', 'dist', 'errors', 'extension', 'fancy_getopt', 'file_util', 'filelist', 'log', 'spawn', 'sys', 'sysconfig', 'util', 'version'] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/69904 Reviewed By: albanD, atalman, janeyx99 Differential Revision: D33094453 Pulled By: malfet fbshipit-source-id: aaf1adb7c6f293c4e376ccff21c64cd6ba625e97	2021-12-14 09:28:19 -08:00
Andrey Talman	77a4b89411	Adding windows cuda 11.5 workflows (#69377 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/69081 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69377 Reviewed By: ngimel Differential Revision: D33076022 Pulled By: atalman fbshipit-source-id: aeb2791fc15d7b491976f57a74c1989c6ca61b81	2021-12-13 20:49:02 -08:00
Alban Desmaison	8b20dde932	add python dispatch test back to CI and fix typo in test (#69565 ) Summary: The error message was changed following a PR comment. And since the test doesn't run on CI, I forgot to update the test to catch the new error message. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69565 Reviewed By: mrshenli Differential Revision: D32932982 Pulled By: albanD fbshipit-source-id: a1da72b0ca735e72b481bc944039233094f1c422	2021-12-08 08:44:49 -08:00
Rohan Varma	3bd7dbf119	[Dist CI][BE] Remainder of c10d/store tests run in subprocess (#68822 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68822 Per title, we switched over c10d_gloo and nccl and results look good so far, so switch the rest of them as well. After the only dist tests that won't run in subprocess are pipe and fsdp tests, which historically haven't had much flakiness. ghstack-source-id: 144213522 Test Plan: CI Reviewed By: H-Huang Differential Revision: D32624330 fbshipit-source-id: 469f613e5b0e4529e6b23ef259d948837d4af26b	2021-11-29 10:59:39 -08:00
Rohan Varma	250d0bd20b	[RPC][Dist CI][BE] RPC tests run in subprocess (#68821 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68821 Continuing effort to move most distributed tests to run in subprocess for better reproducibility + reduce flakiness. ghstack-source-id: 144213520 Test Plan: CI Reviewed By: H-Huang Differential Revision: D32624199 fbshipit-source-id: 04448636320554d7a3ab29ae92bc1ca9fbe37da2	2021-11-29 10:58:08 -08:00
Nikita Shulga	b5b62b3408	Cleanup old TD logic (#68842 ) Summary: Remove `--determine-from` option from run_test.py and remove all references from corresponding test scripts Followup after https://github.com/pytorch/pytorch/pull/64921 Pull Request resolved: https://github.com/pytorch/pytorch/pull/68842 Reviewed By: seemethere, janeyx99 Differential Revision: D32631418 Pulled By: malfet fbshipit-source-id: bdb5dd888c1d97dfaf95c1f299bf8073f3de9588	2021-11-23 18:45:42 -08:00
Rohan Varma	9554ebe44e	[Dist CI][BE] c10d gloo tests run in subprocess (#68504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68504 Per title ghstack-source-id: 143928767 Test Plan: CI Reviewed By: H-Huang Differential Revision: D32485100 fbshipit-source-id: a55687aea4af69e3830aee6f0278550c72f142c2	2021-11-22 09:54:07 -08:00
Rohan Varma	ddc22ea3b2	[Dist CI][BE] test_c10d_nccl run in subprocess (#68503 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68503 Per title ghstack-source-id: 143928768 Test Plan: CI Reviewed By: H-Huang Differential Revision: D32484990 fbshipit-source-id: 6682f46256af0da5153e5087a91a7044156dd17f	2021-11-22 09:52:58 -08:00
Wanchao Liang	fb556c91ce	[BE] delete frontend.cpp (#67400 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67400 c10d/frontend.cpp was originally proposed to introduce pure C++ API and use TorcBind to share python level API with TorchScript. This is no longer needed, so delete this to reduce code redundancy. ghstack-source-id: 143910066 ghstack-source-id: 143910066 Test Plan: wait for ci Reviewed By: navahgar Differential Revision: D31979270 fbshipit-source-id: 6ceb8b53d67ab8f9aef44b34da79346dfbb51225	2021-11-21 23:30:52 -08:00
Rohan Varma	f02efc749a	[Dist CI][BE] Run each test in its own process for test_distributed_spawn (#67901 ) Summary: Context: https://github.com/pytorch/pytorch/issues/67061 Use `run_test.py`'s provided flag `"--subprocess"`, passed in like `extra_unittest_args=["--subprocess"]` when running test_distributed_spawn. This will ensure that each test is run separately in its own process. The goal is to more closely simulate how a developer would run a single test when reproducing a CI failure and make reproducibility easier in general. Also, when a test fails, print out the exact command that was issued so developer knows how to reproduce it. For example test fails, it will print out something like the following to logs - ``` Test exited with non-zero exitcode 1. Command to reproduce: BACKEND=gloo WORLD_SIZE=3 /fsx/users/rvarm1/conda/envs/pytorch/bin/python distributed/test_distributed_spawn.py -v TestDistBackendWithSpawn.test_Backend_enum_class ``` running test_distributed_spawn is still the same cmd as before: ` python test/run_test.py --verbose -i distributed/test_distributed_spawn ` as seen in [distributed contributing](https://github.com/pytorch/pytorch/blob/master/torch/distributed/CONTRIBUTING.md) guide. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67901 Reviewed By: cbalioglu, mruberry Differential Revision: D32225172 Pulled By: rohan-varma fbshipit-source-id: 7e8d4c7a41858044bd2a4e0d1f0bf8f1ac671d67	2021-11-11 06:11:00 -08:00
Brian Hirsh	7c90bd77ec	Test functionalization pass in python (#66101 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66101 Updated description: This PR tests the functionalization pass in python in two ways. For each of the test programs that I have in `test_functionalization.py`, it: - runs the program with and without functionalization, and asserts the outputs and (potentially mutated) inputs are equal in both cases - runs the program with `LoggingTensor`, and uses expecttests on the resulting graph. I manually confirm that the graphs look reasonable and only contain functional ops. Mechanically, the changes include: - factoring out `LoggingTensor` into a testing util so it can be re-used in multiple tests - adding some private python api's in the `torch` namespace as hooks that I can use during testing In the original version of this PR, I also added some fixes to the `_make_subclass()` function in python: allowing you to pass in strides and storage_offset. I kept them in mainly because the changes were already there. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D31942095 Pulled By: bdhirsh fbshipit-source-id: 90ff4c88d461089704922e779571eee09c21d707	2021-11-09 14:34:05 -08:00
Junjie Wang	2766662ca9	[PyTorch][2/N] Basic implementation of ShardedEmbeddingBag using ShardedTensor. (#67188 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67188 This diff/PR is trying to implement the ShardedEmbeddingBag using the ShardedTensor. We support both row-wise and column-wise sharding of the embedding bag. The detailed logic can be found in the comment. Several caveats: 1. Only the sharding of one weight is supported now. 1. We support limited input params for the op. To support more params are on the way. 2. We only support chuck sharding for now. 3. We only support a single local shard per rank for now. Some other changes include: 1. Refactor the ShardedEmbedding code so that the common logic can be reused. 2. Fix tiny typos and corner cases in API `get_chunked_dim_size`. Where it will return -1 if the we set the dim_size = 5, split_size = 2, idx = 3. (This is a valid case because when chunks = 4, dim_size = 5, then the split_size = 2) ghstack-source-id: 142325915 Test Plan: Unit test and CI Reviewed By: pritamdamania87 Differential Revision: D31749458 fbshipit-source-id: ed77e05e4ec94ef1a01b1feda8bbf32dc5d5da1b	2021-11-03 17:39:18 -07:00
Bo Wang	b6df043f1f	Add torch.nn.init.uniform_ operator to ShardedTensor. (#63997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63997 Use torch_function to extend torch.nn.init.uniform_ The Init is done in SPMD fashion. Note that ideally we want to aggregate sharded tensors into a global tensor, init it and reshard. It's fine to run it SPMD since uniform is I.I.D indepenent and identifically distributed. Also enable unit test for test_linear.py for OSS test Test Plan: a) Unit Test (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_init.py TestShardedTensorNNInit --v (pytorch) ... $ python test/distributed/_sharded_tensor/ops/test_linear.py --v (before runs this command is no-op) or b) Manual run: Instruction here: https://docs.google.com/document/d/1_m1Hdo5w51-hhPlZ_F8Y6PIWrN7UgJZqiSpARYvhsaE/edit# Imported from OSS Reviewed By: pritamdamania87, anjali411 Differential Revision: D30563017 fbshipit-source-id: d1859f7682235bcb44515efc69ca92bc5e34fce1	2021-10-21 00:17:13 -07:00
Junjie Wang	08cb31a03e	[PyTorch][1/N] Basic implementation of ShardedEmbedding using ShardedTensor. (#66604 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66604 This diff/PR is trying to implement the ShardedEmbedding and ShardedEmbedding using the ShardedTensor. Several caveats: 1. We support limited input params for the op. To support more params are on the way. 2. We only support chuck sharding for now. 3. We only support a single local shard per rank for now. ghstack-source-id: 141056130 Test Plan: Unit test and CI Reviewed By: pritamdamania87 Differential Revision: D31544556 fbshipit-source-id: cc867dcba8c11e6f4c7c3722488908f5108cc67f	2021-10-20 15:16:49 -07:00
Yanli Zhao	61fca037d6	[Part 1] upstreaming fairscale fsdp to PyTorch -- sharding, core data flow and hooks (#63881 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63881 This PR includes the minimal sets of features to make FSDP work, like sharding, core data flow and hooks. More tests will be added in the follow up PRs. Tests are refactored to utilize common PyTorch utils. Codes are also refactored a little bit. Alternative ways to replace ".data" usage in this PR are still being discussed offline. Test Plan: unit tests Reviewed By: mrshenli Differential Revision: D30521673 fbshipit-source-id: 9a23390dd7c925749604c6860e08fbe39ddc5500	2021-10-07 09:06:44 -07:00
Pritam Damania	0dc98728bc	Basic implementation of ShardedLinear using ShardedTensor. (#64128 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64128 This PR implements a sharded nn.Linear layer using ShardedTensors with the following limitations: 1) Works only for ChunkShardingSpec. 2) Implementation is only aimed to demonstrate functionality and is most likely not performant at all. The PR also introduces a `shard_parameter` API to easily shard parameters of `nn.Modules`. This also has the following limitations: 1) Works only for ChunkShardingSpec. 2) Is not performant since it uses broadcast instead of scatter since ProcessGroupNCCL doesn't yet support scatter. Overall user API for running a sharded linear would be something like this: ``` # SPMD programming paradigm running same code on all nodes. fc = nn.Linear(10, 10) # Setup sharding. sharding_spec=ChunkShardingSpec(...) shard_parameter(fc, 'weight', sharding_spec, src_rank=0) # Run as a normal linear layer. inp = torch.rand(10, 10) output = fc(inp) ``` ghstack-source-id: 138500985 Test Plan: 1) unit tests. 2) waitforbuildbot Reviewed By: wanchaol, bowangbj Differential Revision: D30621215 fbshipit-source-id: 1aa7478568c18a4572f6c3462fdf24a4cbde01d6	2021-09-20 18:31:11 -07:00
Nikita Shulga	01cfea9485	Disable target determination for now (#64921 ) Summary: There were several reports of target determinator incorrectly skipping tests, most recent one is https://github.com/pytorch/pytorch/issues/64902 Let's disable it until it could be further stabilized Pull Request resolved: https://github.com/pytorch/pytorch/pull/64921 Reviewed By: seemethere, janeyx99 Differential Revision: D30901186 Pulled By: malfet fbshipit-source-id: 531afd2d390c6b51f727330d5dd1882d70b6fdde	2021-09-14 09:40:13 -07:00
Rohan Varma	d067f15622	[Dist CI] Move rest of distributed tests to their own CI job (#64253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64253 Follow up to D30496178 (`f4aff3a346`) to move the rest of distributed tests to their own jobs for Linux GHA. ghstack-source-id: 137233785 Test Plan: CI Reviewed By: walterddr Differential Revision: D30662999 fbshipit-source-id: f7cfbc0d1223aca52120f17f9da987d70fda8de6	2021-09-01 21:43:41 -07:00
Nikita Shulga	c2da103fe6	Discover new tests in run_tests.py (#64246 ) Summary: Introduce `discover_tests` function that globs for all Python files starting with `test_` in test folder excluding subfolders which are executed differently Fixes https://github.com/pytorch/pytorch/issues/64178 Pull Request resolved: https://github.com/pytorch/pytorch/pull/64246 Reviewed By: walterddr, seemethere Differential Revision: D30661652 Pulled By: malfet fbshipit-source-id: a52e78ec717b6846add267579dd8d9ae75326bf9	2021-08-31 17:32:55 -07:00
Richard Zou	0457a85d45	Revert D30543236: Add python mode Test Plan: revert-hammer Differential Revision: D30543236 (`4bd03b0242`) Original commit changeset: ef5444d96a5a fbshipit-source-id: b0042ac2c22765fa11d6d00bf751f6a4489eb6d8	2021-08-31 15:28:33 -07:00
Rohan Varma	1c2b5e59ae	Remove ref to test_distributed_fork (#64197 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64197 Removes this line as test is gone. ghstack-source-id: 136986275 Test Plan: CI Reviewed By: walterddr Differential Revision: D30642929 fbshipit-source-id: a0c7dfdfb35a4a7f7ec1b881dbea53d85136012c	2021-08-31 13:31:27 -07:00
leslie-fang-intel	09dfaa0339	add operation list for AutocastCPU (#63534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63534 In this PR: * We have changed the default dtype of `AutocastCPU` from `float16` to `bfloat16` as discussed here `https://github.com/pytorch/pytorch/pull/61002` * We also update the operation list which needs casting to `lower_precision_fp` or `float32`. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D30644914 Pulled By: ezyang fbshipit-source-id: 8b93485ba452b3759611e3f0ac88e920fe495ac1	2021-08-30 19:30:33 -07:00
Richard Zou	4bd03b0242	Add python mode (#63496 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63496 This PR adds a (private) enable_python_mode context manager. (see torch/utils/_python_dispatch.py). enable_python_mode accepts the type of a __torch_dispatch__ object as its argument. Whenever an operator gets called inside of the context manager, it dispatches to the __torch_dispatch__ of the passed-in type. Example usage: ``` with enable_python_mode(LoggingTensor): z = torch.empty([]) assert isinstance(z, LoggingTensor) ``` There are quite a few changes that were made to support this. First, we added TorchDispatchTypeObject, a C++ struct that represents the type of a `__torch_dispatch__` object (e.g. LoggingTensor). It holds both the PyObject* representing the class and a PyInterpreter* so we know which Python interpreter it came from. Next, we updated the concrete_dispatch_fn in python_variable.cpp to accept a `const std::shared_ptr<TorchDispatchTypeObject>&` argument. When this is null, dispatching happens as usual. When it is non-null, we prepend the TorchDispatchTypeObject's PyObject* to the overloaded args list so that it is considered first for dispatch. To get that to work, we changed how `handle_torch_dispatch_no_python_arg_parser` works. The "overloaded args list" previously only consisted of Tensor PyObjects, but now it can have types in addition to Tensors! - We renamed `append_overloaded_arg` to `append_overloaded_arg` - We added a new `append_overloaded_type` that appends a type to overloaded_args - We added special handling in `handle_torch_dispatch_no_python_arg_parser` and `append_overloaded_arg` to handle types in addition to Tensors. Then, there is PythonMode and PythonModeTLS. - We reuse the DispatchKey::Python dispatch key as a mode key - We use PythonMode::enter and PythonMode::exit to enable/disable DispatchKey::Python and set the PythonModeTLS. - PythonModeTLS stores a TorchDispatchTypeObject as metadata. - PythonMode is in libtorch_python, and PythonModeTLS is in ATen. This split is due to the libtorch_python library boundary (because we need to save TLS in ATen/ThreadLocalState) - We modify the PythonFallbackKernel to look up the relevant TorchDispatchTypeObject (if Python Mode is active) and dispatch using it. There are two more miscellaneous changes: - internal_new_from_data (torch/csrc/utils/tensor_new.cpp) gets an exclude guard. enable_python_mode currently does not handle torch.tensor and the exclude guard is to prevent a bug. Future: - This PR does not allow for the nesting of Python modes. In the future we should be able to enable this with a more sane no_dispatch API and by changing the TLS to a stack. For now I did not need this for CompositeImplicitAutograd testing. Test Plan: - new tests Reviewed By: malfet, albanD Differential Revision: D30543236 Pulled By: zou3519 fbshipit-source-id: ef5444d96a5a957d1657b7e37dce80f9a497d452	2021-08-30 18:44:35 -07:00
Jane Xu	1354ee417a	run_test.py: add option to run only core tests (#63976 ) Summary: This is in response to a feature request from some folks in the core team to have a local command that would only run relevant "core" tests. The idea is to have a local smoke test option for developers to run locally before making a PR in order to verify their changes did not break core functionality. These smoke tests are not targeted to be short but rather relevant. This PR enables that by allowing developers to run `python test/run_test.py --core` or `python test/run_test.py -core` in order to run the CORE_TEST_LIST, which is currently test_nn.py, test_torch.py, and test_ops.py. I am not the best person to judge what should be considered "core", so please comment which tests should be included and/or excluded from the CORE_TEST_LIST! Pull Request resolved: https://github.com/pytorch/pytorch/pull/63976 Test Plan: ``` (pytorch) janeyx@janeyx-mbp test % python run_test.py --core -v Selected tests: test_nn, test_ops, test_torch Running test_nn ... [2021-08-25 14:48:28.865078] Executing ['/Users/janeyx/miniconda3/envs/pytorch/bin/python', 'test_nn.py', '-v'] ... [2021-08-25 14:48:28.865123] test_to (__main__.PackedSequenceTest) ... ok test_to_memory_format (__main__.PackedSequenceTest) ... ok ``` Reviewed By: walterddr Differential Revision: D30575560 Pulled By: janeyx99 fbshipit-source-id: 3f151982c1e315e50e60cb0d818adaea34556a04	2021-08-26 09:29:57 -07:00
driazati	ab5cf5a1eb	Move existing target determinator to tools (#63809 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63809 This moves out the modulefinder determinator to `tools/testing` since it is supposed to be CI-only. This also simplifies run_test.py a little bit. Test Plan: Imported from OSS Reviewed By: malfet, seemethere, janeyx99 Differential Revision: D30497438 Pulled By: driazati fbshipit-source-id: 1d203037af5af6a20c1e7812da935e7cbb5cd82f	2021-08-25 13:03:53 -07:00
driazati	67d8e7b659	Reformat run_test.py (#63808 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63808 `black run_test.py` Test Plan: Imported from OSS Reviewed By: seemethere Differential Revision: D30497437 Pulled By: driazati fbshipit-source-id: 41b29b73f41fa4bb15fce5eaa69f8efe614e02f7	2021-08-25 11:27:18 -07:00
Rong Rong (AI Infra)	f4aff3a346	[BE] add distributed run_test options (#63147 ) Summary: Currently distributed tests are mixed within test_python. We would like to split the distributed tests into its own batch thus we need to split them out. Adding an option to include/exclude distributed tests with CUSTOM_HANDLERS. Pull Request resolved: https://github.com/pytorch/pytorch/pull/63147 Test Plan: - locally run with the addition run_test.py options. - CI Dependency: found a bug in mpiexec test and need https://github.com/pytorch/pytorch/issues/63580 to fix it first. Reviewed By: bdhirsh Differential Revision: D30496178 Pulled By: walterddr fbshipit-source-id: 7903a57b619f2425028028f944211938823918a6	2021-08-24 08:03:01 -07:00
Pritam Damania	2d671ca41b	[8/N] Remove c10d/ddp fork tests. (#63454 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63454 Continuation of https://github.com/pytorch/pytorch/pull/63443, this PR removes all fork tests from torch.distributed. ghstack-source-id: 136285511 Test Plan: waitforbuildbot Reviewed By: SciPioneer Differential Revision: D30387872 fbshipit-source-id: f6d6313db126ae7b95b86f78a1e0726887c5c513	2021-08-20 12:23:18 -07:00
Jeff Daily	be9be9bfdd	add distributed/_sharded_tensor/test_sharded_tensor to ROCM_BLOCKLIST (#63508 ) Summary: Fixes current ROCm CI test2 brokenness until tensorpipe is fully supported by ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/63508 Reviewed By: ejguan Differential Revision: D30406450 Pulled By: walterddr fbshipit-source-id: c07509271d5d33901f3eaf7ffb916dc3626e1f9a	2021-08-19 07:50:55 -07:00
Eli Uriegas	4982fc4ecf	test: Add ability to set CONTINUE_THROUGH_ERROR (#63357 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63357 Adds the ability to set CONTINUE_THROUGH_ERROR as an environment variable so that we can easily set it without having to add the flag directly Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: astaff Differential Revision: D30351108 Pulled By: seemethere fbshipit-source-id: 767fa9bd24e1399f359eb24d16f6cc985a2d7173	2021-08-16 15:35:40 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Pritam Damania	91525d42d9	Fix sharded tensor tests. (#63054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63054 1) Ensure these tests are skipped in environments without any GPUs. 2) Add the test to run_test.py ghstack-source-id: 135595698 Test Plan: waitforbuildbot Reviewed By: wanchaol Differential Revision: D30239159 fbshipit-source-id: 21b543ba72e8d10182bc77e7ae1fd34fd4096509	2021-08-11 21:46:45 -07:00
Rohan Varma	39ec1da935	[reland] Gate DistributedOptimizers on RPC availability (#62937 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62937 reland due to windows + cuda failure, fix by running it on gloo on windows even with cuda. ghstack-source-id: 135306176 Test Plan: ci Reviewed By: mrshenli Differential Revision: D30177734 fbshipit-source-id: 7625746984c8f858648c1b3632394b98bd4518d2	2021-08-09 14:41:06 -07:00
Natalia Gimelshein	b45cf9b81b	Revert D30117838: [WIP] Gate DistributedOptimizers on RPC availability Test Plan: revert-hammer Differential Revision: D30117838 (`3f09485d7e`) Original commit changeset: e6365a910a3d fbshipit-source-id: f276b2b2bdf5f7bd27df473fca0eebaee9f7aef2	2021-08-06 22:10:41 -07:00
Rohan Varma	3f09485d7e	[WIP] Gate DistributedOptimizers on RPC availability (#62774 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62774 Gates DistributedOptimizer which relies on RRef based on if RPC is available. This should enable ZeRo to work with Windows as Windows should not try to import the DIstributedOptimizer. If this works as expected we can enable the windows tests for functional/local sgd optimizers as well. ghstack-source-id: 135216642 Test Plan: CI Reviewed By: pbelevich Differential Revision: D30117838 fbshipit-source-id: e6365a910a3d1ca40d95fa6777a7019c561957db	2021-08-06 10:59:00 -07:00
Joel Schlosser	a0309f89f4	Initial ModuleInfo implementation (#61935 ) Summary: This PR contains the initial version of `ModuleInfo` for use in testing modules. The design philosophy taken here is to start small and simple and build out / refactor as needed when more test coverage or `ModuleInfo` entries are added. As such, it's not intended for general usage yet. The PR contains the following: * (new file) `torch/testing/_internal/common_modules.py` * `ModuleInfo` definition - metadata for each module to use in testing * `module_db` - the actual `ModuleInfo` database; currently contains entries for two modules * `ModuleInput` - analogous to `SampleInput` from OpInfo; contains `FunctionInput`s for both constructor and forward pass inputs * Constructor and forward pass inputs are tied together within a `ModuleInput` because they are likely correlated * `FunctionInput` - just contains args and kwargs to pass to a function (is there a nicer way to do this?) * `modules` decorator - analogous to `ops`; specifies a set of modules to run a test over * Some constants used to keep track of all modules under torch.nn: * `MODULE_NAMESPACES` - list of all namespaces containing modules * `MODULE_CLASSES` - list of all module class objects * `MODULE_CLASS_NAMES` - dict from module class object to nice name (e.g. torch.nn.Linear -> "nn.Linear") * (new file) `test/test_modules.py` * Uses the above to define tests over modules * Currently, there is one test for demonstration, `test_forward`, which instantiates a module, runs its forward pass, and compares it to a reference, if one is defined Pull Request resolved: https://github.com/pytorch/pytorch/pull/61935 Reviewed By: mruberry Differential Revision: D29881832 Pulled By: jbschlosser fbshipit-source-id: cc05c7d85f190a3aa42d55d4c8b01847d1efd57f	2021-07-27 07:42:07 -07:00
Rohan Varma	69adb21940	Parity tests for functional optimizer step_param (#61756 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61756 DDP will support running optimizer as communication hook with optimizers that support a per-parameter/gradient step function `step_param`. Add parity tests as we implement more optimizers that support step_param to ensure parity with regular optimizers. ghstack-source-id: 134330378 Test Plan: Ci Reviewed By: SciPioneer Differential Revision: D29727549 fbshipit-source-id: 18977c896f12b8e478298488b298fd107affcf5f	2021-07-26 19:03:22 -07:00
Yukio Siraichi	5224490ae9	Implement NumPy-like `frombuffer` tensor constructor. (#59077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59077 Fixes #58549 `from_buffer` constructs a tensor object from an already allocated buffer through CPython's buffer protocol. Besides the standard `dtype`, `count`, and `offset` parameters, this function also accepts: - `device`: where the buffer lives - `requires_grad`: should autograd record operations on the new tensor A new test file _test_buffer_protocol.py_ was created. Currently, only CPU tests were implemented. That's because neither PyTorch nor Numba implements CPython's buffer protocol. Therefore, there's no way to create a CUDA buffer with the existing dependencies (could use PyCUDA for that, though). At the moment, if `device` differs from the device the buffer actually lives, two things may happen: - `RuntimeError`, if `device='cuda'` - Segmentation fault (not tested -- see above), if `device='cpu'` Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29870914 Pulled By: mruberry fbshipit-source-id: 9fa8611aeffedfe39c9af74558178157a11326bb	2021-07-23 13:17:48 -07:00
Andrew Gu	c2cc6a9396	Add generic join unit tests (#61786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61786 This adds unit tests for the generic join context manager. ``` gpurun python test/distributed/algorithms/test_join.py ``` Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D29746646 Pulled By: andwgu fbshipit-source-id: 2933d85783c2225574c4b77bfb90064690c6e668	2021-07-20 12:13:05 -07:00
Rong Rong (AI Infra)	a5a10fe353	Move all downloading logic out of common_utils.py (#61479 ) Summary: and into tools/ folder Currently run_tests.py invokes tools/test_selections.py 1. download and analyze what test_file to run 2. download and parse S3 stats and pass the info to local files. 3. common_utils.py uses download S3 stats to determine what test cases to run. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61479 Reviewed By: janeyx99 Differential Revision: D29661986 Pulled By: walterddr fbshipit-source-id: bebd8c474bcc2444e135bfd2fa4bdd1eefafe595	2021-07-12 11:23:22 -07:00
Rong Rong (AI Infra)	718db968b8	move CI related functions out of run_test.py (#61124 ) Summary: run_test.py currently does lots of downloading and test file/suite/case parsing. It doesn't work well outside of the CI environment Restructured the run_test.py and created tools/test/test_selections.py and move all test selection logic (reordering, categorizing slow test, creating shards) Follow up PRs should: - refactor those file read/write logic entangled inside test_selections.py into stats/ folder - restructure and add network independent test logics to test_test_selections.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/61124 Test Plan: - tools/test - CI Related PR: This follows the refactoring example in: https://github.com/pytorch/pytorch/issues/60373 Reviewed By: malfet Differential Revision: D29558981 Pulled By: walterddr fbshipit-source-id: 7f0fd9b4720a918d82918766c002295e8df04169	2021-07-06 09:06:42 -07:00
Zafar	509b1ef9d5	[sparsity] Add sparsity tests to run_test.py (#60887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60887 Test Plan: ``` ./test/run_test.py -i test_ao_sparsity ``` ``` ./test/run_test.py -i test_ao_sparsity ``` Differential Revision: D29465834 D29465834 Reviewed By: mruberry Pulled By: z-a-f fbshipit-source-id: 144f940363a20dd65c2bbfe70924c266d8791dc7	2021-07-02 11:11:20 -07:00
Sam Estep	d5a44f9f12	Use expecttest from PyPI (#60658 ) Summary: This PR removes `torch/testing/_internal/expecttest.py` in favor of https://github.com/ezyang/expecttest. See also https://github.com/ezyang/ghstack/pull/71. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60658 Test Plan: CI. Reviewed By: ezyang Differential Revision: D29430763 Pulled By: samestep fbshipit-source-id: b7cdc7ba37330176149fd465312118e2254ae92e	2021-06-28 15:43:34 -07:00
Rong Rong (AI Infra)	7e619b9588	First step to rearrange files in tools folder (#60473 ) Summary: Changes including: - introduced `linter/`, `testing/`, `stats/` folders in `tools/` - move appropriate scripts into these folders - change grepped references in the pytorch/pytorch repo Next step - introduce `build/` folder for build scripts Pull Request resolved: https://github.com/pytorch/pytorch/pull/60473 Test Plan: - CI (this is important b/c pytorch/test-infra also rely on some script reference. - tools/tests/ Reviewed By: albanD Differential Revision: D29352716 Pulled By: walterddr fbshipit-source-id: bad40b5ce130b35dfd9e59b8af34f9025f3285fd	2021-06-24 10:13:58 -07:00
Rong Rong (AI Infra)	40d2fe1053	correct filename issue for test_cpp_extensions_aot (#60604 ) Summary: Using file copy to make actual ninja vs. no_ninja suffixed python test files. This is to trick xmlrunner to report test cases in the correct folder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60604 Test Plan: - CI reports correctly into the corresponding folders - If download the test statistics, calculate shards now doesn't need custom logic to handle `test_cpp_extensions_aot` CI result shown it is working properly: https://github.com/pytorch/pytorch/pull/60604/checks?check_run_id=2900038654 vs https://github.com/pytorch/pytorch/pull/60604/checks?check_run_id=2900038673 Reviewed By: albanD Differential Revision: D29349562 Pulled By: walterddr fbshipit-source-id: e86e6bc0db288a2a57bea3c5f8edf03be1773944	2021-06-24 09:20:19 -07:00
Jane Xu	6385621003	Use JOB_BASE_NAME throughout code--consolidate CIRCLE_JOB (#60425 ) Summary: This PR is a first step in unifying our environment variables across CI (so that we don't have `CIRCLE_BLAH` in our GHA workflows, for example), though I'd like for this PR to be more for discussion about how best to consolidate these variables. This small change only changes most CIRCLE_JOB references in our code to be JOB_BASE_NAME, as that seems the closest GHA (and ROCm) equivalent. Currently, JOB_BASE_NAME is defined as: - in Circle: CIRCLE_JOB (name of the job, like `pytorch_linux_bionic_py3_8_gcc9_coverage_test1`) - in GHA: the build_environment with a `-build` or `-test` tacked to the end , e.g., `pytorch-linux-xenial-cuda10.2-cudnn7-py3.6-gcc7-test` - in ROCm: I don't actually know, but it's important for ROCm test sharding as shown in https://github.com/pytorch/pytorch/pull/60409 I am not sure if this is the intention for JOB_BASE_NAME so it is open to discussion what variable we should use if not JOB_BASE_NAME. I also don't know if it's worth the effort consolidating all these variables, so discussion is also highly encouraged there! Next steps: - Consolidate more CIRCLE_* references, maybe into CI_* equivalents? - We use BUILD_ENVIRONMENT everywhere in Circle though the variable is inconsistent across binary vs CI jobs and across platforms. For example, for linux tests and builds, BUILD_ENVIRONMENT contains the `_test` and `_build` suffixes, but the windows jobs don't. In GHA, BUILD_ENVIRONMENT is similar to how it's defined in windows jobs on Circle. This inconsistency is confusing, and we can probably do something about it. I'm thinking of switching out BUILD_ENVIRONMENT for JOB_BASE_NAME in our test scripts where we actually mean JOB_BASE_NAME. - We should probably document the meaning of the variables we consolidate somewhere, preferably in a README in some unified `ci/` folder. For example, it seems BUILD_ENVIRONMENT is supposed to capture the build environment, whereas JOB_BASE_NAME is supposed to capture the environment _and_ whether we're building or testing. Notes: - I did not replace CIRCLE_JOB references in third_party directories - Previously, print_test_stats reported CIRCLE_JOB as only the build environment for GHA workflows, and I think tacking on the `build` or `test` will not harm anything, though I may be wrong. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60425 Reviewed By: seemethere, samestep Differential Revision: D29333882 Pulled By: janeyx99 fbshipit-source-id: a82080e6205a03a1183035011ce59698eca06748	2021-06-23 11:11:21 -07:00
Howard Huang	ff3678eec2	Disable group group backend rpc tests from running on CI (#60407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60407 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D29278179 Pulled By: H-Huang fbshipit-source-id: ee78085eeb04d81842c95236b8c3a33de7142a3a	2021-06-23 10:58:31 -07:00
Jane Xu	c63a0d0cfe	Adding windows CUDA smoke tests on PRs (#59686 ) Summary: Adding windows CUDA smoke tests on PRs (master should run the full suite). Next step: - Automate data update so we get a new smoke test list without manual effort Pull Request resolved: https://github.com/pytorch/pytorch/pull/59686 Test Plan: https://github.com/pytorch/pytorch/actions/runs/958296267 The sharded smoke tests take long still because of dependencies installation Reviewed By: walterddr Differential Revision: D29243533 Pulled By: janeyx99 fbshipit-source-id: dde7ba127fa15c95bda0e833cc5311598fb85e2b	2021-06-23 10:13:50 -07:00
Jane Xu	462448f07a	Enable GHA sharding on linux (#60124 ) Summary: This is branch off of https://github.com/pytorch/pytorch/issues/59970 to only shard on linux so far (we're running in issues with windows gflags). This would enable sharding of tests on a few Linux jobs on GHA, allowing tts to be essentially halved. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60124 Reviewed By: zou3519 Differential Revision: D29204211 Pulled By: janeyx99 fbshipit-source-id: 1cc31d1eccd564d96e2aef14c0acae96a3f0fcd0	2021-06-17 13:00:23 -07:00
Rong Rong (AI Infra)	b2fc6de2c4	support parsing of PR stats in run_test.py (#60026 ) Summary: Currently S3 test stats doesn't support PR stats parisng. Changes to s3_stats_parser: 1. they are uploaded to `test_times/{sha1}/{job}` and `pr_test_times/{pr}/{sha1}/{job}` separately. Thus we need parsing logics for both 2. need to attach time for PR stats parsing for ordering since PR commits can be force-pushed Changes to run_test.py 1. Reordering based on previous PR stats if available 2. Falling back to file change option if not enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60026 Test Plan: - CI. - local repro: plz run: ``` CIRCLE_JOB="pytorch_linux_bionic_py3_6_clang9_noarch_test" CIRCLE_PR_NUMBER=60057 IN_CI=1 ENABLE_PR_HISTORY_REORDERING=1 python test/run_test.py ``` Reviewed By: samestep Differential Revision: D29164754 Pulled By: walterddr fbshipit-source-id: 206688e0fb0b78d1c9042c07243da1fbf88a924b	2021-06-16 13:32:31 -07:00
Jane Xu	d88fbf0fbc	fix minor typo in run_test.py (#60055 ) Summary: Fixes typo in run_test.py for option use_specified_test_cases_by Pull Request resolved: https://github.com/pytorch/pytorch/pull/60055 Reviewed By: walterddr Differential Revision: D29150156 Pulled By: janeyx99 fbshipit-source-id: 375e594d09c83188bfa80762c8b833a0b7c5cca4	2021-06-16 09:30:45 -07:00
Rohan Varma	c2098487e8	[c10d] Move pg wrapper tests to their own file. (#59840 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59840 moving these tests to their own standalone file. No meaningful code changes. ghstack-source-id: 131359162 Test Plan: CI Reviewed By: cbalioglu Differential Revision: D29012664 fbshipit-source-id: 348870016509a6ed7e69240fa82bccef4a12d674	2021-06-14 15:05:55 -07:00
Rong Rong	e41bc31eb2	make --run-specified-test-case use --include (#59704 ) Summary: instead of having specific logic to handle run-specific-test-case, we provide the flag to override include or bring-to-front with the SPECIFIED_TEST_CASES_FILE. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59704 Reviewed By: janeyx99 Differential Revision: D29038425 Pulled By: walterddr fbshipit-source-id: 803d3555813437c7f287a22f7704106b0c609919	2021-06-11 13:57:13 -07:00
Jane Xu	9bb5663979	Use commit stats from viable/strict instead of nightlies for sharding (#59727 ) Summary: Currently, not all of CI runs on nightlies, so it's better to use viable/strict. For example, current 11.1 test jobs do not get to use automatic sharding because of the lack of stats: https://app.circleci.com/jobs/github/pytorch/pytorch/14010983?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link Pull Request resolved: https://github.com/pytorch/pytorch/pull/59727 Reviewed By: heitorschueroff Differential Revision: D29004910 Pulled By: janeyx99 fbshipit-source-id: eb0c54a7e7947decba8134a1d67e4b0434151a06	2021-06-09 13:52:15 -07:00
Jane Xu	97dfc7e300	[Reland] Adding run specified tests option to run_test.py (#59649 ) Summary: Reland of https://github.com/pytorch/pytorch/issues/59487 Pull Request resolved: https://github.com/pytorch/pytorch/pull/59649 Reviewed By: samestep Differential Revision: D28970751 Pulled By: janeyx99 fbshipit-source-id: 6e28d4dcfdab8a49da4b6a02c57516b08bacd7b5	2021-06-08 16:04:46 -07:00
Alban Desmaison	5d6a10a765	Revert D28913223: [pytorch][PR] Adding run-specified-test-cases option in run_test.py Test Plan: revert-hammer Differential Revision: D28913223 (`24432eaa29`) Original commit changeset: 0d1f99109734 fbshipit-source-id: 47c073720cff23a5d4cb64556381c46025e90937	2021-06-08 02:18:16 -07:00
Rong Rong (AI Infra)	57d8bccd00	only reorder tests based on git diff if IN_CI (#59565 ) Summary: Do not reorder tests unless they are in IN_CI, this causes local development test ordering indeterministic. most of use branch out from viable strict not head of master. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59565 Reviewed By: ejguan Differential Revision: D28943906 Pulled By: walterddr fbshipit-source-id: e742e7ce4b3fc017d7563b01e93c4cd774d0a537	2021-06-07 17:54:19 -07:00
Jane Xu	24432eaa29	Adding run-specified-test-cases option in run_test.py (#59487 ) Summary: The run-specified-test-cases option would allow us to specify a list of test cases to run by having a CSV with minimally two columns: test_filename and test_case_name. This PR also adds .json to some files we use for better clarity. Usage: `python test/run_test.py --run-specified-test-cases <csv_file>` where the csv file can look like: ``` test_filename,test_case_name,test_total_time,windows_only_failure_sha_count,total_sha_count,windows_failure_count,linux_failure_count,windows_total_count,linux_total_count test_cuda,test_cudnn_multiple_threads_same_device,8068.8409659525,46,3768,53,0,2181,6750 test_utils,test_load_standalone,8308.8062920459,14,4630,65,0,2718,8729 test_ops,test_forward_mode_AD_acosh_cuda_complex128,91.652619369806,11,1971,26,1,1197,3825 test_ops,test_forward_mode_AD_acos_cuda_complex128,91.825633094915,11,1971,26,1,1197,3825 test_profiler,test_source,60.93786725749,9,4656,21,3,2742,8805 test_profiler,test_profiler_tracing,203.09352795241,9,4662,21,3,2737,8807 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/59487 Test Plan: Without specifying the option, everything should be as they were before. Running `python test/run_test.py --run-specified-test-cases windows_smoke_tests.csv` resulted in this paste P420276949 (you can see internally). A snippet looks like: ``` (pytorch) janeyx@janeyx-mbp pytorch % python test/run_test.py --run-specified-test-cases windows_smoke_tests.csv Loading specified test cases to run from windows_smoke_tests.csv. Processed 28 test cases. Running test_cpp_extensions_jit ... [2021-06-04 17:24:41.213644] Executing ['/Users/janeyx/miniconda3/envs/pytorch/bin/python', 'test_cpp_extensions_jit.py', '-k', 'test_jit_cuda_archflags'] ... [2021-06-04 17:24:41.213781] s ---------------------------------------------------------------------- Ran 1 test in 0.000s OK (skipped=1) ... ``` With pytest, an example executable would be: `Running test_dataloader ... [2021-06-04 17:37:57.643039] Executing ['/Users/janeyx/miniconda3/envs/pytorch/bin/python', '-m', 'pytest', 'test_dataloader.py', '-v', '-k', 'test_segfault or test_timeout'] ... [2021-06-04 17:37:57.643327]` Reviewed By: samestep Differential Revision: D28913223 Pulled By: janeyx99 fbshipit-source-id: 0d1f9910973426b8756815c697b483160517b127	2021-06-07 16:27:43 -07:00
Jane Xu	caf76c2445	Move sharding to after all tests have been excluded (#59583 ) Summary: It would be most accurate if sharding occurred after all other changes to selected_tests were complete. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59583 Reviewed By: ejguan Differential Revision: D28944737 Pulled By: janeyx99 fbshipit-source-id: a851473948a5ec942ffeeedeefdc645536a3d9f7	2021-06-07 15:04:36 -07:00
Mike Ruberry	de40c8e495	Adds remaining OpInfos and removes redundant test generators (#55558 ) Summary: Per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55558 Reviewed By: ngimel Differential Revision: D28922522 Pulled By: mruberry fbshipit-source-id: 89cefd93788bc8aa0683f4583cf5caa81aa2dc93	2021-06-06 14:52:26 -07:00
Andrew Gu	2ad4b8e58c	Extract c10d Store tests to dedicated test file (#59271 ) Summary: Partially addresses https://github.com/pytorch/pytorch/issues/55340 Overview This factors out `FileStoreTest`, `HashStoreTest`, `PrefixFileStoreTest`, `TCPStoreTest`, `PrefixTCPStoreTest`, `PythonStoreTest`, `RendezvousTest`, `RendezvousEnvTest`, `RendezvousFileTest`, and `RendezvousTCPTest` from `test_c10d_common.py` to a new file `test_store.py`. Additionally, unused import/initialization statements are removed from `test_c10d_common.py`, and the minimal set of import/initialization statements are used for `test_store.py`. Also, this changes `.jenkins/pytorch/multigpu-test.sh`, `.jenkins/pytorch/win-test-helpers/test_distributed.bat`, and `test/run_test.py` to include the new `test_store.py`. Testing All commands shown are run on an AI AWS cluster. I check the Store tests: ``` python test/distributed/test_store.py ``` I also check `test_c10d_common.py` since it is the source of the refactored code. In addition, I check `test_c10d_nccl.py` and `test_c10d_gloo.py` since they import from `test_c10d_common.py`; those two should be the only test files depending on `test_c10d_common.py`. ``` python test/distributed/test_c10d_common.py python test/distributed/test_c10d_nccl.py python test/distributed/test_c10d_gloo.py ``` `test_c10d_gloo.py` produces warnings about how using sparse tensors in TorchScript is experimental, but the warnings do not result from this PR's changes. Testing Issues (To Be Revisited) ``` WORLD_SIZE=4 BACKEND=gloo gpurun pytest test/distributed/test_c10d_gloo.py ``` Running the above command fails three tests (written as `[Test]`: `[Error]`): - `ProcessGroupGlooWrapperTest.test_collective_hang`: `RuntimeError: [../third_party/gloo/gloo/transport/tcp/pair.cc:598] Connection closed by peer [10.200.24.101]:15580` - `CommTest.test_broadcast_coalesced_gloo_cuda`: `RuntimeError: cuda runtime error (3) : initialization error at ../aten/src/THC/THCGeneral.cpp:54` - `CommTest.test_sequence_num_incremented_gloo_default`: `RuntimeError: cuda runtime error (3) : initialization error at ../aten/src/THC/THCGeneral.cpp:54` However, running each of the following yields no errors: ``` WORLD_SIZE=4 BACKEND=gloo gpurun pytest test/distributed/test_c10d_gloo.py -k test_collective_hang WORLD_SIZE=4 BACKEND=gloo gpurun pytest test/distributed/test_c10d_gloo.py -k test_broadcast_coalesced_gloo_cuda WORLD_SIZE=4 BACKEND=gloo gpurun pytest test/distributed/test_c10d_gloo.py -k test_sequence_num_incremented_gloo_default ``` This suggests the existence of some inadvertent state dependency between tests (e.g. improper cleanup). I have not explored this further yet. In particular, I do not have a solid understanding of the tests to be able to explain why using `pytest` and `gpurun` induces the failure (since notably, running the `.py` directly shows no issue). Similarly, running the following yields 47 errors: ``` WORLD_SIZE=4 BACKEND=nccl gpurun pytest test/distributed/test_c10d_nccl.py ``` The errors seem to all be simply complaining about the usage of `fork()` instead of `spawn()` for CUDA multiprocessing. Though, most of the tests in `test_c10d_nccl.py` ask for at least 2 CUDA devices, so I think that the `gpurun` is warranted (assuming that the test file does not need to be run partially on different machines). Both `test_c10d_common.py` and `test_store.py` work fine with `pytest`. Other Notes I noticed that `torch.distributed` is imported both as `dist` and as `c10d` and that `c10d` is used throughout the Store tests. I was curious if this is intentional (as opposed to using `dist` to refer to `torch.distributed`). Also, the original [issue](https://github.com/pytorch/pytorch/issues/55340) suggests that the Store tests do not use multiprocessing, but I saw that `torch.multiprocessing` is still used in `TCPStoreTest`. The links for the Store files in the `CONTRIBUTING.md` [file](https://github.com/pytorch/pytorch/blob/master/torch/distributed/CONTRIBUTING.md) are broken. This can fixed in a separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59271 Reviewed By: jbschlosser, mrshenli Differential Revision: D28856920 Pulled By: andwgu fbshipit-source-id: 630950cba18d34e6b5de661f5a748f2cddc1b446	2021-06-03 10:53:33 -07:00
Pritam Damania	0d6fa1adc5	Introduce ChunkShardingSpec as a model sharding specification. (#55728 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55728 Full design: https://github.com/pytorch/pytorch/issues/55207 This PR introduces ChunkShardingSpec (SingleShardingSpec in the design). Used the name ChunkShardingSpec since it is very similar to `torch.chunk` in terms of how a Tensor is split up and feels more clear compared to SingleShardingSpec. ghstack-source-id: 129603318 Test Plan: waitforbuildbot Reviewed By: SciPioneer Differential Revision: D27694108 fbshipit-source-id: c8764abe6a4d5fc56d023fda29b74b5af2a73b49	2021-05-23 16:04:57 -07:00
Rong Rong (AI Infra)	a70020465b	adding test_sparse_csr to run_test (#58666 ) Summary: fixes https://github.com/pytorch/pytorch/issues/58632. Added several skips that relates to test assert and MKL. Will address them in separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58666 Reviewed By: seemethere, janeyx99 Differential Revision: D28607966 Pulled By: walterddr fbshipit-source-id: 066d4afce2672e4026334528233e69f68da04965	2021-05-22 13:17:46 -07:00
Sam Estep	2e26976ad3	Disallow versionless Python shebangs (#58275 ) Summary: Some machines don't have a versionless `python` on their PATH, which breaks these existing shebangs. I'm assuming that all the existing versionless `python` shebangs are meant to be `python3` and not `python2`; please let me know if my assumption was incorrect for any of these. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58275 Test Plan: CI. Reviewed By: zhouzhuojie Differential Revision: D28428143 Pulled By: samestep fbshipit-source-id: 6562be3d12924db72a92a0207b060ef740f61ebf	2021-05-14 08:26:02 -07:00
Nikita Shulga	b587354e4c	Add Python-3.9 CI testing (#50992 ) Summary: Skip number of tests adjust typing handling Pull Request resolved: https://github.com/pytorch/pytorch/pull/50992 Reviewed By: walterddr Differential Revision: D26170388 Pulled By: malfet fbshipit-source-id: 47852512aa3d5c25faf6687bcd0b1cbb332b0b20	2021-05-10 10:51:39 -07:00
Aliaksandr Ivanou	7fe4c1d0e7	Torchelastic: add multiprocessing tests to ci/cd (#56842 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56842 Add elastic multiprocessing test to ci/cd Test Plan: buck test mode/opt-tsan //caffe2/test/distributed/elastic/multiprocessing/... -- --run-disabled Reviewed By: wilson100hong Differential Revision: D27982226 fbshipit-source-id: 1b4e6f1a20867a6aa7ca409e280fdb04e8db198b	2021-05-02 14:03:47 -07:00
Aliaksandr Ivanou	5c8ceefe46	Pytorch add agent api tests (#56985 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56985 Pytorch add agent api tests Test Plan: ci/cd Reviewed By: cbalioglu Differential Revision: D28020485 fbshipit-source-id: e6acf095f26ce4b99cddfbf7641fb4fa885b0c86	2021-04-29 06:14:39 -07:00
Aliaksandr Ivanou	6ff0002b12	Pytorch: enable many torchelastic tests (#56970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56970 The diff enables metrics, events, utils and timer tests on ci/cd pipeline Test Plan: ci/cd Reviewed By: cbalioglu Differential Revision: D28015200 fbshipit-source-id: 6b419aaf9e62a10a747b6511bff90c82cfb7bcd6	2021-04-28 17:05:09 -07:00
David Reiss	89377e3e45	model_dump tool for model inspection (#56868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56868 See __init__.py for a summary of the tool. The following sections are present in this initial version - Model Size. Show the total model size, as well as a breakdown by stored files, compressed files, and zip overhead. (I expect this breakdown to be a bit more useful once data.pkl is compressed.) - Model Structure. This is basically the output of `show_pickle(data.pkl)`, but as a hierarchical structure. Some structures cause this view to crash right now, but it can be improved incrementally. - Zip Contents. This is basically the output of `zipinfo -l`. - Code. This is the TorchScript code. It's integrated with a blame window at the bottom, so you can click "Blame Code", then click a bit of code to see where it came from (based on the debug_pkl). This currently doesn't render properly if debug_pkl is missing or incomplete. - Extra files (JSON). JSON dumps of each json file under /extra/, up to a size limit. - Extra Pickles. For each .pkl file in the model, we safely unpickle it with `show_pickle`, then render it with `pprint` and include it here if the size is not too large. We aren't able to install the pprint hack that thw show_pickle CLI uses, so we get one-line rendering for custom objects, which is not very useful. Built-in types look fine, though. In particular, bytecode.pkl seems to look fine (and we hard-code that file to ignore the size limit). I'm checking in the JS dependencies to avoid a network dependency at runtime. They were retrieved from the following URLS, then passed through a JS minifier: https://unpkg.com/htm@3.0.4/dist/htm.module.js?module https://unpkg.com/preact@10.5.13/dist/preact.module.js?module Test Plan: Manually ran on a few models I had lying around. Mostly tested in Chrome, but I also poked around in Firefox. Reviewed By: dhruvbird Differential Revision: D28020849 Pulled By: dreiss fbshipit-source-id: 421c30ed7ca55244e9fda1a03b8aab830466536d	2021-04-28 07:33:10 -07:00
Philip Meier	759cfb7495	add missing comma to `run_test.py` (#57010 ) Summary: Factored out from https://github.com/pytorch/pytorch/pull/57008#discussion_r621137121: > Without this comma, the strings are concatenated to `test_binary_ufuncstest_numpy_interop` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57010 Reviewed By: malfet Differential Revision: D28028061 Pulled By: walterddr fbshipit-source-id: 97c64b79a6aaaf0242def03c8808c1a032537258	2021-04-27 08:00:13 -07:00
Joel Schlosser	febff45900	Support factory kwargs in torch.nn modules (#54508 ) Summary: Continuation of https://github.com/pytorch/pytorch/pull/53144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508 Reviewed By: albanD Differential Revision: D27939544 Pulled By: jbschlosser fbshipit-source-id: 4bf517e5f74f093e27ca38a85e732da65e44d805	2021-04-22 16:16:53 -07:00
driazati	187a524249	Re-order tests based on changed files (#56666 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56666 Addresses some of #56557 by checking for changed files when running tests. This will help deliver signal faster when a failing test is run. It should always be safe to at least try to re-order the tests, so there's no option to turn it off, and any error ends up bailing out of the sorting process. Time saved will change between tests, with more improvement for things that are further down the static list here: `1e9c7ad4cb/test/run_test.py (L32)` The results vary from not much improvement ([before: 11m](https://app.circleci.com/pipelines/github/pytorch/pytorch/307580/workflows/6ab3def6-8d63-4f41-9b8d-9c2c50f6266b/jobs/12712819/steps), [after: 10m](https://app.circleci.com/pipelines/github/pytorch/pytorch/307578/workflows/157407b4-f850-431c-b641-d2ac97916a04/jobs/12712802/steps)) to a lot ([before: 75m](https://app.circleci.com/pipelines/github/pytorch/pytorch/307580/workflows/6ab3def6-8d63-4f41-9b8d-9c2c50f6266b/jobs/12712884/steps), [after: 8m](https://app.circleci.com/pipelines/github/pytorch/pytorch/307578/workflows/157407b4-f850-431c-b641-d2ac97916a04/jobs/12712865/steps)), but overall there shouldn't be any regression in test timing. These results are also probably a little confounded since the test sharding will be different after re-ordering. As a follow up we can use the target determination logic to figure out which tests to bring to front based on the actual code instead of just edits to test files Test Plan: Imported from OSS Reviewed By: samestep Differential Revision: D27934076 Pulled By: driazati fbshipit-source-id: 747d09ad732289d7693101803d46e9fa8e6d2f59	2021-04-22 10:27:07 -07:00
Pavel Belevich	426852b4f0	Split test_c10d_spawn.py to test_c10d_spawn_gloo.py,test_c10d_spawn_nccl.py (#56599 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56599 Test Plan: NA Reviewed By: SciPioneer Differential Revision: D27913955 fbshipit-source-id: 7206e589fb7d08c55d08a58a3d57dc3d210a795e	2021-04-21 22:11:49 -07:00
Pavel Belevich	5cc75e46fa	Split test_c10d.py to test_c10d_common.py, test_c10d_gloo.py, test_c10d_nccl.py (#56598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56598 Test Plan: NA Reviewed By: SciPioneer Differential Revision: D27913170 fbshipit-source-id: 3439d18141131b02d55f2ca399a4c795cba2b04b	2021-04-21 22:10:41 -07:00
Joel Schlosser	12b2bc94d7	Revert D27909732: [pytorch][PR] Support factory kwargs in torch.nn modules Test Plan: revert-hammer Differential Revision: D27909732 (`5a09def9b0`) Original commit changeset: d8684b2403ab fbshipit-source-id: d00d69fae4fa4ed58d9e97e70b27a06a0dcb39e4	2021-04-21 13:44:03 -07:00
Joel Schlosser	5a09def9b0	Support factory kwargs in torch.nn modules (#54508 ) Summary: Continuation of https://github.com/pytorch/pytorch/pull/53144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508 Reviewed By: malfet Differential Revision: D27909732 Pulled By: jbschlosser fbshipit-source-id: d8684b2403ab7eb336371d118799146a2520bd76	2021-04-21 13:20:11 -07:00
Aliaksandr Ivanou	c5c5230890	Pytorch resolve bug around incorrect rdzv handler resolution (#56386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56386 The diff resolves bug around incorrect handler resolution: _create_static_handler pointed towards etcd, and _create_etcd_handler pointed towards static. Test Plan: buck test mode/dev-nosan //caffe2/test/distributed:test_launcher Added test_launcher to the ci/cd tests Reviewed By: cbalioglu Differential Revision: D27858897 fbshipit-source-id: 440155789958c091ce5755e7c9524e4bb704203a	2021-04-19 23:50:28 -07:00
Natalia Gimelshein	92d24e3060	Revert D27855386: [pytorch][PR] Support factory kwargs in torch.nn modules Test Plan: revert-hammer Differential Revision: D27855386 (`40483acc51`) Original commit changeset: dabd505d2a04 fbshipit-source-id: f5bf3120d87861b30a8e1bf11977ad7d27cd8500	2021-04-19 20:07:20 -07:00
Joel Schlosser	40483acc51	Support factory kwargs in torch.nn modules (#54508 ) Summary: Continuation of https://github.com/pytorch/pytorch/pull/53144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508 Reviewed By: bdhirsh Differential Revision: D27855386 Pulled By: jbschlosser fbshipit-source-id: dabd505d2a04208e74b158570fb2859c736eea2c	2021-04-19 12:24:58 -07:00
Sam Estep	d05e7c163f	Revert D27600457: [pytorch][PR] Support factory kwargs in torch.nn modules Test Plan: revert-hammer Differential Revision: D27600457 (`1077f87269`) Original commit changeset: b58bfee61c39 fbshipit-source-id: 19d5bfc5133a3880383731d0332503ca1f3bce0c	2021-04-19 07:47:24 -07:00
Joel Schlosser	1077f87269	Support factory kwargs in torch.nn modules (#54508 ) Summary: Continuation of https://github.com/pytorch/pytorch/pull/53144 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54508 Reviewed By: mrshenli Differential Revision: D27600457 Pulled By: jbschlosser fbshipit-source-id: b58bfee61c3917524b4622f63ef216c27a588eb1	2021-04-19 06:58:40 -07:00
Sam Estep	1e9c7ad4cb	Add a test to measure `import torch` time (#56041 ) Summary: This PR adds a couple very simple tests which (as the code comment says) measure the time it takes to `import torch` and ask for the CUDA device count. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56041 Test Plan: ``` $ rm -r /tmp/reports ; python3 test/test_import_time.py --save-xml=/tmp/reports Running tests... ---------------------------------------------------------------------- .. ---------------------------------------------------------------------- Ran 2 tests in 1.855s OK Generating XML reports... ``` ``` $ tools/print_test_stats.py /tmp/reports No scribe access token provided, skip sending report! class TestImportTime: tests: 2 failed: 0 skipped: 0 errored: 0 run_time: 1.85 seconds avg_time: 0.93 seconds median_time: 0.93 seconds 2 longest tests: test_time_cuda_device_count time: 1.10 seconds test_time_import_torch time: 0.75 seconds Total runtime is 0:00:01 2 longest tests of entire run: TestImportTime.test_time_cuda_device_count time: 1.10 seconds TestImportTime.test_time_import_torch time: 0.75 seconds ``` Reviewed By: driazati Differential Revision: D27770908 Pulled By: samestep fbshipit-source-id: 01bbf5a339f41d3a1f493e6fa8c946ff7567daec	2021-04-15 00:53:30 -07:00
Edward Yang	bc86358cf5	Make run_test.py work even if s3_stat_parser fails to import (#56039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56039 Python will try to eagerly resolve the name references even if the import failed. Quote them so that it doesn't. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: janeyx99 Differential Revision: D27770536 Pulled By: ezyang fbshipit-source-id: b111739289498f9bab856fb9424f3080efee4ee0	2021-04-14 13:21:50 -07:00
Luca Wehrstedt	3f8d476857	Split out CUDA RPC tests (#55695 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55695 In order to be able to run CUDA tests on their own (e.g., to avoid running CPU tests on GPU machines). Done by moving test methods to a separate class (and sometimes introducing a "common" base class for utils), and then providing new entry points inside a `cuda/` subdirectory. Test Plan: Checked they are run on Sandcastle. Reviewed By: mrshenli Differential Revision: D27618198 fbshipit-source-id: 8f671657f79c8ae115748ab7752fe0066705893b	2021-04-12 07:48:08 -07:00
Rong Rong (AI Infra)	55db156229	remove test_jit_py3.py entirely (#55560 ) Summary: 1. move module related stuff to test_module_container 2. created test_types for types and annotation 3. created test_misc for the rest Pull Request resolved: https://github.com/pytorch/pytorch/pull/55560 Reviewed By: VitalyFedyunin Differential Revision: D27650911 Pulled By: walterddr fbshipit-source-id: d895a7da9e9c3d25a662a37faf4daabc276b9c1a	2021-04-08 14:28:54 -07:00
Erjia Guan	f9a0bbbeb8	[DataPipe] Remove duplicate dataset (#54553 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54553 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D27279301 Pulled By: ejguan fbshipit-source-id: 112a83e7061e3f35dc517eb623bd9ca93c2f034c	2021-04-07 10:11:22 -07:00
Jane Xu	bf37bf7da4	Make JSON files more human readable (#55335 ) Summary: Prettifies JSON files .pytorch-test-times and .pytorch-slow-tests so that not everything is on one single line. This is of slightly more importance as generated .pytorch-slow-tests ends up getting stored in our test-infra repo ([example](`ad9cd87565`)), and it is nice to not have that lil red symbol at the end. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55335 Reviewed By: samestep Differential Revision: D27576930 Pulled By: janeyx99 fbshipit-source-id: be58565b8c8593a9bfcfab383ee19facc79f0572	2021-04-05 17:23:36 -07:00
Jane Xu	717e70a824	(BE) Refactor get-test-times-from-S3 into s3_stat_parser (#54808 ) Summary: Moves more s3 parsing code to s3_stat_parser.py. This is another step in modularizing the parsing code more correctly. I will also be using this exact function in future slowTest code. Also replaces some Any's in the code to be Report. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54808 Test Plan: .pytorch-test-times generated before the code and after this code is the same. CI should pass, specifically the test tools GHA. Reviewed By: walterddr Differential Revision: D27375783 Pulled By: janeyx99 fbshipit-source-id: bec28551668b2eb3fdd60d802200993e493eac83	2021-03-29 08:45:22 -07:00
Rong Rong (AI Infra)	d4045e9aa1	initial commit to refactor all s3 access codes to s3_stats_parser (#54681 ) Summary: First step to move all S3 related operations into S3 parser utils. in the end we provide APIs from s3_stats_parser: 1. downloading data as reports and uploading data as reports 2. filter by job name and handle all compression, formatting inside. TODO - [ ] Refactor out upload into s3_stats_parser - [ ] Remove all S3/BOTO related checkers and try/catch blocks outside of s3_stats_parser Pull Request resolved: https://github.com/pytorch/pytorch/pull/54681 Test Plan: 1. Running tools/test/* covers the refactoring logic (test_test_history.py and test_stats.py as entrypoint and both using the 2 new APIs in s3_stats_parser after the refactoring. 2. print_test_stats.py's main argparse entrypoint is covered by CI step Report Test Result step. 3. run `python test/run_test.py --export-past-test-times` before and after this PR should result in the same file content in .pytorch-test-times Reviewed By: ailzhang Differential Revision: D27346742 Pulled By: walterddr fbshipit-source-id: fb40162e631e007fed9d5821fe4f190bda2cb52e	2021-03-26 06:49:15 -07:00
Jane Xu	792f5ffb83	Also strip slow_test (#54528 ) Summary: Since `_test1`, `_test2` and `_build` and `test` are all stripped, `slow_test` should be stripped as well. This way, the _slow_test stats will be considered as a part of all stats relating to a particular build job, though currently, it doesn't do much because the jobs don't share a common stemmed name--the build has `_gcc7` while the slow_test CI job does not. This makes me think...do we omit the `gcc7` intentionally? Are there other things I should strip, e.g., `multigpu_test`? See: ci/circleci: pytorch_linux_xenial_cuda10_2_cudnn7_py3_slow_test ci/circleci: pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 ci/circleci: pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54528 Reviewed By: samestep Differential Revision: D27270393 Pulled By: janeyx99 fbshipit-source-id: ffb7289cfe4dba52ded67f50a89f3e75e7bad68d	2021-03-23 14:44:21 -07:00
Jane Xu	635595f706	Change sharding in ci (#54228 ) Summary: Step three (landing this should fix https://github.com/pytorch/pytorch/issues/53882)! Modifying CI to compute job times during build so that the exported job times can be used for sharding future test jobs. The builds that are exempted from this: - `bazel` (no python tests so no need) - `libtorch` (no python stuff so no need) - `onnx` (the test shards are not calculated the same way) - `asan` (runs into error I don't know how to debug/we can debug later: [logs](https://app.circleci.com/pipelines/github/pytorch/pytorch/288019/workflows/57f95f67-1a1b-44a0-9b02-9652b57f2a5f/jobs/11693962) Pull Request resolved: https://github.com/pytorch/pytorch/pull/54228 Test Plan: CI Reviewed By: samestep Differential Revision: D27192978 Pulled By: janeyx99 fbshipit-source-id: 3cb20d14f4989e61873043b81dfd6b0f82d17ccd	2021-03-22 08:40:34 -07:00
Jane Xu	0645e2b490	Use shard file if present, improve functions used for sharding (#54210 ) Summary: Step 2 to fixing https://github.com/pytorch/pytorch/issues/53882 :) This changes TARGET_DET_LIST and sharding automation by checking if there's already cached data from the commit in `.pytorch-test-times`. If not, it pulls data from S3 and updates the file to have the stats. This way, S3 pulling does not need to happen more than once for the same commit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54210 Test Plan: the following methods should run the same set of tests. First `export CIRCLE_JOB=pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2` or your favorite CIRCLE JOB. 1. Pull data first and use it: Download the data from S3 and write it to the cache file with `python test/run_test.py --export-historic-test-times .pytorch-test-times` Now run `python test/run_test.py --shard 1 10` 2. Make the sharding job pull data: Delete the file you just created: `rm .pytorch-test-times` Now run `python test/run_test.py --shard 1 10` Reviewed By: walterddr Differential Revision: D27136849 Pulled By: janeyx99 fbshipit-source-id: 51a42c4e2fa3f8cf15e682679dd3eb6130aad927	2021-03-18 13:25:51 -07:00
Jane Xu	2e7311ef25	First step to refactoring S3 reading logic (#53755 ) Summary: This is an initial attempt in refactoring and consolidating our S3 read logic for print_test_stats.py, test_history.py, and run_test.py. This way, boto3 and botocore do not need to be imported in various places throughout the code base, and duplicated logic (such as the many type definitions) can exist in one place: `tools/stat_utils/s3_stat_parser.py`. walterddr contributed to this PR by moving print_test_stats.py to the tools folder and the corresponding tests a subfolder within tools. NOTE: this removes those tests from CI as the new `tools/test/test_stats.py` is not in the test/ directory as the other tests in TESTS in run_test.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53755 Test Plan: This refactoring change should not break anything, so running the files as before should work as they did previously. To make sure that print_test_stats.py still functions: run `python tools/test/test_stats.py` and make sure all tests pass. To make sure that test_history.py works, run the example commands from `tools/test_history.py --help` and check that their output matches that shown. Note that the script will continue printing for a while, so don't be alarmed. Some next steps: - Actually coming up with similarities among the three current use cases and further refactoring/consolidating of functions (e.g., combining simplify and get_cases) - Moving more parsing logic to s3_stat_parser.py to have better abstraction between our files - Adding tests for s3_stat_parser.py when there is more functionality in it Reviewed By: agolynski, samestep Differential Revision: D27030285 Pulled By: janeyx99 fbshipit-source-id: e664781324ef7c0c30943bfd7f17c895075ef7a7	2021-03-17 12:38:09 -07:00
Jane Xu	f30a7a2739	Add export-historic-test-times option to dump S3 test times into a JSON file (#54083 ) Summary: This will allow for future work to use the test times file (which will save computation time and also allow for more consistency). (Step one to fixing https://github.com/pytorch/pytorch/issues/53882) Pull Request resolved: https://github.com/pytorch/pytorch/pull/54083 Test Plan: export CIRCLE_JOB=your-favorite-circleci-job e.g., pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 `python test/run_test.py --export-historic-test-times` OR `python test/run_test.py --export-historic-test-times .your-favorite-file` When opening either .pytorch-test-times or .your-favorite-file, you should see something like: ``` {"commit": "2d559a09392aabb84dfb4a498010b2f01d99818c", "job_times": {"distributed/test_distributed_spawn": 583.5889999999973, "distributed/test_data_parallel": 4.866999999999997, "test_binary_ufuncs": 171.1569999999998, "test_numpy_interop": 2.5649999999999995, "test_public_bindings": 0.011,...}} ``` Note that no tests will be run when this option is specified. Reviewed By: walterddr Differential Revision: D27091351 Pulled By: janeyx99 fbshipit-source-id: e191d739268d86de0a0ba0eea0006969859d1940	2021-03-17 12:22:00 -07:00
Jane Xu	ee35060888	Fix sharding algo + test it (#53942 ) Summary: This PR: 1. moves sharding algorithm from run_test.py to framework_utils.py (let me know if you have a better place for it) 2. adds tests for the algorithm in test_testing.py 3. fixes the algorithm so that it doesn't tack on the unknown jobs all to the shard with the minimum time, but instead distributes them around the shards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53942 Test Plan: python test/test_testing.py -k TestFrameworkUtils Reviewed By: samestep Differential Revision: D27047223 Pulled By: janeyx99 fbshipit-source-id: 824b20009c0bb707aa5361de445cdec795d5e3f1	2021-03-15 16:33:56 -07:00
Nikita Shulga	b00cdfe136	Fix run_test_module logic (#53884 ) Summary: First argument is either file name or test module name, but key to `CUSTOM_HANDLERS` is test module name. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53884 Test Plan: Run `python3 run_test.py -i distributed/test_distributed_spawn.py` Reviewed By: janeyx99 Differential Revision: D27006164 Pulled By: malfet fbshipit-source-id: f30b42856cd2754e5981c1c69618f84e392c986a	2021-03-12 09:53:58 -08:00
Aliaksandr Ivanou	ec484981c6	[3/n][torch/elastic][upstream] Move torchelastic/events to torch/distributed/events (#53760 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53760 Pull Request resolved: https://github.com/pytorch/elastic/pull/143 The diff upsteams torchelastic/events to the torch. Test Plan: buck test mode/dev-nosan //pytorch/elastic/torchelastic/agent/... buck test mode/dev-nosan //caffe2/test/distributed/elastic/events/fb/... Reviewed By: kiukchung Differential Revision: D26932830 fbshipit-source-id: 23fc10d2ead5af7f7ed510ae0d2581cc2421cf76	2021-03-11 11:25:24 -08:00
Guilherme Leobas	cb68039363	Port NumPy typing testing style to PyTorch (#52408 ) Summary: ref: https://github.com/pytorch/pytorch/issues/16574 Pull Request resolved: https://github.com/pytorch/pytorch/pull/52408 Reviewed By: anjali411 Differential Revision: D26654687 Pulled By: malfet fbshipit-source-id: 6feb603d8fb03c2ba2a01468bfde1a9901e889fd	2021-03-10 12:18:01 -08:00
Jane Xu	bcbe07200c	Improve logic for S3 stats gathering. Uses automatic SLOW_TESTS. (#53549 ) Summary: This PR: 1. refactors the logic for S3 stats gathering. 2. Renames SLOW_TESTS to TARGET_DET_LIST to disambiguate and remove confusion with slowTest 2. detects slow tests (tests with time > 5min) to add to the TARGET_DET_LIST based on results in S3 from the previous nightly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53549 Test Plan: Set CIRCLE_JOB to your favorite CI job (like `pytorch_linux_bionic_py3_8_gcc9_coverage_test1`). Run `python test/run_test.py --determine-from=<your fave pytorch files>` e.g., `python test/run_test.py --determine-from=test/run_test.py` Reviewed By: mrshenli Differential Revision: D26904478 Pulled By: janeyx99 fbshipit-source-id: 9576b34f4fee09291d60e36ff2631753a3925094	2021-03-10 09:37:06 -08:00
Sam Estep	8c798e0622	Forbid trailing whitespace (#53406 ) Summary: Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857 These are the only hand-written parts of this diff: - the addition to `.github/workflows/lint.yml` - the file endings changed in these four files (to appease FB-internal land-blocking lints): - `GLOSSARY.md` - `aten/src/ATen/core/op_registration/README.md` - `scripts/README.md` - `torch/csrc/jit/codegen/fuser/README.md` The rest was generated by running this command (on macOS): ``` git grep -I -l ' $' -- . ':(exclude)/contrib/' ':(exclude)third_party' \| xargs gsed -i 's/ *$//' ``` I looked over the auto-generated changes and didn't see anything that looked problematic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406 Test Plan: This run (after adding the lint but before removing existing trailing spaces) failed: - https://github.com/pytorch/pytorch/runs/2043032377 This run (on the tip of this PR) succeeded: - https://github.com/pytorch/pytorch/runs/2043296348 Reviewed By: walterddr, seemethere Differential Revision: D26856620 Pulled By: samestep fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97	2021-03-05 17:22:55 -08:00
Jane Xu	c0adabe172	automate sharding using S3 test time stats (#53269 ) Summary: Uses nightly commit stats to automatically shard tests based on execution time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53269 Test Plan: set CIRCLE_JOB to an existing job, like `pytorch_linux_bionic_py3_6_clang9_test` Then you can run something like: `python test/run_test.py --shard 1 10` Reviewed By: malfet Differential Revision: D26819440 Pulled By: janeyx99 fbshipit-source-id: 6bc73d6aa3d52d9850817536be15d7b54a72780e	2021-03-05 13:40:24 -08:00
Yi Zhang	fd582af06c	enable coverage test for dataloader on Windows (#52550 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50661 For coverage, The class qualified name is `'SimpleCustomBatch': <class '__mp_main__.SimpleCustomBatch'>` For pytest The class qualified name is `'SimpleCustomBatch': <class 'test_dataloader.SimpleCustomBatch'>` So move the class to one separate file ![image](https://user-images.githubusercontent.com/16190118/108611869-d6b51f80-741d-11eb-908e-be7a64da916d.png) As malfet suggestion, use __import__ to avoid adding new file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52550 Reviewed By: walterddr Differential Revision: D26754023 Pulled By: malfet fbshipit-source-id: 34b0fbe7336b9303cedc28ec6116ab752a2d3630	2021-03-02 18:40:47 -08:00
Meghan Lele	1d6bd15790	[JIT] Add torch._C._jit submodule (#52910 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52910 Summary PR #52158 tried to move all JIT bindings from `torch._C` to a new submodule `torch._C._jit`, but that...did not go well. This pull request adds the new `torch._C._jit` submodule, but does not migrate the existing bindings. Instead, it adds a unit test that fails if any new bindings are added to `torch._C`. A comment in the test instructs developers to add their new binding to the allowlist if it really should be in `torch._C`, or to add it to the appropriate submodule (e.g `torch._C._jit`, for example). The idea is to prevent the issue described in #51691 from getting worse if it cannot be fixed. Test Plan Continuous integration. Fixes This commit fixes #51691. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26698373 Pulled By: SplitInfinity fbshipit-source-id: ec9f5426051227a513d4fd09512b624420e0100b	2021-02-26 16:05:05 -08:00
Kimish Patel	a6e94d274f	[Pytorch] Add python binding to use mobile cpu allocator. (#52323 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52323 Using default cpu allocator for ops executed on qnnpack backend will result in asan failures with heap overflow since qnnpack (and xnnpack) can access input beyond their and/beginning. Here we are enabling this feature specifically to enable dynamic sparse linear op test using qnnpack engine. In dynamic linear op, the fp32 bias is not packed and hence can result in out-of-bound access. Test Plan: test_set_default_mobile_cpu_allocator.py Reviewed By: z-a-f Differential Revision: D26263481 fbshipit-source-id: a49227cac7e6781b0db4a156ca734d7671972d9f	2021-02-17 08:42:23 -08:00
Chester Liu	58eb23378f	Clean up usage of torch._six partially (#49785 ) Summary: See https://github.com/pytorch/pytorch/issues/42919 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49785 Reviewed By: mruberry Differential Revision: D25963833 Pulled By: bugra fbshipit-source-id: 11c90d6b8d3f206c9d0a4d8621b773beb10c6ba2	2021-02-08 13:58:34 -08:00
mattip	9cbefad83f	concantenate LICENSE files when building a wheel (#51634 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50695 I checked locally that the concatenated license file appears at `torch-<version>.dist-info/LICENSE` in the wheel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51634 Reviewed By: zhangguanheng66 Differential Revision: D26225550 Pulled By: walterddr fbshipit-source-id: 830c59fb7aea0eb50b99e295edddad9edab6ba3a	2021-02-08 08:28:46 -08:00
vfdev	b106250047	Introduced AliasInfo for OpInfo (#50368 ) Summary: Introduced AliasInfo for OpInfo. Context: Split of https://github.com/pytorch/pytorch/issues/49158 cc mruberry , please let me know if you'd like to see here more code to cover > [ ] fold test_op_aliases.py into OpInfo-based testing in test_ops.py from https://github.com/pytorch/pytorch/issues/50006 and/or add `UnaryUfuncInfo('abs')` as discussed https://github.com/pytorch/pytorch/pull/49158/files#r548774221 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50368 Reviewed By: ngimel Differential Revision: D26177261 Pulled By: mruberry fbshipit-source-id: 2e3884a387e8d5365fe05945375f0a9d1b5f5d82	2021-02-02 00:10:09 -08:00
Radhakrishnan Venkataramani	3397919dcf	Rowwise Prune op (Add the test to OSS run_test), Make the op private. (#46131 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46131 Refer to the title. Test Plan: `buck test caffe2/test:pruning` Reviewed By: raghuramank100 Differential Revision: D24230472 fbshipit-source-id: 8f0a83446c23fdf30d0313b8c3f5ff1a463b50c7	2021-01-29 06:08:18 -08:00
lixinyu	5ed0ad4b6a	DataPipe naming convension update (#51262 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51262 Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D26120628 Pulled By: glaringlee fbshipit-source-id: 6855a0dd6d4a93ff93adce1039960ffd7057a827	2021-01-28 17:44:36 -08:00
Benjamin Lefaudeux	87fb3707d9	ZeroRedundancyOptimizer: an implementation of a standalone sharded optimizer wrapper (#46750 ) Summary: Implement the first stage of ZeRO, sharding of the optimizer state, as described in [this blog post](https://www.microsoft.com/en-us/research/blog/zero-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/) and [this paper](https://arxiv.org/abs/1910.02054). This implementation is completely independent from the [DeepSpeed](https://github.com/microsoft/DeepSpeed) framework, and aims at providing ZeRO-compliant building blocks within the PyTorch scheme of things. This works by: - acting as a wrapper to a pytorch optimizer. ZeROptimizer does not optimize anything by itself, it only shards optimizers for distributed jobs - each rank distributes parameters according to a given partitioning scheme (could be updated), and owns the update of a given shard only - the .step() is called on each rank as expected, the fact that the optimizer actually works on a shard of the model is not visible from the outside - when the update is completed, each rank broadcasts the updated model shard to all the other ranks This can be used with DDP, although some communications are wasted in that case (gradients are all-reduced to all ranks). This implementation was initially developed in [Fairscale](https://github.com/facebookresearch/fairscale), and can also be used with an optimized DDP which only reduces to the relevant ranks. More context on ZeRO and PyTorch can be found in [this RFC](https://github.com/pytorch/pytorch/issues/42849) The API with respect to loading and saving the state is a known pain point and should probably be discussed an updated. Other possible follow ups include integrating more closely to a [modularized DDP](https://github.com/pytorch/pytorch/issues/37002), [making the checkpoints partition-agnostic](https://github.com/facebookresearch/fairscale/issues/164), [exposing a gradient clipping option](https://github.com/facebookresearch/fairscale/issues/98) and making sure that mixed precision states are properly handled. original authors include msbaines, min-xu-ai and myself Pull Request resolved: https://github.com/pytorch/pytorch/pull/46750 Reviewed By: mruberry Differential Revision: D25958918 Pulled By: blefaudeux fbshipit-source-id: 14280f2fd90cf251eee8ef9ac0f1fa6025ae9c50	2021-01-20 14:36:16 -08:00
peter	a1b1d0cdc0	Better split of the windows test jobs (#50660 ) Summary: See discussion in https://github.com/pytorch/pytorch/pull/50320#discussion_r554447365. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50660 Reviewed By: xuzhao9, samestep Differential Revision: D25959021 Pulled By: seemethere fbshipit-source-id: 7623bddc09e7d55208b8a1af4b5a23fba2cdeb14	2021-01-19 15:07:33 -08:00
Mikhail Zolotukhin	e9dc8fc162	[TensorExpr] Add python bindings. (#49698 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49698 Reincarnation of #47620 by jamesr66a. It's just an initial bunch of things that we're exposing to python, more is expected to come in future. Some things can probably be done better, but I'm putting this out anyway, since some other people were interested in using and/or developing this. Differential Revision: D25668694 Test Plan: Imported from OSS Reviewed By: bertmaher Pulled By: ZolotukhinM fbshipit-source-id: fb0fd1b31e851ef9ab724686b9ac2d172fa4905a	2021-01-14 21:02:47 -08:00
Nikita Shulga	22bd277891	Run test_type_hints first (#49748 ) Summary: Since it sort of a liner check and fails frequently Pull Request resolved: https://github.com/pytorch/pytorch/pull/49748 Reviewed By: vkuzo Differential Revision: D25682980 Pulled By: malfet fbshipit-source-id: 7dba28242dced0277bad56dc887d3273c1e9e575	2021-01-04 09:33:13 -08:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
Nikita Shulga	12942ea52b	[BE] Introduce `set_cwd` context manager (#49657 ) Summary: Used to temporarily change working directory, but restore it even if exception is raised Use it in test_type_hints and during code coverage collection Pull Request resolved: https://github.com/pytorch/pytorch/pull/49657 Reviewed By: walterddr Differential Revision: D25660543 Pulled By: malfet fbshipit-source-id: 77f08d57e4b60b95daa4068d0dacf7c25f978526	2020-12-21 12:08:48 -08:00
Erjia Guan	1b6fc1fd42	[WIP][DataLoader] CollateIterableDataset prototype (#48933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48933 Prototype for CollateIterableDataset. Move `collate_batch_fn` to BatchIterableDataset - CollateIterableDataset - [x] Prototype - [x] Tests - BatchIterableDataset - [x] Prototype - [x] Tests - SamplerIterableDataset - [x] Prototype - [x] Tests Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D25623635 Pulled By: ejguan fbshipit-source-id: 99ba077619f672551ac15367baaba985db35a9c2	2020-12-21 07:04:25 -08:00
Nikita Shulga	6f381de006	Inline coverage report combining/reporting (#49615 ) Summary: Instead of calling coverage frontend import coverage module and call combine() and html_report() Fixes https://github.com/pytorch/pytorch/issues/49596 by not using a strict mode when combining those reports Pull Request resolved: https://github.com/pytorch/pytorch/pull/49615 Reviewed By: seemethere Differential Revision: D25645196 Pulled By: malfet fbshipit-source-id: be55b5c23a3569a331cbdf3f86d8c89bc27d5fe1	2020-12-18 17:08:46 -08:00
Pritam Damania	9d91360b5d	Cleanup APIs for pipeline parallelism. (#48630 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48630 1) Make torch.distributed.pipeline package public. 2) Make several helper methods private. ghstack-source-id: 118820803 Test Plan: waitforbuildbot Reviewed By: rohan-varma Differential Revision: D25235688 fbshipit-source-id: c32833ebf090ddbd4eaf06fcb5e3f9d421623a60	2020-12-18 15:17:13 -08:00
Rong Rong (AI Infra)	df2337097d	add files to SLOW_TESTS for target determinator (#49500 ) Summary: - test_torch was split into 6 in https://github.com/pytorch/pytorch/issues/47356. - also test_linalg has 10 slowtest marking. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49500 Reviewed By: ezyang, malfet Differential Revision: D25598085 Pulled By: walterddr fbshipit-source-id: 74b0b433897721db86c00e236d1dd925d7a6d3d0	2020-12-16 19:10:56 -08:00
Brian Hirsh	9908b93dcf	fix test_dispatch tests to error on duplicate def (#49254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49254 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25505170 Pulled By: bdhirsh fbshipit-source-id: 6796f4ce022c3141934ee69c7caaa08e663adf39	2020-12-15 08:27:52 -08:00
Pritam Damania	df027bfd2c	Modify Pipe to return an RRef. (#47829 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47829 As per proposal in https://github.com/pytorch/pytorch/issues/44827, the API needs to return an RRef to support inter-host pipelining. For now, we just return a local RRef and only support pipeline on a single host. But having this change in the API upfront ensures we don't make any BC breaking changes later. ghstack-source-id: 118366784 Test Plan: waitforbuildbot Reviewed By: rohan-varma Differential Revision: D24914022 fbshipit-source-id: e711e7d12efa45645f752f0e5e776a3d845f3ef5	2020-12-11 14:55:16 -08:00
Rong Rong	ef50c94e7c	reenabling MPI test (#48725 ) Summary: fixes https://github.com/pytorch/pytorch/issues/47443. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48725 Reviewed By: mrshenli Differential Revision: D25278758 Pulled By: walterddr fbshipit-source-id: a02d0fef99a7941c8e98da16a45d840e12b8b0c3	2020-12-03 06:50:36 -08:00
neerajprad	5489a98cd3	Add support for CorrCholeskyTransform (#48041 ) Summary: This adds a transform to convert a real vector of (D * (D-1))/2 dimension into the cholesky factor of a D x D correlation matrix. This follows the implementation in [NumPyro](https://github.com/pyro-ppl/numpyro/blob/master/numpyro/distributions/transforms.py) by fehiepsi. This is needed for the LKJDistribution which will be added in a subsequent PR. Also in line with the ongoing effort to refactor distributions test, this moves the transforms test into its own file that uses pytest with parametrized fixtures. For review: fehiepsi - could you help review the math? fritzo - do you have any suggestions for what to do about the event dimension (more details are in the comment below)? ezyang - could you review the changes in `run_test.py`? Instead of a separate `PYTEST_TESTS`, I have clubbed these tests in `USE_PYTEST_LIST` to avoid duplicate logic. The only difference is that we do not anymore check if pytest is not installed and exclude the tests in the list. I figured that if existing tests are already using pytest, this should not matter. TODOs (probably not all can be satisfied at the same time): - [x] Use operations that are JIT friendly, i.e. the transform works with different sized input under JIT. - [x] Resolve test failures - currently `arange(scalar_tensor)` fails on certain backends but this is needed for JIT. Maybe we should only support same sized tensor under JIT? - [x] Add tests to check that the transform gives correct gradients and is in agreement with the `log_det_jacobian`. - [x] Add `input_event_dim` and `output_event_dim` to `CorrCholeskyTransform`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48041 Reviewed By: zhangguanheng66 Differential Revision: D25262505 Pulled By: neerajprad fbshipit-source-id: 5a57e1c19d8230b53592437590b9169bdf2f71e9	2020-12-03 03:21:08 -08:00
Mike Ruberry	36c87f1243	Refactors test_torch.py to be fewer than 10k lines (#47356 ) Summary: Creates multiple new test suites to have fewer tests in test_torch.py, consistent with previous test suite creation like test_unary_ufuncs.py and test_linalg.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47356 Reviewed By: ngimel Differential Revision: D25202268 Pulled By: mruberry fbshipit-source-id: 75fde3ca76545d1b32b86d432a5cb7a5ba8f5bb6	2020-11-28 20:11:40 -08:00
Jithun Nair	f1c985695c	Enabled gloo backend in test_distributed unit tests for ROCm (#40395 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40395 Reviewed By: ngimel Differential Revision: D25181692 Pulled By: mrshenli fbshipit-source-id: 29f478c974791efc0acea210c8c9e574944746a5	2020-11-25 19:51:40 -08:00
Sam Estep	c4a6df989c	Pass any verbosity from test/run_test.py to pytest (#48204 ) Summary: Previously it was only possible to pass up to one [verbosity level](https://adamj.eu/tech/2019/10/03/my-most-used-pytest-commandline-flags/) to `pytest` when running a test via `test/run_test.py`. Presumably that behavior was never added because `unittest` [doesn't do anything extra](https://stackoverflow.com/a/1322648/5044950) when given more than one `--verbose` flag. This PR removes that limitation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48204 Test Plan: Make a dummy `pytest`-style file `test/test_foo.py`: ```py def test_bar(): assert 'hello\n' * 10 == 'hello\n' * 20 ``` Then add `'test_foo'` to both `TESTS` and `USE_PYTEST_LIST` in `test/run_test.py`, and run this command: ```sh test/run_test.py -vvi test_foo ``` Reviewed By: walterddr Differential Revision: D25069147 Pulled By: samestep fbshipit-source-id: 2765ee78d18cc84ea0e262520838993f9e9ee04f	2020-11-19 08:06:26 -08:00
Wanchao Liang	bc484cfed1	[c10d][jit] initial torchbind bindings for ProcessGroupNCCL (#42944 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42944 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D23228682 Pulled By: wanchaol fbshipit-source-id: 30f4258ec2a90202264745511b897f4e1f5550f7	2020-11-17 21:01:55 -08:00
Xiang Gao	6e42b77be1	Add '--allow-run-as-root' to mpiexec to allow running distributed test inside a container (#43794 ) Summary: Inside a container, the user is often root. We should allow this use case so that people can easily run `run_test.py` insider a container Pull Request resolved: https://github.com/pytorch/pytorch/pull/43794 Reviewed By: ezyang Differential Revision: D24904469 Pulled By: malfet fbshipit-source-id: f96cb9dda3e7bd18b29801cde4c5b0616c750016	2020-11-13 15:31:06 -08:00
Jane Xu	579cfc6641	Moving test order to rebalance test1 and test2 times (#47290 ) Summary: asan testing diff is absurd right now, moving some heftier tests to be in shard2 (test_nn and test_quantization) Pull Request resolved: https://github.com/pytorch/pytorch/pull/47290 Reviewed By: malfet Differential Revision: D24706877 Pulled By: janeyx99 fbshipit-source-id: 35069d1e425857f85775f9be76501d6a158e0376	2020-11-03 09:39:29 -08:00
Pritam Damania	78de12f588	Replace -f with -x for pytest tests. (#46967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46967 Tests under `tests/distributed/_pipeline/sync` use pytest and specifying the `-f` option for such tests as follows: `python test/run_test.py -i distributed/_pipeline/sync/skip/test_api -- -f` doesn't work. The equivalent option for pytest is `-x`. To resolve this issue, I've updated `run_test.py` to replace `-f` with `-x` for pytest tests. More details in https://github.com/pytorch/pytorch/issues/46782 #Closes: https://github.com/pytorch/pytorch/issues/46782 ghstack-source-id: 115440558 Test Plan: 1) waitforbuildbot 2) `python test/run_test.py -i distributed/_pipeline/sync/skip/test_api -- -f` Reviewed By: malfet Differential Revision: D24584556 fbshipit-source-id: bd87f5b4953504e5659fe72fc8615e126e5490ff	2020-10-29 15:28:06 -07:00
Jane Xu	85954164a4	fix minor bug, message variable does not exist (#46777 ) Summary: When run with `--continue-through-error`, the script ends with the following error: ``` Traceback (most recent call last): File "run_test.py", line 745, in <module> main() File "run_test.py", line 741, in main print_to_stderr(message) NameError: name 'message' is not defined make: *** [macos-compat] Error 1 ``` This PR just changes `message` to `err`, which is the intended variable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46777 Reviewed By: seemethere Differential Revision: D24510460 Pulled By: janeyx99 fbshipit-source-id: be1124b6fc72b178d62acc168d0cbc74962de52b	2020-10-23 14:20:23 -07:00
Pritam Damania	06d50b5eb0	Pull in fairscale.nn.Pipe into PyTorch. (#44090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44090 This is an initial commit pulling in the torchgpipe fork at https://github.com/facebookresearch/fairscale. The purpose of this commit is to just pull in the code and ensure all tests and builds work fine. We will slowly modify this to match our intended API mentioned in https://fb.quip.com/txurAV3zIFox#RPZACAfAKMq. Follow up PRs would address further changes needed on top of the initial commit.. We're pulling the code into the `torch.distributed._pipeline.sync` package. The package is private on purpose since there is a lot of work (ex: docs, API changes etc.) that needs to go in before we can actually officially support this. ghstack-source-id: 114864254 Test Plan: 1) waitforbuildbot 2) Ran all tests on my devgpu Reviewed By: mrshenli Differential Revision: D23493316 fbshipit-source-id: fe3c8b7dadeeb86abdc00e8a8652491b0b16743a	2020-10-22 10:59:02 -07:00
Richard Zou	0285618a11	Add utilities to support handling of nested python data structures (#46287 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46287 This adds a lightweight `pytree` implementation that is similar to and inspired by JAX pytrees, tensorflow.nest, deepmind/tree, TorchBeast's TensorNest, etc. A pytree is Python nested data structure. It is a tree in the sense that nodes are Python collections (e.g., list, tuple, dict) and the leaves are Python values. Furthermore, a pytree should not contain reference cycles. This PR: - adds support for flattening and unflattening nested Python list/dict/tuples Context: nested Tensor inputs for vmap -------------------------------------- Right now, vmap is restricted to taking in flat lists of tensors. This is because vmap needs to be able to convert every tensor in the input that is being vmapped over into a BatchedTensor. With a pytree library, we can simply flatten the input data structure (returning the leaves), map all of the Tensors in the flat input to BatchedTensors, and unflatten the flat list of BatchedTensors into a new input. Or equivalently, with a `tree_map` function, we can map a nested python data structure containing Tensors into one containing BatchedTensors. Future work ----------- In some future PRs, we'll add nested input support for vmap. The prerequisites for that are: - a `broadcast_to(small, big)` that broadcasts `small` up to `big`. This is for handling the in_dims to vmap: the in_dims structure must be compatible with the structure of the inputs. Test Plan --------- - New tests in test/test_pytree.py Test Plan: Imported from OSS Reviewed By: heitorschueroff Differential Revision: D24392890 Pulled By: zou3519 fbshipit-source-id: 7daf7430c5a38354e7d203a72882bd7a9b24cfb1	2020-10-20 07:45:45 -07:00
jiej	ac146c4820	[nvFuser] Switching to `CudaFusionGuard` from `BailOut` for nvfuser - update 2 (#46452 ) Summary: 1. Added CudaFusionGuard as the custom TypeCheck for nvfuser; enabled dynamic shape support with profiling executor; 2. dropped support for legacy fuser; 3. re-enabled nvfuser tests; 4. added registration for profiling record to allow profiling on user specified nodes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46452 Reviewed By: zou3519, anjali411 Differential Revision: D24364642 Pulled By: ngimel fbshipit-source-id: daf53a9a6b6636e1ede420a3a6d0397d4a8b450b	2020-10-19 15:44:31 -07:00
Taylor Robie	dda95e6914	More Timer refinement (#46023 ) Summary: This PR just adds more polish to the benchmark utils: 1) `common.py`, `timer.py`, and `valgrind_wrapper/timer_interface.py` are now MyPy strict compliant. (except for three violations due to external deps.) Compare and Fuzzer will be covered in a future PR. 2) `CallgrindStats` now uses `TaskSpec` rather than accepting the individual fields which brings it closer to `Measurement`. 3) Some `__repr__` logic has been moved into `TaskSpec` (which `Measurement` and `CallgrindStats` use in their own `__repr__`s) for a more unified feel and less horrible f-string hacking, and the repr's have been given a cleanup pass. 4) `Tuple[FunctionCount, ...]` has been formalized as the `FunctionCounts` class, which has a much nicer `__repr__` than just the raw tuple, as well as some convenience methods (`__add__`, `__sub__`, `filter`, `transform`) for easier DIY stat exploration. (I find myself using the latter two a lot now.) My personal experience is that manipulating `FunctionCounts` is massively more pleasant than the raw tuples of `FunctionCount`. (Though it's still possible to get at the raw data if you want.) 5) Better support for multi-line `stmt` and `setup`. 6) Compare now also supports rowwise coloring, which is often the more natural layout for A/B testing. 7) Limited support for `globals` in `collect_callgrind`. This should make it easier to benchmark JIT models. (CC ZolotukhinM) 8) More unit tests, including extensive tests for the Callgrind stats manipulation APIs. 9) Mitigate issue with `MKL_THREADING_LAYER` when run in Jupyter. (https://github.com/pytorch/pytorch/issues/37377) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46023 Test Plan: changes should be covered by existing and new unit tests. Reviewed By: navahgar, malfet Differential Revision: D24313911 Pulled By: robieta fbshipit-source-id: 835d4b5cde336fb7ff0adef3c0fd614d64df0f77	2020-10-15 16:32:53 -07:00
Wang Xu	62d37b9f26	add size_based_partition final (#46282 ) Summary: Reopen the PR: https://github.com/pytorch/pytorch/pull/45837 This PR add a new feature for Partitioner() class called size_based_partition. Given a list of devices with the same memory size, this function could distribute graph nodes into different devices. To implement this feature, several help functions are created in Partitioner.py and GraphManipulation.py. An unit test is also added in test/test_fx_experimental.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/46282 Reviewed By: gcatron Differential Revision: D24288470 Pulled By: scottxu0730 fbshipit-source-id: e81b1e0c56e34f61e497d868882126216eba7538	2020-10-14 03:44:05 -07:00
Neeraj Pradhan	faa9c22a51	Support pytest for distribution testing (#45648 ) Summary: In response to https://github.com/pytorch/pytorch/issues/11578. This is a test run to see if CI (and other internal systems) works fine with pytest style tests. - Creates a separate `distributions` directory within `test`. - For testing, this rewrites the `constraint` tests as parameterized tests in pytest. I don't plan to convert any other tests to pytest style, but only expose this option for adding new tests, if required. If this is a success, we can move `EXAMPLES` in `test_distributions` into a separate file that can be imported by both pytest and unittest style tests. cc. fritzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/45648 Reviewed By: ezyang, colesbury Differential Revision: D24080248 Pulled By: neerajprad fbshipit-source-id: 1f2e7d169c3c291a3051d0cece17851560fe9ea9	2020-10-13 10:56:50 -07:00
Jane Xu	ba78eb80ff	including tensorexpr tests in CI for all configs (#46188 ) Summary: Removed test_tensorexpr from the JIT-EXECUTOR exclude list. CI will now run those tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46188 Reviewed By: glaringlee Differential Revision: D24255433 Pulled By: janeyx99 fbshipit-source-id: f18e5b41d49b439407c1c24ef6190ef68bc809bf	2020-10-12 12:03:06 -07:00
Jane (Yuan) Xu	be137e45cd	reorganizing tests so that test1 and test2 are balanced in timing (#45778 ) Summary: used --shard option to split up python tests ran from `test/run_test.py` in the testing script run in CI also revised a help message to be more accurate for --shard. Test results: BEFORE: \| EVENT \| TIMING \| \|---\|---\| \| TEST1 \| \| \| \| \| \| test_python_nn \| 35m19s \| \| test_cpp_extensions \| 30s \| \| total \| 35m49s \| \| TEST2 \| \| \| \| \| \| install_torchvision \| 35s \| \| test_python_all_except_nn_and_cpp_extensions \| 255m37s \| \| test_aten \| SKIPPED \| \| test_libtorch \| 9m8s \| \| test_custom_script_ops \| SKIPPED \| \| test_custom_backend \| SKIPPED \| \| test_torch_function_benchmark \| 10s \| \| total \| 4hr24m \| AFTER THIS SHARD: \| EVENT \| TIMING \| \|---\|---\| \| TEST1 \| \| \| \| \| \| test_autograd \| 26m30s \| \| test_foreach \| 69m \| \| test_nn \| test_nn is 35m38s \| \| total \| 3h1m \| \| TEST2 \| \| \| \| \| \| test-quantization \| 41m28s \| \| test_spectral_ops \| 17m37s \| \| test_torch \| 8m56s \| \| test_jit_legacy \| 16m21s \| \| total \| 2h18m \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/45778 Reviewed By: albanD Differential Revision: D24137156 Pulled By: janeyx99 fbshipit-source-id: 5873fec47aedb9f699ebbda653a4d32a9950fc13	2020-10-06 07:57:08 -07:00
Jane Xu	8bc0c755be	adding option to move excluding to run_test.py instead of test.sh (#45868 ) Summary: Cleaning up test.sh a tiny bit Pull Request resolved: https://github.com/pytorch/pytorch/pull/45868 Reviewed By: albanD Differential Revision: D24122726 Pulled By: janeyx99 fbshipit-source-id: e8254accad15ad887a000ec1401c401389393c92	2020-10-06 07:13:27 -07:00
Jane (Yuan) Xu	6acd7b686c	adding sharding option to run_test.py (#45583 ) Summary: Added a sharding option to run_test.py to enable users to run a subset of the many tests. The new `--shard` argument takes in two integer values, `x` and `y`, where the larger value would denote the number of shards and the smaller value would denote which shard to run. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45583 Reviewed By: malfet Differential Revision: D24083469 Pulled By: janeyx99 fbshipit-source-id: 1777bd7822c95b3bf37079deff9381c6f8eaf4cc	2020-10-02 11:21:51 -07:00
Thomas Viehmann	22a34bcf4e	ROCm {emoji:2764} TensorExpr (#45506 ) Summary: This might be an alternative to reverting https://github.com/pytorch/pytorch/issues/45396 . The obvious rough edge is that I'm not really seeing the work group limits that TensorExpr produces. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45506 Reviewed By: zhangguanheng66 Differential Revision: D23991410 Pulled By: Krovatkin fbshipit-source-id: 11d3fc4600e4bffb1d1192c6b8dd2fe22c1e064e	2020-09-29 16:52:16 -07:00
gunandrose4u	f07ac6a004	Fix Windows build failure after DDP PR merged (#45335 ) Summary: Fixes #{issue number} This is resubmit for PR https://github.com/pytorch/pytorch/issues/42897 . Together with fix for Windows build issue introduced by PR https://github.com/pytorch/pytorch/issues/44344 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/45335 Reviewed By: zou3519 Differential Revision: D23931471 Pulled By: mrshenli fbshipit-source-id: f49b5a114944c1450b32934b3292170be064f494	2020-09-25 12:37:50 -07:00
Mike Ruberry	95df8657c9	Enables test linalg (#45278 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45271. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45278 Reviewed By: ngimel Differential Revision: D23926124 Pulled By: mruberry fbshipit-source-id: 26692597f9a1988e5fa846f97b8430c3689cac27	2020-09-24 23:09:38 -07:00
Mike Ruberry	103fa3894a	Revert D23841786: [pytorch][PR] Enable distributed package on windows, Gloo backend supported only Test Plan: revert-hammer Differential Revision: D23841786 (`0122299f9b`) Original commit changeset: 334ba1ed73ef fbshipit-source-id: ec95432f9957df56a5a04e52661f5db920b7f57f	2020-09-24 22:44:33 -07:00
gunandrose4u	0122299f9b	Enable distributed package on windows, Gloo backend supported only (#42897 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42095 For test case part will be committed to this PR later mrshenli, please help to review Pull Request resolved: https://github.com/pytorch/pytorch/pull/42897 Reviewed By: osalpekar Differential Revision: D23841786 Pulled By: mrshenli fbshipit-source-id: 334ba1ed73eff2f668857390fc32d1bc7f08e5f3	2020-09-24 21:13:55 -07:00
Zachary DeVito	cb75addee4	torch.package - a way to package models and code (#45015 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45015 torch.package allows you to write packages of code, pickled python data, and arbitrary binary and text resources into a self-contained package. torch.package.PackageExporter writes the packages and torch.package.PackageImporter reads them. The importers can load this code in a hermetic way, such that code is loaded from the package rather than the normal python import system. This allows for the packaging of PyTorch model code and data so that it can be run on a server or used in the future for transfer learning. The code contained in packages is copied file-by-file from the original source when it is created, and the file format is a specially organized zip file. Future users of the package can unzip the package, and edit the code in order to perform custom modifications to it. The importer for packages ensures that code in the module can only be loaded from within the package, except for modules explicitly listed as external using :method:`extern_module`. The file `extern_modules` in the zip archive lists all the modules that a package externally depends on. This prevents "implicit" dependencies where the package runs locally because it is importing a locally-installed package, but then fails when the package is copied to another machine. Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D23824337 Pulled By: zdevito fbshipit-source-id: 1247c34ba9b656f9db68a83e31f2a0fbe3bea6bd	2020-09-22 21:21:21 -07:00
Richard Zou	07cba8b1fc	Run vmap tests in CI (#44656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44656 All this time, test_vmap wasn't running in the CI. Fortunately all the tests pass locally for me. h/t to anjali411 for pointing this out. Test Plan: - Wait for CI Reviewed By: anjali411 Differential Revision: D23689355 Pulled By: zou3519 fbshipit-source-id: 543c3e6aed0af77bfd6ea7a7549337f8230e3d32	2020-09-15 10:59:00 -07:00
Nikita Shulga	fc51047af5	Small fixes in Dependency.cmake and run_test.py (#44414 ) Summary: Do not add gencode flags to NVCC_FLAGS twice: First time they are added in `cmake/public/cuda.cmake` no need to do it again in `cmake/Dependencies.cmake` Copy `additional_unittest_args` before appending local options to it in `run_test()` method Pull Request resolved: https://github.com/pytorch/pytorch/pull/44414 Reviewed By: seemethere Differential Revision: D23605733 Pulled By: malfet fbshipit-source-id: 782a0da61650356a978a892fb03c66cb1a1ea26b	2020-09-09 15:09:33 -07:00
Rohan Varma	106459acac	Rename test_distributed to test_distributed_fork (#42932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42932 Follow up from https://github.com/pytorch/pytorch/pull/41769, rename `test_distributed` to `test_distributed_fork` to make it explicit that it forks. New command to run test: `python test/run_test.py -i distributed/test_distributed_fork -v` ghstack-source-id: 111632568 Test Plan: `python test/run_test.py -i distributed/test_distributed_fork -v` Reviewed By: izdeby Differential Revision: D23072201 fbshipit-source-id: 48581688b6c5193a309e803c3de38e70be980872	2020-09-08 23:13:37 -07:00
Rohan Varma	b22abbe381	Enable test_distributed to work with spawn mode (#41769 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41769 Currently the tests in `test_distributed` only work with the `fork` mode multiprocessing, this PR introduces support for `spawn` mode multiprocessing as well (while keeping the `fork` mode intact). Motivations for the change: 1) Spawn multiprocessing is the default on MacOS, so it better emulates how MacOS users would use distributed 2) With python 3.8+, spawn is the default on linux, so we should have test coverage for this 3) PT multiprocessing suggests using spawn/forkserver over fork, for sharing cuda tensors: https://pytorch.org/docs/stable/multiprocessing.html 4) Spawn is better supported with respect to certain sanitizers such as TSAN, so adding this sanitizer coverage may help us uncover issues. How it is done: 1) Move `test_distributed` tests in `_DistTestBase` class to a shared file `distributed_test` (similar to how the RPC tests are structured) 2) For `Barrier`, refactor the setup of temp directories, as the current version did not work with spawn, each process would get a different randomly generated directory and thus would write to different barriers. 3) Add all the relevant builds to run internally and in OSS. Running test_distributed with spawn mode in OSS can be done with: `python test/run_test.py -i distributed/test_distributed_spawn -v` Reviewed By: izdeby Differential Revision: D22408023 fbshipit-source-id: e206be16961fd80438f995e221f18139d7e6d2a9	2020-09-08 23:11:12 -07:00
Mike Ruberry	665feda15b	Adds opinfo-based autograd tests and (un)supported dtype tests (#43451 ) Summary: This PR adds a new test suite, test_ops.py, designed for generic tests across all operators with OpInfos. It currently has two kinds of tests: - it validates that the OpInfo has the correct supported dtypes by verifying that unsupported dtypes throw an error and supported dtypes do not - it runs grad and gradgrad checks on each op and its variants (method and inplace) that has an OpInfo This is a significant expansion and simplification of the current autogenerated autograd tests, which spend considerable processing their inputs. As an alternative, this PR extends OpInfos with "SampleInputs" that are much easier to use. These sample inputs are analogous to the existing tuples in`method_tests()`. Future PRs will extend OpInfo-based testing to other uses of `method_tests()`, like test_jit.py, to ensure that new operator tests can be implemented entirely using an OpInfo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43451 Reviewed By: albanD Differential Revision: D23481723 Pulled By: mruberry fbshipit-source-id: 0c2cdeacc1fdaaf8c69bcd060d623fa3db3d6459	2020-09-03 02:50:48 -07:00
Sinan Nasir	1a79d7bb28	DDP communication hook examples (#43310 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43310 In this diff, we prepared some example DDP communication hooks [#40848](https://github.com/pytorch/pytorch/pull/40848): 1\. `allreduce_hook`: This DDP communication hook just calls ``allreduce`` using ``GradBucket`` tensors. Once gradient tensors are aggregated across all workers, its ``then`` callback takes the mean and returns the result. If user registers this hook DDP results is expected to be same as the case where no hook was registered. Hence, this won't change behavior of DDP and user can use this as a reference or modify this hook to log useful information or any other purposes while unaffecting DDP behavior. 2\. `allgather_then_aggregate_hook` Similar to ``allreduce_hook``, this hook first gathers ``GradBucket`` tensors and its ``then`` callback aggregates the gathered gradient tensors and takes mean. Instead of ``allreduce`` this hook uses ``allgather``. Note that with W workers, both the computation and communication time scale as O(W) for allgather compared to O(logW) for allreduce. Therefore, this hook is expected to be much slower than ``allreduce_hook`` although both essentially do the same thing with the gradients. 3\. `fp16_compress_hook` This DDP communication hook implements a simple gradient compression approach that converts ``GradBucket`` tensors whose type is assumed to be ``torch.float32`` to half-precision floating point format (``torch.float16``). It allreduces those ``float16`` gradient tensors. Once compressed gradient tensors are allreduced, its then callback called ``decompress`` converts the aggregated result back to ``float32`` and takes the mean. 4\. `quantization_pertensor_hook` does quantization per tensor and uses the idea in https://pytorch.org/docs/master/generated/torch.quantize_per_tensor.html. Note that we separately send scale and zero_point (two floats per rank) before quantized tensors. 5\. `quantization_perchannel_hook` does quantization per channel similar to https://pytorch.org/docs/master/generated/torch.quantize_per_channel.html. The main motivation is that after the initial QSGD study diff, we realized that for considerably large gradient tensors such as a tensor that contains 6 million floats quantizing dividing it into smaller channels (512 float chunks) and quantizing independently may significantly increase the resolution and result with lower error. ghstack-source-id: 110923269 Test Plan: python torch/distributed/algorithms/ddp_comm_hooks/test_ddp_hooks.py Couldn't download test skip set, leaving all tests enabled... ..... ---------------------------------------------------------------------- Ran 4 tests in 26.724s OK Internal testing: ``` buck run mode/dev-nosan //caffe2/test/distributed/algorithms/ddp_comm_hooks:test_ddp_hooks ``` Reviewed By: malfet Differential Revision: D22937999 fbshipit-source-id: 274452e7932414570999cb978ae77a97eb3fb0ec	2020-08-28 18:59:14 -07:00
Nikita Shulga	1bda5e480c	Add Python code coverage (#43600 ) Summary: Replace `test` with `coverage_test` stage for `pytorch-linux-bionic-py3.8-gcc9` configuration Add `coverage.xml` to the list of ignored files Add `codecov.yml` that maps installed pytorch folders back to original locations Cleanup coverage option utilization in `run_test.py` and adapt it towards combining coverage reports across the runs Pull Request resolved: https://github.com/pytorch/pytorch/pull/43600 Reviewed By: seemethere Differential Revision: D23351877 Pulled By: malfet fbshipit-source-id: acf78ae4c8f3e23920a76cce1d50f2821b83eb06	2020-08-26 16:16:03 -07:00
albanD	e08e93f946	Reland of benchmark code (#43428 ) Summary: Reland of the benchmark code that broke the slow tests because the GPU were running out of memory Pull Request resolved: https://github.com/pytorch/pytorch/pull/43428 Reviewed By: ngimel Differential Revision: D23296136 Pulled By: albanD fbshipit-source-id: 0002ae23dc82f401604e33d0905d6b9eedebc851	2020-08-24 13:27:26 -07:00
Mike Ruberry	4dc8f3be8c	Creates test_tensor_creation_ops.py test suite (#43104 ) Summary: As part of our continued refactoring of test_torch.py, this takes tests for tensor creation ops like torch.eye, torch.randint, and torch.ones_like and puts them in test_tensor_creation_ops.py. There hare three test classes in the new test suite: TestTensorCreation, TestRandomTensorCreation, TestLikeTensorCreation. TestViewOps and tests for construction of tensors from NumPy arrays have been left in test_torch.py. These might be refactored separately into test_view_ops.py and test_numpy_interop.py in the future. Most of the tests ported from test_torch.py were left as is or received a signature change to make them nominally "device generic." Future work will need to review test coverage and update the tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43104 Reviewed By: ngimel Differential Revision: D23280358 Pulled By: mruberry fbshipit-source-id: 469325dd1a734509dd478cc7fe0413e276ffb192	2020-08-22 23:18:54 -07:00
Alban Desmaison	74781ab5b8	Revert D23242101: [pytorch][PR] Implement first draft of autograd benchmark. Test Plan: revert-hammer Differential Revision: D23242101 (`c2511bdfa4`) Original commit changeset: a2b92d5a4341 fbshipit-source-id: bda562d15565f074b448022d180ec8f959c6ecc9	2020-08-21 12:22:57 -07:00
albanD	c2511bdfa4	Implement first draft of autograd benchmark. (#40586 ) Summary: It is quite a lot of code because I pulled some code from torchaudio and torchvision to remove issues I had to get latest version with pytorch built from source while I can't build there libs from source (dependency missing for torchaudio). The compare script generates table as follows: \| model \| task \| speedup \| mean (before) \| var (before) \| mean (after) \| var (after) \| \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| \| resnet18 \| vjp \| 1.021151844124464 \| 1.5627719163894653 \| 0.005164200905710459 \| 1.5304011106491089 \| 0.003979875706136227 \| \| resnet18 \| vhp \| 0.9919114430761606 \| 6.8089728355407715 \| 0.019538333639502525 \| 6.86449670791626 \| 0.014775685034692287 \| \| resnet18 \| jvp \| 0.9715963084255123 \| 5.720699310302734 \| 0.08197150379419327 \| 5.887938499450684 \| 0.018408503383398056 \| \| ppl_simple_reg \| vjp \| 0.9529183269165618 \| 0.000362396240234375 \| 7.526952949810095e-10 \| 0.00038030146970413625 \| 7.726220357939795e-11 \| \| ppl_simple_reg \| vhp \| 0.9317708619586977 \| 0.00048058031825348735 \| 5.035701855504726e-10 \| 0.0005157709238119423 \| 3.250243477137538e-11 \| \| ppl_simple_reg \| jvp \| 0.8609755877018406 \| 0.00045447348384186625 \| 9.646707044286273e-11 \| 0.0005278587341308594 \| 1.4493808930815533e-10 \| \| ppl_simple_reg \| hvp \| 0.9764100147808232 \| 0.0005881547695025802 \| 7.618464747949361e-10 \| 0.0006023645401000977 \| 6.370915461850757e-10 \| \| ppl_simple_reg \| jacobian \| 1.0019173715134297 \| 0.0003612995205912739 \| 2.2979899233499523e-11 \| 0.0003606081008911133 \| 1.2609764794835332e-11 \| \| ppl_simple_reg \| hessian \| 1.0358429970264393 \| 0.00206911563873291 \| 2.590938796842579e-09 \| 0.0019975185859948397 \| 2.8916853356264482e-09 \| \| ppl_robust_reg \| vjp \| 1.0669910916521521 \| 0.0017304659122601151 \| 3.1047047155396967e-09 \| 0.0016218185191974044 \| 4.926861585374809e-09 \| \| ppl_robust_reg \| vhp \| 1.0181130455462972 \| 0.0029563189018517733 \| 2.6359153082466946e-08 \| 0.0029037236236035824 \| 1.020585038702393e-08 \| \| ppl_robust_reg \| jvp \| 0.9818360373406179 \| 0.0026934861671179533 \| 6.981357714153091e-09 \| 0.00274331565015018 \| 3.589908459389335e-08 \| \| ppl_robust_reg \| hvp \| 1.0270848910527002 \| 0.005576515104621649 \| 3.2798087801211295e-08 \| 0.005429458804428577 \| 6.438724398094564e-08 \| \| ppl_robust_reg \| jacobian \| 1.0543611284155785 \| 0.00167675013653934 \| 2.3236829349571053e-08 \| 0.001590299652889371 \| 1.2011492245278532e-08 \| \| ppl_robust_reg \| hessian \| 1.0535378727082656 \| 0.01643357239663601 \| 1.8450685956850066e-06 \| 0.015598463825881481 \| 2.1876705602608126e-07 \| \| wav2letter \| vjp \| 1.0060408105086573 \| 0.3516994118690491 \| 1.4463969819189515e-05 \| 0.349587619304657 \| 9.897866402752697e-05 \| \| wav2letter \| vhp \| 0.9873655295086051 \| 1.1196287870407104 \| 0.00474404776468873 \| 1.133955717086792 \| 0.009759620763361454 \| \| wav2letter \| jvp \| 0.9741820317882822 \| 0.7888165712356567 \| 0.0017476462526246905 \| 0.8097219467163086 \| 0.0018235758179798722 \| \| transfo \| vjp \| 0.9883954031921641 \| 2.8865864276885986 \| 0.008410997688770294 \| 2.9204773902893066 \| 0.006901870481669903 \| \| transfo \| vhp \| 1.0111290842971339 \| 8.374398231506348 \| 0.014904373325407505 \| 8.282224655151367 \| 0.04449500888586044 \| \| transfo \| jvp \| 1.0080534543381963 \| 6.293097972869873 \| 0.03796082362532616 \| 6.24282169342041 \| 0.010179692879319191 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/40586 Reviewed By: pbelevich Differential Revision: D23242101 Pulled By: albanD fbshipit-source-id: a2b92d5a4341fe1472711a685ca425ec257d6384	2020-08-21 07:36:26 -07:00
Mike Ruberry	e2eb0cb1a9	Adds arccosh alias for acosh and adds an alias consistency test (#43107 ) Summary: This adds the torch.arccosh alias and updates alias testing to validate the consistency of the aliased and original operations. The alias testing is also updated to run on CPU and CUDA, which revealed a memory leak when tracing (see https://github.com/pytorch/pytorch/issues/43119). Pull Request resolved: https://github.com/pytorch/pytorch/pull/43107 Reviewed By: ngimel Differential Revision: D23156472 Pulled By: mruberry fbshipit-source-id: 6155fac7954fcc49b95e7c72ed917c85e0eabfcd	2020-08-16 22:12:25 -07:00
Mike Ruberry	bee174dc3f	Adds linalg.det alias, fixes outer alias, updates alias testing (#42802 ) Summary: This PR: - updates test_op_normalization.py, which verifies that aliases are correctly translated in the JIT - adds torch.linalg.det as an alias for torch.det - moves the torch.linalg.outer alias to torch.outer (to be consistent with NumPy) The torch.linalg.outer alias was put the linalg namespace erroneously as a placeholder since it's a "linear algebra op" according to NumPy but is actually still in the main NumPy namespace. The updates to test_op_normalization are necessary. Previously it was using method_tests to generate tests, and method_tests assumes test suites using it also use the device generic framework, which test_op_normalization did not. For example, some ops require decorators like `skipCPUIfNoLapack`, which only works in device generic test classes. Moving test_op_normalization to the device generic framework also lets these tests run on CPU and CUDA. Continued reliance on method_tests() is excessive since the test suite is only interested in testing aliasing, and a simpler and more readable `AliasInfo` class is used for the required information. An example impedance mismatch between method_tests and the new tests, for example, was how to handle ops in namespaces like torch.linalg.det. In the future this information will likely be folded into a common 'OpInfo' registry in the test suite. The actual tests performed are similar to what they were previously: a scripted and traced version of the op is run and the test verifies that both graphs do not contain the alias name and do contain the aliased name. The guidance for adding an alias has been updated accordingly. cc mattip Note: ngimel suggests: - deprecating and then removing the `torch.ger` name - reviewing the implementation of `torch.outer` Pull Request resolved: https://github.com/pytorch/pytorch/pull/42802 Reviewed By: zou3519 Differential Revision: D23059883 Pulled By: mruberry fbshipit-source-id: 11321c2a7fb283a6e7c0d8899849ad7476be42d1	2020-08-11 21:48:31 -07:00
Mike Ruberry	4bafca1a69	Adds list of operator-related information for testing (#41662 ) Summary: This PR adds: - an "OpInfo" class in common_method_invocations that can contain useful information about an operator, like what dtypes it supports - a more specialized "UnaryUfuncInfo" class designed to help test the unary ufuncs - the `ops` decorator, which can generate test variants from lists of OpInfos - test_unary_ufuncs.py, a new test suite stub that shows how the `ops` decorator and operator information can be used to improve the thoroughness of our testing The single test in test_unary_ufuncs.py simply ensures that the dtypes associated with a unary ufunc operator in its OpInfo entry are correct. Writing a test like this previously, however, would have required manually constructing test-specific operator information and writing a custom test generator. The `ops` decorator and a common place to put operator information make writing tests like this easier and allows what would have been test-specific information to be reused. The `ops` decorator extends and composes with the existing device generic test framework, allowing its decorators to be reused. For example, the `onlyOnCPUAndCUDA` decorator works with the new `ops` decorator. This should keep the tests readable and consistent. Future PRs will likely: - continue refactoring the too large test_torch.py into more verticals (unary ufuncs, binary ufuncs, reductions...) - add more operator information to common_method_invocations.py - refactor tests for unary ufuncs into test_unary_ufunc Examples of possible future extensions are [here](`616747e50d`), where an example unary ufunc test is added, and [here](`d0b624f110`), where example autograd tests are added. Both tests leverage the operator info in common_method_invocations to simplify testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41662 Reviewed By: ngimel Differential Revision: D23048416 Pulled By: mruberry fbshipit-source-id: ecce279ac8767f742150d45854404921a6855f2c	2020-08-11 11:34:53 -07:00
James Reed	575e7497f6	Introduce experimental FX library (#42741 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42741 Test Plan: Imported from OSS Reviewed By: dzhulgakov Differential Revision: D23006383 Pulled By: jamesr66a fbshipit-source-id: 6cb6d921981fcae47a07df581ffcf900fb8a7fe8	2020-08-11 10:01:47 -07:00
Luca Wehrstedt	935fcc9580	[RPC tests] Merge process group tests into single entry point (#40818 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40818 Summary of the entire stack: -- This diff is part of an attempt to refactor the RPC tests. They currently suffer from several problems: - Several ways to specify the agent to use: there exists one "generic" fixture that uses the global variable TEST_CONFIG to look up the agent name, and is used for process group and Thrift, and then there are separate fixtures for the flaky agent and the TensorPipe one. - These two ways lead to having two separate decorators (`requires_process_group_agent` and `@_skip_if_tensorpipe_agent`) which must both be specified, making it unclear what the effect of each of them is and what happens if only one is given. - Thrift must override the TEST_CONFIG global variable before any other import (in order for the `requires_process_group_agent` decorator to work correctly) and for that it must use a "trap" file, which makes it even harder to track which agent is being used, and which is specific to Buck, and thus cannot be used in OSS by other agents. - Even if the TensorPipe fixture doesn't use TEST_CONFIG, it still needs to set it to the right value for other parts of the code to work. (This is done in `dist_init`). - There are a few functions in dist_utils.py that return some properties of the agent (e.g., a regexp to match against the error it returns in case of shutdown). These functions are effectively chained if/elses on the various agents, which has the effect of "leaking" some part of the Thrift agent into OSS. - Each test suite (RPC, dist autograd/dist optimizer, their JIT versions, remote module, ...) must be run on each agent (or almost; the faulty one is an exception) in both fork and spawn mode. Each of these combinations is a separate file, which leads to a proliferation of scripts. - There is no "master list" of what combinations make sense and should be run. Therefore it has happened that when adding new tests or new agents we forgot to enroll them into the right tests. (TensorPipe is still missing a few tests, it turns out). - All of these tiny "entry point" files contain almost the same duplicated boilerplate. This makes it very easy to get the wrong content into one of them due to a bad copy-paste. This refactoring aims to address these problems by: - Avoiding global state, defaults/override, traps, if/elses, ... and have a single way to specify the agent, based on an abstract base class and several concrete subclasses which can be "mixed in" to any test suite. - Instead of enabling/disabling tests using decorators, the tests that are specific to a certain agent are now in a separate class (which is a subclass of the "generic" test suite) so that they are only picked up by the agent they apply to. - Instead of having one separate entry point script for each combination, it uses one entry point for each agent, and in that script it provides a list of all the test suites it wants to run on that agent. And it does that by trying to deduplicate the boilerplate as much as possible. (In fact, the various agent-suite combinations could be grouped in any way, not necessarily by agent as I did here). It provides further advantages: - It puts all the agents on equal standing, by not having any of them be the default, making it thus easier to migrate from process group to TensorPipe. - It will make it easier to add more versions of the TensorPipe tests (e.g., one that disables the same-machine backends in order to test the TCP-based ones) without a further duplication of entry points, of boilerplate, ... Summary of this commit -- This diff does the changes described above for the process group agent. It defines a fixture for it (instead of using the generic fixture in its default behavior) and then merges all the entry points into a single script. Note that after this change there won't be anymore a "vanilla" RPC test: all test scripts now specify what agent they are using. This puts all agents on equal standing. ghstack-source-id: 109229474 Test Plan: Sandcastle and CircleCI Reviewed By: pritamdamania87 Differential Revision: D22283182 fbshipit-source-id: 7e3626bbbf37d88b892077a03725f0598576b370	2020-08-05 15:10:07 -07:00
Luca Wehrstedt	b93c7c54eb	[RPC tests] Merge tests for faulty agent into single script (#40817 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40817 Summary of the entire stack: -- This diff is part of an attempt to refactor the RPC tests. They currently suffer from several problems: - Several ways to specify the agent to use: there exists one "generic" fixture that uses the global variable TEST_CONFIG to look up the agent name, and is used for process group and Thrift, and then there are separate fixtures for the flaky agent and the TensorPipe one. - These two ways lead to having two separate decorators (`requires_process_group_agent` and `@_skip_if_tensorpipe_agent`) which must both be specified, making it unclear what the effect of each of them is and what happens if only one is given. - Thrift must override the TEST_CONFIG global variable before any other import (in order for the `requires_process_group_agent` decorator to work correctly) and for that it must use a "trap" file, which makes it even harder to track which agent is being used, and which is specific to Buck, and thus cannot be used in OSS by other agents. - Even if the TensorPipe fixture doesn't use TEST_CONFIG, it still needs to set it to the right value for other parts of the code to work. (This is done in `dist_init`). - There are a few functions in dist_utils.py that return some properties of the agent (e.g., a regexp to match against the error it returns in case of shutdown). These functions are effectively chained if/elses on the various agents, which has the effect of "leaking" some part of the Thrift agent into OSS. - Each test suite (RPC, dist autograd/dist optimizer, their JIT versions, remote module, ...) must be run on each agent (or almost; the faulty one is an exception) in both fork and spawn mode. Each of these combinations is a separate file, which leads to a proliferation of scripts. - There is no "master list" of what combinations make sense and should be run. Therefore it has happened that when adding new tests or new agents we forgot to enroll them into the right tests. (TensorPipe is still missing a few tests, it turns out). - All of these tiny "entry point" files contain almost the same duplicated boilerplate. This makes it very easy to get the wrong content into one of them due to a bad copy-paste. This refactoring aims to address these problems by: - Avoiding global state, defaults/override, traps, if/elses, ... and have a single way to specify the agent, based on an abstract base class and several concrete subclasses which can be "mixed in" to any test suite. - Instead of enabling/disabling tests using decorators, the tests that are specific to a certain agent are now in a separate class (which is a subclass of the "generic" test suite) so that they are only picked up by the agent they apply to. - Instead of having one separate entry point script for each combination, it uses one entry point for each agent, and in that script it provides a list of all the test suites it wants to run on that agent. And it does that by trying to deduplicate the boilerplate as much as possible. (In fact, the various agent-suite combinations could be grouped in any way, not necessarily by agent as I did here). It provides further advantages: - It puts all the agents on equal standing, by not having any of them be the default, making it thus easier to migrate from process group to TensorPipe. - It will make it easier to add more versions of the TensorPipe tests (e.g., one that disables the same-machine backends in order to test the TCP-based ones) without a further duplication of entry points, of boilerplate, ... Summary of this commit -- This diff does the changes described above for the faulty agent, which is its own strange beast. It merges all the test entry points (i.e., the combinations of agent, suite and fork/spawn) into a single file. It also modifies the test suites that are intended to be run only on the faulty agent, which used to inherit from its fixture, to inherit from the generic fixture, as they will be mixed in with the faulty fixture at the very end, inside the entry point script. ghstack-source-id: 109229477 Test Plan: Sandcastle and CircleCI Reviewed By: pritamdamania87 Differential Revision: D22283178 fbshipit-source-id: 72659efe6652dac8450473642a578933030f2c74	2020-08-05 15:10:04 -07:00
Luca Wehrstedt	edf6c4bc4d	[RPC tests] Merge TensorPipe tests into single entry point (#40816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40816 Summary of the entire stack: -- This diff is part of an attempt to refactor the RPC tests. They currently suffer from several problems: - Several ways to specify the agent to use: there exists one "generic" fixture that uses the global variable TEST_CONFIG to look up the agent name, and is used for process group and Thrift, and then there are separate fixtures for the flaky agent and the TensorPipe one. - These two ways lead to having two separate decorators (`requires_process_group_agent` and `@_skip_if_tensorpipe_agent`) which must both be specified, making it unclear what the effect of each of them is and what happens if only one is given. - Thrift must override the TEST_CONFIG global variable before any other import (in order for the `requires_process_group_agent` decorator to work correctly) and for that it must use a "trap" file, which makes it even harder to track which agent is being used, and which is specific to Buck, and thus cannot be used in OSS by other agents. - Even if the TensorPipe fixture doesn't use TEST_CONFIG, it still needs to set it to the right value for other parts of the code to work. (This is done in `dist_init`). - There are a few functions in dist_utils.py that return some properties of the agent (e.g., a regexp to match against the error it returns in case of shutdown). These functions are effectively chained if/elses on the various agents, which has the effect of "leaking" some part of the Thrift agent into OSS. - Each test suite (RPC, dist autograd/dist optimizer, their JIT versions, remote module, ...) must be run on each agent (or almost; the faulty one is an exception) in both fork and spawn mode. Each of these combinations is a separate file, which leads to a proliferation of scripts. - There is no "master list" of what combinations make sense and should be run. Therefore it has happened that when adding new tests or new agents we forgot to enroll them into the right tests. (TensorPipe is still missing a few tests, it turns out). - All of these tiny "entry point" files contain almost the same duplicated boilerplate. This makes it very easy to get the wrong content into one of them due to a bad copy-paste. This refactoring aims to address these problems by: - Avoiding global state, defaults/override, traps, if/elses, ... and have a single way to specify the agent, based on an abstract base class and several concrete subclasses which can be "mixed in" to any test suite. - Instead of enabling/disabling tests using decorators, the tests that are specific to a certain agent are now in a separate class (which is a subclass of the "generic" test suite) so that they are only picked up by the agent they apply to. - Instead of having one separate entry point script for each combination, it uses one entry point for each agent, and in that script it provides a list of all the test suites it wants to run on that agent. And it does that by trying to deduplicate the boilerplate as much as possible. (In fact, the various agent-suite combinations could be grouped in any way, not necessarily by agent as I did here). It provides further advantages: - It puts all the agents on equal standing, by not having any of them be the default, making it thus easier to migrate from process group to TensorPipe. - It will make it easier to add more versions of the TensorPipe tests (e.g., one that disables the same-machine backends in order to test the TCP-based ones) without a further duplication of entry points, of boilerplate, ... Summary of this commit -- This diff does the changes described above for the TensorPipe agent. It fixes its fixture (making it inherit from the generic fixture) and merges all the entry point scripts into a single one, so that it's easier to have a clear overview of all the test suites which we run on TensorPipe (you'll notice that many are missing: the JIT ones, the remote module one, ...). ghstack-source-id: 109229476 Test Plan: Sandcastle and CircleCI Reviewed By: pritamdamania87 Differential Revision: D22283180 fbshipit-source-id: d5e9f9f4e6d4bfd6fbcae7ae56eed63d2567a02f	2020-08-05 15:08:32 -07:00
iurii zdebskyi	e995c3d21e	Add private API to support tensor lists: _foreach_add(TensorList tensors, Scalar scalar) (#41554 ) Summary: Initial PR for the Tensor List functionality. Motivation [GitHub issue](https://github.com/pytorch/pytorch/issues/38655) Current PyTorch optimizer implementations are not efficient in cases when we work with a lot of small feature tensors. Starting a lot of kernels slows down the whole process. We need to reduce the number of kernels that we start. As an example, we should be looking at [NVIDIAs Apex](https://github.com/NVIDIA/apex). In order to track progress, we will pick PyTorchs DCGAN model with Adam optimizer and once the optimizer is reimplemented with tensor lists, benchmark the model performance against original model version, Apexs version with original Adam optimizer and it’s FusedAdam optimizer. In this PR - Adding `multi_tensor_apply` mechanism which will help to efficiently apply passed functor on a given list of tensors on CUDA. - Adding a first private API - `std::vector<Tensor> _foreach_add(TensorList tensors, Scalar scalar)` Tests Tested via unit tests Plan for the next PRs 1. Cover these ops with `multi_tensor_apply` support - exponent - division - mul_ - add_ - addcmul_ - addcdiv_ - Sqrt 2. Rewrite PyTorch optimizers to use for-each operators in order to get performance gains. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41554 Reviewed By: cpuhrsch Differential Revision: D22829724 Pulled By: izdeby fbshipit-source-id: 47febdbf7845cf931958a638567b7428a24782b1	2020-08-04 15:01:09 -07:00
Mike Ruberry	4b6e5f42a4	Creates spectral ops test suite (#42157 ) Summary: In preparation for creating the new torch.fft namespace and NumPy-like fft functions, as well as supporting our goal of refactoring and reducing the size of test_torch.py, this PR creates a test suite for our spectral ops. The existing spectral op tests from test_torch.py and test_cuda.py are moved to test_spectral_ops.py and updated to run under the device generic test framework. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42157 Reviewed By: albanD Differential Revision: D22811096 Pulled By: mruberry fbshipit-source-id: e5c50f0016ea6bb8b093cd6df2dbcef6db9bb6b6	2020-07-29 11:36:18 -07:00
Alexander Grund	86492410bc	Don't run tests with custom arguments with pytest (#41397 ) Summary: This patch basically removes the `-m pytest` parameters when `extra_unittest_args` is used (e.g. `--subprocess`) Fixes https://github.com/pytorch/pytorch/issues/41393 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41397 Reviewed By: pbelevich Differential Revision: D22792133 Pulled By: ezyang fbshipit-source-id: 29930d703666f4ecc0d727356bbab4a5f7ed4860	2020-07-28 08:17:36 -07:00
Noman Arshad	1a8269a566	Replace blacklist with blocklist in test/run_test.py file. (#42011 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/41716 test/run_test.py file updated with an appropriate replacement for blacklist and whitelist. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42011 Reviewed By: pbelevich Differential Revision: D22791836 Pulled By: malfet fbshipit-source-id: 8139649c5b70c876b711e25c33f3051ea8461063	2020-07-28 07:56:01 -07:00
Eli Uriegas	f71cccc457	test: Add option to continue testing through error (#41136 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41136 Running this within CI seems impossible since this script exits out after one failed test, so let's just add an option that CI can use to power through these errors. Should not affect current functionality. Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Differential Revision: D22441694 Pulled By: seemethere fbshipit-source-id: 7f152fea15af9d47a964062ad43830818de5a109	2020-07-08 17:26:13 -07:00
David Reiss	5e03a1e926	Add support for int[]? arguments in native_functions.yaml (#37174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37174 ghstack-source-id: 106938112 Test Plan: Upcoming diffs use this for upsampling. Differential Revision: D21210002 fbshipit-source-id: d6a55ab6420c05a92873a569221b613149aa0daa	2020-07-07 13:52:20 -07:00
Christian Sarofeen	b9b4f05abf	[nvFuser] Working towards reductions, codegen improvements (#40864 ) Summary: Have basic reduction fusion working, and have improved code generator to approach performance of eager mode reductions. Coming soon will be pointwise-reduction fusions in a way that should prevent the possibility of hitting regressions. Also working on performant softmax kernels in the code generator which may be our next fusion target. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40864 Reviewed By: ngimel Differential Revision: D22392877 Pulled By: soumith fbshipit-source-id: 457448a807d628b1035f6d90bc0abe8a87bf8447	2020-07-06 14:52:49 -07:00
Jeff Daily	ac8c8b028d	[ROCm] restore jit tests (#40447 ) Summary: Remove `skipIfRocm` from most jit tests and enable `RUN_CUDA_HALF` tests for ROCm. These changes passed more than three rounds of CI testing against the ROCm CI. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40447 Differential Revision: D22190711 Pulled By: xw285cornell fbshipit-source-id: bac44825a2675d247b3abe2ec2f80420a95348a3	2020-06-27 01:03:59 -07:00
Ilia Cherniavskii	d8c384544e	Destroy CUDA events after profiling (#39962 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39962 Adding a simple wrapper with ref count for cuda event and destroying cuda event after the last copy is destroyed Test Plan: CI cuda profiler tests Differential Revision: D22027092 Pulled By: ilia-cher fbshipit-source-id: e0810388aa60b2291eb010896e13af1fad92e472	2020-06-23 10:44:39 -07:00
Pritam Damania	e632bf8d57	Add thrift and tensorpipe backend tests for test_ddp_under_dist_autograd. (#40210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40210 ghstack-source-id: 106300839 Test Plan: waitforbuildbot Differential Revision: D22110065 fbshipit-source-id: d9ebd009b8d451c75708eadc7eb3f2b788e875aa	2020-06-20 22:59:59 -07:00
Ivan Kobzarev	3852215170	[vulkan] jit passes for vulkan conv2 prepack and fuse with clamp (#39282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39282 Test Plan: Imported from OSS Differential Revision: D21962424 Pulled By: IvanKobzarev fbshipit-source-id: 2d20e827d2c3836b7e6b443293377c68dc1ffa5a	2020-06-20 14:12:21 -07:00
Jeff Daily	89ef8f8141	add test_openmp to ROCM_BLACKLIST (#40204 ) Summary: This test is flaky for rocm platform. Add to blacklist until it can be further reviewed. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40204 Differential Revision: D22108295 Pulled By: xw285cornell fbshipit-source-id: 802444a7b41260edcb6ce393237784f3e6c52a74	2020-06-18 15:15:35 -07:00
Shihao Xu	00651b8c93	[distribtued.nn] Implement TorchScript-compatible RemoteModule API (#37139 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37139 See design doc in https://github.com/pytorch/pytorch/issues/37136 ghstack-source-id: 105926270 Test Plan: TODO: - Make the generated Interface usable. https://github.com/pytorch/pytorch/pull/37139#discussion_r434190978 - - Avoid generating the same template instances for Module that is not scriptable. - Remove "infer_module_interface_cls". - Use Python format instead of a CodeTemplate - Use Python tempfile to track and delete file. Does it work if there is crash. ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_scripted_remote_module_template buck build mode/dev-nosan //caffe2/test/distributed/nn/jit:test_instantiator && \ buck-out/gen/caffe2/test/distributed/nn/jit/test_instantiator\#binary.par -r test_instantiate_non_scripted_remote_module_template ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_spawn ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_async_script buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_sync_script buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_forward_with_kwargs buck build mode/dev-nosan //caffe2/test/distributed/nn/api:remote_module_fork && \ buck-out/gen/caffe2/test/distributed/nn/api/remote_module_fork\#binary.par -r test_user_provided_global_unique_name ``` ``` buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork ``` buck test mode/opt-asan //caffe2/test:jit -- 'test_script_forward_method_replacement buck build mode/dev-nosan //caffe2/test:jit && \ buck-out/gen/caffe2/test/jit\#binary.par -r 'test_script_forward_method_replacement' buck build mode/dev-nosan //caffe2/test:jit && \ buck-out/gen/caffe2/test/jit\#binary.par -r 'test_imported_classes' Differential Revision: D20499658 fbshipit-source-id: dd9383ae4eb2343366c11127664f845b91ca3b0a	2020-06-15 19:07:35 -07:00
Ilia Cherniavskii	cc3fc786b7	[resubmit] [pytorch][PR] Fix for num_threads==1 in OpenMP "parallel for" (#39533 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39533 Test Plan: CI Reviewed By: ngimel Differential Revision: D21889269 fbshipit-source-id: 5ba13a0a3ec11edd0b6a7c3fdb35396b847a3d9e	2020-06-15 13:14:59 -07:00
HC Zhu	acc13ac828	[PyTorch] Make DDP reducer work under distributed autograd (#37998 ) Summary: ## Why doesn’t DDP work under dist_autograd? DDP follows the steps below 1. [DDP Python constructor](`8d6a8d2b3f/torch/nn/parallel/distributed.py (L389-L393)`) (on a module) creates a [C++ Reducer](https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/reducer.cpp), which holds references to all parameters (or variables in C++ code). 2. The reducer installs a post hook on each model parameter. 3. The backward run starts and triggers the post hooks installed above. 4. The post hook of a parameter simply marks the parameter ready for all-reduce. 5. Once all parameters in a bucket are ready, an all-reduce process starts by reading variable `.grad` and writes to variable `.grad`. But under dist_autograd, `.grad` of a variable is not populated at all. Instead, grads are in a global map in distributed context from variables to their grads. ## Solution of this PR The distributed engine to set a thread_local variable in a backward run indicating we're running in distributed mode. DDP reducer can then appropriately use `.grad` or the distributed context based on the thread local. More precisely, the thread local is set before calling the post hooks installed by the DDP reducer so that DDP post hooks can retrieve this thread local. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37998 Test Plan: ``` python test/distributed/test_ddp_under_dist_autograd.py ``` FB repo ``` buck test caffe2/test/distributed/... ``` DDP accuracy benchmark workflow run ``` flow-cli canary pytorch.benchmark.accuracy_comparison.workflow --parameters-json '{"node_world_size": 4, "dist_backend": "nccl"}' --run-as-secure-group fblearner_flow --entitlement gpu_prod ``` f196173157 Reviewed By: pritamdamania87 Differential Revision: D21513795 Pulled By: hczhu fbshipit-source-id: fe21e68ecdc9274182db4d4bb5a1e2d68ef927a2	2020-06-10 08:38:14 -07:00
Jithun Nair	545a3e1eca	Remove test_nccl from ROCM_BLACKLIST and enable only a couple of test_nccl tests (#39354 ) Summary: All individual test_nccl unit tests have been disabled for ROCm in `bf9395438f` test_nccl was also added to the ROCM_BLACKLIST in `87b198d309` However, the issue only arises when running the test_nccl suite as a whole (as opposed to any one test individually). More details in comments here: https://github.com/pytorch/pytorch/pull/38689 This PR enables test_nccl suite with only two tests so as to workaround the as-yet unresolved issue above, while allowing at least one test_nccl collective test to run on ROCm. This is also needed as a precursor for: https://github.com/pytorch/pytorch/pull/38515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39354 Differential Revision: D21843194 Pulled By: mrshenli fbshipit-source-id: b28d1e073d8d0fdc1b59928fc3b00187cfd02a35	2020-06-05 13:52:23 -07:00
mattip	ada2652ca6	Restore docs coverage test via sphinx (#39331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39331 Fixes gh-37590 Adds an extra `make coverage` to document building, which uses the built-in facility in sphinx to check docstring coverage. Also fixes a failure to import `torch/jit/supported_ops.py` which broke the [Torchscript Builtins](https://pytorch.org/docs/stable/jit_builtin_functions.html) page. This also adds the required `SPHINXOPTS` to turn warnings into error, but this is commented out. Note that since documentation of `torchvision` is merged in here, failures there would cause failures here if this is made active. Some thought might be needed about pinning the torchvision version merged into documentation. The first commit should fail, since the "ScriptModule" class is commented out. I did that in order to check that a CI failure is properly reported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38244 Differential Revision: D21640589 Pulled By: ezyang fbshipit-source-id: 1e240d81669b5f21404d596de4a27d192dc9fd8a	2020-06-04 10:49:38 -07:00
Oguz Ulgen	4a0a38c17a	Revert D21652452: [pytorch][PR] Fix for num_threads==1 in OpenMP "parallel for" Test Plan: revert-hammer Differential Revision: D21652452 Original commit changeset: 2cda7777c0ea fbshipit-source-id: fdd9a0346ce32a962766f57e13357dd2bc60d8b8	2020-06-03 22:51:51 -07:00
Luca Wehrstedt	5beb3b0c53	[TensorPipe] Re-enable dist optimizer tests (#39441 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39441 This is the last test suite to be enabled for TensorPipe. ghstack-source-id: 105166757 Test Plan: Ran the tests, hundreds of times each, in different build modes. Differential Revision: D21858975 fbshipit-source-id: ee0a7e64b77b4b1974f031207031cc14afb3a8c2	2020-06-03 09:00:52 -07:00
Luca Wehrstedt	b1dab266f7	[TensorPipe] Re-enable dist autograd tests (#39440 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39440 After the RPC tests, re-enable the second test suite: dist autograd. ghstack-source-id: 105165393 Test Plan: Ran the tests, several times each, in different build configs. Differential Revision: D21858974 fbshipit-source-id: 409377d564c36fecae51b9e4c776d94187b434a2	2020-06-03 08:59:19 -07:00
Luca Wehrstedt	3f099879f7	[TensorPipe] Re-enable RPC tests (#39406 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39406 For now, just the RPC test (no dist autograd or dist optimizer). I removed the skipping decorator from all the tests except those that explicitly use the ProcessGroup options. Includes #39027. ghstack-source-id: 105159974 Test Plan: Ran the tests several hundred times, in various build modes. Saw some flakes, but at a rate of about 0.1% Differential Revision: D21716069 fbshipit-source-id: 9d2a99e112049a63745772c18e7a58266ed8e74e	2020-06-03 07:14:30 -07:00
mattip	a952f9bb06	Fix for num_threads==1 in OpenMP "parallel for" (#36479 ) Summary: fixes gh-32284 Move the non-parallel stanza out of the parallel context, and use `num_threads` to limit nesting `parallel for`s. The nesting caused a memory leak in the test script in the issue. This should probably have a test somewhere: are there tests for ParallelOpenMP? Pull Request resolved: https://github.com/pytorch/pytorch/pull/36479 Differential Revision: D21652452 Pulled By: ilia-cher fbshipit-source-id: 2cda7777c0eafbe268550a82fed306e52fb6eb25	2020-06-02 18:56:13 -07:00
Shen Li	bb0377bb24	Expose torch.futures.Future (#39008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39008 This commit adds a `torch.futures.Future` type and exposes its ctor, `wait`, `then`, and `set_result` APIs. This type is currently a wrapper of `c10::ivalue::Future` and mainly used by RPC for now. Later, we could revamp c10d APIs to return this `Future` type as well. More utils will be added into `torch.futures` package in followup PRs. Test Plan: Imported from OSS Differential Revision: D21723022 Pulled By: mrshenli fbshipit-source-id: 92e56160544e9bf00d11db3e8347a1b9707882c9	2020-06-02 10:12:56 -07:00
Nikita Shulga	39d037253c	Test PyTorch using python-3.8 + GCC-9 on Bionic (Reland) (#39121 ) Summary: Enable new test config in .circleci/config.yml Skip scanning several 3rd-party packages to work around https://bugs.python.org/issue40350 Remove pre python-3.5 checks from `test.sh` and update `scikit-learn` to python-3.8 compatible version This is a reland of https://github.com/pytorch/pytorch/pull/39030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/39121 Differential Revision: D21820375 Pulled By: malfet fbshipit-source-id: d0be79b7d204cf692e055d42b9be42402dc4c1c0	2020-06-01 11:11:12 -07:00
Rohan Varma	988e31c788	Revert D21752017: [pytorch][PR] Test PyTorch using python-3.8 + GCC-9 on Bionic Test Plan: revert-hammer Differential Revision: D21752017 Original commit changeset: 56c841636349 fbshipit-source-id: adf08e03ba9610050fc5440ef453789f805fdc6b	2020-05-27 17:42:22 -07:00
Nikita Shulga	30dd4acbf6	Test PyTorch using python-3.8 + GCC-9 on Bionic (#39030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39030 Differential Revision: D21752017 Pulled By: malfet fbshipit-source-id: 56c841636349e24c9ebef8dac18c283de3664fa5	2020-05-27 15:56:37 -07:00
Nikolay Korovaiko	4fcd1c3123	run te only for profiling executor (#38591 ) Summary: * Disable the mode where PE can still run the old fuser. * Clean up Pull Request resolved: https://github.com/pytorch/pytorch/pull/38591 Differential Revision: D21643664 Pulled By: Krovatkin fbshipit-source-id: 6753ed6bdc544698a1340e59a624608ff3abf7f9	2020-05-26 18:35:25 -07:00
Shen Li	40ce90bfc1	Revert D21560096: [Tensorpipe Agent] Enabling tests with OSS CI Test Plan: revert-hammer Differential Revision: D21560096 Original commit changeset: 7d61cc1c354e fbshipit-source-id: 6adfd87e354545031203d65d04f0bad4687a93cd	2020-05-19 19:39:33 -07:00
Jeff Daily	87b198d309	add distributed/test_nccl to ROCM_BLACKLIST (#38730 ) Summary: CC ezyang xw285cornell sunway513 Work-around for recent ROCm CI failures due to `9cfc10d52e` (https://github.com/pytorch/pytorch/issues/37294). Replaces full revert suggested by PR https://github.com/pytorch/pytorch/issues/38689. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38730 Differential Revision: D21648707 Pulled By: xw285cornell fbshipit-source-id: 627b11b229c7eadca1f6e0c6192c6b5b6416e6a1	2020-05-19 14:45:50 -07:00
Omkar Salpekar	87aa2d25ae	[Tensorpipe Agent] Enabling tests with OSS CI (#38447 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38447 This PR modifies `run_tests.py` to enable running Tensorpipe Agent tests with the OSS CI. ghstack-source-id: 104321881 Test Plan: CI Differential Revision: D21560096 fbshipit-source-id: 7d61cc1c354e9353c4a586dd2b56690c28d51d10	2020-05-19 13:34:06 -07:00
Nikita Shulga	72e5b7ae5b	Add option to run python unittests in parallel (#37180 ) Summary: So far results looks quite promising: test_nn is purely sequential tests and can be accelerated 3x Pull Request resolved: https://github.com/pytorch/pytorch/pull/37180 Differential Revision: D21437871 Pulled By: malfet fbshipit-source-id: 8679a8af355f839f2c9dae3bf36d2e102af05425	2020-05-06 22:14:11 -07:00
Kimish Patel	b1b6bc36a5	Enable xnnpack_integration test in CI. (#37838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37838 Test Plan: oss: python test/test_xnnpack_integration.py Reviewed By: xcheng16 Differential Revision: D21405850 fbshipit-source-id: ba4ba06692b49315f110653d9492b2e14b618574	2020-05-06 13:53:03 -07:00
ashishfarmer	402f635bbe	Enable ahead of time compilation for HIPExtensions using ninja (#37800 ) Summary: This pull request enables ahead of time compilation of HIPExtensions with ninja by setting appropriate compilation flags for ROCm environment. Also, this enables the unit test for testing cuda_extensions on ROCm as well as removing test for ahead of time compilation of extensions with ninja from ROCM_BLACKLIST ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37800 Differential Revision: D21408148 Pulled By: soumith fbshipit-source-id: 146f4ffb3418f3534e6ce86805d3fe9c3eae84e1	2020-05-05 20:53:35 -07:00
ashishfarmer	bbd2350c99	Disable tests failing on test2 in ROCm CI (#37427 ) Summary: This pull request disables the unit tests that were observed to be failing once `test2` was enabled. These tests will be one by one looked at and fixed at the earliest, but until then disabling them to unblock `test2` The pull request also disables fftPlanDestroy for rocFFT to avoid double-freeing FFT handles cc: ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37427 Differential Revision: D21302909 Pulled By: ezyang fbshipit-source-id: ecadda3778e65b7f4f97e24b932b96b9ce928616	2020-04-29 09:56:28 -07:00
Nikolay Korovaiko	edc5ef1afb	run the simple executor for jit tests by default, add profiling jobs … (#37017 ) Summary: …for fusion tests fix flake8 warnings fix ci failures fix test_determination.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017 Differential Revision: D21238446 Pulled By: Krovatkin fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c	2020-04-28 19:16:52 -07:00
Nikita Shulga	47c4dca1ab	Remove python-2 or python<3.5 checks from unit tests (#37252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37252 Test Plan: CI Differential Revision: D21241083 Pulled By: malfet fbshipit-source-id: 44164b822f7905288abb2beda0175d2162d86143	2020-04-24 17:42:04 -07:00
Jerry Zhang	230b68168b	[quant] Refactor test files (#36964 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36964 Rename and restructure quantization related tests https://github.com/pytorch/pytorch/issues/31625 Test Plan: . Imported from OSS Differential Revision: D21192509 fbshipit-source-id: 148c93e86e0ea68ab18a067fe74a8035a29a1e4e	2020-04-23 10:28:56 -07:00
David Reiss	e75fb4356b	Remove (most) Python 2 support from Python code (#35615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615 Python 2 has reached end-of-life and is no longer supported by PyTorch. Now we can clean up a lot of cruft that we put in place to support it. These changes were all done manually, and I skipped anything that seemed like it would take more than a few seconds, so I think it makes sense to review it manually as well (though using side-by-side view and ignoring whitespace change might be helpful). Test Plan: CI Differential Revision: D20842886 Pulled By: dreiss fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed	2020-04-22 09:23:14 -07:00
Jerry Zhang	57c50db441	[reland][quant] Add backward compatiblity test (#36842 ) Summary: re-created the same PR: https://github.com/pytorch/pytorch/pull/36639 because ghimport does not support importing binary files right now Pull Request resolved: https://github.com/pytorch/pytorch/pull/36842 Test Plan: python test/quantization/test_backward_compatibility.py Differential Revision: D21100689 Pulled By: jerryzh168 fbshipit-source-id: 625a0f9da98138c9c2891b9d99fc45d85fa27cca	2020-04-17 21:24:31 -07:00
Xingying Cheng	86f354c530	Python binding api to optimize for mobile model on script module. (#36357 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36357 ghstack-source-id: 101907180 Creating a python api entry to optimize mobile model which takes a scripted module as argument and returns an optimized scripted module. The initial optimization features includes inserting and folding prepack ops. Test Plan: python test/test_optimizer.py Differential Revision: D20946076 fbshipit-source-id: 93cb4a5bb2371128f802d738eb26d0a4f3b2fe10	2020-04-17 16:21:27 -07:00
Mike Ruberry	f00014b790	Revert D21080503: [pytorch][PR] [quant] Add backward compatiblity test Test Plan: revert-hammer Differential Revision: D21080503 Original commit changeset: 1dca08208bcc fbshipit-source-id: 5cd8c22130ff28b9231f657f80961e94b65b5792	2020-04-16 22:03:12 -07:00
Jerry Zhang	484a00b2d3	[quant] Add backward compatiblity test (#36771 ) Summary: re-created the same PR: https://github.com/pytorch/pytorch/pull/36639 because ghimport does not support importing binary files right now Pull Request resolved: https://github.com/pytorch/pytorch/pull/36771 Test Plan: python test/quantization/test_backward_compatibility.py Differential Revision: D21080503 Pulled By: jerryzh168 fbshipit-source-id: 1dca08208bccead60bba03e5fb5d39e1a1d7c20d	2020-04-16 19:00:30 -07:00
Haixin Liu	455d4aab64	[PyTorch Numeric Suite] Add weight compare API (#36186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36186 Start PyTorch Numeric Suite under PyTorch quantization and add weight compare API to it. ghstack-source-id: 102062165 Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_compare_weights' Differential Revision: D20903395 fbshipit-source-id: 125d84569837142626a0e2119b3b7657a32dbf4e	2020-04-13 19:02:00 -07:00
Thomas Viehmann	d070c0bcf0	ROCm: enable cpp_extensions.load/load_inline (#35897 ) Summary: This enables cpp_extensions.load/load_inline. This works by hipify-ing cuda sources. Also enable tests. CuDNN/MIOpen extensions aren't yet supported, I propose to not do this in this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35897 Differential Revision: D20983279 Pulled By: ezyang fbshipit-source-id: a5d0f5ac592d04488a6a46522c58e2ee0a6fd57c	2020-04-13 11:44:08 -07:00
David Reiss	fab06bfb75	Add utility for bundling sample inputs with models (#35631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35631 Bundling sample inputs with our models with a standardized interface will make it possible to write benchmarking and code-coverage tools that call all models in a uniform way. The intent is to make this a standard for mobile models within Facebook. Putting it in torch/utils so tests can run on GitHub and because it might be useful for others as well. `augment_model_with_bundled_inputs` is the primary entry point. See its docstring for usage information and the test for some example uses. One design question I had was how much power should be available for automatic deflating and inflating of inputs. The current scheme gives some automatic handling and a reasonable escape hatch ("_bundled_input_inflate_format") for top-level tensor arguments, but no automatic support for (e.g.) tensors in tuples or long strings. For more complex cases, we have the ultimate escape hatch of just defining _generate_bundled_inputs in the model. Another design question was whether to add the inputs to the model or wrap the model in a wrapper module that had these methods and delegated calls to `forward`. Because models can have other exposed methods and attributes, the wrapped seemed too onerous. Test Plan: Unit test. Differential Revision: D20925013 Pulled By: dreiss fbshipit-source-id: 4dbbb4cce41e5752133b4ecdb05e1c92bac6b2d5	2020-04-08 13:10:36 -07:00
Johannes M Dieterich	45fc881f05	[ROCm] Hotfix: Black list tensorexpr test set that has failures on ROCm (#36049 ) Summary: Test set got enabled with ROCm failures in https://github.com/pytorch/pytorch/pull/35914 - black list it for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36049 Differential Revision: D20869814 Pulled By: zou3519 fbshipit-source-id: fcdb2abc9f3407344b56cf8d48b7740008317020	2020-04-06 13:26:05 -07:00
David Reiss	a054d05707	Add torch.utils.show_pickle for showing pickle contents in saved models (#35168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35168 Sometimes when a saved model isn't working, it's nice to be able to look at the contents of the pickle files. Unfortunately, pickletools output isn't particularly readable, and unpickling is often either not possible or runs so much post-processing code that it's not possible to tell exactly what is present in the pickled data. This script uses a custom Unpickler to unpickle (almost) any data into stub objects that have no dependency on torch or any other runtime types and suppress (almost) any postprocessing code. As a convenience, the wrapper can search through zip files, supporting command lines like `python -m torch.utils.show_pickle /path/to/model.pt1@*/data.pkl` When the module is invoked as main, we also install a hack in pprint to allow semi-resonable formatting of our stub objects. Test Plan: Ran it on a data.pkl, constants.pkl, and a debug pkl Differential Revision: D20842550 Pulled By: dreiss fbshipit-source-id: ef662d8915fc5795039054d1f8fef2e1c51cf40a	2020-04-03 15:11:20 -07:00
Mikhail Zolotukhin	ba3cec867f	Reenable test/test_tensorexpr.py (#35914 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35914 Test Plan: Imported from OSS Differential Revision: D20827188 Pulled By: ZolotukhinM fbshipit-source-id: ffcc1bb0396a0a19afb577a7ab4ca95c7e4ced37	2020-04-03 12:20:31 -07:00
Will Feng (FAIAR)	2fa3c1570d	Refactor C++ API parity test mechanism and turn it on in CI again (#35190 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35190 The following are the main changes: - The main logic of C++ API parity test mechanism is moved from `test/test_cpp_api_parity.py` to `test/cpp_api_parity/module_impl_check.py` and `test/cpp_api_parity/functional_impl_check.py`, so that there is a clear separation between module tests and functional tests, although they still share a lot of common utility functions which are all in `test/cpp_api_parity/utils.py`. - Module init tests (i.e. testing whether C++ module accepts the same constructor options as the corresponding Python module) is removed and will be added again in the future. - `cpp_constructor_args` / `cpp_options_args` / `cpp_function_call` are added as appropriate to all test params dict in `torch/testing/_internal/common_nn.py`, to indicate how to run C++ API parity test for this test params dict. Test Plan: Imported from OSS Differential Revision: D20588198 Pulled By: yf225 fbshipit-source-id: 11238c560c8247129584b9b49df73fff40c4d81d	2020-04-03 11:20:36 -07:00
Feng Tian	762270c51f	add c10d dynamic loading mechanism and unit test (#28068 ) Summary: The original behavior of pytorch c10d only supports built-in c10d backends, such as nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically loading 3rd party communication libraries which are derived from ProcessGroup base class. related RFC is in: https://github.com/pytorch/pytorch/issues/27955 Through this way, user just need specify a 3rd party c10d backend name when invoking torch.distributed.init_process_group(). The proposed logic will try to load corresponding c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068 Differential Revision: D19174838 Pulled By: agolynski fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62	2020-04-02 15:46:51 -07:00
Nick Korovaiko	ddcad5b9ca	temp disable test_tensorexpr.py Test Plan: test on CI Reviewed By: soumith Differential Revision: D20823336 fbshipit-source-id: 65c04bc57c6a120003cb561613645d2d7e60189c	2020-04-02 14:28:22 -07:00
Christian Sarofeen	6d24f8fe21	Infrastructure for a new CUDA Fuser (#34785 ) Summary: Summary: This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_ One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated. Warning: This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser. Short term goals: Parity with current CUDA fuser (including performance): - Dynamic shapes (no recompilation) - Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code) - Dropout Mid-term goals: - Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation). - 1-D reductions fused with pointwise operations Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785 Reviewed By: ZolotukhinM Differential Revision: D20650977 Pulled By: soumith fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63	2020-04-02 09:22:42 -07:00
Nick Korovaiko	2f50c11954	add test_tensorexpr.py (#35776 ) Summary: Adding `test_tensorexpr.py` to our CI. There's a few complications: the first one is that we now always run `SimpleIREVal` as a part of simplifier, so the counts will always be greater than one. We can potentially invest some effort to differentiate between a real codegen call to `SimpleIREval` and calls in simplifier, but it's probably not that important and the second change to turn not being able to retrieve a counter into a default value of 0 since the test are structured to test for either an llvm or simpleireval backends, so it only seems appropriate to not fail the test too early. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35776 Differential Revision: D20799333 Pulled By: Krovatkin fbshipit-source-id: 2a94ff98e647180c6e6aea141a411c3376c509f9	2020-04-01 22:05:37 -07:00
Jerry Zhang	ab26dfb44e	[quant] Move quantization tests into test/quantization (#35812 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35812 Test Plan: . Imported from OSS Differential Revision: D20795329 fbshipit-source-id: 42cc905c44ce7b86720aeef512d747ff6788d7a2	2020-04-01 12:44:19 -07:00
Michael Suo	319aee1afb	Revert D20771828: [quant] Move quantization tests into test/quantization Test Plan: revert-hammer Differential Revision: D20771828 Original commit changeset: 5f1df5e86c29 fbshipit-source-id: d14f915f291ae8a90026c5b65624459211495f47	2020-03-31 23:01:00 -07:00
Jerry Zhang	fef6c617d4	[quant] Move quantization tests into test/quantization (#35688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35688 Test Plan: . Imported from OSS Differential Revision: D20771828 fbshipit-source-id: 5f1df5e86c29f7bdfbdc6563450e909b3bfdc07a	2020-03-31 20:30:57 -07:00

... 4 5 6 7 8 ...

669 Commits