Summary:
Part of migrating from Circle.
Once we get a successful force_on_cpu test, we can move it to trunk only.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65094
Reviewed By: seemethere
Differential Revision: D31086289
Pulled By: janeyx99
fbshipit-source-id: e1d135cc844d51f0b243b40efb49edca277d9de8
Summary:
Moving distributed to its own job.
- [x] ensure there should be a distributed test job for every default test job matrix (on GHA)
- [x] ensure that circleci jobs works for distributed as well
- [x] waiting for test distributed to have its own run_test.py launch options, see https://github.com/pytorch/pytorch/issues/63147
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62896
Reviewed By: seemethere
Differential Revision: D30230856
Pulled By: walterddr
fbshipit-source-id: 0cad620f6cd9e56c727c105458d76539a5ae976f
Summary:
Another step forward in fixing https://github.com/pytorch/pytorch/issues/62359
Disclaimer: this only works with GHA for now, as circleci would require changes in probot.
Test plan can be seen a previous description where I modified the description to include linked issues. I've removed them now since the actual PR doesn't fix any of them.
It works! In the [periodic 11.3 test1](https://github.com/pytorch/pytorch/pull/62851/checks?check_run_id=3263109970), we get this in the logs and we see that PYTORCH_IGNORE_DISABLED_ISSUES is properly set:
```
test_jit_cuda_extension (__main__.TestCppExtensionJIT) ... Using /var/lib/jenkins/.cache/torch_extensions/py36_cu113 as PyTorch extensions root...
Creating extension directory /var/lib/jenkins/.cache/torch_extensions/py36_cu113/torch_test_cuda_extension...
Detected CUDA files, patching ldflags
Emitting ninja build file /var/lib/jenkins/.cache/torch_extensions/py36_cu113/torch_test_cuda_extension/build.ninja...
Building extension module torch_test_cuda_extension...
Using envvar MAX_JOBS (30) as the number of workers...
[1/3] c++ -MMD -MF cuda_extension.o.d -DTORCH_EXTENSION_NAME=torch_test_cuda_extension -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11 (d55b25a633)_COMPILER_TYPE=\"_gcc\" -DPYBIND11 (d55b25a633)_STDLIB=\"_libstdcpp\" -DPYBIND11 (d55b25a633)_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/conda/lib/python3.6/site-packages/torch/include -isystem /opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.6/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.6/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++14 -c /var/lib/jenkins/workspace/test/cpp_extensions/cuda_extension.cpp -o cuda_extension.o
[2/3] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=torch_test_cuda_extension -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11 (d55b25a633)_COMPILER_TYPE=\"_gcc\" -DPYBIND11 (d55b25a633)_STDLIB=\"_libstdcpp\" -DPYBIND11 (d55b25a633)_BUILD_ABI=\"_cxxabi1011\" -isystem /opt/conda/lib/python3.6/site-packages/torch/include -isystem /opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.6/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.6/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=compute_52 -gencode=arch=compute_52,code=sm_52 --compiler-options '-fPIC' -O2 -std=c++14 -c /var/lib/jenkins/workspace/test/cpp_extensions/cuda_extension.cu -o cuda_extension.cuda.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
[3/3] c++ cuda_extension.o cuda_extension.cuda.o -shared -L/opt/conda/lib/python3.6/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda_cu -ltorch_cuda_cpp -ltorch -ltorch_python -L/usr/local/cuda/lib64 -lcudart -o torch_test_cuda_extension.so
Loading extension module torch_test_cuda_extension...
ok (26.161s)
```
whereas on the latest master periodic 11.1 windows [test](https://github.com/pytorch/pytorch/runs/3263762478?check_suite_focus=true), we see
```
test_jit_cuda_extension (__main__.TestCppExtensionJIT) ... skip (0.000s)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62851
Reviewed By: walterddr, tktrungna
Differential Revision: D30192029
Pulled By: janeyx99
fbshipit-source-id: fd2ecc59d2b2bb5c31522a630dd805070d59f584
Summary:
- [x] add the jobs to the matrix
- [x] `jit_legacy`
- [x] `nogpu_NO_AVX`
- [x] `nogpu_NO_AVX2`
- [x] `slow`
- [x] use the test config properly to enable the different test conditions
- [x] validate that it works
- [x] disable on pull requests before merging
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61055
Test Plan: CI. Example run: https://github.com/pytorch/pytorch/actions/runs/1013240987
Reviewed By: walterddr
Differential Revision: D29594080
Pulled By: samestep
fbshipit-source-id: 02c531ebc42feae81ecaea0785915f95e0f53ed7
Summary:
- [x] add to test matrix
- [x] enable on PRs for testing
- [x] modify the scripts so it actually runs the multigpu tests
- [x] put `num_shards` after `shard` number
- [x] use a separate test-reports artifact
- [x] run on `linux.16xlarge.nvidia.gpu`
- [x] validate that it works
- [x] disable on PRs before merging
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60221
Test Plan: CI. Example run: https://github.com/pytorch/pytorch/actions/runs/984347177
Reviewed By: malfet
Differential Revision: D29430567
Pulled By: samestep
fbshipit-source-id: 09f8e208e524579b603611479ca00515c8a1b5aa
Summary:
This is branch off of https://github.com/pytorch/pytorch/issues/59970 to only shard on linux so far (we're running in issues with windows gflags).
This would enable sharding of tests on a few Linux jobs on GHA, allowing tts to be essentially halved.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60124
Reviewed By: zou3519
Differential Revision: D29204211
Pulled By: janeyx99
fbshipit-source-id: 1cc31d1eccd564d96e2aef14c0acae96a3f0fcd0