Summary:
This PR adds Azure DevOps support for running custom PyTorch unit tests on PyTorch PR and Nightly builds.
PR Builds on Azure DevOps:
- Ensures that the wheel artifacts for a given PR build is ready
- Once the wheels are ready, PyTorch custom tests are run on torch installation from build wheels
Nightly Builds on Azure DevOps:
- Cues 4 builds {Win,Linux}*{cpu, CUDA} to run PyTorch custom unit tests on nightly PyTorch builds.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58007
Reviewed By: seemethere, mruberry
Differential Revision: D28342428
Pulled By: malfet
fbshipit-source-id: a454accf69163f9ba77845eeb54831ef91437981
Summary:
These changes provide the user with an additional option to choose the DNNL+BLIS path for PyTorch.
This assumes BLIS is already downloaded or built from source and the necessary library file is available at the location: $BLIS_HOME/lib/libblis.so and include files are available at: $BLIS_HOME/include/blis/blis.h and $BLIS_HOME/include/blis/cblas.h
Export the below variables to build PyTorch with MKLDNN+BLIS and proceed with the regular installation procedure as below:
$export BLIS_HOME=path-to-BLIS
$export PATH=$BLIS_HOME/include/blis:$PATH LD_LIBRARY_PATH=$BLIS_HOME/lib:$LD_LIBRARY_PATH
$export BLAS=BLIS USE_MKLDNN_CBLAS=ON WITH_BLAS=blis
$python setup.py install
CPU only Dockerfile to build PyTorch with AMD BLIS is available at : docker/cpu-blis/Dockerfile
Example command line to build using the Dockerfile:
sudo DOCKER_BUILDKIT=1 docker build . -t docker-image-repo-name
Example command line to run the built docker container:
sudo docker run --name container-name -it docker-image-repo-name
Fixes #{issue number}
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54953
Reviewed By: glaringlee
Differential Revision: D27466799
Pulled By: malfet
fbshipit-source-id: e03bae9561be3a67429df3b1be95a79005c63050
Summary:
This PR adds Azure Pipelines build steps for PyTorch. There are 3 pipelines that are added.
1) CI Build
- Runs when a PR is opened or when new commits to an open PR is added. This build must succeed before the PR can be merged.
- Currently only TestTorch unit tests are run.
- Only the CI Build configurations are run.
2) Daily Build
- Runs once a day during inactive hours to ensure the current PyTorch repo performs as expected.
- Runs all unit tests.
- Note: I do not have access to the current [determine-from](b9e900ee52/test/run_test.py (L737)) unit tests that are skipped on Windows builds. This `determine-from` filter can be added once a clear way to skip certain unit tests given the build configuration is explained.
- Runs on All Build configurations.
3) Official Build
- Runs once a day during inactive hours to publish official PyTorch artifacts to Azure DevOps Artifacts for consumption.
- No unit tests are run.
- Runs in three stages: Build, Verify, Publish, where PyTorch is built, then its wheel is installed in a clean Conda environment for verification, and then the wheel is published to Azure Artifacts as a Universal Package.
- Runs on All Build configurations.
Ubuntu builds run on Docker with the specified Dockerfile configuration. Windows builds run directly on configured Windows VMs (CPU, CUDA/cuDNN)
CI Build configurations:
1. Ubuntu 18.04
1. Python 3.9
a. CUDA 11.2/cuDNN 8.1.0
2. Python 3.8
a. CPU
2. Windows 2019
1. Python 3.8
b. CUDA 10.2/cuDNN 7.6.5
2. Python 3.7
a. CPU
All Build configurations:
1. Ubuntu 18.04
1. Python 3.9
a. CUDA 11.2/cuDNN 8.1.0
2. Python 3.8
a. CPU
b. CUDA 10.2/cuDNN 8.1.0
3. Python 3.7
a. CPU
b. CUDA 10.1/cuDNN 7.6.5
2. Windows 2019
1. Python 3.9
a. CUDA 11.2/cuDNN 8.1.0
2. Python 3.8
a. CPU
b. CUDA 10.2/cuDNN 7.6.5
3. Python 3.7
a. CPU
b. CUDA 10.1/cuDNN 7.6.4
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54039
Reviewed By: ezyang
Differential Revision: D27373310
Pulled By: malfet
fbshipit-source-id: 06dcfe2d99da0e9876b6deb224272800dae46028
Summary:
i dont think docker/ folder is used anymore. creating this draft to verify
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54729
Reviewed By: ezyang
Differential Revision: D27364811
Pulled By: walterddr
fbshipit-source-id: 3e4a9d061b0e5f00015a805dd8b4474105467572
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857
These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
- `GLOSSARY.md`
- `aten/src/ATen/core/op_registration/README.md`
- `scripts/README.md`
- `torch/csrc/jit/codegen/fuser/README.md`
The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```
I looked over the auto-generated changes and didn't see anything that looked problematic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406
Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377
This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348
Reviewed By: walterddr, seemethere
Differential Revision: D26856620
Pulled By: samestep
fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49486
Remove code for Python 3.5 and lower.
There's more that can be removed/modernised, but sticking mainly to redundant version checks here, to keep the diff/PR smaller.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46579
Reviewed By: zou3519
Differential Revision: D24453571
Pulled By: ezyang
fbshipit-source-id: c2cfcf05d6c5f65df64d89c331692c9aec09248e
Summary:
Fixes deprecated use of the HIP_PLATFORM env var. This env var is no longer needed to be set explicitly. Instead, HIP_PLATFORM is automatically detected by hipcc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47241
Reviewed By: mruberry
Differential Revision: D24699982
Pulled By: ngimel
fbshipit-source-id: 9cd2f32e7c0c8d662832b0cbbc2988835a45961a
Summary:
Request to update ROCm CI dockers to release 3.1
Changes required to the PyTorch source base attached:
* switch to the fast path for the Caffe2 ReLU operator
* switch to the new hipMemcpyWithStream(stream) API to replace hipMemcpyAsync(stream) && hipStreamSynchronize(stream) paradigm in an optimized fashion
* disable two regressed unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33930
Differential Revision: D20589048
Pulled By: ezyang
fbshipit-source-id: 568f40c1b90f311eb2ba57f02a9901114d8364af
Summary:
Sometimes submodule URL may have changed between commits. Let Dockerfile
also sync submodules before updating.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35423
Differential Revision: D20658464
Pulled By: ngimel
fbshipit-source-id: 9c101338437f9e86432d3502766858fa5156a800
Summary:
Done by just editing `.circleci/cimodel/data/dimensions.py` to include `3.8` and then regenerated using `.circleci/regenerate.sh`
cc kostmo, mingbowan, ezyang, soumith
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31948
Differential Revision: D19602069
Pulled By: seemethere
fbshipit-source-id: ac57fde9d0c491c7d948a3f5944c3cb324d403c0
Summary:
In order to support Ubuntu18.04, some changes to the scripts are required.
* install dependencies with -y flag
* mark install noninteractive
* install some required dependencies (gpg-agent, python3-distutils, libidn11)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31886
Differential Revision: D19300586
Pulled By: bddppq
fbshipit-source-id: d7fb815a3845697ce63af191a5bc449d661ff1de
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26704
nccl 2.1.15 isn't available for CUDA 10.1 and 2.4.8 isn't available for cuda 9.1 :(
ghstack-source-id: 90714191
Test Plan: build docker images on Jenkins
Differential Revision: D17543120
fbshipit-source-id: 882c5a005a9a3ef78f9209dea9dcec1782060b25
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26656
Updating the NDK to r18 or newer triggers a path in our CI scripts so that we now build with clang instead of gcc.
Google discontinued the gcc support for android quite a while ago, clang is the only way forward.
ghstack-source-id: 90698985
Test Plan: CI
Reviewed By: dreiss
Differential Revision: D17533570
fbshipit-source-id: 5eef4d5a539d8bb1a6682f000d0b5d33b3752819
Summary:
There is an issue with the torchvision version not matching the pytorch version if one builds the docker from a tag, see issue https://github.com/pytorch/pytorch/issues/25917. The current solution requires one to re-init the submodules or manually change the version of torchvision. This PR allows one to build the docker image without torchvision, which not only fixes the above mentioned bug but also frees non-image pytorch users from the tyranny of torchvision 😆.
In all seriousness, for NLP researchers especially torchvision isn't a necessity for pytorch and all non-essential items shouldn't be in the docker. This option removes one extra thing that can go wrong.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26168
Differential Revision: D17550001
Pulled By: soumith
fbshipit-source-id: 48b8b9e22b75eef3afb392c618742215d3920e9d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25620
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25602
Enable rocThrust with hipCUB and rocPRIM for ROCm. They are the ROCm implementations of the thrust and cub APIs and replace the older hip-thrust and cub-hip packages going forward. ROCm 2.5 is the first release to contain the new packages as an option, as of 2.6 they will be the only available option.
Add hipification rules to correctly hipify thrust::cuda to thrust::hip and cub:: to hipcub:: going forward. Add hipification rules to hipify specific cub headers to the general hipcub header.
Infrastructure work to correctly find, include and link against the new packages. Add the macro definition to choose the HIP backend to Thrust.
Since include chains are now a little different from CUDA's Thrust, add includes for functionality used where applicable.
Skip four tests that fail with the new rocThrust for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21864
Reviewed By: xw285cornell
Differential Revision: D16940768
Pulled By: bddppq
fbshipit-source-id: 3dba8a8f1763dd23d89eb0dd26d1db109973dbe5
Summary:
Only check for cmake dependencies we directly depend on (e.g., hipsparse but not rocsparse)
Use cmake targets for ROCm where possible.
While there, update the docker CI build infrastructure to only pull in packages by name we directly depend on (anticipating the demise of, e.g., miopengemm). I do not anticipate a docker rebuild to be necessary at this stage as the changes are somewhat cosmetic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23527
Differential Revision: D16561010
Pulled By: ezyang
fbshipit-source-id: 87cd9d8a15a74caf9baca85a3e840e9d19ad5d9f
Summary:
- Do not install unecessary packages in the Docker image.
- In the Docker image, use conda to install ninja (saving one layer)
- When workdir is set, use "." to refer to it to reduce redundancy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20881
Differential Revision: D15495769
Pulled By: ezyang
fbshipit-source-id: dab7df71ac107c85fb1447697e25978daffc7e0b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16911
I think the Thrust package has want we want for /opt/rocm/include/thrust. We probably can stop patching it now.
Reviewed By: bddppq
Differential Revision: D14015177
fbshipit-source-id: 8d9128783a790c39083a1b8b4771c2c18bd67d46
Summary:
* we do not need EAP packages any longer as the antistatic feature is now in the release
* consistently install the rccl package
* Skip one unit test that has regressed with 2.1
* Follow-up PRs will use 2.1 features once deployed on CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16808
Differential Revision: D13992645
Pulled By: bddppq
fbshipit-source-id: 37ca9a1f104bb140bd2b56d403e32f04c4fbf4f0
Summary:
Drop custom hcc/hip as the 1.9.2 release should contain the relevant patches therein.
Most notable feature in 1.9.2 is mixed precision support in rocBLAS and MIOpen. These features will be enabled by subsequent PRs.
bddppq ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14216
Differential Revision: D13354294
Pulled By: bddppq
fbshipit-source-id: 2541d4a196af21c9432c1aff7f6e65b572628028
Summary:
1) Use the hip-thrust version of Thrust as opposed to the GH master. (ROCm 267)
2) CentOS 7.5 docker (ROCm 279)
* Always install the libraries at docker creation for ubuntu.
* Add Dockerfile for CentOS ROCm
* Enable the centos build
* Source devtoolset in bashrc
* Set locales correctly depending on whether we are on Ubuntu or CentOS
* Install a newer cmake for CentOS
* Checkout thrust as there is no package for CentOS yet.
PyTorch/Caffe2 on ROCm passed tests: https://github.com/ROCmSoftwarePlatform/pytorch/pull/280
For attention: bddppq ezyang
Docker rebuild for Ubuntu not urgent (getting rid of Thrust checkout and package install is mainly cosmetic). If docker for CentOS 7.5 is wanted, build is necessary. Build of PyTorch tested by me in CentOS docker. PyTorch unit tests work mostly, however, a test in test_jit causes a python recursion error that seems to be due to the python2 on CentOS as we haven't ever seen this on Ubuntu - hence please do not enable unit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12899
Differential Revision: D13029424
Pulled By: bddppq
fbshipit-source-id: 1ca8f4337ec6a603f2742fc81046d5b8f8717c76
Summary:
* switches docker files over to white rabbit release - removed custom package installs
* skips five tests that regressed in that release
* fixes some case-sensitivity issues in ROCm supplied cmake files by sed'ing them in the docker
* includes first changes to the infrastructure to support upcoming hip-clang compiler
* prints ROCm library versions as part of the build (as discussed w/ ezyang )
* explicitly searches for miopengemm
* installs the new hip-thrust package to be able to remove the explicit Thrust checkout in a future revision
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12577
Differential Revision: D10350165
Pulled By: bddppq
fbshipit-source-id: 60f9c9caf04a48cfa90f4c37e242d944a175ab31
Summary:
* purge hcSPARSE now that rocSPARSE is available
* integrate a custom hcc and HIP
* hcc brings two important compiler fixes (fixes hundreds of unit tests)
* HIP brings a smart dispatcher that allows us to avoid a lot of static_casts (we haven't yet removed the automatic static_casts but this catches some occurrences the script did not catch)
* mark 5 unit tests skipping that have regressed w/ the new hcc (we don't know yet what is at fault)
* optimize bitonic sort - the comparator is always an empty struct - therefore passing it by value saves at least 3 bytes. It also removes an ambiguity around passing references to `__global__` functions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11198
Differential Revision: D9652340
Pulled By: ezyang
fbshipit-source-id: f5af1d891189da820e3d13b7bed91a7a43154690
Summary:
* improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs)
* integrate rocFFT (i.e., enable Fourier functionality)
* fix bugs in ROCm caused by wrong warp size
* enable more test sets, skip the tests that don't work on ROCm yet
* don't disable asserts any longer in hipification
* small improvements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893
Differential Revision: D9615053
Pulled By: ezyang
fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b
Summary:
Set the build environment before installing sccache in order to make sure the docker images have the links set up.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10640
Reviewed By: yf225
Differential Revision: D9399593
Pulled By: Jorghi12
fbshipit-source-id: a062fed8b7e83460fe9d50a7a27c0f20bcd766c4
Summary:
* some small leftovers from the last PR review
* enable more unit test sets for CI
* replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
* use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
* use strided_batched gemm interface also from the batched internal interface
* re-enable Dropout.cu as we now have philox w/ rocRAND
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406
Reviewed By: Jorghi12
Differential Revision: D9277093
Pulled By: ezyang
fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2
Summary:
Current Dockerfile builds pytorch using default python within miniconda, which happens to be Python 3.6
This patch allows users to specify which python should be installed in the default miniconda environment used by the pytorch dockerfile. I have tested the build for python 2.7, 3.5, 3.6 and 3.7. Python 2.7 required typing and cython
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10317
Differential Revision: D9204401
Pulled By: ezyang
fbshipit-source-id: 11355cab3bf448bbe8369a2ed1de0d409c9a2d6e