As usual, almost no work on PyTorch side, all changes are on the builder end, namely:
- 8b67d32929 - depend on `blas * mkl` only on x86 machines
- eb78393f1e - install arm64 conda when running on Apple Silicon
- 0d3aea4ee0 - constrain llvmdev-9 to x86 machines only
- 6c6a33b271 - set correct DEVELOPER_DIR path
TODO:
- We should auto-detect this `DEVELOPER_DIR` via `xcode-select`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117801
Approved by: https://github.com/atalman
Adding Workflows for building aarch64 Linux PyTorch PIP wheels
Updates:
* Created aarch64 template for generated workflows
* Updated generate_ci_workflows.py to include aarch64
* Generated the aarch64 wheel workflow
* added _binary-build-aarch64.yml for building aarch64 wheel
* added _binary-test-aarch64.yml for sanity check of aarch64 wheel
* Updated binary_linux_test.sh to use --extra-index-url for aarch64 till needed aarch64 dependencies are available at https://download.pytorch.org/whl/nightly/cpu
NOTES:
* The build and test workflows are using arm64v8/alpine and quay.io/pypa/manylinux2014_aarch64:latest docker images at this time.
* Conda generated workflow not included at this time and being worked on.
Workflows were successfully tested at https://github.com/xncqr/pytorch/actions/runs/5351891068
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104109
Approved by: https://github.com/malfet, https://github.com/atalman
Per the discussion with @malfet , there is no need to run Windows binary build for every PR. We will keep it running in trunk (on push) though just in case.
This also moves the workflow back from unstable after the symlink copy fix in 860d444515
Another data point to back this up is the high correlation between Windows binaries debug and release build v.s. Windows CPU CI job. The numbers are:
* `libtorch-cpu-shared-with-deps-debug` and `win-vs2019-cpu-py3` has 0.95 correlation
* `libtorch-cpu-shared-with-deps-release` and `win-vs2019-cpu-py3` has the same 0.95 correlation
The rest is noise, eh?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100638
Approved by: https://github.com/atalman
As CUDA-11.7 is getting deprecated anyway.
Also, fix the problem when script actually generated the same workflow twice, overriding 11.8 ones with 11.7+11.7-with-pypi
<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 0c6c182</samp>
> _Oh we are the PyTorch crew and we have a job to do_
> _We build and test the manywheel package with CUDA 11.8_
> _So heave away, me hearties, heave away with all your might_
> _We'll smoke the Linux binary and make sure it runs all right_
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99458
Approved by: https://github.com/dagitses, https://github.com/atalman
Mostly `s/@master/@main` in numerous `.yml` files.
Keep `master` in `weekly.yml` as it refers to `xla` repo and in `test_trymerge.py` as it refers to a branch PR originates from.
This has been bugging me for a while as I'm working on these Python scripts and they are not tracked by ufmt linter. So I add these script into that linter.
```
[[linter]]
code = 'UFMT'
include_patterns = [
'.github/**/*.py',
'test/run_test.py',
```
This change should just work and not break anything as ufmt (black + usort) linter is very safe to use for standalone util scripts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97588
Approved by: https://github.com/kit1980
Changes:
1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention.
```python
import re
def normalize(name):
return re.sub(r"[-_.]+", "-", name).lower()
```
2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+
3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490
Approved by: https://github.com/ezyang, https://github.com/albanD
This is pretty much self explanatory issues
Two typo's in generate generate binary script caused workflows to be generated with invalid parameters:
1 .generated-linux-binary-libtorch-pre-cxx11-master.yml
2 .generated-macos-arm64-binary-wheel-nightly.yml
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89153
Approved by: https://github.com/malfet
- get rid of a lot of stuff in generate_ci_workflows.py b/c its only used for binaries now
- get rid of generated-ciflow-ruleset.json b/c its super outdated
- add if statement for tags: in templates b/c it used to cause the tags to be put under branches
helps w/ #74478
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75372
Approved by: https://github.com/seemethere
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73748
This adds CPU-only slow test jobs, which previously would never run.
Includes fixes/skips for slow tests which fail (they need to be skipped now because they used to never run)
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D34628803
Pulled By: davidberard98
fbshipit-source-id: c090ab7bf7bda9e24ec5cdefa6fd35c6310dbac0
(cherry picked from commit 06f7a94a57cc7023e9c5442be8298d20cd011144)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73676
For some reason https://github.com/pytorch/pytorch/pull/72637 ended up in getting messed up during rebasing so please refer to that pr for review history.
This PR creates a new workflow called ` deploy-linux-xenial-cuda11.3-py3.7-gcc7` for torch::deploy tests.
For testing go to https://www.torch-ci.com/pytorch/pytorch/pull/73676 and check if a build and test job occur with ` deploy-linux-xenial-cuda11.3-py3.7-gcc7`
Test Plan: Imported from OSS
Reviewed By: soulitzer
Differential Revision: D34586702
Pulled By: PaliC
fbshipit-source-id: 5627cf4ff411a4a04030f8b7726f84af979da213
(cherry picked from commit df6dddebb9fe078a6053a31033b5a40cc742fcf3)
Summary:
RFC: https://github.com/pytorch/rfcs/pull/40
This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run
```
tools/jit/gen_unboxing.py -d cg/torch/share/ATen
```
Merged changes on CI test. In https://github.com/pytorch/pytorch/issues/71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`.
## Lite predictor build specifics
1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`.
2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off.
## Current CI job test coverage update
Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options:
* `USE_LIGHTWEIGHT_DISPATCH=1`
* `BUILD_LITE_INTERPRETER=1`
* `STATIC_DISPATCH_BACKEND=CPU`
This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69881
Reviewed By: iseeyuan
Differential Revision: D33692299
Pulled By: larryliu0820
fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023
(cherry picked from commit 58e1c9a25e3d1b5b656282cf3ac2f548d98d530b)
Rather than hardcode the value to 240 min, use `timeout_after` argument
to specify different limits depending on config
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73508
We're deprecating support for CUDA 11.1 so moving all of our CUDA 11.1
workflows to CUDA 11.3
Signed-off-by: Eli Uriegas <eliuriegasfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73449
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Today, we have two pieces that conspire to determine what workflows we run:
- `generate_ci_workflows.py`, which takes a declarative description of what we want the workflow to do and uses jinja to generate a workflow yaml file
- `generate-test-matrix`, which runs at CI time to dynamically generate test jobs.
This is bad:
- Having one layer of code generation is unfortunate, having two is confusing.
- You cannot tell from a workflow yaml file what test jobs will be run.
- We have to do this careful dance of plumbing the args to `generate-test-matrix` through setting env vars and other such ugliness.
- In cases where the build job fails and prevents `generate-test-matrix` from running, a ghost `test` job that doesn't actually exist noises up the HUD and our stats.
- A bunch of useless `generate-test-matrix` jobs (8 on PRs) noise up our signal.
As far as I can tell, this complexity is unnecessary--we have all the information we need to generate the build matrix statically. There does not appear to be an advantage in retaining generate-build-matrix, so I am removing `generate-test-matrix` to simplify the CI.
The *only* place where we were actually doing something dynamic is in our windows gpu workflow, where we would check at runtime whether the workflow was triggered from a PR or master and behave accordingly. This is more simply done by just having two separate workflows with different trigger conditions, which avoids the madness of needing to parse labels and forking the behavior dynamically, which has been a source of confusion in the past.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73001