Enable CUDA forward-compatibility mode in all RBE jobs by default.
Forward compatibility mode in hermetic CUDA allows the linker to use the user-mode driver from Bazel cache, so there is no need to install UMD in the RBE Docker image.
UMD on RBE machines is rarely updated, thus RBE jobs need forward compatibility mode to enable the most recent CUDA features usage in the tests.
The non-RBE job runners are updated more often, hence we can update the drivers on those machines and not rely on forward compatibility mode.
PiperOrigin-RevId: 810595379
This change is made to have consistency in parsing NVIDIA dependencies with [JAX repository](https://github.com/jax-ml/jax/pull/30706)
`nvidia-requirements.txt` is used in Bazel hermetic python lock files and in the `tools/pip_package/setup.py` with the package requirements.
The file content is saved in the `nvidia_wheel_versions repository` and passed as an argument to `modify_setup_py.py` that populates `setup_py.tpl` script.
PiperOrigin-RevId: 802755816
Upgrading manylinux compliancy tag in [JAX PR](https://github.com/jax-ml/jax/pull/29672) enabled building targets with linked `nvshmem` libraries.
PiperOrigin-RevId: 786533277
Hermetic toolchains give us builds that are isolated from the host system, cutting down on unexpected dependencies and side effects.
With these changes, TensorFlow will build for Linux x86_64 architectures (both CPU and CUDA-enabled GPU) using self-contained C++ toolchains. If you need to use a non-hermetic toolchain, you can do so by adding the flag --config=clang_local. For remote builds with a non-hermetic toolchain, simply append _clang_local to your existing RBE flag. For example, if your hermetic RBE build uses --config=rbe_linux_cpu, the non-hermetic version would be --config=rbe_linux_cpu_clang_local.
Example: Run CPU tests for Linux x86_64
For hermetic tests, run following command (no env variables like CC, CXX, BAZEL_COMPILER, CLANG_COMPILER_PATH):
bazel test \
--config=avx_linux \
--config=release_linux_base \
--config=linux_cpu_pycpp_test_filters \
--repo_env=HERMETIC_PYTHON_VERSION=3.11 \
//tensorflow/... -- -//tensorflow/compiler/tf2tensorrt/... -//tensorflow/core/tpu/... -//tensorflow/lite/... -//tensorflow/tools/toolchains/...
For Linux x86_64 non-hermetic tests use commands with the flag "--config=clang_local" and env variables CC, CXX, BAZEL_COMPILER, CLANG_COMPILER_PATH, etc.:
bazel test \
--config=clang_local \
--config=avx_linux \
--config=release_linux_base \
--config=linux_cpu_pycpp_test_filters \
--repo_env=HERMETIC_PYTHON_VERSION=3.11 \
--action_env=CLANG_COMPILER_PATH=/usr/lib/llvm-18/bin/clang \
--host_action_env=CLANG_COMPILER_PATH=/usr/lib/llvm-18/bin/clang \
--repo_env=CC=/usr/lib/llvm-18/bin/clang \
--repo_env=CXX=/usr/lib/llvm-18/bin/clang++ \
--repo_env=BAZEL_COMPILER=/usr/lib/llvm-18/bin/clang \
//tensorflow/... -- -//tensorflow/compiler/tf2tensorrt/... -//tensorflow/core/tpu/... -//tensorflow/lite/... -//tensorflow/tools/toolchains/...
PiperOrigin-RevId: 783911228
Hermetic C++ toolchains and CUDA are enabled for Linux x86_64 platform by default. List of covered OSs will be extended in a few closest months. Developers still could use non hermetic toolchains with help of --config=clang_local flag.
std::reduce replace with a traditional for loop. This is necessary because GCC 8 offers only partial support for C++17, and using std::reduce in this environment leads to "Undefined method" error.
PiperOrigin-RevId: 775771057
-Protobuf 5.28.3
-Grpc 1.68.2
-Abseil: LTS 20240116.3
-Plus some other transitive/related deps, riegeli and pybind11 in particular.
-rules_python & rules_cc will be updated in a subsequent CL as they are their own can of worms, plus there are a few pending changes in rules_python which were not pushed yet.
This also switches default protobuf implementation we rely on for bazel builds from cpp to upb, meaning all projects dependin on htis one must be built with build --action_env=PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=upb.
PiperOrigin-RevId: 773586210
TF wheel build rule implementation is also updated to exclude accidental dependencies on NVSHMEM libraries in the wheel content. If the wheel needs to be built with these dependencies, provide `--@local_config_nvshmem//:override_include_nvshmem_libs=True` in Bazel options.
`NVSHMEM` binaries are included in the dependencies if `CUDA` binary dependencies are added as well, e.g. `--@local_config_cuda//:enable_cuda`.
`NVSHMEM` libraries are included in the dependencies if `--@local_config_nvshmem//:include_nvshmem_libs=True` (the default flag value is `False`). Please note that this is a temporary solution, and it should be removed after GLIBC is updated on RBE runners. At the moment `libnvshmem.so` files can't be linked to the targets because they are built with GLIBC version higher then on RBE runners. In the future `--@local_config_cuda//cuda:include_cuda_libs=True` should be used.
The next change will contain adding `NVSHMEM` deps to individual Bazel targets via `select`.
PiperOrigin-RevId: 769344482
the code below creates a condition to run specific Python.h code in cpp for python3.13 as the following functions are deprecated:
_PyArg_NoKeywords (removed)
_PyObject_VisitManagedDict (renamed to PyObject_VisitManagedDict)
_PyObject_ClearManagedDict (renamed to PyObject_ClearManagedDict)
PiperOrigin-RevId: 745284443
Mostly updating `WORKSPACE` and related files from `@tsl//third_party` -> `@xla//third_party`. Please tag me if this change breaks you.
PiperOrigin-RevId: 733489153
This change introduces a uniform way of building the TF wheel and controlling the filename version suffixes.
A new repository rule `python_wheel_version_suffix_repository` provides information about project and wheel version suffixes. The final value depends on environment variables passed to Bazel command: `_ML_WHEEL_WHEEL_TYPE, _ML_WHEEL_BUILD_DATE, _ML_WHEEL_GIT_HASH, _ML_WHEEL_VERSION_SUFFIX`
`tf_version.bzl` defines the TF project version and loads the version suffix information calculated by `python_wheel_version_suffix_repository`.
The targets `//tensorflow/core/public:release_version, //tensorflow:tensorflow_bzl //tensorflow/tools/pip_package:setup_py` use the version chunks defined above.
The version of the wheel in the build rule output depends on the environment variables.
Environment variables combinations for creating wheels with different versions:
* snapshot (default build rule behavior): `--repo_env=ML_WHEEL_TYPE=snapshot`
* release: `--repo_env=ML_WHEEL_TYPE=release`
* release candidate: `--repo_env=ML_WHEEL_TYPE=release --repo_env=ML_WHEEL_VERSION_SUFFIX=-rc1`
* nightly build with date as version suffix: `--repo_env=ML_WHEEL_TYPE=nightly --repo_env=ML_WHEEL_BUILD_DATE=<YYYYmmdd>`
* build with git data as version suffix: `--repo_env=ML_WHEEL_TYPE=custom --repo_env=ML_WHEEL_BUILD_DATE=$(git show -s --format=%as HEAD) --repo_env=ML_WHEEL_GIT_HASH=$(git rev-parse HEAD)`
PiperOrigin-RevId: 733444080
The gist of this change (the new rules implementation) is contained within `rules_pywrap` folder. The rules are generic and not tensorflow-specific
1) (internal-specific)
2) (internal-specific)
3) It provides same linking strategy of final artifacts on all 3 supported platforms (no major differences between Linux, Mac and Windows).
4) It makes it possible to abandon usage of header-only targets to prevent ODR violations. Simply speaking you can now depend on generated protobuf message classes normally, without need to worry how that is linked afterwards.
5) The current version is backward-compatible and unless explicitly enabled is a no-op. To enable the new rules pass `--repo_env=USE_PYWRAP_RULES=True` flag to build/test command.
6) The `if_pywrap` construct is temporary and will be removed once full migration is completed. Currently if_pywrap is mainly used to pass normal dependencies (instead of header-only). The header-only stuff is kept for backward compatibility and smoother migration but will be eventually removed.
7) This CL migrates TF and the most problematic among all google ML repositories. Once TF is sabilized the other repositories, such as JAX and XLA will be migrated too (which should be way easier than migrating TF anyways)
PiperOrigin-RevId: 684324990
This test verifies whether the API v2 packages can be imported from the
current build. It utilizes the `_api/v2/api_packages.txt` list of packages from
the local wheel file specified in the `requirements_lock_<python_version>.txt`.
The test should be executed after the TF wheel was built and put into `dist` dir inside Tensorflow repository.
PiperOrigin-RevId: 676893008
1) Hermetic CUDA rules allow building wheels with GPU support on a machine without GPUs, as well as running Bazel GPU tests on a machine with only GPUs and NVIDIA driver installed. When `--config=cuda` is provided in Bazel options, Bazel will download CUDA, CUDNN and NCCL redistributions in the cache, and use them during build and test phases.
[Default location of CUNN redistributions](https://developer.download.nvidia.com/compute/cudnn/redist/)
[Default location of CUDA redistributions](https://developer.download.nvidia.com/compute/cuda/redist/)
[Default location of NCCL redistributions](https://pypi.org/project/nvidia-nccl-cu12/#history)
2) To include hermetic CUDA rules in your project, add the following in the WORKSPACE of the downstream project dependent on XLA.
Note: use `@local_tsl` instead of `@tsl` in Tensorflow project.
```
load(
"@tsl//third_party/gpus/cuda/hermetic:cuda_json_init_repository.bzl",
"cuda_json_init_repository",
)
cuda_json_init_repository()
load(
"@cuda_redist_json//:distributions.bzl",
"CUDA_REDISTRIBUTIONS",
"CUDNN_REDISTRIBUTIONS",
)
load(
"@tsl//third_party/gpus/cuda/hermetic:cuda_redist_init_repositories.bzl",
"cuda_redist_init_repositories",
"cudnn_redist_init_repository",
)
cuda_redist_init_repositories(
cuda_redistributions = CUDA_REDISTRIBUTIONS,
)
cudnn_redist_init_repository(
cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
)
load(
"@tsl//third_party/gpus/cuda/hermetic:cuda_configure.bzl",
"cuda_configure",
)
cuda_configure(name = "local_config_cuda")
load(
"@tsl//third_party/nccl/hermetic:nccl_redist_init_repository.bzl",
"nccl_redist_init_repository",
)
nccl_redist_init_repository()
load(
"@tsl//third_party/nccl/hermetic:nccl_configure.bzl",
"nccl_configure",
)
nccl_configure(name = "local_config_nccl")
```
PiperOrigin-RevId: 662981325
619575611 by A. Unique TensorFlower<gardener@tensorflow.org>:
Run buildifier on all files where it sorts loads differently
--
619498661 by A. Unique TensorFlower<gardener@tensorflow.org>:
[XLA:GPU][IndexAnalysis] Rename GetDefaultThreadIdToOutputIndexingMap to GetDefaultThreadIdIndexingMap.
The "output" part was a bit confusing. We use this function for threadId->input
mapping as well.
--
619490165 by A. Unique TensorFlower<gardener@tensorflow.org>:
Convert S8 to BF16 in one step without going though F32.
--
PiperOrigin-RevId: 619575611
Repositories depending on TensorFlow should use the content of the WORKSPACE file to initialize TensorFlow and its dependencies. This will make it much less likely for us to break dependent projects when we add/change TensorFlow's dependencies.
PiperOrigin-RevId: 345391447
Change-Id: Ia5f66a341247d0da491e40aee39f460ac10d5c9b
- Add libgpr
Newer grpc-1.28 has a libgpr.so that is also needed during link time
so add it to the linkopts
- Add starlark files
Several starlark files are load()'d from the GRPC repo, vendor them
or add stubs as appropriate when using the system version of grpc.
- grpc WORKSPACE deps
Several deps were loaded in WORKSPACE that were needed by grpc, they
are not needed when building against the system but are difficult to
stub out causing the build to fail.
grpc_extra_deps.bzl is provided to load all the requirements, so use
that from WORKSPACE instead of directly loading each individually.
This is also more maintainable going forward since there is less to
keep in sync in TF's WORKSPACE file.
Signed-off-by: Jason Zaman <jason@perfinion.com>
Add two repository rules:
- @local_execution_config_platform: local platform to allow selecting locally
executed tools on
- @local_execution_config_python: python configured for execution...
PiperOrigin-RevId: 307862682
Change-Id: Ie0320f2f137a40b418632989981c9dc072ef80e6
- @local_execution_config_platform: local platform to allow selecting locally
executed tools on
- @local_execution_config_python: python configured for execution on the local
machine during otherwise remote builds
Mark rules that are required to run locally to require our local platform.
This allows pyth...
PiperOrigin-RevId: 307771596
Change-Id: If1f0013ec88a35d507b2b622894208aab2416fe5
- @local_execution_config_platform: local platform to allow selecting locally
executed tools on
- @local_execution_config_python: python configured for execution on the local
machine during otherwise remote builds
Mark rules that are required to run locally to require our local platform.
This allows python paths to differ between the remote docker image and local
machine.
For example, the local machine might have python 3.7 installed in
/usr/bin/python3, while the remote docker should use a python installed
in /usr/local/bin/python3.8.
PiperOrigin-RevId: 307585019
Change-Id: I29313121beb967b77ae123e7d1b614c688cb40ca
- @local_execution_config_platform: local platform to allow selecting locally
executed tools on
- @local_execution_config_python: python configured for execution on the local
machine during otherwise remote builds
Mark rules that are required to run locally to require our local platform.
This allows python paths to differ between the remote docker image and local
machine.
For example, the local machine might have python 3.7 installed in
/usr/bin/python3, while the remote docker should use a python installed
in /usr/local/bin/python3.8.
PiperOrigin-RevId: 307558811
Change-Id: I0dc2d877a7c26b294bf2b569b4f121cf6506e7fc
Fixes#33758
Downstream projects depending on TensorFlow: If bazel complains, please substitute `@zlib_archive` with `@zlib`, and `@grpc` with `@com_github_grpc_grpc` in WORKPLACE.
PiperOrigin-RevId: 295824868
Change-Id: If2259d59e9d82543369e5670916b1398374c9889
--python_path will be removed in future Bazel, we should switch to use python toolchain. But currently we want Bazel to always use the same python binary specified in configure.py regardless of what's specified in py_binary rule (PY2 or PY3). So we point both py2 and py3 runtime to the same PYTHON_BIN_PATH.
PiperOrigin-RevId: 273032026