Commit Graph

568 Commits

Author SHA1 Message Date
David Dunleavy
eaa584b953 Give XLA it's own .bazelrc, remove the TensorFlow bazelrc from openxla/xla
PiperOrigin-RevId: 729602048
2025-02-21 11:50:12 -08:00
Vladimir Belitskiy
c2a59b4cb9 Remove the Windows 2019 configs.
CI now runs completely on Windows 2022.

PiperOrigin-RevId: 729152614
2025-02-20 10:13:01 -08:00
A. Unique TensorFlower
8065bb7383 Use "common" instead of "build" for some flags in .bazelrc
Setting "build" options in the RC file prevents applying the flags to the query command. "common" works for both build and query commands.

Flags like `--experimental_cc_shared_library` changes the starlark semantics which forces re-fetching all repo rules when switching between commands.

Ideally, more flags should be common instead of build.

PiperOrigin-RevId: 728658294
2025-02-19 07:17:22 -08:00
Alex
6bfc0a6475 [ROCm] Suppress offsetof-extensions warning for rocm build
This closes https://github.com/openxla/xla/pull/22347

PiperOrigin-RevId: 725618260
2025-02-11 07:18:07 -08:00
Alexander Grund
762ea6c2bd
Merge branch 'tensorflow:master' into remove-unused-vars 2025-02-10 09:59:44 +01:00
Alex
bd562209b8 PR #21901: Ci add rocm6.1 deps for ubuntu 20.04
Imported from GitHub PR https://github.com/openxla/xla/pull/21901

Add rocm 6.1.0 dependency for ubuntu 20.04
Copybara import of the project:

--
0acf028eeca5923c7f2aa5762297686836eda310 by Alexandros Theodoridis <atheodor@amd.com>:

Add rocm6.1 deps for ubuntu 20.04

--
fc88c83061d6efff2482599489d622ab3114b9a7 by Alexandros Theodoridis <atheodor@amd.com>:

Fix hermetic build for 6.0

--
73ace5591f4731e1b95b6d3e6a349b528977c580 by Alexandros Theodoridis <atheodor@amd.com>:

Add ci config for hermetic build

--
bbc048bcffd9d35bfad76ff816ed22f3e3f761f8 by Alexandros Theodoridis <atheodor@amd.com>:

Introduce rocm 6.1.0 dependency for 22.04

--
9776f398c2711ba37333d29b934d6ba67c55dbef by Alexandros Theodoridis <atheodor@amd.com>:

Add missing 24.04 redist

--
acf275d57cc185b9c2122d5930d8cf54e473ad95 by Alexandros Theodoridis <atheodor@amd.com>:

Fix test

--
3e49285b0f55597ab5f44c1d0a422bf931d72cda by Alexandros Theodoridis <atheodor@amd.com>:

Add comment explaining the reason for a new target

--
35838bf8d6e678717e9b1c551f840918b00a91f8 by Alexandros Theodoridis <atheodor@amd.com>:

Rever force verbose in the compiler wrapper

--
2952e115b044e1a8ac8aadc7eac7802e8d79cf91 by Alexandros Theodoridis <atheodor@amd.com>:

Add explanation comment for the new target

Merging this change closes #21901

PiperOrigin-RevId: 721043735
2025-01-29 11:20:47 -08:00
Vladimir Belitskiy
01a5dd9a30 Update a missed value for windows_x86_cpu_2022 config.
PiperOrigin-RevId: 720183409
2025-01-27 08:47:03 -08:00
Vladimir Belitskiy
82ba59a646 Fix up the 2022 Win RBE config.
PiperOrigin-RevId: 718920772
2025-01-23 10:29:55 -08:00
David Dunleavy
63323a38c3 Move XLA specific bits out of TensorFlow's bazelrc
There are much fewer of these than expected!

PiperOrigin-RevId: 717978053
2025-01-21 10:43:05 -08:00
mmakevic-amd
55a19c8399 PR #21396: [ROCm] Fix build break due to XNNPACK update and add NCCL_MAX_NCHANNELS to multi gpu tests
Imported from GitHub PR https://github.com/openxla/xla/pull/21396

`NCCL_MAX_NCHANNELS=1`  is necessary for collective ops tests to pass in CI.

As for XNNPACK problem, similar fix has already been done for single-gpu tests -> https://github.com/openxla/xla/pull/20975
Copybara import of the project:

--
631fa6b7fc859c083e0735d2ce47167cbf57c174 by Milica Makevic <Milica.Makevic@amd.com>:

Fix build break due to XNNPACK update

--
d226a07701ddd88d45e7b27406d2915d032832a4 by Milica Makevic <Milica.Makevic@amd.com>:

Add NCCL_MAX_NCHANNELS env variable to multi gpu tests

--
b826eee7ab0e2a1f4d062871465ab4dac48ead37 by Milica Makevic <Milica.Makevic@amd.com>:

Split bazel command arguments in multiple lines

Merging this change closes #21396

PiperOrigin-RevId: 715779335
2025-01-15 07:00:16 -08:00
David Dunleavy
dee7f1feb4 Use @com_google_googletest//:gtest_main instead of tsl/platform:test_main
Also sets `test --test_env="GTEST_INSTALL_FAILURE_SIGNAL_HANDLER=1"` in the bazelrc to retain stacktrace behavior when tests are killed

PiperOrigin-RevId: 715072546
2025-01-13 12:55:18 -08:00
Vladimir Belitskiy
fa3b3c1a0d Update scripts/configs for Windows nightly/release builds.
`set -u` (does not allow unbound variables) has been removed from all scripts.
This is due to Docker on Windows treating variables in an env file,
set to an empty value (`MY_VAR=`), as unbound variables. Consequently,
these variables, even though they are "set", do not make it into the Docker
container at all, and various checks for those variables fail outright.

PiperOrigin-RevId: 713717958
2025-01-09 10:37:25 -08:00
Jacques Pienaar
95def861c7 Remove experimental TOSA convert python API
In preparation for larger changes, this entry point is being disabled here for now.

PiperOrigin-RevId: 713316210
2025-01-08 09:30:57 -08:00
A. Unique TensorFlower
75a4b2e04f Make TF wheel API tests manual.
PiperOrigin-RevId: 707632318
2024-12-18 12:29:34 -08:00
Nitin Srinivasan
5f9cbc8ac4 Remove duplicated XLA .bazelrc configs
These were created to be able to set a different path to the toolchain configs when building XLA. Instead of creating duplicated configs, we will use copybara to transform paths in the .bazelrc between TF and XLA.

PiperOrigin-RevId: 707109121
2024-12-17 08:32:31 -08:00
Quoc Truong
d244f415c4 Add new ml-build-rbe container to the configurations of RBE used with remote config.
Update .bazelrc file to use the new RBE config.

PiperOrigin-RevId: 706005323
2024-12-13 14:43:43 -08:00
A. Unique TensorFlower
2d3abecfda Add TF package import tests in CPU presubmit, continuous and nightly jobs.
PiperOrigin-RevId: 705964187
2024-12-13 12:53:24 -08:00
Alex
28df421ebd PR #19660: [ROCm] switch rocm build to clang
Imported from GitHub PR https://github.com/openxla/xla/pull/19660

This PR switches the default rocm build to clang as the gcc config is broken at the moment.

Copybara import of the project:

--
ea48f7c480d110eab3f133ed6ea8989da0e1e724 by Alexandros Theodoridis <atheodor@amd.com>:

[ROCm] switch rocm build to clang

--
2743fabafd6a358c05e858781064e7fa2e389c78 by Alexandros Theodoridis <atheodor@amd.com>:

Remove explicit clang path from the bazelrc rocm config

--
202dea0a80602cafdbee6067d8f20dc3055c6bbb by Alexandros Theodoridis <atheodor@amd.com>:

Address review comments

Merging this change closes #19660

PiperOrigin-RevId: 699222609
2024-11-22 10:59:36 -08:00
Vadym Matsishevskyi
641de23853 Fix windows-specific issues for pywrap rules
PiperOrigin-RevId: 694978320
2024-11-09 21:31:30 -08:00
A. Unique TensorFlower
6be047c796 Rollback of PR #76831
Roll back https://github.com/tensorflow/tensorflow/pull/76831 because it broke `--config=linux_arm64_pycpp_test` with `--test_env=TF_ENABLE_ONEDNN_OPTS=1`.

There are many places where `-fopenmp` is still appended for aarch64 config, e.g. [1](ca3fc6a119/xla/tsl/tsl.bzl (L352)), [2](ca3fc6a119/xla/tsl/tsl.bzl (L352))

Reverts 805ae2c264

PiperOrigin-RevId: 690760453
2024-10-28 15:02:31 -07:00
TensorFlower Gardener
805ae2c264 Merge pull request #76831 from snadampal:compute_library_def_runtime_fix
PiperOrigin-RevId: 690199035
2024-10-26 17:26:30 -07:00
A. Unique TensorFlower
14fe7ddb7d Add TF wheel API Bazel test for Linux and MacOS platforms.
PiperOrigin-RevId: 689025141
2024-10-23 11:48:35 -07:00
Sunita Nadampalli
89bb98e2bc aarch64: disable openmp runtime in Arm Compute library for threadpool build 2024-10-22 22:14:44 +00:00
A. Unique TensorFlower
82d1840d20 Move wheel_dependency flag to TSL repository.
PiperOrigin-RevId: 688223277
2024-10-21 12:11:27 -07:00
A. Unique TensorFlower
7ba00ec863 Fix import_api_packages_test for the cases when WHEEL_NAME is passed to Bazel options.
Add `--@xla//xla/tsl:wheel_dependency=true` flag to wheel tests.

PiperOrigin-RevId: 684610277
2024-10-10 16:22:54 -07:00
A. Unique TensorFlower
cdf0d2dd49 Rename nvcc_clang to cuda_nvcc according to the changes in JAX
Rename `nvcc_clang` to `cuda_nvcc` according to the changes in JAX
Fix "Line too long" error in xla/.../configure.py issued by PyLint

PiperOrigin-RevId: 681940132
2024-10-03 10:57:50 -07:00
tchatow
3f4b2fda6f PR #16882: Symlink hermetic cuda headers to permit clang cuda version detection
Imported from GitHub PR https://github.com/openxla/xla/pull/16882

Fixes #16877
Copybara import of the project:

--
1ff356ac0870002b369c3ec09547aae2a62c70e2 by tchatow <tchatow@users.noreply.github.com>:

Symlink hermetic cuda headers to permit clang cuda version detection

Fixes #16877

Merging this change closes #16882

PiperOrigin-RevId: 679764212
2024-09-27 16:30:47 -07:00
A. Unique TensorFlower
ee313c5f35 Remove -Wno-error=unused-but-set-variable from .bazelrc
This flag is no longer available since clang 10.0.0.

PiperOrigin-RevId: 679177375
2024-09-26 09:28:12 -07:00
A. Unique TensorFlower
55b572cbdb Remove --@xla//xla/tsl:wheel_dependency=true to fix nightlies CI.
PiperOrigin-RevId: 677017563
2024-09-20 17:02:01 -07:00
A. Unique TensorFlower
0546d214f9 Fix wheel_test configurations.
PiperOrigin-RevId: 676939344
2024-09-20 12:56:29 -07:00
A. Unique TensorFlower
55ca3b150c Add TF wheel API test.
This test verifies whether the API v2 packages can be imported from the
current build. It utilizes the `_api/v2/api_packages.txt` list of packages from
the local wheel file specified in the `requirements_lock_<python_version>.txt`.

The test should be executed after the TF wheel was built and put into `dist` dir inside Tensorflow repository.

PiperOrigin-RevId: 676893008
2024-09-20 10:43:17 -07:00
Vladimir Belitskiy
9b4fae4332 Disable xnn_enable_avxvnniint8 for Android.
This is only supported on the very latest compilers at the moment.

PiperOrigin-RevId: 676469843
2024-09-19 10:49:43 -07:00
Thomas Köppe
58374ae498 Remove obsolete op registrations from c_api_no_xla.
PiperOrigin-RevId: 675958129
2024-09-18 06:07:16 -07:00
Alexander Grund
a51fa4b859
Merge branch 'tensorflow:master' into remove-unused-vars 2024-09-17 12:49:48 +02:00
A. Unique TensorFlower
b26c4dff6e Update cuda_clang_official config with CUDA 12.5.1.
PiperOrigin-RevId: 675148340
2024-09-16 08:06:37 -07:00
Henning Becker
6411af23a9 Update default CUDA Toolkit version to 12.5.1
This updates CUDA for both TF and XLA. It also enables the CUDA driver forward
compatibility mode for XLA since XLA's CUDA graph integration needs a newer
driver version.

PiperOrigin-RevId: 673974335
2024-09-12 13:18:54 -07:00
A. Unique TensorFlower
f7c0688e34 Delete remote tensorrt repository rule calls from TF configs.
Starting from v.2.18.0, TensorFlow doesn't support TensorRT.

PiperOrigin-RevId: 673440422
2024-09-11 10:24:08 -07:00
Kanglan Tang
5fb6e85f96 Remove unsupported_*_linux configs from .bazelrc as planned
These configs are no longer needed as the old GCC toolchain is no longer supported.

If you still need these configs, add them to a user-specific bazelrc file for your project. Refer to the Bazel documentation https://bazel.build/run/bazelrc for more guidance.

PiperOrigin-RevId: 673047905
2024-09-10 12:22:40 -07:00
A. Unique TensorFlower
55de680725 Delete remote python repository rule calls from TF configs.
Remote configurations of python repositories are removed because hermetic Python repository rules install and configure python modules in Bazel cache on the host machine. The cache is shared across host and remote machines.

PiperOrigin-RevId: 671512134
2024-09-05 14:39:55 -07:00
A. Unique TensorFlower
2f7f2f9d67 Remove CUDA and NCCL repository rules calls from RBE configs.
The CUDA and NCCL repositories are created on a host machine now and shared via Bazel cache between host and remote machines.

PiperOrigin-RevId: 671089856
2024-09-04 14:03:13 -07:00
A. Unique TensorFlower
00cc85e759 Exclude CUDA dependencies from libtensorflow build.
PiperOrigin-RevId: 668435722
2024-08-28 07:29:20 -07:00
A. Unique TensorFlower
d9071b91c4 Refactor hermetic CUDA flags and update --config=cuda to add CUDA dependencies both for bazel build and bazel test phases.
Add `--@local_config_cuda//cuda:override_include_cuda_libs` to override settings for TF wheel.

Forbid building TF wheel with `--@local_config_cuda//cuda:include_cuda_libs=true`

PiperOrigin-RevId: 666848518
2024-08-23 11:43:31 -07:00
Henning Becker
510dc6fe76 Update cuDNN to version 9.3.0 in TF's and XLA's CI
Due to the amazing hermetic CUDA change this is now just a
one line change and all tests automatically run as presubmits.

PiperOrigin-RevId: 666780598
2024-08-23 07:18:02 -07:00
A. Unique TensorFlower
9b5fa66dc6 Introduce hermetic CUDA in Google ML projects.
1) Hermetic CUDA rules allow building wheels with GPU support on a machine without GPUs, as well as running Bazel GPU tests on a machine with only GPUs and NVIDIA driver installed. When `--config=cuda` is provided in Bazel options, Bazel will download CUDA, CUDNN and NCCL redistributions in the cache, and use them during build and test phases.

    [Default location of CUNN redistributions](https://developer.download.nvidia.com/compute/cudnn/redist/)

    [Default location of CUDA redistributions](https://developer.download.nvidia.com/compute/cuda/redist/)

    [Default location of NCCL redistributions](https://pypi.org/project/nvidia-nccl-cu12/#history)

2) To include hermetic CUDA rules in your project, add the following in the WORKSPACE of the downstream project dependent on XLA.

   Note: use `@local_tsl` instead of `@tsl` in Tensorflow project.

   ```
   load(
      "@tsl//third_party/gpus/cuda/hermetic:cuda_json_init_repository.bzl",
      "cuda_json_init_repository",
   )

   cuda_json_init_repository()

   load(
      "@cuda_redist_json//:distributions.bzl",
      "CUDA_REDISTRIBUTIONS",
      "CUDNN_REDISTRIBUTIONS",
   )
   load(
      "@tsl//third_party/gpus/cuda/hermetic:cuda_redist_init_repositories.bzl",
      "cuda_redist_init_repositories",
      "cudnn_redist_init_repository",
   )

   cuda_redist_init_repositories(
      cuda_redistributions = CUDA_REDISTRIBUTIONS,
   )

   cudnn_redist_init_repository(
      cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
   )

   load(
      "@tsl//third_party/gpus/cuda/hermetic:cuda_configure.bzl",
      "cuda_configure",
   )

   cuda_configure(name = "local_config_cuda")

   load(
      "@tsl//third_party/nccl/hermetic:nccl_redist_init_repository.bzl",
      "nccl_redist_init_repository",
   )

   nccl_redist_init_repository()

   load(
      "@tsl//third_party/nccl/hermetic:nccl_configure.bzl",
      "nccl_configure",
   )

   nccl_configure(name = "local_config_nccl")
   ```

PiperOrigin-RevId: 662981325
2024-08-14 11:47:44 -07:00
A. Unique TensorFlower
caa2b33e73 Disable TensorRT in TF, XLA and JAX.
This is needed for hermetic CUDA integration in Google ML projects since tensorRT is not distributed in the same free way as other CUDA/CUDNN distributives.

PiperOrigin-RevId: 662601190
2024-08-13 12:17:00 -07:00
A. Unique TensorFlower
96eef05d59 Reverts 2a0d14de06
PiperOrigin-RevId: 661349081
2024-08-09 12:47:16 -07:00
Nitin Srinivasan
2a0d14de06 Remove --flaky_test_attempts from Linux Arm64 presubmit/continuous test filters
We want to use this flag only in the release/nightly wheel builds

PiperOrigin-RevId: 660929972
2024-08-08 12:37:15 -07:00
Vladimir Belitskiy
7f03e159cd Set up a new Windows presubmit, and have it use the pycpp.sh script.
PiperOrigin-RevId: 660554802
2024-08-07 16:01:25 -07:00
Eugene Zhulenev
8bfd1d69fa [xla:cpu] Optimize ThunkExecutor::Execute part #2
Use std::aligned_storage_t trick to avoid default-initializing Node struct on a hot path.

name                                     old cpu/op   new cpu/op   delta
BM_SelectAndScatterF32/128/process_time   791µs ± 4%   720µs ± 2%  -8.93%
BM_SelectAndScatterF32/256/process_time  3.20ms ± 4%  2.96ms ± 2%  -7.46%
BM_SelectAndScatterF32/512/process_time  13.7ms ± 5%  12.8ms ± 2%  -6.80%

name                                     old time/op          new time/op          delta
BM_SelectAndScatterF32/128/process_time   790µs ± 5%           719µs ± 1%   -9.00%
BM_SelectAndScatterF32/256/process_time  3.20ms ± 3%          2.96ms ± 1%   -7.58%
BM_SelectAndScatterF32/512/process_time  13.2ms ± 4%          12.3ms ± 1%   -6.82%

PiperOrigin-RevId: 658139935
2024-07-31 14:59:06 -07:00
David Dunleavy
5d8e3738d0 Remove unused aws_support and hdfs_support configs
PiperOrigin-RevId: 655272587
2024-07-23 13:48:57 -07:00
David Dunleavy
e7c17d6fbf Delete unused //tensorflow/python/integration_testing/... and all users
PiperOrigin-RevId: 651843853
2024-07-12 13:16:57 -07:00
Michael Hudgins
a79c87342c Upgrade to support and default to clang 18 for the OSS compiler
PiperOrigin-RevId: 651080905
2024-07-10 11:56:24 -07:00
David Dunleavy
8a69fee806 Add bazelrc change that should've gone in with https://github.com/openxla/xla/pull/13408
This change fixes mlir_hlo tests on Windows

PiperOrigin-RevId: 642299075
2024-06-11 10:12:54 -07:00
Nitin Srinivasan
43eb24b614 Disable BES uploads and set BEP uploads to minimal for Mac RBE cross-compile builds
PiperOrigin-RevId: 630135925
2024-05-02 12:34:20 -07:00
Nitin Srinivasan
96959d2a0d Add --remote_download_minimal to RBE cross-compile Mac config
Fixes https://github.com/bazelbuild/bazel/issues/21568

PiperOrigin-RevId: 629519684
2024-04-30 14:10:14 -07:00
A. Unique TensorFlower
36304aadbd Add RBE toolchains for Clang on Windows.
PiperOrigin-RevId: 627830607
2024-04-24 13:50:35 -07:00
mraunak
5904f2db12 PR #11299: [XLA:CPU] Enable XLA on Windows
Imported from GitHub PR https://github.com/openxla/xla/pull/11299

This PR aims to enable the XLA test cases on the Windows Platform. The changes made:

1. Changed the .bazelrc file to use the correct toolchain and platform
This change will allow the user to successfully run XLA tests on the Windows platform using the Clang compiler using '--config=win_clang' in the bazel command

2. Added conditions to a few test cases to successfully run on the Windows platform
These test cases check the exit/termination status of a process
WIFEXITED is typically supported in POSIX-compliant operating systems like Unix and Linux to check if a process has terminated normally. WEXITSTATUS allows examining the termination status of child processes. However, these macros are not Windows compliant, hence the additional condition block was added to check the exit/termination status of process or child process for the Windows platform
Copybara import of the project:

--
ece9eefa224a6d051bcac089fe2a9a393af16a2b by Raunak <mayank.kumar.raunak@intel.com>:

Enable XLA Windows

--
347c0326af8f608047f06345cad4dfbb53a52150 by mraunak <83710963+mraunak@users.noreply.github.com>:

Update interactive_graphviz_bin_test.cc
--
2d4a3c2bb2ea23f12029583c53087d8739da0319 by mraunak <83710963+mraunak@users.noreply.github.com>:

Update xla/tools/interactive_graphviz_bin_test.cc

Co-authored-by: Penporn Koanantakool <38085909+penpornk@users.noreply.github.com>
--
90ad8b2730900d7f82b0fb1a83a73ffa2e452e0e by mraunak <83710963+mraunak@users.noreply.github.com>:

Update run_hlo_module_bin_test.cc
--
7f31412f6b57e53bd56ba92149b015fcba92b07c by mraunak <83710963+mraunak@users.noreply.github.com>:

Update xla/tools/run_hlo_module_bin_test.cc

Co-authored-by: Penporn Koanantakool <38085909+penpornk@users.noreply.github.com>
--
4d39e35461f6977f36404a435c68b4809fb51a44 by mraunak <83710963+mraunak@users.noreply.github.com>:

Update hlo_expand_test.cc
--
816b9ae0498831b55139c72d627d07f05e51213b by mraunak <83710963+mraunak@users.noreply.github.com>:

Update hlo_expand_test.cc
--
a728fff1aca4258602c3d7c78afcc7d38b545b7a by mraunak <83710963+mraunak@users.noreply.github.com>:

Update hlo_expand_test.cc
--
ffcb6861becf8de7a5c4e64ef0f19d977475d281 by mraunak <83710963+mraunak@users.noreply.github.com>:

Update interactive_graphviz_bin_test.cc
--
f181497793c2c48079d72b545e6efb837f490504 by mraunak <83710963+mraunak@users.noreply.github.com>:

Update interactive_graphviz_bin_test.cc
--
f9af75677e4663b3348e9e89376262eaff389ea9 by mraunak <83710963+mraunak@users.noreply.github.com>:

Update run_hlo_module_bin_test.cc

Merging this change closes #11299

PiperOrigin-RevId: 627216606
2024-04-22 18:21:54 -07:00
Yang Sheng
8db95db399 Add sycl build target #10244
PiperOrigin-RevId: 623599115
2024-04-10 14:49:38 -07:00
A. Unique TensorFlower
87a99438f8 Create and use 2.17 toolchains in Docker build containers.
PiperOrigin-RevId: 623491147
2024-04-10 08:40:34 -07:00
Jake Harmon
41bfc5d180 Set release_base for all release platforms
PiperOrigin-RevId: 620111958
2024-03-28 18:37:22 -07:00
David Dunleavy
aed79b8b9a Generate XLA's warnings.bazelrc automatically by inspecting the toolchain
PiperOrigin-RevId: 619326423
2024-03-26 16:27:38 -07:00
David Dunleavy
5a6898987e Enable more warnings for XLA
PiperOrigin-RevId: 618259828
2024-03-22 13:10:22 -07:00
Alexander Grund
389f1354d4
Merge branch 'tensorflow:master' into remove-unused-vars 2024-03-09 11:24:30 +01:00
Quoc Truong
727affbef5 Add support for cross-compilation build for Linux XLA ARM64 build. The current bazel config does not work out of the box for the XLA repository because of some path differences. Additionally, the build needs to be generated with Standalone Genrule strategy because the XLA build does not use hermetic Python.
PiperOrigin-RevId: 613373529
2024-03-06 16:35:18 -08:00
TensorFlower Gardener
39abfdc1af Merge pull request #60161 from RoboTux:fix_building_for_host_arch
PiperOrigin-RevId: 611657552
2024-02-29 18:13:40 -08:00
A. Unique TensorFlower
c71bcf6fe2 Add support for sm_89 and remove support for sm_50 compute capability
We expect a noticable performance improvement on NVIDIA GPUs that support
sm_89 (NVIDIA L4 and L40) over just providing sm_80 (which is compatible).

To avoid an increase in package size we remove support for the oldest generation of GPUs (sm_50). The last GPU that does not support
sm_60 was released in early 2016 (8 years ago) and it's a laptop GPU.
The last desktop GPU without sm_60 support was released in in early 2015 (9 years ago).

PiperOrigin-RevId: 610648659
2024-02-26 23:50:04 -08:00
Nitin Srinivasan
c5adaecd9b Change bes_upload_mode to nowait_for_upload_complete for the cross-compile macOS build
`fully_async` lets Bazel upload the build event in background and does not prematurely mark the build as failed in case it is taking longer than the value in `--bes_timeout`. However, BES uploads are sometimes taking a very long time (>5hrs) to finish. Having test results show up in result store would be nice but since it is not a priority for these builds right now, we can disable BES upload when cross-compiling for macOS.

PiperOrigin-RevId: 608994569
2024-02-21 08:10:00 -08:00
Nitin Srinivasan
5adeb8894a Fix typo in config name
Only macOS RBE configs need bes upload mode to be set to fully async

PiperOrigin-RevId: 607511822
2024-02-15 17:50:07 -08:00
Nitin Srinivasan
684cfabcff Change bes_upload_mode to fully async for macOS RBE build
The experimental cross-compile build fail sometimes because build event protocol upload for these builds is flaky and sometimes timeout. This is causing the job (even if `bazel build` ran successfully) to reported as failed. By changing it to full async, Bazel uploads BES asynchronously so it can exit and the upload still continues in the background.

PiperOrigin-RevId: 607416907
2024-02-15 12:31:00 -08:00
Nitin Srinivasan
26c102c580 Make the experimental macOS cross-compile build to be "build-only"
PiperOrigin-RevId: 606757596
2024-02-13 15:15:52 -08:00
A. Unique TensorFlower
87a10faa4d Minor fix for addressing TF query analysis errors
PiperOrigin-RevId: 604669820
2024-02-06 09:29:44 -08:00
Nitin Srinivasan
ebed0b006a Set --jobs for macOS cross-compile builds in .bazelrc to avoid being overridden
`rbe_cross_compile_macos_x86` expands `rbe_cross_compile_base` which in turn expands `rbe_base` which has `--jobs=800`. We avoid the jobs setting being overridden by setting it after `--config rbe_cross_compile_base`.

PiperOrigin-RevId: 603804012
2024-02-02 15:38:12 -08:00
Nitin Srinivasan
960ced3b2e Add experimental build-only Linux Arm64 continuous job
This is an experimental job that only builds the tests and does not run them. The ML DevInfra team is using this to collect data on how these builds perform in production when compared to the regular continuous job (build + test) and to see if their runtime is fast enough to be enabled as presubmit (~20-25 mins)

"pycpp.sh" is modified such that it can be used either in build or test mode. Set `TFCI_PYCPP_SWAP_TO_BUILD_ENABLE` to 1 to enable build mode (default is test mode). Since test configs inherit from build, `linux_arm64_pycpp_test` and `linux_arm64_pycpp_test_filters` are changed to be prefixed with "build:" so that we can run both `bazel build` and `bazel test` commands with the same config.

PiperOrigin-RevId: 603543484
2024-02-01 18:54:25 -08:00
David Dunleavy
6c2ee9b54d Add non-rbe nvcc+clang config to bazelrc
PiperOrigin-RevId: 603457045
2024-02-01 13:06:32 -08:00
David Dunleavy
277eed5e83 Remake configure.py to be XLA specific
Also, move it to `build_tools/configure/configure.py` and add tests

PiperOrigin-RevId: 603158260
2024-01-31 14:28:59 -08:00
A. Unique TensorFlower
4f5b948a7a Addressing TF query analysis errors
PiperOrigin-RevId: 602866660
2024-01-30 16:10:46 -08:00
Michael Hudgins
3d0d85522a Reset the cache for the Linux builds
PiperOrigin-RevId: 600891749
2024-01-23 14:38:00 -08:00
Nitin Srinivasan
a4f0d05366 Increase test timeouts for the cross-compile Mac build
Even though we cross-compile with RBE, the tests are still run locally on the host Mac which require these increase timeout thresholds.

PiperOrigin-RevId: 599673837
2024-01-18 17:49:27 -08:00
Nitin Srinivasan
89419e9949 Disable tests that fail in cross-compile setting for macOS x86
This only affects the experimental cross-compile continuous build.

PiperOrigin-RevId: 599626561
2024-01-18 14:40:05 -08:00
A. Unique TensorFlower
d1631c5e5b Update CUDA to 12.3 in JAX/TF/XLA CIs
This is updating CUDA to version 12.3. Related libraries (notably cuDNN) are also getting updated.

PiperOrigin-RevId: 596515360
2024-01-08 01:32:19 -08:00
Nitin Srinivasan
be288d2446 Only run small and medium sized tests in the macOS continuous build
PiperOrigin-RevId: 595791283
2024-01-04 14:06:06 -08:00
Dr. Christoph Mittendorf
4d0f3dad38
Fixed multiple typos
Fixed multiple typos (english)
2023-12-31 16:43:05 +01:00
Nitin Srinivasan
7db05f8a3f Add experimental cross-compile build for macOS
This adds experimental nightly and continuous CI build that cross-compile Tensorflow for macOS on remote Linux x86 VMs.

PiperOrigin-RevId: 592688193
2023-12-20 16:45:47 -08:00
Nitin Srinivasan
1ef3cd7195 Add Bazel toolchain configs for cross-compiling TensorFlow for macOS
PiperOrigin-RevId: 592059267
2023-12-18 17:53:09 -08:00
Nitin Srinivasan
db579439ee Migrate experimental macOS x86 nightly builds to the new CI folder
We need to install Bazelisk and Pyenv manually as these are not present on the x86 Mac VMs. Note that the uploads from these new jobs are disabled as they are not yet ready. However, the old Mac x86 nightly builds will still be running and upload to tf-nightly so there won't be any missing nightly packages while we are doing this migration.

PiperOrigin-RevId: 587930871
2023-12-04 21:22:40 -08:00
A. Unique TensorFlower
4f1d4bc65b Update XLA GPU config with NVCC compiler.
PiperOrigin-RevId: 586163360
2023-11-28 18:51:47 -08:00
Blake Hechtman
1218ceb4b1 XRT is no longer in use. Remove the code.
PiperOrigin-RevId: 584283434
2023-11-21 04:25:39 -08:00
Nitin Srinivasan
e316f08b70 Skip failing test temporarily in macOS Arm64 continuous builds
PiperOrigin-RevId: 584162939
2023-11-20 16:57:59 -08:00
Nitin Srinivasan
dfc0ccc507 Add Bazel toolchain configs for cross-compiling TensorFlow for Linux Aarch64
This adds support for cross-compiling TensorFlow targets for Linux Aarch64 on a Linux x86 machine. We use Clang as the cross-compiler, `ld.lld` as the linker and build in a special cross-compile supported [Docker image](http://gcr.io/tensorflow-testing/ml-devinfra-linux-aarch64-cross-compile@sha256:de26c1dbddcb42b48e665972f62f128f2c69e0f1aa6f0ba6c7411dd23d4de785) that contains all the necessary build tools and the sysroots for both Linux Aarch64 and Linux x86. We need a Linux x86 toolchain because Bazel needs it to be able to build the tools used during the build—such as Protoc, llvm-tablegen, flatc—correctly for our execution platform (Linux x86).

In addition, this adds support for cross-compiling using RBE. We do this by invoking a Bazel remote build from a Linux Aarch64 host which would then send build requests to remote Linux x86 VMs. The Linux x86 VMs build inside the cross-compile Docker image using the cross-compile toolchain configs to build the targets for Aarch64. The targets, once built, are automatically transferred to the host seamlessly. RBE cross-compiling is necessary for us to be able to run `bazel test` commands. Note that lot of the "host_" flags such as "host_cpu" and "host_crosstool_top" flags seem to be actually used to specify the execution platform details. It seems it is this way because these flags are old and predate the distinction between host and execution platform.

The toolchain configs can be found in "tensorflow/tools/toolchains/cross_compile/cc/BUILD" and the RBE platform configs can be found in "tensorflow/tools/toolchains/cross_compile/config/BUILD".

If trying to cross-compile without RBE, run your build from a Linux x86 host in the [Docker image](http://gcr.io/tensorflow-testing/ml-devinfra-linux-aarch64-cross-compile@sha256:de26c1dbddcb42b48e665972f62f128f2c69e0f1aa6f0ba6c7411dd23d4de785)
and use `--config=cross_compile_linux_arm64`.

If you are trying to cross-compile with RBE, run your build from a Linux Aarch64 host and use `--conig=rbe_cross_compile_linux_arm64`. Since RBE uses GCP VM instances and requires authentication, it is only available to Googlers and TF CI builds. Tests can only be run with RBE.

PiperOrigin-RevId: 583062010
2023-11-16 09:04:54 -08:00
A. Unique TensorFlower
cf0dfaeedf Turn on clang+nvcc compiler for rbe_linux_cuda_nvcc config.
PiperOrigin-RevId: 582773576
2023-11-15 13:04:02 -08:00
Nitin Srinivasan
f01d901e43 Enable macOS Arm64 nightly builds
PiperOrigin-RevId: 582177793
2023-11-13 21:18:03 -08:00
A. Unique TensorFlower
a3b3e7dfba Make sure benchmark-test tagged targets are excluded from all test configs.
Adjust shard counts for a couple more test targets that run empty shards.

PiperOrigin-RevId: 580579935
2023-11-08 10:47:24 -08:00
Deqiang Chen
847d314488 ifrt test should not be built when the depdency is not built
PiperOrigin-RevId: 578683545
2023-11-01 17:02:05 -07:00
A. Unique TensorFlower
88e5914db5 Add Kokoro continuous job for testing XLA Linux GPU with NVCC.
PiperOrigin-RevId: 577849947
2023-10-30 08:25:17 -07:00
A. Unique TensorFlower
16afd0591c Disables flaky test flag after 2.15 branch cut.
PiperOrigin-RevId: 577393994
2023-10-27 23:27:44 -07:00
A. Unique TensorFlower
5bbc6ed4ca Add SM 9.0 / Remove SM 7.5 from TF builds
With the recent CUDA update we gained support for SM 9.0 compute capabilities (Hopper-based GPUs),
but so far we didn't enabled this by default in the nightly builds.

To keep file sizes in check we remove the build for SM 7.5. User's of the
Turing architecture (SM 7.5) will fall back to Volta (SM 7.0) which is compatible.

The main difference between Turing and Volta are TensorCores which TF doesn't target from custom CUDA kernels, hence we don't expect any peformance disadvantages
for Turing users.

It also brings us in line with what JAX is doing.

PiperOrigin-RevId: 576835080
2023-10-26 05:33:43 -07:00
A. Unique TensorFlower
8ea8e69e25 Create and use 2.16 toolchains in Docker build containers.
PiperOrigin-RevId: 576310310
2023-10-24 15:57:14 -07:00
Michael Hudgins
56cf3e14d3 Enable flaky test attempts for experimental arm64 config
PiperOrigin-RevId: 572947090
2023-10-12 10:43:01 -07:00
TensorFlower Gardener
dc4fbf0408 Merge pull request #61871 from Intel-tensorflow:mraunak/Clang_TF_Win
PiperOrigin-RevId: 572096891
2023-10-09 18:46:06 -07:00
Nitin Srinivasan
f36d7785e0 Add CI configs for macOS Arm64 builds
PiperOrigin-RevId: 571390990
2023-10-06 11:45:51 -07:00