mirror of
https://github.com/zebrajr/tensorflow.git
synced 2025-12-06 12:20:11 +01:00
master
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
5dd1758095 |
Add support for local wheel whitelisting and blacklisting
Also fix python version matching logic for wheels which do not require a specific python version. PiperOrigin-RevId: 650383841 |
||
|
|
24a9d7b038 |
Merged commit includes the following changes:
637889039 by A. Unique TensorFlower<gardener@tensorflow.org>:
Remove experimental_adaptive_avx_optimization flag from XNNPACK delegate options
It's always on now.
--
637886275 by A. Unique TensorFlower<gardener@tensorflow.org>:
[XLA:GPU][IndexAnalysis] Use a flag for IsKnownEmpty instead of recomputing every time.
Right now, we would try to simplify or compose with indexing maps that have a known empty domain. That's incorrect, but checking if the domain is empty every time is expensive and can be cached.
--
637876088 by A. Unique TensorFlower<gardener@tensorflow.org>:
Internal config change
--
637864812 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #13088: [ROCm] Fix reduce_atomic_min.hlo.test
Imported from GitHub PR https://github.com/openxla/xla/pull/13088
Copybara import of the project:
--
b241e076198c03fffd8c7e3a6568070ef0223653 by mmakevic <Milica.Makevic@amd.com>:
Fix reduce_atomic_min.hlo.test
--
f894f1954513019f0ca6890a27e09e0fee9d462e by mmakevic <Milica.Makevic@amd.com>:
Remove extra space
Merging this change closes #13088
--
637860531 by A. Unique TensorFlower<gardener@tensorflow.org>:
Remove xla_gpu_normalize_layouts flag.
By now, this is really not experimental anymore.
--
637857834 by A. Unique TensorFlower<gardener@tensorflow.org>:
Add heuristic for when to treat Gather ops as coalesced.
--
637820064 by A. Unique TensorFlower<gardener@tensorflow.org>:
compat: Update forward compatibility horizon to 2024-05-28
--
637820063 by A. Unique TensorFlower<gardener@tensorflow.org>:
Update GraphDef version to 1876.
--
637756070 by A. Unique TensorFlower<gardener@tensorflow.org>:
Automated rollback of changelist 636206934.
637674999 by A. Unique TensorFlower<gardener@tensorflow.org>:
[xla:cpu] Add initial support for Thunk-based execution to CpuCompiler and CpuExecutable
Add support for compiling XLA:CPU HloModule to a ThunkSequence instead of a LLVM module and a jit-compiled function.
--
637666734 by A. Unique TensorFlower<gardener@tensorflow.org>:
Don't fuse inside computations that are already fused.
--
637657345 by A. Unique TensorFlower<gardener@tensorflow.org>:
Automated rollback of changelist 636208997.
637651034 by A. Unique TensorFlower<gardener@tensorflow.org>:
Integrate LLVM at llvm/llvm-project@fddf350f96
Updates LLVM usage to match
[fddf350f9640](https://github.com/llvm/llvm-project/commit/fddf350f9640)
--
637639233 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #12940: [ROCm] Fix dot_bf16.hlo.test on ROCm
Imported from GitHub PR https://github.com/openxla/xla/pull/12940
Added additional params for `hlo_lit_tests` as a workaround, so `mi200.txtpb` would be used in `dot_bf16.hlo.test` for rocm.
Copybara import of the project:
--
c3bb3a7349266a51ff22a2e18dab0afb6e81bad4 by mmakevic <Milica.Makevic@amd.com>:
Have dot_bf16.hlo.test use mi200.txtpb for rocm
Merging this change closes #12940
--
637632492 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #13089: Fix reduce_large_row_to_scalar.hlo.test
Imported from GitHub PR https://github.com/openxla/xla/pull/13089
Copybara import of the project:
--
ae97058c01ca57107a2566a6f190d51f5ad4ca0e by mmakevic <Milica.Makevic@amd.com>:
Fix reduce_large_row_to_scalar.hlo.test
Merging this change closes #13089
--
637623329 by A. Unique TensorFlower<gardener@tensorflow.org>:
Automated rollback of changelist 637594837.
637607386 by A. Unique TensorFlower<gardener@tensorflow.org>:
Automated rollback of changelist 636926669.
637594837 by A. Unique TensorFlower<gardener@tensorflow.org>:
[XLA:GPU] Pass CUDA_VERSION explicitly into CudnnFusedConvRewriter.
Passing the CuDNN version will be the next step.
--
637580666 by A. Unique TensorFlower<gardener@tensorflow.org>:
Remove usage of --xla_gpu_enable_triton_hopper in autotuner
--
637578573 by A. Unique TensorFlower<gardener@tensorflow.org>:
[XLA:GPU] Add documentation about RTVars.
--
637570959 by A. Unique TensorFlower<gardener@tensorflow.org>:
Update GraphDef version to 1875.
--
637570942 by A. Unique TensorFlower<gardener@tensorflow.org>:
compat: Update forward compatibility horizon to 2024-05-27
--
637561798 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #12979: [NVIDIA] Fix PGLE for latency estimation of p2p instructions
Imported from GitHub PR https://github.com/openxla/xla/pull/12979
PGLE doesn't recognize p2p instruction such as send or recv as async operations.
This adds the utility to check if instruction is a p2p communication instruction.
Copybara import of the project:
--
469b2d31ff6b0270dda28f8754462681514d0e04 by TJ Xu <tjx@nvidia.com>:
fix pgle not recognizing p2p instructions
Merging this change closes #12979
--
637560035 by A. Unique TensorFlower<gardener@tensorflow.org>:
[xla:gpu] Track loop iteration counter of a WhileThunk in thread local variable
--
637552495 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #13056: Use `operator->` with XLA FFI Result Buffers in custom call docs
Imported from GitHub PR https://github.com/openxla/xla/pull/13056
Copybara import of the project:
--
7940a1a02a0f93736a88406958edf62488bdbe19 by Andrey Portnoy <aportnoy@nvidia.com>:
Use `operator->` with XLA FFI Result Buffers in custom call docs
Merging this change closes #13056
--
637547404 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #13068: Introduce the Blackwell compute capability.
Imported from GitHub PR https://github.com/openxla/xla/pull/13068
Introduce the Blackwell compute capability. Future Blackwell-specific changes can be guarded by this capability.
Copybara import of the project:
--
cc1adebc95166b2d3979cc01de954a1895515ad4 by Dimitris Vardoulakis <dvardoulakis@nvidia.com>:
Introduce the Blackwell compute capability. Future Blackwell-specific changes can be guarded by this capability.
Merging this change closes #13068
--
637541058 by A. Unique TensorFlower<gardener@tensorflow.org>:
PR #13061: Add Tirton support for XLA clamp
Imported from GitHub PR https://github.com/openxla/xla/pull/13061
Add Triton support for XLA clamp instruction. Clamp is a common instruction found in FP8 fusions, and will be used in cuDNN fusions:
This is a fix for perviously rolled-back PR due to internal ir_emitter_triton test failure:
|
||
|
|
36f7c31dc6 |
Add support for local wheel files in hermetic python
This allows users to specify a list of workspaces that contain pre-built local wheels without need to manually add them in requirements.txt files. The wheels will be automatically processed by bazel rules and injected into the requirements_lock_<py_version>.txt on the fly (assuming `HERMETIC_PYTHON_VERSION=py_version`). This feature is mainly inspired by pytorch/xla demand, since building pytorch/xla implies first building pytorch repo locally and then pointing to its artifacts (both raw .so files and entire .whl) in pytorch/xla build. This also helps JAX to facilitate build_jaxlib=false case, as it would eliminate need to manually update requirements_locak.txt files in JAX CI as well. PiperOrigin-RevId: 636691616 |
||
|
|
3bf2ac3b77 |
Use hermetic Python in TSL and XLA
PiperOrigin-RevId: 634094641 |