This change introduces a new `kTfLiteInt2` type to the TFLite schema and MLIR converter. It includes:
- Adding `INT2` to the flatbuffer schema.
- Mapping `TensorType_INT2` to `kTfLiteInt2` in flatbuffer conversions.
- Updating `tflite_types.h` to include `kTfLiteInt2`.
- Modifying `flatbuffer_export.cc` to handle 2-bit integer types from MLIR and pack them densely.
- Generalizing low-bit utility functions (`PackLowBitValuesDensely`, `UnpackDenseLowBitIntoInt8`) to support both 2-bit and 4-bit values.
- Updating type conversion utilities to recognize and handle `kTfLiteInt2`.
- Adjusting `util.cc` to correctly report the size and byte requirements for `kTfLiteInt2` tensors, considering their dense packing.
PiperOrigin-RevId: 819821231
The dependency has been causing a number of issues, see:
https://github.com/tensorflow/tensorflow/pull/82771
The support for the package is currently uncertain, and has
been on the low end for a while - no Windows wheels since 0.32.0, for example.
It has been capped on <py3.12 for a while now also.
PiperOrigin-RevId: 731313825
1. Fix indentation. The indentation of the first three bullet points in the markdown sources did not match the indentation of the fourth and fifth bullet points, nor of the bullet points further below.
2. Wrap some long lines in the markdown sources, in particular where there were some
lines wrapped but others not wrapped in the same bullet point list.
3. Use "Python API" rather than "Interpreter" as the subheading for changes
affecting the `tf.lite.Interpreter` Python class, for consistency with the earlier
heading "C++ API" in the same bullet point list.
PiperOrigin-RevId: 730380377
This CL can resolve the latest bfloat16 TFLite flatbuffers' interpreter executation & quantization. It's required because quantization checks the TFLite float FB validation with interpreter.
PiperOrigin-RevId: 695529144
in the TF Lite C++ API.
This is to enable better API compatibility with TF Lite in Play services
while preserving the implementation flexibility of changing those constants
in future releases.
PiperOrigin-RevId: 693503483
This change moves the release notes item about SignatureRunner supporting
models with no signatures from the 'Breaking Changes' section to the
'Major Features and Improvments' section, since it is not a breaking
change.
PiperOrigin-RevId: 688541180
This change is a bug fix in the automatically generated code that was introduced by the new version of the flatbuffer generator that TensorFlow updated to in c17d64df85 which includes the following change https://github.com/google/flatbuffers/pull/7813 which fixed the underlying flatbuffer code generator bug.
PiperOrigin-RevId: 686567841
1) Hermetic CUDA rules allow building wheels with GPU support on a machine without GPUs, as well as running Bazel GPU tests on a machine with only GPUs and NVIDIA driver installed. When `--config=cuda` is provided in Bazel options, Bazel will download CUDA, CUDNN and NCCL redistributions in the cache, and use them during build and test phases.
[Default location of CUNN redistributions](https://developer.download.nvidia.com/compute/cudnn/redist/)
[Default location of CUDA redistributions](https://developer.download.nvidia.com/compute/cuda/redist/)
[Default location of NCCL redistributions](https://pypi.org/project/nvidia-nccl-cu12/#history)
2) To include hermetic CUDA rules in your project, add the following in the WORKSPACE of the downstream project dependent on XLA.
Note: use `@local_tsl` instead of `@tsl` in Tensorflow project.
```
load(
"@tsl//third_party/gpus/cuda/hermetic:cuda_json_init_repository.bzl",
"cuda_json_init_repository",
)
cuda_json_init_repository()
load(
"@cuda_redist_json//:distributions.bzl",
"CUDA_REDISTRIBUTIONS",
"CUDNN_REDISTRIBUTIONS",
)
load(
"@tsl//third_party/gpus/cuda/hermetic:cuda_redist_init_repositories.bzl",
"cuda_redist_init_repositories",
"cudnn_redist_init_repository",
)
cuda_redist_init_repositories(
cuda_redistributions = CUDA_REDISTRIBUTIONS,
)
cudnn_redist_init_repository(
cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
)
load(
"@tsl//third_party/gpus/cuda/hermetic:cuda_configure.bzl",
"cuda_configure",
)
cuda_configure(name = "local_config_cuda")
load(
"@tsl//third_party/nccl/hermetic:nccl_redist_init_repository.bzl",
"nccl_redist_init_repository",
)
nccl_redist_init_repository()
load(
"@tsl//third_party/nccl/hermetic:nccl_configure.bzl",
"nccl_configure",
)
nccl_configure(name = "local_config_nccl")
```
PiperOrigin-RevId: 662981325
In the absence of a model signature, with this change a SignatureRunner and a
AsyncSignatureRunner would take the names for their input/output tensors from
the model. This is necessary in order to ensure correct functionality for the
several (Async)SignatureRunner methods that take an input/output name as
argument.
Note that this change alters the behavior of AsyncSignatureRunner, an experimental API, in that it originally would assume nullptr for input/output names for models with no signatures.
PiperOrigin-RevId: 662908295
This is needed for hermetic CUDA integration in Google ML projects since tensorRT is not distributed in the same free way as other CUDA/CUDNN distributives.
PiperOrigin-RevId: 662601190
When `synchronous` is set to `True`, the `map` will always run synchronously, even when `options.experimental_optimization.map_parallelization=True`. Setting `synchronous=True` is useful for saving memory, since it buffers one less element than setting `num_parallel_calls=1`.
PiperOrigin-RevId: 642418430