tensorflow

mirror of https://github.com/zebrajr/tensorflow.git synced 2025-12-06 00:19:58 +01:00

Author	SHA1	Message	Date
A. Unique TensorFlower	d25ccb438d	Reverts `cef240807a` PiperOrigin-RevId: 826374657	2025-10-31 01:32:52 -07:00
A. Unique TensorFlower	4cfaa7e25c	Automated Code Change PiperOrigin-RevId: 826372270	2025-10-31 01:19:21 -07:00
A. Unique TensorFlower	5133f83425	Automated Code Change PiperOrigin-RevId: 826363842	2025-10-31 00:50:37 -07:00
A. Unique TensorFlower	ebacf2a211	Automated Code Change PiperOrigin-RevId: 826342599	2025-10-30 23:59:59 -07:00
Bill Varcho	cef240807a	[ReplicaGroupV3][MeshAxesReplicaGroupList][1/2] Add initial class definition for V3 replica group. PiperOrigin-RevId: 826334561	2025-10-30 23:18:40 -07:00
Felix Wang	d9c76aafeb	Adjust the collective-permute cross host type to `MULTI_HOST_NON_WORLD_LEVEL` only. PiperOrigin-RevId: 826327580	2025-10-30 22:54:49 -07:00
Eugene Zhulenev	d90723f48e	[xla:pjrt:cpu] Add e2e test for YnnFusion + PJRT client PiperOrigin-RevId: 826323865	2025-10-30 22:41:49 -07:00
Eugene Zhulenev	7ad55e8818	[xla:cpu] Add an end-to-end test for ynn fusions PiperOrigin-RevId: 826318525	2025-10-30 22:20:44 -07:00
Eugene Zhulenev	bf23bf1b32	[xla:cpu] Pass HloModule pointer to Thunk SerDes PiperOrigin-RevId: 826312546	2025-10-30 22:11:41 -07:00
Eugene Zhulenev	56d3b19280	[xla:cpu] NFC: Rename protos for Xnn/Ynn fusion options PiperOrigin-RevId: 826304955	2025-10-30 22:01:47 -07:00
A. Unique TensorFlower	a95c558dc4	Save compile options with the compiled IFRT IR program to be used later for serialization PiperOrigin-RevId: 826301016	2025-10-30 21:54:24 -07:00
A. Unique TensorFlower	e61bac51b1	Automated Code Change PiperOrigin-RevId: 826298597	2025-10-30 21:47:00 -07:00
A. Unique TensorFlower	b2334ac330	Integrate LLVM at llvm/llvm-project@22079e3f36 Updates LLVM usage to match [22079e3f3698](https://github.com/llvm/llvm-project/commit/22079e3f3698) PiperOrigin-RevId: 826294004	2025-10-30 20:44:41 -07:00
A. Unique TensorFlower	6d86cff5f3	Automated Code Change PiperOrigin-RevId: 826286610	2025-10-30 20:12:04 -07:00
Eugene Zhulenev	db273660ba	[xla:pjrt] Remove PjRtFuture type alias Cleaning up BUILD files and includes will be done separately. PiperOrigin-RevId: 826280389	2025-10-30 19:44:40 -07:00
Eugene Zhulenev	429a0cf1c7	[xla:cpu] Add target machine features to the error message PiperOrigin-RevId: 826253599	2025-10-30 17:49:12 -07:00
Eugene Zhulenev	d9024af6d4	[xla:cpu] Do not register legacy runtime symbols with XLA:CPU custom calls PiperOrigin-RevId: 826208548	2025-10-30 16:25:55 -07:00
Niklas Vangerow	31bb7c01ff	Migrate multioutput_fusion_test to use PjRt. PiperOrigin-RevId: 826203532	2025-10-30 15:18:22 -07:00
Parker Schuh	c3d0bf7023	Add additional way to poision a connection (to allow testing different poisoning strategies). PiperOrigin-RevId: 826193232	2025-10-30 14:52:58 -07:00
A. Unique TensorFlower	c40bb10b96	Add the option to dump before/after autotuned instructions in AutotunerConfig. - This change is required to still support the functionality of xla_gpu_dump_autotuned_gemm_fusions in the new infra. PiperOrigin-RevId: 826161466	2025-10-30 14:39:24 -07:00
A. Unique TensorFlower	8f60516a86	Refactor: Move common SymbolicMapTest setup to the fixture. This change moves the initialization of commonly used `SymbolicExpr` and a sample `SymbolicMap` into the `SymbolicMapTest` fixture to reduce code duplication across tests. PiperOrigin-RevId: 826161168	2025-10-30 14:19:16 -07:00
A. Unique TensorFlower	7736af79a6	Only enable YNNPACK for bf16 and int8 for now. We plan to enable this in stages, starting with int8 and bf16, where the improvement is more significant. PiperOrigin-RevId: 826160602	2025-10-30 14:05:02 -07:00
Karlo Basioli	f4ebf9d47d	[XLA][codegen] Migrate triton operations that have shared dialect lowerings are implemented for. These were missed in previous commits. Addresses transpose and bitcast. PiperOrigin-RevId: 826158776	2025-10-30 13:54:31 -07:00
Niklas Vangerow	1424c4f739	Migrate slice_test to use PjRt. PiperOrigin-RevId: 826158235	2025-10-30 13:45:44 -07:00
Karlo Basioli	5973848600	[XLA][codegen] Emit shlo reshape from the fusion emitter and lower it to triton for the triton backend. PiperOrigin-RevId: 826147865	2025-10-30 13:32:56 -07:00
Quoc Truong	f01a7fea8c	Update ML Build Docker container to use hermetic C++ PiperOrigin-RevId: 826147864	2025-10-30 13:25:44 -07:00
A. Unique TensorFlower	7e7b1a3015	Allow empty dimension list in SymbolicMap::ReplaceDimsAndSymbols I originally assumed the caller was always providing a full list of replacements but IndexingMap have some uses where the dim_replacement list is empty, resulting in a CHECK-fail. So, I'm allowing the user to provide either dim or symbol empty lists to ReplaceDimsAndSymbols. In that case, the dims/symbols won't be replaced. PiperOrigin-RevId: 826138814	2025-10-30 13:18:28 -07:00
Niklas Vangerow	175774337e	Migrate params_test to use PjRt. PiperOrigin-RevId: 826137636	2025-10-30 13:07:24 -07:00
Daniel Sosa	5dcb571931	Remove unused code from convert.py PiperOrigin-RevId: 826137491	2025-10-30 12:59:13 -07:00
Zixuan Jiang	146c4f56b7	Clear frontend attributes for get-tuple-elements of GlobalToLocal and LocalToGlobal custom-calls. The GlobalToLocal and LocalToGlobal custom-calls are for Shardy round trip. These get-tuple-elements will be removed when we import the Shardy dialect and thus they do not need to hold frontend attributes. This can reduce the size of the generated HLO module text. PiperOrigin-RevId: 826134489	2025-10-30 12:43:45 -07:00
William S. Moses	a94890b1f9	Improve CUDNN error messages PiperOrigin-RevId: 826124080	2025-10-30 12:34:16 -07:00
Oleg Shyshkov	9eeebc9be5	[XLA:GPU] Use a single intra-host ragged-all-to-all in the decomposition. Instead of 2 ra2a + concat, we can double the output buffer and adjust output offsets. This way we can save on latency by having only one multi-GPU synchronization. PiperOrigin-RevId: 826122665	2025-10-30 12:24:20 -07:00
Eugene Zhulenev	9b51864c7b	[xla:ffi] Add example of async custom call in XLA:GPU PiperOrigin-RevId: 826121283	2025-10-30 12:11:20 -07:00
Niklas Vangerow	061041963e	Migrate map_test to use PjRt. PiperOrigin-RevId: 826107887	2025-10-30 11:52:36 -07:00
A. Unique TensorFlower	dd3a14ace4	[Autotuner] Add sharding support using KeyValueStore Interface. - The logic is ported from gemm_fusion_autotuner. I have changed the key of the Key Value store to be just module-fingerprint, earlier it was module-fingerprint + autotunable-fusion-set-from-the-module-fingerprint. The module fingerprint should already represent the fusion-sets contained in it. - We can improve or just remove this functionality when we design storage for offline autotuning. PiperOrigin-RevId: 826103885	2025-10-30 11:43:43 -07:00
Karlo Basioli	4ffcba9004	[XLA][codegen] Emit stablehlo reduce op from the fusion emitter and lower it to triton for the triton backend. PiperOrigin-RevId: 826102479	2025-10-30 11:30:55 -07:00
Niklas Vangerow	0c87bef802	Migrate reshape_test to use PjRt. PiperOrigin-RevId: 826087067	2025-10-30 11:21:47 -07:00
Will Froom	6dd75c4e8b	[XTile] Modify Stable HLO check on iota to restrict it to the 1D case. PiperOrigin-RevId: 826085272	2025-10-30 11:01:37 -07:00
A. Unique TensorFlower	f2b36d1780	Integrate LLVM at llvm/llvm-project@4c46ae3948 Updates LLVM usage to match [4c46ae394841](https://github.com/llvm/llvm-project/commit/4c46ae394841) PiperOrigin-RevId: 826082725	2025-10-30 10:40:50 -07:00
Christian Sigg	3943b53326	Increase the maximum HLO op chain length for profiling from 8192 to 16384. This prevents max chains of trivial ops (e.g. add.fp32) to run faster than copying the data, which results in 'too fast to measure' error. PiperOrigin-RevId: 826079017	2025-10-30 10:34:18 -07:00
Yun Peng	71e640f242	Update Bazel version to 7.7.0. This change updates the Bazel version used in TensorFlow, JAX, and XLA projects from 7.4.1 to 7.7.0 in `.bazelversion` files and build scripts. PiperOrigin-RevId: 826075658	2025-10-30 10:27:38 -07:00
Kanish Anand	1bef3e80b5	Reuse tuple elements field from existing `HloSharding` PiperOrigin-RevId: 826058519	2025-10-30 10:16:11 -07:00
dependabot[bot]	d638f84b90	PR #33278 : Bump keras from 3.11.3 to 3.12.0 in /xla/backends/cpu/benchmarks/e2e/gemma2/keras Imported from GitHub PR https://github.com/openxla/xla/pull/33278 Bumps [keras](https://github.com/keras-team/keras) from 3.11.3 to 3.12.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/keras-team/keras/releases">keras's releases</a>.</em></p> <blockquote> <h2>Keras 3.12.0</h2> <h2>Highlights</h2> <h3>Keras has a new model distillation API!</h3> <p>You now have access to an easy-to-use API for distilling large models into small models while minimizing performance drop on a reference dataset -- compatible with all existing Keras models. You can specify a range of different distillation losses, or create your own losses. The API supports multiple concurrent distillation losses at the same time.</p> <p>Example:</p> <pre lang="python"><code># Load a model to distill teacher = ... # This is the model we want to distill it into student = ... <h1>Configure the process</h1> <p>distiller = Distiller( teacher=teacher, student=student, distillation_losses=LogitsDistillation(temperature=3.0), ) distiller.compile( optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'] )</p> <h1>Train the distilled model</h1> <p>distiller.fit(x_train, y_train, epochs=10) </code></pre></p> <h3>Keras supports GPTQ quantization!</h3> <p>GPTQ is now built into the Keras API. GPTQ is a post-training, weights-only quantization method that compresses a model to int4 layer by layer. For each layer, it uses a second-order method to update weights while minimizing the error on a calibration dataset.</p> <p>Learn how to use it <a href="https://keras.io/guides/gptq_quantization_in_keras/">in this guide</a>.</p> <p>Example:</p> <pre lang="python"><code>model = keras_hub.models.Gemma3CausalLM.from_preset("gemma3_1b") gptq_config = keras.quantizers.GPTQConfig( dataset=calibration_dataset, tokenizer=model.preprocessor.tokenizer, weight_bits=4, group_size=128, num_samples=256, sequence_length=256, hessian_damping=0.01, symmetric=False, </tr></table> </code></pre> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`adbfd13426`"><code>adbfd13</code></a> Add warning to <code>set_backend</code> and more detailed example. (<a href="https://redirect.github.com/keras-team/keras/issues/21787">#21787</a>)</li> <li><a href="`70598b7903`"><code>70598b7</code></a> Fix typo in Distiller docstring</li> <li><a href="`eecd34f406`"><code>eecd34f</code></a> Fix: <code>keras.ops.quantile</code> works with tf graph execution (<a href="https://redirect.github.com/keras-team/keras/issues/21782">#21782</a>)</li> <li><a href="`c2bc6cfcc7`"><code>c2bc6cf</code></a> Suport keras.op.view() to view the same data bitwise at a new dtype (<a href="https://redirect.github.com/keras-team/keras/issues/21763">#21763</a>)</li> <li><a href="`10b51ce5a5`"><code>10b51ce</code></a> Make confusion metrics compilable. (<a href="https://redirect.github.com/keras-team/keras/issues/21775">#21775</a>)</li> <li><a href="`18f79d69c9`"><code>18f79d6</code></a> Fix negative index handling in MultiHeadAttention attention_axes (<a href="https://redirect.github.com/keras-team/keras/issues/21721">#21721</a>)</li> <li><a href="`18e0364cbc`"><code>18e0364</code></a> Support for extracting volume patches (<a href="https://redirect.github.com/keras-team/keras/issues/21759">#21759</a>)</li> <li><a href="`dc5e42cca4`"><code>dc5e42c</code></a> fix sas metrics in jax <code>fit</code> (<a href="https://redirect.github.com/keras-team/keras/issues/21765">#21765</a>)</li> <li><a href="`1ba3b8f896`"><code>1ba3b8f</code></a> Fix discretization discrepancy (<a href="https://redirect.github.com/keras-team/keras/issues/21769">#21769</a>)</li> <li><a href="`53987a768d`"><code>53987a7</code></a> Document that <code>set_backend</code> requires re-importing keras. (<a href="https://redirect.github.com/keras-team/keras/issues/21764">#21764</a>)</li> <li>Additional commits viewable in <a href="https://github.com/keras-team/keras/compare/v3.11.3...v3.12.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=keras&package-manager=pip&previous-version=3.11.3&new-version=3.12.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/openxla/xla/network/alerts). </details> Copybara import of the project: -- b37d94a32428d62ed3e73765f4e7b61bc6ed8549 by dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>: Bump keras in /xla/backends/cpu/benchmarks/e2e/gemma2/keras Bumps [keras](https://github.com/keras-team/keras) from 3.11.3 to 3.12.0. - [Release notes](https://github.com/keras-team/keras/releases) - [Commits](https://github.com/keras-team/keras/compare/v3.11.3...v3.12.0) --- updated-dependencies: - dependency-name: keras dependency-version: 3.12.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Merging this change closes #33278 PiperOrigin-RevId: 826053656	2025-10-30 10:07:36 -07:00
A. Unique TensorFlower	a28d4bf9f8	Add helper functions for creating and inspecting symbolic dimensions and symbols Symbolic dimensions and symbols are both implemented as SymbolicExpr variables, with symbols being offset by the number of dimensions. This implementation detail was previously exposed to users of SymbolicExprContext, who had to manually calculate variable IDs. I ended having a hidden bug in the implementation of IndexingMap, so I included these free functions to make the translation less bug-prone. The SymbolicMap tests have been updated to use these new helper functions PiperOrigin-RevId: 826053249	2025-10-30 09:54:07 -07:00
Graham	8d5e1015aa	PR #4272 : Qualcomm AI Engine Direct - Fix android-arm64 CMake build bug Imported from GitHub PR https://github.com/google-ai-edge/LiteRT/pull/4272 Summary: - Add typename to avoid cmake failure Test Result: ======================== Test Summary ======================== //litert/vendors/qualcomm/core/utils:utils_test [==========] 12 tests from 3 test suites ran. (0 ms total) [ PASSED ] 12 tests. YOU HAVE 2 DISABLED TESTS //litert/vendors/qualcomm/core/backends:qnn_backend_test [==========] 0 tests from 0 test suites ran. (0 ms total) [ PASSED ] 0 tests. YOU HAVE 4 DISABLED TESTS //litert/vendors/qualcomm/core/wrappers/tests:op_wrapper_test [==========] 7 tests from 1 test suite ran. (0 ms total) [ PASSED ] 7 tests. //litert/vendors/qualcomm/core/wrappers/tests:tensor_wrapper_test [==========] 18 tests from 1 test suite ran. (0 ms total) [ PASSED ] 18 tests. //litert/vendors/qualcomm/core/wrappers/tests:param_wrapper_test [==========] 16 tests from 2 test suites ran. (0 ms total) [ PASSED ] 16 tests. //litert/vendors/qualcomm/core/wrappers/tests:quantize_params_wrapper_test [==========] 13 tests from 3 test suites ran. (0 ms total) [ PASSED ] 13 tests. //litert/vendors/qualcomm/core:common_test [==========] 13 tests from 1 test suite ran. (0 ms total) [ PASSED ] 13 tests. //litert/vendors/qualcomm/core:tensor_pool_test [==========] 8 tests from 1 test suite ran. (0 ms total) [ PASSED ] 8 tests. //litert/vendors/qualcomm:qnn_manager_test [==========] 3 tests from 1 test suite ran. (259 ms total) [ PASSED ] 3 tests. //litert/c/options:litert_qualcomm_options_test [==========] 17 tests from 2 test suites ran. (0 ms total) [ PASSED ] 17 tests. //litert/c:litert_op_options_test //litert/tools/flags/vendors:qualcomm_flags_test [==========] 8 tests from 5 test suites ran. (0 ms total) [ PASSED ] 8 tests. //litert/vendors/qualcomm/compiler:qnn_compiler_plugin_test [==========] 232 tests from 4 test suites ran. (61193 ms total) [ PASSED ] 232 tests. Copybara import of the project: -- 57e7429920d1950d635b29c4f8a18ccd51ade496 by chuntl-qti <chuntl@qti.qualcomm.com>: Qualcomm AI Engine Direct - Fix android-arm64 CMake build bug Summary: - Add typename to avoid cmake failure Merging this change closes #4272 PiperOrigin-RevId: 826045177	2025-10-30 09:41:23 -07:00
Marcin Radomski	a0921d9997	[XLA:GPU] CustomCallThunk: enable use of lambdas with captures Add CustomCallThunk::OwnedHandlerBundle, a bag of `unique_ptr<ffi::Ffi>` that enable using lambdas with captures in CustomCallThunk. Lambda captures must outlive the created thunk. The functionality is similar to what is possible with "old-style" callbacks, but doesn't depend on them, and adds support for other handlers available via XLA_FFI_Handler_Bundle. PiperOrigin-RevId: 826043689	2025-10-30 09:32:58 -07:00
Karlo Basioli	4461afa7ef	[XLA:CPU] Compare host cpu features when loading AOT result to the compilation machine features PiperOrigin-RevId: 826043058	2025-10-30 09:23:42 -07:00
A. Unique TensorFlower	fd85062199	Add hashing support for SymbolicMap This change implements AbslHashValue and llvm::hash_value for xla::gpu::SymbolicMap. This is a prerequisite for correctly implementing AbslHashValue for xla::IndexingMap after its internal migration to use SymbolicMap. Specifically, it needs be used in IndexingMap::AbslHashValue. PiperOrigin-RevId: 826038011	2025-10-30 09:11:19 -07:00
A. Unique TensorFlower	bec8916f32	[XLA:Collective] Remove unnecessary const for a function argument PiperOrigin-RevId: 826036516	2025-10-30 09:04:33 -07:00
Henning Becker	772ed8bbc7	Add serialization for ffi::Attribute `ffi::Attribute` and related types are members of the `CustomCallThunk`. Therefore we need to be able to serialize these types to proto message if we wanna be able to serialize instances of CustomCallThunk. So this change is adding proto message representation of the types and adds functions `ToProto` and `FromProto` to each of them. Most of the types are currently defined as type aliases of some `std::variant` instantiation. This changes replaces the aliases by classes which inherit from the std::variant type. These new types then get the proto serialization functions. PiperOrigin-RevId: 826035931	2025-10-30 08:57:22 -07:00

1 2 3 4 5 ...

186466 Commits