Commit Graph

153857 Commits

Author SHA1 Message Date
Adrian Kuegel
f4529e80ab [XLA:GPU] Move buffer sharing logic into a separate target (NFC).
This will allow to use the same CanShareBuffer function in gpu_compiler and
hlo_to_llvm_ir testing tool.

PiperOrigin-RevId: 566615614
2023-09-19 06:49:54 -07:00
A. Unique TensorFlower
b9209839ea No public description
PiperOrigin-RevId: 566612185
2023-09-19 06:30:30 -07:00
A. Unique TensorFlower
ab6bbb417f No public description
PiperOrigin-RevId: 566595405
2023-09-19 05:05:42 -07:00
Quentin Khan
ebb5828ec7 Make the MHLO dialect legal in the MHLO -> TFLite convertion pass.
PiperOrigin-RevId: 566593832
2023-09-19 04:58:23 -07:00
A. Unique TensorFlower
2f6c00fd5d Internal Code Change
PiperOrigin-RevId: 566575065
2023-09-19 03:27:21 -07:00
A. Unique TensorFlower
63c478552e Update TFRT dependency to use revision
bf279def1a.

PiperOrigin-RevId: 566574665
2023-09-19 03:19:52 -07:00
Ilia Sergachev
6cbc40ce10 [XLA:GPU][NFC] Make test TritonGemmTest.NondefaultOperandLayoutIsSupported faster.
And clean it up a little.

PiperOrigin-RevId: 566571071
2023-09-19 03:02:48 -07:00
A. Unique TensorFlower
5ead0b6356 Integrate LLVM at llvm/llvm-project@4176ce61f1
Updates LLVM usage to match
[4176ce61f156](https://github.com/llvm/llvm-project/commit/4176ce61f156)

PiperOrigin-RevId: 566566342
2023-09-19 02:38:36 -07:00
Zhi An Ng
92cfa0e8d0 Update XNNPack dependency
PiperOrigin-RevId: 566564208
2023-09-19 02:29:01 -07:00
A. Unique TensorFlower
fe2da355b5 compat: Update forward compatibility horizon to 2023-09-19
PiperOrigin-RevId: 566559310
2023-09-19 02:19:06 -07:00
A. Unique TensorFlower
ef01cc1d01 Update GraphDef version to 1624.
PiperOrigin-RevId: 566558824
2023-09-19 02:11:46 -07:00
Christian Sigg
c4952b2070 Fixing bug introduced in cl/560954035. The NVIDIA RTX A6000 is an sm_86 GPU, not sm_89.
PiperOrigin-RevId: 566557943
2023-09-19 02:03:08 -07:00
A. Unique TensorFlower
154eb9d524 Internal Code Change
PiperOrigin-RevId: 566517153
2023-09-18 22:56:06 -07:00
Berkin Ilbeyi
1af702b984 [XLA] Verify that async computations are trivial (contain only a root and parameter instructions).
PiperOrigin-RevId: 566509445
2023-09-18 22:19:21 -07:00
Jake Harmon
bf9ebc7c3b Move TSL/XLA headers to their original locations in tensorflow/include
PiperOrigin-RevId: 566498953
2023-09-18 21:20:34 -07:00
Wilsin Gosti
4f638671ed #tf.data Allow the DataServiceClient to increase max_outstanding_requests only when there is enough memory unless its value is specifically set by users.
PiperOrigin-RevId: 566497277
2023-09-18 21:09:24 -07:00
Fangrui Song
dfcf1d40e4 Add third_party/triton/cl565664892.patch which is missing from a previous llvm integration
PiperOrigin-RevId: 566490078
2023-09-18 20:31:22 -07:00
Peter Hawkins
1169ec8093 Remove :cudnn_wrappers BUILD target.
PiperOrigin-RevId: 566466412
2023-09-18 18:12:00 -07:00
Fangrui Song
a86c6c5839 Integrate LLVM at llvm/llvm-project@45735770ee
Updates LLVM usage to match
[45735770ee87](https://github.com/llvm/llvm-project/commit/45735770ee87)

PiperOrigin-RevId: 566460975
2023-09-18 17:48:00 -07:00
Brian Wieder
00a17d7451 Add error checking to make sure the API generator only writes to files that are passed in as output_files.
PiperOrigin-RevId: 566423310
2023-09-18 15:11:47 -07:00
Majid Dadashi
94e73c964e [tflite] Outline the stablehlo.scatter flatbuffer conversion logic
PiperOrigin-RevId: 566420937
2023-09-18 15:03:26 -07:00
Yash Katariya
f0d131703d Delete TransferPjRtBufferBetweenMemories and replace it with CopyToMemorySpace which is more robust and fully async and transfers between any memory space.
PiperOrigin-RevId: 566420233
2023-09-18 14:55:23 -07:00
Fangrui Song
8818cf8285 Integrate LLVM at llvm/llvm-project@14882d6b74
Updates LLVM usage to match
[14882d6b7440](https://github.com/llvm/llvm-project/commit/14882d6b7440)

PiperOrigin-RevId: 566410108
2023-09-18 14:22:51 -07:00
Peter Hawkins
2e7993a63a [XLA:GPU] Shard topk_kernel.cu.cc.
I've seen this file take over 5 minutes to build. Shard it by type.

PiperOrigin-RevId: 566409094
2023-09-18 14:15:20 -07:00
Haibo Huang
28ab1668fe Don't register ops when TF APIs are not available
PiperOrigin-RevId: 566404976
2023-09-18 14:04:53 -07:00
A. Unique TensorFlower
3de4416895 Upgrade to LLVM 17, CUDA 12.2, and CuDNN 8.9.4
This is updating TF's default toolchain to LLVM 17, as well as
CUDA and cuDNN to the latest releases.

PiperOrigin-RevId: 566403707
2023-09-18 13:57:18 -07:00
David Majnemer
3a67329732 [XLA] Clarify code regarding dynamic literals
No functional change, let's just remove some ambiguity.

PiperOrigin-RevId: 566401107
2023-09-18 13:51:16 -07:00
A. Unique TensorFlower
26c6758362 Update TFRT dependency to use revision
44239b5afd.

PiperOrigin-RevId: 566400358
2023-09-18 13:44:28 -07:00
David Dunleavy
a7e764fc71 Make diff_parser have visibility outside it's own package + fix formatting error
PiperOrigin-RevId: 566397309
2023-09-18 13:38:28 -07:00
Derek Murray
5d285bc149 Annotate the dummy constant and Send/Recv pair created for control edges.
This change adds "/ctrl/" to the name of the dummy constant added by graph partitioning. As a result, it becomes easier to determine in a profile trace whether a Send/Recv edge was added due to a control or data dependency.

PiperOrigin-RevId: 566393419
2023-09-18 13:25:35 -07:00
A. Unique TensorFlower
110b707e5e Add an experimental option to SavedModel to suppress the addition of SavedModel-native save and restore ops to the SavedModel, for cases where users already build custom save/restore ops and checkpoint formats for the model being saved, and the creation of the SavedModel-native save/restore ops simply cause longer graph generation times.
PiperOrigin-RevId: 566393107
2023-09-18 13:19:59 -07:00
Chandra Devarakonda
716e1cc977 Enable MLIR bridge in TF-TPU builds
PiperOrigin-RevId: 566387489
2023-09-18 13:09:08 -07:00
Jake Harmon
82b3111214 Move TSL/XLA headers to their original locations in tensorflow/include
PiperOrigin-RevId: 566385285
2023-09-18 13:00:42 -07:00
Siqiao Wu
1c15b63a12 Make PeriodicFunction accept AnyInvocable. std::function doesn't support move-able callback.
PiperOrigin-RevId: 566385242
2023-09-18 12:54:08 -07:00
Ce Zheng
59a577efd0 [XLA] Disable async call syntax sugar when the async computation is nontrvial (i.e. does not expand to a single instruction) as that is not lossless. This is to ensure the two formats can roundtrip.
PiperOrigin-RevId: 566376729
2023-09-18 12:24:58 -07:00
Ilia Sergachev
57a86066fe [XLA:GPU][NFC] Increase triton emitter test shard count.
PiperOrigin-RevId: 566375014
2023-09-18 12:18:39 -07:00
Gunhyun Park
2177f8a43d Add a hlo-expand tool to run passes on HLOModule.
This tool lets you convert a HloModule from stdin or a file to another format,
run a set of expander passes, and dump the output to stdout or a file.

This expander tool is divided into the following steps:
1. Load HloModule from stdin or file.
2. Add a set of passes to the HloPassPipeline.
3. Run a set of passes on the module.
4. Optionally print the output to stdout.
5. Optionally write the output to file in the specified format.

Usage:

  hlo-expand \
    [--input_format=[hlo|pb|pbtxt]] \
    [--optional_flags] \
    [path/to/hlo_module]

PiperOrigin-RevId: 566374517
2023-09-18 12:13:59 -07:00
A. Unique TensorFlower
5baed505a8 Always consume the diagnostic handler status and return failure if it ever catches something the running of the passes does not.
PiperOrigin-RevId: 566370251
2023-09-18 12:07:22 -07:00
TensorFlower Gardener
b3088931e9 Merge pull request #61894 from MichaelHudgins:arm-64-docker
PiperOrigin-RevId: 566369155
2023-09-18 12:00:37 -07:00
A. Unique TensorFlower
49c00089c0 [XLA:GPU] Adds conditional return for SoftmaxRewriterTriton::Run to return false if no fusible diamond chains are matched.
PiperOrigin-RevId: 566368003
2023-09-18 11:52:53 -07:00
Shanbin Ke
8618e98cb9 PR #5184: [XLA:GPU] fused attention runner clean up
Imported from GitHub PR https://github.com/openxla/xla/pull/5184

This is a follow up PR for cleaning up the redundant almost duplicated runners and MLIR op.
There are currently 4 runners and 3 MLIR op defined for fmha fwd. Different runner is chosen based on if there is bias/mask.
There are currently 2 runners and 2 MLIR op defined for fmha bwd. Different runner is chosen based on if there is mask.
We can merge all fwd runner/MLIR op into one and all bwd runner/MLIR op into one with `AttrSizedOperandSegments` since with flash attention there will be more optional buffers. So there is no point of keeping multiple runners/MLIR which is hard for code maintenance and introducing code bloat.
Copybara import of the project:

--
41063ec170108374701721038f4feb602dfee436 by cjkkkk <ske@nvidia.com>:

clean up fmha runner

--
fa04b2e5396e2cce1ecaad8e28a96a0b6f257cd2 by cjkkkk <ske@nvidia.com>:

fix compilation error

--
f3a25aaebd3110529ab383ceb17652eeb22da5e2 by cjkkkk <ske@nvidia.com>:

rebased on fmha runtime changes

--
0007719d2292d5750b8666ba5e1e5a223a420e38 by cjkkkk <ske@nvidia.com>:

add std::optional to guard optional buffers

--
60437ffbe17bbd13173c2ec773f5bf945ecf6432 by cjkkkk <ske@nvidia.com>:

use more informative uid

--
25aa1baa87abb259e3e0350737c4d39a715183a0 by cjkkkk <ske@nvidia.com>:

fix data_ptrs_vec.size() == data_uids_vec.size() check

Merging this change closes #5184

PiperOrigin-RevId: 566359023
2023-09-18 11:22:26 -07:00
Katherine Wu
87694ad5b9 Fix AsyncCheckpoint issues.
1) Move `_copy_trackable_to_cpu` to `ShardedVariable` (from `ShardedVariableMixin`, which is inherited by other objects)
2) Fixed bug that excluded the handling of the TPUEmbedding object.

PiperOrigin-RevId: 566355521
2023-09-18 11:13:38 -07:00
Francois Chollet
c581f9b1c3 Rewire TensorFlow to rely on tf_keras target.
PiperOrigin-RevId: 566353493
2023-09-18 11:06:18 -07:00
A. Unique TensorFlower
f85a264ba0 Fix race condition in SnapshotAssignmentManager
I stumbled upon this just by chance and found a solution.

That's how I understand it:

- Multiple instances of SnapshotManager share one instance of SnapshotAssignmentManager.
- SnapshotManager recently became thread-safe and multiple instances are being used from multiple threads.
- SnapshotAssignmentManager is NOT thread-safe, but it is being used from multiple instances of SnapshotManager
  from multiple threads.

My fix just adds a mutex to SnapshotAssignmentManager.

PiperOrigin-RevId: 566347773
2023-09-18 10:54:43 -07:00
James Mullenbach
b756908e26 Enable retries for initial connection by default for PSS to mitigate startup issues.
Add a test for startup fault tolerance.

Make the connection step more explicit; previously connection only happened within context initialization during a list_logical_devices call, which was unintuitive.

PiperOrigin-RevId: 566344263
2023-09-18 10:47:53 -07:00
Matthias Kramm
dbd362e386 Don't modify linked list while reading it.
PiperOrigin-RevId: 566341841
2023-09-18 10:39:40 -07:00
A. Unique TensorFlower
0e8c6fbea6 Added placeholder for internal RPC option.
PiperOrigin-RevId: 566341265
2023-09-18 10:33:16 -07:00
Peter Hawkins
57f33cabbe Rollback of PR #5300
PR #5300: A new pass to optimize the AllGather->Binary_Op order sequence

Imported from GitHub PR https://github.com/openxla/xla/pull/5300

This is a new GPU SPMD optimization pass for the following pattern:
binary-op(all-gather(a), all-gather(b))
to
all-gather(binary-op(a, b))

PiperOrigin-RevId: 566340142
2023-09-18 10:24:36 -07:00
A. Unique TensorFlower
0f6d8fd48c Fix potential leak in SnapshotManager
In the error case, `snapshot_manager_` will not be deleted.
To solve that we can create the `unique_ptr` before calling `Start` instead of afterwards.

I just saw this by chance when looking into cl/566268125.

PiperOrigin-RevId: 566334445
2023-09-18 10:08:24 -07:00
Michael Hudgins
ee70e8d292 Update copyright headers 2023-09-18 15:53:15 +00:00