tensorflow

mirror of https://github.com/zebrajr/tensorflow.git synced 2025-12-07 12:20:24 +01:00

Author	SHA1	Message	Date
A. Unique TensorFlower	a69eea2814	Internal changes only. PiperOrigin-RevId: 565383371	2023-09-14 09:04:21 -07:00
Bixia Zheng	c6b4dd5ef2	[xla] Rename LatencyHidingSchedulerPreparation pass to P2PSchedulePreparation pass. This is because the pass is needed to linearize point-to-point Send and Recv chains an HLO scheduler. Modify the GPU HLO scheduler to call P2PSchedulePreparation pass regardless whether the latency hiding scheduler is on. PiperOrigin-RevId: 565374605	2023-09-14 08:39:30 -07:00
Pat Notz	ad428bcfb9	Fixes to avoid deadlocks with collectives in the pipelining while loop PiperOrigin-RevId: 565355251	2023-09-14 07:04:02 -07:00
A. Unique TensorFlower	9d833cd42c	Internal Code Change PiperOrigin-RevId: 565353457	2023-09-14 06:55:39 -07:00
Benjamin Kramer	98b0549d68	Integrate LLVM at llvm/llvm-project@bf8fd086d0 Updates LLVM usage to match [bf8fd086d09c](https://github.com/llvm/llvm-project/commit/bf8fd086d09c) PiperOrigin-RevId: 565351116	2023-09-14 06:43:23 -07:00
Johannes Reifferscheid	b24592bcda	Const-correctness fixes for GetFusionRoots. The output is mutable for no good reason, which causes issues when we want to express "this fusion instruction's roots or the instruction if it's not a fusion". PiperOrigin-RevId: 565341699	2023-09-14 05:48:24 -07:00
Jiyoun (Jen) Ha	cd3d3c25b4	Add StableHLO Quantizer as an option in TF Quantizer. PiperOrigin-RevId: 565339452	2023-09-14 05:36:22 -07:00
Andrew Goodbody	3da3565572	[Linaro:ARM_CI] Update to clang-17 Update compiler in use to be clang-17 to maintain sync with other builds.	2023-09-14 12:09:14 +01:00
A. Unique TensorFlower	e585ea7203	Added placeholder for internal RPC option. PiperOrigin-RevId: 565319212	2023-09-14 03:38:16 -07:00
Ilia Sergachev	2736ef7a65	[XLA:GPU] Handle kTranspose in optimized HLO in Triton emitters. PiperOrigin-RevId: 565307989	2023-09-14 02:44:21 -07:00
Alan Kelly	ac9aa167ff	Select correct FC 8x16 path PiperOrigin-RevId: 565302587	2023-09-14 02:23:29 -07:00
A. Unique TensorFlower	f17c9f5ef5	Update GraphDef version to 1619. PiperOrigin-RevId: 565301847	2023-09-14 02:16:28 -07:00
A. Unique TensorFlower	f3b30f5a0b	compat: Update forward compatibility horizon to 2023-09-14 PiperOrigin-RevId: 565301845	2023-09-14 02:08:52 -07:00
Zichuan Wei	ab7c193a3d	lite: enable group conv -> conv2d conversion PiperOrigin-RevId: 565268295	2023-09-13 23:28:05 -07:00
Hye Soo Yang	20196d5398	Add highway c++ library as a dep to TensorFlow PiperOrigin-RevId: 565237467	2023-09-13 20:22:29 -07:00
Ryan M. Lefever	d8a17da3f0	Change 3/6 for making MSA repacking slice aware. Changed SlicedAllocationFinder to - accept a method to determine if allocations a permitted to begin at a given offset - expose a method to test if a sliced allocation can fit at a specific offset PiperOrigin-RevId: 565231151	2023-09-13 19:45:35 -07:00
Jake Harmon	ebe4f498df	Fix comment bug in tsl's clean_dep PiperOrigin-RevId: 565223624	2023-09-13 18:57:41 -07:00
Ryan M. Lefever	0d333c8ae0	Change 2/6 for making MSA repacking slice aware. Fix a bug in which we over-allocate space for slices, when they are colocated with larger buffers. The interaction causing this behavior is as follows: A) GlobalDecreasingSizeBestFitHeap::FindChunkCandidates() adds additional space to the last chunk in a sliced allocation, to account for max_colocation_size. B) When AlternateMemoryBestFitHeap::CheckPrefetchFit() computes slices_for_pending_chunks, it recomputes the size of the sliced allocation as the sum of the sizes of the chunks returned from A. Note, we do not recompute the size for the allocation in a non-sliced world. C) Before committing a chunk, GlobalDecreasingSizeBestFitHeap::CommitChunk() changes the chunk's size to fit the size from B. Thus, in the sliced case we keep the extra max_colocation_size space, since we recalculated the allocation size with it. In the non-sliced case, we adjust the chunk size back to what is needed for the request. So, this change is a no-op for non-slices. PiperOrigin-RevId: 565217603	2023-09-13 18:23:50 -07:00
James Mullenbach	4404175d0d	Add configurable retries for SetServerDef, for fault tolerance amidst preemptions. There's a short period during ParameterServerStrategy initialization / cluster connection in which worker preemptions will lead to UnavailableErrors from CreateContext calls. This adds configurable retries to SetServerDef so that a single connection failure does not stop the whole job. Retries will be enabled as the default behavior for PSS in a followup change. PiperOrigin-RevId: 565214961	2023-09-13 18:13:05 -07:00
A. Unique TensorFlower	75fb8c8e8c	#tf-data Provide autotune with fresh values for cpu/ram budget The loop that runs Autotune will fetch current values for available CPU and RAM on each iteration. This helps in situations where the hardware resources available to tf.data may be vertically scaled up or down based on usage during the process' lifetime. PiperOrigin-RevId: 565197940	2023-09-13 16:50:56 -07:00
Fergus Henderson	7cf3460cd1	Add missing backquotes in a couple of places in the release notes. PiperOrigin-RevId: 565191065	2023-09-13 16:27:18 -07:00
Yu Feng	96d793172d	Open source mesh_util_test.py PiperOrigin-RevId: 565189870	2023-09-13 16:19:34 -07:00
Clive Verghese	caac4ac308	Add 5c9f72faadaca7250b341b99da358e855a8d902e from abseil-cpp. PiperOrigin-RevId: 565187417	2023-09-13 16:12:21 -07:00
Fergus Henderson	310715d91f	Add a test using the FlatBuffer C API (rather than the FlatBuffer C++ API) to construct the TFLiteSettings FlatBuffer. PiperOrigin-RevId: 565184361	2023-09-13 15:56:42 -07:00
Son Tuan Vu	e54cae4089	[XLA:GPU] Limit unroll factor for column reductions Vectorized column reductions might exceed shmem budget. Limit the unroll factors to avoid this. PiperOrigin-RevId: 565170403	2023-09-13 15:25:44 -07:00
TensorFlower Gardener	adcfd3f69c	Merge pull request #61809 from terryheo:use-ndk-r26 PiperOrigin-RevId: 565170354	2023-09-13 15:19:18 -07:00
Swachhand Lokhande	f80b7460db	Use PjRtFuture returned by ExecutePortable to extend the lifetime of PjRtBuffers. The owned PjRtBuffers in `owned_executable_args` need to live until execution is complete. Currently this is achieved by blocking until all the executable outputs are ready. However, this seemed to cause performance overheads, see b/299683272 and b/300102691. With this change, we don't block until execution is complete. The ownership of `owned_executable_args` is moved to a lambda which is executed as a callback when the PjRtFuture returned by ExecutePortable is ready (which happens when the execution is complete). PiperOrigin-RevId: 565169152	2023-09-13 15:10:57 -07:00
Hye Soo Yang	92691ae0ac	Open source op for `GetMinibatchesInCsrWithPhysicalReplica` for SparseCore. PiperOrigin-RevId: 565168980	2023-09-13 15:06:49 -07:00
Yu Feng	4f75b24b0f	UnimplementedError prints the type name. Such that users can act on the classes, adding the override methods there. PiperOrigin-RevId: 565168942	2023-09-13 15:00:50 -07:00
A. Unique TensorFlower	59e2bcf692	[XLA] Add WithReplicaGroups in pattern matcher and modify tests to conform to the new pattern matching format -Add WithReplicaGroups implementation for HloInstructionPattern to match with the collective instruction's replica groups. PiperOrigin-RevId: 565160403	2023-09-13 14:36:25 -07:00
Hye Soo Yang	9331ee5476	Adding sparse_core_ops_stats_handler for recording metrics PiperOrigin-RevId: 565158613	2023-09-13 14:29:32 -07:00
Yishuang Pang	b090b29c0a	Update CocoaPods specs for TFLite 2.13.0 PiperOrigin-RevId: 565157773	2023-09-13 14:21:30 -07:00
Hye Soo Yang	b99dffb460	Open source kernel for `GetMinibatchesInCsrWithPhysicalReplicaOp` for SparseCore. Open source sparse_core_ops_utils* PiperOrigin-RevId: 565156105	2023-09-13 14:13:45 -07:00
Son Tuan Vu	67c0625e50	[XLA:GPU] Fix theoretical bug where shmem_usage > shmem_budget PiperOrigin-RevId: 565152217	2023-09-13 14:02:44 -07:00
A. Unique TensorFlower	6ed6cf087d	Remove PluginConfig class. It was always set to PluginConfig::kDefault. PiperOrigin-RevId: 565140913	2023-09-13 13:23:44 -07:00
A. Unique TensorFlower	646de0c120	Defines an interface for makespan evaluation. PiperOrigin-RevId: 565134913	2023-09-13 13:03:26 -07:00
Matt Callanan	96591d0419	#tf-data Scale up `"inject_io_prefetch"` experiment to 50% job level. PiperOrigin-RevId: 565132363	2023-09-13 12:53:27 -07:00
A. Unique TensorFlower	5c004b4b94	Adds a new cost component for makespan. PiperOrigin-RevId: 565121864	2023-09-13 12:14:13 -07:00
A. Unique TensorFlower	65e5b764a4	[XLA:GPU] Add missing tolerance for BF16 tests in Triton Softmax tests. PiperOrigin-RevId: 565119660	2023-09-13 12:06:12 -07:00
Grant Jensen	74a206527e	[tflite-gpu] Push select_v2 dim check up from inference to parser. PiperOrigin-RevId: 565113938	2023-09-13 11:47:13 -07:00
A. Unique TensorFlower	c91917d7dc	Use tf2xla implementation of SliceOp instead of MLIR. PiperOrigin-RevId: 565111478	2023-09-13 11:39:29 -07:00
TensorFlower Gardener	56edb7fc3b	Merge pull request #58400 from SaoirseARM:toupstream/int6x8_32_accum PiperOrigin-RevId: 565106502	2023-09-13 11:27:45 -07:00
Hye Soo Yang	66dbd1d599	Open source global_iter_id.cc for SparseCore. PiperOrigin-RevId: 565097655	2023-09-13 10:54:59 -07:00
TensorFlower Gardener	3ff7a3ba05	Merge pull request #61634 from 0o001:master PiperOrigin-RevId: 565094585	2023-09-13 10:47:25 -07:00
Jieying Luo	26aa3c84c6	Excludes building pjrt_c_api_gpu_plugin.so (GPU only target) on MAC. PiperOrigin-RevId: 565092976	2023-09-13 10:40:49 -07:00
Marcello Maggioni	be0d1151a6	[XLA] Rework dot() sharding propagation to lookahead instructions sharding to choose a sharding for dot() that agrees with the users if possible. PiperOrigin-RevId: 565086052	2023-09-13 10:19:33 -07:00
Oleg Shyshkov	d11423a4e9	[mhlo] Remove unused HloLegalizeToLhlo pass. PiperOrigin-RevId: 565083641	2023-09-13 10:11:38 -07:00
Marcello Maggioni	f28e73538a	[XLA] Add support to CollectivePipeliner to sink collectives. Small collectives might be better off when sinked and there are other potnential use cases Also fix a bug, where we were accepting reuse of the data that we were storing and changing the tests using that pattern to match the fix. PiperOrigin-RevId: 565080772	2023-09-13 10:02:14 -07:00
Bixia Zheng	65bd69126a	[xla] Change the collective-permute-decomposer to not chain the Send and Recv instructions through control dependence. This is because the generated HLO program is correct even without the control dependence chaining. The purpose of the control dependence chaining is to support a scheduler, such as the latency hiding scheduler, and thus will be added to the latency hiding scheduler preparation pass. Not producing the control dependence chaining while decomposing collective-permute can also simplify the implementation of collective-pipeliner in pipelining Send and Recv instructions. PiperOrigin-RevId: 565073772	2023-09-13 09:38:54 -07:00
Benjamin Kramer	74d7ec3be9	Integrate LLVM at llvm/llvm-project@8ebe1d1cc1 Updates LLVM usage to match [8ebe1d1cc1e4](https://github.com/llvm/llvm-project/commit/8ebe1d1cc1e4) PiperOrigin-RevId: 565069541	2023-09-13 09:25:33 -07:00

... 2 3 4 5 6 ...

153857 Commits