TFLM wants to reduce RAM usage for model allocation. This is the first phase to reduce this overhead. Instead of directly allocating TfLiteTensor items onto TfLiteContext, TF Micro will return instances through function pointers on TfLiteContext. The new TfLiteEvalTensor will be used in TFLM kernel implementations to reduce the memory overhead during TfLiteRegistration::Eval() calls.
NOTE: TfLiteEvalTensor can be moved into TFLM-only build rules when internal builds use TF_LITE_STATIC_MEMORY by default (b/160955687). Additionally, TfLiteContext contains many fields not used by TFLM and should be forked to reduce memory overhead as well.
PiperOrigin-RevId: 320634141
Change-Id: I26d49dfd5fa8f96bea8e098202d191d7ae6f1957
Imported from GitHub PR https://github.com/tensorflow/tensorflow/pull/39429
This PR fixes and enables the TopK op and its tests for ROCm.
Copybara import of the project:
--
7ef56e66a29608b16d3b5cbd7cfad114ede3b3c1 by Eugene Kuznetsov <eugene.kuznetsov@amd.com>:
Fixing and enabling TopK on ROCm
COPYBARA_INTEGRATE_REVIEW=https://github.com/tensorflow/tensorflow/pull/39429 from ekuznetsov139:google-upstream-topk 7ef56e66a29608b16d3b5cbd7cfad114ede3b3c1
PiperOrigin-RevId: 320631998
Change-Id: If5291d7ff5ff0e98f953645e0adf085bf891cee8
The root cause is probably related to keras global graph and some potential memory leak. Adding a teardown method to force clean any keras model created in the test, with a force GC.
Also cleanup some assertion methods which make the test code more readable.
PiperOrigin-RevId: 320630906
Change-Id: Ib72a39c184613e431d6eab03b18f3b217cb8e506
It is possible for a tf.StatefulPartitionedCall to return the result of a tf.ReadVariableOp (with some potential forwarding through ops like tf.Identity). As return op operands are captured prior to replacing tf.ReadVariableOp results with function args, the new function return operands may not be correct. Instead, when replacing tf.ReadVariableOp results with function args, the operands of the new return are updated.
PiperOrigin-RevId: 320597502
Change-Id: I81f614e0b89670c978da376d5810ff82502f601f
Doctest for module.py implemented. Also fixed doctest for Dense class in the same file.
Doctest for Dense class in the same file now runs properly and names of vars are consistent.
Doctest should work now, but I did have some trouble testing it, although I do have reqs installed.
Before this change, the registered graph optimization passes are executed on both the main function side and the component function side when running distributed functions. This is not efficient, and can cause graph compilation problems. This change annotate component functions execution so that the graph passes will be skipped when instantiating them, avoiding the repeated graph passes that are already executed on the main function side.
PiperOrigin-RevId: 320540983
Change-Id: I4816240bcd5b54c738114c36f17ecc1b0b6c920d
tf.sign(NaN) should be NaN via XLA to match TensorFlow's normal behavior.
PiperOrigin-RevId: 320539210
Change-Id: I1cfa4175f88cb1083b2a20222785a372801dacc8
Minor signature change of AddInputList.
Implement AddInputList for GraphContext. This is needed for downstream
implementation of gradients.
PiperOrigin-RevId: 320513498
Change-Id: I16ee672c6d6f1f9b260b319c954158694c82a1db
Many TF ops require that all operands and results have the same (element) type
but previously this was often not checked, leading to failures later in the flow
that are harder to debug. This change adds several such checks.
PiperOrigin-RevId: 320511211
Change-Id: Iaba7d7449085d4fc7a3e3c18b0dd26c185b7282c