Make the RMSPropOptimizer docstring more explicit about sparse vs. dense

PiperOrigin-RevId: 165335237
This commit is contained in:
Allen Lavoie 2017-08-15 11:33:44 -07:00 committed by TensorFlower Gardener
parent 75a9c4b5c8
commit 03a33c08dd

View File

@ -63,9 +63,17 @@ class RMSPropOptimizer(optimizer.Optimizer):
name="RMSProp"):
"""Construct a new RMSProp optimizer.
Note that in dense implement of this algorithm, m_t and v_t will
update even if g is zero, but in sparse implement, m_t and v_t
will not update in iterations g is zero.
Note that in the dense implementation of this algorithm, variables and their
corresponding accumulators (momentum, gradient moving average, square
gradient moving average) will be updated even if the gradient is zero
(i.e. accumulators will decay, momentum will be applied). The sparse
implementation (used when the gradient is an `IndexedSlices` object,
typically because of `tf.gather` or an embedding lookup in the forward pass)
will not update variable slices or their accumulators unless those slices
were used in the forward pass (nor is there an "eventual" correction to
account for these omitted updates). This leads to more efficient updates for
large embedding lookup tables (where most of the slices are not accessed in
a particular graph execution), but differs from the published algorithm.
Args:
learning_rate: A Tensor or a floating point value. The learning rate.