Commit Graph

9 Commits

Author SHA1 Message Date
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Frank Jiang
c809d89810 Fix RowWiseSparseAdam implementation
Summary: The original implementation averaged the momentum across the embedding dimensions, which doesn't make any sense. This meant all the embedding dimensions received the same update, becoming a very memory-expensive one-dimensional embedding.

Differential Revision: D7003135

fbshipit-source-id: ed54e3427bc13895a4e949e96b4b17f6ebfb6d53
2018-02-16 13:28:26 -08:00
Frank Jiang
304e607b70 Fix adam test
Reviewed By: pietern

Differential Revision: D6787780

fbshipit-source-id: a2d1428b0e028d6f3d8f7c312c90f3fa411cd0a2
2018-01-25 12:59:54 -08:00
Frank Jiang
6f0bb28afb Stop running RowWiseSparseAdam test on GPU
Reviewed By: pietern

Differential Revision: D6739194

fbshipit-source-id: 0892cdc6a575a84147f86984c67e7b4bf605a197
2018-01-17 15:05:21 -08:00
Frank Jiang
61356cbadc RowWiseSparseAdam operator
Summary: Added the RowWise functionality for SparseAdam, which saves roughly 2/3 memory usage by only keeping one first and second moment term for each row of the parameter tensor, rather than one for each individual parameter.

Differential Revision: D6679342

fbshipit-source-id: ce6fb27e35ce41a890c66f6089cd2748d10e7a44
2018-01-16 19:39:31 -08:00
Pieter Noordhuis
9835ca9bac Ensure indices list in sparse optimizer tests is unique
Summary:
There were no dimensionality constraints to the generated indices
array, causing many examples being generated and filtered out. Instead,
we should ensure the probability of unique indices is high.

There is a better fix for this by using the `unique` keyword argument
to `hypothesis.extra.numpy.arrays`, but this is available only in
hypothesis version 3.28.0 and later.

This is related to #1536 and #1599.

Once this change has proven to be OK, we can modify the other tests
that now have health check suppression enabled as well.
Closes https://github.com/caffe2/caffe2/pull/1686

Reviewed By: Yangqing

Differential Revision: D6651789

Pulled By: pietern

fbshipit-source-id: d80886c9ccf0a7a842a7580a279f33a2d6cca97c
2018-01-03 12:19:14 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Luke Yeager
a47652379f Fix SparseAdagrad for indices.ndim>1
Summary:
Same fix as https://github.com/caffe2/caffe2/pull/249, but for SparseAdagrad.

Also update the tests for both ops to test this functionality.
Closes https://github.com/caffe2/caffe2/pull/675

Differential Revision: D5148750

Pulled By: akyrola

fbshipit-source-id: d30b722429bc547fd53400c1a29e4ee9e2e6ed18
2017-05-30 12:02:18 -07:00
Luke Yeager
8bd0522c20 Add tests and GPU impls for sparse optimizers
Summary:
These GPU paths are probably even buggier than the CPU paths for sparse gradients with duplicate indices. Both paths cause multiple momentum updates in a single iteration, but only the GPU path is non-deterministic. Depending on how we decide to address the issues on the CPU path, pooyadavoodi has a good idea for how to match dense behavior with the sparse GPU ops.
Closes https://github.com/caffe2/caffe2/pull/254

Reviewed By: bwasti

Differential Revision: D4871680

Pulled By: dzhulgakov

fbshipit-source-id: 220be57a0f699a22ea85ed4f7022d92d362d06b3
2017-04-13 11:07:40 -07:00