pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Frank Jiang	c809d89810	Fix RowWiseSparseAdam implementation Summary: The original implementation averaged the momentum across the embedding dimensions, which doesn't make any sense. This meant all the embedding dimensions received the same update, becoming a very memory-expensive one-dimensional embedding. Differential Revision: D7003135 fbshipit-source-id: ed54e3427bc13895a4e949e96b4b17f6ebfb6d53	2018-02-16 13:28:26 -08:00
Frank Jiang	304e607b70	Fix adam test Reviewed By: pietern Differential Revision: D6787780 fbshipit-source-id: a2d1428b0e028d6f3d8f7c312c90f3fa411cd0a2	2018-01-25 12:59:54 -08:00
Frank Jiang	6f0bb28afb	Stop running RowWiseSparseAdam test on GPU Reviewed By: pietern Differential Revision: D6739194 fbshipit-source-id: 0892cdc6a575a84147f86984c67e7b4bf605a197	2018-01-17 15:05:21 -08:00
Frank Jiang	61356cbadc	RowWiseSparseAdam operator Summary: Added the RowWise functionality for SparseAdam, which saves roughly 2/3 memory usage by only keeping one first and second moment term for each row of the parameter tensor, rather than one for each individual parameter. Differential Revision: D6679342 fbshipit-source-id: ce6fb27e35ce41a890c66f6089cd2748d10e7a44	2018-01-16 19:39:31 -08:00
Pieter Noordhuis	9835ca9bac	Ensure indices list in sparse optimizer tests is unique Summary: There were no dimensionality constraints to the generated indices array, causing many examples being generated and filtered out. Instead, we should ensure the probability of unique indices is high. There is a better fix for this by using the `unique` keyword argument to `hypothesis.extra.numpy.arrays`, but this is available only in hypothesis version 3.28.0 and later. This is related to #1536 and #1599. Once this change has proven to be OK, we can modify the other tests that now have health check suppression enabled as well. Closes https://github.com/caffe2/caffe2/pull/1686 Reviewed By: Yangqing Differential Revision: D6651789 Pulled By: pietern fbshipit-source-id: d80886c9ccf0a7a842a7580a279f33a2d6cca97c	2018-01-03 12:19:14 -08:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Luke Yeager	a47652379f	Fix SparseAdagrad for indices.ndim>1 Summary: Same fix as https://github.com/caffe2/caffe2/pull/249, but for SparseAdagrad. Also update the tests for both ops to test this functionality. Closes https://github.com/caffe2/caffe2/pull/675 Differential Revision: D5148750 Pulled By: akyrola fbshipit-source-id: d30b722429bc547fd53400c1a29e4ee9e2e6ed18	2017-05-30 12:02:18 -07:00
Luke Yeager	8bd0522c20	Add tests and GPU impls for sparse optimizers Summary: These GPU paths are probably even buggier than the CPU paths for sparse gradients with duplicate indices. Both paths cause multiple momentum updates in a single iteration, but only the GPU path is non-deterministic. Depending on how we decide to address the issues on the CPU path, pooyadavoodi has a good idea for how to match dense behavior with the sparse GPU ops. Closes https://github.com/caffe2/caffe2/pull/254 Reviewed By: bwasti Differential Revision: D4871680 Pulled By: dzhulgakov fbshipit-source-id: 220be57a0f699a22ea85ed4f7022d92d362d06b3	2017-04-13 11:07:40 -07:00

9 Commits