Summary: New hybrid randomized sparse nn, which allows layers of sparse NN model to be randomized, semi-random, or learnable
Reviewed By: chocjy
Differential Revision: D5416489
fbshipit-source-id: eb8640ddf463865097ba054b9f8d63da7403024d
Summary:
The current implementation for s=0 doesn't support backward pass.
Switching to using pow op instead as a temporary solution.
Reviewed By: jackielxu
Differential Revision: D5551742
fbshipit-source-id: 33db18325b3166d60933284ca1c4e2f88675c3d3
Summary:
To achive this, I modified the blob name scheme defined in a layer.
Before it was scope/fc_w and scope/fc_w_auto_0 (if there is another fc
within the same scope).
Now I change it to scope/fc/w and scope/fc_auto_0/w.
That is, we rely on the uniqueness of the scoped layer name to define
names for blobs.
I also overwrote the create_param method in LayerModelHelper to let it
use the resolved name for blobs given the sharingparameter context.
There are some details such as making the initializer more structured
that I need to finalize.
Reviewed By: kennyhorror
Differential Revision: D5435132
fbshipit-source-id: a0525f5ea0977e255dd5ea765b38913f5951d455
Summary:
Updated the semi-random layer model for multi-layer models using semi-random layers.
Notable changes:
- Input and outputs for the semi-random layer is now a Struct with "full" and "random" components
- Flag was added to choose to initialize output schema in Arc Cosine or not (if output schema initialization will happen in Semi Random layer)
Reviewed By: chocjy
Differential Revision: D5496034
fbshipit-source-id: 5245e287a5b1cbffd5e8d2e3da31477c65b41e04
Summary:
The original issue was that the initialized parameters for randomized layers (Arc Cosine and Semi-Random) were not fixed across distributed runs of the layers. Moreover, as the weights are initialized as (constant) parameters, when the layer is added to the preprocessing part, these weights won't be saved after training since they don't exist on the trainer.
I fixed the issue here by building an option to add the randomized parameters to the model global constants so that the same parameter values can be accessed. Also, the parameters can be saved when the training is finished.
In this diff, I've:
- Updated randomized parameters to be added as a global constant across distributed runs of Arc Cosine Feature Map and Semi Random Feature layers
- Updated unit tests
- Ran an end-to-end test, enabling multiple readers to test the fixed issue
Reviewed By: chocjy
Differential Revision: D5483372
fbshipit-source-id: b4617f9ffc1c414d5a381dbded723a31a8be3ccd