Commit Graph

40 Commits

Author SHA1 Message Date
Anshul Verma
a340d141de Check num_elements > num_samples in UniformSampling
Summary: When num_elements is less than num_samples, a workflow should fail during net construction time. Currently, it fails at run time.

Reviewed By: kittipatv

Differential Revision: D5858085

fbshipit-source-id: e2ab3e59848bca58806eff00adefe7c30e9ad891
2017-09-21 16:37:20 -07:00
Badri Narayan Bhaskar
9507cae9e0 Create MergeIdListsLayer
Summary: We create a layer for MergeIdListsOp

Differential Revision: D5531348

fbshipit-source-id: a2e227e1abda05cefa893fd41a2c3ca997851e25
2017-08-22 17:00:55 -07:00
Yan Shang
57c93435e3 Dedup name in functional layer
Summary:
Before this fix, a functional layer name can appear several time in a
blob and causes confusion. This diff fix this issue.

Reviewed By: kittipatv

Differential Revision: D5641354

fbshipit-source-id: d19349b313aab927e6cb82c5504f89dbab60c2f2
2017-08-17 17:50:34 -07:00
Long Jin
ef64a4f6b2 Add conv layer and layer tests
Reviewed By: xianjiec

Differential Revision: D5569206

fbshipit-source-id: ed836315f3ee4d7983da94f2633a3085fe99194d
2017-08-08 10:57:43 -07:00
Jacqueline Xu
a1bf14d8e6 Building new randomized sparse nn model
Summary: New hybrid randomized sparse nn, which allows layers of sparse NN model to be randomized, semi-random, or learnable

Reviewed By: chocjy

Differential Revision: D5416489

fbshipit-source-id: eb8640ddf463865097ba054b9f8d63da7403024d
2017-08-07 12:48:58 -07:00
Jiyan Yang
4b80ff89e2 Use softsign op for s=0 in arc-cosine feature map
Summary:
The current implementation for s=0 doesn't support backward pass.
Switching to using pow op instead as a temporary solution.

Reviewed By: jackielxu

Differential Revision: D5551742

fbshipit-source-id: 33db18325b3166d60933284ca1c4e2f88675c3d3
2017-08-03 23:35:11 -07:00
Jacqueline Xu
13569c9aa0 Fixing semi-random layer model for multi-layer models
Summary:
Updated the semi-random layer model for multi-layer models using semi-random layers.

Notable changes:
- Input and outputs for the semi-random layer is now a Struct with "full" and "random" components
- Flag was added to choose to initialize output schema in Arc Cosine or not (if output schema initialization will happen in Semi Random layer)

Reviewed By: chocjy

Differential Revision: D5496034

fbshipit-source-id: 5245e287a5b1cbffd5e8d2e3da31477c65b41e04
2017-07-27 15:25:19 -07:00
Jacqueline Xu
9bec54bbf1 Modify arc cosine feature map and semi random layers to initialize parameters as global constants
Summary:
The original issue was that the initialized parameters for randomized layers (Arc Cosine and Semi-Random) were not fixed across distributed runs of the layers. Moreover, as the weights are initialized as (constant) parameters, when the layer is added to the preprocessing part, these weights won't be saved after training since they don't exist on the trainer.

I fixed the issue here by building an option to add the randomized parameters to the model global constants so that the same parameter values can be accessed. Also, the parameters can be saved when the training is finished.

In this diff, I've:
- Updated randomized parameters to be added as a global constant across distributed runs of Arc Cosine Feature Map and Semi Random Feature layers
- Updated unit tests
- Ran an end-to-end test, enabling multiple readers to test the fixed issue

Reviewed By: chocjy

Differential Revision: D5483372

fbshipit-source-id: b4617f9ffc1c414d5a381dbded723a31a8be3ccd
2017-07-26 16:37:00 -07:00
Honghao Wei
290acab2c7 implement drelu and unittest
Summary:
In this revision, I mainly implemented the DRelu activation. See https://arxiv.org/pdf/1706.06978v1.pdf for details.
To sum up, different from standard relu and purely, which divide the scope into two parts with boundary at zero, DRelu calculate another value p to divide the activation into two part. P is the softmax value of the output of Batch Normalization. For f(x)=x part in relu, you can find similar patten in f(x)=px, and for f(x)=0 part in rely, you can find similar pattern in f(x)=a(1-p)x, in which a is a parameter to tune. Drelu activation result is the sum of these two parts, f(x) = a(1-p)x + px.

To implement DRelu, I take BatchNormalization as super class and then use the above formula for computation. In order to allow users to choose activation methods, which usually takes place when calling add_mlp function in processor_util.py, I pass the parameter transfer in model_option from UI to the details, just as what dropout do. Currently, I place it in extra_option, but can modify it if AML team needs to redesign the UI.

I also add units test for DRelu. We check the shape of output and also do the numeric unit tests.
For Unit test, I first check the numeric value of BatchNormalization, since there is no similar test before. I then compute the value of DRelu outputs and compare the results with current DRelu layer.

Reviewed By: chocjy

Differential Revision: D5341464

fbshipit-source-id: 896b4dcc49cfd5493d97a8b448401b19e9c80630
2017-07-20 11:50:08 -07:00
Honghao Wei
b68adec7bb adding model loss logic
Summary: Add api model.add_loss(), which allows adding loss, such as optimization and regularization. See change in sparse_nn.py, in which 'model.loss = loss' is changed to 'model.add_loss(loss)'.

Reviewed By: xianjiec

Differential Revision: D5399056

fbshipit-source-id: 13b2ced4b75d129a5ee4a9b0e989606c04d2ca8b
2017-07-14 16:25:23 -07:00
Jacqueline Xu
2aa8fc7e8d Implementing Semi-Random Features Layer
Summary:
- (Split diff from Arc Cosine)
- Implemented [[ https://arxiv.org/pdf/1702.08882.pdf | Semi-Random Features ]] Layer
- Created a buck unit test for SRF Layer

Reviewed By: chocjy

Differential Revision: D5374803

fbshipit-source-id: 0293fd91ed5bc19614d418c2fce9c1cfdd1128ae
2017-07-14 13:15:50 -07:00
Jiyan Yang
043640c3eb Return top K classes
Reviewed By: kittipatv

Differential Revision: D5363481

fbshipit-source-id: 27ce37878434917c1a7c5f325ed77c989a1448af
2017-07-13 00:20:00 -07:00
Tao Wu
02aa5ad9fb make functional layer return scalar if only one output
Summary: This diff makes functional layer return scalar if only one output. This diff also corrects all other corresponding implementations.

Reviewed By: kittipatv

Differential Revision: D5386853

fbshipit-source-id: 1f00582f6ec23384b2a6db94e19952836755ef42
2017-07-12 11:34:31 -07:00
Jacqueline Xu
e89e71c595 Simplifying Random Fourier Features and layer test
Summary:
- Condensed operators in RFF layer
- Adjusted RFF layer test; made test code more concise

Reviewed By: chocjy

Differential Revision: D5391436

fbshipit-source-id: 08748861cd6fb4a9e4cc9c8762996371492020a1
2017-07-11 00:40:53 -07:00
Jacqueline Xu
6ea71155c1 Implementing Arc Cosine Layer
Summary:
- Implemented the [[ http://cseweb.ucsd.edu/~saul/papers/nips09_kernel.pdf | Arc Cosine ]] layer
  - Developed buck unit test for Arc Cosine

Reviewed By: chocjy

Differential Revision: D5367604

fbshipit-source-id: ffd3ee081bc055b06c075c34aa6ce329b62ce2e0
2017-07-10 10:10:36 -07:00
Jacqueline Xu
25bd5dda27 Implementing random fourier features layer
Summary:
- Created the random fourier features layer
- Generated a unit test to test the random fourier features layer is built correctly
- Inspired by the paper [[ https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf |   Random Features for Large-Scale Kernel Machines]]

Reviewed By: chocjy

Differential Revision: D5318105

fbshipit-source-id: c3885cb5ad1358853d4fc13c780fec3141609176
2017-07-04 23:48:42 -07:00
Tao Wu
4be5337cca add support for weight in batch_softmax_loss
Summary: weighted batch_softmax_loss when weight exists in input_record

Reviewed By: kittipatv

Differential Revision: D5291646

fbshipit-source-id: f1bcd386ad1fc0e95e0a0315ec1c36531c792495
2017-06-21 10:32:15 -07:00
Jacqueline Xu
6150d9bef2 Building dropout as layer
Summary: Dropout layer and unittest for DPer2

Reviewed By: chocjy

Differential Revision: D5254866

fbshipit-source-id: 5eaea81808ddf8e0c7a7d76209ea44cda2ee28aa
2017-06-19 14:46:52 -07:00
Thomas Dudziak
60c78d6160 Fixes range/xrange for Python 3
Summary: As title

Differential Revision: D5151894

fbshipit-source-id: 7badce5d3122e8f2526a7170fbdcf0d0b66e2638
2017-06-07 00:04:26 -07:00
Jiyan Yang
6aff754dbc Add batch normalization layer
Summary: As desc.

Reviewed By: xianjiec

Differential Revision: D5077230

fbshipit-source-id: f73cdedac6d9a3542f8ef829b54fb4c713dcafd0
2017-05-26 16:46:52 -07:00
Kittipat Virochsiri
211eae127c LastNWindowCollector
Summary: Layer for LastNWindowCollector op. We need this since it's an in-place operator.

Reviewed By: chocjy

Differential Revision: D4981772

fbshipit-source-id: ec85dbf247d0944db422ad396771fa9308650883
2017-05-04 17:32:09 -07:00
Kittipat Virochsiri
22d4eaeb9e JoinContext
Summary:
Layer to allow model to follow different paths for each instantiation context and join later. Together with tagging system cleanup (this is a separate issue), this should reduce the need to write a layer to differentiate between context.

Re: tagging system clean up, we should make exclusion more explicit: EXCLUDE_FROM_<CONTEXT>. This would simplify instation code. TRAIN_ONLY should become a set of all EXCLUDE_FROM_*, except EXCLUDE_FROM_TRAIN.

Reviewed By: kennyhorror

Differential Revision: D4964949

fbshipit-source-id: ba6453b0deb92d1989404efb9d86e1ed25297202
2017-05-02 17:32:26 -07:00
Chonglin Sun
e8e93066e7 add workflow for user complicated embedding
Summary: Correctly propagate request_only tag to all layer.

Reviewed By: kennyhorror

Differential Revision: D4751496

fbshipit-source-id: e65fd8cfe56d2989213d44e684a528ede691d316
2017-05-02 10:46:52 -07:00
Jiyan Yang
795dc1c326 Remove loss ops from eval net
Summary: Current eval nets contain loss operators; see example: https://fburl.com/6otbe0n7, which is unnecessary. This diff is to remove them from the eval net.

Differential Revision: D4934589

fbshipit-source-id: 1ba96c20a3a7ef720414acb4124002fb54cabfc7
2017-04-26 12:46:25 -07:00
Jiyan Yang
ef2701a57e MapToRange layer
Summary: A layer that takes raw ids as inputs and outputs the indices which can be used as labels. The mapping will be stored with the model.

Reviewed By: kittipatv

Differential Revision: D4902556

fbshipit-source-id: 647db47b0362142cdba997effa2ef7a5294c84ee
2017-04-25 16:03:58 -07:00
Kittipat Virochsiri
fd9185ab21 fix getting empty struct
Summary: `not field` calls `__len__()`, causing the field to appear to be missing even when it's not

Differential Revision: D4910587

fbshipit-source-id: bc2b2fadab96571ae43c4af97b30e50c084437af
2017-04-19 22:36:05 -07:00
Huazhong Ning
ad6b53e401 allow to specify output dtypes for functional layers
Summary:
Currently, the functional layer infers the output types and shapes by running the operator once.
But in cases where special input data are needed to run the operator, the inferrence may fail.
This diff allows the caller to manually specify the output types and shapes if the auto infererence may fail.

Reviewed By: kennyhorror

Differential Revision: D4864003

fbshipit-source-id: ba242586ea384f76d745b29a450497135717bdcc
2017-04-18 16:34:52 -07:00
Kittipat Virochsiri
0a726af42e Coerce input of FunctionalLayer to record
Summary: Having to pack the input to schema doesn't make much sense since the structure is not recognized by operators anyway.

Differential Revision: D4895686

fbshipit-source-id: df78884ed331f7bd0c69db4f86c682c52829ec76
2017-04-17 19:26:06 -07:00
Kittipat Virochsiri
baf33161d4 GatherRecord layer
Summary: Perform gather on the whole record. This will be used for negative random sampling.

Reviewed By: kennyhorror

Differential Revision: D4882430

fbshipit-source-id: 19e20f7307064755dc4140afb5ba47a699260289
2017-04-13 15:02:44 -07:00
Kittipat Virochsiri
5c32c82a6d Add option to subtract log odd from sampled trained prediction.
Summary: Useful for sampled softmax training

Differential Revision: D4782673

fbshipit-source-id: 88195de60070a0bc16f5e06b9aad4dffd0484546
2017-04-03 17:50:58 -07:00
Xianjie Chen
9fc56793dd fix trunk for push and small cleanup
Summary:
multiple places broken, blocking the push :(

- fix the weighted training for ads and feeds
- fix the publishing if no exporter model is selected
- fix the feeds retrieval evaluation
- added the default config for retrieval workflows. plan to use for flow test (in next diff)
- clean up not used code
- smaller hash size for faster canary test

Reviewed By: chocjy

Differential Revision: D4817829

fbshipit-source-id: e3d407314268b6487c22b1ee91f158532dda8807
2017-04-02 23:35:49 -07:00
Kittipat Virochsiri
3eb3507367 uniform_sampling layer
Summary: This layer will be used to sample negative labels for sampled softmax.

Differential Revision: D4773444

fbshipit-source-id: 605a979c09d07531293dd9472da9d2fa7439c619
2017-03-29 14:36:12 -07:00
Andrey Malevich
7cc92b1260 Add eval net for layer_model_helper
Summary:
This diff is adding eval nets to layer model helper. It should be useful for
the cases when train/eval nets need some extra input (usually some supervision)
for train/eval. For example various sampled layers, etc.

Differential Revision: D4769453

fbshipit-source-id: 7a8ec7024051eab73b8869ec21e20b5f10fd9acb
2017-03-29 04:03:40 -07:00
Kittipat Virochsiri
da36212259 SamplingTrain layer
Summary:
`SamplingTrain` layer is a wrapper around another layer subclassing `SamplingTrainableMixin`. When initiated in the training context, `SamplingTrain` produces sparse output of the wrapped layer. Output can be paired with `indices` to create Map schema.  When initiated in prediction context, the full output of the wrap layer is produced.

This is liked the SampledFC function in model helper, https://fburl.com/gi9g1awh, with the ability to initiated in both trainig and prediction context.

I'd like to get consensus whether we should introduce the `SamplingTrain` layer and the accompaying mixin. This can probably be accomplished in some other way, but I think this is not too bad.

Reviewed By: xianjiec

Differential Revision: D4689887

fbshipit-source-id: 7be8a52d82f3a09a053378146262df1047ab26a8
2017-03-27 23:31:55 -07:00
Huazhong Ning
8168e8ac25 allows to specify output names for functional layers
Summary:
currently the output schema and blobs are names as "field_i" which is
bad for debugging. This diff allows us to specify output names.

Reviewed By: kennyhorror

Differential Revision: D4744949

fbshipit-source-id: 8ac4d3c75cacbb4c9b5f55793ac969fe1cf20467
2017-03-23 13:18:58 -07:00
Kittipat Virochsiri
4829bdb1ea BatchSoftmaxLoss layer
Summary: Similar to BatchLRLoss layer

Reviewed By: xianjiec

Differential Revision: D4689609

fbshipit-source-id: 89fa4b9d4145ce77cb2aaa7a5c0c1a24f901d88f
2017-03-17 10:19:06 -07:00
Kittipat Virochsiri
cea16ff7cd BatchSigmoidCrossEntropyLoss
Summary: To support feed interset team

Reviewed By: kdub0

Differential Revision: D4719213

fbshipit-source-id: 8deb3544377fb06593399b101de66f3f845f93b5
2017-03-17 09:35:51 -07:00
Kittipat Virochsiri
61dd35f1d6 FCWithoutBias layer
Summary: For some embedding task, we don't want to include bias term in embedding computation.

Reviewed By: xianjiec

Differential Revision: D4689620

fbshipit-source-id: 4168584681d30c0eaa1d17ceaf68edda11924644
2017-03-15 11:03:37 -07:00
Kittipat Virochsiri
25b1221579 Allow scalar output in functional layer
Summary: Some operators, e.g., SoftmaxWithLoss, returns scalar-typed tensor. This would allow us to use those ops without having to write layer manually.

Reviewed By: xianjiec, kennyhorror

Differential Revision: D4703982

fbshipit-source-id: f33969971c57fc037c9b44adb37af1caba4084b6
2017-03-14 15:32:47 -07:00
Andrey Malevich
a3726759c6 Add a way do describe layers in a more AdHoc manner.
Summary:
This diff is trying to address one of the concerns that Xianjie have had - requirements create a layer for all operators and attach pass shapes and other info around.

The basic idea of the diff:
1. Try to create a layer with a given name, but if it's not available try to fallback on operator with that name (that is expected to have no parameters).
2. For all operators that we're adding through this functional style of creation - try to use C2 Shape/Type inference logic to get output type. If we fail to get - it just return untyped record and expect user to annotate it when it's really needed.

Reviewed By: xianjiec

Differential Revision: D4408771

fbshipit-source-id: aced7487571940d726424269970df0eb62670c39
2017-02-27 23:30:39 -08:00