Commit Graph

59 Commits

Author SHA1 Message Date
Artem Volkhin
b2cf0fad15 Convert SparseLookup layer's embedding to fp16 blobs for predictor
Summary:
First part of adding half-floats support to DPER 2.0. Let's add an option use_half_floats to enable converting some weights of the model from fp32 to fp16 before saving it to predictor models parts. For now it's for SparseLookup layer's embeddings. All conversion is done after training is finished and saved models are ready to be used on remote predictors as-is (they will be stored compacted in memory). New fp16 blobs are saved to the model instead of original ones, under the same names, so we don't modify MetaNetDef at all.

Next steps:
1) support on delivery side -- operators working with these blobs should support both float and float16 input types
2) benchmark performance to make sure there is no regression
 a) of serialization
 b) of delivery
3) support realtime training (I'm thinking about adding new pre-publishing net which will be executed each time the realtime trainer stops to publish a new snapshot)

Depends on D4567304

Reviewed By: kennyhorror

Differential Revision: D4571710

fbshipit-source-id: 19967a17d3bd84878d66e8c0ed8c5342bf38d979
2017-02-22 11:05:49 -08:00
Xianjie Chen
d0621a2449 NextScopedBlob with well-defined behavior and respect namescope
Summary:
Remove the use of `NextName` in layer model helper, so that the same function return `model_helper` that should construct identical `Net`, when under the same NameScope.

The `NextScopedBlob` should only take effect when there is real name conflicting, otherwise it returns ScopedBlobReference.

This is critical for parameter blobs. In long run, we need to be able to specify parameter blobs more explicitly. (kennyhorror is working on this). This solution works in short term for e.g., two tower sparse nn models.

Reviewed By: kennyhorror

Differential Revision: D4555423

fbshipit-source-id: 2c4b99a61392e5d51aa878f7346466a8f14be187
2017-02-16 17:16:36 -08:00
Andrey Malevich
86fb25cefa Rely on embedding size in split
Summary: As desc.

Differential Revision: D4471823

fbshipit-source-id: 2685c64c22556da1749b3e3e6b21a684a7231e7b
2017-01-27 19:44:31 -08:00
Vsevolod Oparin
5e5486491d Replace Gather + RowMul by SparseLengthsWeightedSum
Summary:
Improving performace using command SparseLenghtsWeightedSum. Results for my run:
Before:

  8.98474 RowMul
  6.89952 Gather
  0.80991 LengthsSum
  2.02056 SparseLengthsWeightedSum
  Total: 18.71

After:

  1.075 Gather
  6.54999 SparseLengthsWeightedSum
  Total: 7.62

Log of run: P56992396

With skip_backward. Command:

  CLASSPATH=/mnt/vol/gfsetlprocstore-oregon/users/cxj/hivereader-wrapper-1.0-SNAPSHOT-standalone.jar OMP_NUM_THREADS=1 MKL_NUM_THREADS=1 MKL_DYNAMIC=FALSE ./buck-out/gen/caffe2/caffe2/fb/dper/tools/speed_benchmark.par -loader_param /mnt/vol/gfsfblearner-altoona/flow/data/2017-01-22/d832bb7b-5598-422e-9fee-b3299a9c8c1f -negDownsampleRate 0.1 -hidden 'unary(dot{"num_dense": 6, "pooling_method": "PositionWeighted"}(128, 64)128-128, 1)' -model_type mlp_sparse -warmup_runs 10 -main_runs 1000 -run_individual -skip_backward 2>&1 | tee /tmp/log.txt

Before: P56993234$7509
After: P56992503$7344

Command:

  ./fblearner/nn/ads/canary all

https://our.intern.facebook.com/intern/fblearner/details/13320564/?notif_channel=cli

Cloned "caffe2 ads sparse nn canary" run: https://our.intern.facebook.com/intern/fblearner/details/13322337/

Reviewed By: xianjiec

Differential Revision: D4451073

fbshipit-source-id: 0a4e9693d7b8b0372b2efefa61154e987a493210
2017-01-24 20:44:21 -08:00
Ievgen Soboliev
1632f053e5 implement user-only metadata for input_record
Summary:
We want to implement request only net and to do this we decided to split the work into two parts. The first part will propagate required metadata and the second part will cut the nets properly.
This diff is to propagate request_only metadata across the layers.

A few notes about implementation:
  - Each layer contains a field request_only which can be set based on the input_record. If all the scalars from the input_record are marked request_only we mark a layer as request_only;
  - Sparse-To-Dense layer sets request_only metadata;
  - SigridTransformation and SparseLookup layers propagate request_only status;
  - As for now we join request_only and other sparse features together in input_record, but ideally we may want to separate this, because request_only should be served separately;

Reviewed By: xianjiec

Differential Revision: D4259505

fbshipit-source-id: db8a30ef92cba84f1a843981b9dde3a8b9633608
2016-12-15 12:01:29 -08:00
Xianjie Chen
c70e8115a1 dper_example use RowMul for speed
Summary:
Faster ~65k vs 25k:

After: 11444089
Before: 11259149

Differential Revision: D4275671

fbshipit-source-id: 57de414676799980632c1d29142ee698965b1b68
2016-12-15 12:01:28 -08:00
Xianjie Chen
2045a5de9f add position based weighting
Summary: adding more methods to the layer representation. The corresponding implementation in DPER is: https://fburl.com/563869364

Differential Revision: D4256583

fbshipit-source-id: 91326b7bb9e960a5bc70b5a13812fce90054eceb
2016-12-05 11:53:26 -08:00
Yangqing Jia
589398950f fbsync at f5a877 2016-11-18 15:41:06 -08:00
Yangqing Jia
238ceab825 fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00