Summary:
In this diff :
[1] Change the output from generating all paths from root to labels to TreeProto.
TreeProto itself is required by inference and we can use hsm_util to get the
paths from TreeProto.
[2] Fix hsm_util index assigment.
Differential Revision: D4416731
fbshipit-source-id: 657d8b9b4df6fa30c9f92d391cf7e07b5c5db1f8
Summary: Change labels indices range to be in the range [0, num_classes[
Differential Revision: D4416685
fbshipit-source-id: b16ca8539fd538ad62bf1298dbad3f1553956241
Summary:
An operator that reads labels compute their counts and generates huffman tree
hierarchy. It generates all paths from root node to leafs labels as serialized
HierarchyProto to be used as an input to HSoftmax operator.
The tree is constructed in a bottom up greedy way keeping indices to parent
nodes to in order to generate the code and the path from root to leave in
a bottom up traversal.
Note:
HSoftmax handels computing a generic hierarchy which means for the binary case
we can save one matrix x vector operation per node by representing every node as
logsitc function and also reduce the paths proto size by producing only
one integer list to represent the path / indices and bytes list for the code
per label.
Differential Revision: D4303294
fbshipit-source-id: c7f0d3c204536234c26bb2a4228cb3a1892db395