Commit Graph

12 Commits

Author SHA1 Message Date
Bugra Akyildiz
27c7158166 Remove __future__ imports for legacy Python2 supports (#45033)
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:

```2to3 -f future -w caffe2```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033

Reviewed By: seemethere

Differential Revision: D23808648

Pulled By: bugra

fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
2020-09-23 17:57:02 -07:00
Yangxin Zhong
ed788ec780 Linearizable Label: Class Weights, Allow Missing Label, and Average by Batch Size (#29707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29707

In D17885977, Linearizable label (a multi-class classification) was implemented in MTML.

In this diff, we add several items for Linearizable label:

- Assigning different weights to each class through ```model_def.tasks[i].class_weights```.

  - This option is a dictionary, the keys of which are indices of the classes and the values of which are weights for each class.

  - For example, if a linearizable-label task has 4 classes and its ```class_weights = {"0": 1, "1": 0.1, "2": 0.1, "3": 0.01}```, it means that in the loss function of this task, we assign weight 1 to its first class, weight 0.1 to its second and third class, and weight 0.01 to its forth class. The index/order of classes follows the logic of linearizable label.

  - Note that when you assign different weights to different classes, you need to correct the calibration by setting an appropriate ```model_def.tasks[i].calibration.linearizable_class_weight```. Basically, the class weights in calibration should be the reciprocals of the class weights in loss function. So the ```calibration.linearizable_class_weight = {"0": 1, "1": 10, "2": 10, "3": 100}``` for the example above.

  - Example FBLearner job: f150763093

- We also support ```model_def.allow_missing_label_with_zero_weight``` for linearizable label, which will ignore those examples with first label missing, by assigning zero weights to them in loss function.

  - We need to set ```allow_missing_label_with_zero_weight = true``` to enable it.

  - Example FBLearner job: f150763093

- Last but not least, we update caffe2 operator ```SoftmaxWithLoss``` to support loss averaged by batch size.

  - We need to set ```model_def.tasks[i].loss.softmaxLoss.average_by_batch_size = true``` to enable it.

  - Previously, the loss was averaged by weight sum of examples in batch, which is still the default behavior now (when ```average_by_batch_size = null``` or ```average_by_batch_size = false```).

  - Without this new feature, the calibration will be incorrect when applying non-equal-weight training among different classes to a linearizable task.

  - Example FBLearner job with ```average_by_batch_size = true``` results in a correct calibration: f150763093

  - Example FBLearner job with ```average_by_batch_size = null``` results in an incorrect calibration: f150762990

Test Plan:
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_class_weights
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_zero_weight
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_average_by_batch_size

All tests passed.

full canary: https://fburl.com/fblearner/troznfgh

Reviewed By: chenshouyuan

Differential Revision: D18461163

fbshipit-source-id: aaf3df031406ae94f74e2e365b57e47409ef0bfe
2019-11-13 16:52:27 -08:00
Long Jin
76bf8f62f7 fix loss_weight for self_supervision
Summary: previously loss_weight is not used correctly for self-supervision branch

Test Plan: buck test mode/dev-nosan //caffe2/caffe2/fb/dper/layer_models/models/experimental/tests:tum_test

Reviewed By: xianjiec

Differential Revision: D17862312

fbshipit-source-id: 554b793a5caa3886946c54333c81a0d8a10230d9
2019-10-15 10:40:48 -07:00
Xiaolong Wang
c909abd85f [GanH] Label Smooth: Add Layer and Integrate to SparseNN
as titled
2018-03-27 18:10:39 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
sf-wind
602a09dde7 Update caffe2 from facebook 4f527ef46abf (#2234)
* [GanH]: two_task_discriminator

as titled

and adding label smooth

* [Dper2] Simplified UI options needed for blob magnitude visualization

* [GanH]: fix tags

as titled

* Added type and shape inference for GatherRange operator

This helps with type / shape inference when using this operator in layers.
Also just a nice to have in general.

* Demonstrate Caffe2 exception handling with StoreHandlerTimeoutError in Python

We'd like to catch and recover from certain Caffe2 net exceptions. Use this diff to demonstrate a pattern of registering a pybind exception mapping and catching in Pythonusing caffe2::StoreHandlerTimeoutException.

* Bind Gloo IoException to IoError in Python

Allow peer failure handling and recovery using an exception based mechanism. This diff registers gloo::IoException with pybind.

* [GanH]: add label smoothing to softmax with loss

as titled

* [C2] Enable LARS in Adagrad and hook it to DPER

* [DPER] Don't pass LayerModelHelper in create_trainer_nodes

Since we're planning to get rid of it eventually and I want to get access to
NetDef only interface ASAP - I'm looking towards removing all references to
LMH, where we don't really need them.

* fix bugs in LambdaRankNdcgOp

the loss and gradient in LambdaRankNdcgOp are incorrect. The loss should be negative log of probs instead of log.

* Restrict thread pool on iOS to only big cores

Historically, iPhones exposed only one type of cores, and Caffe2 thread pool used all of them.
However, iPhone 8/iPhone X exposes 2 big + 4 LITTLE cores. As our thread pool doesn't support work stealing or other forms of load balancing, fast cores end up waiting for the slow ones, and it may be better to restrict execution to only 2 fast cores, like we do on Android.

* Remove SparseLength Sum/WeightedSum/Mean operators with fp16 engine

Remove SparseLength Sum/WeightedSum/Mean operators with fp16 engine

* make clang happy and get fewer warnings

make clang happy and get fewer warnings

* [Personalization] Support add_output_schema() in layer_model_helper

Problem:
Currently the output_schema of sparse_nn can only be set once. https://fburl.com/efth5zer.

Solution:
For flexibility, we want to add fields to output_schema incrementally.

Plan:
Wrap the change of `model._output_schema` into a new function `add_output_schema()` for adding additional output_schema.

Callsite:
The add_output_schema() should be called instead at https://fburl.com/efth5zer

Reference:
The newly added `add_output_schema()` will be similar to `add_loss()` in https://fburl.com/t2ii8njh
2018-03-12 12:22:59 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Jiyan Yang
a8695178aa Adding parameter sharing API to Dper2
Summary:
To achive this, I modified the blob name scheme defined in a layer.
Before it was scope/fc_w and scope/fc_w_auto_0 (if there is another fc
    within the same scope).
Now I change it to scope/fc/w and scope/fc_auto_0/w.
That is, we rely on the uniqueness of the scoped layer name to define
names for blobs.

I also overwrote the create_param method in LayerModelHelper to let it
use the resolved name for blobs given the sharingparameter context.

There are some details such as making the initializer more structured
that I need to finalize.

Reviewed By: kennyhorror

Differential Revision: D5435132

fbshipit-source-id: a0525f5ea0977e255dd5ea765b38913f5951d455
2017-08-03 00:33:18 -07:00
Tao Wu
4be5337cca add support for weight in batch_softmax_loss
Summary: weighted batch_softmax_loss when weight exists in input_record

Reviewed By: kittipatv

Differential Revision: D5291646

fbshipit-source-id: f1bcd386ad1fc0e95e0a0315ec1c36531c792495
2017-06-21 10:32:15 -07:00
Jiyan Yang
a458aa4b2a Fix tags to be based on EXCLUDE_FROM_{CONTEXT}
Summary: Cleaning up the tagging system. Introducing tags EXCLUDE_FROM_{CONTEXT}.

Reviewed By: kennyhorror

Differential Revision: D4974842

fbshipit-source-id: b0fa6772299bb70afa2192c39e45191c9f41336a
2017-05-02 09:32:27 -07:00
Aaron Markham
58f7f2b441 doxygen python block added
Summary: Closes https://github.com/caffe2/caffe2/pull/226

Differential Revision: D4793550

Pulled By: JoelMarcey

fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e
2017-03-29 06:46:16 -07:00
Kittipat Virochsiri
4829bdb1ea BatchSoftmaxLoss layer
Summary: Similar to BatchLRLoss layer

Reviewed By: xianjiec

Differential Revision: D4689609

fbshipit-source-id: 89fa4b9d4145ce77cb2aaa7a5c0c1a24f901d88f
2017-03-17 10:19:06 -07:00