pytorch/caffe2/python
Taiqing Wang ad91a3a11f Skipping L2 regularization on sparse biases
Summary:
# Motivations
As explained in the [link](https://stats.stackexchange.com/questions/86991/reason-for-not-shrinking-the-bias-intercept-term-in-regression/161689#161689), regularizing biases will cause mis-calibration of predicted probabilities.
In SparseNN, the unary processor may use 1d embedding tables for the sparse features to serve as biases.
In this diff, the regularization term is automatically skipped for the 1d sparse parameters to avoid regularizing biases.

# Experiments
Experiments were conducted to verify that it has no significant impact on the NE to skip the regularization on 1d sparse parameters.
Baseline.1 (no L2 regularization): f193105372
Baseline.2 (L2 regularization in prod): f193105522
Treatment (skipping L2 regularization on 1d sparse params): f193105708

{F239859690}

Test Plan:
Experiments were conducted to verify that it has no significant impact on the NE to skip the regularization on 1d sparse parameters using a canary package: `aml.dper2.canary:9efc576b35b24361bb600dcbf94d31ea`.

Baseline.1 (no L2 regularization): f193105372
Baseline.2 (L2 regularization in prod): f193105522
Treatment (skipping L2 regularization on 1d sparse params): f193105708

Reviewed By: zhongyx12

Differential Revision: D21757902

fbshipit-source-id: ced126e1eab270669b9981c9ecc287dfc9dee995
2020-06-11 11:21:25 -07:00
..
benchmarks [caffe2] open source 2/4-bit SLS operators (#34903) 2020-03-17 22:55:10 -07:00
docs Fix several DeprecationWarning: invalid escape sequence (#15733) 2019-01-05 08:53:35 -08:00
examples fix ROCm bench CI by increasing first iter timeout (#37633) 2020-05-04 20:49:32 -07:00
fakelowp Make FakeLowP tests work (#36525) 2020-04-13 20:16:33 -07:00
helpers Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
ideep Remove (most) Python 2 support from Python code (#35615) 2020-04-22 09:23:14 -07:00
layers Add LN after specialzied output embeddings and flexible LCE (#35178) 2020-04-30 15:32:09 -07:00
mint re-enable copy of python files, but be careful that the copy is only … (#14982) 2018-12-11 16:54:08 -08:00
mkl Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
modeling [c2] fix compute_norm test (#38529) 2020-05-14 20:49:36 -07:00
models Skip c2_ref_tests on network failures (#37972) 2020-05-06 22:19:28 -07:00
onnx [ONNX] Bump up ONNX submodule to a82c6a7010e2e332d8f74ad5b0c726fd47c85376 (#39372) 2020-06-02 21:08:14 -07:00
operator_test Update cafe2 hypothesis_test_util to support hypothesis-5 (#39498) 2020-06-05 08:27:50 -07:00
predictor [online trainer] Add blob reorder (#39534) 2020-06-05 17:33:08 -07:00
rnn Unify cuda and hip device types in Caffe2 python front end (#14221) 2018-11-29 14:00:16 -08:00
serialized_test [Caffe2] Fix shape inference for element-wise operators (#33431) 2020-02-25 09:03:06 -08:00
test [caffe2] Fix of initializing ATen's CUDA before using caching allocator (#39759) 2020-06-09 17:25:42 -07:00
trt Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
__init__.py Fix dll load logic for Python 3.8 on Windows (#32215) 2020-01-22 08:33:34 -08:00
_import_c_extension.py [AMD] Remove num_gpu check for remote execution (#34318) 2020-03-06 09:53:57 -08:00
allcompare_test.py
attention.py
benchmark_generator.py
binarysize.py
brew_test.py
brew.py Testing for folded conv_bn_relu (#19298) 2019-04-16 19:04:06 -07:00
build.py
cached_reader.py Pass loop_over optional parameter for cached reader properly. (#21929) 2019-06-19 18:15:32 -07:00
caffe_translator_test.py Fix several ResourceWarning: unclosed file (#15746) 2019-01-09 15:36:53 -08:00
caffe_translator.py Fix several ResourceWarning: unclosed file (#15746) 2019-01-09 15:36:53 -08:00
checkpoint_test.py Revert D9566744: [New Checkpoint] Kill the dummy TaskOutput when task.get_step() (#11164) 2018-08-31 22:25:57 -07:00
checkpoint.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
CMakeLists.txt Fix CMakeLists.txt for Int8 python bindings (#15047) 2018-12-11 10:48:47 -08:00
cnn.py Unify cuda and hip device types in Caffe2 python front end (#14221) 2018-11-29 14:00:16 -08:00
compatibility.py migrating deprecated calls without abc module for containers (#11515) 2018-09-13 15:09:22 -07:00
context_test.py
context.py
control_ops_grad_test.py Fix the weird bug in control_flow_op_test.py (#26931) 2019-09-26 20:44:03 -07:00
control_ops_grad.py DeviceScope support for CUDA and testing (#15357) 2019-01-30 18:42:12 -08:00
control_ops_util.py
control_test.py
control.py
convert_test.py New serialization format (#12384) 2018-10-16 16:36:58 -07:00
convert.py New serialization format (#12384) 2018-10-16 16:36:58 -07:00
convnet_benchmarks_test.py Skip convnets benchmark in rocm CI (#17331) 2019-02-20 21:12:24 -08:00
convnet_benchmarks.py
core_gradients_test.py Back out "Back out "[Caffe2] Fix device_option propagation"" (#25908) 2019-09-17 04:01:36 -07:00
core_test.py Extend Net.RunAllOnGPU() to support RecurrentNetwork op (#15713) 2019-02-08 15:48:42 -08:00
core.py [net_transform] only skip ConstantFill for autogen_grad (#34628) 2020-03-11 19:09:52 -07:00
crf_predict.py Move crf in caffe2 from fb to oss (#12200) 2018-10-01 18:31:41 -07:00
crf_viterbi_test.py Move crf in caffe2 from fb to oss (#12200) 2018-10-01 18:31:41 -07:00
crf.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
data_parallel_model_test.py Skips test_equiv_recurrent (#29255) 2019-11-06 13:29:23 -08:00
data_parallel_model.py [Caffe2] raise exceptions instead of str (#37744) 2020-05-05 13:34:33 -07:00
data_workers_test.py Disables test_atomic_ops and testInputOrder (#29145) 2019-11-05 16:53:53 -08:00
data_workers.py Fixed log message (#10874) 2018-09-05 09:55:52 -07:00
dataio_test.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
dataio.py Rearrange stopping condition in CompositeReader (#20062) 2019-05-06 15:06:32 -07:00
dataset.py
db_file_reader.py Adding support for manifold files in DBReader (#37727) 2020-05-15 07:18:30 -07:00
db_test.py
device_checker.py
dlpack.h Fix typos (#30606) 2019-12-02 20:17:42 -08:00
dyndep.py guard dyndep with a lock (#26153) 2019-09-13 11:38:14 -07:00
embedding_generation_benchmark.py
experiment_util.py
extension_loader.py always restore dlopen flag in dyndep (#22958) 2019-07-17 10:26:25 -07:00
filler_test.py caffe2 - Expose tensor filler util to Python (#18886) 2019-04-08 11:54:10 -07:00
functional_test.py
functional.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
fused_8bit_rowwise_conversion_ops_test.py [caffe2] make fused rowwise quant/dequant op work for N-dim tensors (#33426) 2020-02-19 23:29:42 -08:00
gradient_check_test.py Unify gpu_support variable in python tests (#16748) 2019-02-07 00:29:51 -08:00
gradient_checker.py [caffe2] fix type and shape inference for common gradient ops (#35857) 2020-04-02 11:17:04 -07:00
gru_cell.py
hip_test_util.py Make CUDNN an alias of MIOPEN for HIP ops (#12278) 2018-10-24 17:07:31 -07:00
hsm_util.py
hypothesis_test_util.py Update cafe2 hypothesis_test_util to support hypothesis-5 (#39498) 2020-06-05 08:27:50 -07:00
hypothesis_test.py Update cafe2 hypothesis_test_util to support hypothesis-5 (#39498) 2020-06-05 08:27:50 -07:00
ideep_test_util.py
layer_model_helper.py Add transfer_learning_blob_name_mappings into layer_model_helper to support layer model transfer learning 2020-03-18 15:28:00 -07:00
layer_model_instantiator.py
layer_parameter_sharing_test.py Add validator for optimizers when parameters are shared 2019-04-17 21:10:38 -07:00
layer_test_util.py
layers_test.py FCTransposed to FbFCPacked (#29766) 2019-12-10 10:18:21 -08:00
lengths_reducer_fused_8bit_rowwise_ops_test.py [caffe2] fix how np.clip is used in lengths_reducer_fused_{4,8}_rowwise_ops_test (#32086) 2020-01-14 22:53:28 -08:00
lengths_reducer_rowwise_8bit_ops_test.py
lstm_benchmark.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
memonger_test.py Unify gpu_support variable in python tests (#16748) 2019-02-07 00:29:51 -08:00
memonger.py [pyfi] override TP2 networkx -> PyFI networkx (#37764) 2020-05-11 13:20:00 -07:00
mkl_test_util.py
model_device_test.py Unify gpu_support variable in python tests (#16748) 2019-02-07 00:29:51 -08:00
model_helper_test.py keep net type info when generating model complete net (#11032) 2018-09-04 21:10:06 -07:00
model_helper.py Fix TensorProtosDBInput AttributeError (#32274) 2020-01-29 12:05:43 -08:00
modifier_context.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
mpi_python.cc Replace c10::guts::stuff with std::stuff (#30915) 2019-12-16 13:57:19 -08:00
muji_test.py Unify cuda and hip device types in Caffe2 python front end (#14221) 2018-11-29 14:00:16 -08:00
muji.py Unify cuda and hip device types in Caffe2 python front end (#14221) 2018-11-29 14:00:16 -08:00
net_builder_test.py
net_builder.py
net_drawer.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
net_printer_test.py
net_printer.py Fix spelling errors (#21665) 2019-06-13 15:21:55 -07:00
nomnigraph_test.py nomnigraph - support subgraph visualization (#13795) 2018-11-16 08:19:20 -08:00
nomnigraph_transformations_test.py Add transpose network pass (#13437) 2018-11-01 14:27:07 -07:00
nomnigraph_transformations.py Add transpose network pass (#13437) 2018-11-01 14:27:07 -07:00
nomnigraph.py [caffe2/nomnigraph] handle when PATH env is not defined (#39373) 2020-06-10 17:09:59 -07:00
normalizer_context.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
normalizer_test.py
normalizer.py Scale init for batch-norm and layer-norm (#31983) 2020-01-10 11:55:56 -08:00
numa_benchmark.py Revert D13205604: Move numa.{h, cc} to c10/util 2018-12-07 10:01:25 -08:00
numa_test.py Move numa.{h, cc} to c10/util (#15024) 2018-12-12 12:21:10 -08:00
observer_test.py
operator_fp_exceptions_test.py Caffe2 - Add flag to fails if float point exceptions is detected in operator runs (#18040) 2019-03-16 12:28:05 -07:00
optimizer_context.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
optimizer_test_util.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
optimizer_test.py Implementation of STORM optimizer caffe2 python wrapper (#36399) 2020-04-14 23:05:45 -07:00
optimizer.py Skipping L2 regularization on sparse biases 2020-06-11 11:21:25 -07:00
parallel_workers_test.py ParallelWorkersTest.testParallelWorkersInitFun is flaky (#29045) 2019-11-01 13:59:02 -07:00
parallel_workers.py get rid of deprecated thread.isAlive() to use py2.6 modern form is_alive() 2019-10-22 15:37:31 -07:00
parallelize_bmuf_distributed_test.py Unify gpu_support variable in python tests (#16748) 2019-02-07 00:29:51 -08:00
pipeline_test.py
pipeline.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
predictor_constants.py
pybind_state_dlpack.cc Upgrade DLPack 2018-11-12 15:59:46 -08:00
pybind_state_dlpack.h Remove PythonOp non-CPU path and PytorchOp (#15417) 2019-01-02 16:36:37 -08:00
pybind_state_gpu.cc add simple memory analyzer and log warning if GPU underutilized (#21024) 2019-05-28 19:58:54 -07:00
pybind_state_hip.cc Make caffe2/fb folder compatible with AMD (#29131) 2019-11-04 16:40:29 -08:00
pybind_state_ideep.cc Upgrade MKL-DNN to DNNL v1.2 (#32422) 2020-03-26 22:07:59 -07:00
pybind_state_int8.cc Renaming meta() to dtype() - 2/2 (#13334) 2018-10-30 18:24:30 -07:00
pybind_state_nomni.cc nomnigraph - support subgraph visualization (#13795) 2018-11-16 08:19:20 -08:00
pybind_state_registry.cc Move registry fully to c10 (#12077) 2018-09-27 03:09:54 -07:00
pybind_state_registry.h Move registry fully to c10 (#12077) 2018-09-27 03:09:54 -07:00
pybind_state.cc [caffe2] create and register child ws in pybind (#36741) 2020-04-16 14:53:55 -07:00
pybind_state.h caffe2: preserve python exception type from PythonOp (#36267) 2020-04-09 12:43:24 -07:00
python_op_test.py caffe2: preserve python exception type from PythonOp (#36267) 2020-04-09 12:43:24 -07:00
queue_util.py
record_queue.py
recurrent.py
regularizer_context.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
regularizer_test.py Implement "trimmed lasso" regularization and support all available regularization in a single interface (#22966) 2019-07-17 16:12:31 -07:00
regularizer.py Adding LpNorm regularization for sparse features in DPER3 (#38582) 2020-06-09 12:34:50 -07:00
rnn_cell.py [caffe2] Remove python2 from operator_test (#33977) 2020-03-02 08:55:53 -08:00
schema_test.py Pass LRU hash output evicted_values to SparseLookup (#21389) 2019-07-02 11:27:37 -07:00
schema.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
scope_test.py Add EmptyNameScope to allow you jump out from current scope. (#14631) 2018-12-12 01:39:50 -08:00
scope.py Add EmptyNameScope to allow you jump out from current scope. (#14631) 2018-12-12 01:39:50 -08:00
session_test.py
session.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
sparse_to_dense_mask_test.py Increase static tolerance for negative feature ids 2019-05-20 19:09:22 -07:00
sparse_to_dense_test.py
task_test.py caffe2/python/task: added __repr__ methods to all task definitions (#15250) 2018-12-17 16:02:16 -08:00
task.py Fix typos (#30606) 2019-12-02 20:17:42 -08:00
test_util.py caffe2 - support flaky operator tests for caffe2 build (#18155) 2019-03-25 16:58:34 -07:00
text_file_reader.py Create Node2Vec ModuleKeeper 2019-04-01 10:36:23 -07:00
timeout_guard.py
toy_regression_test.py Enable junk fill for the default CPU allocator (#13377) 2018-11-08 00:02:37 -08:00
transformations_test.py Remove sinkMaxPool transformation (#17694) 2019-03-12 20:10:46 -07:00
transformations.py support pre-convert filter format for mkldnn training mode and change 'OptimizeForIdeep' to 'OptimizeForMkldnn' (#15171) 2019-03-29 19:00:48 -07:00
tt_core_test.py
tt_core.py
utils_test.py [C2] Introduce extra_info force CPU tags for auto-generated iteration counter blobs (#32607) 2020-02-05 23:49:27 -08:00
utils.py [C2] Introduce extra_info force CPU tags for auto-generated iteration counter blobs (#32607) 2020-02-05 23:49:27 -08:00
visualize.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
workspace_test.py Revert "Revert D18171156: Merge Tensor and Variable." (#29299) 2019-11-08 09:11:20 -08:00
workspace.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00