* Add full impl of GroupNorm
* Fix comments in math.h
* Remove unsed buffers
* Add #include <array> in gpu version
* Remove unused moments_buffer_
* Make inverse std to be a template.
* Add detailed comments
* fix unit test for sqrt op
From the error logging:
[idx, grad, grad_estimate] are:
[[ 146. 0.5 0.45776367]
[ 147. 0.5 0.45776367]
The gradient == 0.5 is correct, which means the SqrtOp and its gradient is doing right job. (Because y = sqrt(x), loss = y^2/2 = x/2, and then d(loss)/dx = 1/2 = 0.5; )
The test failed because of numerical problem of grad_estimate (in unit test). It can be because the step_size is small, and float precision is not high (when there are multiple elements in the tensor, we do sum(y^2) to compute loss)
This diff
- increase the step size, and also move the test cases to be further away from 0 (where sqrt(x) is not well defined) to be safe :)
- also clean up, and merge the test case for inplace Vs. non-inplace
Tested with:
`CAFFE2_HYPOTHESIS_PROFILE=debug ai_bt caffe2/caffe2/python/operator_test:elementwise_ops_test -- "test_sqrt"`
* CompositeReader & CompositeReaderBuilder
A new type of reader gluing multiple readers together.
* Back out "Revert D7394363: [GanH]: Log D Trick for Cross Entropy with Sigmoid"
Original commit changeset: 9325a4356dbe
* [dai][WIP] convert params to int8 on ps before sending to trainer
Add float->uint8 conversion in addition to float->fp16 conversion in model_saver.
* [easy] improve unit test for sparse length sum ops
as desc.
#accept2ship
* Update GitHub upstream to 771fcb3455
* move sparse hash unique ops to OOS and add unit tests
- move the SparseHash version to OOS, since 'sparsehash' is already deps of caffe2 OOS: https://fburl.com/arssw4n1
- The 'SparseHash' engine is also being used in OOS, so the SparseHash version shall be in OOS to reduce confusion: https://fburl.com/o5ea7ah2
- fix the CUDA UniqueOp for the case when batch is empty.
- add unit test
* group_norm_op for caffe2
This is the cuda op for Group Normalization (GN): https://arxiv.org/abs/1803.08494
This code implements GN in one op that computes Y=gamma * (X-mu) / sigma + beta and also its gradients. It is expected to have minimal memory consumption (similar to the BN op), without creating new blobs if GN were implemented as several ops (e.g., reshape, norm_mean/std, affine_channel).
* Resubmit D7405233: disappeared in D7464958
OOS publish causes the op missing -- however, test was still there
* [c2] add sparse hash engine for cuda unique op
The SparseHash version of UniqueOp copy input tensor to CPU, and make use of sparse hash map to get unique output, and then copy back to GPU.
* [dper][gpu] enable unit testing gpu trainer for sparse nn
to debug the GPU trainer using mock data in unit test.
make it easier to develop GPU trainer for new models.
* Reuse Gloo context for Synchronize() calls
Previously we were creating (and leaking) the Gloo context on each call to Synchronize(). Now only run the common world op and create the barrier net once, then run the barrier net on each Synchronize() call. Since timeout is associated with the Gloo context, assert that the timeout is fixed instead of trying to handle the complexity of multiple timeouts (and associated contexts).
* [GanH/WGAN][1/n]: add FC param clipping
as titled
* [mobile] minimizing changes between caffe2_benchmark and speed_benchmark
* [GanH]: enable diagnose within model
avoid finding blob names but to directly enable inside the model
* Add `net_transformer_fun` option to DPM
This callback allows for various transformations to be made to the
model after gradient operators have been added. The immediate motivation for
this is to allow transformations such has "checkpoint-and-recompute" which
allow trading off memory for additional compute.
Adding several callbacks like this has made DPM's API less than ideal at this
stage. However, I could not find any reasonable alternative.
* [DT] [33/n] Compile flow task groups
task groups need to compiled in order to pickle the object in fblearner. However I also changed the Job's compile function as creating new object is not necessary.
* Initial commit for sparse_normalize vectorization and benchmark
* [GanH]: LB Calibration for JSD
as titled
* Tracing event in async executor
Adding event tracing through TRACE_EVENT macro in async executor
* [Resubmit] D7409751 Reseting book-keeping blobs when the reservoir is reset
D7409751 got lost in D7464958
* Visualizing realtime weights values
we want to visualize the weights values as optimizer is iterating. This diff supports to visual the weights at an assigned index.
Currently, we assume the blob to be 2 dimensional.
* [GanH][Easy]: Fix Homotopy Weighting
apparantely, there was a bug in homotopy weight (alpha, beta) update
* [c2] move sparse hash unique op out of oss
so that oss do not need to depend on google hash map.
* Get rid of std::round as it's not supported on Android
* Revert changes on setup.py
* Skip shaky test on Dataio
* fix
Summary: Quick fix for unit test broken by D6454290. This is my fault for approving while the tests covering the single callsite were broken.
Reviewed By: goldsborough
Differential Revision: D6466566
fbshipit-source-id: 2683be3d6bb184286e64fbde3e572946e39030c7
Summary:
While working on layer normalization for LSTMs I encountered an issue where the layer norm parameters (which are the scale/gain and bias/shift from the paper) were not registered in the model for `brew.layer_norm`. salexspb explained that this is because it was using the `init_net_param` API instead of `create_param`. This diff fixes this.
While fixing I noticed that I noticed that `brew.layer_norm` actually had a bug where it was multiplying with the bias instead of adding it. Another issue was that the function giving the scale and bias a shape of `[1]`, however the paper (https://arxiv.org/pdf/1607.06450.pdf) specifies that, like for batch norm, there is one scale and bias parameter per neuron, i.e. the shape should be `[1, axis_dimension]`. The API now takes an explicit `dim_in` parameter (also more consistent with other normalization functions in that module) so that this can be specified. See tests for how this now looks.
Reviewed By: jhcross
Differential Revision: D6454290
fbshipit-source-id: fc00ca614de3190c40ab743e8984bec9e85fb58c
Summary: Updated brew SpatialBN to use initializers similar to other brew ops such as conv and fc instead of initilaizing all of its parameters itself within the brew call.
Reviewed By: asaadaldien
Differential Revision: D5840359
fbshipit-source-id: 9f3d688d4957605eaf7ecd2488bc26bfb1da3f78
Summary: Implement a brew wrapper for the LayerNorm op. This adds the scalar weight and bias terms to the op.
Reviewed By: jmp84
Differential Revision: D5595836
fbshipit-source-id: 467b2e1158b0c454a149d4b26c47719826e98752
Summary:
This diff is the first step in the effort for refactoring all parameters. As a first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data structs to store all the BlobReferences).
Renaming computed_params to non-trainable/non-backprop params should be done is some other diff.
Reviewed By: salexspb
Differential Revision: D5171159
fbshipit-source-id: 68031ca779f053fb266a7c4a2e5b482a3bd9c832
Summary:
This diff is the first step in the effort for refactoring all paramters. As a
first step - I'm merging concept of params and computed_params, that is going
to be based on tags instead (in the first version it's still using old data
structs to store all the BlobReferences).
Renaming computed_params to non-trainable/non-backprop params should be done is
some other diff.
Reviewed By: salexspb
Differential Revision: D5119830
fbshipit-source-id: 2001090a37346eb12abbb234e13e727c288eb8a7
Summary:
Correct schema generation was previously broken leading to invalid gradient op creation.
Also exhibited in model_device_helper, where invalid schema were being created on the CPU when kwargs['engine'] == 'CUDNN'
Closes https://github.com/caffe2/caffe2/pull/617
Reviewed By: asaadaldien
Differential Revision: D5097062
Pulled By: akyrola
fbshipit-source-id: e22181f857deccb7b4395e87271e2cbf1226eb64
Summary:
Update rnn_cell.py and char_rnn.py example with new `brew` model.
- Deprecated CNNModelHelper
- replace all helper functions with brew helper functions
- Use `model.net.<SingleOp>` format to create bare bone Operator for better clarity.
Reviewed By: salexspb
Differential Revision: D5062963
fbshipit-source-id: 254f7b9059a29621027d2b09e932f3f81db2e0ce
Summary:
cuDNN versions of dropout and LRN (for native fp16 support), port of Caffe's max pooling algo that uses an explicit mask to store locations (also supports fp16 storage)
Closes https://github.com/caffe2/caffe2/pull/396
Reviewed By: akyrola
Differential Revision: D4990880
Pulled By: asaadaldien
fbshipit-source-id: a716acffb656843e9b31e3e6808bd2d8aa959d03
Summary:
rename model_helpers to brew. This is a big diff now. I did these things:
1. replace model_helpers with brew:
find . -type f -exec sed -i 's/model_helpers/brew/g' {} +
2. rename model_helpers.py and model_helpers_test.py
3. rename ModelHelpersTest to BrewTest
4. lowercase all the helper functions to distinguish them from single op
5. run my unittests
6. run converge tests
Reviewed By: salexspb
Differential Revision: D4930465
fbshipit-source-id: f420a1b03238df1cbe9f4426e0b9c43a12119661