* Add criterion scalar tests.
This exposed an issue in MarginRankingLoss with scalars, but the cleanest way to fix is to wait
until forward runs on Variables (so we don't have to wait for the backward to check if something
is a scalar).
* Fix flake8.
* Add error message for margin_ranking_loss with scalars.
* test_nn working.
* Fix some incorrect scalar assumptions.
* Don't use Variables when we don't have to.
* Use Variable Mixin.
* Fix NLLLoss reference function when WITH_SCALARS not enabled.
* Allow device to be optional in cuda().
* Fix multilabelmarginloss_reference.
* Add some more builder scripts from ossci-job-dsl
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Relax precision requirement on test_Upsample_trilinear_scale_3d_cuda
Partially addresses #5006.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
This commit fixes double-backwards on batch norm. There were two
bugs:
- Returned buffers from batchnorm backwards were being marked as differentiable
when they shouldn't be. The fix for this is "easy": use 'grad' instead of
'grads[0]' in cudnn_batch_norm's backward definition. (More on this below.)
- I was using toTensor on a Scalar, which gives me a Tensor of the wrong
type when I'm in CUDA world. Using the Scalar add() overload directly
solves the problem.
The differentiability of returned buffers was annoyingly subtle and I nearly
went off and implemented a big pile of infrastructure to "tell" the codegen how
to distinguish between differentiable and non-differentiable outputs before
realizing that there must be a way we do this legitimately, because it works for
THNN. I documented this in derivatives.yaml, and also added tests for the
problem in load_derivatives.py to catch the various ways you could "get it
wrong". Hope this helps someone else.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Previously, we only tested CPU double-backwards, which is bad!
This would have caught #4422 (still not fixed, so those tests
are manually disabled) and also uncovered #4500 (not yet diagnosed.)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* API changes
* Implement reduce for THNN ClassNLLCriterion
* Implement reduce keyword for THCUNN ClassNLLCriterion
* Implement reduce for THNN SpatialClassNLLCriterion
* Implement reduce for THCUNN SpatialClassNLLCriterion
* Make legacy NLLLoss work
* Docs for NLLLoss reduce
* reduce keyword for double backwards NLLLoss
* reduce=False tests
* Addressed comments
* Fix trailing whitespace
* Fix test failures in legacy nn
* Rebase: add reduce keyword to aten declarations of NLLLoss
* Add reference functions for all NLLLoss and NLLLoss2d test cases
* Replaced slow get/set fns. Don't use int64_t in kernels.
* Use TH_INDEX_BASE in NLLLoss for consistency
* Fix legacy ClassNLLCriterion tests
- Cleaned up THNN and THCUNN code and kernels
- Improved THCUNN kernel performance 5x, making it match cuDNN performance
- Added support for computing softmax over arbitrary dims
NOTE: The default dim for 3D inputs is now 1 (used to be 0)
- Both functions now accept inputs with arbitrarily many dimensions
- Autograd functions no longer save the input (it's unnecessary)
- Added cuDNN bindings for softmax, but they are unused as THCUNN
matches or even exceeds cuDNN performance
* add dropout2d and dropout3d to functional
added some loss functions to functional
added tests
using dropout from backend
added docs
fixes
* edited loss modules to call functional
1) Line up trailing dimensions in broadcast docs.
2) remove unnecessary expand_as in common_nn test.
3) use view in tensor_str instead of resize_.
4) newExpand remove raiseErrors change.
5) clarify expandedSizes/expandedStrides parameters in inferExpandGeometry.
6) simplify inferSize2/inferSizeN implementations.
7) use new-style classes for warning.
Because of this Variables can no longer appear in the graph.
Every usage of a leaf Variable will leave an AccumulateGrad
function that has no outputs, but modifies var.grad as a side
effect.
Here's the command I used to invoke autopep8 (in parallel!):
git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i
Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.
Also configures flake8 to match pep8's behavior.
Also configures TravisCI to check the whole project for lint.
- don't use cuDNN for half inputs because weight, bias, running_mean,
etc. are required to be of different type than for THCUNN
- accept 3D inputs (N,C,L) in BatchNorm1d
- remove accidental 'use_cudnn=False'
* Conv2d, MaxPool2d, and AvgPool2d have one argument for each of ksize,
stride, and pad. This argument can be either a single number or a
tuple of (h, w)