* ENH: add to method for PackedSequence
* ENH: return self if possible
* TST: remove extra data
* DOC: add more explanation
* TST: remove extra data
* DOC: minor fix
* Codemod to update our codebase to 0.4 standard
* Update some of the test scri[ts
* remove Variable in test_clip_grad_value
* fix _symbolic_override_wrapper_maker
* Unit test for pack_padded tracing
* Move monkeypatching stuff
* Switch symbolic
* Fix stack traces and update test
* Fixup and confirm e2e working
* lint
* Move monkeypatch back to onnx
* Address comments
* remove extraneous import
* Add gradient checking
* lint
* Address comments
* improve test case
* Namespaced symbols
- Our interned strings now have structure, "ns::symname" rather than just
"symname" before. We support efficient namespace testing for uniques
by encoding the namespace in one byte in the Symbol internal representation.
See torch/csrc/jit/interned_strings.h for a more in-depth implementation
discussion.
- All uses of ksymbol are now attr::symbol (or some appropriate namespace).
The valid namespaces are prim, attr, onnx and aten.
- Symbol is bound in Python as a qualified string "attr::symbol", EXCEPT for the
attribute setting/getting API, whose symbols must always be attr
symbols; they get special cased to assume strings are passed.
There's a little bit of naughtiness in the implementation, maybe you know
how to solve it.
- However, the g.op() convenience function assumes that you're generating
ONNX operators, unless you explicitly qualify.
- All ATen operators and nodes have built-in interned strings generated
for them, so you should never have to write a string literal ever again.
The tracing code is adjusted to use it.
- ONNX exporter now properly tests to see that all operators are in
onnx namespace before accepting the export. This is way more
robust than the previous exporter, which would be willing to
export capitalized operators which were not actually ONNX operators.
- A slight organizational change for symbolic.py; this module now ONLY
contains aten operators. In particular, the exporter for Constant
has moved into utils.py (along with Undefined, from the C++ side),
since primitive ops get "special treatment."
- The un-inplacing logic in recording is more robust, so that we don't
delete a trailing underscore from __and__. This never affected us
before because we didn't have any tests for it.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Fix some minor errors in existing docs.
* Fix Convolution and Pooling docs in torch.nn.functional
* Cleaned up torch.nn.functional docs
* Address @SsnL 's comments
* Add multiplication sign missing in docs
* Fix more typos, and clear some warnings
* Change infinity symbol in LPPool2d
* Revert some changes in torch.nn.functional
* Few more minor changes
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan
* Fix minor nit in the docstring
* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs
* Add more changes
1. Modify all torch.Tensor wherever required
* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines
* Improve Pooling docs
1. Fix lint error
* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling
* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation
* Fix lint error
* Improve docstrings in torch.nn.init
* Fix lint error
* Fix minor error in torch.nn.init.sparse
* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs
* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py
* Fix batch norm doc error
This was accidentally lost while addressing review comments on
https://github.com/pytorch/pytorch/pull/4695
pack_padded_sequence may be called either with a list or with a
Variable. If called with a list we convert to Variable internally.
I added to test_nn to test the new codepath. The bug was also caught
by the onnx-fb-universe tests (which rely on passing in Variable).
* PackedSequence: store batch_sizes as tensor
rather than converting to a list of python integers. This maintains
the invariant that module's inputs/outputs are collections of
Variables.
In particular, this causes the JIT to no longer choke when flattening
and unflattening arguments.
* Handle sequence lengths correctly when exporting RNNs to ONNX
- when uniform sequence lengths are provided, correctly omit the
argument when constructing the ONNX graph, so as to not fix the
graph to the batch size.
- handle PackedSequences by floating them through the graph and
eliminating them in an optimization pass. ONNX does not have packed
sequences, but operates on a representation equivalent to
PaddedSequence, so we hide the representation-switching from ONNX
- as a preliminary step towards handling PackedSequences, not directly
tied to ONNX export, change batch_sizes from being an argument to
the RNN operators into being an argument to the forward() function
of those RNN operators. This more closely models the reality that
batch_sizes are effectively part of the input sequences.
Implements basic and advanced indexing using ATen tensors/variables.
Basic indexing is translated at the Python-binding level
(python_variable_indexing.cpp) to slice/squeeze/unsqueeze/select calls.
Advanced indexing is implemented in ATen in terms of take() and put()
calls.
* Specifying the value used for padding
The "pad_packed_sequence" function fills padded elements with zeros, but sometimes it is not useful. For example, some previous papers on NLP, including my recent paper [1], use a max-pooling technique for RNN-based sentence representations. More specifically, the max-pooling technique selects the maximum value from all time steps (i.e., hidden states) for each dimension. In such a case, we do not want the padded zeros to be selected. To overcome this situation, we can simply use a very small value instead of zero.
An LSTM example is shown below:
input = embedding(Variable(batchInput))
packedInput = nn.utils.rnn.pack_padded_sequence(input, lengths, batch_first = True)
h, (hn, cn) = self.encoder(packedInput, (h0, c0))
h, _ = nn.utils.rnn.pad_packed_sequence(h, -1024.0 batch_first = True)
sentenceRep, _ = torch.max(h, 1, keepdim = True)
[1] A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. The 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).
https://arxiv.org/abs/1611.01587 (Equation (4))
* Modified the order of the arguments
Following the suggestion, I modified the order of the arguments.