Commit Graph

38 Commits

Author SHA1 Message Date
Adam Paszke
aa7af94656 Make JIT tracing a thread-local property (#9414)
Summary:
As in the title. Lets us simplify a lot of code.

Depends on #9363, so please review only the last commit.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414

Reviewed By: zdevito

Differential Revision: D8836496

Pulled By: apaszke

fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3
2018-07-19 19:09:39 -07:00
Richard Zou
0656ef483d remove sort requirement from pad-sequence (#7928)
* pad-sequence no longer requires sorting entries

pad-sequence can get the max_len from the list of sequences. entries only need to be sorted if output will be used for pack_padded_sequence, which can throw the error itself.

* remove sort requirement from pad-sequence

Picks up from #5974.

Removes the requirement that input sequences to pad_sequence have to be
sorted. Addressed the comments in the PR:
- Updated docstring for pad_sequence
- Remove sort requirement in pad_sequence test
- Test unsorted and sorted sequences in pad_sequence test
2018-05-30 16:36:55 -04:00
James Reed
5e50993be7 Better type checking for pack_padded_sequence symbolic (#7874) 2018-05-26 11:16:41 -04:00
bddppq
f94ae3ba1d
Update from facebook (#7696)
* Fix handling of empty batches in SumReduceDimsOp

As titled

* Deferrable async_scheduling finishRun fix

Proper order of finishing run operations in deferrable_async_scheduling net

* Simplify exception handling in async_scheduling

Simplify exception handling, no need to busy wait, thread that processes the
last task can finish the run

* [C2]worker_coordinator_memorize_worker_ids

As titled. This is related to T28689868, where the number of blobs we want to create is equal to the number of worker ids

* Add unit test for nets with no type set

* Ignore total length argument in sympolic_pad_packed_sequence

1- There was a mistake in the code that total_length was added to the wrong symbolic function (pack_padded_sequence) instead of (pad_packed_sequence)
2- No need to throw an exception if total_length is given since it is only used to enable data_parallel training on multi-gpus and doesn't have anything to do with onnx export, so just ignore it. https://fburl.com/tk4gciqp

* Add support for MKLDNN to async_scheduling

Just add MKLDNN as a possible CPU option to async_scheduling's pool function

* [AuFL][ensemble] support branch output for prediction

This diff supports using predictions from different branches and thus enables model ensembling (not fully independent).

* Fix a bug in add_loss in layer_model_helper

As titled.

* Support lradaption for adam

1.lr adaption operator
2.apply to dense adam

* Perf tweaks for async_scheduling

Restore single pool option + remove unnecessary (no-ops) calls

* add quantization to SparseSimdAdagradOp

add a bunch of quantization signatures to SparseSimdAdagradOp, implementations to come next

* [sr] [codemod] Change all SR callsites to use new API

@allow-large-files

This diff refactors all callsites of SR to use the slightly changed API introduced in the diff below. Really what this means is that you need to include the correct header. Also if you were using `ClientFactory::newFactory` you need to not prefix it with `ClientFactory::`.

```
cd ~/fbsource/fbcode
find ./ -type f -exec sed -i -e 's:#include "servicerouter/client/cpp2/ClientFactory.h":#include "servicerouter/client/cpp2/ServiceRouter.h":' -e 's:#include <servicerouter/client/cpp2/ClientFactory.h>:#include <servicerouter/client/cpp2/ServiceRouter.h>:' -e 's/ClientFactory::newFactory(/newFactory(/g' {} \;
```

Also manually fixed spots that couldn't be done automatically (or broke because they depended on transitive includes).

* Back out "Fix handling of empty batches in SumReduceDimsOp"

Original commit changeset: 282da1730cc2 This commit is blocking the
Github->fbcode sync, which really needs to get merged ASAP. D7881937 which this
diff depends on will be reverted in the sync D7990948 which causes this to
break. The sync diff cannot be patched with this reversion because it must be
landed against base revision 5c8c099 , and D7881937 must not be included in the
sync diff because it is breaking GPU tests that are not available in sandcastle
: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-cuda8.0-cudnn6-ubuntu16.04-test/3638/console
for one example.

* Add the flow to support operator benchmark

1) generate model with the operator 2) upload to everstore 3) generate model spec into json file 4) start running the benchmark

* [tum][gpu] Connect DPM trainer with flow and unit tests

This diff:
- Fix some small bugs for Yiming's recent changes to parallelizer, so it suits real use cases.
- Add correct tags to the TUM code, so we can do data parallel transform
- pass extra info when instantiation.
- add unit test for using DPM in TUM model

After this diff, we can do simple box, multi-gpu fully-sync trainer for TUM in Fblearner workflow, but may still need to do speed benchmarking.

* w/o normalized lradaption for adam dense only

The previous lr adaption includes a normalization step when performing the dot product operation. This is not exactly same as what is proposed in the paper. I add normalization as an option. Without it, the operator performs exactly what the paper proposed. With the option, we add the normalization step

* [fb] Use SharedPromise in DeferrableAsyncSchedulingNet

This code is to simplify DeferrableAsyncSchedulingNet by removing condition
variable + small fixes

* [tum] implement cuda sparseLengthsMean and LengthsMean

as title

* Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.

Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.

* Move feature_to_index to FeatureSpec.feature_to_index

move feature_to_index to FeatureSpec.feature_to_index to avoid override other fields

* [Caffe2] Rename bytes_moved to bytes_written

Just a rename in preparation for supporting bytes_read.

* [c2] fix ReduceFrontSumOp for empty case by setting 0

otherwise, it may use the results from last iteration when it's empty batch.

* [Caffe2] [Int8] Improve Intel CPU performance

* [Easy] Improve PrependDim op logging

as titled

* DBFileReader expand db_path using os.path.expanduser(..)

Since there are a lot of possible use cases of `DBFileReader` to read from user home path, like `~/local/sample.db`, I want to save people's trouble of calling `os.path.expanduser(db_path)` themselves.

* [Caffe2] Add bytes_read to cost structure

We're adding analytical read bytes to cost functions.  This extends the structure accordingly for all CostInference defined operators.
Additionally, some small bug fixes were performed:
1) Cost functions now extract type information of operands instead of assuming float

* Fix sleef on aarch64 for hhvm

@bypass-lint

Rename flag

* Remove duplicated part in caffe2/ideep/operators/conv_op.cc

should be sync error

* Rename test helper function test_adagrad_sparse_helper to adagrad_sparse_test_helper to avoid confusing pytest
2018-05-19 23:10:48 -07:00
Konpat
9789602814 Fix excess ']' in nn.utils.rnn.pack_sequence (#7475) 2018-05-10 14:41:17 -04:00
Yan Facai (颜发才)
f1e38725bf add to method for PackedSequence (#7319)
* ENH: add to method for PackedSequence

* ENH: return self if possible

* TST: remove extra data

* DOC: add more explanation

* TST: remove extra data

* DOC: minor fix
2018-05-07 14:39:03 -04:00
Sergei Lebedev
a0c1e5faea Change the error message in pad_sequence to be more user-friendly (#7283) 2018-05-04 12:29:21 -07:00
li-roy
d564ecb4a5 Update docs with new tensor repr (#6454)
* Update docs with new tensor repr

* remove cuda in dtype

* remove changes to gloo submodule

* [docs] document tensor.new_* ctor

* [docs] Add docs for tensor.to(), tensor.float(), etc

* [docs] Moar examples for docs.

* [docs] Warning for tensor ctor copy behavior

* Quick fix

* [docs] Document requires_grad_()

* [docs] Add example for requires_grad_()

* update slogdet and *fft

* update tensor rst

* small fixes

* update some docs

* additional doc changes

* update torch and tensor docs

* finish changing tensor docs

* fix flake8

* slogdet with negative det

* Update functional.py tensor ctors

* Fix nll_loss docs

* reorder to move device up

* torch.LongTensor -> torch.tensor or torch.empty in docs

* update tensor constructors in docs

* change tensor constructors

* change constructors

* change more Tensor() to tensor()

* Show requires_grads_ docs

* Fix set_default_dtype docs

* Update docs with new tensor repr

* remove cuda in dtype

* remove changes to gloo submodule

* [docs] document tensor.new_* ctor

* [docs] Add docs for tensor.to(), tensor.float(), etc

* [docs] Moar examples for docs.

* [docs] Warning for tensor ctor copy behavior

* Quick fix

* [docs] Document requires_grad_()

* [docs] Add example for requires_grad_()

* update slogdet and *fft

* update tensor rst

* small fixes

* update some docs

* additional doc changes

* update torch and tensor docs

* finish changing tensor docs

* fix flake8

* slogdet with negative det

* Update functional.py tensor ctors

* Fix nll_loss docs

* reorder to move device up

* torch.LongTensor -> torch.tensor or torch.empty in docs

* update tensor constructors in docs

* change tensor constructors

* change constructors

* change more Tensor() to tensor()

* Show requires_grads_ docs

* Fix set_default_dtype docs

* Link to torch.no_grad, etc, from torch doc

* Add dtype aliases to table

* regen docs again

* Tensor attributes stub page

* link to inplace sampling

* Link torch.dtype, device, and layout

* fix dots after nonfinite floats

* better layout docs
2018-04-21 07:35:37 -04:00
Tongzhou Wang
1c01eabd3c
Codemod to update our codebase to 0.4 standard (#6641)
* Codemod to update our codebase to 0.4 standard

* Update some of the test scri[ts

* remove Variable in test_clip_grad_value

* fix _symbolic_override_wrapper_maker
2018-04-17 22:06:54 -04:00
James Reed
ad5d421554
[JIT] Implement staged symbolics for pack_padded_sequence/pad_packed_sequence (#6256)
* Unit test for pack_padded tracing

* Move monkeypatching stuff

* Switch symbolic

* Fix stack traces and update test

* Fixup and confirm e2e working

* lint

* Move monkeypatch back to onnx

* Address comments

* remove extraneous import

* Add gradient checking

* lint

* Address comments

* improve test case
2018-04-10 11:30:50 -07:00
Tongzhou Wang
4d15442ebc
Add total_length option to pad_packed_sequence (#6327)
* add total_length to pad_packed_sequence; add example on how to use pack->rnn->unpack with DP

* address comments

* fix typo
2018-04-08 20:25:48 -04:00
Kento NOZAWA
3b58b859b2 Fix typos in docs (#6389) 2018-04-07 12:41:15 -04:00
Edward Z. Yang
acc409396b
Namespaced symbols (#5820)
* Namespaced symbols

- Our interned strings now have structure, "ns::symname" rather than just
  "symname" before.  We support efficient namespace testing for uniques
  by encoding the namespace in one byte in the Symbol internal representation.
  See torch/csrc/jit/interned_strings.h for a more in-depth implementation
  discussion.

- All uses of ksymbol are now attr::symbol (or some appropriate namespace).
  The valid namespaces are prim, attr, onnx and aten.

- Symbol is bound in Python as a qualified string "attr::symbol", EXCEPT for the
  attribute setting/getting API, whose symbols must always be attr
  symbols; they get special cased to assume strings are passed.
  There's a little bit of naughtiness in the implementation, maybe you know
  how to solve it.

- However, the g.op() convenience function assumes that you're generating
  ONNX operators, unless you explicitly qualify.

- All ATen operators and nodes have built-in interned strings generated
  for them, so you should never have to write a string literal ever again.
  The tracing code is adjusted to use it.

- ONNX exporter now properly tests to see that all operators are in
  onnx namespace before accepting the export.  This is way more
  robust than the previous exporter, which would be willing to
  export capitalized operators which were not actually ONNX operators.

- A slight organizational change for symbolic.py; this module now ONLY
  contains aten operators.  In particular, the exporter for Constant
  has moved into utils.py (along with Undefined, from the C++ side),
  since primitive ops get "special treatment."

- The un-inplacing logic in recording is more robust, so that we don't
  delete a trailing underscore from __and__.  This never affected us
  before because we didn't have any tests for it.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-03-16 13:36:11 -04:00
anderspapitto
5c51bb6c0f bugfix in onnx export of batch_first = True (#5753) 2018-03-15 12:23:21 -04:00
Vishwak Srinivasan
76a283db40 [ready] General Documentation Improvements - 2 (#5685)
* Fix some minor errors in existing docs.

* Fix Convolution and Pooling docs in torch.nn.functional

* Cleaned up torch.nn.functional docs

* Address @SsnL 's comments

* Add multiplication sign missing in docs

* Fix more typos, and clear some warnings

* Change infinity symbol in LPPool2d

* Revert some changes in torch.nn.functional

* Few more minor changes
2018-03-13 09:47:43 -04:00
anderspapitto
e9d1a5f6d5 support non-Variable arguments to functions in symbolic overrides (#5645)
simply pass them through unmodified. This is just the final tweaks,
after the bulk of the work getting rid of ExportProxy
2018-03-10 17:51:49 -05:00
Vishwak Srinivasan
32b3841553 [ready] General documentation improvements (#5450)
* Improvize documentation
1. Add formula for erf, erfinv
2. Make exp, expm1 similar to log, log1p
3. Symbol change in ge, le, ne, isnan

* Fix minor nit in the docstring

* More doc improvements
1. Added some formulae
2. Complete scanning till "Other Operations" in Tensor docs

* Add more changes
1. Modify all torch.Tensor wherever required

* Fix Conv docs
1. Fix minor nits in the references for LAPACK routines

* Improve Pooling docs
1. Fix lint error

* Improve docs for RNN, Normalization and Padding
1. Fix flake8 error for pooling

* Final fixes for torch.nn.* docs.
1. Improve Loss Function documentation
2. Improve Vision Layers documentation

* Fix lint error

* Improve docstrings in torch.nn.init

* Fix lint error

* Fix minor error in torch.nn.init.sparse

* Fix Activation and Utils Docs
1. Fix Math Errors
2. Add explicit clean to Makefile in docs to prevent running graph generation script
while cleaning
3. Fix utils docs

* Make PYCMD a Makefile argument, clear up prints in the build_activation_images.py

* Fix batch norm doc error
2018-03-08 13:21:12 -05:00
anderspapitto
28b1c94f0f allow application of @symbolic decorators without circular imports (#5595) 2018-03-08 12:44:16 -05:00
Evpok
c93076495d add: padding_value to torch.nn.utils.rnn.pad_sequence (#5540) 2018-03-05 11:39:03 +01:00
anderspapitto
b6854ee012 support batch-first in ONNX export of padded sequences (#5360) 2018-02-22 20:24:56 -05:00
anderspapitto
ef14590209 Support calling pack_padedd_sequence with a Variable lengths (#5113)
This was accidentally lost while addressing review comments on
https://github.com/pytorch/pytorch/pull/4695

pack_padded_sequence may be called either with a list or with a
Variable. If called with a list we convert to Variable internally.

I added to test_nn to test the new codepath. The bug was also caught
by the onnx-fb-universe tests (which rely on passing in Variable).
2018-02-07 17:11:33 -05:00
anderspapitto
b2cfd961d3 Handle sequence lengths correctly when exporting RNNs to ONNX (#4695)
* PackedSequence: store batch_sizes as tensor

rather than converting to a list of python integers. This maintains
the invariant that module's inputs/outputs are collections of
Variables.

In particular, this causes the JIT to no longer choke when flattening
and unflattening arguments.

* Handle sequence lengths correctly when exporting RNNs to ONNX

- when uniform sequence lengths are provided, correctly omit the
  argument when constructing the ONNX graph, so as to not fix the
  graph to the batch size.

- handle PackedSequences by floating them through the graph and
  eliminating them in an optimization pass. ONNX does not have packed
  sequences, but operates on a representation equivalent to
  PaddedSequence, so we hide the representation-switching from ONNX

- as a preliminary step towards handling PackedSequences, not directly
  tied to ONNX export, change batch_sizes from being an argument to
  the RNN operators into being an argument to the forward() function
  of those RNN operators. This more closely models the reality that
  batch_sizes are effectively part of the input sequences.
2018-02-06 21:40:27 -05:00
Justus Schwabedal
a20ac05c8b Added method cuda to PackedSequence. (#4430) 2018-01-10 21:42:37 +01:00
Marcin Elantkowski
a0ab48575e Implement backward pass for pack_padded_sequence 2018-01-09 12:33:15 +01:00
William Falcon
ff9d1aeab5 removes duplicate variable reference crash from pad_sequences (#4383) 2017-12-29 08:34:53 +09:00
Sherin Thomas
492e26fbcd Pad sequences and Pack sequences (#3875) 2017-12-22 16:14:09 +01:00
Tristan Zajonc
3a5bbc2140 improve PackedSequence docs to explain batch_sizes (#3878) 2017-11-26 21:11:54 +01:00
Sam Gross
4518793aa2
Implement indexing in ATen (#3725)
Implements basic and advanced indexing using ATen tensors/variables.
Basic indexing is translated at the Python-binding level
(python_variable_indexing.cpp) to slice/squeeze/unsqueeze/select calls.
Advanced indexing is implemented in ATen in terms of take() and put()
calls.
2017-11-21 13:19:00 -05:00
Ozan Çağlayan
dd6d04ddf2 doc: Normalize all true/false in docstrings to `True|False` (#3593)
* doc: Normalize all true/false in docstrings to ``True|False``

This makes them more apparent in the documentation.

* doc: fix flake8
2017-11-09 08:12:29 -05:00
SsnL
de1f4e69dd raw text (#3327) 2017-10-28 01:24:02 +05:30
Adam Paszke
5760b036fb Fix pack_padded_sequence to accept inputs of arbitrary sizes 2017-10-26 17:40:03 +02:00
Dirk Weissenborn
21ff182809 improve padding code 2017-10-26 12:03:17 +02:00
Dirk Weissenborn
b5170c8bf1 improves pack padded sequence operation runtime #1788 (#3278)
* improves pack padded sequence operation runtime #1788

* error message
2017-10-25 13:16:32 +02:00
Kazuma Hashimoto
ea8b09365c Specifying the value used for padding (#2751)
* Specifying the value used for padding

The "pad_packed_sequence" function fills padded elements with zeros, but sometimes it is not useful. For example, some previous papers on NLP, including my recent paper [1], use a max-pooling technique for RNN-based sentence representations. More specifically, the max-pooling technique selects the maximum value from all time steps (i.e., hidden states) for each dimension. In such a case, we do not want the padded zeros to be selected. To overcome this situation, we can simply use a very small value instead of zero.

An LSTM example is shown below:

input = embedding(Variable(batchInput))
packedInput = nn.utils.rnn.pack_padded_sequence(input, lengths, batch_first = True)
h, (hn, cn) = self.encoder(packedInput, (h0, c0))
h, _ = nn.utils.rnn.pad_packed_sequence(h, -1024.0 batch_first = True)
sentenceRep, _ = torch.max(h, 1, keepdim = True)

[1] A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. The 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).
https://arxiv.org/abs/1611.01587 (Equation (4))

* Modified the order of the arguments

Following the suggestion, I modified the order of the arguments.
2017-09-18 14:48:10 -04:00
Leonid Vlasenkov
46a868dab7 [Ready] Limit docs line length (#1900)
* some docs are ready

* docs

* docs

* fix some more

* fix some more
2017-07-10 10:24:54 -04:00
BenFielding
a8e6610e3d Fix argument typo in pad_packed_sequence docstring (#1300) 2017-04-19 13:50:59 -04:00
Yugnaynehc
37ebbc2809 the length of any item in padded_sequence should be greater than 0 (#1013) 2017-03-16 17:32:43 +01:00
Adam Paszke
da725830c2 Add support for variable length sequences in RNNs (#873) 2017-03-01 17:36:32 +01:00