Commit Graph

5 Commits

Author SHA1 Message Date
Bugra Akyildiz
27c7158166 Remove __future__ imports for legacy Python2 supports (#45033)
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:

```2to3 -f future -w caffe2```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033

Reviewed By: seemethere

Differential Revision: D23808648

Pulled By: bugra

fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
2020-09-23 17:57:02 -07:00
Ansha Yu
98aebed88e Refactor tests part 1 (#11350)
Summary:
Followup to [the serialized test framework](https://github.com/pytorch/pytorch/pull/10594)

Round 1 for refactoring tests, starting alphabetically. I added some functionality, so I wanted to send out some of these initial changes sooner.

I'm skipping all tests that don't explicitly call assertReferenceChecks. Some tests directly call np.allclose, and others are simply TestCase (rather than HypothesisTestCase).

1. Start alphabetically producing serialized outputs for test functions, annotating those we want to include with `serialized_test_util.given`. So far I've only added one test per operator, but this already does seem to add quite a few tests.
2. Add functionality to allow us to generate outputs using pytest by adding pytest argument options. This allows us to skip adding a `__main__` function to quite a few tests.
3. Catch any exceptions generating the gradient operator and skip serializing/reading it, since certain operators don't have gradients.
4. Add functionality to better handle jagged array inputs, which numpy doesn't handle very well. We simply explicitly do the conversion to dtype=object.
5. Make only one file per test function, rather than 4, to reduce the number of files in the github repo.

I also noticed that there is some hypothesis handling that makes `serialized_test_util.given` not compatible with adding more hypothesis decorators on top. For example, there are tests that do
```
settings(...)
given(...)
def test_my_stuff(...)
```
But there is a hypothesis handler that explicitly checks that `given` is called below `settings`, so we cannot refactor this to `serialized_test_util.given`. I've just avoided decorating these kinds of tests for now, I hope that's alright.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11350

Reviewed By: houseroad

Differential Revision: D9693857

Pulled By: ajyu

fbshipit-source-id: a9b4279afbe51c90cf2025c5ac6b2db2111f4af7
2018-09-18 10:42:10 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Curtis Huang
c9238671ee Use char-ngram embedding for out-of-vocabulary words
Summary:
**Description**

Provide DeepText model with the functionality to load a secondary index (pre-trained char-ngram embedding, e.g. FastText) during training/test.  Embeddings of out-of-vocabulary words will be computed on-the-fly during training/test by averaging the char-ngram embeddings.

**Approach**

This diff provides two custom operators to accomplish this task – ConditionalOp and IndexCharNgramGetOp.  We first use IndexCharNgramGetOp to perform char-ngram index lookup and return a sparse tensor segmented by lengths for each token.  The sparse tensor is then used to compute the average embedding provided by the char-ngram index.  Finally, we use a ConditionalOp to replace those whose embeddings were not found in the original index during the feature apply stage.  Please refer to documentations of the code for more details.

Reviewed By: jamesr66a

Differential Revision: D5666924

fbshipit-source-id: f76605d093154a014d5b9ebf9510de9d79874eee
2017-09-01 19:16:49 -07:00