Commit Graph

4952 Commits

Author SHA1 Message Date
Chandler Zuo
472be69a73 Avoid Output Uninitialized Blobs in Load with load_all=1 (#19133)
Summary:
When output blob names are specified while load_all=1, output blob names are ignored. However, this behavior is not documented. In this diff, we just disallow users to provide blob names when load_all=1.

See discussion at https://fb.workplace.com/groups/1405155842844877/permalink/2714909788536136/
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19133

Reviewed By: dzhulgakov

Differential Revision: D14883698

Pulled By: chandlerzuo

fbshipit-source-id: 6e4171e36c4ccc4f857e79da98b858a06b7d8ad6
2019-04-27 10:45:44 -07:00
Michael Suo
a25b79531c use fully qualified name for ScriptClasses (#19239)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239
ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706

Differential Revision: D14928293

Reviewed By: eellison

Pulled By: suo

fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a
2019-04-26 19:17:21 -07:00
Xiaomeng Yang
2ce39de3fc Add elementwise_affine for layer_norm_op (#19713)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19713

Add elementwise_affine for layer_norm_op

Reviewed By: houseroad

Differential Revision: D15075454

fbshipit-source-id: e8a7d3da1c81e49fa55323f5e74a68bc4ef8d83f
2019-04-26 17:20:01 -07:00
Yinghai Lu
9d180e602f More topi support (#19728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19728

Added `Tanh`, `Transpose` and `Mul` support.

Reviewed By: hlu1

Differential Revision: D15078878

fbshipit-source-id: 0a0df6b0d453bc38987b6d744774c127dd6875fe
2019-04-26 00:53:11 -07:00
Jack Montgomery
48d5ab54a8 Automatic update of fbcode/foxi to 8f74bc4df3a4cfc69b1a3eadf62aa29d9961c72d AND update Glow AND update C2 (#19792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19792

This diff also contains the contents of D15092641 and D15090411 so as to not let c2, foxi, and glow get out of sync

Previous import was 81e1683d6348eee4b5ed1145222dc2c41be4269c

Included changes:
- **[8f74bc4](https://github.com/houseroad/foxi/commit/8f74bc4)**: Small fixes (#12) <Jack Montgomery>
- **[72097e4](https://github.com/houseroad/foxi/commit/72097e4)**: Add multiple quantization params per tensor (#11) <Jack Montgomery>
- **[b681fe0](https://github.com/houseroad/foxi/commit/b681fe0)**: Merge pull request #10 from jackm321/add_autoinstrument_graph_prop <Jack Montgomery>
- **[a68d835](https://github.com/houseroad/foxi/commit/a68d835)**: Add ONNXIFI_GRAPH_PROPERTY_AUTO_INSTRUMENT_NODES <Jack Montgomery>

Reviewed By: rdzhabarov, zrphercule

Differential Revision: D15086794

fbshipit-source-id: 8df02c62303b580e16a218d6be7791747e3d7213
2019-04-25 21:03:32 -07:00
Spandan Tiwari
9ef8eb4cbc Fix case for activations attribute in nn.RNN ONNX export. (#19368)
Summary:
This PR addresses the https://github.com/pytorch/pytorch/issues/19366 issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19368

Reviewed By: zrphercule

Differential Revision: D15043949

Pulled By: houseroad

fbshipit-source-id: 9b90410307d31bc5f2fd14aa0cdd33b22572ed7c
2019-04-25 16:31:25 -07:00
Oleg Bogdanov
bf5a5c2a31 caffe2 | Use _aligned_free in WorkerPool destruction (#19751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19751

This has probably never been tested on Windows but destruction of WorkersPool crashes because it uses _aligned_malloc to allocate and 'free' to deallocate, which is not symmetric. Fix is to use _aligned_free in deallocation

Reviewed By: hlu1

Differential Revision: D15083472

fbshipit-source-id: 42243fce8f2dfea7554b52e6b289d9fea81d7681
2019-04-25 14:54:50 -07:00
Yinghai Lu
65496e4e67 Bug fix in bound shape inferencer (#19729)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19729

Accessing dims() without boundary check is not good.

Reviewed By: zrphercule

Differential Revision: D15078912

fbshipit-source-id: 3746d0c18261abeec0c4880c30430125928c3309
2019-04-25 14:50:19 -07:00
Lu Fang
5025d1d5e4 Automatic update of fbcode/onnx to 27d4b617e7097cda7d0d4c45ff2b09d248f33179 (#19718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19718

Previous import was 0e8d2bc5e51455c70ef790b9f65aa632ed9bc8a7

Included changes:
- **[27d4b617](https://github.com/onnx/onnx/commit/27d4b617)**: Adding RoIAlign operator (#1869) <Sam Pepose>
- **[70c9026c](https://github.com/onnx/onnx/commit/70c9026c)**: add ReverseSequence op (#1927) <Guoliang Hua>
- **[ed2db02a](https://github.com/onnx/onnx/commit/ed2db02a)**: README.md: Update badge style for build status (#1942) <Yulong Wang>
- **[e36d3b54](https://github.com/onnx/onnx/commit/e36d3b54)**: Enable python 3.7 in CI for Windows (#1943) <Raymond Yang>

Differential Revision: D15077516

fbshipit-source-id: c8c6935381ff5a96ab9a4ee519685814f4ea6e59
2019-04-25 10:54:15 -07:00
Summer Deng
cbd0a2d3c9 Fix the depthwise 3x3x3 fast path criteria for the stride (#19692)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19692

Remove the requirement on stride for the optimized depthwise 3x3x3 kernels.

Reviewed By: jspark1105

Differential Revision: D15070214

fbshipit-source-id: 9fe2d8e96930166e4eb0e2dd2288f6a0c4831e0a
2019-04-24 21:35:27 -07:00
David Goodwin
c855e04d5f Caffe2 shouldn't fail if CUDA peer access is already enabled
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19586

Differential Revision: D15061544

Pulled By: dzhulgakov

fbshipit-source-id: 6a5f9f4fe45259d689671f58ad5206cdaf15c5bd
2019-04-24 13:22:27 -07:00
Gu, Jinghui
b675f07bb6 Remove useless input shape checker in conv (#19608)
Summary:
The input shape checkers in conv/int8_conv operator is aims to avoid the issue when running with mkldnn winograd, the weigths has to be reordered each time if input shape changed.
However, the checkers result to big performance regression due to frequent reorder.

Meanwhile, in mkldnn-bridge, such case has been already fixed by correcting the prop_kind.
Therefore, we have to remove the useless checker to fix the performance regression.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19608

Differential Revision: D15061169

Pulled By: yinghai

fbshipit-source-id: 649a43ae6fce989e84939210f6dffb143ec3d350
2019-04-24 11:39:43 -07:00
Zachary DeVito
87a6974193 Make it possible for self.forward to return a ScriptMethod (#19217)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19217
ghimport-source-id: 6fdd7f5ac041dae950b47ca316f30682ede0b083

Reviewed By: suo

Differential Revision: D14922120

Pulled By: zdevito

fbshipit-source-id: 5e82e5d7ee72df6f401146d2519c80ea336ff40e
2019-04-24 11:14:34 -07:00
Rui Zhu
2f73b3d26e Add if ops support for onnxifi and ssa-rewrite (#19585)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19585

Originally we will unroll all If op to many different subnets;
Now we will not unroll it anymore, but just add all external input of its subnet to the If op, and ssa-rewrite all external input/outputs. That would be enough.

Reviewed By: yinghai

Differential Revision: D15038139

fbshipit-source-id: 8532216d8749068acd5558ad0d8cb1d98463a063
2019-04-24 11:01:13 -07:00
Xiaomeng Yang
fb9fc42a0c optimize BatchMatmulOp (#18612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18612

optimize BatchMatmulOp

Reviewed By: houseroad

Differential Revision: D14681665

fbshipit-source-id: cf5ea4909ace58fd44fe6fa634531102ac84e851
2019-04-23 15:34:59 -07:00
Oleg Bogdanov
70b82d28b8 caffe2 | Windows compat fixes
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19531

Reviewed By: hlu1

Differential Revision: D15024541

fbshipit-source-id: cd8249a6d529afb65fa8afd74a05dbfe73eb1fb0
2019-04-23 14:30:19 -07:00
Huamin Li
55e53d3d7e correct comments in group_norm_op (#19621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19621

Comments for group_norm_op is not accurate (i.e., the math part), this diff will fix it.

Reviewed By: BIT-silence

Differential Revision: D15048695

fbshipit-source-id: 27d41d3ae21054257967815254134849944d56ca
2019-04-23 13:31:15 -07:00
Yinghai Lu
4e8cc8ee90 Surface the Glow traces to C2 (#19087)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19087

att

Reviewed By: jackm321

Differential Revision: D14863112

fbshipit-source-id: 2680161b9f05391e73bb8dac4fbbeabb87a82c05
2019-04-23 12:27:49 -07:00
Priya Goyal
0d0acba3bd Allow extracting element-wise loss in softmax (#19579)
Summary:
Often times, we want to experiment with loss per element (image etc.). This changeset allows getting per element loss as well. This output is optional.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19579

Reviewed By: jerryzh168

Differential Revision: D15035797

Pulled By: prigoyal

fbshipit-source-id: 562dea514f49c1f2f1cbbc083a1938dc019a75c4
2019-04-23 11:49:49 -07:00
Jiyan Yang
714344a976 Specify to use Float16UniformFill if necessary in sparse lookup layer (#18499)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18499

If the init op is not fp16 compatible, it should throw.
However, in the special case where the original init op is UniformFill,
we replace it with Float16UniformFill

Reviewed By: kennyhorror

Differential Revision: D14627209

fbshipit-source-id: eb427772874a732ca8b3a25d06670d119ce8ac14
2019-04-23 10:14:08 -07:00
Lu Fang
5a796d15be Automatic update of fbcode/onnx to 0e8d2bc5e51455c70ef790b9f65aa632ed9bc8a7 (#19568)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19568

Previous import was 83dd62659fc07d5b7fa93b5d1c1879f93509c7db

Included changes:
- **[0e8d2bc5](https://github.com/onnx/onnx/commit/0e8d2bc5)**: [Minor need to be in 1.5]Fix an issue in NMS test data which introduce wrong shape. (#1953) <Hector Li>
- **[9346dd5d](https://github.com/onnx/onnx/commit/9346dd5d)**: adding modulus operator (#1874) <Jeff Saremi>
- **[414dbc73](https://github.com/onnx/onnx/commit/414dbc73)**: Fix shape inference for slice (#1950) <Hariharan Seshadri>
- **[6fb0775d](https://github.com/onnx/onnx/commit/6fb0775d)**: Fix shape inference for ConstantOfShape op (#1951) <Ashwini Khade>

Reviewed By: bddppq, zrphercule, benoitsteiner

Differential Revision: D15033070

fbshipit-source-id: f7eb90b142cbdc9bf1600cfd33e5a8df709045fb
2019-04-22 17:36:36 -07:00
Yinghai Lu
767d184b77 Add back option to not adjust output batch size (#19442)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19442

For cases like CV, some of ops like transpose and tile will mangle the batch size so that we don't know how to adjust output batch size. In this case, the current solution is just fix the input batch statically and do not adjust output batch size.

Reviewed By: zrphercule

Differential Revision: D15007237

fbshipit-source-id: a21b943a52ee5462d9d7804dfae44360f579f8cf
2019-04-22 12:29:24 -07:00
Michael Antonov
7655b857f7 Add debug logic to c2_ref_test and its helpers (#19359)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19359

Even with file IO exception handling, some of the sandcastle c2_ref_tests are still failing in length-check assert, as can be seen here:
https://our.intern.facebook.com/intern/test/844424932589974?ref_report_id=0

This is an attempt to add printing logic to debug what's going on.

Reviewed By: dzhulgakov

Differential Revision: D14966274

fbshipit-source-id: adce6d4780d664c5ef59f9341b6133b0d09324cb
2019-04-22 12:08:55 -07:00
Dehua Cheng
a09240b0a0 fix variable shadowing issus (#19567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19567

fix variable shadowing

Reviewed By: bddppq, wx1988

Differential Revision: D15032114

fbshipit-source-id: 895ea21f22b87db8c7c8684f54fa186d22f24d10
2019-04-22 11:55:30 -07:00
Lu Fang
e714429bf4 Automatic update of fbcode/onnx to 83dd62659fc07d5b7fa93b5d1c1879f93509c7db (#19454)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19454

Previous import was ad7313470a9119d7e1afda7edf1d654497ee80ab

Included changes:
- **[83dd6265](https://github.com/onnx/onnx/commit/83dd6265)**: Add NonMaxSuppression operator (#1703) <Hector Li>
- **[31ca5d6f](https://github.com/onnx/onnx/commit/31ca5d6f)**: add node tests for quantized ops (#1944) <Ashwini Khade>
- **[e6076c1d](https://github.com/onnx/onnx/commit/e6076c1d)**: Fix test stat coverage script (#1948) <Raymond Yang>
- **[ad036405](https://github.com/onnx/onnx/commit/ad036405)**: Add IsInf to detect infinity values (#1884) <Wei-Sheng Chin>

Reviewed By: benoitsteiner

Differential Revision: D15010015

fbshipit-source-id: 4b29de21de60f8e6a2db75309809a4e619c92532
2019-04-22 10:46:08 -07:00
Jiyan Yang
deadf3ba89 Add assertion to make sure init op is always fp16 compatible in fp16 training
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18498

Reviewed By: kennyhorror

Differential Revision: D14626755

fbshipit-source-id: d8a0b3c02920ab3835911a21bf05e8956853fcd7
2019-04-21 23:43:13 -07:00
Gu, Jinghui
c96c91da22 Improve optimizations for DNNLOWP support on MKL-DNN (#18843)
Summary:
In this PR, the fusion alogrithms are improved to support DNNLOWP.
1. Enabled conv fusions for DNNLOWP
2. Fused order switch op into following quantize op
3. Improve conv+sum fusion to parse larger scope/window
4. re-org fusion code to fix random crash issue due to changing graph
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18843

Differential Revision: D15021030

Pulled By: yinghai

fbshipit-source-id: 88d2199d9fc69f392de9bfbe1f291e0ebf78ab08
2019-04-20 02:12:06 -07:00
Sam Leeman-Munk
9f4f7e1621 Support compilation on gcc-7.4.0 (#19470)
Summary:
There are two corrections in this pull request.
The first is specific to gcc-7.4.0.
compiled with -std=c++14 gcc-7.4.0 has __cplusplus = 201402L
This does not meet the check set in Deprecated.h, which asks for >201402L.
The compiler goes down to the __GNUC__ check, which passes and sets C10_DEPRECATED_MESSAGE to a value that c++14 does not appear to support or even recognize, leading to a compile time error.
My recommended solution, which worked for my case, was to change the = into a >=

The second correction comes in response to this error:
caffe2/operators/crash_op.cc: In member function ‘virtual bool caffe2::CrashOp::RunOnDevice()’:
caffe2/operators/crash_op.cc:14:11: error: ‘SIGABRT’ was not declared in this scope

I am merely committing to the repository the solution suggested here (which worked for me)
https://discuss.pytorch.org/t/building-pytorch-from-source-in-conda-fails-in-pytorch-caffe2-operators-crash-op-cc/42859
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19470

Differential Revision: D15019529

Pulled By: ailzhang

fbshipit-source-id: 9ce9d713c860ee5fd4266e5c2a7f336a97d7a90d
2019-04-19 21:41:36 -07:00
James Reed
d17c22d024 Improve embedding_bag add kernel (#19329)
Summary:
This was actually getting pretty poor throughput with respect to memory bandwidth. I used this test to measure the memory bandwidth specifically for the AXPY call: https://gist.github.com/jamesr66a/b27ff9ecbe036eed5ec310c0a3cc53c5

And I got ~8 GB/s before this change, but ~14 GB/s after this change.

This seems to speed up the operator overall by around 1.3x (benchmark: https://gist.github.com/jamesr66a/c533817c334d0be432720ef5e54a4166):

== Before ==

time_per_iter 0.0001298875093460083
GB/s 3.082544287868467

== After ==

time_per_iter 0.00010104801654815674
GB/s 3.9623142905451076

The large difference between the local BW increase and the full-op BW increase likely indicates significant time is being spent elsewhere in the op, so I will investigate that.

EDIT: I updated this PR to include a call into caffe2/perfkernels. This is the progression:

before

time_per_iter 8.983819484710693e-05
GB/s 4.456723564864611

After no axpy
time_per_iter 7.19951868057251e-05
GB/s 5.56126065872172

AFter perfkernels
time_per_iter 5.6699180603027346e-05
GB/s 7.061548257694262

After perfkernels no grad
time_per_iter 4.388842582702637e-05
GB/s 9.122769670026413
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19329

Reviewed By: dzhulgakov

Differential Revision: D14969630

Pulled By: jamesr66a

fbshipit-source-id: 42d1015772c87bedd119e33c0aa2c8105160a738
2019-04-19 19:16:24 -07:00
Xiaomeng Yang
f5fe7aa0b2 Fix relu bug for empty tensor (#19451)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19451

Fix relu bug for empty tensor

Reviewed By: xianjiec

Differential Revision: D15009811

fbshipit-source-id: b75e567c3bec08d7d12b950d8f1380c50c138704
2019-04-19 15:21:07 -07:00
Yinghai Lu
b85edac16f Fix out-of-topological-order issue in Nomnigraph (#19458)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19458

The algorithm in https://fburl.com/ggh9iyvc fails to really ensure topological ordering of nodes. The fix is ugly but effective. I think we need a real topological sort to fix this issue more nicely. Mikhail Zolotukhin, Bram Wasti.

Differential Revision: D15011893

fbshipit-source-id: 130c3aa442f5d578adfb14fbe5f16aa722434942
2019-04-19 12:19:39 -07:00
Sebastian Messmer
17f05ad5e5 Moving at::Tensor into caffe2::Tensor without bumping refcount (#19388)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19388

The old implementation forced a refcount bump when converting at::Tensor to caffe2::Tensor.
Now, it is possible to move it without a refcount bump.

Reviewed By: dzhulgakov

Differential Revision: D14986815

fbshipit-source-id: 92b4b0a6f323ed38376ffad75f960cad250ecd9b
2019-04-18 14:13:26 -07:00
Sebastian Messmer
601f36bacc Use string based schema for exposing caffe2 ops (#19287)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19287

Since we now have a string-schema-based op registration API, we can also use it when exposing caffe2 operators.

Reviewed By: dzhulgakov

Differential Revision: D14931925

fbshipit-source-id: ec162469d2d94965e8c99d431c801ae7c43849c8
2019-04-18 02:04:50 -07:00
Jiyan Yang
c48e1679f9 Add validator for optimizers when parameters are shared
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18497

Reviewed By: kennyhorror

Differential Revision: D14614738

fbshipit-source-id: beddd8349827dcc8ccae36f21e5d29627056afcd
2019-04-17 21:10:38 -07:00
Yinghai Lu
5fa1aad670 Remove unused template parameter in OnnxifiOp (#19362)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19362

`float` type is never used in OnnxifiOp....

Reviewed By: bddppq

Differential Revision: D14977970

fbshipit-source-id: 8fee02659dbe408e5a3e0ff95d74c04836c5c281
2019-04-17 16:48:14 -07:00
Yinghai Lu
f1f31b634d Eliminate AdjustBatch ops (#19083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19083

As we have discussed, there are too many of AdjustBatch ops and they incur reallocation overhead and affects the performance. We will eliminate these ops by
- inling the input adjust batch op into Glow
- inling the output adjust batch op into OnnxifiOp and do that only conditionally.

This is the C2 part of the change and requires change from Glow side to work e2e.

Reviewed By: rdzhabarov

Differential Revision: D14860582

fbshipit-source-id: ac2588b894bac25735babb62b1924acc559face6
2019-04-17 10:00:25 -07:00
Sebastian Messmer
db611b7caf Delete C10Tensor (#19328)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19328

Plans changed and we don't want this class anymore.

Reviewed By: dzhulgakov

Differential Revision: D14966746

fbshipit-source-id: 09ea4c95b352bc1a250834d32f35a94e401f2347
2019-04-17 00:02:27 -07:00
Jerry Zhang
ff0a7ae43f Testing for folded conv_bn_relu (#19298)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19298

Proper testing for conv_bn_relu folding

Differential Revision: D13998891

fbshipit-source-id: ceb58ccec19885cbbf38964ee0d0db070e098b4a
2019-04-16 19:04:06 -07:00
Mark Santaniello
20fc7b6ec7 Avoid undefined symbol error when building AdIndexer LTO (#19009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19009

Move the definition of `MulFunctor<>::Backward()` into a header file.

Reviewed By: BIT-silence

Differential Revision: D14823230

fbshipit-source-id: 1efaec01863fcc02dcbe7e788d376e72f8564501
2019-04-15 23:43:13 -07:00
Summer Deng
84b264b17d Add NHWC order support in the cost inference function of 3d conv (#19170)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19170

As title
The quantized resnext3d model in production got the following failures without the fix:

```
 Caffe2 operator Int8ConvRelu logging error: [enforce fail at conv_pool_op_base.h:463] order == StorageOrder::NCHW. 1 vs 2. Conv3D only supports NCHW on the production quantized model
```

Reviewed By: jspark1105

Differential Revision: D14894276

fbshipit-source-id: ef97772277f322ed45215e382c3b4a3702e47e59
2019-04-15 16:47:22 -07:00
Jongsoo Park
ffc9e29844 unit test with multiple op invocations (#19118)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19118

A bug introduced by D14700576 reported by Yufei (fixed by D14778810 and D14785256) was not detected by our units tests.
This diff improves unit tests to catch such errors (with this diff and without D14778810, we can reproduce the bug Yufei reported).
This improvement also revealed a bug that affects the accuracy when we pre-pack weight and bias together and the pre-packed weight/bias are used by multiple nets. We were modifying the pre-packed bias in-place which was supposed to be constants.

Reviewed By: csummersea

Differential Revision: D14806077

fbshipit-source-id: aa9049c74b6ea98d21fbd097de306447a662a46d
2019-04-15 14:41:28 -07:00
Gemfield
6ed57e052d Fix the return value of ParseFromString (#19262)
Summary:
Fix the return value of ParseFromString.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19262

Differential Revision: D14937605

Pulled By: ezyang

fbshipit-source-id: 3f441086517186a075efb3d74f09160463b696b3
2019-04-15 12:39:29 -07:00
Yinghai Lu
0e435afc3c Add more debugging helper to net transformer (#19176)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19176

Add some amenities for debugging.

Reviewed By: llyfacebook

Differential Revision: D14901740

fbshipit-source-id: 2c4018fdbf7e3aba2a754b6b4103a72893c229c2
2019-04-12 14:28:37 -07:00
Huamin Li
c480798a1c use C10_REGISTER for GELU op
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19090

Reviewed By: BIT-silence

Differential Revision: D14864737

fbshipit-source-id: 8debd53171f7068726f0ab777a13ca46becbfbdf
2019-04-12 11:41:04 -07:00
Will Feng
c7b5a8a876 Change is_variable() to check existence of AutogradMeta, and remove is_variable_ (#19139)
Summary:
Currently, a TensorImpl's `is_variable_` is true if and only if the TensorImpl has AutogradMeta. This PR unifies these two concepts by removing `is_variable_` and change `is_variable()` to check existence of AutogradMeta instead.

Removing `is_variable_` is part of the work in Variable/Tensor merge.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19139

Differential Revision: D14893339

Pulled By: yf225

fbshipit-source-id: ceb5e22c3c01f79b5d21d5bdbf4a7d1bc397796a
2019-04-11 14:03:33 -07:00
Gregory Chanan
b6ee83a5b4 Materialize a non-default device for C2 legacy storage. (#18605)
Summary:
It's not intended that Storages have 'default' CUDA devices, but this is allowable via the Storage::create_legacy codepath.

This also messages with device_caching, because the initial cache is obtained from the Storage, which may have a 'default' device.

Instead, we materialize a device by allocating 0 bytes via the allocator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18605

Differential Revision: D14680620

Pulled By: gchanan

fbshipit-source-id: 6d43383d836e90beaf12bfe37c3f0506843f5432
2019-04-11 13:50:41 -07:00
Yinghai Lu
bbe648dffb Allow empty net type (#19154)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19154

I recently saw some weird workflow error due to empty but set net_type. Maybe we should just fallback to simple net in this case.

Reviewed By: dzhulgakov

Differential Revision: D14890072

fbshipit-source-id: 4e9edf8232298000713bebb0bfdec61e9c5df17d
2019-04-11 12:43:07 -07:00
Xing Wang
b6f130aa70 try to enable uncertainty for lr loss (#17236)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17236

Following the paper in https://papers.nips.cc/paper/7141-what-uncertainties-do-we-need-in-bayesian-deep-learning-for-computer-vision.pdf, approximate the classification case with the regression formulation. For the LRLoss, add penalty based on the variance and regularization on the variance with a tunable parameter lambda.

Reviewed By: chocjy

Differential Revision: D14077106

fbshipit-source-id: 4405d8995cebdc7275a0dd07857d32a8915d78ef
2019-04-11 07:35:19 -07:00
Xiaomeng Yang
821b5f138a Optimize SoftmaxOp on CPU (#18635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18635

Optimize SoftmaxOp on CPU

Reviewed By: houseroad

Differential Revision: D14689516

fbshipit-source-id: d2dcee2476d1a3a21f428e99bce9835f1d229d64
2019-04-10 18:52:15 -07:00
Hao Lu
226a358136 Move ConcatBatchMatMulBatchGatherOp to OSS
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19059

Reviewed By: bwasti

Differential Revision: D14849735

fbshipit-source-id: fefd1887d38e51151c07a8b187e9c7c50ef02c6e
2019-04-10 15:29:03 -07:00