Commit Graph

2236 Commits

Author SHA1 Message Date
Jongsoo Park
d53012b4fe add NCHW2NHWC and NHWC2NCHW in utils.py (#15588)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15588

Use NHWC2NCHW or NCHW2NHWC functions which is easier to understand compared to code using transpose and generalizable to non-2D convolutions.

Reviewed By: csummersea

Differential Revision: D13557674

fbshipit-source-id: c4fdb8850503ea58f6b17b188513ae2b29691ec0
2018-12-28 17:34:50 -08:00
Jongsoo Park
4e4ef0cffb add rowwise adagrad lp test (#15082)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15082

We didn't have unit test for low-precision rowwise adagrad

Reviewed By: chocjy

Differential Revision: D13300732

fbshipit-source-id: 46e7bdfc82c5a6855eeb6f653c0a96b0b3a20546
2018-12-22 10:25:39 -08:00
Jongsoo Park
e012b183dd handle empty inputs to SparseLengthsMean correctly (#15389)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15389

SparseLengthsMean was generating uninitialized data for empty inputs (lengths == 0). We should return zeros.
The unit tests were also not covering this special case which is fixed by this diff.

Reviewed By: salexspb

Differential Revision: D13515970

fbshipit-source-id: 3c35265638f64f13f0262cee930c94f8628005da
2018-12-21 22:20:14 -08:00
Bram Wasti
235d47760b Relax check on outputs (#15458)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15458

many nets in the wild seem to have outputs that are never produced by the net.

Reviewed By: ZolotukhinM

Differential Revision: D13534185

fbshipit-source-id: 2b23b39c28404c53f68868f3bf6df53c5fea9eab
2018-12-21 14:19:37 -08:00
Bram Wasti
ac506f5820 Back out "[nomnigraph][executor] computeChains with nomnigraph" (#15451)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15451

Original commit changeset: ccd050bfead6

Reviewed By: ilia-cher

Differential Revision: D13533161

fbshipit-source-id: 1d0dcd54c2e3875aab015f3e996693e67a449b87
2018-12-21 11:09:27 -08:00
Jongsoo Park
f52f68bcf9 format specialized_segment_ops_test.py to prepare D13515970 (#15408)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15408

Applied formatting to specialized_segment_ops_test.py to prepare D13515970

Reviewed By: salexspb

Differential Revision: D13520300

fbshipit-source-id: c3250b6abe8087c607f65ae60d1da61bd46c342b
2018-12-20 23:44:47 -08:00
Yinghai Lu
cb79e1b3a5 Clean up onnxifi transformation code (#15453)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15453

Just move things around to facilitate further development. No logic change.

Reviewed By: rdzhabarov

Differential Revision: D13533959

fbshipit-source-id: eebab1306939e802aacffb24a711d372fd67916c
2018-12-20 22:06:47 -08:00
Edward Yang
26b04523b1 Record Caffe2's current stream ID in c10_cuda. (#15174)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15174

Previously, Caffe2 maintained a separate per-thread per-device
current logical CUDA stream ID.  In this PR, we switch Caffe2 over
to using c10::Stream to manage the current stream, and also
manage the allocation of cudaStream_t objects.

This results in a slight behavior change: previously, Caffe2
would have been willing to allocate an arbitrary number of
CUDA streams, depending on how high the logical stream IDs
went.  The c10::Stream pool has a fixed number of streams, once
you exceed it, it wraps around.

Reviewed By: dzhulgakov

Differential Revision: D13451550

fbshipit-source-id: da6cf33ee026932a2d873835f6e090f7b8a7d8dc
2018-12-20 21:54:05 -08:00
Bram Wasti
055de167d5 computeChains with nomnigraph (#15366)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15366

swap the old implementation with a slightly easier one to understand

I ran the tests and compared the number of chains compared to the old algorithm.  This one outperforms on every test, but we have yet to see if that impacts performance at all.

old chain 34 nomnigraph chain 25
old chain 46 nomnigraph chain 34
old chain 228 nomnigraph chain 188
old chain 397 nomnigraph chain 338

Reviewed By: ilia-cher

Differential Revision: D13057451

fbshipit-source-id: ccd050bfead6eb94ab9c7b0a70b09a22c2b9e499
2018-12-19 15:04:23 -08:00
Bill Li
3681bf7cff add dense vector to id_list operator (#15090)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15090

as title
step 2 of the linked task

Reviewed By: ellie-wen

Differential Revision: D13425977

fbshipit-source-id: f3538ed68f42470ba39c5b779af764d4a5591a9d
2018-12-18 16:27:38 -08:00
Tristan Rice
e650a84872 caffe2/python/task: added __repr__ methods to all task definitions (#15250)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15250

This adds `__repr__` methods to all of the classes under task.py. This makes the objects much easier to interact with when using them in an interactive manner, such as in a Jupyter notebook.

The default `__repr__` method just returns the object ID which is very unhelpful.

Reviewed By: hanli0612

Differential Revision: D13475758

fbshipit-source-id: 6e1b166ec35163b9776c797b6a2e0d002560cd29
2018-12-17 16:02:16 -08:00
peter
216ab259fb Fix the missing caffe2 proto files for Windows (#15157)
Summary:
Fixes #15156
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15157

Differential Revision: D13490420

Pulled By: orionr

fbshipit-source-id: 4387d707f634a5975238af915b1befb2277f8ec7
2018-12-17 15:21:47 -08:00
rohithkrn
763b9954f3 FP16MomentumSGDUpdate Op fix and enable for ROCm (#15150)
Summary:
1. Fix a bug in FP16MomentumSGDUpdate operator
2. Enable operator for ROCm
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15150

Differential Revision: D13473145

Pulled By: bddppq

fbshipit-source-id: 4c5c5f30cb9bba658e3639dbe193fa08a304d306
2018-12-14 16:33:45 -08:00
Alexander Sidorov
e596d23137 Start unittesting our main observer (#15191)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15191

OSS:

just splitting out basic flags from a unit test. So I can extend them in another test where I need to add additional flags.

Reviewed By: yinghai

Differential Revision: D13159184

fbshipit-source-id: 9823e792cf0ed8d0379235c44564862b7d784845
2018-12-14 16:24:38 -08:00
bddppq
855d9e1f19 Run ONNX cuda backend test cases via ROCm
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15069

Differential Revision: D13427757

Pulled By: bddppq

fbshipit-source-id: ba0273d75986cd5b146f7041a83c63ddf9c6c0cf
2018-12-13 15:10:00 -08:00
Xianjie Chen
fabd23cb2d support casting to string (#15110)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15110

support casting to string on CPU

Reviewed By: intermilan

Differential Revision: D13429381

fbshipit-source-id: b737a1ba1237b10f692d5c42b42a544b94ba9fd1
2018-12-12 21:33:58 -08:00
Cheng,Penghui
1717ea1da0 Implementation of ChannelShuffle Op for MKLDNN (#15106)
Summary:
the speed-up of a single operation is up to 3X .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15106

Differential Revision: D13429596

Pulled By: bddppq

fbshipit-source-id: f8d987cafeac9bef9c3daf7e43ede8c6a4ee2ce5
2018-12-12 20:25:12 -08:00
Brett Koonce
d8260239a0 docs: minor spelling tweaks
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15148

Differential Revision: D13443708

Pulled By: suo

fbshipit-source-id: 5e3ec0afd3416ab8ce207f2d04105c49e1c04611
2018-12-12 18:17:14 -08:00
Jerry Zhang
63e77ab6c4 Move numa.{h, cc} to c10/util (#15024)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15024

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393

att

Reviewed By: dzhulgakov

Differential Revision: D13380559

fbshipit-source-id: abc3fc7321cf37323f756dfd614c7b41978734e4
2018-12-12 12:21:10 -08:00
Zhiping Xiu
1423c0d9f1 Add EmptyNameScope to allow you jump out from current scope. (#14631)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14631

adding a empty name scope to allow people jump out from current namescope.

This could be useful when you want to access blob from parent or sibling scope.

 Facebook:

e.g: we encoutered a potential usecase in D13124249 (it's a large diff, please search by EmptyNameScope in that diff), we need to access to a blob declared in root namescope from a device namescope (device namescope has been used by parallel_GPU API). `EmptyNameScope` can help us do that with ease.

I referenced to `EmptyDeviceScope` D6103412 while implementing this one.

Reviewed By: yinghai

Differential Revision: D13272240

fbshipit-source-id: d4cde5abcc2336e456b6c6ef086266ef94d86da8
2018-12-12 01:39:50 -08:00
bddppq
479481b6cb Remove linker and dlopen flags that allowed undefined symbols in rocm build (#15091)
Summary:
Previously the undefined symbols were caused by disabled_modules in tools/amd_build/disabled_features.json (now it's cleared).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15091

Differential Revision: D13429595

Pulled By: bddppq

fbshipit-source-id: b341e83f9e5a8d16440a364e837b045a8a4fd6e1
2018-12-11 23:23:47 -08:00
Daniel Ingram
5c2c40ad87 Add error type to raise statement
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15039

Differential Revision: D13419566

Pulled By: zou3519

fbshipit-source-id: f67a3aebce937e3e640e91e81eb3e184cfdf269c
2018-12-11 17:41:44 -08:00
Zachary DeVito
92314c83fa re-enable copy of python files, but be careful that the copy is only … (#14982)
Summary:
…done once

This allow no-op build to work correctly even when BUILD_CAFFE2_OPS is on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14982

Differential Revision: D13413960

Pulled By: zdevito

fbshipit-source-id: 6e5412a8c375af8a47c76f548cdd31cff15f3853
2018-12-11 16:54:08 -08:00
TerryTsao
c2a754c58b Fix CMakeLists.txt for Int8 python bindings (#15047)
Summary:
Currently in caffe2, one cannot properly fetch the content of Int8 blobs.

Upon digging the source code, it turns out that the relevant source code is not being compiled. Adding the source to CMakeLists.txt fixes this issue.

First time ever doing a pull request. Please let me know if there's any rule I should follow. Thanks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15047

Differential Revision: D13417583

Pulled By: bddppq

fbshipit-source-id: dd39575971a3012635edbf97a045d80e4b62a8eb
2018-12-11 10:48:47 -08:00
Jongsoo Park
cff509e2b1 share code between adagrad and rowwise adagrad tests (#14692)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14692

Remove some code duplication

Reviewed By: chocjy

Differential Revision: D13296731

fbshipit-source-id: 5924e037ca64fc4b89234be922bc5ca47fb8bd32
2018-12-10 22:10:39 -08:00
bddppq
45dfc6764e Enable more caffe2 fp16 rocm tests (#15040)
Summary:
cc rohithkrn petrex
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15040

Reviewed By: houseroad

Differential Revision: D13413068

Pulled By: bddppq

fbshipit-source-id: b2967f16f8da0b9e80083138fb8632c14e9e9b63
2018-12-10 21:30:21 -08:00
Ilia Cherniavskii
e9cd781681 Back out "Revert D13043261: [caffe2] Task graph and task future abstractions in executor"
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15030

Reviewed By: bddppq

Differential Revision: D13408998

fbshipit-source-id: 9eb675e09fbc4829eab34df7aa660a0590816feb
2018-12-10 19:30:58 -08:00
rohithkrn
7e2b074219 Integrate rocBLAS fp16 api into Caffe2 (#14882)
Summary:
This PR integrates rocBLAS half and mixed precision APIs in to Caffe2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14882

Differential Revision: D13407840

Pulled By: bddppq

fbshipit-source-id: 75cb0d74da066776fa66575f1d255e879d36121e
2018-12-10 17:54:06 -08:00
Junjie Bai
4a145cd95c Revert D13043261: [caffe2] Task graph and task future abstractions in executor
Differential Revision:
D13043261

Original commit changeset: d89424354aea

fbshipit-source-id: b307e3281c4d83b60ba2bfadcbcf69afb7a41412
2018-12-10 16:03:59 -08:00
Ilia Cherniavskii
029600813e Task graph and task future abstractions in executor
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14116

Reviewed By: dmudiger

Differential Revision: D13043261

fbshipit-source-id: d89424354aea14d1d14eb8320fb3aa34908a4e81
2018-12-10 14:28:56 -08:00
Jerry Zhang
a51fe386c8 caffe2/caffe2/contrib/script (#15007)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15007

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14979

att

Reviewed By: dzhulgakov

Differential Revision: D13286191

fbshipit-source-id: b8a6bc7aea44487aea4dcf7f44c858fd30c6293c
2018-12-10 14:23:31 -08:00
Yiming Wu
a1494efdfa fix auto grad summing for IfOp where intermediate output needs renaming (#14772)
Summary:
fix auto grad summing for IfOp where intermediate output needs renaming.

Bug before this diff:
- we only renames the output of IfOp without changing the subnet ops output
- this results in blob not found error

the unittest provides an example
this diff fix that for IfOp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14772

Differential Revision: D13327090

Pulled By: harouwu

fbshipit-source-id: ec40ee88526ace3619c54551e223dd71158a02f8
2018-12-09 08:26:46 -08:00
Your Name
5e06fa0baf ONNX changes to use int32_t (instead of enum) to store data type
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14926

Reviewed By: houseroad

Differential Revision: D13390642

Pulled By: bddppq

fbshipit-source-id: c2314b24d9384f188fda2b9a5cc16465ad39581e
2018-12-08 01:06:08 -08:00
Lu Fang
5be28ade66 Automatic update of fbcode/onnx to aca8473a40cf43f01958c81b648efcee7f3a755a (#14865)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14865

Previous import was 42804705bdbf179d1a98394008417e1392013547

Included changes:
- **[aca8473](https://github.com/onnx/onnx/commit/aca8473)**: Add Erf operator for computing error function (#1675) <bddppq>
- **[3fc82ca](https://github.com/onnx/onnx/commit/3fc82ca)**: Add IsNaN operator. (#1656) <Pranav Sharma>
- **[0685f01](https://github.com/onnx/onnx/commit/0685f01)**: Add Sign Op (#1658) <Rui Zhu>
- **[2a8fae8](https://github.com/onnx/onnx/commit/2a8fae8)**: Fix unused var warning (#1669) <Yinghai Lu>
- **[e212833](https://github.com/onnx/onnx/commit/e212833)**: Update scan (#1653) <G. Ramalingam>

Reviewed By: zrphercule

Differential Revision: D13370727

fbshipit-source-id: 13a93d5acc8d4758f682278ea162ec9124ced22d
2018-12-07 17:37:42 -08:00
rohithkrn
11a9248d01 Enable fp16 for MIOPEN operators in Caffe2 (#14905)
Summary:
This PR enables fp16 MIOPEN operators in Caffe2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14905

Differential Revision: D13383439

Pulled By: bddppq

fbshipit-source-id: 840afa8d08bef2952ca0039dee2423f1542bb330
2018-12-07 17:26:44 -08:00
Sergei Nikolaev
a0ee3a279c USE_TENSORRT support and TensorRT 5 compatibility
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13945

Differential Revision: D13317525

Pulled By: yinghai

fbshipit-source-id: 8630dfec1bbc5aac19539e344e7c38a7fd8b051d
2018-12-07 14:01:11 -08:00
Orion Reblitz-Richardson
febc7ff99f Add __init__.py so files get picked up on install (#14898)
Summary:
This will let us install tests and other Caffe2 python code as a part of running Caffe2 tests in PyTorch.

Broken out of https://github.com/pytorch/pytorch/pull/13733/

cc pjh5 yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14898

Reviewed By: pjh5

Differential Revision: D13381123

Pulled By: orionr

fbshipit-source-id: 0ec96629b0570f6cc2abb1d1d6fce084e7464dbe
2018-12-07 13:40:23 -08:00
PenghuiCheng
939877bf4b Implementation of WeightedSum op for mkl-dnn and fix FC op output shape issue.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14407

Reviewed By: yinghai

Differential Revision: D13364364

Pulled By: wesolwsk

fbshipit-source-id: e69bcd1bc52e35b2f0e45e5dc40184f1bd66605d
2018-12-07 12:35:19 -08:00
Yudong Guang
265b55d028 Revert D13205604: Move numa.{h, cc} to c10/util
Differential Revision:
D13205604

Original commit changeset: 54166492d318

fbshipit-source-id: 89b6833518c0b554668c88ae38d97fbc47e2de17
2018-12-07 10:01:25 -08:00
Jerry Zhang
1d111853ae Move numa.{h, cc} to c10/util (#14393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393

att

Reviewed By: ezyang

Differential Revision: D13205604

fbshipit-source-id: 54166492d31827b0343ed070cc36a825dd86e2ed
2018-12-06 11:30:13 -08:00
lcskrishna
12addc64a6 Fixed MIOpen RNN Segfault issue and enabled RNN test (#14810)
Summary:
This pull request contains changes for:
1. Added MIOpen RNN API miopenGetRNNLayerBiasSize and miopenGetRNNLayerParamSize.
2. Fixed usage of API miopenGetRNNLayerParam.
3. Modifying the RNN test to run using MIOpen engine.

Differential Revision: D13355699

Pulled By: bddppq

fbshipit-source-id: 6f750657f8049c5446eca893880b397804120b69
2018-12-05 23:54:31 -08:00
Huan Gui
ba287eebca Fix clip gradient with empty input (#14709)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14709

As titled

Reviewed By: Wakeupbuddy

Differential Revision: D13305554

fbshipit-source-id: 380062d4b0e4f9dc0207a27766cac7b8d05384d5
2018-12-05 22:53:25 -08:00
Jerry Zhang
a597c0ca05 Add inplace FeedTensor for python frontend (#14512)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14512

att

Reviewed By: dzhulgakov

Differential Revision: D13243278

fbshipit-source-id: 78af417d0fcd9b9791ee839d62095903e49205cb
2018-12-04 12:45:11 -08:00
Michael Antonov
773f4d8081 Implements Gather operator for arbitrary axis, sharing the code with BatchGather. (#13756)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13756

This implements general Gather operator for arbitrary axis, sharing the code with BatchGather.
 - CPU gather & batch gather logic is now shared through caffe2::gather_helper, for any axis.
 - Shared CUDA kernel moved to gather_op.cuh, for any axis.
 - Gradients of axis > 0 delegate to BatchGatherGradientOp which now has axis argument.
 - BatchGatherOp doc strings updated to have correct rank (q + (r -1)) and output.
 - Added tests for axis == 2.

GatherOp supports index wrapping for axis == 0 by default, which was earlier for ONNX.
This diff also extends it to work in Cuda kernel. Added "wrap_indices" argument which specifies
wheather this wrapping should be done; set it to true if you'd like wrapping for any axis.

TBD: Update gradients to support negative indices (separate diff).
TBD: Once we have operator versioning, we'd like to update GatherOp to NOT support axis 0 wrapping
by default, but rather do it only if wrap_indices is set.

Reviewed By: dzhulgakov

Differential Revision: D12983815

fbshipit-source-id: 8add9d67b47fe8c5ba7a335f581ca0530b205cd7
2018-12-04 11:54:28 -08:00
Lu Fang
44894915d6 Automatic update of fbcode/onnx to 6b34743d2e361bbc0acb29dd73536478cb92562e (#14637)
Summary:
Previous import was f461f7aad9987635b4aff108620ed7918f002d19

Included changes:
- **[6b34743](https://github.com/onnx/onnx/commit/6b34743)**: fix the const map initializatoin (#1662) <Lu Fang>
- **[ae80999](https://github.com/onnx/onnx/commit/ae80999)**: Fuse Pad into Conv optimizer (#1580) <vloncar>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14637

Differential Revision: D13281338

Pulled By: houseroad

fbshipit-source-id: c31429914bf5954fdc85e0c02168836ef47d635c
2018-12-03 20:11:17 -08:00
Yan Zhu
aeb38cfcea cuda implementation for PackSegment to support presence mask (#14635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14635

as title

Reviewed By: enosair

Differential Revision: D13254097

fbshipit-source-id: b9f40109e2889907c925f9a4df9da14f67f45f38
2018-11-30 16:54:10 -08:00
Lu Fang
2752ad8045 Automatic update of fbcode/onnx to f461f7aad9987635b4aff108620ed7918f002d19 (#14568)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14568

Previous import was 882c5283c54345d131e8fe5c859e4844dcf7ca8e

Included changes:
- **[f461f7a](https://github.com/onnx/onnx/commit/f461f7a)**: Show the op's type and name when the shape inference is failed. (#1623) <Jerry>
- **[ab8aaf9](https://github.com/onnx/onnx/commit/ab8aaf9)**: Add scan test case (#1586) <G. Ramalingam>
- **[c95357e](https://github.com/onnx/onnx/commit/c95357e)**: link the tutorial (#1650) <Lu Fang>
- **[d7e2420](https://github.com/onnx/onnx/commit/d7e2420)**: Upgrade label encoder to support more input types (#1596) <Wei-Sheng Chin>
- **[6425108](https://github.com/onnx/onnx/commit/6425108)**: Add Doc about Adding New Operator into ONNX (#1647) <Lu Fang>
- **[295889c](https://github.com/onnx/onnx/commit/295889c)**: use an empty initializer to create map (#1643) <Lu Fang>
- **[e38f3ec](https://github.com/onnx/onnx/commit/e38f3ec)**: Remove redundant const (#1639) <daquexian>
- **[ea694bf](https://github.com/onnx/onnx/commit/ea694bf)**: implement fuse reduce->unsqueeze + fix assumption in nop_dropout pass (#1565) <Armen>
- **[6db386e](https://github.com/onnx/onnx/commit/6db386e)**: make output shape clear enough for Softmax family (#1634) <Lu Fang>
- **[2b67c6e](https://github.com/onnx/onnx/commit/2b67c6e)**: fix batchnorm doc (#1633) <Lu Fang>
- **[c901784](https://github.com/onnx/onnx/commit/c901784)**: remove inappropriate consts (#1632) <Lu Fang>
- **[de82119](https://github.com/onnx/onnx/commit/de82119)**: Shape inference fix for broadcast, concat and scan (#1594) <KeDengMS>
- **[d7ffe3b](https://github.com/onnx/onnx/commit/d7ffe3b)**: Update Optimizer Docs (#1607) <Armen>
- **[d09d139](https://github.com/onnx/onnx/commit/d09d139)**: mark PROTOBUF_INCLUDE_DIRS as BUILD_INTERFACE (#1466) <Yuta Okamoto>
- **[eb4b7c2](https://github.com/onnx/onnx/commit/eb4b7c2)**: allow variadic parameters of different types (#1615) <G. Ramalingam>
- **[4166246](https://github.com/onnx/onnx/commit/4166246)**: Fix onnxifi test (#1617) <Yinghai Lu>
- **[6706a4d](https://github.com/onnx/onnx/commit/6706a4d)**: Fix a bug in vector address access (#1598) <Raymond Yang>
- **[ae39866](https://github.com/onnx/onnx/commit/ae39866)**: Separate types of inputs 1 and 2 in OneHot op. (#1610) <Spandan Tiwari>
- **[45ba661](https://github.com/onnx/onnx/commit/45ba661)**: Handle new types in the switch. (#1608) <Dmitri Smirnov>
- **[14853b6](https://github.com/onnx/onnx/commit/14853b6)**: Bump docker image version to 230 used in CircleCI (#1606) <bddppq>
- **[e0993b8](https://github.com/onnx/onnx/commit/e0993b8)**: [onnxifi] Make sure that backend handles run async. (#1599) <Roman Dzhabarov>
- **[e6965cc](https://github.com/onnx/onnx/commit/e6965cc)**: Introduce SparseTensor ML proto (#1554) <Dmitri Smirnov>
- **[75b782f](https://github.com/onnx/onnx/commit/75b782f)**: In driver test check the return status of onnxGetBackendIDs (#1597) <bddppq>
- **[c05b364](https://github.com/onnx/onnx/commit/c05b364)**: Make CI log less verbose (#1595) <bddppq>
- **[fa568e4](https://github.com/onnx/onnx/commit/fa568e4)**: Loop type shape inferencing (#1591) <Scott McKay>
- **[937e64c](https://github.com/onnx/onnx/commit/937e64c)**: add uint8 (#1590) <Lu Fang>
- **[f86e951](https://github.com/onnx/onnx/commit/f86e951)**: Add domain as an optional parameter for make_node function (#1588) <Young Kim>
- **[ff45588](https://github.com/onnx/onnx/commit/ff45588)**: Remove unreachable code in shape_inference.h (#1585) <Changming Sun>
- **[f7dcad0](https://github.com/onnx/onnx/commit/f7dcad0)**: Add several hyperbolic function ops. (#1499) <Sergii Dymchenko>
- **[a60ac7d](https://github.com/onnx/onnx/commit/a60ac7d)**: Add OneHot op to ONNX. (#1567) <Spandan Tiwari>
- **[f6c3a7e](https://github.com/onnx/onnx/commit/f6c3a7e)**: [compiler flag] Issue a warning if class has virtual method but missing virtual dtor. (#1583) <Roman Dzhabarov>
- **[88d1784](https://github.com/onnx/onnx/commit/88d1784)**: Fix MaxUnpool shape inference when output_shape is provided as input (#1578) <Spandan Tiwari>
- **[20041b7](https://github.com/onnx/onnx/commit/20041b7)**: Add type shape inferencing for the If operator (#1571) <Scott McKay>
- **[d6c4c75](https://github.com/onnx/onnx/commit/d6c4c75)**: Add a virtual destructor to GraphInferencer (#1574) <Changming Sun>
- **[a339598](https://github.com/onnx/onnx/commit/a339598)**: fix ConvTranspose spec (#1566) <Wenhao Hu>

Reviewed By: zrphercule

Differential Revision: D13263831

fbshipit-source-id: a2ff22c6454e2430429e5a7d18d21661a7ffb0cb
2018-11-29 16:31:56 -08:00
rohithkrn
0d663cec30 Unify cuda and hip device types in Caffe2 python front end (#14221)
Summary:
Goal of this PR is to unify cuda and hip device types in caffe2 python front end.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14221

Differential Revision: D13148564

Pulled By: bddppq

fbshipit-source-id: ef9bd2c7d238200165f217097ac5727e686d887b
2018-11-29 14:00:16 -08:00
Dmytro Dzhulgakov
0cfbbceac3 Change Tensor::CopyFrom to a simple double dispatch (#14268)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14268

Removes the need for Context in Tensor by doing simple dispatch for CopyBytes. It'd eventually be subsumed by Roy Li's changes of proper copy_ op, but before that is done, let's get a clear logic of how copies are implemented and clean up some craft in CopyFrom implementation.

Note, that with these changes, one can probably can get rid of Context::CopyFromCPU/CopyToCPU, but it's a matter for follow up diffs.

This diff doesn't change the API of Tensor yet, but relies on the fact that passing `Context` to CopyFrom makes copy async if the device is CUDA and doesn't have any effect otherwise (that's how Context methods are implemented).

This doesn't change semantics of copy async implementation - as before it blindly calls cudaMemcpyAsync which probably means that it can be misused if invoked separately outside of operator body. I'll leave it for the follow up copy_ unification.

For Extend() we always do async copy - it makes sense as it's an in-place device-device operation and only any further op would be observable.

Note: there are now three ways of invoking copy in C2 code - templated CopyBytes, virtual CopyFromCPU/etc, and double-dispatch free method here. Hopefully we can get rid of the second one.

Also, please advise whether it's c10-worthy :)

Reviewed By: ezyang

Differential Revision: D13117987

fbshipit-source-id: a6772d6dcf3effaf06717da3a656fc9873b310b5
2018-11-28 15:45:37 -08:00
Jiyan Yang
a2fcd4dee5 Ensure FP16 rowwise Adagrad can be run
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12317

Reviewed By: hyuen

Differential Revision: D10190778

fbshipit-source-id: 720a9aaa4e6b1736023d8c6326a613e4ea592b31
2018-11-28 02:15:36 -08:00