Commit Graph

1630 Commits

Author SHA1 Message Date
Chandler Zuo
472be69a73 Avoid Output Uninitialized Blobs in Load with load_all=1 (#19133)
Summary:
When output blob names are specified while load_all=1, output blob names are ignored. However, this behavior is not documented. In this diff, we just disallow users to provide blob names when load_all=1.

See discussion at https://fb.workplace.com/groups/1405155842844877/permalink/2714909788536136/
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19133

Reviewed By: dzhulgakov

Differential Revision: D14883698

Pulled By: chandlerzuo

fbshipit-source-id: 6e4171e36c4ccc4f857e79da98b858a06b7d8ad6
2019-04-27 10:45:44 -07:00
Xiaomeng Yang
2ce39de3fc Add elementwise_affine for layer_norm_op (#19713)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19713

Add elementwise_affine for layer_norm_op

Reviewed By: houseroad

Differential Revision: D15075454

fbshipit-source-id: e8a7d3da1c81e49fa55323f5e74a68bc4ef8d83f
2019-04-26 17:20:01 -07:00
Jack Montgomery
48d5ab54a8 Automatic update of fbcode/foxi to 8f74bc4df3a4cfc69b1a3eadf62aa29d9961c72d AND update Glow AND update C2 (#19792)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19792

This diff also contains the contents of D15092641 and D15090411 so as to not let c2, foxi, and glow get out of sync

Previous import was 81e1683d6348eee4b5ed1145222dc2c41be4269c

Included changes:
- **[8f74bc4](https://github.com/houseroad/foxi/commit/8f74bc4)**: Small fixes (#12) <Jack Montgomery>
- **[72097e4](https://github.com/houseroad/foxi/commit/72097e4)**: Add multiple quantization params per tensor (#11) <Jack Montgomery>
- **[b681fe0](https://github.com/houseroad/foxi/commit/b681fe0)**: Merge pull request #10 from jackm321/add_autoinstrument_graph_prop <Jack Montgomery>
- **[a68d835](https://github.com/houseroad/foxi/commit/a68d835)**: Add ONNXIFI_GRAPH_PROPERTY_AUTO_INSTRUMENT_NODES <Jack Montgomery>

Reviewed By: rdzhabarov, zrphercule

Differential Revision: D15086794

fbshipit-source-id: 8df02c62303b580e16a218d6be7791747e3d7213
2019-04-25 21:03:32 -07:00
Xiaomeng Yang
fb9fc42a0c optimize BatchMatmulOp (#18612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18612

optimize BatchMatmulOp

Reviewed By: houseroad

Differential Revision: D14681665

fbshipit-source-id: cf5ea4909ace58fd44fe6fa634531102ac84e851
2019-04-23 15:34:59 -07:00
Huamin Li
55e53d3d7e correct comments in group_norm_op (#19621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19621

Comments for group_norm_op is not accurate (i.e., the math part), this diff will fix it.

Reviewed By: BIT-silence

Differential Revision: D15048695

fbshipit-source-id: 27d41d3ae21054257967815254134849944d56ca
2019-04-23 13:31:15 -07:00
Yinghai Lu
4e8cc8ee90 Surface the Glow traces to C2 (#19087)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19087

att

Reviewed By: jackm321

Differential Revision: D14863112

fbshipit-source-id: 2680161b9f05391e73bb8dac4fbbeabb87a82c05
2019-04-23 12:27:49 -07:00
Priya Goyal
0d0acba3bd Allow extracting element-wise loss in softmax (#19579)
Summary:
Often times, we want to experiment with loss per element (image etc.). This changeset allows getting per element loss as well. This output is optional.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19579

Reviewed By: jerryzh168

Differential Revision: D15035797

Pulled By: prigoyal

fbshipit-source-id: 562dea514f49c1f2f1cbbc083a1938dc019a75c4
2019-04-23 11:49:49 -07:00
Yinghai Lu
767d184b77 Add back option to not adjust output batch size (#19442)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19442

For cases like CV, some of ops like transpose and tile will mangle the batch size so that we don't know how to adjust output batch size. In this case, the current solution is just fix the input batch statically and do not adjust output batch size.

Reviewed By: zrphercule

Differential Revision: D15007237

fbshipit-source-id: a21b943a52ee5462d9d7804dfae44360f579f8cf
2019-04-22 12:29:24 -07:00
Sam Leeman-Munk
9f4f7e1621 Support compilation on gcc-7.4.0 (#19470)
Summary:
There are two corrections in this pull request.
The first is specific to gcc-7.4.0.
compiled with -std=c++14 gcc-7.4.0 has __cplusplus = 201402L
This does not meet the check set in Deprecated.h, which asks for >201402L.
The compiler goes down to the __GNUC__ check, which passes and sets C10_DEPRECATED_MESSAGE to a value that c++14 does not appear to support or even recognize, leading to a compile time error.
My recommended solution, which worked for my case, was to change the = into a >=

The second correction comes in response to this error:
caffe2/operators/crash_op.cc: In member function ‘virtual bool caffe2::CrashOp::RunOnDevice()’:
caffe2/operators/crash_op.cc:14:11: error: ‘SIGABRT’ was not declared in this scope

I am merely committing to the repository the solution suggested here (which worked for me)
https://discuss.pytorch.org/t/building-pytorch-from-source-in-conda-fails-in-pytorch-caffe2-operators-crash-op-cc/42859
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19470

Differential Revision: D15019529

Pulled By: ailzhang

fbshipit-source-id: 9ce9d713c860ee5fd4266e5c2a7f336a97d7a90d
2019-04-19 21:41:36 -07:00
Xiaomeng Yang
f5fe7aa0b2 Fix relu bug for empty tensor (#19451)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19451

Fix relu bug for empty tensor

Reviewed By: xianjiec

Differential Revision: D15009811

fbshipit-source-id: b75e567c3bec08d7d12b950d8f1380c50c138704
2019-04-19 15:21:07 -07:00
Sebastian Messmer
601f36bacc Use string based schema for exposing caffe2 ops (#19287)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19287

Since we now have a string-schema-based op registration API, we can also use it when exposing caffe2 operators.

Reviewed By: dzhulgakov

Differential Revision: D14931925

fbshipit-source-id: ec162469d2d94965e8c99d431c801ae7c43849c8
2019-04-18 02:04:50 -07:00
Yinghai Lu
5fa1aad670 Remove unused template parameter in OnnxifiOp (#19362)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19362

`float` type is never used in OnnxifiOp....

Reviewed By: bddppq

Differential Revision: D14977970

fbshipit-source-id: 8fee02659dbe408e5a3e0ff95d74c04836c5c281
2019-04-17 16:48:14 -07:00
Yinghai Lu
f1f31b634d Eliminate AdjustBatch ops (#19083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19083

As we have discussed, there are too many of AdjustBatch ops and they incur reallocation overhead and affects the performance. We will eliminate these ops by
- inling the input adjust batch op into Glow
- inling the output adjust batch op into OnnxifiOp and do that only conditionally.

This is the C2 part of the change and requires change from Glow side to work e2e.

Reviewed By: rdzhabarov

Differential Revision: D14860582

fbshipit-source-id: ac2588b894bac25735babb62b1924acc559face6
2019-04-17 10:00:25 -07:00
Sebastian Messmer
db611b7caf Delete C10Tensor (#19328)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19328

Plans changed and we don't want this class anymore.

Reviewed By: dzhulgakov

Differential Revision: D14966746

fbshipit-source-id: 09ea4c95b352bc1a250834d32f35a94e401f2347
2019-04-17 00:02:27 -07:00
Mark Santaniello
20fc7b6ec7 Avoid undefined symbol error when building AdIndexer LTO (#19009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19009

Move the definition of `MulFunctor<>::Backward()` into a header file.

Reviewed By: BIT-silence

Differential Revision: D14823230

fbshipit-source-id: 1efaec01863fcc02dcbe7e788d376e72f8564501
2019-04-15 23:43:13 -07:00
Summer Deng
84b264b17d Add NHWC order support in the cost inference function of 3d conv (#19170)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19170

As title
The quantized resnext3d model in production got the following failures without the fix:

```
 Caffe2 operator Int8ConvRelu logging error: [enforce fail at conv_pool_op_base.h:463] order == StorageOrder::NCHW. 1 vs 2. Conv3D only supports NCHW on the production quantized model
```

Reviewed By: jspark1105

Differential Revision: D14894276

fbshipit-source-id: ef97772277f322ed45215e382c3b4a3702e47e59
2019-04-15 16:47:22 -07:00
Huamin Li
c480798a1c use C10_REGISTER for GELU op
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19090

Reviewed By: BIT-silence

Differential Revision: D14864737

fbshipit-source-id: 8debd53171f7068726f0ab777a13ca46becbfbdf
2019-04-12 11:41:04 -07:00
Xiaomeng Yang
821b5f138a Optimize SoftmaxOp on CPU (#18635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18635

Optimize SoftmaxOp on CPU

Reviewed By: houseroad

Differential Revision: D14689516

fbshipit-source-id: d2dcee2476d1a3a21f428e99bce9835f1d229d64
2019-04-10 18:52:15 -07:00
Hao Lu
226a358136 Move ConcatBatchMatMulBatchGatherOp to OSS
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19059

Reviewed By: bwasti

Differential Revision: D14849735

fbshipit-source-id: fefd1887d38e51151c07a8b187e9c7c50ef02c6e
2019-04-10 15:29:03 -07:00
Yinghai Lu
b461689cfd Clear input/ouput shape cache for each inference (#19085)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19085

This is a bug where input_shapes_ and output_shapes_ will grow indefinitely. Fix it here.

Reviewed By: bertmaher, rdzhabarov

Differential Revision: D14861695

fbshipit-source-id: d59116f27c3b54f5cc5a33533de4b9222dbb7afc
2019-04-10 10:21:37 -07:00
Liang Xiong
b1bea0b733 add logging to make the saving action visible (#19042)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19042

show the model saving step in the log.

Reviewed By: kennyhorror

Differential Revision: D14809385

fbshipit-source-id: c7a1e50ff92bb45b16b1c501d9325b304b07fbd3
2019-04-09 09:35:43 -07:00
Edward Yang
48a35135fb Convert all tabs to spaces, add CI. (#18959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18959
ghimport-source-id: a934163fa34cb2019732d5f49dc7290c376bf156

Differential Revision: D14831246

Pulled By: ezyang

fbshipit-source-id: beb92dc4ee8c82f4c8259c081dd72e477fe7a9d0
2019-04-09 08:12:26 -07:00
Xiaomeng Yang
fd40c0eba0 Add gelu op (#18992)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18992

Add gelu op

Reviewed By: houseroad

Differential Revision: D14814811

fbshipit-source-id: 00f126b8b83763c57ebbf28fbd2de5a8fab6d491
2019-04-08 21:58:29 -07:00
Yinghai Lu
1d263ed92a Add backward pass to infer single missing input shape for Concat opportunitiscally (#18911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18911

Att.

Reviewed By: bddppq

Differential Revision: D14791295

fbshipit-source-id: 4b7a775924f0eadb0cb73aa6c434a6a5be8b92be
2019-04-05 10:11:58 -07:00
Junjie Bai
814c1df29a Fix caffe2 miopen conv transpose gradient op for case of no dX gradient
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18809

Reviewed By: ezyang

Differential Revision: D14759762

Pulled By: bddppq

fbshipit-source-id: ff795b7e58c82f67a1d7284b5ab06b0e0e5fd3ae
2019-04-04 17:29:30 -07:00
Xiaomeng Yang
b145dcca04 Add support for group ConvTranspose (#18794)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18794

Add support for group ConvTranspose

Reviewed By: houseroad

Differential Revision: D14741327

fbshipit-source-id: 5d947ca044bf8495dd7f8f56122441ebbcc6c7e4
2019-04-04 11:52:06 -07:00
Yinghai Lu
e5e2110a8e Add shape inference function for Split (#18838)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18838

It turns out that we don't have shape inference function of `Split` op at all. This diff adds that.

Reviewed By: bertmaher

Differential Revision: D14766871

fbshipit-source-id: 535cb4f24bdada603c76579e00e7a39aee93e19f
2019-04-04 00:22:22 -07:00
Jerry Zhang
dfcd7b0185 QTensor (#18230)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18230

Implementing minimum qtensor API to unblock other workstreams in quantization

Changes:
- Added Quantizer which represents different quantization schemes
- Added qint8 as a data type for QTensor
- Added a new ScalarType QInt8
- Added QTensorImpl for QTensor
- Added following user facing APIs
  - quantize_linear(scale, zero_point)
  - dequantize()
  - q_scale()
  - q_zero_point()

Reviewed By: dzhulgakov

Differential Revision: D14524641

fbshipit-source-id: c1c0ae0978fb500d47cdb23fb15b747773429e6c
2019-04-03 13:17:11 -07:00
Eli Amesefe
385a755b68 Undefined behavior with memset of std::string to 0 (#18703)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18703

 `zeroPtr` is sometimes a `std::string` tensor, so `memset` to 0 is undefined behavior.

This might be accidentally safe with `std::string` implementation that use SSO (Small String Optimization), but will crash otherwise.

Reviewed By: zheng-xq

Differential Revision: D14714458

fbshipit-source-id: 012a18464e6514d38ff791509b88ddc3fc55b2b1
2019-04-02 10:10:11 -07:00
Junjie Bai
246f5c412e Revert "Tensor construction codemod(raw_mutable_data) (#16373)" (#18680)
Summary:
This reverts commit d73c830e23.

We have observed significant perf drop when training ResNext101 with multiple amd GPUs:

Before:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1636/console
2 GPUs ResNext training got 150\~160 imgs/sec
4 GPUs ResNext training got 270\~280 imgs/sec

After:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang7-rocmdeb-ubuntu16.04-bench/1637/console
Both 2 and 4 GPUs ResNext training drop to 110\~120 imgs/sec

Similar perf drop are seen on ResNet50 training jobs as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18680

Differential Revision: D14702941

Pulled By: bddppq

fbshipit-source-id: 828141805afc23f25c08d4a2eb6d4b99f817c128
2019-04-01 14:39:13 -07:00
Ru Li
3749d65a7e Create Node2Vec ModuleKeeper
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18504

Reviewed By: sunnieshang

Differential Revision: D14632091

fbshipit-source-id: d4544866552dc6bcbc7515be9e88cb11e7622a44
2019-04-01 10:36:23 -07:00
Rui Zhu
19fe2b9db4 Adding quantized tensor shape/type info support for caffe2=>glow in caffe2 side (#18621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18621

This diff added caffe2 support for onnxifi quantization.

Reviewed By: yinghai

Differential Revision: D14648767

fbshipit-source-id: 4ddb492cacbba6142305866e6dbb875880acaea3
2019-03-31 17:42:27 -07:00
Sebastian Messmer
14c28fabd2 Check kernel against function schema in c10 op registration (#18256)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18256

This diff infers the function schema from the kernel function/functor and checks that it matches the specified function schema.

This diff does not allow (yet) to omit specifying the function schema in the registration API. That will come in a future diff.

Reviewed By: dzhulgakov

Differential Revision: D14552738

fbshipit-source-id: 00202b489ede19f26ae686c97416b38c72c11532
2019-03-30 00:07:22 -07:00
Sebastian Messmer
c4bb09cc42 Add functor- and function-based kernel registration API (#18162)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18162

- Adds the API to register a functor- and function-based kernel.
- Change the experimental c10 ops to use this new API instead of the old one
- Deletes the old APIs in KernelRegistration.h and OpSchemaRegistration.h

Reviewed By: dzhulgakov

Differential Revision: D14514239

fbshipit-source-id: 35b2f6e8f62964e54886450a6a5fac812ed20f26
2019-03-30 00:07:19 -07:00
Jerry Zhang
d73c830e23 Tensor construction codemod(raw_mutable_data) (#16373)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16373

motivation: https://github.com/pytorch/pytorch/pull/12407
This is a manual diff.
most of the fixes should be:

```
auto* Y = Output(0);
Y->Resize(dims);
Y->raw_mutable_data(dtype);
```
-->
```
auto* Y = Output(0, dims, at::dtype(dtype));
```
But there might be other cases.

Reviewed By: dzhulgakov

Differential Revision: D13725460

fbshipit-source-id: 649a4b0e42f62cda1a60171dd9fa3e440dc9dca1
2019-03-29 18:36:46 -07:00
Yanghan Wang
f4e35d30ed register BoxWithNMSLimit with C10
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17956

Reviewed By: houseroad

Differential Revision: D14417300

fbshipit-source-id: eb5e2ba84513b3b7bfa509dc442424b13fe9148f
2019-03-29 13:41:40 -07:00
Xiaomeng Yang
c21e763cd6 Optimize relu op on GPU (#18506)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18506

Optimize relu op on GPU

Reviewed By: houseroad

Differential Revision: D14633171

fbshipit-source-id: bd3afa9a0bae1325d32ad4153736a0c7ecb0ec64
2019-03-29 00:23:24 -07:00
Jing Huang
11ac0cf276 Implement rotated generate_proposals_op without opencv dependency (CPU version)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18533

Reviewed By: ezyang

Differential Revision: D14648083

fbshipit-source-id: e53e8f537100862f8015c4efa4efe4d387cef551
2019-03-28 17:02:50 -07:00
Ahmed Aly
1ae2c1950c Use SetOutputTensor instead of copying outputs manually (#17770)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17770

As title

Reviewed By: dzhulgakov

Differential Revision: D14370937

fbshipit-source-id: f415490c38556cf03bb13dce3643775331483448
2019-03-28 16:01:33 -07:00
Yinghai Lu
f3ddc40ca4 Move weight offload inside backend construction functor (#18385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18385

By moving the weight offload into the backend initialization function, we can instantiate the backend once by creating the OnnxifiOp once and then clean up the parameter workspace. And we need to keep hold of that instantiated net (OnnxifiOp) without cleaning it. Subsequent ctor of OnnxifiOp of the same model will hit the cached backend and they will not look into weight offloading, which is safe as the weight is already gone.

Reviewed By: ipiszy

Differential Revision: D14590379

fbshipit-source-id: f7f34016e09777ad3df0af487885cd14658e1044
2019-03-26 21:03:17 -07:00
Sameer Indarapu
bdd098c694 Fix typo in Github links in elementwise_ops_schema.cc (#18018)
Summary:
s/elementwise_op_schema.cc/elementwise_ops_schema.cc
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18018

Differential Revision: D14612291

Pulled By: soumith

fbshipit-source-id: 09276283b9ff92c039ce530165c62cc8421fb443
2019-03-26 15:37:26 -07:00
Sebastian Messmer
c6bfcb854b Expose c10 operators to caffe2 by operator name (#18160)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18160

When exposing a c10 operator to the caffe2 frontend, don't use the operator schema but use the operator name instead.
This allows us to get rid of the existing mechanism for operator schema registration in a diff stacked on top.

Reviewed By: dzhulgakov

Differential Revision: D14513420

fbshipit-source-id: 6b08a9c6d9497eaf18b62361dd44bc07c7b4b76b
2019-03-26 12:36:11 -07:00
Xiaomeng Yang
265fa0ce4d Move math::Axpy function to elementwise lib (#18316)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18316

Move math::Axpy function to elementwise lib

i-am-not-moving-c2-to-c10

Reviewed By: houseroad

Differential Revision: D14574697

fbshipit-source-id: 7cfbb2da295c8966c5328bd6b577cce2638eea62
2019-03-26 12:19:19 -07:00
Iurii Zdebskyi
1a742075ee Resolving comments from Bool Tensor for CPU PR (#18165)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18165
ghimport-source-id: 55cb3fb63a25c2faab1725b4ec14c688bf45bd38

Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18166 Bool Tensor for CUDA
* **#18165 Resolved comments from Bool Tensor for CPU PR**
-------
------------
This is a follow up PR that resolves some additional feedback on one the of previous Bool Tensor PRs.

gchanan, here is a list of almost all the comments from the original PR with respective fixes and replies:

**[utils/python_scalars.h]** why is this converting from uint8_t and not bool? (comment?)
When i was adding this, i was testing by creating a tensor and then calling its .tolist(). it worked for bool and uint8_t equally good so i left uint8_t as thought it makes more sense as we are calling PyBool_FromLong. �Changing it to bool.

**[ATen/Dispatch.h]**better name?.
fixed.

**[test/test_torch.py]** what about other factories, such as full? (and more).
There is a test that goes through the factory methods - test_tensor_factories_empty. i added some bool cases above it and added a comment that once CUDA will be done, i will unite them and it will iterate not just between CUDA and CPU but also all types. ��Adding all bool cases now. Will unite in CUDA PR.

**[generic/THTensorMath.h]** any changes in this file actually needed?
Bad merge. Fixed.

**[TH/THTensor.h]** this generates code for random, clampedRandom, and cappedRandom -- do we have tests for all of these with bool?
Added

**[c10/core/ScalarType.h]** I'm not very confident about the lack of Bool here -- can you look at the call sites and see what makes sense to do here?
Added bool to the macro and created a similar one without for a single case which fails the build with errors:

_./torch/csrc/jit/symbolic_variable.h:79:20: error: ambiguous overload for ‘operator*’ (operand types are ‘const torch::jit::SymbolicVariable’ and ‘torch::jit::Value*’)
return (*this) * insertConstant(rhs);_

Differential Revision: D14605105

fbshipit-source-id: abf82d50e8f8c50b386545ac068268651b28496d
2019-03-26 09:59:34 -07:00
Edward Yang
d9960fbdb2 Revert D14446895: [C2] Implement rotated generate_proposals_op without opencv dependency (~2x faster)
Differential Revision:
D14446895

Original commit changeset: 847f2443e645

fbshipit-source-id: fc6ab5ee59e027f125f5ab0f7ee51ad7db37d4a4
2019-03-23 09:38:55 -07:00
Jing Huang
6052d04100 Implement rotated generate_proposals_op without opencv dependency (1.8x faster) (#18010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18010

[C2] Implement rotated generate_proposals_op without opencv dependency.

Reviewed By: newstzpz

Differential Revision: D14446895

fbshipit-source-id: 847f2443e645f8cae1327dfbaa111c48875ca9be
2019-03-22 18:15:27 -07:00
Alexander Sidorov
d4c52158c7 Caffe2: crash op (#18207)
Summary:
this is handy when testing various core dump related
things. If in the future we want to unit test our future gdb debugger
extensions, we can use this op to generate a core dump for us within a
unit test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18207

Differential Revision: D14482186

Pulled By: salexspb

fbshipit-source-id: 39a9fffbdd4bd083597f544d1c783a82cf023a89
2019-03-22 11:52:01 -07:00
Junjie Bai
46439c78d0 Replace the remaining usages of IntList in caffe2 to IntArrayRef
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18282

Differential Revision: D14569269

Pulled By: bddppq

fbshipit-source-id: 5fc33701b83f9efdec4b456d2691764831d10e7f
2019-03-21 16:34:38 -07:00
Sebastian Messmer
1877087df2 Allow registering same operator schema multiple times (#18038)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18038

Now that we have named overloads, we can allow registering the same function schema multiple times and just check it's identical.

This is going to be used in custom op registration since they register the schema every time a kernel is registered.

Reviewed By: dzhulgakov

Differential Revision: D14467494

fbshipit-source-id: 2c26cf72a64b65f120afe05e989302ec42597515
2019-03-21 14:57:28 -07:00
Xiaomeng Yang
43a5c636e2 Optimize group_norm_op (#17945)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17945

Optimize group_norm_op

Reviewed By: houseroad

Differential Revision: D14419908

fbshipit-source-id: 4024b5c5dbeff97f4f026d61fc44af1f0e98ed68
2019-03-21 13:05:01 -07:00