Commit Graph

2801 Commits

Author SHA1 Message Date
Devin He
b46fddf506 idtt + zch distributed inference (#35763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35763

Adds inference function and test for ScatterAssign

Test Plan: Updated unit test

Reviewed By: yyetim, shunting1986

Differential Revision: D20501079

fbshipit-source-id: 7ec6ef0127a151250dd699c90c2b80c35cfb1fe4
2020-04-03 12:09:34 -07:00
Tristan Rice
676fc929b7 [caffe2] fix type and shape inference for common gradient ops (#35857)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35857

This fixes a lot of common ops for InferBlobShapesAndTypes as well as adds support for testing the inferred shapes and types of gradient ops.

Ops:
* Concat
* Split
* LeakyReLU
* Relu
* Prelu
* Gelu
* Elu
* Sinh, Tanh, Cosh
* Abs
* ... and a number of other simple element wise ops

Test Plan:
Added support to hypothesis test to check the shape and type of gradient ops.

Enabled it for all the ops I fixed the shape and type inference for.

  buck test caffe2/caffe2/python/operator_test:

Reviewed By: pradeepd24

Differential Revision: D20806284

fbshipit-source-id: 77f796d9ff208e09e871bdbadf9a0a7c196b77f2
2020-04-02 11:17:04 -07:00
Yinghai Lu
dd98abb453 Enable splitSparseLengthsSumSparse in onnxifi (#35555)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35555

Att. So that we can lower the SparseLengthsSum* part of SparseLengthsSum*Sparse. We update the tying policy between Gather and SparsLengthsWeightSum* so that we don't bother lowering a single Gather into the backend, which is inefficient to execute on card and creates bubbles between continuous lowering graphs.

Test Plan:
```
buck test glow/fb/test:test_onnxifinnpi
```

Reviewed By: ipiszy

Differential Revision: D20688525

fbshipit-source-id: cb8e38239057ff13a8d385ed09d0d019421de78b
2020-03-30 13:34:59 -07:00
Yinghai Lu
af4d86788c Split SparseLengthsSumSparse into SparseLengthsSumSparseLookup + SparseLengthsSum (#35507)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35507

We want to split up the SparseLengthsSumSparse op into an indirection op and the SparseLengthsSum op so that we can lower the later part.  The indirection part is a plain impl now.

Test Plan:
```
for i in `seq 10`; do buck test caffe2/caffe2/python/operator_test:lengths_reducer_fused_nbit_rowwise_ops_test -- test_sparse_lengths_sum_rowwise_sparse; done
```

Reviewed By: jspark1105

Differential Revision: D20683478

fbshipit-source-id: 509effe88719d20aa0c4783bbe0ce1f183ee473c
2020-03-30 13:33:29 -07:00
anjali411
96eec95ece torch.from_numpy for complex dtypes (#35531)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35531

Differential Revision: D20693581

Pulled By: anjali411

fbshipit-source-id: d53e26b4175452fa00b287efbfceea18104c1364
2020-03-27 14:40:28 -07:00
pinzhenx
bd604cb5b7 Upgrade MKL-DNN to DNNL v1.2 (#32422)
Summary:
## Motivation

This PR upgrades MKL-DNN from v0.20 to DNNL v1.2 and resolves https://github.com/pytorch/pytorch/issues/30300.

DNNL (Deep Neural Network Library) is the new brand of MKL-DNN, which improves performance, quality, and usability over the old version.

This PR focuses on the migration of all existing functionalities, including minor fixes, performance improvement and code clean up. It serves as the cornerstone of our future efforts to accommodate new features like OpenCL support, BF16 training, INT8 inference, etc. and to let the Pytorch community derive more benefits from the Intel Architecture.

<br>

## What's included?

Even DNNL has many breaking changes to the API, we managed to absorb most of them in ideep. This PR contains minimalist changes to the integration code in pytorch. Below is a summary of the changes:

<br>

**General:**

1. Replace op-level allocator with global-registered allocator

```
// before
ideep::sum::compute<AllocForMKLDNN>(scales, {x, y}, z);

// after
ideep::sum::compute(scales, {x, y}, z);
```

The allocator is now being registeted at `aten/src/ATen/native/mkldnn/IDeepRegistration.cpp`. Thereafter all tensors derived from the `cpu_engine` (by default) will use the c10 allocator.

```
RegisterEngineAllocator cpu_alloc(
  ideep::engine::cpu_engine(),
  [](size_t size) {
    return c10::GetAllocator(c10::DeviceType::CPU)->raw_allocate(size);
  },
  [](void* p) {
    c10::GetAllocator(c10::DeviceType::CPU)->raw_deallocate(p);
  }
);
```
------

2. Simplify group convolution

We had such a scenario in convolution where ideep tensor shape mismatched aten tensor: when `groups > 1`, DNNL expects weights tensors to be 5-d with an extra group dimension, e.g. `goihw` instead of `oihw` in 2d conv case.

As shown below, a lot of extra checks came with this difference in shape before. Now we've completely hidden this difference in ideep and all tensors are going to align with pytorch's definition. So we could safely remove these checks from both aten and c2 integration code.

```
// aten/src/ATen/native/mkldnn/Conv.cpp

if (w.ndims() == x.ndims() + 1) {
  AT_ASSERTM(
      groups > 1,
      "Only group _mkldnn_conv2d weights could have been reordered to 5d");
  kernel_size[0] = w.get_dim(0) * w.get_dim(1);
  std::copy_n(
      w.get_dims().cbegin() + 2, x.ndims() - 1, kernel_size.begin() + 1);
} else {
  std::copy_n(w.get_dims().cbegin(), x.ndims(), kernel_size.begin());
}
```

------

3. Enable DNNL built-in cache

Previously, we stored DNNL jitted kernels along with intermediate buffers inside ideep using an LRU cache. Now we are switching to the newly added DNNL built-in cache, and **no longer** caching buffers in order to reduce memory footprint.

This change will be mainly reflected in lower memory usage from memory profiling results. On the code side, we removed couple of lines of `op_key_` that depended on the ideep cache before.

------

4. Use 64-bit integer to denote dimensions

We changed the type of `ideep::dims` from `vector<int32_t>` to `vector<int64_t>`. This renders ideep dims no longer compatible with 32-bit dims used by caffe2. So we use something like `{stride_.begin(), stride_.end()}` to cast parameter `stride_` into a int64 vector.

<br>

**Misc changes in each commit:**

**Commit:** change build options

Some build options were slightly changed, mainly to avoid name collisions with other projects that include DNNL as a subproject. In addition, DNNL built-in cache is enabled by option `DNNL_ENABLE_PRIMITIVE_CACHE`.

Old | New
-- | --
WITH_EXAMPLE | MKLDNN_BUILD_EXAMPLES
WITH_TEST | MKLDNN_BUILD_TESTS
MKLDNN_THREADING | MKLDNN_CPU_RUNTIME
MKLDNN_USE_MKL | N/A (not use MKL anymore)

------

**Commit:** aten reintegration

- aten/src/ATen/native/mkldnn/BinaryOps.cpp

    Implement binary ops using new operation `binary` provided by DNNL

- aten/src/ATen/native/mkldnn/Conv.cpp

    Clean up group convolution checks
    Simplify conv backward integration

- aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp

    Simplify prepacking convolution weights

- test/test_mkldnn.py

    Fixed an issue in conv2d unit test: it didn't check conv results between mkldnn and aten implementation before. Instead, it compared the mkldnn with mkldnn as the default cpu path will also go into mkldnn. Now we use `torch.backends.mkldnn.flags` to fix this issue

- torch/utils/mkldnn.py

    Prepack weight tensor on module `__init__` to achieve better performance significantly

------

**Commit:** caffe2 reintegration

- caffe2/ideep/ideep_utils.h

    Clean up unused type definitions

- caffe2/ideep/operators/adam_op.cc & caffe2/ideep/operators/momentum_sgd_op.cc

   Unify tensor initialization with `ideep::tensor::init`. Obsolete `ideep::tensor::reinit`

- caffe2/ideep/operators/conv_op.cc & caffe2/ideep/operators/quantization/int8_conv_op.cc

    Clean up group convolution checks
    Revamp convolution API

- caffe2/ideep/operators/conv_transpose_op.cc

    Clean up group convolution checks
    Clean up deconv workaround code

------

**Commit:** custom allocator

- Register c10 allocator as mentioned above

<br><br>

## Performance

We tested inference on some common models based on user scenarios, and most performance numbers are either better than or on par with DNNL 0.20.

ratio: new / old | Latency (batch=1 4T) | Throughput (batch=64 56T)
-- | -- | --
pytorch resnet18 | 121.4% | 99.7%
pytorch resnet50 | 123.1% | 106.9%
pytorch resnext101_32x8d | 116.3% | 100.1%
pytorch resnext50_32x4d | 141.9% | 104.4%
pytorch mobilenet_v2 | 163.0% | 105.8%
caffe2 alexnet | 303.0% | 99.2%
caffe2 googlenet-v3 | 101.1% | 99.2%
caffe2 inception-v1 | 102.2% | 101.7%
caffe2 mobilenet-v1 | 356.1% | 253.7%
caffe2 resnet101 | 100.4% | 99.8%
caffe2 resnet152 | 99.8% | 99.8%
caffe2 shufflenet | 141.1% | 69.0% †
caffe2 squeezenet | 98.5% | 99.2%
caffe2 vgg16 | 136.8% | 100.6%
caffe2 googlenet-v3 int8 | 100.0% | 100.7%
caffe2 mobilenet-v1 int8 | 779.2% | 943.0%
caffe2 resnet50 int8 | 99.5% | 95.5%

_Configuration:
Platform: Skylake 8180
Latency Test: 4 threads, warmup 30, iteration 500, batch size 1
Throughput Test: 56 threads, warmup 30, iteration 200, batch size 64_

† Shufflenet is one of the few models that require temp buffers during inference. The performance degradation is an expected issue since we no longer cache any buffer in the ideep. As for the solution, we suggest users opt for caching allocator like **jemalloc** as a drop-in replacement for system allocator in such heavy workloads.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32422

Test Plan:
Perf results: https://our.intern.facebook.com/intern/fblearner/details/177790608?tab=Experiment%20Results

10% improvement for ResNext with avx512, neutral on avx2

More results: https://fb.quip.com/ob10AL0bCDXW#NNNACAUoHJP

Reviewed By: yinghai

Differential Revision: D20381325

Pulled By: dzhulgakov

fbshipit-source-id: 803b906fd89ed8b723c5fcab55039efe3e4bcb77
2020-03-26 22:07:59 -07:00
Tristan Rice
d4f3bc7f8e [dt] [caffe2] add/fix shape inference for StumpFunc, SliceGradient and ResizeLike (#35430)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35430

This fixes and adds tests for several commonly used operators.

There's some formatting differences due to running clang-format on one of the files.

Test Plan: buck test //caffe2/caffe2/fb/operators:hypothesis_test //caffe2/caffe2/python/operator_test:utility_ops_test //caffe2/caffe2/python/operator_test:concat_split_op_test

Reviewed By: yyetim

Differential Revision: D20657405

fbshipit-source-id: 51d86d0834003b8ac8d6acb5149ae13d7bbfc6ab
2020-03-26 17:50:32 -07:00
Chunli Fu
de3b2f98db [Shape Inference] Add ssaRewrite pybind func (#35410)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35410

Reviewed By: yinghai

Differential Revision: D20653042

fbshipit-source-id: 3845413d4e80b9be4fb97dc1eb8e824a55fb7576
2020-03-26 00:46:28 -07:00
Xiaodong Wang
53fceff1e1 Change weight scale test to cpu only (#35346)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35346

weight scale op doesn't have GPU impl. This is breaking OSS CI from D20506032. Making it cpu only

Test Plan: OSS CI

Reviewed By: ustctf

Differential Revision: D20637440

fbshipit-source-id: 9aa6cce63ce637ab7856788e5d02f527decb2a26
2020-03-25 09:18:58 -07:00
Fei Tian
845b19c4ef Add weight_scale in Adagrad (#34944)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34944

Reviewed By: chonglinsun

Differential Revision: D20506032

fbshipit-source-id: ef025e536da01fdcabc783466bc065685b80ab9a
2020-03-20 22:36:51 -07:00
Zhonghao Liu
e3272559e4 [caffe2] SWA operator (#34394)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34394

# SWA operator
In this diff, we added a new operator `SWA` which will be used in `AdaGradOptimizer`.

The algorithm looks like:

{F230902995}

# Background

In our testings, we found that this operator could improve our models' reproducibility a lot. (KT: 0.86 -> .92)

So we hope to land this operator and in future, enable this by default in our Models.

Test Plan:
Local build `aml.dper3:30f068668cfb408fbb40141fb17129f2` and bento kernel.
- Local test: n215857
- f174600345

Reviewed By: chocjy

Differential Revision: D20165239

fbshipit-source-id: c03cdd048cb10b091e5f06323f4c0f3999f95d8a
2020-03-20 08:17:08 -07:00
Chunli Fu
b3fccda4a9 [DPER3][Shape inference] Update Shape Information in dper3 backend (#34475)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34475

Differential Revision: D20332799

fbshipit-source-id: 16aa7399eb48ce4d1d0f8431941ae1252322c382
2020-03-19 13:49:34 -07:00
Li Zhang (DAI)
69e701fbf9 Add transfer_learning_blob_name_mappings into layer_model_helper to support layer model transfer learning
Summary: Add transfer_learning_blob_name_mappings into layer_model_helper to support layer model transfer learning

Reviewed By: mraway

Differential Revision: D20286298

fbshipit-source-id: de3e029611d843f38d3f42ecd4148358f7e14a2b
2020-03-18 15:28:00 -07:00
Edward Yang
d927d58c2a Revert D20289209: Support RowWiseSparseAdam on GPU
Test Plan: revert-hammer

Differential Revision:
D20289209

Original commit changeset: a7a8a21bd18c

fbshipit-source-id: 4a8ae684d099a5499c28b7e65578fc7ab10b248d
2020-03-18 07:35:07 -07:00
Jongsoo Park
bcbdba450c [caffe2] open source 2/4-bit SLS operators (#34903)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34903

Reattempt of D20461609

Moving 2/4-bit SLS and row-wise 2/4-bit conversion operator to open source to be used by DLRM

Test Plan: CI

Reviewed By: jianyuh

Differential Revision: D20495304

fbshipit-source-id: 66a99677583f50fd40e29c514710c7b1a8cdbc29
2020-03-17 22:55:10 -07:00
Yan Xie
959a7138fd Support RowWiseSparseAdam on GPU (#34341)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34341

Implement RowWiseSparseAdam on CUDA

Reviewed By: xianjiec

Differential Revision: D20289209

fbshipit-source-id: a7a8a21bd18c1b9891f04f202d3ecaf183e30cad
2020-03-17 15:08:24 -07:00
Edward Yang
3e68d0c5d0 Revert D20461609: [caffe2] open source 2/4-bit SLS operators
Test Plan: revert-hammer

Differential Revision:
D20461609

Original commit changeset: b3ef73ff10f2

fbshipit-source-id: e90ee5e34b1feab5b0bd582ed7e96e37de7044b0
2020-03-17 11:10:10 -07:00
Jongsoo Park
d9b97a4ffd [caffe2] open source 2/4-bit SLS operators (#34783)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34783

Moving 2/4-bit SLS and row-wise 2/4-bit conversion operator to open source to be used by DLRM

Test Plan: CI

Reviewed By: yinghai

Differential Revision: D20461609

fbshipit-source-id: b3ef73ff10f2433afe06ffa73fe1145282d9ec4c
2020-03-17 01:00:31 -07:00
Xinyi Zhang
99b91ee2ad [fix][tiny][caffe2] Avoid triggering errors when allow ratio is 100% (#34757)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34757

Reviewed By: Wakeupbuddy

Differential Revision: D20451255

fbshipit-source-id: 07997cf31dba653b61d082ec3f28357c3b90c4eb
2020-03-16 11:39:32 -07:00
Bangsheng Tang
8f854fb9e2 [1/n][multi-tower] add partition info in predictor construction (#34175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34175

to incorporate PartitionInfo added in D20015493

Test Plan: unit tests

Reviewed By: yinghai

Differential Revision: D20133759

fbshipit-source-id: 130db2d80bca3c05a7ec91292159f857046718e0
2020-03-13 09:23:39 -07:00
Edward Yang
d5f8c8f3ba Revert D20121169: [pytorch][PR] ONNX Export Support for CrossEntropyLoss
Test Plan: revert-hammer

Differential Revision:
D20121169

Original commit changeset: 7b56617e8c60

fbshipit-source-id: d7f302d1e54f3c978c3be0a0ad1ee600790a5b27
2020-03-12 20:30:54 -07:00
Chunli Fu
fe9b4e3cba [DPER3] Blob Reorder (#33579)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33579

Differential Revision: D20008865

fbshipit-source-id: f35aded311d9d1d7d438d828ccabd2bab5575e5c
2020-03-12 12:28:12 -07:00
Ksenija Stanojevic
944ea4c334 ONNX Export Support for CrossEntropyLoss (#33767)
Summary:
Add ONNX export support for torch.nn.CrossEntropyLoss.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33767

Reviewed By: hl475

Differential Revision: D20121169

Pulled By: houseroad

fbshipit-source-id: 7b56617e8c60617b922949fc8b4ecc626eedf7ed
2020-03-12 11:46:58 -07:00
Michael Suo
c235be42dd [jit] kill script namespace (#34515)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515

Once upon a time we thought this was necessary. In reality it is not, so
removing it.

For backcompat, our public interface (defined in `api/`) still has
typedefs to the old `script::` names.

There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph
transform. I renamed one of them.

Test Plan: Imported from OSS

Differential Revision: D20353503

Pulled By: suo

fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93
2020-03-11 23:32:48 -07:00
Xianjie Chen
0dc0fffca1 [net_transform] only skip ConstantFill for autogen_grad (#34628)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34628

Differential Revision: D20370564

fbshipit-source-id: 854c8ab44ba262e5020383447ed6bb629064ec33
2020-03-11 19:09:52 -07:00
Xiaodong Wang
4c99351de6 [AMD] Remove num_gpu check for remote execution (#34318)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34318

Stop checking whether we have AMD GPU devices on the host, because we may be constructing a net on a machine without GPU, and run the net on another one with GPU

Reviewed By: ajauhri

Differential Revision: D20269562

fbshipit-source-id: 1f561086cacdcead3ce7c03c2d02c25336c8b11a
2020-03-06 09:53:57 -08:00
Orion Reblitz-Richardson
ad17dafc50 [caffe2] Remove python2 from operator_test (#33977)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33977

Removing python2 from operator_test so we can retire python2 support for PyTorch.

Test Plan: waitforsandcastle

Reviewed By: seemethere

Differential Revision: D20129500

fbshipit-source-id: d4c82e4acfc795be9bec6a162c713e37ffb9f5ff
2020-03-02 08:55:53 -08:00
Michael Suo
dbe850af5b [jit] do the code reorg (#33851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851

Rationale and context described in #33828.

Script to reproduce the move:
https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9
ghstack-source-id: 99079645

Test Plan: Make sure CI passes

Reviewed By: jamesr66a

Differential Revision: D20133869

fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e
2020-02-27 13:02:51 -08:00
Alex Cheparukhin
ee23944f46 [Caffe2] Fix shape inference for element-wise operators (#33431)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33431

Some elementwise operators don't have shape and type inference specified for the output tensor: `BitwiseOr`, `BitwiseAnd`, `BitwiseXor`, `Not`, `Sign`.

This change fixes this issue:
- For `Not` and `Sign` operators, the output has the same type and shape as the input, so `IdenticalTypeAndShapeOfInput` function is used to specify that.
- For bitwise operators created by `CAFFE2_SCHEMA_FOR_BINARY_BITWISE_OP` macro, the type and shape inference rules should be the same as for other binary element-wise operators, so `TensorInferenceFunction(ElementwiseOpShapeInference)` is used to specify that.

Also some tests were modified to ensure that the shape and type are inferred (`ensure_outputs_are_inferred` parameter)

Test Plan:
```
CAFFE2_ASSERT_SHAPEINFERENCE=1 buck test caffe2/caffe2/python/operator_test:elementwise_ops_test
CAFFE2_ASSERT_SHAPEINFERENCE=1 buck test caffe2/caffe2/python/operator_test:math_ops_test
```

Note that the tests have to be executed with `CAFFE2_ASSERT_SHAPEINFERENCE=1` in order to fail upon shape inference failure.

Reviewed By: idning

Differential Revision: D19880164

fbshipit-source-id: 5d7902e045d79e5669e5e98dfb13a39711294939
2020-02-25 09:03:06 -08:00
Xinyi Zhang
696527e659 [caffe2] Add embedding empty ratio checker (disabled by default) (#33145)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33145

Reviewed By: xianjiec

Differential Revision: D19716574

fbshipit-source-id: 42a636600ac3977910d35093916865790bbe5b10
2020-02-24 16:10:01 -08:00
Jongsoo Park
e95282ab28 [caffe2] make fused rowwise quant/dequant op work for N-dim tensors (#33426)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33426

Make 2/4/8-bit fused rowwise conversion operators more general to work for N-dim tensors

Test Plan: CI

Reviewed By: ellie-wen

Differential Revision: D19943136

fbshipit-source-id: 47008544dd7e1d11a346d34f35449e0fcc0e7ee0
2020-02-19 23:29:42 -08:00
Huayu Li
c75d06d854 Move gating part of SparseFeatureGating to local
Summary: in dper2, local net is hard-coded by whitelisting some layers. Add SparseFeatureGating related layers to local net explicitly.

Test Plan:
* workflow: f167812211
* QRT: fall back looks normal

{F228442018}

Differential Revision: D19852280

fbshipit-source-id: 6fecc3d745c3f742d029575a7b9fe320618f1863
2020-02-16 14:18:27 -08:00
Johannes M Dieterich
6ade7e3a15 [ROCm] Enable 3D convolutions through ROCm (#33067)
Summary:
For both the Caffe2 and PyTorch backends, enable 3D convolutions through MIOpen.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33067

Reviewed By: BIT-silence

Differential Revision: D19880495

Pulled By: bddppq

fbshipit-source-id: 8f6f970910654c1c5aa871b48a04c1054875691c
2020-02-14 13:19:10 -08:00
Lu Fang
e5c7b7b8b5 Automatic update of fbcode/onnx to 04a29addfd5b912812addb8dea5f8763fbfaad01 (#33328)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33328

Previous import was 8b3f7e2e7a0f2aba0e629e23d89f07c7fc0e6a5e

Included changes:
- **[04a29add](https://github.com/onnx/onnx/commit/04a29add)**: Use // instead of # (#2598) <Lu Fang>
- **[f8e140a9](https://github.com/onnx/onnx/commit/f8e140a9)**: Kezhan/function update (#2596) <Ke Zhang>
- **[6185faae](https://github.com/onnx/onnx/commit/6185faae)**: fix the attribute types section in IR.md (#2590) <Ke Zhang>
- **[f254647a](https://github.com/onnx/onnx/commit/f254647a)**: Allow Constant operator to promote scalar and list to tensors. (#2592) <Jeremy Cochoy>
- **[f12ec799](https://github.com/onnx/onnx/commit/f12ec799)**: Add NegativeLogLikelihood(NllLoss) op (#2551) <liqunfu>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D19897554

fbshipit-source-id: d8efb5c5ac8f9d71727de33c67af681ed8ec8123
2020-02-13 21:03:17 -08:00
Chaitanya Sri Krishna Lolla
2635055229 [ROCm] Enable 3D batch norms through MIOpen (#33262)
Summary:
Enable test for Caffe2
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33262

Differential Revision: D19880486

Pulled By: bddppq

fbshipit-source-id: af663a11137a53302e55198f38117ab6bdc9ec89
2020-02-13 11:29:51 -08:00
Lin Yang
9d9fa2eace [2/3] Bind Bucketize to PyTorch (#33014)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33014

Export Bucketize to PyTorch.

Test Plan: buck test caffe2/caffe2/python/operator_test:torch_integration_test

Reviewed By: bddppq

Differential Revision: D19737534

fbshipit-source-id: be1c892bb8d01da9892f221f150f1a2788ac732e
2020-02-11 23:20:10 -08:00
Lin Yang
6f46962f21 [1/3] Bind IndexHash to PyTorch (#33015)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33015

Export IndexHash to PyTorch

Test Plan:
buck test caffe2/caffe2/python/operator_test:torch_integration_test

      ✓ caffe2/caffe2/python/operator_test:torch_integration_test-2.7 - test_index_hash_op (caffe2.caffe2.python.operator_test.torch_integration_test.TorchIntegration) 0.151 44/50 (passed)

Reviewed By: bddppq

Differential Revision: D19727301

fbshipit-source-id: a65c954539e81a15577fe5c3c0deb3614e983534
2020-02-10 17:47:38 -08:00
Lu Fang
674dca0831 Automatic update of fbcode/onnx to 8b3f7e2e7a0f2aba0e629e23d89f07c7fc0e6a5e (#33075)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33075

Previous import was 65020daafa9183c769938b4512ce543fd5740f8f

Included changes:
- **[8b3f7e2e](https://github.com/onnx/onnx/commit/8b3f7e2e)**: Update Dropout and  BatchNorm to be Training Friendly (#2568) <Lara Haidar>
- **[61f0bbc5](https://github.com/onnx/onnx/commit/61f0bbc5)**: Fix a bug in ScatterND shape inference (#2577) <Bowen Bao>
- **[05bce9cf](https://github.com/onnx/onnx/commit/05bce9cf)**: add utility function to make reference attribute whose name is not the same as the attribute it refers. (#2583) <Ke Zhang>
- **[71181c83](https://github.com/onnx/onnx/commit/71181c83)**: Clarify spec for constant of shape with dim_n = 0 (#2567) <Negin Raoof>
- **[eadba733](https://github.com/onnx/onnx/commit/eadba733)**: Update sigs.md with link to calendar page (#2579) <Prasanth Pulavarthi>
- **[08562f8e](https://github.com/onnx/onnx/commit/08562f8e)**: Update working-groups.md (#2580) <Prasanth Pulavarthi>
- **[0e718913](https://github.com/onnx/onnx/commit/0e718913)**: Fix Slice op's shape inference logic (#2526) <Hariharan Seshadri>
- **[12111410](https://github.com/onnx/onnx/commit/12111410)**: Add missing spaces to Random*Like doc (#2572) <Takeshi Watanabe>
- **[7e6e61d6](https://github.com/onnx/onnx/commit/7e6e61d6)**: Contributing: fix typos (#2571) <Maher Jendoubi>
- **[bbd604ef](https://github.com/onnx/onnx/commit/bbd604ef)**: Add Einsum op (#2504) <Negin Raoof>
- **[fd3ab73a](https://github.com/onnx/onnx/commit/fd3ab73a)**: Clarify split supports zero length splits (#2544) <Negin Raoof>
- **[6dd73774](https://github.com/onnx/onnx/commit/6dd73774)**: Fix circleci build and drop unsupported Windows builds (#2565) <Wei-Sheng Chin>
- **[b3d201a2](https://github.com/onnx/onnx/commit/b3d201a2)**: Fix the formula of intermediate zero calculation for DynamicQuantizeLinear (#2556) <Yufeng Li>
- **[3613eb25](https://github.com/onnx/onnx/commit/3613eb25)**: Add wording to clarify. (#2555) <Dwayne Robinson>
- **[dfa4384c](https://github.com/onnx/onnx/commit/dfa4384c)**: Fix shape inference for Split with split attribute (#2328) <Shinichiro Hamaji>
- **[684fc1bc](https://github.com/onnx/onnx/commit/684fc1bc)**: Keep symbolic dims in Concat with a single input (#2418) <Shinichiro Hamaji>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D19784487

fbshipit-source-id: 421cdc3394faeff0168853f4ff065fc599ca3967
2020-02-07 02:18:57 -08:00
Andrey Malevich
e76fa9822d [C2] Introduce extra_info force CPU tags for auto-generated iteration counter blobs (#32607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32607

As desc.

Test Plan: Unit-test.

Reviewed By: xw285cornell, chocjy

Differential Revision: D19551567

fbshipit-source-id: 3a121351d2b4016e99a1536dec746be970698664
2020-02-05 23:49:27 -08:00
peng
18d1896ba0 Fix confusing "does not have GPU support" warning message (#30721)
Summary:
Many people who use caffe2 are confused about "does not have GPU support" warning message.
https://github.com/facebookresearch/video-nonlocal-net/issues/6
facebookarchive/caffe2#346
facebookarchive/caffe2#1634
facebookarchive/caffe2#197

Many none GPU reasons can cause this warning message. It is better to give the error info.
![image](https://user-images.githubusercontent.com/13826327/70129721-41175e00-16ba-11ea-85df-a4b1a1690149.png)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30721

Differential Revision: D19697413

Pulled By: ezyang

fbshipit-source-id: bd24b7c814e7e677352068b9e9f77a68de080159
2020-02-04 14:20:00 -08:00
James Reed
341fb6d11d Make caffe2/caffe2/python/models/seq2seq python3 compatible
Test Plan: watiforsadcastle

Reviewed By: dzhulgakov

Differential Revision: D19698403

fbshipit-source-id: 36b73e07e598c848abbe368e522484da9ba4c78f
2020-02-04 10:51:47 -08:00
Edward Yang
c47c78d0bf Revert D19597036: More code fakefp16 mapping unification
Test Plan: revert-hammer

Differential Revision:
D19597036

Original commit changeset: deed61945884

fbshipit-source-id: c057e57810a99464aefb00b645613ecd6a7c5533
2020-01-29 13:32:42 -08:00
Deepali Chourasia
e84f9d9d0c Fix TensorProtosDBInput AttributeError (#32274)
Summary:
https://github.com/pytorch/pytorch/issues/6794
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32274

Differential Revision: D19621889

Pulled By: ezyang

fbshipit-source-id: 1bdd042b6421a2798c7f1e9030dfc6dfc1246989
2020-01-29 12:05:43 -08:00
Yinghai Lu
642c9ef922 More code fakefp16 mapping unification
Summary: ATT

Reviewed By: amylittleyang

Differential Revision: D19597036

fbshipit-source-id: deed61945884fb4b01d058f3c72c75f5a937a41c
2020-01-29 11:01:24 -08:00
Xinyi Zhang
1f78bd0774 [caffe2] Early error throwing for currupted embeddings
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32717

Reviewed By: xianjiec

Differential Revision: D19604954

fbshipit-source-id: c02eccf048c0dba3f66d729ab1fda50f3cacef63
2020-01-28 16:55:29 -08:00
Jongsoo Park
e735395fc6 [caffe2] use 2-stage EmbeddingSpMDM interface (#32271)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32271

Use the 2-stage EmbeddingSpMDM interface in D19425982 to reduce the overhead of code cache lookup and lock contention.
Fix an issue in sparse_lengths_sum_benchmarks generating empty indices when average length is small like 1.

Test Plan: CI

Reviewed By: dskhudia

Differential Revision: D19425987

fbshipit-source-id: d5c5f0d46e0072403901809c31d516fa0f4b9b31
2020-01-22 19:05:36 -08:00
Dehua Cheng
685f090ac8 [Rowwise Pruning][c2 op] Add Quantile Op (#32448)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32448

Using binary search to compute the value for the given quantile among the input tensors.

Test Plan: Newly added unittests;

Reviewed By: jspark1105

Differential Revision: D19487604

fbshipit-source-id: 0dc6627b78d1310ac35b3f1d53b89cc89a697ece
2020-01-22 16:59:56 -08:00
Jongsoo Park
14e0bec9f2 [caffe2] remove unnecessary np.set_printoptions and fix test errors (#32475)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32475

As title

Test Plan: CI

Reviewed By: houseroad

Differential Revision: D19508778

fbshipit-source-id: fd9ad63607535980505d155f3e3c3b7c6b95daf7
2020-01-22 14:49:47 -08:00
peter
b77c25dec0 Fix dll load logic for Python 3.8 on Windows (#32215)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/31181 and https://github.com/pytorch/pytorch/pull/31162#discussion_r362495611.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32215

Differential Revision: D19501869

Pulled By: ezyang

fbshipit-source-id: 363824e52d2592ad968ecf1df345aa4c0daff915
2020-01-22 08:33:34 -08:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Yanghan Wang
9b6ec61bfd exposing CPU/GPU Copy ops (#32248)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32248

expose CPU/GPU copy ops

Test Plan: buck test mode/dev-nosan caffe2/caffe2/python/operator_test:torch_integration_test

Reviewed By: houseroad

Differential Revision: D19405856

fbshipit-source-id: 1df4aa202e26647cb81e9fe7e4478e594a5f7f3e
2020-01-17 12:40:43 -08:00
Alexander Melnikov
4e69352713 Add 64bit atomic fetch add (#32354)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32354

adding int_64 version of AtomicFetchAdd

Reviewed By: bwasti

Differential Revision: D19434349

fbshipit-source-id: b2358e8c5c6b7cd7e7b21de974b4ee1b5258fcf4
2020-01-17 11:43:43 -08:00
David Gisser
91bdb872ce fix spelling mistake: excpected -> expected
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28817

Differential Revision: D18544562

Pulled By: dgisser

fbshipit-source-id: 51f728e807f9c4bb30f58585d5b6f436cb880153
2020-01-17 00:11:08 -08:00
Jing Huang
ef5ae4823a Register RoIAlignRotated with C10
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30785

Reviewed By: wat3rBro

Differential Revision: D18415056

fbshipit-source-id: e00376bec948309d53f2172697cd477449f769b2
2020-01-16 16:32:28 -08:00
Tim Gates
0392e8384b Fix simple typo: whos -> whose (#31288)
Summary:
Closes https://github.com/pytorch/pytorch/issues/31287
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31288

Differential Revision: D19166753

Pulled By: zou3519

fbshipit-source-id: da31ad323b8fafa7cbc502fda4e2eb6e02facfb6
2020-01-15 11:47:21 -08:00
Shu Liu
8c3ee9f2ba [Python] Deprecate use of scipy.misc.logsumexp and scipy.misc.comb (#32209)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32209

* Deprecate use of scipy.misc.logsumexp and scipy.misc.comb.
* Removed in 1.0.0 https://docs.scipy.org/doc/scipy-1.1.0/reference/generated/scipy.misc.logsumexp.html and https://docs.scipy.org/doc/scipy-1.2.1/reference/generated/scipy.misc.comb.html
* Use scipy.special.logsumexp and scipy.special.comb instead.
* This diff updates most usages of except those in experimental folders.
* This diff does NOT fix existing lint/code/TARGETS issues.
* This diff does NOT autoformat codes.

Test Plan: sandcastle auto unittests

Differential Revision: D19406460

fbshipit-source-id: 2103fa0d674d9671a0175f4ce54b3c887d22f04e
2020-01-15 10:40:47 -08:00
Jongsoo Park
879620e85e [caffe2] fix how np.clip is used in lengths_reducer_fused_{4,8}_rowwise_ops_test (#32086)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32086

np.clip(1, num_indices // 2, 10) -> np.clip(num_indices // 2, 1, 10)
Also change batchsize -> num_rows to match with what the variable actually does

Test Plan: CI

Reviewed By: hx89

Differential Revision: D19361521

fbshipit-source-id: 9ce864c7d7da046dc606afa5207da677ccf80f52
2020-01-14 22:53:28 -08:00
Lu Fang
f6f1e0aef5 Automatic update of fbcode/onnx to 65020daafa9183c769938b4512ce543fd5740f8f (#32125)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32125

Previous import was 57ebc587fcf3913b4be93653b0dd58c686447298

Included changes:
- **[65020daa](https://github.com/onnx/onnx/commit/65020daa)**: better error message for undefined inputs (#2540) <Yuxin Wu>
- **[8afff0e9](https://github.com/onnx/onnx/commit/8afff0e9)**: bump ORT version (#2538) <Lu Fang>
- **[3d9ca57e](https://github.com/onnx/onnx/commit/3d9ca57e)**: fix name of directory (#2537) <Prasanth Pulavarthi>
- **[df8fa2c9](https://github.com/onnx/onnx/commit/df8fa2c9)**: Repository guidelines (#2539) <Prasanth Pulavarthi>
- **[49cc2f02](https://github.com/onnx/onnx/commit/49cc2f02)**: Update CircleCI job to use Python3.6 (#2527) <bddppq>
- **[25ff79a4](https://github.com/onnx/onnx/commit/25ff79a4)**: Fix wrong model version, it's not 12 (the onnx_opset_version()), not 11 (the opset version of the latest stable), but 10 (#2478) <daquexian>
- **[7cebaed5](https://github.com/onnx/onnx/commit/7cebaed5)**: Fix Windows py3.5 CI (#2529) <bddppq>
- **[eddae00e](https://github.com/onnx/onnx/commit/eddae00e)**: Correct the order of arguments of InferShapes (#2500) <Shinichiro Hamaji>
- **[41b5afe6](https://github.com/onnx/onnx/commit/41b5afe6)**: Include <ostream> in common/status.h (#2519) <Casey Carter>
- **[423f1977](https://github.com/onnx/onnx/commit/423f1977)**: add 8 bit support to maxpool op (#2510) <Ashwini Khade>
- **[78593c2f](https://github.com/onnx/onnx/commit/78593c2f)**: add 8 bit support to reducemin and reducemax ops (#2516) <Ashwini Khade>

Test Plan: cont build

Reviewed By: benoitsteiner

Differential Revision: D19380034

fbshipit-source-id: ddce8450864a611773b2a32e2f0254c9bb6b6906
2020-01-14 15:21:37 -08:00
Silun Wang
28c1258f18 Scale init for batch-norm and layer-norm (#31983)
Summary:
Per discussion with Fei Tian, we need to add a `scale_init_value` to scale down the output of normalization such as batch-norm and layer-norm.

Currently we have `sparse_normalization_options` to normalize embedding pooling output. By default, scale = 1.0, we found it's better to set scale from 0.025 to 0.1 https://fb.quip.com/MiKUAibEaYhH

Besides, I am removing the tags from normalizers because it makes more sense to calculate norm ops in distributed trainers, not ps.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31983

Test Plan:
Testing LN and BN after sum-pooling --
baseline f160348514
LN: f160348609
BN: f160348710

{F226106518}

Layer norm after sum-pooling fwd_net https://fburl.com/sa4j207n
Layer norm after dot-prod fwd_net https://fburl.com/twggwyvb

## Unit Tests
Testing normalization after pooling
```
buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_sparse_pooling_batch_normalization
buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_dense_sparse_pooling_batch_normalization
buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_sparse_pooling_layer_normalization
buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 -- test_dense_sparse_pooling_layer_normalization
```

Testing normalization after dot-prod
```
buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_last_layer_use_batch_norm
buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_last_layer_use_layer_norm
```

Differential Revision: D19277618

Pulled By: SilunWang

fbshipit-source-id: ea323e33e3647ba55d2e808ef09d94ad7b45b934
2020-01-10 11:55:56 -08:00
Hector Yuen
9e9ca6ec37 add conversion functions to embedding tables (#31083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31083

add (fp32/fp16)<->(int8 rowwise quantized fp32/fp16 scale biases)

Test Plan:
added unit tests
enhanced shape inference tests

Reviewed By: jspark1105

Differential Revision: D18920547

fbshipit-source-id: 6b3d7cb93f9d1669ecf511817d73976177632891
2020-01-08 16:56:12 -08:00
Yinghai Lu
d2fdf140af Combine all the user inputs together and convert them to fp16 (#31898)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31898

Att

Reviewed By: tracelogfb

Differential Revision: D19291357

fbshipit-source-id: 747ed5234ca042ceeaff2d094701ead7597ac3ee
2020-01-08 14:36:42 -08:00
Fei Tian
809ee9d04c Enable personalized FC weight_init and sparse_emb weight_init (#31707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31707

Change the initialization value for FC weight init and sparse embedding lookup init.

Previous default initialization is uniform(-\sqrt(1/input_dim), \sqrt(1/input_dim)); Now pass into a flexible hyperparameter, say \alpha into it, to change into uniform(-\sqrt(\alpha/input_dim), \sqrt(\alpha/input_dim));

Reviewed By: chonglinsun

Differential Revision: D18825615

fbshipit-source-id: 4c5f2e07f2b3f5d642fd96d64dbf68892ebeb30b
2020-01-07 10:10:54 -08:00
Jiyan Yang
b102550d2c Allow to pass in masks through db (#31676)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31676

Facebook:

Previously we assumed mask is passed in as a tensor which is not feasible for sparse parameter.
Here we allow to pass in the mask through db path which requires the masks to be stored in some db first.

Test Plan: unit tests

Reviewed By: ellie-wen

Differential Revision: D18928753

fbshipit-source-id: 75ca894de0f0dcd64ce17b13652484b3550cbdac
2019-12-30 20:54:27 -08:00
Xinyi Zhang
f4e955ff62 Change PackSegments to ensure consistent behavior between CPU and GPU
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31673

Reviewed By: Wakeupbuddy, BIT-silence

Differential Revision: D18925762

fbshipit-source-id: e0c318e97f69b14a54f43c176af57d98fbc16c9f
2019-12-30 13:31:45 -08:00
Jiyan Yang
90a187618e Integrate masked sparse Adagrad (#31641)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31641

Assuming mask is provided as a tensor

Test Plan: unit test

Reviewed By: ellie-wen

Differential Revision: D18928737

fbshipit-source-id: a4f3dd51769c2b56e5890043e91c18e6128be082
2019-12-27 18:40:50 -08:00
Dehua Cheng
35bee0c729 separate op for rowwise counter (#31612)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31612

Count the number recent update on rows. Exponential decay is applied on the counter with decay rate r, such that
    r^{counter_halflife} = 0.5;
If counter_halflife is nonpositive, this operator is turned off.

Test Plan: added unittest

Reviewed By: chocjy

Differential Revision: D19217921

fbshipit-source-id: 96d850123e339212cc0e0ef352ea8a1b1bf61dfa
2019-12-27 12:18:39 -08:00
Jiyan Yang
4983ef8de1 Integrating MaskedAdagrad
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31640

Test Plan: unit test

Reviewed By: ellie-wen

Differential Revision: D18805278

fbshipit-source-id: 1def4a89b7e4e04385c762bf127d95c5e513180e
2019-12-26 17:18:39 -08:00
Fan Wang
39508501a4 Create byte-aware word lstm benchmark (#31260)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31260

1. Update the LiteLM dataset conversion script (fbcode/pytext/fb/tools/lite_lm_dataset_to_tensorproto.py)
2. Created a benchmark json file for byte-aware lstm word model (xplat/aibench/specifications/models/caffe2/assistant/lite_lm_len5.json)
3. In order to run the model -- created an int64 Tensor for the model, added batch gather ops to the BUCK file

Test Plan:
```
1. Create tensorproto of the model input
buck run mode/opt //pytext/fb/tools:byte_lm_dataset_to_tensorproto -- --in-path /mnt/vol/pytext/smart_keyboard/aibench/test_5.txt --out-path /mnt/vol/pytext/smart_keyboard/aibench/byteAwareWordLM/ --hidden_dim 203 --layers_num 2 --max_seq_len 64 --max_byte_len 15

2. Run the aibench command
buck run fbsource//xplat/aibench:run_bench -- -b aibench/specifications/models/caffe2/assistant/lm_byte_lstm_len5.json --remote --devices SM-G960U-8.0.0-26
```

Reviewed By: gardenia22

Differential Revision: D17785682

fbshipit-source-id: 351c3c8bae16449e72ac641522803b23a83349be
2019-12-26 16:44:30 -08:00
Jongsoo Park
7a12ccd003 optimize FloatToFused8BitRowwiseQuantized and Fused8BitRowwiseQuantizedToFloat (#31470)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31470

Optimize performance of these two operators.
Additionally use nearbyint instead of round to be consistent with 4-bit embedding table quantization.

Reviewed By: hyuen

Differential Revision: D19072103

fbshipit-source-id: efe96f14aeff7958cceb453ed625d3fd693891ff
2019-12-20 10:09:26 -08:00
Yanghan Wang
d08250c223 fix zero-batch handling in convtranspose (#24341)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24341

ConvTransposeOp doesn't crash for zero-batch, but it doesn't modify the output blob. This leads to buggy behaviour especially when running the same network twice using different input, or backprop during training.

Seems `ConvTransposeUnpoolBase<Context>::GetOutputSize` works for zero-batch, so I remove the check for `input.numel() > 0`, and reshape the output blob before returning.

For CudnnConvTransposeGradientOp, it's a bit verbose to set `dfilter` and `dbias`, it's a  seems the Cudnn can handle it, so simply remove the `X.numel() == 0` branch.

Test Plan: buck test mode/dev-nosan caffe2/caffe2/python/operator_test:conv_transpose_test -- --run-disabled

Reviewed By: BIT-silence

Differential Revision: D16807606

fbshipit-source-id: 0d72c5bd8f2e03c34465e7b530cca548d9bdd5e1
2019-12-18 15:06:36 -08:00
Vitaly Fedyunin
c5d2758c35 Disable flaky TestMomentumSGD.test_fp16momentum_sgd (#31369)
Summary:
Related to https://github.com/pytorch/pytorch/issues/31368
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31369

Differential Revision: D19147072

Pulled By: VitalyFedyunin

fbshipit-source-id: 6fad13be7b35f992d84a20f23877cad05ff18616
2019-12-17 19:16:54 -08:00
Yanghan Wang
52b8a52e4d move AliasWithNameOp to caffe2/operators
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31281

Reviewed By: houseroad

Differential Revision: D19053453

fbshipit-source-id: 350bfd5c001db9c17916dcae7ade8f56db1e9841
2019-12-17 02:39:40 -08:00
Sebastian Messmer
643ca5def2 Replace c10::guts::stuff with std::stuff (#30915)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915

Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609

Test Plan: waitforsandcastle

Differential Revision: D18869639

fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
2019-12-16 13:57:19 -08:00
Yuchen Hao
4a751dfc20 optimize MulGradient for common shapes (#19705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19705

Optimizing for a case when there's a consecutive dims that are not broadcasted followed by another consecutive dims that are broadcasted.
For example, MulGradient(["dC", "A", "B"], ["dA", "dB"], broadcast=True, axis=0) where A.shape == dC.shape == [9508, 80] and B.shape == [80] .

Test Plan:
In SKL T6,

Running mul_gradient_benchmark without this optimization
Operator #0 (dA, MulGradient) 11.9119 ms/iter

After this optimization,
Operator #0 (dA, MulGradient) 0.672759 ms/iter

Need to land D15291800 before to fix the unit test error

Reviewed By: dmudiger

Differential Revision: D15075415

fbshipit-source-id: 0f97be17cf8f1dacbafa34cd637fb8bc1c5e5387
2019-12-11 11:39:52 -08:00
Summer Deng
a42d093db2 FCTransposed to FbFCPacked (#29766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29766

Add FbgemmPackTranspose op to support the packing on FCTransposed weights

Add FCTransposed to FbFCPacked transformation to Dper fp16 exporter

Test Plan:
```
buck test mode/opt caffe2/caffe2/fb/fbgemm:fb_fc_packed_op_test
```

```
buck test mode/opt caffe2/caffe2/python:layers_test
```

Differential Revision: D18482306

fbshipit-source-id: e8f1947b3d0d04892293509ebf88742f5f0f5997
2019-12-10 10:18:21 -08:00
Lu Fang
c34ef1aa2e Automatic update of fbcode/onnx to c08a7b76cf7c1555ae37186f12be4d62b2c39b3b (#30619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30619

Previous import was fea8568cac61a482ed208748fdc0e1a8e47f62f5

Included changes:
- **[c08a7b76](https://github.com/onnx/onnx/commit/c08a7b76)**: doc: fix some typos at ONNXIFI (#2473) <Yorkie Liu>
- **[4be12d46](https://github.com/onnx/onnx/commit/4be12d46)**: remove workshop update since it is done (#2460) <Prasanth Pulavarthi>
- **[86107d1b](https://github.com/onnx/onnx/commit/86107d1b)**: Updated with correct URL to LICENSE (#2468) <Ryan Loney>
- **[9bf6fbb6](https://github.com/onnx/onnx/commit/9bf6fbb6)**: Update Argmin/Argmax (#2461) <Lara Haidar>
- **[748d81b8](https://github.com/onnx/onnx/commit/748d81b8)**: Fix windows conda build (#2452) <Ashwini Khade>
- **[a32db1c5](https://github.com/onnx/onnx/commit/a32db1c5)**: Delete duplicate word in comment (#2439) <Haibo Hao>
- **[e108da9a](https://github.com/onnx/onnx/commit/e108da9a)**: Fix bug in function body verifier (#2390) <G. Ramalingam>
- **[c3d3ef82](https://github.com/onnx/onnx/commit/c3d3ef82)**: docs: fix typo in IR.md (#2441) <Elliot Waite>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D18766132

fbshipit-source-id: 13c04f21399579acb87a8f9fac2e4c329b0720b8
2019-12-10 10:15:08 -08:00
Chunli Fu
42324cb6e8 Change interface from map of TensorShape to shapeInfoMap (#30802)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30802

Change shape_hints from map<string, TensorShape> to ShapeInfoMap to catch dimType info from model file.

Reviewed By: ipiszy

Differential Revision: D18821486

fbshipit-source-id: c5d9ed72e158d3698aba38900aeda00f776745b4
2019-12-10 00:35:11 -08:00
Supriya Rao
a51c5f5cbf Add JIT pass to insert permutes for conv ops (#30679)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30679

Caffe2 expects quantized ops to be in NHWC format while pytorch inputs are in NCHW.
Add a jit pass to insert permutes to convert from nchw2nhwc before each conv op and add nhwc2nchw permute after the conv op.
Using graph rewriter to find consecutive redundant permutes and remove them from the graph

Test Plan:
python test/onnx/test_pytorch_onnx_caffe2_quantized.py TestQuantizedOps

Imported from OSS

Differential Revision: D18790518

fbshipit-source-id: 4dd39cf0b31b21f5586c0edfdce2260d4e245112
2019-12-05 18:51:16 -08:00
Brian Wignall
e7fe64f6a6 Fix typos (#30606)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606

Differential Revision: D18763028

Pulled By: mrshenli

fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c
2019-12-02 20:17:42 -08:00
Chuan Jiang
6c9b188262 Support in-place update in IndexHashOp (#30275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30275

`IndexHash` did not support in-place update.

Reviewed By: kennyhorror

Differential Revision: D18612231

fbshipit-source-id: adeccdf1ceb6107454555ff9cdf66fd5e5773f2a
2019-11-22 14:49:28 -08:00
Mengshi Zhang
5b6dd52e3c Build Unit Test of SparseRAdam
Summary: We added caffe2 python wrapper and unit test for the SparseRAdam C++ operator.

Test Plan:
Unit test is constructed following the design pattern of [Wngrad optimizer](https://our.intern.facebook.com/intern/diff/D8655724/). Test passed smoothly.
buck test //caffe2/caffe2/python:optimizer_test -- TestSparseRAdam

Test result:
{F221144048}

Reviewed By: wx1988

Differential Revision: D18330650

fbshipit-source-id: e0f4724c2b616b665e2a0fe2e5c3430696cca7ee
2019-11-18 15:22:37 -08:00
Lei Zhang
b45069b59f fix fc fp16 quantization (#29469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29469

The original approach is to save both fp16 and fp32 for all models, which increased the filesize and memory.

This diff is to save 'used' blobs into predictor file.

Test Plan:
fc clone workflow :
f149878151

ctr mbl feed test with fc fp16 quantization:
f149996395

No fp32 in local file
{F221750392}

QRT after the fix:
https://fburl.com/qrt/cp8r8263

Reviewed By: wx1988

Differential Revision: D18382503

fbshipit-source-id: 231c41668f25b1d35ca8d4358ce9b12ba60a4f91
2019-11-18 11:26:49 -08:00
James Reed
7a6c3b36a1 Switch ScriptModuleOp to use a unique_ptr
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29856

Test Plan: waitforsadcastle

Reviewed By: dzhulgakov

Differential Revision: D18516553

fbshipit-source-id: d1e2d49ec613d07b21cd30bd777fbd300032cba1
2019-11-14 19:36:00 -08:00
Yangxin Zhong
ed788ec780 Linearizable Label: Class Weights, Allow Missing Label, and Average by Batch Size (#29707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29707

In D17885977, Linearizable label (a multi-class classification) was implemented in MTML.

In this diff, we add several items for Linearizable label:

- Assigning different weights to each class through ```model_def.tasks[i].class_weights```.

  - This option is a dictionary, the keys of which are indices of the classes and the values of which are weights for each class.

  - For example, if a linearizable-label task has 4 classes and its ```class_weights = {"0": 1, "1": 0.1, "2": 0.1, "3": 0.01}```, it means that in the loss function of this task, we assign weight 1 to its first class, weight 0.1 to its second and third class, and weight 0.01 to its forth class. The index/order of classes follows the logic of linearizable label.

  - Note that when you assign different weights to different classes, you need to correct the calibration by setting an appropriate ```model_def.tasks[i].calibration.linearizable_class_weight```. Basically, the class weights in calibration should be the reciprocals of the class weights in loss function. So the ```calibration.linearizable_class_weight = {"0": 1, "1": 10, "2": 10, "3": 100}``` for the example above.

  - Example FBLearner job: f150763093

- We also support ```model_def.allow_missing_label_with_zero_weight``` for linearizable label, which will ignore those examples with first label missing, by assigning zero weights to them in loss function.

  - We need to set ```allow_missing_label_with_zero_weight = true``` to enable it.

  - Example FBLearner job: f150763093

- Last but not least, we update caffe2 operator ```SoftmaxWithLoss``` to support loss averaged by batch size.

  - We need to set ```model_def.tasks[i].loss.softmaxLoss.average_by_batch_size = true``` to enable it.

  - Previously, the loss was averaged by weight sum of examples in batch, which is still the default behavior now (when ```average_by_batch_size = null``` or ```average_by_batch_size = false```).

  - Without this new feature, the calibration will be incorrect when applying non-equal-weight training among different classes to a linearizable task.

  - Example FBLearner job with ```average_by_batch_size = true``` results in a correct calibration: f150763093

  - Example FBLearner job with ```average_by_batch_size = null``` results in an incorrect calibration: f150762990

Test Plan:
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_class_weights
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_with_zero_weight
buck test caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_linearizable_label_task_average_by_batch_size

All tests passed.

full canary: https://fburl.com/fblearner/troznfgh

Reviewed By: chenshouyuan

Differential Revision: D18461163

fbshipit-source-id: aaf3df031406ae94f74e2e365b57e47409ef0bfe
2019-11-13 16:52:27 -08:00
Yinghai Lu
f0dd7517f2 Add option to clean up allocated activations between c2 runs (#29619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29619

att

Reviewed By: houseroad

Differential Revision: D18415190

fbshipit-source-id: 739aaf436578fac635df10de42b35e2b4368df37
2019-11-13 10:30:10 -08:00
Huan Gui
be757957ba Support softmax with D == 0 (#29167)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29167

As titled.

This fix is crucial as multi_channel splitting would create history that has no items (i.e., D == 0), which leads to flow failure.

Test Plan:
Unittest

flow test:

before fix: f148783160

after fix: f149082299

buck test mode/dev-nosan caffe2/caffe2/python/operator_test:softmax_ops_test

Reviewed By: xianjiec

Differential Revision: D18296081

fbshipit-source-id: e0bb2dc2c4e5b465e213f31e5c5ced3a7e1fd574
2019-11-11 00:46:10 -08:00
Mike Ruberry
991c2ac383 Disables flaky test_rand_quantization (#29463)
Summary:
See https://github.com/pytorch/pytorch/issues/28550.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29463

Differential Revision: D18405669

Pulled By: mruberry

fbshipit-source-id: 2984c3896a9260a06fbf052afb06e0cb8d28b53d
2019-11-08 13:51:22 -08:00
Xiaodong Wang
36b73d5a1b Hipify contrib/nccl (#29385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29385

hipify contrib/gloo

Test Plan: OSS & sandcastle build

Reviewed By: bddppq

Differential Revision: D18373308

fbshipit-source-id: 39c232db36318af116c341f64d03642639575ecd
2019-11-08 10:39:17 -08:00
Edward Yang
4e21157e01 Revert "Revert D18171156: Merge Tensor and Variable." (#29299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29299

This reverts commit 9c43b16df9, but also
with the changes from D18348622.  Comments there:

thpp-compatibility is used by admarket/adreview/service:adreviewservice and
libtorch is too big for the service to deal with.

thpp-compatibility doesn't support autograd, so we hack around dispatching
variables by using AutoNonVariableTypeMode everywhere we call into ATen,
so we never attempt to call into Variable stubs.  If you get it wrong,
you'll get an error like:

```
what():  Could not run 'aten::empty' with arguments from the 'VariableTensorId' backend. 'aten::empty' is only available for these backends: [SparseCPUTensorId, CPUTensorId, MkldnnCPUTensorId]. (lookup_ at caffe2/aten/src/ATen/core/dispatch/DispatchTable.h:298)
```

Test Plan:
Imported from OSS

```
buck test //thpp-compatibility/...
buck build mode/opt-clang admarket/adreview/service:adreviewservice
```

adreviewservice canary: https://our.intern.facebook.com/intern/ads/canary/422290029716387895 (comparing against parent comment due to current breakage) ==> experiment store https://our.intern.facebook.com/intern/experiment_store/experiment/43990006/
adfinder canary: https://our.intern.facebook.com/intern/ads/canary/422268535840333934
adindexer canary: https://our.intern.facebook.com/intern/ads/canary/422268550559034675

adreview second canary:  https://our.intern.facebook.com/intern/ads/canary/422307863515591925

canary without thpp-compat fixups https://our.intern.facebook.com/intern/ads/canary/422308951649168772

Reviewed By: dreiss

Differential Revision: D18353504

Pulled By: ezyang

fbshipit-source-id: 65feaba39fa07bb66762810909aeb38868668a30
2019-11-08 09:11:20 -08:00
Mike Ruberry
74b2d9ed2e Skips test_equiv_recurrent (#29255)
Summary:
This test is flaky, per issue https://github.com/pytorch/pytorch/issues/10322.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29255

Differential Revision: D18350782

Pulled By: mruberry

fbshipit-source-id: 53a7d33e17428c2484211618cb71e870ce2d6a03
2019-11-06 13:29:23 -08:00
Edward Yang
9c43b16df9 Revert D18171156: Merge Tensor and Variable.
Test Plan: revert-hammer

Differential Revision:
D18171156

Original commit changeset: 5b6a045beba3

fbshipit-source-id: f5581d902c2305018ea49f8473592be2a465560b
2019-11-06 10:57:00 -08:00
Mike Ruberry
2f2a0d1607 Disables test_atomic_ops and testInputOrder (#29145)
Summary:
These tests have been flaky for some time, see:

- https://github.com/pytorch/pytorch/issues/28179
- https://github.com/pytorch/pytorch/issues/9064

This PR disables them. The actual tests were added/updated 2+ years ago. It's unclear who, if anyone, would own them now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29145

Differential Revision: D18327937

Pulled By: mruberry

fbshipit-source-id: d02731d662aff3545b581272e5ae8db4e3097d87
2019-11-05 16:53:53 -08:00
Huan Gui
8a2dcff189 Add cuda version for operators BatchSparseToDense and BatchDenseToSparse (#29166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29166

As titled

Test Plan:
unittest

 buck test  mode/dev-nosan  caffe2/caffe2/python/operator_test:batch_sparse_to_dense_op_test

Reviewed By: xianjiec

Differential Revision: D18197966

fbshipit-source-id: 7486300c509dd552ddb7484c2d83099f62878278
2019-11-05 13:06:23 -08:00
Kevin Chen
1189f559cc Creating new layer FCWithBootstrap used in bootstrapping uncertainty approach (#29152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29152

Bootstrapping uncertainty approach: bootstrap the last layer before the last fully-connected layer. FCWithBootstrap is a new layer to handle the logic for the bootstrapping process.

Goal:
- return a struct with the bootstrapped indices and bootstrapped predictions from this layer
- separate the functionality in the train_net and eval_net
- save the bootstrapped FC in this object so that the eval_net can use them during prediction time

Reviewed By: wx1988

Differential Revision: D17822429

fbshipit-source-id: 15dec501503d581aeb69cb9ae9e8c3a3fbc7e7b5
2019-11-04 21:18:15 -08:00
Kevin Chen
56f7415795 L0 norm approx with budget (#29155)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29155

Update the L0 norm regularizer with a budget feature to penalize features over this limit

Formula and summary:

{F212248495}

Test Plan: * Unit test located in: ~/fbsource/fbcode/caffe2/caffe2/fb/dper/layer_models/tests/split_1/fsparse_nn_test.py

Reviewed By: un-disclosed, wx1988

Differential Revision: D17458138

fbshipit-source-id: 2ed9ce6f55573b0bfc0fefbfd392f90c7542a0fd
2019-11-04 21:09:53 -08:00
Xiaodong Wang
cb72c9f5b1 Make caffe2/fb folder compatible with AMD (#29131)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29131

caffe2_pb2.CUDA --> workspace.GpuDeviceType
workspace.NumCudaDevices() --> workspace.NumGpuDevices()

Also added the totalGlobalMem into get_device_properties(), which is needed by multi_gpu_utils.py

Test Plan:
sandcastle

f148921769

Reviewed By: bddppq

Differential Revision: D18290090

fbshipit-source-id: bde7c175d1fb6ff59a062266c1b17de39d113b24
2019-11-04 16:40:29 -08:00
Edward Yang
25261a4776 Merge Tensor and Variable. (#28620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28620

All Tensors are Variables now, they just happen to have requires_grad=False. Tensors ALWAYS have `VariableTensorId` in their type set.

When constructing this patch, I had to make decisions about what I would fix in this patch, and what I would leave for follow up PRs. Here is the cleanup that happens in this patch:

- The `is_variable` property is removed from TensorOptions. I removed this immediately because unlike Tensor::is_variable, TensorOptions::is_variable doesn't respect our VariableTensorId thread-local state. This means that there were a bunch of places where TensorOptions::is_variable was false, which is obviously bogus in the world when tensor and variable are merged. Instead of keeping the method as a function that always returns true, I just opted to remove it entirely (it's not public API.) All places we set `is_variable` are deleted.
  - Knock on effect: there is no longer a separate DeprecatedTypeProperties for the variable and non-variable versions of type.
  - Knock on effect: instead of asserting on TensorOptions::is_variable, instead we just test `at::impl::variable_is_excluded()`
- There is now only one copy of the cuDNN RNN dropout cache, not two (I'm not sure why we had two to begin with)

Some cleanup that doesn't happen in this patch:
- Eliminating unnecessary uses of `make_variable`
- Eliminating `Tensor::is_variable`

The most subtle part of this patch is retaining tracing behavior: the fact that everything is a Variable means that more code gets routed to VariableType than before; this can change traces. I identified two places where we didn't appropriately turn off VariableType, mostly factory functions:

- `torch.tensor` must turn off VariableType before invoking `at::empty` to construct the tensor, as it subsequently does direct data access
- `tensor_slow` (invoked when you pass a Python scalar to a tensor argument) must turn off VariableType before calling `scalar_to_tensor` so the scalar gets traced as constant, rather than as a call to `scalar_to_tensor`.

Honestly, these are all giant hacks, and should be replaced with a more specialized guard that just toggles tracing.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: dreiss

Differential Revision: D18171156

Pulled By: ezyang

fbshipit-source-id: 5b6a045beba37492647e350190f495114e86504d
2019-11-04 14:59:57 -08:00
Kevin Wilfong
cddda17394 ParallelWorkersTest.testParallelWorkersInitFun is flaky (#29045)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29045

Addressing an issue seen in GitHub https://github.com/pytorch/pytorch/issues/28958

It seems sometimes the workers in this test don't stop cleanly.  The purpose of this test is to check that the init_fun in init_workers works as expected, which is captured by the assertEqual in the for loop in the test.  The behavior of stop() is not really important here.

The fact it's returning false is probably indicative that a worker is getting blocked but that doesn't affect the correctness of the test.

Test Plan: Ran the test 100 times, it consistently succeeds.

Reviewed By: akyrola

Differential Revision: D18273064

fbshipit-source-id: 5fdff8cf80ec7ba04acf4666a3116e081d96ffec
2019-11-01 13:59:02 -07:00
Sergei Nikolaev
1e2049c566 #26426 fixed (#28715)
Summary:
This is the fix for reverted https://github.com/pytorch/pytorch/issues/26426
houseroad bddppq soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28715

Reviewed By: hl475

Differential Revision: D18146731

Pulled By: houseroad

fbshipit-source-id: 247366451a6334e84df82d00339521f797b33130
2019-11-01 12:53:01 -07:00
Xinyi Zhang
5821b9bf0f Remove error logging of high empty range ratio
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28854

Reviewed By: xianjiec

Differential Revision: D18206695

fbshipit-source-id: 4ce471f0236b2ceaf54ba1b1ce96e193feca720b
2019-10-30 12:55:25 -07:00
Huayu Li
793e2914e4 Support full id interations (#28769)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28769

Support full id interaction.

Test Plan:
* unit-tests
  * buck test caffe2/caffe2/python/operator_test:pack_ops_test --
  * buck test caffe2/caffe2/fb/dper/layer_models/tests:sparse_nn_attention_test -- test_sparse_nn_full_id

* canary
  * apply SUM + full id with max_length as 20 on SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID: f147253340 (v1: f146340704)

# of embeddings for this features is 20:
{F219139816}

The corresponding ops: two lookups, which is as expected.
```
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_0/Repeat_0/sparse_lookup/w"
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:values"
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:lengths"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_0/Repeat_0/sparse_lookup/output"
  name: ""
  type: "SparseLengthsSum"
}
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/w"
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:values"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/output"
  name: ""
  type: "Gather"
}
op {
  input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:lengths"
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/output"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/PackSegments/embedding_packed"
  name: ""
  type: "PackSegments"
  arg {
    name: "max_length"
    i: 20
  }
  arg {
    name: "pad_minf"
    i: 0
  }
}
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/PackSegments/embedding_packed"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/reshaped_record"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/old_shape"
  name: ""
  type: "Reshape"
  arg {
    name: "shape"
    ints: -1
    ints: 1280
  }
}
op {
  input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/reshaped_record"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_0"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_1"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_2"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_3"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_4"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_5"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_6"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_7"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_8"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_9"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_10"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_11"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_12"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_13"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_14"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_15"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_16"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_17"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_18"
  output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_19"
  name: ""
  type: "Split"
  arg {
    name: "axis"
    i: 1
  }
}
```

Reviewed By: chonglinsun

Differential Revision: D18083520

fbshipit-source-id: f592fb7734dd4e3e712ba42dc0afcd0b32a4afa0
2019-10-29 14:56:18 -07:00
Xinyi Zhang
f5ea2ca34a Reduce logging frequency for empty range tolarence
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28704

Reviewed By: xianjiec

Differential Revision: D18138828

fbshipit-source-id: 4f3c376502cb6e30b931217702c4ca537c9eb644
2019-10-28 09:52:17 -07:00
Lu Fang
c89340f068 Extend HasElements to support multiple inputs (#28717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28717

Make HasElements support multiple inputs. Any input has element, then return true.

Test Plan: to be added

Reviewed By: BIT-silence

Differential Revision: D17972759

fbshipit-source-id: 3ecdea74a30fcfaaa6490fef1debc6cde68db922
2019-10-27 23:00:07 -07:00
Junjie Bai
d37c2d7c8d Revert D17495965: TensorRT 6.0 support and PyTorch->ONNX->TRT6 unit test
Test Plan: revert-hammer

Differential Revision:
D17495965

Original commit changeset: 3e8dbe8943f5

fbshipit-source-id: d47fcbec22b0d61df41d7dbf15cfdde196ac818f
2019-10-25 13:58:16 -07:00
Sergei Nikolaev
4996e3aca2 TensorRT 6.0 support and PyTorch->ONNX->TRT6 unit test (#26426)
Summary:
This PR makes Caffe2 compatible with TensorRT 6. To make sure it works well, new unit test is added. This test checks PyTorch->ONNX->TRT6 inference flow for all classification models from TorhchVision Zoo.
Note on CMake changes: it has to be done in order to import onnx-tensorrt project. See https://github.com/pytorch/pytorch/issues/18524 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26426

Reviewed By: hl475

Differential Revision: D17495965

Pulled By: houseroad

fbshipit-source-id: 3e8dbe8943f5a28a51368fd5686c8d6e86e7f693
2019-10-25 13:01:57 -07:00
Xinyi Zhang
2f16284231 change empty range tolorrance logging
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28489

Differential Revision: D18067322

fbshipit-source-id: 2096d1cce820f4ebe28db0045a2ddacc022e07da
2019-10-23 09:39:39 -07:00
Jason Fried
9705d60a2f get rid of deprecated thread.isAlive() to use py2.6 modern form is_alive()
Summary:
Codemod to remove all thread.isAlive() since it throws a warning that is breaking some tests that monitor the output of their cli's

is_alive() was added in python 2.6 this is super safe

This is a codemod I don't care if the code supports python3, just that its python code

Test Plan: unittests

Reviewed By: cooperlees

Differential Revision: D18069520

fbshipit-source-id: 4ca4dcb541c0b0debeb194aba5d060152ad0ef0e
2019-10-22 15:37:31 -07:00
Jiyan Yang
07a181da1d Add more logging in net modifier
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28327

Test Plan:
Failed as expected and the full protobuf is logged
f145060005

Reviewed By: ffjiang, wx1988

Differential Revision: D17975560

fbshipit-source-id: 5375acffc1f9dede16622b06eb58b6c3a26ebe5a
2019-10-21 17:53:00 -07:00
Xinyi Zhang
06bb74ce96 Tolerate small amount of embedding corruptions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28371

Reviewed By: xianjiec

Differential Revision: D18031155

fbshipit-source-id: a51d2a62a919f032dc04372b30cf9071aa2dd629
2019-10-21 16:23:25 -07:00
Jiang Wu
29f56eb920 Revert D17937850: Tolerate small amount of embedding corruptions
Test Plan: revert-hammer

Differential Revision:
D17937850

Original commit changeset: e9c633768d98

fbshipit-source-id: 5c2c837c7867504392b19965d91a60cadd3b8101
2019-10-19 14:17:01 -07:00
Xinyi Zhang
ca6ba06f95 Tolerate small amount of embedding corruptions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28299

Reviewed By: Wakeupbuddy

Differential Revision: D17937850

fbshipit-source-id: e9c633768d9819fd734ddd59017c33688ebbdcca
2019-10-18 14:59:06 -07:00
Peiyao Zhou
46fefc98e2 Change dper3 loss module to match dper2 (#28265)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28265

Fix the difference in dper3 and dper2 when regressionLoss is used.

Test Plan:
test using dper2 model id f134632386
Comparison tool output before change:
```
FOUND OP DIFFERENT WITH DPER2!!!
OP is of type ExpandDims
OP inputs ['supervision:label']
OP outputs ['sparse_nn/regression_loss/mean_squared_error_loss/ExpandDims:0']
===============================
Finished all dper3 ops, number of good ops 11, bad ops 1, skipped 26
run_comparison for dper2 / dper3 nets running time: 0.0020143985748291016
result type: <class 'NoneType'> result: None
```

After change:

```
FOUND OP DIFFERENT WITH DPER2!!!
OP is of type ExpandDims
OP inputs ['sparse_nn_2/regression_loss_2/mean_squared_error_loss_8/Squeeze:0_grad']
OP outputs ['sparse_nn_2/over_arch_2/linear_2/FC_grad']
===============================
Finished all dper3 ops, number of good ops 19, bad ops 1, skipped 16
run_comparison for dper2 / dper3 nets running time: 0.0017991065979003906
result type: <class 'NoneType'> result: None
```

dper2  label part of net P111794577
dper3  label part of net after change P116817194

Reviewed By: kennyhorror

Differential Revision: D17795740

fbshipit-source-id: 9faf96f5140f5a1efdf2985820bda3ca400f61fa
2019-10-18 10:08:38 -07:00
Long Jin
76bf8f62f7 fix loss_weight for self_supervision
Summary: previously loss_weight is not used correctly for self-supervision branch

Test Plan: buck test mode/dev-nosan //caffe2/caffe2/fb/dper/layer_models/models/experimental/tests:tum_test

Reviewed By: xianjiec

Differential Revision: D17862312

fbshipit-source-id: 554b793a5caa3886946c54333c81a0d8a10230d9
2019-10-15 10:40:48 -07:00
Alyssa Wang
4b1096c652 Fix predict net issue with LRU hash eviction
Summary:
We are seeing error "[enforce fail at BlackBoxPredictor.cpp:134] ! !parameter_workspace->HasBlob(out). Net REMOTE of type predict_net writes to blob cat/NGRAM_QRT_VERSIONS_x_EVENT_TYPE_AUTO_FIRST_X/Pool_Option_0/Repeat_0/sparse_lookup/w which exists in the parameter workspace" in online testing for calibration models.
I'm suspecting it's due to the op CopyRowsToTensorOp are being used in prediction

Test Plan:
f143080108 offline predict net does not contain CopyRowsToTensorNet, which looks right.
Waiting for Olga to test online behavior
dper2 canary:
https://fburl.com/fblearner/sv3o3yj1

Differential Revision: D17741823

fbshipit-source-id: 19721b632b5ea9ebfa1ef9ae0e99d3a10c926287
2019-10-14 16:08:14 -07:00
Benny Chen
d23d62cb1e Fix unaries to export fp16 instead of fp32 when rest of the model export to int8
Summary: Currently accelerators does not have the concept for fp32, it only has understandings of fp16 and int8 in terms of data input. In order to fixe the issue here, we want to make sure unaries are turned into fp16 when we have the int8 exporter turned on.

Reviewed By: kennyhorror

Differential Revision: D17743791

fbshipit-source-id: 7322d23eb12ac3f813b525fc0ddd066f95c8ca85
2019-10-14 10:51:17 -07:00
Lei Zhang
0e8d4836e4 add feature name into module and update position weighted to match dper2
Test Plan:
The notebook showed no diff for id score list
https://our.intern.facebook.com/intern/anp/view/?id=154764

Reviewed By: alyssawangqq

Differential Revision: D17649974

fbshipit-source-id: 84cb4ae372fc215295c2d0b139d65f4eacafae4a
2019-10-14 08:06:19 -07:00
Kevin Chen
275dfa3485 Initial commit for L0 norm approx (#27756)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27756

Implement approximate L0 norm for use in the dense feature regularizer that will be used for feature importance. The formula is as follows:
{F212246801}

Reviewed By: wx1988

Differential Revision: D17432708

fbshipit-source-id: 57d6c9c3dd1b4e210b9f10264075c57dbc9c8cb6
2019-10-11 11:24:34 -07:00
Kutta Srinivasan
415b17e81c Fix for flaky caffe2 dataio test (test_time_limit_reader_with_short_limit) (#27592)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27592

The caffe2 data reader test `test_time_limit_reader_with_short_limit` is flaky as-written because it places an upper bound on how much can be read, but under stress it is possible for fewer records to be read. The fix is to make the assertion check a fuzzy/range check rather than exact equality, since there's not a straightforward way to precisely test a timer-based feature.
ghstack-source-id: 91543898

Test Plan:
`buck test mode/dev-tsan //caffe2/caffe2/python:dataio_test-2.7 -- --stress-runs 20` -> P117156924 (with fix, 100% pass)

P117158750 - without fix, lots of failures in this test

Reviewed By: boryiingsu

Differential Revision: D17816775

fbshipit-source-id: 2ab0d3304fbd9c9806d37a4fe2912c840616db61
2019-10-10 13:53:58 -07:00
Jason Fried
b96f49885f caffe2 python ideep conv_op test_int8_convolution skip for python 3
Summary: This test was failing in 3.7,  turns out it was ommitted by test director in 3.6 so I added a skip for both versions

Test Plan: unittests is skipped in 3.7 and 3.6 all other tests pass.

Reviewed By: tomdz

Differential Revision: D17820967

fbshipit-source-id: 571f0ec7fe1b0cb50ead4e0d18c00151a701f36a
2019-10-08 21:31:11 -07:00
Lin Jiang
1f158adeee Add support for attention weight in SparseLookup (#26748)
Summary:
Support attention weights input to SparseLookup. In attention sum pooling, if attention weights can be pre-calculated before embedding lookup,  they can be passed to SparseLookup and processed by SparseLengthsWeightedSum op. One example is id_score attention sum pooling.

Essentially the net is converted from:
  LengthsSum(Mul(Gather(keys, w), att_weight))
to:
  SpaseLenghtsWeightedSum(keys, w, att_weight)

It unblocks potential efficiency gain with distributed training.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/26748

Test Plan: unit test

Reviewed By: chocjy

Differential Revision: D17553345

Pulled By: wheatkit

fbshipit-source-id: 60cc3c4b0bc1eade5459ac598e85286f3849a412
2019-10-08 20:22:25 -07:00
Swati Rallapalli
e63addfff6 Exponential decay of the weight of task loss (#27508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27508

Implemented a simple exponential decay of the weight of lr loss function, with a lower bound.

Test Plan:
buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test -- test_task_weight_decay
https://our.intern.facebook.com/intern/testinfra/testrun/3377699729136308

canary: f140103452

Reviewed By: chenshouyuan

Differential Revision: D17524101

fbshipit-source-id: 9a653e21a4ecb74dfc4ac949c9e3388f36ef3a20
2019-10-08 09:15:41 -07:00
Kevin Chen
c2223df578 Implement LpNorm regularizer to be used on the inputs for feature importance (#26376)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26376

* Create the new dense_feature_reg (FCInputLpNorm) for feature importance to be applied to the fully-connected layer for feature-importance.

Test Plan: * Unit test located in: `caffe2/caffe2/fb/dper/layer_models/tests/split_1/sparse_nn_test.py`

Reviewed By: un-disclosed

Differential Revision: D17360361

fbshipit-source-id: 1a0e119eeb17199a13dfffe58b3036ea4255e301
2019-10-03 09:39:42 -07:00
Xing Wang
a1513dced3 Integrate FC fp16 exporter into Dper2 (#26582)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26582

Add the blob quantization.
replace the op in the eval/predictor net.

Test Plan:
# Unit test:

-----

buck build fblearner/flow/projects/dper/tests/validators:test_exporter_options_validators

./buck-out/gen/fblearner/flow/projects/dper/tests/validators/test_exporter_options_validators#binary.par

----

buck build caffe2/caffe2/fb/dper/layer_models/tests:exporter_test

./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/tests/exporter_test-2.7#binary.par

Reviewed By: chocjy

Differential Revision: D17439720

fbshipit-source-id: 68de5d0322b0111aeca5ed552210bf80a4cddc78
2019-09-29 10:19:28 -07:00
Simran Suresh Motwani
d63d7ab997 Expose PiecewiseLinearTransform to PyTorch
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26903

Test Plan: Unit Test

Reviewed By: bddppq

Differential Revision: D17585637

fbshipit-source-id: fe669aaf3301d7efb5c28ec0097945d55a71773d
2019-09-27 12:49:04 -07:00
Lu Fang
7163bfdf58 Fix the weird bug in control_flow_op_test.py (#26931)
Summary:
In some version of python, then_net and else_net may switch the order. Let's make sure we are iterating the right arg node.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26931

Reviewed By: hl475

Differential Revision: D17614829

Pulled By: houseroad

fbshipit-source-id: 3f1b4eb91ecf4d808f58c34896d3e628aa2e0af0
2019-09-26 20:44:03 -07:00
Jongsoo Park
8fb756d3b2 batch size 0 support in ChannelShuffle DNNLOWP op (#26858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26858

Handle batch size = 0 in ChannelShuffle operator

Test Plan: CI

Reviewed By: jianyuh

Differential Revision: D17591041

fbshipit-source-id: 63373aa752406c1f38401c3e93d8e1954ce7281e
2019-09-26 00:40:07 -07:00
Lu Fang
d6ee58494f Automatic update of fbcode/onnx to 23bb6ea1a71f08e200114a153f48bd7adb66d486 (#26441)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26441

Previous import was 1316afc9f972f81340faa05763e2898f38bcc3b0

Included changes:
- **[23bb6ea1](https://github.com/onnx/onnx/commit/23bb6ea1)**: Gemm optional bias (#2330) <James Allingham>
- **[1ac1f219](https://github.com/onnx/onnx/commit/1ac1f219)**: Changes for AIX platform (#1913) <kavanabhat>
- **[13b026f5](https://github.com/onnx/onnx/commit/13b026f5)**: Updated test cases for reshape (#2127) <James Allingham>
- **[97fcfe30](https://github.com/onnx/onnx/commit/97fcfe30)**: Replace is by == (#2326) <G. Ramalingam>
- **[3b5601e6](https://github.com/onnx/onnx/commit/3b5601e6)**: Updated docs for strides and dilations attributes  (#2291) <James Allingham>
- **[d0c697b1](https://github.com/onnx/onnx/commit/d0c697b1)**: Revamped test cases for Gemm (#2060) <James Allingham>
- **[a3955c3c](https://github.com/onnx/onnx/commit/a3955c3c)**: Add more shape inference tests for Logical operators to improve coverage (#2133) <Hariharan Seshadri>
- **[e2e12d97](https://github.com/onnx/onnx/commit/e2e12d97)**: Change incorrect use of ValueError to TypeError (#2304) <prcvih>
- **[1f4b5f8c](https://github.com/onnx/onnx/commit/1f4b5f8c)**: Support dynamic 'pads' and 'value' in Pad operator (#2031) <Hariharan Seshadri>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D17466717

fbshipit-source-id: 0f89a7a5a821d2c693492c99b4bebd5966e21d9f
2019-09-24 05:38:52 -07:00
Aapo Kyrola
aeb6532e7f BlobReference __getattr__ can only throw AttributeError (#26654)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26654

As per python contract, __getattr__ can only throw AttributeError. Throwing something else breaks hasattr() and causes upstream issues.

Similar bug was in pytorch earlier.

Test Plan: builds

Differential Revision: D17529471

fbshipit-source-id: bb6ac6c9e3be8b80fa2967e6a2e293afd1594cf9
2019-09-23 13:01:00 -07:00
Xing Wang
73ae23a4ea add support for real4bits quant (#25426)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25426

Add embedding table 4bit quantization support.

* add the conversion from fp32 to int4.
* using brew to pass the context so that the 4bit operators are added when generating the predictor net.

Reviewed By: kennyhorror, chocjy

Differential Revision: D16859892

fbshipit-source-id: a06c3f0b56a7eabf9ca4a2b2cb6c63735030d70b
2019-09-20 13:45:23 -07:00
Huan Gui
a8386d2a7d fix composite learning rate (#26227)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26227

In the previous implementation of composite lr, the lr_scale for each sub policy will be rewritten by the last lr_scale.

Due to another bug in unittest (where policy_lr_scale being the same for all sub policies), this bug was not detected by unittest...

Fix: add an additional field in CompositeLearningRateItem so that we store  lr_scale values for all sub policies

If fix unittest, the error in previous implementation:
https://fburl.com/testinfra/ikdbnmey

With the fix,
https://fburl.com/testinfra/m694ehl1

Test Plan:
unittest

buck test  caffe2/caffe2/python/operator_test:learning_rate_op_test -- test_composite_learning_rate_op

Reviewed By: chocjy, alex1o1o7cloud

Differential Revision: D17380363

fbshipit-source-id: 161e9cb71bb2ea7f0734a3361e270616057a08e4
2019-09-18 17:34:17 -07:00
Xiaodong Wang
f341291bfb Support unpickle py2 NetDef object in py3 (#26147)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26147

We may try to unpickle a byte string in py3 that was pickled from py2. Therefore we need to add encoding latin1.

Reviewed By: kennyhorror

Differential Revision: D17305677

fbshipit-source-id: c0c8a51909629a65eb72bb81cccfbabaee9f8d01
2019-09-18 02:02:34 -07:00
Lu Fang
bebc3d6aad Automatic update of fbcode/onnx to 1316afc9f972f81340faa05763e2898f38bcc3b0 (#26309)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26309

Previous import was 95252c2adec185e305e34486c6756ece9aa8f57f

Included changes:
- **[1316afc9](https://github.com/onnx/onnx/commit/1316afc9)**: Update IR doc to clarify initializers are permitted as node inputs (#2320) <G. Ramalingam>
- **[5e920d0c](https://github.com/onnx/onnx/commit/5e920d0c)**: Avoid uses of special chars (#2315) <Wei-Sheng Chin>
- **[2fa08b0f](https://github.com/onnx/onnx/commit/2fa08b0f)**: Regenerate ONNX proto and add release date to ver 6 IR (#2316) <Wei-Sheng Chin>
- **[adf9c7a3](https://github.com/onnx/onnx/commit/adf9c7a3)**: Add description of default type about y_zero_point (#2110) <Takeshi Watanabe>
- **[ee7072c7](https://github.com/onnx/onnx/commit/ee7072c7)**: Support make_attribute empty string (#2129) <shjwudp>
- **[f913b6e7](https://github.com/onnx/onnx/commit/f913b6e7)**: More unsqueeze tests (#2200) <James Allingham>
- **[57b51937](https://github.com/onnx/onnx/commit/57b51937)**: Fix resize shape inference issue in opset10 (#2294) <Bowen Bao>
- **[d7595f34](https://github.com/onnx/onnx/commit/d7595f34)**: Sequence related ops (#2249) <Bowen Bao>
- **[599f3da9](https://github.com/onnx/onnx/commit/599f3da9)**: Add helper function update_inputs_outputs_dims to tools (#2148) <Bowen Bao>
- **[3e6382bc](https://github.com/onnx/onnx/commit/3e6382bc)**: Update documentation about required input output types (#2310) <G. Ramalingam>
- **[0c765d9b](https://github.com/onnx/onnx/commit/0c765d9b)**: Shape inference for NMS (#2269) <Hariharan Seshadri>
- **[89266710](https://github.com/onnx/onnx/commit/89266710)**: Fix extra collect_snippets warning (#2277) (#2307) <Lutz Roeder>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D17403954

fbshipit-source-id: 78a9c3ecf5aa7f7a0ba8ea30286eab61ee903772
2019-09-17 06:46:59 -07:00
Andrey Malevich
28d3eb8156 Back out "Back out "[Caffe2] Fix device_option propagation"" (#25908)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25908

Original commit changeset: f6e961e88c01

device_option propagation is completely broken in Caffe2 for cases when pass through operators are used. As an example Gather operator don't have gradient and passes through it's inputs, which results in incorrect detection of the components for sparse parameter aggregation (component will be empty instead of the real device).
This diff is trying to fix this issue.

Original diff had a problem, that Caffe2 is not handling cases when device option is present, but contains only metadata (for example one for auto-generated reduction ops in backward pass). This diff is addressing this issue by merging device options during the backward pass

Test Plan:
1. net_transform is finally working with Gather + FloatToHalf transformed model instead of failing because of incorrect number of components.
2. New unit-test.
3. Verify that previously broken benchmark is now passing

ezyang do you have suggestions what else I should test?

Reviewed By: ezyang

Differential Revision: D17281528

fbshipit-source-id: 4a1bc386f29f6a34fbf8008effde9d4890abebfa
2019-09-17 04:01:36 -07:00
Aapo Kyrola
20124c4814 guard dyndep with a lock (#26153)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26153

I am suspecting that our multithreaded test-system causes issue with dyndep, if two places try to concurrently InitOpsLibrary. So perhaps we just guard this by a lock. This is just a guess-fix, as it is impossible to repro.

Test Plan: sandcastle

Reviewed By: bddppq

Differential Revision: D17361310

fbshipit-source-id: 596634a2098b18881abbd26a5a727a5ba0d03b6e
2019-09-13 11:38:14 -07:00
Qi Zhou
076eaf4ccf Exposing Fused8BitRowwiseQuantizedToFloat in PyTorch (#26080)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26080

Will be used in c2 ctr_mbl_feed model to PyTorch conversion

Test Plan: Unit test

Reviewed By: yinghai

Differential Revision: D17337604

fbshipit-source-id: a90d9f5dc38301608d1562c6f2418e7f4616e753
2019-09-12 12:36:33 -07:00
Lu Fang
7e4ac8b851 Automatic update of fbcode/onnx to 7988d8360b11e6003560076e9b1d4aa426db3244 (#25959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25959

Previous import was 28ca699b69b5a31892619defca2391044a9a6052

Included changes:
- **[7988d836](https://github.com/onnx/onnx/commit/7988d836)**: Supporting negative axes for all existing onnx ops (#2281) <Negin Raoof>
- **[5ca0a09e](https://github.com/onnx/onnx/commit/5ca0a09e)**: Update managingexperimentalops.md (#1981) <Joseph Spisak>
- **[bc0495c1](https://github.com/onnx/onnx/commit/bc0495c1)**: Fix link to community docs in readme (#2261) <Prasanth Pulavarthi>
- **[2fdb3ef6](https://github.com/onnx/onnx/commit/2fdb3ef6)**: move map and sequence types to onnx domain, (#2244) <Ke Zhang>
- **[568b65aa](https://github.com/onnx/onnx/commit/568b65aa)**: Improve compatiblity with proto3 and enable reading attributes (#2288) <Dmitri Smirnov>
- **[1f350f2c](https://github.com/onnx/onnx/commit/1f350f2c)**: Remove type info for loop variadic input in Loop op used to compose the Range op (#2287) <Hariharan Seshadri>
- **[eb139446](https://github.com/onnx/onnx/commit/eb139446)**: Add Foundation WG to working-groups.md (#2276) <Ryan Loney>
- **[4eabc4b3](https://github.com/onnx/onnx/commit/4eabc4b3)**: Fix testdata model for CumSum. Add exclusive attribute. (#2271) <jignparm>
- **[1a62afdb](https://github.com/onnx/onnx/commit/1a62afdb)**: Support GatherND operator in ONNX (#2106) <Hariharan Seshadri>
- **[0e330e9d](https://github.com/onnx/onnx/commit/0e330e9d)**: Support ScatterND operator in ONNX (#2220) <Bowen Bao>
- **[733f7a6a](https://github.com/onnx/onnx/commit/733f7a6a)**: Add Det to ONNX (#2233) <Bowen Bao>
- **[52187738](https://github.com/onnx/onnx/commit/52187738)**: Update the description of nearest_mode of resize op (#2257) <daquexian>
- **[64b4b686](https://github.com/onnx/onnx/commit/64b4b686)**: Adding sparse tensor to ONNX (#2019) <G. Ramalingam>
- **[c8a8b7cc](https://github.com/onnx/onnx/commit/c8a8b7cc)**: Support Range operator in ONNX (#2242) <Hariharan Seshadri>
- **[44b0d6d5](https://github.com/onnx/onnx/commit/44b0d6d5)**: Update resize op (#2057) <daquexian>
- **[7d907964](https://github.com/onnx/onnx/commit/7d907964)**: Add function to fuse dynamic quantization graph into 1 node (#2187) <Ashwini Khade>
- **[36f8e6d9](https://github.com/onnx/onnx/commit/36f8e6d9)**: Update logo_request.md (#2231) <Prasanth Pulavarthi>
- **[4eb737c8](https://github.com/onnx/onnx/commit/4eb737c8)**: Update Clip in opset 11 to support min/max as inputs instead of attributes (#2096) <Bowen Bao>
- **[a25e1388](https://github.com/onnx/onnx/commit/a25e1388)**: Fix segfault in tile shape inference (#2221) <daquexian>
- **[2dc273c7](https://github.com/onnx/onnx/commit/2dc273c7)**: update onehot shape inference to reflect the spec for depth input (#2224) <Ashwini Khade>
- **[665211c1](https://github.com/onnx/onnx/commit/665211c1)**: Add GatherElements Op and Rename ScatterElements (#2143) <Lara Haidar>
- **[3ba2e31a](https://github.com/onnx/onnx/commit/3ba2e31a)**: Unique (#2141) <liqunfu>
- **[5a5588ad](https://github.com/onnx/onnx/commit/5a5588ad)**: Clarify dimension variable scoping (#2211) <G. Ramalingam>
- **[fabe39d5](https://github.com/onnx/onnx/commit/fabe39d5)**: Liqun/topk sort (#2126) <liqunfu>
- **[453aa644](https://github.com/onnx/onnx/commit/453aa644)**: Update document for NMS (#2193) <Hector Li>
- **[34e28ec2](https://github.com/onnx/onnx/commit/34e28ec2)**: Handle negative 'axis' value in Split type and shape inferencing (#2177) <Scott McKay>
- **[28ec4583](https://github.com/onnx/onnx/commit/28ec4583)**: depth to space shuffle order (#2163) <Negin Raoof>
- **[98f72629](https://github.com/onnx/onnx/commit/98f72629)**: minor updates to fix links in readme (#2189) <Prasanth Pulavarthi>
- **[321d1467](https://github.com/onnx/onnx/commit/321d1467)**: Add check to disallow squeezing input axes which are not 1 (#2204) <Ashwini Khade>
- **[573f0dc9](https://github.com/onnx/onnx/commit/573f0dc9)**: fix a bug in fun shape inference (#2188) <Tang, Cheng>
- **[36dc7110](https://github.com/onnx/onnx/commit/36dc7110)**: Clarify ambiguity in gather spec regarding indices expectation (#2202) <Ashwini Khade>
- **[a2449673](https://github.com/onnx/onnx/commit/a2449673)**: Fix some minor issues in IR.md and Versioning.md (#2108) <edgchen1>
- **[349aff69](https://github.com/onnx/onnx/commit/349aff69)**: Skip install typing package for python >=3.5 (#2199) <bddppq>

Test Plan: ci

Reviewed By: bddppq, benoitsteiner

Differential Revision: D17296390

fbshipit-source-id: 9f9f5ce85d9694128008d756c2ea393bd4e0cb71
2019-09-12 12:15:03 -07:00
Dmytro Dzhulgakov
a6a7f35481 Better error messages in C2 ONNX backend (#25809)
Summary:
Just a tiny fix to make debugging easier (output errors to stderr and include in the exception message)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25809

Reviewed By: zrphercule

Differential Revision: D17329957

Pulled By: houseroad

fbshipit-source-id: 0d73dd9f62c735fbc5096e6a7c0e5f58e4cd90ae
2019-09-11 17:00:42 -07:00
Junjie Bai
a7eb18e243 Enable Unique operator tests on ROCm
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26046

Differential Revision: D17331522

Pulled By: bddppq

fbshipit-source-id: 729624d1df15a1c0c7ba2b7e7e3c3a903fb13abf
2019-09-11 16:36:14 -07:00
Swati Rallapalli
c47ccfd01d Enable variable size embedding (#25782)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25782

Enable variable size embedding for dot processor. We split the embedding matrix into multiple towers, based on the embedding size and perform dot product in a loop over each of the towers and finally concatenate all the dot product outputs.

Test Plan:
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:
https://our.intern.facebook.com/intern/testinfra/testrun/3659174703037560

Specific unit tests --
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_per_feature_emb_dim
https://our.intern.facebook.com/intern/testinfra/testrun/3377699726358808

Reviewed By: chenshouyuan

Differential Revision: D16690811

fbshipit-source-id: 8f5bce5aa5b272f5f795d4ac32bba814cc55210b
2019-09-09 22:08:32 -07:00
Edward Yang
f70ef229ce Back out "[Caffe2] Fix device_option propagation"
Summary: Original commit changeset: 916551b93346

Test Plan: none

Reviewed By: nairbv

Differential Revision: D17259017

fbshipit-source-id: f6e961e88c01126393ed2b6be0adeb6fcc68cb3c
2019-09-09 07:22:42 -07:00
Andrey Malevich
bd0e564d40 Fix device_option propagation (#25203)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25203

device_option propagation is completely broken in Caffe2 for cases when pass
through operators are used. As an example Gather operator don't have gradient
and passes through it's inputs, which results in incorrect detection of the
components for sparse parameter aggregation (component will be empty instead of
the real device).

This diff is trying to fix this issue.

Test Plan:
net_transform is finally working with Gather + FloatToHalf transformed model
instead of failing because of incorrect number of components.

Reviewed By: dzhulgakov

Differential Revision: D16936041

fbshipit-source-id: 916551b933469f04e32ddf86ec4b2c07f76c9176
2019-09-06 19:05:04 -07:00
Frank Jiang
3be1745b3c Make SparseNormalize backwards compatible (#25660)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25660

As title

Test Plan:
buck test caffe2/caffe2/python/operator_test:sparse_normalize_test
https://our.intern.facebook.com/intern/testinfra/testrun/5910974517813190

Reviewed By: boryiingsu

Differential Revision: D17187839

fbshipit-source-id: 1e5a6eaac0e825db4ae969540a1f689444070579
2019-09-05 15:14:21 -07:00
Jongsoo Park
8199bb3dd3 add options to flush cache in SLS benchmarks (#25530)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25530

Add an option to flush cache for more consistent benchmarking.

Test Plan:
buck run mode/opt caffe2/caffe2/fb/python/benchmarks:sparse_lengths_sum_4bit_benchmark -- --flush-cache
buck run mode/opt caffe2/caffe2/python/operator_test:sparse_lengths_sum_benchmark -- --flush-cache

Reviewed By: hyuen

Differential Revision: D17148087

fbshipit-source-id: 7eb782986676620254c1619a9a48c656cb1a6856
2019-09-03 05:09:03 -07:00
Jongsoo Park
f1059d4e6a format sparse_lengths_sum_benchmark (#25529)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25529

To prepare D17148087

Test Plan: Just formatting

Reviewed By: hyuen

Differential Revision: D17148085

fbshipit-source-id: faff90ee7dfec543d47037d20ce00f251144bc06
2019-09-03 05:08:59 -07:00
Xing Wang
8a8844dc83 Add the sparse feature information during logging in sparse lookup layer (#24863)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24863

Add the sparse feature name in logging for ease of debugging

Test Plan:
./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/sparse_nn/pooling_test#binary.par  -r test_simple_sum_pooling_named_exception

Another test for id_score_list. the original sparse_key is equivalent to get_key(self.input_record)()
P98343716

./buck-out/gen/caffe2/caffe2/python/layers_test-2.7#binary.par -r test_get_key

Reviewed By: chocjy

Differential Revision: D16901964

fbshipit-source-id: 2523de2e290aca20afd0b909111541d3d152a588
2019-08-27 23:25:26 -07:00
Yanghan Wang
e34ef04301 register HeatmapMaxKeypoint with C10 (#25191)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25191

registering as C10.

Test Plan: buck test mode/dev-nosan caffe2/caffe2/python/operator_test:heatmap_max_keypoint_op_test

Reviewed By: newstzpz

Differential Revision: D17056321

fbshipit-source-id: 989b72d7e3c9f23684b10d5fc9b98177ad4ee47b
2019-08-27 20:13:57 -07:00
Yu Shi
43a2fd0e24 Support focal loss in MTML
Summary:
[Not in need of review at this time]
Support focal loss in MTML (effectively dper2 in general) as described in https://arxiv.org/pdf/1708.02002.pdf. Adopt approach similar to Yuchen He's WIP diff D14008545

Test Plan:
Passed the following unit tests
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_lr_loss_based_focal_loss
buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_mtml_with_lr_loss_based_focal_loss
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_lr_loss_based_focal_loss_with_stop_grad_in_focal_factor

Passed ./fblearner/flow/projects/dper/canary.sh; URL to track workflow runs: https://fburl.com/fblearner/446ix5q6

Model based on V10 of this diff
f133367092
Baseline model
f133297603

Protobuf of train_net_1 https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GEq30QIFW_7HJJoCAAAAAABMgz4Jbr0LAAAz

Reviewed By: hychyc90, ellie-wen

Differential Revision: D16795972

fbshipit-source-id: 7bacae3e2255293d337951c896e9104208235f33
2019-08-25 01:42:25 -07:00
Xiao Fang
3385693edd gradient clipping by norm
Summary: as titled

Reviewed By: hbjerry, alyssawangqq

Differential Revision: D16797498

fbshipit-source-id: 4ea05ab9f06b309d32faa3218e79899c9f8d9cf2
2019-08-22 11:20:40 -07:00
Frank Jiang
d7c6debc14 Remove gradient value as input from SparseNormalize op (#24357)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24357

SparseNormalize does not need to know the gradient value to the lookup table, only the indices of the embeddings that need to be updated. By removing this input, we allow SparseNormalize to be used alongside SparseAdagradFusion

Differential Revision: D16809919

fbshipit-source-id: cc19692ba4dea8854663ae1ed8cf9365e90c99bc
2019-08-19 14:47:09 -07:00
Yanghan Wang
3b22bbeb5b enable "keeps" from BoxWithNMSLimit and caffe2_fastrcnn_outputs_inference
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24451

Reviewed By: newstzpz

Differential Revision: D16850259

fbshipit-source-id: 22f69d71a558d63c32a27d271a7557fc35a55176
2019-08-19 10:54:22 -07:00
Bin Wen
e78dad3593 Add BPR loss to TTSN (#24439)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24439

many literatures mentioned BPR is useful for improving recommendation quality. Add a BPR loss so that we can train TTSN with it. Would like to see if it can improve retrieval models.

reference: https://arxiv.org/pdf/1205.2618.pdf

Reviewed By: dragonxlwang

Differential Revision: D16812513

fbshipit-source-id: 74488c714a37ccd10e0666d225751a845019eb94
2019-08-15 23:20:15 -07:00
neginraoof
3574d9ff70 updated pixel_shuffle in opset 11 to use depthToSpace
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23739

Differential Revision: D16800355

Pulled By: bddppq

fbshipit-source-id: 1502c5b7ec1495286bad17b6ffa359cf995f78fb
2019-08-15 11:37:44 -07:00
Fan Wang
59094c409e Refactor and expose metadata of tum_history layer for online prediction
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24290

Reviewed By: xianjiec

Differential Revision: D16570968

fbshipit-source-id: f68d42f3a8e1a6c8d30e00c2dd7f7efc1fb35d7c
2019-08-15 00:27:11 -07:00
Kevin Wilfong
88b1f6619e Return list of AccessedFeatures from get_accessed_features (#23983)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23983

While testing I realized that model layers can extract different types of features from the same column.  For example, MultifeedFeaturesTransform uses float and ID list features from the "features" column.

get_accessed_features returns a map from column to AccessedFeatures, and AccessedFeatures only has the feature IDs for one feature type.  This is incompatible with have multiple types of features per column, one type ends up overwriting another in the map.

To fix this, I've modified get_accessed_features to return a map from column to a list of AccessedFeatures objects.

Reviewed By: itomatik

Differential Revision: D16693845

fbshipit-source-id: 2099aac8dc3920dd61de6b6ad5cf343c864803bc
2019-08-14 10:50:27 -07:00
Frank Jiang
1439152e72 Make hashing default for bucket-weighted pooling (#24266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24266

As title

Reviewed By: huginhuangfb

Differential Revision: D16775870

fbshipit-source-id: f919fdffa014ef3ce9a69fe173dd240e91813c3e
2019-08-13 13:56:32 -07:00
Sergio Giro
dc870a3761 Hypothesis tests: add ability to enforce shape inference (#23935)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23935

Add parameter to enforce that outputs are inferred

Reviewed By: yinghai

Differential Revision: D16667772

fbshipit-source-id: 44f9c47133749b48c0db25a54f9bd9f4698f3e7d
2019-08-13 05:32:41 -07:00
Tongliang Liao
4f254c3c33 Fix typo "properlyh"
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24067

Differential Revision: D16732526

Pulled By: ezyang

fbshipit-source-id: 0f3a5b53c0e46bd40a6e5c838504301766c00a82
2019-08-09 11:43:04 -07:00
Yanghan Wang
ad64789a1e add aligned option to RoIAlign
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23706

Reviewed By: ppwwyyxx

Differential Revision: D16615823

fbshipit-source-id: fd9152af8bc979cb04044413e66af349b032a99d
2019-08-07 21:22:33 -07:00
Shali Jiang
15d3f0242b support Gather different indices for different examples in one batch (#23813)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23813

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23285

for example:

Inputs:
  data:
   [[[2 4 2 0],
     [0 1 2 0],
     [1 1 0 0]],
    [[3 4 1 3],
     [0 3 2 2],
     [4 1 0 4]]]

  idx:
    [[0 2],
     [0 1]]

outputs:
  [[[2 4 2 0],
    [1 1 0 0]],
   [[3 4 1 3],
    [0 3 2 2]]]

data and idx must have the same outer dimension

call Gather or BatchGather with argument match_outer=True

Reviewed By: huayuli00

Differential Revision: D16652485

fbshipit-source-id: 9e144e97a8d6fceaf3b5714df1534338068f4a10
2019-08-07 21:14:30 -07:00
Amy Yang
9588cd921e weight_names bug fix (#23848)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23848

Problem:
In experiment running feed model 127607201 (/mnt/public/tracelog/feed_repro2/127607201_0.predictor), encountered blob dimensionality mismatch error when running onnxified net. This is due to the model initializing input blobs in current workspace with blob size 0, and onnxifi() falsely identified those input blobs as weight blobs and assigned wrong dimension.

Solution:
Add option to pass correct weight blob names to onnxifi() instead of using all blobs in current workspace.

Reviewed By: yinghai

Differential Revision: D16661396

fbshipit-source-id: cabe44db6b64e6538bef4b65e380312214b3ba9f
2019-08-06 10:58:43 -07:00
Andrey Malevich
d58059bc6f Fix SliceGradientOp to handle properly empty batches (#23784)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23784

Backward path does nothing during the gradient path when the input as empty, as
a result workspace can preserve gradient values from previous iteration and get
inconsistent inputs for some of the backward pass operators. This diff should
fix this disrepancy by always reinitializing output during the backward path.

Reviewed By: dzhulgakov

Differential Revision: D16646096

fbshipit-source-id: 8ca68dfad17a63fc87c033cce7b36b40bd77245c
2019-08-06 02:43:32 -07:00
Michael Suo
a3c165f9d2 Revert D16452539: support Gather different indices for different examples in one batch
Differential Revision:
D16452539

Original commit changeset: 7229489f4a9c

fbshipit-source-id: 010c177e551cb81521d2af84ce951bf964cdab44
2019-08-05 10:22:01 -07:00
Shali Jiang
f87a4cc23f support Gather different indices for different examples in one batch (#23285)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23285

for example:

Inputs:
  data:
   [[[2 4 2 0],
     [0 1 2 0],
     [1 1 0 0]],
    [[3 4 1 3],
     [0 3 2 2],
     [4 1 0 4]]]

  idx:
    [[0 2],
     [0 1]]

outputs:
  [[[2 4 2 0],
    [1 1 0 0]],
   [[3 4 1 3],
    [0 3 2 2]]]

data and idx must have the same outer dimension

call Gather or BatchGather with argument match_outer=True

Reviewed By: huayuli00

Differential Revision: D16452539

fbshipit-source-id: 7229489f4a9c02ee9f3c6a8a24bcd02925d96e07
2019-08-04 21:17:49 -07:00
Le Fang
a1b10270c2 Fix the bug in regularizer matching (#23485)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23485

In previous diff D16326492, the "regularizer" in dot processor is defined according to input regularizer options through the function "get_emb_weighting_reg" in processor_utils.py. The option matching is only valid in local test, but doesn't work in workflows. This bug causes the regularizer not added in actual models and has made previous trimmed lasso implementation useless.

An evidence is that before D16326492, a flow f126010621 has elastic regularizer added:
https://our.intern.facebook.com/intern/chronos/jobinstance/?jobinstanceid=5375243255&smc=chronos_gp_admin_client

{F171862755}

while after D16326492, the regularizer is gone in flow f127262007
https://our.intern.facebook.com/intern/chronos/jobinstance/?jobinstanceid=5428982684&smc=chronos_gp_admin_client

{F171862770}

Differential Revision: D16535466

fbshipit-source-id: 6b0b5e95b2b14a0d6c6d65f96bab89529f4e79c5
2019-08-02 15:54:48 -07:00
Jiexian Li
302adf1d20 add LambdaRank DCG Loss Option (#23679)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23679
Full Canary: https://fburl.com/fblearner/sa1pkpya
Add LambdaRank DCG Loss Option
* when use_idcg_normalization == true, regular LambdaRank with NDCG loss
* when use_idcg_normalization == false, gradient and loss functions are not normalized by idcg.

Differential Revision: D16605459

fbshipit-source-id: a16f071e69516974e48d27bef4ca179019ca4ae7
2019-08-02 11:47:46 -07:00
Jiexian Li
fc6aec9491 format only change (#23685)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23685

format only changes.

Differential Revision: D16607482

fbshipit-source-id: 572afb59c6ff9f8a8842ba044fed6c87f8506843
2019-08-02 11:47:42 -07:00
Levent Ertoz
8d4956fd02 hook up dropout sparse with replacement operator
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23183

Reviewed By: ffjiang

Differential Revision: D16428262

fbshipit-source-id: 0d6e17d15c898629bbd2826441f2c9701a78b0bd
2019-07-23 14:34:25 -07:00
Levent Ertoz
6f01d13728 Implement dropout with replacement for id list features. (#22880)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22880

Implement sparse dropout with replacement value.

Reviewed By: xianjiec

Differential Revision: D16267012

fbshipit-source-id: 8c4878230f61bb3ac333291e2c6aaf2fbdc5f9ce
2019-07-23 14:34:21 -07:00
Kevin Wilfong
3ca7c0ffdb Add get_accessed_features function to ModelLayer class (#23036)
Summary:
We need a way to figure get a complete list fo features that are used in training a model.  One way to do this is to make it possible to get the list of features used in each Model Layer.  Then once the model is complete we can go through the layers and aggregate the features.

I've introduced a function to expose that information here, get_accessed_features, and implemented it in the FeatureSparseToDense layer to start with.

I've tried to include the minimum amount of information to make this useful, while making it easy to integrate into the variety of model layers.  This is, for example, why AccessedFeatures does not contain feature_names which is not always present in a model layer.  I debated whether or not to include feature_type, but I think that's useful enough, and easy enough to figure out in a model layer, that it's worth including.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23036

Test Plan:
Added a unit test to verify the behavior of get_accessed_features in FeatureSparseToDense.

aml_dper2-fblearner-flow-integration-tests failed due to a known issue D16355865
aml_dper3-fblearner-flow-integration-tests failed due to a known issue T47197113

I verified no tests in the integration tests failed to issues other than those known ones.

DPER2 canaries: https://fburl.com/fblearner/1217voga

Reviewed By: volkhin

Differential Revision: D16365380

Pulled By: kevinwilfong

fbshipit-source-id: 2dbb4d832628180336533f29f7d917cbad171950
2019-07-22 15:04:28 -07:00
Le Fang
442dd7b906 Implement "trimmed lasso" regularization and support all available regularization in a single interface (#22966)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22966

We want to implement "trimmed lasso" for feature selection with learnable and regularizable weights. Trimmed lasso is a simple yet powerful improved version from traditional lasso. More reference can be found at https://arxiv.org/abs/1708.04527 and http://proceedings.mlr.press/v97/yun19a.html. For quick and necessary intro, please refer to P1-3 of the paper at https://arxiv.org/abs/1708.04527.

Given n weights, traditional lasso sums up all weights' l1 norms. The trimmed lasso takes an input integer k (how many weights you want to select from n) and only sums over the smallest n - k weights. Given lambda as the regularization constant, the penalty term is only on the smallest n - k weights, but not other larger weights. If lambda becomes larger than certain threshold, the smallest n - k weights are shrunk to zero. That means we have those weights "dropped". With this property, the number k is the number of weights left after lasso, which we can easily control.

Meanwhile, we further support all available regularization in a single interface. Current supported regularizers on weights include no reg, l1, l2, elastic, trimmed l1, elastic with trimmed l1, group l1, and logbarrier.

Differential Revision: D16326492

fbshipit-source-id: 6e1fd75606005d9bc09d6650435c96a7984ba69c
2019-07-17 16:12:31 -07:00
Lu Fang
796a39ba85 Automatic update of fbcode/onnx to 707064980b9825b8705b9d1c9aad34d8b022d5dd (#22981)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22981

Previous import was 806aa863020fa180e57f576cb032ec44ce8ddcca

Included changes:
- **[70706498](https://github.com/onnx/onnx/commit/70706498)**: TensorProto::INT8 & INT16 were missed here (#2164) <ZINEKS>
- **[8218a4ea](https://github.com/onnx/onnx/commit/8218a4ea)**: Fix LabelEncoder's shape inference (#2170) <Wei-Sheng Chin>
- **[0f1a9a1c](https://github.com/onnx/onnx/commit/0f1a9a1c)**: Fixing a unit test in Cumsum Operator (#2157) <Jeff Saremi>
- **[2c03cff0](https://github.com/onnx/onnx/commit/2c03cff0)**: [New Operator] CumSum (#2030) <Jeff Saremi>
- **[220b8300](https://github.com/onnx/onnx/commit/220b8300)**: Fix globalpool output shape (#2147) <daquexian>

Reviewed By: benoitsteiner

Differential Revision: D16341736

fbshipit-source-id: 7e7a2684d8c821991231bfd6558f9f6cb4fb05fb
2019-07-17 14:05:14 -07:00
Xiaodong Wang
2630109727 always restore dlopen flag in dyndep (#22958)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22958

When we use `extension_loader.DlopenGuard()` to dyndep or import modules, it sets a `RTLD_GLOBAL` flag, and restores the original flags after the `yield`. However, if the modules is not there, yield will fail, and the flags won't be restored, creating all kinds of symbol conflict problems.

Reviewed By: bddppq

Differential Revision: D16311949

fbshipit-source-id: 7b9ec6d60423ec5e78cae694b66c2f17493840b0
2019-07-17 10:26:25 -07:00
Xiaodong Wang
9b8d771733 skip import nccl and gloo_gpu in cpu machine (#22522)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22522

Skip importing nccl and gloo_gpu modules in cpu machine

Reviewed By: bddppq

Differential Revision: D16115827

fbshipit-source-id: 329b7a0bb5eccb78c9e772bdab5db7c79b546d55
2019-07-10 11:56:56 -07:00
Will Feng
3a12520844 Pass Variable into Caffe2 ops, by requiring that the Variable doesn't require grad (#22473)
Summary:
As part of the Variable/Tensor merge, we want to be able to pass Variables into Caffe2 without doing extra shallow copy, to improve performance and also allow for in-place mutations in Caffe2 ops. There are a few approaches outlined in https://github.com/pytorch/pytorch/pull/22418, and this PR is the chosen approach.

Specifically, we can have the assumption that we won't be connecting autograd to C2 gradients at any point (as it's too tricky and not that useful). Therefore, we can pass Variable into Caffe2 ops by requiring that all Variables in Caffe2 don't require grad. For code paths in Caffe2 that might potentially track gradients (e.g. `ScriptModuleOp` and `call_caffe2_op_from_c10`), we use the `torch::NoGradGuard` to make sure gradients are not tracked.

This supersedes https://github.com/pytorch/pytorch/pull/22418.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22473

Differential Revision: D16099042

Pulled By: yf225

fbshipit-source-id: 57efc3c7cfb3048d9abe90e63759acc14ebd2972
2019-07-08 11:31:10 -07:00
Du Tran
d2ceab2766 update video input (#22471)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22471

update C2 video input with latest augmentation

Reviewed By: HengCV

Differential Revision: D16096127

fbshipit-source-id: bb07394e211cd52b50005d801b6d03250248ea9e
2019-07-05 00:56:33 -07:00
Alyssa Wang
d9e15bccb0 Perform weight re-init for embedding table in sparse_lookup.py (#22348)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22348

This is the last step of LRU hash eviction weight re-init. This diff checks if there's evicted values in sparse_lookup, if so call op created in D15709866 to re-init the values for indicies in evicted_values. Also created gradient op for the operator. The gradient op just passes the output gradient as input gradient.

Reviewed By: itomatik

Differential Revision: D16044736

fbshipit-source-id: 9afb85209b0de1038c5153bcb7dfc5f52e0b2abb
2019-07-03 10:33:40 -07:00
Duke Vijitbenjaronk
d684112ec9 Output sequence probability with CTC beam search, optional multiple output sequences (#21927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21927

Add `OUTPUT_PROB` output to CTCBeamSearchDecoderOp to return a probability for each sequence.

Add argument to output top-k instead of top-1 decoded sequences.

Reviewed By: SuperIRabbit

Differential Revision: D15797371

fbshipit-source-id: 737ca5cc4f90a0bcc3660ac9f58519a175977b69
2019-07-02 17:29:13 -07:00
Alyssa Wang
34f950c800 Create C2 operator to replace values in embedding table (#22279)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22279

This new operator is used for embedding table weight re-init. After we get the evicted indices, they will be the rows need reseting in embedding table. Then we can create a 1d tensor with default values, and apply this operator to copy the tensor to all evicted rows in embedding table

Will add gradient op in next diff

Reviewed By: itomatik

Differential Revision: D15709866

fbshipit-source-id: 2297b70a7326591524d0be09c73a588da245cc08
2019-07-02 15:26:22 -07:00
Alyssa Wang
bb07f2d063 Pass LRU hash output evicted_values to SparseLookup (#21389)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21389

As titled. To do weight re-init on evicted rows in embedding table, we need to pass the info of the evicted hashed values to SparseLookup, which is the layer model responsible for constructing the embedding table and do pooling.

To pass evicted values, we need to adjust the output record of lru_sparse_hash to include the evicted values, and add optional input to all processors that needs to take in sparse segment. For SparseLookup to get the evicted values, its input record needs to be adjusted. Now the input record can have type IdList/IdScoreList/or a struct of feature + evicted values

Reviewed By: itomatik

Differential Revision: D15590307

fbshipit-source-id: e493881909830d5ca5806a743a2a713198c100c2
2019-07-02 11:27:37 -07:00
Xianjie Chen
2dd1323379 Fix the GPU trainer for NoneCalibration and RNN
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22385

Reviewed By: Wakeupbuddy

Differential Revision: D16053190

fbshipit-source-id: 6304c5c51f33691c201c78d4c921a9c250d9b4f5
2019-07-01 22:55:18 -07:00
Lu Fang
dfa6fca1c6 Supporting Manifold DB in Predictor Exporter (#22334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22334

Improve the function signatures of save_to_db and load_from_db in predictor_exporter.

Reviewed By: akyrola

Differential Revision: D16047208

fbshipit-source-id: a4e947f86e00ef3b3dd32c57efe58f76a38fcec7
2019-07-01 16:17:02 -07:00
Xiaomeng Yang
10e4137396 Optimize InstanceNormGradientOp (#22288)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22288

Optimize InstanceNormGradientOp

Benchmarks:

CPU with [N, C, H, W] = [128, 256, 56, 56],
NCHW order: 616ms -> 128ms
NHWC order: 1612ms -> 174ms

GPU with [N, C, H, W] = [128, 256, 112, 112],
NCHW order: 6450ms -> 37ms
NHWC order: 1419ms -> 82ms

Reviewed By: houseroad

Differential Revision: D16023630

fbshipit-source-id: 5af9bf1103cde2fc2bcb5cd5a057d039732f052e
2019-07-01 15:10:17 -07:00
Xiaomeng Yang
29b53b0259 Fix bug in caffe2 transpose on GPU (#22233)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22233

Fix bug in caffe2 transpose on GPU

Reviewed By: hl475

Differential Revision: D15994973

fbshipit-source-id: 542dc8757b51a6322fffa55826c1d4e32927398d
2019-06-26 11:33:25 -07:00
Cheng,Penghui
7ee82d48a8 Removed work around for convolution transpose op since the bug has be… (#22184)
Summary:
…en fixed in v0.18
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22184

Differential Revision: D15982627

Pulled By: bddppq

fbshipit-source-id: 8725d5b5e5b68e029ffb08af12b416bd310c9638
2019-06-25 14:34:34 -07:00
Zachary DeVito
5b87049c66 remove uses of std::shared_ptr<Module> (#21934)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21934
ghimport-source-id: e64ab9096f43749ead3ac5567675b815da295664

Test Plan: Imported from OSS

Differential Revision: D15892401

Pulled By: zdevito

fbshipit-source-id: 6424139206593ff944556c69d8a54723884eacaf
2019-06-25 13:24:38 -07:00
Le Fang
ac4913ee62 support both regularizable and sofmax re-weighting on sparse features in dot product (#22176)
Summary:
In order to select more important features in dot product among a list of candidate sparse features, we can assign one learnable weight on each feature, reweight each feature by multiplying the weight onto its embedding before dot product. We finally select features based on the weight magnitude after training.

We can perform L1 and/or L2 regularization on the weights. To summarize, the weights tend to shrink their values (avoiding overfitting) due to L2 regularization, and some weights will vanish to zero as L1. To avoid sparse feature embedding being ignored due to early collapse of weights, a piece lr warm up policy is used in optimizing regularization term, such that regularization is weak at first stage and gets stronger afterwards (a small lr constant in iters less than threshold 1, a medium lr constant in stage 2, and a final reasonable large lr constant in all iters after threshold 2). The features with nonzero and relatively large weights (in absolute value) will be selected for the module.

We can also apply softmax on the original weights to make it sum to 1. We can even boosting the softmaxed weights by multiply the number of softmax components, which essentially make them sum to the number of softmax components and avergae to 1. In this idea, all the weights are positive and sum to a constant. Regularization is not a must since we can count on the competition between softmax weights themselves to achieve reasonable re-weighting. We expect those weights be more dense, comparing with sparse ones from L1 regularization and we can select features based on top K weights.

Overall, we aim to demonstrate the selected feature set outperform current v0 feature set in experiments. Special acknowledgement goes to Shouyuan Chen, who initiated the work of regularizable weighting.

 ---

Pull Request resolved: https://github.com/pytorch/pytorch/pull/22176

The diff will export updates to Github repository, as stated below.

{F162787228}

Basically, the updates on the files are summarized as below:

- adding logger messages
`caffe2/python/layer_model_helper.py`
- add ElasticNet regularizer, which combines both L1 and L2 regularization
`caffe2/python/regularizer.py`
- implement piecewarmup, specifically warm up with three constant pieces
`caffe2/sgd/learning_rate_functors.h, caffe2/sgd/learning_rate_op.cc, caffe2/sgd/learning_rate_op.h`

Differential Revision: D15923430

fbshipit-source-id: ee18902cb88c23b1b7b367cc727d690a21e4cda9
2019-06-24 21:27:33 -07:00
Frank Jiang
84a2d5d7aa Add hashing to bucket-weighted pooling (#20673)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20673

Add option to bucket-weighted pooling to hash the bucket so that any cardinality score can be used.

Reviewed By: huginhuangfb

Differential Revision: D15003509

fbshipit-source-id: 575a149de395f18fd7759f3edb485619f8aa5363
2019-06-20 15:12:36 -07:00
Zhanibek Datbayev
4fee532de6 Pass loop_over optional parameter for cached reader properly. (#21929)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21929

Just need to pass `loop_over` argument properly.

Reviewed By: noname01

Differential Revision: D15885401

fbshipit-source-id: f1928277262a80e5b41f4c4f3945c2f378a4e233
2019-06-19 18:15:32 -07:00
hexiaoting
34536e207a Fix: convert Onnx DynamicSlice operator with 4 inputs to caffe2 fa… (#20846)
Summary:
I reported an issue [https://github.com/pytorch/pytorch/issues/20743](url)
and make this pull request for it
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20846

Reviewed By: zrphercule

Differential Revision: D15569135

Pulled By: houseroad

fbshipit-source-id: 96a2c818ef666a7d79b96decfa347d7154b34d5c
2019-06-19 00:09:15 -07:00
Hong Xu
3bdde56907 Fix incorrect usage of __HIP_PLATFORM_HCC__ (#21757)
Summary:
This avoid using `__HIP_PLATFORM_HCC__` in case it changes in the future.

Following up https://github.com/pytorch/pytorch/issues/21718
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21757

Reviewed By: xw285cornell

Differential Revision: D15891867

Pulled By: bddppq

fbshipit-source-id: 5de55687ab1c86eddf6b4d8d25fee48d96ec72ad
2019-06-18 18:56:32 -07:00
Hongyu Xiong
76a250d590 Add new regression loss function type to FBLearner (#21080)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21080

Add Huber loss as a new option for regression training (refer to TensorFlow implementation: https://fburl.com/9va71wwo)
  # huber loss
  def huber(true, pred, delta):
    error = abs(true-pred)
    loss = 0.5 * min(error, delta)^2 + delta * max(error - delta, 0)
    return mean(loss)

As a combination of MSE loss (`x < delta`) and MAE loss (`x >= delta`), the advantage of Huber loss is to reduce the training dependence on outlier.

One thing worth to note is that Huber loss is not 2nd differential at `x = delta`. To further address this problem, one could consider adopt the loss of `LOG(cosh(x))`.

Reviewed By: chintak

Differential Revision: D15524377

fbshipit-source-id: 73acbe2728ce160c075f9acc65a1c21e3eb64e84
2019-06-17 17:43:00 -07:00
Benny Chen
1e7bd7586d Query caffe2 operator stats for detailed execution info (#20924)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20924

I found a python3 bug for deserializing caffe2 code. The exception thrown is Unicode related error instead of just decode error, and we need to catch that as well

Reviewed By: ipiszy

Differential Revision: D15293221

fbshipit-source-id: 29820800d1b4cbe5bf3f5a189fe2023e655d0508
2019-06-13 23:41:04 -07:00
Sungmann Cho
f59581218f Fix spelling errors (#21665)
Summary:
alloctor -> allocator
excutable -> executable
excution -> execution
foward -> forward
initiaize -> initialize
paralell -> parallel
preprocesor -> preprocessor
tranpose -> transpose
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21665

Differential Revision: D15806155

Pulled By: soumith

fbshipit-source-id: d92b21ec8650a2b32f05faf9af0b7d2b073e992c
2019-06-13 15:21:55 -07:00
Natalia Lunova
63a7c7bb2a Add event and event_counter columns to caffe2_usage_tracer table (#21739)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21739

Added event and event_counter columns for PyTorch/Caffe2 API usage metrics

Reviewed By: dzhulgakov

Differential Revision: D15119119

fbshipit-source-id: a71010bd659109a8e4f3a8bad84b22c1d15dc528
2019-06-13 12:06:02 -07:00
Xiaodong Wang
5a7e2ccc0b Add use_rocm flag to detect AMD build in the runtime (#21718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21718

adding a detection method on whether the package is built for AMD.

Reviewed By: bddppq

Differential Revision: D15795893

fbshipit-source-id: 91a21ee76b2273b1032507bdebe57e016717181d
2019-06-13 09:30:49 -07:00
Jiyan Yang
2c91ba3bbc Add div hashing
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21422

Reviewed By: xianjiec

Differential Revision: D15589181

fbshipit-source-id: f6ff0726164f88da45e4b090b4d5ad05305b3225
2019-06-12 11:27:37 -07:00
Lu Fang
c2a08d339b Automatic update of fbcode/onnx to dd599b05f424eb161a31f3e059566a33310dbe5e (#21641)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21641

Previous import was 5160f3ac3380302224998f1c95e111cd961c4bc5

Included changes:
- **[dd599b05](https://github.com/onnx/onnx/commit/dd599b05)**: Fix type s/depracted/deprecated/ (#2092) <Takeshi Watanabe>
- **[abb1702a](https://github.com/onnx/onnx/commit/abb1702a)**: Add shape inference for Tile op (#2076) <Hariharan Seshadri>
- **[67638d9c](https://github.com/onnx/onnx/commit/67638d9c)**: [New Operator] Round (#2053) <Jeff Saremi>
- **[584e4477](https://github.com/onnx/onnx/commit/584e4477)**: Add dilations support in ConvTranspose shape inference and update docs (#2068) <daquexian>

Reviewed By: zrphercule

Differential Revision: D15762382

fbshipit-source-id: 590f25fb733e1565eb90fcdeb797b0ba34e2d2c3
2019-06-11 16:54:47 -07:00
Jinghui
29c849ff34 implement transpose operator for MKLDNN (#19955)
Summary:
implement transpose operator for MKLDNN
1. upgrade mkldnn-bridge to support ND transpose
2. implement transpose operator in caffe2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19955

Differential Revision: D15701832

Pulled By: bddppq

fbshipit-source-id: e4337cd0ba6f8180a35c8c70cbb6830a0a84182f
2019-06-11 01:55:13 -07:00
Cheng,Penghui
74f6c55f0f support negative axis in concat and split operators
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17955

Differential Revision: D14476031

Pulled By: ezyang

fbshipit-source-id: e0e57e8595ed2005ded9e923572a40fe62aca5a7
2019-06-10 15:26:29 -07:00
Haixin Liu
4bdbd30b96 Add python binding to deserialize blob (#21532)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21532

Add python binding to deserialize blob

Reviewed By: yinghai

Differential Revision: D15706816

fbshipit-source-id: f498c7e0f7392f055b13810bbf81cba59f25e1d2
2019-06-10 10:49:21 -07:00