Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15884
Codemod generated with clangr shard mode, 25 files per diff,
To eliminiate partially initialized Tensor, we split the initialization of local Tensor variables into two steps, first declare un uninitialized Tensor, and
call `ReinitializeTensor` to initialize it.
motivation: https://github.com/pytorch/pytorch/pull/12407
Reviewed By: hyuen
Differential Revision: D13586737
fbshipit-source-id: dc8e49e9f29505b8898bb19f84c1a983f2d811ab
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15841
Fix the bugs in dnnlowp to support int8/int16 quantization for sparsenn.
Reviewed By: jspark1105
Differential Revision: D13600878
fbshipit-source-id: 27f06d7c54a663208320c8f211714220a9b49540
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15759
Some flags have too long names. And some other few minor clean ups.
Reviewed By: jianyuh
Differential Revision: D13587353
fbshipit-source-id: f8aee7f167505644f5d8f80fe2eed70201ef1e54
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15758
DNNLOWP Conv operators became very complex due to many options. This diff simplifies them by not allowing fp32 in/out. This is OK for Conv operators because Conv operators are usually used in deep networks where quantizing and dequantizing using separate operators is not much overhead.
Reviewed By: csummersea
Differential Revision: D13587341
fbshipit-source-id: e88c919dae79d1c5b7d787ea539edf5bcb064afc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15708
nbits_in_non_outlier == 0 doesn't make sense because it means everything is outlier and we can just use 32-bit accumulation.
Depending on architecture, break-even point between acc16 and acc32 can be different. Adding thresholds for falling back to acc32.
Reviewed By: jianyuh
Differential Revision: D13574832
fbshipit-source-id: b7a37aacbfdc7867e31838dafcdd5f7c2ac282af
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15625
3D group conv (both NCHW and NHWC layout) was not correct.
Added group=2 in test_1d_convolution and test_3d_convolution in conv_test
Reviewed By: protonu
Differential Revision: D13562099
fbshipit-source-id: 586e8a7574a2764f2a3b559db6c2415b3ab90453
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15685
The declaration of "Dequantize" is in "fbsource/fbcode/deeplearning/fbgemm2/QuantUtils.h", so it requires the "namespace fbgemm".
<T> is actually optional, since the type can de deduced from the first argument.
In some places we have "Dequantize<T>(...)", while in other places we have "Dequantize(...)". We'd better unify them. As a reference, all occurrences of "Quantize" are using "fbgemm::Quantize<T>(...)".
Reviewed By: jspark1105
Differential Revision: D13570847
fbshipit-source-id: 7fca9f7f9e4e0d9e5eb27ac44b8707adc3c80717
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15588
Use NHWC2NCHW or NCHW2NHWC functions which is easier to understand compared to code using transpose and generalizable to non-2D convolutions.
Reviewed By: csummersea
Differential Revision: D13557674
fbshipit-source-id: c4fdb8850503ea58f6b17b188513ae2b29691ec0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15582
Following convention of having caffe2_ prefix in command line options
Reviewed By: viswanathgs
Differential Revision: D13252055
fbshipit-source-id: 142a6395b832f211f34d0a87ec2d62c1e5fcdc69
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15371
Similar to D13387692:
Never call mutable_data from an OpenMP region!!!
Reviewed By: jspark1105
Differential Revision: D13511259
fbshipit-source-id: 100812d2a547c0a1d5018749d5fdc88162375673
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15147
Forgot to take out dnnlowp.cc from avx2 list in a previous diff.
Reviewed By: dskhudia
Differential Revision: D13440686
fbshipit-source-id: 9ada98b6e885c7d5f22c91a735ff60304480b4cb
Summary:
…done once
This allow no-op build to work correctly even when BUILD_CAFFE2_OPS is on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14982
Differential Revision: D13413960
Pulled By: zdevito
fbshipit-source-id: 6e5412a8c375af8a47c76f548cdd31cff15f3853
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14881
This diff allows us to pre-quantize and pre-pack weight matrix used in DNNLOWP_ACC16 .
The intended use pattern is run Int8ConvPackWeight in init_net that generates a packed weight and Int8Conv with DNNLOWP_ACC16 engine uses the the packed weight.
Reviewed By: csummersea
Differential Revision: D13374662
fbshipit-source-id: dd02b9a4eb7af1fe208aa857fcd0b445e6e395af
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14725
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/33
Renaming FbgemmI8Depthwise.h to FbgemmI8DepthwiseAvx2.h and FbgemmI8Depthwise.cc to FbgemmI8DepthwiseAvx2.cc since FbgemmI8DepthwiseAvx2.cc will be compiled with avx2 flags
Reviewed By: jianyuh
Differential Revision: D13313898
fbshipit-source-id: a8111eacf3d79a466ce0565bfe5f2f0b200a5c33
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14547
Unit tests used in dnnlowp need a better compilation flow as some of them need avx. Disabling for now so that pytorch builds with fbgemm.
Reviewed By: jianyuh
Differential Revision: D13240933
fbshipit-source-id: e2e187b758c5d89e524470cd261ce35493f427a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14350
acc32 for now. Will have a separate diff for acc16 but that will need another out processing that does sparse convolution without im2col.
Reviewed By: dskhudia
Differential Revision: D13188595
fbshipit-source-id: e8faee46c7ea43e4a600aecb8b8e93e6c860a8c8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14340
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/25
Per-group and per-channel quantization in fbgemm
This diff also cleans up explicit template instantiation using macro expansion
This diff also changes randFill interface which was easy to make mistakes of generating integer random numbers for floating point vectors.
Using this in DNNLOWP operators will be done in a separate diff.
Reviewed By: dskhudia
Differential Revision: D13176386
fbshipit-source-id: e46c53e31e21520bded71b8ed86e8b19e010e2dd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14163
Some of the names we were using to guard the header file was too short (e.g. DYNAMIC_HISTOGRAM_H).
Reviewed By: csummersea
Differential Revision: D13115451
fbshipit-source-id: cef8c84c62922616ceea17effff7bdf8d67302a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14192
We can only use C10_* in OSS. The build is only broken if built with USE_FBGEMM=ON
Reviewed By: jianyuh
Differential Revision: D13121781
fbshipit-source-id: f0ee9a75997766e63e1da8a53de7ddb98296a171
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14133
Experiment with ultra low precisions on the Resnext-101 URU trunk model
Reviewed By: jspark1105
Differential Revision: D10108518
fbshipit-source-id: f04d74fbe1c9e75efafcd9845719bdb2efbbfe9c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14135
Update atol scale of dnnlowp test. Can't reproduce the flaky test error in the task locally even after setting the same seed value, but found according to comments in check_quantized_results_close(), atol_scale should be 1/1.9=0.526315789473684, which is larger than current value 0.51. So increase the atol_scale to 0.53.
Reviewed By: jspark1105
Differential Revision: D13108415
fbshipit-source-id: 1e8840659fdf0092f51b439cf499858795f9706a
Summary:
Pull Request resolved: https://github.com/pytorch/FBGEMM/pull/9
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13960
The vectorized code was rounding to even in halfway cases with _mm256_round_ps + (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) (see more details in https://software.intel.com/en-us/node/523819), but we were still using std::round in a couple of places which does rounding away from zero in halfway cases.
With this diff, we use std::nearbyint in all scalar code (except a few cases where we don't care exact rounding mode and uses rint which is the fastest in general) to be more consistent. nearbyint is the same as what the vectorized code does only when the current rounding mode is FE_TONEAREST but in practice this is OK because we almost always use the default rounding mode FE_TONEAREST.
This is inspired by Marat's diff for mobile quantization.
Reviewed By: dskhudia
Differential Revision: D13017719
fbshipit-source-id: 6b8f99db7ea2e233aa2e3bd2adf622e03ed6258e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13814
D10333829 implemented 3D conv in NHWC in fp32 ops so int8 ops don't need special handling anymore.
Reviewed By: hx89
Differential Revision: D13017666
fbshipit-source-id: 41df449f5e21c4c7134cc5c480e559f8c247069b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13740
We would like to rename the old fbgemm to “fbgemm0”, and the new fbgemm2 to “fbgemm”:
This DIFF changes all namespace fbgemm2 to namespace fbgemm.
The purpose is to avoid the confusion of "fbgemm2" when we release our FBGEMM open source.
Reviewed By: jspark1105
Differential Revision: D12850449
fbshipit-source-id: 08cc47864b157e36fbceddb7a10bf26218c67bd8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13701
We would like to rename the old fbgemm to “fbgemm0”, and the new fbgemm2 to “fbgemm”:
This DIFF changes all namespace fbgemm to namespace fbgemm0.
Reviewed By: jspark1105
Differential Revision: D12848727
fbshipit-source-id: 47935e9e2c4714a7ce1bfc3f7e4d6a334130132e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13660
Any change in server side quantized operator was triggering ios-sanity-check with more than 5 hours testing time. I suspect this was because the operator code was synced with xplat directory. This diff moves server side quantized operators to caffe2/caffe2/quantization/server to avoid this issue.
Reviewed By: hx89
Differential Revision: D12955420
fbshipit-source-id: b6c824b9de5e2a696f8c748e1b2c77d81d46746b