Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19319
Quantized SUM + ReLU (Fused). The implementation is the same as the one in the DNNLOWP.
Reviewed By: jianyuh
Differential Revision: D14866442
fbshipit-source-id: c8c737a37e35b6ce3c1c2077c07546aba16e0612
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19312
Replaces the tuple hack with the QTensor. Please, note this can be landed ONLY after #18960 (D14810261) is landed.
Reviewed By: raghuramank100
Differential Revision: D14819460
fbshipit-source-id: 75ca649304b1619cb3cfe845962c9f226b8f884a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19273
Some of the CIs are failing if the protobuf is not installed. Protobuf is imported as part of the `caffe2.python.core`, and this adds a skip decorator to avoid running tests that depend on `caffe2.python.core`
Reviewed By: jianyuh
Differential Revision: D14936387
fbshipit-source-id: e508a1858727bbd52c951d3018e2328e14f126be
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19091
Implements a basic quantized ReLU (uint8). This is a temporary solution before using the `QTensor` type instead of the tuple.
Reviewed By: dzhulgakov
Differential Revision: D14565413
fbshipit-source-id: 7d53cf5628cf9ec135603d6a1fb7c79cd9383019
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**
This was requested by someone at Facebook; this lint is turned
on for Facebook by default. "Sure, why not."
I had to noqa a number of imports in __init__. Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it. Left for future work.
Be careful! flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments. flake8-3 will
report an import unused; flake8-2 will not. For now, I just
noqa'd all these sites.
All the changes were done by hand.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D14687478
fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18267
Motivation: we don't actually want to use it for real under any circumstances. This is an idea to unblock our internal progress and parallelize workstreams. We can easily define schemas for all ops in question and implement forwarding to C2 ops which is NOT going to be performant. Then several things can be happening in parallel:
* move code of ops outside of C2 ops that depend on protobuf into c10
* development of optimization/fusion passes
* building python-level wrappers with clean API
* improving perf
This demonstrates, Relu, quant, dequant. It seems to cover all use cases necessary (maybe except weights prepacking). Ideally I'd demonstrate Conv, but will get to it later in a separate PR (contributions welcomed)
Reviewed By: ezyang
Differential Revision: D14531232
fbshipit-source-id: 4cd4a71ae0cb373c6c0e81f965c442b82a1b4069