Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37937
Sometime traces models dont preseve aten::linear ops and they are decomposed
into addmm or mul + add. Adding thie preprocessing step helps us catch more
lowerable linear nodes.
Please see the test for example.
Test Plan: python test/test_xnnpack_integration.py
Reviewed By: xcheng16
Differential Revision: D21428069
fbshipit-source-id: 6c4ea3335eaf5722852c639fb4ee593746bb408f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35350
Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format. The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination. Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.
Differential Revision: D20656798
Test Plan: Imported from OSS
Pulled By: AshkanAliabadi
fbshipit-source-id: 3f8c51df4d1fd386fa9473e7024621a7b7c6e86c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35010
semantics.
This PR moves all the xnnpack specific interfces to a generic interface.
Accordingly removes xnnpac specific reference from API and some variable
names.
What has not yet changed:
TODO:
USE_XNNPACK is still used. This can be removed where no XNNPACK
specific things are done. e.g., RegisterOpContext.cpp and
xnnpack_rewrite.cpp.
Also the filename and structure also remains. Some of the generic class
definition can be moved non-XNNPACK specific folder.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20526416
fbshipit-source-id: 2e1725345c44bbb26bdc448097a7384eca121387
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34319
Removes prepacking ops and install them as attributes of the top level
module. Needs to run freezing as the first pass.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20290726
fbshipit-source-id: 633ceaa867ff7d5c8e69bd814c0362018394cb3a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34048
Rewrites the graph to insert xnnpack prepack and packed run ops for
conv2d and linear.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20185658
fbshipit-source-id: c4c073c912ad33e822e7beb4ed86c9f895129d55
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34047
This PR integrates the added xnnpack conv2d and linear op via
custom class registration for packed weights. The packed struct
is serializable.
Test Plan:
python test test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20185657
fbshipit-source-id: fc7e692d8f913e493b293b02d92f4e78536d7698