Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54986
If the input is 1D xnnpack::linear fails while aten::linear makes it (1, D) and continues
Test Plan: buck test //caffe2/test:xnnpack_integration -- TestXNNPACKOps
Reviewed By: kimishpatel
Differential Revision: D27441966
fbshipit-source-id: dfb2c23b91247632e0e3fd2482056a503c246c39
Summary:
`TCPStoreTest.test_numkeys_delkeys` takes 5+ min (mostly in idle wait for socket timeout)
`TestDataLoader.test_proper_exit` and `TestDataLoaderPersistentWorkers.test_proper_exit` take 2.5 min each
`TestXNNPACKConv1dTransformPass.test_conv1d_with_relu_fc` takes 2 min to finish
Add option to skip reporting test classes that run for less than a second to `print_test_stats.py` and speed up `TestTorchDeviceTypeCUDA.test_matmul_45724_cuda`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46068
Reviewed By: mruberry
Differential Revision: D24208660
Pulled By: malfet
fbshipit-source-id: 780e0d8be4f0cf69ea28de79e423291a1f3349b7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45850
In TSAN mode most xnnpack integration tests seem to be failing. Reason for
failure is not entirely clear. It is not clear if this is spurious.
Test Plan: python test/test_xnnpack_integration.py
Reviewed By: xcheng16
Differential Revision: D24113885
fbshipit-source-id: dc3de3ad3d4bf0210ad67211383dbe0e842b09dd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44035
change
Also added test so as to capture such cases for future.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Reviewed By: iseeyuan
Differential Revision: D23476773
fbshipit-source-id: a62c4429351c909245106a70b4c60b1bacffa817
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43233
XNNPack is already being used for the convolution2d operation. Add the
ability for it to be used with transpose convolution.
Test Plan: buck run caffe2/test:xnnpack_integration
Reviewed By: kimishpatel
Differential Revision: D23184249
fbshipit-source-id: 3fa728ce1eaca154d24e60f800d5e946d768c8b7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37937
Sometime traces models dont preseve aten::linear ops and they are decomposed
into addmm or mul + add. Adding thie preprocessing step helps us catch more
lowerable linear nodes.
Please see the test for example.
Test Plan: python test/test_xnnpack_integration.py
Reviewed By: xcheng16
Differential Revision: D21428069
fbshipit-source-id: 6c4ea3335eaf5722852c639fb4ee593746bb408f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35350
Currently we call input.contiguous() on the input tensor resulting in an
unecessary allocation and copy in cases where the input is not contiguous
with regards to the requested memory format. The reason is that in such
scenarios, this call re-allocates and copies the input tensor into
contiguous storage, only for this newly allocated tensor to be used as
the source of another copy to the final destination. Instead, if we copy
into the destination directly in such circumstances, we will save an
allocation and a copy.
Differential Revision: D20656798
Test Plan: Imported from OSS
Pulled By: AshkanAliabadi
fbshipit-source-id: 3f8c51df4d1fd386fa9473e7024621a7b7c6e86c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35010
semantics.
This PR moves all the xnnpack specific interfces to a generic interface.
Accordingly removes xnnpac specific reference from API and some variable
names.
What has not yet changed:
TODO:
USE_XNNPACK is still used. This can be removed where no XNNPACK
specific things are done. e.g., RegisterOpContext.cpp and
xnnpack_rewrite.cpp.
Also the filename and structure also remains. Some of the generic class
definition can be moved non-XNNPACK specific folder.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20526416
fbshipit-source-id: 2e1725345c44bbb26bdc448097a7384eca121387
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34319
Removes prepacking ops and install them as attributes of the top level
module. Needs to run freezing as the first pass.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20290726
fbshipit-source-id: 633ceaa867ff7d5c8e69bd814c0362018394cb3a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34048
Rewrites the graph to insert xnnpack prepack and packed run ops for
conv2d and linear.
Test Plan:
python test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20185658
fbshipit-source-id: c4c073c912ad33e822e7beb4ed86c9f895129d55
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34047
This PR integrates the added xnnpack conv2d and linear op via
custom class registration for packed weights. The packed struct
is serializable.
Test Plan:
python test test/test_xnnpack_integration.py
Imported from OSS
Differential Revision: D20185657
fbshipit-source-id: fc7e692d8f913e493b293b02d92f4e78536d7698