mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 00:21:07 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26125 We already had some optimization implementation using AVX2 for improve the quantized kernel performance. In this diff, we want to enable the runtime dispatch. Test Plan: Sandcastle build and test Also test with a python binary calling into vectorized op. torch.__config__.show() PyTorch built with: - GCC 4.2 - clang 8.0.20181009 - Intel(R) Math Kernel Library Version 2017.0.3 Product Build 20170413 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v0.18.1 (Git Hash N/A) - OpenMP 1 - **CPU capability usage: AVX2** - Build settings: Reviewed By: jamesr66a Differential Revision: D17337251 fbshipit-source-id: 8e22d10011a12a4eaf54cea3485353eb1811d828 |
||
|---|---|---|
| .. | ||
| any.cpp | ||
| autograd.cpp | ||
| CMakeLists.txt | ||
| dataloader.cpp | ||
| dispatch.cpp | ||
| enum.cpp | ||
| expanding-array.cpp | ||
| functional.cpp | ||
| init_baseline.h | ||
| init_baseline.py | ||
| init.cpp | ||
| integration.cpp | ||
| jit.cpp | ||
| memory.cpp | ||
| misc.cpp | ||
| module.cpp | ||
| modulelist.cpp | ||
| modules.cpp | ||
| namespace.cpp | ||
| nn_utils.cpp | ||
| optim_baseline.h | ||
| optim_baseline.py | ||
| optim.cpp | ||
| ordered_dict.cpp | ||
| parallel.cpp | ||
| README.md | ||
| rnn.cpp | ||
| sequential.cpp | ||
| serialize.cpp | ||
| static.cpp | ||
| support.cpp | ||
| support.h | ||
| tensor_cuda.cpp | ||
| tensor_indexing.cpp | ||
| tensor_options_cuda.cpp | ||
| tensor_options.cpp | ||
| tensor.cpp | ||
| torch_include.cpp | ||
C++ Frontend Tests
In this folder live the tests for PyTorch's C++ Frontend. They use the GoogleTest test framework.
CUDA Tests
To make a test runnable only on platforms with CUDA, you should suffix your
test with _CUDA, e.g.
TEST(MyTestSuite, MyTestCase_CUDA) { }
To make it runnable only on platforms with at least two CUDA machines, suffix
it with _MultiCUDA instead of _CUDA, e.g.
TEST(MyTestSuite, MyTestCase_MultiCUDA) { }
There is logic in main.cpp that detects the availability and number of CUDA
devices and supplies the appropriate negative filters to GoogleTest.
Integration Tests
Integration tests use the MNIST dataset. You must download it by running the following command from the PyTorch root folder:
$ python tools/download_mnist.py -d test/cpp/api/mnist
The required paths will be referenced as test/cpp/api/mnist/... in the test
code, so you must run the integration tests from the PyTorch root folder.