mirror of https://github.com/zebrajr/ollama.git synced 2025-12-06 00:19:51 +01:00

History

Daniel Hiltgen bc8909fb38 Use runners for GPU discovery (#12090 ) This revamps how we discover GPUs in the system by leveraging the Ollama runner. This should eliminate inconsistency between our GPU discovery and the runners capabilities at runtime, particularly for cases where we try to filter out unsupported GPUs. Now the runner does that implicitly based on the actual device list. In some cases free VRAM reporting can be unreliable which can leaad to scheduling mistakes, so this also includes a patch to leverage more reliable VRAM reporting libraries if available. Automatic workarounds have been removed as only one GPU leveraged this, which is now documented. This GPU will soon fall off the support matrix with the next ROCm bump. Additional cleanup of the scheduler and discovery packages can be done in the future once we have switched on the new memory management code, and removed support for the llama runner.		2025-10-01 15:12:32 -07:00
..
testdata	test: improve scheduler/concurrency stress tests (#11906 )	2025-08-15 14:37:54 -07:00
api_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
basic_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
concurrency_test.go	tests: reduce stress on CPU to 2 models (#12161 )	2025-09-09 09:32:15 -07:00
context_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
embed_test.go	fix(integration): check truncated length (#12337 )	2025-09-18 14:00:21 -07:00
library_models_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
llm_image_test.go	perf: build graph for next batch async to keep GPU busy (#11863 )	2025-08-29 14:20:28 -07:00
max_queue_test.go	perf: build graph for next batch async to keep GPU busy (#11863 )	2025-08-29 14:20:28 -07:00
model_arch_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
model_perf_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
quantization_test.go	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
README.md	tests: add single threaded history test (#12295 )	2025-09-22 11:23:14 -07:00
utils_test.go	Use runners for GPU discovery (#12090 )	2025-10-01 15:12:32 -07:00

README.md

Integration Tests

This directory contains integration tests to exercise Ollama end-to-end to verify behavior

By default, these tests are disabled so go test ./... will exercise only unit tests. To run integration tests you must pass the integration tag. go test -tags=integration ./... Some tests require additional tags to enable to allow scoped testing to keep the duration reasonable. For example, testing a broad set of models requires -tags=integration,models and a longer timeout (~60m or more depending on the speed of your GPU.). To view the current set of tag combinations use find integration -type f | xargs grep "go:build"

The integration tests have 2 modes of operating.

By default, they will start the server on a random port, run the tests, and then shutdown the server.
If OLLAMA_TEST_EXISTING is set to a non-empty string, the tests will run against an existing running server, which can be remote based on your OLLAMA_HOST environment variable

Important

Before running the tests locally without the "test existing" setting, compile ollama from the top of the source tree go build . in addition to GPU support with cmake if applicable on your platform. The integration tests expect to find an ollama binary at the top of the tree.

Many tests use a default small model suitable to run on many systems. You can override this default model by setting OLLAMA_TEST_DEFAULT_MODEL