ollama/server
Daniel Hiltgen 1fdb351c37
New engine: vision models and auto-fallback (#9113)
* Include unified vision layers in memory prediction

For newer vision models with a single gguf, include
the projection estimates.

* Adjust CLI to handle both styles of vision model metadata

* Wire up new tokenizers for new engine

If we're loading the new engine, utilize the new model
text processor instead of calling into cgo wrappers for
llama.cpp.  This also cleans up some tech debt from the
older tokenization flow for the C++ server which was
no longer used.

This also adjusts the grammar handling logic to pass
through to the new engine instead of utilizing the cgo
schema to grammar call.

* Lay foundation for auto selection of new engine
2025-03-04 09:03:46 -08:00
..
internal server/internal/registry: reintroduce pruning on model deletion (#9489) 2025-03-03 19:11:16 -08:00
testdata/tools all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create_test.go server: validate local path on safetensor create (#9379) 2025-02-28 16:10:43 -08:00
create.go server: validate local path on safetensor create (#9379) 2025-02-28 16:10:43 -08:00
download.go server: increase timeout in stall detection from 5s to 30s (#8831) 2025-02-05 10:00:26 -08:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
model.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
modelpath_test.go server: more support for mixed-case model names (#8017) 2024-12-11 15:29:59 -08:00
modelpath.go server: more support for mixed-case model names (#8017) 2024-12-11 15:29:59 -08:00
prompt_test.go prompt: Don't trim whitespace from prompts 2024-12-09 11:02:55 -08:00
prompt.go New engine: vision models and auto-fallback (#9113) 2025-03-04 09:03:46 -08:00
routes_create_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
routes_delete_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_generate_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go server/internal/client/ollama: hold DiskCache on Registry (#9463) 2025-03-02 20:55:44 -08:00
routes.go New engine: vision models and auto-fallback (#9113) 2025-03-04 09:03:46 -08:00
sched_test.go next ollama runner (#7913) 2025-02-13 16:31:21 -08:00
sched.go server: add missing function parens to debug log (#9255) 2025-02-20 12:10:15 -08:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00