ollama/llm
nicole pardal 5d347f6d6f
server: Consolidate embedding truncation in runner (#12730)
Currently, checking the length of prompts for embeddings to ensure
they fit in the context window (and possible truncation) occurs in
two places - the Ollama server and runner. This can lead to
inconsistencies in both the checks and reported number of tokens
processed. Since we have to do this processing in the runner, this
consolidates all of the logic there.
2025-10-27 11:59:12 -07:00
..
llm_darwin.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_linux.go Optimize container images for startup (#6547) 2024-09-12 12:10:30 -07:00
llm_windows.go win: lint fix (#10571) 2025-05-05 11:08:12 -07:00
memory_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
memory.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
server_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
server.go server: Consolidate embedding truncation in runner (#12730) 2025-10-27 11:59:12 -07:00
status.go Improve crash reporting (#7728) 2024-11-19 16:26:57 -08:00