ollama/server
nicole pardal 5d347f6d6f
server: Consolidate embedding truncation in runner (#12730)
Currently, checking the length of prompts for embeddings to ensure
they fit in the context window (and possible truncation) occurs in
two places - the Ollama server and runner. This can lead to
inconsistencies in both the checks and reported number of tokens
processed. Since we have to do this processing in the runner, this
consolidates all of the logic there.
2025-10-27 11:59:12 -07:00
..
internal refactor: use the built-in max/min to simplify the code (#12280) 2025-09-16 17:14:21 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
create_test.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
create.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
download.go server: abort download on empty digest 2025-05-27 11:28:48 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
images.go templates: fix crash in improperly defined templates (#12483) 2025-10-02 17:25:55 -07:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model.go tools: refactor tool call parsing and enable streaming (#10415) 2025-05-23 14:19:31 -07:00
modelpath_test.go lint: enable usetesting, disable tenv (#10594) 2025-05-08 11:42:14 -07:00
modelpath.go server: add hint to the error message when model path access fails (#10843) 2025-05-24 13:17:04 -07:00
prompt_test.go Reapply "add truncate and shift parameters" (#12582) 2025-10-11 16:06:14 -07:00
prompt.go add registries for parsers/renderers 2025-10-14 01:13:54 -07:00
quantization_test.go Reapply "feat: incremental gguf parser (#10822)" (#11114) (#11119) 2025-06-20 11:11:40 -07:00
quantization.go skip quantizing per_layer_token_embd (#11207) 2025-06-26 21:49:35 -07:00
routes_create_test.go fs(ggml): fill in arch prefix if necessary (#12646) 2025-10-20 16:42:18 -07:00
routes_debug_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_delete_test.go fs(ggml): fill in arch prefix if necessary (#12646) 2025-10-20 16:42:18 -07:00
routes_generate_renderer_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_generate_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_harmony_streaming_test.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
routes_list_test.go Update the /api/create endpoint to use JSON (#7935) 2024-12-31 18:02:30 -08:00
routes_test.go engine: add remote proxy (#12307) 2025-09-17 14:40:53 -07:00
routes.go server: Consolidate embedding truncation in runner (#12730) 2025-10-27 11:59:12 -07:00
sched_test.go server: Consolidate embedding truncation in runner (#12730) 2025-10-27 11:59:12 -07:00
sched.go DRY out the runner lifecycle code (#12540) 2025-10-23 11:20:02 -07:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: always print upload/download part info (#8832) 2025-02-04 19:30:49 -08:00