ollama

mirror of https://github.com/zebrajr/ollama.git synced 2025-12-06 12:19:56 +01:00

History

Jesse Gross 282bfaaa95 ollamarunner: Use a separate context per multimodal input Currently there is a single context per sequence, shared all by all multimodal inputs. Since we build a vision encoder graph per image, with a large number of inputs we can eventually hit the maximum number of graph nodes per context. This changes to use a separate context for each image, ensuring that available resource limits are consistent.		2025-03-14 15:38:54 -07:00
..
imageproc	imageproc mllama refactor (#7537 )	2024-12-14 19:50:15 -08:00
input	ml: Allow models to constrain inputs to a single batch	2025-03-14 15:38:54 -07:00
models	ollamarunner: Use a separate context per multimodal input	2025-03-14 15:38:54 -07:00
testdata	gemma2 impl	2025-03-11 14:35:08 -07:00
model_test.go	model: Update encoder cache to use multimodal input processing handler	2025-03-09 17:05:26 -07:00
model.go	ollamarunner: Use a separate context per multimodal input	2025-03-14 15:38:54 -07:00
process_text_spm_test.go	model: add more spm tokenizer tests	2025-03-11 14:49:20 -07:00
process_text_spm.go	model: validate left and right pairs before merging them	2025-03-11 14:49:20 -07:00
process_text_test.go	model: Don't unconditionally add special tokens	2025-03-06 16:54:16 -08:00
process_text.go	set non-causal attention	2025-03-11 14:49:18 -07:00