ollama/model/models
Michael Yang 6c833d5f8d fix(qwen3): deepseek distill
deepseek's qwen3 distill uses a different rope scheme so support both
2025-10-13 13:30:30 -07:00
..
bert embed: cleanup (#12299) 2025-09-16 09:48:42 -07:00
deepseek2 Fixed Deepseek2 adding nil tensor error 2025-10-03 14:20:06 -07:00
gemma2 gemma: fix rope scaling for qat models (#12348) 2025-09-19 15:04:40 -07:00
gemma3 gemma: fix rope scaling for qat models (#12348) 2025-09-19 15:04:40 -07:00
gemma3n fix(llama): other llama flavours (#12308) 2025-09-17 12:12:21 -07:00
gptoss multi-regexp pretokenizer (#12325) 2025-09-23 13:21:47 -07:00
llama multi-regexp pretokenizer (#12325) 2025-09-23 13:21:47 -07:00
llama4 refactor: use builtin max and min 2025-10-09 16:17:52 -07:00
mistral3 multi-regexp pretokenizer (#12325) 2025-09-23 13:21:47 -07:00
mllama refactor: use builtin max and min 2025-10-09 16:17:52 -07:00
qwen2 multi-regexp pretokenizer (#12325) 2025-09-23 13:21:47 -07:00
qwen3 fix(qwen3): deepseek distill 2025-10-13 13:30:30 -07:00
qwen25vl multi-regexp pretokenizer (#12325) 2025-09-23 13:21:47 -07:00
models.go Grace/deepseek v3 migration (#12385) 2025-09-24 15:19:47 -07:00