ollama/server
Jesse Gross 6cd566872b sched: Lift parallel restriction for multimodal models except mllama
The Go runner does not have a problem with supporting parallel
requests for most multimodal models. Now that we won't be potentially
falling back to server.cpp, this restriction can be lifted.

However, the new mllama model can't support parallel requests, so we
will need to keep a restriction for that.
2024-11-06 13:32:18 -08:00
..
imageproc add more tests for getting the optimal tiled canvas (#7411) 2024-10-29 16:28:02 -07:00
testdata/tools server: add tool parsing support for nemotron-mini (#6849) 2024-09-17 18:06:16 -07:00
auth.go fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
download.go server: fix blob download when receiving a 200 response (#6656) 2024-09-05 10:48:26 -07:00
fixblobs_test.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
fixblobs.go server: replace blob prefix separator from ':' to '-' (#3146) 2024-03-14 20:18:06 -07:00
images.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
layer.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest_test.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
manifest.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
model_test.go server: add tool parsing support for nemotron-mini (#6849) 2024-09-17 18:06:16 -07:00
model.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
modelpath_test.go validate model path 2024-08-28 09:32:57 -07:00
modelpath.go validate model path 2024-08-28 09:32:57 -07:00
prompt_test.go runner.go: Better abstract vision model integration 2024-10-30 14:53:43 -07:00
prompt.go prompt: Use a single token when estimating mllama context size 2024-11-05 10:11:50 -08:00
routes_create_test.go Merge pull request #6534 from ollama/mxyng/messages 2024-08-30 09:39:59 -07:00
routes_delete_test.go server: clean up route names for consistency (#6524) 2024-08-26 19:36:11 -07:00
routes_generate_test.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
routes_list_test.go server: clean up route names for consistency (#6524) 2024-08-26 19:36:11 -07:00
routes_test.go image processing for llama3.2 (#6963) 2024-10-18 16:12:35 -07:00
routes.go One corrupt manifest should not wedge model operations (#7515) 2024-11-05 14:21:45 -08:00
sched_test.go Rename gpu package discover (#7143) 2024-10-16 17:45:00 -07:00
sched.go sched: Lift parallel restriction for multimodal models except mllama 2024-11-06 13:32:18 -08:00
sparse_common.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
sparse_windows.go Don't hard fail on sparse setup error 2024-08-09 12:16:19 -07:00
upload.go server: limit upload parts to 16 (#6411) 2024-08-19 09:20:52 -07:00