ollama

mirror of https://github.com/zebrajr/ollama.git synced 2025-12-06 12:19:56 +01:00

History

Parth Sareen 0682dae027 sample: improve ollama engine sampler performance (#9374 ) This change bring in various interface cleanups along with greatly improving the performance of the sampler. Tested with llama3.2 on local machine. Improves performance from ~ 70 tokens/s -> 135 tokens/s with topK(40) enabled. Without topK performance is ~ 110 tokens/s		2025-03-07 12:37:48 -08:00
..
samplers_benchmark_test.go	sample: improve ollama engine sampler performance (#9374 )	2025-03-07 12:37:48 -08:00
samplers_test.go	sample: improve ollama engine sampler performance (#9374 )	2025-03-07 12:37:48 -08:00
samplers.go	sample: improve ollama engine sampler performance (#9374 )	2025-03-07 12:37:48 -08:00
transforms_test.go	sample: improve ollama engine sampler performance (#9374 )	2025-03-07 12:37:48 -08:00
transforms.go	sample: improve ollama engine sampler performance (#9374 )	2025-03-07 12:37:48 -08:00