mirror of
https://github.com/zebrajr/ollama.git
synced 2025-12-06 12:19:56 +01:00
This change bring in various interface cleanups along with greatly improving the performance of the sampler. Tested with llama3.2 on local machine. Improves performance from ~ 70 tokens/s -> 135 tokens/s with topK(40) enabled. Without topK performance is ~ 110 tokens/s |
||
|---|---|---|
| .. | ||
| samplers_benchmark_test.go | ||
| samplers_test.go | ||
| samplers.go | ||
| transforms_test.go | ||
| transforms.go | ||