mirror of
https://github.com/zebrajr/ollama.git
synced 2025-12-06 00:19:51 +01:00
If we create a memory layout that should fit based on report free VRAM but allocation still fails, we start applying a backoff. This reduces free VRAM by an exponential percentage (1%, 2%, 4%...). However, the points chosen tend to be too dense at the beginning and too sparse at the end. Therefore, this switches to an incremental backoff (10%, 20%, 30%...). |
||
|---|---|---|
| .. | ||
| llm_darwin.go | ||
| llm_linux.go | ||
| llm_windows.go | ||
| memory_test.go | ||
| memory.go | ||
| server_test.go | ||
| server.go | ||
| status.go | ||