mirror of
https://github.com/zebrajr/ollama.git
synced 2025-12-06 00:19:51 +01:00
Defragging the KV cache can generate a lot of operations, so we need to be careful that we don't overflow the number that the graph can support. We currently account for all of the nodes that we add to the graph for each move but we also need to include the original cache tensors as well. Fixes #9904 |
||
|---|---|---|
| .. | ||
| cache.go | ||
| causal_test.go | ||
| causal.go | ||
| encoder.go | ||
| wrapper.go | ||