ollama/kvcache
Jesse Gross 1c093e97af kvcache: Remove special case for reservation mask
We currently short circuit generation of the cache mask and just
generate an empty tensor of the correct size. However, in some
cases, this can also skip a cast operation. This can result in the
worst case graph being not fully worst case.

We don't actually need the fast path for mask generation, so it's
better to just use the normal code path.
2025-10-22 17:38:04 -07:00
..
cache.go ollamarunner: Preallocate worst case graph at startup 2025-04-08 10:01:28 -07:00
causal_test.go kvcache: Clean up sliding window state with independent batches 2025-10-08 16:43:14 -07:00
causal.go kvcache: Remove special case for reservation mask 2025-10-22 17:38:04 -07:00
encoder.go ollamarunner: Preallocate worst case graph at startup 2025-04-08 10:01:28 -07:00
wrapper.go ollamarunner: Preallocate worst case graph at startup 2025-04-08 10:01:28 -07:00