ollama/ml
Michael Yang 59412fbb43
convert(gptoss): mxfp4 to ggml layout to avoid jit conversion (#12018)
* convert: return bytes written

* ggml flavor mxfp4

* simplify jit conversion

* comment
2025-08-26 16:41:02 -07:00
..
backend convert(gptoss): mxfp4 to ggml layout to avoid jit conversion (#12018) 2025-08-26 16:41:02 -07:00
nn update vendored llama.cpp and ggml (#11823) 2025-08-14 14:42:58 -07:00
backend.go kvcache: Use Cast instead of Copy for flash attention masks 2025-08-19 12:36:28 -07:00