llama.cpp b9483 improves profiling and OpenCL — Intelligence Feed

TL;DR: llama.cpp b9483 fixes hexagon profiler output and optimizes OpenCL gemv for large batch sizes.

Summary: This release fixes redundant "NONE" entries in the hexagon profiler output and updates the profiling script. It also switches OpenCL gemv to use flat variants of q4_K and q6_K for very large M, improving performance.

Why it matters: These optimizations directly benefit AI builders running local LLM inference on diverse hardware. Watch for performance gains on OpenCL devices with large batch sizes; update your llama.cpp build to b9483.

Source: github.com