TL;DR: llama.cpp b9483 fixes hexagon profiler output and optimizes OpenCL gemv for large batch sizes.
Summary: This release fixes redundant "NONE" entries in the hexagon profiler output and updates the profiling script. It also switches OpenCL gemv to use flat variants of q4_K and q6_K for very large M, improving performance.
Why it matters: These optimizations directly benefit AI builders running local LLM inference on diverse hardware. Watch for performance gains on OpenCL devices with large batch sizes; update your llama.cpp build to b9483.
Source: github.com