TL;DR: A new fine-tuned alternative to the Gemma 4 heretic collection has been released, modifying divergence and refusal behaviors.
Summary: An independent developer has built an unquantized version of a Gemma 4 heretic alternative, which modifies the divergence and refusal characteristics compared to the original model. Users report the model maintains performance better in long-context scenarios (up to 20k tokens) and during complex tool-chain execution where the original model often degrades. Quantization to 4-bit (specifically Q4_K_M) is currently being sought by the community.
Why it matters: It offers AI developers an alternative alignment profile for Gemma 4 that is less hyper-vigilant and more robust during long-context tool use. Builders should watch for upcoming 4-bit quantizations to run this model efficiently on local hardware.
Source: reddit.com