Homunculus 12B and GLM-4-32B-Base-32K: 2 new Arcee AI research-oriented models
In this new video, I introduce two new research-oriented models that Arcee AI recently released on Hugging Face.
Homunculus is a 12 billion-parameter instruction model distilled from Qwen3-235B onto the Mistral AI Nemo backbone. It was purpose-built to preserve Qwen’s two-mode interaction style—/think (deliberate chain-of-thought) and /nothink (concise answers)—while running on a single consumer GPU, and even on CPU as demonstrated in the video.
GLM-4-32B-Base-32K is an enhanced version of Tsinghua University's THUDM's GLM-4-32B-Base-0414, specifically engineered to offer robust performance over an extended context window. While the original model's capabilities degraded after 8,192 tokens, this version maintains strong performance up to a 32,000-token context, making it ideal for tasks requiring long-context understanding and processing.