HYDERABAD, India (GizTimes) —Google has gone a full year without updating its open-source AI models. That changes with Gemma 4, a release that doesn’t just improve performance but reshapes how developers can use and deploy AI. The company released Gemma 4 today, built on Gemini 3 architecture in sizes from 2B to 31B parameters. What sets it apart is the switch to Apache 2.0 licensing from restrictive custom terms, opening doors for broad commercial and developer freedom.
These models are designed to run locally across devices, including phones, laptops, and workstations. The 31B dense variant is aimed at complex reasoning tasks and, in early benchmarks, slightly outperforms Qwen 3.5 397B. Alongside it, a 26B Mixture-of-Experts model uses a 26B-A4B setup, activating roughly 4 billion parameters per token. This allows it to deliver performance comparable to larger systems like Qwen 3.5 27B, while remaining more efficient. Smaller 2B and 4B variants (E2B and E4B) are built for mobile use and include native audio support, expanding their practical applications.
Gemma 4 excels at agentic tasks like multi-step instructions and tool calls. It runs byte-for-byte on NVIDIA RTX hardware or Android via AICore, with a developer preview now available for app integration. Weights are public as open versions of proprietary Gemini 3 tech, though training data stays internal.
Support rolls out fast on Google Cloud, Hugging Face, and NVIDIA stacks.
What really stands out is the change in licensing. Gemma 4 drops several restrictions that were present in Gemma 3, including limits on redistribution and distillation, and even allows air-gapped use. Compared to Llama 4, which still has commercial caps at scale, this feels far more open. It could encourage a shift away from GPU-heavy approaches toward running AI more efficiently on local systems. The licensing shift is a key highlight, removing earlier restrictions and enabling broader use cases, including redistribution, distillation, and air-gapped deployments.
(Twitter) X reactions tilt positive, buzzing over local AI’s rise and open terms.
Users foresee home data centers paired with phone models killing the app overload, as one said: This is huge news everyone will be running their own model locally. This could change the ‘app’ model we’re currently drowning in.



