Elon Musk’s AI company, xAI, has just released Grok 4, its latest and most powerful AI model to date. Alongside the new model, the company introduced a premium subscription called SuperGrok Heavy, priced at $300 per month.
This new model is designed to compete with OpenAI’s upcoming GPT-5 and Google’s Gemini, positioning xAI directly in the race for AI dominance. Grok 4 can analyse images, answer questions, and has been integrated into X, the social media platform formerly known as Twitter, which Musk also owns. However, this tight integration has drawn public attention to Grok’s errors, making every mistake highly visible.
The spotlight on Grok has been particularly intense due to a recent controversy. xAI had to limit Grok’s automated X account after it responded to users with antisemitic comments. The backlash was swift, and the company quietly removed part of Grok’s instructions that encouraged it to make politically incorrect statements. While the incident stirred concern, Musk and his team focused the Grok 4 launch on technical performance rather than directly addressing the controversy.
At the launch, Musk sat with xAI executives wearing a leather jacket and confidently stated that Grok 4 performs above a PhD level in every subject. However, he admitted it sometimes lacks common sense. He also hinted that the model had not yet created new technologies or scientific discoveries, but suggested that such milestones were only a matter of time.
xAI launched two versions of its model: Grok 4 and Grok 4 Heavy. The latter is described as a “multi-agent” version, meaning it can break down tasks into smaller parts, solve them with different agents, and then choose the best solution, much like a study group. This approach, according to xAI, allows Grok 4 Heavy to perform at a much higher level on academic benchmarks.
The company claims Grok 4 outperformed top models from Google and OpenAI on Humanity’s Last Exam, a rigorous test that measures how well AI models perform across a range of human knowledge. Grok 4 scored 25.4 percent without extra tools, compared to 21.6 percent for Gemini 2.5 Pro and 21 percent for OpenAI’s o3. With tools, Grok 4 Heavy scored an impressive 44.4 percent, well ahead of competitors.
Grok 4 also achieved the highest score ever recorded on the ARC-AGI-2 benchmark, which evaluates an AI’s ability to solve logic-based visual puzzles. It earned 16.2 percent, almost twice as high as its closest rival, Claude Opus 4. These numbers help position xAI’s models as leaders in the field, even if they are not yet household names like ChatGPT.
The $300-per-month SuperGrok Heavy plan is now the most expensive subscription offered by any major AI provider. In exchange, users get early access to Grok 4 Heavy and upcoming features. xAI has already teased the release of an AI coding assistant in August, a multi-modal agent in September, and a video generation model by October, indicating a fast-moving development pipeline.
Grok 4 is also being made available through an API, allowing developers to build apps using the model. Although xAI’s enterprise business is still only two months old, it plans to collaborate with large cloud providers to expand access through established platforms. This is a strategic move to quickly gain traction among developers and businesses.
Still, xAI faces an uphill battle in earning trust, especially after the recent public scandal. Grok 4 might be powerful, but the challenge now is to assure businesses and users that xAI can manage its models responsibly. Whether the company can maintain momentum and position Grok as a genuine rival to ChatGPT, Claude, and Gemini will depend not just on performance but on transparency, trust, and user experience.
📲 Get the latest Tech & Startup News on our WhatsApp Channel
👉 Join Now



