Free users get xAI’s newest Grok model now

Close-up portrait of Elon Musk in three-quarter view with a neutral expression, centered against sweeping cyan and amber data ribbons spiraling into depth to suggest very long memory, a subtle glossy xAI logo orb to one side, bright high-key background with crisp contrast, medium framing

xAI launched Grok-4-Fast, a cost-optimized successor to Grok-4 with a 2M-token context window. The model unifies “reasoning” and “non-reasoning” behaviors under one set of weights that respond to system prompts. It targets high-throughput search, coding, and Q&A, and includes native tool-use reinforcement learning for browsing, code execution, and tool calls.

According to Marktechpost, Grok-4-Fast aims to cut latency and token use while keeping accuracy close to Grok-4.

Unified model for real-time work

Earlier Grok releases split long “reasoning” and short “non-reasoning” responses across separate models. Grok-4-Fast uses a unified weight space and steers behavior with system prompts. This reduces end-to-end latency and token count in real-time use like search and interactive coding.

xAI trained the model end-to-end with tool-use RL. The company reports gains on search agent tasks, including BrowseComp 44.9%, SimpleQA 95.0%, and Reka Research 66.0%. Chinese variants saw higher scores, such as BrowseComp-zh 51.2%.

Benchmarks and “intelligence density”

xAI cites pass@1 results of 92.0% on AIME 2025 and 93.3% on HMMT 2025 without tools. It reports 85.7% on GPQA Diamond and 80.0% on LiveCodeBench Jan–May. The company says the model uses about 40% fewer “thinking” tokens than Grok-4 at similar accuracy.

In LMArena, grok-4-fast-search (codename “menlo”) ranks #1 in the Search Arena with 1163 Elo. The text variant (codename “tahoe”) sits at #8 in the Text Arena, described as near grok-4-0709.

Availability and pricing

The model is generally available in Grok’s Fast and Auto modes on web and mobile. Auto will select Grok-4-Fast on hard queries to improve latency and keep quality. For the first time, free users can access xAI’s latest model tier.

Developers get two SKUs, grok-4-fast-reasoning and grok-4-fast-non-reasoning, both with 2M context. xAI lists API pricing as $0.20 / 1M input tokens (<128k), $0.40 / 1M input tokens (≥128k), $0.50 / 1M output tokens (<128k), $1.00 / 1M output tokens (≥128k), and $0.05 / 1M cached input tokens.

xAI frames the efficiency as “intelligence density.” It claims a ~98% reduction in price to match Grok-4 on benchmarks when combining lower token use with new per-token rates.

Marktechpost links to xAI’s announcement and technical details at x.ai.

Total
0
Shares
Previous Post
Official Oracle and Meta logos large and centered, linked by a glowing fiber-optic cable above sleek cloud server racks, vivid red and deep blue palette, bright high-contrast scene, medium close-up framing, no text or UI

Oracle and Meta in talks on $20 billion AI deal

Next Post
Close-up split portrait showing Sylvester Stallone on the left and Noah Centineo on the right, their faces connected by a shimmering digital mesh that suggests AI de-aging, a bold red headband subtly running across both, soft jungle background bokeh, high-brightness warm skin tones with cool teal shadows, crisp eyes and pores, neutral editorial tone, no text.

Stallone pitches AI to play teen Rambo himself

Related Posts