Product Synthesized from 1 source

Murati's Interaction Models: Real-Time AI or Clever Branding?

Key Points

• Thinking Machines announces 'interaction models' for continuous multimodal AI perception
• Models maintain real-time awareness unlike current systems that wait for prompts
• Technical components (multimodal, real-time) exist separately in current AI
• Pricing and availability remain undisclosed
• Competitors (OpenAI, Google) already pursuing similar real-time AI capabilities
• Former CTO Murati leveraging credibility but faces scrutiny on differentiation

References (1)

[1] Mira Murati's Thinking Machines unveils 'interaction models' for real-time AI — The Verge AI ↗

We call them "assistants," but today's AI models spend most of their time blind. They wait. They have no idea what you're doing until you hit enter. Mira Murati's new company, Thinking Machines, is calling this the wrong paradigm—and on Monday, it unveiled what it believes should replace it.

The company announced "interaction models," a category it defines as AI systems that continuously perceive audio, video, and text while monitoring user actions in real time. Unlike current assistants that activate only when prompted, these models maintain persistent context awareness. The result, according to Thinking Machines, is collaboration that mirrors how humans work together—responsive, continuous, and contextually grounded.

The distinction sounds subtle but represents a fundamental shift in architecture. Today's large language models process inputs in discrete turns. You type a query, the model responds, the conversation pauses. Interaction models, as Murati's team describes them, would observe your cursor movements, note hesitations, watch facial expressions through your camera, and factor all of it into ongoing reasoning. The model never truly "starts" or "stops"—it simply thinks alongside you.

This raises a obvious question: is this a genuine paradigm shift or polished rebranding? The technical components—multimodal processing, real-time inference, sustained context windows—exist individually in current systems. GPT-4o demonstrated early real-time audio capabilities. Google's Project Astra showed continuous video perception. What Thinking Machines appears to be proposing is tighter integration: a unified system where all modalities inform each other simultaneously, without the user explicitly invoking each one.

The competitive pressure is real. OpenAI, Anthropic, and Google are all racing toward AI that feels more present and less transactional. Murati knows this territory intimately—she helped architect the systems she's now challenging. Her credibility buys the concept legitimacy, but it also invites scrutiny. Industry observers will ask whether interaction models represent technical breakthroughs or architectural choices competitors could replicate quickly.

Pricing and access details remain sparse. Thinking Machines has not announced when interaction models will be available or at what cost tier. For a technology defined by continuous operation, the business model implications are significant—sustained real-time inference is expensive. Whether developers and enterprises will pay a premium for perpetual awareness versus per-query pricing remains uncertain.

What is clear is the framing. Murati is not asking for incremental improvement. She's arguing that the current input-output model of AI is fundamentally broken—that waiting for typed commands is the wrong metaphor for how humans and machines should collaborate. If she's right, interaction models could become the standard interface layer for the next generation of AI applications. If she's overpromising, the industry's collective attention span is short enough that the window for proof won't stay open long.

The first implementations will tell the story. Until then, "interaction models" exists somewhere between technical promise and marketing architecture—a category waiting to be earned.