The number that matters most from Meta's AI launch this week isn't in a benchmark table. It's the jump from #57 to #5 on the iOS App Store within 24 hours of Muse Spark's release.
That 52-rank climb within a day is the real signal. Muse Spark represents Meta's first non-Llama model—built on a completely new internal stack—and the market is voting with downloads. Simon Willison's hands-on testing of the two modes offers the clearest picture of what users actually experience. His "pelican test" produced a mangled bird on a mangled bicycle in Instant mode, then a clearly recognizable pelican wearing a blue cycling helmet in Thinking mode. The difference isn't cosmetic. It's the gap between a toy and a tool.
The thinking mode is what makes Muse Spark worth watching. Meta exposes two distinct inference modes on meta.ai: a fast "Instant" response and a slower "Thinking" path that produces meaningfully better outputs. Benchmarks are competitive—Meta's self-reported numbers place Muse Spark alongside Opus 4.6, Gemini 3.1 Pro, and GPT 5.4 on selected tests. But the company acknowledges it's behind on Terminal-Bench 2.0, specifically flagging long-horizon agentic tasks and coding workflows as areas needing work. A future "Contemplating" mode promises much longer reasoning chains, positioning Meta against GPT-5.4 Pro's depth.
The API situation remains tightly controlled. Muse Spark is currently hosted-only—no open weights—and access comes through a private API preview limited to select partners. No public pricing has emerged. This puts Muse Spark in direct competition with OpenAI and Google's hosted offerings rather than the open-source ecosystem Llama once dominated.
But here's what the App Store climb actually tells us: users are choosing quality over brand inertia. The pelican SVG that Willison generated isn't a curated demo—it's a reproducible result anyone can verify. When people download an app, try it, and recommend it to others, that's not marketing spend. That's product-market fit emerging in real time.
Meta spent years as an AI also-ran, shipping Llama models that impressed researchers while their consumer product lagged. Muse Spark suggests something has shifted. The question now isn't whether Meta can build a competitive model—it's whether they can sustain user attention once the novelty of the App Store charts fades. The next few weeks will reveal whether this surge reflects genuine switching or temporary curiosity. Either way, Meta just proved it can build something people actually want to use.