The side button has this satisfying click—like an old Leica camera shutter. But after twenty minutes of "vibe coding" a half-baked tool, the back of the Rabbit R1 gets noticeably warm. Not hot, but that specific, localized warmth of a processor struggling to keep up with a Large Action Model (LAM) in real-time. It’s a weird, orange plastic brick that finally does more than play Spotify, but man, the road to RabbitOS 2 was bumpy.
I bought the R1 in September 2025, right after the OS 2 overhaul dropped. I'd read the 2024 disaster reviews—the half-baked demos, the connectivity that came and went like a bad Wi-Fi signal. But I’m a sucker for weird hardware. And after my own 2 AM server crash in January 2025—where a rogue agent deleted system files just to "free up space"—I needed something that felt tamed. I needed an AI with guardrails.
The 1,000% Tip Incident
My first real attempt at a "Creation" was innocent enough: a tip splitter for a 15-person dinner. I held the button and spoke clearly: "Make a tool that splits the bill evenly, adds 18% tip, and handles uneven amounts." The R1 thought for six seconds, then generated a digital card.
At dinner, I tapped it. It suggested a 1,000% tip.
I stared at the screen. The group laughed. I paid manually and spent the walk home trying three more prompts to fix the logic. Turns out, I’d said "eighteen percent," but the LAM parsed it as "1800%" because of the background restaurant noise? Still not sure. The "Critic Agent" that was supposed to catch these errors? Silent. It took "eighteen, not one thousand eight hundred" and a full redo to get it right.
That’s Vibe Coding in a nutshell: you’re not writing code; you’re negotiating with a very literal, very fast, and sometimes very confused intern. But here is the kicker—I didn't have to open the App Store once.
I bought the R1 in September 2025, right after RabbitOS 2 dropped. I'd read the 2024 disaster reviews—half-baked demos, connectivity that came and went. But I'm a sucker for weird hardware. And after my own 2AM server crash in January 2025 (long story: an agent deleted system files to "free up space"), I needed something that felt tamed. Something with guardrails.
The 1,000% tip incident
My first real attempt at a "Creation" was innocent enough: a tip splitter for a 15-person dinner. I held the button, spoke clearly: "Make a tool that splits the bill evenly, adds 18% tip, and handles uneven amounts." The R1 thought for a few seconds, then generated a card. At dinner, I tapped it. It suggested a 1,000% tip. I stared at the screen. The group laughed. I paid manually and spent the walk home trying three more prompts to fix the logic. Turns out, I'd said "eighteen percent" but the LAM parsed it as "1800%"? Still not sure. The critic agent? Silent. It took "eighteen, not one thousand eight hundred" and a full redo to get it right. That's vibe coding: you're not writing code, you're negotiating with a very literal intern.
So what is vibe coding, really?
It's not "the end of programming." It's more like talking to someone who half-listens. You describe what you want, it builds a card. But the card is often wrong in delightful or infuriating ways. The consensus on the Rabbit subreddit is that you need at least three back-and-forths for anything non-trivial. Someone there posted: "I'm not a developer, I'm a prompt nagger." That's the vibe.
The device runs on a Large Action Model (LAM) and something they call Rabbit Intern. The Intern is supposed to ask clarifying questions. Sometimes it does. Sometimes it just... guesses. Like that time I asked for a "meeting note taker" and it made a tool that ordered pizza for every attendee. (I wish I was making this up.)
“It's like pairing a very eager, very confused intern with a slightly overprotective babysitter (the critic agent).”— u/agent_fail, r/RabbitR1
Under the hood: cards, critics, and that warm back
RabbitOS 2 replaced the old app list with a "deck of cards." Swipe left to dismiss, right to duplicate. It's tactile. I like it. But the real magic is the critic agent—a second AI that watches the first one. After my 2025 server meltdown, I'm obsessed with oversight. The critic isn't perfect (see: tip incident), but it catches stuff like "hey, this creation is trying to access your texts at 3am" or "that flight is $900 over your usual budget."
Here's a real constitution from a health Creation I built, based on the health agent guide:
{
"triggers": { "oura_hrv": "<35" },
"actions": [
{ "reschedule": "calendar_events with 'meeting' before 10am" },
{ "order": "smoothie", "budget_max": 18 }
],
"critic_rules": {
"never_reschedule": "events tagged 'doctor'",
"confirm_if": "total_cost > 20"
},
"privacy": { "anonymize_health_data": true }
}
It works. Mostly. But the smoothie order once tried to deliver to my old office because the GPS flickered. The critic caught it—"delivery address changed, confirm?"—and I sighed relief.
How it compares (I tested, you're welcome)
Task
RabbitOS 2
Cloud AutoGPT
Email draft + send
1.8s · 98%
4.2s · 91%
Multi-step travel
3.1s · 94%
8.7s · 88%
Health analysis
2.3s · 96%
5.5s · 87% (privacy concerns)
Energy cost? The R1 sips power. Cloud agents? $29/month subscription. Local Llama 4? 2.4 kWh/day — about $0.36. Numbers from my orchestration deep dive.
But do I miss the App Store? (yes, for some things)
Here's the contrarian take everyone hates: I still use my phone for banking. The Chase app is just faster. The R1's banking Creation? It worked twice, then failed to log in, then asked for MFA via text, then... I gave up. The polish of a native app—the milliseconds, the muscle memory—matters for stuff you do daily. Creations shine for weird tasks, not core ones. I'm not alone: on the Rabbit subreddit, someone posted "I love my R1 but I'll never ditch my phone for it." And that's fine. It's a companion, not a replacement.
The App Store isn't dying. But it's shrinking at the edges. The long-tail, the hyper-personal, the "I need this one specific tool for this weekend"—that's where Creations win. And they win without giving Apple 30%.
Four Creations I Actually Use
The Recovery Guardian (health)
Monitors my Oura ring. When HRV tanks, it shifts my calendar and texts my partner. It once blocked a 9am meeting after a bad night. I slept in. Worth the price of entry.
The Bill Negotiator (finance)
Drafted an ISP email, scheduled a callback, saved $25/month. The critic flagged that I shouldn't auto-pay until I confirmed—smart.
The Family Organizer (household)
Fridge inventory + meal suggestions. It once added "buy cat food" because it misheard "cat food" in a podcast. We were out anyway.
The "Auto-Shopper" (failure)
Tried to automate household supplies. The critic kept yelling: "budget exceeded," "new vendor not approved." After a week, I gave up. Some things need human eyes.
“I'm starting to think the real value isn't the Creations that work perfectly. It's the ones that fail in ways that make me laugh, or think, or text a friend.”
R2, swarms, and the doubt that won't leave
The rumored Rabbit R2 (late 2026) is supposed to have a dedicated AI chip, fully local critic, and "agent swarms"—multiple Creations talking to each other. Sounds cool. But I keep coming back to a question: do I want my devices to talk that much? The 2AM crash still echoes. Every time the R1 gets warm in my hand, I think about that server deleting files. The critic helps. But it's not perfect. And I'm not sure I want it to be.
Stuff people actually ask
How accurate is vibe coding? About 85% for multi-step. Plan on 2-3 tries.
Does it cost more than apps? Creations are free. You pay for the hardware ($199) and subscription ($15/mo). I've saved more than that.
Can you trust stranger's Creations? Inspect the constitution. The critic enforces limits. I still don't grant bank access.
RabbitR1 VibeCoding AgenticAI 2026
I still don't know if this is the future. But it's a weird, warm, orange present. And I'm here for it.
References: Health agent guide · Orchestration 2.0 · r/RabbitR1 · last updated Feb 2026
Hashtags
#RabbitR1 #RabbitOS #VibeCoding #AgenticAI #AIHardware #PostAppEra #RabbitIntern #TechTrends2026 #LargeActionModel #AI