All posts
4 min read
James Rivera Small business consultant and AI strategist

Claude Made 1,322% Trading on Polymarket. OpenClaw Got Liquidated.

Two AI agents traded on Polymarket with $1,000 each. Claude returned 1,322%. OpenClaw hit zero. What this means for AI agents in finance.

Claude Made 1,322% Trading on Polymarket. OpenClaw Got Liquidated.

A viral post hit X last week. Someone gave two AI agents $1,000 each and 48 hours to trade on Polymarket. Claude turned that into $14,216. OpenClaw got liquidated to zero. The post pulled 7,700 likes and 327 retweets. It also started a conversation that I think most people are reading wrong.

The takeaway isn't "Claude is better at trading." The takeaway is that autonomous agents are now operating in real financial markets with real money, and the performance gap between models is enormous.

What happened on Polymarket with AI agents trading?#

Polymarket is a prediction market. You bet on the outcome of real-world events. Will X happen by Y date? The market prices those contracts between $0 and $1 based on what traders collectively believe. It's been around for a while, but volume exploded in 2025 during the US election cycle. Now it covers everything from geopolitics to interest rate decisions to celebrity drama.

The experiment was straightforward. Two AI agents, each with $1,000 in USDC, given permission to read market data, analyze positions, and execute trades autonomously. No human in the loop. The Claude-based agent analyzed contract prices, assessed probability mispricings, sized positions conservatively, and compounded gains across dozens of trades over 48 hours. It finished at $14,216.

The OpenClaw-based agent took a different approach. It concentrated heavily into a few high-conviction bets, didn't manage position sizing well, and got caught on the wrong side of a binary outcome. Liquidated.

Now here's where people get it wrong. This wasn't really a fair model comparison. The OpenClaw agent was likely running a less capable model underneath (probably GPT-3.5 or an older open-source model), had a simpler system prompt, and wasn't given the same risk management instructions. We don't know the exact prompts used. We don't know the tool configurations. We don't know if one agent had access to real-time news feeds while the other didn't.

What we do know is this: the setup matters as much as the model. I've built enough agents to know that the difference between a good system prompt and a bad one can be the difference between an agent that reasons carefully and one that yolos into a single position.

Still, 1,322% in 48 hours is wild. Even accounting for luck on binary outcomes, the Claude agent demonstrated something real: it could read markets, manage risk, and compound gains without human intervention. That's a new capability. A year ago, this wasn't possible.

Dario Amodei (Anthropic's CEO) mentioned last month that proprietary trading is one area where a single person with AI agents could build a billion-dollar company. I'm starting to think he wasn't being hyperbolic.

Why should you care?#

If you're building anything with AI agents, this experiment is a signal worth paying attention to.

First, it shows that agent capability isn't just about chat quality. The same models we use for customer support and research can operate in high-stakes financial environments. The barrier between "assistant that answers questions" and "autonomous system that manages money" is mostly a tooling and orchestration problem now. The reasoning is already there.

Second, the performance gap tells you something about model selection. Not all models are equal when the stakes are real. For low-risk tasks like summarizing emails or drafting content, the difference between models is mostly vibes. For tasks where bad reasoning costs you money, the gap is existential. The OpenClaw agent didn't just underperform. It lost everything.

Third, and this is what keeps me up at night, we're heading toward a world where autonomous agents have financial accounts. They can hold balances, execute transactions, and make decisions about money. The regulatory framework for this doesn't exist yet. Who's liable when an AI agent makes a bad trade? The developer who wrote the prompt? The platform that hosted the agent? The model provider? Nobody has answered this.

For founders, the practical lesson is simpler. If you're building agents that interact with anything involving money (invoicing, expense management, payments, trading), your error handling and guardrails need to be at a completely different level than what you'd use for a chatbot. The Polymarket experiment showed what happens when guardrails are loose: total loss.

What I'm doing about it#

I run RapidClaw, a managed hosting platform for AI agents. We don't support trading use cases right now, and honestly I'm not sure when we will. The liability questions are too unresolved.

But I am paying attention to the underlying pattern: agents that can interact with external services autonomously, on a schedule, without human approval for every action. That's exactly what RapidClaw agents do today for things like research, notifications, and content curation. The architecture is the same. The risk profile is just different.

What I've been thinking about is tiered autonomy. An agent that can send you a Telegram message on its own is low-risk. An agent that can make a $500 trade needs a different approval flow. I think the platforms that figure out granular permission models for agent actions will win this market.

Who should pay attention#

Fintech founders building anything adjacent to autonomous transactions. Crypto protocols thinking about agent-native interfaces. Risk and compliance teams at trading firms who haven't started thinking about AI agent policies yet. Quantitative researchers who are watching the cost of running capable models drop every quarter. And honestly, anyone who's curious about where the "AI agents can do real things in the real world" trend is heading. Polymarket was just the most visible example this week. The next one might be an agent managing a real portfolio, not just prediction market contracts.

There's also a less obvious audience here: regulators. If you work at the SEC, CFTC, or any financial oversight body, this experiment should be on your desk. Autonomous AI agents placing trades in real markets is happening now, not in some theoretical future. The question isn't whether to regulate it. The question is whether the frameworks will arrive before the agents scale.

Frequently asked questions#

Can I use an AI agent to trade on Polymarket right now?#

Technically yes. Polymarket has an API, and you can connect an agent to it with some custom tooling. But there's no regulatory clarity on AI-directed trading in prediction markets, and as the experiment showed, the downside risk is total loss. I wouldn't recommend it with money you can't afford to lose.

Which AI model is best for financial trading agents?#

Based on this one experiment, Claude outperformed significantly. But one 48-hour test isn't a benchmark. Model performance varies by task, and trading involves reasoning about probability, risk, and timing simultaneously. The system prompt and tool configuration matter at least as much as the model choice.

Algorithmic trading is legal and widespread in traditional markets. AI-directed trading on prediction markets like Polymarket exists in a gray area, especially for US users. Polymarket itself has had regulatory issues in the past. If you're considering building in this space, talk to a lawyer before you talk to an LLM.


I'm building RapidClaw to make AI agents accessible to everyone. Try it free.

Share this post

Ready to build your own AI agent?

Deploy a personal AI agent to Telegram or Discord in 60 seconds. From $19/mo.

Get Started

Stay in the loop

New use cases, product updates, and guides. No spam.