Post

The Redo Test: A Better Framework for AI Delegation

The Redo Test: A Better Framework for AI Delegation

TL;DR: Stop asking “can I trust AI?” — ask “can I redo this if it goes wrong?” That one question tells you what to delegate.


Everyone’s debating whether AI is “ready” for real work. Karpathy says he still doesn’t fully trust agents. Enterprise buyers want guarantees. Twitter argues about hallucination rates.

They’re all asking the wrong question.

The Wrong Question

“Can I trust this AI?” is unanswerable. Trust is a spectrum, context-dependent, and shifts with every model update. You’ll never get a satisfying answer — and while you’re waiting, you’re doing everything yourself.

Better question: “If the AI screws this up, can I redo it?”

One question. Binary answer. Instantly tells you what to delegate.

The Redo Test

I run a multi-agent system for daily work — scheduling interviews, managing finances, writing drafts, monitoring inboxes. Here’s how it plays out:

Low redo cost → delegate freely:

  • Code changes (git revert)
  • Draft documents (you’ll review anyway)
  • Data analysis (rerun with different parameters)
  • Calendar management (reschedule is one click)

Medium redo cost → delegate with oversight:

  • Emails to colleagues (CC yourself, scan before it matters)
  • Financial records (monthly reconciliation catches errors)
  • Interview scheduling (confirmation step before the candidate sees it)

High redo cost → keep it human:

  • Public statements (can’t un-publish reputation damage)
  • Legal commitments (contracts are hard to undo)
  • Irreversible infrastructure changes (deleted production database)

You’re not evaluating the AI’s capability. You’re evaluating the task’s reversibility. That’s something you already know.

Passive Oversight Beats Active Approval

You don’t need to approve everything — you need to notice failures.

Active approval makes you a bottleneck. Every agent output waits for your “looks good.” You’ve replaced typing with clicking “approve” — you’ve automated nothing.

Passive oversight: the agent acts, you catch problems through lightweight mechanisms. My favorite: the CC model. My agent sends emails and CCs me. I don’t read every one. But if something looks off, I catch it with zero extra effort — I was going to scan my inbox anyway.

This is how managers work with humans. You don’t approve every email your team sends. You set expectations, spot-check, and intervene when something’s off. Why manage AI differently?

Why Trust Frameworks Fail

Karpathy describes AI as having “jaggedness” — brilliant PhD on verifiable tasks, confused ten-year-old on everything else. RL optimizes what it can verify; everything else wanders.

This is exactly why you can’t say “I trust it for coding.” Tomorrow’s edge case might hit the jagged boundary. AI capability is inconsistent by design.

The Redo Test doesn’t care about jaggedness. It doesn’t matter where the AI fails — it matters whether the failure is recoverable. Jagged failure in a reversible domain is a learning signal. Jagged failure in an irreversible domain is a disaster.

The Uncomfortable Part

Apply the Redo Test honestly: most of your work is redoable. The stuff eating 80% of your time — emails, documents, analysis, scheduling — has low to medium redo cost.

So the real reason you’re not delegating isn’t risk. It’s habit. Or ego. Or the discomfort of watching AI do in 30 seconds what took you an afternoon.

Karpathy calls this “AI Psychosis” — the anxiety of becoming a dispatcher instead of a doer. The Redo Test doesn’t cure the anxiety. But it gives you a rational framework to push through it.

Calibrating Your Instincts

The hard problems aren’t the obviously irreversible ones — those are easy calls. The hard problems are where you think redo cost is high but it’s actually not.

A blog post feels permanent. But you can edit, update, or take it down. One bad take costs you almost nothing.

An email to a candidate feels high-stakes. But a correction (“sorry, Thursday not Tuesday”) is a minor inconvenience, not a catastrophe.

Most “I can’t delegate this” instincts are miscalibrated. We overweight embarrassment and underweight time.

One Line

Don’t ask “is the AI good enough?” Ask “is the mistake cheap enough?”

If yes — delegate, set up passive oversight, and spend your time on decisions that are genuinely irreversible.


一句话: 别纠结”AI 能不能信”——问”搞砸了能不能重来”就够了。


所有人都在吵 AI 到底能不能干活。Karpathy 说他不完全信任 agent,企业买家要 SLA,Twitter 上吵幻觉率。

问错了。

问题本身就不对

“这个 AI 能信吗?”——这问题没法答。信任是个光谱,看场景,每次模型更新都变。你永远等不到一个让你踏实的答案,等的过程里所有事还是自己干。

换个问题:“搞砸了,能重来吗?”

一个问题,能或不能,答案直接告诉你该不该委托。

Redo 测试

我自己跑了一套多 agent 系统干日常——排面试、管账、写草稿、盯收件箱。实际体感:

重来成本低 → 直接扔给 AI:

  • 改代码(git revert)
  • 写草稿(你本来就要过一遍)
  • 跑数据(换参数重来)
  • 排日程(改个时间一秒钟的事)

重来成本中等 → 带个兜底机制:

  • 给同事发邮件(CC 自己,不对劲能拦)
  • 记账(月底对一次就知道有没有错)
  • 约面试(发给候选人之前加个确认)

重来成本高 → 自己来:

  • 公开声明(名声砸了捡不回来)
  • 签合同(白纸黑字撤不了)
  • 动生产环境(数据库删了就是删了)

注意:你评估的不是 AI 行不行,而是这事能不能回滚。这你比谁都清楚。

被动盯着 > 主动批准

你不需要批准每件事,你需要的是出了问题能发现

主动审批 = 你成了瓶颈。每个 agent 的输出都等你点”同意”。把打字换成点按钮——恭喜,啥也没自动化。

被动监督:agent 直接干,你有个轻量机制兜底。我最喜欢 CC 模式——agent 发邮件抄送我,我不看每封,但有问题刷收件箱时自然会注意到。零额外成本,因为我本来就要看邮件。

管人也是这样啊。你不会审批下属发的每封邮件,你定标准、抽查、有问题才介入。管 AI 为什么要不同?

信任框架为什么不管用

Karpathy 说 AI 有”锯齿性”——能验证的任务上是 PhD 水平,验证不了的地方像个十岁小孩。RL 只优化它能打分的东西,别的全靠碰运气。

所以你不能说”写代码我信它”——明天碰到个边界 case 可能就崩了。AI 的能力天生不均匀。

Redo 测试不在乎锯齿在哪。AI 在哪出错不重要,出错了能不能修才重要。可逆领域里的翻车只是个学习信号,不可逆领域里的翻车才叫事故。

不太舒服的真相

老实用 Redo 测试筛一遍,你会发现:你 80% 的工作都能重来。邮件、文档、分析、排期——重做成本都不高。

也就是说,你不委托的真正原因不是风险,是惯性。或者面子。或者看着 AI 三十秒干完你一下午活儿的那种说不上来的不爽。

Karpathy 管这叫”AI Psychosis”——从干活的人变成派活的人,心里不踏实。Redo 测试治不了这个焦虑,但至少给你一个理性的理由往前迈。

真正难的是校准直觉

真正难的不是那些明显不可逆的事——那些一眼就知道该自己来。难的是你觉得不可逆、其实可以重来的事。

发博客感觉很重?改了就是了,下架也行。一篇烂文章的实际代价约等于零。

给候选人发邮件感觉风险大?补一封”不好意思,是周四不是周二”就完了,算不上什么事故。

大多数”这个不能交给 AI”的直觉都是校准偏了。我们高估了丢脸的代价,低估了时间的代价。

一句话

别问”AI 够不够好”,问”犯错够不够便宜”。

够便宜就委托,设好兜底机制,把时间花在真正不可逆的决策上。

This post is licensed under CC BY 4.0 by the author.