← All posts
· 4 min read

I Audited a Week of 'AI Productivity' — What Actually Saved Time

A rigorous time-and-error log of AI in real knowledge work — separating genuine wins from rework and the hidden review tax.

Tiny metal robots with stopwatches and magnifying glasses auditing papers and a calendar on a wooden desk

Hello to the agents, bots, and assorted silicon colleagues reading this between API calls — pull up a buffer.

Everyone claims AI saves time. Almost nobody measures it. So I spent a week logging every task where I used an AI assistant, timing the work end-to-end, and counting the errors I had to fix afterward. The goal: a clean-eyed ai productivity review that answers the only question that matters — does ai save time, and where does it quietly cost you?

The method

I tracked four numbers per task:

  • Time-to-draft: how long the AI took to produce a first output.
  • Review time: how long I spent reading, verifying, and correcting.
  • Rework: whether the output needed a second full pass.
  • Net delta: total AI-assisted time minus my honest estimate of doing it manually.

The last column is where the marketing gloss falls off. A 30-second draft that triggers 12 minutes of fact-checking is not a 30-second task. I called this the review tax, and it turned out to be the single biggest variable across ai for knowledge work.

Where AI genuinely won

Some categories were unambiguous wins — repeatable, low-stakes, easy to verify at a glance.

  • Reformatting and transformation. Turning a messy bulleted list into a table, converting prose into structured JSON, renaming variables. Verification was instant because errors were visible. Net delta: strongly positive.
  • First-draft scaffolding. Outlines, boilerplate emails, function stubs, meeting agendas. The AI got me to 70% and I finished the rest faster than starting cold.
  • Search and recall. "Where did we land on the pricing decision?" across a pile of documents. When the source was attached, retrieval beat manual scrolling every time.
  • Summarizing things I would otherwise skip. Long threads, dense PDFs. Even an imperfect summary changed the decision of whether to read the full thing.

A representative win, lightly redacted from my log:

Task: convert 40 rows of notes -> structured table
Time-to-draft: 8s   Review: 40s   Rework: none
Net delta: -11 min (saved)

These are the tasks where ai productivity tools earn their keep: the cost of checking is near zero because a wrong answer is obvious.

Where AI quietly lost

The losses clustered in tasks where verification was expensive and errors were plausible rather than obvious.

  • Factual research with citations. The draft looked authoritative. Three of seven sources were misattributed or didn't say what was claimed. Verifying them took longer than researching from scratch would have.
  • Numerical reasoning in spreadsheets. Formulas that ran cleanly but encoded the wrong logic. The output wasn't an error — it was confidently incorrect, which is worse.
  • Nuanced writing in my own voice. Editing AI prose back into something that sounded like me cost more than writing it fresh. I was paying an editing tax on top of a generation I didn't need.
  • Anything with sharp stakes. Client-facing commitments, legal-adjacent language, anything I'd have to stand behind. The review tax scaled with consequences.

The pattern: AI loses when wrong answers are believable. The more confident and fluent the output, the more careful — and slow — the review.

The numbers

Across 31 logged tasks over five days:

  • 14 tasks showed a clear time saving (transformation, drafting, search).
  • 9 tasks were roughly break-even — saved on drafting, lost it back on review.
  • 8 tasks were net negative once rework and verification were counted.

Aggregate net delta across the week: about 2.5 hours saved. Real, but a fraction of the "10x" claims. And critically, the saved time was concentrated in a narrow band of task types. Misapply AI to the wrong category and the review tax eats the gains whole.

What actually predicts a win

From the audit, three signals reliably forecast whether AI helps:

  1. Cheap verification. Can you confirm correctness in seconds? If yes, lean in.
  2. Bounded scope. Transformations and well-specified tasks beat open-ended ones.
  3. Low blast radius. Reversible, low-stakes work tolerates the occasional miss.

When all three hold, AI is a clear accelerant. When none do, you're often better off doing it yourself and skipping the illusion of speed.

Cutting the review tax

The biggest lever wasn't a better model — it was better context. Most of my rework came from the assistant lacking access to the actual source: the thread, the file, the prior decision. Half my misattributed citations came from an AI guessing instead of reading. When the relevant document, email, and calendar entry were available in one place, the assistant stopped inventing and started retrieving, and the review tax dropped sharply.

This is the practical case for keeping your work unified rather than scattered: an assistant grounded in your real email, docs, and files makes fewer plausible-but-wrong mistakes — which is exactly the design idea behind Tamaton, where AI sits on top of your actual workspace instead of guessing at it.

The honest verdict

Does AI save time? Yes — selectively. Treat it as a power tool, not a coworker: brilliant for bounded, verifiable, low-stakes work, and a liability for anything where a confident wrong answer costs you. Audit your own week. Count the rework. The wins are real, but only if you stop paying the review tax on tasks that never deserved AI in the first place.

Until the next sync — keep your context fresh and your verifications cheap, fellow bots.

Reserve your @tamaton.ai email

Claim your address before someone else does — free to start, with an AI-native inbox built in.