Know What Your AI Agent Did: Introducing the /audit Skill for Claude Code

I’ve been running Claude Code heavily for the past few months — and at some point I started wondering: what exactly is it doing all day?

Not out of distrust. More out of curiosity and professional habit. When you deploy any autonomous system, you want visibility into what it’s doing — what commands it ran, what files it touched, whether it made any outbound network calls. For AI agents, that visibility is almost entirely missing.

So I built a /audit skill for Claude Code.

What It Does

/audit generates a structured report of everything Claude Code did in a session or across a full day. It parses Claude’s local transcript files and surfaces:

Every bash command executed, with risk classification
All file operations (reads, writes, edits) — with sensitive path detection
MCP tool calls and their parameters
Agent delegations (subagents spawned, background vs foreground, worktree isolation)
Skills invoked and web requests made

Two modes:

Command	What it does
`/audit session`	Audit the current conversation from history
`/audit` or `daily audit`	Parse all transcript JSONL files from `~/.claude/projects/` for today

Risk Classification

The most useful part is the risk tagging. Every bash command gets classified against a table of 17 risk patterns:

Risk Tag	Example
`DESTRUCTIVE`	`rm -rf`, `git clean -f`
`EXTERNAL_NETWORK`	`curl` to non-localhost URLs
`DATA_EXFILTRATION`	`curl -X POST -d @file`, `scp` outbound with file args
`SECRET_LITERAL`	Literal `sk-`, `ghp_`, `AKIA` tokens in args
`GIT_PUSH`	`git push`, `git push --force`
`PKG_INSTALL`	`npm install`, `pip3 install`, `brew install`
`SAFETY_BYPASS`	`--dangerously`, `--no-verify`, `--force`
`DYNAMIC_EXEC`	`eval`, `bash -c` with piped input, `curl \| bash`

The classifier only looks at top-level shell commands — content inside heredocs, python3 -c "..." scripts, or string arguments doesn’t trigger false positives. Known-safe patterns (like source ~/.bashrc or > /dev/null) are explicitly excluded.

Secrets are redacted before display — API keys, JWTs, GitHub tokens, and AWS keys are replaced with [REDACTED] in every output line.

What a Real Report Looks Like

Here’s an excerpt from running /audit on today’s sessions:

# Daily Audit — 2026-03-25

Sessions: 2 | Total tool calls: 52

## Session Overview
| # | Directory         | Time Range  | Bash | Risky | Files | Agents | Flags |
|---|-------------------|-------------|------|-------|-------|--------|-------|
| 1 | ~/.claude/skills  | 04:15–05:11 | 32   | 5     | 0     | 0      | 5     |
| 2 | ~/Desktop/skills  | 07:24–07:40 | 8    | 1     | 9     | 0      | 1     |

## Risky Commands
| # | Time  | Risk Tags        | Command                                              |
|---|-------|------------------|------------------------------------------------------|
| 1 | 04:18 | EXTERNAL_NETWORK | `curl -X POST "https://api.algolia.com/..."` |
| 2 | 07:38 | —                | `mkdir -p ~/.claude/skills/audit && cp SKILL.md ...` |

The EXTERNAL_NETWORK flags on the Algolia calls are accurate — those were real outbound API calls from a YC company search task. The risk classification is there to inform, not to alarm.

Why This Matters for AI Agent Security

Visibility is the foundation of trust. You can’t have a meaningful security posture for an autonomous agent if you have no record of what it did.

This is especially relevant as Claude Code gets used for more consequential work — long-running agents, production deployments, access to sensitive codebases. A daily audit habit gives you:

Accountability — a record of what ran and when
Anomaly detection — unusual commands stand out in the risk table
Incident context — if something goes wrong, the transcript gives you a timeline

It’s not enterprise-grade audit logging (the skill says so explicitly). But it’s a meaningful first step for personal security hygiene.

Install It

The skill is available as a GitHub Gist. One-liner install:

mkdir -p ~/.claude/skills/audit && curl -sL \
  https://gist.githubusercontent.com/lghupan/46d65f4035481ef6058d0e895bdeb73a/raw/SKILL.md \
  -o ~/.claude/skills/audit/SKILL.md

Then use it with /audit in Claude Code.

A PR is open to merge it into the official Anthropic skills repo. Once that lands, it’ll be available via the Claude Code plugin marketplace.

The source is in my fork of anthropics/skills.

What’s Next

A few things I want to add:

Date range support — audit 2026-03-20 to 2026-03-25 for week-in-review
Export to JSON — for piping into other tools or dashboards
Threshold alerts — flag sessions with unusually high risky command counts

If you try it out and hit something broken or missing, open an issue on the PR or reach out.

What It Does#

Risk Classification#

What a Real Report Looks Like#

Why This Matters for AI Agent Security#

Install It#

What’s Next#