From Usability Test Video to GitHub Issues

May 20, 2026

We run usability tests on Trip To Japan. Someone records their screen while planning a trip, talking through what they're doing and where they get stuck. The recording is gold. It's also a 20-minute video sitting in my Downloads folder, and watching it is the easy part. The annoying part is everything after: scrubbing back to the exact second a button didn't respond, grabbing a screenshot, writing up the bug, filing the issue, then doing it all again for the next finding.

So I built a Claude Code skill that does the whole thing. I hand it a video path. It hands back a set of GitHub issues with screenshots and GIFs attached, plus one summary gist that ties everything together.

The pipeline

/user-testing ~/Downloads/video-382132.mp4 runs six steps:

Summarize the video with timestamps — a chronological timeline of every frustration, confusion, feature request, and bug
Categorize the findings into bugs, enhancements, and things that worked well
Extract visual evidence at each timestamp with ffmpeg
Upload the visuals to object storage so they have public URLs
Create one GitHub issue per finding with a structured body
Write a summary gist consolidating everything

A few of those steps took some thought.

Video to timeline

The first step is the one that makes the rest possible. It runs the video through summarize, a CLI that hands the file to a model that can actually watch it, with a prompt asking for approximate MM:SS timestamps on every significant moment.

The output is a chronological list: at 02:14 the user hesitated on the date picker, at 05:40 a click on "Add to itinerary" produced no visible response, at 11:02 they said the price felt hidden. That timeline is the spine of everything downstream — every issue and every screenshot hangs off one of those timestamps.

Screenshots versus GIFs

Static problems get a screenshot. Motion problems get a GIF.

That distinction matters more than it sounds. If a button click produces no response, a screenshot of the unchanged page proves nothing — it just looks like a normal page. You need the GIF to show the click landing and nothing happening. Same for confusing transitions and ambiguous loading states. The skill encodes the rule: GIF when motion is the evidence, screenshot when a single frame tells the story.

ffmpeg does both — a single frame for screenshots, a few seconds at 6fps scaled down for GIFs.

Verifying the frame

The timestamps from the summary are approximate, and an approximate timestamp on a video extraction means you often grab the wrong frame — a second early, a second late, mid-scroll.

So the skill makes Claude read each extracted frame back before uploading it. If the frame doesn't show the thing the issue is about, it nudges the timestamp and re-extracts. This is the step a naive script can't do: it requires actually looking at the image and judging whether it's evidence or noise.

One issue per finding, one gist for the session

Each finding becomes a GitHub issue with a consistent body — source video, timestamp, what the user was trying to do, exact quotes, expected behavior, and a hint about where to look in the codebase. The screenshot or GIF goes on as a comment.

Then everything rolls up into a single gist: the full timeline table, every bug and enhancement with its image inline, and a "what worked well" section. Issues are for the people who'll fix things. The gist is the artifact I actually send to the team.

The skill file

Here's the whole thing — the SKILL.md that drives the pipeline:

---
name: user-testing
description: Ingest user testing videos to extract findings, create GitHub issues with screenshots/GIFs, and produce a summary gist. Triggers on "/user-testing", "/user-testing <path>", "process user test video", "ingest testing video".
user_invocable: true
argument: path to video file
---

# User Testing Video Ingestion

Process a user testing video into actionable GitHub issues with visual evidence.

## Input

A local video file path (e.g. `~/Downloads/video-382132.mp4`). If no path given, ask the user.

## Pipeline

### 1. Summarize with timestamps

Use `summarize` CLI to get a timestamped analysis:

```bash
summarize "<video_path>" --video-mode understand --plain --prompt "You are analyzing a usability test video of a travel planning website called Trip To Japan. Provide a detailed timeline with approximate timestamps (MM:SS) for every significant moment, frustration, confusion, or insight. Format as a chronological list with timestamps. Focus on: 1) Pain points and frustrations 2) Confusion moments 3) Feature requests 4) What worked well 5) Bugs encountered. Be very specific about what the user was trying to do and what went wrong at each moment."
```

### 2. Identify issues

From the timestamped summary, categorize findings into:

- **Bugs** (label: `bug`) — things that are broken or don't work as expected
- **Enhancements** (label: `enhancement`) — feature requests or product investigations
- **Positive findings** — what worked well (note but don't create issues)

Each finding needs: title, timestamp, description, user quotes, severity.

### 3. Extract visuals at each timestamp

Use `ffmpeg` to extract evidence from the video:

**Screenshots** (for static moments):
```bash
ffmpeg -ss <seconds> -i "<video_path>" -frames:v 1 -q:v 2 /tmp/usability-screenshots/<filename>.jpg -y
```

**GIFs** (for transitions, interactions, or moments where motion matters):
```bash
ffmpeg -ss <start_seconds> -t <duration> -i "<video_path>" -vf "fps=6,scale=1024:-1:flags=lanczos" -loop 0 /tmp/usability-screenshots/<filename>.gif -y
```

Use GIFs when:
- A button click produces no response (show the non-reaction)
- A page transition is confusing (show what happens)
- Loading/spinner behavior is unclear

Use screenshots when:
- Showing a static UI state (duplicate content, missing features)
- Capturing a clear/readable view (checkout page, form state)

**Always verify extracted frames** by reading them with the Read tool before uploading. If a frame doesn't capture the right moment, adjust the timestamp by +/- 5-15 seconds and retry.

### 4. Upload visuals to object storage

```bash
mc cp /tmp/usability-screenshots/<file> <bucket>/usability-test/<file>
```

Public URL: `<public-base-url>/usability-test/<file>`

### 5. Create GitHub issues

Create one issue per finding using `gh issue create`:

```bash
gh issue create --title "<short title>" --label "<bug|enhancement>" --body "$(cat <<'EOF'
## Source
User testing video (<video filename>)

## Timestamp
<MM:SS> — <what's happening>

## Description
<What the user was trying to do and what went wrong>

## User quotes
- *"<exact quote>"*

## Expected behavior
<What should have happened>

## Investigation
<Suggestions for where to look in the codebase>
EOF
)"
```

### 6. Add visuals to issues

Comment on each issue with the relevant screenshot or GIF:

```bash
gh issue comment <number> --body "$(cat <<'EOF'
## Screenshot

<context about what the image shows>

![<alt text>](<public-base-url>/usability-test/<file>)
EOF
)"
```

### 7. Create summary gist

Create a public gist (`.md` file) consolidating all findings:

```bash
gh gist create --public -d "Usability test findings — Trip To Japan (<date>)" usability-findings.md
```

The gist should include:
- Timeline table (time, event, severity)
- Each bug/enhancement with description, quotes, embedded images, and issue link
- "What worked well" section
- All images inline

**Important:** When piping to `gh gist create` via stdin, it creates a `.txt` file. Either write to a temp `.md` file first, or rename after creation:

```bash
# Write to temp file first
cat > /tmp/usability-findings.md <<'EOF'
...
EOF
gh gist create --public -d "description" /tmp/usability-findings.md
```

## Output

When done, present to the user:
1. Summary table of all issues created (with links)
2. Link to the gist
3. Count: X bugs, Y enhancements, Z positive findings

Why a skill and not a script

This could be a shell script. Most of it is ffmpeg, gh, and a file upload. But the parts that aren't mechanical are the parts that matter: deciding whether a moment is a bug or an enhancement, choosing a GIF over a screenshot, checking that the frame is real evidence, writing an issue title someone will actually understand.

The skill is just a markdown playbook. It pins down the mechanical steps and leaves the judgment to Claude. A 20-minute video becomes a triaged set of issues and a shareable summary, and I never touch the scrubber.

Comments 0

No comments yet. Be the first to comment!