Blog

How to Do a Paper Edit, Step by Step

Editor reviewing a transcript
The ScriptCut Team
/
June 9, 2026
/
10 min read

To do a paper edit, transcribe your footage with timecode, read it and highlight the strongest lines, copy those selects into a fresh document, arrange them into a structure that tells a story, then verify each line against the actual clip before you build the timeline. That is the whole method in one sentence. The rest of this is how to do each step without wasting a day.

I have cut interviews both ways, straight into a timeline and from a paper edit first, and the paper-first version wins on every project longer than about three minutes. Not because it is fancier. Because the structural decisions happen in text, where changing your mind costs nothing.

Step 1: Transcribe with timecode

You need a transcript where every line carries a timecode, ideally word-level. Without timecode the paper edit is a nice essay you cannot turn into a cut. Word-level timing is what lets a single highlighted sentence become a precise in and out point later.

Auto-transcription is good enough now that hand-typing is a waste of your life. Clean up names and jargon after, do not retype the whole thing. If you want the options, how to transcribe an interview covers them.

Step 2: Read it once, all the way through, before you touch anything

Resist the urge to start highlighting on line one. Read the whole transcript first so you know where the story actually goes. The best moment in an interview is often buried 40 minutes in, and if you start cutting at the top you will anchor on the wrong material.

Reading is the fast part. Average adult reading speed is around 238 words per minute against roughly 150 words per minute of speech, so a full read of an hour-long interview is maybe 20 minutes. Spend them.

Step 3: Highlight your selects, ruthlessly

Now go back through and mark the lines worth keeping, your selects. The test is not 'is this fine,' it is 'would I fight to keep this.' If you are highlighting half the page, you are rereading, not deciding.

A trick that works: use two colors. One for must-have lines, one for maybe lines. When you assemble, build from the must-haves first and pull in maybes only where the story needs a bridge. This maps cleanly onto how a tool like ScriptCut marks selections, must-have versus nice-to-have, so the page stage and the build stage speak the same language. For finding the genuinely strong moments, how to find the best soundbites has a deeper method.

Step 4: Pull selects into a new document and group them

Copy your highlighted lines, with their timecodes and speaker names, into a clean doc. Do not arrange yet. Just group them by theme or beat: the setup, the conflict, the change, the resolution. Most stories have a shape like that even when the interview rambled.

This grouped document is your raw material. It is the text equivalent of a stringout, everything good, nothing ordered for flow yet.

Step 5: Arrange it into a story, then read it out loud

This is the actual work. Order the lines so they build. A good opening line earns attention, the middle escalates, the end pays off. Cut transitions that do not move you forward. Reorder freely, you are dragging text, not re-cutting clips.

Then read the whole thing out loud. Reading aloud exposes every soft spot: the repeat you did not notice, the answer that needs its question, the place where two thoughts collide. If it does not flow when you say it, it will not flow on screen. Fix it on the page.

Step 6: Verify every line against the footage

This is the step beginners skip, and it is the one that saves the project. A line can read perfectly and play terribly: the delivery is flat, there is a stumble mid-sentence, the eyeline is wrong, a phone rings. Errol Morris refuses paper edits for exactly this reason. In his Transom interview he said 'paper cuts give you a very false idea' because the page hides performance.

He is right, and the answer is not to abandon the paper edit, it is to check your work. Watch each select. Mark the ones that do not play and find an alternate take or cut them. ScriptCut lets you play any line straight from the transcript, so verifying is a click, not a hunt through bins.

Step 7: Build the assembly from the plan

Now open your NLE and conform the plan. Because every decision is made, the assembly comes together fast. Drop each select on the timeline in order. Where two lines butt against each other awkwardly, plan a J-cut or L-cut or cover the seam with b-roll. From the assembly you refine toward a rough cut and beyond.

A worked example

I cut a 90-second founder testimonial from a 40-minute interview. Reading and highlighting took 25 minutes and produced 22 selects. Arranging took another 20: I opened on the founder's biggest fear, not the company origin, because the fear was the hook. Reading it aloud killed three redundant lines. Verifying the footage caught one great line ruined by a cough, so I swapped to a weaker phrasing that actually played. The timeline assembly took 15 minutes. Total: about an hour for a clean first cut. Straight into the timeline, that same job used to eat an afternoon.

How to handle multiple speakers and multiple files

Single-interview paper edits are easy. The method earns its money on messier jobs: four interviews, a panel, a multi-cam shoot. The trick is to keep speaker attribution and source attached to every line from the start. Each select should carry who said it, which file it is from, and its timecode. Lose that and your beautifully arranged document becomes a treasure hunt when you try to build it.

For a panel or a multi-speaker piece, I tag selects by speaker as I pull them, then build the structure across speakers. You will often find the best version cuts between two people answering the same question, even though they said it 20 minutes apart. That cross-cutting is exactly the kind of move that is trivial on paper and tedious to discover by scrubbing. If your footage is sprawling, organizing interview footage first makes the paper edit far smoother.

Tools: doc, cards, or transcript software

You can paper edit three ways, and they trade off the same way they always have.

A document. Free, flexible, universal. Highlight in one color, copy selects into a new file, reorder. The downside is the manual conform later: your doc has timecodes, but you still have to find each one in the NLE.

Index cards. One select per card, spread them on a table, physically reorder. Some editors swear by it for feature docs because you can see the whole structure at once. Slow to set up, but powerful for finding act breaks.

Transcript-based software. The modern default. You highlight on the transcript and the tool keeps the timecode bound to the line, so the plan becomes the cut with no re-finding. This is the category Adobe entered with Text-Based Editing and that ScriptCut is built around. It collapses steps 3 through 7 into one continuous flow.

What to watch for

Do not over-highlight. Do not arrange before you have read everything. Do not skip the read-aloud, it is your cheapest quality check. And never lock a structure you have not watched. The paper edit is a hypothesis you prove against the footage, not a substitute for it.

One more, easy to miss: do not strip a quote of the context it needs. A killer line that depends on the question before it, or on the sentence that sets up the joke, is not a standalone select. Pull the setup with it or the moment dies. Reading your arrangement out loud is what catches this, because your ear notices the missing rung the instant you hit it.

If you want the bigger picture on why this works, what is a paper edit covers the history and the theory. For a faster end-to-end interview workflow, see how to edit an interview faster. To go from this plan straight into your NLE, transcript to timeline covers the export.

How it changes by content type

The seven steps are constant, but what you are hunting for shifts with the format.

Interviews and documentary. You are after a narrative arc and the lines that carry emotion or information. Cross-cut between speakers freely. This is the classic case the method was built for. See cutting a documentary interview.

Podcasts to video or clips. You are hunting self-contained moments that make sense without the rest of the episode. Each clip is almost its own tiny paper edit. Turning a podcast into clips applies the same selecting instinct at clip scale.

Courses and webinars. The structure is often already there in the agenda, so your paper edit is mostly subtraction, cutting the tangents, the tech hiccups, the dead air, and tightening each point. Editing a course video and repurposing a webinar lean on this.

Testimonials. One clean arc from a rambling chat: the problem, the turn, the result. The paper edit is where you find that arc inside 30 minutes of friendly conversation. More in editing a testimonial video.

A realistic time budget

People assume paper editing adds time. It moves time earlier and removes more later. For one hour of interview targeting a few minutes of finished video, a fair budget is roughly 20 minutes to read, 25 to highlight and pull, 30 to arrange and read aloud, and 20 to verify your selects. Call it an hour and a half before you open the timeline. Then the assembly is fast because it is mechanical. Compare that to cutting blind, where you can lose a full day scrubbing and rebuilding structure you never settled. The paper edit front-loads the thinking and the total drops.

Run the whole thing in one place with ScriptCut: transcribe, highlight, arrange, verify, export a timeline. Try it on your next interview and time the difference.

Sources

Frequently asked questions

What do I need before I start a paper edit?

A transcript with timecodes, ideally word-level, and a way to highlight and reorder text. That is it. Auto-transcription plus a document or a transcript-based editor covers it.

Should I highlight as I read the first time?

No. Read the whole transcript once first so you know where the story lands, then go back and highlight. Highlighting on the first pass anchors you to the wrong material.

Why read the arranged selects out loud?

Because your ear catches what your eye skips: repeats, missing context, answers that need their question. If it does not flow when spoken, it will not flow on screen.

Can I skip verifying against the footage?

Do not. A line can read great and play badly because of delivery, stumbles, or eyeline. Watch every select before you lock the structure.