Blog

How to Remove Filler Words From an Interview

Interview being filmed
The ScriptCut Team
/
June 9, 2026
/
8 min read

To remove filler words fast, do it on the transcript, not the timeline: cut the ums, uhs, repeated words, and false starts across the whole recording in one pass, then export the cleaned selection to your editor. Hunting each filler on the timeline, one scrub at a time, is the slow, soul-draining way. Working on the words, you can clear an entire interview before you would have finished the first ten minutes by hand.

Fillers are everywhere because that is how people talk. Behavioral research summarized by Toastmasters puts the average speaker around five filler sounds per minute, roughly one every twelve seconds, and casual on-camera speech often runs higher. A 2024 study in the Journal of Applied Behavior Analysis found that a low rate, around five a minute, did not hurt how a speaker was perceived, but higher rates of filler sounds noticeably did. So the goal is not zero. It is getting under the threshold where the ums start to distract.

Which fillers to cut, and which to leave

Not all fillers are equal. The clear cuts are the vocalized non-words, um, uh, er, and the obvious false starts where someone abandons a sentence and restarts it. Those carry no meaning and almost always tighten the line when removed.

The judgment calls are the discourse markers, the 'you know,' 'I mean,' 'like,' 'so' that pepper natural speech. Some are pure filler. Others do real work, signaling a shift, softening a claim, giving the listener a beat to catch up. Strip every single one and the speaker can sound clipped and unnatural, almost robotic. Read the line and ask whether removing it changes the rhythm or the meaning. If it does, leave it.

The one-pass method

Get the interview transcribed with word-level timecode. Now every filler in the transcript is anchored to a frame, so removing the word removes the audio and video with it. A transcript tool can sweep the whole recording for the obvious fillers at once. In ScriptCut, Remove Fillers clears the ums and uhs across the entire interview in a single action, and you can trim repeated words and false starts as you select. Every cut carries through your paper edit and into the exported timeline, so you are not redoing the work in your NLE.

Reading-based cleanup is also just faster than scrubbing. You scan the page, see the clutter, and clear it, instead of playing the same ten seconds three times to catch a filler by ear. Silent reading runs around 238 words a minute against roughly 150 for speech, per Brysbaert's 2019 meta-analysis.

A worked example

You have a 30-minute interview with a nervous first-time speaker, so the filler count is high. By hand on the timeline, you would scrub for each um, mark it, ripple-delete, and repeat for what could be a couple hundred fillers, an afternoon of tedium. On the transcript, you run the one-pass filler removal, which clears the bulk of them, then read through and manually trim a dozen false starts the auto pass left, plus a handful of repeated words. Fifteen minutes, not an afternoon. You play a few of the cleaned passages to make sure they still breathe, then export. The speaker now sounds composed, and you did not change a word they meant.

Common mistakes

Cutting fillers on the timeline. It is the slowest possible method. Each one is a scrub, a mark, a delete. Do it on the words in bulk.

Removing every discourse marker. Strip all the 'you know's and 'I mean's and the delivery turns mechanical. Keep the ones that carry rhythm or meaning. Clean, not sterile.

Creating audible jumps. Removing a filler from a continuous shot can leave a tiny jump in picture or a too-abrupt audio splice. On a talking head, plan for the jump cuts the same way you would for any tightening, and listen for clipped breaths at each cut.

Not verifying the result. A pass that looks clean on the page can sound choppy. Play back the cleaned sections before you lock.

The honest tradeoffs

Aggressive filler removal makes a speaker sound more confident, but push it too far and you erase their natural cadence, which can read as inauthentic, especially in a testimonial or a documentary where realness is the point. There is a real tension between polish and authenticity, and the right setting depends on the format. A corporate explainer wants tight; a personal story wants room to breathe.

The other cost is the visual side. Every removed word in a single continuous shot is a potential jump cut, so heavy cleanup on a static talking head can leave you with a lot of visual hops to cover with b-roll or hide with cuts. The audio cleanup is fast; the picture cleanup that follows is the part to budget for.

The takeaway

Remove filler words on the transcript in one pass, keep the markers that carry meaning, watch the jump cuts, and verify the result by ear. The principle is the same as the rest of transcript-first editing: change the words and the video follows, far faster than hunting clip by clip. Pair this with editing the interview faster and tightening a talking-head video, then export the clean cut straight to your timeline.

Clean up your interview in ScriptCut.

Sources

Frequently asked questions

How do I remove filler words from a video?

Work from the transcript. With word-level timecode, each filler is anchored to a frame, so a transcript tool can sweep the whole recording for ums and uhs in one pass, and you trim false starts and repeats as you go. Then export the cleaned selection to your editor.

Should I remove every filler word?

No. Cut the vocalized non-words (um, uh) and obvious false starts, but keep the discourse markers ('you know,' 'I mean') that carry rhythm or meaning. Research suggests around five fillers a minute does not hurt how a speaker comes across, so aim for natural, not zero.

Will removing fillers create jump cuts?

On a single continuous shot, yes, every removed word is a potential visual jump. Plan to cover the worst ones with b-roll or a small punch-in, or keep them as a deliberate style. The audio cleanup is fast; budget time for the picture side.

Does removing fillers change what the person said?

Fillers carry no meaning, so removing them tightens delivery without altering the message. The line you do not cross is splicing words from different moments together. Trim within what they actually said and verify the result by ear before locking.