The four-beat structure (and the timestamps that matter)
Almost every UGC video that holds attention on TikTok follows the same skeleton. Not because it is a gimmick, but because it matches how the feed actually works: the first frame decides the swipe, and the last few seconds decide whether anyone taps the product link.
The four beats are Hook → Problem → Proof → CTA. What separates a video that scrolls past from one that sells is not the script length, it is where each beat lands on the clock. A 30-second video with a hook that takes 6 seconds to arrive has already lost most viewers.
| Beat | Timestamp | Job | Common mistake |
|---|---|---|---|
| Hook | 0–3s | Stop the scroll and name who this is for | Opening with a logo or brand intro |
| Problem | 3–8s | Make the viewer feel the pain or desire | Listing features instead of the felt problem |
| Proof | 8–22s | Show the product working, not just talking | Telling instead of demonstrating on camera |
| CTA | 22–30s | One clear next action, low friction | Three asks at once, or no ask at all |
Use these as ceilings, not targets. A strong hook can land in 1.5 seconds. The point is that by second 8 the viewer should already know who you are talking to, why it matters, and that proof is coming.
Copy-paste the template
Below is the full fill-in-the-blank template. The bracketed parts are the only things you change. Keep the spoken word count near 2.2 words per second of runtime (a 30-second video is roughly 65–70 spoken words) so you are not rushing or padding.
Two rules that make a bigger difference than anything else: (1) say the hook with your mouth, do not rely on the on-screen caption alone, and (2) show the product physically in frame before second 10.
RUNTIME: ~30s | SPOKEN BUDGET: ~65 words | PRODUCT: [product name + price] [0–3s] HOOK (say it out loud) "[Call out WHO this is for] + [the result or pain]." -> e.g. "If you [problem], this is the [product] that [outcome]." -> Write 3–5 hook variants. Test these, hold everything below constant. [3–8s] PROBLEM (make them feel it) "[The honest, specific frustration the viewer already has]." -> First person beats lecturing: "I used to ___ and it never ___." [8–22s] PROOF (show, don't tell — product on screen) "[Demonstrate the product working] — watch." -> Use observable, honest language: looks / feels / noticed. -> AVOID: cures, heals, clinically proven (unless the brand can substantiate it). [22–30s] CTA (one ask, low friction) "[One next action]. [One tiny tip to start]." -> e.g. "It's linked below — start with [easiest step]." --- QA before recording: [ ] Hook names the viewer/problem in <3s and is spoken aloud [ ] Product physically on screen before 0:10 [ ] Main claim is honest + substantiatable [ ] Exactly ONE CTA [ ] Spoken script fits the word budget (~2.2 words/sec) [ ] 3–5 hook variants written for this same body
Two worked examples
Here is the template filled in for two different DTC categories so you can see how the blanks become real lines.
Example 1 — a $34 vitamin-C serum (skincare):
Hook: "If your skincare shelf is chaos and your skin still looks dull, watch this."
Problem: "I had six half-used bottles and zero routine. My skin looked tired by 3pm."
Proof: "This is the one serum I actually finished. Two weeks in, my skin looks brighter and a lot of people have noticed — here's the texture, here's how it sinks in."
CTA: "It's linked below if you want to try it. Start with the morning step."
Notice the proof line says "looks brighter" and "people noticed" — observable, honest, the kind of claim a brand can stand behind. It deliberately avoids "clears acne" or "clinically proven," which are claims a brand would have to substantiate and which can get an ad rejected.
Example 2 — a $59 cordless handheld vacuum (home / DTC gadget):
Hook: "POV: you have a car, a couch, and a dog that sheds like it's a full-time job."
Problem: "The big vacuum is too heavy for the car, and the little dustbuster died in a month."
Proof: "This one charges on USB-C and pulls dog hair out of the seat seam in one pass — watch."
CTA: "Link's in the bio. Grab the one with the crevice tool."
The proof beat in both cases is a demonstration on camera, not an adjective. That single habit — show, do not describe — is the highest-leverage edit you can make to any UGC script.
The pre-shoot checklist
Before you (or your creator) hit record, run the script against this list. If any answer is no, fix the script, not the footage.
- Does the hook name the target viewer or their problem in the first 3 seconds?
- Is the hook spoken out loud, not just shown as a caption?
- Is the product physically on screen before second 10?
- Is the main claim something the brand can honestly back up (observable result, not a medical or "clinically proven" claim)?
- Is there exactly one CTA, and is it the easiest possible action ("link below," "comment a word")?
- Is the spoken script within the word budget for the runtime (~2.2 words/sec)?
- Did you write at least 3–5 hook variants for the same body, so you can test which one holds?
That last point is the one most brands skip. The hook is responsible for the majority of your retention, so it is the variable worth iterating on first. Keep the proof and CTA constant, swap the hook, and let the data tell you which opener earns the watch.
How to test hooks without guessing
One script is not a campaign. The fastest way to find what converts is to treat hooks like the test variable and everything else as the control.
- Hold the body and CTA constant. Change only the first 3 seconds across 3–5 cuts of the same video.
- Watch 3-second view rate and average watch time, not just likes. A hook that wins on retention is the one worth scaling.
- Kill losers fast. Our internal rule is a 7-day kill: if a hook hasn't earned its keep in a week, we stop spending creative energy on it and move budget to the winner.
- Reinvest into the winning angle. Once a hook proves out, write five more variations of that angle rather than starting from a blank page.
This is a general pattern, not a guarantee — different products and audiences behave differently. But the discipline of testing the hook in isolation is what turns a decent template into a repeatable system, and it is why volume matters: you cannot find your one winning hook out of a sample size of three videos.



