AI finds the clips.
You pick the best ones.
shortube.pro reads the meaning of every sentence in your video, scores each moment on four dimensions, and surfaces the top 10 clip candidates — no timeline scrubbing required.
No credit card required · First project free
Clip analysis scores
Does the opening sentence create curiosity or make a bold claim?
Does the clip make a full, standalone point?
Is the speech pace and edit rhythm suited to short-form attention spans?
Overall score combining all dimensions — displayed as a 0–100 index.
Under the hood
How the AI actually works
Not a simple algorithm — a multi-stage AI pipeline that understands what is being said before deciding what to clip.
Semantic understanding
A language model reads the full transcript and identifies where ideas begin and end — not just where silence falls or the camera cuts.
Four-dimensional scoring
Every candidate clip is scored on hook strength, topic completeness, pacing and viral potential. The composite score determines ranking.
Virality modelling
The model is trained on signal patterns from high-performing short-form content — it learns which moments drive watch-through and subscription.
Word-level precision
Whisper's millisecond timestamps ensure clip boundaries fall exactly on sentence starts and ends. No abrupt mid-word cuts.
5-minute processing
A 60-minute video is transcribed, analysed and rendered into 10 clips in under 5 minutes — faster than watching the source once.
Transparent scores
Every clip shows its score breakdown. You can see exactly why the AI ranked each candidate where it did.
Semantic clipping vs everything else
Why it matters that the AI reads meaning, not just audio waveforms.
| Method | Traditional tools | shortube.pro AI |
|---|---|---|
| Silence detection | ✗ Cuts on pauses — ignores content meaning | ✓ Semantic analysis — understands what is being said |
| Scene cut detection | ✗ Cuts on camera angles — misses best spoken moments | ✓ Transcript-based selection — finds the idea, not just the cut |
| Keyword matching | ✗ Finds words, not insights | ✓ LLM understanding — surfaces complete, valuable thoughts |
| Manual timeline | ✗ Hours of scrubbing for one good clip | ✓ 10 ranked candidates ready in minutes |
Works for any long-form content
One pipeline, every content type.
Example: 90-minute interview
Best insights, stories and quotable moments extracted automatically.
Example: 60-minute training
Key frameworks and teaching moments identified without watching the full video.
Example: 45-minute talk
The strongest arguments and reveals surfaced as standalone Shorts.
Example: Last 10 minutes
Post event highlights while the audience is still engaged.
Example: Any length
Turn every upload into a week of Shorts with zero extra recording.
Example: 30-minute class
Repurpose educational content for social media discovery.
From URL to published Short in 6 steps
Ingest
Paste a YouTube URL or upload a video file. shortube.pro downloads and prepares the source in the background.
Transcribe
OpenAI Whisper converts every spoken word to text with millisecond-precise timestamps.
Analyse
A language model reads the full transcript, identifies topic boundaries, scores moments and ranks candidates.
Select
The top 10 clips are selected based on combined scores. You review, approve and optionally re-rank.
Render
9:16 reframe + animated word-level captions are burned into each clip and delivered as a 1080×1920 MP4.
Publish
Upload directly to YouTube or download for other platforms. Schedule for optimal posting times.
From 1–2 manual clips per video to up to 10 AI-selected candidates per batch.
A 60-minute video is transcribed, scored and rendered in under 5 minutes.
Clips ranked #1 by shortube.pro AI average a 94/100 virality score in internal benchmarks.
Frequently asked questions
How is this different from just cutting on silence?
Silence detection finds gaps in audio — it has no understanding of what was said. shortube.pro reads the meaning of the conversation, so a quiet, insightful moment scores higher than a loud but low-value exchange.
Can it clip videos in languages other than English?
Whisper supports transcription in dozens of languages. The AI clip analysis currently works best on English-language content, with multilingual support in development.
What is the maximum video length?
There is no hard limit. We have successfully processed videos up to 3 hours long. Processing time scales linearly with video length.
Can I adjust the scoring to prioritise different types of clips?
The preset you choose (podcast, webinar, talking head, etc.) adjusts the scoring weights. Podcast preset weights topic completeness and quotability; talking head preset weights hook density and pacing.
What if I disagree with the AI's top picks?
All generated clips are shown with their scores. You can review every candidate, not just the top 10, and promote any clip the AI ranked lower.
Is the clip timing exactly on the sentence boundaries?
Yes. Because Whisper provides word-level timestamps, clip boundaries are placed precisely at sentence starts and ends — no abrupt cuts mid-word.
Start clipping smarter today
Paste your first video URL and get 10 AI-selected clips in minutes. Free to start.
Create free account