Kapwing Drops the “AI Slop” Bomb: 1 in 5 Brand-New YouTube Sessions Now Start With Algorithmic Gruel—And a Cartoon Monkey From India Is Cashing $4 M a Year for It
SAN FRANCISCO—If you’ve recently opened YouTube with a wiped browser history and felt an inexplicable urge to watch a 3-D-rendered monkey lecture a CGI toddler about dental hygiene, you are not alone. Video-editing startup Kapwing just served the largest empirical look yet at what it bluntly labels “AI slop”: low-effort, algorithm-chasing videos that are scripted, storyboarded, voiced and often “acted” entirely by generative models. The headline number—21 % of the first 500 recommendations served to a zero-history account—means the platform’s vaunted recommendation engine now greets one in five first-time viewers with content no human hand noticeably touched. Worse, the economics work: the top slop channel, India’s Bandar Apna Dost (Hindi for “Our Monkey Friend”), has racked up 2.05 billion views and an estimated $4.25 million in annual AdSense, all while starring an anthropomorphic primate whose facial expressions are clearly diffusion-model drift.
Methodology: how Kapwing caught the slop
Researchers built a clean-slate persona—“Viewer Zero”—residing on a fresh Chrome profile behind a rotating residential proxy. They logged out of Google services, blocked cookies, then let YouTube autoplay for 48 hours while scraping every recommendation, transcript and metadata blob. Kapwing’s in-house classifier (a fine-tuned RoBERTa fed 8,000 manually labeled videos) flagged clips as slop when they met four criteria:
Researchers built a clean-slate persona—“Viewer Zero”—residing on a fresh Chrome profile behind a rotating residential proxy. They logged out of Google services, blocked cookies, then let YouTube autoplay for 48 hours while scraping every recommendation, transcript and metadata blob. Kapwing’s in-house classifier (a fine-tuned RoBERTa fed 8,000 manually labeled videos) flagged clips as slop when they met four criteria:
- Synthetic voice-over score > 0.82 (using ElevenLabs detector),
- 80 % of keyframes flagged as AI-generated by Hive’s image model,
- Script perplexity < 25 (indicating templated text), and
- Channel upload cadence ≥ 15 long-form videos per week.
The 500-video sample covered 24 languages; manual review by two bilingual annotators confirmed 94 % precision. YouTube’s API does not expose monetization data, so Kapwing back-filled CPM estimates using Social Blade ranges and public rate-cards for each geography. The final spreadsheet: 105 channels, 12.7 billion cumulative views, $42 million in inferred yearly revenue—numbers big enough to make even the most idealistic product manager wince.
Global slop map: South Korea watches, Spain subscribes
Country-level consumption defies easy narratives. South Korea logged 8.45 billion slop views, topping even India (6.12 B) and Pakistan (5.34 B). Analysts blame “Alggul-tube,” a local meme that equates mindless viewing with decompression; Korean slop channels simply swap the monkey for a pastel raccoon lecturing about kimchi etiquette. The United States sits fourth at 3.39 billion views, but with the highest CPMs ($7.80 average), American eyeballs deliver the fattest paychecks. Spain, meanwhile, shows the highest slop-subscription conversion (31 % of viewers hit subscribe), suggesting Latin European audiences value narrative closure—even if the narrator is a procedurally generated tomato.
Country-level consumption defies easy narratives. South Korea logged 8.45 billion slop views, topping even India (6.12 B) and Pakistan (5.34 B). Analysts blame “Alggul-tube,” a local meme that equates mindless viewing with decompression; Korean slop channels simply swap the monkey for a pastel raccoon lecturing about kimchi etiquette. The United States sits fourth at 3.39 billion views, but with the highest CPMs ($7.80 average), American eyeballs deliver the fattest paychecks. Spain, meanwhile, shows the highest slop-subscription conversion (31 % of viewers hit subscribe), suggesting Latin European audiences value narrative closure—even if the narrator is a procedurally generated tomato.
Genre deep-dive: three blueprints that print money
- Nursery-time morality plays
3-D animals scold toddlers for refusing broccoli; 11-minute episodes, 8-second loops of royalty-free lullabies. Average watch-time: 7:42 minutes. - “Interesting facts” conveyor belts
Synthetic voice reads ChatGPT bulletins over B-roll stolen from Pexels: “Bananas are berries, but strawberries are not!” Upload cadence: 28 clips per day. - Disaster click-bait
Title: “CGI volcano erupts inside Walmart—shoppers flee!” Thumbnail shows lava in aisle 7. Video delivers 58 seconds of Unreal Engine smoke. CTR: 18 %.
All three formats cost < $15 per clip using off-the-shelf tools (Midjourney, Runway, ElevenLabs, CapCut), meaning break-even occurs at roughly 4,000 views—an event horizon almost any botnet can deliver.
The feedback loop: why slop outranks substance
YouTube’s algorithm maximizes predicted watch-time per impression. Slop factories exploit three levers:
YouTube’s algorithm maximizes predicted watch-time per impression. Slop factories exploit three levers:
- Infinite content moat: 15 uploads a day trains the recommender to treat the channel as a high-volume supplier.
- Session length hack: nursery rhymes are autoplay catnip for kids handed tablets by exhausted parents; average session climbs.
- Low abandon rate: because narratives are non-existent, viewers don’t feel the cognitive “punctuation” that normally triggers closure.
Once a channel crosses ~50 million views in a niche, the algorithm begins to recommend it to look-alike audiences abroad, exporting local slop globally—hence Korean raccoons on American TVs.
Dead-Internet Theory, now in 4K
Kapwing’s report dedicates an entire section to the cultural fallout. Comments on slop videos are themselves increasingly bot-like: repetitive emoji chains, timestamps no human would need, and verbatim praise across languages. Bandar Apna Dost’s latest upload attracted 42,000 comments within 90 seconds—statistically impossible without automation. The implication: engagement may be generating more engagement, with organic humans an ever-shrinking minority. YouTube declined to comment on specific channels, but reiterated that “manipulated content” violating spam policies is removed—yet Bandar Apna Dost remains monetized and featured in YouTube Kids.
Kapwing’s report dedicates an entire section to the cultural fallout. Comments on slop videos are themselves increasingly bot-like: repetitive emoji chains, timestamps no human would need, and verbatim praise across languages. Bandar Apna Dost’s latest upload attracted 42,000 comments within 90 seconds—statistically impossible without automation. The implication: engagement may be generating more engagement, with organic humans an ever-shrinking minority. YouTube declined to comment on specific channels, but reiterated that “manipulated content” violating spam policies is removed—yet Bandar Apna Dost remains monetized and featured in YouTube Kids.
Platform incentives: follow the money
YouTube keeps 45 % of AdSense. At Bandar’s estimated $4.25 M annual take, Google’s cut is $1.9 M for zero licensing cost—pure margin. Premium subscriptions, which promise ad-free kids’ content, actually rise in regions where slop dominates viewing time, suggesting slop indirectly nudges parents toward paid tiers. Until slop CPMs collapse (unlikely while toddler eyeballs remain scarce), YouTube’s quarterly earnings call has no metric that punts on slop removal.
YouTube keeps 45 % of AdSense. At Bandar’s estimated $4.25 M annual take, Google’s cut is $1.9 M for zero licensing cost—pure margin. Premium subscriptions, which promise ad-free kids’ content, actually rise in regions where slop dominates viewing time, suggesting slop indirectly nudges parents toward paid tiers. Until slop CPMs collapse (unlikely while toddler eyeballs remain scarce), YouTube’s quarterly earnings call has no metric that punts on slop removal.
What happens next: three scenarios
- Regulatory whack-a-mole
The EU’s Digital Services Act already flags “systemic risks” to minors. Expect fines in 2025 that force YouTube to throttle channels with > 50 % synthetic content. - Creator counter-culture
Educated Gen-Z viewers begin bragging about “100 % human-made” watch histories, spawning boutique channels that live-stream script-writing sessions. - Slop arms race
Tools get cheaper, CPMs hold, and slop creeps into long-form; by 2026 Netflix competes with procedurally generated 8-hour “movies” stitched in real time to match viewer biometric data.
For now, Kapwing’s data is a mirror. The 21 % figure is not a ceiling—it’s a floor measured before Sora-style video generators reach Fiverr prices. The company has open-sourced its classification pipeline, inviting researchers, regulators and horrified parents to replicate the audit. Download it, point it at your own recommendations, and you may discover that your personalized corner of the world’s largest video platform is, statistically speaking, already a cartoon monkey’s personal ATM.