Crawlora
ProductPlatformsUse CasesDocsPricingCompare
Sign inTry Playground Console
Crawlora

Structured public web data APIs for search, maps, geocoding, streaming, travel, real estate, marketplaces, apps, social, audio, crypto, finance, and AI workflows with managed execution and credit-based usage.

Product

Web Scraping APIFeaturesInfrastructure FeaturesPlatformsTravel APIsReal Estate APIsPricing

Platforms

Google SearchGoogle TrendsBingBraveGoogle MapsDatasetsGeocodingJustWatchAirbnbTripAdvisorZillowCoinGeckoYahoo FinanceGoogle FinanceAmazon

Developers

DocsGetting StartedAuthenticationAPI ExamplesRecipesShowcasesBlogChangelogPlaygroundSDKsIntegrationsMCPGitHub

Use cases

SERP MonitoringGoogle Maps LeadsTravel & Hospitality ResearchProperty Market IntelligenceApp Review AnalysisReview & Reputation MonitoringTikTok Trend IntelligenceYouTube Creator IntelligenceAmazon Product MonitoringMusic Catalog / Playlist IntelligencePodcast & Audio IntelligenceCrypto Market ResearchFinance Market DataAI Agent Web Data

Legal

TermsPrivacy
Product
Web Scraping APIFeaturesInfrastructure FeaturesPlatformsTravel APIsReal Estate APIsPricing
Platforms
Google SearchGoogle TrendsBingBraveGoogle MapsDatasetsGeocodingJustWatchAirbnbTripAdvisorZillowCoinGeckoYahoo FinanceGoogle FinanceAmazon
Developers
DocsGetting StartedAuthenticationAPI ExamplesRecipesShowcasesBlogChangelogPlaygroundSDKsIntegrationsMCPGitHub
Use cases
SERP MonitoringGoogle Maps LeadsTravel & Hospitality ResearchProperty Market IntelligenceApp Review AnalysisReview & Reputation MonitoringTikTok Trend IntelligenceYouTube Creator IntelligenceAmazon Product MonitoringMusic Catalog / Playlist IntelligencePodcast & Audio IntelligenceCrypto Market ResearchFinance Market DataAI Agent Web Data
Legal
TermsPrivacy

© 2026 Built with 💖 by Tony Wang

|System:Crawlora API status
  1. Home
  2. /Showcases
  3. /YouTube
  4. /Wo95ob_s_NI

YouTube video intelligence showcase

John Schulman on Reasoning, RLHF, and the Road to AGI

John Schulman explains how pre-training and post-training shape AI behavior, why long-horizon training may unlock more useful models, and what could still bottleneck progress toward AGI.

Dwarkesh PatelPre-training and post-trainingLong-horizon tasksGeneralization and robustness1 hr 35 minMay 15, 20246 comment sample
Transcript API Comments API Source video

Build this with Crawlora

Video intelligence API workflow

Video ID
Wo95ob_s_NI
Available APIs
TranscriptCommentsMetadata
YouTube transcript API YouTube comments API YouTube video metadata API YouTube scraping API Creator intelligence workflow Pricing Source video
Open transcript in Playground Open comments in Playground Get API key

cURL

curl "https://api.crawlora.net/api/v1/youtube/transcript/Wo95ob_s_NI" \
  -H "x-api-key: $CRAWLORA_API_KEY"

Video summary

John Schulman on reasoning, RLHF, and AGI progress

In this Dwarkesh Patel interview, OpenAI cofounder John Schulman discusses reasoning, RLHF-style post-training, and what it may take for models to handle longer, more complex tasks. The conversation covers coding agents, generalization, bottlenecks, and possible paths toward more capable AI systems.

Pre-training vs. post-training

Schulman explains how pre-training builds a broad web-trained model, while post-training narrows it into a helpful chat assistant.

Long-horizon task capability

He discusses how models may move from short chatbot responses to longer, more autonomous coding and planning tasks.

Generalization and robustness

The interview explores sample efficiency, recovery from errors, and how better generalization may help models get unstuck.

AI-friendly interfaces

He also touches on UI design, multimodal use, and why human websites may still work well for AI agents.

Topics

Pre-training and post-training

How pre-training learns from web-scale data and why post-training aims for a more helpful assistant persona.

Long-horizon tasks

Why future models may handle multi-file coding projects and other longer, more autonomous tasks.

Generalization and robustness

The role of generalization, sample efficiency, and recovering from errors when models get stuck.

Audience comments snapshot

What viewers are saying

Comments praise the depth of the interview and Dwarkesh Patel’s persistent follow-up questions. Several viewers highlight the discussion of pre-training vs. post-training, long-horizon tasks, and the possibility of near-term AGI timelines as especially memorable.

Sampled comments
6
Visible likes
56
Public replies
0

Comment themes

AI training concepts

The conversation is widely appreciated for its clear explanation of how pre-training and post-training differ.

Long-horizon capability

The audience was especially interested in how models may progress toward longer, more coherent task execution.

Thoughtful interview dynamics

Listeners valued the interviewer’s persistent questioning to surface clearer answers.

Audience signals

Strong positive reception

Multiple comments call the episode great and engaging, with appreciation for the interview style.

AGI timeline discussion stood out

Viewers specifically mention the discussion of dangerous AGI potentially emerging within a few years.

Notable takeaways were easy to follow

One comment summarizes key moments, including autonomous coding and long-horizon task ability.

Minor audio feedback

A viewer notes the audio and suggests lowering mic gain for cleaner sound.

Representative public comments

@vrai49132024-05-30

great episode, john schulman was interesting. i appreciated you pressing him on his view that dangerous AGI could emerge within "two or three years", at least with some likelihood where he found this topic worth discussing. i don't have enough info for a strong opinion on that myself, but i've noticed it's almost a...

24 likes0 replies
@ashh30512024-05-30

Great delving there. Thanks guys.

8 likes0 replies
@moonsonate56312025-05-30

00:30 Pre-training creates a model that can generate content from the web. Post-training targets a narrower range of behaviors like being a chat assistant. 03:44 Models evolving to perform complex coding tasks autonomously 10:29 Improvement in the ability to do long-horizon tasks is key to AI capabilities. 13:52 Mod...

16 likes0 replies
@justinrce2025-05-30

Great interview, appreciate the interviewer challenging and persistent line of good questions and follow-ups to get the best answers

1 likes0 replies
@muntazirabidi2024-05-30

Another great episode. Thanks for such wonderful content.

4 likes0 replies
@peteyhayman2024-05-30

great interview! if you want cleaner audio try reducing mic gain to avoid clipping ( it can be normalized later to get full volume)

3 likes0 replies
Build with YouTube comments data

Use Crawlora's YouTube comments API with the video and transcript endpoints to collect viewer language, thread activity, and audience signals.

Comments API docs Playground
Build this workflow
1Fetch video metadata

Start with the video endpoint to capture ID, channel, publish date, duration, and source context.

2Fetch transcript

Pull timestamped transcript data for summarization, search, citation, and RAG preparation.

3Fetch public comments

Collect visible audience comments to identify themes, objections, questions, and engagement signals.

4Store, analyze, report

Persist structured JSON, run analysis, and publish dashboards, alerts, or research reports.

Public transcript excerpt

Transcript

Timestamped public transcript passages group captions into readable sections, making the video easier to scan, cite, and summarize.

Public excerpt
1:55

it can also assign probabilities to everything. The base model can effectively take on all of these different personas or generate all different kinds of content. When we do post-training, we're usually targeting a narrower range of behaviors where we want the model to behave like a kind of chat assistant. It's a more specific persona where it's trying to be helpful. It's not trying to imitate a person. It's answering your questions or doing your tasks. We're optimizing on a different objective, which is more about producing outputs that humans will like and find useful, as opposed to just imitating this raw content from the web.

2:46

Maybe I should take a step back and ask this. Right now we have these models that are pretty

Build with YouTube transcript data

Use Crawlora's YouTube transcript API to fetch fresh timestamped transcript data for your own server-side workflows.

API docs Sign in

Related showcases

More structured YouTube examples

Dwarkesh Patel

How GPT, Claude, and Gemini are actually trained and served – Reiner Pope

Reiner Pope explains the mechanics behind how GPT-style models are trained and served, focusing in this excerpt on inference economics. Using a roofline-style analysis of transformer execution on a GPU cluster, he shows how batch size, weight fetches, compute throughput, and KV cache access shape latency and cost. The discussion helps explain why higher-priced fast modes can stream tokens more quickly, and why serving many users together can dramatically improve efficiency.

Batch size and batchingRoofline analysis
Dwarkesh Patel

Jensen Huang on Nvidia’s Moat, Supply Chain Bottlenecks, and Whether AI Software Gets Commoditized

Jensen Huang argues that Nvidia’s moat is not just software, but the hard-to-replicate system that turns electrons into valuable tokens across a broad AI ecosystem. He also discusses supply chain constraints, upstream investments, and how Nvidia plans years ahead to scale through bottlenecks.

Nvidia’s value creationSupply chain and ecosystem
Dwarkesh Patel

Michael Nielsen on scientific progress, falsification, and the road to special relativity

Michael Nielsen and Dwarkesh Patel discuss how scientific progress is actually recognized in practice, using the history of the ether, Michelson-Morley, Lorentz, Poincaré, Einstein, and later muon experiments to show why the standard falsification story is often too simple.

Michelson-Morley and the myth of simple falsificationMultiple theories, not one target

Build this with Crawlora

Video intelligence API workflow

Video ID
Wo95ob_s_NI
Available APIs
TranscriptCommentsMetadata
YouTube transcript API YouTube comments API YouTube video metadata API YouTube scraping API Creator intelligence workflow Pricing Source video
Open transcript in Playground Open comments in Playground Get API key

cURL

curl "https://api.crawlora.net/api/v1/youtube/transcript/Wo95ob_s_NI" \
  -H "x-api-key: $CRAWLORA_API_KEY"