Developer guides
LangChain Integration Guide
Use Crawlora APIs to provide structured public web data to LangChain workflows without relying on raw HTML as the primary input.
Verified HTTP pattern
POST /google/search
Request
POST https://api.crawlora.net/api/v1/google/search
x-api-key: $CRAWLORA_API_KEY
Content-Type: application/json
{
"country": "us",
"keyword": "best CRM software",
"language": "en",
"limit": 10,
"page": 1
}Base URL
https://api.crawlora.net/api/v1
Auth header
x-api-key
Example endpoint
POST /google/search
This repository does not contain an official LangChain package. Treat this guide as a custom tool and loader pattern.
Developer workflow
Why use Crawlora with LangChain?
LangChain workflows often need external data. Crawlora can provide normalized JSON from supported public platforms, making it easier to summarize, classify, embed, or store results.
Developer workflow
Integration patterns
- Custom tool.
- Custom document loader.
- Retrieval ingestion pipeline.
- Agent tool.
- Scheduled data refresh job.
Developer workflow
Python custom tool example
Adapt the wrapper to your installed LangChain version's current tool API.
Custom function · python
import os
import requests
API_KEY = os.environ["CRAWLORA_API_KEY"]
BASE_URL = "https://api.crawlora.net/api/v1"
def crawlora_google_search(query: str) -> dict:
response = requests.post(
f"{BASE_URL}/google/search",
headers={"x-api-key": API_KEY, "Content-Type": "application/json"},
json={"keyword": query, "country": "us", "language": "en", "limit": 10, "page": 1},
timeout=60,
)
response.raise_for_status()
return response.json()
# Adapt this function to your installed LangChain version's tool wrapper.Developer workflow
Document conversion example
Transform Crawlora JSON into simple document dictionaries before passing them into your retrieval or storage layer.
Document conversion · python
def crawlora_results_to_documents(payload: dict) -> list[dict]:
results = payload.get("data", {}).get("result", [])
return [
{
"page_content": item.get("Snippet") or item.get("title") or "",
"metadata": {
"title": item.get("title"),
"url": item.get("link"),
"position": item.get("position"),
"source": "crawlora_google_search",
},
}
for item in results
]Developer workflow
Use cases
- SERP research agent.
- YouTube transcript summarizer.
- App review clustering.
- Local business research.
- Product Hunt startup research.
- Amazon product monitoring.
Developer workflow
Production tips
- Keep API keys server-side.
- Control result count.
- Add timeouts.
- Handle 429 responses.
- Cache where appropriate.
- Log request IDs where available.
- Avoid sending unnecessary personal data to LLMs.
Responsible public web data workflows
Use Crawlora for structured public web data workflows. Customers are responsible for compliance with applicable laws, third-party rights, platform rules, and Crawlora terms. Keep API keys server-side, validate inputs, and avoid collecting or storing unnecessary sensitive data.
Read Crawlora termsDeveloper workflow
Related developer links
Use these pages to move between endpoint discovery, examples, pricing, and responsible-use guidance.
Developer workflow
FAQ
Common questions for this Crawlora developer integration path.
Does Crawlora have an official LangChain integration?
This frontend repository does not contain an official Crawlora LangChain package. Use a custom tool or loader wrapper around the HTTP API.
Should I build a LangChain tool or loader?
Use a tool for agent-time decisions and a loader for scheduled ingestion or retrieval indexing.
Can I use Crawlora results in a vector database?
Yes. Convert normalized result items into documents with clear metadata before embedding.
Can I summarize YouTube transcripts with LangChain?
Yes, if the selected YouTube transcript endpoint fits your workflow. Keep result counts and token budgets bounded.
How do I control cost?
Bound result counts, cache repeated requests, and monitor credits on the pricing and console surfaces.
How should I handle rate limits?
Back off on 429 responses, reduce concurrency, and avoid aggressive retry loops.
How is this different from a general web loader?
Crawlora returns platform-specific JSON for supported sources instead of relying on raw HTML extraction.
Next step
Wrap one endpoint as a LangChain tool
Start with Google Search or YouTube, normalize the response, then connect it to your agent or retrieval flow.