Developer guides

LangChain Integration Guide

Use Crawlora APIs to provide structured public web data to LangChain workflows without relying on raw HTML as the primary input.

Custom toolsLoader patternRetrieval ingestionPython

Verified HTTP pattern

POST /google/search

Normalized JSON

Request

POST https://api.crawlora.net/api/v1/google/search
x-api-key: $CRAWLORA_API_KEY
Content-Type: application/json

{
  "country": "us",
  "keyword": "best CRM software",
  "language": "en",
  "limit": 10,
  "page": 1
}

Base URL

https://api.crawlora.net/api/v1

Auth header

x-api-key

Example endpoint

POST /google/search

This repository does not contain an official LangChain package. Treat this guide as a custom tool and loader pattern.

Developer workflow

Why use Crawlora with LangChain?

LangChain workflows often need external data. Crawlora can provide normalized JSON from supported public platforms, making it easier to summarize, classify, embed, or store results.

Developer workflow

Integration patterns

  • Custom tool.
  • Custom document loader.
  • Retrieval ingestion pipeline.
  • Agent tool.
  • Scheduled data refresh job.

Developer workflow

Python custom tool example

Adapt the wrapper to your installed LangChain version's current tool API.

Custom function · python

import os
import requests

API_KEY = os.environ["CRAWLORA_API_KEY"]
BASE_URL = "https://api.crawlora.net/api/v1"

def crawlora_google_search(query: str) -> dict:
    response = requests.post(
        f"{BASE_URL}/google/search",
        headers={"x-api-key": API_KEY, "Content-Type": "application/json"},
        json={"keyword": query, "country": "us", "language": "en", "limit": 10, "page": 1},
        timeout=60,
    )
    response.raise_for_status()
    return response.json()

# Adapt this function to your installed LangChain version's tool wrapper.

Developer workflow

Document conversion example

Transform Crawlora JSON into simple document dictionaries before passing them into your retrieval or storage layer.

Document conversion · python

def crawlora_results_to_documents(payload: dict) -> list[dict]:
    results = payload.get("data", {}).get("result", [])

    return [
        {
            "page_content": item.get("Snippet") or item.get("title") or "",
            "metadata": {
                "title": item.get("title"),
                "url": item.get("link"),
                "position": item.get("position"),
                "source": "crawlora_google_search",
            },
        }
        for item in results
    ]

Developer workflow

Use cases

  • SERP research agent.
  • YouTube transcript summarizer.
  • App review clustering.
  • Local business research.
  • Product Hunt startup research.
  • Amazon product monitoring.

Developer workflow

Production tips

  • Keep API keys server-side.
  • Control result count.
  • Add timeouts.
  • Handle 429 responses.
  • Cache where appropriate.
  • Log request IDs where available.
  • Avoid sending unnecessary personal data to LLMs.

Responsible public web data workflows

Use Crawlora for structured public web data workflows. Customers are responsible for compliance with applicable laws, third-party rights, platform rules, and Crawlora terms. Keep API keys server-side, validate inputs, and avoid collecting or storing unnecessary sensitive data.

Read Crawlora terms

Developer workflow

Related developer links

Use these pages to move between endpoint discovery, examples, pricing, and responsible-use guidance.

Developer workflow

FAQ

Common questions for this Crawlora developer integration path.

Does Crawlora have an official LangChain integration?

This frontend repository does not contain an official Crawlora LangChain package. Use a custom tool or loader wrapper around the HTTP API.

Should I build a LangChain tool or loader?

Use a tool for agent-time decisions and a loader for scheduled ingestion or retrieval indexing.

Can I use Crawlora results in a vector database?

Yes. Convert normalized result items into documents with clear metadata before embedding.

Can I summarize YouTube transcripts with LangChain?

Yes, if the selected YouTube transcript endpoint fits your workflow. Keep result counts and token budgets bounded.

How do I control cost?

Bound result counts, cache repeated requests, and monitor credits on the pricing and console surfaces.

How should I handle rate limits?

Back off on 429 responses, reduce concurrency, and avoid aggressive retry loops.

How is this different from a general web loader?

Crawlora returns platform-specific JSON for supported sources instead of relying on raw HTML extraction.

Next step

Wrap one endpoint as a LangChain tool

Start with Google Search or YouTube, normalize the response, then connect it to your agent or retrieval flow.