Developer guides

Python Integration Guide

Connect Python scripts, notebooks, ETL jobs, and AI workflows to Crawlora's structured public web data APIs.

requestsPipelinesNotebooksAI workflowsTimeouts

Verified HTTP pattern

POST /google/search

Normalized JSON

Request

POST https://api.crawlora.net/api/v1/google/search
x-api-key: $CRAWLORA_API_KEY
Content-Type: application/json

{
  "country": "us",
  "keyword": "best CRM software",
  "language": "en",
  "limit": 10,
  "page": 1
}

Base URL

https://api.crawlora.net/api/v1

Auth header

x-api-key

Example endpoint

POST /google/search

This repository does not contain an official Crawlora Python package, so the examples use standard HTTP requests.

Developer workflow

Install dependencies

Use `requests` for a simple synchronous integration.

Install requests · bash

pip install requests

Developer workflow

Environment variable

.env · bash

export CRAWLORA_API_KEY="your_api_key_here"

Developer workflow

Basic request

This request uses the verified Google Search path and `x-api-key` authentication.

Python request · python

import os
import requests

API_KEY = os.environ["CRAWLORA_API_KEY"]
BASE_URL = "https://api.crawlora.net/api/v1"

def crawlora_request(path: str, payload: dict) -> dict:
    response = requests.post(
        f"{BASE_URL}{path}",
        headers={
            "x-api-key": API_KEY,
            "Content-Type": "application/json",
        },
        json=payload,
        timeout=60,
    )

    if not response.ok:
        raise RuntimeError(
            f"Crawlora request failed: {response.status_code} {response.text}"
        )

    return response.json()

if __name__ == "__main__":
    data = crawlora_request(
        "/google/search",
        {
            "country": "us",
            "keyword": "best CRM software",
            "language": "en",
            "limit": 10,
            "page": 1,
        },
    )

    print(data)

Developer workflow

Data pipeline example

Keep the pipeline dependency-light by writing JSONL directly from the standard library.

JSONL pipeline · python

import json

keywords = ["project management software", "crm for startups", "sales automation"]

with open("crawlora-search-results.jsonl", "w", encoding="utf-8") as output:
    for keyword in keywords:
        try:
            payload = crawlora_request(
                "/google/search",
                {"keyword": keyword, "country": "us", "language": "en", "limit": 10, "page": 1},
            )
            output.write(json.dumps({"keyword": keyword, "response": payload}) + "\n")
        except RuntimeError as exc:
            output.write(json.dumps({"keyword": keyword, "error": str(exc)}) + "\n")

Developer workflow

Notebook / AI workflow example

Normalize Crawlora output before passing it into a summarization, clustering, or retrieval step.

Structured agent input · python

def summarize_search_results(keyword: str) -> dict:
    data = crawlora_request(
        "/google/search",
        {"keyword": keyword, "country": "us", "language": "en", "limit": 10, "page": 1},
    )

    results = data.get("data", {}).get("result", [])
    return {
        "keyword": keyword,
        "items": [
            {
                "title": item.get("title"),
                "url": item.get("link"),
                "snippet": item.get("Snippet"),
            }
            for item in results
        ],
    }

# Pass this structured JSON into your LLM or agent layer.

Developer workflow

Error handling

Catch request exceptions, inspect 401, 429, and temporary 5xx responses, and log response context where available.

Status / codeMeaningHow to handle
400Invalid request or missing required input.Validate request bodies before calling Crawlora and surface useful messages to users.
401Missing or invalid API key.Check the `x-api-key` header and rotate the key from the console if needed.
402/403Plan, permission, or billing issue where applicable.Check plan access, credit state, and endpoint availability.
429Rate limit exceeded.Back off with jitter and reduce concurrency.
5xxTemporary execution or upstream failure.Retry safe jobs with exponential backoff and keep the failure visible.

Developer workflow

Production checklist

  • Store the key in an environment variable.
  • Set a timeout.
  • Avoid unbounded concurrency.
  • Handle 429 responses.
  • Store raw responses only when needed.
  • Sanitize user input.
  • Respect responsible-use rules.

Responsible public web data workflows

Use Crawlora for structured public web data workflows. Customers are responsible for compliance with applicable laws, third-party rights, platform rules, and Crawlora terms. Keep API keys server-side, validate inputs, and avoid collecting or storing unnecessary sensitive data.

Read Crawlora terms

Developer workflow

Related developer links

Use these pages to move between endpoint discovery, examples, pricing, and responsible-use guidance.

Developer workflow

FAQ

Common questions for this Crawlora developer integration path.

Is there an official Crawlora Python SDK?

This frontend repository does not contain an official Crawlora Python SDK. Use requests or another standard HTTP client with the documented API.

Should I use requests or httpx?

Use requests for straightforward scripts. Use httpx if your project already needs async clients, connection pooling, or richer timeout controls.

Can I use Crawlora in notebooks?

Yes. Store your key outside the notebook when possible and keep result counts bounded.

Can I run Crawlora jobs in Airflow or cron?

Yes. Add retries with backoff, timeout controls, and logging around each endpoint call.

How do I handle rate limits?

Reduce concurrency and retry 429 responses with exponential backoff. Avoid tight retry loops.

Can I use Crawlora with LangChain?

Yes. Wrap a Crawlora HTTP call as a custom tool, loader, or retrieval ingestion step.

Where are response schemas documented?

Endpoint detail pages in the docs catalog show available examples, parameters, and schema references.

Next step

Run your first Python request

Copy the Google Search example, swap in a keyword, then inspect the endpoint docs for platform-specific schemas.