Datasets API endpoint
Use Crawlora's GitHub users dataset item API to search or inspect stored structured datasets as JSON. This page includes request parameters, cURL examples, response schema, validation behavior, credit cost, and a Playground link for testing before integration. Dataset endpoints read indexed records and do not apply proxy routing.
/datasets/github-users/items/{login}Returns one enriched GitHub user record by login from dataset id enum value `github-users`. Developers commonly use this endpoint for repeatable dataset search, filtering, facets, local business enrichment, analytics, exports, and internal tools that need structured records beyond the limited manual refinement available in the Google Maps app. Authentication uses the x-api-key header, usage is metered with the credit cost shown on this page, and the request does not trigger live scraping or proxy routing.
Request parameters are generated from the active endpoint catalog. Dataset parameters filter, page, facet, or locate stored structured records; they do not configure a live scraper or proxy path.
| Parameter | Type | Required | Default | Description | Example |
|---|---|---|---|---|---|
| login (path) | string | Yes | GitHub login, max 128 characters | ||
| x-api-key (header) | string | Yes | API key required |
curl -X GET "https://api.crawlora.net/api/v1/datasets/github-users/items/%3Clogin%3E" \ -H "x-api-key: $CRAWLORA_API_KEY"
Send your scraping API key in the x-api-key header. Use the console API Keys page to rotate or select the active key.
Endpoint usage is metered in credits. The plan prices, included credits, limits, and overage rates below match the active backend billing configuration.
| Plan | Price | Included credits | Daily cap | Rate limit | Overage |
|---|---|---|---|---|---|
| Free | $0/mo | 2,000 | 500 daily credits | 5/min | No overage |
| Starter | $9/mo | 20,000 | 5,000 daily credits | 15/min | $0.75/1,000 overage credits when enabled |
| Growth | $29/mo | 100,000 | 25,000 daily credits | 45/min | $0.45/1,000 overage credits when enabled |
| Pro | $79/mo | 400,000 | No daily cap | 120/min | $0.30/1,000 overage credits |
| Business | $199/mo | 1,200,000 | No daily cap | 300/min | $0.20/1,000 overage credits |
| Enterprise | $499/mo | 5,000,000 | No daily cap | 1,000/min | $0.12/1,000 overage credits |
This endpoint reads stored indexed dataset records. It does not execute a live upstream Google Maps request, browser session, or proxy-routed scraping job.
- Returns `404` when the login is not present in the stored dataset. - Does not trigger live scraping. Example response: ```json { "code": 200, "msg": "OK", "data": { "login": "octodev", "name": "Octo Dev", "company_normalized": "google", "blog": "https://octo.dev", "twitter_username": "octodev", "influence_tier": "mid", "geo": { "country": "Germany", "country_code": "DE", "city": "Berlin" }, "followers": 1200, "public_repos": 60, "account_age_years": 10.0, "reachable": true, "domains": ["ml-ai"], "rank_score": 88 } } ```
Crawlora does not silently return invalid dataset search results when filters, pagination, coordinates, or stored record lookups cannot be satisfied.
| Status | Common failure case |
|---|---|
| 400 | Invalid input, missing required parameter, invalid enum, bad coordinate pair, or result window beyond the dataset limit |
| 404 | Requested stored dataset item is not present |
| 429 | Plan or endpoint rate limit exceeded |
| 500 | Internal dataset query or storage error |
When possible, Crawlora returns structured error context so your integration can adjust filters, page size, location inputs, or lookup identifiers.
| Status | Description | Schema |
|---|---|---|
| 400 | Bad Request | #/definitions/app.Response |
| 404 | Not Found | #/definitions/app.Response |
| 429 | Too Many Requests | #/definitions/app.Response |
| 500 | Internal Server Error | #/definitions/app.Response |
{
"code": 200,
"msg": "OK",
"data": {
"login": "octodev",
"name": "Octo Dev",
"company_normalized": "google",
"blog": "https://octo.dev",
"twitter_username": "octodev",
"influence_tier": "mid",
"geo": {
"country": "Germany",
"country_code": "DE",
"city": "Berlin"
},
"followers": 1200,
"public_repos": 60,
"account_age_years": 10,
"reachable": true,
"domains": [
"ml-ai"
],
"rank_score": 88
}
}Request schema
No body schema
Response schema
#/definitions/datasets.githubUserResponseDoc
| Field | Type | Required | Enum | Bounds | Example | Description |
|---|---|---|---|---|---|---|
| code | integer | No | 200 | |||
| data | es.GithubUserRecord | No | ||||
| data.account_age_years | number | No | ||||
| data.active_90d | boolean | No | ||||
| data.avatar_url | string | No | ||||
| data.bio | string | No | ||||
| data.blog | string | No | ||||
| data.company | string | No | ||||
| data.company_normalized | string | No | ||||
| data.crawled_at | string | No | ||||
| data.created_at | string | No | ||||
| data.domains | array | No | ||||
| data.email | string | No | ||||
| data.follower_following_ratio | number | No | ||||
| data.followers | integer | No | ||||
| data.following | integer | No | ||||
| data.geo | es.GithubGeo | No | ||||
| data.geo.city | string | No | ||||
| data.geo.country | string | No | ||||
| data.geo.country_code | string | No | ||||
| data.geo.location | es.GithubGeoPoint | No | ||||
| data.geo.location.lat | number | No | ||||
| data.geo.location.lon | number | No | ||||
| data.geo.state | string | No | ||||
| data.has_blog | boolean | No | ||||
| data.has_email | boolean | No | ||||
| data.has_twitter | boolean | No | ||||
| data.hireable | boolean | No | ||||
| data.html_url | string | No | ||||
| data.id | integer | No | ||||
| data.influence_tier | string | No | ||||
| data.is_bot | boolean | No | ||||
| data.is_org | boolean | No | ||||
| data.last_active_at | string | No | ||||
| data.location_raw | string | No | ||||
| data.login | string | No | ||||
| data.name | string | No | ||||
| data.prs_30d | integer | No | ||||
| data.public_gists | integer | No | ||||
| data.public_repos | integer | No | ||||
| data.pushes_30d | integer | No | ||||
| data.rank_score | integer | No | ||||
| data.reachable | boolean | No | ||||
| data.reviews_30d | integer | No | ||||
| data.schema_version | integer | No | ||||
| data.social_accounts | array | No | ||||
| data.social_accounts[].provider | string | No | ||||
| data.social_accounts[].url | string | No | ||||
| data.social_count | integer | No | ||||
| data.twitter_username | string | No | ||||
| data.type | string | No | ||||
| msg | string | No | OK |
Use environment variables for secrets and keep Crawlora API keys server-side.
curl -X GET "https://api.crawlora.net/api/v1/datasets/github-users/items/%3Clogin%3E" \
-H "x-api-key: $CRAWLORA_API_KEY"Crawlora is designed for responsible structured public web data workflows. Customers are responsible for using Crawlora in compliance with applicable laws, third-party rights, target-platform rules, and Crawlora terms.
Read Crawlora terms