Datasets API endpoint
Use Crawlora's Search nearby GitHub users API to search or inspect stored structured datasets as JSON. This page includes request parameters, cURL examples, response schema, validation behavior, credit cost, and a Playground link for testing before integration. Dataset endpoints read indexed records and do not apply proxy routing.
/datasets/github-users/nearbySearches enriched GitHub users near a coordinate, sorted by distance, in dataset id enum value `github-users`. influence_tier enum: `nano`, `micro`, `mid`, `macro`, `mega`. Developers commonly use this endpoint for repeatable dataset search, filtering, facets, local business enrichment, analytics, exports, and internal tools that need structured records beyond the limited manual refinement available in the Google Maps app. Authentication uses the x-api-key header, usage is metered with the credit cost shown on this page, and the request does not trigger live scraping or proxy routing.
Request parameters are generated from the active endpoint catalog. Dataset parameters filter, page, facet, or locate stored structured records; they do not configure a live scraper or proxy path.
| Parameter | Type | Required | Default | Description | Example |
|---|---|---|---|---|---|
| lat | number | Yes | Latitude | ||
| lon | number | Yes | Longitude | ||
| radius_m | integer | Yes | Radius in meters, max 50000 | ||
| influence_tier | string | No | Follower-tier enum: nano, micro, mid, macro, mega | ||
| reachable | boolean | No | Filter by any public contact channel | ||
| min_followers | integer | No | Minimum follower count | ||
| page | integer | No | 1 | Page number, defaults to 1 | |
| page_size | integer | No | 20 and maxes at 100 | Page size, defaults to 20 and maxes at 100; page * page_size must be <= 10000 | |
| x-api-key (header) | string | Yes | API key required |
curl -X GET "https://api.crawlora.net/api/v1/datasets/github-users/nearby?reachable=true&page=1" \ -H "x-api-key: $CRAWLORA_API_KEY"
Send your scraping API key in the x-api-key header. Use the console API Keys page to rotate or select the active key.
Endpoint usage is metered in credits. The plan prices, included credits, limits, and overage rates below match the active backend billing configuration.
| Plan | Price | Included credits | Daily cap | Rate limit | Overage |
|---|---|---|---|---|---|
| Free | $0/mo | 2,000 | 500 daily credits | 5/min | No overage |
| Starter | $9/mo | 20,000 | 5,000 daily credits | 15/min | $0.75/1,000 overage credits when enabled |
| Growth | $29/mo | 100,000 | 25,000 daily credits | 45/min | $0.45/1,000 overage credits when enabled |
| Pro | $79/mo | 400,000 | No daily cap | 120/min | $0.30/1,000 overage credits |
| Business | $199/mo | 1,200,000 | No daily cap | 300/min | $0.20/1,000 overage credits |
| Enterprise | $499/mo | 5,000,000 | No daily cap | 1,000/min | $0.12/1,000 overage credits |
This endpoint reads stored indexed dataset records. It does not execute a live upstream Google Maps request, browser session, or proxy-routed scraping job.
- Results are sorted by ascending distance; each item carries `distance_m` (meters from the query point). - `lat`, `lon` and `radius_m` are all required. - The maximum result window is `10000`; `page * page_size` must not exceed `10000`. - Only users with a geocoded location are returned. - Returns an empty `items` array (not an error) when no users fall within the radius. - Does not trigger live scraping. Example response: ```json { "code": 200, "msg": "OK", "data": { "dataset": "github-users", "items": [ { "login": "octodev", "name": "Octo Dev", "influence_tier": "mid", "geo": { "country": "United States", "city": "San Francisco" }, "followers": 1200, "reachable": true, "distance_m": 1840.5 } ], "page": 1, "page_size": 20, "total": 1, "sort": "distance_asc" } } ```
Crawlora does not silently return invalid dataset search results when filters, pagination, coordinates, or stored record lookups cannot be satisfied.
| Status | Common failure case |
|---|---|
| 400 | Invalid input, missing required parameter, invalid enum, bad coordinate pair, or result window beyond the dataset limit |
| 404 | Requested stored dataset item is not present |
| 429 | Plan or endpoint rate limit exceeded |
| 500 | Internal dataset query or storage error |
When possible, Crawlora returns structured error context so your integration can adjust filters, page size, location inputs, or lookup identifiers.
| Status | Description | Schema |
|---|---|---|
| 400 | Bad Request | #/definitions/app.Response |
| 429 | Too Many Requests | #/definitions/app.Response |
| 500 | Internal Server Error | #/definitions/app.Response |
{
"code": 200,
"msg": "OK",
"data": {
"dataset": "github-users",
"items": [
{
"login": "octodev",
"name": "Octo Dev",
"influence_tier": "mid",
"geo": {
"country": "United States",
"city": "San Francisco"
},
"followers": 1200,
"reachable": true,
"distance_m": 1840.5
}
],
"page": 1,
"page_size": 20,
"total": 1,
"sort": "distance_asc"
}
}Request schema
No body schema
Response schema
#/definitions/datasets.githubUsersSearchResponseDoc
| Field | Type | Required | Enum | Bounds | Example | Description |
|---|---|---|---|---|---|---|
| code | integer | No | 200 | |||
| data | datasets.GithubUserSearchResponse | No | ||||
| data.dataset | string | No | ||||
| data.items | array | No | ||||
| data.items[].account_age_years | number | No | ||||
| data.items[].active_90d | boolean | No | ||||
| data.items[].avatar_url | string | No | ||||
| data.items[].bio | string | No | ||||
| data.items[].blog | string | No | ||||
| data.items[].company | string | No | ||||
| data.items[].company_normalized | string | No | ||||
| data.items[].crawled_at | string | No | ||||
| data.items[].created_at | string | No | ||||
| data.items[].distance_m | number | No | ||||
| data.items[].domains | array | No | ||||
| data.items[].email | string | No | ||||
| data.items[].follower_following_ratio | number | No | ||||
| data.items[].followers | integer | No | ||||
| data.items[].following | integer | No | ||||
| data.items[].geo | es.GithubGeo | No | ||||
| data.items[].geo.city | string | No | ||||
| data.items[].geo.country | string | No | ||||
| data.items[].geo.country_code | string | No | ||||
| data.items[].geo.location | es.GithubGeoPoint | No | ||||
| data.items[].geo.location.lat | number | No | ||||
| data.items[].geo.location.lon | number | No | ||||
| data.items[].geo.state | string | No | ||||
| data.items[].has_blog | boolean | No | ||||
| data.items[].has_email | boolean | No | ||||
| data.items[].has_twitter | boolean | No | ||||
| data.items[].hireable | boolean | No | ||||
| data.items[].html_url | string | No | ||||
| data.items[].id | integer | No | ||||
| data.items[].influence_tier | string | No | ||||
| data.items[].is_bot | boolean | No | ||||
| data.items[].is_org | boolean | No | ||||
| data.items[].last_active_at | string | No | ||||
| data.items[].location_raw | string | No | ||||
| data.items[].login | string | No | ||||
| data.items[].name | string | No | ||||
| data.items[].prs_30d | integer | No | ||||
| data.items[].public_gists | integer | No | ||||
| data.items[].public_repos | integer | No | ||||
| data.items[].pushes_30d | integer | No | ||||
| data.items[].rank_score | integer | No | ||||
| data.items[].reachable | boolean | No | ||||
| data.items[].reviews_30d | integer | No | ||||
| data.items[].schema_version | integer | No | ||||
| data.items[].social_accounts | array | No | ||||
| data.items[].social_accounts[].provider | string | No | ||||
| data.items[].social_accounts[].url | string | No | ||||
| data.items[].social_count | integer | No | ||||
| data.items[].twitter_username | string | No | ||||
| data.items[].type | string | No | ||||
| data.page | integer | No | ||||
| data.page_size | integer | No | ||||
| data.sort | string | No | ||||
| data.total | integer | No | ||||
| msg | string | No | OK |
Use environment variables for secrets and keep Crawlora API keys server-side.
curl -X GET "https://api.crawlora.net/api/v1/datasets/github-users/nearby?reachable=true&page=1" \
-H "x-api-key: $CRAWLORA_API_KEY"Crawlora is designed for responsible structured public web data workflows. Customers are responsible for using Crawlora in compliance with applicable laws, third-party rights, target-platform rules, and Crawlora terms.
Read Crawlora terms