Datasets API

The Datasets API provides programmatic access to structured data collections.

All endpoints require an apiKey parameter (query param or in the request body). Your plan must include API access.

Data is billed per row with a minimum of 50 rows per request.

Premium Fields: If your plan does not include premium field access, those fields will return **HIDDEN** in the response data.

Important: Before making API requests, we strongly recommend exploring the dataset in the Datasets UI first. Use the filters and preview to ensure you're selecting exactly the data you need — this helps avoid purchasing unwanted rows.

The API follows an asynchronous request-response pattern:

  1. Request — Submit a dataset request with filters and format preferences
  2. Poll — Check request status until processing completes
  3. Download — Stream the data in your preferred format

Browse all available datasets and their schemas on the Datasets page.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// 1. Create a dataset request
const response = await fetch('https://rebrowser.net/api/datasets/requests/create', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    apiKey: 'YOUR_API_KEY',
    datasetSlug: 'iheart',
    entitySlug: 'stations',
    format: 'json',
    page: null,         // null = export all pages; omit or set number for single page
    allowPurchase: true,
  }),
})
const { datasetRequestId } = await response.json()

// 2. Poll for completion
let status
do {
  await new Promise(r => setTimeout(r, 5000))
  const statusRes = await fetch(
    `https://rebrowser.net/api/datasets/requests/${datasetRequestId}?apiKey=YOUR_API_KEY`
  )
  status = await statusRes.json()
} while (status.status === 'pending' || status.status === 'processing')

// 3. Download the data
const dataRes = await fetch(status.downloadUrl + '&apiKey=YOUR_API_KEY')
const rows = await dataRes.json()
console.log(rows)

API Endpoints

POST /api/datasets/requests/create

Create a new dataset export request. Returns either a preview (with cost estimate) or creates a purchase request.

Request Body (JSON)

Your API key from Dashboard / API

Dataset identifier (e.g., iheart, seatgeek, copart)

Entity within dataset (e.g., stations, events, listingDetails)

Output format: json, jsonl, csv, or parquet

Format availability depends on your plan. Higher-tier plans include access to JSONL and Parquet formats.

null — export all pages (entire dataset matching filters)

1, 2, ... — export a single page of results

If omitted, defaults to 1 (first page only). You must explicitly pass null to export all pages.

Object to specify filters, sort, and page size. Structure:

  • filters — Filter rules (see Filtering & Sorting below)
  • sort — Sort configuration with field and dir (ASC or DESC)
  • pageSize — Number of rows per page (default: 50)
  • page — Page number for pagination (used when page parameter is a number)

Note: Premium fields used in filters require premium plan access. Requests with premium field filters will be rejected if your plan doesn't include premium field access.

Set to true to automatically purchase unowned rows and create the export request.

When false, returns a preview with cost estimate only.

Response (Preview Mode)

When allowPurchase is not set or false:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{
  "success": true,
  "meta": {
    "datasetSlug": "iheart",
    "entitySlug": "stations",
    "format": "json",
    "totalRows": 15420,
    "ownedRows": 1000,
    "freeRows": 5000,
    "unownedRows": 9420,
    "estimatedCost": 47.10,
    "hasFreeAccess": true
  },
  "message": "Set allowPurchase: true to purchase 9,420 missing rows for $47.10 and create the export request.",
  "rows": [
    {
      "stationId": "1234",
      "name": "Station Name",
      "premiumField": "**HIDDEN**"  // Premium fields show this value if plan doesn't include premium access
    }
  ]
}
Response (Purchase Mode)

When allowPurchase=true:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
  "success": true,
  "meta": {
    "datasetSlug": "iheart",
    "entitySlug": "stations",
    "format": "json",
    "totalRows": 15420,
    "ownedRows": 1000,
    "freeRows": 5000,
    "unownedRows": 9420,
    "estimatedCost": 47.10
  },
  "datasetRequestId": "507f1f77bcf86cd799439011",
  "status": "pending",
  "message": "Dataset request created. Poll GET /api/datasets/requests/507f1f77bcf86cd799439011 for status."
}

Check the status of a dataset request. Poll this endpoint until status is success or error.

Parameters

MongoDB ObjectId returned from the create request endpoint

Your API key (query param)

Response
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
  "success": true,
  "requestId": "507f1f77bcf86cd799439011",
  "status": "success",
  "datasetSlug": "iheart",
  "entitySlug": "stations",
  "format": "json",
  "page": null,
  "estimatedRows": 15420,
  "estimatedCost": 47.10,
  "createdAt": "2026-02-05T10:30:00Z",
  "updatedAt": "2026-02-05T10:32:15Z",
  "completedAt": "2026-02-05T10:32:15Z",
  "rowCount": 15420,
  "expiresAt": "2026-02-12T10:32:15Z",
  "downloadUrl": "/api/datasets/requests/.../download?format=json",
  "downloadUrls": {
    "json": "/api/datasets/requests/.../download?format=json",
    "jsonl": "/api/datasets/requests/.../download?format=jsonl",
    "csv": "/api/datasets/requests/.../download?format=csv",
    "parquet": "/api/datasets/requests/.../download?format=parquet"
  }
}
Status Values
StatusDescription
pendingRequest queued for processing
processingCurrently being processed (includes progress field 0-100)
successReady for download
errorProcessing failed (includes error field)

Stream the completed dataset. Returns the data file directly.

Parameters

Request ID from the status endpoint

json, jsonl, csv, or parquet

Your API key (query param)

Response Headers
HeaderDescription
Content-Typeapplication/json, text/csv, or application/octet-stream
Content-Dispositionattachment; filename="dataset_entity_requestId.ext"
X-Request-RowsTotal row count

Data Formats

Some datasets include premium fields that require a premium plan subscription. When accessing data:

  • Without premium access: Premium fields will contain the value **HIDDEN** in all responses
  • With premium access: Premium fields will contain the actual data values
  • Filtering by premium fields: Requests with filters on premium fields will be rejected if your plan doesn't include premium access
  • Free dataset rows: Premium fields are fully visible in free dataset rows regardless of plan tier

Check each dataset's field list in the UI to identify which fields require premium access.

Returns data as a JSON array of objects. Each row contains the full record with all fields.

1
2
3
4
[
  {"stationId":"1234","name":"KIIS-FM","market":"Los Angeles","genres":["Pop","Top 40"]},
  {"stationId":"5678","name":"KROQ-FM","market":"Los Angeles","genres":["Alternative","Rock"]}
]

Best for: General programmatic access, maximum flexibility

Returns data as newline-delimited JSON. Each line is a standalone JSON object — no wrapping array, no commas between records.

1
2
{"stationId":"1234","name":"KIIS-FM","market":"Los Angeles","genres":["Pop","Top 40"]}
{"stationId":"5678","name":"KROQ-FM","market":"Los Angeles","genres":["Alternative","Rock"]}

Best for: Streaming pipelines, line-by-line processing, large datasets

Plan requirement: Business plan and above

Standard comma-separated values with header row.

Best for: Excel, Google Sheets, pandas, general data analysis

Apache Parquet columnar format with zstd compression.

Best for: Big data pipelines, Spark, Athena, data lakes, efficient storage

Plan requirement: Business plan and above

Billing & Pricing

RuleDescription
Minimum50 rows minimum per request
RoundingAll purchases rounded up to nearest 50 rows
RateVaries by plan (see your plan details for pricing)
1
2
3
4
5
6
7
billableRows = Math.ceil(unownedRows / 50) * 50
totalCost = (billableRows / 1000) * ratePer1kRows

// Examples:
// 1 row → 50 billable rows
// 51 rows → 100 billable rows
// 235 rows → 250 billable rows
StateDescriptionBilling
OWNEDPreviously purchased (within 30 days)Not charged
FREEWithin free access windowNot charged
UNOWNEDRequires purchaseCharged
  • Row retention: Purchased rows expire after 30 days. Re-downloading after expiration requires a new purchase.
  • Request expiration: Completed download requests expire after 7 days.
  • Free access: Some datasets offer free access for older data (configurable per dataset).

Filtering & Sorting

Use the viewSettings object in the request body to specify filters, sort order, and page size:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
const response = await fetch('https://rebrowser.net/api/datasets/requests/create', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    apiKey: 'YOUR_API_KEY',
    datasetSlug: 'iheart',
    entitySlug: 'stations',
    format: 'json',
    page: null,  // null = all pages, 1/2/3... = specific page
    viewSettings: {
      filters: {
        conjunction: 'and',
        rules: [
          { type: 'rule', field: 'market', operator: 'is', value: 'Los Angeles' },
          { type: 'rule', field: 'genres', operator: 'contains', value: 'Pop' },
        ],
      },
      sort: {
        field: 'cume',
        dir: 'DESC',
      },
      pageSize: 100,
    },
    allowPurchase: true,
  }),
})

Text fields: is, isNot, isAnyOf, isNoneOf, contains, notContains, startsWith, endsWith, isEmpty, isNotEmpty

Number fields: eq, neq, gt, gte, lt, lte

DateTime fields: is, before, after

Boolean fields: isTrue, isFalse

Error Handling

CodeDescription
200Success
400Bad Request — invalid parameters, invalid format, request not ready
401Unauthorized — missing or invalid API key
402Payment Required — insufficient balance or no active plan
403Forbidden — plan restrictions (no API access, format not allowed, premium fields required)
404Not Found — dataset, entity, view, or request not found
410Gone — request has expired
500Internal Server Error
1
2
3
4
{
  "success": false,
  "error": "Detailed error message"
}
ErrorCauseSolution
API key requiredMissing API keyAdd apiKey parameter
Your plan does not include API accessPlan doesn't allow APIUpgrade plan
Insufficient balanceNot enough creditsAdd funds or reduce request size
No active planNo active subscriptionSubscribe to a plan
Dataset request has expiredDownload link expired (7 days)Create new request
Dataset request is not readyTrying to download before completionPoll status endpoint until success
Your plan does not support [format] downloadsFormat not included in planUpgrade plan or use a different format (CSV and JSON are available on all plans)
Filtering by "[field]" requires a premium subscriptionPremium field used in filter without premium accessRemove premium field filters or upgrade to premium plan

Best Practices

  • Preview First: Always test your filters in the Datasets UI before making API requests to ensure you're selecting the right data.
  • Use Preview Mode: Call without allowPurchase to see cost estimates and preview rows before committing to a purchase.
  • Check Premium Fields: Review which fields are marked as premium in the dataset UI. Premium fields will show **HIDDEN** values unless your plan includes premium access.
  • Export All vs. Single Page: Pass page: null to export all matching rows. If page is omitted, only page 1 is returned — this is a safety default to prevent accidental large purchases.
  • Paginate Large Requests: For large datasets, start with a single page to test your filters and control costs before exporting all pages.
  • Handle Polling Gracefully: Implement exponential backoff when polling for status (start at 5s, increase to 30s max).
  • Choose the Right Format: Use JSON or CSV for general use. Use JSONL for streaming large datasets. Use Parquet for data warehouse integrations.
  • Monitor Row Expiration: Track when purchased rows expire (30 days) if you need ongoing access.
  • Download Before Expiration: Download completed requests within 7 days before they expire.