Datasets API

The Datasets API provides programmatic access to structured data collections.

All endpoints require an apiKey parameter (query param or in the request body). Your plan must include API access.

Data is billed per row with a minimum of 50 rows per request.

Premium Fields: If your plan does not include premium field access, those fields will return **HIDDEN** in the response data.

Important: Before making API requests, we strongly recommend exploring the dataset in the Datasets UI first. Use the filters and preview to ensure you're selecting exactly the data you need — this helps avoid purchasing unwanted rows.

Overview

The API follows an asynchronous request-response pattern:

Request — Submit a dataset request with filters and format preferences
Poll — Check request status until processing completes
Download — Stream the data in your preferred format

Browse all available datasets and their schemas on the Datasets page.

Quick Example

// 1. Create a dataset request
const response = await fetch('https://rebrowser.net/api/datasets/requests/create', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    apiKey: 'YOUR_API_KEY',
    datasetSlug: 'iheart',
    entitySlug: 'stations',
    format: 'json',
    page: null,         // null = export all pages; omit or set number for single page
    allowPurchase: true,
  }),
})
const { datasetRequestId } = await response.json()

// 2. Poll for completion
let status
do {
  await new Promise(r => setTimeout(r, 5000))
  const statusRes = await fetch(
    `https://rebrowser.net/api/datasets/requests/${datasetRequestId}?apiKey=YOUR_API_KEY`
  )
  status = await statusRes.json()
} while (status.status === 'pending' || status.status === 'processing')

// 3. Download the data
const dataRes = await fetch(status.downloadUrl + '&apiKey=YOUR_API_KEY')
const rows = await dataRes.json()
console.log(rows)

API Endpoints

POST /api/datasets/requests/create

Create a new dataset export request. Returns either a preview (with cost estimate) or creates a purchase request.

Request Body (JSON)

apiKeystringrequired

Your API key from Dashboard / API

datasetSlugstringrequired

Dataset identifier (e.g., iheart, seatgeek, copart)

entitySlugstringrequired

Entity within dataset (e.g., stations, events, listingDetails)

formatstringdefault: json

Output format: json, jsonl, csv, or parquet

Format availability depends on your plan. Higher-tier plans include access to JSONL and Parquet formats.

pagenumber | nulldefault: 1

null — export all pages (entire dataset matching filters)

1, 2, ... — export a single page of results

If omitted, defaults to 1 (first page only). You must explicitly pass null to export all pages.

viewSettingsobject

Object to specify filters, sort, and page size. Structure:

filters — Filter rules (see Filtering & Sorting below)
sort — Sort configuration with field and dir (ASC or DESC)
pageSize — Number of rows per page (default: 50)
page — Page number for pagination (used when page parameter is a number)

Note: Premium fields used in filters require premium plan access. Requests with premium field filters will be rejected if your plan doesn't include premium field access.

allowPurchasebooleandefault: false

Set to true to automatically purchase unowned rows and create the export request.

When false, returns a preview with cost estimate only.

Response (Preview Mode)

When allowPurchase is not set or false:

{
  "success": true,
  "meta": {
    "datasetSlug": "iheart",
    "entitySlug": "stations",
    "format": "json",
    "totalRows": 15420,
    "ownedRows": 1000,
    "freeRows": 5000,
    "unownedRows": 9420,
    "estimatedCost": 47.10,
    "hasFreeAccess": true
  },
  "message": "Set allowPurchase: true to purchase 9,420 missing rows for $47.10 and create the export request.",
  "rows": [
    {
      "stationId": "1234",
      "name": "Station Name",
      "premiumField": "**HIDDEN**"  // Premium fields show this value if plan doesn't include premium access
    }
  ]
}

Response (Purchase Mode)

When allowPurchase=true:

{
  "success": true,
  "meta": {
    "datasetSlug": "iheart",
    "entitySlug": "stations",
    "format": "json",
    "totalRows": 15420,
    "ownedRows": 1000,
    "freeRows": 5000,
    "unownedRows": 9420,
    "estimatedCost": 47.10
  },
  "datasetRequestId": "507f1f77bcf86cd799439011",
  "status": "pending",
  "message": "Dataset request created. Poll GET /api/datasets/requests/507f1f77bcf86cd799439011 for status."
}

GET /api/datasets/requests/:requestId

Check the status of a dataset request. Poll this endpoint until status is success or error.

Parameters

requestIdstringrequired

MongoDB ObjectId returned from the create request endpoint

apiKeystringrequired

Your API key (query param)

Response

{
  "success": true,
  "requestId": "507f1f77bcf86cd799439011",
  "status": "success",
  "datasetSlug": "iheart",
  "entitySlug": "stations",
  "format": "json",
  "page": null,
  "estimatedRows": 15420,
  "estimatedCost": 47.10,
  "createdAt": "2026-02-05T10:30:00Z",
  "updatedAt": "2026-02-05T10:32:15Z",
  "completedAt": "2026-02-05T10:32:15Z",
  "rowCount": 15420,
  "expiresAt": "2026-02-12T10:32:15Z",
  "downloadUrl": "/api/datasets/requests/.../download?format=json",
  "downloadUrls": {
    "json": "/api/datasets/requests/.../download?format=json",
    "jsonl": "/api/datasets/requests/.../download?format=jsonl",
    "csv": "/api/datasets/requests/.../download?format=csv",
    "parquet": "/api/datasets/requests/.../download?format=parquet"
  }
}

Status Values

Status	Description
`pending`	Request queued for processing
`processing`	Currently being processed (includes `progress` field 0-100)
`success`	Ready for download
`error`	Processing failed (includes `error` field)

GET /api/datasets/requests/:requestId/download

Stream the completed dataset. Returns the data file directly.

Parameters

requestIdstringrequired

Request ID from the status endpoint

formatstringdefault: json

json, jsonl, csv, or parquet

apiKeystringrequired

Your API key (query param)

Response Headers

Header	Description
`Content-Type`	`application/json`, `text/csv`, or `application/octet-stream`
`Content-Disposition`	`attachment; filename="dataset_entity_requestId.ext"`
`X-Request-Rows`	Total row count

Data Formats

Premium Fields

Some datasets include premium fields that require a premium plan subscription. When accessing data:

Without premium access: Premium fields will contain the value **HIDDEN** in all responses
With premium access: Premium fields will contain the actual data values
Filtering by premium fields: Requests with filters on premium fields will be rejected if your plan doesn't include premium access
Free dataset rows: Premium fields are fully visible in free dataset rows regardless of plan tier

Check each dataset's field list in the UI to identify which fields require premium access.

JSON

Returns data as a JSON array of objects. Each row contains the full record with all fields.

[
  {"stationId":"1234","name":"KIIS-FM","market":"Los Angeles","genres":["Pop","Top 40"]},
  {"stationId":"5678","name":"KROQ-FM","market":"Los Angeles","genres":["Alternative","Rock"]}
]

Best for: General programmatic access, maximum flexibility

JSONL (JSON Lines)

Returns data as newline-delimited JSON. Each line is a standalone JSON object — no wrapping array, no commas between records.

{"stationId":"1234","name":"KIIS-FM","market":"Los Angeles","genres":["Pop","Top 40"]}
{"stationId":"5678","name":"KROQ-FM","market":"Los Angeles","genres":["Alternative","Rock"]}

Best for: Streaming pipelines, line-by-line processing, large datasets

Plan requirement: Business plan and above

CSV

Standard comma-separated values with header row.

Best for: Excel, Google Sheets, pandas, general data analysis

Parquet

Apache Parquet columnar format with zstd compression.

Best for: Big data pipelines, Spark, Athena, data lakes, efficient storage

Plan requirement: Business plan and above

Billing & Pricing

Row-Based Pricing

Rule	Description
Minimum	50 rows minimum per request
Rounding	All purchases rounded up to nearest 50 rows
Rate	Varies by plan (see your plan details for pricing)

Cost Calculation

billableRows = Math.ceil(unownedRows / 50) * 50
totalCost = (billableRows / 1000) * ratePer1kRows

// Examples:
// 1 row → 50 billable rows
// 51 rows → 100 billable rows
// 235 rows → 250 billable rows

Row Ownership States

State	Description	Billing
`OWNED`	Previously purchased (within 30 days)	Not charged
`FREE`	Within free access window	Not charged
`UNOWNED`	Requires purchase	Charged

Important Limits

Row retention: Purchased rows expire after 30 days. Re-downloading after expiration requires a new purchase.
Request expiration: Completed download requests expire after 7 days.
Free access: Some datasets offer free access for older data (configurable per dataset).

Filtering & Sorting

View Settings

Use the viewSettings object in the request body to specify filters, sort order, and page size:

const response = await fetch('https://rebrowser.net/api/datasets/requests/create', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    apiKey: 'YOUR_API_KEY',
    datasetSlug: 'iheart',
    entitySlug: 'stations',
    format: 'json',
    page: null,  // null = all pages, 1/2/3... = specific page
    viewSettings: {
      filters: {
        conjunction: 'and',
        rules: [
          { type: 'rule', field: 'market', operator: 'is', value: 'Los Angeles' },
          { type: 'rule', field: 'genres', operator: 'contains', value: 'Pop' },
        ],
      },
      sort: {
        field: 'cume',
        dir: 'DESC',
      },
      pageSize: 100,
    },
    allowPurchase: true,
  }),
})

Filter Operators

Text fields: is, isNot, isAnyOf, isNoneOf, contains, notContains, startsWith, endsWith, isEmpty, isNotEmpty

Number fields: eq, neq, gt, gte, lt, lte

DateTime fields: is, before, after

Boolean fields: isTrue, isFalse

Error Handling

HTTP Status Codes

Code	Description
`200`	Success
`400`	Bad Request — invalid parameters, invalid format, request not ready
`401`	Unauthorized — missing or invalid API key
`402`	Payment Required — insufficient balance or no active plan
`403`	Forbidden — plan restrictions (no API access, format not allowed, premium fields required)
`404`	Not Found — dataset, entity, view, or request not found
`410`	Gone — request has expired
`500`	Internal Server Error

Error Response Format

{
  "success": false,
  "error": "Detailed error message"
}

Common Errors

Error	Cause	Solution
`API key required`	Missing API key	Add `apiKey` parameter
`Your plan does not include API access`	Plan doesn't allow API	Upgrade plan
`Insufficient balance`	Not enough credits	Add funds or reduce request size
`No active plan`	No active subscription	Subscribe to a plan
`Dataset request has expired`	Download link expired (7 days)	Create new request
`Dataset request is not ready`	Trying to download before completion	Poll status endpoint until success
`Your plan does not support [format] downloads`	Format not included in plan	Upgrade plan or use a different format (CSV and JSON are available on all plans)
`Filtering by "[field]" requires a premium subscription`	Premium field used in filter without premium access	Remove premium field filters or upgrade to premium plan

Best Practices

Preview First: Always test your filters in the Datasets UI before making API requests to ensure you're selecting the right data.
Use Preview Mode: Call without allowPurchase to see cost estimates and preview rows before committing to a purchase.
Check Premium Fields: Review which fields are marked as premium in the dataset UI. Premium fields will show **HIDDEN** values unless your plan includes premium access.
Export All vs. Single Page: Pass page: null to export all matching rows. If page is omitted, only page 1 is returned — this is a safety default to prevent accidental large purchases.
Paginate Large Requests: For large datasets, start with a single page to test your filters and control costs before exporting all pages.
Handle Polling Gracefully: Implement exponential backoff when polling for status (start at 5s, increase to 30s max).
Choose the Right Format: Use JSON or CSV for general use. Use JSONL for streaming large datasets. Use Parquet for data warehouse integrations.
Monitor Row Expiration: Track when purchased rows expire (30 days) if you need ongoing access.
Download Before Expiration: Download completed requests within 7 days before they expire.

← Supported Libraries

Quickstart →