Skip to main content

Overview

Tick-by-tick order book reconstructions from websocket feeds and REST data. Each row represents the full order book state at that timestamp. The order book is reconstructed from multiple redundant websockets to ensure zero data loss. Three separate datastreams are merged: price_change messages, book messages, and a scheduled download of all books from the REST API every 5 minutes. This way in the highly unlikely event the book becomes slightly out of sync, every 5 minutes it is guaranteed to be correct. If you plan on downloading high volumes of order book data, please reach out! We support bulk parquet exports which will allow you to avoid parsing text. Email: [email protected].

Columns

ColumnTypeDescription
exchange_timestampintegerExchange timestamp in milliseconds
local_timestampintegerLocal capture timestamp in milliseconds (UTC)
ask_pricesstringComma-separated ask prices (lowest to highest)
ask_sizesstringComma-separated ask sizes at each price level
bid_pricesstringComma-separated bid prices (highest to lowest)
bid_sizesstringComma-separated bid sizes at each price level

Notes

  • Date format: YYYY-MM-DD
  • Prices and sizes are comma separated string for easy parsing
  • Book snapshots merge data from multiple websockets to ensure completeness

Fetching Data

PredictionData.dev automatically joins the market slug and outcome to the asset id so you don’t need to look up the clob token id through Polymarket’s API.

Using Market Slug + Outcome

import requests

def download_order book_data(market_slug, outcome, date_str, api_key):
    url = f"http://datasets.predictiondata.dev/polymarket/books/{market_slug}/{outcome}/{date_str}.csv.gz"
    params = {'slug': 'true', 'apikey': api_key}
    
    response = requests.get(url, params=params)
    response.raise_for_status()
    
    with open(f'{market_slug}_{outcome}_{date_str}.csv.gz', 'wb') as f:
        f.write(response.content)
    
    print(f"Downloaded to {market_slug}_{outcome}_{date_str}.csv.gz")

if __name__ == "__main__":
    api_key = "YOUR_API_KEY"
    market_slug = "ramp-ipo-in-2025"
    outcome = "YES"
    date_str = "2025-11-16"
    
    download_order book_data(market_slug, outcome, date_str, api_key)

Using Token ID

import requests

def download_order book_data_by_token(token_id, date_str, api_key):
    url = f"http://datasets.predictiondata.dev/polymarket/books/{token_id}/{date_str}.csv.gz"
    params = {'apikey': api_key}
    
    response = requests.get(url, params=params)
    response.raise_for_status()
    
    with open(f'{token_id}_{date_str}.csv.gz', 'wb') as f:
        f.write(response.content)
    
    print(f"Downloaded to {token_id}_{date_str}.csv.gz")

if __name__ == "__main__":
    api_key = "YOUR_API_KEY"
    token_id = "6535996220481600525438454491949371553057652243233032166205012948847090204871"
    date_str = "2025-11-16"
    
    download_order book_data_by_token(token_id, date_str, api_key)