Set up your Batch API

Step by step guide to set up your Similarweb Batch API

Welcome to Similarweb's Batch API - giving you scaleable access to the world's largest digital measure database!

Get Similarweb data for more than 1,000,000 domains and 5 years of history, tens of metrics - in one API call!

This guide has 2 quick steps to get millions of data points from our API.

Get started checklist:

After you set up your API Key, you can start creating your first API call, for up to 1,000,000 domains!

Step-by-step guide:

  1. Make a POST request with a JSON in the body or attached as a file as multipart/form-data
https://api.similarweb.com/v3/batch/request-report
import requests

url = "https://api.similarweb.com/v3/batch/request-report"

payload={}
files=[
  ('request',('Batchexample.json',open('/Users/Batchexample.json','rb'),'application/json'))
]
headers = {
  'api-key': '{{your_api_key}}'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)

👍

In order to calculate the estimated credits the report will cost, you can use the "request-validate" endpoint

https://api.similarweb.com/v3/batch/request-validate

Example JSON:

{
    "domains":[
        "cnn.com"
    ],
    "countries": ["US"],
    "metrics":[
        "all_traffic_visits"
    ],
    "start_date": "2022-12-01",
    "end_date": "2023-01-01",
    "granularity": "daily", 
    "delivery_method": "download_link",
    "response_format": "json",
    "webhook_url": "foo.com"
}

When requesting a report you must include in the JSON the following parameters.

Make sure to save the report ID you receive after your API request.

👍

Data credits are calculated for each report based on the number of results you are actually receiving:

Formula: Number of domains X Number of metrics X history X cadence (daily/monthly) X Number of countries X Number of results

❗️

The request limit per user is 100 pending requests. if you receive a '429' error it means you've exceeded the limit of allowed pending requests. Reduce the frequency of your requests to stay within the limits of your account.

Mandatory Parameters:

ParametersDescriptionAcceptable Values
domainsCharacters in domain names can include letters, numbers, dashes, and hyphens. One request can include up to 1M domains.amazon.com
countriesCountries with standard 2-letter ISO encoding when calling all metrics (excluding desktop_top_geo). For worldwide, use "WW". This parameter is case-sensitive and must be inputted in capital letters. When calling desktop_top_geo, you must remove any countries from your JSON file.WW, US, GB All country codes
metricsList of metrics per datasetall_traffic_visits
start_date, end_dateFor daily granularity, format the start-and-end date like this: YYYY-MM-DD. For monthly granularity, format the start-and-end date like this- YYYY-MMDaily: 2023-06-30
Monthly: 2023-06
granularityTime series granularitymonthly, weekly, daily
response_formatOutput of the API callJSON, csv, parquet, orc

📘

When requesting a report that includes multiple metrics, please create separate requests for metrics that aren't within the same Metric Group. Check the supported metrics datasets to verify which Metric Group each metric belongs to.

Optional Parameters:

ParameterDescriptionAcceptable values
delivery_methodThe default Value is "download_link". When the delivery method is set to “snowflake”, the “response_format” field is not requireddownload_link, bucket_access, snowflake
delivery_method_paramsUse this when requesting reports to be delivered to aggregated Snowflake tables. Input “table_name”: “your_table_name”. See set-up guide for more details.table_name
all_historyBoolean, when set to true, will automatically override the dates to the minimum start date and maximum end date, valid values true or false, default is false.true/false
latestBoolean, when set to true will override the end date with the latest available date, if the start date is not specified it will also override the start date with the same.true/false
window_sizeString, when set will override the start date with a time relative to the end date.Should be in the format - {number}{y/m/d}, for example - '12d', '3m', or '2y'.
limitInteger, Limits the number of results per entity selected.above 0, most metrics default is 100
Include_subdomainsBoolean, Default is true.true/false
webhook_urlEnter the delivery URL you'd like us to ping when the status of your report changes.URL
sortAllows you to sort by a specific metricspecific metrics: "sort": "all_traffic_visits"
  1. After you made your request and got your report ID, use the Request Report Status to receive the report status.

Upon completion, you will need to request the report status.

GET Request Report Status

https://api.similarweb.com/v3/batch/request-status/{{generated_report_id}}
import requests

url = "https://api.similarweb.com/v3/batch/request-status/{{generated_report_id}}"

payload={}
headers = {
  'api-key': '{{your_api_key}}'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Example response:

{
    "data_points_count": 1779429,
    "download_url": "example_url.com",
    "status": "completed",
    "used_quota": 35589
}
{
    "status": "pending"
}

👍

The download link will remain valid for 30 days. We recommend saving these for a certain time period just in case you will need our assistance to troubleshoot any issue that may occur.


What’s Next