Skip to main content

How to upload datasets through the API

M
Written by Michael McCarty
Updated over 3 weeks ago

Overview


API keys allow you to upload datasets to Minibase.ai programmatically, without needing to use the web interface. Once you've created an API key, you can use it to upload JSONL training datasets directly to your account. This article explains how to upload datasets using cURL, Python, or JavaScript.
​

How to Use

Step 1: Retrieve your API key

Go to Settings β†’ API Keys in your Minibase dashboard.
Click Create API Key. You can optionally give it a name (e.g., "Dataset Upload") and set an expiration date.
Copy the API key when it's shown β€” this is the only time you'll see the full key. Store it securely.
Your new API key will now appear in the list of active keys.

Step 2: Prepare your dataset

Your dataset must be in JSONL format where each line contains a JSON object with training data:
- `instruction`: The instruction or prompt for the model- `input`: Additional context or input data (optional)- `response` or `output`: The expected response from the model
Example JSONL format:
​
{"instruction": "What is machine learning?", "input": "Explain simply", "response": "ML is AI that learns from data."}{"instruction": "Define neural networks", "response": "Computing systems inspired by biological neurons."}

Step 3: Upload your dataset

Every upload request is sent to the Minibase API endpoint with:

  • Your API key in the Authorization header.

  • A dataset name and the JSONL file to upload.

  • Optional parameters like description, dataset type, and privacy setting.

cURL example

# Use API key authentication via Authorization header.
API_KEY="YOUR_API_KEY"
BASE="https://minibase.ai/api.php"

# Upload dataset with API key
curl -X POST "$BASE?action=fu_uploadDatasetViaApi&format=json" \
-H "Authorization: Bearer $API_KEY" \
-F "dataset_name=My Training Dataset" \
-F "description=A dataset for fine-tuning my model" \
-F "dataset_type=SFT" \
-F "privacy_setting=private" \
-F "[email protected]"

Python example

# Use API key authentication via Authorization header.
import requests

API_KEY = "YOUR_API_KEY"
BASE = "https://minibase.ai/api.php"

# Upload dataset with API key
url = f"{BASE}?action=fu_uploadDatasetViaApi&format=json"
headers = {"Authorization": f"Bearer {API_KEY}"}

with open("dataset.jsonl", "rb") as f:
files = {"file": f}
data = {
"dataset_name": "My Training Dataset",
"description": "A dataset for fine-tuning my model",
"dataset_type": "SFT",
"privacy_setting": "private"
}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())

JavaScript example

// Use API key authentication via Authorization header.
const API_KEY = "YOUR_API_KEY";
const BASE = 'https://minibase.ai/api.php';

async function uploadDataset(file, datasetName) {
const formData = new FormData();
formData.append('dataset_name', datasetName);
formData.append('description', 'A dataset for fine-tuning my model');
formData.append('dataset_type', 'SFT');
formData.append('privacy_setting', 'private');
formData.append('file', file);

const res = await fetch(`${BASE}?action=fu_uploadDatasetViaApi&format=json`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`
},
body: formData
}).then(r => r.json());

console.log(res);
}

// Usage:
// await uploadDataset(fileInput.files[0], 'My Training Dataset');


Tips & Best Practices

  • Copy your API key once: You'll only see the full key at creation time. Store it somewhere secure (e.g., secret manager).

  • Validate JSONL format: Ensure each line in your file is valid JSON before uploading.

  • Use descriptive names: Give your datasets clear, descriptive names to help organize your training data.

  • Set appropriate privacy: Use `private` for sensitive data, `team` to share with your organization, or `public` for open datasets.

Troubleshooting

My request returns 401 Unauthorized.

Double-check that your Authorization header is set correctly (-H "Authorization: Bearer YOUR_API_KEY"). Also confirm that the key is active and not revoked.

I get "Dataset name is required" error.

Ensure you're providing a "dataset_name" parameter and it's under 100 characters.

Why does it say "You hae already uploaded this exact file"?

The system detects duplicate files by checksum. Use a different file or modify the existing dataset.

My upload fails with "File upload failed".

Check that your file is valid JSONL format and under the size limit. Each line must be valid JSON.

Need More Help?

Need More Help?
Join our Discord support server to chat with our team and get real-time assistance.

Did this answer your question?