Get Started
Rate Limits
Per-service rate limits and per-user generation concurrency behavior.
How limits are applied
- Service rate limiting is per user + service category over a 60s sliding window.
- Categories:
music,speech,sound-effects. - Limits are plan-aware and resolved from effective runtime config.
enterprisemaps to effectively unlimited sentinel values.- Concurrency limit is separate and global across active generation jobs.
Rate-limit headers
Read rate-limit headers
const res = await fetch('https://prod-backup-backend.wubble.ai/v1/music/songs', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ prompt: 'Upbeat electronic trailer music' }),
});
console.log({
service: res.headers.get('x-ratelimit-service'),
limit: res.headers.get('x-ratelimit-limit'),
remaining: res.headers.get('x-ratelimit-remaining'),
reset: res.headers.get('x-ratelimit-reset'),
retryAfter: res.headers.get('retry-after'),
});On throttling, expect 429 + Retry-After.
Retry strategy
Use exponential backoff and honor Retry-After when present.
429 retry loop
async function fetchWithRetry(url: string, init: RequestInit, maxRetries = 4) {
let retries = 0;
while (retries < maxRetries) {
const res = await fetch(url, init);
if (res.status !== 429) return res;
const retryAfter = parseInt(res.headers.get('retry-after') ?? '1', 10);
const delayMs = retryAfter * 1000 * Math.pow(2, retries);
await new Promise((r) => setTimeout(r, delayMs));
retries++;
}
throw new Error('Max retries exceeded after 429');
}Concurrency limit
Generation routes may also be blocked by per-user active-job limits (default 1 unless overridden by plan/internal sync).
Typical concurrency block responsejson
{
"success": false,
"data": null,
"error": {
"code": "CONCURRENT_GENERATION_LIMIT",
"message": "...",
"active_requests": 1,
"concurrency_limit": 1
}
}Runtime notes
Redis limiter failure mode
Service rate limiter is fail-open on Redis errors (requests continue, risk logged).
Self-healing concurrency
Middleware attempts stale/orphan release and resumable re-enqueue for some music ops.
Credits checks run independently of rate limiting. Even with rate-limit headroom, a request can still fail with 402.