The Voice API places calls synchronously with no per-account concurrency cap — you can fan
calls out concurrently across every number you’ve provisioned. This guide shows how to place a call
and run a safe concurrency test so you can confirm capacity for your own volume.
Before you start
Get a production API key with the calls:write scope
Issued from your dashboard. (Demo keys are capped at 120 req/min and restricted to an allowlist of
destinations; production keys allow 600 req/min and any destination.)
Have at least one active phone number
Calls originate from a number on your account.
Use destinations you own or control
For a load test, dial numbers you control — never real third parties.
All requests go to https://api.revdesk.com. Rate-limit headers (X-RateLimit-Limit,
X-RateLimit-Remaining, X-RateLimit-Reset) are returned on every response.
Place a call
curl -X POST https://api.revdesk.com/v1/calls/dial \
-H "Authorization: Bearer $REVDESK_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"from_number": "+1XXXXXXXXXX",
"to_number": "+1YYYYYYYYYY",
"record": true
}'
# → 202 { "data": { "call_id": "…", "status": "QUEUED", … } }
The call is accepted immediately (202) and progresses asynchronously. Check its status anytime:
curl https://api.revdesk.com/v1/calls/<call_id> \
-H "Authorization: Bearer $REVDESK_API_KEY"
Run a concurrency test
The pattern: fire N calls at once, hold a concurrency cap, and record success rate + latency as you
ramp. Below is a self-contained Node script — no dependencies beyond fetch.
// load-test.mjs — usage: REVDESK_API_KEY=... node load-test.mjs
const KEY = process.env.REVDESK_API_KEY;
const BASE = "https://api.revdesk.com";
const FROM = ["+1XXXXXXXXXX"]; // numbers you own (rotate across your pool)
const TO = ["+1YYYYYYYYYY"]; // destinations you control
const RAMP = [3, 6, 10, 20]; // concurrent calls per wave
const COOLDOWN_MS = 15_000;
async function placeOne(i) {
const t0 = performance.now();
const res = await fetch(`${BASE}/v1/calls/dial`, {
method: "POST",
headers: { "Content-Type": "application/json", Authorization: `Bearer ${KEY}` },
body: JSON.stringify({
from_number: FROM[i % FROM.length],
to_number: TO[i % TO.length],
record: false,
}),
});
const ms = Math.round(performance.now() - t0);
const body = await res.json().catch(() => ({}));
return { ok: res.ok, status: res.status, ms, callId: body?.data?.call_id };
}
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));
for (const n of RAMP) {
const results = await Promise.all(Array.from({ length: n }, (_, i) => placeOne(i)));
const ok = results.filter((r) => r.ok).length;
const lat = results.map((r) => r.ms).sort((a, b) => a - b);
const p95 = lat[Math.min(lat.length - 1, Math.floor(0.95 * lat.length))];
console.log(`n=${n} ok=${ok}/${n} p95=${p95}ms 429s=${results.filter((r) => r.status === 429).length}`);
if (ok / n < 0.9) { console.log(`fail rate too high at n=${n} — stopping`); break; }
if (n !== RAMP.at(-1)) await sleep(COOLDOWN_MS);
}
Each call places a real call and consumes minutes/credits. Keep record off and waves modest while
testing, and dial only numbers you control.
Concurrency & capacity
Capacity is measured in concurrent channels — one channel carries one in-progress call leg.
Your account has a provisioned concurrent-channel limit, sized to your expected peak. When you exceed
it, additional calls are rejected — the placement fails, calls don’t queue. It’s a provisioned
setting, not a hard platform limit; tell us your peak and we raise it ahead of launch.
How many concurrent calls that allows depends on the path, since a channel is per leg:
| Path | Channels per call | Concurrent calls at a 100-channel limit |
|---|
AI outbound (POST /v1/calls) — agent joins over the network | 1 | 100 |
| Human via browser/app (WebRTC token + 1 leg) | 1 | 100 |
Phone-to-phone bridge (POST /v1/calls/dial) — rings both numbers | 2 | 50 |
For maximum concurrency, use a path where only the destination is on the phone network (the agent or
rep joins over WebRTC) — that’s 1 channel per call. The dial bridge rings two phones, so it uses 2.
Two independent limits to keep distinct:
- Request rate — 600 requests/minute per key. Exceeding it returns
429 with X-RateLimit-* headers.
- Concurrent channels — your provisioned limit. Exceeding it fails the call placement with a
channel-limit error — it is not a
429.
What to measure
| Metric | Where | Healthy |
|---|
| Accepted | response code | 202 accepted (429 = request-rate limited) |
| Call status | GET /v1/calls/{id} | reaches ANSWERED / COMPLETED, not FAILED |
| Channel ceiling | first wave that starts failing placement | equals your provisioned channel limit ÷ channels-per-call |
Results
A call is accepted (202) and dispatched asynchronously; subscribe to webhooks or poll
GET /v1/calls/{id} for live status. The ceiling is your provisioned concurrent-channel limit — at
which further placements return a channel-limit error. We size that limit to your peak before go-live.
Fill the table for your account and path:
| Concurrent calls | Placed | Succeeded | Failed | p95 accept | Notes |
|---|
| 25 | | | | | |
| 50 | | | | | |
| 100 | | | | | |
Planning for high volume? Tell us your peak expected concurrency and we’ll provision capacity
ahead of your launch.