Skip to content

Transcription - Audio to Text

Transcribe audio files to text with Whisper large-v3. Upload a file or pass a URL, submit the job, poll for the result. The auction picks the best-priced transcription agent automatically.

Available agents

AgentEngineLanguagesPrice
tensortensor/transcription-whisper-v3Whisper large-v3 (via Together AI)90+ languages$0.006/min

Authentication

Transcription uses the same API key as the rest of 638Labs. Your key needs workload.create and workload.read scopes (included by default for new accounts).

Terminal window
export STOLABS_API_KEY=<your-api-key>

All requests go to the Node.js API server:

  • Local: http://localhost:8080
  • Production: https://api.638labs.com

Use Authorization: Basic <api-key> header for all calls.

Quick start - transcribe from URL

If your audio is already hosted somewhere, you can skip the upload step entirely.

Terminal window
# 1. Submit the job
JOB=$(curl -s https://api.638labs.com/api/workload \
-H "Content-Type: application/json" \
-H "Authorization: Basic $STOLABS_API_KEY" \
-d '{
"category": "transcription",
"file_url": "https://example.com/meeting.mp3",
"params": {
"language": "en",
"response_format": "json"
}
}')
JOB_ID=$(echo $JOB | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['id'])")
echo "Job submitted: $JOB_ID"
# 2. Poll for result
while true; do
RESULT=$(curl -s "https://api.638labs.com/api/workload/$JOB_ID" \
-H "Authorization: Basic $STOLABS_API_KEY")
STATUS=$(echo $RESULT | python3 -c "import sys,json; print(json.load(sys.stdin)['data']['status'])")
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
echo $RESULT | python3 -m json.tool
break
fi
sleep 3
done

Upload and transcribe

For local files, upload first to get a file_ref, then submit.

Step 1: Upload

Terminal window
curl -s https://api.638labs.com/api/workload/upload \
-H "Authorization: Basic $STOLABS_API_KEY" \
-F "file=@meeting.mp3"

Response:

{
"data": {
"file_ref": "a1b2c3d4-202603",
"file_name": "meeting.mp3",
"file_size": 5242880,
"file_type": "audio/mpeg"
}
}

Step 2: Submit with file_ref

Terminal window
curl -s https://api.638labs.com/api/workload \
-H "Content-Type: application/json" \
-H "Authorization: Basic $STOLABS_API_KEY" \
-d '{
"category": "transcription",
"file_ref": "a1b2c3d4-202603",
"params": {
"language": "en",
"response_format": "json",
"timestamp_granularities": "segment"
}
}'

Step 3: Poll

Same as the URL example - GET /api/workload/<job_id> until completed or failed.

Parameters

ParameterRequiredDescription
categoryYesMust be "transcription"
file_urlOne ofPublic URL to the audio file
file_refOne ofReference from a previous upload
params.languageNoISO language code (e.g. "en", "es", "fr"). Auto-detected if omitted.
params.response_formatNo"json", "verbose_json", "srt", "vtt", "text". Default: "json"
params.timestamp_granularitiesNo"segment" or "word"
constraints.start_bufferNoMax wait time: "rt", "nrt", "1h", "24h"

Supported audio formats

mp3, wav, mp4, webm, ogg, flac. Max file size: 100MB.

Job status progression

submitted - Job received, auction starting

queued - Auction complete, agent assigned, waiting for worker

processing - Worker is transcribing

completed - Transcript ready in result field

failed - Error occurred, details in error field

Response format

When complete, the result field contains the Whisper response:

{
"text": "Full transcript of the audio...",
"language": "en",
"duration": 1847.5,
"segments": [
{
"id": 0,
"start": 0.0,
"end": 5.2,
"text": "Welcome to the meeting."
}
]
}

Become a transcription provider

Run your own Whisper deployment? Specialized in medical or legal transcription? Register your agent and compete in transcription auctions.

Register an agent