Skip to main content

Overview

This tutorial walks you through the complete process of uploading an audio file, getting it transcribed, and translated into multiple languages using the Tiro API.

Prerequisites

  • Valid Tiro API key
  • Audio file (MP3, WAV, M4A)
  • Max file size: 500MB
  • Max duration: 3 hours

Step 1: Create a Voice File Job

Start by creating a job with your desired transcription and translation settings:
curl -X POST https://api.tiro.ooo/v1/external/voice-file/jobs \
  -H "Authorization: Bearer $TIRO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "transcriptLocaleHints": ["ko_KR"],
    "translationLocales": ["en_US"]
  }'
Example Response:
{
  "id": "job_abc123",
  "uploadUri": "https://storage.example.com/upload/signed-url",
  "state": "created",
  "createdAt": "2024-01-01T12:00:00Z"
}

Step 2: Upload Your Audio File

Upload your audio file to the provided signed URL:
# Upload audio file to the signed URL (replace UPLOAD_URI with the uploadUri from Step 1)
curl -X PUT "$UPLOAD_URI" \
  -H "Content-Type: audio/mpeg" \
  --data-binary @/path/to/your/audio.mp3

Step 3: Notify Upload Completion

Tell the API that the upload is complete:
curl -X PUT "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/upload-complete" \
  -H "Authorization: Bearer $TIRO_API_KEY" \
  -H "Content-Type: application/json"

Step 4: Poll for Job Completion

Monitor the job status until processing is complete:
# Check job status (poll until status is COMPLETED or FAILED)
curl -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID" \
  -H "Authorization: Bearer $TIRO_API_KEY"

# Example response:
# {
#   "id": "job_abc123",
#   "status": "COMPLETED",
#   "fileUploadedAt": "2024-01-01T12:00:20Z",
#   "processStartedAt": "2024-01-01T12:01:00Z",
#   "processCompletedAt": "2024-01-01T12:02:30Z"
# }

Step 5: Retrieve Results

Once the job is completed, fetch the transcript and translations:
# Get transcript
curl -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/transcript" \
  -H "Authorization: Bearer $TIRO_API_KEY"

# Get all translations
curl -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/translations" \
  -H "Authorization: Bearer $TIRO_API_KEY"

# Get specific translation (e.g., English)
curl -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/translations/en_US" \
  -H "Authorization: Bearer $TIRO_API_KEY"

# Get transcript paragraph summary
curl -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/transcript/paragraph-summary" \
  -H "Authorization: Bearer $TIRO_API_KEY"

# Get translation paragraph summary
curl -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/translations/en_US/paragraph-summary" \
  -H "Authorization: Bearer $TIRO_API_KEY"

Complete Example

Here’s a complete example that processes an audio file from start to finish:
#!/bin/bash
# Complete Voice File Processing Script

# Set your API key
export TIRO_API_KEY="your_api_key_here"
AUDIO_FILE="/path/to/your/audio.mp3"

# Step 1: Create job
echo "Creating voice file job..."
RESPONSE=$(curl -s -X POST https://api.tiro.ooo/v1/external/voice-file/jobs \
  -H "Authorization: Bearer $TIRO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "transcriptLocaleHints": ["ko_KR"],
    "translationLocales": ["en_US"]
  }')

JOB_ID=$(echo $RESPONSE | jq -r '.id')
UPLOAD_URI=$(echo $RESPONSE | jq -r '.uploadUri')
echo "Job created: $JOB_ID"

# Step 2: Upload file
echo "Uploading audio file..."
curl -s -X PUT "$UPLOAD_URI" \
  -H "Content-Type: audio/mpeg" \
  --data-binary @"$AUDIO_FILE"
echo "File uploaded successfully"

# Step 3: Notify upload complete
echo "Notifying upload completion..."
curl -s -X PUT "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/upload-complete" \
  -H "Authorization: Bearer $TIRO_API_KEY" \
  -H "Content-Type: application/json"

# Step 4: Poll for completion
echo "Waiting for processing to complete..."
while true; do
  STATUS=$(curl -s -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID" \
    -H "Authorization: Bearer $TIRO_API_KEY" | jq -r '.status')
  echo "Status: $STATUS"
  
  if [ "$STATUS" = "COMPLETED" ]; then
    echo "Job completed successfully!"
    break
  elif [ "$STATUS" = "FAILED" ]; then
    echo "Job failed!"
    exit 1
  fi
  
  sleep 5
done

# Step 5: Get results
echo "Retrieving results..."

echo "=== Transcript ==="
curl -s -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/transcript" \
  -H "Authorization: Bearer $TIRO_API_KEY" | jq '.text'

echo "=== Translation (en_US) ==="
curl -s -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/translations/en_US" \
  -H "Authorization: Bearer $TIRO_API_KEY" | jq '.text'

echo "=== Paragraph Summary ==="
curl -s -X GET "https://api.tiro.ooo/v1/external/voice-file/jobs/$JOB_ID/transcript/paragraph-summary" \
  -H "Authorization: Bearer $TIRO_API_KEY" | jq '.summary'

echo "Processing completed!"

Step 6: Understanding Paragraph Summaries

The Paragraph Summary feature provides intelligent content summarization that helps you quickly understand the key points from your audio files.

What are Paragraph Summaries?

Paragraph summaries are automatically generated overviews of your transcript or translation content, broken down into logical sections. Each summary provides:
  • Concise Overview: Key points from each paragraph in markdown format
  • Language-Specific: Summaries generated for both transcript and translations
  • Structured Content: Easy to parse and integrate into your applications

Example Response

When you fetch paragraph summaries, you’ll receive a response like this:
{
  "jobId": "job-123",
  "locale": "ko_KR",
  "summary": [
    {
      "type": "markdown",
      "content": "### 회의는 프로젝트 진행 상황을 점검하고, 다음 단계에 대한 계획을 수립하는 데 중점을 두었습니다."
    },
    {
      "type": "markdown", 
      "content": "### 팀원들은 각자의 담당 업무에 대한 현재 상태를 보고하고, 발생한 이슈들에 대해 논의했습니다."
    }
  ]
}

When are Summaries Available?

  • Transcript and Translation Summaries: Available after job completion (COMPLETED status)
    • COMPLETED means all processing is finished: transcript, translation (if requested), and paragraph summaries for both
  • Processing Time: Summaries are generated asynchronously, usually within 30-60 seconds after the main processing

Polling Best Practices

Exponential Backoff Strategy

  • Start delay: 2 seconds (processing takes time to start)
  • Growth factor: 1.2x (gentle increase)
  • Maximum delay: 30 seconds (avoid overwhelming the server)
  • Maximum attempts: Based on expected processing time

Job State Transitions

Processing Time Guidelines

File DurationExpected Processing TimeRecommended Poll Interval
< 5 minutes30-60 secondsStart: 2s, Max: 10s
5-30 minutes1-5 minutesStart: 5s, Max: 20s
30-120 minutes5-15 minutesStart: 10s, Max: 30s
> 2 hours15-30 minutesStart: 30s, Max: 60s

Common Issues & Solutions

Upload Failures

  • Issue: 413 Payload Too Large
  • Solution: Check file size (max 500MB) and compress if needed

Processing Failures

  • Issue: Low audio quality
  • Solution: Ensure sample rate ≥8kHz and minimal background noise

Timeout Issues

  • Issue: Job doesn’t complete within expected time
  • Solution: Increase timeout for longer files, check job state for errors

See Also