Amazon Transcribe
An automatic speech recognition (ASR) service that converts speech to text.
Exam Tip: Transcribe = speech to text. If the question involves converting audio/video to text, the answer is Transcribe. Remember it also supports toxicity detection and custom vocabulary.
Key Capabilities
- Real-time streaming transcription
- Batch transcription from audio/video files
- Multi-speaker identification (diarization)
- Automatic punctuation and capitalization
- Channel identification (separate speakers in phone calls)
- Custom vocabulary (add domain-specific words)
Custom Language Models
- Train custom language models with your domain-specific text
- Improves accuracy for industry-specific terminology
- Provide text data from your domain (no audio needed)
Improving Accuracy
- Custom Vocabulary: Add specific words, acronyms, or proper nouns
- Custom Language Models: Train with domain-specific text
- Vocabulary Filters: Remove or mask unwanted words
Toxicity Detection
- Identify toxic content in transcribed speech
- Categories: harassment, hate speech, sexual content, threats, profanity, insults
- Confidence scores for each toxicity category
Transcribe Medical
- Specialized version for medical speech-to-text
- HIPAA-eligible for processing protected health information (PHI)
- Supports medical dictation and telemedicine conversations
- Recognizes medical terminology (drug names, procedures, conditions)
- Available in streaming and batch modes