Overview. â Uses of automatic speech recognition technology. â Principles of forced alignment and speech recognition systems. â Some practicalities.
Forced Alignment and Speech Recognition Systems
Overview ● Uses of automatic speech recognition technology ● Principles of forced alignment and speech recognition systems ● Some practicalities ● Evaluating alignment quality
ASR technology - Existing uses As a toolbox:
As a methodology:
Pre-built generic application used as a tool
Forms an integral part of the experimental procedure
> speech recognition
> tune for unusual pronunciations
> forced alignment
> extract probability of words/phonemes matching models
for lexical transcription or time stamps
> detect assimilation, deletion, insertion
Forced alignment With transcription: Already know exactly what is in the audio.
Align
Forced alignment With some transcription: Know what is in some of the audio.
Align
Noise
Noise
Automatic speech recognition No transcription: Don't know what's in the audio
Estimate
Word spotting Possible transcription: Looking for a word/phrase
P(“policy”) >> P(noise)? Noise
Word spotting Possible transcription:
Noise
Word spotting Possible transcription:
Noise
Word spotting Possible transcription:
Noise
Word spotting Possible transcription:
Noise
Word spotting Possible transcription:
Yes! P(“policy”) >> P(noise)
Early Automatic Speech Recognition (ASR) Systems
Acoustic Property Extraction
Decision Process
Automatic Speech Recognition System Acoustic models in memory
Property Extraction
acoustic vectors Evaluation: Likelihood score
Grammar or Language model
Forced Alignment system Acoustic models in memory
Property Extraction
acoustic vectors Evaluation: Likelihood score
Words ADVERSITY BLUEST BLUESTEIN
phonemes AE0 D V ER1 S AH0 T IY0 B L UW1 IH0 S T B L UH1 S T AY0 N
Lexicon & Transcription
Speech transformation:
Split digitized audio analogue conversion to digital into overlapping frames(~10ms)
frames
Pre-processor/Front end eg. FFT, LPC, MFCC acoustic vectors
Audio sampling rate (16kHz best) Lists of phone and silence model names Dictionary/Lexicon (ARPAbet) ADVERSE ADVERSELY ADVERSELY ADVERSITIES ADVERSITY BLUEST BLUESTEIN BLUESTEIN BLUESTINE ZZZ ZZZ …
●
AH0 D V ER1 S AE0 D V ER1 S L IH0 AE0 D V ER1 S L IY0 AH0 D V ER1 S IH0 T IH0 Z AE0 D V ER1 S AH0 T IY0 B L UW1 IH0 S T B L UH1 S T AY0 N B L UH1 S T IY0 N B L UW1 S T AY2 N sp ns