Time and Frequency – Formants and Harmonics

This post shall initially talk about the time-frequency uncertainty (yes, its like Heisenberg’s position-momentum uncertainty principle). Further, we shall look at Formants and Harmonics and the corresponding wideband and narrowband spectrograms.

So, what exactly is the time-frequency problem. In the generation of a spectrogram, its a practice to take a window of a certain size, and then take that time duration signal’s FFT. Thus it generates a time-frequency graph. Now, the size of the window is very important to note. In simple steps

Long window – Bad time resolution – Good frequency resolution – Narrowband spectrogram – Harmonics
Short window – Good time resolution – Bad frequency resolution – Wideband spectrogram – Formants

What do we exactly mean by time and frequency resolution. Time resolution is how well we can point out to a section of the audio and say that this analysed spectrum belongs to this particular part. On the other hand, frequency resolution is saying that this frequency component accurate to a few Hz is part of the windowed speech sample.

So, the wideband spectrogram, shows clear formants (5ms window). But decreasing the time window too much, smears the frequency graph too much making it tough to detect formants. Formants are basically acoustic resonances of the human vocal tract. Our vocal tract continuously changes shape to produce all the sounds that we make. These formants can be used alone to classify all the vowels (essentially voiced phonetics) that we produce. Generally, the F1 and F2 are used to do this. Vowels are found to generally have around 4 formants. There are some other factors like plosives close to vowels, which lower the formant frequencies. Below is a picture of the formants.

Formant Structure - seen in red

Formant Structure - seen in red

Now coming to harmonics, these are nothing but multiples of the fundamental frequency (also called F0) – the pitch. These are seen in narrow-band spectrograms (30ms window). Harmonics have a role to play in speaker recognition and speech recognition too. Below is a picture clearly showing the harmonics in speech.

Harmonics - Bands with darker regions of formants

Harmonics - Bands with darker regions of formants

These images were generated using a software called Praat. May write a post on that sometime.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s