Why 44.1kHz and not 40kHz for Audio

So, this was one of the questions asked during my Nvidia interview (when I said I was interested in Multimedia). Why use a 44.1kHz sampling rate rather than 40kHz when most humans can’t hear above the 18.5kHz range or definitely the 20kHz mark. Well I didn’t know the answer then, but here it is now. Its mostly quoted from John Watkinson’s book, The Art of Digital Audio.

So basically, “In the early days of digital audio research, the necessary bandwidth of about 1 Mbps per audio channel was difficult to store. Disk drives had the bandwidth but not the capacity for long recording time, so attention turned to video recorders. These were adapted to store audio samples by creating a pseudo-video waveform which would convey binary as black and white levels. The sampling rate of such a system is constrained to relate simply to the field rate and field structure of the television standard (NTSC / PAL – note the refresh rates used ahead), so that an integer number of samples can be stored on each usable TV line in the field. Such a recording can be made on a monochrome recorder, and these recording are made in two standards, 525 lines at 60 Hz and 625 lines at 50 Hz. Thus it is possible to find a frequency which is a common multiple of the two and is also suitable for use as a sampling rate.

The allowable sampling rates in a pseudo-video system can be deduced by multiplying the field rate by the number of active lines in a field (blanking lines cannot be used) and again by the number of samples in a line. By careful choice of parameters it is possible to use either 525/60 or 625/50 video with a sampling rate of 44.1KHz.”

Ah ha.. so thats the point. Digital audio stored on video recorders were to be sampled at 44.1kHz to best suite the video setup and not the audio. But we know that tradition continues.. and even today in the days of MP3 (ripped again from CD / DVD / 44.1kHz content) we still do not bother to resample the data (a thing which could harm quality). A few calculations were further explained as follows.

“In 60 Hz video, there are 35 blanked lines, leaving 490 lines per frame or 245 lines per field, so the sampling rate is given by 60 X 245 X 3 = 44.1 KHz. While, in 50 Hz video, there are 37 lines of blanking, leaving 588 active lines per frame, or 294 per field, so the same sampling rate is given by 50 X 294 X3 = 44.1 Khz.

The sampling rate of 44.1 KHz came to be that of the Compact Disc. Even though CD has no video circuitry, the equipment used to make CD masters is video based and determines the sampling rate.”

Thats the secret behind the fishy 44.1kHz sampling rate 🙂


One thought on “Why 44.1kHz and not 40kHz for Audio

  1. Paul

    Can anyone explain the remark “While, in 50 Hz video, there are 37 lines of blanking”? To the best of my knowledge, there are 50 lines of blanking per frame in a 50 Hz system (25 lines and 12 us per field). Some respected authors state 37 lines but only ever in the context of CD sampling rates. Does anyone know of the specification that defines 37 lines or whether digital audio data was in fact placed on 13 VBI lines plus the active lines?


