Audio codecs, keep it real.

After trying to find some decent qualitative analysis of AAC / OGG codecs online, I didn’t manage to find a correct or recent one, so I decided to make a test by myself.

Those tests have been made with the first aim to find the best candidate for audio storage on my Android phone. Possibles choices are quite limited, it only could be AAC or OGG. The audio sample used as source for the comparative was extracted from a compact disc by using cdparanoia and was choosen as a good representative of my usual listenings (English rock – Manchester / 1994).

The following tools were used for the compression :

  • FAAC 1.28 :
    faac input.wav -q <quality> -o output.aac
  • Vorbis tools 1.4.0 :
    oggenc -q <quality> input.wav
  • Spectral analysis was made using a python script to compare spectral bands (available here) and Audacity to display the masking impact.

Here are the results (attenuation (dB) / frequency (Hz)). For AAC :
AAC

For OGG :
OGG

And the corresponding quality parameter / filesize :

Quality OGG Quality AAC Lossless
3 3.2 MB 70 3.1 MB 41 MB
5 4.5 MB 110 4.4 MB
6 5.6 MB 140 5.5 MB
7 6.5 MB 200  6.4 MB
8 7.8 MB  325  7.7 MB

First thoughts :

  • AAC with quality < 110 must be avoid if the file is not voice only, because it cuts middle and high frequencies .
  • At quality 200, AAC have a real good spectral resolution, close to the original file.
  • As frequencies > 20 kHz is humanly inaudible or not rendered by the audio devices [1], OGG at quality 5 is the right choice to encode music and minimize the space usage (better resolution than ACC at level 110 for the same file size). This is also the quality level used by default by Spotify [2].
  • OGG at quality 6 provides a quite good spectral resolution, very close to OGG quality 7 and 8.
  • OGG at quality 6 is also the first level to enable lossless coupling [3], even if the (lossly) coupling can be disabled at fewer levels [4] (quality 5 -> quality 5 without coupling : size increased by 7%, still smaller files than quality 6).

More specifically, here is the representation of the spectral removal. Original file :
spectral_master
AAC at quality 200 :
spectral_aac
OGG at quality 7 :
spectral_ogg

As we can see at the middle and the end of the sample, the amount of data required by the large spectral resolution of AAC is balanced by frequency cuts in the audible domain, more especially by a more agressive temporal masking [5], and to a lesser extent, by a more important simultaneous masking  [6] than for OGG, which only cuts high (and inaudibles anyway) frequencies.

In summation, the first thing to do is to realize you likely did not hear high frequencies anymore [1, again] (and if you don’t trust your equipment, just look at your cat/dog face while running the test).
The best candidate for the specified usage seems to be OGG at quality 5, for the restrained masking in audible domain, the limited size and the good spectral resolution. It could be improved by disabling (lossly) coupling and it is also natively supported under Linux / Android.
If you are young enouth to hear 20 kHz, OGG at quality 6 or AAC at quality 140 seems fine. And if, for weird reasons, you want to keep all inaudibles frequencies (music for your dog, maybe ?), AAC at quality 200 is the right choice.

[1] : http://www.audiocheck.net/audiotests_frequencycheckhigh.php
[2] : http://support.spotify.com/se/learn-more/faq/#!/article/What-bitrate-does-Spotify-use-for-streaming
[3] : http://wiki.hydrogenaudio.org/index.php?title=Recommended_Ogg_Vorbis#Recommended_Encoder_Settings
[4] : http://wiki.hydrogenaudio.org/index.php?title=Recommended_Ogg_Vorbis#Enabling_and_disabling_Vorbis_5.1.2F7.1_Channel_Coupling_for_Use_in_Mainline
[5] : https://en.wikipedia.org/wiki/Auditory_masking#Temporal_masking
[6] : https://en.wikipedia.org/wiki/Auditory_masking#Simultaneous_masking