IMPORTANT INFORMATION!
The creator and maintainer of this web site Tony Sanderson died in June 2006. This web site is being maintained in his memory by others.
As a result information on this web site IS NOT CURRENT OR ACCURATE and should not be relied upon at all.

A console-window driven audio peak-limiter

Background


LEVEL is a non-realtime command-line-driven audio levelling (AGC) utility. It currently accepts 'WAV' format audio files only.

It's main purpose in life is to automatically adjust the volume of WAV files before burning them onto CDs or encoding them to MP3/RealAudio for this web site.

It can do this in one of two ways - either by continuously (dynamically) adjusting the peak volume level throughout the entire file (known as peak-limiting) or by determining the optimum fixed volume level and then applying that (normalising).

In dynamic mode, it's the equivalent of what's known in the broadcast and recording industry as a mono-band peak-limiting amplifier. Peak limiters are used in a program chain to provide a "brick wall" upper limit to peak program levels without the risk of introducing audible distortion due to waveform clipping. They're employed in AM and FM broadcast transmitters, for the mastering of audio discs and tapes, motion-picture film sound-tracks, and anywhere that the instantaneous peak signal level (usually of both polarities) needs to be maintained at or below some fixed maximum value.

Peak limiting amplifiers were originally implemented in the late 1940's using variable-mu pentode electronic vacuum tube (valve) amplifiers, with compression slopes which were generally limited to around 10:1. That is to say - above the limiting threshold, an increase of 10db in the input signal would result in an increase of 1db in the output signal. This compression slope limit of around 10:1 was a function of the technology used (and in particular, the relatively low feedback loop gain imposed by the use of multiple interstage coupling transformers).

In practice, a compression slope of 10:1 was adaquate, because there was invariably one peak limiting amplifier feeding the main program line leaving the station and a second one located at the transmitter site. Both would be in circuit at all times and would be limiting to 2 or 3db on peaks, so the total compression slope (ignoring any phase-shift in the program line) was actually more like 100:1.

In the late 1960's, these monsters finally gave way to the first wave of solid-state (transistorised) versions which could achieve compression slopes of anywhere from 30:1 up to 200:1 or even greater. (Amazing what you can do in a circuit when all the iron-cored interstage transformers disappear!). These transistorised versions generally employed a junction FET as the gain-control element in a resistive L-pad configuration. I constructed (and even designed) a number of such peak limiters for various purposes in the 1970's, but all the "final" boxed versions that I ever built for my own use were limited to mono (mainly due to the scarcity of suitable audio transformers for isolating the 600 ohm input and output lines).

When the need arose to process some stereo one evening, I thought "Why not write a software version?"



Typical Applications

Audio CDs

Because I have CoolEdit Pro 2.0 and use that for processing much of my audio, I usually let it look after the "normalising" of audio tracks I'm putting onto CDs. I always convert any MP3s into WAV first to allow me to do this sort of thing (ie: correct the volume, trim the ends, and possibly optimise the frequency reponse as well). Even so, I do use occasionally use level for normalising instead, because CoolEdit often gets it wrong, believe or not. It obviously contains a bug, because if you select "Normalise to 100%", the audio is often visibly clipped. So if I can't be bothered messing around, I often save the file and run level -n (normalise) on it to do the job properly!

Spoken word material

With audio files recorded off radio, the average level often varies throughout for one reason or another, and in these cases (especially if I'm putting them up here on Bluehaze), I usually process them fairly gently using level. This keeps the average level up nice and high. A typical set of parameters for doing this without it sounding too obviously "squashed" is:

level -f3 -s 0.02 -l2 filename

This runs level in dynamic (peak-limiting) mode. Using -f3 (fast recovery rate = 3db/sec) reduces the amount of rapid gain increase between words or sentences from its normal default value of 12. And setting the slow recovery rate to 0.02 db/sec (-s 0.02) calms things down further still by reducing the slow rate of gain restoration during quiet sections (the default for -s is 1db/sec). Finally, I use -l2 to reduce the maxiumum output by 2db (ie: around 20%) to allow for the effect of phase-shift when files are eventually encoded into lossy MPG or RealAudio format. If you don't do the latter, the final encoded result will occasionally clip.

Optimising video sound tracks (with Premiere, etc)

When I'm capturing or editing videos (eg: with Adobe Premiere(c), or even via Virtual Dub)), I make sure that the audio track is available as a separate WAV file so I can process it using CoolEdit and/or level. Particularly with live spoken-word material, I then process these soundtracks with level using the same parameters as above, before finally encoding into MPEGs, ie:

level -f3 -s0.02 -l2 filename

For musical material, I usually only normalise the sound-tracks, via

level -n -l2 filename

The use of -n here forces level to run in normalise mode only, ie: all dynamic gain adjustment is killed. Note again the use of -l (lower case L - the output level) . Because video soundtracks are invariably encoded using compression such as MPEG2, the resulting phase-shifts can again cause clipping unless you reduce the output level by around 20% first.

For material which is just going onto an audio CD in normal Redbook format, reducing the output level like this is uneccessary, since no lossy compression encoding is involved. So all you need to do there (if you only want to normalise) is:

level -n filename

and then grab all the resulting files (ie: those with a "c_" prefix), and burn those onto your audio CD.


Limiting is not clipping

Having read this far, if you're not familiar with this particular area of audio engineering terminology, you might well be thinking "What the hell is this guy raving on about ...?"

Okay - well, a couple of graphs might help to describe how peak limiting works.

If you're copying audio tracks onto your PC, or even if you're simply playing music on your amplifier, there is a certain volume setting at which your system will overload. This effect is often called "clipping" (see Fig. 1 below). Notice particularly the flat tops on the signal near the middle of the graph. You can buy boxes for electric guitars that do this for you - they're called "fuzz boxes". But on complex music or vocals, this sort of "fuzz" effect usually sounds fairly horrid.

Clipping is simply a flattening off of your audio signal (the music) when you try to exceed the maximum signal level that your system can handle. As in - you turn anything up too loud, and it sounds distorted - right?

A peak-limiting amplifier (if you could afford it) is a box you would add to your stereo system, PC, or radio transmitter to maximise the volume without allowing clipping to occur.

Fig.2 shows the same audio signal processed using a peak-limiter. The peak volume levels are still maximised but the distortion due to flattening (clipping) the signal is now gone.

In the context of maximising the volume of a set of music tracks before you burn them onto a CD, level is a bit of software you can use to process your tracks to push up the average volume level without the risk of clipping (distorting) the sound. Level constantly adjusts the volume to be as loud as possible - all the way through a track. Most radio (and TV) stations operate with their peak-limiters in this mode.

Think of level as a very smart audio operator who just sits there and constantly adjusts the volume level of your music tracks to keep them as loud as possible but without ever allowing them to overload or distort, and that'll give you the general idea.


Usage and brief description

Level is typically invoked as (eg) level fred.wav. This takes an existing file fred.wav and processes it into a peak-limited version called c_fred.wav. That is, the output file name is created by prepending a "c_" to the name of the input file.

The attack time is instantaneous. Dual recovery time-constants are provided, and these can be modified via the -f, -s, -r and -t parameters if desired. Additionally, distortion on low-frequency content (during heavy limiting) is minimised by a fixed 45mS "sample-and-hold" circuit which defeats gain recovery during the single-period of individual low-frequency sinusoids.

Compression ratio (above limiting threshold) is infinite. Distortion (below limiting threshold) is zero (in the sense that it will be identical to the original source distortion). Above limiting threshold, waveform "distortion" is a direct function of the time-constants set by the user. The distortion level is quite low if using the default settings.

For brief help on usage, type level -+ (or level -h) and you'll see:

level version 2.82 (built Aug 10, 2004)
A software implementation of a mono-band dual time-constant peak-limiting
amplifier with an infinite compression-ratio, instantaneous attack-time, and
low distortion.

Usage: level [options] [file(s) or *]. Options=[-n] [-f#] [-s#] [-r#] [-g#]
[-m#] [-l#] [-t#] [-p] [-d] [file(s) or *]. (# = a number, in db or db/sec.)
Processed WAV files are saved with a 'c_' prefix, ie: as 'c_',

Options (default values shown in parenthesis):
-f (12)  FAST gain-recovery rate (db/sec) for short peaks
-s (1)   SLOW gain recovery rate (db/sec) for sustained program peaks
-r (1)   RATIO (db) of fast to slow recovery on sustained program
-l (0)   LINE level - output attenuation (db)
-t (26)  TRANSIENT tracking rate (db/sec) of the slow recovery section
-m (30)  Maximum gain (db) - acts as an input attenuator
-g (6)   Initial gain (db) - the initial m value
-n       Normalise only (disables all dynamic compression)
-h       (or -+)  Show this help
-e       EXTENDED HELP - gives a few usage examples
-p       Displays some WAV file parameters, and -d prints debugging info.

Note 1: Type 'level -e' for some real usage examples (extended help).
Note 3: The order of parameters is irrelevant, but file name(s) must be last.
Note 4: Using * as the last parameter to process all WAV files in a directory
        skips any with a 'c_' prefix (these assumed to be already processed).
(c) Bluehaze Solutions http://www.bluehaze.com.au, EMail admin@bluehaze.com.au
This demo version processes 90 seconds of audio.  Full versions for PC &
unix available - see http://www.bluehaze.com.au/unix/level.html for details.

The above 'help' is quite brief. There are two ways of getting more info. The first, as mentioned in Note 1 above, is by typing level -e to get some handy usage examples and hints, as follows:

Usage examples for level:
level file.wav
This just compresses file.wav to c_file.wav using the built-in defaults,
ie: r = 1db, f = 12db/sec, s = 1db/sec, g = 6db, m = 30db.

level -s 0.5 -g6 -m12 file.wav
Processes file.wav (into c_file.wav) to maximise the average level somewhat
less obtrusively.  Gain-recovery for sustained signal peaks = 12db/sec (the
-f default), but then slows to 0.5db/sec (-s 0.5). Initial gain (-g) = 6db,
and the maximum gain (-m) will be 12db.

level -f2 -r.5 -s 0.02 -t50 -l .5 -m9 *
Processes every WAV file in the current directory except those with a 'c_'
prefix. Even gentler parameters - fast recovery rate (-f) reduced to 2db/sec,
ramp height (-r) down to 0.5db, slow recovery rate (-s) down to 0.02db/sec,
tracking rate (-t) bumped to 50db/sec, peak output level (-l) down by 0.5db
(ie: to approx 5% below maximum), and maximum gain (-m) down to +9db.

level -f 30 -s 15 -r6 .... heavy compression;  will sound like some of the
more 'punchy' rock/metallica radio stations (heavy 'pumping').  Fairly insane
for music.  Possibly useful for live speech with widely varying levels

level -f 20 -s 2 -r 1 .... not quite as heavy but still merciless.

Note 1: If a file has a low volume level and you need to start with higher
initial gain, use -g.  Eg: -g18 sets the *initial* gain to 18db (8 times).

Note 2: The fast/slow ramp height (-r) applies to sustained program levels
only.  Short peaks will result in a much higher percentage of fast recovery.
For example, a short transient +30db peak (such as a noise pulse or a speech
or music transient) will usually defeat the (-r) ramp-height almost entirely
resulting in a subsequent 30db gain recovery at the fast rate.  This is
intuitively what one would expect, and is normal design practice with
broadcast peak-limiters (or was in the 1960s :-).

Note 3: Reduce -t value to speed up gain recovery from isolated transients.
Increase it (eg: to 50 or 60) to reduce gain-chasing effects.

The second way of getting more usage information is to read this unix manual entry (PDF format) which I put together for a company that wanted copies of level for running on PCs and Sun workstations.


Level was first created back in 1995 using (Boreland (Turbo)) C. And in spite of my general distaste for anything concerning DOS or Windoze (ie: Microsoft), I must admit that these early Boreland compilers provided a fairly efficient and productive environment, especially when used in conjunction with something like Boreland's own "Make" and the MKS Toolkit to provide a decent, unix-like shell within which to work.

The only thing I've really seen which is more productive than this would be Boreland's Turbo Pascal. For whipping up quite complex compiled applications in a hurry, I found Pascal as a language terrific (and certainly far more pleasant and less error-prone than (yuk) C!).

And although I hate to admit it, the latest "32 bit" version of level for Windoze (2.71) was built using Microsoft's Visual C++ package. (I use a decent editor (VIM) for editing, though :-)

(*) GNU (sometimes called 'Linux') [x86] and Solaris [Sparc] versions are also available. These work properly with piped input, of course ... :-)

BTW, here's the actual man entry markup code if you prefer that to the abovementioned PDF.


Processed sample - before and after

The following comparison may make level's default mode of operation a little more obvious. The first graph is the original WAV file of an old hit song (Petula Clark's Don't sleep in the subway). The second graph shows the result of processing it via level -g30. (I used -g30 to set the initial gain to +30db only because the audio level in this particular file was quite low.)

You can click on the graphs to hear the results, but bear in mind that these web samples have been encoded into MP3 format, and phase-shifts from this type of lossy encoding can theoretically cause peaks to clip. It was converted to MP3 using Steve Lhomme's LAME-based MP3 encoder plug-in for Winamp .

Original
After processing with level

Of course, one doesn't always want this type of dynamic compression, particularly since the compression-ratio in the default (peak-limiting) mode as shown above is infinite. Peak-limiting like this is ideal for some situations (eg: listening in the car, broadcast processing, and even noisy parties), but it's not necessarily the best choice for general audio file processing.

You can of course use the -n (normalise) flag to disable the dynamic compression so that the loudest peak just reaches maximum level without altering the dynamic range. (But then if that's all you want, you probably don't need level - there are plenty of programs out there that will do that for you {:-)

Terms and conditions

The demo versions - Windoze or Linux (x86) process a maximum of 90 seconds of audio. If you find the program useful, you can buy normal, full copies as follows:
Private (non-profit) usage $90 per binary (specify Windoze or Unix/Linux)
Commercial usage $1150 per binary (specify Windoze or Unix/Linux)
Source licence $5400 - full source with make files, etc (Windows and Unix/Linux)

Prices are in Australian dollars. If you wish, you can convert prices to your own currency using Personal Currency Assistant (window may take several seconds to 'activate').

  Back to (Bluehaze) software archive page

 Bluehaze home page

Administrator (Bluehaze Solutions). Last revised: Sat 28-Aug-2004