Visual Sound with the Web Audio API

You'll need a modern browser for these demos. Internet Explorer won't work.

The Web Audio API allows us to play, analyze, manipulate, and even generate audio assets within the browser. Here is a demo of it in action to start things off. Turn your speakers up.

Above we are analyzing the audio at runtime, in JavaScript, which is driving the visualization. Most major browsers now support this feature, with the (unsurprising) exception of Internet Explorer. Here is a quick introduction to working with the API. The code below is written in Dart, but the the API implementation translates over to JavaScript very closely.

The first thing we need to do is load our audio file. We don’t do this with the HTML5 audio tag, but with a standard HTTP Request.

// HttpRequest is the Dart equivalent of JavaScript's XmlHttpRequest
var request = new HttpRequest();
request.open("GET", "my-audio-file.ogg", async: true);
request.responseType = "arraybuffer";
request.onLoad.listen(onLoadComplete);
request.send();

This loads the audio file into an ArrayBuffer. ArrayBuffer is a data type for representing generic binary data. We won’t work with this data directly however, instead we will pass it into the Web Audio API and have it decode the binary data for us.

Here we create an AudioContext instance. This is primary object we’ll use for working with the audio data. In this first step we pass it the ArrayBuffer we loaded in and then wait for it to execute a callback containing the decoded audio data buffer.

var ctx = new AudioContext();
ctx.decodeAudioData(e.target.response).then((AudioBuffer buffer) {
    // now we are ready to work with the data
});

Once the audio context has finished decoding the audio data we can begin playing it. In order to do that we need to attach the the AudioBuffer object to our AudioContext instance. Here is what that looks like:

var bufsrc = ctx.createBufferSource();
// attach the buffer returned by decodeAudioData
bufsrc.buffer = buffer;
// connect it to the AudioContext destination
// data that goes to the destination gets output as sound we can hear 
bufsrc.connectNode(ctx.destination);
// tell it to start playing
bufsrc.start();

In the above code snippet, we first create a buffer source then attach our loaded audio data to it. The buffer source is the interface we use for controlling the playback of the audio file. We then connect the buffer source to the audio context’s destination property, which can be thought of as your speakers. Any bytes that get sent to the destination property are output as actual sound. Finally we call start, and the sound file begins to play through our speakers. Adding a play/pause button, we get the following:

That’s a lot of work just to play a sound file. And if that’s all you wanted to do, you would be better of using the audio media tag. The real power of the Web Audio API when you want to do more than just play some sounds.

In the first example we passed our audio data directly from the buffer source into out audio context’s destination. But if we want to manipulate those bytes before they hit the destination, we can pass the data through one or more nodes first. The flow of bytes might look like this:

[source] -> [manipulation node] -> [manipulation node] -> [destination]

There are several types of built in nodes, such as a panner, oscillator, splitter, and more. The one we’ll look at is the AudioAnalyzerNode. The analyzer node provides us with real-time frequency data, but does not change the data passing through it. Frequency data can be used to create audio visualizations. Adding an analyzer node is a very small change from the previous example, it looks like this:

var analyser = ctx.createAnalyser();
var bufsrc = ctx.createBufferSource();
bufsrc.buffer = buffer;
bufsrc.connectNode(analyser);
analyser.connectNode(ctx.destination);
bufsrc.start();

Now the analyzer will contain frequency data as the sound plays, and we can query it during playback. In the following example we read the data, storing it in a list.

var data = new Uint8List(analyser.frequencyBinCount);
analyser.getByteFrequencyData(data);

This data is just a list of numbers, each ranging from 0-255. Higher numbers mean a higher frequency. If we average the numbers and use it as a percentage width on HTML elements, we get the following:

Here is what the code looks like:

var data = new Uint8List(analyser.frequencyBinCount);
analyser.getByteFrequencyData(data);

for (int i = 0; i<data.length; i++) {
    sum += data[i] / 255;
}

sum = sum / data.length * 100;
view.style.width = sum.toString() + "%";

That’s a pretty simple example, but a little creativity can take this concept a lot further. The demo at the start of this page averages frequency exactly like this. Its only the what I’ve done with the average frequency that changed.

More information about the Web Audio API can be found here.

Tagged:

JS, Dart, Web Audio API, HTML

You can contact me on Twitter.

Recent & Popular Articles

Browse All >