Creating an Audio Visualizer That Can Handle Multiple Audio Sources Dynamically - All In Vanilla JS!

rizz0s

Summer Rizzo

Posted on February 20, 2020

Creating an Audio Visualizer That Can Handle Multiple Audio Sources Dynamically - All In Vanilla JS!

For one of my recent projects, I decided to delve into the world of data visualization by making an audio visualizer. There is a multitude of guides on how to write one - even for Vanilla JS - but I failed to find one that detailed taking in multiple sound inputs, which was a necessary feature of my project (a layer-able sound-scape mixer). Additionally, the inputs needed to dynamic - users had the ability to add and remove sounds at will, and the visualizer needed to reflect that in real-time. I'll take you step-by-step through my solution to that problem.

First, I'll link you to the primary sources I used for the visualizer itself. To get a handle on how audio contexts work in JS with the Web Audio API, I referenced this CodePen to make a simple, single-source horizontal visualizer. After getting that up and running, I decided to rewrite the shape of the visualization to wrap around a circle. For that, I referenced this step-by-step guide. I'll focus on that implementation since it's the one I worked with to implement taking in multiple sources.

NOTE // I don't doubt that this isn't the most efficient way to implement a visualizer in the browser. Once multiple audio sources or, generally, larger-sized files are added, it's a pretty hefty load for something client-side. Nonetheless, it can be done, and I'd like to argue that it's pretty cool considering no packages or frameworks are needed.

For context, all the sounds were associated with a specific flower object in my program, in case you're curious about the floral theme of some of the variable names.

Let's start by seeing how sounds are created.



    function createSound (flower) {
        const sound = document.createElement('audio');

        sound.id = flower.name; // set ID of sound to use as a key for global obj
        sound.src = `./sounds/${flower.sound}.mp3`; // set source to locally stored file
        sound.crossOrigin = "anonymous"; // avoid a CORS error
        sound.loop = "true"; // sounds need to loop to the beginning after they end
        sound.dataset.action = "off"; // for pausing feature
        document.getElementById("audio-container").append(sound); // append sound to HTML container
        allSoundsById[sound.id] = sound; // add to global object for later use


        return sound; // return sound to parent function
    }


Enter fullscreen mode Exit fullscreen mode

When the sounds are rendered to the page on page load, the createSound function is called at the beginning to create an HTML <audio> tag and populate a global array that uses the id (in this case, the associated flower's name) as the key and the element as the value.

There is a "click" event listener associated with each flower that will first play the sound, then call the renderVisualizer function that actually displays the sound data that's currently playing to the page. Let's take a look at that function next.

Before we get into the nitty-gritty of taking in multiple sound inputs, I want to establish a few things about how the visualizer is set up. It is drawn on an HTML5 canvas element, which, when animations frames are rendered, has a circle drawn in the center. It is divided equally into a fixed amount of parts, which is equal to the number of bars the visualizer has. Each bar is associated with a bit of frequency data, whose height changes accordingly to the sound every time an animation frame is rendered. So, the width is fixed, and the height represents the ever-changing frequency information of the sounds (what makes it move!). Reference my resources linked at the end of the article if you'd like a more bare-bones dive about how the basis of this works.

Let's first gain access to the canvas element on the page. This is just an HTML element which you can choose to create within your script file, or have prepared in HTML already. I did the latter. Directly after, you have to get the context for the HTML canvas - we're working with 2D (as opposed to 3D). Note that canvasContext is what we'll be drawing to - canvas is just equal to the DOM element.



    function renderVisualizer () {
        // Get canvas
        const canvas = document.getElementById("vis");
        const canvasContext = canvas.getContext("2d");


Enter fullscreen mode Exit fullscreen mode

Next, we need to create audio contexts for each sound. This is what gives us access to all that wonderful data. I mentioned before that all the sounds were stored in a global object for later use - this is where we'll use that! For each sound key-value pair in the object, I'm creating another object with the same key, and the value set to the necessary information:



    Object.keys(allSoundsById).forEach((id) => {
                    // condition to avoid creating duplicate context. the visualizer won't break without it, but you will get a console error.
            if (!audioContextById[id]) {
                audioContextById[id] = createAudioContextiObj(allSoundsById[id])
            }
        })


Enter fullscreen mode Exit fullscreen mode

...and here's the createAudioContextObj function:



    function createAudioContextiObj (sound) {
        // initialize new audio context
        const audioContext = new AudioContext();

        // create new audio context with given sound
        const src = audioContext.createMediaElementSource(sound);

        // create analyser (gets lots o data bout audio)
        const analyser = audioContext.createAnalyser(); 

        // connect audio source to analyser to get data for the sound
        src.connect(analyser);
        analyser.connect(audioContext.destination);
        analyser.fftSize = 512; // set the bin size to condense amount of data

        // array limited to unsigned int values 0-255
        const bufferLength = analyser.frequencyBinCount;
        const freqData = new Uint8Array(bufferLength);


        audioContextObj = {
            freqData, // note: at this time, this area is unpopulated!
            analyser
        }

        return audioContextObj; 
    }


Enter fullscreen mode Exit fullscreen mode

Here, we're creating an audio context, connecting the sound to it, and returning the necessary tools in an object for later use in the parent function. I'm also setting the fftSize (stands for Fast Fourier Transform) to 512 - the default is 2048, and we don't need that much data, so I'm condensing it. This will make the length of the freqData array 256 - a bit more fitting, considering our number of bars is only 130! I understand that, at this point, this can get a little convoluted; while I don't want to say that knowing the details of what's going on here doesn't matter, it's okay to not fully understand what's happening here yet. Essentially, we are using tools given to us to obtain information about sound frequencies that we'll use to draw the visualization.

Let's move forward. Before we call the renderFrame function that lives inside renderVisualizer, I'm going to set the fixed number of bars, their according width, and initialize their height variable:



    const numBars = 130;

    let barWidth = 3;
    let barHeight;


Enter fullscreen mode Exit fullscreen mode

All right, now we can get into the thick of it. We're inside the renderFrame function. This is responsible for continuously rendering data and drawing it to the canvas.



    function renderFrame() {
            const freqDataMany = []; // reset array that holds the sound data for given number of audio sources
            const agg = []; // reset array that holds aggregate sound data

            canvasContext.clearRect(0, 0, canvas.width, canvas.height) // clear canvas at each frame

            requestAnimationFrame(renderFrame); // this defines the callback function for what to do at each frame

            audioContextArr = Object.values(audioContextById); // array with all the audio context information

                    // for each element in that array, get the *current* frequency data and store it
            audioContextArr.forEach((audioContextObj) => {
                let freqData = audioContextObj.freqData;
                audioContextObj.analyser.getByteFrequencyData(freqData); // populate with data
                freqDataMany.push(freqData);
            })

            if (audioContextArr.length > 0) {
                 // aggregate that data!
                 for (let i = 0; i < freqDataMany[0].length; i++) {
                        agg.push(0);
                        freqDataMany.forEach((data) => {
                        agg[i] += data[i];
                        });
                    }


Enter fullscreen mode Exit fullscreen mode

Okay, this is a lot of code! Let's step through it. First, at each frame, the renderFrame function is called. The first thing we do is reset the array that holds all the instances of frequency data, and the array that has all of that data added together. Remember, each frequency data in the audio context is currently set to an unpopulated array that will be populated by its respective analyser. After all is said and done, think of it like this:



    freqDataMany = [ [freqDataForFirstSound], [freqDataForSecondSound], [freqDataForThirdSound]....];
    agg = [[allFreqDataAddedTogether]];


Enter fullscreen mode Exit fullscreen mode

For your curiosity, here's a snippet of agg populated with some data:

Alt Text

Ain't that somethin'? We'll do more with the aggregate data later, but first let's draw the circle that the bars will be drawn onto:



    // still inside if (audioContextArr.length > 0) 

        // set origin of circle to center of canvas
        const centerX = canvas.width / 2;
        const centerY = canvas.height / 2;
        const radius = 50; // set size of circle based on its radius

        // draw circle
        canvasContext.beginPath();
        canvasContext.arc(centerX, centerY, radius, 0, (2*Math.PI) );
        canvasContext.lineWidth = 1;
        canvasContext.stroke();
        canvasContext.closePath()


Enter fullscreen mode Exit fullscreen mode

NOTE // If you wanted the circle to be drawn on the canvas at all times, you could write this outside of the renderFrame function. I wanted the canvas to be completely clear if no sounds were playing.

Here's where the magic happens. For each render, which happens every animation frame, this loop will run 130 (the number of bars defined above) times. It is responsible for drawing each bar around the circle.



        for (let i = 0; i < (numBars); i++) {
          barHeight = (agg[i] * 0.4);

          let rads = (Math.PI * 2) / numBars;
          let x = centerX + Math.cos(rads * i) * (radius);
          let y = centerY + Math.sin(rads * i) * (radius);
          let x_end = centerX + Math.cos(rads * i) * (radius + barHeight);
          let y_end = centerY + Math.sin(rads * i) * (radius + barHeight);

          drawBar(canvasContext, x, y, x_end, y_end, barWidth)
      }


Enter fullscreen mode Exit fullscreen mode

The bar height is being dynamically set to the ith bit of information in the aggregate frequency data array. Let's let that sink in. The frequency data is being split into 265 "bins". agg[0] is the first bin, agg[1] is the second... agg[130] is the 130th. Note that I could set numBars to 256 to gain access to every bit of frequency data in the array. However, I preferred to drop off the higher frequencies and have a lower number of bars (it normalized some high-freq bird-chirping sounds). Additionally, I'm multiplying that by 0.4 to limit the bar height so everything could fit on the canvas.

Let's move onto to the math. Fear not - it's only some trig that will help us draw the bars along the circle. rads is converting the circle into radians - it's a bit easier to work with for our purpose. We're going to be using a common formula to convert polar coordinates (which uses radians) to Cartesian coordinates (or in other words, our familiar friends (x, y)):

x = radius × cos( θ )
y = radius × sin( θ )

You can do a deeper dive into why this works (see below links), but if you'd rather move along, just know that we are using this formula to determine the starting and ending coordinates of our bar. Its starting point needs to be at a point along the circumference of the circle (which is what the above formula is being used for) and it needs to be incremented based on which cycle of the loop we're on (which is why we are multiplying it by i - otherwise they would all be drawn on top of each other). The endpoint is being based on the barHeight, which, if you recall, is based on its associated bit data in the agg array. With all the necessary coordinates, and the fixed width of the bar we defined before the loop, we can draw the bar:



    function drawBar(canvasContext, x1, y1, x2, y2, width){
        const gradient = canvasContext.createLinearGradient(x1, y1, x2, y2); // set a gradient for the bar to be drawn with

        // color stops for the gradient
        gradient.addColorStop(0, "rgb(211, 197, 222)");
        gradient.addColorStop(0.8, "rgb(255, 230, 250)");
        gradient.addColorStop(1, "white");

        canvasContext.lineWidth = width; // set line width equal to passed in width
        canvasContext.strokeStyle = gradient; // set stroke style to gradient defined above
        // draw the line!
        canvasContext.beginPath();
        canvasContext.moveTo(x1,y1);
        canvasContext.lineTo(x2,y2);
        canvasContext.stroke();
        canvasContext.closePath();
    } 


Enter fullscreen mode Exit fullscreen mode

We are almost there. All we have to do now is make sure that all these functions get invoked at the right time. With as many things collapsed as possible, here is the renderVisualizer function:

Alt Text

Directly after the renderFrame function definition, we invoke it. The renderVisualizer function is called on the click action when the sound is first played. When another sound is layered via click, its frequency data is aggregated to the current frequency data. When a sound is paused, there is no frequency data - remember, freqData and agg are getting reset at each rendered frame. If a sound isn't playing, it's freqData is just a bunch of zeros - when it's aggregated with the currently playing sounds, it simply doesn't have any data to add.

Here's a gif of it in action:

Alt Text

For the sake of appropriately sized gifs, I only screen-recorded the visualizer. First, an initial sound is added - then another (notice the bars jump in height, especially in the lower left) - the second source is removed, then so is the first.

Voila! I implemented this in only a few days' time, so I am certainly open to any optimizations or critiques. Here is a useful list of references I used:

With ♡, happy coding.

💖 💪 🙅 🚩
rizz0s
Summer Rizzo

Posted on February 20, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related