Wednesday, April 13, 2011

ActionScript foray into fast Fourier transforms

I am aware that this image matches my blog :) Preview my FFT implementation

Sometime late December/early January of this year, I realized that I would need to learn how to implement a fast Fourier transform in ActionScript and, quite honestly, in general. I was working on my game and as I said in a previous post, my game is partially reliant on a powerful audio engine I am working on.

Some people reading this may be aware that Adobe already has an FFT implementation. In their SoundMixer class, there is a computeSpectrum() function and if you choose true for the FFTMode boolean parameter, the spectrum returned will be a Fourier transform of the current frame of audio. However, this function is only available on the SoundMixer which is Adobe's static global sound control class and so the transform represents the transform of all sound going through the Flash application and not just, say, a single music track. This isn't a problem if all you are playing is the track you want to transform but this isn't always going to be the case. You may want to perform the transform on a music track, while sound effects playing in the background, for example.

I ended up writing an FFT implementation in ActionScript. However, it wasn't super fast and I shelved it for a few months. Recently, I decided to optimize it, so I spent the last few days cleaning it up and making a faster implementation. Although it is not as fast as perfect implementations written in C, I have managed to make it run fairly quickly, however it is optimized for real data such as audio (as opposed to data consisting of complex numbers).

What is an FFT?

For those who aren't sure what an FFT is, it is an algorithm used to compute the discrete Fourier transform much more quickly. That was probably the least helpful sentence imaginable if you don't know what either of those things are, so a fast Fourier transform is a transform on a set of data that brings it from one domain (often time) to the frequency domain. The most common example of this is the spectral analysis of audio.

Logic Pro's Channel EQ plugin with spectral analysis turned on

I am not going to go in depth describing how FFTs work because not only would I likely butcher it and make it more confusing (since I am by no means an expert), but there are plenty of sites on the web that go into much greater detail. The one, however, that made me finally understand what the fast Fourier transform was doing and how it was doing it is the FFT Guru. Another useful site, for a more math and less programming take on the FFT, is The FFT Demystified, specifically their page on real number FFTs. If you are building your own FFT and would like to test your results, this online FFT solver on Random Science Tools has proven invaluable to me.

My FFT

My first attempt at an FFT was a somewhat standard radix-2 implementation. Even though I was only trying to find the transform of audio (real, not complex data points), I was using a standard complex number implementation in which all of the imaginary number coefficients were set to 0. This did not run super fast; it was not too slow but not fast enough to be useable as far as I am concerned. It is bundled in the source code (linked below) as the function computeFFTRegular which is called when computeFFT is called with the parameter real set to false. The demo does not use this at all; it is included for the purposes of completeness and because it is clean enough that someone can get a better sense of what a simple non-real number FFT should look like in a code (since so many implementations are written in the most impenetrable form of eye-glazing C there is).

The implementation that the demo uses is very similar to the unused implementation. Both are radix-2 and even use the same butterfly function. The main difference is that the real number implementation which I use does not use N samples in the real number array and N 0s in the imaginary number array. Instead, it puts all the even samples (0, 2, 4,...) in the real number array (now of size N/2) and all of the odd samples (1, 3, 5,...) in the imaginary number array (also N/2). The FFT that has to be calculated is now only N/2 instead of N which saves a significant amount of time. The result is all muddled up but luckily, there is a way to unjumble the result and end up with an FFT that is the same as a standard complex number FFT, but the time it takes to calculate only half of the samples (plus some unjumbling time).

FFT for real numbers function:

private function computeFFTReal(outputArray:ByteArray, frequencies:Vector.<Number>):void {
 var i:int; //Used as loop iterator
 
 var evenReal:Array = new Array();
 var oddReal:Array = new Array();
 
 _currentBytesMono.position = 0;
 while (_currentBytesMono.bytesAvailable) {
  evenReal.push(_currentBytesMono.readFloat());
  oddReal.push(_currentBytesMono.readFloat());
 }
 
 while (evenReal.length < SAMPLES_PER_FRAME / 2) {
  evenReal.push(0);
  oddReal.push(0);
 }
 
 //=======================================
 
 butterfly(false, evenReal, oddReal);
 
 //=======================================
 
 var reverseEven:Array = new Array();
 var reverseOdd:Array = new Array();
 
 for (i = 0; i < evenReal.length; i++) {
  reverseEven[_bitReversalTableHalfSamples[i]] = evenReal[i];
  reverseOdd[_bitReversalTableHalfSamples[i]] = oddReal[i];
 }
 
 evenReal = reverseEven;
 oddReal = reverseOdd;
 
 //=======================================
 
 var evenImaginary:Array = new Array();
 var oddImaginary:Array = new Array();
 
 var newReal:Array = new Array();
 var newImaginary:Array = new Array();
 
 for (i = 1; i < (SAMPLES_PER_FRAME / 2) / 2; i++) {
  evenImaginary[i] = (oddReal[i] - oddReal[(SAMPLES_PER_FRAME / 2) - i]) / 2;
  evenImaginary[(SAMPLES_PER_FRAME / 2) - i] = -evenImaginary[i];
  
  oddImaginary[i] = -((evenReal[i] - evenReal[(SAMPLES_PER_FRAME / 2) - i]) / 2);
  oddImaginary[(SAMPLES_PER_FRAME / 2) - i] = -oddImaginary[i];
  
  evenReal[i] = (evenReal[i] + evenReal[(SAMPLES_PER_FRAME / 2) - i]) / 2;
  evenReal[(SAMPLES_PER_FRAME / 2) - i] = evenReal[i];
  
  oddReal[i] = (oddReal[i] + oddReal[(SAMPLES_PER_FRAME / 2) - i]) / 2;
  oddReal[(SAMPLES_PER_FRAME / 2) - i] = oddReal[i];
 }
 evenImaginary[(SAMPLES_PER_FRAME / 2) / 2] = 0;
 oddImaginary[(SAMPLES_PER_FRAME / 2) / 2] = 0;
 
 for (i = 1; i < SAMPLES_PER_FRAME / 2; i++) {
  newReal[i] = evenReal[i] + (oddReal[i] * _twiddleFactorTableCosineUnmuddle[i]) + (oddImaginary[i] * _twiddleFactorTableSineUnmuddle[i]);
  newImaginary[i] = evenImaginary[i] + (oddImaginary[i] * _twiddleFactorTableCosineUnmuddle[i]) - (oddReal[i] * _twiddleFactorTableSineUnmuddle[i]);
  newReal[SAMPLES_PER_FRAME - i] = newReal[i];
  newImaginary[SAMPLES_PER_FRAME - i] = -newImaginary[i];
 }
 newReal[0] = evenReal[0] + oddReal[0];
 newReal[SAMPLES_PER_FRAME / 2] = evenReal[0] - oddReal[0];
 newImaginary[0] = 0;
 newReal[SAMPLES_PER_FRAME/ 2] = 0;
 
 //=======================================
 
 var freqHz:Number;
 var amplitude:Number;
 
 outputArray.clear();
 outputArray.position = 0;
 frequencies.splice(0, frequencies.length);
 
 for (i = 0; i < newReal.length; i++) {
  if (i <= SAMPLES_PER_FRAME / 2) {
   freqHz = i * ((SAMPLES_PER_MILLISECOND * 1000) / SAMPLES_PER_FRAME);
   amplitude = (Math.sqrt(Math.pow(newReal[i], 2) + Math.pow(newImaginary[i], 2)));
   frequencies.push(freqHz);
   outputArray.writeFloat(amplitude);
  }
 }
}

Note that SAMPLES_PER_FRAME is how many samples of audio data we choose to handle through the Adobe Sound class. I personally use 4096. The number must be a power of 2. SAMPLES_PER_MILLISECOND is 44.1, which is better known as 44100Hz.

In the first part of the FFT function, we are staggering the audio samples into the even and odd arrays. The even array will be treated like the "real" data and the odd array will be treated like the "imaginary" data, as if the FFT were handling a complex number instead of just real numbers. On the off chance we were shorted samples, we pad with 0s. _currentBytesMono is a byte array holding the mono-ized audio sample data.

After that we call the butterfly function and then we bit reverse the order of the result.

Following that, we unjumble the results. To understand the theory behind this, visit Engineering Productivity Tools' The FFT Demystified guide's section on FFT's with real sequences. Also, for clarification that is more explanatory and less equations, read FFT Guru's Two N point real or a 2N point real with an N point complex FFT PDF, which should be linked to on his homepage (he has requested on his homepage not place any click throughs to his content).

After the results are unjumbled, the amplitude/magnitude is calculated and returned along with the associated frequency. This amplitude/frequency pairing is linear; some may prefer it in decibels. In my demo, I give the option of using decibels which essentially just runs a conversion on the amplitude as follows:

\[\mathrm{decibels}=20\log_{10}{(\mathrm{amplitude})}\]

Remember that ActionScript's Math.log() function is NOT \(\log_{10}\) but \(\ln\) and you will need to divide your logarithm result by \(\ln{10}\) to make it base 10. Conveniently, ActionScript has this as a constant, Math.LN10.

Butterfly function:

private function butterfly(fullSamples:Boolean, real:Array, imaginary:Array) {
 var outputSize:uint;
 if (fullSamples) {
  outputSize = SAMPLES_PER_FRAME;
 } else {
  outputSize = SAMPLES_PER_FRAME / 2;
 }
 
 var splitter:uint = 2;
 var splitterVal:uint = outputSize / splitter;
 var splitterStart:uint;
 
 var twiddleCounter:uint = splitter / 2;
 var twiddlePos:uint = 0;
 
 var siblingPoint:uint;
 
 var realBackup:Number;
 var imaginaryBackup:Number;
 
 while (splitterVal >= 1) {
  splitterStart = 0;
  while (splitterStart < outputSize) {
   for (var i:int = splitterStart; i < splitterStart + splitterVal; i++) {
    siblingPoint = i + splitterVal;
    
    realBackup = real[i];
    imaginaryBackup = imaginary[i];
    
    real[i] += real[siblingPoint];
    imaginary[i] += imaginary[siblingPoint];
    
    real[siblingPoint] -= realBackup;
    imaginary[siblingPoint] -= imaginaryBackup;
    
    realBackup = real[siblingPoint];
    imaginaryBackup = imaginary[siblingPoint];
    
    if (fullSamples) {
     real[siblingPoint] = (realBackup * _twiddleFactorTableRealFullSamples[twiddlePos]) - (imaginaryBackup * _twiddleFactorTableImaginaryFullSamples[twiddlePos]);
     imaginary[siblingPoint] = (realBackup * _twiddleFactorTableImaginaryFullSamples[twiddlePos]) + (imaginaryBackup * _twiddleFactorTableRealFullSamples[twiddlePos]);
    } else {
     real[siblingPoint] = (realBackup * _twiddleFactorTableRealHalfSamples[twiddlePos]) - (imaginaryBackup * _twiddleFactorTableImaginaryHalfSamples[twiddlePos]);
     imaginary[siblingPoint] = (realBackup * _twiddleFactorTableImaginaryHalfSamples[twiddlePos]) + (imaginaryBackup * _twiddleFactorTableRealHalfSamples[twiddlePos]);
    }
    
    twiddlePos += twiddleCounter;
   }
   splitterStart += splitterVal * 2;
   twiddlePos = 0;
  }
  twiddleCounter = splitter;
  splitter <<= 1;
  splitterVal >>= 1;
 }
}

This is my implementation of a butterfly algorithm for a radix-2 FFT. You would be best studying this with an accompanying guide to fast Fourier transforms and a knowledge of what they are actually supposed to do (once again, FFT Guru is a great place to start). While reading or trying to figure something out, you can come back here and look at how it appears when implemented in ActionScript, but I would do it an injustice by trying to describe it in a blog post. The above code should be used alongside an FFT explanation guide as a study aid, if you are a beginner. On its own, I do not think it will be as useful to learn from. Either way, enjoy!

A quick note on the demo

If you are looking through the demo code or using my code in an implementation, you may notice that just drawing the points straight doesn't net the same results. This is because I space the frequencies unevenly. This prevents "bunching up" of the rendering. Although to make it pretty, I just have a blue line dancing around with no labeling, it is actually unevenly divided into blocks. There are three larger blocks and six smaller blocks. If you look at the Logic Pro Channel EQ image again, you will see what I mean.

There is large block for the 20-50Hz range, a small block for the 50-100Hz, another small block for the 100-200Hz and then a large block for the 200-500Hz and now two large and two small blocks later, we are leaving the lower frequencies. Next, there are two small blocks for 500Hz-1kHz and 1-2kHz, a large block for 2-5kHz and another small block for 5-10kHz. So there are only three small blocks and a large block for the mid-range, and the mid-range is also noticeably larger. Whereas the lower end (20-500Hz) is 480Hz long, the mid-range is 9500Hz and it takes up less pixel space when we draw it out. Finally, on the high end is one small block for 10-20kHz. This is 10000Hz long and takes up the least amount of pixel space, when drawn.

Source and acknowledgements

Once again, I would like to thank Flexible Factory, the makers of MP3FileReferenceLoaderLib for AS3 which made the seemingly trivial task of loading MP3 locally into Flash actually trivial because its not actually as built in as you would expect.

You can download the source code under the MIT License. I am by no means 100% sure this code works completely correctly; my testing has been successful but it is entirely possible I've missed something. I consider myself to be more knowledgeable about FFTs after this whole ordeal but I am still by no means an expert unlike some of the people online to whom digital signal processing is a second nature.

Sample Track Credit

1 comment:

  1. God this is so fast. It's action packed and transforms like every second. It's like a Ditto!

    Ditto! Transform!

    ReplyDelete