Ideas for FFT implementation

Hi, I have made a rough outline to implement FFT-analysis on the DSO nano. I welcome everbody to join this thread and give feedback on my ideas. My plan is to implement FFT-analysis this way the coming month. Any comment and/or suggestions are welcome!

Memory size is an issue which gives limitations to the quality ans speed of the algorithm. The choice we can make is between N=256 or 512 bins. For bin size N, memory is needed for algorithm input of 2N shorts, that is 4N bytes. Also the same amount of memory is needed for algorithm output.
Therefore, for N=512, we need at least 4K RAM extra. For N=256, this is 2K.
Fortunately, we can re-use some existing buffers: F_Buff, the buffer for USB communication (see files.c) and Data_Buffer, for reading and writing Micro SD card (see memory.c) are both 512 bytes. By overlaying these buffers on the FFT-algorithm input-buffer, the amount of extra RAM for FFT is lowered by 1024 bytes.

I would like to choose N=512, as the quality of the FFT is higher with more bins.

As Jerry already described in thread viewtopic.php?f=12&t=971&start=10 , the company Embedded Signals has made made an open source 16 bit assember FFT routine that can be used for non-commercial purposes. One routine is available for N=64, 256 and 1024. ANother routine is available for N=128 and 512. Both are highly optimised assember functions that can be called from C.

FFT can be drawn instead of the reference wave. That way no extra memory is needed for storage of the fft wave.
Also the FFT wave can be drawn on the same screen as the normal wave, just like the reference is laid over the normal wave.
Only the first 300 bins of the 512 are shown on the display. Perhaps that some scrolling function could be implemented to view the other 212 bins. The frequency range that is calculated with the FFT algorithm is: 0 … sample_freq/2. Hence the
frequency range on the 300 pixels screen width is 0 … sample_freq/2 * 300/512.

The pseudo code for implementing FFT on the DSO Nano is:
(this code should be called every time the normal waveform is refreshed)

Step 1. Statically allocate memory

short FFT_in [ 1024 ]; //input data 16 bit
short FFT_out[ 1024 ]; //output data 16 bit

Step 2. Copy Scan_Buffer to input buffer, copy data around the triggerpoint

if ( Scan_buffer is empty )
break; // no fft calculation when no data
short scanindex = triggerpoint - 256;
for ( short i=0; i<1024; i=i+2 )
FFT_in[ i ] = Scan_Buffer[ scanindex++ ]; //real value
FFT_in[ i+1 ] = 0; //imaginair value

Step 3. Calculate Windowing function

// multiply each real input value with hanning windowing function which is multiplication by value between 0 and 1
for ( short i = 0; i < 1024; i=i+2 )
// windowfactor is array in flash memory
// multiplication is done with int values, therefore factors are between 0…65535
// and are precalculated as windowfactor[i] = 65535 * (0.5 - 0.5 * Cos(2 * PI * i / (1024 - 1)));
FFT_in[i] = ((int) FFT_in[i] * windowfactor[i]) / 65536;

4. compute FFT

fft512( FFT_out, FFT_in );

5. calculate magnitude of output (and store result back in FFT_in)

short j=0;
for ( short i = 0; i < 1024; i=i+2 )
FFT_in[ j++ ] = sqrt( FFT_in[i] * FFT_in[i] + FFT_in[i+1] * FFT_in[i+1] );

6. calculate logarithmic scale (dB)

for ( short i = 0; i < 512; i++ )
FFT_in[ i ] = 20 * Log10(FFT_in[ i ]);

7. display FFT values on screen


What do you think? Feel free to give suggestions/ideas/feedback.

Hi Paul
Your implementation looks good, and if we can squeeze 512 rather than 256 bins, then that’s great. I have one suggestion for the display that would make the NANO into a very useful tool for VLF signal displaying - that is to have a waterfall display as used in various PC programs such as Spectrum Lab. I guess this can be implemented with no additional memory requirement by writing a row of coloured pixels to the top or bottom of the screen and scrolling one row. I look forward to trying your code.

I would like this function.
regards Michael

Hello Paulvvz,
I’ve read your post with a lot of interest - adding a spectrum analyzer is something I’ve scratched my head against for some time. Unfortunately, I’m very far from the productivity of BenF: so I’m still struggling :slight_smile:

Don’t know if it could be useful to you, but I’ve been studying a working implementation on a similar hardware (Mini STM32) which should be quite portable:

The code is very readable. Most notably, task concurrency is obtained thanks to an interesting embedded OS, named CoOS.

I am still wondering if the current firmware could be reorganized in order to leave enough RAM free.

I plan to investigate the possibility of moving as many variables as possible into program (flash) memory - constant tables, for instance, or anything that gets initialized but never changes, could be declared with the modifiers __flash in IAR or attribute((progmem)) in GCC: I’d say that there might be some significant RAM savings in that, and flash memory space is still quite ample.

Just my 2c,


If those variables are declared with const, they will be allocated to the the read-only data section, which will be linked like a code (“text”) segment (or “CONST” for IAR) and end up in the flash memory. If you run “objdump -t” on the .elf file you can see the address of each variable, (hex) 08… is flash and 20… is RAM.

Since the library from Embedded Signals is not free, please consider using another library (for instance the one from ST). Otherwise we can not use our DSOs at work, and Seeed can not ship this firmware without going into legal contracts. I have no idea how high the licensing fees would be, but it would certainly be a hassle. IANAL applies.