7.3.11 Real-Time vs. Off-Line Processing

7.3.11 Real-Time vs. Off-Line Processing

To this point, we’ve primarily considered off-line processing of audio data in the programs that we’ve asked you to write in the exercises.  This makes the concepts easier to grasp, but hides the very important issue of real-time processing, where operations have to keep pace with the rate at which sound is played.

Chapter 2 introduces the idea of audio streams.  In Chapter 2, we give a simple program that evaluates a sine function at the frequency of desire notes and writes the output directly to the audio device so that notes are played when the program runs.  Chapter 5 gives a program that reads a raw audio file and writes that to the audio device to play it as the program runs.  The program from Chapter 5 with a few modifications is given here for review.


/*Use option -lasound on compile line.  Send in number of samples and raw sound file name.*/

#include </usr/include/alsa/asoundlib.h>
#include <math.h>
#include <iostream>
using namespace std;

static char *device = "default";	/*default playback device */
snd_output_t *output = NULL;
#define PI 3.14159

int main(int argc, char *argv[])
{
        int err, numRead;
        snd_pcm_t *handle;
        snd_pcm_sframes_t frames;
        int numSamples = atoi(argv[1]);

        char* buffer = (char*) malloc((size_t) numSamples);
        FILE *inFile = fopen(argv[2], "rb");
        numRead = fread(buffer, 1, numSamples, inFile);
        fclose(inFile);   

        if ((err = snd_pcm_open(&handle, device, SND_PCM_STREAM_PLAYBACK, 0)) < 0){
            printf("Playback open error: %s\n", snd_strerror(err));
            exit(EXIT_FAILURE);
        }
        if ((err = snd_pcm_set_params(handle,
				SND_PCM_FORMAT_U8,
				SND_PCM_ACCESS_RW_INTERLEAVED,
				1,
				44100, 1, 400000) ) < 0 ){
            printf("Playback open error: %s\n", snd_strerror(err));
            exit(EXIT_FAILURE);
        }

        frames = snd_pcm_writei(handle, buffer, numSamples);
        if (frames < 0)
          frames = snd_pcm_recover(handle, frames, 0);
        if (frames < 0) {
          printf("snd_pcm_writei failed: %s\n", snd_strerror(err));
        }  
}

Program 7.1 Reading and writing raw audio data

This program uses the library function send_pcm_writei to send samples to the audio device to be played. The audio samples are read from in input file into a buffer and transmitted to the audio device without modification The variable buffer indicates where the samples are stored, and sizeof(buffer)/8 gives the number of samples given that this is 8-bit audio.

Consider what happens when you have a much larger stream of audio coming in and you want to process it in real time before writing it to the audio device. This entails continuously filling up and emptying the buffer at a rate that keeps up with the sampling rate.

Let’s do some analysis to determine how much time is available for processing based on a given buffer size. For a buffer size of N and a sampling rate of r, then $$N/r$$ seconds can be passed before additional audio data will be required for playing. For and $$N=4096$$ and $$r=44100$$, this would be $$\frac{4096}{44100}=0.0929\: ms$$. (This scheme implies that there will be latency between the input and output, at most $$N/r$$ seconds.)

What is you wanted to filter the input audio before sending it to the output? We’ve seen that filtering is more efficient in the frequency domain using the FFT. Assuming the input is in the time domain, our program has to do the following:

  • convert data to the frequency domain with inverse FFT
  • multiply the filter and the audio data
  • convert data back to the time domain with inverse FFT
  • write the data to the audio device

The computational complexity of the FFT and IFFT is $$0\left ( N\log N \right )$$, on the order of $$4096\ast 12=49152$$ operations (times 2). Multiplying the filter and the audio data is $$0\left ( N \right )$$, and writing the data to the audio devices is also $$0\left ( N \right )$$, adding on the order of the order of 2*4096 operations. This yields on the order of 106496 operations to be done in $$0.0929\; ms$$, or about $$0.9\; \mu s$$ per operation. Considering that today’s computers can do more than 100,000 MIPS (millions of instructions per second), this is not unreasonable.

We refer the reader to Boulanger and Lazzarini’s Audio Programming Book for more examples of real-time audio processing.