In earlier chapters, we showed how audio signals can be represented in either the time domain or the frequency domain. In this section, you’ll see how mathematical operations are applied in these domains to implement filters, delays, reverberation, etc. Let’s start with the time domain.

Filtering in the time domain is done by a convolution operation. Convolution uses a convolution filter, whichis an array of N values that, when graphed, takes the basic shape shown in Figure 7.32. A convolution filter is also referred to as a convolution mask, an impulse response (IR), or a convolution kernel. There are two commonly-used time-domain convolution filters that are applied to digital audio. They are FIR filters (finite impulse response) and IIR filters (infinite impulse response).

Figure 7.32 Graph of time-domain convolution filter
Figure 7.32 Graph of time-domain convolution filter

Equation 7.1 describes FIR filtering mathematically.

[equation caption=”Equation 7.1 FIR filter”]

$$!\begin{matrix}\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )=\sum_{k=0}^{N-1}\mathbf{h}\left ( k \right )\mathbf{x}\left ( n-k \right )=\sum_{k=0}^{N-1}\mathbf{a}_{k}\mathbf{x}\left ( n-k \right )\\\textit{where N is the size of the filter and}\: x\left ( n-k \right )=0\: if\; n-k< 0\end{matrix}$$

[/equation]

[aside]

You can actually think of equations such as Equation 7.1 in one of two ways: x  and y could be understood as vectors of audio samples in the time domain, being the input and being the output. The equation must be executed N times to get each nth output sample. Alternatively,  could be understood as a function with input n. The function is executed N times to yield all N output samples.

[/aside]

By our convention, boldface variables refer to vectors (i.e., arrays). In this equation, $$\mathbf{h}=\left [ a_{0},a_{1},a_{2}\cdots,a_{N-1} \right ]$$ is the convolution filter – which is a vector of multipliers to be applied successively to audio samples. The number of multipliers is the order of a filter, N in this case. N is sometimes also referred to as the number of taps in the filter.

It’s helpful to think of Equation 7.1 algorithmically, as described Algorithm 7.1. The notation $$\mathbf{y}\left ( n \right )$$ indicates that the $$n^{th}$$ output sample is created from a convolution of input values from the audio signal x and the filter multipliers in h, as given in the summation. The equation is repeated in a loop for every sample in the audio input.

[equation class=”algorithm” caption=”Algorithm 7.1 Convolution with a finite impulse response (FIR) filter”]

/*Input:

x, an array of digitized audio samples (i.e., in the time domain) of size M

h, a convolution filter of size N (Specifically, a finite-impulse-response filter, FIR

Output:

y, the audio samples, filtered

*/

 

for $$\left ( n=0\: to\; N-1 \right )$$ {

$$\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )=\sum_{k=0}^{N-1}\mathbf{h}\left ( k \right )\mathbf{x}\left ( n-k \right )$$

where $$x\left ( n-k \right )=0\: if\; n-k< 0$$

}

[/equation]

The FIR convolution process is described diagrammatically in Figure 7.33.

 

Figure 7.33 Filtering in the time domain by convolving with an FIR filter
Figure 7.33 Filtering in the time domain by convolving with an FIR filter

You can implement convolution yourself as a function, or you can use MATLAB’s conv function. The conv function requires two arguments. The first is an array containing the coefficients of the filter, and the second is the array of time domain audio samples to be convolved. Say that we have an FIR filter of size 3 defined as

[equation caption=”Equation 7.2 Equation for an example FIR filter”]

$$!y\left ( n \right )=0.4\left ( n \right )+0.3x\left ( n-1 \right )+0.3x(n-2)$$

[/equation]

In MATLAB, the coefficients of the filter can be stored in an array h like this:


h = [0.4 0.3 0.3];

 

Then we can read in a sound file and convolve it and listen to the results.


x = wavread('ToccataAndFugue.wav');

y = conv(h,x);

sound(y, 44100);

 

(This is just an arbitrary example with a random filter to demonstrate how you convolve in MATLAB. We haven’t yet considered how the filter itself is constructed.)

IIR filters are also time domain filters, but the process by which they work is a little different. An IIR filter has an infinite length, given by this equation:

[equation caption=”Equation 7.3 IIR Filter, infinite form”]

$$!\begin{matrix}\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )=\sum_{k=0}^{\infty }\mathbf{h}\left ( k \right )\mathbf{x}\left ( n-k \right )\\where\; x\left ( n-k \right )=0\: if\; n-k< 0\\\textit{and k is theoretically infinite}\end{matrix}$$

[/equation]

We can’t deal with an infinite summation in practice, but Equation 7.3 can be transformed to a difference equation form which gives us something we can work with.

[equation caption=”Equation 7.4 IIR filter, difference equation form”]

$$!\begin{matrix}\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )=\sum_{k=0}^{N-1}\mathbf{a}_{k}\mathbf{x}\left ( n-k \right )-\sum_{k=1}^{M}\mathbf{b}_{k}\mathbf{y}\left ( n-k \right )\\\textit{where N is the size of the forward filter, M is the size of the feedback filter, and}\; x\left ( n-k \right )=0\: if\; n-k< 0\end{matrix}$$

[/equation]

In Equation 7.4, N is the size (also called order) of the forward filter and M is the size of the feedback filter. The output from an IIR filter is determined by convolving the input and combining it with the feedback of previous output. In contrast, the output from an FIR filter is determined solely by convolving the input.

[wpfilebase tag=file id=156 tpl=supplement /]

FIR and IIR filters each have their advantages and disadvantages. In general, FIR filters require more memory and processor time. IIR filters can more efficiently create a sharp cutoff between frequencies that are filtered out and those that are not. An FIR filter requires a larger filter size to accomplish the same sharp cutoff as an IIR filter. IIR filters also have the advantage of having analog equivalents, which facilitates their design. An advantage of FIR filters is that they can be constrained to have linear phase response, which means that phase shifts for frequency components are proportional to the frequencies. This is good for filtering music because harmonic frequencies are phase-shifted by the same proportions, preserving their harmonic relationship. Another advantage of FIR filters is that they’re not as sensitive to the noise that results from low bit depth and round-off error.

The exercise associated with this section shows you how to make a convolution filter by recording an impulse response in an acoustical space. Also, in Section 7.3.9, we’ll show a way to apply an IIR filter in MATLAB.

You may have noticed that in our discussion of frequency domain and time domain filters, we didn’t mention how we got the filters – we just had them and applied them. In the case of an FIR filter, the filter is represented in the coefficients $$\left [ a_{0},a_{1},a_{2},\cdots ,a_{N-1} \right ]$$. In the case of the IIR filter, the filter resides in coefficients $$\left [ a_{0},a_{1},a_{2},\cdots ,a_{N-1} \right ]$$ and $$\left [ b_{0},b_{1},b_{2},\cdots ,b_{M} \right ]$$.

Without explaining in detail the mathematics of filter creation, we can show you algorithms for creating low-pass, high-pass, bandpass, and bandstop filters when they are given the appropriate parameters as input. Low-pass filters allow only frequencies below a cutoff frequency $$f_{c}$$ to pass through. Thus, Algorithm 7.2 takes $$f_{c}$$ as input and outputs an N-element array constituting a low-pass filter. Similarly, Algorithm 7.3 takes $$f_{c}$$ as input and yields a high-pass filter, and Algorithm 7.4 and Algorithm 7.5 take $$f_{1}$$ and $$f_{2}$$ as input to yield bandpass and bandstop filters. These algorithms yield time-domain filters. If you’re interested in how these algorithms were derived, see (Ifeachor and Jervis 1993), (Steiglitz 1996), or (Burg 2008).

[equation class=”algorithm” caption=”Algorithm 7.2 Low-pass filter”]

algorithm FIR_low_pass filter

/*

Input:

f_c, the cutoff frequency for the low-pass filter, in Hz

f_samp, sampling frequency of the audio signal to be filtered, in Hz

N, the order of the filter; assume N is odd

Output:

h, a low-pass FIR filter in the form of an N-element array */

{

//Normalize f_c and ω _c so that pi is equal to the Nyquist angular frequency

f_c = f_c/f_samp

ω_c = 2*pi*f_c

middle = N/2    /*Integer division, dropping remainder*/

for i = −N/2 to N/2

   if (i == 0) h(middle) = 2*f_c

   else h(i + middle) = sin(ω_c*i)/(pi*i)

}

[/equation]

[equation class=”algorithm” caption=”Algorithm 7.3 High-pass filter”]

algorithm FIR_high_pass filter

/*

Input:

f_c, the cutoff frequency for the high pass filter, in Hz

f_samp, sampling frequency of the audio signal to be filtered, in Hz

N, the order of the filter; assume N is odd

Output:

h, a high-pass FIR filter in the form of an N-element array */

{

//Normalize f_c and ω _c so that pi is equal to the Nyquist angular frequency

f_c = f_c/f_samp

ω_c = 2*pi*f_c

middle = N/2    /*Integer division, dropping remainder*/

for i = −N/2 to N/2

   if (i == 0) h(middle) = 1 – 2*f_c

   else h(i + middle) = -sin(ω_c*i)/(pi*i)

}

[/equation]

[equation class=”algorithm” caption=”Algorithm 7.4 Bandpass filter”]

algorithm FIR_bandpass filter

/*

Input:

f1, the lowest frequency to be included, in Hz

f2, the highest frequency to be included, in Hz

f_samp, sampling frequency of the audio signal to be filtered, in Hz

N, the order of the filter; assume N is odd

Output:

h, a bandpass FIR filter in the form of an N-element array */

{

//Normalize f_c and ω _c so that pi is equal to the Nyquist angular frequency

  f1_c = f1/f_samp

  f2_c = f2/f_samp

  ω1_c = 2*pi*f1_c

  ω2_c = 2*pi*f2_c

  middle = N/2    /*Integer division, dropping remainder*/

  for i = −N/2 to N/2

   if (i == 0) h(middle) = 2*(f2_c – f1_c)

   else

  h(i + middle) = sin(ω2_c*i)/(pi*i) – sin(ω1_c*i)/(pi*i)

 }

[/equation]

[equation class=”algorithm” caption=”Algorithm 7.5 Bandstop filter”]

algorithm FIR_bandstop filter

/*

Input:

f1, the highest frequency to be included in the bottom band, in Hz

f2, the lowest frequency to be included in the top band, in Hz

Everything from f1 to f2 will be filtered out

f_samp, sampling frequency of the audio signal to be filtered, in Hz

N, the order of the filter; assume N is odd

Output:

h, a bandstop FIR filter in the form of an N-element array */

{

//Normalize f_c and ω _c so that pi is equal to the Nyquist angular frequency

f1_c = f1/f_samp

f2_c = f2/f_samp

ω1_c = 2*pi*f1_c

ω2_c = 2*pi*f2_c

middle = N/2    /*Integer division, dropping remainder*/

for i = −N/2 to N/2

   if (i == 0) h(middle) = 1 – 2*(f2_c – f1_c)

   else

h(i + middle) = sin(ω1_c*i)/(p*i) – sin(ω2_c*i)/(pi*i)

}

[/equation]

[wpfilebase tag=file id=163 tpl=supplement /]

[wpfilebase tag=file id=165 tpl=supplement /]

As an exercise, you can try implementing these algorithms in C++, Java, or MATLAB and see if they actually work. In Section 7.3.8, we’ll show you some higher level tools in MATLAB’s digital signal processing toolkit that create these types of filters for you.

If you continue to work in digital signal processing, you need to be familiar with the z-transform, a mathematical operation that converts a discrete time domain signal into a frequency domain signal represented as real or complex numbers. The formal definition is below.

Let $$\mathbf{x}\left ( n \right )$$ be a sequence of discrete values for $$n=0,1,\cdots $$. The one-sided z-transform of $$\mathbf{x}\left ( n \right )$$, $$\mathbf{X}\left ( z \right )$$, is a function that maps from complex numbers to complex numbers as follows:

[equation caption=”Equation 7.5 The one-sided z-transform”]

$$!\begin{matrix}\mathbf{X}\left ( z \right )=\sum_{n=0}^{\infty }\mathbf{x}\left ( n \right )z^{-n}\\\textit{where z is a complex number}\end{matrix}$$

[/equation]

A full z-transform sums from -∞ to ∞, but the one-sided transform suffices for us because n isan index into an array of audio samples and will never be negative.

In practice, our sector of time domain audio samples, has a finite length N. Thus, if we assume that for $$\mathbf{x}\left ( n \right )=0\; for\; n\geq N$$, then we can redefine the transform as

$$!\mathbf{X}\left ( z \right )=\sum_{n=0}^{N-1}\mathbf{x}\left ( n \right )z^{-n}$$

$$\mathbf{X}\left ( z \right )$$ is a discrete function from complex variables to complex variables. Now, look at what results if we set $$z=e^{\frac{i2\pi k}{N}}$$ and apply the z-transform to a vector of length N (applying the equation N times for $$0\leq k< N$$). This yields the following:

[equation caption=”Equation 7.6 The discrete Fourier transform as a special case of the z-transform”]

$$!for\; 0\leq k< N,\mathbf{X}\left ( Z_{k} \right )=\sum_{n=0}^{N-1}\mathbf{x}\left ( n \right )\left ( e^{i2\pi k} \right )^{-n}=\sum_{n=0}^{N-1}\mathbf{x}\left ( n \right )e^{\frac{-i2\pi kn}{N}}=\sum_{n=0}^{N-1}\mathbf{x}\left ( n \right )\cos ^{\frac{2\pi kn}{N}}-i\mathbf{x}\left ( n \right )\sin ^{\frac{2\pi kn}{N}}$$

[/equation]

The idea of Equation 7.6 is that we are evaluating the equation for each $$k^{th}$$ frequency component $$\left ( \frac{2\pi k}{N} \right )$$ to determine N frequency components in the audio data. You can see that this equation is equivalent to the definition of the discrete Fourier transform given in Chapter 2. The Fourier transform transforms from the time domain (real numbers) to the frequency domain (complex numbers), but since real numbers are a subset of the complex numbers, the Fourier transform is an instance of a z-transform.

In Chapter 2, we showed how the Fourier transform can be applied to transform audio data from the time to the frequency domain. The Fourier transform is a function that maps from real numbers (audio samples in the time domain) to complex numbers (frequencies, which have magnitude and phase). We showed in the previous section that the Fourier transform is a special case of the z-transform, a function that maps from complex numbers to complex numbers. Now let’s consider the equivalence of filtering in the time and frequency domains. We begin with some standard notation.

Say that you have a convolution filter as follows:

$$!\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )$$

By convention, the z-transform of $$\mathbf{x}\left ( n \right )$$ is called $$\mathbf{X}\left ( z \right )$$, the z-transform of $$\mathbf{y}\left ( n \right )$$is called $$\mathbf{Y}\left ( z \right )$$, and the z-transform of $$\mathbf{h}\left ( n \right )$$ is called $$\mathbf{H}\left ( z \right )$$, as shown in Table 7.1.

time domain frequency domain
input audio $$\mathbf{x}\left ( n \right )$$ $$\mathbf{X}\left ( z \right )$$
output, filtered audio $$\mathbf{y}\left ( n \right )$$ $$\mathbf{Y}\left ( z \right )$$
filter $$\mathbf{h}\left ( n \right )$$ $$\mathbf{H}\left ( z \right )$$
filter operation $$\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )$$ $$\mathbf{Y}\left ( z \right )=\mathbf{H}\left ( z \right )\ast \mathbf{X}\left ( z \right )$$

Table 7.1 Conventional notation for time and frequency domain filters

The convolution theorem is an important finding in digital signal processing that shows us the equivalence of filtering in the time domain vs. the frequency domain. By this theorem, instead of filtering by a convolution, as expressed in $$\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )$$, we can filter by multiplication, as expressed in $$\mathbf{Y}\left ( z \right )=\mathbf{H}\left ( z \right )\ast \mathbf{X}\left ( z \right )$$. (Note that $$\mathbf{H}\left ( z \right )\ast \mathbf{X}\left ( z \right )$$ is an element-by-element multiplication of order $$N$$.)  That is, if you take a time domain filter, transform it to the frequency domain, transform your audio data to the frequency domain, multiply the frequency domain filter and the frequency domain audio data, and do the inverse Fourier transform on the result, you get the same result as you would get by convolving the time domain filter with the time domain audio data. This is in essence the convolution theorem, explained diagrammatically in Figure 7.34. In fact, with a fast implementation of the Fourier transform, known as the Fast Fourier Transform (FFT), filtering in the frequency domain is more computationally efficient than filtering in the time domain. That is, it takes less time to do the operations.

Figure 7.34 The Convolution Theorem
Figure 7.34 The Convolution Theorem

MATLAB has a function called fft for performing the Fourier transform on a vector of audio data and a function called conv for doing convolutions. However, to get a closer view of these operations, it may be enlightening to try implementing these functions yourself. You can implement the Fourier transform either directly from Equation 7.6, or you can try the more efficient algorithm, the Fast Fourier transform. To check your accuracy, you can compare your results with MATLAB’s functions. You can also compare the run times of programs that filter in the time vs. the frequency domain. We leave this as an activity for you to try.

In Chapter 2, we discussed the necessity of applying the Fourier transform over relatively small segments of audio data at a time. These segments, called windows, are on the order of about 1024 to 4096 samples. If you use the Fast Fourier transform, the window size must be a power of 2. It’s not hard to get an intuitive understanding of why the window has to be relatively small. The purpose of the Fourier transform is to determine the frequency components of a segment of sound. Frequency components relate to pitches that we hear. In most sounds, these pitches change over time, so the frequency components change over time. If you do the Fourier transform on, say, five seconds of audio, you’ll get an imprecise view of the frequency components in that time window, called time slurring. However, what if you choose a very small window size, say just one sample? You couldn’t possibly determine any frequencies in one sample, which at a sampling rate of 44.1 kHz is just 1/44,100 second. Frequencies are determined by how a sound wave’s amplitude goes up and down as time passes, so some time must pass for there to be such a thing as frequency.

The upshot of this observation is that the discrete Fourier transform has to be applied over windows of samples where the windows are neither too large nor too small. Note that the window size has a direct relationship with the number of frequency components you detect. If your window has a size of N, then you get an output telling you the magnitudes of N/2 frequency bands from the discrete Fourier transform, ranging in frequency from 0 to the Nyquist frequency (i.e., ½ the sampling rate). An overly small window in the Fourier transform gives you very high time resolution, but tells you the magnitudes of only a small number of discrete, wide bands of frequencies. An overly large window yields many frequency bands, but with poor time resolution that leads to slurring. You want to have good enough time resolution to be able to reconstruct the resulting audio signal, but also enough frequency information to apply the filters with proper effect. Choosing the right window size is a balancing act.

Now let’s look at how the z-transform can be useful in describing, designing, and analyzing audio filters. We know that we can do filtering in the frequency domain with the operation

[equation caption=”Equation 7.7 Filtering in the frequency domain”]

$$!\mathbf{Y}\left ( z \right )=\mathbf{H}\left ( z \right )\ast \mathbf{X}\left ( z \right )$$

[/equation]

If we have the coefficients defining an FIR filter, $$\left [ a_{0},a_{1}.a_{2}\cdots ,a_{N-1} \right ]$$, then $$\mathbf{H}\left ( z \right )$$ is derived directly from the definition of the z-transform

[equation caption=”Equation 7.8 H(z), an FIR filter in the frequency domain”]

$$!\mathbf{H}\left ( z \right )=\sum_{n=0}^{N-1}\mathbf{h}\left ( n \right )z^{-n}=a_{0}+a_{1}z^{-1}+a_{2}z^{-2}+\cdots +a^{N-1}z^{-\left ( N-1 \right )}$$

[/equation]

Consider our previous example of an FIR filter,

$$!y\left ( n \right )=0.4x\left ( n \right )+0.3x\left ( n-1 \right )+0.3x\left ( n-2 \right )$$,

where $$\mathbf{h}=\left [ a_{0},a_{1},a_{2} \right ]=\left [ 0.4,0.3,0.3 \right ]$$. This filter can be transformed to the frequency domain by applying the z-transform.

$$!\mathbf{H}\left ( z \right )=\sum_{n=0}^{N-1}\mathbf{h}\left ( n \right )z^{-n}=a_{0}+a_{1}z^{-1}+a_{2}z^{-2}+\cdots +a^{N-1}z^{-\left ( N-1 \right )}=0.4+0.3z^{-1}+0.3z^{-2}$$

Filters can be expressed diagrammatically in terms of the z-transform, as illustrated in Figure 7.35. The $$z^{-m}$$ elements are delay elements.

Figure 7.35 Example FIR filter diagram
Figure 7.35 Example FIR filter diagram

To derive $$\mathbf{H}\left ( z \right )$$ for an IIR filter, we rearrange Equation 7.7 as a ratio of the output to the input.

[equation caption=”Equation 7.9 Filter in the frequency domain as a ratio of output to input”]

$$!\mathbf{H}\left ( z \right )=\frac{\mathbf{Y}\left ( z \right )}{\mathbf{X}\left ( z \right )}$$

[/equation]

This form of the filter is called the transfer function. Using the definition of the IIR filter, $$\mathbf{y}\left ( n \right )=\mathbf{h}\left ( n \right )\otimes \mathbf{x}\left ( n \right )=\sum_{k=0}^{N-1}a_{k}\mathbf{x}\left ( n-k \right )-\sum_{k=1}^{M}b_{k}\mathbf{y}\left ( n-k \right )$$, we can express the IIR filter in terms of its coefficients $$\left [ a_{0},a_{1},a_{2}\cdots ,a_{N-1} \right ]$$ and $$\left [ b_{1},b_{2},b_{3}\cdots ,b_{M} \right ]$$, which yields the following transfer function defining the IIR filter:

[equation caption=”Equation 7.10 Transfer function for an IIR filter”]

$$!\mathbf{H}\left ( z \right )=\frac{\mathbf{Y}\left ( z \right )}{\mathbf{X}\left ( z \right )}=\frac{a_{0}+a_{1}z^{-1}+a_{2}z^{-2}+\cdots +a_{N-1}z^{N-1}}{1+b_{1}z^{-1}+b_{2}z^{-2}+\cdots +b_{M}z^{N-M}}=\frac{a_{0}z^{N}+a_{1}z^{N-1}+\cdots +a_{N-1}z^{2N-1}}{z^{N}+b_{1}z^{N-1}+\cdots +b_{m}z^{2N-M}}$$

[/equation]

(The usefulness of transfer functions will become apparent in the next section.) An IIR filter in the frequency domain can be represented with a filter diagram as shown in Figure 7.36.

Figure 7.36 Example IIR filter diagram
Figure 7.36 Example IIR filter diagram

The function $$\mathbf{H}\left ( z \right )$$ defining a filter has a complex number z as its input parameter. Complex numbers are expressed as $$a+bi$$, where a is the real number component, b is the coefficient of the imaginary component, and i is $$\sqrt{-1}$$. The complex number plane can be graphed as shown in Figure 7.37, with the real component on the horizontal axis and the coefficient of the imaginary component on the vertical axis. Let’s look at how this works.

With regard to Figure 7.37, let x, y, and r denote the lengths of line segments $$\overline{AB}$$, $$\overline{BC}$$, and $$\overline{AC}$$ , respectively. Since the circle is a unit circle, $$r=1$$. From trigonometric relations, we get

$$!x=\cos \theta$$

$$!y=\sin \theta$$

Point C represents the number $$\cos \theta +i\sin \theta$$ on the complex number plane. By Euler’s identity, $$\cos \theta +i\sin \theta=e^{i\theta }$$.  $$\mathbf{H}\left ( z \right )$$ is a function that is evaluated around the unit circle and graphed on the complex number plane. Since we are working with discrete values, we are evaluating the filter’s response at N discrete points evenly distributed around the unit circle. Specifically, the response for the $$k^{th}$$ frequency component is found by evaluating $$\mathbf{H}\left ( z_{k} \right )$$ at $$z=e^{i\theta }=e^{i\frac{2\pi k}{N}}$$ where N is the number of frequency components. The output,  $$\mathbf{H}\left ( z_{k} \right )$$, is a complex number, but if you want to picture the output in the graph, you can consider only the magnitude of $$\mathbf{H}\left ( z_{k} \right )$$, which would be a real number coming out of the complex number plane (as we’ll show in Figure 7.41). This helps you to visualize how the filter attenuates or boosts the magnitudes of different frequency components.

Figure 7.37 A circle with radius 1 on the complex number plane
Figure 7.37 A circle with radius 1 on the complex number plane, with $$\theta =\frac{2\pi k}{N}$$

This leads us to a description of the zero-pole diagram, a graph of a filter’s transfer function that marks places where the filter does the most or least attenuation (the zeroes and poles, as shown in Figure 7.38). As the numerator of $$\frac{\mathbf{Y}\left ( z \right )}{\mathbf{X}\left ( z \right )}$$ goes to 0, the output frequency is getting smaller relative to the input frequency. This is called a zero. As the denominator goes to 0, the output frequency is getting larger relative to the input frequency. This is called a pole.

To see the behavior of the filter, we can move around the unit circle looking at each frequency component at point $$e^{i\theta }$$ where $$\theta =\frac{2\pi k}{N}$$ for $$0\leq k< N-1$$. Let $$d_{zero}$$ be the point’s distance from a zero and $$d_{pole}$$ be the point’s distance from a pole. If, as you move from point $$P_{k}$$ to point $$P_{k+1}$$ (these points representing input frequencies $$e^{i\frac{2\pi k}{N}}$$ and $$e^{i\frac{2\pi\left ( k+1 \right )}{N}}$$ respectively), the ratio $$\frac{d_{zero}}{d_{pole}}$$ gets larger, then the filter is progressively allowing more of the frequency $$e^{i\frac{2\pi k}{N}}$$ to “pass through” as opposed to $$e^{i\frac{2\pi\left ( k+1 \right )}{N}}$$.

Let’s try this mathematically with a couple of examples. These examples will get you started in your understanding of zero-pole diagrams and the graphing of filters, but the subject is complex. MATLAB’s Help has more examples and more explanations, and you can find more details in the books on digital signal processing listed in our references.

First let’s try a simple FIR filter. The coefficients for this filter are . In the time domain, this filter can be seen as a convolution represented as follows:

[equation caption=”Equation 7.11 Convolution for a simple filter in the time domain”]

$$!\mathbf{y}\left ( n \right )=\sum_{k=0}^{1}\mathbf{h}\left ( k \right )\mathbf{x}\left ( n-k \right )=\mathbf{x}\left ( n \right )-0.5\mathbf{x}\left ( n-1 \right )$$

[/equation]

Converting the filter to the frequency domain by the z-transform, we get

[equation caption=”Equation 7.12 Delay filter represented as a transfer function”]

$$!\mathbf{H}\left ( z \right )=1-0.5z^{-1}=\frac{z\left ( 1-0.5z^{-1} \right )}{z}=\frac{z-0.5}{z}$$

[/equation]

In the case of this simple filter, it’s easy to see where the zeroes and poles are. At $$z=0.5$$, the numerator is 0. Thus, a zero is at point (0.5, 0). At z = 0, the denominator is 0. Thus, a pole is at (0,0). There is no imaginary component to either the zero or the pole, so the coefficient of the imaginary part of the complex number is 0 in both cases.

Once we have the zeroes and poles, we can plot them on the graph representing the filter, $$\mathbf{H}\left ( z \right )$$, as shown in Figure 7.38. To analyze the filter’s effect on frequency components, consider the $$k^{th}$$ discrete point on the unit circle, point $$P_{k}$$, and the behavior of filter $$\mathbf{H}\left ( z \right )$$ at that point. Point $$P_{0}$$ is at the origin, and the angle formed between the x-axis and the line between $$P_{0}$$ and $$P_{k}$$is $$\theta =\frac{2\pi k}{N}$$. Each such point $$P_{k}$$ at angle $$\theta =\frac{2\pi k}{N}$$ corresponds to a frequency component $$e^{i}\frac{2\pi k}{N}$$ in the audio that is being filtered. Assume that the Nyquist frequency has been normalized to $$\pi$$. Thus, by the Nyquist theorem, the only valid frequency components to consider are between 0 and $$\pi$$. Let $$d_{zero}$$ be the distance between $$P_{k}$$ and a zero of the filter, and let $$d_{pole}$$ be the distance between $$P_{k}$$ and a pole. The magnitude of the frequency response, $$\mathbf{H}\left ( z \right )$$, will be smaller in proportion to $$\frac{d_{zero}}{d_{pole}}$$. For this example, the pole is at the origin so for all points $$P_{k}$$, $$d_{pole}$$ is 1. Thus, the magnitude of $$\frac{d_{zero}}{d_{pole}}$$ depends entirely on $$d_{zero}$$. As $$d_{zero}$$ gets larger, the frequency response gets larger. $$d_{zero}$$ increases as you move from $$\theta =0$$ to $$\theta =\pi$$. Since the frequency response increases as frequency increases, this filter is a high-pass filter, attenuating low frequencies more than high ones.

Figure 7.38 Zero-pole graph of simple delay filter
Figure 7.38 Zero-pole graph of simple delay filter

Now let’s try a simple IIR filter. Say that the equation for the filter is this:

[equation caption=”Equation 7.13 Equation for an IIR filter”]

$$!\mathbf{y}\left ( n \right )=\mathbf{x}\left ( n \right )-\mathbf{x}\left ( n-2 \right )-0.49\mathbf{y}\left ( n-2 \right )$$

[/equation]

The coefficients for the forward filter are $$\left [ 1,0,-1\right ]$$, and the coefficients for the feedback filter are $$\left [ 0, -0.49\right ]$$. Applying the transfer function form of the filter, we get

[equation caption=”Equation 7.14 Equation for IIR filter, version 2″]

$$!\mathbf{H}\left ( z \right )=\frac{1-z^{-2}}{1-0.49z^{-2}}=\frac{z^{2}-1}{z^{2}-0.49}=\frac{\left ( z+1 \right )\left ( z-1 \right )}{\left ( z+0.7i \right )\left ( z-0.7i \right )}$$

[/equation]

From Equation 7.14, we can see that there are zeroes at $$z=1$$ and $$z=-1$$ and poles at $$z=0.7i$$ and $$z=-0.7i$$. The zeroes and poles are plotted in Figure 7.39. This diagram is harder to analyze by inspection because there are two zeroes and two poles.

Figure 7.39 A zero-pole diagram for an IIR filter with two zeroes and two poles
Figure 7.39 A zero-pole diagram for an IIR filter with two zeroes and two poles

If we graph the frequency response of the filter, we can see that it’s a bandpass filter. We can do this with MATLAB’s fvtool (Filter Visualization Tool). The fvtool expects the coefficients of the transfer function (Equation 7.10) as arguments. The first argument of fvtool should be the coefficents of the numerator $$\left ( a_{0},a_{1},a_{2},\cdots \right )$$, and the second should be the coefficients of the denominator $$\left ( b_{1},b_{2},b_{3},\cdots \right )$$. Thus, the following MATLAB commands produce the graph of the frequency response, which shows that we have a bandpass filter with a rather wide band (Figure 7.40).


a = [1 0 -1];

b = [1 0 -0.49];

fvtool(a,b);

 

(You have to include the constant 1 as the first element of b.) MATLAB’s fvtool is discussed further in Section 7.3.9.

Figure 7.40 Frequency response of bandpass filter
Figure 7.40 Frequency response of bandpass filter

[wpfilebase tag=file id=167 tpl=supplement /]

We can also graph the frequency response on the complex number plane, as we did with the zero-pole diagram, but this time we show the magnitude of the frequency response in the third dimension by means of a surface plot. This plot is shown in Figure 7.41. Writing a MATLAB function to create a graph such as this is left as an exercise.

Figure 7.41 Surface plot of filter on complex number plane, with zeroes marked in red
Figure 7.41 Surface plot of filter on complex number plane, with zeroes marked in red

In Chapter 4, we discussed comb filtering in the context of acoustics and showed how comb filters can be made by means of delays. Let’s review comb filtering now.

A comb filter is created when two identical copies of a sound are summed (in the air, by means of computer processing, etc.), with one copy being offset in time from the other. The amount of offset determines which frequencies are combed out. The equation that predicts the combed frequency is given below.

[equation caption=”Equation 7.15 Comb filtering”]

$$!\begin{matrix}\textrm{Given a delay of }t\textrm{ seconds between two identical copies of a sound, then the frequencies of }f_{i}\textrm{ that will be combed are }\\ f_{i}=\frac{2i+1}{2t}\textrm{ for all integers }i\geq 0\end{matrix}$$

[/equation]

For a sampling rate of 44.1 kHz, 0.01 is equivalent to 441 samples. This delay is demonstrated in the MATLAB code below, which applies a delay of 0.01 seconds to white noise and graphs the resulting frequencies. (The white noise was created in Audition and saved in a raw PCM file.)


fid = fopen('WhiteNoise.pcm', 'r');

y = fread(fid, 'int16');

y = y(1:44100);

first = y(442:44100);

second = y(1:441);

y2 = cat(1, first, second);

y3 = y + y2;

plot(abs(fft(y3)));

 

You can see in Figure 7.42 that with a delay of 0.01 seconds, the frequencies that are combed out are 50 Hz, 150 Hz, 250 Hz, and so forth, while frequencies of 100 Hz, 200 Hz, 300 Hz, and so forth are boosted.

Figure 7.42 Comb filtering
Figure 7.42 Comb filtering

We created the comb filter above “by hand” by shifting the sample values and summing them. We can also use MATLAB’s fvtool.


a = [1  zeroes(1,440) 0.5];

b = [1];

fvtool(a,b);

axis([0 .008 -8 6]);

 

Figure 7.43 Comb filtering through fvtool
Figure 7.43 Comb filtering through fvtool

The x-axis in Figure 7.43 is normalized to the Nyquist frequency. That is, for a sampling rate of 44.1 kHz, the Nyquist frequency of 22,050 Hz is normalized to 1. Thus, 50 Hz falls at position 0.002 and 100 Hz at about 0.0045, etc.

Comb filters can be expressed as either non-recursive FIR filters or recursive IIR filters. A simple non-recursive comb filter is expressed by the following equation:

[equation caption=”Equation 7.16 Equation for an FIR comb filter”]

$$!\begin{matrix}\mathbf{y}\left ( n \right )=\mathbf{x}\left ( n \right )+g \ast \mathbf{x}\left ( n-m \right )\\\textit{where }\mathbf{y}\textit{ is the filtered audio in the time domain}\\\mathbf{x}\textit{ is the original audio in the time domain}\\m\textit{ is the amount of delay in samples, and}\left | g \right |\leq 1\end{matrix}$$

[/equation]

In an FIR comb filter, a fraction of an earlier sample is added to a later one, creating a repetition of the original sound. The distance between the two copies of the sound is controlled by m.

A simple recursive comb filter is expressed by the following equation:

[equation caption=”Equation 7.17 Equation for an IIR comb filter”]

$$!\begin{matrix}\mathbf{y}\left ( n \right )=\mathbf{x}\left ( n \right )+g \ast \mathbf{y}\left ( n-m \right )\\\textit{where }\mathbf{y}\textit{ is the filtered audio in the time domain}\\\mathbf{x}\textit{ is the original audio in the time domain}\\m\textit{ is the amount of delay in samples, and}\left | g \right |\leq 1\end{matrix}$$

[/equation]

You can see that Equation 7.17 represents a recursive filter because the output $$\mathbf{y}\left ( n-m \right )$$ is fed back in to create later output. With this feedback term, an earlier sound continues to have an effect on later sounds, but with decreasing intensity each time it is fed back in (because of the multiplier $$\left | g \right |\leq 1$$).

Flanging is defined as a time-varying comb filter effect. It results from the application of a delay between two identical copies of audio where the amount of delay varies over time. This time-varying delay results in different frequencies and their harmonics being combed out a different moments in time. The time delay can be controlled by an LFO.

Flanging originated in the analog world as a trick performed with two tape machines playing identical copies of audio. The sound engineer was able to delay one of the two copies of the recording by pressing his finger on the rim or “flange” of the tape deck. Varying the pressure would vary the amount of delay, creating a “swooshing” sound.

Flanging effects have many variations. In basic flanging, the frequencies combed out are in a harmonic series at any moment in time. Flanging by means of phase-shifting can result in the combing of non-harmonic frequencies, such that the space between combed frequencies is not regular. The amount of feedback in flanging can also be varied. Flanging with a relatively long delay between copies of the audio is sometimes referred to a chorusing.

Guitars are a favorite instrument for flanging. The rock song “Barracuda” by Heart is good example of guitar flanging, but there are many others throughout rock history.

We leave flanging as an exercise for the reader, referring you to The Audio Programming Book cited in the references for an example implementation.

Section 7.3.2 gives you algorithms for creating a variety of FIR filters. MATLAB also provides built-in functions for creating FIR and IIR filters. Let’s look at the IIR filters first.

MATLAB’s butter function creates an IIR filter called a Butterworth filter, named for its creator. Consider this sequence of commands:


N = 10;

f = 0.5;

[a,b] = butter(N, f);

 

The butter function call sends in two arguments: the order of the desired filter, N; and the and the cutoff frequency, f. The cutoff frequency is normalized so that the Nyquist frequency (½ the sampling rate) is 1, and all valid frequencies lie between 0 and 1. The function call returns two vectors, a and b, corresponding to the vectors a and b in Equation 7.4. (For a simple low-pass filter, an order of 6 is fine.  The order is the number of coeffients.)

Now with the filter in hand, you can apply it using MATLAB’s filter function. The filter function takes the coefficients and the vector of audio samples as arguments:


output = filter(a,b,audio);

 

You can analyze the filter with the fvtool.


fvtool(a,b)

 

The fvtool provides a GUI through which you can see multiple views of the filter, including those in Figure 7.44. In the first figure, the blue line is the magnitude frequency response and the green line is phase.

Figure 7.44 Filter Visualization tool in MATLAB
Figure 7.44 Filter Visualization tool in MATLAB

Another way to create and apply an IIR filter in MATLAB is by means of the function yulewalk. Let’s try a low-pass filter as a simple example. Figure 7.45 shows the idealized frequency response of a low-pass filter. The x-axis represents normalized frequencies, and f_c is the cutoff frequency. This particular filter allows frequencies that are up to ¼ the sampling rate to pass through, but filters out all the rest.

Figure 7.45 Frequency response of an ideal low-pass filter
Figure 7.45 Frequency response of an ideal low-pass filter

The first step in creating this filter is to store its “shape.” This information is stored in a pair of parallel vectors, which we’ll call f and m. For the four points on the graph in Figure 7.46, f stores the frequencies, and stores the corresponding magnitudes. That is, $$\mathbf{f}=\left [ f1\; f2\; f3\; f4 \right ]$$ and $$\mathbf{m}=\left [ m1\; m2\; m3\; m4 \right ]$$, as illustrated in the figure. For the example filter we have


f = [0 0.25 0.25 1];

m = [1 1 0 0];

 

Figure 7.46 Points corresponding to input parameters in yulewalk function
Figure 7.46 Points corresponding to input parameters in yulewalk function

[aside]The yulewalk function in MATLAB is named for the Yule-Walker equations, a set of linear equations used in auto-regression modeling.[/aside]

Now that you have an ideal response, you use the yulewalk function in MATLAB to determine what coefficients to use to approximate the ideal response.


[a,b] = yulewalk(N,f,m)

 

Again, an order N=6 filter is sufficient for the low-pass filter. You can use the same filter function as above to apply the filter. The resulting filter is given in Figure 7.47. Clearly, the filter cannot be perfectly created, as you can see from the large ripple after the cutoff point.

Figure 7.47 Frequency response of a yulewalk filter
Figure 7.47 Frequency response of a yulewalk filter

The finite counterpart to the yulewalk function is the fir2 function. Like butter, fir2 takes as input the order of the filter and two vectors corresponding to the shape of the filter’s frequency response. Thus, we can use the same f and m as before. fir2 returns the vector h constituting the filter.


h = fir2(N,f,m);

 

We need to use a higher order filter because this is an FIR. N=30 is probably high enough.

MATLAB’s extensive signal processing toolkit includes a Filter Designer with an easy-to-use graphical user interface that allows you to design the types of filters discussed in this chapter. A Butterworth filter is shown in Figure 7.48. You can adjust the parameters in the tool and see how the roll-off and ripples change accordingly. Experimenting with this tool helps you know what to expect from filters designed by different algorithms.

Figure 7.48 Butterworth filter designed in MATLAB’s Filter Design and Analysis tool
Figure 7.48 Butterworth filter designed in MATLAB’s Filter Design and Analysis tool

 

Vocoders were introduced in Section 7.1.8. The implementation of a vocoder is sketched in Algorithm 7.6 and diagrammed in Figure 7.49. The MATLAB and C++ exercises associated with this section encourage you to try your hand at the implementation.

[equation class=”algorithm” caption=”Algorithm 7.6 Sketch of an implementation of a vocoder”]

algorithm vocoder

/*

Input:

c, an array of audio samples constituting the carrier signal

m, n array of audio samples constituting the modulator signal

Output:

v, the carrier wave modulated with the modulator wave */

{

Initialize v with 0s

Divide the carrier into octave-separated frequency bands with bandpass filters

Divide the modulator into the same octave-separated frequency bands with bandpass filters for each band

use the modulator as an amplitude envelope for the carrier

}

[/equation]

Figure 7.49 Overview of vocoder implementation
Figure 7.49 Overview of vocoder implementation

[wpfilebase tag=file id=169 tpl=supplement /]

[wpfilebase tag=file id=103 tpl=supplement /]

[wpfilebase tag=file id=71 tpl=supplement /]

[wpfilebase tag=file id=101 tpl=supplement /]

Another interesting programming exercise is implementation of a pitch glide. A Risset pitch glide is an audio illusion that sounds like a constantly rising pitch. It is the aural equivalent of the visual image of a stripe on a barber pole that seems to be rising constantly. Implementing the pitch glide is suggested as an exercise for this section.

10/11