IntroToSignalProcessing.pdf

(1625 KB) Pobierz
An Introduction to Signal Processing
in Chemical Analysis
An illustrated essay with software available for free download
Last updated July, 2011
Professor Emeritus
Department of Chemistry and Biochemistry
University of Maryland at College Park
E-mail: toh@umd.edu
Foreword
The interfacing of analytical instrumentation to small computers for the purpose of on-
line data acquisition has now become standard practice in the modern chemistry
laboratory. Using widely-available, low-cost microcomputers and off-the-shelf add-in
components, it is now easier than ever to acquire data quickly in digital form. In what
ways is on-line digital data acquisition superior to the old methods such as the chart
recorder? Some of the advantages are obvious, such as archival storage and retrieval of
data and post-run re-plotting with adjustable scale expansion. Even more important,
however, there is the possibility of performing post-run data analysis and signal
processing. There are a large number of computer-based numerical methods that can be
used to reduce noise, improve the resolution of overlapping peaks, compensate for
instrumental artifacts, test hypotheses, optimize measurement strategies, diagnose
measurement difficulties, and decompose complex signals into their component parts.
These techniques can often make difficult measurements easier by extracting more
information from the available data. Many of these techniques are based on laborious
mathematical procedures that were not practical before the advent of computerized
instrumentation. It is important for chemistry students to appreciate the capabilities and
the limitations of these modern signal processing techniques.
In the chemistry curriculum, signal processing may be covered as part of a course on
instrumental analysis (1, 2), electronics for chemists (3), laboratory interfacing (4), or
basic chemometrics (5). The purpose of this paper is to give a general introduction to
some of the most widely used signal processing techniques and to give illustrations of
their applications in analytical chemistry. This essay covers only elementary topics and is
limited to only basic mathematics. For more advanced topics and for a more rigorous
treatment of the underlying mathematics, refer to the extensive literature on
chemometrics.
This tutorial makes use of a freeware signal-processing program called SPECTRUM that
was used to produce many of the illustrations. Additional examples were developed in
Matlab , a high-performance commercial numerical computing environment and
programming language that is widely used in research. Paragraphs in gray at the end of
each section in this essay describe the related capabilities of each of these programs.
1379999171.037.png 1379999171.038.png 1379999171.039.png 1379999171.040.png 1379999171.001.png 1379999171.002.png
 
Signal arithmetic
The most basic signal processing functions are those that involve simple signal
arithmetic: point-by-point addition, subtraction, multiplication, or division of two signals
or of one signal and a constant. Despite their mathematical simplicity, these functions can
be very useful. For example, in the left part of Figure 1 (Window 1) the top curve is the
absorption spectrum of an extract of a sample of oil shale, a kind of rock that is is a
source of petroleum.
Figure 1. A simple point-by--point subtraction of two signals allows the background
(bottom curve on the left) to be subtracted from a complex sample (top curve on the left),
resulting in a clearer picture of what is really in the sample (right).
This spectrum exhibits two absorption bands, at about 515 nm and 550 nm, that are due
to a class of molecular fossils of chlorophyll called porphyrins. (Porphyrins are used as
geomarkers in oil exploration). These bands are superimposed on a background
absorption caused by the extracting solvents and by non-porphyrin compounds extracted
from the shale. The bottom curve is the spectrum of an extract of a non-porphyrin-bearing
shale, showing only the background absorption. To obtain the spectrum of the shale
extract without the background, the background (bottom curve) is simply subtracted from
the sample spectrum (top curve). The difference is shown in the right in Window 2 (note
the change in Y-axis scale). In this case the removal of the background is not perfect,
because the background spectrum is measured on a separate shale sample. However, it
works well enough that the two bands are now seen more clearly and it is easier to
measure precisely their absorbances and wavelengths.
In this example and the one below, the assumption is being made that the two signals in
Window 1 have the same x-axis values, that is, that both spectra are digitized at the same
set of wavelengths. Strictly speaking this operation would not be valid if two spectra were
digitized over different wavelength ranges or with different intervals between adjacent
points. The x-axis values much match up point for point. In practice, this is very often the
case with data sets acquired within one experiment on one instrument, but the
experimenter must take care if the instruments settings are changed or if data from two
experiments or two instrument are combined. (Note: It is possible to use the mathematical
technique of interpolation to change the number of points or the x-axis intervals of
1379999171.003.png 1379999171.004.png 1379999171.005.png 1379999171.006.png 1379999171.007.png
 
signals; the results are only approximate but often close enough in practice).
Sometimes one needs to know whether two signals have the same shape, for example in
comparing the spectrum of an unknown to a stored reference spectrum. Most likely the
concentrations of the unknown and reference, and therefore the amplitudes of the spectra,
will be different. Therefore a direct overlay or subtraction of the two spectra will not be
useful. One possibility is to compute the point-by-point ratio of the two signals; if they
have the same shape, the ratio will be a constant. For example, examine Figure 2.
Figure 2. Do the two spectra on the left have the same shape? They certainly do not
look the same, but that may simply be due to that fact that one is much weaker that the
other. The ratio of the two spectra, shown in the right part (Window 2), is relatively
constant from 300 to 440 nm, with a value of 10 +/- 0.2. This means that the shape of
these two signals is very nearly identical over this wavelength range.
The left part (Window 1) shows two superimposed spectra, one of which is much weaker
than the other. But do they have the same shape? The ratio of the two spectra, shown in
the right part (Window 2), is relatively constant from 300 to 440 nm, with a value of 10
+/- 0.2. This means that the shape of these two signals is the same, within about +/-2 %,
over this wavelength range, and that top curve is about 10 times more intense than the
bottom one. Above 440 nm the ratio is not even approximately constant; this is caused by
noise, which is the topic of the next section.
Simple signal arithmetic operations such as these are easily done in a spreadsheet, any
general-purpose programming language, or a dedicated signal-processing program such
as SPECTRUM, which is available for free download .
SPECTRUM includes addition and multiplication of a signal with a constant; addition,
subtraction, multiplication, and division of two signals; normalization, and a large
number of other basic math functions (log, ln, antilog, square root, reciprocal, etc).
In Matlab , math operations on signals are especially powerful because the variables in
Matlab can be either scalar (single values), vector (like a row or a column in a
spreadsheet), representing one entire signal, spectrum or chromatogram, or matrix (like a
rectangular block of cells in a spreadsheet), representing a set of signals. For example, in
Matlab you could define two vectors a=[1 2 5 2 1] and b=[4 3 2 1 0] . Then
to subtract B from A you would just type a-b , which gives the result [-3 -1 3 1 1] .
To multiply A times B point by point, you would type a.*b , which gives the result [4
6 10 2 0] . If you have an entire spectrum in the variable a , you can plot it just by
1379999171.008.png 1379999171.009.png 1379999171.010.png 1379999171.011.png 1379999171.012.png 1379999171.013.png 1379999171.014.png
 
typing plot(a) . And if you also had a vector w of x-axis values (such as wavelengths),
you can plot a vs w by typing plot(w,a) . The subtraction of two spectra a and b , as in
Figure 1, can be performed simply by writing a-b . To plot the difference, you would
write plot(a-b) . Likewise, to plot the ratio of two spectra, as in Figure 2, you would
write plot(a./b) . Moreover, Matlab is a programming language that can automate
complex sequences of operations by saving them in scripts and functions.
Signals and noise
Experimental measurements are never perfect, even with sophisticated modern
instruments. Two main types or measurement errors are recognized: systematic error, in
which every measurement is either less than or greater than the "correct" value by a fixed
percentage or amount, and random error, in which there are unpredictable variations in
the measured signal from moment to moment or from measurement to measurement. This
latter type of error is often called noise , by analogy to acoustic noise. There are many
sources of noise in physical measurements, such as building vibrations, air currents,
electric power fluctuations, stray radiation from nearby electrical apparatus, interference
from radio and TV transmissions, random thermal motion of molecules, and even the
basic quantum nature of matter and energy itself.
In spectroscopy, three fundamental type of noise are recognized: photon noise, detector
noise, and flicker (fluctuation) noise. Photon noise (often the limiting noise in
instruments that use photomultiplier detectors), is proportional to the square root of light
intensity, and therefore the SNR is proportional to the square root of light intensity and
directly proportional to the slit width. Detector noise (often the limiting noise in
instruments that use solid-state photodiode detectors) is independent of the light intensity
and therefore the detector SNR is directly proportional to the light intensity and to the
square of the monochromator slit width. Flicker noise, caused by light source instability,
vibration, sample cell positioning errors, sample turbulence, light scattering by suspended
particles, dust, bubbles, etc., is directly proportional to the light intensity, so the flicker
SNR is not decreased by increasing the slit width. Flicker noise can usually be reduced or
eliminated by using specialized instrument designs such as double-beam , dual
wavelength , diode array, and wavelength modulation .
The quality of a signal is often expressed quantitatively as the signal-to-noise ratio (SNR)
which is the ratio of the true signal amplitude (e.g. the average amplitude or the peak
height) to the standard deviation of the noise. Signal-to-noise ratio is inversely
proportional to the relative standard deviation of the signal amplitude. Measuring the
signal-to-noise ratio usually requires that the noise be measured separately, in the absence
of signal. Depending on the type of experiment, it may be possible to acquire readings of
the noise alone, for example on a segment of the baseline before or after the occurrence of
the signal. However, if the magnitude of the noise depends on the level of the signal (as
in photon noise or flicker noise in spectroscopy), then the experimenter must try to
produce a constant signal level to allows measurement of the noise on the signal. In a
few cases, where it is possible to model the shape of the signal exactly by means of a
mathematical function, the noise may be estimated by subtracting the model signal from
the experimental signal.
1379999171.015.png 1379999171.016.png 1379999171.017.png 1379999171.018.png 1379999171.019.png 1379999171.020.png
Figure 3. Window 1 (left) is a single measurement of a very noisy signal. There is
actually a broad peak near the center of this signal, but it is not possible to measure its
position, width, and height accurately because the signal-to-noise ratio is very poor (less
than 1). Window 2 (right) is the average of 9 repeated measurements of this signal,
clearly showing the peak emerging from the noise. The expected improvement in signal-
to-noise ratio is 3 (the square root of 9). Often it is possible to average hundreds of
measurement, resulting is much more substantial improvement.
One of the fundamental problems in signal measurement is distinguishing the noise from
the signal. Sometimes the two can be partly distinguished on the basis of frequency
components : for example, the signal may contain mostly low-frequency components and
the noise may be located a higher frequencies. This is the basis of filtering and
smoothing . But the thing that really distinguishes signal from noise is that random noise
is not the same from one measurement of the signal to the next, whereas the genuine
signal is at least partially reproducible. So if the signal can be measured more than once,
use can be made of this fact by measuring the signal over and over again as fast as
practical and adding up all the measurements point-by-point. This is called ensemble
averaging , and it is one of the most powerful methods for improving signals, when it can
be applied. For this to work properly, the noise must be random and the signal must occur
at the same time in each repeat. An example is shown in Figure 3. 3.
SPECTRUM includes several functions for measuring signals and noise, plus a signal-
generator that can be used to generate artificial signals with Gaussian and Lorentzian
bands, sine waves, and normally-distributed random noise. Matlab has built-in functions
that can be used for measuring and plotting signals and noise, such as mean , max , min,
to automate commonly-
used algorithms. Some examples that you can download and use are these user-defined
functions to calculate typical peak shapes commonly encountered in analytical chemistry,
gaussian and lorentzian , and typical types of random noise ( whitenoise , pinknoise ) , which
can be useful in modeling and simulating analytical signals and testing measurement
techniques. (If you are viewing this document on-line, you can Ctrl-click on these links to
inspect the code). Once you have created or downloaded those functions, you can use
them to plot a simulated noisy peak such as in Figure 3 by typing
functions
x=[1:256];plot(x,gaussian(x,128,64)+whitenoise(x)) .
1379999171.021.png 1379999171.022.png 1379999171.023.png 1379999171.024.png 1379999171.025.png 1379999171.026.png 1379999171.027.png 1379999171.028.png 1379999171.029.png 1379999171.030.png 1379999171.031.png 1379999171.032.png 1379999171.033.png 1379999171.034.png 1379999171.035.png 1379999171.036.png
 
Zgłoś jeśli naruszono regulamin