# How does Spectral Hard Modeling work?

This article answers a frequently asked question about the spectra analysis methods subsummed as Spectral Hard Modeling. It explains what spectral Hard Modeling is and how it works. Spectral Hard Modeling methods - most prominent Indirect Hard Modeling (IHM) - is implemented in PEAXACT Software for Quantitative Spectroscopy from S-PACT. There is a software tutorial available if you would like to try it on your own.

### What is spectral Hard Modeling?

Spectral Hard Modeling is a set of methods for predicting unknown concentrations from spectra of mixtures, in particular mid-infrared (MIR), Raman, and NMR spectra, but any peak-shaped signal may work. We also use the name to describe the procedure of building a spectral Hard Model. (Why Indirect Hard Modeling is considered "indirect" - this is explained later.)

### The spectral Hard Model

A mathematical function is called physical model or Hard Model if it is derived from equations representing the physics behind an underlying process. In contrast, a function is a polynomial model or Soft Model if it is derived empirically or statistically. As for spectral Hard Modeling, the Hard Model is derived from the physics of molecular spectroscopy; the function is a mathematical representation of a mixture spectrum. Mixture spectrum (black) is composed of superimposed peaks (cyan)
Physics tells us that a mixture spectrum is composed of superimposed peaks originating from the individual components in the mixture, with the components' concentrations being responsible for the peaks' intensities. This structural information is maintained in the Hard Model by means of a sum of peak-shaped curves. Groups of peak-curves which represent pure component spectra are referred to as Component Models; they are multiplied by concentration-related weight parameters. As typical for Hard Models, all model parameters correspond to physical quantities like peak positions or peak widths, but of particular interest is the component weight parameter because of its meaning for the analysis of component concentrations. Hard Model (red) is a weighted sum of Component Models (blue) Flexibility of peak functions
In order to obtain a component’s concentration from a measured spectrum, one has to determine the component’s weight parameter of the model. This is done by a mathematical procedure called model fitting, in which the model’s parameters are automatically adjusted until the model fits the measured spectrum. During model fitting the Hard Model unfolds its full potential: Disturbing peak variations in the measured spectrum like shifts or shape changes are accounted for by a suitable adjustment of the corresponding peak parameters, thus reducing error propagation to the relevant component weights. What remains to be done is a simple conversion of component weights to concentrations, known as calibration.

### The Calibration Model in spectral Hard Modeling

The goal of calibration is to quantitatively convert measurements made on one measurement scale to another measurement scale. The functional relationship between both scales is established by regression. While a direct conversion from spectra to concentrations would require complex multivariate regression techniques (see e.g. Partial Least Squares regression), the relationship between Hard Model component weights (resulting from a fit of the Hard Model to a measured spectrum) and concentrations can be determined by simple univariate regression. The benefit of univariate regression is a well-established methodology with a sound statistical underpinning, especially regarding validation of the regression function. Calibration model of spectral Hard Models: Regression of known concentrations on component weights.
The regression function used in calibration – also known as calibration function, calibration curve, or calibration model – is defined in terms of parameters (so-called calibration constants) that are estimated from training samples. Training samples are measured spectra for which the concentrations are known, e.g. by using a balance to weigh in the compounds or by means of a trustworthy reference analysis. The name “training sample” refers to its usage in training the calibration model in how to compute unknown concentrations from component weights of new samples. Training the calibration model of spectral Hard Models comes with two nice properties. Firstly, the calibration model can be trained with concentrations outside the prediction region of interest, or in other words, the model is capable of extrapolation. And secondly, since regression is done using component weights adjusted for spectral disturbances, a robust calibration model can be trained with only a few samples.

### “Indirect” Model Building

Indirect Hard Modeling (IHM) has been the first prominent Spectral Hard Modeling method published in 2004.
Model building typically involves two steps: determination of the model’s structure and estimation of initial model parameters. The structure of a spectral Hard Model is already known to be a weighted sum of component models which again are sums of peak-curves, e.g. Gaussian or Lorentzian profiles. Therefore, the task reduces to building the model for a very specific mixture, i.e. choosing the number of peak-curves and their initial parameters (position, intensity, width, shape) in order for the model to represent the mixture's spectrum.
Due to overlapping peaks in the mixture spectrum, one cannot simply choose peak-curves for all measured peaks directly. Instead one needs an indirect approach with a little help of the pure component spectra. For each pure component spectrum a separate Hard Model is built in a first step by automatic peak fitting. In order to set up all individual Component Models one needs to know all pure component spectra. (See related article about HMFA if not all pure component spectra are known). In the final step, these Component Models are combined to a Mixture Model.
A question may arise so far: Why bothering with building a pure component model and not simply use those measured pure component spectra instead? First of all, this idea is already picked up and implemented in a method called Classical Least Squares. But more importantly, the measured pure component spectra are stiff data vectors while the model has flexible parameters that can be adjusted to the many spectral variations that may occur in a mixture spectrum.

### Properties of spectral Hard Modeling

#### Requirements

• The spectra under consideration should follow a peak shape described by the Hard Model equations. Modelling other spectral shapes is physical nonsense.
• Ideally, all pure components of the mixture should be known (or at least all components contributing to the spectral signal within the selected spectral range). If not, one of the alternative ways to generate the pure component Hard Models should be applicable.
• The Hard Model must fit the mixture spectrum well. A lack-of-fit would induce systematic errors, especially in case of small concentrations.

#### Benefits

• The physically motivated spectral Hard Model is easily interpretable because it resembles a mixture spectrum. This kind of modeling allows for better prediction and is less subject to variation than Soft Models (as long as the underlying physical process doesn't change).
• The component weight parameter of the Hard Model is a very suitable parameter for calibration because it is highly selective for a certain component of interest.
• The Hard Model contains flexible parameters that are automatically adjusted during model fitting to correct spectral effects like peak shifts and shape changes. Error propagation to the important component weight parameter is reduced. Less training samples are necessary for calibration.
• Using surrogate component weights instead of full spectra for calibration allows for simple univariate regression instead of complex multivariate regression. This enables thorough model validation, e.g. uncertainty estimation, prediction intervals, and other figures of merit.
• Instead of classical calibration, spectral Hard Modeling uses inverse calibration which has been proven to be better [Tellinghuisen2000, Krutchkoff1967].

### Literature

• E. Kriesten, F. Alsmeyer, A. Bardow, and W. Marquardt (2008). “Fully automated indirect hard modeling of mixture spectra”, Chemometrics and Intelligent Laboratory Systems, Vol. 91, pp. 181–193.
• F. Alsmeyer, H.-J. Koß, and W. Marquardt (2004). “Indirect Spectral Hard Modeling for the Analysis of Reactive and Interacting Mixtures”, Journal of Applied Spectroscopy, Vol. 58 (8), pp. 975-986.
• J. Tellinghuisen (2000). “Inverse vs. classical calibration for small data sets”, Fresenius Journal of Analytical Chemistry, Vol. 368, pp. 585-588
• R. G. Krutchkoff (1967). “Classical and Inverse Methods of Calibration”, Technometrics, Vol. 9, pp. 425-439.

# S•PACT GmbH

Burtscheider Str. 1
52064 Aachen
Tel.: +49 241 - 9569 9812
Fax: +49 241 - 4354 4308
E-Mail:
Internet: www.s-pact.de