Oplecorr

From Eigenvector Research Documentation Wiki
Revision as of 13:08, 17 December 2013 by imported>Neal (→‎Algorithm)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Purpose

Optical path-length estimation and correction with closure constraints.

Synopsis

model = oplecorr(x,y,ncomp,options); %identifies model (calibration)
sx = oplecorr(x,model,options); %applies the model

Description

The OPLEC model is similar to EMSC but doesn't require esimates of the pure spectra for filtering. Instead it assumes closure on the chemical analyte contributions and the use of a non-chemical signal basis P defined by the input (options.order). For example, if options.order = 2, then P = [1, (1:n)', (1:n)'.^2] to account for offset, slope and curvature in the baseline.

Inputs

  • x = X-block (2-way array class "double" or "dataset"), and
  • ncomp = number of components to to be calculated (positive integer scalar).

1) Calibration: model = oplecorr(x,y,ncomp,options);

  • x = M by N matrix of spectra (class "double" or "dataset").
  • y = M by 1 matrix of known reference values.
  • ncomp = number of components to to be used for the basis Z (positive integer scalar).
  • options = an optional input structure array described below.

2) Apply: sx = oplecorr(x,model,options);

  • x =M by N matrix of spectra to be correctected .
  • model = oplecorr model.

Outputs

  • model = oplecorr model is a model structure with the following fields (see Standard Model Structure for additional information):
  • modeltype: 'OPLECORR',
  • datasource: structure array with information about input data,
  • date: date of creation,
  • time: time of creation, ...
and
  • sx = a M by N matrix of filtered ("corrected") spectra.

Options

options = a structure array with the following fields:

  • display: [ {'off'}| 'on' ] governs level of display to the command window.
  • order: defines the order of polynomial to describe 'non-chemical' signal due to physical artifacts.
Alternatively, (order) can be a N by Kp matrix corresponding to basis vectors to account for non-chemical signal.
This portion of the signal is not included in the closure constraint. See Algorithm for a more complete description.
  • center: [ {false} | true] governs mean-centering of the PLS model that regresses the corrections factors (model.b). No centering (the default) results in a force fit through zero.

Algorithm

The OPLEC algorithm is based on the work Z-P Chen, J Morris, E Martin, “Extracting Chemical Information from Spectral Data with Multiplicative Light Scattering Effects by Optical Path-Length Estimation and Correction,” Anal. Chem., 78, 7674-7681 (2006). OPLEC is similar to extended multiplicative scatter correction (EMSC) except that it incorporates closure in the signal due to chemical analytes.

It is assumed that the measured signal, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{x}} can be modeled as

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{x}=a\left( \mathbf{Sc}+\mathbf{Pt} \right)+\mathbf{e}\text{ (1)}}

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{x}} is a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N\times 1} column vector, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{S}} is a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N\times J} matrix with columns corresponding to analyte spectra, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{c}} is a Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle J\times 1} vector of contributions, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{P}} is a matrix with columns corresponding to physical artifacts in the spectra and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{t}} is a vector corresponding scores (or contributions for the artifacts). The factor Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathit{a}} is a multiplicative factor (e.g. due to changes in path-length) identified by the OPLEC algorithm. The Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle J} analyte contributions are subject to closure such that

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sum\limits_{j=1}^{J}{{{c}_{j}}}=1\text{ ; }{{c}_{1}}=1-\sum\limits_{j=2}^{J}{{{c}_{j}}}.\text{ (2)}}

Closure also implies that the contributions are non-negative. It is assumed that the contributions to the first analyte are known (i.e., the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle M\times 1} column vector Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{c}_1} is known). It is also assumed that the matrix Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{P}} can be modeled a priori. Examples for physical artifacts include an offset, slope and curvature of the baseline that can be accounted for by the Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle N\times 3} basis

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{P}=\left[ \begin{matrix} \mathbf{1} & \lambda & {{\lambda }^{2}} \\ \end{matrix} \right]\text{ (3)}}

where Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \lambda} is the wavelength (or frequency) axis. However, it should be clear that Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{P}} is a matrix with columns that span physical artifacts not subject to closure. The Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{m}^{th}} measured signal, Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{x}_{\mathit{m}}} , Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathit{m}=1,...,\mathit{M}} , orthogonal to Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mathbf{P}} is

where , , and . The measurements can be collected into a matrix and it is recognized that a basis for the measurements, , can be obtained from a subset of linearly independent measurements. Partitioning into the basis and remaining measurements, , gives

This partitioning implies that the remaining measurements, , are linear combinations of such that

where

Expanding a single measurement in gives

Substitution of Equation (2) into (8) gives

where . The partitioned matrices in Equation (5) can now be written using the last expression of Equation (9) to give

Noting the relationship in Equation (6) gives

Equating terms in Equation (11) gives two additional relationships:

, and
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle {{\mathbf{a}}_{r}}=\mathbf{\Gamma} {{\mathbf{a}}_{b}}.\text{ (13)}}

Substitution of Equation (13) into (12) gives

Recall that , and are known but is unknown. However, as with MSC where the reference used for correction is arbitrary (e.g., the mean of the calibration set is often used as the spectrum to “correct to”), any element of can be set to one. Setting the first element of to one and rearranging Equation (14) yields

Recognizing that the corrections, , must be non-negative implies that the remaining correction factors should be obtained by solving Equation (15) using non-negative least squares. The result is correction factors for all the basis vectors . that can be substituted into the sum of Equations (12) to give

The correction factors can be collected into a single vector given by .

Next, a regression model is obtained to allow estimation of correction factors for future test samples using the following

where the regression vector, , is estimated using PLS. Change options.center to true to use mean-centering for the PLS model. The correction factors for test samples are calculated using the following steps

, and

where the corrected spectrum is then given by

See Also

emscorr, mscorr, stdfir