Emscorr: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Scott
(New page: ===Purpose=== Extended multiplicative scatter correction (EMSC) preprocessing. ===Synopsis=== :[sx,fx,xref,reg,res] = emscorr(x,xref,options) ===Description=== EMSCORR attempts to rem...)
 
imported>Mathias
 
(18 intermediate revisions by 4 users not shown)
Line 5: Line 5:
===Synopsis===
===Synopsis===


:[sx,fx,xref,reg,res] = emscorr(x,xref,options)
:[sx,fx,xref,reg,res] = emscorr(x,''xref'',''options'')


===Description===
===Description===


EMSCORR attempts to remove additive and multiplicative scattering effects in spectra. This can be thought of as a filter where some portions of the signal are passed and some are rejected. The input (x) is a MxN matrix (class "double") of M spectra measured at N channels. Each row of (x) is regressed against input (xref) and the results are used to 'correct' (x). If (xref) is not input then mean(x) is used.
EMSCORR attempts to remove additive and multiplicative scattering effects in spectra. This can be thought of as a filter where some portions of the signal are passed and some are rejected. Each row of input (x) is regressed against input (''xref'') and the results are used to "correct" (x). If (''xref'') is not input then <tt>mean(x)</tt> is used.


There are several options to allow for weighted least squares (i.e. to de-weight channels that should not be included in the regression), including different spectra that should be filtered out, and including spectra that should not be filtered out.
The spectra to not filter out corresponds to the measured response for analytes you want to model.
The spectra to filter out corresponds to the measured response for clutter (i.e., all measured signal not of interest).
For spectroscopic measurements, the former is typically the spectra of analytes that you want to detect or quantify, and the latter corresponds to interferences and physical artifacts in the measurements.
These aren't always easy to identify or characterize and it may take additional measurements (good design of experiments) to ensure that both the targets of interest and the interferences (both chemical and physical) are appropriately characterized.
There are several options to allow for weighted least squares (i.e., to de-weight channels that should not be included in the regression), for using different spectra to be filtered out, and for using spectra not to filtered out.


The outputs are (sx) the corrected spectra, (fx) the signal that was filtered out, and (xref) the reference spectrum. Outputs (reg) are the regression coefficients and (res) is a MxN matrix of residuals. For non-windowed filtering, (reg) is [number of coefficients] x M. The number of coefficients corresponds to the number of basis vectors included in the correction. The coefficients are ordered according to the following basis: xbase = [xref, 1 x x2 ..., options.p, options.s]. If a windowed filter is used, (reg) is [number of coefficients] x N x M. where mode 2 corresponds to the windows.
This method is also described in the document [http://www.eigenvector.com/whitepapers/EISC_Soil_Reflectance.pdf Extended Multiplicative Scatter Correction Applied to Mid-Infrared Reflectance Measurements of Soil (PDF)]


====Inputs====
====Inputs====
* '''first''' = first input is this.
* '''x''' = is a ''M''x''N'' matrix (class "double") of ''M'' spectra measured at ''N'' channels.


====Optional Inputs====
====Optional Inputs====
* '''second''' = optional second input is this.
* '''xref''' = 1x''N'' reference spectrum to regress against. If not input, <tt>mean(x)</tt> is used.


====Outputs====
====Outputs====
* '''firstout''' = first output is this.
* '''sx''' = the corrected spectra.
* '''fx''' = the signal that was filtered out.
* '''xref''' = the reference spectrum.
* '''reg''' = the regression coefficients. For non-windowed filtering, (reg) is [number of coefficients] x ''M''. The number of coefficients corresponds to the number of basis vectors included in the correction. The coefficients are ordered according to the following: <tt>xbase = [xref, 1 x x2 ..., options.p, options.s]</tt>. If a windowed filter is used, (reg) is [number of coefficients] x ''N'' x ''M'' where mode 2 corresponds to the windows.
* '''res''' = ''M''x''N'' matrix of residuals.


===Options===
===Options===


options = a structure array with the following fields:
'''options''' = structure array with the following fields:
* '''order''': [ {2} ]  Order of the polynomial filter (positive integer).
* '''logax''': [ {'no'} | 'yes' ]  Use the log of the axisscale, <tt>x.axisscale{2}</tt> as a basis vector to regress against. If the axisscale is not present <tt>log(1:N)</tt> is used. When (options.logax) is used, (options.order) is typically set to zero.
* '''s''': [ ]  Dataset or matrix, ''K''x''N'' spectra to not filter out.
* '''p''': [ ]  Dataset or matrix, ''Kp''x''N'' spectra to filter out.
* '''algorithm''': [ {'cls'} | 'ils' ]  Governs correction model method.
: 'cls' uses Classical Least Squares i.e., EMSC.
: 'ils' uses Inverse Least Squares i.e., EISC.
* '''win''': [ ]  An odd scalar that defines the window width (number of variables) for piece-wise correction. If empty {the default} piece-wise is not used. Note that piece-wise correction can be slow.
* '''initwt''': [ ]  Empty or ''N''x1 vector of initial weights (0<=w<=1). Low weights are used for channels not to be included in the fit.
* '''condnum''': [1e6]  Maximum condition number for <tt>'''Z''''*'''Z''''</tt> used in the least squares estimates (see Algorithm).
* '''xrefS''': [{'no'} | 'yes']  Indicates whether input (xref) includes spectra contained in (options.s). If <tt>'yes'</tt> then the spectra in (options.s) are centered and an SVD estimate of (options.s) is used in EMSCORR instead of (options.s).
* '''robust''': [ {'none'} | 'lsq2top' ]  Governs the use of robust least squares. If 'lsq2top' is used then (options.trbflag), (options.tsqlim) and (options.stopcrit) are also used (see LSQ2TOP for descriptions of these fields).
* '''res''': [ ] Positive scalar (required with <tt>options.lsq2top = 'yes'</tt>). It is the input (res) to the LSQ2TOP function.
* '''trbflag''': [ 'top' | 'bottom' | {'middle'} ] Used only when <tt>options.lsq2top = 'yes'</tt>.
* '''tsqlim''': [ 0.99 ]  Used only when <tt>options.lsq2top = 'yes'</tt>.
* '''stopcrit''': [1e-4 1e-4 1000 360]  Used only when <tt>options.lsq2top = 'yes'</tt>.
* '''axisscale''': [ ] 1x''N'' axis scale for the spectral mode, if empty <tt>[1:N]</tt> is used.
* '''mag''': [ {'yes'} | 'no' ], performs slope correction when set to 'yes'.
* '''display''': Governs level of display to command window.


* '''plots''': [ {'none'} | 'final' ] governs plotting of results, and
===Algorithm===
* '''order''': positive integer for polynomial order {default = 1}.
In EMSC, a ''N''x1 signal vector <math>{\mathbf{x}}</math> is modeled as
:<math>{\mathbf{x}} = \left[ {\begin{array}{*{20}c}
  {{\mathbf{x}}_{ref} } & {\mathbf{V}} & {\mathbf{S}} & {\mathbf{P}}  \\
\end{array} } \right] {\mathbf{c}}</math>
where <math>{\mathbf{x}_{ref}}</math> is a ''N''x1 reference vector, and <math>{\mathbf{V}}</math> is a ''N''x<math>{\mathit{K}_{v}}</math> matrix consisting of polynomials of the axis scale. For example, if the axis scale is in wavenumbers <math>{\nu}</math> then
:<math>{\mathbf{V}} = \left[ {\begin{array}{*{20}c}
  {\mathbf{1}} & {\mathbf{\nu }} & {{\mathbf{\nu }}^2 } &  \ldots  \\
\end{array} } \right]</math> .
The ''N''x<math>{\mathit{K}_{s}}</math> matrix <math>{\mathbf{S}}</math> corresponds to signal allowed to pass the filter and the ''N''x<math>{\mathit{K}_{p}}</math> matrix <math>{\mathbf{P}}</math> corresponds to signal filtered out of the signal. Typically, <math>{\mathbf{S}}</math> will correspond to spectra of target signal and <math>{\mathbf{P}}</math> will correspond to basis vectors that capture clutter signal (e.g., loadings from PCA of clutter). The vector <math>{{\mathbf{c}}^T = \mathit{c}_{ref}+{\mathbf{c}_{v}^{T}}+{\mathbf{c}_{s}^{T}}+{\mathbf{c}_{p}^{T}} }</math> contains corresponding coefficients to be estimated using least squares. The estimated coefficients and the basis vectors are used to "correct" the signal using the following
:<math>{\mathbf{x}_{corrected}} = {({\mathbf{x}} - {\mathbf{V}}{\mathbf{c}_{v}} - {\mathbf{P}}{\mathbf{c}_{p}})} / \mathit{c}_{ref} </math>  .
 
 
The original multiplicative scatter correction (MSC) is discussed in Geladi P., MacDougall D. and Martens H., “Linearization and scatter-correction for near-infrared reflectance spectra of meat,” Appl. Spectrosc. 1985; 39(3): 491-500.
 
Piece-wise approaches are discussed in Isaksson T. and Kowalski B., “Piece-wise multiplicative scatter correction applied to near-infrared diffuse transmittance data from meat products,” Appl. Spectrosc. 1993; 47(7): 702-709 and Blank T.B., Sum S.T., Brown S.D. and Monfre, S.L., “Transfer of near-infrared multivariate calibrations without standards,” Anal. Chem. 1996; 68(17): 2987–2995.
 
The inverse scatter correction approach is discussed in Helland I.S., Naes T. and Isaksson T., “Related Versions of the Multiplicative Scatter Correction Method for Preprocessing Spectroscopic Data,” Chemom. Intell. Lab. Syst. 1995; 29: 233–241.
 
The EMSC approach is given in Martens H. and Stark E., “Extended multiplicative signal correction and spectral interference subtraction: new preprocessing methods for near infrared spectroscopy,” Journal of Pharmaceutical and Biomedical Analysis 1991; 9: 625–635 and Martens H., Nielsen J.P. and Engelsen S.B., “Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures,” Anal. Chem. 2003; 75(3): 394–404.
 
An application of robust least squares to EMSC is given in Gallagher, N.B, Blake, T.A., and Gassman, P.L., “Application of Extended Inverse Multiplicative Scatter Correction to mid-Infrared Reflectance Spectroscopy of Soil,” J. Chemometrics., 19(5-7), 271-281 (2005). Another application is given in Kohler, A., Kirschner, C., Oust, A. and Martens, H., "EMSC as a tool for separation and characterisation of physical and chemical information in FT-IR microscopy images of cryo-sections of beef loin", Appl. Spectrosc. 2005; 59(6): 707-716.
 
Additional references are Thennadil, S.N., Martens, H. and Kohler, “Physics-Based Multiplicative Scatter Correction Approaches for Improving the Performance of Calibration Models,” Appl. Spectrosc. 2006; 60(3): 315-321; Thennadil, S.N. and Martens, E.B., “Empirical preprocessing methods and their impact on NIR calibrations: a simulation study,” J. Chemo. 2005; 19(2): 77-89; Gallagher, N.B, Blake, T.A. and Gassman, P.L., “Detection of Low Volatility Organic Analytes on Soils Using IR Reflection-Absorption Spectroscopy,” J. Near Infrared Spectrosc., 16(7): 179-187 (2008); Blake, T.A., Gassman, P.L., Gallagher, N.B, “Detection and Classification of Organic Analytes in Soil” International Journal of High Speed Electronics and Systems, 18(2), 319-336 (2008); and Gallagher, N.B., Gassman, P.L., Blake, T.A., “Strategies for Detecting Organic Liquids on Soils Using Mid-Infrared Reflection Spectroscopy,” Environ. Sci. Technol. 42(15), 5700-5705 (2008).


===Example===


<pre>
>>This is an example
Error: does not exist
</pre>


===See Also===
===See Also===


[[baselinew]], [[deresolv]]
[[mscorr]], [[stdfir]], [[emscorrdemo2]]

Latest revision as of 12:21, 20 April 2016

Purpose

Extended multiplicative scatter correction (EMSC) preprocessing.

Synopsis

[sx,fx,xref,reg,res] = emscorr(x,xref,options)

Description

EMSCORR attempts to remove additive and multiplicative scattering effects in spectra. This can be thought of as a filter where some portions of the signal are passed and some are rejected. Each row of input (x) is regressed against input (xref) and the results are used to "correct" (x). If (xref) is not input then mean(x) is used.

The spectra to not filter out corresponds to the measured response for analytes you want to model. The spectra to filter out corresponds to the measured response for clutter (i.e., all measured signal not of interest). For spectroscopic measurements, the former is typically the spectra of analytes that you want to detect or quantify, and the latter corresponds to interferences and physical artifacts in the measurements. These aren't always easy to identify or characterize and it may take additional measurements (good design of experiments) to ensure that both the targets of interest and the interferences (both chemical and physical) are appropriately characterized. There are several options to allow for weighted least squares (i.e., to de-weight channels that should not be included in the regression), for using different spectra to be filtered out, and for using spectra not to filtered out.

This method is also described in the document Extended Multiplicative Scatter Correction Applied to Mid-Infrared Reflectance Measurements of Soil (PDF)

Inputs

  • x = is a MxN matrix (class "double") of M spectra measured at N channels.

Optional Inputs

  • xref = 1xN reference spectrum to regress against. If not input, mean(x) is used.

Outputs

  • sx = the corrected spectra.
  • fx = the signal that was filtered out.
  • xref = the reference spectrum.
  • reg = the regression coefficients. For non-windowed filtering, (reg) is [number of coefficients] x M. The number of coefficients corresponds to the number of basis vectors included in the correction. The coefficients are ordered according to the following: xbase = [xref, 1 x x2 ..., options.p, options.s]. If a windowed filter is used, (reg) is [number of coefficients] x N x M where mode 2 corresponds to the windows.
  • res = MxN matrix of residuals.

Options

options = structure array with the following fields:

  • order: [ {2} ] Order of the polynomial filter (positive integer).
  • logax: [ {'no'} | 'yes' ] Use the log of the axisscale, x.axisscale{2} as a basis vector to regress against. If the axisscale is not present log(1:N) is used. When (options.logax) is used, (options.order) is typically set to zero.
  • s: [ ] Dataset or matrix, KxN spectra to not filter out.
  • p: [ ] Dataset or matrix, KpxN spectra to filter out.
  • algorithm: [ {'cls'} | 'ils' ] Governs correction model method.
'cls' uses Classical Least Squares i.e., EMSC.
'ils' uses Inverse Least Squares i.e., EISC.
  • win: [ ] An odd scalar that defines the window width (number of variables) for piece-wise correction. If empty {the default} piece-wise is not used. Note that piece-wise correction can be slow.
  • initwt: [ ] Empty or Nx1 vector of initial weights (0<=w<=1). Low weights are used for channels not to be included in the fit.
  • condnum: [1e6] Maximum condition number for Z'*Z' used in the least squares estimates (see Algorithm).
  • xrefS: [{'no'} | 'yes'] Indicates whether input (xref) includes spectra contained in (options.s). If 'yes' then the spectra in (options.s) are centered and an SVD estimate of (options.s) is used in EMSCORR instead of (options.s).
  • robust: [ {'none'} | 'lsq2top' ] Governs the use of robust least squares. If 'lsq2top' is used then (options.trbflag), (options.tsqlim) and (options.stopcrit) are also used (see LSQ2TOP for descriptions of these fields).
  • res: [ ] Positive scalar (required with options.lsq2top = 'yes'). It is the input (res) to the LSQ2TOP function.
  • trbflag: [ 'top' | 'bottom' | {'middle'} ] Used only when options.lsq2top = 'yes'.
  • tsqlim: [ 0.99 ] Used only when options.lsq2top = 'yes'.
  • stopcrit: [1e-4 1e-4 1000 360] Used only when options.lsq2top = 'yes'.
  • axisscale: [ ] 1xN axis scale for the spectral mode, if empty [1:N] is used.
  • mag: [ {'yes'} | 'no' ], performs slope correction when set to 'yes'.
  • display: Governs level of display to command window.

Algorithm

In EMSC, a Nx1 signal vector is modeled as

where is a Nx1 reference vector, and is a Nx matrix consisting of polynomials of the axis scale. For example, if the axis scale is in wavenumbers then

.

The Nx matrix corresponds to signal allowed to pass the filter and the Nx matrix corresponds to signal filtered out of the signal. Typically, will correspond to spectra of target signal and will correspond to basis vectors that capture clutter signal (e.g., loadings from PCA of clutter). The vector contains corresponding coefficients to be estimated using least squares. The estimated coefficients and the basis vectors are used to "correct" the signal using the following

.


The original multiplicative scatter correction (MSC) is discussed in Geladi P., MacDougall D. and Martens H., “Linearization and scatter-correction for near-infrared reflectance spectra of meat,” Appl. Spectrosc. 1985; 39(3): 491-500.

Piece-wise approaches are discussed in Isaksson T. and Kowalski B., “Piece-wise multiplicative scatter correction applied to near-infrared diffuse transmittance data from meat products,” Appl. Spectrosc. 1993; 47(7): 702-709 and Blank T.B., Sum S.T., Brown S.D. and Monfre, S.L., “Transfer of near-infrared multivariate calibrations without standards,” Anal. Chem. 1996; 68(17): 2987–2995.

The inverse scatter correction approach is discussed in Helland I.S., Naes T. and Isaksson T., “Related Versions of the Multiplicative Scatter Correction Method for Preprocessing Spectroscopic Data,” Chemom. Intell. Lab. Syst. 1995; 29: 233–241.

The EMSC approach is given in Martens H. and Stark E., “Extended multiplicative signal correction and spectral interference subtraction: new preprocessing methods for near infrared spectroscopy,” Journal of Pharmaceutical and Biomedical Analysis 1991; 9: 625–635 and Martens H., Nielsen J.P. and Engelsen S.B., “Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures,” Anal. Chem. 2003; 75(3): 394–404.

An application of robust least squares to EMSC is given in Gallagher, N.B, Blake, T.A., and Gassman, P.L., “Application of Extended Inverse Multiplicative Scatter Correction to mid-Infrared Reflectance Spectroscopy of Soil,” J. Chemometrics., 19(5-7), 271-281 (2005). Another application is given in Kohler, A., Kirschner, C., Oust, A. and Martens, H., "EMSC as a tool for separation and characterisation of physical and chemical information in FT-IR microscopy images of cryo-sections of beef loin", Appl. Spectrosc. 2005; 59(6): 707-716.

Additional references are Thennadil, S.N., Martens, H. and Kohler, “Physics-Based Multiplicative Scatter Correction Approaches for Improving the Performance of Calibration Models,” Appl. Spectrosc. 2006; 60(3): 315-321; Thennadil, S.N. and Martens, E.B., “Empirical preprocessing methods and their impact on NIR calibrations: a simulation study,” J. Chemo. 2005; 19(2): 77-89; Gallagher, N.B, Blake, T.A. and Gassman, P.L., “Detection of Low Volatility Organic Analytes on Soils Using IR Reflection-Absorption Spectroscopy,” J. Near Infrared Spectrosc., 16(7): 179-187 (2008); Blake, T.A., Gassman, P.L., Gallagher, N.B, “Detection and Classification of Organic Analytes in Soil” International Journal of High Speed Electronics and Systems, 18(2), 319-336 (2008); and Gallagher, N.B., Gassman, P.L., Blake, T.A., “Strategies for Detecting Organic Liquids on Soils Using Mid-Infrared Reflection Spectroscopy,” Environ. Sci. Technol. 42(15), 5700-5705 (2008).


See Also

mscorr, stdfir, emscorrdemo2