Advanced Preprocessing: Simple Mathematical Operations: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
imported>Donal
 
(One intermediate revision by one other user not shown)
Line 9: Line 9:
===Log10===
===Log10===


A base 10 logarithm (that is, <math>\mathbf{X}_p = log_{10}(\mathbf{X})</math> ) can be used whenever the response of the data is linear to the function 10<sup>X</sup>. Note that negative values will become undefined values (NaN = Not a Number). Most modeling algorithms will attempt to replace these values with their least-biasing value. As such, the use of an absolute value preprocessing step prior to a Log10 step may be necessary to avoid problems during modeling.
A base 10 logarithm (that is, <math>\mathbf{X}_p = log_{10}(\mathbf{X})</math> ) can be used whenever the response of the data is linear to the function 10<sup>X</sup>. Since log of negative values will become undefined values (NaN = Not a Number) this operation first sets negative data values to zero. This effect can be avoided by use of an absolute value preprocessing step prior to a Log10 step. A minimum value filter is also used to prevent huge negative log values when x is very small. This is achieved by adding a constant, c, to x before applying log. This constant is removed during the "undo" step. c = 10<sup>-5</sup> by default.


There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10.
There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10 (however this does not set negative values to zero before taking log, and does not use a minimum value filter).


===Transmission to Absorbance (log(1/T))===
===Transmission to Absorbance (log(1/T))===
Line 18: Line 18:


There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10(1./X).
There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10(1./X).
===Arithmetic===
Apply simple arithmetic operations to all or part of dataset. See [[arithmetic]].

Latest revision as of 23:47, 8 November 2015

Two preprocessing methods involve simple mathematical operations which are used to linearize or otherwise modify certain kinds of data.

Absolute Value

The absolute value method is used to remove any sign information from the data. Although unusual, this method may be useful following a derivative or other method which creates negative values. Such correction can allow the use of non-negativity constraints, or simply improve interpretability of derivatized spectra. It should be noted, however, that an absolute value following any method which centers data (such as mean- or median-centering) may create a non-linear response and complicate modeling.

There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command abs.

Log10

A base 10 logarithm (that is, ) can be used whenever the response of the data is linear to the function 10X. Since log of negative values will become undefined values (NaN = Not a Number) this operation first sets negative data values to zero. This effect can be avoided by use of an absolute value preprocessing step prior to a Log10 step. A minimum value filter is also used to prevent huge negative log values when x is very small. This is achieved by adding a constant, c, to x before applying log. This constant is removed during the "undo" step. c = 10-5 by default.

There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10 (however this does not set negative values to zero before taking log, and does not use a minimum value filter).

Transmission to Absorbance (log(1/T))

The spectroscopic transformation is often used when data has been collected as transmission (ratio of measured signal relative to incident signal). The transformation converts the signal to "absorbance" but, in general, transforms data which follows the inverse log relationship.

There are no settings associated with this preprocessing method. The command line function to perform this operation is the MATLAB command log10(1./X).

Arithmetic

Apply simple arithmetic operations to all or part of dataset. See arithmetic.