Parafac2
Contents
Purpose
PARAFAC2 (PARAllel FACtor analysis2) for unevenly-sized multi-way arrays
Synopsis
- model = parafac2(x,ncomp); % identifies model with default options (calibration step)
- model = parafac2(x,ncomp,options); % identifies model (calibration step)
- model = parafac2(x,initval,options); % identifies model based on initial values in initval (calibration step)
- options = parafac2('options'); % returns a default options structure
- pred = parafac2(xnew,model); % find scores for new samples given old model
Description
The three-way PARAFAC2 model is best perceived as a model close to the ordinary PARAFAC model. The major difference is that strict trilinearity is no longer required, so PARAFAC2 can sometimes handle elution time shifts, varying batch trajectories etc. The ordinary PARAFAC model is also sometimes called the PARAFAC1 model to distinguish it from the PARAFAC2 model.
In the PARAFAC1 model, one loading matrix is found for each mode. That implies that this loading matrix is the same across all levels for the other modes. For example, in a PARAFAC1 model of a data set with chromatographic spectrally detected experiments, the PARAFAC1 model ideally provides a loading matrix for e.g. the chromatographic mode which holds the true elution profiles of the chemical analytes. Thus, the PARAFAC1 model assumes that these elution profiles do not change shape in different experiments (only their magnitude). Such an assumption may be too strict and invalid. A little model error is seldom problematic, but if the structure of the data deviates considerably from the assumptions of the model, it can be impossible to fit a reasonable model. In the PARAFAC2 model, this trilinearity assumption is relaxed in one mode. A PARAFAC1 model of a three-way array is given by A, B and C (loading matrices in first, second and third mode). In PARAFAC2, the loadings in one mode can change from level to level. That is, assume that the third mode (C) of dimension K holds different samples (it is common practice, to have samples in the last mode for PARAFAC2). Instead of having a fixed first mode loading A for all samples, A may now vary from sample to sample. Thus for each sample, k, there is an individual A called A_{k}. The only restriction on A_{k} is that the cross-product A_{k}^{T}A_{k} remains constant. This is in contrast to PARAFAC1 where A is simply the same for all k.
Another way of imposing this constraint (A_{k}^{T}A_{k} constant) is to say that each A_{k} is modeled as P_{k}H where P_{k} is an orthogonal matrix of the same size as A_{k} and where H is a small quadratic matrix with dimension equal to the number of components. This different interpretation of the concept shows that the individual components A_{k} only differ up to a rotation. Hence, the latent variables are the same for all samples but may manifest themselves through different rotations.
The situations in which the PARAFAC2 model is valid can be difficult to understand because the flexibility compared to the PARAFAC1 model is somewhat abstract. However, one simple way to see the applicability of the PARAFAC2 model is that PARAFAC2 is worth considering in situations in which PARAFAC1 should ideally be valid, but where practical applications show that it is not. For example, it is often observed that the differences in elution profiles from experiment to experiment in chromatography makes the PARAFAC1 model difficult to fit. Many times PARAFAC2 can still handle such deviations even when the shifts in retention times are quite severe.
It is possible to fit both the PARAFAC1 and the PARAFAC2 model. If both models give the same results (approximately), then PARAFAC1 is likely valid and then PARAFAC1 is preferred because it uses fewer degrees of freedom. If there are large deviations, PARAFAC2 may be preferred. Note, though, that the K matrices A_{k} may have a larger variability than the corresponding A from the PARAFAC1 model because of the smaller amount of data that it is estimated from. This does not imply inadequacy but simply that there are differences in the way that the parameters are estimated.
Another interesting type of application of PARAFAC2 follows from the insight that the constraint that A_{k}^{T}A_{k} is constant. This directly implies that the individual slabs, X_{k}, of the array can have different lengths, hence different size A_{k}, yet still fulfill the constraint that A_{k}^{T}A_{k} is constant. Thus, PARAFAC2 can also handle e.g. batch data where the data from each batch are obtained at different sampling rates or different sampling duration. This is a very powerful feature of the PARAFAC2 model compared to the PARAFAC1 model.
The three-way PARAFAC2 model is given
X_{k} = A_{k}D_{k}B^{T} + E_{k} = P_{k}HD_{k}B^{T} + E_{k}, k = 1, .., K
X_{k} is a slab of data (I x J) in which I may actually vary with K. K is the number of slabs and A_{k} (I x ncomp) are the first-mode loadings for the kth sample. D_{k} is a diagonal matrix that holds the kth row of C in its diagonal. C (K x ncomp) is the third mode loadings, H is an (ncomp x ncomp) matrix, and P_{k} is an (I x ncomp) orthogonal matrix. The output P is given as a cell array of length K where the kth cell element holds the (I x ncomp) matrix P_{k}. Thus, to get e.g. the second sample P, write P{2}, and to get the estimate of the first mode loadings, A_{k}, at this second frontal slab (k = 2), write P{2}*H.
The model can also be fitted to more than three-way data. It is important then to be aware which mode is supposed to be fitted by separate loadings for each sample. The convention is that the first mode is the mode that has individual loadings and that these are defined across the last (the sample) mode. For example, chromatographic data with spectral detection can be arranged as the first mode being elution, the second spectral and the third mode being different experiments. Then different elution profiles (mode one) are found for each experiment (mode three). For multivariate batch process data, the array is typically arranged as time x variables x batches, meaning that the time trajectories (mode one) can vary from batch to batch (mode three).
For assistance in preparing batch data for use in PARAFAC2 please see bspcgui.
Inputs
- x = the multiway array to be decomposed,
- If all slabs have similar size, x is an array. For example, for three-way data where the matrix of measurements for sample one is held in x1, for sample 2 in x2 etc. then X(:,:,1) = X1; X(:,:,2) = X2; etc. If the slabs have different size, X is a cell array (type <help cell> for more info on cells). Then X{1} = X1; X{2} = X2; etc., and
- ncomp =
- the number of factors (components) to use, OR
- model = a PARAFAC model structure (new data are fit to the model i.e. sample mode scores are calculated).
Optional Inputs
- initval =
- If a parafac2 model is input, the data are fit to this model where the loadings for the first mode (scores) are estimated found.
- If the loadings are input (e.g. model.loads) these are used as starting values.
- options = discussed below.
Outputs
Data that are input as a cell-array in PARAFAC2 are converted to an array by zero-padding each samples first mode dimension in case of different first mode dimensions for different samples. Residuals etc. are also output as arrays. The output model is a structure array with the following fields:
- modeltype: 'PARAFAC2',
- datasource: structure array with information about input data,
- date: date of creation,
- time: time of creation,
- info: additional model information,
- loads: 1 by K cell array with model loadings for each mode/dimension,
- pred: cell array with model predictions for each input data block,
- tsqs: cell array with T^{2} values for each mode,
- ssqresiduals: cell array with sum of squares residuals for each mode,
- description: cell array with text description of model, and
- detail: sub-structure with additional model details and results.
The output pred is a structure array that contains the approximation of the data if the options field blockdetails is set to 'all' (see options).
Options
options = a structure array with the following fields:
- display: [ {'on'} | 'off' ], governs level of display,
- plots: [ {'final'} | 'all' | 'none' ], governs level of plotting,
- weights: [], used for fitting a weighted loss function,
- stopcriteria: Structure defining when to stop iterations based on any one of four criteria
- relativechange: Default is 1e-6. When the relative change in fit gets below the threshold, the algorithm stops.
- absolutechange: Default is 1e-6. When the absolute change in fit gets below the threshold, the algorithm stops.
- iterations: Default is 10.000. When the number of iterations exceeds the threshold, the algorithm stops.
- seconds: Default is 3600 (seconds). When the time spent exceeds the threshold, the algorithm stops.
- stopcrit: [1e-6 1e-6 10000 3600] defines the stopping criteria as [(relative tolerance) (absolute tolerance) (maximum number of iterations) (maximum time in seconds)],
- init: [ 0 ], defines how parameters are initialized (discussed below),
- line: [ 0 | {1}] defines whether to use the line search {default uses it},
- algo: [{'als'} 'tld' 'swatld'}] Governs algorithm used (not recommended to modify from ALS unless data are close to perfect; i.e. low model error, low noise),
- algo: not applicable for PARAFAC2 as ALS is always used,
- iterative: settings for iterative reweighted least squares fitting,
- scaletype: [{'norm'} 'max'] Normally, loading vectors are scaled to norm one and the variance is in the sample loadings/scores. Can be changed so that loadings are scaled to have maximum value one,
- blockdetails: [ {'standard'} | 'compact' | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model.
- ‘Standard’ - the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'.
- ‘Compact’ - like 'Standard' only residual limits from old model is used and the core consistency field in the model structure is left empty. ('model.detail.reslim', 'model.detail.coreconsistency.consistency').
- 'All' - keep predictions, raw residuals for x-block as well as the X-block dataset itself.
- preprocessing: {[]}, one element cell array containing preprocessing structure (see PREPROCESS) defining preprocessing to use on the x-block
- missdat: this option is not yet active,
- samplemode: [3], defines which mode should be considered the sample or object mode (do not change in PARAFAC2),
- constraints: {3x1 cell}, defines constraints on parameters (see PARAFAC), and
- coreconsist: [ {'on'} | 'off' ], governs calculation of core consistency (turning off may save time with large data sets and many components).
- waitbar: [ {'on'} | 'off' ], display waitbar.
The default options can be retrieved using: options = parafac2('options');.
Note that samplemode should not be altered in PARAFAC2. See help on TUCKER for help on the use of options for PARAFAC2. One important difference from TUCKER is that constraints in the first mode do not apply to the estimated profiles, A_{k}, themselves but only to H. It is generally adviced not to use constraints in the first mode.
Examples
parafac2 demo for a demonstration of the use of the PARAFAC2 algorithm.
model = parafac2(X,5) fits a five-component PARAFAC2 model to the array X using default settings. options = parafac2('options'); generates a set of default settings for PARAFAC2. options.plots = 0; sets the plotting off. options.init = 3; sets the initialization of PARAFAC2 to orthogonalized random numbers. model = parafac2(X,2,options); fits a two-component PARAFAC2 model with the settings defined in options.
parafac2 io shows the I/O of the algorithm.
See Also
bspcgui, datahat, explode, gram, mpca, outerm, parafac, tld, tucker, unfoldm