https://www.wiki.eigenvector.com/api.php?action=feedcontributions&user=Rasmus&feedformat=atomEigenvector Research Documentation Wiki - User contributions [en]2021-07-29T09:25:53ZUser contributionsMediaWiki 1.32.0https://www.wiki.eigenvector.com/index.php?title=Constrainfit&diff=11230Constrainfit2021-02-03T09:04:30Z<p>Rasmus: /* Options */</p>
<hr />
<div>===Purpose===<br />
<br />
Finds '''A''' minimizing ||X-A*B'|| subject to constraints, given the small matrices ('''X''' ' '''B''') and ('''B''' ' '''B''')<br />
<br />
===Synopsis===<br />
: [A,diagnostics]=constrainfit(XB,BtB,Aold,options); % Constrained<br />
: [A,diagnostics]=constrainfit(XB,BtB,Aold); % Unconstrained<br />
<br />
===Description===<br />
<br />
CONSTRAINFIT solves the least squares problem behind bilinear, trilinear and other multilinear models. Assuming a model '''X''' = '''A'''*'''B''' ' and assuming that '''X''' and '''B''' are known, the least squares estimate of '''A''' is obtained. Rather than using '''X''' and '''B''' this algorithm uses the cross product matrices ('''X''' ' '''B''') and ('''B''' ' '''B''') which are generally smaller and less memory-demanding especially in multi-way models.<br />
<br />
CONSTRAINFIT can do a number of general types of regression problems such as nonnegativity-constrained regression, regression with column-orthogonality of '''A''' etc. These constraints are simply set in the option field 'type', e.g. option.type='nonnegativity'. Thus, for most problems, only the 'type' field needs to be set. CONSTRAINFIT will provide a least squares solution to most of these problems.<br />
<br />
CONSTRAINFIT can also find '''A''' subject to different constraints on different columns. In this case, the update of '''A''' will be an improvement of the initially provided estimate '''Aold''' though not necessarilly the least squares solution. As CONSTRAINFIT is used inside iterative algorithms, an improvement is sufficient to guarantee overall convergence.<br />
<br />
====Inputs====<br />
* '''XB''' = The matrix '''X''' ' '''B'''.<br />
* '''BtB''' = The matrix '''B''' ' '''B'''.<br />
* '''Aold''' = An initial estimate of '''A'''.<br />
<br />
====Optional Inputs====<br />
* '''options''' = provides definitions for which type of constraint to impose.<br />
<br />
====Outputs====<br />
* '''A''' = The improved estimate of '''A'''.<br />
<br />
===Options===<br />
<br />
options = a structure array with the following fields:<br />
<br />
* '''type''': [ {'unconstrained'} | 'nonnegativity' | 'unimodality' | 'orthogonality' | 'columnorthogonal' | 'equality' | 'exponential' | 'rightprod' | 'columnwise']<br />
::: provides quick access to most important settings<br />
::: ''''unconstrained'''' - do unconstrained fit of '''A'''<br />
::: ''''nonnegativity'''' - '''A''' is all nonnegative<br />
::: ''''unimodality'''' - '''A''' has unimodal columns ''and'' is nonnegative<br />
::: ''''unimodality_nonon'''' - '''A''' has unimodal columns <br />
::: ''''orthogonality'''' - '''A''' is orthogonal ('''A''' ' '''A''' = '''I''')<br />
::: ''''columnorthogonal''''- '''A''' has orthogonal columns ('''A''' ' '''A''' = diagonal)<br />
::: ''''equality'''' - columns in '''A''' are subject to equality constraints. Useful for e.g. imposing closure (see settings under options.equality below)<br />
::: ''''exponential'''' - Columns are mono-exponentials<br />
::: ''''rightprod'''' - Fitting '''A''' subject to being of the form '''F*D''', where '''D''' is predefined (must be set in options.advanced.linearconstraints.matrix). if imposed then columnwise constraints (see below) are applied to the columns of '''F''' rather than '''A'''. Hence options.columnconstraints must be set appropriately.<br />
::: ''''L1 penalty'''' - A is estimated using a constraint that A should be sparse - (the higher options.L1.penalty, the sparser A will be). Note: A is also constrained to be nonnegative.<br />
<br />
::: ''''columnwise'''' - A has other constraints than the above. These have to be defined in options.columnconstraints (see below).<br />
<br />
* '''columnconstraints''': cell where element f defines constraints on column f (only applicable if options.type = 'columnwise'). <br />
::: columnconstraints is a cell vector {f1,f2,f3, ... fF}. Each element f1, f2, etc. corresponds to one column of A. f1 defines constraints on the first column of A etc. Each constraint on a column is defined by a number. For example if f1 is 1, then nonnegativity is imposed on the first column (see definitions below). If f1 = [1 4], then first nonnegativity is imposed and then smoothness. The following constraints are available on individual columns<br />
::: a = 0 : Unconstrained<br />
::: a = 1 : Nonnegativity<br />
::: a = 2 : Unimodality<br />
::: a = 3 : Inequality (every element >= scalar). Scalar has to be in options.inequality.scalar. This is a vector of size F, one scalar for each factor<br />
::: a = 4 : Smoothness. options.smoothness.operator can be used to hold operator (for speeding up. Won't have to be estimated each time. options.smoothness.alpha (0<alpha<1). Setting to zero means no smoothness while setting to 1 means high degree of smoothness.<br />
::: a = 5 : Fixed elements. The elements that are fixed are defined in options.fixed. <br />
::: a = 6 : Not applicable<br />
::: a = 7 : Approximate unimodality. See options.unimodality.weight<br />
::: a = 8 : Normalize the loading vectors to norm one<br />
::: a = 20: Functional constraint. Using simple pre- or userdefined functions, any functional constraint can be imposed on individual columns. For example, that one column is exponential. Functional constraints require that a function is written (type HELP FITGAUSS for an example). <br />
<br />
::'''Example:''' Fitting the second loading vector as being Gaussian:<br />
<pre><br />
NumberFactors=3;<br />
options.functional=cell(NumberFactors,1);<br />
ToFix = 2; % This constraint is for the second column<br />
options.functional{ToFix}.functionhandle = @fitgauss;<br />
% Define starting parameters<br />
center = 100;width = 100;height = .1;<br />
options.functional{ToFix}.parameters = [center width height];<br />
options.functional{ToFix}.additional=[]; % no additional input<br />
</pre><br />
<br />
When a column has more than one constraint these are generally imposed sequentially starting with the first one in options.columnconstraint. For most constraints, the order of constraints will not be important. Advise is to input constraints with smaller numbers first.<br />
<br />
* '''inequality''' : Defines a cutoff. If inequality is defined in columnwise constraints, all elements of that column will be > options.inequality.scalar. Thus, when set to zero, nonnegativity is imposed.<br />
* '''nonnegativity''': defines which algorithm to use for imposing nonnegativity when options.type = 'nonnegativity'. If set to 0, the default NNLS algorithm is used. If set to 1, a faster columnwise update is used which only improves the current least squares fit, if set to 2, an ad hoc approach is used where '''A''' is estimated in a least squares sense and then negative numbers are set to zero. This will not provide a well-defined solution in terms of the least squares loss function. If set to 3, the NMF algorithm is used. This requires that all elements of the data array are nonnegative in order to work properly.<br />
* '''smoothness''': defines how much smoothness is imposed when smoothness is imposed as a columnconstraint. smoothness.alpha is a number between 0 (no smoothness) and 1 (full smoothness)<br />
* '''fixed''': options.fixed.values is a matrix of the same size as the loading matrix ('''A''') with the actual numbers to be fixed in the positions corresponding to the position in A. The remaining positions must be NaN. The degree to which elements are fixed is set in options.fixed.weight (0<weight<1). Zero means not imposed whereas one means completely fixed.<br />
* '''advanced''': In the field advanced.linearconstraints, settings for options.type = 'rightprod' are set. If '''A''' is IxF, then linearconstraint.matrix must be and SxF matrix '''D''''. '''A''' is then found as '''F*D'''. E.g. if '''A''' has three columns, set the predefined matrix D = [1 1 0; 0 0 1], then the first and second of the three columns in A will be identical (F*D where F is to be estimated). <br />
* '''equality''': Settings for using options.type = 'equality'. Two fields are held in equality, C and d. When imposed, CONSTRAINFIT solves for loading matrix A subject to A(i,:)*C' = d for all i. Hence if you want to impose closure and have three factors, set C=[1 1 1] and d=1.<br />
* '''unimodality''': Set weight in options.unimodality.weight. weight==1: exact unimodality. weight==0: no unimodality<br />
* '''functional''': For functional constrains (see above)<br />
<br />
===Examples===<br />
<br />
<br />
'''Setting global constraints on A'''<br />
opt = constrainfit('options');<br />
opt.type='nonnegativity';<br />
[A]=constrainfit(XB,BtB,Aold,opt); % Nonnegative<br />
<br />
'''Setting constraints on just one column of A'''<br />
opt = constrainfit('options');<br />
opt.type='columnwise';<br />
opt.columnconstraints={0;2;0}; % If three columns<br />
[A]=constrainfit(XB,BtB,Aold,opt); % Second column unimodal<br />
<br />
<pre><br />
% Make a noisy dataset such that PARAFAC gives noisy loadings<br />
load aminoacids<br />
x = X.data;<br />
x = x+randn(size(x))*100;<br />
<br />
% define parafac options<br />
op=parafac('options');<br />
<br />
% set constraints in second mode to be defined columnwise<br />
op.constraints{2}.type='columnwise';<br />
<br />
% Define that first column is smooth, second and third unconstrained<br />
op.constraints{2}.columnconstraints={4 0 0};<br />
<br />
% Fit model<br />
model = parafac(x,3,op);<br />
</pre><br />
<br />
Note how the first loading in the second mode is more smooth than the rest. If needed smoothness can be turned up (to one) and down (to zero) using op.constraints{2}.smoothness.alpha=0.6<br />
<br />
===Notes===<br />
<br />
* When attempting to use the fixed constraint with non-negativity or unimodality, use the .fixed and .columnconstraints. Meaning, use constraints{n,1}.fixed.values, and add values ( as a cell array) in constraints{n,1}.columnconstraints using valid columnconstraint values as described above. <br />
<br />
===See Also===<br />
<br />
[[parafac]]</div>Rasmushttps://www.wiki.eigenvector.com/index.php?title=Parafac&diff=11191Parafac2020-11-13T06:31:47Z<p>Rasmus: /* Synopsis */</p>
<hr />
<div>===Purpose===<br />
<br />
PARAFAC (PARAllel FACtor analysis) for multi-way arrays<br />
<br />
===Synopsis===<br />
<br />
:model = parafac(X,ncomp,''options'')<br />
:pred = parafac(Xnew,model)<br />
:parafac % Launches an analysis window with Parafac as the selected method<br />
<br />
Please note that the recommended way to build and apply a PARAFAC model from the command line is to use the Model Object. Please see [[EVRIModel_Objects | this wiki page on building and applying models using the Model Object]].<br />
<br />
===Description===<br />
<br />
PARAFAC will decompose an array of order ''N'' (where ''N'' >= 3) into the summation over the outer product of ''N'' vectors (a low-rank model). E.g. if ''N''=3 then the array is size ''I'' by ''J'' by ''K''. An example of three-way fluorescence data is shown below..<br />
<br />
For example, twenty-seven samples containing different amounts of dissolved hydroquinone, tryptophan, phenylalanine, and dopa are measured spectrofluoremetrically using 233 emission wavelengths (250-482 nm) and 24 excitation wavelengths (200-315 nm each 5 nm). A typical sample is also shown.<br />
<br />
[[Image:Parafacdata.gif]]<br />
<br />
A four-component PARAFAC model of these data will give four factors, each corresponding to one of the chemical analytes. This is illustrated graphically below. The first mode scores (loadings in mode 1) in the matrix '''A''' (27x4) contain estimated relative concentrations of the four analytes in the 27 samples. The second mode loadings '''B''' (233x4) are estimated emission loadings and the third mode loadings '''C''' (24x4) are estimated excitation loadings.<br />
<br />
[[Image:Parafacresults.gif]]<br />
<br />
For more information about how to use PARAFAC, see the [https://www.youtube.com/playlist?list=PL4L59zaizb3E-Pgp-f90iKHdQQi15JJoL University of Copenhagen's Multi-Way Analysis Videos].<br />
<br />
In the PARAFAC algorithm, any missing values must be set to NaN or Inf and are then automatically handled by expectation maximization. This routine employs an alternating least squares (ALS) algorithm in combination with a line search. For 3-way data, the initial estimate of the loadings is usually obtained from the tri-linear decomposition (TLD).<br />
<br />
For assistance in preparing batch data for use in PARAFAC please see [[bspcgui]].<br />
<br />
====Inputs====<br />
<br />
* '''x''' = the multiway array to be decomposed, and<br />
<br />
* '''ncomp''' = <br />
:* the number of factors (components) to use, OR<br />
:* a cell array of parameters such as {a,b,c} which will then be used as starting point for the model. The cell array must be the same length as the number of modes and element j contain the scores/loadings for that mode. If one cell element is empty, this mode is guessed based on the remaining modes.<br />
<br />
====Optional Inputs====<br />
<br />
* '''''initval''''' = <br />
:* If a parafac model is input, the data are fit to this model where the loadings for the first mode (scores) are estimated. <br />
:* If the loadings are input (e.g. model.loads) these are used as starting values.<br />
<br />
*'''''options''''' = discussed below.<br />
<br />
====Outputs====<br />
<br />
The output model is a structure array with the following fields:<br />
<br />
* '''modeltype''': 'PARAFAC',<br />
<br />
* '''datasource''': structure array with information about input data,<br />
<br />
* '''date''': date of creation,<br />
<br />
* '''time''': time of creation,<br />
<br />
* '''info''': additional model information,<br />
<br />
* '''loads''': 1 by ''K'' cell array with model loadings for each mode/dimension,<br />
<br />
* '''pred''': cell array with model predictions for each input data block,<br />
<br />
* '''tsqs''': cell array with T<sup>2</sup> values for each mode,<br />
<br />
* '''ssqresiduals''': cell array with sum of squares residuals for each mode,<br />
<br />
* '''description''': cell array with text description of model, and<br />
<br />
* '''detail''': sub-structure with additional model details and results. E.g. .detail.ssq contains information on loss function and variance explained as well as variance per component.<br />
<br />
Note that the sum-squared captured table contains various statistics on the information captured by each component. Please see [[MCR and PARAFAC Variance Captured]] for details.<br />
The output pred is a structure array that contains the approximation of the data if the options field blockdetails is set to 'all' (see next).<br />
<br />
===Options===<br />
<br />
''options'' = a structure array with the following fields:<br />
<br />
* '''display''': [ {'on'} | 'off' ], governs level of display,<br />
<br />
* '''plots''': [ {'final'} | 'all' | 'none' ], governs level of plotting,<br />
<br />
* '''weights''': [], used for fitting a weighted loss function (discussed below),<br />
<br />
* '''stopcriteria''': Structure defining when to stop iterations based on any one of four criteria<br />
<br />
:* '''relativechange''': Default is 1e-6. When the relative change in fit gets below the threshold, the algorithm stops.<br />
:* '''absolutechange''': Default is 1e-6. When the absolute change in fit gets below the threshold, the algorithm stops.<br />
:* '''iterations''': Default is 10.000. When the number of iterations exceeds the threshold, the algorithm stops.<br />
:* '''seconds''': Default is 3600 (seconds). When the time spent exceeds the threshold, the algorithm stops.<br />
<br />
* '''init''': [ 0 ], defines how parameters are initialized (discussed below),<br />
<br />
* '''line''': [ 0 | {1}] defines whether to use the line search {default uses it},<br />
<br />
* '''algo''': [ {'ALS'} | 'tld' | 'swatld' ] governs algorithm used. Only ALS allows more than three-way and allows constraints,<br />
<br />
* '''iterative''': settings for iterative reweighted least squares fitting (see help on weights below),<br />
<br />
* '''validation.splithalf''': [ 'on' | {'off'} ], Allows doing [[splithalf]] analysis. See the help of SPLITHALF for more information,<br />
<br />
* '''auto_outlier.perform''': [ 'on' | {'off'} ], Will automatically remove detected outliers in an iterative fashion. See auto_outlier.help for more information,<br />
<br />
* '''scaletype''': Defines how loadings are scaled. See options.scaletype.text for help,<br />
<br />
* '''blockdetails''': [ {'standard'} | 'compact' | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model.<br />
:* ‘Standard’ = the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'.<br />
:* ‘Compact’ = like 'Standard' only residual limits from old model is used and the core consistency field in the model structure is left empty. ('model.detail.reslim', 'model.detail.coreconsistency.consistency').<br />
:* 'All' = keep predictions, raw residuals for x-block as well as the X-block dataset itself.<br />
<br />
* '''preprocessing''': {[]}, one element cell array containing preprocessing structure (see PREPROCESS) defining preprocessing to use on the x-block <br />
<br />
* '''samplemode''': [1], defines which mode should be considered the sample or object mode,<br />
<br />
* '''constraints''': {3x1 cell}, defines constraints on parameters (discussed below),<br />
<br />
* '''coreconsist''': [ {'on'} | 'off' ], governs calculation of core consistency (turning off may save time with large data sets and many components), and<br />
<br />
* '''waitbar''': [ {'on'} | 'off' ], display waitbar. <br />
<br />
The default options can be retrieved using: options = parafac('options');.<br />
<br />
=====Weights=====<br />
<br />
Through the use of the ''options'' field weights it is possible to fit a PARAFAC model in a weighted least squares sense The input is an array of the same size as the input data X holding individual weights for each element. The PARAFAC model is then fit in a weighted least squares sense. Instead of minimizing the frobenius norm ||x-M||<sup>2</sup> where M is the PARAFAC model, the norm ||(x-M).*weights||<sup>2</sup> is minimized. The algorithm used for weighted regression is based on a majorization step according to Kiers, ''Psychometrika'', '''62''', 251-266, 1997 which has the advantage of being computationally inexpensive.<br />
<br />
=====Init=====<br />
<br />
The ''options'' field init is used to govern how the initial guess for the loadings is obtained. If optional input ''initval'' is input then options.init is not used. The following choices for init are available.<br />
<br />
Generally, options.init = 0, will do for well-behaved data whereas options.init = 10, will be suitable for difficult models. Difficult models are typically those with many components, with very correlated loadings, or models where there are indications that local minima are present.<br />
<br />
* '''init''' = 0, PARAFAC chooses initialization {default},<br />
<br />
* '''init''' = 1, uses TLD (unless data is more than three-way. Then ATLD is used),<br />
<br />
* '''init''' = 2, based on singular value decomposition (good alternative to 1), <br />
<br />
* '''init''' = 3, based on orthogonalization of random values (good for checking local minima),<br />
<br />
* '''init''' = 4, based on approximate (sequentially fitted) PARAFAC model, <br />
<br />
* '''init''' = 5, based on compression which may be useful for large data, and<br />
<br />
* '''init''' > 5, based on best fit of many (the value options.init) small runs.<br />
<br />
=====Constraints=====<br />
<br />
The ''options'' field constraints is used to employ constraints on the parameters. It is a cell array with number of elements equal to the number of modes of the input data X. Each cell contains a structure array that defines the constraints in that particular mode. Hence, options.constraints{2} defines constraints on the second mode loadings. For help on setting constraints see [[constrainfit]]. Note, that if your dataset is e.g. a five-way array, then the default constraint field in options only defines the first three modes. You will have to make the constraint field for the remaining modes yourself. This can be done by copying from the other modes. For example, options.constraints{4} = options.constraints{1};options.constraints{5} = options.constraints{1};<br />
<br />
===Examples===<br />
<br />
parafac demo gives a demonstration of the use of the PARAFAC algorithm.<br />
<br />
model = parafac(X,5) fits a five-component PARAFAC model to the array X using default settings.<br />
<br />
pred = parafac(Z,model) fits a parafac model to new data Z. The scores will be taken to be in the first mode, but you can change this by setting options.samplemodex to the mode which is the sample mode. Note, that the sample-mode dimension may be different for the old model and the new data, but all other dimensions must be the same.<br />
<br />
options = parafac('options'); generates a set of default settings for PARAFAC. options.plots = 0; sets the plotting off.<br />
<br />
options.init = 3; sets the initialization of PARAFAC to orthogonalized random numbers.<br />
<br />
options.samplemodex = 2; Defines the second mode to be the sample-mode. Useful, for example, when fitting an existing model to new data has to provide the scores in the second mode.<br />
<br />
model = parafac(X,2,options); fits a two-component PARAFAC model with the settings defined in options. <br />
<br />
parafac io shows the I/O of the algorithm.<br />
<br />
===See Also===<br />
<br />
[[analysis]], [[bspcgui]], [[datahat]], [[eemoutlier]], [[explode]], [[gram]], [[mpca]], [[npls]], [[outerm]], [[parafac2]], [[pca]], [[preprocess]], [[splithalf]], [[tld]], [[tucker]], [[unfoldm]], [[modelviewer]], [[EVRIModel_Objects]]</div>Rasmushttps://www.wiki.eigenvector.com/index.php?title=Parafac&diff=11028Parafac2020-02-14T09:31:05Z<p>Rasmus: /* Outputs */</p>
<hr />
<div>===Purpose===<br />
<br />
PARAFAC (PARAllel FACtor analysis) for multi-way arrays<br />
<br />
===Synopsis===<br />
<br />
:model = parafac(X,ncomp,''initval,options'')<br />
:pred = parafac(Xnew,model)<br />
:parafac % Launches an analysis window with Parafac as the selected method<br />
<br />
Please note that the recommended way to build and apply a PARAFAC model from the command line is to use the Model Object. Please see [[EVRIModel_Objects | this wiki page on building and applying models using the Model Object]]. <br />
<br />
===Description===<br />
<br />
PARAFAC will decompose an array of order ''N'' (where ''N'' >= 3) into the summation over the outer product of ''N'' vectors (a low-rank model). E.g. if ''N''=3 then the array is size ''I'' by ''J'' by ''K''. An example of three-way fluorescence data is shown below..<br />
<br />
For example, twenty-seven samples containing different amounts of dissolved hydroquinone, tryptophan, phenylalanine, and dopa are measured spectrofluoremetrically using 233 emission wavelengths (250-482 nm) and 24 excitation wavelengths (200-315 nm each 5 nm). A typical sample is also shown.<br />
<br />
[[Image:Parafacdata.gif]]<br />
<br />
A four-component PARAFAC model of these data will give four factors, each corresponding to one of the chemical analytes. This is illustrated graphically below. The first mode scores (loadings in mode 1) in the matrix '''A''' (27x4) contain estimated relative concentrations of the four analytes in the 27 samples. The second mode loadings '''B''' (233x4) are estimated emission loadings and the third mode loadings '''C''' (24x4) are estimated excitation loadings.<br />
<br />
[[Image:Parafacresults.gif]]<br />
<br />
For more information about how to use PARAFAC, see the [http://www.youtube.com/user/QualityAndTechnology/videos?view=1&flow=grid University of Copenhagen's Multi-Way Analysis Videos].<br />
<br />
In the PARAFAC algorithm, any missing values must be set to NaN or Inf and are then automatically handled by expectation maximization. This routine employs an alternating least squares (ALS) algorithm in combination with a line search. For 3-way data, the initial estimate of the loadings is usually obtained from the tri-linear decomposition (TLD).<br />
<br />
For assistance in preparing batch data for use in PARAFAC please see [[bspcgui]].<br />
<br />
====Inputs====<br />
<br />
* '''x''' = the multiway array to be decomposed, and<br />
<br />
* '''ncomp''' = <br />
:* the number of factors (components) to use, OR<br />
:* a cell array of parameters such as {a,b,c} which will then be used as starting point for the model. The cell array must be the same length as the number of modes and element j contain the scores/loadings for that mode. If one cell element is empty, this mode is guessed based on the remaining modes.<br />
<br />
====Optional Inputs====<br />
<br />
* '''''initval''''' = <br />
:* If a parafac model is input, the data are fit to this model where the loadings for the first mode (scores) are estimated. <br />
:* If the loadings are input (e.g. model.loads) these are used as starting values.<br />
<br />
*'''''options''''' = discussed below.<br />
<br />
====Outputs====<br />
<br />
The output model is a structure array with the following fields:<br />
<br />
* '''modeltype''': 'PARAFAC',<br />
<br />
* '''datasource''': structure array with information about input data,<br />
<br />
* '''date''': date of creation,<br />
<br />
* '''time''': time of creation,<br />
<br />
* '''info''': additional model information,<br />
<br />
* '''loads''': 1 by ''K'' cell array with model loadings for each mode/dimension,<br />
<br />
* '''pred''': cell array with model predictions for each input data block,<br />
<br />
* '''tsqs''': cell array with T<sup>2</sup> values for each mode,<br />
<br />
* '''ssqresiduals''': cell array with sum of squares residuals for each mode,<br />
<br />
* '''description''': cell array with text description of model, and<br />
<br />
* '''detail''': sub-structure with additional model details and results. E.g. .detail.ssq contains information on loss function and variance explained as well as variance per component.<br />
<br />
Note that the sum-squared captured table contains various statistics on the information captured by each component. Please see [[MCR and PARAFAC Variance Captured]] for details.<br />
The output pred is a structure array that contains the approximation of the data if the options field blockdetails is set to 'all' (see next).<br />
<br />
===Options===<br />
<br />
''options'' = a structure array with the following fields:<br />
<br />
* '''display''': [ {'on'} | 'off' ], governs level of display,<br />
<br />
* '''plots''': [ {'final'} | 'all' | 'none' ], governs level of plotting,<br />
<br />
* '''weights''': [], used for fitting a weighted loss function (discussed below),<br />
<br />
* '''stopcriteria''': Structure defining when to stop iterations based on any one of four criteria<br />
<br />
:* '''relativechange''': Default is 1e-6. When the relative change in fit gets below the threshold, the algorithm stops.<br />
:* '''absolutechange''': Default is 1e-6. When the absolute change in fit gets below the threshold, the algorithm stops.<br />
:* '''iterations''': Default is 10.000. When the number of iterations exceeds the threshold, the algorithm stops.<br />
:* '''seconds''': Default is 3600 (seconds). When the time spent exceeds the threshold, the algorithm stops.<br />
<br />
* '''init''': [ 0 ], defines how parameters are initialized (discussed below),<br />
<br />
* '''line''': [ 0 | {1}] defines whether to use the line search {default uses it},<br />
<br />
* '''algo''': [ {'ALS'} | 'tld' | 'swatld' ] governs algorithm used. Only ALS allows more than three-way and allows constraints,<br />
<br />
* '''iterative''': settings for iterative reweighted least squares fitting (see help on weights below),<br />
<br />
* '''validation.splithalf''': [ 'on' | {'off'} ], Allows doing [[splithalf]] analysis. See the help of SPLITHALF for more information,<br />
<br />
* '''auto_outlier.perform''': [ 'on' | {'off'} ], Will automatically remove detected outliers in an iterative fashion. See auto_outlier.help for more information,<br />
<br />
* '''scaletype''': Defines how loadings are scaled. See options.scaletype.text for help,<br />
<br />
* '''blockdetails''': [ {'standard'} | 'compact' | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model.<br />
:* ‘Standard’ = the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'.<br />
:* ‘Compact’ = like 'Standard' only residual limits from old model is used and the core consistency field in the model structure is left empty. ('model.detail.reslim', 'model.detail.coreconsistency.consistency').<br />
:* 'All' = keep predictions, raw residuals for x-block as well as the X-block dataset itself.<br />
<br />
* '''preprocessing''': {[]}, one element cell array containing preprocessing structure (see PREPROCESS) defining preprocessing to use on the x-block <br />
<br />
* '''samplemode''': [1], defines which mode should be considered the sample or object mode,<br />
<br />
* '''constraints''': {3x1 cell}, defines constraints on parameters (discussed below),<br />
<br />
* '''coreconsist''': [ {'on'} | 'off' ], governs calculation of core consistency (turning off may save time with large data sets and many components), and<br />
<br />
* '''waitbar''': [ {'on'} | 'off' ], display waitbar. <br />
<br />
The default options can be retrieved using: options = parafac('options');.<br />
<br />
=====Weights=====<br />
<br />
Through the use of the ''options'' field weights it is possible to fit a PARAFAC model in a weighted least squares sense The input is an array of the same size as the input data X holding individual weights for each element. The PARAFAC model is then fit in a weighted least squares sense. Instead of minimizing the frobenius norm ||x-M||<sup>2</sup> where M is the PARAFAC model, the norm ||(x-M).*weights||<sup>2</sup> is minimized. The algorithm used for weighted regression is based on a majorization step according to Kiers, ''Psychometrika'', '''62''', 251-266, 1997 which has the advantage of being computationally inexpensive.<br />
<br />
=====Init=====<br />
<br />
The ''options'' field init is used to govern how the initial guess for the loadings is obtained. If optional input ''initval'' is input then options.init is not used. The following choices for init are available.<br />
<br />
Generally, options.init = 0, will do for well-behaved data whereas options.init = 10, will be suitable for difficult models. Difficult models are typically those with many components, with very correlated loadings, or models where there are indications that local minima are present.<br />
<br />
* '''init''' = 0, PARAFAC chooses initialization {default},<br />
<br />
* '''init''' = 1, uses TLD (unless data is more than three-way. Then ATLD is used),<br />
<br />
* '''init''' = 2, based on singular value decomposition (good alternative to 1), <br />
<br />
* '''init''' = 3, based on orthogonalization of random values (good for checking local minima),<br />
<br />
* '''init''' = 4, based on approximate (sequentially fitted) PARAFAC model, <br />
<br />
* '''init''' = 5, based on compression which may be useful for large data, and<br />
<br />
* '''init''' > 5, based on best fit of many (the value options.init) small runs.<br />
<br />
=====Constraints=====<br />
<br />
The ''options'' field constraints is used to employ constraints on the parameters. It is a cell array with number of elements equal to the number of modes of the input data X. Each cell contains a structure array that defines the constraints in that particular mode. Hence, options.constraints{2} defines constraints on the second mode loadings. For help on setting constraints see [[constrainfit]]. Note, that if your dataset is e.g. a five-way array, then the default constraint field in options only defines the first three modes. You will have to make the constraint field for the remaining modes yourself. This can be done by copying from the other modes. For example, options.constraints{4} = options.constraints{1};options.constraints{5} = options.constraints{1};<br />
<br />
===Examples===<br />
<br />
parafac demo gives a demonstration of the use of the PARAFAC algorithm.<br />
<br />
model = parafac(X,5) fits a five-component PARAFAC model to the array X using default settings.<br />
<br />
pred = parafac(Z,model) fits a parafac model to new data Z. The scores will be taken to be in the first mode, but you can change this by setting options.samplemodex to the mode which is the sample mode. Note, that the sample-mode dimension may be different for the old model and the new data, but all other dimensions must be the same.<br />
<br />
options = parafac('options'); generates a set of default settings for PARAFAC. options.plots = 0; sets the plotting off.<br />
<br />
options.init = 3; sets the initialization of PARAFAC to orthogonalized random numbers.<br />
<br />
options.samplemodex = 2; Defines the second mode to be the sample-mode. Useful, for example, when fitting an existing model to new data has to provide the scores in the second mode.<br />
<br />
model = parafac(X,2,options); fits a two-component PARAFAC model with the settings defined in options. <br />
<br />
parafac io shows the I/O of the algorithm.<br />
<br />
===See Also===<br />
<br />
[[analysis]], [[bspcgui]], [[datahat]], [[eemoutlier]], [[explode]], [[gram]], [[mpca]], [[npls]], [[outerm]], [[parafac2]], [[pca]], [[preprocess]], [[splithalf]], [[tld]], [[tucker]], [[unfoldm]], [[modelviewer]], [[EVRIModel_Objects]]</div>Rasmushttps://www.wiki.eigenvector.com/index.php?title=Parafac&diff=10883Parafac2019-10-18T06:23:13Z<p>Rasmus: /* Init */ Corrected according to bug fix</p>
<hr />
<div>===Purpose===<br />
<br />
PARAFAC (PARAllel FACtor analysis) for multi-way arrays<br />
<br />
===Synopsis===<br />
<br />
:model = parafac(X,ncomp,''initval,options'')<br />
:pred = parafac(Xnew,model)<br />
:parafac % Launches an analysis window with Parafac as the selected method<br />
<br />
===Description===<br />
<br />
PARAFAC will decompose an array of order ''N'' (where ''N'' >= 3) into the summation over the outer product of ''N'' vectors (a low-rank model). E.g. if ''N''=3 then the array is size ''I'' by ''J'' by ''K''. An example of three-way fluorescence data is shown below..<br />
<br />
For example, twenty-seven samples containing different amounts of dissolved hydroquinone, tryptophan, phenylalanine, and dopa are measured spectrofluoremetrically using 233 emission wavelengths (250-482 nm) and 24 excitation wavelengths (200-315 nm each 5 nm). A typical sample is also shown.<br />
<br />
[[Image:Parafacdata.gif]]<br />
<br />
A four-component PARAFAC model of these data will give four factors, each corresponding to one of the chemical analytes. This is illustrated graphically below. The first mode scores (loadings in mode 1) in the matrix '''A''' (27x4) contain estimated relative concentrations of the four analytes in the 27 samples. The second mode loadings '''B''' (233x4) are estimated emission loadings and the third mode loadings '''C''' (24x4) are estimated excitation loadings.<br />
<br />
[[Image:Parafacresults.gif]]<br />
<br />
For more information about how to use PARAFAC, see the [http://www.youtube.com/user/QualityAndTechnology/videos?view=1&flow=grid University of Copenhagen's Multi-Way Analysis Videos].<br />
<br />
In the PARAFAC algorithm, any missing values must be set to NaN or Inf and are then automatically handled by expectation maximization. This routine employs an alternating least squares (ALS) algorithm in combination with a line search. For 3-way data, the initial estimate of the loadings is usually obtained from the tri-linear decomposition (TLD).<br />
<br />
For assistance in preparing batch data for use in PARAFAC please see [[bspcgui]].<br />
<br />
====Inputs====<br />
<br />
* '''x''' = the multiway array to be decomposed, and<br />
<br />
* '''ncomp''' = <br />
:* the number of factors (components) to use, OR<br />
:* a cell array of parameters such as {a,b,c} which will then be used as starting point for the model. The cell array must be the same length as the number of modes and element j contain the scores/loadings for that mode. If one cell element is empty, this mode is guessed based on the remaining modes.<br />
<br />
====Optional Inputs====<br />
<br />
* '''''initval''''' = <br />
:* If a parafac model is input, the data are fit to this model where the loadings for the first mode (scores) are estimated. <br />
:* If the loadings are input (e.g. model.loads) these are used as starting values.<br />
<br />
*'''''options''''' = discussed below.<br />
<br />
====Outputs====<br />
<br />
The output model is a structure array with the following fields:<br />
<br />
* '''modeltype''': 'PARAFAC',<br />
<br />
* '''datasource''': structure array with information about input data,<br />
<br />
* '''date''': date of creation,<br />
<br />
* '''time''': time of creation,<br />
<br />
* '''info''': additional model information,<br />
<br />
* '''loads''': 1 by ''K'' cell array with model loadings for each mode/dimension,<br />
<br />
* '''pred''': cell array with model predictions for each input data block,<br />
<br />
* '''tsqs''': cell array with T<sup>2</sup> values for each mode,<br />
<br />
* '''ssqresiduals''': cell array with sum of squares residuals for each mode,<br />
<br />
* '''description''': cell array with text description of model, and<br />
<br />
* '''detail''': sub-structure with additional model details and results.<br />
<br />
Note that the sum-squared captured table contains various statistics on the information captured by each component. Please see [[MCR and PARAFAC Variance Captured]] for details.<br />
The output pred is a structure array that contains the approximation of the data if the options field blockdetails is set to 'all' (see next).<br />
<br />
===Options===<br />
<br />
''options'' = a structure array with the following fields:<br />
<br />
* '''display''': [ {'on'} | 'off' ], governs level of display,<br />
<br />
* '''plots''': [ {'final'} | 'all' | 'none' ], governs level of plotting,<br />
<br />
* '''weights''': [], used for fitting a weighted loss function (discussed below),<br />
<br />
* '''stopcriteria''': Structure defining when to stop iterations based on any one of four criteria<br />
<br />
:* '''relativechange''': Default is 1e-6. When the relative change in fit gets below the threshold, the algorithm stops.<br />
:* '''absolutechange''': Default is 1e-6. When the absolute change in fit gets below the threshold, the algorithm stops.<br />
:* '''iterations''': Default is 10.000. When the number of iterations exceeds the threshold, the algorithm stops.<br />
:* '''seconds''': Default is 3600 (seconds). When the time spent exceeds the threshold, the algorithm stops.<br />
<br />
* '''init''': [ 0 ], defines how parameters are initialized (discussed below),<br />
<br />
* '''line''': [ 0 | {1}] defines whether to use the line search {default uses it},<br />
<br />
* '''algo''': [ {'ALS'} | 'tld' | 'swatld' ] governs algorithm used. Only ALS allows more than three-way and allows constraints,<br />
<br />
* '''iterative''': settings for iterative reweighted least squares fitting (see help on weights below),<br />
<br />
* '''validation.splithalf''': [ 'on' | {'off'} ], Allows doing [[splithalf]] analysis. See the help of SPLITHALF for more information,<br />
<br />
* '''auto_outlier.perform''': [ 'on' | {'off'} ], Will automatically remove detected outliers in an iterative fashion. See auto_outlier.help for more information,<br />
<br />
* '''scaletype''': Defines how loadings are scaled. See options.scaletype.text for help,<br />
<br />
* '''blockdetails''': [ {'standard'} | 'compact' | 'all' ] level of detail (predictions, raw residuals, and calibration data) included in the model.<br />
:* ‘Standard’ = the predictions and raw residuals for the X-block as well as the X-block itself are not stored in the model to reduce its size in memory. Specifically, these fields in the model object are left empty: 'model.pred{1}', 'model.detail.res{1}', 'model.detail.data{1}'.<br />
:* ‘Compact’ = like 'Standard' only residual limits from old model is used and the core consistency field in the model structure is left empty. ('model.detail.reslim', 'model.detail.coreconsistency.consistency').<br />
:* 'All' = keep predictions, raw residuals for x-block as well as the X-block dataset itself.<br />
<br />
* '''preprocessing''': {[]}, one element cell array containing preprocessing structure (see PREPROCESS) defining preprocessing to use on the x-block <br />
<br />
* '''samplemode''': [1], defines which mode should be considered the sample or object mode,<br />
<br />
* '''constraints''': {3x1 cell}, defines constraints on parameters (discussed below),<br />
<br />
* '''coreconsist''': [ {'on'} | 'off' ], governs calculation of core consistency (turning off may save time with large data sets and many components), and<br />
<br />
* '''waitbar''': [ {'on'} | 'off' ], display waitbar. <br />
<br />
The default options can be retrieved using: options = parafac('options');.<br />
<br />
=====Weights=====<br />
<br />
Through the use of the ''options'' field weights it is possible to fit a PARAFAC model in a weighted least squares sense The input is an array of the same size as the input data X holding individual weights for each element. The PARAFAC model is then fit in a weighted least squares sense. Instead of minimizing the frobenius norm ||x-M||<sup>2</sup> where M is the PARAFAC model, the norm ||(x-M).*weights||<sup>2</sup> is minimized. The algorithm used for weighted regression is based on a majorization step according to Kiers, ''Psychometrika'', '''62''', 251-266, 1997 which has the advantage of being computationally inexpensive.<br />
<br />
=====Init=====<br />
<br />
The ''options'' field init is used to govern how the initial guess for the loadings is obtained. If optional input ''initval'' is input then options.init is not used. The following choices for init are available.<br />
<br />
Generally, options.init = 0, will do for well-behaved data whereas options.init = 10, will be suitable for difficult models. Difficult models are typically those with many components, with very correlated loadings, or models where there are indications that local minima are present.<br />
<br />
* '''init''' = 0, PARAFAC chooses initialization {default},<br />
<br />
* '''init''' = 1, uses TLD (unless data is more than three-way. Then ATLD is used),<br />
<br />
* '''init''' = 2, based on singular value decomposition (good alternative to 1), <br />
<br />
* '''init''' = 3, based on orthogonalization of random values (good for checking local minima),<br />
<br />
* '''init''' = 4, based on approximate (sequentially fitted) PARAFAC model, <br />
<br />
* '''init''' = 5, based on compression which may be useful for large data, and<br />
<br />
* '''init''' > 5, based on best fit of many (the value options.init) small runs.<br />
<br />
=====Constraints=====<br />
<br />
The ''options'' field constraints is used to employ constraints on the parameters. It is a cell array with number of elements equal to the number of modes of the input data X. Each cell contains a structure array that defines the constraints in that particular mode. Hence, options.constraints{2} defines constraints on the second mode loadings. For help on setting constraints see [[constrainfit]]. Note, that if your dataset is e.g. a five-way array, then the default constraint field in options only defines the first three modes. You will have to make the constraint field for the remaining modes yourself. This can be done by copying from the other modes. For example, options.constraints{4} = options.constraints{1};options.constraints{5} = options.constraints{1};<br />
<br />
===Examples===<br />
<br />
parafac demo gives a demonstration of the use of the PARAFAC algorithm.<br />
<br />
model = parafac(X,5) fits a five-component PARAFAC model to the array X using default settings.<br />
<br />
pred = parafac(Z,model) fits a parafac model to new data Z. The scores will be taken to be in the first mode, but you can change this by setting options.samplemodex to the mode which is the sample mode. Note, that the sample-mode dimension may be different for the old model and the new data, but all other dimensions must be the same.<br />
<br />
options = parafac('options'); generates a set of default settings for PARAFAC. options.plots = 0; sets the plotting off.<br />
<br />
options.init = 3; sets the initialization of PARAFAC to orthogonalized random numbers.<br />
<br />
options.samplemodex = 2; Defines the second mode to be the sample-mode. Useful, for example, when fitting an existing model to new data has to provide the scores in the second mode.<br />
<br />
model = parafac(X,2,options); fits a two-component PARAFAC model with the settings defined in options. <br />
<br />
parafac io shows the I/O of the algorithm.<br />
<br />
===See Also===<br />
<br />
[[analysis]], [[bspcgui]], [[datahat]], [[eemoutlier]], [[explode]], [[gram]], [[mpca]], [[npls]], [[outerm]], [[parafac2]], [[pca]], [[preprocess]], [[splithalf]], [[tld]], [[tucker]], [[unfoldm]], [[modelviewer]]</div>Rasmus