Doptimal: Difference between revisions

Latest revision as of 17:36, 8 October 2008

Purpose

Selects samples from a candidate matrix that satisfy the d-optimal condition.

Synopsis

isel = doptimal(x,nosamps,iint,tol)

Description

DOPTIMAL selects a number (nosamps) of samples from a candidate matrix x that maximizes the determinant of det(x(isel,:)'\*x(isel,:)) where isel is a vector of indices of the selected samples.

The optional input iint is a vector of indices to initialize the optimization algorithm. If iint is not input the algorithm is initialized using samples identified as on the exterior of the data set using the DISTSLCT function. This is in contrast to initializing with a random subset used in many algorithms. The reason is that the routine is based on Fedorov's algorithm (de Aguiar, P.F., Bourguignon, B., Khots, M.S., Massart, D.L., and Phan-Than-Luu, R., "D-optimal designs", Chemo. Intell. Lab. Sys., 30, 199-210, 1995) which requires calculating inv(x(isel,:)'\*x(isel,:)), and it is possible that the inverse of a random set will not exist. The routine then exchanges the 'least informative' sample in the selected set with a 'more informative' sample in the candidate set. The optional input tol sets the tolerance for minimum increase in the determinant {default = 1x10^-4}.

Note that nosamps must be > rank(x) (it is necessary but not sufficient that nosamps > size(x,2)) for a good solution to be found. This is required so that a good estimate of inv(x(isel,:)'\*x(isel,:)) can be obtained. When nosamps > size(x,2) the scores from PCA or PLS can be used where nosamps > than the number of factors (principal components or latent variables) used. Also, note that the solution can depend on the initial guess and that isel does not necessarily represent a global optimum.

Inputs

x: data matrix
nosamps: number of samples to select

Optional Inputs

iint: vector of initialization indices
tol: tolerance for minimum increase in the determinant {default: 1x10^-4}

Outputs

isel: vector of selected indices

Examples

For an input matrix x that is m by 5

  isel5 = doptimal(x,5);

  isel6 = doptimal(x,6);

@@ Line 1: / Line 1: @@
 ===Purpose===
@@ Line 10: / Line 9: @@
 ===Description===
-DOPTIMAL selects a number (nosamps) of samples from a candidate matrix x that maximizes the determinant of det(x(isel,:)'\*x(isel,:)) where isel is a vector of indices of the selected samples.
+DOPTIMAL selects a number (<tt>nosamps</tt>) of samples from a candidate matrix <tt>x</tt> that maximizes the determinant of det(<tt>x</tt>(<tt>isel</tt>,:)'\*<tt>x</tt>(<tt>isel</tt>,:)) where <tt>isel</tt> is a vector of indices of the selected samples.
+The optional input ''iint'' is a vector of indices to initialize the optimization algorithm. If ''iint'' is not input the algorithm is initialized using samples identified as on the exterior of the data set using the DISTSLCT function. This is in contrast to initializing with a random subset used in many algorithms. The reason is that the routine is based on Fedorov's algorithm (de Aguiar, P.F., Bourguignon, B., Khots, M.S., Massart, D.L., and Phan-Than-Luu, R., "D-optimal designs", ''Chemo. Intell. Lab. Sys.'', '''30''', 199-210, 1995) which requires calculating inv(<tt>x</tt>(<tt>isel</tt>,:)'\*<tt>x</tt>(<tt>isel</tt>,:)), and it is possible that the inverse of a random set will not exist. The routine then exchanges the 'least informative' sample in the selected set with a 'more informative' sample in the candidate set. The optional input ''tol'' sets the tolerance for minimum increase in the determinant {default = 1x10<sup>-4</sup>}.
+Note that nosamps must be <u>></u> rank(<tt>x</tt>) (it is necessary but not sufficient that <tt>nosamps</tt> <u>></u> size(<tt>x</tt>,2)) for a good solution to be found. This is required so that a good estimate of inv(<tt>x</tt>(<tt>isel</tt>,:)'\*<tt>x</tt>(<tt>isel</tt>,:)) can be obtained. When <tt>nosamps</tt> <u>></u> size(<tt>x</tt>,2) the scores from PCA or PLS can be used where <tt>nosamps</tt> <u>></u> than the number of factors (principal components or latent variables) used. Also, note that the solution can depend on the initial guess and that <tt>isel</tt> does not necessarily represent a global optimum.
+====Inputs====
+* '''x''': data matrix
+* '''nosamps''': number of samples to select
+====Optional Inputs====
+* '''iint''': vector of initialization indices
+* '''tol''': tolerance for minimum increase in the determinant {default:  1x10<sup>-4</sup>}
-The optional input ''iint'' is a vector of indices to initialize the optimization algorithm. If ''iint'' is not input the algorithm is initialized using samples identified as on the exterior of the data set using the DISTSLCT function. This is in contrast to initializing with a random subset used in many algorithms. The reason is that the routine is based on Fedorov's algorithm (de Aguiar, P.F., Bourguignon, B., Khots, M.S., Massart, D.L., and Phan-Than-Luu, R., "D-optimal designs", ''Chemo. Intell. Lab. Sys.'', '''30''', 199-210, 1995) which requires calculating inv(x(isel,:)'\*x(isel,:)), and it is possible that the inverse of a random set will not exist. The routine then exchanges the 'least informative' sample in the selected set with a 'more informative' sample in the candidate set. The optional input ''tol'' sets the tolerance for minimum increase in the determinant {default = 1x10<sup>-4</sup>}.
+====Outputs====
-Note that nosamps must be ? rank(x) (it is necessary but not sufficient that nosamps ? size(x,2)) for a good solution to be found. This is required so that a good estimate of inv(x(isel,:)'\*x(isel,:)) can be obtained. When nosamps ? size(x,2) the scores from PCA or PLS can be used where nosamps ? than the number of factors (principal components or latent variables) used. Also, note that the solution can depend on the initial guess and that isel does not necessarily represent a global optimum.
+* '''isel''': vector of selected indices
 ===Examples===
@@ Line 20: / Line 33: @@
 For an input matrix x that is m by 5
-:isel5 = doptimal(x,5);
+<pre>
+  isel5 = doptimal(x,5);
-:isel6 = doptimal(x,6);
+  isel6 = doptimal(x,6);
+</pre>
 ===See Also===
 [[distslct]], [[stdsslct]]

Doptimal: Difference between revisions

Latest revision as of 17:36, 8 October 2008

Contents

Purpose

Synopsis

Description

Inputs

Optional Inputs

Outputs

Examples

See Also

Navigation menu

Doptimal: Difference between revisions

Latest revision as of 17:36, 8 October 2008

Purpose

Synopsis

Description

Inputs

Optional Inputs

Outputs

Examples

See Also

Navigation menu

Search