Spxy

From Eigenvector Research Documentation Wiki
Revision as of 06:40, 20 November 2023 by Lyle (talk | contribs) (→‎Example)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Purpose

Selects a subset of samples from a data set by the SPXY algorithm.

Synopsis

sel = spxy(x,y,k)

Description

Selected samples should provide uniform coverage of the dataset, which takes into account X and Y data, and include samples on the boundary of the data set. Algorithm is similar to the algorithm in the reference below. The distance calculation differs from the reference to remain consistent with how Eigenvector calculates distances in the kennardstone and duplex functions.

Reference: R.K.H Galvao, M.C.U. Araujo, G.E. Jose, M.J.C. Pontes, E.C. Silva, T.C.B. Saldanha (2005): A method for calibration and validation subset partitioning, Talanta 67, 736-740

Inputs

  • x = array, or dataset, containing X block data to select k samples from,
  • y = array, or dataset, containing Y block data to select k samples from.
  • k = number of samples to select.

Outputs

  • sel = logical vector of length nsamples, indicating samples which are selected (true = selected). If input x was a dataset object then sel has size (1, nincluded) where nincluded is the number of included samples, and sel indicates which included samples are selected.

Example

>> load beer;
>> sel = spxy(beer, extract, 10);

See Also

distslct, reducennsamples, splitcaltest, doptimal, stdsslct, randomsplit