Spxy: Difference between revisions
Jump to navigation
Jump to search
(Created page with "===Purpose=== Selects a subset of samples from a data set by the SPXY algorithm. ===Synopsis=== :sel = spxy(x,y,k) ===Description=== Selected samples should provide unifor...") |
|||
Line 26: | Line 26: | ||
<pre> | <pre> | ||
>> load beer; | >> load beer; | ||
>> sel = spxy(beer, extract 10); | >> sel = spxy(beer, extract, 10); | ||
</pre> | </pre> | ||
Latest revision as of 06:40, 20 November 2023
Purpose
Selects a subset of samples from a data set by the SPXY algorithm.
Synopsis
- sel = spxy(x,y,k)
Description
Selected samples should provide uniform coverage of the dataset, which takes into account X and Y data, and include samples on the boundary of the data set. Algorithm is similar to the algorithm in the reference below. The distance calculation differs from the reference to remain consistent with how Eigenvector calculates distances in the kennardstone and duplex functions.
Reference: R.K.H Galvao, M.C.U. Araujo, G.E. Jose, M.J.C. Pontes, E.C. Silva, T.C.B. Saldanha (2005): A method for calibration and validation subset partitioning, Talanta 67, 736-740
Inputs
- x = array, or dataset, containing X block data to select k samples from,
- y = array, or dataset, containing Y block data to select k samples from.
- k = number of samples to select.
Outputs
- sel = logical vector of length nsamples, indicating samples which are selected (true = selected). If input x was a dataset object then sel has size (1, nincluded) where nincluded is the number of included samples, and sel indicates which included samples are selected.
Example
>> load beer; >> sel = spxy(beer, extract, 10);
See Also
distslct, reducennsamples, splitcaltest, doptimal, stdsslct, randomsplit