Duplex
Jump to navigation
Jump to search
Purpose
Select a subset of samples from a data set by the Duplex algorithm.
Synopsis
- [selCal, selTest] = duplex(x, k)
Description
Selected samples should provide uniform coverage of the dataset and nclude samples on the boundary of the data set. Duplex starts by selecting the two samples furthest from each other and assigns these to the calibration set. Then finds the next two samples furthest from each other assigns these to the test set. Then iterates over the rest of the samples.
References: R.D. Snee, Validation of regression models: methods and examples, Technometrics 19 (1977) 415-428 M. Daszykowski, B. Walczak, D.L. Massart, Representative subset selection, Analytica Chimica Acta 468 (2002) 91-103
Inputs
- x = array, or dataset, containing data to select k samples from,
- k = number of samples to select.
Outputs
- selCal = logical vector of length nsamples, indicating samples which are selected for calibration set (true = selected). If input x was a dataset object then sel has size (1, nincluded) where nincluded is the number of included samples, and sel indicates which included samples are selected.
- selTest = (1,nsamples) logical vector indicating samples which are selected for test set, true = is selected. If input x was a dataset then sel has size (1, nincluded) and sel indicates which included samples are selected.
Example
>> load arch; >> [selCal,selTest] = duplex(arch, 50); >> arch_subset = arch(selCal,:);
See Also
distslct, reducennsamples, splitcaltest, doptimal, stdsslct, randomsplit, spxy