Duplex

Purpose

Select a subset of samples from a data set by the Duplex algorithm.

Synopsis

[selCal, selTest] = duplex(x, k)

Description

Selected samples should provide uniform coverage of the dataset and include samples on the boundary of the data set. Duplex starts by selecting the two samples furthest from each other and assigns these to the calibration set. Then finds the next two samples furthest from each other assigns these to the test set. Then iterates over the rest of the samples to find the sample furthest from the samples in the calibration set and assigns this to the calibration set and then finds the sample furthest from the test set and assigns this to the test set. This is done until the desired number of samples in the calibration set is reached.

References:

R.D. Snee, Validation of regression models: methods and examples, Technometrics 19 (1977) 415-428
M. Daszykowski, B. Walczak, D.L. Massart, Representative subset selection, Analytica Chimica Acta 468 (2002) 91-103

Inputs

x = array, or dataset, containing data to select k samples from,
k = number of samples to select.

Outputs

selCal = logical vector of length nsamples, indicating samples which are selected for calibration set (true = selected). If input x was a dataset object then sel has size (1, nincluded) where nincluded is the number of included samples, and sel indicates which included samples are selected.

selTest = (1,nsamples) logical vector indicating samples which are selected for test set, true = is selected. If input x was a dataset then sel has size (1, nincluded) and sel indicates which included samples are selected.

Example

>> load arch;
>> [selCal,selTest] = duplex(arch, 50);
>> arch_subset = arch(selCal,:);

Duplex

Contents

Purpose

Synopsis

Description

Inputs

Outputs

Example

See Also

Navigation menu

Duplex

Purpose

Synopsis

Description

Inputs

Outputs

Example

See Also

Navigation menu

Search