Experimentreadr

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search

Purpose

Read an Experiment File containing filenames and reference measurement data, and imports the corresponding files and data into Analysis GUI. Makes it possible to automatically assign the data to x and y and to calibration and validation blocks.

Synopsis

experimentreadr(filename)

Description

Experiment files include a list of data files and their corresponding "properties of interest" (y-values). An experiment file is expected to be a plain text file (comma, space, tab or other delimited file) or a Microsoft Excel-formatted file. The file must consist of one column of text strings indicating the files to be read and used as samples in the X-block of a regression or classification model and at least one column of numerical values indicating the values to use as the corresponding y-block. If no experiment file filename is specified on the command line, the user is prompted to locate a suitable file.

Once loaded, the experiment file can be manipulated, excluding samples using the include field of the Row Labels tab, or y-block columns using the include field of the Column Labels tab. Samples can be marked as in the Calibration or Validation set using the Row Labels tab.

When all manipulations are complete, the user clicks the check-mark toolbar button to import all the indicated files and automatically load the experiment data into the Analysis GUI.

X-block File Formats

The x-block files named can be in any standard readable file format. However, experiment files do not currently allow for any multi-file formats. Named files must contain only one sample (row) of data per file.

Header Row

An experiment file can include an optional header row for the filenames and properties of interest. This row can contain text lables which will be used to label the y-block columns (i.e. giving a text description of the property of interest.)

Calibration/Validation

Experiment files can also contain information used to split the data into calibration and validation sets. To use this feature, include an additional column with the keywords "Calibration" or "Validation" next to each file. When the experiment is imported, the data will be automatically loaded into the appropriate data blocks. NOTE: other valid synonyms include (all are case insensitive)

Calibration = Cal = C
Validation = Val = V = Test = T

Overriding File Format

If the extension on the specified files does not unambiguously identify the importer to be used (e.g. xy files with an extension of ".txt" will be read by the delimited text file importer, not the XY importer), then the file may supply an additional "header" line above the column headers which specifies the file format to expect. This line must contain the keyword "format" followed by an equal sign and the name of the import method to use. For example:

 format=xy

Note that an overriding file format can ONLY be specified when a column header row (described below) is also included.

Example Experiment File

  filename,concentration,cal/val
  file1.spc,13.2,cal
  file2.spc,19.0,cal
  file3.spc,5.3,cal
  file4.spc,8.3,val

The above experiment file would define an experiment with three samples with X-block data stored in the indicated files and y-values of 13.2, 19.0, 5.3, and 8.3 (with a text description of the y-values as "concentration"). The first three files would be used for calibration, the last file for validation.

Experimentreadr diagram.png

See Also

analysis, autoimport, xclreadr