Demonstration Datasets: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
imported>Lyle
 
(16 intermediate revisions by 5 users not shown)
Line 1: Line 1:
The following is a list of the data supplied with PLS_Toolbox. It consists of the name of the file (all end in ".mat" unless otherwise specified) along with a brief description of the contents of the given file.
__TOC__


{| class="wikitable"
==List of DataSets==
The following is a list of the data supplied with PLS_Toolbox. It consists of the name of the file (all end in ".mat" unless otherwise specified) along with a brief description of the contents of the given file. Each file may contain one or more variables to be used.
 
{| class="wikitable sortable"
|-
!File Name !! Description !! Variable Name(s) !! Variable Size !! Task !! Notes
|-
|alcohol || Biological fluid analysis of alcoholics || alcohol || 65x52 || Classification ||
|-
|aminoacids || Fluorescence EEM of 5 aminoacid samples || X || 5x20x61 || Decomposition || Try with PARAFAC
|-
|arch || Archeological artifact data set || arch || 75x10 || Decomposition, Classification ||
|-
|beer || VIS-NIR transmission recorded directly on undiluted degassed beer ||beer <br />extract <br />beertest <br />extracttest || 40x926 <br />40x1 <br />20x926 <br /> 20x1 ||Regression || Good for testing Variable Selection Interface
|-
|biscuit || NIR reflectance spectra of 40 samples of biscuit dough from 1200-2398nm. ||spec <br />recipe || 40x600 <br />40x4 || Regression ||
|-
|brain_weight    || Body mass (kg) and brain mass (g) for 28 animals. || brains || 28x2 || Decomposition ||
|-
|bread          || Sensory evaluation of breads. || bread || 10x11x8 || Decomposisiton || Try with PARAFAC, also good for testing EEM Filtering preprocessing
|-
|cancer          || Fluorescence EEM spectra extracted from cervical cancer images. || cancer || 563x22 || Classification, Decomposition || Unfolded EEM data
|-
|corn_dso || 80 samples of corn measured on 3 different NIR spectrometers with moisture, oil, protein and starch values for each of the samples is also included. ||conc <br />m5nbs <br />m5spec <br />mp5nbs <br />mp5spec <br />mp6nbs <br />mp6spec || 80x4 <br />3x700 <br />80x700<br />4x700<br />80x700<br />4x700<br />80x700 || Regression ||
|-
|-
|alcohol        || Biological fluid analysis of alcoholics for discriminant analysis.
|data_mid_IR    || Data sets for correlation spectroscopy. || data_mid_IR || 21x130 || Correlation Spectroscopy || Use with data_near_IR dataset
|-
|-
|aminoacids      || AMINOACIDS Fluorescence EEM of 5 samples for PARAFAC.
|data_near_IR    || Data sets for correlation spectroscopy. || data_near_IR || 21x149 || Correlation Spectroscopy || Use with data_mid_IR dataset
|-
|-
|arch            || ARCH Archeological artifact data set for PCA amd SIMCA examples.
|dorrit || DORRIT EEM of 27 samples with 4 flourophores for PARAFAC. ||EEM <br />yblock || 27x116x18 <br /> 27x4 || Decomposition, Regression || Try PARAFAC and Multi-Way PLS, NPLS
|-
|-
|bread          || Sensory evaluation of breads.
|Dupont_BSPC || 10 process variables (pressure, flow, temperature) for 55 batches with 7 steps each. ||dupont_cal <br /> dupont_test || 3600x10 <br /> 1900x10 || Batch Processor  ||
|-
|-
|dorrit          || DORRIT EEM of 27 samples with 4 flourophores for PARAFAC.
|etchdata || Engineering process data from semiconductor metal etch ||Etchcal <br />EtchTest || 20x12x107 <br />20x12x20 || Decomposition || Use Multi-Way PCA (MPCA). See Chemometrics Tutorial chapter 5.
|-
|-
|etchdata        || Engineering process data from semiconductor metal etch (MPCA).
|fia            || UV detection of Flow Injection Analysis of hydroxy-benzaldehyde (n-way data: sample, wavelength, time) || FIA || 12x50x45 || Decomposition ||
|-
|-
|fia            || UV detection of Flow Injection Analysis of hydroxy-benzaldehyde (n-way data: sample, wavelength, time).
|flucuttest      || Fluorescence EEM data || z || 2x15x23 || Decomposition || Try PARAFAC
|-
|-
|FTIR_microscopy || FTIR microscopy transect spectra of a three-layer polymer laminate.
|FTIR_microscopy || FTIR microscopy transect spectra of a three-layer polymer laminate. || FTIR_microscopy || 17x81 || Decomposition || Use PURITY program
|-
|-
|halddata        || HALDDATA Hald cement curing data.
|gcwine          || Dynamic headspace GCMS data of red wines from different regions. || gcwine || 71x150x24 || Decomposition || Try PARAFAC2
|-
|-
|lcms            || LC/MS electrospray of 15 surfactant solution.
|halddata || HALDDATA Hald cement curing data. ||xblock <br /> yblock || 13x4<br />13x1 || Regression ||
|-
|-
|lcms_compare1  || Select data from LC/MS electrospray data set.
|lcms            || LC/MS electrospray of 15 surfactant solution. || lcms || 345x1451 || Decomposition || Use CODA-DW Tool. See Chemometrics Tutorial, chapter 9.
|-
|-
|lcms_compare2   || Select data from LC/MS electrospray data set.
|lcms_compare1   || Select data from LC/MS electrospray data set. || lcms_compare1 || 675x1940 || Decomposition || Try LCMS Compare Tool. Use with lcms_compare2 and lcms_compare3. See Chemometris Tutorial Chapter, chapter 9.
|-
|-
|lcms_compare3   || Select data from LC/MS electrospray data set.
|lcms_compare2   || Select data from LC/MS electrospray data set. || lcms_compare2 || 675x1940 || Decomposition || Try LCMS Compare Tool. Use with lcms_compare1 and lcms_compare3. See Chemometris Tutorial Chapter, chapter 9.
|-
|-
|MS_time_resolved || Direct probe time profile MS of three color-coupling compounds.
|lcms_compare3  || Select data from LC/MS electrospray data set. || lcms_compare3 || 675x1940 || Decomposition || Try LCMS Compare Tool. Use with lcms_compare1 and lcms_compare2. See Chemometris Tutorial Chapter, chapter 9.
|-
|-
|MS_time_resolved_references || Reference spectra of pure compounds from MS_time_resolved.
|MS_time_resolved || Direct probe time profile MS of three color-coupling compounds. ||MS_time_resolved || 20x757 || Decomposition || Use PURITY program (try 3 and 4 components)
|-
|-
|nir_data        || NIR_DATA NIR spectra of pseudo gasoline samples for STDDEMO.
|MS_time_resolved_references || Reference spectra of pure compounds from MS_time_resolved. || MS_time_resolved_references || 3x757 ||  || Use along with MS_time_resolved data for comparing results from PURITY
|-
|-
|nmr_data        || NMR_DATA NMR data for GRAM demo.
|nir_data || NIR spectra of pseudo gasoline samples ||conc <br />spec1 <br />spec2 || 30x5 <br />30x401 <br />30x401 || Regression || Try Savitsky-Golay preprocessing
|-
|-
|oesdata        || OESDATA Optical emission spectra from metal etch.
|nmr_data        || NMR data for GRAM demo. || nmrdata || 20x1176 || Decomposition ||
|-
|-
|paint          || PAINT Non-linear paint formulation data.
|octane || NIR spectra and octane number values of 39 gasoline samples. ||spec <br />octane || 39x226 <br />39x1 || Regression, Decomposition || Try some Robust methods. See Chemometrics Tutorial, chapter 13.
|-
|-
|pcadata         || PCADATA Slurry Fed Ceramic Melter data.
|oesdata         || Optical emission spectra from metal etch. || oes1 || 46x770 || Decomposition || Try Multivariate Curve Resolution (MCR). See Chemometrics Tutorial, chapter 9.
|-
|-
|plsdata        || PLSDATA SFCM data for PCR and PLS demos.
|OliveOilData || 36 FT-IR spectra (3600 - 600 cm-1) of olive oils. ||xcal <br /> xtest || 36x518 <br /> 36x518 || Decomposition, Classification || Try Multiplicative Scatter Correction (MSC) preprocessing.
|-
|-
|plslogo        || PLSLOGO Data used to construct the PLS logo.
|paint || Non-linear paint formulation data.||paint_cal_X <br />paint_cal_Y <br />paint_test_X <br />paint_test_Y || 49x4<br />49x3 <br />8x4 <br />8x3 || Regression || Try using with non-linear regression methods, like SVM-R
|-
|-
|projdat        || PROJDAT Projection demo data for PROJDEMO.
|pcadata || Slurry Fed Ceramic Melter data. ||part1 <br />part2 || 300x10 <br />200x10 || Decomposition ||
|-
|-
|pulsdata        || Time series data from a Slurry Fed Ceramic Melter for PLSPULSM demo.
|plsdata || SFCM data for PCR and PLS demos. ||xblock1 <br />yblock1 <br />xblock2<br />yblock2 ||300x20 <br />300x1 <br />200x20 <br />200x1 || Regression || Try Multiple Linear Regression (MLR). See Chemometrics Tutorial chapter 6.
|-
|-
|raman_time_resolved || Raman spectra of a time resolved reaction.
|pulsdata        || Time series data from a Slurry Fed Ceramic Melter for PLSPULSM demo. || melter_data || 325x3 || Decomposition ||
|-
|-
|replacedata    || REPLACEDATA SFCM data for REPLACEDEMO.
|purvardata      || Raman spectra used in "Chemometrics Tutorial" section "MATLAB Code for Pure Variable Method, (chapter 9)". || data || 6x99 || Decomposition || Use PURITY
|-
|-
|sawdata        || SAWDATA Surface acoustic wave sensor data.
|purvardata_noise || Raman spectra used in "Chemometrics Tutorial" section "MATLAB Code for Pure Variable Method, (chapter 9)". || data || 6x99 || Decomposition || Use PURITY
|-
|-
|statdata        || STATDATA Data sets for ANOVA and statistics STATDEMO.
|raman_dust_particles || Raman spectra || raman_dust_particles || 120x1025 || Decomposition || Use PURITY. Use with raman_dust_particles_references. See Chemometrics Tutorial section "MATLAB Code for Pure Variable Method, chapter 9. <br />Also try different Baseline preprocessing techniques with this data (Hint: order = 3).
|-
|-
|sugar          || SUGAR Fluorescence EEM N-way data set.
|raman_dust_particles_references || Raman spectra || raman_dust_particles_references || 3x1025 || Decomposition || Use PURITY. Use with raman_dust_particles. See Chemometrics Tutorial section "MATLAB Code for Pure Variable Method, chapter 9
|-
|-
|wine            || WINE Wine demographic data set for PCA example.
|raman_time_resolved || Raman spectra of a time resolved reaction, used in "Chemometrics Tutorial". || raman_time_resolved || 16x151 || Decomposition || Try the PURITY program
|-
|-
|redbeerdata.xls  || REDBEERDATA.XLS Example spreadsheet for "Intro to MATLAB".
|sawdata        || Surface acoustic wave sensor data. || SAWdata || 72x13 || Decomposition, Classification ||
|-
|-
|areadrdemtext.txt || AREADRDEMTEXT.TXT Text file used by AREADRDEMO.
|SBRdata_EU || NIR transmission spectra of styrene-butadiene copolymers. ||Xcal <br />Ycal <br />Xtest <br />Ytest || 60x141 <br />60x4 <br />10x141 <br />10x4 || Regression || Try different regression methods (CLS, MLR, PLS)
|-
|-
|xclreadrdata.txt  || XCLREADRDATA.TXT Text file used by XCLREADRDEMO.</pre>
|stars          || Surface temperature and light intensity values for 47 stars. || stars || 47x2 || Decomposition ||
|-
|sugar          || Sugar Fluorescence EEM N-way data set|| sugar || 268x44x7 || Decomposition || Try Mulit-Way PCA
|-
|wine            || Wine demographic data set for PCA example. || wine || 10x5 || Decomposition ||
|-
|wineregion      || Metal Composition of Wines for classification by region. || wineregion || 38x17 || Classification ||
|}
|}
</tt>
 
==How to Load Demo Data==
 
For MAT files, the easiest way to load demo data is using the Load Dialog Box:
 
# From the file menu Load Data (or Import Workspace/MAT File).
# When the dialog box appears click the From File button.
# Type the name of the demo MAT file to loaded and hit the return key.
# Select the variables to be loaded and click the Load button.
 
[[Image:LoadDemoData.png| |600px|Load Demo Data]]
 
 
(Sub topic of [[PLS_Toolbox_Topics|PLS_Toolbox_Topics]])
(Sub topic of [[PLS_Toolbox_Topics|PLS_Toolbox_Topics]])

Latest revision as of 16:07, 8 May 2019

List of DataSets

The following is a list of the data supplied with PLS_Toolbox. It consists of the name of the file (all end in ".mat" unless otherwise specified) along with a brief description of the contents of the given file. Each file may contain one or more variables to be used.

File Name Description Variable Name(s) Variable Size Task Notes
alcohol Biological fluid analysis of alcoholics alcohol 65x52 Classification
aminoacids Fluorescence EEM of 5 aminoacid samples X 5x20x61 Decomposition Try with PARAFAC
arch Archeological artifact data set arch 75x10 Decomposition, Classification
beer VIS-NIR transmission recorded directly on undiluted degassed beer beer
extract
beertest
extracttest
40x926
40x1
20x926
20x1
Regression Good for testing Variable Selection Interface
biscuit NIR reflectance spectra of 40 samples of biscuit dough from 1200-2398nm. spec
recipe
40x600
40x4
Regression
brain_weight Body mass (kg) and brain mass (g) for 28 animals. brains 28x2 Decomposition
bread Sensory evaluation of breads. bread 10x11x8 Decomposisiton Try with PARAFAC, also good for testing EEM Filtering preprocessing
cancer Fluorescence EEM spectra extracted from cervical cancer images. cancer 563x22 Classification, Decomposition Unfolded EEM data
corn_dso 80 samples of corn measured on 3 different NIR spectrometers with moisture, oil, protein and starch values for each of the samples is also included. conc
m5nbs
m5spec
mp5nbs
mp5spec
mp6nbs
mp6spec
80x4
3x700
80x700
4x700
80x700
4x700
80x700
Regression
data_mid_IR Data sets for correlation spectroscopy. data_mid_IR 21x130 Correlation Spectroscopy Use with data_near_IR dataset
data_near_IR Data sets for correlation spectroscopy. data_near_IR 21x149 Correlation Spectroscopy Use with data_mid_IR dataset
dorrit DORRIT EEM of 27 samples with 4 flourophores for PARAFAC. EEM
yblock
27x116x18
27x4
Decomposition, Regression Try PARAFAC and Multi-Way PLS, NPLS
Dupont_BSPC 10 process variables (pressure, flow, temperature) for 55 batches with 7 steps each. dupont_cal
dupont_test
3600x10
1900x10
Batch Processor
etchdata Engineering process data from semiconductor metal etch Etchcal
EtchTest
20x12x107
20x12x20
Decomposition Use Multi-Way PCA (MPCA). See Chemometrics Tutorial chapter 5.
fia UV detection of Flow Injection Analysis of hydroxy-benzaldehyde (n-way data: sample, wavelength, time) FIA 12x50x45 Decomposition
flucuttest Fluorescence EEM data z 2x15x23 Decomposition Try PARAFAC
FTIR_microscopy FTIR microscopy transect spectra of a three-layer polymer laminate. FTIR_microscopy 17x81 Decomposition Use PURITY program
gcwine Dynamic headspace GCMS data of red wines from different regions. gcwine 71x150x24 Decomposition Try PARAFAC2
halddata HALDDATA Hald cement curing data. xblock
yblock
13x4
13x1
Regression
lcms LC/MS electrospray of 15 surfactant solution. lcms 345x1451 Decomposition Use CODA-DW Tool. See Chemometrics Tutorial, chapter 9.
lcms_compare1 Select data from LC/MS electrospray data set. lcms_compare1 675x1940 Decomposition Try LCMS Compare Tool. Use with lcms_compare2 and lcms_compare3. See Chemometris Tutorial Chapter, chapter 9.
lcms_compare2 Select data from LC/MS electrospray data set. lcms_compare2 675x1940 Decomposition Try LCMS Compare Tool. Use with lcms_compare1 and lcms_compare3. See Chemometris Tutorial Chapter, chapter 9.
lcms_compare3 Select data from LC/MS electrospray data set. lcms_compare3 675x1940 Decomposition Try LCMS Compare Tool. Use with lcms_compare1 and lcms_compare2. See Chemometris Tutorial Chapter, chapter 9.
MS_time_resolved Direct probe time profile MS of three color-coupling compounds. MS_time_resolved 20x757 Decomposition Use PURITY program (try 3 and 4 components)
MS_time_resolved_references Reference spectra of pure compounds from MS_time_resolved. MS_time_resolved_references 3x757 Use along with MS_time_resolved data for comparing results from PURITY
nir_data NIR spectra of pseudo gasoline samples conc
spec1
spec2
30x5
30x401
30x401
Regression Try Savitsky-Golay preprocessing
nmr_data NMR data for GRAM demo. nmrdata 20x1176 Decomposition
octane NIR spectra and octane number values of 39 gasoline samples. spec
octane
39x226
39x1
Regression, Decomposition Try some Robust methods. See Chemometrics Tutorial, chapter 13.
oesdata Optical emission spectra from metal etch. oes1 46x770 Decomposition Try Multivariate Curve Resolution (MCR). See Chemometrics Tutorial, chapter 9.
OliveOilData 36 FT-IR spectra (3600 - 600 cm-1) of olive oils. xcal
xtest
36x518
36x518
Decomposition, Classification Try Multiplicative Scatter Correction (MSC) preprocessing.
paint Non-linear paint formulation data. paint_cal_X
paint_cal_Y
paint_test_X
paint_test_Y
49x4
49x3
8x4
8x3
Regression Try using with non-linear regression methods, like SVM-R
pcadata Slurry Fed Ceramic Melter data. part1
part2
300x10
200x10
Decomposition
plsdata SFCM data for PCR and PLS demos. xblock1
yblock1
xblock2
yblock2
300x20
300x1
200x20
200x1
Regression Try Multiple Linear Regression (MLR). See Chemometrics Tutorial chapter 6.
pulsdata Time series data from a Slurry Fed Ceramic Melter for PLSPULSM demo. melter_data 325x3 Decomposition
purvardata Raman spectra used in "Chemometrics Tutorial" section "MATLAB Code for Pure Variable Method, (chapter 9)". data 6x99 Decomposition Use PURITY
purvardata_noise Raman spectra used in "Chemometrics Tutorial" section "MATLAB Code for Pure Variable Method, (chapter 9)". data 6x99 Decomposition Use PURITY
raman_dust_particles Raman spectra raman_dust_particles 120x1025 Decomposition Use PURITY. Use with raman_dust_particles_references. See Chemometrics Tutorial section "MATLAB Code for Pure Variable Method, chapter 9.
Also try different Baseline preprocessing techniques with this data (Hint: order = 3).
raman_dust_particles_references Raman spectra raman_dust_particles_references 3x1025 Decomposition Use PURITY. Use with raman_dust_particles. See Chemometrics Tutorial section "MATLAB Code for Pure Variable Method, chapter 9
raman_time_resolved Raman spectra of a time resolved reaction, used in "Chemometrics Tutorial". raman_time_resolved 16x151 Decomposition Try the PURITY program
sawdata Surface acoustic wave sensor data. SAWdata 72x13 Decomposition, Classification
SBRdata_EU NIR transmission spectra of styrene-butadiene copolymers. Xcal
Ycal
Xtest
Ytest
60x141
60x4
10x141
10x4
Regression Try different regression methods (CLS, MLR, PLS)
stars Surface temperature and light intensity values for 47 stars. stars 47x2 Decomposition
sugar Sugar Fluorescence EEM N-way data set sugar 268x44x7 Decomposition Try Mulit-Way PCA
wine Wine demographic data set for PCA example. wine 10x5 Decomposition
wineregion Metal Composition of Wines for classification by region. wineregion 38x17 Classification

How to Load Demo Data

For MAT files, the easiest way to load demo data is using the Load Dialog Box:

  1. From the file menu Load Data (or Import Workspace/MAT File).
  2. When the dialog box appears click the From File button.
  3. Type the name of the demo MAT file to loaded and hit the return key.
  4. Select the variables to be loaded and click the Load button.

Load Demo Data


(Sub topic of PLS_Toolbox_Topics)