Release Notes Version 7 0: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
No edit summary
imported>Jeremy
 
(28 intermediate revisions by the same user not shown)
Line 4: Line 4:


(back to [[Release Notes PLS Toolbox and Solo]])
(back to [[Release Notes PLS Toolbox and Solo]])
==New Features==
==New Features in Solo and PLS_Toolbox==
 
===[[Bspcgui|Batch Statistical Process Control Tools]]===
 
* New [[Bspcgui|top-level data processor]] to read, align, tag, and arrange batch data into appropriate form for batch analysis.
* Creates data in appropriate format for analysis with these model types:
*:* Summary PCA (PCA on summary of variables over time)
*:* Batch Maturity (PCA with heterogeneous confidence limits)
*:* MPCA (Multiway PCA)
*:* PARAFAC (Parallel Factor Analysis)
*:* Summary PARAFAC (PARAFAC on summary of variables over time)
*:* PARAFAC2 (only available in PLS_Toolbox with MATLAB)
* Graphical and automatic identification of batches and [optional] steps in the imported data.
* Automatic alignment of batches (when necessary) by linear, infilling, or Correlation Optimized Warping.
* Summary methods allow a wide range of statistics to be calculated for each variable.
* Opens processed data directly in Analysis for immediate model building.
* Steps to process data stored for easy application to new data (in data application mode.)


===[[AnalysisWindow_Layout|Analysis Window]]===
===[[AnalysisWindow_Layout|Analysis Window]]===
* Feature


===[[WorkspaceBrowser_Layout|Workspace Browser]] ===
* [[Batchmaturity|BatchMaturity]] analysis type added (PCA model with heterogeneous confidence limits for scores).
* Feature
* [[Automatic_sample_selection|Split data into calibration / validation sets]] using manual or automatic selection. [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/cal_val_split]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/cal_val_split View Video ]''' '' 
* Calculate relative T^2 and Q contributions. New buttons on Plot Controls allow selection of sample(s) as a T or Q reference set. Resulting T or Q contributions are done relative to those selected sample(s). [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/relative_contributions]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/relative_contributions View Video ]''' '' 
* Y-block loadings included in bi-plots (PLS).
* Cross-validation results in SSQ table. [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/crossval_info_analysis]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/crossval_info_analysis View Video ]''' '' 
* 3D Loadings from multiway methods can be plotted as 3D surfaces (or other 3D plots).
* Change included data directly on preprocessed data plots.
* "Export to Regression Vector" allowed for MLR models.
* Cross-validation default enabled with improved user awareness of options.
* Model Cache "Date" mode now sorts in descending order (for faster access to the most recent models and data)
* Tucker congruence and core consistency test added for multiway models (warn user if it looks like the "supposed to be one" components in the core are showing signs of degeneracy.)
* Purity now has "Resolve" and "Accept" buttons to improve usability.
 
===[[Model_Building:_Plotting_Scores|Scores Plots]]===
''(see also [[#Plot Controls and Visualization Tools]] below)''


===[[Importing_Data|Import / Export]]===
* Double-sided confidence limits display is more configurable: display as shaded regions, lines, or both and choose color.
* [[hjyreadr]] - Improved ability to install and manage ActiveX object.
* Cross-validation sub-sets are included as classes to show which samples were in which cross-validation groups. [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/crossval_info_analysis]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/crossval_info_analysis View Video ]''' ''
* Add support for [[Svmoc|SVM One-Class models]] (command-line only)


===[[Plot_Controls|Plot Controls and Visualization Tools]]===
===[[Plot_Controls|Plot Controls and Visualization Tools]]===
* Selectable
 
* Automatic find and mark peak locations. [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/peak_marking]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/peak_marking View Video ]''' '' 
* [[PlotControlsWindow_Layout_2#Search_Bar|Quick-search bar for selecting]] by labels, axisscales, classes and indexes.  [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/quick_selection_search]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/quick_selection_search View Video ]''' '' 
* Plot type button for quick  change between various plot types.  [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/plot_types_peak_marking]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/plot_types_peak_marking View Video ]''' '' 
* Plot types [[Plotmonotonic|"Monotonic"]], "scatter" and "line" added.
* Right-click access to adjust axis limits and other plot settings.
* Improved appearance of selections, 3D plots, stacked plots, and class set identifiers.
* Class population statistics by right-clicking data (shows # and % of samples in each class). [[Image:Movie.png|link=http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/class_summary]]'' '''[http://www.eigenvector.com/eigenguide.php?m=../movies_wiki/for70/class_summary View Video ]''' '' 
* Colormap changes by right-clicking image (With MIA_Toolbox and Solo+MIA only).
* Autosize makers option added ( automatically adjusts marker sizes to match axes size, if not specified otherwise).
* Color-by axisscales and index and clarify what will be colored for all types (lines or points).
* Magnify tool now has an easy "resize" corner.
* Label points by their axisscale value.
 
===[[Trendtool|TrendTool]]===
 
* Automatically find peaks and place markers at those peaks.
* Images now allow showing of more than 3 markers at a time ''(MIA_Toolbox and Solo+MIA only)''.
* Image axisscale is used when displaying images ''(MIA_Toolbox and Solo+MIA only)''.


===[[Dataset_editor|DataSet Editor]]===
===[[Dataset_editor|DataSet Editor]]===
* Plot


===Other Method Improvements===
* Axistype (continuous, discrete,...) support in labels tabs.
* [[Sample_Classification_Predictions|Classification methods]] - include "Class Pred Member - multiple" indicating when a sample is being assigned to more than one class.
* Bulk selection changes in label tab context menus (allows quick selections based on list of all samples.)
* Export to ThermoGalactic SPC file format from File menu.
* Data drop support (drop onto tabs imports data.)
* Data augmentation adds classes to identify different data blocks (for column-augment mode.)
 
===[[Importing_Data|Import / Export]]===
 
* [[spcreadr|SPC File Format]]
*:* Improved multiple file reading (with unequally spaced x-axis)
*:* improved handling of automatic axis scale names
* [[xclreadr|CSV File Format]] -Allow space, tab, and | as valid automatically-detected delimiters for CSV files (improves drag/drop importing behavior).
 
===Preprocessing and Transformations===
 
* [[savgol]] -New "tails" mode to improve performance at ends of spectra.
* [[classcentroid]] -New class centroid centering preprocessing method.
* [[mscorr]] -MSC with new 'median' method for robust scaling (and to use with Probabilistic Quotient Normalization - PQN.)
* [[wlsbaseline]] -Whittaker filter option added to Weighted Least Squares baseline. FAST and better for baselines which don't look like polynomials.
* [[reducennsamples]] -Added access to help within settings dialog.


===New Demo Datasets===
===New Demo Datasets===


''TODO: Add Descriptions''
*'''cancer''' -Fluorescence EEM spectra from images of cervices with various states of cervical cancer.         
*'''Dupont_BSPC''' -Batch data (10 variables x 36 batches at 100 time intervals) from Nomikos & MacGregor, Technometrics, 37(1), 1995.
*'''OliveOilData''' -Olive Oil FT-IR Classification data from Dahlberg, et. al. Appl. Spectrosc., 51(8), 1118-1124 (1997)
 
===General Solo Improvements===
 
* Re-enable docked figures with Solo & Solo+MIA.
* Improved memory performance (java.opts modification).


:''cancer''
:''Dupont_BSPC''
:''OliveOilData''


==New Command-line Features and Functions==
==New Command-line Features and Functions==


*''Full Support for Matlab R2012b''
*''Full Support for Matlab R2012b''
===[[EVRIModel Objects]]===
New [[EVRIModel_Objects|high-level object]] to contain models. Provides an object-oriented approach to building new models and makes working with models and predictions much easier.
* [[EVRIModel_Objects#Building_from_Uncalibrated_Model_Objects|Build models]] using simple object-oriented assignments and methods
* [[EVRIModel_Objects#Working_With_Calibrated_Models|Apply models to new data]] directly using model methods: <tt>prediction = model.apply(newdata);</tt>
* [[EVRIModel_Objects#Calibrated_Model_Methods|Review results in plots]] using object methods: <tt>model.plotscores, model.plotloads, model.ploteigen</tt>
* Access commonly-used results via simple properties: <tt>.scores, .predictions, .tcon, .t2</tt> instead of having to remember array indexes and cryptic field names.
* Nearly complete backwards compatibility with existing user code.


===Command-line Tool Changes===
===Command-line Tool Changes===
Line 40: Line 118:
* ''Quick Reference Card'' -New quick reference card ( PLS_Toolbox_Quick_Reference.pdf )
* ''Quick Reference Card'' -New quick reference card ( PLS_Toolbox_Quick_Reference.pdf )


* [[autoexport]] -add SPC export functionality.
* [[chitest]] -add distribution name and function name to chitest outputs (making it much easier to apply the results).
* [[chitest]] -add distribution name and function name to chitest outputs (making it much easier to apply the results).
* [[coreanal]] -updated coreanal.m to be able to provide a list of important core values (new optional second output).
* [[coreanal]] -updated coreanal.m to be able to provide a list of important core values (new optional second output).
* [[crossval]] -added output of cvi to help identify which leave-out group each sample was in.
* [[encode]] -Increase number of items allowed in each row of "speed" encoded files (makes the encoding MUCH faster)
* [[ils_esterror]] -Various improvements to allow different types of error estimates.
* [[ils_esterror]] -Various improvements to allow different types of error estimates.
* [[mscorr]] -Add option.algorithm to include new option 'median', based on Probabilistic Quotient Normalization.
* [[spcreadr]]
*:* Improved multiple file reading (with unequally spaced x-axis)
*:* Improved handling of automatic axis scale names
* [[svmoc]] -add support to plot scores from SVM One Class models.
* [[svmoc]] -add support to plot scores from SVM One Class models.
* [[DataSet Object]] - New methods added:
* [[windowfilter]] -Added method 'roll' (for processing rows only), slight modification to RH edge indexing during call (is last channel processed?)
* [[wlsbaseline]] -Add Whittaker filter option to wlsbaseline and wlsbaselineset (FAST and better for baselines which don't look like polynomials)
* [[xclreadr]] -Allow space, tab, and | as valid automatically-detected delimiters for CSV files
 
* [[DataSet Object]] - Changes:
*:* Decrease dependency on PLS_Toolbox
*:* Allow assignment directly onto imageaxisscale
* [[DataSet Object]] - New Methods:
*:* FINDSET -Locate a set within a label field (axisscale,label,class) in a DataSet.
*:* FINDSET -Locate a set within a label field (axisscale,label,class) in a DataSet.
*:* LISTSETS -For a given field and mode list the sets available.
*:* LISTSETS -For a given field and mode list the sets available.
*:* SEARCH -Search for given term in a dso field, mode, and set.
*:* SEARCH -Search for given term in a dso field, mode, and set.
*:* UPDATESET -Add/update a label field (axisscale,label,class) in a DataSet.
*:* UPDATESET -Add/update a label field (axisscale,label,class) in a DataSet.


===Misc New Functions===
===Misc New Functions===


:[[unhist]] -Create a vector whose values follow an empirical distribution.
:[[roccurve]] - Calculate and display ROC curve(s) for yknown and ypred.
:[[minimizemodel]] - Shrinks model by removing non-critical information.
:[[batchalign]] - Convert data columns based on matching ref col to target vector.
:[[batchalign]] - Convert data columns based on matching ref col to target vector.
:[[batchmaturity]] - Batch process model and monitoring.
:[[batchmaturity]] - Batch process model and monitoring.
:[[batchfold]] - Transform batch data into dataset for analysis.
:[[batchfold]] - Transform batch data into dataset for analysis.
:[[classcentriod]] - Centers data to the centroid of all classes.
:[[classcentroid]] - Centers data to the centroid of all classes.
:[[EVRIModel_Objects|evrimodel]] - EVRI Model Object.
:[[EVRIModel_Objects|evrimodel]] - EVRI Model Object.
:[[plotmontonic]] - Plot lines with breaks when the x-value "doubles-back" on itself.
:[[minimizemodel]] - Shrinks model by removing non-critical information.
:[[plotmonotonic]] - Plot lines with breaks when the x-value "doubles-back" on itself.
:[[roccurve]] - Calculate and display ROC curve(s) for yknown and ypred.
:[[splitcaltest]] - Splits randomly ordered data into calibration and test sets.
:[[splitcaltest]] - Splits randomly ordered data into calibration and test sets.
:[[unhist]] -Create a vector whose values follow an empirical distribution.
:[[writespc]] - Writes Galactic SPC files.
:[[writespc]] - Writes Galactic SPC files.

Latest revision as of 11:17, 24 February 2014

Version 7.0 of PLS_Toolbox and Solo was released in October, 2012.

For general product information, see PLS_Toolbox Product Page. For information on Solo, see Solo Product Page. This release was done in conjunction with MIA_Toolbox / Solo+MIA version 2.8

(back to Release Notes PLS Toolbox and Solo)

New Features in Solo and PLS_Toolbox

Batch Statistical Process Control Tools

  • New top-level data processor to read, align, tag, and arrange batch data into appropriate form for batch analysis.
  • Creates data in appropriate format for analysis with these model types:
    • Summary PCA (PCA on summary of variables over time)
    • Batch Maturity (PCA with heterogeneous confidence limits)
    • MPCA (Multiway PCA)
    • PARAFAC (Parallel Factor Analysis)
    • Summary PARAFAC (PARAFAC on summary of variables over time)
    • PARAFAC2 (only available in PLS_Toolbox with MATLAB)
  • Graphical and automatic identification of batches and [optional] steps in the imported data.
  • Automatic alignment of batches (when necessary) by linear, infilling, or Correlation Optimized Warping.
  • Summary methods allow a wide range of statistics to be calculated for each variable.
  • Opens processed data directly in Analysis for immediate model building.
  • Steps to process data stored for easy application to new data (in data application mode.)

Analysis Window

  • BatchMaturity analysis type added (PCA model with heterogeneous confidence limits for scores).
  • Split data into calibration / validation sets using manual or automatic selection. Movie.png View Video
  • Calculate relative T^2 and Q contributions. New buttons on Plot Controls allow selection of sample(s) as a T or Q reference set. Resulting T or Q contributions are done relative to those selected sample(s). Movie.png View Video
  • Y-block loadings included in bi-plots (PLS).
  • Cross-validation results in SSQ table. Movie.png View Video
  • 3D Loadings from multiway methods can be plotted as 3D surfaces (or other 3D plots).
  • Change included data directly on preprocessed data plots.
  • "Export to Regression Vector" allowed for MLR models.
  • Cross-validation default enabled with improved user awareness of options.
  • Model Cache "Date" mode now sorts in descending order (for faster access to the most recent models and data)
  • Tucker congruence and core consistency test added for multiway models (warn user if it looks like the "supposed to be one" components in the core are showing signs of degeneracy.)
  • Purity now has "Resolve" and "Accept" buttons to improve usability.

Scores Plots

(see also #Plot Controls and Visualization Tools below)

  • Double-sided confidence limits display is more configurable: display as shaded regions, lines, or both and choose color.
  • Cross-validation sub-sets are included as classes to show which samples were in which cross-validation groups. Movie.png View Video
  • Add support for SVM One-Class models (command-line only)

Plot Controls and Visualization Tools

  • Automatic find and mark peak locations. Movie.png View Video
  • Quick-search bar for selecting by labels, axisscales, classes and indexes. Movie.png View Video
  • Plot type button for quick change between various plot types. Movie.png View Video
  • Plot types "Monotonic", "scatter" and "line" added.
  • Right-click access to adjust axis limits and other plot settings.
  • Improved appearance of selections, 3D plots, stacked plots, and class set identifiers.
  • Class population statistics by right-clicking data (shows # and % of samples in each class). Movie.png View Video
  • Colormap changes by right-clicking image (With MIA_Toolbox and Solo+MIA only).
  • Autosize makers option added ( automatically adjusts marker sizes to match axes size, if not specified otherwise).
  • Color-by axisscales and index and clarify what will be colored for all types (lines or points).
  • Magnify tool now has an easy "resize" corner.
  • Label points by their axisscale value.

TrendTool

  • Automatically find peaks and place markers at those peaks.
  • Images now allow showing of more than 3 markers at a time (MIA_Toolbox and Solo+MIA only).
  • Image axisscale is used when displaying images (MIA_Toolbox and Solo+MIA only).

DataSet Editor

  • Axistype (continuous, discrete,...) support in labels tabs.
  • Bulk selection changes in label tab context menus (allows quick selections based on list of all samples.)
  • Export to ThermoGalactic SPC file format from File menu.
  • Data drop support (drop onto tabs imports data.)
  • Data augmentation adds classes to identify different data blocks (for column-augment mode.)

Import / Export

  • SPC File Format
    • Improved multiple file reading (with unequally spaced x-axis)
    • improved handling of automatic axis scale names
  • CSV File Format -Allow space, tab, and | as valid automatically-detected delimiters for CSV files (improves drag/drop importing behavior).

Preprocessing and Transformations

  • savgol -New "tails" mode to improve performance at ends of spectra.
  • classcentroid -New class centroid centering preprocessing method.
  • mscorr -MSC with new 'median' method for robust scaling (and to use with Probabilistic Quotient Normalization - PQN.)
  • wlsbaseline -Whittaker filter option added to Weighted Least Squares baseline. FAST and better for baselines which don't look like polynomials.
  • reducennsamples -Added access to help within settings dialog.

New Demo Datasets

  • cancer -Fluorescence EEM spectra from images of cervices with various states of cervical cancer.
  • Dupont_BSPC -Batch data (10 variables x 36 batches at 100 time intervals) from Nomikos & MacGregor, Technometrics, 37(1), 1995.
  • OliveOilData -Olive Oil FT-IR Classification data from Dahlberg, et. al. Appl. Spectrosc., 51(8), 1118-1124 (1997)

General Solo Improvements

  • Re-enable docked figures with Solo & Solo+MIA.
  • Improved memory performance (java.opts modification).


New Command-line Features and Functions

  • Full Support for Matlab R2012b

EVRIModel Objects

New high-level object to contain models. Provides an object-oriented approach to building new models and makes working with models and predictions much easier.

  • Build models using simple object-oriented assignments and methods
  • Apply models to new data directly using model methods: prediction = model.apply(newdata);
  • Review results in plots using object methods: model.plotscores, model.plotloads, model.ploteigen
  • Access commonly-used results via simple properties: .scores, .predictions, .tcon, .t2 instead of having to remember array indexes and cryptic field names.
  • Nearly complete backwards compatibility with existing user code.

Command-line Tool Changes

  • Quick Reference Card -New quick reference card ( PLS_Toolbox_Quick_Reference.pdf )
  • autoexport -add SPC export functionality.
  • chitest -add distribution name and function name to chitest outputs (making it much easier to apply the results).
  • coreanal -updated coreanal.m to be able to provide a list of important core values (new optional second output).
  • crossval -added output of cvi to help identify which leave-out group each sample was in.
  • encode -Increase number of items allowed in each row of "speed" encoded files (makes the encoding MUCH faster)
  • ils_esterror -Various improvements to allow different types of error estimates.
  • mscorr -Add option.algorithm to include new option 'median', based on Probabilistic Quotient Normalization.
  • spcreadr
    • Improved multiple file reading (with unequally spaced x-axis)
    • Improved handling of automatic axis scale names
  • svmoc -add support to plot scores from SVM One Class models.
  • windowfilter -Added method 'roll' (for processing rows only), slight modification to RH edge indexing during call (is last channel processed?)
  • wlsbaseline -Add Whittaker filter option to wlsbaseline and wlsbaselineset (FAST and better for baselines which don't look like polynomials)
  • xclreadr -Allow space, tab, and | as valid automatically-detected delimiters for CSV files
  • DataSet Object - Changes:
    • Decrease dependency on PLS_Toolbox
    • Allow assignment directly onto imageaxisscale
  • DataSet Object - New Methods:
    • FINDSET -Locate a set within a label field (axisscale,label,class) in a DataSet.
    • LISTSETS -For a given field and mode list the sets available.
    • SEARCH -Search for given term in a dso field, mode, and set.
    • UPDATESET -Add/update a label field (axisscale,label,class) in a DataSet.

Misc New Functions

batchalign - Convert data columns based on matching ref col to target vector.
batchmaturity - Batch process model and monitoring.
batchfold - Transform batch data into dataset for analysis.
classcentroid - Centers data to the centroid of all classes.
evrimodel - EVRI Model Object.
minimizemodel - Shrinks model by removing non-critical information.
plotmonotonic - Plot lines with breaks when the x-value "doubles-back" on itself.
roccurve - Calculate and display ROC curve(s) for yknown and ypred.
splitcaltest - Splits randomly ordered data into calibration and test sets.
unhist -Create a vector whose values follow an empirical distribution.
writespc - Writes Galactic SPC files.