Automatic sample selection and Release Notes Version 7 0 2: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Jeremy
No edit summary
 
imported>Jeremy
No edit summary
 
Line 1: Line 1:
The Calibration/Validation Sample selection interface allows the user to choose which samples to keep in the calibration set (Cal) and which to move to the validation set (Val).
==Changes and Bug Fixes in Version 7.0.2==


Selection can be done manually, by setting the "Sample Type" Class set (Under the Row Labels tab) to either Calibration or Validation for each sample, or automatically by selecting the Automatic split button (gear) in the toolbar.
===Bug Fixes and Enhancements===
{|


The sample selection interface is opened by choosing "Split into Calibration / Validation" from any of the data blocks in the Analysis status window. The resulting interface is a customized DataSet editor which shows one row for each sample in the current calibration and validation blocks and allows the user to modify the status of each sample.
|----valign="top"
|'''[[analysis]]'''
|
* Allow split cal/val even when no cal is present
* Fix for error when loading old model with custom cross-validation (loaded cvi which had only the INCLUDED samples liseted. New detail.cvi field contains both included and excluded samples and is what crossval was expecting to get)
* Fix for missing "block" information when drilling down from summary contributions to full contributions in MPCA model
* Allow relative T and Q contributions in MPCA models
* Fix for multiway bug in calculating Q contributions
* Give warning when user attempts to change conf. limit on batch maturity model type that this has no effect on shown conf. limits.
* Show used conf. limit in plot controls for Batch Maturity


Once set selection is done, the "Accept Experiment Setup" toolbar button can be used to automatically sort the data into the calibration and validation blocks. All data marked as "Calibration" will be moved to the X/Y blocks in the calibration section of the Analysis window and all data marked as "Validation" will be moved to the X/Y blocks in the validation section of the Analysis window. Clicking the "Discard Experiment Setup" button will discard all Cal / Val changes.


__TOC__
|----valign="top"
|'''[[bspcgui|Batch Processor]]'''
'''[[batchfold]]'''
|
* If steps are disabled, ignore extraction by steps!
* Remove forced removal of steps if Batch Maturity.
* Add name to dataset.
* Add per batch linear axis scale.
* Updates for alignment on BM and other.
* Fix model saving. Fix cow options. Add 'none' option in alignment. Add better loading of model and settings. Fix tab enable on load of model.
* Fix for allowing no steps. Become all one step.
* Add new plotting style, apply to new data, and remove class 0 from batch list.
* Always push data into the same Analysis window (if it is still open), otherwise use a new window
* If model or data is loaded, ask how to load data when pushed (calibration / validation)
* Add default alignment plus default method for BM and other.
* Add "stacked" plotting on batch plot.
* Update to drag patch behavior in linear view.
* Fix for batch list selections, make default batch plot style = stack.
* Remove unneeded batch selection now that Class 0 has been removed.


==Manual Sample Selection==
|----valign="top"
|'''[[b3spline]]'''
|
* Fix error in display option handling


Each sample can be moved to either the Calibration or Validation set by simply changing the "Sample Type" class. If there are labels for the samples, these will be shown in the Label field of the interface.
|----valign="top"
|'''[[batchmaturity]]'''
|
* Added asymmetric standard deviation as method to calculate confidence limits
* Added confidence limit algorithm (clalgorithm) option with default to asymmetric least squares (astd)
* Adjusted default confidence limit to 95% to match default in other level 2 functions
* Remove weighting applied to deviations when calculating the score limits using "percentile" method
* Don't calculate score limits when building raw model as this would be done unnecessarily for 10 PCs. This could be time consuming.


To move more than one sample at a time, click the button at the left of each row to move to select the row. Once all the desired rows are selected, use the Class pull-down menu on one of the selected rows to choose Calibration or Validation, as desired. All selected samples will be switched to the indicated set.
|----valign="top"
|'''[[boxplot]]'''
|
* No "Extreme" outliers plotted if there were no "Standard" outliers. This was the case for either upper or lower outliers, so upper (lower) extremes only plotted if there were upper (lower) standard outliers.


==Automatic Sample Selection==
|----valign="top"
|'''[[browse]]'''
|
* Add message saying browse is initializing


Automatic sample selection walks the user through the selection asking a series of questions outlined below.
|----valign="top"
|'''[[corrspecgui]]'''
|
* Fix typo in plot type.


===Disposition of Previous Selection Changes===
|----valign="top"
|'''[[summary]]'''
|
* Fix for error when all of a given variable are excluded/missing


First, if there are any samples which have been manually or automatically moved from Cal to Val, or vice versa, the user is asked if they want to Reset all samples back to their original set before automatic selection is done. Choosing "Reset" will restore all the samples to the set they were in when the sample selection interface was opened. Choosing "Select from Current Split" will keep the samples in their current split and allow '''further''' selection automatically. "Cancel" stops all selection.
|----valign="top"
|'''[[experimentreadr]]'''
|
* Switch cal/val class numbers (so calibration is 0 and shows as black circles, and 1 as red triangles as with scores plots)
* Handle case when all samples are converted to validation


[[Image:Selreset.png]]
|----valign="top"
|'''[[genalgplot]]'''
|
* add drawnow to make sure some plots get updated when we switch from selection plot to the information plot


===Direction for Sample Selection===
|----valign="top"
|'''[[modelcache]]'''
|
* Add new deletedates mode to modelcache


Next, if there are any samples marked as Validation, the user is asked which "direction" they want to select, either removing samples FROM the calibration set (to the validation set), or adding samples TO the calibration set (out of the validation set). The first option is used when there are more samples in the calibration than are desired ''or'' when the user wishes to create a test set for their model. The second option is used when new data has been measured and the user wishes to add some subset of these samples to a previous set of calibration samples (to improve model performance on the new types of samples.)
|----valign="top"
|'''[[mscorr]]'''
|
* Fix typo in error message


If all the samples are in the calibration set already, Remove From Calibration is assumed.
|----valign="top"
|'''[[parafac]]'''
|
* Fix for serious but rare bug in PARAFAC: For higher than three-way, the constraint in mode two was also imposed in mode three. So the bug is only seen when those constraints are different. Most of the time constraints would just be nonneg all over the place,so bug is unlikely to be seen.


[[Image:Seldirection.png]]
|----valign="top"
|'''[[peakfind]]'''
|
* Don't do search for peaks if fewer than window*2 variables!


|----valign="top"
|'''[[plotgui|Plot Controls]]'''
|
* Add separators above Bar and Mesh to make menu easier to read
* Add "enhanced surface" mode
* Better handling duplication of data as needed for 3D plots (to avoid errors when plotting)
* Change settings on viewinterpolated so it will be available from the settings control button on the toolbar
* Fix for plotting scatter plots with n-way data in 3rd dimension (xdata is row vector instead of column vector)
* Don't reset 'PlotBoxAspectRatioMode','CameraViewAngleMode', or 'DataAspectRatioMode' in 2008b or later (seems to cause strange plot box resizing problems)
* Better position labels when rotated text is being used
* Add ability to use logical in search


===Selection Method===
|----valign="top"
|'''Adjust Axis Limits Interface'''
|
* Fix use with multiple axes and multiple figures. Fix bugs with initializing settings. Better handle restoring color.
* Fix for color of background when target figure has BLACK (or dark gray) background (can't see text!!)


Next, the selection method must be chosen from:
|----valign="top"
* Nearest Neighbor Thinning - based on [[reducennsamples]] this method selects samples by discarding samples which are very similar to existing samples. The result is a set of samples that spans the same range as the original data, but with an even distribution of samples across that range.
|'''[[plsda]]'''
* Onion - based on [[distslct]] this method first selects a ring of highly-unqiue samples based on distance (similar to the D-Optimal criteria), then leaves out a ring of the next-unique samples, then finally selects a random subset of samples inside the boundaries selected in the "onion".
|
* Treat "0" as unknown class only if input y has more than 2 unique values


[[Image:Selmethod.png]]
|----valign="top"
|'''[[preprocess]]'''
|
* Add "Favorites" button to
: (a) move certain methods to the top of the preprocessing list OR
: (b) to create new aggregate methods from the current selection of multiple methods
* Add "Hide/Unhide" button to hide items you don't use often
* Add hidden support for font size changing


===Choosing Percentage to Keep===
|----valign="top"
|'''[[splitcaltest]]'''
|
* Fix bug where splitcaltest does nothing (all samples remain as calibration) if input data is "short and wide", as with nir_data for example with SVM, or when ncomp >=10 for PCA, LWR, etc.
* Remove requirement that the input data were acquired in a random order
* Initial demo added


Finally, the user must select the percentage of samples to "select". In the case of Removing From Calibration, this is the percentage of Calibration samples to '''keep''' in the calibration set. In the case of Adding To Calibration, this is the percentage of Validation samples to '''add''' to the calibration set. The value must be between 1 and 100
|----valign="top"
|'''[[tconcalc]]'''
|
* Add support for tcon calculation from PCR and PLS models even when tconcalc is passed ONLY the prediction structure (as long as the necessary eigenvalues information is in the model details)


[[Image:Selpct.png]]
|----valign="top"
|'''[[trendtool]]'''
|
* Consider a "viewSpec" request for the a spectrum beyond the highest numbered spectrum as a request for "the last" spectrum (e.g. "inf" will give the max)
* Add 'interpolation' as new property that trendtool can set on the trend view
* Add ability to access this through evrigui as property: obj.setInterpolation(n)
* Add plottype surface and evrigui connection to modify it (setPlottype)


|----valign="top"
|'''[[EVRIGUI Objects]]'''
|
* Add fieldnames to EVRIGUI object to allow tab-completion of valid methods and properties


===Finishing the Selection===
|----valign="top"
|'''[[EVRIModel Objects]]'''
|
* Rearrange logic when updating from old model version (generalize copying of fields from old model into new one
* Add conrearrange as private method to re-arrange contributions into "used", "passed", or "full" forms (like with Solo_Predictor)
* Add "contributions" and "matchvarsmap" (hidden) properties
* Fix logic which assigns calibrate.options.plots and calibrate.options.display settings (also set in top-level)
* Add "matchvars" property to models as option to DISABLE call to matchvars during apply, xhat and tcon/qcon calculations.
* If user turns off model object, don't expect evrimodelversion field (use modelversion only) and automatically extract model contents. Now users can automatically down-grade models using simply:
setplspref('evrimodel','noobject',1)
:then loading the new model


Once all settings have been defined, the selection will take place and the samples will be marked in their new sets. It may be useful to create a plot (click on the Plot toolbar button, or the Plot tab) to view which samples are in which sets. Accepting the changes will move all samples to the new sets and make sure Analysis is in the appropriate configuration for analysis of the data.
|----valign="top"
|'''add3dlight'''
|
* Add "add3dlight" as new GUI utility to add 3D lighting effects for enhanced surface plots
 
|----valign="top"
|'''modelviewertool'''
|
* Fixed a bug in Tucker where the core was plotted as a loading in modelviewer when fitting e.g. Tucker(X,[3 3 1])
 
|----valign="top"
|'''peakfindgui'''
|
* Allow for more or less adjustability in sensitivity depending on the # of variables
* Encode logic to handle non-integer values for found peak position (in case center of mass calculation is used and non-integer peak positions values get returned)
 
|----valign="top"
|'''[[piconnectgui]]'''
|
* better handling of errors thrown during initialization
|----
|}

Revision as of 16:55, 20 November 2012

Changes and Bug Fixes in Version 7.0.2

Bug Fixes and Enhancements

analysis
  • Allow split cal/val even when no cal is present
  • Fix for error when loading old model with custom cross-validation (loaded cvi which had only the INCLUDED samples liseted. New detail.cvi field contains both included and excluded samples and is what crossval was expecting to get)
  • Fix for missing "block" information when drilling down from summary contributions to full contributions in MPCA model
  • Allow relative T and Q contributions in MPCA models
  • Fix for multiway bug in calculating Q contributions
  • Give warning when user attempts to change conf. limit on batch maturity model type that this has no effect on shown conf. limits.
  • Show used conf. limit in plot controls for Batch Maturity


Batch Processor

batchfold

  • If steps are disabled, ignore extraction by steps!
  • Remove forced removal of steps if Batch Maturity.
  • Add name to dataset.
  • Add per batch linear axis scale.
  • Updates for alignment on BM and other.
  • Fix model saving. Fix cow options. Add 'none' option in alignment. Add better loading of model and settings. Fix tab enable on load of model.
  • Fix for allowing no steps. Become all one step.
  • Add new plotting style, apply to new data, and remove class 0 from batch list.
  • Always push data into the same Analysis window (if it is still open), otherwise use a new window
  • If model or data is loaded, ask how to load data when pushed (calibration / validation)
  • Add default alignment plus default method for BM and other.
  • Add "stacked" plotting on batch plot.
  • Update to drag patch behavior in linear view.
  • Fix for batch list selections, make default batch plot style = stack.
  • Remove unneeded batch selection now that Class 0 has been removed.
b3spline
  • Fix error in display option handling
batchmaturity
  • Added asymmetric standard deviation as method to calculate confidence limits
  • Added confidence limit algorithm (clalgorithm) option with default to asymmetric least squares (astd)
  • Adjusted default confidence limit to 95% to match default in other level 2 functions
  • Remove weighting applied to deviations when calculating the score limits using "percentile" method
  • Don't calculate score limits when building raw model as this would be done unnecessarily for 10 PCs. This could be time consuming.
boxplot
  • No "Extreme" outliers plotted if there were no "Standard" outliers. This was the case for either upper or lower outliers, so upper (lower) extremes only plotted if there were upper (lower) standard outliers.
browse
  • Add message saying browse is initializing
corrspecgui
  • Fix typo in plot type.
summary
  • Fix for error when all of a given variable are excluded/missing
experimentreadr
  • Switch cal/val class numbers (so calibration is 0 and shows as black circles, and 1 as red triangles as with scores plots)
  • Handle case when all samples are converted to validation
genalgplot
  • add drawnow to make sure some plots get updated when we switch from selection plot to the information plot
modelcache
  • Add new deletedates mode to modelcache
mscorr
  • Fix typo in error message
parafac
  • Fix for serious but rare bug in PARAFAC: For higher than three-way, the constraint in mode two was also imposed in mode three. So the bug is only seen when those constraints are different. Most of the time constraints would just be nonneg all over the place,so bug is unlikely to be seen.
peakfind
  • Don't do search for peaks if fewer than window*2 variables!
Plot Controls
  • Add separators above Bar and Mesh to make menu easier to read
  • Add "enhanced surface" mode
  • Better handling duplication of data as needed for 3D plots (to avoid errors when plotting)
  • Change settings on viewinterpolated so it will be available from the settings control button on the toolbar
  • Fix for plotting scatter plots with n-way data in 3rd dimension (xdata is row vector instead of column vector)
  • Don't reset 'PlotBoxAspectRatioMode','CameraViewAngleMode', or 'DataAspectRatioMode' in 2008b or later (seems to cause strange plot box resizing problems)
  • Better position labels when rotated text is being used
  • Add ability to use logical in search
Adjust Axis Limits Interface
  • Fix use with multiple axes and multiple figures. Fix bugs with initializing settings. Better handle restoring color.
  • Fix for color of background when target figure has BLACK (or dark gray) background (can't see text!!)
plsda
  • Treat "0" as unknown class only if input y has more than 2 unique values
preprocess
  • Add "Favorites" button to
(a) move certain methods to the top of the preprocessing list OR
(b) to create new aggregate methods from the current selection of multiple methods
  • Add "Hide/Unhide" button to hide items you don't use often
  • Add hidden support for font size changing
splitcaltest
  • Fix bug where splitcaltest does nothing (all samples remain as calibration) if input data is "short and wide", as with nir_data for example with SVM, or when ncomp >=10 for PCA, LWR, etc.
  • Remove requirement that the input data were acquired in a random order
  • Initial demo added
tconcalc
  • Add support for tcon calculation from PCR and PLS models even when tconcalc is passed ONLY the prediction structure (as long as the necessary eigenvalues information is in the model details)
trendtool
  • Consider a "viewSpec" request for the a spectrum beyond the highest numbered spectrum as a request for "the last" spectrum (e.g. "inf" will give the max)
  • Add 'interpolation' as new property that trendtool can set on the trend view
  • Add ability to access this through evrigui as property: obj.setInterpolation(n)
  • Add plottype surface and evrigui connection to modify it (setPlottype)
EVRIGUI Objects
  • Add fieldnames to EVRIGUI object to allow tab-completion of valid methods and properties
EVRIModel Objects
  • Rearrange logic when updating from old model version (generalize copying of fields from old model into new one
  • Add conrearrange as private method to re-arrange contributions into "used", "passed", or "full" forms (like with Solo_Predictor)
  • Add "contributions" and "matchvarsmap" (hidden) properties
  • Fix logic which assigns calibrate.options.plots and calibrate.options.display settings (also set in top-level)
  • Add "matchvars" property to models as option to DISABLE call to matchvars during apply, xhat and tcon/qcon calculations.
  • If user turns off model object, don't expect evrimodelversion field (use modelversion only) and automatically extract model contents. Now users can automatically down-grade models using simply:
setplspref('evrimodel','noobject',1)
then loading the new model
add3dlight
  • Add "add3dlight" as new GUI utility to add 3D lighting effects for enhanced surface plots
modelviewertool
  • Fixed a bug in Tucker where the core was plotted as a loading in modelviewer when fitting e.g. Tucker(X,[3 3 1])
peakfindgui
  • Allow for more or less adjustability in sensitivity depending on the # of variables
  • Encode logic to handle non-integer values for found peak position (in case center of mass calculation is used and non-integer peak positions values get returned)
piconnectgui
  • better handling of errors thrown during initialization