Textreadr and Release Notes Model Exporter Version 3 2: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Jeremy
 
imported>Jeremy
(Created page with "Version 3.2 of Model_Exporter was released in June, 2015. For general product information, see Model_Exporter_User_Guide. ==NEW FEATURES / FIXES== * Add support for non-...")
 
Line 1: Line 1:
===Purpose===
Version 3.2 of Model_Exporter was released in June, 2015. For general product information, see [[Model_Exporter_User_Guide]].


Reads ASCII flat files from MS Excel and other spreadsheets as a DataSet Object.
==NEW FEATURES / FIXES==


===Synopsis===
* Add support for non-negative least squares CLS models IF outputting to m-file format (requires function be enabled as sub-functions are added to accomplish the NNLS calculation)
* Add support for approximate nearest neighbor distance via distancemetric option. This is an approximation of the NN distance, but is NOT exact.
* Add support for application of most preprocessing methods to MATRICES
** Allows use of PLS, PCR, PCA, CLS, and PLSDA on x-block of MATRICES (not just vectors) with basically all supported preprocessing methods
* Add basic support for arithmetic on X-block (only - undo on y-block not supported)
* Change "prob" to "probs" in SVMDA so it matches PLSDA outputs
* Fix bug where placeholder variables were not removed from data if SavGol preprocessing was the one and only preprocessing method
* Fix bug which would lead to indexing error in SavGol preprocessing if variables at the END of the spectrum are excluded


:out = textreadr(file,delim,''options'')
===Model Interpreter===


===Description===
* Allow input of matrix instead of just vectors (some methods support applying to matrix)
 
* Force numeric conversion to be done expecting period as decimal separator (resolves problems interpreting models on systems set to other numeric formats like in France)
TEXTREADR reads tab, space, comma, semicolon or bar delimited files with names on the columns (variables) and rows (samples). Also handles Excel XLS files.
* Move math steps into new MEMath object (simplifies code - exposes mathematical operations to caller)
 
If TEXTREADR is called with no input, or an empty matrix for file name ''file'', a dialog box allows the user to select a file to read from the hard disk.
 
====Inputs====
 
* '''file''' = One of the following identifications of files to read:
 
: '''a)'''  a single string identifying the file to read
:: ('example.txt')
:'''b)'''  a cell array of strings giving multiple files to read
:: ({'example_a' 'example_b' 'example_c'})
: '''c)''' an empty array indicating that the user should be prompted to locate the file(s) to read
:: ([])
* '''delim''' = An optional string used to specify the delimiter character.
:Supported delimiters include:
:* ''''tab''''  or  '\t' or sprintf('\t')
:* ''''space''''  or  ' '
:* ''''comma''''  or  ','
:* ''''semi''''  or  ';'
:* ''''bar''''    or  '|'
:If (delim) is omitted, the file will be searched for a delimiter common to all rows of the file and producing an equal number of columns in the result.
 
====Outputs====
 
* '''out''' = A DataSet object with date, time, info (data from cell (1,1)) the variable names vars, sample names samps, and data matrix data.  Note that the primary difference between this function and the Mathworks function xlsread is the parsing of labels and output of a dataset object.
 
===Options===
 
Optional input ''options'' = a structure array with the following fields:
 
* '''parsing''': [ 'manual' | {'automatic'} | 'auto_strict' | 'graphical_selectin' | 'gui' ] determines the type of parsing to perform:
::  ''''automatic'''' : the file is automatically parsed for labels and header information. This works on many standard arrangements with different numbers of rows and column labels. May take some time to complete with larger files. See note below regarding additional options available with 'automatic' parsing.
::  ''''auto_strict'''' : faster automatic parsing which does not handle header lines, and expects that all row labels will be on the left-hand side of the data and all column labels will be on the top of the columns. If this returns the wrong result, try 'automatic'.
::  ''''manual'''' : the options below are used to determine the number of labels and header information.
::  ''''stream'''' : nearly identical to 'automatic' but reads from the file in pieces. This allows reading somewhat larger files than might otherwise be readable because of memory limitations.
:: ''''importtool'''' : Show [[importtool]] during parsemixed to manually designate data,label,class,... columns and rows.
:: ''''graphical_selection'''' : same as 'importtool'
::  ''''gui'''' : allows selection of standard options using a GUI.
: '''Note''' that when the file type is XLS, 'automatic' parsing is always performed.
 
* '''commentcharacter''': [<nowiki>''</nowiki>] any line that starts with the given character will be considered a comment and parsed into the "comment" field of the DataSet object. Deafult is no comment character. Example: '%' uses % as a comment character. '''NOTE:''' NOT used with 'auto_strict' parsing.
* '''headerrows''' : [{0}] number of header rows to expect in the file. '''NOTE:''' NOT used with 'auto_strict' parsing.
* '''catdim''' : [{0}] specifies the dimension that multiple text files should be joined on. 1 = rows, 2 = columns, 3 = slabs, 0 = automatically select based on sizes. Automatic mode joins in rows or columns if the other mode doesn't match in size, in the 3rd mode if BOTH dimensions match, and throws an error if no sizes match.
* '''autopermute''' : [ false | {true} ] When true, multiple files joined in the 3rd dimension are also permuted so that multiple files form the ROWS of the output and the original rows and columns are moved into columns and slabs. This option is most often used for multiway data where each file is a separate sample and, thus, should be separate rows of the output.
* '''waitbar''' : [ 'off' |{'on'}] Governs use of waitbars to show progress
 
Two additional options are used ONLY with 'manual' parsing. See [[parsemixed]] for similar options to use with 'automatic' parsing.
* '''rowlabels'''  : [{1}] number of row labels to expect in the file
* '''collabels''' : [{1}] number of column labels to expect in the file
 
 
 
The default options can be retrieved using: options = textreadr('options');
 
In addition to the above options, if option parsing is set to 'automatic', any option used by the [[parsemixed]] function can be input to TEXTREADR. These options will be passed directly to [[parsemixed]] for use in parsing the file. See [[parsemixed]] for details.
 
===Examples===
 
Reading a file (in this case an XLS file) which has a single row of axis scale information at the top (first row) of the table and a single column of axis scale information at the left (first column) of the table can be done using this code:
 
  opts = textreadr('options');
  opts.axisscalecols = [1];
  opts.axisscalerows = [1];
  data = textreadr('myfile.xls',opts);
 
 
===See Also===
 
[[areadr]], [[dataset]], [[parsemixed]], [[spcreadr]], [[xclgetdata]], [[xclputdata]], [[xlsreadr]]

Revision as of 15:49, 15 June 2015

Version 3.2 of Model_Exporter was released in June, 2015. For general product information, see Model_Exporter_User_Guide.

NEW FEATURES / FIXES

  • Add support for non-negative least squares CLS models IF outputting to m-file format (requires function be enabled as sub-functions are added to accomplish the NNLS calculation)
  • Add support for approximate nearest neighbor distance via distancemetric option. This is an approximation of the NN distance, but is NOT exact.
  • Add support for application of most preprocessing methods to MATRICES
    • Allows use of PLS, PCR, PCA, CLS, and PLSDA on x-block of MATRICES (not just vectors) with basically all supported preprocessing methods
  • Add basic support for arithmetic on X-block (only - undo on y-block not supported)
  • Change "prob" to "probs" in SVMDA so it matches PLSDA outputs
  • Fix bug where placeholder variables were not removed from data if SavGol preprocessing was the one and only preprocessing method
  • Fix bug which would lead to indexing error in SavGol preprocessing if variables at the END of the spectrum are excluded


Model Interpreter

  • Allow input of matrix instead of just vectors (some methods support applying to matrix)
  • Force numeric conversion to be done expecting period as decimal separator (resolves problems interpreting models on systems set to other numeric formats like in France)
  • Move math steps into new MEMath object (simplifies code - exposes mathematical operations to caller)