Introduction To The DataSet Object

From Eigenvector Documentation Wiki
Jump to: navigation, search

Data sets include far more ancillary information than simple tables of numbers and the DataSet_Object was designed as a standard container that can hold all of this information. The standard DataSet object is similar to a structure object defined in MATLAB but it is intended to standardize how data is organized and maintained. The purpose of this document is to define standards and conventions for the standard DataSet object developed and maintained by Eigenvector Research, Inc. (EVRI). It is the intention of EVRI that the DataSet be of an open architecture and independent of any toolboxes. For example, the DataSet is used in the PLS_Toolbox written by EVRI, but it does not require any of the PLS_Toolbox functions or scripts. Although the open architecture allows for user modifications, suggestions for modifications and enhancements should be forwarded to EVRI (helpdesk@eigenvector.com) who will make free updates available at ([www.eigenvector.com]).

The goals of the DataSet are:

  1. Standardize data set storage into a single compact variable. This includes descriptions, ancillary information, and tools that can be used in the development of data analysis tools.
  2. Standardize data function input/output. The DataSet offers a compact form for passing data and information.
  3. Allow open and cost free use for MATLAB users.

Other DataSet Objects in MATLAB

The Eigenvector-Research-developed DataSet object was originally released in 2000 and the design team included staff from The Mathworks. In spite of this, in 2007 The MathWorks bundled their own version of a DataSet Object for distribution with their Statistics Toolbox. The Eigenvector DataSet Object includes all the functionality of the DSO contined in the Stats Toolbox so there is no need to switch between the two objects, although the Mathworks Statistics Toolbox is not compatible with the Eigenvector Research DataSet Object. More information on the history of the DSO can be found here.

There are also several other incompatible DataSet object released by several groups. Information on these DataSet objects would be appreciated by the authors of this DataSet object to help coordinate merging of these objects into a single, open-source solution.