Classcentroid and File:T1267-f3.jpg: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Donal
(Created page with "===Purpose=== Centers data to the centroid of all classes. ===Synopsis=== :[ccx,mn] = classcentroid(x,options); %calibrate, centers the data :[ccx,mn,pstd] = classcentroi...")
 
imported>Benjamin
(Working with False-color images, figure 3.)
 
Line 1: Line 1:
===Purpose===
Working with False-color images, figure 3.
 
Centers data to the centroid of all classes.
 
===Synopsis===
 
:[ccx,mn]      = classcentroid(x,options); %calibrate, centers the data
:[ccx,mn,pstd] = classcentroid(x,options); %calibrate, centers and scales
: ccx = classcentroid(x,mn);              %apply, centers new data
: ccx = classcentroid(x,mn,pstd);          %apply, centers and scales
 
===Description===
 
Rows in the input data are centered to the centroid of all the classes. The centroid is equivalent to a weighted mean where each class is given the same weight. For example, if two classes A and B are present the centroid is
  mn = mean([mean(Class A); mean(Class B)]);
 
If only two outputs are requested, then the data is centered only. If three outputs are requested, than the data is both centered and scaled (scaling based on the pooled standard devation of the classes). For more details, see [[Advanced_Preprocessing:_Variable_Centering]].
 
====Inputs====
 
* '''x''' = DataSet object to be class-centered.
 
====Optional Inputs====
 
* '''mn''' = Centroids from previous call to classcentroid. Must be passed with associated classes (see next input).
* '''classset''' = Class set (from rows) which should be used to center data. Default is class set 1.
* '''pstd''' = Pooled standard deviation of the classes. e.g., pstd = mean([std(Class A).^2/MA; std(Class B).^2/MB]); where MA and MB are the number of samples in each class.
* '''offset''' = scales by pstd = pstd+offset (default = 0).
 
====Outputs====
 
* '''ccx''' = Class-centered x. Dataset where each class has been centered.
* '''mn''' = Row vector of the centroids for each class.
* '''pstd''' = Row vector of pooled standard deviation of the classes.
 
===Use in Multilevel Classification and Regression===
Classcentroid can be used in multi-level classification. Multi-level data are data where samples have a class set and where samples within each class are also associated with a secondary class set. Patient data is an example where measurements are taken from each patient before and after treatment. The primary class is patientID and the secondary class is "untreated"/"treated", as discussed in J.A. Westerhuis, Ewoud J.J., van Velzen H. C., Hoefsloot J., and Smilde A.K., "Multivariate paired data analysis: multilevel PLSDA versus OPLSDA" Metabolomics (2010) 6:119-128.  Classcentroid can similarly be used to perform class-centered regression.
 
Steps to do multilevel PLS
# assign classes to samples in x-block where each pair of measurements for a subject have the same class (i.e. measurements which have a common offset have same class)
# add class centering to preprocessing ("Class Centroid Centering" or "Class Centroid Centering and Scaling")
# use PLS / PLSDA / OPLS / OPLSDA to build model as usual
 
===See Also===
[[mncn]], [[rescale]], [[scale]], [[classcenter]]

Revision as of 14:16, 12 May 2017

Working with False-color images, figure 3.