Tools: Permutation Test: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
imported>Jeremy
(Created page with "==Permutation Test Tool== Some regression and preprocessing methods are so exceptionally good at finding correlation between the measured data (X- and Y-blocks) that the model b...")
 
imported>Jeremy
No edit summary
Line 5: Line 5:
Permutation tests involve repeatedly randomly reordering the y-block and rebuilding the model under the current modeling settings. For a regression problem, this means each sample is assigned a nominally "incorrect" y-value (although the distribution of y-values is maintained because every sample's y-value is simply re-assigned to a different sample.) In the case of classification models, reordering the y-block is equivalent to shuffling the class assignments on each sample.
Permutation tests involve repeatedly randomly reordering the y-block and rebuilding the model under the current modeling settings. For a regression problem, this means each sample is assigned a nominally "incorrect" y-value (although the distribution of y-values is maintained because every sample's y-value is simply re-assigned to a different sample.) In the case of classification models, reordering the y-block is equivalent to shuffling the class assignments on each sample.


Such permutation tests to what extent the modeling conditions might be finding "chance correlation" between the x-block and the y-block. After shuffling the y-block samples, the values predicted for each sample from a cross-validation and self-prediction (a.k.a. calibration) as well as the RMSEC and RMSECV (see [[Using Cross-Validation]]) for the given shuffling. The shuffling is repeated multiple times and several statistics are calculated for each shuffling as well as accumulating all the RMSE results. The result is two pieces of information: A table of "Probability of Model Insignificance" and a plot of Sum Squared Y versus Y-block correlation.
Such permutation tests to what extent the modeling conditions might be finding "chance correlation" between the x-block and the y-block. After permuting the y-block samples, the values predicted for each sample from a cross-validation and self-prediction (a.k.a. calibration) as well as the RMSEC and RMSECV (see [[Using Cross-Validation]]) for the given shuffling. The shuffling is repeated multiple times and several statistics are calculated for each permutation as well as accumulating all the RMSE results. The result is two pieces of information: A table of "Probability of Model Insignificance" and a plot of Sum Squared Y versus Y-block correlation. It should be noted that the statistics being calculated are designed to operate even with very few iterations. Iterations are more critical for the SSQ Y plot. If the plot is not of interest, the number of iterations can be greatly reduced (down to 5 or 10, for example). Otherwise, iterations of 50 or more should be used.


===Probability Table===
===Probability Table===
The probability table shows the probabilities (calculated using several different methods) that the predictions for the original model could have come from random chance. Put another way: these are the probabilities that the original model is not significantly different from one created from randomly shuffling the y-block. Three tests are shown:
* Wilcoxon - Pairwise Wilcoxon signed rank test
* Sign Test - Pairwise signed rank test
* Rand t-test - Randomization t-test
An example below shows an example in which the original model is very unlikely to be random.
<pre>
Probability of Model Insignificance vs. Permuted Samples
For model with 1 component(s)
Y-column:  1
                    Wilcoxon    Sign Test    Rand t-test
Self-Pred (RMSEC) :  0.000        0.000          0.005
Cross-Val (RMSECV):  0.000        0.000          0.005
</pre>
Compare this to the result obtained when the number of samples is decreased to 1/3 and the number of latent variables raised to 2. The Randomized t-test is now indicating that the model is probably insignificantly different from one created from randomly permuted samples


<pre>
<pre>
Probability of Model Insignificance vs. Permuted Samples
Probability of Model Insignificance vs. Permuted Samples
For model with 3 component(s)
For model with 2 component(s)
_________________________________
 
Y-column:  1
Y-column:  1
                     Wilcoxon    Sign Test    Rand t-test
                     Wilcoxon    Sign Test    Rand t-test
Self-Pred (RMSEC) :   0.00        0.00          0.01
Self-Pred (RMSEC) :   0.085        0.186          0.076
Cross-Val (RMSECV):   0.00        0.00          0.01
Cross-Val (RMSECV):   0.021        0.060          0.099
</pre>
</pre>


===SSQ Y Plot===
===SSQ Y Plot===


For each shuffled y-block, the root mean squared error of calibration and cross-validation (RMSEC and RMSECV, respectively) are calculated and stored. In general practice, the RMSEC will always decrease as the modeling conditions push towards
The SSQ_Y Plot shows the self-prediction (calibration) and cross-validated y-block captured as a fractional value versus the correlation of the used y-block to the original y-block. For an non-permuted y-block the correlation should be one (1). For any permuted y-block the correlation should be significantly less.
 
For each permuted y-block, the root mean squared error of calibration and cross-validation (RMSEC and RMSECV, respectively) are calculated and stored. From these values, the fractional sum squared Y captured (SSQ Y) for the calibration (self-predictions) can be calculated from:
    SSQ<sub>Y,C</sub> = 1-(RMSEC/SSQ<sub>Y,Total</sub>)
Where SSQ<sub>Y,Total</sub> is the total sum squared Y response) and for cross-validated predictions from:
    SSQ<sub>Y,CV</sub> = 1-(RMSECV/SSQ<sub>Y,Total</sub>)
The SSQ_Y,C is expected to increase up to a value of "1" when the model is capturing all the y-block response. The SSQ_Y,CV is expected to be about the same as the SSQ_Y,C as long as the model is not overfit.
 
Thus, when examining SSQ_Y,C and SSQ_Y,CV, the values should be similar for a given model. However, both SSQ_Y values should be higher for the model built on non-permuted y-block data versus models built from permuted data (indicating the permuted models are not doing as well - as would be expected).

Revision as of 14:24, 1 September 2011

Permutation Test Tool

Some regression and preprocessing methods are so exceptionally good at finding correlation between the measured data (X- and Y-blocks) that the model becomes too specific and will only apply to that exact data. Such overfit models are often useless for predictive applications as well as even interpretation. In many cases, careful use of cross-validation and/or validation data will help identify when this has happened. Permutation tests are another way to help identify an overfit model as well as provide a probability that the given model is different from one built under the same conditions but on random data.

Permutation tests involve repeatedly randomly reordering the y-block and rebuilding the model under the current modeling settings. For a regression problem, this means each sample is assigned a nominally "incorrect" y-value (although the distribution of y-values is maintained because every sample's y-value is simply re-assigned to a different sample.) In the case of classification models, reordering the y-block is equivalent to shuffling the class assignments on each sample.

Such permutation tests to what extent the modeling conditions might be finding "chance correlation" between the x-block and the y-block. After permuting the y-block samples, the values predicted for each sample from a cross-validation and self-prediction (a.k.a. calibration) as well as the RMSEC and RMSECV (see Using Cross-Validation) for the given shuffling. The shuffling is repeated multiple times and several statistics are calculated for each permutation as well as accumulating all the RMSE results. The result is two pieces of information: A table of "Probability of Model Insignificance" and a plot of Sum Squared Y versus Y-block correlation. It should be noted that the statistics being calculated are designed to operate even with very few iterations. Iterations are more critical for the SSQ Y plot. If the plot is not of interest, the number of iterations can be greatly reduced (down to 5 or 10, for example). Otherwise, iterations of 50 or more should be used.

Probability Table

The probability table shows the probabilities (calculated using several different methods) that the predictions for the original model could have come from random chance. Put another way: these are the probabilities that the original model is not significantly different from one created from randomly shuffling the y-block. Three tests are shown:

  • Wilcoxon - Pairwise Wilcoxon signed rank test
  • Sign Test - Pairwise signed rank test
  • Rand t-test - Randomization t-test

An example below shows an example in which the original model is very unlikely to be random.

Probability of Model Insignificance vs. Permuted Samples
For model with 1 component(s)

Y-column:  1
                     Wilcoxon     Sign Test     Rand t-test
Self-Pred (RMSEC) :   0.000        0.000          0.005
Cross-Val (RMSECV):   0.000        0.000          0.005

Compare this to the result obtained when the number of samples is decreased to 1/3 and the number of latent variables raised to 2. The Randomized t-test is now indicating that the model is probably insignificantly different from one created from randomly permuted samples

Probability of Model Insignificance vs. Permuted Samples
For model with 2 component(s)

Y-column:  1
                     Wilcoxon     Sign Test     Rand t-test
Self-Pred (RMSEC) :   0.085        0.186          0.076
Cross-Val (RMSECV):   0.021        0.060          0.099

SSQ Y Plot

The SSQ_Y Plot shows the self-prediction (calibration) and cross-validated y-block captured as a fractional value versus the correlation of the used y-block to the original y-block. For an non-permuted y-block the correlation should be one (1). For any permuted y-block the correlation should be significantly less.

For each permuted y-block, the root mean squared error of calibration and cross-validation (RMSEC and RMSECV, respectively) are calculated and stored. From these values, the fractional sum squared Y captured (SSQ Y) for the calibration (self-predictions) can be calculated from:

   SSQY,C = 1-(RMSEC/SSQY,Total)

Where SSQY,Total is the total sum squared Y response) and for cross-validated predictions from:

   SSQY,CV = 1-(RMSECV/SSQY,Total)

The SSQ_Y,C is expected to increase up to a value of "1" when the model is capturing all the y-block response. The SSQ_Y,CV is expected to be about the same as the SSQ_Y,C as long as the model is not overfit.

Thus, when examining SSQ_Y,C and SSQ_Y,CV, the values should be similar for a given model. However, both SSQ_Y values should be higher for the model built on non-permuted y-block data versus models built from permuted data (indicating the permuted models are not doing as well - as would be expected).