Diviner review outliers: Difference between revisions

From Eigenvector Research Documentation Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Outlier Detection==
==Outlier Detection==


If a Diviner run is started with outlier detection turned on then outlier detection will be performed for each preprocessing method designated by the user. Decisions will need to be made with how to handle samples detected as outliers. This Outlier Detection message will appear along with a plot of the Outlier Detection Survey:
If a Diviner run is started with outlier detection turned on then outlier detection will be performed for each preprocessing method designated by the user. Decisions will need to be made regarding how to handle samples detected as outliers. This Outlier Detection message will appear along with a plot of the Outlier Detection Survey:


<gallery widths=600px heights=700px mode="nolines">
<gallery widths=600px heights=700px mode="nolines">
Line 43: Line 43:
[[File: Select_Outlier_Plot_Accept_and_Close.png | 400px]]
[[File: Select_Outlier_Plot_Accept_and_Close.png | 400px]]


====Additional Toolbar Buttons====


[[File: Select_Outlier_Plot_Toolbar.png | 600px]]


'''This page is under construction'''
# Accept and Close - accept the highlighted samples and close figure
# Highlight samples - change how samples are highlighted. See above section for more information.
# Deselect samples - un-highlight the currently highlighted samples
# Sort Samples - sort samples by index or ascending y value
# View Robust Models - open robust PLS or PCA models in Analysis
# Help - open the Help image
 
===Manually Highlight and Un-highlight Samples===
 
Click on the plot to manually highlight or un-highlight a sample. '''Note:''' If needed, zoom in on the plot to help achieve this.
 
==Too Many Potential Outliers==
 
If the number of suspected outliers seems too high then increase the '''alpha''' option in the Diviner Settings interface. Use the '''Options settings''' button on the main Diviner interface to access the options settings window:
 
[[File: Diviner_Showing_Options_button.png | 600px]]
[[File: Diviner_Options_Alpha_Setting.png| 500px]]
 
Also, increasing the '''cutoff''' option for Robust PCA models will lead to less potential outliers. For version 9.5, the '''cutoff''' option is not available in the Diviner Settings interface. This option can be set using the Expert Preferences interface, see this wiki page for information: [[Expert_Preferences_GUI | Expert Preferences]].
 
* Open the Expert Preferences interface from the Workspace Browser window Edit -> Options -> Preferences (Expert) menu
* In Preferences GUI window, type <code>pca</code> in the '''Function Name''' box
* Check the '''View Options''' check box
* Select the '''roptions''' line and note that it contains a '''cutoff''' option
* Copy and paste this line: <code>struct('alpha',{ 0.75 },'cutoff',{ 0.99 })</code>  into the '''Override Value''' box
* Modify the '''cutoff''' value to the value of interest
* Click the '''Set''' button
* Click '''OK'''
 
<gallery widths=400px heights=500px mode="nolines">
File: Expert_Preferences_PCA_.png
File: Expert_Preferences_PCA_roptions.png
File: Expert_Preferences_Override_Value.png
File: Expert_Preferences_PCA_Cutoff_Set.png
</gallery>

Latest revision as of 12:35, 27 August 2024

Outlier Detection

If a Diviner run is started with outlier detection turned on then outlier detection will be performed for each preprocessing method designated by the user. Decisions will need to be made regarding how to handle samples detected as outliers. This Outlier Detection message will appear along with a plot of the Outlier Detection Survey:

The Outlier Detection Survey shows the number of samples (left axis) and ratio of samples (right axis) per preprocessing method designated to use for outlier detection.

Use the Include All Samples button to keep all samples in the model building process. If you would like to further inspect the samples flagged as potential outliers click on the Inspect Outliers button. This will open the Potential Outlier Status for each Preprocessing plot.

Potential Outlier Status for each Preprocessing

The Potential Outlier Status for each Preprocessing plot will allow selecting samples to be flagged as outliers and thus not used in the model building process.

Select Outlier Plot 1.png

This plot shows the sample index numbers on the x-axis and the outlier preprocessing methods on the y-axis. Samples that are potential outliers for each preprocessing method are colored pink.

Potential Outlier Status Toolbar

Use the wrench icon to change how the potential outliers are highlighted.

Select Outlier Plot Highlight Options.png

For each outlier preprocessing method you can choose to highlight samples from:

  • The robust PLS model
  • The robust PCA model
  • The union between the robust PLS and robust PCA models. This will give the unique samples from both models.
  • The intersection between the robust PLS and robust PCA models. This will give only the samples that are present in both models.

The last option, Highlight Commonly Flagged Samples, allows highlighting the samples flagged as outliers in all of the preprocessing methods.

Select Outlier Plot Highlighted Samples.png

Accept Highlighted Samples

Once the potential samples are highlighted, use the Accept and Close button (green check icon) to accept this selection and close the figure. These samples will be excluded from the model process.

Select Outlier Plot Accept and Close.png

Additional Toolbar Buttons

Select Outlier Plot Toolbar.png

  1. Accept and Close - accept the highlighted samples and close figure
  2. Highlight samples - change how samples are highlighted. See above section for more information.
  3. Deselect samples - un-highlight the currently highlighted samples
  4. Sort Samples - sort samples by index or ascending y value
  5. View Robust Models - open robust PLS or PCA models in Analysis
  6. Help - open the Help image

Manually Highlight and Un-highlight Samples

Click on the plot to manually highlight or un-highlight a sample. Note: If needed, zoom in on the plot to help achieve this.

Too Many Potential Outliers

If the number of suspected outliers seems too high then increase the alpha option in the Diviner Settings interface. Use the Options settings button on the main Diviner interface to access the options settings window:

Diviner Showing Options button.png Diviner Options Alpha Setting.png

Also, increasing the cutoff option for Robust PCA models will lead to less potential outliers. For version 9.5, the cutoff option is not available in the Diviner Settings interface. This option can be set using the Expert Preferences interface, see this wiki page for information: Expert Preferences.

  • Open the Expert Preferences interface from the Workspace Browser window Edit -> Options -> Preferences (Expert) menu
  • In Preferences GUI window, type pca in the Function Name box
  • Check the View Options check box
  • Select the roptions line and note that it contains a cutoff option
  • Copy and paste this line: struct('alpha',{ 0.75 },'cutoff',{ 0.99 }) into the Override Value box
  • Modify the cutoff value to the value of interest
  • Click the Set button
  • Click OK