Welcome

Data Sets
Overview
Creating New
--ASCII Text
--XML
--Google Harvest
--Web Harvest
Settings
--Fields
--Stopwords
--Stopmajors
--Punctuation Rules
Editing
Merging
Exporting
Importing
Subsetting

Visualizations
Galaxy
--Basics
--Outliers
ThemeView
Settings

Tools
Document Viewer
Gist
Groups
--Basics
--Evidence Panel
Major Terms
Queries
Print
Probe
Time Slicer

About version 2.2
Overview
Known issues

Galaxy: Outliers

Sometimes your Galaxy visualization will contain documents that are so different from the majority of the documents in the data set that they skew how the data set as a whole is presented in the Galaxy. These extreme outliers can cause all the other documents to be piled together on the opposite side of the Galaxy window, making it difficult to understand the relationships between the very documents that are most interesting.

Viewing the outlier documents in the Documents Viewer may reveal that they are irrelevant to the analysis and should be "set aside" from the main visualization. The Outliers Panel is a "holding area" where you can place documents so that they will not affect the overall Galaxy visualization, but where they are still responsive to queries, gisting, and most other IN-SPIRE tools. You may place any selected document in the Outlier Panel. Most often you will use it to remove outlier documents from the Galaxy visualization; also use it when creating Subset data sets.

After a data set is recalculated, excluding outliers, the outlier documents will:

  • Remain part of the overall context of the data set but will not be considered when the document coordinates in the Galaxy and ThemeView™ are calculated
  • Not be deleted from the data set.
  • Still be visible and accessible through the Outliers panel
  • Be available for use in the Document Viewer, Gist, Query, Group, and Time Slicer tools when they are selected, just as documents in the Galaxy itself are.

3-steps in creating a subset data set

Removing outliers from the Galaxy

  1. Open the Galaxy for the data set of interest.
  2. Use the S+ cursor, Query tool, or other selection method to select the outliers you want to remove. See Ways to Select.
  3. Click on downward arrow (between the Galaxy and the Outliers panel). The selected document dots disappear from the Galaxy, appearing in the Outliers panel. To see the titles of the outlier documents, choose Settings > Outlier Titles Visible. The Recalculate button becomes active, to remind you that the Galaxy visualization needs to be recalculated. (ThemeView™ will also be recalculated when the Galaxy is recalculated.)
    If you change your mind about documents in the Outliers panel, and want to move them back to the Galaxy again, select them and click on . The selected Outliers will be moved back to the Galaxy.
  4. When you have added to the Outliers panel all of the outliers you want to eliminate, click Recalculate. The Galaxy is recalculated without the outliers. The visualization will change.
    If you close the data set (or the IN-SPIRE application) before recalculating, you will be warned that there are Outlier changes pending and given the choice to recalculate or add the Outliers back into the Galaxy before closing.

You can also remove documents from the Galaxy and add them to the Outliers panel from the Document Viewer.

Using the Outliers Panel

Interacting with documents in the Outliers Panel
The S+ and V cursors are available in the Outliers panel and work the same way they do in the Galaxy, so that documents in the Outliers panel can be selected and viewed the same way that documents in the Galaxy can. Any query of the data set will also include outliers. You will probably want to examine unexpected query hits amongst the outliers.

To make all outlier document titles visible
Choose Settings > Outlier Titles Visible. Outlier documents are redrawn in a vertical list, showing the title of each.

To move outliers back into the Galaxy

  1. Select the documents in the Outliers panel that you want to add back to the Galaxy. You can turn on Outlier titles (choose Settings > Outlier Titles Visible), or select the Outlier documents and examine them in the Document Viewer to help determine which ones you might want to move back into the Galaxy. Using the V+ cursor might also be helpful.
  2. Click the upward arrow between the Galaxy and the Outliers panel. The selected outliers disappear from the Outliers panel and are drawn in the Galaxy once more.
    There is no need to click the Recalculate button. The Galaxy is automatically recalculated.

Using Outlier Terms

You may notice some terms on the major terms list which occur frequently in your documents but which do not discriminate between them, or which are "red herrings" for the purposes of your analysis. IN-SPIRE enables you to identify those terms and remove them from being considered when the visualization is calculated. Here's how:

  1. Open the Galaxy for the data set of interest.
  2. Open the Major Terms window.
  3. Scroll through the alphabetical list until you find a term which you want to remove from the major terms list. Click on it.
  4. Click Outlier Terms. The term will appear in dark red in the Outlier Terms box at the right hand side of the Outlier Panel.
  5. When you have moved all of the terms you want to eliminate to the Outlier Terms panel, on the Galaxy window, click Recalculate. The Galaxy is recalculated without the Outlier Terms, and the terms in the Outlier Terms box are now black rather than dark red.

Using Outlier Shortcuts


The Outlier Shortcuts dropdown menu allows you to:

  • Temporarily move all documents except selected ones to the Outliers Panel, so the Galaxy visualization shows only selected documents. Choose Only Selected Documents in Galaxy.
  • Return to the original visualization with all of the documents in the Galaxy. Choose All Documents in Galaxy.
  • See only grouped and selected documents in the Galaxy. Choose Only Colored Documents in Galaxy.
  • Save and revisit the present view. Choose Save As... and enter a name for the view. Click OK. The name you entered appears on the Shortcuts dropdown menu. When you choose a view from the Shortcuts dropdown menu, its name will appear just to the right of the dropdown menu, under the Galaxy.
  • Load, Rename, and Delete Shortcuts. Choose Manage Shortcuts...

    Load
    causes the selected view to be loaded into the Galaxy.