Creating Dataset Subsets

You can create a subset dataset that contains some but not all of the documents in an existing dataset. In general, there are two ways to do this:

 

Subset

Included documents

Effect of Time Slicer

Outliers

From Selection...

Selected documents, wherever they may be located.

None

Selected outliers will be included.

From Galaxy...

Documents visible in the Galaxy only. Move documents you want to exclude to the Outliers panel.

If the Time Slicer is on, the documents in the Galaxy are those in the current slice.

Pending outliers (still part of the Galaxy because the Galaxy has not been recalculated) will be included; processed outliers will not be.

From Outliers...

All processed outliers in the Outliers panel.  The Galaxy has been recalculated and the Recalculate button is not active.

If the Time Slicer is on, the outliers that will be included are those processed outliers that are in the currently visible slice.

Processed outliers (now in the Outliers panel) will be included; pending outliers will not be, since they are still part of the Galaxy.

How to Create a Subset

  1. To create a dataset subset, open the parent dataset.

  2. Using the information in the table above, choose which type of subset is most convenient for you:

  3. To create a subset from your selection: Select the documents you want to include in the subset.

    1. Use the Groups tool to create a group of these documents. As you discover documents to add, you can add them to the Group. Selected documents can be in the Galaxy, the Outliers panel, or both. From the IN-SPIRE main toolbar File menu, select Subset from Selection.

    2. To create a subset from the Galaxy: Select the documents you want to exclude from the subset, and move them to the Outliers panel. See Outliers for more information. Click Recalculate to reprocess the dataset. From the IN-SPIRE main toolbar File menu, select Subset from Galaxy.
      Or, click Create new subset from Galaxy create subset from outliers button above the Outliers panel.

    3. To create a subset from Outliers: Select documents you want to include in the subset and move them to the Outliers panel. See Outliers for more information. Click Recalculate to reprocess the dataset. From the IN-SPIRE main toolbar File menu, select Subset from Outliers.

  4. Once the Dataset Wizard window opens, edit the name (the default is "<dataset name> Subset") and any of the settings (stopwords, punctuation rules, and stopmajor list). For further information, see Stopwords, Punctuation Rules or Stopmajor List. You should not have to edit the Fields.

  5. Click Finish. The subset dataset is processed and appears in the list in the Dataset Editor window.

  6. To open the subset dataset, choose File > Datasets. The Dataset Editor window opens.
    If the status of the subset dataset is Available, click on it to select it, and click Open to view the subset galaxy.

 

7/18/05