Welcome

Data Sets
Overview
Creating New
--ASCII Text
--XML
--Google Harvest
--Web Harvest
Settings
--Fields
--Stopwords
--Stopmajors
--Punctuation Rules
Editing
Merging
Exporting
Importing
Subsetting

Visualizations
Galaxy
--Basics
--Outliers
ThemeView
Settings

Tools
Document Viewer
Gist
Groups
--Basics
--Evidence Panel
Major Terms
Queries
Print
Probe
Time Slicer

About version 2.2
Overview
Known issues

Editing Data Sets

There are three ways to modify an existing data set:

  1. You can change which documents it contains, by subsetting or merging.
  2. You can change which documents it contains, by changing the harvest settings of a web or Google harvest, so that the harvest retrieves different documents.
  3. For other types of data sets, you can keep the documents you have and change how they are processed.

Suppose your data set contains the right documents but has major terms that are not meaningful for the analysis, or words that should be disregarded, or poorly-defined fields. If it's a web harvest or Google harvest, the terms of the harvest itself might be improved to return a better result. Rather than create a new data set with the improved settings, you can edit the existing one and reprocess it.

Accessing the Data Set Editor

  1. Open the Data Set Editor.

    From the main IN-SPIRE menu bar, choose File > Data Sets... The Data Set Editor window opens.
    DSE.jpg

  2. Click on the name of the data set you want to edit.
  3. Click Edit. The Data Set Wizard window opens.
  4. Change any of the settings associated with the data set, just as you would when you first created it. For details, see Creating New Data Sets.

What will happen to my Web Harvest or Google harvest data set if I edit it?
You can choose whether or not to run a new harvest with the new settings.

To run a new harvest
Make sure the Reharvest documents checkbox is checked.

To reprocess the documents presently in the data set with the new settings
Uncheck the Reharvest documents checkbox.