Welcome
Data Sets
Overview
Creating New
--ASCII Text
--XML
--Google Harvest
--Web Harvest
Settings
--Fields
--Stopwords
--Stopmajors
--Punctuation Rules
Editing
Merging
Exporting
Importing
Subsetting
Visualizations
Galaxy
--Basics
--Outliers
ThemeView
Settings
Tools
Document Viewer
Gist
Groups
--Basics
--Evidence Panel
Major Terms
Queries
Print
Probe
Time Slicer
About version 2.2
Overview
Known issues
|
Editing Data Sets
There are three ways to modify an existing data set:
- You can change which documents it contains, by subsetting
or merging.
- You can change which documents it contains, by changing the harvest
settings of a web or Google harvest, so that the harvest retrieves different
documents.
- For other types of data sets, you can keep the documents you have
and change how they are processed.
Suppose your data set contains the right documents but has major terms
that are not meaningful for the analysis, or words that should be disregarded,
or poorly-defined fields. If it's a web harvest or Google harvest, the
terms of the harvest itself might be improved to return a better result.
Rather than create a new data set with the improved settings, you can
edit the existing one and reprocess it.
Accessing the Data Set Editor
- Open the Data Set Editor.
From the main IN-SPIRE menu bar, choose File > Data Sets...
The Data Set Editor window opens.
- Click on the name of the data set you want to edit.
- Click Edit. The Data Set Wizard window opens.
- Change any of the settings associated with the data set, just as you
would when you first created it. For details, see Creating
New Data Sets.
What will happen to my Web Harvest or Google harvest data set
if I edit it?
You can choose whether or not to run a new harvest with the new
settings.
To run a new harvest
Make sure the Reharvest documents checkbox is checked.
To reprocess the documents presently in the data set with the new settings
Uncheck the Reharvest documents checkbox.
|