Datasets:  Google Harvest

Follow the basic steps for creating a new dataset. The Dataset Wizard for a Google harvest will display.

 

  1. In the Dataset Name field, enter a dataset name.

  2. Type the query text in one or more of the Google Query boxes. The query will join the options together using the Boolean AND operator. For example, if you type "cat" in the field "with at least one of the words" and type "dog" in the field "without the words", Google search will look for documents containing "cat" that do not also contain "dog".Finally, click Next>. The Advanced Google Options panel displays.

  3. The defaults for Language, Date, and Occurrences are shown in the above example. You are not required to change them, although you may wish to hone your query by specifying:

    When you are done, click Next>.  The Settings panel will display.

    Note:  At this point, you can click Finish t
    o accept the defaults on the following panels and start processing.

  4. The settings on the above panel serve as controls for the duration of a harvest and can be useful if you are experiencing any of the following problems:

  1. When you are done, click Next>.

  2. The Filters panel will display.

    Filters help you to deal with the following problems:

  3. Enter hosts or URL words to filter, click Next> and go to Optional Settings or to use the default settings for the remainder of the options and start processing immediately, click Finish.

Start Processing

The Processing dialog opens, informing you that the dataset is being processed. Click OK. The dataset name displays in the list of datasets in the Dataset Editor window. You can monitor its status as it is processed by clicking , the Refresh button, at the top of the Dataset Editor window.

Check the Status of the Google Harvest

The progress of the harvest is reflected in the Status column of the Dataset Editor window.  To see the complete status details, click the Status button Dataset Editor Status Button.  The Dataset Details window will display.

Use the tabs to view status information about the dataset harvesting, preprocessing, and processing phases of the Google harvest.  This information can be used to refine your harvest.

 

6/18/05