Datasets: Word, RTF, and PDF

Word, RTF and PDF datasets require the Universal Parsing Agent (UPA) parser, which by default is installed with IN-SPIRE.

To create a dataset from Word, RTF, and/or PDF files:

  1. Follow the Creating New Datasets Basic Steps 1 and 2, choosing "Word, RTF, or PDF Dataset."

    The Dataset Wizard will guide you through choosing source files.
  2. Modify the Optional Settings if desired.
  3. Click Finish. The UPA wizard opens.
    UPA wizard window
  4. Simple Mode is the default. If this is a new dataset, not one that you are refining by reprocessing it, click Finish.