Skip to Main Content U.S. Department of Energy
IN-SPIRE™ Visual Document Analysis

FAQ: What types of documents can it process?

IN-SPIRE™ organizes and visualizes the topical content of multiple types of text files. These files may come from web pages, databases, results from Optical Character Reading processes, message traffic, or other sources. Currently IN-SPIRE supports encodings for ASCII, UTF-8, UTF-16 and will also ingest most types of PDF, MS-Word, MS-Excel, and RTF files, as well as email and spreadsheet sources. IN-SPIRE™ is capable of ingesting XML formatted documents, and can read various types of web formats such as HTML, and RSS/XML formats. HTML documents IN-SPIRE™ retrieves directly from the web or a local file system are cleaned of markup. New document types and encodings are routinely added and are prioritized according to demand.

Return to the FAQ page