Search: Advanced Term Searches

Any term search can combine single-word terms, phrase terms (enclosed in single or double quotes), and order-independent proximity search phrase terms (also enclosed in single or double quotes). Boolean logic operators AND, OR and NOT can be used between any of these terms to specify Boolean logic operations; parentheses may be used to indicate the precedence of operations.

Order-Independent Approximate Phrase Searches and Choice Lists are two advanced Term Search features that can be useful for refining a search which is returning too many false hits. If a simple phrase search is not working, you might want to try either of these alternatives.

Both of these features pertain to phrase searches, that is, to elements of a search that are contained within a pair of single or double quote marks.

Approximate Phrase Searches

Order-Dependent

Order-Dependent Approximate Phrase Searches use an asterisk (*) character between words in the search text to for zero or one occurrences of any word. Optionally, a one or two-digit quantifier, N, may follow the asterisk, to specify that zero to N words could appear between the first and second specified words. For example, the phrase:

Brown *3 cow

would match:

"Brown cow"

"Brown old cow"

"Brown and black cow"

"Brown and white old cow"

and so on. You could think of these phrase search terms as order-dependent, approximate phrase search terms.

Order-Independent

Approximate phrase searches also include an order-independent form. If the first non-white space character following the opening single or double quote mark of a phrase search term is a tilde (~), then the search will find all of the words following the tilde, regardless of order, up to the closing quote mark. All of the words in the phrase must be present.

You may also specify the maximum number of words within a span of contiguous words for there to be a match. There is an implied maximum number of extra words if one is not explicitly specified, and the extra words may occur in any order. The number of extra words specifies the range of the search term. A range that is too large will retrieve unrelated information.

The following are examples of such order-independent proximity search terms:

"~orbit planet sun"
"~10 mayor governor president king"
" ~3 University Ohio"

You may not mix order-dependent and order-independent notation within the same search. The following are syntax errors:

  • Any phrase search that starts with a tilde and follows with an asterisk between words.
  • Any phrase term that contains a tilde anywhere except at the beginning of the phrase.

You may use a trailing, or stemming, asterisk on any or all of the words in these search terms. For example,

" ~ bomb* buil*"

would match any of the following:

"building was bombed"

"bomber was trapped in the building"

"build a bomb"

"bombs have been built"

Defining the Range

The range of an order-independent proximity search is the number of individual words in the term plus the specified number of extra words, or 5 words if the number is not specified. For example,

"~2 cattle alfalfa water"

has a range of 2 + 3 = 5, and the search term

"~crop rain damage cost"

has a range of 5 + 4 = 9.

The range specifies the maximum length of a run of text that can satisfy the search criteria. Thus, for example, the text

"The cattle became bloated after consuming too much water and alfalfa."

would not match the first search phrase above, because there are ten words, inclusive, between "cattle" and "alfalfa" and the search term has a range of only five. However, it would be found by:

"~7 cattle alfalfa water"

because this term has a range of 7 + 3 = 10.

Highlighting in the Document Viewer

Once you run a search, IN-SPIRE will highlight text in the Document Viewer that matches an order-independent proximity search term. Each section of text highlighted will be the longest section that matches the search term within its range, beginning with the first matching word and ending with the last. For example, if we apply the search term

"~ the moon"

to the text

"and the cow jumped over the moon"

the part of the phrase that will be highlighted is:

and the cow jumped over the moon

This is because the first "the" is within the range, which is within seven words of "moon" so the matching text includes everything from the first "the" to "moon" inclusive. On the other hand, the search term

"~2 the moon" 

would result in the following highlighting:

and the cow jumped over the moon

This is because this search term has a range of four, and the first "the" is too far away. So the second "the" is chosen to begin the matching text.

This behavior is consistent with the way the document viewer's highlight function treats ordinary, order-dependent phrase search terms that contain asterisks between words.

Choice Lists

Within a phrase or order-independent proximity search term, you can now specify a choice list of words, any one of which will match a word in the text at the corresponding position in the phrase. To do this, enclose the list of words in parentheses. Use any non-alphanumeric character or white space to separate terms in the list. Remember to enclose the entire phrase search term in single or double quotes.

Examples:

"New York (Yankees Mets Highlanders) baseball team"

will match "New York Yankees baseball team" or "New York Mets baseball team" or "New York Highlanders baseball team", but not "New York State baseball team".

"~ Oklahoma bomb* (facility=building=courthouse)federal"

will match, for example, any of these phrases:

"bomb went off at the Oklahoma City Federal Building"

"Oklahoma federal facility was bombed"

"bombing in Oklahoma, federal officials at the courthouse"

Choice lists can be used in ordinary phrase searches as well as in order-independent proximity searches.

When counting words to determine the range of the order-independent proximity search term, a choice list counts as one word, regardless of how many words it contains.

The stemming asterisk technique can be used with words in a choice list, for example,

"Federal (officer* official judge*)"

You can use more than one choice list in a phrase search term. Examples:

"(men women) playing (baseball softball basketball soccer)"
"~12 telescope (Hubble Palomar) photo* (asteroid comet meteor)"