Publication date: 07/08/2024

Term and Phrase Lists

The Term and Phrase Lists section of the Text Explorer report contains tables of terms and phrases found in the text after tokenization has occurred. See Figure 12.8 for an example of the Term and Phrase Lists report. The Count column in the term list indicates the number of occurrences of the term in the corpus. The Count column in the phrase list indicates the number of occurrences of the phrase in the corpus; the N column indicates the number of words in the phrase.

By default, the Terms List is sorted in descending count order; terms that are tied in count are sorted alphabetically. The Phrases List is sorted in descending count order; phrases that are tied in count are then sorted in descending length (N) order. Further ties in the Phrases List are sorted alphabetically. The sort order of each list can be changed to alphabetical sorting using the options in each list.

The phrases that appear in the phrase list are determined by the settings of the Maximum Words per Phrase and Maximum Number of Phrases options in the launch window. Phrases that occur only one time in the data table do not appear in the phrase list.

Phrases can be specified as terms at various scopes. Phrases in the phrase list that have been specified as terms are colored based on the scope of the phrase specification (Table 12.1). For more information about specifying phrases in different scopes, see Term Options Management Windows.

Table 12.1 Colors for Specified Phrases

Scope

Color

Built-in

Red

User Library

Green

Project

Blue

Column Property

Orange

Local

Gray

Actions for Terms and Phrases

You can access options in the Term List and Phrase List tables by selecting items and then right-clicking in the left-most column of each table. You can save each table as a data table by right-clicking in the Count column of each table and selecting Make into Data Table.

Term List Pop-up Menu Options

When you right-click in the Term column of the Term List table, a pop-up menu appears with the following options:

Select Rows

Selects rows in the data table that contain the selected terms.

Show Text

Shows the documents that contain the selected terms.

Note: By default, only the first 10,000 documents are shown. If there are more than 10,000 documents that contain a selected term, a window appears that enables you to increase this limit.

Alphabetical Order

Specifies the sort order of the term list. When this option is selected, the terms are sorted in alphabetical order. When this option is not selected, the terms are sorted in descending Count order.

Numerical Order

(Available only when the Alphabetical Order option is selected.) Specifies the sort order of the term list. When this option is selected, the items are split into string and numeric segments, and the numeric segments are then sorted in numerical order. For more information about the sorting rules used by the Numerical Order option, see Numerical Order in Using JMP.

Copy

Places the selected terms onto the clipboard.

Color

Enables you to assign a color to the selected terms.

Label

Places labels on the corresponding points in the Term SVD Plot for the selected terms.

Containing Phrases

Selects the phrases in the Phrase List table that contain the selected terms.

Save Indicators

Saves an indicator column to the data table for each term selected in the term list. The value of the indicator column for each row is 1 if the document in that row contains the term and 0 otherwise.

Save Formula

Saves a column formula to the data table for each term selected in the term list. The column formula for each row evaluates to 1 if the document in that row contains the term and 0 otherwise. This is useful for new documents.

Recode

Enables you to change the values for one or more terms. Select the terms in the list before selecting this option. After you select this option, the Recode window appears. See Recode Data in a Column in Using JMP.

Add Stop Word

Adds the selected terms to the list of stop words and removes those terms from the term list. This action also updates the phrase list.

Note: If you add a stemmed word as a stop word, all of the tokens that correspond to that stem are added as stop words.

Add Stem Exception

(Available only when the Language option is set to English, German, Spanish, French, or Italian.) Adds the selected terms to the list of terms that are excluded from stemming.

Remove Phrase

(Available only when a specified phrase is selected in the term list and the selected Stemming method is No Stemming.) Removes the selected phrase from the set of specified phrases and updates the Term Counts accordingly.

Note: If a phrase as been added as a Sentiment Phrase, the Remove Phrase option also removes the phrase from the list of sentiment terms in the current Sentiment Analysis report.

Image shown hereAdd Sentiment

(Available only when a Sentiment Analysis report is open in the current report window.) Adds the selected terms to the list of sentiment terms in the current Sentiment Analysis report.

Note: If you add a stemmed word as a sentiment term, all of the tokens that correspond to that stem are added as sentiment terms.

Show Filter

Shows or hides a search filter above the term list. See Search Filter Options.

Make into Data Table

Creates a JMP data table from the report table.

Make Combined Data Table

Searches the report for other tables like the one you selected and combines them into a single JMP data table.

Phrase List Pop-up Menu Options

When you right-click in the Phrase column of the Phrase List table, a pop-up menu appears with the following options:

Select Rows

Selects rows in the data table that contain the selected phrases.

Show Text

Shows the documents that contain the selected phrases.

Save Indicators

Saves an indicator column to the data table for each phrase selected in the phrase list. The value of the indicator column for each row is 1 if the document in that row contains the phrase and 0 otherwise.

Alphabetical Order

Specifies the sort order of the phrase list. When this option is selected, the terms are sorted in alphabetical order. When this option is not selected, the terms are sorted in descending Count order.

Numerical Order

(Available only when the Alphabetical Order option is selected.) Specifies the sort order of the phrase list. When this option is selected, the items are split into string and numeric segments, and the numeric segments are then sorted in numerical order. For more information about the sorting rules used by the Numerical Order option, see Numerical Order in Using JMP.

Copy

Places the selected phrases onto the clipboard.

Select Contains

Selects larger phrases in the phrase list that contain the selected phrase.

Select Contained

Selects smaller phrases in the phrase list and terms in the term list that are contained by the selected phrase.

Add Phrase

Adds the selected phrases to the term list and updates the Term Counts accordingly.

Add Stop Word

Adds the selected phrases to the list of stop words. This action also updates the term list.

Image shown hereAdd Sentiment Phrase

(Available only when a Sentiment Analysis report is open in the current report window.) Adds the selected phrases to the term list and to the list of sentiment terms in the current Sentiment Analysis report.

Show Filter

Shows or hides a search filter above the phrase list. See Search Filter Options.

Make into Data Table

Creates a JMP data table from the report table.

Make Combined Data Table

Searches the report for other tables like the one you selected and combines them into a single JMP data table.

Search Filter Options

Click the down arrow button next to the search box to refine your search.

Contains Terms

Returns items that contain a part of the search criteria. A search for “ease oom” returns messages such as “Release Zoom”.

Contains Phrase

Returns items that contain the exact search criteria. A search for “text box” returns entries that contain “text” followed directly by “box” (for example, “Context Box” and “Text Box”).

Starts With Phrase

Returns items that start with the search criteria.

Ends With Phrase

Returns items that end with the search criteria.

Whole Phrase

Returns items that consist of the entire string. A search for “text box” returns entries that contain only “text box”.

Regular Expression

Enables you to use the wildcard (*) and period (.) in the search box. Searching for “get.*name” looks for items that contain “get” followed by one or more words. It returns “Get Color Theme Names”, “Get Name Info”, and “Get Effect Names”, and so on.

Invert Result

Returns items that do not match the search criteria.

Match All Terms

Returns items that contain both strings. A search for “t test” returns elements that contain either or both of the search strings: “Pat Test”, “Shortest Edit Script” and “Paired t test”.

Ignore Case

Ignores the case in the search criteria.

Match Whole Words

Returns items that contain each word in the string based on the Match All Terms setting. If you search for “data filter”, and Match All Terms is selected, entries that contain both “data” and “filter” are returned.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).