Saves a stacked version of the document-term matrix to a JMP data table. The stacked format is appropriate for analysis in the Association Analysis platform. For more information, see Association Analysis in the Predictive and Specialized Modeling book. If you specify an ID variable in the Text Explorer launch window, the ID variable is used to identify the rows that each term came from in the original text data table. The stacked table also contains a table script to launch Association Analysis.
Saves a vector-valued formula column to the data table. The length of the vector depends on user-specified options for the maximum number of terms, the minimum term frequency, and the weighting. The resulting column uses the Text Score() JSL function. For more information about this function, see Help > Scripting Index.
Assigns TF * log10( nDoc / nDocTerm ). Abbreviation for term frequency - inverse document frequency. This is the default weighting. The terms in the formula are defined as follows:
TF = frequency of the term in the document
nDoc = number of documents in the corpus
nDocTerm = number of documents that contain the term