Predictive and Specialized Modeling > Explore Patterns > Statistical Details for the Explore Patterns Platform
Publication date: 07/08/2024

Statistical Details for the Explore Patterns Platform

This section contains statistical details for calculating the rarity in longest runs and longest sequences in the Explorer Patterns platform.

Rarity in Longest Runs

To calculate the rarity for longest runs, first define the following variables:

n = the number of rows in the column

k = the number of times a specific value occurs in the column

p = k/n = the probability of observing the specific value in the column

m = the length of the run

N = the number of unique runs

Then, the rarity for longest runs is calculated as follows:

Rarity = log2(1 (1 pm - 1)N)

Rarity in Longest Sequences

To calculate the rarity for longest sequences, first define the following variables:

p = the probability of observing the specific sequence one time in the column

k = the number of times the starting value of the sequence occurs in the column

Then, the rarity for longest sequences is calculated as follows:

Rarity = log2(1 (1 p)k)

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).