Note: The strategy described here is not useful for columns with floating point numbers. Use Summarize instead. See Store Summary Statistics in Global Variables in Data Tables.
A key can exist only once in an associative array, so putting a column’s values into one automatically results in the unique values. For example, the Big Class.jmp sample data table contains 40 rows. To see how many unique values are in the column height, run this script:
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
unique heights = Associative Array( dt:height );
nitems( unique heights );
17
There are only 17 unique values for height. You can use those unique values by getting the keys:
unique heights << Get Keys;
{51, 52, 55, 56, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70}
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
nms = dt:name << Get Values;
dtbig = New Table( "Really Big Class",
New Column( "name",
Character,
Set Values( nms[J( 100000, 1, Random Integer( N Items( nms ) ) )] )
)
);
Wait( 0 );
t1 = Tick Seconds();
Write(
"\!N# names from Really Big Class = ",
N Items( Associative Array( dtbig:name ) ),
", elapsed time=",
Tick Seconds() - t1
);
# names from Really Big Class = 39, elapsed time=0.116666666639503
Because keys are ordered lexicographically, putting the values into an associative array also sorts them. For example, the <<Get Keys message returns the keys (unique values of the names column) in ascending order:
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
unique names = Associative Array( dt:name );
unique names << Get Keys;
{"ALFRED", "ALICE", "AMY", "BARBARA", "CAROL", "CHRIS", "CLAY", "DANNY", "DAVID", "EDWARD", "ELIZABETH", "FREDERICK", "HENRY", "JACLYN", "JAMES", "JANE", "JEFFREY", "JOE", "JOHN", "JUDY", "KATIE", "KIRK", "LAWRENCE", "LESLIE", "LEWIS", "LILLIE", "LINDA", "LOUISE", "MARION", "MARK", "MARTHA", "MARY", "MICHAEL", "PATTY", "PHILLIP", "ROBERT", "SUSAN", "TIM", "WILLIAM"}
dt1 = Open( "$SAMPLE_DATA/BirthDeathYear.jmp" );
dt2 = Open( "$SAMPLE_DATA/World Demographics.jmp" );
aa1 = Associative Array( dt1:Country );
aa2 = Associative Array( dt2:Territory );
Use N Items() to see how many countries appear in each data table:
N Items(aa1);
23
N Items(aa2);
239
Use the <<Intersect message to find the common values:
aa1 = Associative Array( dt1:Country );
aa1 << Intersect( aa2 );
Show(N Items(aa1), aa1 << Get Keys);
N Items(aa1) = 21;
aa1 << get keys = {"Australia", "Austria", "Belgium", "France", "Greece", "Ireland", "Israel", "Italy", "Japan", "Mauritius", "Netherlands", "New Zealand", "Norway", "Panama", "Poland", "Portugal", "Romania", "Switzerland", "Tunisia", "United Kingdom", "United States"};