Using JMP > Enter and Edit Your Data > Restructure Data > Compress Selected Columns in Data Tables
Publication date: 07/08/2024

Compress Selected Columns in Data Tables

JMP lets you compress columns in a data table to minimize the size of the file and reduce the amount of memory required to analyze data. This feature is helpful when numeric columns contain many small integers or when any column contains fewer than 255 unique values. For example, compressing columns in a data table with 389 columns and 85,000 rows might decrease the file size from 250MB to 33MB, depending on the type of data.

When you compress columns, JMP verifies whether the data can be stored in a more compact form based on the data type:

In character columns with fewer than 255 unique values, the List Check property is added to the column where appropriate (Figure 4.37). When the preference Allow 16 Bit List Check Compression is selected, the List Check property is also added to character columns that have more than 255 unique values.

The List Check property restricts the values in the selected column to valid values. The List Check property is not applied when the number of values in the selected column is too great. For example, if the number of values is almost the same as the number of rows, the data table does not add the List Check property to the column.

For numeric columns, only those with the Best, Fixed Dec, or Data format are compressed. Data is compressed to 1-byte, 2-byte, or 4-byte integers when possible (Figure 4.38). For more information about short integers, see The Short-Integer Format.

A numeric column with non-integer values can also be compressed if there are fewer than 255 unique values. In this case, the List Check property is added to the column.

Notes:

To automatically compress a column that has less than or equal to 65,535 unique values, select the Allow 16 Bit List Check Compression preference in the General group. The List Check property is also added to the column.

In a column with the List Check property, you can enter only a value that is in the list. Otherwise, JMP warns that the cell contains invalid data when you try to enter the new value. See List Check.

Figure 4.37 List Check Property Added to a Compressed Character Column 

List Check Property Added to a Compressed Character Column

Figure 4.38 Column Info Window Showing Numeric Column before and after Compression 

Column Info Window Showing Numeric Column before and after Compression

To compress columns, select one or more columns and select Cols > Utilities > Compress Selected Columns. (Select all columns if you do not know which columns can be compressed.)

The column or columns are compressed if possible. The log shows which columns were compressed and how they were compressed. (Select View > Log to show the log.)

Note: To compress a numeric column manually, set your Tables preferences to allow short numeric data and then change the column’s data type to 1-byte integer, 2-byte integer, or 4-byte integer. For more information about this preference, see Preferences for Data Tables.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).