The Select All Rows message selects (or highlights) all of the rows in a data table.
dt << Select All Rows;
If all rows are selected, you can deselect them all by using Invert Row Selection. This command reverses the selection state for each row, so that any selected rows are deselected, and any deselected rows are selected.
dt << Invert Row Selection;
Note: With the exception of Invert Row Selection, whose result depends on the current selection, any new selection message starts over with a new selection. If you already have certain rows selected and you then send a new message to select rows, all rows are first deselected.
To select a specific row, use Go To Row:
dt << Go To Row( 9 );
To select specific rows in a data table based on their row number, use the Select Rows command. The argument to the command is a list of row numbers. For example, to select rows 1, 3, 5, and 7 of a data table:
dt << Select Rows( {1, 3, 5, 7} );
To select a range of rows, specify one of the following messages:
dt << Select Rows( Index( 7, 10 ) );
dt << Select Where( Any( Row() == Index( 7, 10 ) ) );
Both of these examples select rows seven through 10 in the current data table.
To select rows according to data values, use Select Where or Where.
Tip: For a description of the functions and operators that you can use within a Select Where message, see Arithmetic Operators.
For example, using the Big Class.jmp sample data table, select the rows where the students’ age is greater than 13:
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Select Where( :age > 13 );
or
dt << Select Rows( Where( :age > 13 ) );
Or, select the rows where the students’ ages are less than 14:
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
col = Column( dt, 2 );
dt << Select Where( col[] < 14 );
or
dt << Select Rows( Where( :age < 14 ) );
The following example selects the rows where the student’s ages are less than 15 and the sex is “F”:
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Select Where( :age < 15 & :sex == "F" );
To select a row without deselecting a previously selected row, combine << Select Where with << Select Where and the Current Selection("extend") argument. This is an alternative to using an OR statement.
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Select Where( :age == 14 );
dt << Select Where( :sex == "F", Current Selection( "extend" ) );
To select rows that contain specific strings, follow this example:
dt = Open( "$SAMPLE_DATA/Hollywood Movies.jmp" );
my_genres = {"Romance", "Comedy"};
dt << Select Where( Contains( my_genres, :Genre ) );
You can also create more complex conditions by using regular expressions.
// select rows where :name begins with "j" or "J".
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
/* \0 is a group reference that causes the expression to return the entire matched text. !Is Missing is needed because the regex returns the matched text if it successfully matched or missing if it failed to match. So !Is Missing means the string successfully matched */
dt << Select Where( !Is Missing( Regex( :name, "^j", "\0", IGNORECASE ) ) );
// select rows where :name ends with "N" or "E".
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Select Where( !Is Missing( Regex( :name, "(N|E)$", "\0" ) ) );
To select rows that are currently excluded, hidden, or labeled:
dt << Select Excluded;
dt << Select Hidden;
dt << Select Labeled;
To select rows that are not excluded, hidden, or labeled, stack a select message and an invert selection message together in the same statement, or send the two messages sequentially:
dt << Select Hidden << Invert Row Selection;
dt << Select Hidden;
dt << Invert Row Selection;
To select duplicate rows in a data table, use the Select Duplicate Rows message:
dt = Open( "$SAMPLE_DATA/San Francisco Crime.jmp" );
dt << Select Duplicate Rows(); // selects rows 301 and 8864
The second and subsequent duplicate rows are selected. Note that the duplicate values are case sensitive.
Note: By default, if no matching column is specified, all of the columns are matched.
In the preceding example, the duplicate rows might have been data input errors. Exclude the duplicate rows to remove them from the analysis:
dt = Open( "$SAMPLE_DATA/San Francisco Crime.jmp" );
dt << Select Duplicate Rows();
dt << Exclude();
To select duplicate values in the rows of selected columns, specify the column names. The following example finds duplicate values in the rows of the Incident Number and Time columns.
dt = Open( "$SAMPLE_DATA/San Francisco Crime.jmp" );
dt << Select Duplicate Rows( Match( Column( "Incident Number" ), :Time ) );
To select duplicate rows according to data values, select the duplicate rows, specify the data values, and then restrict the duplicate rows to the current selection.
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
dt << Select Duplicate Rows( Match( :age ) );
dt << Select Where( :age > 15, Current Selection( "restrict" ) );
To refer to a specific cell, assign a subscript to the cell’s row number. In the following example, the subscript [1] is used with the weight column. The formula then calculates the ratio between each height and the first value in the weight column.
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
New Column( "ratio", Formula( height / weight[1] ) );
To obtain a random selection, use this syntax:
dt << Select Randomly( number );
dt << Select Randomly( Sample Size( number ) );
dt << Select Randomly( probability );
dt << Select Randomly( Sampling Rate( probability ) );
These commands use a conditional probability to obtain the exact count requested.
The row menu message Select Matching Cells is also implemented in JSL.
// select matching cells in the current data table
dt << Select Matching Cells;
// select matching cells in all open data tables
dt << Select All Matching Cells;
For more complicated selections, or to store selections permanently as row state data, see Row States.