Scripting Guide > Types of Data > Pattern Matching
Publication date: 07/08/2024

Pattern Matching

Pattern matching in JSL is a flexible method for searching and manipulating strings.

You define and use pattern variables just like any JMP variable:

i = 3; // a numeric variable
a = "Ralph"; // a character variable
t = textbox("Madge"); // a display box variable
p = ( "this" | "that" ) + patSpan(" ")
		+ ( "car" | "bus" ); // a pattern variable

When the above statement executes, p is assigned a pattern value. The pattern value can be used either to construct another pattern or to perform a pattern match. The patSpan function returns a pattern that matches a span of characters specified in the argument; patSpan("0123456789") matches runs of digits.

p2 = "Take " + p + "."; // using p to build another pattern
If( Pat Match( "Take this bus.", p2 ), // performing a match
	Print( "matches" ),
	Print( "no match" )
);

Sometime all you need to know is that the pattern matched the source text, as above. Other times, you might want to know what matched; for example, was it a bus or a car?

p = ("this" | "that") + Pat Span( " " ) + ("car" | "bus") >?
vehicleType; // conditional assignment ONLY if pattern matches
If( Pat Match( "Take this bus.", p ),

// do not use vehicleType in the ELSE because it is not set

	Show( vehicleType ),
	Print( "no match" )
);

You could pre-load vehicleType with a default value if you do not want to check the outcome of the match with an if. The >? conditional assignment operator has two arguments, the first being a pattern and the second a JSL variable. >? constructs a pattern that matches the pattern (first argument) and stores the result of the match in the JSL variable (second argument) after the pattern succeeds. Similarly, >> does not wait for the pattern to succeed. As soon (and as often) as the >> pattern matches, the assignment is performed.

findDelimString = Pat Len( 3 ) >> beginDelim + Pat Arb() >? middlePart
+Expr( beginDelim );
testString = "SomeoneSawTheQuickBrownFoxJumpOverTheLazyDog'sBack";
rc = Pat Match( testString, findDelimString, "<<<" || middlePart || ">>>" );
Show( rc, beginDelim, middlePart, testString );

The above example shows a third argument in the patMatch function: the replacement string. In this case, the replacement is formed from a concatenation (|| operator) of three strings. One of the three strings, middlePart, was extracted from the testString by >? because the replacement cannot occur unless the pattern match succeeds (rc == 1).

Look at the pattern assigned to findDelimString. It is a concatenation of 3 patterns. The first is a >> operator that matches 3 characters and assigns them to beginDelim. The second is a >? operator that matches an arbitrary number of characters and, when the entire match succeeds, assigns them to middlePart. The last is an unevaluated expression, consisting of whatever string is in beginDelim at the time the pattern is executing, not at the time the pattern is built. Just like expr(), the evaluation of its argument is postponed. That makes the pattern hunt for two identical three letter delimiters of the middle part.

Other pattern functions might be faster and represent the problem that you are trying to solve better than writing a lot of alternatives; for example, "a"|"b"|"c" is the same as Pat Any("abc"). The equivalent example for Pat Not Any("abc") is much harder. Similar to Pat Span (above), Pat Break("0123456789") matches up to, but not including, the first number.

Here is a pattern that matches numbers with decimals and exponents and signs. It also matches some degenerate cases with no digits; look at the pattern assigned to digits.

digits = Pat Span( "0123456789" ) | "";
 
number = (Pat Any( "+-" ) | "") >? signPart + (digits) >? wholePart + ("."
+digits | "") >? fractionPart + (Pat Any( "eEdD" ) + (Pat Any( "+-" ) | "")
 + digits | "") >? exponentPart;
 
If( Pat Match( "-123.456e-78", number ),
	Show( signPart, wholePart, fractionPart, exponentPart )
);
Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).