This version of the Help is no longer updated. See JMP.com/help for the latest version.

JSL Syntax Reference > JSL Functions > Character Functions

Publication date: 07/30/2020

Character Functions

Most character functions take character arguments and return character strings, although some take numeric arguments or return numeric data. Arguments that are literal character strings must be enclosed in quotation marks.

Work with Character Functions in the Scripting Guide provides more information about some of the functions.

BLOB To Char(blob, <encoding>)

Description

Reinterpret binary data as a Unicode string.

Returns

A string.

Arguments

blob

A binary large object.

encoding

(Optional) A quoted string that specifies an encoding. The default encoding for the character string is utf-8. utf-16le, utf-16be, us-ascii, iso-8859-1, ascii~hex, shift-jis, and euc-jp are also supported.

Notes

The optional argument ascii is intended to make conversions of blobs containing normal ASCII data simpler when the data might contain CR, LF, or TAB characters (for example) and those characters do not need any special attention.

BLOB To Matrix(blob, type, bytes, endian, <nCols>)

Description

Creates a matrix by converting each byte in the blob to numbers.

Returns

A matrix that represents the blob.

Arguments

blob

A blob or reference to a blob.

type

A quoted string that contains the named type of number. The options are "int", "uint", or "float".

bytes

Byte size of the data in the blob. Options are 1, 2, 4, or 8.

endian

A quoted string that contains a named type that indicates whether the first byte is the most significant. Options are as follows:

– "big" indicates that the first byte is the most significant.

– "little" indicates that the first byte is the least significant.

– "native" indicates that the machine’s native format should be used.

<nCols>

The number of columns in the matrix. The default value is 1.

Char(x, <width>, <decimal>, < <<Use Locale(Boolean)>)

Description

Converts an expression or numeric value into a character string.

Returns

A string.

Arguments

an expression or a numeric value. An expression must be quoted with Expr(). Otherwise, its evaluated value is converted to a string.

width

(Optional) A number that sets the maximum number of characters in the string.

decimal

(Optional) A number that sets the maximum number of places after the decimal that is included in the string.

Use Locale(Boolean)

(Optional) Preserves locale-specific numeric formatting.

Note

The width argument overrides the decimal argument.

Example

Char( Pi(), 10, 4)

"3.1416"

Char( Pi(), 3, 4)

"3.1"

Char To BLOB("string", <"encoding">)

Description

Converts a string of characters into a binary (blob).

Returns

A binary object.

Arguments

string

Quoted string or a reference to a string.

encoding

(Optional) A quoted string that specifies an encoding. The default encoding for the blob is utf-8. utf-16le, utf-16be, us-ascii, iso-8859-1, ascii~hex, shift-jis, and euc-jp are also supported.

Notes

Converting BLOBS into printable format escapes \ (in addition to ~ " ! and characters outside of the printable ASCII range) into hex notation (~5C for the backslash character).

x = Char To BLOB( "abc\def\!n" );

y = BLOB To Char( x, encoding = "ASCII~HEX" );

If(

	y == "abc~5Cdef~0A", "JMP 12.2 and later behavior",

	y == "abc\def~0A", "Pre-JMP 12.2 behavior"

);

"JMP 12.2 and later behavior" // output

Char To Hex(value, <"integer"|encoding="enc">)

Hex(value, <"integer"|encoding="enc"|Base(num)|Pad To(number)>)

Description

Returns the hexadecimal (or other base number system) text corresponding to the given value and encoding, which can be a number a string or a blob. If the value is a number, IEEE 754 64-bit encoding is used unless one of the optional arguments, integer, or base, is provided.

Arguments

value

Any number, quoted string, or blob.

integer

(Optional) A switch that causes the value to be interpreted as an integer.

encoding

(Optional) A quoted string that specifies an encoding. The default encoding is utf-8. utf-16le, utf-16be, us-ascii, iso-8859-1, ascii~hex, shift-jis, and euc-jp are also supported.

base(number)

(Optional) An integer value between 2 and 36 inclusive. If base is specified, the function returns the text corresponding to the specified number in that base number system instead of hexadecimal.

pad to(number)

(Optional) A value to specify the padded width of the hex output.

Collapse Whitespace("text")

Description

Trims leading and trailing whitespace and replaces interior whitespace with a single space. That is, if more than one white space character is present, the Collapse Whitespace function replaces the two spaces with one space.

Returns

A quoted string.

Arguments

text

A quoted string.

Concat(a, b)

Concat(A, B)

a||b

A||B

Description

For strings: Appends the string b to the string a. Neither argument is changed.

For lists: Appends the list b to the list a. Neither argument is changed.

For matrices: Horizontal concatenation of two matrices, A and B.

Returns

For strings: A string composed of the string a directly followed by the string b.

For lists: A list composed of the list a directly followed by the list b.

For matrices: A matrix.

Arguments

Two or more strings, string variables, lists, or matrices.

Notes

More than two arguments can be strung together. Each additional string is appended to the end, in left to right order. Each additional matrix is appended in left to right order.

Example

a = "Hello"; b = " "; c = "World"; a || b || c;

"Hello World"

d = {"apples", "bananas"}; e = {"peaches", "pears"}; Concat( d, e );

{"apples", "bananas", "peaches", "pears"}

A = [1 2 3]; B = [4 5 6]; Concat( A, B );

[1 2 3 4 5 6]

Concat Items

See Concat Items({string1, string2, ...}, <delimiter>}).

Concat To(a, b)

a||=b

A||=B

Description

For strings: Appends the string b to the string a and places the new concatenated string into a.

For matrices: Appends the matrix b to the matrix a and places the new concatenated matrix into a.

For lists: Appends the list b to the list and places the new concatenated list into a.

Returns

For strings: A string composed of the string a directly followed by the string b.

For matrices: A matrix.

For lists: A list composed of the list a directly followed by the list b.

Arguments

Two or more strings, string variables, matrices, or lists. The first variable must be a variable whose value can be changed.

Notes

More than two arguments can be strung together. Each additional string, matrix, or list is appended to the end, in left to right order.

Example

a = "Hello"; b = " "; c = "World"; Concat To( a, b, c ); Show( a );

a = "Hello World"

A = [1 2 3]; B = [4 5 6]; Concat To( A, B ); Show( A );

A = [1 2 3 4 5 6];

d = {"apples", "bananas"}; e = {"peaches", "pears"}; Concat to(d,e); Show( d );

d = {"apples", "bananas", "peaches", "pears"};

Contains(whole, part, <start>)

Description

Determines whether part is contained within whole.

Returns

If part is found: For lists, strings, and namespaces, the numeric position where the first occurrence of part is located. For associative arrays, 1.

If part is not found, 0 is returned in all cases.

Arguments

whole

A string, list, namespace, or associative array.

part

For a string or namespace, a string that can be part of the string whole. For a list, an item that can be an item in the list whole. For an associative array, a key that can be one of the keys in the map whole.

start

(Optional) A numeric argument that specifies a starting point. within whole. If start is negative, contains searches whole for part backwards, beginning with the position specified by the length of whole – start. Note that start is meaningless for associative arrays and is ignored.

Example

nameList={"Katie", "Louise", "Jane", "Jaclyn"};

r = Contains(nameList, "Katie");

The example returns a 1 because “Katie” is the first item in the list.

Contains Item(x, <item | {list} | pattern>, <delimiter>)

Description

Identifies multiple responses by searching for the specified item, list, pattern, or delimiter. The function can be used on columns with the Multiple Response modeling type or column property.

Returns

Returns a Boolean that indicates whether the word (item), one of a list of words (list), or pattern (pattern) matches one of the words in the text represented by x. Words are delimited by the characters in the optional delimiter (delimiter) string. A comma, ",", character is the default delimiter. Blanks are trimmed from the ends of each extracted word from the input text string (x).

Example

The following example searches for “pots” followed by a comma and then outputs the result.

x = "Franklin Garden Supply is a leading online store featuring garden decor, statues, pots, shovels, benches, and much more.";

b = Contains Item( x, "pots", "," );

If( b,

	Write( "The specified items were found."  ), Write( "No match." )

);

The specified items were found.

Ends With("string", substring)

Description

Determines whether substring appears at the end of string.

Returns

1 if string ends with substring, otherwise 0.

Arguments

string

A quoted string or a string variable. Can also be a list.

substring

A quoted string or a string variable. Can also be a list.

Equivalent Expression

Right("string", Length(substring)) == substring

Hex(value, <"integer"|encoding="enc"|Base(number)|Pad To(number)>)

See Char To Hex(value, <"integer"|encoding="enc">).

Hex To BLOB("string")

Description

Converts the quoted hexadecimal string (including whitespace characters) to a blob (binary large object).

Example

Hex To BLOB( "4A4D50" );

Char To BLOB("JMP", "ascii~hex")

Hex To Char("string", <encoding>)

Description

Converts the quoted hexadecimal string to its character equivalent.

Example

Hex To Char( "30" ) results in “0”.

Notes

The default encoding for character string is utf-8. utf-16le, utf-16be, us-ascii, iso-8859-1, ascii~hex, shift-jis, and euc-jp are also supported.

Hex To Number("string", <Base(number)>)

Description

Returns the number corresponding to the hexadecimal (or other base number system) text.

Arguments

string

A quoted hexadecimal string.

base(number)

(Optional) An integer value between 2 and 36 inclusive. If base is specified, the text is treated as a string representing the number in that base.

Example

Hex To Number( "80" );

128

Note

16-digit hexadecimal numbers are converted as IEEE 754 64-bit floating point numbers. Otherwise, the input is treated as a hexadecimal integer.

Whitespace between bytes (or pairs of digits) and in the middle of bytes is permitted (for example, FF 1919 and F F1919).

Insert

See Insert(source, item, <position>).

Insert Into

See Insert Into(source, item, <position>).

Item(n|[first last], string, <delimiter>, <Unmatched(result string)>, <Include Boundary Delimiters(Boolean)>)

Description

Returns the nth item or the span from the first to last item of the string according to the quoted string delimiters given. If you include a fourth argument, any and all characters in that argument are taken to be delimiters.

Arguments

The position of the word being extracted.

[first last]

A matrix that defines the beginning and end word range to return.

string

The string that is evaluated.

delimiter

(Optional) The character used as a boundary. If delimiter is absent, an ASCII space is used. If delimiter is the empty string, each character is treated as a separate word. If delimiter is an empty string, each character is treated as a separate word.

Unmatched(result string)

The string to print if no match is found.

Include Boundary Delimiters(Boolean)

(Optional) Includes the delimiters in the returned string.

Example

In Item(), consecutive delimiters are treated as though they have a word between them. In this example, the delimiters are a comma and a space.

Item( 4,"the  quick, brown fox", ", " ); // quick is preceded by two spaces

The expression is processed as follows:

the<delim[space]><word2><delim[space]>quick<delim[comma]><word 4><delim[space]>brown<delim[space]>fox

Because word4 is empty, this expression returns an empty string.

Item() is the same as Word() except that Item() treats each delimiter character as a separate delimiter, and Word() treats several adjacent delimiters as a single delimiter.

Word( 4,"the  quick, brown fox", ", " ); // quick is preceded by two spaces

This expression is processed as follows:

the<delim[2 spaces]>quick<delim[comma + space]>brown<delim[space]>fox

It returns "fox".

Left("string", n, <filler>)

Left({list}, n, <filler>)

Description

Returns a truncated or padded version of the original string or list. The result contains the left n characters or list items, padded with any filler on the right if the length of string is less than n.

Length("string")

Description

Returns the length of the given string (in characters), list (in items), associative array (in number of keys), BLOB (in bytes), matrix (in elements), namespace (in number of functions and variables), or class (in number of methods, functions, and variables).

Lowercase("string")

Description

Converts any upper case character found in quoted string to the equivalent lowercase character.

Matrix to BLOB(matrix, type, bytesEach, endian)

Description

Makes a BLOB from a matrix by converting the matrix elements to 1-byte, 2-byte, or 4-byte signed or unsigned integers or 4-byte or 8-byte floating point numbers.

Argument

matrix

The matrix.

type

The type of BLOB: int, uint, or float.

bytesEach

The number of bytes in each int, uint, or float. Integers can be 1, 2, or 4 bytes each, and floats can be 4 or 8 bytes each.

endian

The endian-ness of your system: big (the first byte is most significant), little (the first byte is the least significant), or native (the machine’s native format).

Munger("string", offset, find|length)

Munger("string", offset, find, replace)

Description

Computes new character strings from the quoted string by inserting or deleting characters. It can also produce substrings, calculate indexes, and perform other tasks depending on how you specify its arguments.

Offset is a numeric expression indicating the starting position to search in the string. If the offset is greater than the position of the first instance of the find argument, the first instance is disregarded. If the offset is greater than the search string’s length, Munger uses the string’s length as the offset.

Num("string")

Description

Converts a character string into a number.

Regex("source", "pattern", (<replacementString>, <GLOBALREPLACE>), <format>, <IGNORECASE>)

Description

Searches for the pattern within the source string.

Returns

The matched text as a string or numeric missing if there was no match.

Arguments

source

A quoted string.

pattern

A quoted regular expression.

format

(Optional) A backreference to the capturing group. The default is \0, which is the entire matched string. \n returns the nth match.

IGNORECASE

(Optional) The search is case sensitive, unless you specify IGNORECASE.

GLOBALREPLACE

(Optional) A replacement string. Applies the regular expression to the source string repeatedly until all matches are found.

Remove

See Remove(source, position, <n>).

Remove From

See Remove From(source, position, <n>).

Repeat(source, a)

Repeat(matrix, a, b)

Description

Returns a copy of source concatenated with itself a times. Or returns a matrix composed of a row repeats and b column repeats. The source can be text, a matrix, or a list.

Reverse

See Reverse(source).

Reverse Into

See Reverse Into(source).

Right("string", n, <Filler>)

Right({list}, n, <Filler>)

Description

Returns a truncated or padded version of the original string or list. The result contains the right n characters or list items, padded with any filler on the left if the length of string is less than n.

Shift

See Shift(source, <n>).

Shift Into

See Shift Into(source, <n>).

Starts With("string", "substring")

Description

Determines whether substring appears at the start of string.

Returns

1 if string starts with substring, otherwise 0.

Arguments

string

A quoted string or a reference to one. Can also be a list.

substring

A quoted string or a reference to one. Can also be a list.

Equivalent Expression

Left("string", Length("substring")) = = "substring"

Substitute

See Substitute("string", "substring", "replacementString", ...).

Substitute Into

See Substitute Into("string", substring, replacementString, ...).

Substr("string", start, length)

Description

Extracts the characters that are the portion of the first argument beginning at the position given by the second argument and ending based on the number of characters specified in the third argument. The first argument can be a character column or value, or an expression evaluating to same. The starting argument and the length argument can be numbers or expressions that evaluate to numbers.

Example

This example extracts the first name:

Substr( "Katie Layman", 1, 5 );

The function starts at position 1, reads through position 5, and ignores the remaining characters, which yields “Katie.”

Text Score(text column, text-to-number, <weighting>, <{support vectors}>, <text explorer setup>)

Description

Used to create scoring formulas in Text Explorer.

Returns

Returns a vector of scores.

Arguments

text column

The data table column.

text-to-number

An associative array that maps lowercase words to numbers.

weighting

"Binary", "Ternary", "Count", "LogCount", "LCA", or an arrray of inverse document frequency weights for TFLogIDF. "Count" is the default value.

support vectors

A list of vectors that are used in the text scoring. The number and length of the vectors depends on the weighting argument.

text explorer setup

An expression that contains a block of Text Explorer setup information.

Titlecase("text")

Description

Converts the string to title case, that is, each word in the string has an initial uppercase character and subsequent lowercase characters.

Returns

A quoted string.

Arguments

text

A quoted string.

Example

For example, the following function:

Titlecase( "veronica layman ")

returns the following string:

"Veronica Layman"

Trim("text",<left|right|both>)

Trim Whitespace("text",<left|right|both>)

Description

Removes leading and trailing whitespace.

Results

A quoted string.

Arguments

text

A quoted string.

left|right|both

(Optional) The second argument determines if whitespace is removed from the left, the right, or both ends of the string. If no second argument is used, whitespace is removed from both ends.

Example

For example, the following command:

Trim( "  John    ", both )

returns the following string:

"John"

Uppercase("string")

Description

Converts any lower case character found in the quoted string to the equivalent uppercase character.

Word(n|[first last], string, <delimiter>, <Unmatched(result string)>)

Description

Returns the nth item of the string, where words are sub-strings separated by any number of any characters in the delimiter argument.

Arguments

The position of the word being extracted.

[first last]

A matrix that defines the beginning and end word range to return.

string

The string that is evaluated.

delimiter

Unmatched(result string)

The string to print if no match is found.

Examples

This example returns the last name:

Word( 2, "Katie Layman" );

Note

See Item(n|[first last], string, <delimiter>, <Unmatched(result string)>, <Include Boundary Delimiters(Boolean)>) for examples of how Word() differs from Item().

Words

See Words("string", <delimiter>).

XPath Query( xml, "xpath_expression")

Description

Runs an XPath expression on an XML document.

Returns

A list.

Arguments

xml

A valid XML document.

xpath_expression

A quoted XPath 1.0 expression.

Example

Suppose that you created a report of test results in JMP and exported important details to an XML document. The test results are enclosed in <result> tags.

The following example stores the XML document in a variable. The XPath Query expression parses the XML to find the text nodes inside the <result> tags. The results are returned in a list.

rpt =

"\[<?xml version="1.0" encoding="utf-8"?>

<JMP><report><title>Production Report</title>

<result>November 21st: Pass</result>

<result>November 22nd: Fail</result>

<note>Tests ran at 3:00 a.m.</note></report>

</JMP> ]\";

results = XPath Query( rpt, "//result/text()" );

{"November 21st: Pass", "November 22nd: Fail"}

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).