SAS - CHARACTER FUNCTIONS

Contents:                                                                                     
       
1.  SUBSTR 
2.  PROPCASE
3.  TRIMN 
4.  LEFT                                                                                                            
5.  INDEX 
6. TRANWRD 
7. COMPBl 
8. SCAN 
9. SUBSTR (left of =) 
10 COMPRESS   
11 STRIP           
12 COUNTC        
13 FIND           
14 LENGTH  
15 LENGTHC                         
16.CATX
                                                                                                                  






1. SUBSTR
Information:
The SUBSTR function takes a character matrix as an argument along with starting positions and lengths and produces a character matrix with the same dimensions as the argument. Elements of the result matrix are substrings of the corresponding argument elements. Each substring is constructed using the starting position supplied. If a length is supplied, this length is the length of the substring. If no length is supplied, the remainder of the argument string is the substring.
Category: Character Functions
Syntax: SUBSTR(string, start_position, length_required)
Usage:
a='KIDNAP';
b=substr(a,1,3);
put a;
Result: KID
Default: If you use an undeclared variable, it will be assigned a default length of 8 when the SUBSTR function is compiled.
Additional Info:
SUBSTR function can also be used to replace a substring in a host string. For replacing the values you might have to have the “substr” function in the left side of the assignment function. For understanding how you can replace substrings using SUBSTR refer SUBSTR (left of =) Function and SUBSTR (right of =) Function. 


Information:
The PROPCASE function copies a character argument and converts all uppercase letters to lowercase letters. It then converts to uppercase the first character of a word that is preceded by a blank, forward slash, hyphen, open parenthesis, period, or tab. PROPCASE returns the value that is altered.
Category: Character Functions
Syntax: PROPCASE (string, delimiter)
Usage:
PROPCASE (“SCIENCE OF ASTRONOMY“) = Science Of Astronomy
PROPCASE (“SCIENCE\OF\ASTRONOMY“) = Science\of\astronomy
PROPCASE (“SCIENCE\OF\ASTRONOMY“, "\") = Science\Of\Astronomy
Default: If the delimiter is not specified, character space is considered as delimiter by default. 

3. TRIMN
Information:
TRIMN copies a character argument, removes all trailing blanks, and returns the trimmed argument as a result. If the argument is blank, TRIMN returns a null string. Category: Character Functions Syntax: TRIMN (string) Usage:
Data _null_;
x=" ";
y=" k ";
z1=">"||trimn(x)||"<";
z2=">"||trim(x)||"<";
z3=">"||trim(y)||"<";
Put z1=;
Put z2=;
Put z3=;
Run;

z1=><
z2=> <
z3=> k<
Additional Info: In a DATA step, if the TRIMN function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the argument. The difference between TRIMN and TRIM is TRIMN returns a null string (zero blanks) for a blank string, TRIM returns one blank for a blank string. If the trimmed value is shorter than the length of the receiving variable, SAS pads the value with new blanks as it assigns it to the variable.


4. LEFT
Information:
LEFT returns an argument with leading blanks moved to the end of the value. The argument's length does not change Category: Character Functions Syntax: LEFT (string) Usage:
Data _null_;
x=" ";
y=" k ";
z1= ">"||left(x)||"<";
z2=">"||left(y)||"<";
Put z1=;
Put z2=;
Run;
z1=> <
z2=>k <
Additional Info: If the LEFT function returns a value to a variable that has not yet been assigned a length, by default the variable length is determined by the length of the first argument.


5. INDEX 
Information:
Searches a character expression for a string of characters, and returns the position of the string's first character for the first occurrence of the string. Category: Character Functions Syntax: INDEX (source, excerpt)
Arguments:
  • Source- Specifies a character constant, variable, or expression to search.
  • Excerpt - Is a character constant, variable, or expression that specifies the string of characters to search for in source.
Usage:
data _null_;
a = 'ABC.DEF(X=Y)';
b = 'X=Y';
x = index(a,b);
put x=;
run;
RESULT: x=9
Additional Info: The following example shows the results when you use the INDEX function with and without the TRIM function. If you use INDEX without the TRIM function, leading and trailing spaces are considered part of the excerpt argument. If you use INDEX with the TRIM function, TRIM removes trailing spaces from the excerpt argument as you can see in this example. Note that the TRIM function is used inside the INDEX function.
data _null_;
length a b $14;
a='ABC.DEF (X=Y)';
b='X=Y';
q=index(a,b);
w=index(a,trim(b));
put q= w=;
run;
Result: q=0 w=10


6. TRANWRD 
Information:
Category:
Syntax:
Arguments
  •     Source -     Specifies a character constant, variable, or expression that you want to translate.
  •     Target -     Specifies a character constant, variable, or expression that is searched for in source.
  •     Replacement -     Specifies a character constant, variable, or expression that replaces target.
Usage:
Name=tranwrd(name, "Mrs.", "Ms.");
Name=tranwrd(name, "Miss", "Ms.");
Put name;

RESULT:
Value: Mrs. Joan Smith output: Ms. Joan Smith
Value: Miss Alice Cooper output: Ms. Alice Cooper

Additional Info: TRANWRD function does not replace the source string if the target string contains blanks. Use the TRIM function with TARGET to exclude trailing blanks from a target or replacement variable.


7. COMPBl 
Information: Removes multiple blanks from a character string.
Category: Character Functions
Syntax: COMPBL(source)
Arguments
  •     Source -     Specifies a character constant, variable, or expression to compress.
Usage:
string='125 E Main St';
length address $10;
address=compbl(string);
put address;
RESULT: 125 E Main
Additional Info: The COMPRESS function removes every occurrence of the specific character from a string. If you specify a blank as the character to remove from the source string, the COMPRESS function removes all blanks from the source string, while the COMPBL function compresses multiple blanks to a single blank and has no effect on a single blank.
In a DATA step, if the COMPBL function returns a value to a variable that has not previously been assigned a length, then the length of that variable defaults to the length of the first argument.

8. SCAN 
Information: Returns the nth word from a character string.
Category: Character Functions
Syntax: SCAN(string, count<,charlist <,modifier(s)>>)
Arguments
  • string - Specifies a character constant, variable, or expression.
  • count - Is a nonzero numeric constant, variable, or expression that has an integer value that specifies the number of the word in the character string that you want SCAN to select. For example, a value of 1 indicates the first word, a value of 2 indicates the second word, and so on. The following rules apply: If count is positive, SCAN counts words from left to right in the character string. If count is negative, SCAN counts words from right to left in the character string.

  • charlist Specifies an optional character expression that initializes a list of characters. This list determines which characters are used as the delimiters that separate words. The following rules apply: By default, all characters in charlist are used as delimiters.
If you specify the K modifier in the modifier argument, then all characters that are not in charlist are used as delimiters.
Tip: You can add more characters to charlist by using other modifiers. modifier Specifies a character constant, a variable, or an expression in which each non-blank character modifies the action of the SCAN function. Blanks are ignored.
Usage:
PROPCASE (“SCIENCE OF ASTRONOMY“) = Science Of Astronomy

PROPCASE (“SCIENCE\OF\ASTRONOMY“) = Science\of\astronomy

PROPCASE (“SCIENCE\OF\ASTRONOMY“, "\") = Science\Of\Astronomy
Additional Info: If you use the SCAN function with only two arguments, then the default delimiters depend on whether your computer uses ASCII or EBCDIC characters. In a DATA step, most variables have a fixed length. If the word returned by the SCAN function is assigned to a variable that has a fixed length greater than the length of the returned word, then the value of that variable will be padded with blanks. Macro variables have varying lengths and are not padded with blanks. The SCAN function allows character arguments to be null. Null arguments are treated as character strings with a length of zero. Numeric arguments cannot be null.


Information: Replaces character value contents.
Category: Character Functions
Syntax: SUBSTR(variable, position<,length>)=characters-to-replace
Arguments variable Specifies a character variable. position Specifies a numeric constant, variable, or expression that is the beginning character position. Length Specifies a numeric constant, variable, or expression that is the length of the substring that will be replaced. Restriction: length cannot be larger than the length of the expression that remains in variable after position.
Tip: If you omit length, SAS uses all of the characters on the right side of the assignment statement to replace the values of variable. characters-to-replace specifies a character constant, variable, or expression that will replace the contents of variable. Tip: Enclose a literal string of characters in quotation marks.
Usage:
a=’KIDNAP’;

substr(a,1,3)=’CAT’;

put a;

RESULT: CATNAP
Additonal Info: If you use an undeclared variable, it will be assigned a default length of 8 when the SUBSTR function is compiled. When you use the SUBSTR function on the left side of an assignment statement, SAS replaces the value of variable with the expression on the right side. SUBSTR replaces length characters starting at the character that you specify in position.


10. COMPRESS 
Information: Compress function is used to remove unwanted delimiters within the string of a variable. Returns a character string with specified characters removed from the original string.
Category: Character Functions
Syntax: COMPRESS (<, chars><, modifier(s)>)
Usage:
  x=’1 2 3 4 5’;
  y=compress(x,,’s’);
  put y;


  Result = 12345
Additional Info: In a DATA step, if the COMPRESS function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the first argument. The COMPRESS function allows null arguments. A null argument is treated as a string that has a length of zero.

11. STRIP 
Information: Returns a character string with all leading and trailing blanks removed
Category: Character Functions
SYNTAX: STRIP(string)
Usage:
data lengthn;

input string $char8.;

original = ’*’ || string || ’*’;

stripped = ’*’ || strip(string) || ’*’;

datalines;

abcd

abcd

abcd

abcdefgh

x y z;



proc print data=lengthn;

run;
Additional Info: The STRIP function returns the argument with all leading and trailing blanks removed. If the argument is blank, STRIP returns a string with a length of zero. Assigning the results of STRIP to a variable does not affect the length of the receiving variable. If the value that is trimmed is shorter than the length of the receiving variable, SAS pads the value with new trailing blanks.

12. COUNTC 
Information: Counts the number of characters in a string that appear or do not appear in a list of characters
Category: Character Functions
Syntax: COUNTC(string, charlist <,modifier(s)>)
Usage:
data test;

string = ’Baboons Eat Bananas ’;

a = countc(string, ’a’);

b = countc(string,’b’);

b_i = countc(string,’b’,’i’);

abc_i = countc(string,’abc’,’i’);

/* Scan string for characters that are not "a", "b", */

/* and "c", ignore case, (and include blanks). */

abc_iv = countc(string,’abc’,’iv’);

/* Scan string for characters that are not "a", "b", */

/* and "c", ignore case, and trim trailing blanks. */

abc_ivt = countc(string,’abc’,’ivt’);

run;
Additional Info: The COUNTC function allows character arguments to be null. Null arguments are treated as character strings with a length of zero. If there are no characters in the list of characters to be counted, COUNTC returns zero. The COUNTC function counts individual characters in a character string, whereas the COUNT function counts substrings of characters in a character string.

13. FIND 
Information: Searches for a specific substring of characters within a character string
Category: Character Functions
Syntax: FIND(string,substring<,modifiers><,startpos>)
Arguments:
  • string Specifies a character constant, variable, or expression that will be searched for substrings.
  • Tip: Enclose a literal string of characters in quotation marks. substring Is a character constant, variable, or expression that specifies the substring of characters to search for in string.
  • Tip: Enclose a literal string of characters in quotation marks. modifiers Is a character constant, variable, or expression that specifies one or more modifiers. The following modifiers can be in uppercase or lowercase:
  • I- Ignores character case during the search. If this modifier is not specified, FIND only searches for character substrings with the same case as the characters in substring.
  • t- trims trailing blanks from string and substring.
Usage:
SAS Statements                                                                             Results

whereisshe=find(’She sells seashells? Yes, she does.’,’she ’);           

put whereisshe;                                                                                27

variable1=’She sells seashells? Yes, she does.’;

variable2=’she ’;

variable3=’i’;

whereisshe_i=find(variable1,variable2,variable3);

put whereisshe_i;                                                                             1
Additional Info: The FIND function searches string for the first occurrence of the specified substring, and returns the position of that substring. If the substring is not found in string, FIND returns a value of 0. If startpos is not specified, FIND starts the search at the beginning of the string and searches the string from left to right. If startpos is specified, the absolute value of startpos determines the position at which to start the search. The sign of startpos determines the direction of the search.

14. LENGTH 
Information: Returns the length of a non-blank character string, excluding trailing blanks, and returns 1 for a blank character string.
Category: Character Functions
Syntax: LENGTH(string) Arguments string specifies a character constant, variable, or expression.
Usage:
len=length(’ABCDEF’);

put len;

RESULT: 6
Additional Info: The LENGTH function returns an integer that represents the position of the rightmost non-blank character in string. If the value of string is blank, LENGTH returns a value of 1. If string is a numeric constant, variable, or expression (either initialized or uninitialized), SAS automatically converts the numeric value to a right-justified character string by using the BEST12. format. In this case, LENGTH returns a value of 12 and writes a note in the SAS log stating that the numeric values have been converted to character values.

15. LENGTHC 
Information: Returns the length of a character string, including trailing blanks.
Category: Character Functions
Syntax: LENGTHC(string) Arguments string specifies a character constant, variable, or expression.
Usage:
x=lengthc(’variable with trailing blanks ’);

put x;

RESULT: 32
Additional Info: The LENGTHC function returns the number of characters, both blanks and non-blanks, in string. If string is a numeric constant, variable or expression (either initialized or uninitialized), SAS automatically converts the numeric value to a right-justified character string by using the BEST12. format. In this case, LENGTHC returns a value of 12 and writes a note in the SAS log stating that the numeric values have been converted to character values.

16. CATX 
Information: The CATX function enables you to concatenate character strings, remove leading and trailing blanks, and insert separators. The CATX function returns a value to a variable, or returns a value to a temporary buffer. The results of the CATX function are usually equivalent to those that are produced by a combination of the concatenation operator and the TRIM and LEFT functions.
Category: Character Functions
Syntax: CATX(separator,string-1 <,...string-n> )
where
• separator specifies the character string that is used as a separator between concatenated strings
• string specifies a SAS character string.
Usage:
NewAddress=catx(', ',address,city,zip)

  1. gravatar

    # by Anonymous - March 6, 2014 at 4:01 PM

    Hi,
    I'm stuck while using CATX function.
    I have thirty string variable(COL1 to COL30) and i want to join all by using CATX. It doesnt working.
    Is there is any other way to do this.
    Your little help may be great help.

Blog Widget by LinkWithin

Search this blog..

Loading