SAS-Efficient Coding Techniques 3: Importance of Format Statement



FORMAT STATEMENT (Click on the images to Zoom)



Let’s see what a missing format statement can do. Here I have a data which I create using the following DATA step.

DATA ABC;
INPUT NAME $ FLAG 1.;
CARDS;
ANDREW 1
JIM 3
JOHN 5
TINA 2
RYAN 4

;
RUN;

To this data I want to add a Column which gives the status of the customer .

DATA XYZ;
SET ABC;
IF FLAG = 0 THEN STATUS = "BRAND NEW";
ELSE IF FLAG = 1 THEN STATUS = "NEVER BUYS";
ELSE IF FLAG = 2 THEN STATUS = "RARELY BUYS";
ELSE IF FLAG = 3 THEN STATUS = "RANDOMLY BUYS";
ELSE IF FLAG = 4 THEN STATUS = "REGULARLY BUYS";
ELSE IF FLAG = 5 THEN STATUS = "SPENDTHRIFT";
RUN;


What result do you expect to get out of this? Will you get the intended Output?

The answer is NO. The Output of this will have truncated values for the STATUS Column.


This is because what happens is when SAS reads the first statement i.e. “IF FLAG = 0 THEN STATUS = "BRAND NEW";” then it simply creates a variable”STATUS” of length
$9. (Length of BRAND NEW) and there after all the other values get truncated to length 9.Thus it is important to use a format statement here:

DATA XYZ;
SET ABC;
FORMAT STATUS $20.;
IF FLAG = 0 THEN STATUS = "BRAND NEW";
ELSE IF FLAG = 1 THEN STATUS = "NEVER BUYS";
ELSE IF FLAG = 2 THEN STATUS = "RARELY BUYS";
ELSE IF FLAG = 3 THEN STATUS = "RANDOMLY BUYS";
ELSE IF FLAG = 4 THEN STATUS = "REGULARLY BUYS";
ELSE IF FLAG = 5 THEN STATUS = "SPENDTHRIFT";
RUN;


&20. Just to be on the safer side.Also one important thing to notice here is that the length of the variable is according to the first statement that sas encounters in this case “IF FLAG = 0 THEN STATUS = "BRAND NEW";” and not according to the first data point it encounters in the data in this case FLAG=1.

Blog Widget by LinkWithin

Search this blog..

Loading