IF / ELSE Programming in SAS

Most programming languages support some form of the IF / Else statement. “IF” statements allow you to execute some kind of routine if the “IF” test resolves to true. If the test resolves to false, the routine is not executed and the program continues. Programmers use “ELSE” syntax when they need to use multiple IF statements and test multiple conditions on a single variable. The benefits of ELSE statements is that they bring in an element of efficiency to your programs – if a prior IF test resolves to true, then the program will skip over the subsequent ELSE statements.

SAS programmers often use IF/ELSE statements to recode variables. In this example, let’s use some hypothetical demographic data that contains the total population of cities in the United States:

/root/mydata/demog.sas7bdat

city totalpop
1 100
2 10000
3 500
4 200
5 .

Where:

id = A unique ID for each city
totalpop = the total population of each city

In this example, we will use IF/ELSE statements to recode totalpop (a continuous variable) to a new categorical variable called “size”. A value of 1 will represent cities with a total population between 0-499; a value of 2 will represent cities with a population of 500-999; and a value of 3 will represent cities with a population 1000 or more.  Here is the most straight-forward way to accomplish this task:


 /* Location of the input data */
libname dsn "/root/mydata";      
                                 
data newpop;
set dsn.demog;                   
                                 
/* Recode new varaible useing IF statements */
if totalpop in (0-499) then size = 1 ;
else if totalpop in (500-999) then size = 2 ;
else size = 3 ;

run;

proc print data = newpop; run;


Code Review

We begin the program by specifying the location of the input data with a LIBNAME statement, along with a datastep titled “newpop.”

 /* Location of the input data */
libname dsn "/root/mydata";      
                                 
data newpop;
set dsn.demog;                   
                                 
/* Recode new varaible useing IF statements */
if totalpop in (0-499) then size = 1 ;
else if totalpop in (500-999) then size = 2 ;
else size = 3 ;

run;

proc print data = newpop; run;

Next, SAS encounters the first “IF” statement, which prompts SAS to conduct a logical test. If the total population is between the values of 0 and 499, then SAS will create a new variable called “size”. As we saw with our input data, the first record will does indeed have a population value between these bounds, so the test will resolve to true. Because the first if test resolves to true, SAS will skip over the two “ELSE” statements.  If we wrote those two else statements as normal “IF” statements, SAS would evaluate each one of them instead of skipping them. This is why writing in ELSE syntax is essential if you want to build efficiency into your programs.

The second case in the dataset has a value of 10,000. For this case, SAS will encounter the first IF statement, and the test will evaluate to false. SAS will then proceed on to the next else statement, which will also evaluate to false. The final else statement has no logical test – it is a “catch-all” statement that instructs SAS to assign a value of 3 to “size” if all of the previous IF tests evaluate to false. Because the first two logical tests were false, this case receives a value of 3.

 /* Location of the input data */
libname dsn "/root/mydata";      
                                 
data newpop;
set dsn.demog;                   
                                 
/* Recode new varaible useing IF statements */
if totalpop in (0-499) then size = 1 ;
else if totalpop in (500-999) then size = 2 ;
else size = 3 ;

run;

proc print data = newpop; run;

We can see how the variable recode went over by observing the output with the PRINT PROCEDURE:

city totalpop size
1 100 1
2 10000 3
3 500 2
4 200 1
5 . 3

Can you see the obvious problem here? The first 4 records have the appropriate codes for the “size” variable, but the fifth record has a missing value for totalpop, and still managed to end up with a value for “size” when it should be missing as well. Let’s take one more look at our IF statements and see if we can identify what went wrong:

if totalpop in (0-499) then size = 1 ;
else if totalpop in (500-999) then size = 2 ;
else size = 3 ;   

Because the first two if statements do not take into account missing values, the “catch-all” statement at the end assigns the “size” variable a value of 3. Ideally, we would want the size variable to be missing when totalpop is also missing. As a general programming rule, I always first handle missing values first in IF/ELSE programming blocks:


if missing (totpop) then call missing (size) ;
else if totalpop in (0-499) then size = 1 ;
else if totalpop in (500-999) then size = 2 ;
else size = 3 ;   

This coding style will ensure that SAS handles all missing values appropriately when recoding variables.

Summary

IF/ELSE syntax is a staple of SAS programming. Almost all SAS programs will contain IF/ELSE syntax in one form or another. This is because it provides an easy and readable way to conditionally execute blocks of code. As we have seen, it offers an easy way to recode variables and create new variables depending on the values of an existing variable. One aspect you should always be aware of are missing values. Be sure that your IF and ELSE statements explicitly handle missing values, otherwise your new variables may not have their intended values. Otherwise they may sneak under the radar. If you need to conduct multiple IF tests on a single variable for your recodes, consider reconfiguring some of your IF tests to ELSE tests. This will instruct SAS to control skip over them when previous IF tests evaluate to true. This will make your programs more efficient and elegant, and will also lead to shorter run times when you are working with large datasets.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s