Commenting In SAS

Most of the beginner to intermediate level programs on this blog are fairly simple and self-contained. This means that if you know a thing or two about SAS, then you do not need me to explain the purpose of the sample program, or what the individual lines of code are doing.

There are occasions, however, when the intent of a program will not be clear and you will not understand the code. Consider the following examples (both of which have happened to me):

  1. You start on a project  and a co-worker hands you off a program that is thousands of lines long with no documentation.
  2. You start on a fairly complex program that you are unable to finish, and the revisit it some time later.

In the first case, even if you are able to understand any given line of code, the sheer amount of code will be too overwhelming to understand in it’s entirety. In the second case, many programmers have trouble following code that they wrote without some kind of reminder of what they were trying to do.

SAS comments can help you avoid problems and headaches such as these. Comments in  a SAS program are essentially notes that you write into the program to remind yourself (and inform others) what the code is doing. The code interpreter built into SAS ignores the comments and will not process them as SAS statements. In many cases, a well commented program requires no additional documentation.

Commenting Styles

To demonstrate the importance of commenting, let’s take a look at the uncommented program below. Can you tell what the code is doing?


options linesize = 120 pagesize = 1000 mprint;

LIBNAME inpt “/root/mydata/wide”;

proc contents data = inpt.demog out=vars (keep = name type where=(type=1)); run;

proc sql noprint;
select count (*) into :varcount
from vars;
quit;

data test;
set inpt.demog;

array nm (*) _NUMERIC_;
array fg   (*) f1 – f&varcount;

do i = 1 to dim(nm);

if missing(nm(i)) then fg(i)=1
else fg(i)=0;

end;
drop i ;

keep f: ;

run;

proc freq data = test ;
tables f:  ;
run;


This is some general code I use to check for missing values on numeric variables. While you may not understand each line of code, you should be able to understand what the code is doing if we comment it out.

We can comment a program out using one of two commenting styles. Using the first style, we will initiate a comment with an asterisk and end with a semi-colon. This means that SAS will ignore everything after the asterisk until it hits the semi-colon.


*************************************************************
Missing Value Checker:                                                                       

Purpse:                                                                                                  
This program checks for missing values on all numeric
variable in a dataset. The program has 4  general steps:

1) Extract numeric variables using the contents procedure
2) Create a macro variable for the number of numeric variables
   using PROC SQL.
3) Create a series of flag variables using the macro variable
   created in the SQL procedure.
4) Run frequencies on all of the flags. Values of 1 indicated
   missing values

Author: Sam’s SAS TIPS

*************************************************************;

options linesize = 120 pagesize = 1000 mprint;

* Location of the input data ;

LIBNAME inpt “/root/mydata/wide”;

* Extract numeric variables into a dataset ;
proc contents data = inpt.demog out=vars (keep = name type where=(type=1)); run;

* Count number of numeric variables ;
proc sql noprint;
select count (*) into :varcount
from vars;
quit;

 

* Flag records with missing values;

data test;
set inpt.demog;

array nm (*) _NUMERIC_;       * All numeric variables in the dataset ;
array fg (*) f1 – f&varcount;    * Initialize flag variables as missing    ;

*Loop through variables. Flag missing values as 1;
do i = 1 to dim(nm);

if missing(nm(i)) then fg(i)=1
else fg(i)=0;
end;

drop i ;

keep f: ;

run;

* Review Flag variables with frequencies ;
proc freq data = test ;
tables f:  ;
run;


Using comments, we are now able to describe each step of the code. Perhaps the most important comment in this program is the one at the top. Because SAS will interpret everything after the asterisk as a comment, we are able to write a short description of the program that spans multiple lines.  Writing descriptions such as these is a good standard practice when you are working on a team. This way, all of your co-workers can read what your code is trying to do without actually reading the program.

The one thing you need to watch out for using this style of commenting is your use of semi-colons. If you write a semi-colon in the middle of a sentence for grammatical reasons, then SAS will interpret this as the end of a comment, and will try to interpret the rest of your comment as valid SAS code. In all likelihood, this will cause your code to error out.

For this reason, I usually default to the second commenting style. In this style, we initiate a comment with a forward slash and asterisk, and close the comment with an asterisk and forward slash. No semi-colons required! Let’s take a look at how this looks by re-commenting the program.


/*************************************************************
Missing Value Checker:                                                                       

Purpse:                                                                                                  
This program checks for missing values on all numeric
variable in a dataset. The program has 4 general steps:

1) Extract numeric variables using the contents procedure
2) Create a macro variable for the number of numeric variables
   using PROC SQL.
3) Create a series of flag variables using the macro variable
   created in the SQL procedure.
4) Run frequencies on all of the flags. Values of 1 indicated
   missing values

Author: Sam’s SAS TIPS

*************************************************************/

options linesize = 120 pagesize = 1000 mprint;

/* Location of the input data */
LIBNAME inpt “/root/mydata/wide”;

/* Extract numeric variables into a dataset */
proc contents data = inpt.demog out=vars (keep = name type where=(type=1)); run ;

/* Count number of numeric variables */
proc sql noprint;
select count (*) into :varcount
from vars;
quit;

 

/* Flag records with missing values */

data test;
set inpt.demog;

array nm (*) _NUMERIC_;        /* All numeric variables in the dataset  */
array fg (*) f1 – f&varcount;    /* Initialize flag variables as missing      */

/* Loop through variables. Flag missing values as 1 */
do i = 1 to dim(nm);

if missing(nm(i)) then fg(i)=1
else fg(i)=0;

end;
drop i ;

keep f: ;

run;

/* Review Flag variables with frequencies */
proc freq data = test ;
tables f:  ;
run;

Summary

Thorough commenting is an important component to all complex SAS programs. Ironically, the more advanced you become at programming in SAS, the more important commenting becomes. This is because people will want to simply read your commenting and documentation rather than pour over your code.

There are two commenting styles in SAS. The first way is to initiate a comment with a asterisk and end with a semi-colon. This style will work as long as you do not put a semi-colon in the middle of your comment. The second style initiates a comment with a forward slash and asterisk, and ends with an asterisk and forward slash. This commenting style will always work.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s