SAS-Simple Ways to reduce size of Datasets

Simple Ways to reduce size of Sas Datasets


1.Compression of Datasets


/*USE THE DATA STEP COMPRESS OPTION*/
DATA TEMP.FILE2(COMPRESS=BINARY
REUSE=YES);
SET WORK.FILE1;
WHERE REC_TYPE='1';
RUN;

/*USE THE SYSTEM COMPRESS OPTION*/
OPTIONS COMPRESS=YES REUSE=YES;
DATA WORK.FILE3;
SET WORK.FILE1;
WHERE REC_TYPE='1';
RUN;

Using the COMPRESS= system or data set option, any SAS data set created on disk will be compressed. SAS data set compression can greatly reduce the size of SAS data sets. It does not affect the data stored within that SAS data set. To use the COMPRESS= system or data set option, set the option to either "YES" or "BINARY."

The COMPRESS=YES value uses an algorithm that works better with SAS data sets that are primarily comprised of character variables. On the other hand, COMPRESS=BINARY uses a different algorithm

that works better with SAS data sets that are primarily comprised of many variables including many numeric variables. An option to use with COMPRESS= is REUSE=. Specifying this option allows SAS to reuse space within the compressed SAS data set that has been freed by deleted observations. Otherwise, SAS cannot reclaim the space made available by deleted observations.


2. Using Keep and Drop statements


3. Deleting Datasets Before Reusing it.
/*DELETE EXISTING FILE FIRST*/
PROC DATASETS LIBRARY=DISK NOLIST;
DELETE WIP_DTL;
QUIT;
DATA DISK.WIP_DTL;
SET TAPE.WIP_DTL;
WHERE REC_TYPE NE ' ';
RUN;
SAS data Whenever you update or replace a SAS data set with new data, SAS will create a temporary SAS data set to hold the new data until the DATA step or
procedure successfully completes. Then SAS will overwrite the old input SAS data set with the one just created. This causes SAS to create a temporary SAS data set
in the WORK library. This, of course, is wasting space.Instead, delete the existing SAS data set using the DATASETS or the DELETE procedure before running the
DATA step or procedure that will update or replace the SAS data set. This method cannot be used, of course, if the existing SAS data set is needed as the basis for
updating or replacing the SAS data set.


4. Usage of two different library.
LIBNAME CD
"D:\LONG\WINDOWS\FOLDERNAME\DATA";
/*SPLIT THE INPUT FILE TO MULTIPLE
STORAGE DESTINATIONS*/
DATA WORK.FILE1 TEMP.FILE2;
SET CD.MASTER
SELECT (DIVISION);
WHEN ("01") OUTPUT WORK.FILE1;
WHEN ("02") OUTPUT TEMP.FILE2;
OTHERWISE DELETE;
END;
RUN;


When you are short of space in the WORK library and need to have more than one SAS data set in temporary storage simultaneously, an alternate approach is to
create alternate temporary libraries. Once you have allocated one or more alternate temporary libraries, use it for some SAS data sets that you need to place in
temporary work space. For example (assuming your SAS WORK library is on the C: drive and you have a D: drive available).In this example, you are reading a
permanent SAS data set stored on a CD and creating two temporary SAS data set.

Blog Widget by LinkWithin

Search this blog..

Loading