Sunday, January 8, 2023

SAS Macro for Splitting Datasets into Multiple Parts

Description: This SAS code defines a macro called 'split' that allows you to split a dataset into a specified number of parts.

Be aware that the execution time of this code will increase substantially if the 'Parts' macro keyword parameter is set to a large value. This is due to the creation of multiple data steps based on this parameter. For optimal performance, it's advised to limit the use of this code to generating fewer datasets.

Should you require a more efficient version of this program, I invite you to reach out to me via email at srini.bits.pilani@gmail.com.


options symbolgen mprint mlogic mcompilenote=all;
%macro split (dsn=, parts=);
%if &parts =  %then %do;
   %put ERROR: 'parts' is not specified.;
   %put Use the 'parts=' option to specify the number of parts to split the dataset into.;
   %return;
%end;
%if &dsn = %then %do;
   %put ERROR: 'dsn' is not specified.;
   %put Use the 'dsn=' option to specify the dataset to split.;
   %return;
%end;
proc sql noprint;
select count(*) into :n trimmed from &dsn;quit;
%let num=%sysfunc(inputn(&n,8.));
%let fin=%sysfunc(round(&num/&parts));
%put &fin;
%let obsm=0;
%do i=1 %to (&parts);
%let fobs=%eval(&obsm+1);
%let obsm=%eval(&fobs-1+&fin);

data ds&i;
set &dsn (firstobs=&fobs obs=%sysfunc(min(&obsm,&num)));
run;
%end;
run;
%mend;

Example:

  1. %split (dsn=sashelp.cars, parts=4);
  2. %split (dsn=, parts=4);
  3. %split (dsn=sashelp.cars, parts=);



SAS Macro for Splitting Datasets into Multiple Parts

Description: This SAS code defines a macro called 'split' that allows you to split a dataset into a specified number of parts. Be aw...