Archive for the ‘Data management’ Category
A Macro To Rank Values Within / Without By Groups
/*—————————————————————————————————————————————————————-
| MACRO NAME : _grp
| CREATED BY : Gu, Peihua (Aug 17, 2013)
| CONTACT : lilygu10@yahoo.com
*—————————————————————————————————————————————————————–*
| PURPOSE
| This macro is to rank variables within By groups or without By groups;
| Ranked variables are named as original variable name plus _rk. (ex: if weight needs to be ranked,
| weight_rk will be the ranked variable)
*—————————————————————————————————————————————————————–*
| LIMITATION
| Can’t perform a weighted ranking. Combining Proc univariate and other procedures can do it.
| Length for variables need to be ranked should be less than 30 characters.
*—————————————————————————————————————————————————————–*
| MACRO CALL
| %_grp (_indata, _outdata, _varlist, _grpcnt, _bylist);
*—————————————————————————————————————————————————————–*
| REQUIRED PARAMETERS
| _indata : Input dataset name
| _outdata : Output dataset name
| _varlist: a list of variables need to be ranked
| _grpcnt: the number of groups variables need to be grouped into
| _bylist: a list of variables to form By groups; leave it as blank if users only rank variables for the whole group
*—————————————————————————————————————————————————————–*
| EXAMPLE
| %_grp(perm.score, perm.score_rank, height weight, 5, gender race)
*—————————————————————————————————————————————————————-*/
%macro _grp(_indata, _outdata, _varlist, _grpcnt, _bylist);
%local _varcnt _i;
%let _varcnt=%sysfunc(countw(&_varlist));
%if %length(&_bylist) ne 0 %then %do;
proc sort data=%str(&_indata); by &_bylist; run;
%end;
proc rank data=%str(&_indata) out=%str(&_outdata) groups=&_grpcnt;
%if %length(&_bylist) ne 0 %then %do;
by &_bylist;
%end;
var &_varlist;
ranks
%do _i=1 %to &_varcnt;
%scan(&_varlist, &_i)_rk
%end;;
run;
%mend;
A Macro To Automate Importing Multiple Excel Files Containing Multiple Worksheets In Which Common Variables Have Different Length Or Types
Note: Sorry that I forgot to mention another limitation. You may slightly change input function for your specific purpose. If you don’t want to print invalid data error message, you can add double question marks in your input function. Also, the code cannot be used in situations where variables which have date, time and dollar format have mixed data types (See the following example) in some Excel worksheets. I may modify the code when I have time.
BTW, you need to list all character variables for macro variable named “charvar” so the program can detect mixed type variables for each worksheet and convert them to numeric variables.
date | a |
12/5/2009 | — |
— | — |
— | — |
— | — |
— | — |
— | — |
— | $20 |
— | $5 |
$6 | |
12/8/2012 | $189 |
11/19/2007 | $19.40 |
8/25/1999 | $1,568,941 |