Friday, April 8, 2016

some questions

when do we use a sample and when to use the population for a statistical analysis?
Explain what is the use of “Proc print” in SAS?
Histogram definition , example , problem statement
Correlation definition , example , problem statement
Pareto definition , example , problem statement
Shortest distance between 2 points with many points in between.

Wednesday, March 2, 2016

Business Analytics Question


For a Restaurant Pricing Policy, create a statistical model and whether to keep the cancellation charges or not.

Base SAS Training with initial related Codes



INTRODUCTION
options formdlim = '*'; * Specifies a character to delimit page breaks in SAS output;

data FIRST; * creating a dataset by name FIRST;
input ID NAME HEIGHT WEIGHT; * to read the variable names;
datalines;
01 VIJAY 5 26
02 AJAY 6  28
03 RAVI 4 25
;
*The DATALINES statement is used to indicate internal data;
*The DATALINES statement must be the last statement in the DATA step;
*All lines in the SAS program following the DATALINES statement are considered data until SAS encounters a semicolon;
*The CARDS statement and the DATALINES statement are synonymous;

proc print data=FIRST;
* prints the values for all variables and all observation in a SAS dataset;
run; * End of the program;

data SECOND;
input ID NAME $ HEIGHT WEIGHT; * The $ after Name indicates that it is a character variable;
datalines;
01 VIJAY 5 26
02 AJAY 6  28
03 RAVI 4 25
;
proc print data=SECOND;
run;

libname SMRITHA 'C:\SAS\SMRITHA';
* specify the libref and then the location of permanent dataset to be stored;

data SMRITHA.THREE;
* creating a permanent dataset in the specified location with two level naming;
input ID NAME $ HEIGHT WEIGHT;
datalines;
01 VIJAY 5 26
02 AJAY 6  28
03 RAVI 4 25
;
proc print data=SMRITHA.THREE;
run;
proc contents data=SMRITHA.THREE; * prints the description of a SAS dataset;
RUN;


IMPORTING DATA
options formdlim='.';


* INFILE;
* Use the INFILE statement to tell SAS the filename and path, if appropriate, of the external file containing the data.
* The INFILE statement follows the DATA statement and must precede the INPUT statement. After the INFILE keyword, the file path
  and name are enclosed in quotation marks;

data EMP_DESIG;
infile 'C:\SAS\EXTERNAL FILES\SECOND.xls';
input ID NAME $ HEIGHT WEIGHT;
RUN;

/* LIST INPUT METHOD(FREE FORMATTED INPUT)*/
*The values in the raw data file are separated by atleast one space;

data EMP_DATA;
input EID NAME $;
datalines;
100 ram
101 krish
102 ranjith
;
run;

data  EMP_DATA4;
input EID NAME $ AGE SAL; *Since the default character length is 8, therefore it truncates the character observation value;
datalines;
101 Ram 34 10000
102 Ramakrishanan 22 15000
103 Ranjith 23 12000
;
run;

Informats are useful anytime you have non-standard data. (Standard numeric data contain only
numerals, decimal points, minus signs, and E for scientific notation.) Numbers with embedded
commas or dollar signs are examples of non-standard data. Other examples include data in
 
hexadecimal or packed decimal formats. SAS has informats for reading these types of data as well.


 
 









data  EMP_DATA5;
input EID NAME $13. AGE SAL;



*By using an informat in the input statement such as $13. to tell SAS that the variable field is 13 characters, then SAS will
read for 13 columns whether or not there is a space in those columns. This will inculde unwanted characters which appear after
the "NAME" on the dataline;
datalines;
101 Ram 34 10000
102 Ramakrishanan 22 15000
103 Ranjith 23 12000
;
run;

*COLON MODIFIER;

data  EMP_DATA6;
input EID NAME :$13. AGE SAL;
*To read until it encounters a space, use a colon modifier in the informat;
datalines;
101 Ram 34 10000
102 Ramakrishanan 22 15000
103 Ranjith 23 12000
;
proc print data=EMP_DATA6;
run;

data  EMP_DATA7;
length name $13;
*length statement is equivalent to using informat collan modifier;
*The length statement should always be before the input statement;
input EID NAME $ AGE SAL;
datalines;
101 Ram 34 10000
102 Ramakrishanan 22 15000
103 Ranjith 23 12000
;
proc print data=EMP_DATA7;
run;

*COLUMN INPUT METHOD;
* To use the column input method, only if:
  1. Each of the variable values is always found in the same place in the dataline.
  2. As long as the values are character or standard numeric;

data EMP_DATA9;
input  EID 1-4 NAME $ 6-12 AGE 14-15 RACE $ 17-21;
datalines;
1234 suresh  32 white
1256 kumar   32 black
1456 eeshwar 52 black
;
proc print data=EMP_DATA9;
run;

* Column input has the following advantages over list input:
  1. Space are not required between values
  2. Missing values can be left blank
  3. Character data can have embedded spaces
  4. Can skip unwanted variables;

* FORMATTED INPUT METHOD;

* 1. RELATIVE INPUT METHOD:
     Use informats by placing the informat after the variable name in the INPUT statement;

data EMP_DATA12;
input EID 4. +1 NAME $7. AGE 3. +1 RACE $5. ; * "+1" specifies to skip one character;
datalines;
1234 suresh  32 white
1256 kumar   32 black
1456 eeshwar 52 black
;
proc print data=EMP_DATA12;
run;

*2. ABSOLUTE INPUT METHOD;
*   The "@n" is the column pointers, where n is the number of columns SAS should move to.
    For example @7 says to SAS,go to Column 7 and read the data;
*The Column pointer has other uses too and can be used anytime you want SAS to skip backwards or forwards within a dataline.
 For example to skip over unneeded data, or to read a variable twice using different informats;

data EMP_DATA14;
input @1 EID 4. @7 NAME $7. @16 AGE 3. @21 RACE $5. @27 SALARY dollar7.;
format SALARY dollar7.;
datalines;
1234  suresh   32   white $15,600
1256  kumar    32   black $18,560
1456  eeshwar  52   black $20,890
;
proc print data=EMP_DATA14;
run;

*LINE POINTER;

data ADDRESS4;
input #1 EID NAME $ AGE #2 PIN STREET $15.;
datalines;
101 suman 21
560016 church street
102 pushpita 22
560027 residency road
103 sunil 23
560018 MG road
;
proc print data=ADDRESS4;
run;
* The LINE POINTER # (a pound sign) in the INPUT statement tells SAS which line to read when you have multiple lines of data per
  observation;

*NEW LINE OPERATOR (/) or RELATIVE LINE POINTER;
*An alternative way to read the multiple data is to use a relative line pointer, a forward slash(/) to tell SAS to skip to the
 next line of input;

data INCOME3;
input NAME $ / AGE / SALARY ;
datalines;
Ram   32
25
10000
Krish 45
26
12000
Ajay 55
27
15000
;
proc print data=INCOME3;
run;

* Delimited files are raw data files that have a special character sperating data values. Many programs can save data as delimited
  files, often with commas or tab characters for delimiters. SAS gives two options for the infile statement that make it easy to
  read delimited data files: the DLM=option and the DSD option;

* DLM=option;
* To read the data using list input, the DATA step expects your file to have spaces between your data values.
  The DELIMITER=, or DLM=, option in the INFILE statement allows you to read data files with other delimiters.
  The comma and tab characters are common delimiters found in data files, but can read data files with any delimiter character
  by just enclosing the delimiter character in quotation marks after the DLM=option;

data EMPLOYEE_DETAIL;
infile datalines dlm=',';
input EID NAME $ AGE;
datalines;
100,Ram,73
101,Krish,23
102,suresh,62
;
proc print data= EMPLOYEE_DETAIL;
run;

* DSD Option;
* The DSD (Delimited-sensitive Data)option for the INFILE statement does 3 things.
  1. It ignores delimiters in data values enclosed in quotation marks.
  2. It does not read quotation marks as part of the data value.
  3. It reads two delimiters as a missing value.
  The DSD option assumes that the delimiter is a comma.
  If the delimiter is not a comma then use the DLM=option with the DSD option to specify the delimiter;

data EMPLOYEE_DETAIL5;
infile datalines dlm='%' dsd;
input EID NAME $ AGE;
datalines;
100%Ram%73
101%Krish%23
102%suresh%62
;
proc print data= EMPLOYEE_DETAIL5;
run;

* If there are comma in any of the data values, then those values are enclosed in quotation marks;

data EMPLOYEE_DETAIL6;
infile datalines dlm=',' dsd;
input EID NAME $ AGE;
datalines;
100,"Mr.,Ram",73
101,Krish,23
102,suresh,62
;
proc print data= EMPLOYEE_DETAIL6;
run;

* MISSOVER;
* By default, SAS will go to the next data line to read more data if SAS has reached the end of the dataline and there are still
  more variables in the INPUT statement that have not been assigned values.
  The MISSOVER option tells SAS that if it runs out of data, don't go to the next data line. Instead, assign missing values
  to any remaining variables;

data EMPLOYEE_DETAIL11;
infile datalines missover;
input EID NAME $ AGE SAL;
datalines;
100 Ram 23 2000
101 Krish
102 Sunil 32 25000
102
102 Suresh 42 45000
102 Kumar 28 35000
;
proc print data= EMPLOYEE_DETAIL11;
run;

* FIRSTOBS;
* The firstobs=option tells SAS at what line to begin reading data.
  This is useful if you have a datafile that contains descriptive text or header information at the beginning,
  and is also useful to skip few data at the beginning;

* OBS;
* The OBS=option can be used anytime you want to read only a part of your datafile.
  It tells SAS to stop reading when it gets to that line in the raw data file;

data EMPLOYEES5;
infile datalines firstobs=3 obs=6;*will start reading this file on the third dataline and stops reading after the sixth dataline;
input DEPT $ EID SAL;
datalines;
DEPT EID SAL
CE 101 15000
TE 105 20000
RE 110 10000
TE 115 12000
TE 215 32000
RE 120 11000
CE 150 19000
RE 140 14000
CE 160 20000
RE 130 12000
TE 315 22000
;
proc print data=EMPLOYEES5;
run;

/* IMPORT Procedure; */
* TO read the delimited data files using the IMPORT procedure;
* PROC IMPORT will scan your data file and automatically determine the variable types(numeric or character), will asign proper
  lengths to the character variables, and can recognize some date formats.
* PROC IMPORT will treat two consecutive delimiters in your files as a missing value, will read values enclosed by quotation marks,
  and assign missing values to variables when it runs out of data on a line.
* The IMPORT procedure actually writes a DATA step, after submitting the program in the log window to see the DATA step it produced.

* To read the delimited data files, then must use the DBMS=option in the PROC IMPORT statement.
* If the specific SAS dataset name already exists then use the REPLACE option to overwrite it.

Type of File                                                           Extension                             DBMS Identifier
Comma-delimited .                                             csv                                          CSV
Tab-delimited                                                       .txt                                           TAB
Delimiters other than commas or tabs                                                            DLM

The IMPORT procedure, will by default get variable names from the first line in the data file. In case, if the variable names not
  be taken then add the GETNAMES=NO statement after the proc import statement. PROC IMPORT will assign the variables the names
  VAR1, VAR2, VAR3 and so on;

* If the datafile is type DLM, PROC IMPORT assumes that the delimiter is a space. if it is a different delimiter then specify it in
  the DELIMITER=statement;

/*PROC IMPORT DATAFILE='filename' OUT='data-set'
     DBMS= DLM REPLACE;
      GETNAMES=no;
    DELIMITER='delimiter-character';*/

proc import datafile='C:\SAS\EXTERNAL FILES\emp.txt' out = EMPLOYEE_DETAILS
dbms=DLM replace;
getnames=YES;
delimiter=' ';
proc print data=EMPLOYEE_DETAILS;
run;

* Reading a excel file with more than one sheet, then specify the name of which sheet to be read.
   
   SHEET=name-of sheet;

proc import datafile='C:\SAS\EXTERNAL FILES\list.xls' out= FMLIST
dbms=excel replace;
getnames=yes;
sheet=flist;
proc print data=FMLIST;
run;

proc import datatable= "emp"
out=EMPLOYEE
dbms=access replace;
database="C:\SAS\EXTERNAL FILES\db1.mdb";
proc print data=EMPLOYEE;
run;

IF/THEN
options formdlim='~';

* CONDITIONAL EXECUTION USING IF/THEN STATEMENTS;

IF-THEN/ELSE logic takes this basic form:
IF condition THEN action;
ELSE IF condition THEN action;
ELSE action;

data ACTORS;
input NAME $ JOBCODE $ 7-15 SALARY;
datalines;
JOJO  Actor I   10000
JIJO  Actor II  20000
Sunil Actor III   15000
ram   Actor     10000
;
proc print data=ACTORS;
run;

data JOBLEVELS;
set ACTORS;
if JOBCODE='Actor I' OR JOBCODE='Actor II' then JOBLEVEL='Beginner';
   else if JOBCODE='Actor III' then JOBLEVEL='Advanced';
           else JOBLEVEL='Unknown';
proc print data=JOBLEVELS;
run;

IF condition THEN DO;
action;
END;

data SALARY;
input EID SALARY;
datalines;
101 20000
102 25000
103 30000
104 35000
105 15000
106 10000
;
proc print data=SALARY;
run;

data BONUS;
set SALARY;
if SALARY>=30000 then
do
   BONUS=SALARY*(10/100);
   INCREMENT=SALARY*(8/100);
end;
   else if SALARY<30000 & SALARY>=20000 then
   do
      BONUS=SALARY*(15/100);
      INCREMENT=SALARY*(10/100);
   end;
       else
        do
          BONUS=SALARY*(20/100);
          INCREMENT=SALARY*(100/100);
        end;
proc print data=BONUS;
run;

* SUBSETTING USING IF CONDITION;

data DEPT;
infile 'C:\SAS\EXTERNAL FILES\emp2.txt';
input NO EMPNO ENAME $ JOB $ MGR SAL HIREDATE DEPTNO;
format HIREDATE date9.;
if DEPTNO=20;
proc print data=EMPLOYEE_DETAIL;
run;

PROCEDURES

* BY Statement;
* The BY statement is required for only one procedure, PROC SORT. In PROC SORT the BY statement tells SAS how to arrange the
  observations;

* PROC SORT;
* PROC SORT, sorts the data based on the values of a variable;
/*PROC SORT is used to organize data for a report, before combining datasets, or before using a BY statement in another PROC or
  data step.

          PROC SORT;
          BY variable-1....variable-n;

  The variables named in the BY statement are called BY variables. With a BY variable, SAS sorts the data based on the values of
that variable.

The DATA=and OUT=options specify the input and output datasets.
If the DATA=option, is not specified, then SAS will use the most recently created SAS dataset. I
If the OUT-option is not specified, then SAS will replace the original dataset with the newly sorted version.  */

data EXPREV;
input REGION $ STATE $ MONTH monyy5. EXPENSES REVENUE;
format month monyy5.;
datalines;
Southern GA JAN95 2000  8000
Southern GA FEB95 1200  6000
Southern FL FEB95 8500 11000
Northern NY FEB95 3000  4000
Northern NY MAR95 6000  5000
Southern FL MAR95 9800 13500
Southern NY MAR95 4000  5000
Northern MA MAR95 1500  1000
Eastern GA FEB95 4000  6000
eastern GA FEB95 3000  6000
;
proc print data= EXPREV;
run;

proc sort data=EXPREV out=EXPREV2;
by EXPENSES;
format month monyy5.;
proc print data= EXPREV2;
run;

* By default SAS sorts data in ascending order, from lowest to highestor from A to Z. To have data sorted from highest to lowest,
  add the keyword DESCENDING to the BY statement before each variable that should be sorted from highest to lowest;

proc sort data=EXPREV out=EXPREV4;
by STATE descending EXPENSES;
format month monyy5.;
proc print data= EXPREV4;
run;
* NODUPKEY;
* The NODUPKEY option tells SAS to eliminate any duplicate observations that have the same value for the BY variables;

proc sort data=MARINE out=SEASORT2 NODUPKEY;
by family;
proc print data=SEASORT2;
run;

 
 
FORMATTING THE DATA AS REQUIRED USING FORMAT STATEMENT;












* FORMAT PROCEDURE;
* Sometimes variable values are stored according to a code. The predefined SAS formats cannot help in such cases. However the
  FORMAT procedure can be used to create user-defined formats for displaying values of variables;
* The FORMAT procedure creates formats that will later be associated with variables in a FORMAT statement. The procedure starts
  with the statement PROC FORMAT and continues with one or more VALUE statements (other optional statements are available);

/*       PROC FORMAT;
              VALUE name range-1 = 'formatted-text-1'
                         range-2 = 'formatted-text-2'
                                 .
                                 .
                                 .
                         range-n = 'formatted-text-n';

The name in the VALUE statement is the name of the format for creating. If the format is for character data, the name must start
with a $. The name can't be the name of an existing format;

Character values must be enclosed in quotation marks. If there is more than one variable in the range, then separate the values
with a comma or use the hyphen for a continuous range. The keyword LOW and HIGH can be used in ranges to indicate the lowest
and the highest non-missing value for the variable.*/

DATA carsurvey;
INPUT Age Sex Income Color $;
DATALINES;
19 1 14000 Y
45 1 65000 G
72 2 35000 B
31 1 44000 Y
58 2 83000 W
;
RUN;

PROC FORMAT;
VALUE gender 1 = 'Male'
             2 = 'Female';
VALUE agegroup 13 -< 20 = 'Teen'
               20 -< 65 = 'Adult'
               65 - HIGH = 'Senior';
VALUE $col 'W' = 'Moon White'
           'B' = 'Sky Blue'
           'Y' = 'Sunburst Yellow'
           'G' = 'Rain Cloud Gray';
RUN;

PROC PRINT DATA = carsurvey;
FORMAT Sex gender. Age agegroup. Color $col. Income DOLLAR8.;
TITLE 'Survey Results Printed with User-Defined Formats';
RUN;

* MEANS PROCEDURE;
* The MEANS procedure provides simple statistics on numeric variables;
/*The MEANS procedure starts with the keywords PROC MEANS, followed by options listing the statistics to be printed:
                 PROC MEANS options;
If the options is not specified, MEANS will print the number of non-missing values, the mean, the standard deviation, and the
minimum and maximum for each variable. There are over 30 different statistics which can be requested with the MEANS procedure*/

data GRADE;
input NAME $ GENDER $ STATUS YEAR $ SECTION $ SCORE FINALGRADE;
datalines;
Abbott    F 2 97 A 90 87
Branford  M 1 98 A 92 97
Crandell  M 2 98 B 81 71
Dennison  M 1 97 A 85 72
Edgar     F 1 98 B 89 80
Faust     M 1 97 B 78 73
Greeley   F 2 97 A 82 91
;
proc print data=GRADE;
run;

* VAR variable-list;
* The VAR statement specifies which numeric variables to use in the analysis;
* BY variable-list;
* The BY statement performs separate analyses for each level of the variables in the list. The data must first be sorted in the
  same order as the variable list;

proc means data=GRADE maxdec=2;
by YEAR STATUS;
var SCORE FINALGRADE;
run;

* CLASS variable-list;
* The CLASS statement also performs separate analyses for each level of the variables in the list, but its output is more compact
  than with the BY statement, and the data do not have to be sorted first;

proc means data=GRADE maxdec=2;
class STATUS;
var SCORE;
run;

* FREQ PROCEDURE;
* A freuqency table is a simple list of counts;
* The most obvious reason for using PROC FREQ is to create tables showing the distribution of categorical data values, but PROC
  FREQ can also reveal irregularities in data;
* The FREQ PROCEDURE is a descriptive procedure as well as statistical procedure. It produces one-way and n-way frequency tables,
  and it concisely describes the data by reporting the distribution of variable values. The FREQ PROCEDURE is used to create
  cross-tabulation tables that summarize data for two or more categorical variables by showing the number of observations for
  each combination of variable values;

data COLOR;
input REGION EYES $ HAIR $ COUNT @@;
datalines;
1 blue  fair    23  1 blue  red      7  1 blue  medium  24
1 green  dark    11  1 black fair    19  1 green red      7
1 green medium  18  1 green dark    14  1 brown fair    34
1 brown red      5  1 brown medium  41  1 brown dark    40
1 brown black    3  2 blue  fair    46  2 blue  red     21
2 blue  medium  44  2 black  dark    40  2 blue  black    6
2 green fair    50  2 black red     31  2 black medium  37
2 green dark    23  2 green fair    56  2 brown red     42
2 blue medium  53  2 brown dark    54  2 brown black   13
;
proc print data=COLOR;
run;

* By default, PRO FREQ creates a one-way table with the FREQUENCY,PERCENT, CUMULATIVE FREQUENCY and CUMULATIVE PERCENT for each
  value of the variables in a SAS data set;

proc freq data=COLOR;
run;

* To specify the variables to be processed by the FREQ PROCEDURE, includes a TABLES statement;

proc freq data=COLOR;
tables EYES HAIR;
run;

* Options can be provided after a slash in the TABLES statement;

proc freq data=COLOR;
tables EYES HAIR / out=FREQCNT;     * Only for the HAIR variable the dataset is created;
proc print data=FREQCNT;
run;

* To Produce a cross tabulation, list the variables separated by an asterisk;

proc freq data=COLOR;
tables EYES * HAIR ;
run;

proc freq data=COLOR;
tables EYES * HAIR / nopercent nocum ;
run;


* SET STATEMENT;
* The SET statement in the data step allows you to read a SAS dataset so you can add new variables, create a subset, or  otherwise modify the dataset;

data EMPLOYEE;
input EID NAME $;
datalines;
100 sunil
101 ram
102 anand
103 jijo
104 vasanth
104 komal
104 praveen
105 jojo
;
proc print data=EMPLOYEE;
run;

data EMPLOYEE2;
set EMPLOYEE;
where eid < 104;
proc print data=EMPLOYEE2;
run;

* ONE-TO-ONE MERGE;
* When you want to match observations from one data set with observations from another, use the MERGE statement in the DATA step.
  Before merging the data sets, the data sets must be sorted by the commmon variable;
/*                            DATA new-data-set;
                              MERGE data-set-1 data-set-2;
                              BY variable list; */

data C;
input ID VAR_C $;
datalines;
1 Mary
3 Sue
5 Hilary
;
proc print data=C;
run;

data D;
input ID VAR_D $;
datalines;
1 Smith
3 Roos
4 Clinton
;
proc print data=D;
run;

proc sort data=C;
by ID;
run;

proc sort data=D;
by ID;
run;

data CD;
merge C D;
by ID;
proc print data=CD;
run;

* MANY TO ONE;

data P;
input ID VAR_P $;
datalines;
1 Mary
3 Sue
3 row
5 Hilary
;
proc print data=P;
run;

data Q;
input ID VAR_Q $;
datalines;
2 Smith
3 Roos
proc print data=Q;
run;

proc sort data=P;
by ID;
run;

proc sort data=Q;
by ID;
run;

data PQ;
merge P Q;
by ID;
proc print data=PQ;
run;

* ONE TO MANY;

data M;
input ID VAR_M $;
datalines;
1 Mary
3 Sue
5 Hilary
;
proc print data=M;
run;

data N;
input ID VAR_N $;
datalines;
2 Smith
3 Roos
3 row
proc print data=N;
run;

proc sort data=M;
by ID;
run;

proc sort data=N;
by ID;
run;

data MN;
merge M N;
by ID;
proc print data=MN;
run;

* IN=option;
* The IN=option is used to track which of the original data sets contributed to each observation in the new data set;
* Unlike most variables, IN=variables are temporary, existing only during the current DATA step. SAS gives the IN=variables a
  value of 0 or 1. A value of 1 means that data set did contribute to the current observation, and a value of 0 means the
  data set did not contribute;

data ONE;
input ID NAME1 $;
datalines;
1 Mary
2 jack
3 Sue
5 Hilary
;
proc print data=ONE;
run;

data TWO;
input ID NAME2 $;
datalines;
2 Smith
3 Roos
6 jimmy
;
proc print data=TWO;
run;

proc sort data=ONE;
by ID;
run;

proc sort data=TWO;
by ID;
run;

data COMBINED;
set ONE TWO;
by ID;
proc print data=COMBINED;
run;

data COMBINED2;
set ONE(in=O) TWO(in=T);
by ID;
if O;
proc print data=COMBINED2;
run;

data COMBINED3;
set ONE(in=O) TWO(in=T);
by ID;
if T;
proc print data=COMBINED3;
run;

* IN=option in MERGE;

data COMBINED4;
merge ONE TWO;
by ID;
proc print data=COMBINED4;
run;

* INNER JOIN;
data COMBINED5;
merge ONE(in=O) TWO(in=T);
by ID;
if O & T;
proc print data=COMBINED5;
run;

*RIGHT JOIN;
data COMBINED6;
merge ONE(in=O) TWO(in=T);
by ID;
if T;
proc print data=COMBINED6;
run;

*LEFT JOIN;
data COMBINED6;
merge ONE(in=O) TWO(in=T);
by ID;
if O;
proc print data=COMBINED6;
run;