********************************************************************************************************************* February 1, 2014 Used in statistics calculator to compare two Proportions (called by Pcomp.sas). P_pval is the output parameter. Used in output for rates (called by StandardRateTable.sas). P_pval, P_pval1, or P_pval2 is the output parameter. COMPARE 2 PROPORTIONS (2 X 2 TABLE) Hypothesis testing: by Mid-p method based on hypergeometric distrubution 95%CI by Taylor series approach (traditional log transformation of Cumulative Risk Ratio/prevalence ratio) Note: Mid-p exact method to calculate 95%CI of risk ratio is not yet available in the literature. Martin's program calculates point estimate of OR by CMLE and 95%CI by mid-P and exact methods. Developed by Minn M. Soe, MBBS,MPH ( Epidemiologist/Biostatistican, DHQP, CDC ) Reference: 1.David G. Kleinbaum, Lawrence L. Kupper. Epidemiologic Research: Principles and Quantitative Methods. 2.Dean AG, Sullivan KM, Soe MM. OpenEpi: Open Source Epidemiologic Statistics for Public Health, Version. www.OpenEpi.com, updated 2013/03/21 3.John Pezzullo. Interactive Statistical Calculation Pages. www.Statpages.com. 4.Bernard Rosner. Fundamentals of Biostatistics' (5th edition) (Example 10.20, page 375). Two-tailed p-value calculated as 2 times whichever is smallest: left-tail, right-tail, or 0.5. It tends to agree closely with Yates Chi-Square p-value. Validation: with OpenEpi, WinPEPI, www.Statpages.com Data input layout: A= E+,D+ B= E-,D+ C= E+,D- D= E-,D- Disease/Outcome | Yes No | -----------|------------------------------------|------- Exposure Yes | A C | No | B D | -----------|------------------------------------|------- Output: if any cell has missing value, mid-P, risk ratio and 95%CI are not calculated. if cell-A has a null value (numerator '0' in exposed gp), risk ratio=0 but 95%CI of risk ratio is not calculated (Ref: WinPEPI). if cell-B has a null value (numerator '0' in unexposed gp), risk ratio (RR='infinity') is not calculated (Ref: WinPEPI). **********************************************************************************************************************; %MACRO Pcomp(A=,B=,C=,D=);*<--------change to PROPORTION(A,B,C,D) FOR NHSN APPLICATION; %MACRO LNfact(Z=); F=0;LNFACT=0;z=&z; if(z<2) THEN LNFACT=0; ELSE if(z<17) THEN DO; f=z; DO while(z>2); z=z-1; f=f*z ; END; LNFACT=LOG(f) ; END; ELSE DO; LNFACT=(z+0.5)*LOG(z) - z + LnPi2/2 + 1/(12*z) - 1/(360*(z**3)) + 1/(1260*(z**5)) - 1/(1680*(z**7)) ; END; %MEND; ***********INITIAL SETTING; *HANDLING OF MISSING VALUES; IF &A = 0 AND &B = 0 AND &C =0 AND &D = 0 THEN DO;MID_P=.;END; ELSE DO; IF &A ne . AND &B ne . AND &C ne . AND &D ne . THEN DO; Cell_A=0; Cell_B=0; Cell_C=0; Cell_D=0; Cell_r1=0; Cell_r2=0; Cell_c1=0; Cell_c2=0; t=0; *INPUT PARAMETERS; Cell_A = &A; Cell_B = &B; Cell_C = &C; Cell_D = &D; Cell_r1 = Cell_A+Cell_B; Cell_r2 = Cell_C+Cell_D; Cell_c1 = Cell_A+Cell_C; Cell_c2 = Cell_B+Cell_D; t = Cell_A+Cell_B+Cell_C+Cell_D; *SETTING THE DEFAULT VALUES; Pi=3.141592653589793; Pi2=2*Pi; LnPi2 = LOG(Pi2); E1=0; E2=0; E3=0; E4=0; E5=0; ***************************HYPOTHESIS TESTING; *STAT FUNCTIONS; *function CalcStats(form); LoSlop = Cell_A; if(Cell_D=Cell_A ) THEN RightP = RightP + P ; if( k>Cell_A ) THEN RightP1 = RightP1 + P ; midp_Right=((RightP - RightP1)*0.5 + RightP1); k = k + 1; END; FisherP=2*MIN(LEFTP, RIGHTP); Mid_P=2*MIN(midp_left, midp_Right); if mid_p>1 then MID_P=1; if mid_p<0 then MID_P=0; ***************************POINT ESTIMATE OF RISK RATIO (PREVALENCE RATIO) AND 95%CI; N1=SUM(&A,&C); N2=SUM(&B,&D); R1=(&A/N1);R2=(&B/N2); IF R2 NE 0 THEN DO; *PREVENT DIVISION BY ZERO FOR RR; RR= R1 / R2; IF R1 NE 0 THEN DO; *PREVENT DIVISION BY ZERO FOR CI; LL_RR= RR * EXP( -1.96*SQRT ( (1-R1)/(N1*R1) + (1-R2)/(N2*R2) )); UL_RR= RR * EXP( 1.96*SQRT ( (1-R1)/(N1*R1) + (1-R2)/(N2*R2) )); END; END; IF &A=0 THEN LL_RR=0; *SET LOWER LIMIT OF RR MISSING IF &A IS ZERO; END; END; mid_p = round(mid_p,.0001); drop f lnfact z lnpi2 cell_a cell_b cell_c cell_d cell_r1 cell_r2 cell_c1 cell_c2 t pi pi2 e1 e2 e3 e4 e5 loslop hislop lnprob1 fisherp leftp leftp1 rightp1 sumcheck k n1 n2 rr r1 r2 RightP P midp_left midp_Right ; %MEND; *NOTE: RR= RISK1 / RISK2. RISK1=A/A+C, RISK2=B/B+D; ****************IMPORT AN EXTERNAL DATASET AND RUN THE MACRO*********************; options mprint; data ClabsiExample;/*Create a data set*/ input Event1 Event2 NonEvent1 NonEvent2 ; cards; 3 4 30 100 2 6 7 25 ; run; data ClabsiExample_;set ClabsiExample;/*This step calls the macro*/ %Pcomp(A=Event1, B=Event2, C=NonEvent1, D=NonEvent2); run; proc print;run;