Simultaneous Estimation of Adjusted Rate of Two Factors Using Method of Direct Standardization
This paper presents the use of standardization or adjustment of rates and ratios in comparing two populations using single indices rather than a series of specific rates or ratios. Here the overall adjusted crude rate or the unadjusted crude rate for two populations will have same estimate irrespective of the nature of the standard population distribution. These results are obtained in all cases whenever the two standard distributions are of the total sample. In these cases the overall adjusted crude rates based on the two sets of directly adjusted rates would be equal to each other, although not necessarily always equal to the overall unadjusted crude rate as is found to be the case here. However, if the standard population distribution chosen for a given population is different from that chosen for another, then the two resulting estimated adjusted or standardized crude rates would most likely not be equal to each other.
Standardization or adjustment of rates and ratios is often necessary because it is usually easier in comparing two populations, say, to make the comparison using single summary indices rather than a series of specific rates or ratios. This approach also helps avoid the problem of small imprecise and sometime nonexistence of specific rates and ratios [13].
Standardization of rates and ratios may be done for only one factor or several factors of classification of a criterion variable of interest. In particular if a criterion variable or condition is associated with each of two factors of classification which may by themselves also be associated with each other, then standardization of rates or ratios may sometimes be necessary for a clearer analysis and interpresentation of results to simultaneously standardize or adjust the rates for the two factors of classification, first specific to the levels or categories of one of the factors across the levels of the other factor, and then also specific to the levels of the second factor say holding constant the levels, that is for all levels or categories of the first factor [4,5]. Research interest in this case would be to identify and measure the separate effects of the two factors of classification on the criterion variable or condition. This paper proposes, develops and presents a formatted systematic statistical method for this purpose.
The proposed method
Research interest here is using the direct method of standardization of rates to measure or estimate the separate effects of two factors of classification which may be associated on the variable being studied and to obtain sample estimate of unadjusted and adjusted crude rates specific to the levels of each of the factors holding the levels of the other factor of classification constant.
Now to develop the method of estimation of direct standardized or adjusted rate, suppose A and B are two variables of classification with ‘a’ and ‘b’ groups or levels respectively. Factors A and B may be associated or related. Research interest is to estimate the rates of occurrence of a criterion variable or condition specific to each of the levels of factor A across, that is for all levels of factor B and also the rates of occurrence of the specified condition specific to each of the levels of factor B for all levels of factor A as well as the corresponding marginal rates and overall rate. Suppose a total random sample of size N=N.. of subject are randomly drawn from an antecedent or predisposing population C for all levels of factors A and B of which Nij is the number of subjects at the ith level of factor A and jth level of factor B, for i=1,2,…aj and j=1,2,…..,b. Also suppose there are a total of n=n.. outcomes or cases in condition or set D of cases for all levels of factors A and B of which nij cases are at the ith level of factor A and jth level of factor B, for i=1,2,…aj and j=1,2,…,b where population D is possibly a subset of population C.
Now the rate of occurrence of cases in population D as a function of cases in population C specific to the ith level of factor A and jth level factor B is
${r}_{ij}=\frac{{n}_{ij}}{{N}_{ij}}\mathrm{........1}$
For i=1,2,…aj; j=1,2,…b.
Let
${N}_{i.}={\displaystyle \sum _{j=1}^{b}{N}_{ij};{N}_{.j}={\displaystyle \sum _{i=1}^{a}{N}_{ij}\mathrm{.........2}}}$
be respectively the total or marginal number of subjects or observations in population C at the ith level of factor A and jth level of factor B.
Similarly let
${r}_{i.;unadj}=\frac{{n}_{i.}}{{N}_{i.}};{r}_{.j;unadj}=\frac{{n}_{.j}}{{N}_{.j}}\mathrm{........4}$
be respectively the total or marginal number of cases or outcomes in population D at the ith level of factor A and jth level of factor B. Then the estimated unadjusted crude rates of occurrence of cases or outcomes in population D as a function of outcomes in population C specific to the ith level of factor A for all levels of factor B for all levels of factor A are respectively the ratios
$N={N}_{\mathrm{..}}={\displaystyle \sum _{i=1}^{a}{n}_{i.}={\displaystyle \sum _{j=1}^{b}{N}_{.j}={\displaystyle \sum _{j=1}^{b}{\displaystyle \sum _{i=1}^{a}{N}_{ij}}}}}\mathrm{...........}(5)$
For i=1,2,….a; j=1,2,…b.
Note that
$n={n}_{\mathrm{..}}={\displaystyle \sum _{i=1}^{a}{n}_{i.}}={\displaystyle \sum _{j=1}^{b}{n}_{.j}={\displaystyle \sum _{j=1}^{b}{\displaystyle \sum _{i=1}^{a}{n}_{ij}}}\mathrm{..............}(6)}$
And
$n={n}_{\mathrm{..}}={\displaystyle \sum _{i=1}^{a}{n}_{i.}}={\displaystyle \sum _{j=1}^{b}{n}_{.j}={\displaystyle \sum _{j=1}^{b}{\displaystyle \sum _{i=1}^{a}{n}_{ij}}}\mathrm{..............}(6)}$
Therefore the overall unadjusted crude rate of occurrence of event D as a function of event C for all levels of factors A and B is
${r}_{unadj}=r=\frac{{n}_{\mathrm{..}}}{{N}_{\mathrm{..}}}\mathrm{...........}(7)$
As noted above research interest is to obtain standardized or adjusted crude rate specific to each level of factor A for all levels of factor B and also specific to each level of factor B for all levels of factor A as well as the overall adjusted or standardized crude rate. To obtain estimates of adjusted or standardized crude rates specific to each level of factor B for all levels of factor A we use the proportionate distribution of total number of observations ${N}_{\mathrm{..}}$
across the ‘a’ levels or groups of factor A, namely Pis the waiting factor, for i=1,2,..,a.
Thus
${P}_{is}=\frac{{N}_{i.}}{{N}_{\mathrm{..}}}\mathrm{.........}(8)$
Similarly to obtain estimates of adjusted or standardized crude rate specific to each level of factor A for all levels of factor B we use the proportionate distributions N.. across the ‘b’ levels or groups of factor B, namely Psj the waiting factor, for j=1,2,…,’b.’ Thus
${P}_{sj}=\frac{{N}_{.j}}{{N}_{\mathrm{..}}}\mathrm{..........}(9)$
Hence the adjusted or standardized crude rate of condition D as a function of population C specific to the jth level of factor B for all levels of factor A is
${r}_{.j;adj}={\displaystyle \sum _{i=1}^{a}{p}_{is}{r}_{ij}}\mathrm{...........}(10)$
Similarly the adjusted or standardized crude rate of condition D as a function of population C specific to the ith level of factor A for all levels of factor B is
${r}_{i.;adj}={\displaystyle \sum _{j=1}^{b}{p}_{sj}{r}_{ij}}\mathrm{..........}(11)$
We then obtain the sample estimate of the overall adjusted crude rate of condition D as a function of population C for all levels of factors A and B as
${r}_{s;adj}={r}_{\mathrm{..}adj}={\displaystyle \sum _{i=1}^{a}{p}_{is.}{r}_{i.}}={\displaystyle \sum _{j=1}^{b}{p}_{sj}.{r}_{.j}}$
…………(12)
These results are summarized in Table 1.
FACTOR B 
FACTOR A 
1 
2 
……….. 
b 
Total 
Proportion 
Un adjust 
adjust 

${n}_{11}({N}_{11})$

${n}_{12}({N}_{12})$

………. 
${n}_{1b}({N}_{1b})$

$\left({n}_{1.}({N}_{1.})\right)$

$\left({p}_{sj}\right)$

$\left({r}_{.j;adj}\right)$

$\left({r}_{.j;unadj}\right)$

1 








${r}_{.1;adj}$









$\frac{{n}_{.1}}{{N}_{.1}}$









2 
${n}_{21}({N}_{21})$

${n}_{22}({N}_{22})$

…….. 
${n}_{2b}({N}_{2b})$

${n}_{2.}({N}_{2.})$

${p}_{2s}$

$\frac{{n}_{2.}}{{N}_{2.}}$

${r}_{2.;adj}$

${r}_{.2;adj}$

${r}_{21}$

${r}_{22}$

……… 
${r}_{2b}$

${r}_{2.}$

…. 
…. 
….. 
$\frac{{n}_{.2}}{{N}_{.2}}$









… 








a 
${n}_{a1}({N}_{a1})$

${n}_{a2}({N}_{a2})$

…… 
${n}_{ab}({N}_{ab})$

$\frac{{p}_{sa.}}{{n}_{a.}({N}_{a.})}$

${p}_{as}$

$\frac{{n}_{a.}}{{N}_{a.}}$

${r}_{a.;adj}$

${r}_{.j;adj}$

${r}_{{}_{a1}}$

${r}_{{}_{a2}}$

……. 
${r}_{{}_{ab}}$

${r}_{{}_{a.}}$




Total 
${n}_{.1}({N}_{.1})$

${n}_{.2}({N}_{.2})$


${n}_{.b}({N}_{.b})$





$\left({n}_{.j}({N}_{.j})\right)$

${r}_{.1}$

${r}_{.2}$


${r}_{.b}$





proportion 








$\left({p}_{sj}\right)$

${p}_{s1}$

${p}_{s2}$

…….. 
${p}_{sb}$

…. 
….. 
…. 

Un adjust$\left({r}_{.j;adj}\right)$

$\frac{{n}_{.1}}{{N}_{.1}}$

$\frac{{n}_{.1}}{{N}_{.1}}$

….. 
$\frac{{n}_{.b}}{{N}_{.b}}$

…… 
… 
${r}_{.unadj}=\frac{{n}_{\mathrm{..}}}{{N}_{\mathrm{..}}}$


Adjust$\left({r}_{.j;unadj}\right)$

${r}_{.1;adj.}$

${r}_{.2;adj}$

…… 
${r}_{.b;adj}$




${r}_{\mathrm{..};adj}$

Table 1: Data format for Estimation of Unadjusted and Adjusted Rates in two Factor Standardization by Direct method.
In table 1 the entries in each of the cells are the number of cases in condition D the number of observations in population D and the ratios of these numbers.
Illustrative Example
We now illustrate the proposed method with the sample data of Table 2 on premature and live births by birth order and age of mother in a certain population.
Birth Order 
Maternal Age 
1 
2 
3 
4 
5+ 
Total
$\left({n}_{i.}({N}_{1.})\right)$

Proportion of total births$({p}_{is})$

Under 20 
11(23) 
3(72) 
3(32) 
1(43) 
0(33) 
18(203) 


0.478 
0.042 
0.094 
0.023 
0.000 
0.089 
0.066 
2024 
14(329) 
15(327) 
7(176) 
3(69) 
8(67) 
47(968) 


0.043 
0.046 
0.040 
0.043 
0.119 
0.049 
0.012 
2529 
6(115) 
11(209) 
11(207) 
6(132) 
6(123) 
40(786) 


0.052 
0.053 
0.053 
0.045 
0.049 
0.051 
0.254 
3034 
4(78) 
8(83) 
10(117) 
9(98) 
12(150) 
43(526) 


0.051 
0.096 
0.085 
0.092 
0.080 
0.082 
0.170 
3539 
4(42) 
8(56) 
11(90) 
14(56) 
3(104) 
40(348) 


0.095 
0.143 
0.122 
0.050 
0.029 
0.115 
0.112 
40 and above 
3(34) 
4(457) 
8(72) 
10(48) 
4(68) 
29(267) 


0.088 
0.089 
0.111 
0.208 
0.059 
0.109 
0.086 
Total$\left({n}_{.j}({N}_{.j})\right)$

42(621) 
49(792) 
47(694) 
45(446) 
33(545) 
217(3098) 


0.068 
0.010 
0.068 
0.096 
0.060 
0.070 
0.070 
Proportion of total births
$({p}_{sj})$

0.200 
0.256 
0.256 
0.224 
0.144 
0.176 

Table 2: Sample Data on Premature and Live births by Birth order and Maternal age in a population.
The data of Table 2 is used to obtain estimates of the unadjusted and adjusted crude rate specific to each of the levels or groups of the two factors of classification.
Specifically to estimate adjusted or standardized crude rates specific to birth order, we apply the proportionate distribution of the total life births across maternal age as the standard population, namely${p}_{is}$
in the last column of Table 2 to each of the columns of rates, ${r}_{ij}$
of the Table, for j=1,2,3,4,5.Similarly to estimate adjusted or standardize crude rate specific to Maternal age we apply the proportionate distribution of total life births across birth order as the standard population, namely ${p}_{sj}$
in the last row of Table 2 to each of the rows of rates, ${r}_{ij}$
of the Table, for i=1,2,3,4,5,6.The results are presented in Table 3.
Birth Order 
Maternal Age 
Proportion of total birth$({p}_{is})$

${r}_{i1}$

1 
${r}_{i2}$

2 
${r}_{i3}$

3 
${r}_{i4}$

4 
${r}_{i5}$

5+ 
Unadjusted crude rate
$({r}_{1.;unadj})$

Adjusted crude rate$({r}_{1.;adj})$

less than 20 
0.066 
0.478 

0.042 

0.094 

0.023 

0.000 

0.089 
0.131 
2024 
0.312 
0.042 

0.046 

0.040 

0.043 

0.119 

0.049 
0.057 
2529 
0.254 
0.052 

0.052 

0.053 

0.045 

0.049 

0.050 
0.051 
3034 
0.170 
0.051 

0.096 

0.085 

0.092 

0.080 

0.082 
0.081 
3539 
0.112 
0.095 

0.143 

0.122 

0.250 

0.029 

0.115 
0.157 
40 and over 
0.086 
0.088 

0.008 

0.111 

0.208 

0.059 

0.109 
0.594 
Proportion of total birth$({p}_{.j})$














Unadjusted cruderate$({r}_{.j;unadj})$


0.068 

0.062 

0.068 

0.096 

0.061 

0.070 

Adjusted crude rate$({r}_{.j;adj})$



0.086 

0.070 

0.069 

0.088 

0.068 

0.070 
Table 3: Simultaneous Estimates of Unadjusted Adjusted Premature Birth rates by maternal age and Birth order: Direct Standardization
The adjusted crude rate of premature births specific to birth order for all age groups shown in the last row of Table 3 are estimated using equation 10,while the corresponding adjusted crude rate specific to maternal age for all birth orders shown in the last column of Table 3 are estimated using equation 11. Thus the last two rows of Table 3 show rates specific for birth order and directly adjusted for maternal age, with the standard maternal age distribution of births being that of the total sample of births. The last two columns of the Table show rates specific for maternal age and directly adjusted for birth order, with the standard birth order distribution of birth being that of total sample of births.
The estimated adjusted specific premature birth rate of Table 3 seem to indicate that incidence of premature births may not be strongly associated with birth order, but may probably be some how associated with increasing maternal age, especially from age 25 years. The overall adjusted crude premature birth rate is estimated to be severally 70 per 1000 live births whether the standard population distribution is either the proportionate distribution of total birth by birth order or by maternal age. The unadjusted crude rate is also here estimated to be 70 per 1000 live births. These results are usually the case whenever the two standard distributions are those of the total sample. In these cases the overall adjusted crude rates based on the two sets of directly adjusted rates would be equal to each other, although not necessarily always equal to the overall unadjusted crude rate as is found to be the case here. However, if the standard population distribution chosen for population A (here maternal age)is different from that chosen for factor B(here birth order),then the two resulting estimated adjusted or standardized crude rates would most lively not be equal to each other.
References
 Fleiss JL (1981) Statistical Method for Rates and proportions (2nd ed), New York:John Wiley & Sons.
 Pepe MS (2003) The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford statistical series 28, Oxford: University Press, U.K.;
 Greenberg RS, Daniels SR, Flanders WD, Eley JW and Boring JR (2001) Medical Epidemiology, London: LangeMcGrawHill.
 Cochran WG (1950) ”The comparison of percentages in matched samples”, Biometrika 37: 256266.
 Gibbon JD (1971) Non Parametric Statistical Inference McGraw Hill Inc.