ISSN: 2378315X BBIJ
Biometrics & Biostatistics International Journal
Research Article
Volume 3 Issue 1  2016
Statistical Analysis of CD4+ Cell Counts progression of HIV1positive Patients enrolled in Antiretroviral Therapy at Hossana District Queen Elleni Mohamad Memorial Hospital, South Ethiopia
Getachew Tekle^{1}*, Wondwosen Kassahun^{2} and Abdisa gurmessa^{3}
^{1}Department of Statistics, Wachemo University, Ethiopia
^{2}Department of Epidemiology and Biostatistics, Jimma University, Ethiopia
^{3}Department of Statistics, Jimma University, Ethiopia
Received:December 08, 2015  Published: January 09, 2016
*Corresponding author: Getachew Tekle, Wachemo University, Faculty of Natural &computational Sciences, department of Statistics, Hossana, Ethiopia, Email: getch.55tekle@gmail.com
Citation: Tekle G (2016) Statistical Analysis of CD4+ Cell Counts progression of HIV1positive Patients enrolled in Antiretroviral Therapy at Hossana District Queen Elleni Mohamad Memorial Hospital, South Ethiopia. Biom Biostat Int J 3(1): 00057. DOI: 10.15406/bbij.2016.03.00057
Abstract
Background: Human immunodeficiency virus/acquired immune deficiency syndrome (HIV/AIDS) have caused the world most shocking tragedy and risk. Mortality among patients on HAART is associated with high baseline levels of HIV RNA, WHO stage III or IV at the beginning of treatment, low body mass index, severe anemia, low CD4+ cell count, type of ART treatment, gender, resourcepoor settings, and poor adherence to HAART.
Objective: The main objective of this study was to make use of appropriate modeling approach to CD4+ cell progression and identify the potential risk factors affecting the CD4+ cell progression of ART patients in Hossana District Queen Elleni Mohamad Memorial Hospital.
Methods: In this longitudinal retrospective based study secondary data was used from Hossana District Queen Elleni Mohamad Memorial Hospital. The study population consists of 222 HIV1positive patients, measured repeatedly at least one time on each patient who are 15 years old or older those treated with ART drugs from September 2011 to May 2014. The data was analyzed using SAS 9.2 version procedure NLMIXED. Poisson, Poissongamma, Poissonnormal, and Poissonnormalgamma models were applied to study overdispersion and correlation in the data.
Results: A total of 222 adult ART HIV1positive patients were included in this study. Out of these ART patients, 131(59%) were female patients and 91(41%) were male patients; 65(29.30%) were followed the drug combinations properly; the mean and standard deviation of baseline CD4+ cell counts were 355.9 and 321.4 cells per milliliter of blood, respectively; the mean and standard deviation of age of patients (p=0.0001) were 31.06 and 8.50 years, respectively; patients were followed for a mean of 24 months (p=0.0001). The analysis showed that the covariates significant for the progression of CD4+ cell counts were age of the patient, time since seroconversion, and sex at 5% level of significance.
Conclusion: On average CD4+ cell count increases after patients initiated to the HAART program (the disease rate declines). The progression of end outcome depends on patient’s baseline sociodemographic characteristics. For the presence of overdispersion, and clustering, the Poissonnormalgamma model results in improvement in model fit.
Keywords: CD4+ cell count; Poissonnormalgamma model; Over dispersion; Correlation
Introduction
Background
Human immunodeficiency virus/acquired immune deficiency syndrome (HIV/AIDS) have caused the world most distressing tragedy and danger. More than 25 million people worldwide have died of AIDS since 1981, as reported by Avert Org [1]. In 2005 Ethiopia launched free ART, over 71, 000 were initiated by the end of November 2006. 241 hospitals and health centers are now providing HIV care and treatment services in regions of the country.
According to Bayeh et al. [2], enumeration of CD4+ T cell count has been useful to initiate and monitor therapy in HIV infected individuals taking potent ART.
Count data are collected repeatedly over time in many applications, such as biology, epidemiology, and public health. Such data are often characterized by the following features: correlation due to the repeated measures is usually accounted for using subjectspecific random effects, which are assumed to be normally distributed. The sample variance may exceed the mean, overdispersion. Hence, appropriate modeling approaches which can overcome these issues and which lighten data analysis are needed.
Statement of the problem
The CD4+ cell count still remains the major determinant or measure of the cell mediated immunity. Currently, there is no enough evidence showing that all the ART centers in Ethiopia have implemented research tools to monitor patient’s immune (CD4) response to HAART within a specified time frame and identification of factors that might be associated with the poor CD4Lymphocyte response to HAART.
Hence, this study seeks to answer the following questions:
 Does HAART have a positive effect on the HIV/AIDS patients immune system based on an indication of their gained CD4+cell counts at Hossana District Queen Elleni Mohamad Memorial Hospital, SNNPR, Ethiopia?
 What are the appropriate longitudinal models to handle over dispersed and correlated individual subjects in this data?
 What are important potential determining factors in HIV/AIDS patient’s response to HAART at Hossana District Queen Elleni Mohamad Memorial Hospital, SNNPR, Ethiopia?
Objectives of the study
General objective: To make use of appropriate modeling approach to CD4+ cell counts progression and identify the potential risk factors affecting the CD4+ cell progression of ART patients in Hossana District Queen Elleni Mohamad Memorial Hospital.
Specific objectives
 To explore how CD4+ cell counts of HIV1positive patients under ART in Hossana District Queen Elleni Mohamad Memorial Hospital change over time;
 To fit an appropriate statistical model for the average evolution of CD4+ cell counts of HIV1 positive patients;
 To identify the potential factors affecting the evolution of CD4+ cell counts among HIV1 positive patients under ART.
Significance of the study
This study will have the following benefits:
 It helps to identify the potential risk factors influencing the absolute CD4 count measurements in HIV infected patients.
 It helps the respective policy makers of the health sector monitoring frequency of CD4 Count, monitoring therapeutic response, and judge urgency of ART initiation using CD4+ cell Counts of the patients.
 It can be used as a reference for those who want to apply the techniques of handling correlation and overdispersion in counts longitudinally collected data analysis.
Methodology
Source of data and sample size
The data was obtained from Hossana Queen Elleni Mohamad Memorial Hospital, SNNPR, Ethiopia from September, 2011 to May, 2014. The hospital is found in Hosanna town which is 235 km away from Addis Ababa. Persons living with HIV/AIDS, age greater than or equal to 15 years and started ART treatment in Hossana Queen Elleni Mohamad Memorial Hospital were included in the study. Individuals of 222 HIV patients in the HAART with a minimum of one and maximum of nineteen measures were selected. A retrospective longitudinal study design was conducted. Data was analyzed by SAS version 9.2 using NLMIXED procedure.
Study Variables
Dependent/response variable
Enumeration of CD4+ cell counts (CD4+ T cells) per mm3 of blood of ART patients, which is count.
Independent/explanatory variables
Eleven covariates were used to meet the goal of the study. These are:
Baseline age, Time since a month of seroconversion, Sex of the patient, Level of Education, (WHO) Clinical Stage, Area/residence of the patient, Adherence to any of the drugs, Employment status, Religion, and Marital Status.
Method of Data Analysis
Explanatory data analysis:
 Exploring the individual profile: Provide some information on within and between subject variability.
 Exploring the Mean Structure: Its purpose is to choose the fixed effects for the model.
Statistical Models
Generalized linear model/poisson model
Generalized linear models are usually in use for modeling univariate nonGaussian data.
The Poisson distribution belongs to the exponential family and is commonly used for the analysis of count data.
The distribution of outcome is Y ~ Poisson
$\left(\theta \right)$
, thus:
$f({{\displaystyle y}}_{i},{{\displaystyle \theta}}_{i})=\frac{{{\displaystyle \theta}}_{i}^{{y}_{i}}{{\displaystyle e}}^{\theta}}{{y}_{i}}$
(1)
The Poisson regression model, with
$\beta $
a vector of p fixed, but unknown regression coefficients is given by
$$\mathrm{log}\left(\lambda \right)={{\displaystyle X}}_{i}^{0}\beta $$
Overdispersion model
For count data, we consider the assumption that the variance proportional to the mean (in Poisson distribution). Specifically,
$\mathrm{var}\left(y\right)=\phi E\left(y\right)=\phi \lambda $
Where,
$\phi $
is over dispersion parameter if
$\phi =1$
, then the variance equals the mean and we obtain the Poisson meanvariance relationship. When the mean and variance are not equal (overdispersion), often the Poisson distribution replaced with Negative Binomial Distribution.
Start from a Poisson regression model and add a multiplicative random effect
$\theta $
to represent unobserved heterogeneity.
$y/\theta \sim P\left(\theta \lambda \right)$
(2)
$\theta $
has a gamma distribution with parameters
$\alpha $
and
$\beta $
unconditional distribution of the outcome, which happens to be the negative binomial distribution.
Generalized linear mixed model
According to Engel & Keen [3], Molenberghs & Verbeke [4], and Pinheiro & Bates [5], when nonGaussian data are repeatedly measured like CD4+ cell counts, the GLM is usually extended to GLMMs, with a subjectspecific random effect, usually a Gaussian type, added in the linear predictor to capture the correlation.
GLMMs combine the properties of linear mixed models and generalized linear models.
$g\left(\lambda \right)=In\left(\lambda \right)=\gamma {{\displaystyle X}}_{ij}+{{\displaystyle b}}_{i}{{\displaystyle Z}}_{ij}$
(3)
<
$\lambda \left({X}_{11},\mathrm{...},{X}_{jk}\right)=\mathrm{exp}\left(\gamma {{\displaystyle X}}_{ij}+{b}_{i}{Z}_{ij}\right)$
Where,
$\lambda $
: The mean of which is related to the covariates of X by link function
${X}_{ij}$
: Covariates of the i^{th} patient for the j^{th} time
$\gamma $
: Regression coefficients of
${X}_{ij}$
${Z}_{ij}$
: The covariates of the random effects of the i^{th} patient at j^{th} time
Poissonnormalgamma model
Correlation and overdispersion can crop up simultaneously, in practice. Combining ideas from the over dispersion and the GLMM the Poisson model with normal and gamma random effects can be specified as:
${Y}_{ij}\sim Poi\left({\lambda}_{ij}={\theta}_{ij}{K}_{ij}\right)$
(4)
with
${K}_{ij}=\mathrm{exp}\left({X}_{ij}\gamma +{Z}_{ij}{b}_{i}\right)$
[6], where
${Y}_{ij}$
be the j^{th} outcome measured for subject
$i=\mathrm{1...}N,j=1,\mathrm{...},{n}_{i},{b}_{i}\sim N(0,D)$
and
${\theta}_{ij\sim \gamma}(\alpha ,\beta ),{{\displaystyle X}}_{ij}and{{\displaystyle Z}}_{ij}$
pdimensional and q dimensional vectors of known covariate values, and
$\gamma $
a pdimensional vector of unknown fixed regression coefficients.
Model comparison/selection and variable selection
NLMIXED procedure fits nonlinear mixed models by maximizing an approximation to the likelihood integrated over the random effects.
It uses adaptive Gaussian quadrature method (MLH) by default.
Although the AIC can be used in association with mixed models, it is not common to be used with the models discussed above to select either the optimal set of explanatory variables or other structures.
Hence, finally the four models are compared to select the best one using 2loglikelihood comparison technique.
In all models, to select significant variables, first the main effect and main effect by time interaction were incorporated to the initial candidate model and, then avoid nonsignificant and so on (Back ward selection technique).
Results
Results
Baseline information and descriptive statistics of CD4+ cell counts: A total of 222 adult ART HIV1+ patients (1047observations) were used with a minimum of one and maximum of nineteen measures of CD4+ cell counts per individual patients were included in this study. Out of these ART patients, 131(59%) were female patients and 91(41%) were male patients, 175(78.8%) were living around rural out of Hossana town and 47(21.2%) were living around urban centers including Hossana town, the mean and standard deviation of baseline CD4+ cell counts are 355.891 and 321.455 cells per milliliter of blood, respectively.
Exploratory analysis for CD4+ cell counts data: The first step in any modelbuilding process is exploratory data analysis as done below.
Figure 1 shows that there is high variation in CD4+ cell counts at the baseline than at the end in both sex, but the variation of CD4+ cell counts in female is more pronounced than in male and the same feature can seen as for Figure 2 above. Thus, between and withinsubject specific difference in variation can’t be ignored.
Figure 1: Individual Profile plot for CD4+ cell counts of HIV1positive patients by Sex.
Figure 2: Individual Profile plot for CD4+ cell counts of HIV1positive patients. As depicted in Figure 1, the individual profile plot indicates that most of the CD4+ cell counts are concentrated around below 500 and there is high variation in CD4+ cell counts at the baseline than at the end and the CD4 cell counts appear to be increasing and decreasing over time, the degree is very high.
The loess smooth curve, as shown in Figure 3, suggests that the average profile of CD4+ cell counts has a linear relationship over time and almost fairly constant around below 500 cells per milliliter of blood. It indicates that CD4+ cell counts show a slight increasing pattern over time, but the rate of increasing is very low. And also it indicates the linear time effects to be included as fixedeffects in the model.
Figure 3: Loess smoother for CD4+ cell counts of HIV1positive patients.
Statistical models for data analysis: Since the response variable in this study is counts and the data is over dispersed as the variance of CD4+ cell counts (94710.06) is greater than the mean(438.511), the Poisson model is fitted including different random effects(gamma and normal) to handle both over dispersion and correlation, respectively. The (Table 1) displays the comparison among four models, GLM (Poisson model without random effect to handle over dispersion), GLMM (Poisson model with normal random effects to oversee the correlation), Poisson model with the gamma random effects (negativebinomial model) to grip extra variation in the data, and Poissonnormalgamma model, model with both normal and gamma random effects to discuss about both correlation and over dispersion simultaneously.
CD4+ Cell Counts [Mean (St.d) & %] 
Variables 
Levels 
Mean(St.d) 
% 
Total 
Baseline CD4+ cell counts 
 
355.891(321.455) 
 
222 
Age of Patients 
 
31.067(8.521) 
 
222 
Educational level of Patients 
No education 
288.000(66.30) 
25.67 
57 
Primary 
489.491(323.71) 
44.1 
98 
Secondary 
409.077(279.44) 
23.4 
52 
Tertiary 
444.469(328.37) 
6.76 
15 
Sex of Patients 
Female 
338.061 (330.467) 
59 
131 
Male 
381.560(308.00) 
41 
91 
Marital Status of Patients 
Never married 
432.714(416.537) 
15.8 
35 
Married 
331.335(276.971) 
57.7 
128 
Widowed 
369.500(352.421) 
10.8 
24 
Divorced 
323.526(347.029) 
8.6 
19 
Separated 
402.312(349.103) 
7.2 
16 
Religion of Patients 
Muslim 
567.730(355.185) 
11.7 
26 
Orthodox 
376.333(371.689) 
28.4 
63 
Protestant 
337.969 (286.336) 
53.6 
119 
Catholic 
317.400(174.321) 
2.3 
5 
Other 
437.222(385.016) 
4.1 
9 
Employment Status of Patients 
Unemployed 
289.666(245.476 
25.7 
57 
Employed 
1072.000(453.234) 
27 
60 
Working full time 
331.641(261.385) 
17.6 
39 
Other 
383.687(335.383) 
29.73 
66 
Adherence to any of the Drugs 
No 
349.758 (306.857) 
70.7 
157 
Yes 
370.707(356.357) 
29.3 
65 
WHO Stage of the Disease 
I 
359.416(332.894) 
27 
60 
II 
413.873(323.824) 
32 
71 
III 
333.647(317.347) 
30.6 
68 
IV 
233.478(269.075) 
10.4 
23 
Area of the Patients 
Rural 
364.595(358.506) 
21.2 
47 
Urban 
353.554(311.835) 
78.8 
175 
Total 
100 
222 
Table 1: Summary of the baseline characteristics.
* Estimates which are not significant at 5% level of significance.
Clearly, both the negativebinomial model and the PNG model are important improvements, in terms of the likelihood, relative to the PN model and ordinary P– model. But in PNG model, there is a very strong improvement in fit when gamma and normal random effects are simultaneously allowed for as also the over dispersion parameter, Variance RIS (d) is significant (P=0.0208). Implying the presence of considerable extra variability due to the grouped nature of the data, which is beyond what can be accommodated by the combined model. All covariates in the model except, time sex interaction in PNG model and sex in PG and PNG model are significant in all models for the progression of CD4+ cell counts of ART patients.
Therefore, the PNG model is a more viable candidate, substantiated further by the abovementioned likelihood comparison.
The random slope model strongly improves the fit of the model based on the likelihood comparison. The estimates and standard errors of the covariates are similarly significant in both models for the response variable. All the normal random effects in the slope model are significant implying that the correlation among subjects is evident.
*Estimates which are not significant at 5% level of significance. Combined model=Poissonnormalgamma model.
Allowing for the extension of the PNG model to include both random intercept and random slope does not improve the fit based on the likelihood comparison. The estimates of the covariates in both models are more or less similar, except an improvement in the slope model for the time sex interaction and sex. Time has a linear relationship with CD4+ cell count, which is what we observed in the graph showing the average trend. The random intercept PNG model is chosen based on likelihood comparison.
Discussion
A retrospective longitudinal study was conducted at Hossana Queen Elleni Mohamad Memorial Hospital, SNNPR, Ethiopia to determine the appropriate model for CD4+ cell counts and to characterize the time course of CD4+ cell progress with the software package, SAS procedure.
The data was unbalanced because some subjects were not keeping the regular time schedule and they were measured at different time points and the number of measurements was different across a subject which is similar to Vernon et al. [7] The time scale was used in a monthly format though a six months interval schedule was not worked for some patients which might be reluctance of subjects to follow up. The data was analyzed by version SAS 9.2 using PROC NLMIXED procedure that fits nonlinear mixed models—that is, models in which both fixed and random effects enter nonlinearly. PROC NLMIXED fits nonlinear mixed models by maximizing an approximation to the likelihood integrated over the random effects. This makes the approach to analyze data of this study unlike to other studies done on CD4+ cell counts.
As the result of this study reveals that the mean and standard deviation of CD4+ cell counts per milliliter of blood are not consistent with the result of the study conducted by Gezahegn [8] at Durame and Hosanna hospital found. This disagreement in result may be due to the sample size the study included, due to it was made at two different hospitals and due to differences in educational and socioeconomic levels. Exploratory data analysis reveals that there is a time trend in the data and the average profile of CD4+ cell counts has a little linear relationship over time which is unlike the SAS sasuser aids data analyzed using linear mixed model by Michael [9]. This difference may be due to smoothing technique, time scale, number of observations, and progressive pattern of baseline CD4+ cell counts. However, in this current study, this could not affect the random effect structure to be included in the model. Thus, no improvement in the analysis when nonlinearity was assumed and hence, as shown from the relationship a linear time effect was studied on CD4+ cell counts.
Most of the patients were females and they had lower mean CD4+ cell counts than males before ART was initiated which is similar to Moges et al. [10]. This is because females are biologically and socially more vulnerable to HIV infection in the developing countries. However, this is inconsistent with Kumarasamy et al. [11] reported from India. This difference could be due to several reasons as described in that study; HIV associated TB could be the contributing factor for the low CD4+ count in males as the proportion of patients having TB was significantly higher in male HIV positive patients than females. In addition, it may be due to a sexrelated difference in the overall CD4+ counts among males and females. HIV seronegative Ethiopian females had relatively higher CD4+ cell counts than HIV seronegative males like reported by Yanis et al. [12] and Tsegay et al. [13].
Most of the HIV infected patients enrolled in this study were living around rural out of Hossana town as found out by Nuredin [14] in another study at Adama hospital and most of the patients were young mean age of 31 years old who were sexually more active and thus have a higher risk of infection which is comparable to the study conducted by Moges et al. [10] at Zewuditu Hospital Addis Ababa, Ethiopia. These findings as found by these authors could conform as previous reports from elsewhere in Ethiopia which reported that HIV prevalence decreases significantly to increasing level of education as well as their socio economic status.
The data in this study indicates that the majority of HIV patients started antiretroviral treatment with more advanced immunodeficiency status. Since the majority of HIV patients had AIDS as defined by their CD4 cell counts < 200 cells/μl, indicating advanced immune suppression at initiation of ART. This was significantly higher when compared to the studies conducted in Nigeria, south eastern United States and Thailand which reported a lower rate of AIDS at the initiation of ART [15,16]. Therefore, in the hospital of this current study, delayed enrollment in ART program could be attributed by several factors such as due to fear of stigma. In Ethiopia, as described by the above authors, only one third of HIV infected persons disclosed their HIV status to their partner further compromising the utilization of the counseling and testing and ART services. A similar observation was made among South Africans where patients stared ART program with advanced immunodeficiency status. These findings indicate urgent need to promote early and enhanced HIV testing to enable HIV/AIDS patients to benefit from the expanding ART services [17].
Using stepwise selection technique, enter and remove, and backward technique, the most nonsignificant covariates are removed and the rest in the model are refitted and so on. At the last step the procedure ends with (the most likely selected covariates): time since month of seroconversion, age of the patients, and sex of the patient. Except sex time interaction term in the model, the other covariates, time since month of seroconversion and age of patient are significant for the change in the CD4+ cell counts of ART patients at HAART which was supported by [10,14].
Thus, as in many other diseases, age is an important prognostic factor in HIV infection. Age at seroconversion and age at a given CD4 cell count were shown to be important determinants of progression and survival before the widespread introduction of HAART, starting in 1996 [18]. This supports the current study. One study is reviewed which supports the significance of gender like this study, but it showed no difference among male and female. Thus, no differences in HIV progression and response to HAART attributable to gender among patients accessing the Spanish hospital network [19]. This difference may attribute to method of data analysis used in that study, KaplanMeier and Cox regression were used to assess the effect of sex on time to AIDS, survival from AIDS and attribute to other factors.
The comparison among four models were made, GLM (Poisson model without random effect to handle over dispersion), GLMM (Poisson model with normal random effects to administer the correlation), Poisson model with the gamma random effects to grip extra variation in the data, and Poissonnormalgamma model, model with both normal and gamma random effects to capture both correlation and overdispersion simultaneously. Estimation was done by maximum likelihood using numerical integration over the normal random effects, if present as was done by Molenberghs et al. [6].
As Kassahun et al. [20] summarized, based on Molenberghs et al. [21], it is argued that the normal and nonnormal, a gamma random effect, can usefully be integrated together into a single model to induce association between repeated Poisson data and to correct for the overdispersion.
One possible route to deal with overdispersion is to introduce an overdispersion parameter and only specify a relationship between the mean and the variance, and then apply quasilikelihood, whereby the extra variability in the data could captured by the dispersion parameter which is as Wedderburn [22] did. For the count data, it is common to combine Poisson distribution with a gamma distributed random effect, so that the unconditional distribution of the outcome turns out to be a negative binomial distribution, SAS procedure NLMIXED displayed this as shown on Table 2 (fifth column) [2325]. On the other hand, focusing on hierarchical data, the GLM is usually extended to generalized linear mixed models (GLMMs), with a subjectspecific random effect, typically a Gaussian form, added in the linear predictor to take into custody a hierarchyinduced association or to account for overdispersion [35], displayed on Table 3 fourth column. As used by Kassahun et al. [20] and Molenberghs et al. [21] proposed a flexible and unified modeling structure, termed the Poissonnormalgamma model, to simultaneously capture overdispersion and correlation for a wide range of clustered data, including count, binary and timetoevent. Thus, two sets of random effects were brought together. The normally distributed subject specificrandom effects take into custody the correlation, while a conjugate measurementspecific random effect on the natural parameter, is used to accommodate overdispersion, as shown on Table 2 last column.
Models 
Effects and parameters 
Poisson 
PoissonNormal 
PoissonGamma 
Poisson NormalGamma 
Effects 
Parameters 
Estimate (Se.) 
Estimate (Se.) 
Estimate (Se.) 
Estimate (Se.) 
Intercept 
β_{1} 
6.055(0.038) 
6.050(0.067) 
6.131(0.081) 
6.136(0.087) 
Time 
β_{1} 
0.002(0.001) 
0.002(0.001) 
0.005 (0.002) 
0.006(0.003) 
Age 
β_{2} 
0.006(0.001) 
0.006(0.001) 
0.004(0.002) 
0.004(0.002) 
Sex 
β_{3} 
0.052(0.004) 
0.052(0.004) 
0.073(0.058)* 
0.071(0.061)* 
Sex*Time 
β_{1} 
0.004(0.001) 
0.093(0.031) 
0.004 (0.002) 
0.003(0.002)* 
Sigma 
σ 
0.550(0.030) 
1.568E8(0.082) 
 
0.164(0.036) 
Negativebinomial parameter 
α 
 
 
2.170(0.089) 
2.287(0.101) 
1/ alpha 
β 
 
 
0.460(0.019) 
0.437(0.019) 
Variance RIS 
d 
 
2.46E16() 
 
0.027(0.012) 
−2loglikelihood 
 
151870 
151975 
14536 
14527 
Table 2: Comparison of Poisson, Poissongamma, Poissonnormal and Poisson normalgamma Models.
Molenberghs & Verbeke [4], considered a Poissonnormal model with random intercepts as well as random slopes in time. It is interesting to note that, when allowing for such an extension in our models, the random slopes improve the fit of the Poissonnormal model, but not of the Poissonnormalgamma model (details are shown in Table 3 & Table 4). Recall that the same procedure was applied, too, by Booth et al. [26,27]. While in this study it is considered four different models; but those authors focused on the Poissonnormal and Poissonnormalgamma implementations. There are further differences in actual fixedeffects and random effects models considered.
Models 
Effects and parameters 
PoissonNormal(Random Intercept Only) 
PoissonNormal (with Random Intercept& Slope) 
Effects 
Parameters 
Estimate (Se.) 
Estimate (Se.) 
Intercept 
β_{1} 
6.026(0.038) 
6.072(0.007) 
Time 
β_{1} 
0.003(0.001) 
0.004(0.000) 
Age 
β_{2} 
0.006(0.001) 
0.004(0.001) 
Sex 
β_{3} 
0.102(0.005) 
0.042(0.004) 
Sex*Time 
β_{1} 
0.002(0.002) 
0.003(0.001) 
Normal Random Effect 
b 
1.188(0.099) 
 
var(b1) 
d_{11} 
 
0.145(0.002 ) 
Var(b2) 
d_{22} 
 
0.001(1.552E6) 
Cov(b1,b2) 
d_{12} 
 
0.004(0.001) 
−2loglikelihood 
 
151689 
137475 
Table 3: Comparison of PNmodel among random effects.
Models 
Effects and parameters 
Combined Model(with no random effect) 
Combined Model(random effect) 
Effects 
Parameters 
Estimate (Se.) 
Estimate (Se.) 
Intercept 
β_{1} 
6.136(0.087) 
5.876(0.006) 
Time 
β_{1} 
0.006(0.003) 
0.003(0.001) 
Age 
β_{2} 
0.004(0.002) 
0.006(0.001) 
Sex 
β_{3} 
0.071(0.061)* 
0.106(0.005) 
Sex*Time 
β_{1} 
0.003(0.002)* 
0.002(0.001) 
NegativeBinomial Parameter 
α 
2.287(0.102) 
2.170(0.089) 
1/ alpha 
β 
0.438(0.019) 
0.461(0.019) 
Variance RIS 
d 
 
0.027(0.012) 
Theta_1 
θ 
 
0.089(0.001) 
−2loglikelihood 
 
14527 
152476 
Table 4: Comparison of PNG model without and with random effects.
Therefore, as tried out to lay the ground to talk, the four models were compared based on their likelihoods since they are not nested to use AIC though some authors used. Accordingly, as was discussed by Kassahun et al. [20], the Poissonnormalgamma model which combines both normal and gamma random effects to capture together both overdispersion and correlation was selected to improve the fit to the model based on −2loglikelihood which is 14527.
NB: CD4+ cell counts is to mean the CD4 cell counts after a patient is tested and notified that he/she is HIV positive, i e., CD4 cell counts after the first visit.
Conclusion and Recommendation
Conclusion
 Although good CD4+ cells recovery in response to ART was recognized, HIVpositive patients were enrolled in ART program at decreased CD4 cells levels.
 Time since month of sero conversion, sex of the patient and age of patient are potential risk factors for the change in the CD4+ cell counts of ART patients at HAART.
 Poissonnormalgamma model is the best chosen appropriate model for CD4+ cell counts data to handling overdispersion and correlation in this study.
Recommendation
 From Table 1, high number of patients were not following their drug combinations properly. This may affect their CD4+ cell counts progress and the response to HAART may not be as expected.
 High number of patients were married and were living around rural areas out of Hossana town which are far from the hospital.
 Hence, the HAART and any health related concerning bodies should have to support in giving advice to the patients to take care in making relation with others except their partners as they are married and they have to encourage the patients to follow the ART though the service center may be far.
 The natural feature of CD4+ cell counts is nonlinear as supported by some authors, but in this study there was a slight linearity in the data. So, this difference needs some overlook at it.
 Further Longitudinal studies with better number of repeated measurements per subject should be conducted on CD4+ cell counts to get better insight on the trends and to account for both overdispersion and correlation.
Limitations of the study
The following were the limitations of the study;
 Limited number of variables was captured during patient enrolment: In order to determine probabilities of predicting CD4 response to HAART, it was needed to identify some variables that were found in the records of the respondents, captured during commencement of ARV therapy. The problem was that variables considered for this study was not recorded in all the files of clients on HAART.
 Limited number of variables to measure social economic status. Only income and employment status were used as proxy measures of social economic status.
 One limitation of all observational studies is that of unmeasured confounding, none of the coinfectious diseases like TB were included.
 Some of important techniques like Monte Carlo simulation models were not used to track HIV disease progression and to indirectly estimate the outcomes and costs of treatment when initiated at various CD4 cell counts. Using this approach, initiation of HAART at a CD4 cell count more than 350 cells/μl can be seen to result in longer qualityadjusted survival compared to starting HAART at lower CD4 cell counts.
Acknowledgment
Above all, I would like to thank my almighty God for he allowed me to live and availed his mercy to move and breath, as the result of this, to achieve every of my activities on this land.
My special gratitude goes to Wondewosen Kassahun (Assistant Professor, PhD), my major advisor and instructor, for his kind support, advice, guidance and constructive comments beginning from the proposal development up to the final work of this thesis.
Likewise, I am sincerely grateful to my coadvisor Mr. Abdisa Gurmessa, head department of statistics, for his valuable suggestions, supports and comments beginning from the proposal development up to the final work of this thesis.
References
 Avert org (2009) HIV and AIDS in ZAMBIA: The epidemic and its impact. Republic of Zambia.
 Bayeh A, Fisseha W, Tsehaye T, Atnaf A, Mohammed Yessin (2010) ARTnaive HIV patients at FelegHiwot Referral Hospital Northwest Ethiopia. Ethiop Z Health Dev 24(1): 38.
 Engel B, Keen A (1992) A Simple Approach for the Analysis of Generalized Linear Mixed Models. LWA926, Agricultural Mathematics Group (GLWDLO). Wageningen the Netherlands.
 Molenberghs G, Verbeke G (2005) Models for Discrete Longitudinal Data. Springer Series in Statistics.
 Pinheiro J, Bates D (1995) Approximations to the Loglikelihood Function in the Nonlinear Mixed Effects Model. Journal of Computational and Graphical Statistics 4(1): 1235.
 Molenberghs G, Verbeke G, Clarice G, Demétrio B (2007) An extended randomeffects approach to modeling repeated, overdispersed count data. Lifetime Data Anal 13: 513531.
 Vernon L, Demko C, Babineau D, Wang X, Toossi Z, et al. (2013) Effect of Nadir CD4+ T cell Count on Clinical Measures of Periodontal Disease in HIV+ Adults before and during Immune Reconstitution on HAART. PLoS One 8(10): e76986.
 Gezahegn A (2011) Survival Status among patient living with HIV AIDS who are on ART treatment in Durame and Hossana Hospitals.
 Michael P (2002) Longitudinal data analysis with discrete and continuous responses course notes. SAS Institute Inc 58710.
 Moges D, Monga D, Deresse D (2013) Immunological response among HIV/AIDS patients before and after ART therapy at Zewuditu Hospital Addis Ababa, Ethiopia. American Journal of Research Communication 1(1): 103115.
 Kumarasamy N, Venkatesh K, Cecelia A, Devaleenol B, Saghayam S, et al. (2008) Genderbased differences in treatment and outcome among HIV patients in South India. J Womens Health (Larchmt) 17(9): 14711475.
 Nicastri E, Angeletti C, Palmisano L, Sarmati L, Chiesi A, et al. (2005) Gender difference in clinical progression of HIV1 infected individuals during long term highly active antiretroviral therapy. AIDS 19(6): 577583.
 Tsegaye A, Messele T, Tilahun T, Hailu E, Sahlu T, et al. (1999) Immunohematological reference ranges for adult Ethiopians. Clin Diagn Lab Immunol 6(3): 410414.
 Nuredin I (2007) Evaluation of factors affecting chance of survival/death status among HIVpositive people under the Anti Retroviral Treatment Program: The Case of Adama Hospital.
 Chasombat S, McConnell M, Siangphoe U, Yuktanont P, Jirawattanapisal T, Fox K, et al. (2009) National expansion of antiretroviral treatment in Thailand, 20002007: program scaleup and patient outcomes. J Acquir Immune Defic Syndr 50(5): 506512.
 Nwokedi E, Ochicha O, Mohammed A, Saddiq N (2007) Baseline CD4 lymphocyte count among HIV patients in Kano, Northern Nigeria. Afr J Health Sci 14: 212215.
 Stohr (2007) Factors affecting CD4+Tlymphocyte count response to HAART in HIV/AIDS patients. HIV medicine journal 8(7): 135141.
 Grabar S, Weiss L, Costagliola D (2006) HIV infection in older patients in the HAART era. J Antimicrob Chemother 57(1): 47.
 PerezHoyos S, RodríguezArenas MA, García de la Hera M, Iribarren JA, Moreno S, et al. (2007) Progression to AIDS and death and response to HAART in men and women from multicenter Hospital based cohort. J Womens Health (Larchmt) 16(7): 10521061.
 Kassahun W, Neyens T, Molenberghs G, Faes C, Verbeke G (2014) Modeling Hierarchical Data, Allowing for Overdispersion and Zero Inflation, in Particular Excess Zeros. Stat Med.
 Molenberghs G, Verbeke G, Demétrio C, Vieira A (2010) A family of generalized linear models for repeated measures with normal and conjugate random effects. Statist Sci 25(3): 325347.
 Wedderburn R (1974) Quasilikelihood functions, generalized linear models and the gaussnewton method. Biometrika 61(3): 439447.
 Breslow N (1984) ExtraPoisson variation in loglinear models. Applied Statistics 33(1): 3844.
 Hinde J, Demétrio C (1998)a Overdispersion: Models and Estimation, São Paulo: Associação Brasileira de Estatística.
 Hinde J, Demétrio C (1998)b Categorical Data Analysis, São Paulo: Associação Brasileira de Estatística.
 Booth J, Casella G, Friedl H, Hobert J (2003) Negative binomial loglinear mixed models. Stat Model 3(3): 179191.
 Kassahun W, Neyens T, Molenberghs G, Faes C, Verbeke G (2012) Modeling overdispersed longitudinal binary data using a combined beta and normal randomeffects model. Arch Public Health 70(1): 7.

