 Research
 Open Access
 Published:
Timehomogeneous Markov process for HIV/AIDS progression under a combination treatment therapy: cohort study, South Africa
Theoretical Biology and Medical Modellingvolume 15, Article number: 3 (2018)
Abstract
Background
As HIV enters the human body, its main target is the CD4 cell which it turns into a factory that produces millions of other HIV particles. These HIV particles target new CD4 cells resulting in the progression of HIV infection to AIDS. A continuous depletion of CD4 cells results in opportunistic infections, for example tuberculosis (TB). The purpose of this study is to model and describe the progression of HIV/AIDS disease in an individual on antiretroviral therapy (ART) follow up using a continuous time homogeneous Markov process. A cohort of 319 HIV infected patients on ART follow up at a Wellness Clinic in Bela Bela, South Africa is used in this study. Though Markov models based on CD4 cell counts is a common approach in HIV/AIDS modelling, this paper is unique clinically in that tuberculosis (TB) coinfection is included as a covariate.
Methods
The method partitions the HIV infection period into five CD4cell count intervals followed by the end points; death, and withdrawal from study. The effectiveness of treatment is analysed by comparing the forward transitions with the backward transitions. The effects of reaction to treatment, TB coinfection, gender and age on the transition rates are also examined. The developed models give very good fit to the data.
Results
The results show that the strongest predictor of transition from a state of CD4 cell count greater than 750 to a state of CD4 between 500 and 750 is a negative reaction to drug therapy. Development of TB during the course of treatment is the greatest predictor of transitions to states of lower CD4 cell count. Transitions from good states to bad states are higher on male patients than their female counterparts. Patients in the cohort spend a greater proportion of their total followup time in higher CD4 states.
Conclusion
From some of these findings we conclude that there is need to monitor adverse reaction to drugs more frequently, screen HIV/AIDS patients for any signs and symptoms of TB and check for factors that may explain gender differences further.
Background
The life cycle of HIV starts as it enters the human body, its major target being a white blood cell called Thelper cells or CD4 cells [1]. Once these cells are infected, HIV takes over and turns them into factories that produce thousands of copies of the virus. The HIV makes use of the enzyme Reverse Transcriptase to change copies of its Ribonucleic Acid (RNA) into Deoxyribose nucleic Acid (DNA). The viral DNA then enters the nucleus of the host cell and combines with cell DNA and starts making copies of viral RNA. The enzyme protease helps in assembling the viral particles into thousands of new viruses, which will bud and destroy the host cell. These new viruses will then be ready to attack other CD4 + T cells. Hence, the importance of CD4+ T cell count in understanding the progression of HIV. Depreciation of the CD4+ T cells in the human body leads to the deterioration of the human immune system, which is why the virus is called the Human Immunodeficiency Virus (HIV).
As the immune system is compromised, the individual is now prone to opportunistic infections like tuberculosis (TB). TB may occur at any stage of HIV disease and is frequently the first recognised presentation of underlying HIV [2]. HIV and TB coinfection is characterised by challenges that include poor adherence and overlapping toxicities resulting in an impaired CD4 Tcell recovery with antiretroviral therapy (ART) due to the effect of drugdrug interaction [3].
The need to address the challenges associated with HIV/AIDS progression in the presence of TB coinfection has prompted this study and also to analyse HIV disease history based CD4 multistates and death/loss to followup in a single model. However, most studies use KaplanMeier analysis and Cox proportional hazards regression models [3,4,5]. KaplanMeier survival methods and Cox proportional hazards regression are commonly employed tools to model mortality and time to viral suppression and/or subsequent rebound and occasionally used to model time to CD4 recovery. However, survival models are not appropriate for all studies, particularly in the presence of competing risks and when multiple or recurrent outcomes are of interest. In particular, when modelling HIV/AIDS progression, Markov models are relatively straight forward to analyse both CD4 stage and death or loss to followup within a single model which survival models fail to do. Markov models can accommodate censored data, competing risks (informative censoring), multiple outcomes, recurrent outcomes, frailty and nonconstant survival probabilities [6]. Examination of the conditions of the stochastic processes at various points in time, categorisation of the conditions, and examination of the external influences on the stochastic processes can be done using Markov models [7]. Markov models are favourable to the modelling of diseases in particular cases where the disease is grouped into a set of exhaustive and mutually exclusive health states, thereby forming a multistate model [8]. History is naturally generated as the multistates evolve over time; it contains information on previous visits, time of entry into various states, and the length of stay in states.
Continuoustime homogeneous Markov models have been used since early in the epidemic to model disease progression of HIV/AIDS patients, and there has been some recent renewed interest in the use of these models. In 1989 Longini et al. used a 5state Markov model based on the clinical indicators of the HIV disease progression [9]. In 1998 Alioum et al. estimated the effects of gender, age, mode of transmission and ART on HIV progression using a 3state Markov model [10]. In 2011 Reddy [11] carried out a research almost similar to Alioum et al. in South Africa. However, Reddy used a 5state Markov model with 4 CD4 based transient states followed by the absorbing state, ARV initiation. Reddy’s model is characterised by high rates of immune deterioration since the study was carried out on ARV naïve patients. In 2009, Binquet et al. used a multistate Markov model to analyse the impact of gender, intravenous drug use, weight loss, low haemoglobin, CD8 cell count and HIV viral load on HIV evolution in the era of highly active antiretroviral therapy (HAART) [12]. Recently, in 2013 Grover et al. assessed the impact of ART using a 5staged multistate Markov model and went further to examine the effects of explanatory variables; age, sex and mode of transmission on the transition rates [13].
In this study, we use 7staged continuoustime Markov model to assess the disease progression of HIV/AIDS patients receiving ART from a clinic in BelaBela, South Africa. The first 5 stages are based on CD4 cell counts and the end points are either death or withdrawal from study. In addition to the gender and age differences in HIV/AIDS progression, we further assess the effects of having TB as the initial marker of HIV/AIDS, developing TB during the course of treatment, developing some adverse effects to treatment (Reaction), CD4 baseline and viral load baseline. Though Markov models based on CD4 cell counts is a common approach in HIV/AIDS modelling, this paper is unique clinically in that tuberculosis (TB) coinfection is included as a covariate.
In medical research, the state of the patient at observation time is the only thing known with certainty. The researcher may know the time interval in which a transition has occurred, but not the exact time. Thus, homogeneous Markov models which are interval censored can handle such data [14]. The transition intensities, probabilities and the distribution functions associated with the times are the basic building blocks of the Markov processes [15]. For a continuoustime Markov model, transitions can occur at any (realvalued) time instant. For a timehomogeneous Markov jump process, the holding time in state i are modelled using exponential distributions. The exponential distributions may be adequate for many reallife situations, for example, time until death, and waiting time before moving to another state. However, the exponential distributions are memoryless continuous distributions, hence a limitation in the application of Markov processes. The ‘memoryless’ property could be seen as a problematic assumption in this setting. It is likely the case that patients starting on ART who respond well to treatment will continue to respond well to treatment  contradicting the Markov assumption and memoryless property.
Transition probabilities for continuoustime homogeneous models only depend on the difference between the two observation times. That is, for all t ≥ 0 the probability of moving from state i to state j is given by:
This is the Markov property, where F_{ t } is the natural Filtration of the stochastic process. P[X_{ t } = j F_{ t }], therefore, represents the probability that the stochastic process X_{ t } is in state j at time t given the history of the process up to time t. The Markov property implies that all the history of the process is contained in the state currently occupied, X_{ s } = i. The transition probabilities of a continuous time homogeneous Markov process X_{ t }, t ≥ 0 is given by:
The equations obey the ChapmanKolmogorov equations:
In this paper we describe, using the theory of continuous time Markov processes, and using real data on an evolving disease such as AIDS. Also, the effects of covariates, including TB, on baseline transition rates is considered. Models with and without covariates are fitted and compared using the likelihood ratio test.
The next section explores the methods of Markov modelling and an illustrative case study on HIV progression is given. In this section, data used in the analysis is described and we explain formulation of the model based on the data. This is followed by a section on the results and discussions. The final section concludes on the findings.
Methods
A continuoustime homogeneous Markov model
Formulation of the continuoustime homogeneous model is done by considering transition probabilities over narrow interval of time ∆t. In this study \( \Delta t=\frac{1}{2} \) year making it appropriate to assume that transition rates over these intervals are constant. These transition rates, also known as transition intensities or forces of transition, are the fundamental concept in continuous time Markov jump processes. They can take values greater than 1, unlike probabilities. In order to differentiate the transition probabilities and avoid technical problems with mathematics, the assumption is that the functions p_{ ij }(t) are continuously differentiable and are subject to the initial condition:
δ_{ ij } is a Kronecker delta, p_{ ii }(0) = 1 means that at t = 0 the system maintains its original state and p_{ ij }(0) = 0 means that there is no change of state when no time elapses. The force of transition from state i to j is defined as:
α_{ ij }, for i = 1, …, 5 and j = 1, …, 7, does not vary over time and satisfies the following conditions; \( {\sum}_{j\in X}{\alpha}_{ij}=0 \) and \( {\alpha}_{ii}={\sum}_{i\ne j}{\alpha}_{ij} \).
Once the transition intensities are known, the transition probabilities can be obtained by solving a system of differential equations known as the Kolmogorov’s forward equation subject to the initial conditions stated in eq. (2). The Kolmogorov’s forward equation is as follows:
where k is a state that the system can pass through as it makes a transition from state i to state j. The time homogeneous models are fitted for this data to assess effectiveness of the treatment by comparing the forward transition and the reverse transitions. This then lead to building of models that allow transitions in both directions.
Data description
The model is initially applied on 319 HIV patients on antiretroviral therapy (ART) from a Wellness clinic in Bela Bela, South Africa, from year 2005 to year 2009. Two hundred and twentyseven of these patients were females and 92 were males at treatment commencement (t = 0). After 3 years of treatment uptake, 173 females and 71 males were remaining in the study. Thirtyeight females had died and 16 withdrew and their status was not known after 3 years of treatment up take. Nineteen of the males died during the first three years and two had withdrawn and it was not known whether they were alive or dead. A 2year old (subject 81) together with subject 82 were detected by the residuals plot as an outlier and it was removed from the analysis meaning that the remaining 317 patients were used for analysis. About 50 and 65% of the female and male deaths respectively occurred during the first 6 months of treatment uptake. The interquartile range of patient ages is (33; 47.5) years with a mean and median age of 39.53 and 40 years respectively. The ages were negatively skewed (skew = −0.24) which means that there were more younger patients than older patients in this cohort.
At time t = 0 there were 242 individuals with CD4 baseline (CD4BL) cell count below 200, 59 individuals with CD4 cell count between 200 and 350, 11 individuals with CD4 cell count between 350 and 500, 6 individuals with CD4 cell count between 500 and 750 and 1 individual with CD4 cell count above 750. At t = 0 the CD4 cell count had mean of 156 cells/mm^{3}, a median of 116 cell/mm^{3} and the maximum CD4 cell count was 1202 cells/mm^{3}. The mean viral load baseline (VLBL) for these patients was 105,573.35 copies/mm^{3} and it ranged from 56 to 818,600 copies/mm^{3}. The median viral load was 58,523.00 copies/mm^{3}. From these individuals 155 had a WHO stage baseline (WSBL) of 4 which is related to severe HIV symptoms. WSBL is the categorisation of HIV/AIDS at baseline basing on the clinical markers as defined by World Health Organisation (WHO).
Although some individuals developed TB (DTB) during the course of treatment, 109 patients had TB as an initial marker of HIV. From the individuals who had TB before (TBB4) commencement of antiretroviral therapy (ART), 66 had a CD4 baseline below 200cells/mm^{3}, 20 had a CD4 baseline between 200 and 350cells/mm^{3}, 2 had CD4 baseline between 350 and 500cells/mm^{3}, 2 between 500 and 750cells/mm^{3} and 19 had unknown CD4 baseline. These patients completed their TB treatment before commencement of ART. Fiftytwo patients developed TB during the treatment period and 12 of these patients had TB before commencement of treatment. During the first 6 months of treatment uptake, 35 patients died and from these deaths, only five were attributed to having TB before commencement of ART.
A combination therapy was administered to all HIVinfected individual in the cohort. The therapeutic intervention inhibits the actions of reverse transcriptase enzyme and/or protease of new infectious free HIV by the HIVinfected cell. The drug regimens at t = 0 were mainly a combination of d4T3TCEFV (administered to 207 patients) and d4T3TCNVP (administered to 83 patients). The second line regimens were mainly a combination of AZT3TCEFV/NVP and were given to patients who developed some adverse reaction. These second line regimens were frequently used from 2 to 4 years posttreatment commencement. The therapeutic intervention lowers the number of infectious free virus particles in the circulation, and in some cases to beyond detection. This results in a reduction on the density of infected cells, causing a rise on the CD4 cell count of infected individuals. So generally the CD4 cell count of an individual receiving therapeutic intervention is expected to rise to well above 500 cell/mm^{3}, assuming a proper adherence to treatment. Hence the use of increase in CD4 cell count as the marker of efficacy of treatment.
During the course of treatment, some individuals developed some adverse reaction (React) to treatment. For these individuals the adverse reactions were treated and drugs administered to them were changed. Change of treatment was also based on the viral load monitoring.
For the purpose of analysis, the variables are coded into the model as follows:
WSBL  Gender  Age  CD4BL  VLBL  DTB  TBB4  React  

1  4  Male  ≤ 40years  ≤ 350  ≥10 000  Yes  Yes  Yes 
0  other  Female  > 40years  > 350  < 10 000  No  No  No 
Model formulation
At any time t + ∆t, the state of an HIVinfected individual is defined basing on the CD4 cell count level or whether the individual is dead or has withdrawn as follows:
State 1CD4 ≥ 750  State 2–500 ≤ CD4 < 750 

State 3–350 ≤ CD4 < 500  State 4–200 ≤ CD4 < 350 
State 5CD4 < 200  State 6Death 
State 7Withdrawal. 
Basing on these seven states, progression of HIV positive individuals on treatment is defined by the state diagram on Fig. 1 below. The arrows in the diagram show possible transitions between the seven states defined above.
The information in Fig. 1 shows that state 6 and 7 are absorbing states hence no transitions from these states. As HIV progresses in an individual’s body, there is a possibility of an individual being in the same state in consecutive visit times.
Basing on the classification above, Table 1 summarises transition counts that took place for the whole period of study 2005 to 2009.
Table 1 shows that, transition counts from state i to i ± j are higher for all the values in which j = 1 than for j > 1 where i, j ∈ { 1, …, 5}. As a result a bidirectional model is proposed which defines transitions from state i to i ± 1 or from i to j = 6; 7.
The model is formulated basing on the assumptions that between times (t, t + ∆ t), where ∆t is a very small value, there is a transition from any one of the states i = 1, 2, …, 5 (transient states) to state j = 1, 2, …, 7 defined as follows:

CD4 cell count of an individual is expected to rise due to efficacy of treatment at a rate of α_{ ij }, where j = i − 1;

Some individuals fail to adhere to treatment therapy. These individuals can move to a state of lower CD4 cell count at a rate of α_{ ij }, where j = i + 1;

From any state i = 1, 2, …, 5 an infected individual can die (state 6) at a rate of α_{i6};

An individual in state i = 1, 2, …, 5 can decide to withdraw (state 7) at a rate of α_{i7};

An individual can remain in the same state at a rate of α_{ ii } = − λ_{ i } = − (α_{i, i − 1} + α_{i, i + 1} + α_{i6} + α_{i7}). This is based on the fact that the sum of transition rates from any state is equal to zero.
These assumptions can be represented by the following transition rate matrix Q(t):
Once the transition rate matrix has been obtained, the matrix of transition probabilities can be obtained using Kolmogorov’s forward differential equations defined in (3). This yields the following differential equations for the Markov jump processes:
Equations (4) to (10) represent all the possible transition probabilities from state i, for i = 1, 2, ...5, to state j = 1, …, 7. p_{ ij }(t) represents the probability that a patient in state i makes a transition to state j and its coefficients represent the transition rates. For example, in equation (4), −(α_{12} + α_{16} + α_{17}) = α_{11}. These states denoted by i are defined based on the CD4 cell count grouping. So there is a possibility of a backward or forward movement transition between transient states due to failure or efficacy of treatment respectively. There is no possible transition from state i = 6 and state i = 7 because these states are absorbing states where i = 6 represents death of an infected individual and state i = 7 represents withdrawal from treatment by an infected individual. All the analysis is done using the package ‘msm’ for multistate modelling in R software. The package was developed by Jackson in 2011 [16].
Results and discussions
Estimation of the transition rate matrix
Estimation of the transition intensities is done using the method of maximum likelihood to estimate the transition intensities. The likelihood, L, is given by:
where t_{i } : i = 1, 2, …, 7, is the total number of observed waiting/holding time in state i, \( {\alpha}_{ii}={\sum}_{i\ne j}{\alpha}_{ij} \) and n_{ ij } is the number of transitions observed from state i to state j. The estimates are obtained by taking the logarithm of the likelihood and differentiating this with respect to each of the transition intensities α_{ ij }'s. This leads to the maximum likelihood estimates of the transition intensities as \( {\alpha}_{ij}=\frac{n_{ij}}{t_i} \), where n_{ ij } is the number of transitions from state i to state j, and t_{ i } the total observed waiting/holding time in state i.
The plot of residuals for each of the individuals in the study was drawn to identify the outliers (subjects with higher influence) in the data. Once the outliers are identified they can simply be deleted and the model is refit. According to Titman in 2007 [17] residuals for multistate models can be determined as follows;
If n subjects and a parameter vector θϵΘ, with maximum likelihood estimator based upon the whole data \( \widehat{\theta} \). Let \( {\widehat{\theta}}_{(j)} \) represent the estimate with subject j deleted. Thus the quantity \( {\widehat{\theta}}_{(j)}\widehat{\theta} \) for j = 1, …, n is of interest. The influence of each point on each parameter can be compared separately and to get a measure of the overall influence of a particular subject we take the scalar quantity;
where I(θ) is the observed Fisher information matrix at the maximum likelihood estimates for the full data. Consider the contribution to the score function of each subject evaluated to the maximum likelihood estimate for the full model. Highly influential subjects will have scores of high magnitude. For a single subject, the score residual is given by an analogous scalar measure:
where \( {U}_j\left(\widehat{\theta}\right) \) is the vector of first derivatives of the loglikelihood for that subject at maximum likelihood estimates θ. That is, \( U\left(\theta \right)=\frac{\partial l}{\partial \theta}\left(\theta \right), \) is determined using the derivative of the transition probability matrix P(t) with respect to θ. These derivatives were given by Kalbfleisch and Lawless [18]. The residuals plot displays the residuals for each subject in the order labelled by subject identifiers. Subjects with a higher influence on the maximum likelihood estimates will have higher score residuals [16]. The plot helps to identify any outliers in the data. Figure 2 below shows the plot of residuals.
Results from Fig. 2 show that patients with ID numbers 81 and 82 are outliers as indicated by their positions from the rest of the patients in the cohort. The corresponding residuals for these values are 1.58315799 and 1.58315999, respectively compared to the rest of the subjects whose residuals below 1. Patient number 81 is a 2 year old enrolled whilst in state 1 and maintained the state throughout the study period. Patient number 82 was enrolled whilst in state 5 and, during the third visit, was already in state 1 and maintained it throughout the study period. These patients are excluded from the analysis leaving us with 317 subjects. Table 2 shows transition intensities α_{ ij } for i = 1, 2, …, 5 and j = 1, 2, …, 6, 7 for the two models, one with outliers and the other one without outliers. The corresponding confidence interval is also given for each transition intensity. The state space is X_{ t } = {1, 2, 3, 4, 5, 6, 7}.
The results from Table 2 show narrow confidence intervals which is an indication that the suggested continuous time Markov model gives a precise estimate of the data. Results from Table 2 show that transitions to better CD4 cell count states are higher than transitions to worse CD4 cell count states which is an indication of efficacy of ART. The model with outliers has got a higher loglikelihood than the model without outliers as expected since the model with outliers has got a greater dimension. A further analysis on the transition intensities was also done for each of the CD4 baseline (CD4BL) and WHO stage baseline (WHOSBL) levels coded as follows:
The results are shown in Appendix 2 and 3 for CD4BL and WHOSBL respectively. The results from Appendix 2 show that transition rates to CD4 recovery (2 to 1, 3 to 2, 4 to 3 and 5 to 4) were high for patients who initiated therapy when their CD4 baseline level was well above 350 per mm^{3}. These rates of CD4 recovery decrease with as the CD4 cell count at treatment initiation decrease with a baseline CD4 cell count below 200 per mm^{3} recording the lowest rates of CD4 recovery. The results from Appendix 3 show that regardless of the WHO stage baseline, transition rates to CD4 recovery are higher than transition rates to CD4 deterioration. The rates of CD4 recovery are the highest for transitions from state 5 to state 4. Transition rates to state 6 (death) are the highest for those individuals who had severe HIV symptoms (WHOBL = 4) and these intensities decrease as the symptoms decrease from severe to asymptomatic levels.
Expected holding times
The expected holding time in each state also known as the mean sojourn time describes the average time an individual spends in each state in a single stay before he/she makes a transition to another state. The mean sojourn time in each state i for i = 1, 2, …, 5, is estimated as \( \frac{1}{\lambda_i} \), where \( {\lambda}_i={\sum}_{i\ne j}{\alpha}_{ij} \) is the total force of transition out of state i. For example, the expected holding time in state 1 is 1/(0.887 + 0.1016 + 0.1418) ≈ 0.8844 as shown in Table 3 below:
Results from Table 3 show estimates of the holding time, the standard error (SE), the lower bound (L) and the upper bound (U) for each of the transient state i. From the results, if an individual is in state 5 (corresponding to a CD4 count below 200cell/mm^{3}) he spends more time in that state before making a transition to other states. This could be due the time taken by an individual to respond to treatment since state 5 is the worst state in HIV/AIDS progression.
The jump chain
This is when a Markov process is observed at the times it makes transitions to a new state. In other words a jump chain is a stochastic matrix R of probabilities where each row sums to one, on the state space X_{ t }, which gives the conditional probability of the next state an individual goes to after leaving state i. If α_{ ii } > 0 then given that there is a jump to a different state, it means we never stay in state i, we make a jump out resulting in having R_{ ii } = 0 and if α_{ ii } = 0 then we never leave state i meaning that R_{ ii } = 1 (States 6 and 7). The computed matrix of probabilities of each state being next (also known as the jump chain), together with the mean sojourn times in each state, fully define a continuoustime Markov model. This is a more intuitively meaningful description of a model than the transition intensity matrix. The matrix for the probabilities that the next state after state i is state j is approximated as \( {p}_{ij}=\frac{\alpha_{ij}}{\lambda_i} \) , for each i and j such that i ≠ j. α_{ ij } is the force of transition from state i to state j and α_{ ii } is the total force of transition out of state i. For example, \( {p}_{12}=\frac{\alpha_{12}}{\lambda_1}=\frac{0.8872}{0.8872+0.1016+0.1418}=0.7847 \), as shown in the matrix below. The results are shown Table 4 below:
The results from Table 4 show that R_{i, 1 − 1} > R_{i, i + 1}, which shows that the probability of jumping to a better state is higher than the probability of jumping to a worse state. This is more pronounced for individuals in state 5 where the probability of jumping to state 4 (recovery) is 0.8123 which is very high compared to probability of making a jump to state 6. This is an indication of the effectiveness of treatment. Probability of the death state being next is the highest for those patients with CD4 counts less than 500. These probabilities increase with the decreasing number of CD4 counts.
Forecast of the total length of stay in each state
We need to forecast the total time spent in the good states and the bad states by individuals who are on HIV treatment before death or withdrawal from the study. Estimates of the forecasted total lengths of time spent in each state j between two future time points t_{1} and t_{2} are estimated using the formula:
where i is the state at the start of the process, which defaults to 1. The results are shown below:
State1  State2  State3  State4  State5  State6  State7 

8.988960  8.806075  7.767124  3.520485  1.153648  Inf  Inf 
The results show that each individual is forecasted to spend approximately 8.99 half years in state 1, 8.8 half years in state 2, 7.77 half years in state 3, 3.52 half years in state 4 and finally 1.153 half years in state 5. These results show that HIV positive individuals on treatment are expected to spend more time in good states compared to the time spent in bad states.
Percentage prevalence for the model without covariates.
Using the fitted timehomogeneous Markov model, the percentage prevalence were plotted to compare the expected values with the observed values. The results are shown in Fig. 3 below:
The results from Fig. 3 show that for the state i = 1, …, 6 the expected prevalence fit the observed data perfectly well except for the withdrawal state where the expected prevalence overestimate the observed. The plots further show a sharp decrease on state 5 percentage prevalence with the fitted model, underestimating the model for observed data up to time = 7 half years. The percentage prevalence for the death state is increasing at a slow rate and from time = 2 half years to time = 8 half years the percentage prevalence is stable.
Effects of covariates on transition intensities
A continuoustime Markov model for the effects of covariates; Age, CD4BL, VLBL, WSBL, Reaction, DTB, TBB4 and Gender is fitted. Identification of covariates that have a significant contributory effect is done by entering each covariate one after the other and performing the likelihood ratio test in comparison to the model without covariates. All the other variables proved to be significant to the progression except for the variable gender which could not be eliminated because of its demographic importance. The baseline transition intensities \( \left({\alpha}_{ij}^{(0)}\right) \) relate to the transitions from state i to state j. Baseline transition intensities and linear effect of each of the covariates is estimated and the results are shown in two separate Table 5 and Appendix 1 respectively:
The fitted time homogeneous model with covariates has 2xLL = 3699.259, which represents an improvement of 242.712 compared to the model without covariates. A Likelihood ratio test is performed to compare the two nested models that were fitted, the one without covariates and the other with covariates. The value of the \( LRT=2 lo{g}_e\left(\frac{L_s\left(\widehat{\theta}\right)}{L_g\left(\widehat{\theta}\right)}\right) \) where \( {L}_s\left(\widehat{\theta}\right) \) is the simple model (no covariates) and \( {L}_g\left(\widehat{\theta}\right) \) is the general model (with covariates). A likelihood ratio test statistic of 1770.618 is compared to a χ^{2} distribution with 144 degrees of freedom. The test was performed and the results are shown below:
−2logLR  Df  pvalue  

with.covariates  1770.186  144  10^{−4} 
The results show that the model with covariates fits significantly better than the model without covariates.
Hazard ratios of covariates on transition intensities
In this section the hazard ratios for each of the covariates; VLBLviral load baseline, DTBdevelop TB during treatment period, TBB4develop TB before treatment, Gender, Reactreaction to treatment, CD4BLCD4 baseline, WSBLWHO stage baseline and Age are estimated. The relationship between these covariates and the transition intensities is defined by the following equation:
where Z = [VLBL, DTB, TBB4, Gender, React, CD4BL, WSBL, Age ] is a k = 8dimensional vector of covariates and β_{ ij } is a vector of k regression parameters relating the instantaneous rate of transitions from state i to state j to the covariates Z and baseline intensities \( {\alpha}_{ij}^{(0)} \) relating to the transition from state i to state j as shown in Table 5 above. Estimates of β_{ ij }’s, regression coefficients, were calculated and the results are shown in Appendix 1. The regression coefficients can be interpreted similarly to those in the proportional hazards regression model [19]. The results are shown in Table 6.
The 2xLL for the model fitted in Table 6 is 3699.259. The results show that the strongest predictor of transition from state 1 to 2 is a negative reaction to treatment, which has a hazard ratio of 4.715. This means that patients who developed some form of reaction were over 4 times more likely to transit from a level of CD4 ≥ 750 to a level of 500 ≤ CD4 < 750 than patients who did not react to treatment. However, from all the other states, hazard ratios for the patients who reacted to treatment are higher for immune recovery than for immune deterioration.
The strongest predictor of immune deterioration from a CD4 level between 350 and 500 to a CD4 level between 200 and 350 (3 to 4) is developing TB during treatment, with a hazard ratio of over 2. Developing TB is also the strongest predictor of immune deterioration from 4 to 5, with a hazard ratio also greater than 2. This means that TB is the major cause of further immune deterioration when the immune system is too weak. Hence the recommendation that HIV patients should continuously have their TB status checked. Those individuals who had TB before enrolment had the strongest predictor for the transition from state 3 to state 6. These patients had a hazard ratio of over 2 times more likely to die from state 3 than those who were enrolled without having TB. However, for these individuals, transitions to better states were generally higher than transitions to worse states for almost all states.
A hazard ratio of 6.46 for the predictor variable male shows that males were over 6 times more likely to transit from state 2 to 3 than their female counterparts. The hazard ratios of males from a bad state to a better state are less than 1, which is an indication that males are less likely to respond to treatment compared to females.
The hazard ratios for the transitions to a better state for patients who were enrolled with CD4 counts below 350 are less than one, but hazards to worse states are greater than one, an indication that starting treatment when the CD4 levels are below 350 retards immune recovery. The transitions to the death state for individuals who started treatment when they were on the WHO stage of 4 are all more than one, meaning that starting treatment with a WHO stage of 4 is a leading cause of being absorbed in the death state.
Percentage prevalence for the model with covariates
The prevalence for the model with covariates were plotted to examine areas of poor fit of the timehomogeneous model with covariates. The plots are shown in Fig. 4.
Figure 4 confirms that the inclusion of covariates on the model improves the fitness of the model since the expected prevalence is now perfectly closer to the observed prevalence for all states than the model without covariates.
Conclusion
In this paper a continuoustime homogeneous Markov model is fitted to explore predictors of HIV/AIDS progression for patients on antiretroviral therapy. A continuoustime homogeneous model is fitted with and without covariates and comparison of these two models is done using the likelihood ratio test. Parameters that define progression of HIV/AIDS were estimated and these include transition intensities, mean sojourn times and probability of each state being next or jump chains. The fitted model is used to analyse the effects of the covariates on the transition intensities. These covariates were reaction to treatment, development of TB during treatment and gender among others.
Results from the likelihood ratio test show that the model with covariates provides a better fit than the model with no covariates with a pvalue =10^{{−4}}. The results show that transition rates to immune recovery are generally higher than the transition rates to immune deterioration. However, the results show that the strongest predictor of immune deterioration from state 1 (CD4 cell count greater than 750) to state 2 (CD4 cell count between 500 and 750) is reaction to treatment. These patients are 4 times more likely to transit from state 1 to state 2 than those who did not react to treatment.
Patients who developed TB during the course of treatment have higher chances of immune deterioration than immune recovery compared to those who did not develop any TB coinfection. These incidences are quite high for transition from state 4 (CD4 cell count between 200 and 350) to state 5 (CD4 cell count below 200). For these states the immune system is still weak. As a result patients on antiretroviral drugs should consistently be screened for TB coinfection. Patients who had initially been diagnosed with TB before commencement of ART recover better from HIV/AIDS disease except that transitions to death for patients with CD4 cell count between 350 and 500 cells/mm^{3} are two times higher than that of patients who were not initially diagnosed with TB.
From this cohort, transitions to bad states are higher for males than for their female counterparts. This is quite pronounced on transitions from state 2 (CD4 cell count between 500 and 750) to state 3 (CD4 cell count between 350 and 500) where the hazards for males are 6 times that of females. This result is consistent with the findings from Maskew and others, they discovered that men gain fewer CD4 cell counts than did women [20]. An assessment of published studies by Castillo and others [21] from both resourcelimited and resourcerich countries suggest an improved survival outcomes for females than males. However, the studies they assessed do not show a clear sex disparity in the disease progression or in treatment effects of viral suppression and immunologic recovery.
The results from the fitted model show that the rates of immune recovery were much higher than the rates of immune deterioration which is an indication of effectiveness of treatment. Patients who started treatment when their CD4 baseline was at least 350 had higher rates of immune recovery than those who had a lower CD4 baseline. This result is commensurate with the findings from Moore and Keruly who also discovered that patients with baseline CD4 cell count above 350 cells/mm^{3} returned to nearly normal CD4 cell count after 6 years [22]. The probability of dying increases with decreasing CD4 count of the individual at enrolment. This is supported by the findings of [23,24,25], who also concluded that being in the AIDS defining stage leads to the highest probability of reaching the death state.
The mean sojourn times revealed that patients take longer time in the AIDS defining states (CD4 cell count below 200) before they move to the other states. Research has also shown that CD4 cell count rises gradually despite the suppressed viral load particularly in older patients. Hence, there is need to use both CD4 cell count and viral load in monitoring the efficacy of treatment. The younger people below the age of 40 have higher chances of immune recovery than the older ones. This finding is supported by some previous studies who concluded lower mean CD4 increases for older patients than younger patients [20, 26]. Alioum and others further argued that this could be caused by the fact that older subjects may have a reduced capacity to generate CD4 cells in response to the viral killing [10].
Although continuous time Markov models can handle multiple or recurrent outcomes compared to the Kaplan Meier analysis and Cox proportional hazards models, the assumption of constant hazard function that is frequently unrealistic [27] and puts limitations on the disease history behaviour [28], especially on HIV/AIDS progression for patients on ART. Some studies have shown that if a patient responds well to treatment and manages to achieve viral load suppression within the first 6 months, that patient is likely to continue responding well to treatment [29]. This goes against the Markov and memoryless properties of the models. Thus a limitation in the application of time homogeneous Markov processes.
References
 1.
Perelson AS, Nelson PW. Mathematical analysis of hiv1 dynamics in vivo. SIAM Rev. 1999;41(1):3–44.
 2.
Sonnenberg P, Glynn JR, Fielding K, Murray J, GodfreyFaussett P, Shearer S. How soon after infection with HIV does the risk of tuberculosis start to increase? A retrospective cohort study in south African gold miners. J Infect Dis. 2007;191:150–8.
 3.
Cingolani A, Lepri AC, Castagna A, Goletti D, De Luca A, Scarpellini P, Fanti I, Antinori A, d’Armino Monfarte A, Girardi E. Impaired CD4 Tcell count response to combined antiretroviral therapy in antiretroviralnaive HIVinfected patients presenting with tuberculosis as AIDSdefining condition. CID. 2012;54(6):853–61.
 4.
Ndumbi P, Falutz J, Pai NP, Tsoukas CM, et al. PLoS One. 2014;9(4):e94018. https://doi.org/10.1371/journal.pone.0094018.
 5.
Phillips AN, Staszewski S, Weber R, Kirk O, Francidi P, Miller V, Vernazza P, Lundgren JD, Ledergerber B. HIV viral response to antiretroviral therapy according to the baseline CD4 cell count and viral load. JAMA. 2001;286:2560–7.
 6.
Abner EL, Nelson PT, Schmitt FA, Browning SR, Fardo DW, Wan L, Jicha GA, Cooper GE, Smith CD, CabanHolt AM, Van Eldik LJ, Kryscio RJ. Selfreported head injury and risk of latelife impairment and AD pathology in an AD center cohort. Dement Geriatr Cogn Disord. 2014;37:294–306.
 7.
Mullins LJ. Management and Organisational behaviour. 4th ed. London: Pitman Publishing; 1996.
 8.
Foucher Y, Mathieu E, SaintPierre P, Durand JF, Daures JP. A semiMarkov based on generalised Weibull distribution with an illustration for HIV disease. Biom J. 2005;6:1–9.
 9.
Longini IM, Clark WS, Byers RH, Ward J, Darrow WW. Statistical analysis of the stages of HIV infection using a Markov model. Stat Med. 1989;8:831–43.
 10.
Alioum A, Leroy V, Commenges D, et al. Effects of gender, age, transmission category, and antiretroviral therapy on the progression of human immuno virus infection using multistate Markov models. Group d'Epidemiologie clinique du SIDA en Aquitaine, Epidemiology. 1998;9(6):605–12.
 11.
Reddy T. HIV disease progression in South Africa using multistate Markov models. SACEMA. 2011; http://www.sacemaquarterly.com.
 12.
Binquet C, Le Teuff G, Abrahamovicz M, Mahboubi A, Yazdanpanah Y, Rey D, Rabaud C, Chirouze C, Berger JL, Faller JP, Chavanet P, Quantin C, Piroth L, Groupe InterCOrevih du NordEst (ICONE). Markov modelling of HIV infection evolution in the HAART era. Epidemiol Infect. 2009;137(9):1272–82. https://doi.org/10.1017/S0950268808001775. Epub 2009 Jan 12
 13.
Grover G, Gadpayle KA, Swain PK, Deka B. A multistate Markov model based on CD4 cell count for HIV/AIDS patients on antiretroviral therapy (ART). Int J Stat Med Res. 2013;2:144–51.
 14.
Gibson EL. Continuous time multistate models for survival analysis. WinstonSalem: Wake Forest University, Department of Mathematics; 2008.
 15.
Halim D. Maximum likelihood estimation for generalized semiMarkov processes. Discrete event dynamics systems: theory and applications. Dep Ind Eng. 1996;6(2):73–104.
 16.
Jackson CH. Multistate models for Panel Data: The msm package for R. J Stat Softw. 2013;38(8). http://www.jstatsoft.org/.
 17.
Titman, A.C., Model diagnostics in multistate models of biological systems, Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, 2007.
 18.
Kalbfleisch JD, Lawless JF. The analysis of panel data under a Markov assumption. J Am Stat Assoc. 1985;80:863–71.
 19.
Cox DR. The statistical analysis of dependencies in point processes. In Stochastic Point Processes. Wiley; 1972.
 20.
Maskew M, Brennary AT, Westreich D, McNamara L, MacPhail P, and Fox MP. Gender Differences in Mortality and CD4 Count Response Among HIVPositive Patients Virally Suppressed Within 6 Months of Antiretroviral Therapy Initiation. J Womens Health. 2012; 22(00). https://doi.org/10.1089/jwh.2012.3585.
 21.
Castilho JL, Melhekin VV, Sterling TR. Sex Differences in HIV Outcomes in the Highly Active Antiretroviral Therapy Era: A Systematic Review. AIDS Res Hum Retroviruses. 2014;30(5):446–56. https://doi.org/10.1089/aid.2013.0203.
 22.
Moore RD and Keruly JC. CD4+Cell Count 6 Years after Commencement of Highly Active Antiretroviral Therapy in Persons with Sustained Virologic Suppression CID. 2007; 44: 441–6.
 23.
Biadgilign S, Reda AA, Digaffe T. Predictors of mortality among HIV infected patients taking antiretroviral treatment in Ethiopia: A retrospective cohort study. AIDS Res Ther. 2012;9:15 https://doi.org/10.1186/17426405915.
 24.
Goshu TA and Dessie ZG. Modeling progression of HIV/AIDS disease stages using semiMarkov processes. 2013.
 25.
Lifson AR, Krantz EM. Grambsch PL. Macalino GE. CrumCianflone NF. Ganesan A, Okulicz JF, Eaton A, Powers JH, Eberly LE, Agan BK. Clinical, demographic and laboratory parameters at HAART initiation associated with decreased postHAART survival in a U.S. military cohort. AIDS Res Ther. 2012; 9(4).
 26.
Hasibi M, Hajiabdolbaghi M, Hamzelou S, Sardashti S, Faroughi M, Jozani ZB, Alinaghi SAS. Impact of age on CD4 response to combination antiretroviral therapy: a study in Tahran, Iran. Orld J AIDS. 2014;4:156–62.
 27.
Lange JM, Hubbard RA, Inoue LYT, Minin VN. A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data. Biometrics. 2015;71(1):90–101.
 28.
SaintPierre P, Combescure C, Daures JP and Godard P. The analysis of Asthma control under a Markov assumption with use of covariates. Statistics in Medicine. 2003;22(24):3755–3770.
 29.
Maartens G, Cotton M, Meintjes G, Mendelson M, Rabie H, Maharaj S. Clinical guidelines 9^{th} Edition. Copyright Aids for AIDS management (Pty) Ltd. 2012.
Acknowledgements
This study would not have been a success without the assistance of the Microbiology Department at the University of Venda in providing the secondary data through Professor Pascal O. Bessong.
Funding
Not applicable
Availability of data and materials
The data are available by contacting the corresponding author and can be submitted upon request.
Author information
Affiliations
Contributions
CS and DC drafted the manuscript. CS and DC contributed to the analysis and interpretation of the data. Both authors participated in critical revision of the manuscript drafts and approved the final version.
Corresponding author
Correspondence to Claris Shoko.
Ethics declarations
Authors’ information
CS is a PhD student and DC is a senior lecturer at the University of the Free State.
Ethics approval and consent to participate
Not applicable
Consent for publication
The authors give their consent to publish this work.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
Appendix 2
Appendix 3
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 HIV/AIDS progression
 Homogeneous Markov models
 Reaction to treatment