 Research
 Open access
 Published:
A modified particle swarm optimization algorithm for parameter estimation of a biological system
Theoretical Biology and Medical Modelling volume 15, Article number: 17 (2018)
Abstract
Background
Mathematical modeling has achieved a broad interest in the field of biology. These models represent the associations among the metabolism of the biological phenomenon with some mathematical equations such that the observed time course profile of the biological data fits the model. However, the estimation of the unknown parameters of the model is a challenging task. Many algorithms have been developed for parameter estimation, but none of them is entirely capable of finding the best solution. The purpose of this paper is to develop a method for precise estimation of parameters of a biological model.
Methods
In this paper, a novel particle swarm optimization algorithm based on a decomposition technique is developed. Then, its root mean square error is compared with simple particle swarm optimization, Iterative Unscented Kalman Filter and Simulated Annealing algorithms for two different simulation scenarios and a real data set related to the metabolism of CAD system.
Results
Our proposed algorithm results in 54.39% and 26.72% average reduction in root mean square error when applied to the simulation and experimental data, respectively.
Conclusion
The results show that the metaheuristic approaches such as the proposed method are very wise choices for finding the solution of nonlinear problems with many unknown parameters.
Background
The parameter estimation of a biological model is a crucial step of a system description. These models are an approximation of the real phenomenon and some of their parameters do not have physical interpretation and their presence is to compensate the reductions and approximations in the model. These approximations often stem from the lack of our knowledge about the biological system.
The modelling of biochemical pathway is possible with the concurrent measurement of biochemicals. This is the result of latest developments in data acquisition technologies which provide us with abundance of time profiles of metabolites or proteins that can be used for mathematical modeling of biochemical networks [1].
The first step to mathematically model these phenomena is to specify a comprehensive framework that represent the underlying structure and convey the thorough information. The second step is to choose a promising approach to find a global solution for the unknown parameters of the model.
Frequently, the problem of biological parameter estimation are said to be NP hard, so they are multimodal and ill conditioned i.e. there are more than one true solutions for the estimated parameters that fit the model and produce the time course information [2]. Many algorithms have been developed to address this problem. Among these algorithms, the gradient optimization methods often fail to reach the true solution due to its high sensitivity to initialization [3]. The heuristic algorithms however have been shown to be able to solve the NP hard problems and overcome the problem of multimodality and achieve the global optimality [4].
Many researchers have investigated the problem of parameter estimation in biological systems. As the results of their works, algorithms with promising outcomes have been emerged. EKF, IUKF, alternating regression, simulated annealing, firefly, and genetic algorithms are some example [5,6,7,8,9,10]. In [11], the author has developed an improved particle filter method for estimation of the parameters of an Ssystem, and shows the superiority of this method over the ordinary particle filter algorithm from the accuracy and convergence rate point of view. Besides this method, Ding and et al. [12] have used a kalman filter based least square method for state estimation of canonical state space problem and a decomposition technique for enhancing the computational time. In [13], Moles and et al. have compared the global optimization methods for the problem of biochemical pathway modeling. In their paper, the authors have shown that the evolutionary algorithms such as metaheuristic methods are the most proper choices for solving the NP hard problems and preventing from local solutions. Some other papers have employed the evolutionary algorithms. For example, in [14], the author developed a hybrid firefly algorithm to estimate the parameter of a highly complex and nonlinear biological model. This method uses a neighborhood search by utilizing the evolutionary methods and compares its results in aspects of accuracy and computational time with the firefly and neldermead algorithms. In [15], the authors have demonstrated the capability of the evolutionary algorithms for estimation of the unknown parameters of a system in a reasonable amount of time. They proposed a parallel implementation of an enhanced Differential Evolution (DE) using Spark to reduce the computational time of the search process by including a selected local search and exploiting the available distributed resources. In [16], the author has modified the UKF algorithm to estimate the parameters of an Ecoli system. The method is called Iterative UKF and is designed to prevent from the early saturation of the filter gain. As a result the estimation error is decreased in comparison with the ordinary UKF algorithm.
One of the main problems of the parameter estimation of complex and highly nonlinear models is the capability of the method to estimate the global solution. In these models, as the fitness function is nonconvex, the number of the local solutions is large. This is the main reason of the algorithms such as the gradient base methods and optimal filtering that are unable to find the global solution, since these algorithms are highly sensitive to the initialization of the unknown parameters [3]. On the other hand, the evolutionary methods such as particle swarm optimization are powerful methods for exploring the search space in order not to stick in local solutions.
In this paper, a modified PSO algorithm is presented. In this method, a decomposition technique is employed so that the algorithm has higher ability for the exploitation technique near the final solution. The improvement of the exploitation technique results in minor movements of the particles near the global solution and prevents from larger jumps and deterioration of the value of fitness function. As a result, the method has less mean square error compare with the ordinary PSO algorithm.
The modified PSO algorithm is proposed for estimation of the parameters of a simulated biological model from a synthetic data [16]. This pathway generated data is modelled by a canonical model which is made by the Ssystem. Then, this algorithm is employed for real data of the Ecoli system [15, 16]. The results of the algorithm are then compared with the IUKF, SA and PSO algorithms in Root Mean Square Error point of view. We have demonstrated that the novel PSO algorithm is accomplished to estimate the true parameters and has less RMSE compare with the SA and IUKF and the ordinary PSO algorithm.
This paper is organized as follows. The mathematical modelling is defined in section II. The parameter estimation methods and the proposed technique are described in section III. The simulated and experimental results are presented in section IV and the paper is finally concluded in Section V.
Mathematical modelling
Review stage
Consider a nonlinear dynamical model for a biological system as below:
where x ∈ ℝ^{N} is the metabolites, u ∈ ℝ^{m} is the concentration vectors of biomolecules that is the vector of independent variables, and w ∈ ℝ^{q} is the parameter vector. The problem is to find a solution for the unknown parameters that are best fitted to the model.
Different canonical modelling frameworks are available to describe a biological phenomenon. In this paper, among some frameworks such as LotkaVolterra [17], Ssystems [18], and cooperative and saturation models [19], the Ssystem is used to express the nonlinear model of eq. (1)
where α_{i} > 0 and β_{i} > 0 are rate coefficients and g and h are kinetic orders. The number of dependent variables is N which represents the metabolism and m is the number of independent variables that are considered as inputs to the system.
Considering the eq. 2, we have x = [x_{1}, …, x_{N}]^{T}as the state vector, u = [x_{N + 1}, …, x_{N + m}]^{T} as the input vector and w = [α_{1}, …, α_{N}β_{1}, …, β_{N}, g_{11}, …, g_{N, N + m}, h_{11}, …, h_{N, N + m}]^{T}as the parameter vector, while q = N(2 + 2(N + m)) is the largest number of the unknown parameters. Many algorithms have been developed to estimate the unknown parameters of such nonlinear equations. In section III, some of these methods are presented.
Methods
In order to compare the performance of the proposed method with another heuristic approach, the Simulated Anealling method is used. Moreover, for the further analysis of heuristic methods, a non heuristic approach based on Unscented Kalman filter is applied. These methods besides the proposed algorithm are described in this section.
Iterative Kalman filter (IUKF)
The Unscented Kalman Filter (UKF) is a modified framework of the Kalman Filter for nonlinear modelling. In this filter, one or both the process and measurement equations might be nonlinear and the nonlinear kalman filter is required. The UKF algorithm employs the unscented transform as a nonlinear transformation to compute the statistics of a random variable.
The UKF algorithm is briefly explained as follow:
Consider the discrete state space model of eq. I:
By the definition ofy[k] = x[k + 1], k = 0, …, N − 1, the above equation can be rewritten as the following nonlinear process equation:
where x and u are the inputs, y is the output and w is the unknown parameter with dimension q that is to be estimated. To estimate the parameters, the following state space representation is defined as follows:
where the first model is the process equation, driven by r as the process noise. The latter is the measurement equation driven by the input and the measurement noise e. the UKF algorithm is able to estimate the parameter w based on the following pseudo code:
1. Initialize the unknown parameter and the covariance matrix
2. Assuming the parameter vector ω with mean \( \overline{\omega} \) and covariance P_{ω} a set of 2q + 1 sigma vectors \( \mathcal{W} \) are obtained through the following equations
where \( \gamma =\sqrt{q+\lambda } \), λ = ε^{2}(q + ϗ) − q is a scaling parameter, and \( \Big({\sqrt{\left(q+\lambda \right){P}_{\omega}\Big)}}_i \) represents the ith column of the matrix. The constant 10^{−4} < ε < 1 controls the sigma points spread around \( \overline{\omega} \). ϗ is also a scaling parameter, which is regularly fixed to 0 or 3 − q and q is the dimension of parameters [14].
3. The sigma points are transformed with the nonlinear process F
4. The mean and covariance of the transformed sigma point of step 3 are calculated as follows
Where \( {W}_i^{(m)}={W}_i^{(c)}=\frac{1}{2\left(\lambda +q\right)} \),\( {W}_0^{(m)}={W}_0^{(c)}=\frac{\lambda }{\left(\lambda +q\right)} \).
5. The cross covariance matrix of the measurement and parameter vectors are calculated
6. The kalman gain, parameters and the covariance matric is then updated
In [16], the authors have shown that the convergence rate of the UKF algorithm is small and the filter might not converge to the true parameters. The main reason is the early saturation of the kalman gain resulting from the assignment of not suitable initial values. Thus, the small filter gain results in small changes in the estimation of parameters and the algorithm converge very slowly.
To remedy this problem, the authors have developed the IUKF algorithm. In this algorithm, after applying the UKF algorithm, RMSE is calculated through the following equation
where y[k] is the true measurement and \( \widehat{y}\left[k\right] \) is the estimated output. If MSE is smaller than a threshold value δ_{E} or the iteration number of the algorithm is large enough the algorithm will stop, otherwise the UKF algorithm will be initialized with the final estimate of the parameters and their covariance which produces better initial values for the UKF algorithm. If the difference of two consecutive RMSE is less than a threshold value δ_{R}, the covariance matrix resets to the first initial value preventing from small changes in covariance matrix.
Simulated annealing
SA algorithm is a metaheuristic algorithm introduced by Scott Kirkpatrick and et al. [20]. This algorithm is based on heating and cooling of the materials, starting with a prior solution and improves it in a repetitive process to reach the optimized solution for the problem. It consists of an inner and outer loop. The inner loop displaces the last solution in a solution space with a local search and updates the obtained solution based on a probabilistic criterion. The outer loop decreases the temperature of the process consistently. This temperature affects the performance of the inner loop. In the beginning with high temperature, the algorithm performs well in searching the solution space and prevents from the local solution in nonconvex problems. By decreasing the temperature, it demonstrates a good capability in exploration.
At the first iterations that the temperature is high, in the contrary to the nonproper value of the cost function, the probability of a bad solution in the inner loop is high. This property prevents the algorithm from convergence in local responses. In the last iterations of the outer loop with the loose of temperature, the probability of a bad solution is decreased and the most proper solution is chosen with high probability.
Particle swarm optimization
Particle Swarm Optimization, also called PSO, is a population based stochastic optimization method intruduced by Kennedy and Eberhart [21]. PSO simulates the activities of school of birds, swarms of insects or groups of fish, in which individuals are called particles and the population is named a swarm. In this method, a position P and a velocity V are assigned to each particle. These particles are scattered around in the searchspace based on a few procedures. The particles are scattered following their own best known position P^{*}, in the searchspace as well as the total swarm’s best known position \( {P}_g^{\ast } \). The velocity of the jth dimension of particle i in iteration t, is obtained as follow:
Where W is the inertia weight and is normally in the range of 0.9–1.2 [22], c_{1} and c_{2} are constants that are set to 2, and r_{1} and r_{2} are randomly generated stochastic parameters in the interval [0, 1].
The new position of the ith particle is then updated by the following formula:
The PSO algorithm is known as a fast simple method, suitable for nonconvex NPhard problems such as biological pathway modelling. Despite its great performance in exploring the solution space, it has weakness in exploiting the optimal solution when reaching to its neighborhood. To cope with this deficiency, the researchers usually adopt some heuristics to the algorithm or hybridize it with some other methods. In this paper, a novel heuristic is embedded in the PSO algorithm to improve its exploiting capabilities. Here, we apply a decomposition technique in a sequential platform with the canonical PSO. The proposed algorithm, named as DPSO, has a twophase structure. The first phase is a standard PSO, where the algorithm finds a solution in the neighborhood of the optimal one, known as the GlobalBest. In order to enhance the capability of the first phase in reaching near the optimal solution as close as possible, the inertia weight in the eq. (16) is linearly reduced in each iteration of the PSO algorithm, ending with a final value equal to 0.1. Using this technique provides the algorithm with more precision in the final iterations, when the particles are in the neighbor of the best solution. The second phase, utilizes a decomposition technique as following pseudocode:
For i = 1: Iteration Number
For j = 1:Param_Number

Relocate the position of all the particles as the globalbest

Scatter the particles in the jth dimension in the neighbor of the globalbest

Find the optimal value of the jth dimension using the PSO algorithm

Update the globalbest
endfor.
endfor.
As described in the above pseudocode, the second phase of the algorithm has an inner loop where it tries to find the optimal value of all the decision variables, considering the values of the others as constant values, using the canonical PSO. In the second phase, we needed to be more precise in our searching method. So, we concentrated the ability of the PSO algorithm from a multidimensional search space to a singledimension exploitation: by decomposing the problem into each decision variable, the PSO was able to search the optimal value of each decision variable near the vicinity of the best found solution at hand, while considering other decision variables as fixed values. This searching technique is repeated for all decision variables. In the outer loop, the whole procedure is repeated to insure evading from a local solution. Using this technique allows the method to exploit the optimal solution with higher levels of accuracy, resulting in a reliable high quality solution.
Results
Simulations
In this section, we demonstrate the capability of the modified PSO algorithm with two synthetic simulations. In the first simulation, no additive noise has been assumed. In the second simulation, additive white Gaussian noise with SNR = 20 is considered. The data are generated through the following model:
and the parameters are as follows:
Table 1 shows the chosen parameters of the above model.
Four random initial values are generated to produce the four synthetic metabolisms. The time profiles of these four synthetic states are obtained from eq. (18) and shown in Fig. 1.
Tables 2 through 5 depict the results obtained from IUKF, SA, PSO and the proposed DPSO algorithms. In these tables, the average of estimated parameters and the corresponding average RMSE for each algorithm in 1000 simulations are presented. As it is obvious from the tables, DPSO algorithm has higher capabilty for estimating the unknown parameters in both simulation scenarios. The RMSE of the proposed method is less than the other three algorithms. These simulations demonstrate the enhancement of the exploitation phase in the PSO algorithm employing the proposed technique compared with the ordinary PSO and also the capability of the method for estimation of more precise parameters compared with the IUKF and SA algorithms. Figure 2 illustrates the value of RMSEs of noise free and noisy scenarios in each 1000 simulation runs for each algorithm in Tables 2, 3, 4 and 5.
The mean percentage improvement of DPSO algorithms over SA, IUKF and PSO methods is 60.06% and 48.72% with respect to noise free and noisy scenarios.
Real data
In order to compare the performance of the algorithms in an experimental scenario, a real dataset is utilized from the Cad system. The Cad system is one of the conditional stress response modules in E. coli, which is induced only at low pH and a lysinerich environment [23, 24]. The following Ssystem can be used to model this phenomenon
The description and the parameters of the model have obtained from [16]. The parameters of this system have been estimated by the above mentioned methods. Then, these parameters will be used to generate the time profiles of the states of the Cad system in eq. (20). To compare the results of the algorithms, the RMSE is calculated as the error of the generated time profiles and the real datasets obtained by the measurements. Table 6 shows the resultant RMSEs.
The generated time profiles obtained by the estimated parameters along with the true time profiles are depicted in Fig. 3.
The proposed decomposition technique is applied on the ordinary PSO to have smaller movements near the global solution and prevent from larger jumps that deteriorate the results. Therefore, as it is explained earlier, the DPSO algorithm has higher capability in exploitation step compared with the ordinary PSO algorithm. In the real dataset experiment, the RMSE of DPSO algorithm is 0.741 revealing the capability of the proposed method in comparison to IUKF and SA and PSO algorithms. For this real data experiment, the percentage improvement of DPSO algorithm over SA, IUKF and PSO methods is 29.36%, 33.78%, 17.02%, respectively.
Discussion
Four different algorithms have been discussed in this paper. IUKF has an analytical approach and it provides the solution under the mathematics of kalman filter. However, the heuristic approaches are very popular in the literature for the nonlinear problems with large number of parameters. Therefore, two metaheuristic approaches are considered besides IUKF. SA and PSO algorithms are two powerfull methods for finding the best solution in NP hard problems. The proposed DPSO algorithm is developed to have higher capability in exploitation step. As a result this algorithm has less RMSE compared with the other three algorithm due to smaller jumps near the global solution. The results show the superiority of DPSO algorithms over the other three methods in RMSE point of view.
In simulations, a model with 17 parameters is considered and two scenarios with different SNR have been analysed. RMSE is computed with the estimated and the true parameters. The RMSE of DPSO is less than the other three algorithms as it is expected.
In real experiment scenario the data of CAD system is used to estimate the paramters of an assumed model for the time profile of the involved metabolisms. The model is similar to the one in simulation scenarios and the parameters are estimated. The DPSO method has less RMSE between the true data and the data obtained from the model with the estimated parametrs. This accuracy is obtained as a result of more running time compared with the PSO and SA algorithms. However, this time is not a significant problem because this methods are offline and a model is estimated to be used for further analysis. The running time of each algorithm is 343, 305, 287 and 416 s for IUKF, SA, PSO and DPSO algorithms, respectively.
Conclusion
In this paper, a PSO based algorithm is suggested to estimate the unknown parameters of the nonlinear biological models. The proposed method is compared against IUKF, SA and the ordinary PSO algorithm. The results show the superiority of the DPSO algorithm compared with the other three methods. The results of the simulation and experimental scenarios reveal the capability of the proposed algorithm for estimation of the unknown parameters of a nonlinear biological system.
References
Harrigan GG, Goodacre R, eds. Metabolic profiling: its role in biomarker discovery and gene function analysis. Springer Science & Business Media, 2012.
Gu X. A multistate optimization framework for parameter estimation in biological systems. IEEE/ACM Trans Comput Biol Bioinform. 2016;13(3):472–82.
Friedl G, Kuczmann M. Population and gradient based optimization techniques, a theoretical overview. Acta Technica Jaurinensis. 2014;7(4):378–87.
Doerner KF, Maniezzo V. Metaheuristic search techniques for multiobjective and stochastic problems: a history of the inventions of Walter J. Gutjahr in the past 22 years. CEJOR. 2018;26(2):331–56.
Meskin N, et al. Parameter estimation of biological phenomena modeled by ssystems: An extended kalman filter approach. Decision and Control and European Control Conference (CDCECC), 2011 50th IEEE Conference on. IEEE, 2011.
Chou IC, Martens H, Voit EO. Parameter estimation in biochemical systems models with alternating regression. Theor Biol Med Model. 2006;3:25.
Daniels BC, Nemenman I. Efficient inference of parsimonious phenomenological models of cellular dynamics using Ssystems and alternating regression. PLoS One. 2015;10(3):e0119821.
Wang C, Zhang L. Comparison of parameters estimation methods based on the systems biology model of breast cancer. Progress in Informatics and Computing (PIC), 2015 IEEE International Conference on. IEEE, 2015.
Gonzalez OR, et al. Parameter estimation using simulated annealing for Ssystem models of biochemical networks. Bioinformatics. 2006;23(4):480–6.
Chou IC, Voit EO. Recent developments in parameter estimation and structure identification of biochemical and genomic systems. Math Biosci. 2009;219:57–83.
Mansouri MM, et al. State and parameter estimation for nonlinear biological phenomena modeled by Ssystems. Digital Signal Process. 2014;28:1–17.
Ding F, Liu X, Ma X. Kalman state filtering based least squares iterative parameter estimation for observer canonical state space systems using decomposition. J Comput Appl Math. 2016;301:135–43.
Moles CG, Mendes P, Banga JR. Parameter estimation in biochemical pathways: a comparison of global optimization methods. Genome Res. 2003;13:2467–74.
Abdullah A, et al. An evolutionary firefly algorithm for the estimation of nonlinear biological model parameters. PLoS One. 2013;8(3):e56310.
Teijeiro D, et al. A cloudbased enhanced differential evolution algorithm for parameter estimation problems in computational systems biology. Clust Comput. 2017;20(3):1937–50.
Meskin N, et al. Parameter estimation of biological phenomena: an unscented Kalman filter approach. IEEE/ACM Trans Comput Biol Bioinform. 2013;10:2.
Skvortsov A, Ristic B, Kamenev A. Predicting population extinction from early observations of the Lotka–Volterra system. Appl Math Comput. 2018;320:371–9.
Meskin N, Nounou H, Nounou M, Datta A, Dougherty ER. Parameter estimation of biological phenomena modeled by Ssystems: an extended Kalman filter approach: Proc IEEE 50th Decision and Control and European Control Conf (CDCECC); 2011. p. 4424–9.
Sorribas B, HernndezBermejo EV, Alves R. Cooperativity and saturation in biochemical networks: a Saturable formalism using Taylor series approximations. Biotechnol Bioeng. 2007;97(5):1259–77.
Kirkpatrick S, Daniel Gelatt C, Vecchi MP. Optimization by simulated annealing. Science. 1983;220(4598):671–80.
Eberhart R, Kennedy J. A new optimizer using particle swarm theory. In: Micro machine and human science, 1995. MHS’95, Proceedings of the sixth international symposium on: IEEE; 1995.
Shi Y, Eberhart RC. A modified particle swarm optimizer: Proc. of IEEE World Conf. on Computation Intelligence; 1998. p. 69–73.
Küper C, Jung K. CadCmediated activation of the cadBA promoter in Escherichia coli. J Mol Microbiol Biotechnol. 2005;10(1):26–9.
Fritz G, Koller C, Burdack K, Tetsch L, Haneburger I, Jung K, Gerland U. Induction kinetics of a conditional pH stress response system in Escherichia coli. J Mol Biol. 2009;393(2):272–86.
Acknowledgments
RM is grateful to Dr. Mehrdad GholamiShahbandi for his comments on metaheuristic approaches and for English proofreading of the paper.
Funding
This research is funded by no institute or organization.
Availability of data and materials
Please contact author for data requests.
Author information
Authors and Affiliations
Contributions
RM is the main author of the paper and FB is the advising professor and helped to draft the manuscript. Both authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Authors’ information
Raziyeh Mosayebi received her Bs.c and Ms.c. degree in Electrical Engineering from Amirkabir University of Technoogy, Tehran, Iran in 2011 and 2013, respectively. She then joined the biomedical Engineering department at University of Tehran in 2014 as a Ph.D. student. She is currently working on signal processing tools for fusion of EEG and fMRI data. Her main focus is on data decomposition and factorization methods.
Fariba Bahrami is an associate professor at the University of Tehran, School of Electrical and Computer Engineering, Department of Biomedical Engineering. She received her B.S. and M.S. degrees in Communication Engineering from Technical University of Isfahan, Isfahan, Iran, in 1985 and 1988, respectively, and her Ph.D. degree from the University of Tehran at 1998 in Biomedical Engineering. She accomplished a part of her PhD research at Technical University of Munich, in Munich, Germany from 1994 to 1997 under a stipendium from DAAD, Germany. She was the vice president of Iranian Society of Biomedical Engineering (ISBME) from 2009 to 2015 and she has been an editorial board member of Iranian Journal of Biomedical Engineering since 2007. Her field of interests are biological system modeling, computational and cognitive neuroscience, human motor control and neurorehabilitation. She is a member of IEEE, EMB Society.
Ethics approval and consent to participate
Not Applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Mosayebi, R., Bahrami, F. A modified particle swarm optimization algorithm for parameter estimation of a biological system. Theor Biol Med Model 15, 17 (2018). https://doi.org/10.1186/s1297601800896
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1297601800896