Theoretical Biology and Medical

Background: Circadian rhythms with varying components exist in organisms ranging from humans to cyanobacteria. A simple evolutionarily plausible mechanism for the origin of such a variety of circadian oscillators, proposed in earlier work, involves the non-disruptive coupling of pre-existing ultradian transcriptional-translational oscillators (TTOs), producing "beats," in individual cells. However, like other TTO models of circadian rhythms, it is important to establish that the inherent stochasticity of the protein binding and unbinding does not invalidate the finding of clear oscillations with circadian period.


Background
We are concerned with mechanisms that can account for circadian rhythms at the cellular level. Although circadian oscillators exist in complex multicellular organisms as well as in single-cell organisms, it is thought that most occur in single cells [1][2][3]. We have previously [4] described a model for circadian oscillations in which ultradian oscillators, which have been widely observed to occur in living systems, are coupled to produce circadian periods. The model was based, as is much of the related literature, on so-called transcriptional-translational oscillators (TTOs), in which genes are activated or inhibited for transcription by protein products of the oscillating system itself (transcriptional activators or suppressors, respec-tively). Several models for interactions between more than one oscillator to generate a circadian one have been described [5][6][7], but ours differs in positing coupling between the protein products of independent ultradian oscillators. We argued that our model provides a plausible evolutionary origin for circadian oscillators across a range of organisms, since it allows existing ultradian oscillations to be co-opted as components of circadian oscillators without disturbing their primary functions.
A challenging feature of TTOs is the fact that in cells, a given transcribed gene is present in one, or at most a small number, of copies, and its interaction with a transcriptional regulator is not correctly modeled by deterministic differential equations as used in [4]. Rather, because the number of copies of an expressed gene, and at some times the numbers of transcription factor molecules in a cell is small, such interactions are more accurately described by stochastic equations, and this has been done for a number of existing models [6,[8][9][10], using a classical algorithm due to Gillespie [11]. In some cases this results in shorter autocorrelation times [8] or random fluctuations [6]. Typically, as for example in reference [9], the effect of the stochasticity is to degrate the circadian oscillations, but for fast enough binding rates, the circadian oscillations are maintained. Our objective here is to apply the stochastic approach to a model similar to the one described in [4]. To do this, we have estimated the rates of association and dissociation of transcription factors from their DNA binding sites. We have then incorporated these rates, together with parameters previously used [4], into the new version of the model, in which the DNA binding steps have been treated as stochastic processes. The subsequent steps of translation and turnover of protein and mRNA have been left as deterministic ones, since the numbers of molecules in these processes are large.
We suggest that if the model is well-behaved with the critical DNA-binding step as a stochastic process, then the remaining steps can be left as deterministic without compromising the reliability of the model. Three quite different time scales arise in the model. The binding and dissociation of the transcription factors to DNA sites occur on a fast time scale, as discussed below. We introduce an (artificial) parameter ε with dimensions of time to adjust the time scales for these events and to explore the limit ε → 0. In our Numerical Tests section we vary ε; for several numerical simulations we use a value that corresponds to a relatively high rate of binding and dissociation, as explained in the Model section below. Under these conditions the results are essentially indistinguishable from the simulations for a time-averaged deterministic model which is obtained in the limit ε → 0. We subsequently show that the model is well-behaved even for binding rates that are at least 1000-fold slower.
The second significant time scale is given by the periods of the individual ultradian oscillators, which are of the order of a few hours. The critical parameters for these oscillations are those describing the half-lives of mRNA, proteins, and protein complexes. Following our numerical tests, we conduct a brief exploratory analysis of the range of periods of our "primary" oscillators.
The third time scale is, of course, the circadian rhythm time scale, which in our model arises from an interaction of two of the simpler ultradian oscillators of slightly different frequencies. Natural selection could explain why pairs of frequencies leading to the right "beats" have emerged in the course of evolution. In fact, the common occurrence of ultradian oscillators would make it easy for evolution to produce circadian rhythms out of different components in different organisms, as is actually observed [4]. This mechanism has the added advantages of robustness and easy adaptability (the period of the beat will change with minor adjustments of the frequency ratio between the two primary oscillators, but this ratio could stay quite stable even if the parameters involved varied with external conditions such as temperature). A power spectrum analysis presented below demonstrates the robustness of the model with respect to the parameter ε.
We mention that power spectra could be used to analyze observational data for a potential validation of the model. First steps in this direction were taken in [12].

The model
Our model involves TTOs contained in a single cell. As described in [4], the model comprises two ultradian "primary" oscillators whose protein products are coupled to drive a circadian rhythm. For simplicity, the two coupled primary oscillators are essentially identical, with only their frequencies different, since the critical feature is the ability to couple TTOs through known molecular processes (formation of transcriptional-regulatory protein heterodimers). Therefore, the key question regarding the ability of a stochastic process to describe stable circadian oscillators can be addressed in terms of one primary oscillator. In this system, two genes (DNA sites) are transcribed into mRNA, and this process is the origin of the following chemical dynamics.
• Transcription by gene 1 occurs when site 1 (its regulatory region) is unoccupied. Its state is given by a random variable X 1 , so that X 1 = 0 if site 1 is empty; X 1 = 1 if site 1 is occupied by D 2 (see below) • When gene 1 is active it produces mRNA (measured in molecules per cell, R 1 ) at a constant rate k 13 . These molecules undergo first-order decay with a rate constant k 14 .
• The mRNA molecules are translated into protein P 1 , which: (a) decays at rate constant k 16 , (b) forms homodimers D 1 at rate k 17 , and (c) forms heterodimers D 13 with proteins P 3 from a third gene (see below) with a rate constant k 61 .
• The homodimer D 1 binds to site 2, and thereby activates the transciption of gene 2. The state of gene 2 is given by the value of a random variable Y 1 so that Y 1 = 0 if site 2 is empty, and Y 1 = 1 if site 2 is occupied by D 1 .
• Transcription of gene 2 and translation of its mRNA into protein P 2 , which forms homodimer D 2 , which in turn feeds back to inhibit gene 1 (above). In addition, the P 2 molecules decay with a certain (biological) half-life.
• These linked reactions generate a TTO for an appropriate choice of parameters. The parameters used in our subsequent calculations are listed in Table 1. Our model entails gene 1 being inhibited by homodimer D 2 and gene 2 being activated by homodimer D 1 . This is the mechanism leading to primary oscillations.
We denote by R i , P i , D i , i = 1, 2 the concentrations of the mRNA, the translated protein and the homodimer produced by site i. The above scenario is then summarized in the following system of stochastic differential equations (only two of the equations contain the random variables X 1 and Y 1 explicitly, but all dependent variable are then random variables of necessity). The parameters k 13 etc. have the same meaning as in Ref. [4], and we have kept the notation used there; this explains the unconventional numbering (some of the equations from the reference, and hence some of the parameters, are no longer needed).
The last two terms in the second equation reflect the combination of proteins P 1 and P 3 (which is produced by the second primary oscillator) to form the heterodimer D 13 . This heterodimer in turn breaks down into pairs P 1 and P 3 at rate constant k 62 .
The second primary oscillator is given by a nearly identical set of equations, except that the periods of the oscillations are slightly different. This can, of course, be achieved by changing the parameters in many ways, but the simplest method is to have the two TTOs identical in nature but with different time scales. To do this we simply multiply each right hand side by a fixed constant δ > 0, where δ is close (but not identical) to one. For example, the first equation of the second oscillator will read The parameters chosen reflect, where available, reasonable choices of known molecular processes. The critical ones for establishing the periods of the primary oscillators are the decay times of the mRNAs and proteins. For the former, a half-life of 13-17 minutes and for the latter, 4-17 minutes generate ultradian oscillations in the model. The values used in the simulation are given in Table 1.
The coupling between the two sites communicating in each oscillator is, of course, provided by the random variables X i , Y i . The times for which these random variables stay constant are assumed to be exponentially distributed. For example, Here ε is a time scaling parameter, introduced for convenience to exploit the fact that the binding and unbinding of the homodimers occurs on a faster time scale than the remaining processes. The constants r and s measure, relative to the scale ε, the average times for which the sites will remain occupied. As this is an internal parameter of the site it should not depend on the states of the rest of the system (like, for example, the dimer concentrations).
We use ε to gauge the rate constant for binding of the transcriptional-regulatory proteins (D1, D2) to the binding sites on the relevant genes. Experimental work has shown that the second-order rate constant for the binding of transcription-regulating proteins to DNA can be 100 to 1000 times greater than the maximum rate predicted for threedimensional diffusion [13,14]. With transcription-regulating protein concentrations measured in molecules/ nucleus, using the experimental rate constant for binding of the lac repressor to its cognate DNA [10 -10 (Msec) -1 ], and assuming that a small eukaryotic nucleus has an effective volume of 40% of its total volume, this suggests a value for ε of 0.10 seconds (2.8 × 10 -5 hours). This can be interpreted as the time required for a binding event when Dl or D2 is present at 1 molecule/nucleus. At higher concentrations (of D1 or D2), this time will shorten proportionately. The average "free" time of the binding site for D 2 is thus ε/D 2 , and the average "occupied" time is ε/r. Their quotient is independent of ε, but will change with the homodimer concentration D 2 . Similar interpretations apply for X 2 and Y 2 and the random variables associated with the second primary oscillator. We have used the value ε = 0.1 sec for producing most of the numerical simulations in our Numerical Tests Section below (Figures 1,   2, 3, 4). However, as shown in Figures 5 and 6, an ε of 1000 times greater value (corresponding to a 1000-fold slower rate of binding) yields effectively the same power spectrum for the circadian model. This is comparable to the observation by Forger and Peskin that in their model for mammalian circadian rhythms the on/off times need to be in the order of seconds.
The average times for which a dimer stays bound (ε/r, ε/s, etc.) are independent of the state of the system. In contrast, the "free" times are inversely proportional to the concentration of the attaching homodimer. In one of our simulations we use r = 25 and ε = 10-1sec (which corresponds to sec, or an average of 900,000 binding events per hour). We shall see that the corresponding stochastic simulation compares well with a limiting scenario for which ε = 0. Before we describe this limiting scenario in detail we present the remaining equations making up the complete oscillatory system.
As stated earlier, the protein products P 1 and P 3 of the first and second primary oscillators combine to produce the heterodimer D 13 . As formulated in the model, this heterodimer binds to the regulatory site of a fifth gene and activates it for transcription (other constructs, involving other heterodimeric products of the two primary oscillators, and either stimulation or inhibition of transcription of the fifth gene, could also be used). Transcription, translation, and dimerization of the protein product of gene 5 yields the product D 5 , which is the primary circadian output of the model (although all variables show circadian behaviour to a greater or less extent, as seen in the graphical results).
The corresponding system is and

The time-averaged deterministic model
We employ renewal reward theory (see [15]) to derive a system of ordinary differential equations which replaces (1-6) by a "time-averaged" system in the limit ε → 0. To this end, note first that if D 2 were independent of time, the time average of X 1 (t) over "macroscopic" time intervals Renewal reward theory implies that this intuition is mathematically accurate.
Specifically, define a cycle to consist of a period of unoccupied time followed by a period of occupied time. The cycle ends with detachment. The period of unoccupied time is exponentially distributed with mean ε/D 2 . Suppose, in the language of renewal reward theory, that no reward is received during this time. The following occupied part of the cycle is exponentially distributed with mean ε/r, and we assume that the reward associated with this period is exactly equal to the amount of occupied time. Then, by renewal reward theory, the long-term average reward (i.e., the proportion of occupied time) is with probability 1 equal to E(R)/E(L) where E(R) is the expected reward during a cycle and E(L) is the expected length of a cycle. In the case under consideration E(R) = ε/r, E(L) = ε/r + ε/D 2 , so the long-term time average of X 1 (t) is D 2 /(r + D 2 ), i.e., lim ε→0 X 1ε (t) = (here, we denote the random variables X i as X iε to emphasize the dependence on ε). This time average will hold over any time interval over which D 2 is constant or changes sufficiently slowly. In this time-

12
The time evolution of the proteins P 1 and P 3 according to the time-averaged model Figure 1 The time evolution of the proteins P 1 and P 3 according to the time-averaged model. To this end we denote by R 1ε , P 1ε , D 1ε etc. the solution of (1-6) for some ε > 0 and given initial values R 1 (0), P 1 (0),..., and denote by R 1 , P 1 , D 1 etc. the solution of Eqns. (11,12) ff. for the same initial values.
We prove

Proposition 1
Almost surely for all t > 0,

Consider an arbitrary but fixed time interval [0, T]
and let (ε n ) be a sequence such that ε n → 0 as n → ∞. For each n we consider a realization, again denoted by R 1ε etc., of the initial value problem (1-6) ff. with the given fixed initial data. Step 2. We write ε rather than ε n to simplify the notation.
Observe that and The central step of our proof is showing that and are also related by (11 The time evolution of the proteins P 1 and P 3 according to the stochastic model Figure 3 The time evolution of the proteins P 1 and P 3 according to the stochastic model. This completes the proof.
Remark. This result is only a first step in a possible more complete analysis of the whole process. Specifically, we intend to study the partial differential equations governing the probabilities that the stochastic variables R i , P i , D i assume values in certain ranges, derive the deterministic model given earlier as a set of equations for the first moments of these variables, and proceed to study fluctuations. The nonlinear coupling in our equations makes this a challenging program.

Numerical tests
Here we present some results of simulations performed with the XPPAUT package (see [16,17]). The chosen parameters are those from Table 1. Figure 1 shows the time course of the proteins P 1 and P 3 for the deterministic model, which oscillate with a period of about 3 hours but differ slightly in their periods. A slight circadian variation is seen; it is much more promiment in Figure 2, where the responses of the protein products of the fifth DNA site are shown; note the time lag of D 5 with respect to D 13 .
In Figures 3 and 4 the same calculation was done for the stochastic model. This calculation used Gillespie's method [11], where the ε was chosen as 2.8 × 10 -5 hrs. The results are essentially identical to the ones for the timeaveraged model.
As a control measure we performed some calculations with larger ε, for example ε = 2.8 × 10 -3 hrs and ε = 0.028 hrs. For the former case, especially, the results were close to the time-averaged simulations. For the latter case, deviations from the time-averaged simulations became noticable: the amplitude of the circadian oscillations in D 5 fluctuated stochastically and their period decreased slightly.
Despite these more significant stochastic effects with larger ε, the integrity of the circadian period is remarkably robust in our model with respect to the choice of ε. We demonstrate this by computing Fourier power spectra of D 5 time series generated by simulations with ε = 2.8 × 10 -5 and ε = 2.8 × 10 -2 (see Figures 5 and 6). The former was calculated from a time series of 7447 data points at intervals of 1 minute, representing 124.1 hours of real time. The latter was calculated from a time series of 9920 data points at intervals of 10 minutes, representing 1653.2 hours of real time. We chose to integrate for a longer time in the latter case because the circadian oscillations were less regular. The power spectrum is shown in decibels (decibels = 10 log 10 (power), where power = |X i | 2 for X i , the i th frequency component of the Fourier transform of the time series {x k }). The frequencies of the primary oscillators show up clearly in the power spectra at close to 8 and 9 cycles per day respectively, and the circadian oscillations are clearly overwhelmingly dominant at close to (but not exactly) 1 cycle per day in both cases. Even after 65 "days" with ε = 0.028, the stochastic oscillator remained in phase with the circadian period; the wave form appeared to persist indefinitely.

Remarks on the frequencies of the primary oscillators
The fundamental idea of our model is that circadian oscillations can easily be achieved via coupling of faster oscillators. We now address the question of whether the primary oscillators could attain circadian periods without need for coupling within reasonable ranges of parameter values based on known biochemistry. To this end we investigated which (if any) intrinsic limitations there are on the periods of the primary oscillators introduced earlier. We first explored (randomly) variations of the growth parameters k 13 , k 15 , k 17 , etc., and the unbinding rates r and The time evolution of the heterodimer D 13 and the homodimer D 5 according to the stochastic model Figure 4 The time evolution of the heterodimer D 13   Power spectra for D 5 when ε = 2.8 × 10 -2 (smoothed with a Daniell filter of length 11). Power spectra for D 5 when ε = 2.8 × 10 -5 (no smoothing)

Figure 5
Power spectra for D 5 when ε = 2.8 × 10 -5 (no smoothing). To achieve this, we first modified the parameter k 17 governing the rate of homodimer formation. However, decreasing k 17 turns out to increase P 1E , counter-acting attempts to move the crossing pair closer to the real axis.
Finally, the actual rate constant of homodimer decay, k 18 , is not known, although it is unlikely to be smaller than 1 per hour. Choosing it to be exactly 1 per hour (earlier it was set to 15 per hour) we increased the periods up to 9 hours. Setting k 18 this low is probably not reasonable, but given no a priori firm bounds as to how small k 18 can actually be (a comment that applies to k 14 and k 16 as well), no simple predictions on the size of the periods of the primary oscillators can be made.

Conclusion
We have shown that TTOs in both their stochastic and time-averaged versions produce stable ultradian oscillations for reasonable parameter choices. Although the effect of the stochasticity is to degrade the circadian rhythms as in other models like that of Forger and Peskin [9], these oscillations are nevertheless robust in our model with respect to the scaling parameter governing the dimerdriven stochastic activation or inhibition of the relevant gene sites. Couplings of such TTOs with slight variations in their periods offer a simple mechanism to explain the emergence of circadian rhythms as "beats". This explanation has the added desirable feature of making circadian rhythms readily adaptable to evolutionary pressures.