Skip to main content

On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision

Abstract

Background

The effects of current anthropogenic influences on eelgrass (Zostera marina) meadows are noticeable. Eelgrass ecological services grant important benefits for mankind. Preservation of eelgrass meadows include several transplantation methods. Evaluation of establishing success relies on the estimation of standing stock and productivity. Average leaf biomass in shoots is a fundamental component of standing stock. Existing methods of leaf biomass measurement are destructive and time consuming. These assessments could alter shoot density in developing transplants. Allometric methods offer convenient indirect assessments of individual leaf biomass. Aggregation of single leaf projections produce surrogates for average leaf biomass in shoots. Involved parameters are time invariant, then derived proxies yield simplified nondestructive approximations. On spite of time invariance local factors induce relative variability of parameter estimates. This influences accuracy of surrogates. And factors like analysis method, sample size and data quality also impact precision. Besides, scaling projections are sensitive to parameter fluctuation. Thus the suitability of the addressed allometric approximations requires clarification.

Results

The considered proxies produced accurate indirect assessments of observed values. Only parameter estimates fitted from raw data using nonlinear regression, produced robust approximations. Data quality influenced sensitivity and sample size for an optimal precision.

Conclusions

Allometric surrogates of average leaf biomass in eelgrass shoots offer convenient nondestructive assessments. But analysis method and sample size can influence accuracy in a direct manner. Standardized routines for data quality are crucial on granting cost-effectiveness of the method.

Background

Based on published data for 17 ecosystem services in 16 biomes, Costanza et al. [1] estimated the value of ecosystem services at the level of the whole biosphere. They found a lower bound in the range of US$16–54 trillion (1012) per year, with an average of US$33 trillion per year. Marine systems produced near 63% of the annual value. Almost half derived from coastal ecosystems. Approximately 25% of this share related to algal beds and seagrasses. This contribution to human welfare is deem relevant. Thus maintaining the health of marine ecosystems is subject of scientific concerns. Currently, increasing anthropogenic pressures pose important threats. And global climate change could also threaten future viability of seagrass meadows [2, 3]. For instance, water quality and other local stressors promote unprecedented meadow loss [4,5,6]. This reduces mitigation of wave action [7] and filtration [8]. Diminishes food and shelter for a myriad of organisms [9,10,11]. Weakens nutrient cycling [12, 13], erosion abatement and shoreline stabilization [14,15,16]. Moderates support for detrital food web foundation [17]. And inhibits carbon sequestration [18, 19].

Eelgrass (Zostera marina L.) is a dominant along the coasts of both the North Pacific and North Atlantic [20]. This species supports communities known as among the richest and most diverse in sea life [21]. Contribution of organic materials for food webs in shallow environments [22] is noticeable. Indeed, eelgrass produced up to 64% of the whole primary production of an estuarine system [23]. Current deleterious effects of anthropogenic influences on eelgrass prompted special restoration strategies. Among remediation efforts replanting plays an important role [24,25,26,27]. Transplant success amounts to reinstatement of ecological functions of natural populations. Evaluation relies on monitoring standing stock and productivity of transplanted plants. Then comparing with assessments of a reference population, which usually settle nearby [28].

Combined biomass of leaves in shoots is an important component of standing stock. Assessments rely on the estimation of the biomass of individual leaves. This requires shoot removal followed by dry weight measurement procedures in the laboratory. Elimination of shoots could infringe damage to natural eelgrass populations [29]. And reduced shoot density makes these effects even more perceptible for transplanted plots. Allometric methods make it possible simplified-indirect estimations of eelgrass productivity and standing stock. Echavarría-Heras et al. [30] considered an allometric representation for eelgrass leaf biomass and related length. Agreeing with Solana et al. [31], the involved parameters are invariant within a given geographical region. Estimates and leaf length measurements grant nondestructive approximations of observed leaf biomass values. This way, leaf length measurements grant nondestructive approximations for observed leaf biomass values. Leaf growth rates estimation relies on successive measurements of leaf biomass. Then the allometric model in [30] entails nondestructive assessments of eelgrass productivity. But, invariance does not impede local factors to imply variability of parameter estimates. Besides, local influences other factors could explain numerical differences in parameter estimates. There are methodological influences that may render biased parameter estimates. Analysis method, sample size, and data quality can influence scaling results (e.g. [32, 33]). And, since scaling relationships are particularly sensitive to parametric uncertainties, Echavarría-Heras et al. [30] concluded that the actual precision of derived allometric surrogates requires clarification.

Here we deal with allometric surrogates for average leaf biomass in eelgrass shoots. These derive from the model w(t) = βa(t)α for leaf biomass w(t) and area a(t) measured at time t, and α and β parameters. Leaf area is more informative of eelgrass leaf biomass than corresponding length. Thus, the present scaling endures a boost in precision of parameter estimates by the model in [30]. This could increase the accuracy of derived surrogates for leaf biomass in shoots. Besides, eelgrass leaf area and length admit an isometric representation [34, 35]. Then, the time invariance found by Solana-Arellano et al. [31] also holds for parameter estimates of the present scaling. This by the way imbeds a nondestructive advantage to the present shoot-biomass substitutes. But, agreeing with Echavarría-Heras et al. [30], we must examine influences on precision of estimates for suitability of projections. Since, such an analysis was not produced before, we took here the try of filling that gap. Achieving the related goals, required the assemblage of an extensive data set. It comprises coupled measurements of eelgrass leaf biomasses and related areas. This is further called “raw data set”. A data cleaning procedure adapted from Echavarría-Heras et al. [30] removed inconsistent leaf biomass replicates from the raw data. Thereby forming what we call a “processed data set”. Differences in reproducibility strength allowed to assess data quality effects in precision. A similar procedure evaluated sample size effects. And a sensitivity analysis evaluated robustness of the projection method. This supports consistent, cost-effective allometric projections of observed values from raw data. But, this depends on nonlinear regression as an analysis method. Besides, sample size must be optimal. Data quality as expected improved reproducibility strength of the allometric projection method. But, this factor was more relevant in optimizing sample size. A detailed explanation of used procedures appears in the methods section. The results section is not only devoted to the presentation of our findings. It also examines the relative strengths of factors influencing the precision of proxies. A Discussion section emphasizes on the gains and the limitations of the method. Appendix 1 deals with the model selection problem. Appendix 2 is about data processing methodology. Appendix 3 presents the procedure for sensitivity assessment.

Methods

The symbol w m (t) will stand for average leaf dry weight of shoots collected at sampling date t. An the average of these values over all sampling dates symbolized through 〈w m (t)〉. Formal representations of these variables appear in Appendix 3 (cf. Eq. (15) and Eq. (16)). The symbols w(t) and a(t) will one to one stand for biomass and area of an individual leaf collected at a sampling time t. We assume that these variables are linked through the scaling relationship

$$ w(t)=\beta a{(t)}^{\alpha }. $$
(1)

The present raw data come from a coastal lagoon located in San Quintin Bay, México [30]. This comprises 10,412 leaves and measured lengths [mm], widths [mm] and dry weights (g). The product of length times width provided estimations of leaf area [mm2] [36]. In what follows the symbol n ra stands for number of leaves in raw data. Processed data results by applying direct and statistical data cleaning techniques. The direct hinges on the consistency of allometric models for eelgrass leaf biomass. Leaf length or area are allometric descriptors of eelgrass leaf biomass [30, 31, 34]. A model selection exploration corroborated a power function like trend assumption for the present data. Details appear in Appendix 1. Severe deviations, from the mean response curve, are inconsistent and must be removed. This took care of sets containing less than ten leaf dry weight replicates. The statistical procedure worked on sets with a larger number of replicates. It centers on properties of the median of a group of data. This is immune to sample size and also a robust estimator of scale. The adapted Median Absolute Deviation (MAD) data cleaning procedure [37] appears in Appendix 2. Processing data resulted in a number of n qa  = 6094 pairs of leaf dry weights and areas.

Parameter estimates \( \widehat{\alpha} \) and \( \widehat{\beta} \) and leaf area values yield allometric proxies \( {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) \) for w m (t). (cf. Eq. (19)). The symbol \( \left\langle {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right)\right\rangle \) (cf. Eq. (20)) stands for the pertinent average over sampling dates. We use Lin’s Concordance Correlation Coefficient (CCC) [38] as an evaluation of reproducibility. This meant as the extent to which two connected variables fall on a line through the origin and with a slope of one. We represent this statistic by means of the symbol \( \widehat{\rho} \). Agreement defined as poor whenever \( \widehat{\rho}<0.90 \), moderate for \( 0.90\le \hat{\rho}<0.95 \), good for \( 0.95\le \hat{\rho}\le 0.99 \) or excellent for \( \widehat{\rho} \)>0.99 [39]. Values of \( \widehat{\rho} \) gave an evaluation of the strength of the \( {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) \) devise to reproduce observed values.

In getting parameter estimates \( \widehat{\alpha} \) and \( \widehat{\beta} \) we relied on two procedures. The traditional analysis method of allometry and nonlinear regression. Assessing analysis method effects on reproducibility strength of \( {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) \) depended on testing differences in \( \widehat{\rho} \). The traditional approach involves a linear regression equation (cf. Eq. (4)). This obtained through logarithmic transformation of response and descriptor in Eq. (1). The nonlinear regression analysis method relied on maximum likelihood [40, 41]. This approach fitted the model of Eq. (1) in a direct way in the original arithmetical scale. The nonlinear fit allowed the consideration of homoscedasticity or heteroscedasticity (cf. Eqs. (5) and (6)). All the required fittings for both raw and processed data depended on the use of the R software.

We also fitted the model of Eq. (1) to samples of different sizes taken out from primary and processed data sets. Each sample of size k; with 100 ≤ k ≤ n ra produced estimates \( \widehat{\alpha}(k) \) for α and \( \widehat{\beta}(k) \) for β, and resulting \( {w}_m\left(\widehat{\alpha}(k),\widehat{\beta}(k),t\right) \) projections. The symbol \( \widehat{\rho}(k) \) denotes the value of \( \widehat{\rho} \) for a sample of size k. Differences in \( \widehat{\rho}(k) \) allow exploring sample size influences in reproducibility.

Deviations ∆α q and ∆β r convey fluctuating values α q  = α + ∆α q and β r  = β + ∆β r for the parameters α and β one to one. The modulus of the vector of parametric changes (∆α q ∆β r ) defines a tolerance range θ(q, r). And the value of θ(q, r) determined by the standard errors of parameter estimates denoted by mean of θ ste . A fixed value of θ(q, r) leads to four possible characterizations of the pair (∆α q ∆β r ). Each one associates to a trajectory w m (α q , β r , t) shifting from a reference one w m (α, β, t). The symbol δw (α q , β r , t) (cf. Eq. (42)) denotes deviations between reference and average of shifting trajectories at sampling dates. And the average of δw (α q , β r , t) values taken over all sampling dates denoted through 〈δw (α q , β r , t)〉 (cf. Eq. (43)). The absolute value of the ratio of 〈δw (α q , β r , t)〉 to 〈w m (α, β, t)〉 defines a relative deviation index ϑ(θ). It measures sensitivity of 〈w m (α, β, t)〉 to fluctuations of tolerance θ(q, r) on α and β. Appendix 3 presents detailed formulae.

Results

Figure 1 shows the variation of leaf dry weight and area observed in the raw data. Smallest and largest leaf areas were 2 mm2 and 7868 mm2 respectively. Associated dry weights were 1 × 10−5 g and 0.1588 g one to one. The time average of mean leaf dry weight in shoots was 〈w m (t) 〉 = 0.01461g (cf. Eq. (16)). Each leaf area measurement associate to several replicates of leaf biomass. Number of replicates increased from a single association up to a largest value of 84. Dispersion masks a power function like trend. Contents of Appendix 1 corroborate this at formal level. And exploration of dispersion reveals severe deviations from the inherent power function-like trend. Inconsistencies are more visible for leaves with areas under 350 mm2 and also for those over 5000 mm2. This hints about the relevance of data quality.

Fig. 1
figure 1

Distribution of eelgrass leaf dry weight and linked area values in raw data. Dispersion shows a masked power function-like trend. Deviations from this trend are more manifest for areas under 350 mm2 and also for those bigger than 5000 mm2. This suggests data quality effects on accuracy of allometric projections

Figure 2 exhibits the spreading of leaf dry weight after data quality control. About 40% of the replicates in the raw data were eliminated. A power function like trend appears more depicted than that showing on Fig. 1. But dispersion still shows significant deviations around the expected power function like trend. This suggests lack of standardized routines for data gathering. In this work the length times width proxy [36] approximated leaf area. Errors in estimation of area of older damaged leaves could explain uneven replicates. Faulty equipment, or incorrect recording could explicate inconsistencies for small leaves.

Fig. 2
figure 2

Plot of processed data. Distribution of eelgrass leaf dry weight and area values remaining after data quality control procedures. Although about 40% of the replicates in the raw data were found inconsistent and eliminated, this plot still shows significant residual variability around an expected power function like trend

Table 1 gives estimates\( \widehat{\alpha} \) and \( \widehat{\beta} \)for the parameters α and β and corresponding standard errors. Assuming heteroscedasticity in the model of Eqs. (5) and (6) did not affect estimates. Thus, presentation of results of nonlinear regression refers to the homoscedastic case of the model of Eqs. (5) and (6). Figure 3 displays mean response curves fitted using raw data. Figure 4 shows those associated to quality controlled data. Results for the log-linear transformation method included correction for bias of retransformation [42]. The smearing estimate of bias of Duan [43] provided the form of the correction factor.

Table 1 Parameter estimates \( \widehat{\alpha} \) and \( \widehat{\beta} \) associated standard errors (\( ste\left(\widehat{\alpha}\right), ste\Big(\widehat{\beta} \))) found by fitting the model of Eq. (1). Nonlinear regression estimates associate to the homoscedastic case of the model of Eqs. (5) and (6) (see Appendix 1). Values of \( \widehat{\rho} \) give an evaluation of reproducibility strength of the proxy of Eq. (1)
Fig. 3
figure 3

Fit of the model of Eq. (1) to raw data. Panel a Fitting results of the model of Eq. (1) by the log-linear transformation method. Distribution of replicates around the mean response curve shows a significant bias. This entails poor reproducibility (\( \widehat{\uprho} \)=0.8910) of leaf dry weight values. Panel b Shows fitting results for nonlinear regression and raw data. For this arrangement parameter estimates and Eq. (1) produced \( \widehat{\uprho} \)=0.9307. Thus, nonlinear regression stands a gain in reproducibility strength

Fig. 4
figure 4

Fit of the model of Eq. (1) to processed data. Panel a Fitting results for model of Eq. (1) via traditional log-linear transformations. Though data processing improved goodness of fit, still a notorious bias remains. Panel b Fitting results by taking nonlinear regression as an analysis method. Shown spreading of replicates around the mean response curve is fair. Hence, \( \widehat{\uprho} \)=0.9777 entails suitable reproducibility of observed values via Eq. (1)

For raw data the log-linear transformation method produced \( \widehat{\rho}=0.8910 \), entailing poor reproducibility. This explains a biased distribution of replicates around the mean response curve (Fig. 3a). Meanwhile, estimates acquired by nonlinear regression from raw data conveyed adequate reproducibility (\( \widehat{\rho}=0.9307\Big) \). This explains a displayed fair distribution of projected leaf biomass values (Fig. 3b).

Estimates via log-linear transformation for processed data seemed enhance reproducibility (\( \widehat{\rho}=0.9777 \) ̂=0.9455). But, Fig. 4a, reveals a bulk of inconsistent replicates for leaves areas under 5000 mm2. Notice that this subset of replicates distributes almost evenly around the mean response curve. Yet replicate spread for areas beyond 5000 mm2 shows significant bias (Fig. 4a). Meanwhile, nonlinear regression and processed data associate to \( \widehat{\rho}=0.9777 \). This corresponded to good reproducibility strength. Indeed, spread of replicates around the mean response is fair (Fig. 4b).

Estimates acquired from raw data via the traditional analysis method of allometry returned a value of \( \widehat{\rho}=0.9285 \) for \( {w}_m\left(\hat{\alpha},\hat{\beta},t\right) \) projections (Table 2). This seems to correspond to moderate reproducibility. Yet, corresponding rms = 0.01265 was largest among analysis method- data set combinations (Table 2). Figure 3a shows a relative wider bias in spread around the mean response curve for larger leaves. This explains resulting inconsistencies in reproducibility of \( {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) \) projections shown in Fig. 5a. Display reveals biased \( {w}_m\left(\hat{\alpha},\hat{\beta},t\right) \) projections for near 50% of sampling dates. This, led to a smallest value of 0.8436 for the ratio of projected to observed averages.

Table 2 Reproducibility results for w m (α, β, t). Entries include, Lin’s concordance correlation coefficients \( \left(\hat{\rho}\right) \), root mean square deviations (rms) and ratios of 〈w m (α, β, t)〉 to 〈w m (t)〉 averages
Fig. 5
figure 5

Effects of analysis method on reproducibility of w m (α,β,t) projections (raw data). Continuous lines display w m (t) averages of leaf dry weight in shoots. Dashed lines in panel a show w m (α,β,t) projections produced by log-linear transformation. Dashed lines in panel b display those projected via nonlinear regression. Nonlinear regression estimates support greater reproducibility of observed w m (t) values through w m (α,β,t) proxies (Table 2)

Instead, nonlinear regression and raw data produced a value of \( \widehat{\rho}=0.9915. \) And root mean squared deviation attained a value of rms = 0.00460 (Table 2). This suggest a remarkable reproducibility strength for \( {w}_m\left(\hat{\alpha},\hat{\beta},t\right) \) projections (Table 2). Correspondence between projected and observed values, shown in Fig. 5b. corroborates high agreement. Moreover, the ratio of projected to observed leaf dry weight averages attained an outstanding value of 0.9773 (Table 2).

Processed data and log-linear transformation analysis produced \( \hat{\rho}=0.9489 \) for w m (α, β, t) projections. This figure is bigger than corresponding to raw data for this method. Nevertheless, lines in Fig. 6a show that this result does not correspond to a real gain in reproducibility. Besides data quality could not significantly reduce calculated root mean squared deviation (Table 2). Also, a value of 0.8588 for a ratio of projected to observed leaf dry weight averages is still low for suitable agreement (Table 2). Thus, regardless data quality, log-linear analysis failed to produce consistent w m (α, β, t) projections of w m (t) averages.

Fig. 6
figure 6

Effects of analysis method on reproducibility of w m (α,β,t) projections (processed data). Continuous lines depict observed w m (t) averages. Dashed lines in panel a) are projections yield by log-linear analysis. Dashed lines in panel b) stand for projections linked to nonlinear regression

In turn, w m (α, β, t) projections made by nonlinear regression and processed data yield the highest value of \( \hat{\rho}=0.9976 \). (Table 2). And also the smallest root mean squared deviation among analysis method–data set combinations (Table 2). As shown by Fig. 6b this corresponds to a fairly good reproducibility strength. Additionally, data quality and nonlinear regression led to an outstanding value of 0.9975 for the ratio of projected 〈w m (α, β, t)〉 to observed 〈w m (t)〉 averages.

Results exhibit that log-linear transformations failed to produce consistent projections for observed w m (t) averages. In contraposition, nonlinear regression entailed parameter estimates of noteworthy reliability. Thus studying sample size effects on reproducibility addressed only this analysis method. Figure 7 exhibits variation of CCC values depending on sample size k. This is expressed as a percentage of the extent of data set ( n ra  = 10412 for raw) or (n qa  = 6094 for processed). For raw data, a sample sized k = 0.20n ra endures reasonable reproducibility \( \left(\widehat{\rho}(k)=0.9889\right) \). But samples beyond this threshold would not induce a significant change in reproducibility. Meanwhile, for the quality controlled data, the sample size threshold for excellent reproducibility was k = 0.10n rq . This leads to a high reproducibility strength of \( \widehat{\rho}(k)=0.9929 \). Thus, a sample 10% the size of processed data set produced excellent reproducibility. Again sample sizes beyond this threshold failed to increase \( \widehat{\rho}(k) \) values.

Fig. 7
figure 7

The effects of sample size on reproducibility of w(α,β,t). For raw data a sample of size k = 0.20n ra (or near 2000 leaves) yields reasonable reproducibility (\( \widehat{\uprho} \)(k) = 0.9889). But, for quality controlled data similar reproducibility associates to only k = 0.10n rq (about 1000 leaves). Larger sample would not induce a significant changes in the values of \( \widehat{\uprho} \)(k)

In Appendix 3 we consider variations α q  = α + ∆α q and β r  = β + ∆β r . We found that shifting trajectories, w m (α q , β r , t) overestimated reference projections w m (α, β, t) whenever ∆α q  > 0 and ∆β r  > 0. In correspondence underestimation of w m (α, β, t) occurs for −α < ∆α q  < 0 and −β < ∆β r  < 0. Fig. 8 explains that for ∆α q  ∙ ∆β r  > 0, shifting trajectories overestimate (see red lines in panel a)) or underestimate the reference one (see blue lines in panel a)). We can also make certain that relatively smaller deviations between w m (α q , β r , t) and w m (α, β, t), values occur for the case, ∆α q  ∙ ∆β r  < 0, (see red and blue lines in panel b)).

Fig. 8
figure 8

Examples of changing trajectories w m (α q , β r ,t). Black lines a reference trajectory w m (α,β,t). This produced by raw data and nonlinear regression as an analysis method. For ∆α q . ∆β r  > 0, shifting trajectories w m (α q , β r ,t) overestimate or underestimate w m (α,β,t) projections (see red or blue lines in panel a)). The case ∆α q ∙∆β r  < 0, leads to relative smaller deviations between w m (α q , β r ,t) and w m (α,β,t) (see red and blue lines in panel b))

The simulation code of Eqs. (39) through (44) explored the sensitivity of the w m (α, β, t) projection method, to numerical variation of parameters α and β. Available parameter estimates yield reference values for α and β (Table 2). Again, since nonlinear regression associates to highest reproducibility strength, for easier presentation, we only explain results using this analysis method.

The variation of the absolute deviation index for raw data is shown in Fig. 9. A range 0 ≤ θ(q, r) ≤ θ ste , places the relative ϑ(θ) deviation index within the domain 0≤ ϑ(θ) ≤ 0.0205 (Table 3). Therefore, for a bound of θ(q, r) set by the standard errors of estimates largest absolute deviation between w m (α q , β r , t) and w m (t) amounts to about 2% of 〈w m (t) 〉. Moreover, a range 0 ≤ θ(q, r) ≤ 2θ ste produces 0 ≤ ϑ(θ) ≤ 0.031. This leads to an equivalent 3% of 〈w m (t) 〉. Figure 10 displays the dynamics of ϑ(θ) depending on θ(q, r) for processed data. We have that θ(q, r) varying in a range of 0 ≤ θ(q, r) ≤ θ ste implies 0 ≤ ϑ(θ) ≤ 0.003 (Table 3). This time largest absolute deviation was only 0.03% of 〈w m (t)〉. Comparing with results for raw data, we ascertained remarkable gain in precision of w m (α, β, t) projections. This exploration highlights on importance of data quality control as a procedure leading the consistent w m (α, β, t) projections. But, results show that the projection method is robust even when parameter estimates are obtained from raw data.

Fig. 9
figure 9

The variation of the relative deviation index ϑ(θ) (raw data). The standard errors of estimates acquired by nonlinear regression from raw data, produced ϑ(θ ste ) = 0.0205. And for a range 0 ≤ θ(q,r) ≤ θ ste the largest value of the absolute deviation between w m (α q , β r ,t) and w m (t) is around 2% of 〈w m (t) 〉 (see Appendix 3 for details)

Table 3 Sensitivity of the w m (α, β, t) projections to changes in estimates of the parameters α and β. Included are calculated θ ste values. This gives θ(q, r) as determined by the standard errors of estimates. We also present corresponding values of the relative deviation index ϑ(θ ste )
Fig. 10
figure 10

The variation of the relative deviation index ϑ(θ) (processed data). The standard errors of estimates acquired by nonlinear regression from processed data, produced θ ste =3.954 × 10− 3). This set a range 0 ≤ ϑ(θ) ≤ 0.003. Thus, the largest absolute deviation between w m (α q , β r ,t) and w m (t) amounts to about 0.3% of the 〈w m (t) 〉 average. Data quality control could be a factor improving accuracy of w m (α,β,t) projections

Discussion

Results of Solana-Arellano et al. [31] explain invariance of the allometric parameters α and β in Eq. (1). This suggest w m (α, β, t) proxies as possible nondestructive estimations of the average leaf dry weight in eelgrass shoots. These assessments are essential for monitoring the efficiency of transplanted eelgrass plots, fundamental in remediation aims. The present examination shows that the w m (α, β, t) proxies could in fact offer reliable and cost-effective assessments. This on condition that practitioners take in to account our guidelines. For instance, our results typify the extent on which accuracy of estimates of the parameters α and β influences the predictive power of the w m (α, β, t) projections. And, our findings clarify that there are methodological factors affecting the accuracy of estimates. Related influences could spread significant bias in approximations supported by the w m (α, β, t) device. Indeed, analysis method turned into a main factor inducing bias in parameter estimates of the model of Eq. (1). Moreover, only parameter estimates acquired by nonlinear regression yield consistency of the model of Eq. (1) (Table 1 and lines in Fig. 3b and Fig. 4b). And, only these estimates upheld conclusive predictive power of the w m (α, β, t) proxies (Table 2, as well as, lines in Fig. 5b and Fig. 6b). Our results also show that data quality could not improve the performance of w m (α, β, t) projections acquired via log-linear transformations. Without doubt, parameter estimates acquired from processed data by this method still led to significant bias in w m (α, β, t) projections (Fig. 6a). Meanwhile, data processing improved reproducibility of projections built for raw data using nonlinear regression (Table 2 and lines in Fig. 5b and Fig. 6b). Besides, relevance of data quality was also evident for optimizing sample size. Indeed, while for raw data, a sample of approximately 2000 leaves shows reasonable reproducibility, for the quality controlled data this threshold drops to near 1000. However, samples sized beyond these thresholds would not induce a noteworthy gain in reproducibility. This result on its own, ties to efficiency of the w m (α, β, t) projection method. Undoubtedly routines for leaf dry weight assessment are tedious and time consuming. So, reducing size of data set for parameter estimation increases cost-effectiveness of the w m (α, β, t) projection method.

Nonlinear regression estimation also showed advantages in sensitivity over the log-linear analysis counterpart. Estimates from raw data led to a largest absolute deviation between w m (α, β, t) and w m (t) values amounting only 3% of the average of w m (t) over sampling dates. And, for processed data, the fluctuation range for equivalent sensitivity widened to 2.8 times the range for standard errors of estimates. But, on spite of data quality relevance, sensitivity results for raw data reveal that the w m (α, β, t) projection method is robust relative to expected fluctuations in parameter estimates.

Our results show that both the accuracy and cost-effectiveness of projections can be enhanced by the addition of data quality control procedures. However, including data processing can become a weakness for the w m (α, β, t) projection method. Indeed, data cleaning procedures convey niceties that relate in a fundamental way to detection and rejection of inconsistent replicates. Also, compromising about which particular rejection edge should work, is hard to determine. Thus, the use of any data processing will endure a doubt, that the examiner selects an arrangement producing the most probable results [37]. In that order of ideas, when attempting to enhance the reproducibility power of w m (α, β, t) projections it is desirable to avoid depending in any form of data processing. For that aim, prior to data assembly, we must bear in mind standardized routines yielding accurate measurements for w(t) and a(t). This will favor direct identification of the model of Eq. (1) in a consistent way. It is of a fundamental importance to be aware, that errors in leaf dry weight or area assessment differentiate in terms of leaf size. Certainly, leaves produced anew normally present a complete and undamaged span. But, they normally yield very reduced dry weight values. Therefore, we can expect estimation errors imputable to the precision of the analytical scale for individual leaf dry weight assessments. To this, we may add errors in the reading and/or recording of the scale output. These issues could explain a perceptible accumulation of inconsistent replicates for leaves with areas between 2 mm2 and 350 mm2 (Fig. 1). And, even after data cleaning procedures, leaf dry weight replicate spread for leaves bigger than 2000 mm2 shows significant residual variability (Fig. 2). Likewise, as far as, bigger and older leaves is concerned, there are issues on dry weight estimation errors. These seem to relate to damage caused by exposure to environmental factors. The fact that we estimated leaf area by means of the product of related length and width could explain these effects. For complete undamaged eelgrass leaves, the use of a leaf times width proxy grants simplified and accurate estimations of leaf area [36].

But, this approach could deliver inaccurate estimations for long and damaged leaves. Actually, bigger leaves remain exposed during significant periods of time to environmental influences such as drag forces or herbivory. This could remove large amounts of leaf tissue while leaving length unaffected. Thus, causing overestimation of true leaf area when using a width and length product estimation. At the same time, lost portions of a leaf will produce a smaller dry weight than expected for an overestimated area. These effects will bring dry weight replicates that deviate from the power function–like trend associated to the model of Eq. (1). Estimation bias for the dry weights of smaller and longer leaves could explain the anomalous proliferation of inconsistencies (around 41 % ) found while applying the proposed data cleaning procedure to the present raw data. These effects will propagate uncertainty of parameter estimates of the model of Eq. (1), influencing accuracy of the w m (α, β, t) projections. Hence, for the sake of consistent reproducibility of observed values via w m (α, β, t) projections, we need to be aware of these effects. And as elaborated, a good starting point for granting consistency, is appropriate gathering of primary data for the identification of the model of Eq. (1). This will make subsequent data cleaning procedures unnecessary.

Conclusion

This research show that precise estimates of allometric parameters in Eq. (1) grant accurate non-destructive projections of the average leaf dry weight in eelgrass shoots. These projections offer efficient appraisal of eelgrass restoration projects, thereby contributing to the conservation of this important seagrass species. Our findings support views in Hui and Jackson [32], Packard and Birchard [33] and Packard et al. [44], on the relevance of analysis method in scaling studies. Indeed, we found that for assuring suitability of the w m (α, β, t) proxies, the use of nonlinear regression is crucial. On spite of claims that the use of the traditional log-linear analysis method is a must in allometric examination [45], exploration of spread of present crude data reveals curvature [46]. This explaining failure of the traditional analysis method to produce unbiased results for the present data. Besides proxies supported by nonlinear regression and raw data, are robust.

Data cleaning could only marginally enhance the accuracy of proxies produced by nonlinear regression and raw data. But results underline a relevant influence of data quality in setting optimal sample size for a suitable precision of parameter estimates. Nevertheless, the use of data cleaning procedures leads to intricacies. They in a fundamental way relate to choosing thresholds for rejection of detected inconsistencies, which are often regarded as subjective. Thus, instead of using later data cleaning, data gathering should seek for suitability. Special care must be taken when processing bigger and older leaves. These are often damaged or even trimmed so that their dry weights do not conform to a true weight to area relationship. Irregularities in raw data may also associate to biased estimation of leaf length or width. Moreover, in a lesser way faulty equipment for leaf dry weight assessment, rounding off, or even incorrect data recording could as well contribute.

Taking advantage of a time invariance of the parameters in Eq. (1) the w m (α, β, t) device could offer to the eelgrass conservation practitioner highly consistent and truly nondestructive assessments of the average value of leaf dry weight in shoots. But the explained guidelines on analysis method, sample size and data appropriateness are mandatory for cost-effectiveness. Moreover, the present results suggest that the use of the w m (α, β, t) method could be extended to other seagrasses species with similar leaf architecture to eelgrass.

References

  1. Costanza R, d’Arge R, de Groot R, Farber S, Garsso M, Hannon B, et al. The value of the world’s ecosystem services and natural capital. Nature. 1997;387:253–60.

    Article  CAS  Google Scholar 

  2. Short FT, Neckles HA. The effects of global climate change on seagrasses. Aquatic Bot. 1999;63:169–96.

    Article  Google Scholar 

  3. Grech A, Chartrand-Miller K, Erftemeijer P, Fonseca M, McKenzie L, Rasheed M, et al. A comparison of threats, vulnerabilities and management approaches in global seagrass bioregions. Environ Res Lett. 2012;7:024006.

    Article  Google Scholar 

  4. Unworth KFR. Seagrass meadows in a globally changing environment. Marine Poll Bull. 2014;83(2):383–6.

    Article  Google Scholar 

  5. Flindt MR, Rasmussen EK, Valdemarsen T, Erichsen A, Kaas H, Canal-Vergés P. Using a GIS-tool to evaluate potential eelgrass reestablishment in estuaries. Ecol Model. 2016;338:122–34.

    Article  Google Scholar 

  6. Hughes BB, Eby R, Dyke EV, Tinker MT, Marks CI, Johnson KS, et al. Recovery of a top predator mediates negative eutrophic effects on seagrass. Proc Natl Acad Sci. 2013;110:15313–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. van Katwijk MM, Thorhaug A, Marbà N, Orth RJ, Duarte CM, Kendrick GA, et al. Global analysis of seagrass restoration: the importance of large-scale planting. J Appl Ecol. 2016;53:567–78.

    Article  Google Scholar 

  8. Short FT, Kostenb S, Morganc PA, Malone S, Moore GE. Impact of climate change on submerged and emergent wetland plants. Aquat Bot. 2016;135:3–17.

    Article  Google Scholar 

  9. Holmquist JG, Powell GVN, Sogard SM. Decapod and stomatopod assemblages on a system of seagrass-covered mud banks in Florida bay. Mar Biol. 1989;100:473–83.

    Article  Google Scholar 

  10. Montague CL, Ley JA. A possible effect of salinity fluctuation on abundance of benthic vegetation and associated fauna in northeastern Florida bay. Estuar Coast Shelf Sci. 1993;16:703–17.

    CAS  Google Scholar 

  11. Plummer ML, Harvey CJ, Anderson LE, Guerry AD, Ruckelshaus MH. The role of eelgrass in marine community interactions and ecosystem services: results from ecosystem-scale food web models. Ecosystems. 2013;16(2):237–51.

    Article  Google Scholar 

  12. Blackburn TH, Nedwell DB, Weibe WJ. Active mineral cycling in a Jamaican seagrass sediment. Mar Ecol Prog Ser. 1994;110:233–9.

    Article  CAS  Google Scholar 

  13. Park SR, Li WT, Kim SH, Kim JW, Lee KS. A comparison of methods for estimating the productivity of Zostera marina. J Ecol Field Biol. 2010;33(1):59–65.

    Google Scholar 

  14. Terrados J, Borum J. Why are seagrasses important? goods and services provided by seagrass meadows. In: Borum J, Duarte CM, Krause-Jensen D, Greve TM, editors. European seagrasses: an introduction to monitoring and management: EU project Monitoring and Managing of European Seagrasses (M&MS); 2004. p. 8–10. http://www.seagrasses.org/handbook/european_seagrasses_high.pdf.

  15. Newell IER, Koch EW. Modeling seagrass density and distribution in response to changes in turbidity stemming from bivalve filtration and seagrass sediment stabilization. Estuar Coast Shelf Sci. 2004;27(5):793–806.

    Google Scholar 

  16. Christianen MJA, van Belzen J, Herman PMJ, van Katwijk MM, Lamers LPM, van Leent PJM, et al. Low-canopy seagrass beds still provide important coastal protection services. PLoS One. 2013;8(5):e62413.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Liu X, Zhou Y, Yang H, Ru S. Eelgrass detritus as a food source for the sea cucumber Apostichopus Japonicus Selenka (Echinodermata: Holothuroidea) in coastal waters of North China: an experimental study in flow-through systems. PLoS One. 2013;8(3):e58293.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Kennedy H, Beggins J, Duarte CM, Fourqurean JW, Holmer M, Marbà N, et al. Seagrass sediments as a global carbon sink: isotopic constraints. Global Biogeome Cycles. 2010;24(4):GB4026.

    Google Scholar 

  19. Duarte CM, Sintes T, Marba N. Assessing the CO2 capture potential of seagrass restoration projects. J Appl Ecol. 2013;50:1341–9.

    Article  CAS  Google Scholar 

  20. Short FT, Coles RG, Pergent-Martini C. Global seagrass distribution. In: Short FT, Coles RG, editors. Global seagrass research methods. Amsterdam: Elsevier Science B.V; 2001. p. 6.

    Google Scholar 

  21. Phillips RC. In: Odum HT, Copeland BJ, Mc Mahan EA, editors. Temperate grass flats in coastal ecological Systems of the United States. 2nd ed. Washington DC: Conservation Foundation; 1974. p. 244–99.

    Google Scholar 

  22. Jacobs RPWM. Distribution and aspects of the production and biomass of eelgrass, Zostera marina L. at Roscoff, France. Aquat Bot. 1979;7:151–72.

    Article  Google Scholar 

  23. Williams RB. Nutrient level and phytoplankton productive in the estuary. In: Chabreck RA, editor. Proceedings of the coastal marsh and estuary management symposium. Baton Rouge: Louisiana State University; 1973. p. 59.

    Google Scholar 

  24. Novak AB, Plaisted HK, Hays CG, Hughes RA. Limited effects of source population identity and number on seagrass transplant performance. PeerJ. 2017;5:e2972.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Matheson FE, Reed J, Dos Santos VM, Mackay G, Cummings VJ. Seagrass rehabilitation: successful transplants and evaluation of methods at different spatial scales. N Z J Mar Freshwater Res. 2017;51(1):96–109.

    Article  Google Scholar 

  26. Fishman JR, Orth RJ, Marion S, Bieri J. A comparative test of mechanized and manual transplanting of eelgrass, Zostera marina, in Chesapeake Bay. Restor Ecol. 2004;12:214–9.

    Article  Google Scholar 

  27. Olsen JL, Coyer JA, Chesney B. Numerous mitigation transplants of the eelgrass Zostera marina in southern California shuffle genetic diversity and may promote hybridization with Zostera pacifica. Biol Conserv. 2014;176:133–43.

    Article  Google Scholar 

  28. Li WT, Kim JH, Park JI, Lee KS. Assessing establishment success of Zostera marina transplants through measurements of shoot morphology and growth. Estuar Coast Shelf Sci. 2010;88(3):377–84.

    Article  Google Scholar 

  29. Hauxwell J, Cebrián J, Valiela I. Eelgrass Zostera marina loss in temperate estuaries: relationship to land-derived nitrogen loads and effect of light limitation imposed by algae. Mar Ecol Prog Ser. 2003;247:59–73.

    Article  CAS  Google Scholar 

  30. Echavarría-Heras HA, Leal-Ramírez C, Villa-Diharce E, Cazarez-Castro NR. The effect of parameter variability in the allometric projection of leaf growth rates for eelgrass (Zostera Marina L.) II: the importance of data quality control procedures in bias reduction. Theor Biol Med Model. 2015;12:30.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Solana-Arellano ME, Echavarría-Heras HA, Leal-Ramírez C, Kun-Seop L. The effect of parameter variability in the allometric projection of leaf growth rates for eelgrass (Zostera marina L). Lat Am J Aquat Res. 2014;42(5):1099–108.

    Article  Google Scholar 

  32. Hui D, Jackson RB. Uncertainty in allometric exponent estimation: a case study in scaling metabolic rate with body mass. J Theor Biol. 2007;249:168–77.

    Article  PubMed  Google Scholar 

  33. Packard GC, Birchard GF. Traditional allometric analysis fails to provide a valid predictive model for mammalian metabolic rates. J Exp Biol. 2008;211:3581–7.

    Article  PubMed  Google Scholar 

  34. Echavarría-Heras HA, Lee KS, Solana-Arellano ME, Franco-Vizcaino E. Formal analysis and evaluation of allometric methods for estimating above-ground biomass of eelgrass. Ann Appl Biol. 2011;159(3):503–15.

    Article  Google Scholar 

  35. Echavarría-Heras HA, Leal-Ramírez C, Villa-Diharce E, Castillo O. Using the value of Lin's concordance correlation coefficient as a criterion for efficient estimation of areas of leaves of eelgrass from noisy digital images. Source code Biol Med 2014; 9(1):29.

  36. Leal-Ramírez C, Echavarría-Heras HA, Castillo O. Exploring the suitability of a genetic algorithm as tool for boosting efficiency in Monte Carlo estimation of leaf area of eelgrass. In: P Melin, O Castillo, J Kacprzyk, editores. Design of intelligent systems based on fuzzy logic, neural networks and nature-inspired optimization. Studies in computational intelligence. England: Springer. 2015;601:291–303.

  37. Leys C, Ley C, Klein O, Bernard P, Licata L. Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol. 2013;49(4):764–6.

    Article  Google Scholar 

  38. Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics. 1989;45:255–68.

    Article  CAS  PubMed  Google Scholar 

  39. McBride GB. A proposal for strength-of-agreement criteria for Lin’s concordance correlation coefficient. NIWA client report: HAM2005–062. Hamilton: National Institute of Water & Atmospheric Research; 2005.

    Google Scholar 

  40. Paiwatin Y. Statistical Modeling and Inference Using Likelihood. In: In all Likelihood. UK: Oxford University Press; 2013.

  41. Casella G, Berger RL. Statistical inference. 2nd ed. UK: Duxbury Press; 2001.

  42. Newman MC. Regression analysis of log-transformed data: statistical bias and its correction. Environ Toxicol. 1993;12:1129–33.

    Article  CAS  Google Scholar 

  43. Duan N. Smearing estimate: a nonparametric retransformation method. J Am Stat Assoc. 1983;78:605–10.

    Article  Google Scholar 

  44. Packard GC, Boardman TJ, Birchard GF. Allometric equations for predicting body mass of dinosaurs: a comment on Cawley and Janacek. J Zool. 2010;282:221–2.

    Article  Google Scholar 

  45. Lai J, Yang B, Lin D, Kerkhoff AJ, Ma K. The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression? PLoS One. 2013;8(10):e77007.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Packard GC. Misconceptions about logarithmic transformation and the traditional allometric method. Zoology. 2017;123:115–20.

    Article  PubMed  Google Scholar 

  47. Huber PJ. Robust statistics. New York: Wiley; 1981.

    Book  Google Scholar 

  48. Miller J. Reaction time analysis with outlier exclusion: bias varies with sample size. Q J Exp Psychol. 1991;43(4):907–12.

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank Francisco Javier Ponce Isguerra for art work.

Funding

All funding for the present research were obtained from institutional grants

Availability of data and materials

Data and simulation codes will be available from corresponding author upon request.

Author information

Authors and Affiliations

Authors

Contributions

HEH conceived, designed performed and supervised the whole research. CLR performed formal derivations simulation and numerical tasks. EVD performed the statistical analysis and interpretation of data. NCC revised the manuscript critically both at statistical and formal levels. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Héctor Echavarría-Heras.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interest.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Model selection

This Appendix covers the selection problem of Eq. (1) as a model for the variation of eelgrass leaf dry weight in terms of related area. On first sight the spread in Fig. 1 may suggest that instead of clinging to the power function model of Eq. (1) we should explore fitting an alternative linear model. There are two possible ways to achieve a linear model fit for the present data. One by fitting an isometric model. This follows by setting α = 1 in Eq. (1) to yield a linear model with w-intercept through origin, that is,

$$ w(t)\kern0.5em =\kern0.5em \beta a(t) $$
(2)

A second choice is considering a standard linear equation with intercept different from zero, that is,

$$ w(t)=\beta a(t)+\delta, $$
(3)

where β is the slope and δ the intercept. But, by looking at the spread in Fig. 1, we realize that this model turns out to be inconsistent. This because for a vanishing leaf area we expect a vanishing dry weight. In any event, we fit the model of Eq. (3) as a device to analyze the suitability of the model of Eq. (2).

Fitting of the model of Eq. (1) by means the traditional analysis method of allometry

For this fit we used raw data. Appling a logarithmic transformation on both sides of the Eq. (1), since we have available data pairs (w i , a i ), we get

$$ \mathrm{In}\left({w}_i\right)=\mathrm{In}\kern0.28em \left(\beta \right)+\alpha \kern0.28em \mathrm{In}\kern0.28em \left({a}_i\right)+{\varepsilon}_i,i=1,2,\dots, n, $$
(4)

where the additive errors ε1, ε2, …, ε n are independent and identically distributed, with distribution N(0, σ2). Fitting the model of Eq. (4) to data produced entries in Tables 4, 5 and 6.

Table 4 Residual statistics for the fitting of model of Eq. (4)
Table 5 Fitting results for the model of Eq. (4)
Table 6 Fitting tests for the model of Eq. (4)

Residual and normal probability plots for the fit of the model of Eq. (4) appear in Fig. 11. We notice that a value α = 1.0 is not included in the confidence interval for the parameter α. This impairs Eq. (2) as a consistent model for the present raw data.

Fig. 11
figure 11

Residuals and normal probability plots for the fitting of the model of Eq. (4). This plot associates to a fit to raw data by means of the traditional analysis method of allometry. We observe a biased distribution of residuals around the zero line. Also, normal Q-Q plot shows heavier tails than expected for a normal distribution (left and right panels one to one)

Fitting of the model of Eq. (1) via non-linear regression

For the aim of fitting the model of Eq. (1) to raw data using nonlinear regression, we use the equation

$$ {w}_i=\beta {a_i}^{\alpha }+{\varepsilon}_i $$
(5)

the error term ε i has a normal distribution with mean μ zero and standard deviation σ i . This chosen to contemplate heteroscedasticity or homoscedasticity. Moreover, σ i depends on the predictor variable a i through the model

$$ {\sigma}_i={\gamma}_0+{\gamma}_1{a}_i, $$
(6)

where γ0 and γ1 are parameters to be estimated. The case γ1 different from zero associates to heteroscedasticity. Whenever γ1 vanishes allows the consideration of homoscedasticity.

The response variable w i resulting from Eq. (5) is normally distributed, with mean μ i  = βa i α and standard deviation σ i given by Eq. (6). Fitting this regression model to raw data relied on a likelihood approach, i.e., acquiring estimates for the parameters β, α, γ0 and γ1, which yield the largest value of the log likelihood function [40, 41]

$$ l\left(\beta, \alpha, {\gamma}_0,{\gamma}_1\right)=-\frac{n}{2}\log \left(2\pi \right)-{\sum}_1^n\log \left({\sigma}_i\right)-\frac{1}{2}{\sum}_1^n{\left(\frac{w_i-{\mu}_i}{\sigma_i}\right)}^2, $$
(7)

with μ i  = βa i α and σ i given by Eq. (6).

Estimates \( \widehat{\beta} \)and \( \widehat{\alpha} \) are very similar for the homoscedastic and heteroscedastic cases of Eq. (5). Notice that a value α = 1.0 is not included in the confidence interval for the parameter α. Thus, this fit also excludes the model of Eq. (2) (isometric case).

Confidence intervals in Table 7 show a wide overlap region. It follows that parameter estimates do not differ in a significant way. This is also supported by the estimated CCC value (\( \widehat{\rho}=0.9307 \)) that was the same for the heteroscedastic and homoscedastic cases. Likewise, plotting the response curves w i  = βa i α on a dispersion diagram, reveals that curves estimated considering heteroscedasticity or homoscedasticity coincide (Fig. 12).

Table 7 Maximum likelihood estimates for parameters β and α in Eq. (5). This fit considers heteroscedasticity (γ1 ≠ 0) or homoscedasticity (γ1 = 0)
Fig. 12
figure 12

Data dispersion and allometric functions fitted through Eq. (5). This plot shows allometric function lines fitted by means of Eq. (5) and raw data. Homoscedasticity (blue lines) and heteroscedasticity (red lines) (cf. Eq. (6)). Notice that both lines overlap

Figure 13 shows the scatter diagram of dry weight residuals against leaf area. It also displays the region bounded by lines for 95% confidence intervals (blue lines). Figure 14 displays raw data and response curve (median) for the heteroscedastic model. It includes curves for region comprising 95% of leaf dry weight values (blue lines).

Fig. 13
figure 13

Scatter diagram of residuals of Eq. (5). This shows residuals against leaf area that associate to the fit of the regression model of Eqs. (5) and (6) allowing heteroscedasticity. Blue lines bound region 95% confidence intervals

Fig. 14
figure 14

Dispersion around response curve fitted using Eq. (5) assuming heteroscedasticity. This displays leaf dry weight against leaf area in arithmetical scale and response curve (median) estimated by the heteroscedastic regression model of Eqs. (5) and (6) (red lines). Also shown is region comprising 95% of leaf dry weight values observed for a given value of leaf area

Thus, predictive strength from the homoscedastic or heteroscedastic regression models are similar. This because assumed distributions of the error term, in both models are symmetric. What makes a difference is the form capturing the uncertainty pattern.

Fitting of the linear model of Eq. (3)

Since, raw data composes of pairs (w i , a i ), in order to test the consistency of the model of Eq. (3), we consider the regression equation

$$ {w}_i=\beta\;{a}_i+\delta +{\varepsilon}_i, $$
(8)

where the additive errors ε1, ε2, …, ε n are independent and identically distributed N(0, σ2). Fitting regression Eq. (8) to raw data produced the results in Table (8),Table (9) and Table (10).

Table 8 Residual statistics for the fit of Eq. (8)
Table 9 Fitting results for the model of Eq. (8)
Table 10 Fitting test statistics for the model of Eq. (8)

Residual and normal probability plot for this fit appear in Fig. 15. Looking at the confidence intervals in Table 7, we observe that a value α = 1 is not included. This implies rejection for the model of Eq. (2) (isometric case). Also, the intercept in Eq. (8) is negative, which is inconsistent. This is inconsistent with observations. Thus, we must reject the model of Eq. (3).

Fig. 15
figure 15

Residual and normal probability plots of fitting of Eq. (3). This shows the residual and normal probability plots resulting from the fitting of the linear regression model of Eq. (3) to raw data (left and right panels respectively)

In summary, fitting results for the model of Eq. (1) excludes the value α = 1. Now, for α = 1, the model of Eq. (1) takes the form of a straight line through the origin (isometric model of Eq. (2)). In the other hand, for the fit of the linear model of Eq. (3) the intercept is negative. This means that this model could predict negative values of leaf dry weight. Thus, the model of Eq. (1) is the only one supporting a good selection criterion.

Appendix 2

Data processing

This Appendix explains the Median Absolute Deviation (MAD) criterion for data cleaning. A first step is detecting the group of leaf dry weight replicates that associate to a given leaf area. The symbol G(a) stands for this group, and the number of replicates it contains denoted by means of the symbol n(G). Formally, G(a) = {w i (a) | 1 ≤ i ≤ n(G) }. The median of G(a) is denoted by means of MED(G(a)), that is,

$$ \mathrm{MED}\left(G(a)\right)=\mathrm{MED}\left\{{w}_1(a),\dots .,{w}_{n(G)}(a)\right\}. $$
(9)

And, for 1 ≤ i ≤ n(G) the absolute deviation of replicate w i (a) from the median MED(G(a)) represented by means of δ i (a), that is,

$$ {\delta}_i(a)=\left|{w}_i(a)-\mathrm{MED}\left(G(a)\right)\right|. $$
(10)

Similarly, the median of the set of absolute deviations is signified by the symbol MED(δG(a)), that is,

$$ \mathrm{MED}\left(\delta G(a)\right)=\mathrm{MED}\left\{{\delta}_1(a),\dots .,{\delta}_{n(G)}(a)\right\}. $$
(11)

We use the character MAD(G(a)) to denote median absolute deviation of a group G(a). After Huber [47] and recalling that eelgrass leaf dry weight values are log-normally distributed, this is calculated through,

$$ \mathrm{MAD}\left(G(a)\right)=b\mathrm{MED}\left\{\ {\delta}_1(a),\dots .,{\delta}_{n(G)}(a)\right\}, $$
(12)

where b = 1/Q(0.75), being Q(0.75) the 0.75 quantile of the lognormal distribution.

The removal of inconsistent replicates in a group G(a) follows the decision criterion

$$ \mathrm{MED}\left(G(a)\right)-T\cdot \mathrm{MAD}\left(G(a)\right)<{w}_i(a)<\mathrm{MED}\left(G(a)\right)+T\cdot \mathrm{MAD}\left(G(a)\right), $$
(13)

where T is the rejection threshold. Following Miller [48], we set T = 3. In what follows, the symbol n qa stands for the size of the resulting quality controlled data set.

Appendix 3

Sensitivity of w m (α, β, t)

This Appendix presents a study of the influences of parameter fluctuations in the precision of the w m (α, β, t) projections. The examination is performed on the basis of both analytical and simulation approaches. Presentation requires a formal description of variables of interest. For that aim we label a generic eelgrass shoot by using a subscript s. Correspondingly, the characters nl(s) and w s (t) shall stand for the associated number of leaves and their combined dry weight. Similarly, ns(t) denotes the number of shoots collected at a sampling date t. And, the number of sampling date symbolized by means of n(t).

Observed dry weight in a shoot s is represented by mean of

$$ {w}_s(t)={\sum}_{nl(s)}w(t), $$
(14)

where ∑nl(s) indicates summation of the leaves that the shoot s, holds. Meanwhile, the matching average leaf dry weight of shoots denoted by the symbol w m (t) is given by

$$ {w}_m(t)=\frac{\sum_{ns(t)}{w}_s(t)}{ns(t)\kern0.5em }, $$
(15)

where ∑ns(t) indicates summation of the ns(t) shoots collected at a time t.

The character 〈w m (t)〉 stands for the average of w m (t) values over the number n(t) of sampling times. It is given by

$$ \left\langle {w}_m(t)\right\rangle =\frac{\sum_{n(t)\kern0.75em }{w}_m(t)}{n(t)}, $$
(16)

where \( \sum \limits_{n(t)} \) indicates summation over the number of sampling times.

We assumed that w(t) and a(t) are linked through the scaling expression of Eq. (1). Then, we can obtain an allometric proxy for w s (t). The symbol w s (α, β, t) represents this surrogate and it is given by

$$ {w}_s\left(\alpha, \beta, t\right)={\sum}_{nl(s)}\beta a{(t)}^{\alpha }. $$
(17)

Moreover,

$$ {w}_s(t)={w}_s\left(\alpha, \beta, t\right)+{\epsilon}_s\left(\alpha, \beta, t\right), $$
(18)

that is, w s (α, β, t) is an approximation to the true value of w s (t), being ϵ s (α, β, t) the error of estimation. Similarly, Eq. (17) produces an allometric proxy for w m (t), the average leaf dry weight in shoots at a time t. This surrogate, represented here by means of w m (α, β, t), is given by

$$ {w}_m\left(\alpha, \beta, t\right)=\frac{\sum_{ns(t)\kern0.5em }\ {w}_s\left(\alpha, \beta, t\right)}{ns(t)\kern0.5em }. $$
(19)

In turn, the average of w m (α, β, t) over the number n(t) of sampling times, denoted through 〈w m (α, β, t) 〉, is calculated by means of

$$ \left\langle {w}_m\left(\alpha, \beta, t\right)\ \right\rangle =\frac{\sum_{n(t)}\ {w}_m\left(\alpha, \beta, t\right)}{n(t)\ }. $$
(20)

Similarly,

$$ {w}_m(t)={w}_m\left(\alpha, \beta, t\right)+{\epsilon}_m\left(\alpha, \beta, t\right), $$
(21)

where ϵ m (α, β, t) stands for the resulting approximating error. Moreover, the root mean squared deviation between w m (t) and w m (α, β, t) is denoted by means of rms(α, β) and given by

$$ rms\left(\alpha, \beta \right)=\sqrt{\sum_{n(t)}{\epsilon}_m{\left(\alpha, \beta, t\right)}^2/n(t)}, $$
(22)

again ∑n(t) indicates summation over the sampling dates n(t).

We assume changes in parameters α and β given by factors of their standard errors. Then, a factor q associates to a shift of the value of the parameter α. This deviation, denoted by means of the symbol α q , is given by

$$ {\alpha}_q=\alpha +\varDelta {\alpha}_q, $$
(23)

where

$$ {\Delta \alpha}_q=q\cdot ste\left(\alpha \right), $$
(24)

with ste (α) symbolizing the standard error of α and q satisfying.

$$ \mid q\mid \le {q}_{\max .} $$
(25)

Similarly, a factor r, such that.

$$ \left|r\right|\le {r}_{\mathit{\max},} $$
(26)

associates to a fluctuation in β denoted by means of Δβ r and expressed by way of

$$ \varDelta {\beta}_r=r\cdot ste\left(\beta \right), $$
(27)

being ste(β) the standard error of β. This returns a variation of the parameter β, denoted via β r and represented in the form

$$ {\beta}_r=\beta +\varDelta {\beta}_r. $$
(28)

Qualitative exploration of sensitivity of w m (α, β, t)

A qualitative study is intended to set domains where reference estimations w m (α, β, t) are underestimated or overestimated by changing values w m (α q , β r , t).For that aim, we firstly set reasonable local variation ranges for the numerical deviations α q and β r . Fittings of Eq. (1) to eelgrass leaf dry weight and area data demonstrate that the inequality 0 < β < α holds [30]. As stated by Eqs. (24) and (27), letting |q| < ste(α)/α along with | r| < ste(β)/β set domains |∆α q | ≤ α and |∆β r | ≤ β. This enables suitable domain to pursue a qualitative study of the effects of parametric changes in w s (α, β, t) (cf. Eq. (17)). For that aim, let’s consider deviations δw s (α q , β r , t) defined through

$$ \delta {w}_s\left({\alpha}_q,{\beta}_r,t\right)={w}_s\left({\alpha}_q,{\beta}_r,t\right)-{w}_s\left(\alpha, \beta, t\right). $$
(29)

Then,

$$ \delta {w}_s\left({\alpha}_q,{\beta}_r,t\right)={\sum}_{nl(s)}\left(\left(\beta +\varDelta {\beta}_r\right)a{(t)}^{\alpha +\varDelta {\alpha}_q}-\beta a{(t)}^{\alpha}\right), $$
(30)

and factoring the term βa(t)α

$$ \delta {w}_s\left({\alpha}_q,{\beta}_r,t\right)={\sum}_{nl(s)}\beta a{(t)}^{\alpha }{\mu}_a\left(\varDelta {\alpha}_q\kern0.5em \varDelta {\beta}_r,a(t)\right), $$
(31)

where

$$ {\mu}_a\left(\varDelta {\alpha}_q\kern0.5em \varDelta {\beta}_r,a(t)\right)=\frac{\pi \left(\varDelta {\alpha}_q\right)}{\ \varphi \left(\varDelta {\beta}_r\right)}-1 $$
(32)

with

$$ \varphi \left(\varDelta {\beta}_r\right)=\frac{\beta }{\beta +\varDelta {\beta}_r} $$
(33)

and

$$ \pi \left(\varDelta {\alpha}_q\right)=a{(t)}^{\varDelta {\alpha}_q}. $$
(34)

Now, since a(t) is positive, for fixed values of the parameters α and β in the domain |∆α q | < α, the factor π(∆α q ) remains positive and increasing. Respectively, for ∆β r varying in the interval |∆β r | < β, the factor φ(∆β r ) is positive and decreasing. Furthermore, for ∆α q  = ∆β r = 0, we have π(∆α q ) = φ(∆β r ) = 1 (Fig. 16).

Fig. 16
figure 16

The variation of the auxiliary factors φ(∆β r )and π(∆β r ). This plot presents the variation of the auxiliary factors φ(∆β r ) and π(∆β r ) defined by Eqs. (33) and (34) respectively. Both ∆α and ∆β r r vary in the horizontal axis. Similarly, the corresponding variations of π(∆β r ) and φ(∆β r ) are projected in the vertical axis

From Eq. (31) the proxies w s (α, β, t) will be overestimated by shifting projections w s (α q , β r , t) whenever the inequality

$$ {\mu}_a\left(\varDelta {\alpha}_q,\varDelta {\beta}_r,a(t)\right)\kern0.5em >0 $$
(35)

is satisfied. And from Eq. (32) this last inequality leads to

$$ \pi \left(\varDelta {\alpha}_q\right)>\varphi \left(\varDelta {\beta}_r\right). $$
(36)

But, for −α ≤ ∆α q  ≤ 0 the factor π(∆α q ) is continuous and increases monotonically from a value π(−α) > 0, until it reaches a value of π(0) = 1, while the factor φ(∆β r ) takes positive and arbitrarily large values whenever ∆β r  approaches –β, and decreases monotonically towards a value of φ(0) = 1. Thus, the continuity of φ(∆β r ) and π(∆α q ) implies the order relationship π(∆α q ) < φ(∆β r ), in the domain−α ≤ ∆α q  ≤ 0 and –β < ∆β r  ≤ 0. In the other hand in the domain 0 ≤ ∆α q  ≤ α and 0 ≤ ∆β r  ≤ β, both π(∆α q ) and φ(∆β r ) steer away from one being π(∆α q ) increasing and φ(∆β r ) decreasing. Therefore, by continuity the statement of inequality (36) will hold only in the domain ∆α q  > 0 and ∆β r  > 0.

Similarly, reference values w s (α, β, t) will be underestimated by fluctuating ones w s (α q , β r , t) whenever, the statement

$$ {\mu}_a\left(\varDelta {\alpha}_q,\varDelta {\beta}_r,a(t)\right)<0 $$
(37)

holds, or equivalently whenever

$$ \pi \left(\varDelta {\alpha}_q\right)<\varphi \left(\varDelta {\beta}_r\right), $$
(38)

which as it has been discussed above, only holds whenever the increments ∆α q and ∆β r vary in.

the domain −β < ∆β r  < 0 and −α < ∆α q  < 0.

Since, the w s (α, β, t) projections are always positive, the w m (α, β, t) reference projections will be overestimated or underestimated by changing w m (α q , β r , t) values in the same domains of variation of ∆α q and and ∆β r  where the w s (α, β, t) projections are overestimated or underestimated by w s (α q , β r , t) values.

Exploration of sensitivity of w m (α, β, t) by simulation methods

In order to perform a simulation study of sensitivity, we considered a combined range of variability for α and β given by the modulus of the vector of parametric changes (∆α q ∆β r ). This is denoted by means of the symbol θ(q, r) and calculated through

$$ \theta \left(q,r\right)=\sqrt{\ \varDelta {\alpha_q}^2+\varDelta {\beta_r}^2.} $$
(39)

Equivalently, using Eqs. (24) and (27) yield

$$ \theta \left(q,r\right)=\sqrt{{\left(q\cdot ste\left(\alpha \right)\right)}^2+{\left(r\cdot ste\left(\beta \right)\right)}^2.} $$
(40)

And, setting q = 1 and r = 1 give (∆α q , ∆β r ) = (ste(α), ste(β)). This produces vectors of shifting parametric values with modulus determined by the standard errors of estimates. Denoting by means of the symbol θ ste the associated range of variability we have

$$ {\theta}_{ste}=\sqrt{ste{\left(\alpha \right)}^2+ ste{\left(\beta \right)}^2}. $$
(41)

By virtue of Eq. (39), every fixed value of θ(q, r) returns different pairs (∆α q , ∆β r ) of increments, each one giving rise to a couple of shifting parameter values (α q , β r ). Besides, for each ordered pair of changing parameters, leaf data in shoots and Eq. (19) return a shifting trajectory, that is denoted trough the symbol w m (α q , β r , t). The average of the values taken by the fluctuating trajectories at a sampling date t is denoted by 〈w m (α q , β r , t)| θ〉. The procedure estimates the deviations between the observed w m (t) and the average 〈w m (α q , β r , t)| θ〉 trajectory for each time t. This is denoted by δw (α q , β r , t) and calculated by way of

$$ \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)={w}_m(t)-\left\langle {w}_{m\kern0.5em }\left({\alpha}_q,{\beta}_r,t\right)|\theta \right\rangle . $$
(42)

The average of the δw (α q , β r , t) deviation values over all sampling times n(t), represented thorough the symbol 〈δw (α q , β r , t)〉, is given by

$$ \left\langle \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)\right\rangle =\frac{\sum_{n(t)}\ \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)\kern0.5em }{n(t)}, $$
(43)

where, \( \sum \limits_{n(t)} \) indicates summation over the involved n(t) sampling dates. These statistics along with, the average of w m (t) values over the number n(t) of sampling times are used to produce a relative deviation index represented by means of the symbol ϑ(θ) and given by

$$ \vartheta \left(\theta\ \right)=\frac{\left|\left\langle \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)\right\rangle \right|\kern0.5em }{\left\langle {w}_m(t)\right\rangle\ }. $$
(44)

The value of ϑ(θ) stands a measure of the sensitivity of the reference trajectory w m (α, β, t) to a change of tolerance θ(q, r) on the values of α and β. Moreover, ϑ(θ) assess in what percentage of 〈w m (t)〉 the absolute deviation between w m (α q , β r , t) and w m (t) amounts.

Application to present data

The results of Appendix 1 show that there are not differences in parameter estimates of the nonlinear regression model of Eqs. (5) and (6) assuming heteroscedasticity or homoscedasticity. Therefore, in order to explore the sensitivity of w m (α, β, t) by means of the simulation code of Eqs. (39)–(44), we take as a reference parameter estimates acquired from the raw data using the homoscedastic case. Parameter estimates were \( \widehat{\alpha}=1.104 \) with \( ste\left(\widehat{\alpha}\ \right)=5.101\times {10}^{-3} \) for α and of \( \widehat{\beta}=8.71\times {10}^{-6} \) with \( ste\left(\widehat{\beta}\right)=3.53\times {10}^{-7} \) for β (Table 1). Then, letting q = 1 and r = 1 in Eqs. (24) and (27) allows to consider a combined range of parametric change set by the standard errors of estimates. Eq. (41) returned θ ste  = 5.101 × 10−3. Afterwards, for every fixed value of θ(q, r) we determined all ordered pairs (∆α q , ∆β r ) complying with condition set by Eq. (39). This returned fluctuating values α q for α and β r for β, given one by one by Eqs. (23) and (28). Observed leaf area data and shifting parameter values, shaped the associated changing w m (α q , β r , t) trajectories. Moreover, while 0 ≤ θ(q, r) ≤ θ ste , the values of the relative deviation index ϑ(τ) satisfied 0≤ ϑ(θ) ≤ 0.0205 (Fig. 9). Thus, for a parametric variation range set by the standard errors of estimates the maximum value of the absolute deviation ϑ(θ) between w m (α q , β r , t) and w m (t) amounts to approximately 2% of the average of w m (t) taken over all sampling dates (〈w m (t) 〉 cf. Eq. (16)). Moreover, for θ(q, r) in a range 0 ≤ θ ≤ 2θ ste we have 0 ≤ ϑ(θ) ≤ 0.031. This leads to a maximum absolute deviation between w m (α q , β r , t) and w m (t) of around 3% of 〈w m (t) 〉. This shows the robustness of the w m (α, β, t) projection method.

In turn, fitting the model of Eq. (1) to the quality controlled data using nonlinear regression, produced estimates, \( \widehat{\alpha}=1.132 \) with \( ste\left(\widehat{\alpha}\ \right)=3.954\times {10}^{-3} \) for α and of \( \widehat{\beta}=6.956\times {10}^{-6} \) with\( ste\left(\hat{\beta}\right)=2.202\times {10}^{-7} \) for β (Table 1). Letting p = 1 and q = 1 in Eq. (40) gives θ ste  = 3.954 × 10−3, therefore, 0 ≤ θ(q, r) ≤ θ ste implies 0 ≤ ϑ(θ) ≤ 0.003 (Fig. 10). Hence, the maximum absolute-relative deviation between w m (α q , β r , t) and w m (t) amounts to only 0.03% of the 〈w m (t)〉 average. On addition, a range 0 ≤ θ ≤ 2.8θ ste leads to 0 ≤ ϑ(θ)  ≤ 0.03006. This leads to a maximum absolute deviation between w m (α q , β r , t) and w m (t) of around 3% of 〈w m (t)〉. Comparing with results for raw data, we can ascertain that data quality induces a relative gain in the precision of the w m (α, β, t) projections.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Echavarría-Heras, H., Leal-Ramírez, C., Villa-Diharce, E. et al. On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision. Theor Biol Med Model 15, 4 (2018). https://doi.org/10.1186/s12976-018-0076-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12976-018-0076-y

Keywords