Future body mass index modelling based on macronutrient profiles and physical activity

Background An accurate system of determining the relationship of macronutrient profiles of foods and beverages to the long-term weight impacts of foods is necessary for evidence-based, unbiased front-of-the-package food labels. Methods Data sets on diet, physical activity, and BMI came from the Food and Agriculture Organization (FAO), the World Health Organization (WHO), the Diabetes Control and Complications Trial (DCCT), and Epidemiology Diabetes Intervention and Complications (EDIC). To predict future BMI of individuals, multiple regression derived FAO/WHO and DCCT/EDIC formulas related macronutrient profiles and physical activity (independent variables) to BMI change/year (dependent variable). Similar formulas without physical activity related macronutrient profiles of individual foods and beverages to four-year weight impacts of those items and compared those forecasts to published food group profiling estimates from three large prospective studies by Harvard nutritional epidemiologists. Results FAO/WHO food and beverage formula: four-year weight impact (pounds)=(0.07710 alcohol g+11.95 (381.7+carbohydrates g per serving)*4/(2,613+kilocalories per serving)–304.9 (30.38+dietary fiber g per serving)/(2,613+kilocalories per serving)+19.73 (84.44+total fat g)*9/(2,613+kilocalories per serving)–68.57 (20.45+PUFA g per serving)*9/(2,613+kilocalories per serving))*2.941–12.78 (n=334, R2=0.29, P < 0.0001). DCCT/EDIC formula for four-year weight impact (pounds)=(0.898 (102.2+protein g per serving)*4/(2,297+kilocalories per serving)+1.063 (264.2+carbohydrates g per serving)*4/(2,297+ kilocalories per serving)–13.19 (24.29+dietary fiber g per serving)/ (2,297+kilocalories per serving)+ 0.973 (74.59+(total fat g per serving–PUFA g per serving)*9/(2,297+kilocalories per serving))*85.82–68.11 (n=1,055, R2=0.03, P < 0.0001). (FAO/WHO+ DCCT/EDIC formula forecasts averaged correlated strongly with published food group profiling findings except for potatoes and dairy foods (n=12, r=0.85, P = 0.0004). Formula predictions did not correlate with food group profiling findings for potatoes and dairy products (n=10, r= −0.33 P=0.36). A formula based diet and exercise analysis tool is available to researchers and individuals: http://thehealtheconomy.com/healthTool/. Conclusions Two multiple regression derived formulas from dissimilar databases produced markedly similar estimates of future BMI for 1,055 individuals with type 1 diabetes and female and male cohorts from 167 countries. These formulas predicted the long-term weight impacts of foods and beverages, closely corresponding with most food group profiling estimates from three other databases. If discrepancies with potatoes and dairy products can be resolved, these formulas present a potential basis for a front-of-the-package weight impact rating system.


Introduction
Previous mathematical modelling approaches to predict weight change have largely looked at short term results of changes of energy intake and expenditure [1,2]. As noted in one article using USA data, "A small persistent average daily energy imbalance gap between intake and expenditure of about 30 kJ per day underlies the observed average weight gain " [3]. The effect of the macronutrient profile, independent of daily energy intake, has been given less attention. Many researchers have characterized the etiology of the worldwide obesity epidemic as much more complex that simply an imbalance of calories in versus calories out. There are both complex economic causes and consequences of obesity [4]. For example, the food industry in Europe invested €1-billion in a campaign to block evidence-based health warnings on food [5]. Producing refined and processed convenience foods, high in sugar and saturated fats, is lucrative. To defend profits from sales of unhealthy foods, the food industry has adapted techniques long used by the tobacco industry to defend cigarettes. For example, they seek to instill doubt in the public about scientific evidence linking certain foods and eating patterns to obesity, emphasize personal responsibility, hire scientists to counteract obesity research, make self-regulatory pledges, lobby to stop government public health anti-obesity programs, and, of course, heavily advertise unhealthy foods [6]. At a macroscopic level in western countries, the overconsumption of value added foods (i.e., processed and refined) is a predictable outcome of market economies predicated on consumption-based growth [7].
Evidence-based public health strategies are needed to better understand the dietary and physical activity factors implicated in the development of obesity and to guide interventions and policies that can curb or reverse the increase in BMI globally [8].
Only with a quantitative understanding of the factors leading to excessive weight gain guiding the implementation of weight control education campaigns and practical public health strategies will the obesity epidemic be curbed. No methodology for relating macronutrient intake and exercise to long-term BMI change/year or future BMI has yet been scientifically validated.
A 2007 European Union regulation on nutrition and health claims made for foods provides for the use of nutrient profiles to determine which foods may bear claims but does not specify what the profiles should be or how they should be developed [9]. An in depth analysis by the French Food Safety Agency of existing nutrient profiling schemes based on indicator foods [10] found fairly good concordance between (1) The UK Food Standards Agency (FSA) model (2), The Dutch Tripartite classification model, and (3) The USA Food and Drug Administration (FDA) model but concluded, ". . . further improvement of the "indicator foods" approach is needed if it is to serve as a "gold standard" [9,10]. The British Heart Foundation Health Promotion Research Group compared eight nutrient profiling models with a standard ranking of 120 foods. They found good correlations with opinions of nutrition professionals about both the continuous models of nutrient profiling (Spearman's rho = 0.6-0.8) and categorical models of food profiling (high chi squared results) [11]. However, these correlations are expert opinion-based rather than evidencebased and therefore subject to challenges that they are biased.
In a consensus report entitled, "Front-of-Package Nutrition Rating Systems and Symbols: Promoting Healthier Choices," a USA Institute of Medicine committee concluded, ". . .it is time for a move away from front-of-package systems that mostly provide nutrition information on foods or beverages but don't give clear guidance about their healthfulness, and toward one that encourages healthier choices through simplicity, visual clarity, and the ability to convey meaning without written information. The report recommends that the FDA develop, test, and implement a single, standard front-of-package symbol system to appear on all food and beverage products, in place of other systems already in use" [12].
Food and Agriculture Organization (FAO) and World Health Organization (WHO) data worldwide [13,14] show large variations in diets, physical activity, and mean BMIs for female and male cohorts from countries around the world. This provides an opportunity to derive multiple regression formulas capturing the relationship between diet, physical activity, and mean BMIs of adults. Similarly, diet, exercise, and BMI data from the Diabetes Control and Complications Trial (DCCT) and the DCCT follow up study, Epidemiology Diabetes Intervention and Complications study (EDIC), can be used to generate multiple regression formulas predicting BMI change/year. By omitting the physical activity or exercise components of the formulas, the macronutrient components can be used to predict the long-term weight impacts of individual foods and beverages.
In the first and only genuinely evidence-based long-term weight impact assessment methodology of foods and beverages ever published, Harvard nutritional epidemiologists led by Dr. Dariush Mozaffarian analysed the statistical relationship between increases and decreases in servings per day of specific foods and beverages and 4 four year changes in weight of subjects from three large studies of diet and lifestyle [15]. The food group profiling component (categorical profiling) of this pioneering Harvard diet and lifestyle study serves as the comparator for results from the FAO/WHO and DCCT/EDIC macronutrient (continuous profiling) formulas for future BMI predictions for 4 four year weight impacts of individual foods and beverages. This paper will explore whether these macronutrient and physical activity profiling multiple regression formulas are validated by comparison with the evidence-based Harvard nutritional epidemiology food group profiling study and correlated enough with each other to provide a generalisable model to predict long-term BMI change/year for diverse individuals, populations, foods, and beverages. If so, these future BMI continuous model prediction formulas could inform obesity prevention public health policies for countries and weight control strategies for clinicians and individuals.
Increased physical activity (FAO/WHO database) and exercise (DCCT/EDIC database) were hypothesized to reduce weight gain over time. Regarding the relationship between food group availability/macronutrient availability and BMI change/year of cohorts of female and male adults from countries around the world and macronutrient consumption of individuals with type 1 diabetes, an exploratory hypothesis was put forth.

FAO and WHO data
Of 200 countries in the Global Health Observatory Data Repository of the WHO and the FAO databases, 112 countries have complete data on plant and animal food commodity availability per capita [14], physical activity [16], and mean BMI (kg/m 2 ) of adults aged 25+ in 2008 [17]. For an additional 55 countries, physical activity data was absent but diet and BMI were available. Imputed estimates of the WHO variable "insufficient physical activity" of these 55 female and male cohorts were obtained by using multiple regression analysis with insufficient physical activity of the 112 countries as the dependent variable and food group availability profile, gender, and country percapita GDP as independent variables. In all, 334 cohorts from 167 countries served as the subjects for these univariate and multivariate statistical analyses.

FAO food group availability data and derived macronutrient profiles
The FAO supplied data on food commodity availability in kilocalories (kcals) per capita per day. These food commodity availability data were broken down to cereals (e.g., rice, maize, and corn), vegetable oils (e.g., soy, rapeseed, mustard seed, and palm), sugar and sweeteners (e.g., sucrose and fructose from sugar cane, corn, beets, and honey), meat (e.g., cow, pig, sheep, goat, offals), animal fats, roots and tubers (e.g., potatoes and cassavas), fruit (including juices), pulses (e.g., beans and lentils), milk, cheese, and eggs. The percent of total available kcals percapita per day for each food group in each country comprised that country's food group profile. Data for percapita alcohol "consumption", in contrast with "availability," came as the variable "g/day consumed" from the WHO [18], rather than as percent of total available kcals.
Macronutrients included for univariate analysis with mean adult BMI were protein (g and % of kcals), carbohydrates (g and % of kcals), dietary fiber (g and g/1,000 kcals), polyunsaturated fatty acids (PUFA: g and % of kcals), monounsaturated fatty acids (MUFA: g and % of kcals), saturated fatty acids (SFA: g and % of kcals), and total fats (g and % of kcals). In addition, percapita daily total kcals was assessed for each cohort.
To generate the FAO/WHO macronutrient profiling formula, the FAO food commodity availability data were broken down to macronutrient availability by analyzing samples of each commodity using the United States Department of Agriculture (USDA) Nutrient Database for Standard Reference, Release 24 [19]. Correlations of the mean BMIs of female and male cohorts from different countries with the formula estimates served as a measure of the adequacy of the method of transforming food group data to macronutrient profile data (i.e., #1 physical activity and food group profiling versus #2 physical activity and macronutrient profiling).

WHO physical activity data
The WHO evaluated physical activity of females and males in countries worldwide with the variable "insufficient physical activity" (0%-100% scale). According to the WHO, "adults aged 18-64 should do at least 150 minutes of moderate-intensity aerobic physical activity throughout the week or do at least 75 minutes of vigorousintensity aerobic physical activity throughout the week or an equivalent combination of moderate-and vigorous-intensity activity" [20]. The WHO defined "insufficient physical activity" as less than this recommended level of physical activity.
The DCCT/EDIC study Diabetes Control and Complications Trial (DCCT) and its follow-up the Epidemiology of Diabetes Interventions and Complications (EDIC) study were conducted by the DCCT/EDIC Research Group and supported by National Institute of Health grants and contracts and by the General Clinical Research Center Program, NCRR. The data [and samples] from the DCCT/EDIC study were supplied by the NIDDK Central Repositories. This manuscript was not prepared under the auspices of the DCCT/EDIC study and does not represent analyses or conclusions of the DCCT/EDIC study group, the NIDDK Central Repositories, or the NIH.
The DCCT eligibility criteria and screening methods and the baseline characteristics of the study subjects have been reported in detail [21][22][23]. From 1983 and 1989, investigators recruited 1,441 participants between 13 and 39 years of age (mean = 26.8 years) that were C peptide-deficient and in good general health. The range of follow-up for all subjects was 3.5 to 9 years, with a mean of 6.5 years at the end of the trial in 1993 [24]. EDIC data from the National Institute of Diabetes and Diseases of the Kidney (NIDDK) were obtained in collaboration with the Endocrinology Department of the University of Pittsburgh. The EDIC study collected data over the 10 years immediately following the DCCT (1993DCCT ( -2003 [25], bringing the total length of follow up of yearly BMI records to 14-19 years.

DCCT macronutrient profiling
Under the guidance of registered dietitians trained in collecting nutrient consumption data, all DCCT participants submitted detailed accounts of their food intake during the week previous to entry into the study. Subsequently, the dietitians obtained follow-up diet histories at years two and five and at the end of the trial [21]. With these diet histories transformed into intakes of 99 macro and micronutrients, statisticians generated a nutrient consumption data set. This nutritional analysis instrument had a high reproducibility on repeated assessments of the diet history [26]. After screening the data set for nutrients correlating with BMI change/year, univariate correlations were performed on food energy expressed as total kcals per day percapita and the same macronutrient variables as were included as in the FAO macronutrient analysis. To include as much of the macronutrient profile spectrum as possible, the variable "total fat -PUFA" was derived and included in the multiple regression analyses in place of the variables total fat, MUFA, and SFA. Otherwise only the MUFA variable would enter the formula.
For each DCCT participant, kcals and macronutrient intakes on entry, at years two and five, and on completion of the study were averaged.

Glycemic control of DCCT participants
Since people with type 1 diabetes comprised the DCCT database, poor glycemic control could confound the relationship of diet with BMI change/year. Glycosuria due to serum blood sugar levels chronically greater than nine mmol/l reduces the BMI at the expense of increased complications of diabetes for patients. Using HbA1c <= 9.5 as the cutoff for inclusion resulted in no significant correlation between the mean HbA1c and BMI change/year (n = 1,055, r = 0.02, P = 0.42). However, with HbA1c > 9.5, BMI change/ year decreased significantly as HbA1c rose (n = 137, r = −0.19, P = 0.0304). Consequently, HbA1c ≤ 9.5 was selected as the cutoff for participant inclusion in the analysis.
Imputing values for "insufficient physical activity" for cohorts with no data from WHO To impute the level of adult physical activity of the 55 female and 55 male cohorts from countries without insufficient physical activity data from countries with insufficient physical activity data, a multiple regression formula was generated with insufficient physical activity as the dependent variable and the 26 food and beverage groups, mean BMI, gender, and country per capita GDP as independent variables. A constant was added to adjust the lowest imputed insufficient physical activity estimate to "2.7," equating with the lowest of the range of reported insufficient physical activity values (Bangladeshi males). The resulting formula is as follows: Comparison of FAO/WHO macronutrient profiling formula results with the formula from DCCT/EDIC participants For the purpose of comparing the FAO/WHO formula predictions of mean BMI of adults with the BMI change/year formula from the DCCT/EDIC trial, the WHO variable "insufficient physical activity" (scale: 0% -100% of the sample) was converted to the DCCT exercise variable (1-4 scale). The formula for this conversion is as follows: Y (DCCT 1-4 exercise scale) = 1.82 (100% -Z% (WHO insufficient physical activity scale: 0% -100%))/ 59.55%. "Z%", the WHO "insufficient physical activity" score, was transformed to "Y", the DCCT exercise value used for compatibility of the two formulas. For North American DCCT participants, "1.82" was the mean exercise level on the DCCT 1-4 scale. The divisor, "59.55%", represents the mean amount of "sufficient physical activity" of males and females in the USA (WHO "insufficient physical activity" scores for USA males and females were 33.5% and 47.4%, respectively. The mean "insufficient physical activity" in North America was 33.5% + 47.4% = 80.9÷2 = 40.45%. Therefore, "sufficient physical activity" = 100% -40.45% = 59.55%). The conversion of WHO activity scores to the DCCT exercise (1-4 scale) gave the following estimated means and ranges of scores for country female and male cohorts: females: mean DCCT score = 1.86, range 0.073 to 2.87 and males: mean DCCT score = 2.08, range 0.90 to 2.97. As expected due to the formula deriving DCCT "exercise" from WHO "insufficient physical activity," these variables were negatively correlated (r= −1.0).

Evaluation of the long-term weight impact of foods and beverages
To evaluate the long-term weight impact of individual foods and beverages, each item was assessed by adding the macronutrient profile data of one serving of that food or beverage to the mean macronutrient values of cohorts/individuals from the respective databases. To compose a weight impact prediction formula based on macronutrient profiling for individual foods and beverages comparable to the four-year weight impact (in pounds) of foods profiling analysis of Mozaffarian and his nutritional epidemiology colleagues authoring the Harvard diet and lifestyle study [15], the following adjustments were made to the FAO/WHO and DCCT/EDIC macronutrient and exercise profiling BMI change/year prediction formulas: 1. Eliminate the exercise variable 2. Switch from macronutrients as "% of kcals" to g 3. Multiply b weights of protein, carbohydrates and fats by 100 because of the change from % (0-100) to these variables expressed as a fraction of the total kcals (i.e., 0.00 -1.00) 4. Multiply protein and total carbohydrates by "4" because they contain 4 kcals/g 5. Multiply fats by "9" because they contain 9 kcals/g 6. Multiply alcohol by "7" because it contains 7 kcals/g (DCCT/EDIC only) 7. Retain alcohol in grams as a variable in the FAO/WHO database because alcohol "consumption" data (WHO) and not "availability" data (FAO) was used 8. Multiply b weights of dietary fiber by 1,000 because of the change from dietary fiber g/1,000 kcal to dietary fiber g 9. Adjust the standard deviations (SDs) of the FAO/WHO and DCCT/EDIC formulas to the SD of the Mozaffarian predictions of the 22 food and beverage groups (SD=0.88734) by multiplying each formula by 0.88734 ÷ (the initial SD of the respective formula estimates for the 22 items). 10. Set the output of each multiple regression formula for a hypothetical food with 0.1 kcal and no macronutrients at "0.00" weight impact in four-years by adding a constant, thereby centering the weight impact of each formula at 0.00 pounds in four-years.
Assuming all macronutrient variables would enter the multiple regression macronutrient profiling formulas to predict BMI change/year (although not all macronutrient variables do enter the formulas), the formulas would have the following format: Weight impact over 4 years (pounds) = (b weight of kcals * kcals per serving of the food or beverage + b weight of protein * 4 (kcal/g protein) * 100 (conversion from percentages to 0.00 -1.00 portions) * protein (g protein per serving + mean intake/ day protein g)/(kcals per serving + average kcals/day) + b weight of carbohydrates * 4 (kcal/g carbohydrate) * 100 * carbohydrate (g carbohydrates per serving + mean intake/day carbohydrates g) /(kcals per serving + average kcals/day) + b weight of dietary fiber * dietary fiber (g per 1,000 kcals) * 1,000/(kcals per serving + average kcals/day) + b weight of PUFA * 9 (kcal/g PUFA) * 100 * PUFA (g PUFA per serving + mean intake/day PUFA g) /(kcals per serving + average kcals/day) + b weight of MUFA * 9 (kcal/g MUFA) * 100 * MUFA (g MUFA per serving + mean intake/day MUFA g) /(kcals per serving + average kcals/day) + b weight of SFA * 9 (kcal/g SFA) * 100 * SFA (g SFA per serving + mean intake/day SFA g) /(kcals per serving + average kcals/day) + b weight of total fat * 9 (kcal/g total fat) * 100 * total fat (g total fat per serving + mean intake/day total fat g) /(kcals per serving + average kcals/day) + b weight of alcohol * 7 (kcal/g alcohol) * 100 * alcohol (g alcohol per serving + mean intake/day alcohol g) /(kcals per serving + average kcals/day)) * 0.88734 (the SD of the weight impacts of the Mozaffarian predictions of the 22 food and beverage groups) ÷ (the initial SD of the FAO/WHO and DCCT/EDIC formula estimates, respectively, for the 22 foods and beverages) + constant;

Statistical analysis
Pearson correlations related food group availability (% of total kcals for each food group), availability of kcals percapita overall, and exercise (WHO insufficient physical activity data transformed to DCCT 1-4 scale) to mean BMI of females and males in each country. Similarly for the derived FAO macronutrient profiling analysis, Pearson correlations related mean BMI in each country to exercise (DCCT 1-4 scale), availability of kcals overall, protein (g and % of total kcals), carbohydrates (g and % of total kcals), dietary fiber (g and g per 1,000 kcals), PUFA (g and % of total kcals), MUFA (g and % of total kcals), SFA (g and % of total kcals), total fat (g and % of total kcals), and total fat -PUFA (g and % of total kcals). Alcohol (g/day) consumption, not availability, was obtained from the WHO. Consequently, alcohol (% of kcals) relative to the other macronutrients was not known or estimated.
Based on the Bonferonni correction for univariate correlations of food group and exercise variables related to mean BMI, P values less than 0.002 were considered significant (i.e., for 27 food groups and exercise: 0.05/28 = 0.00179, rounded off to 0.002).
In the multivariate analyses, WHO data on mean BMIs from female and male adults 25+ years old in 2008 from 167 countries (criterion variable) were correlated with FAO food commodity availability/macronutrient availability data and exercise (DCCT 1-4 scale) data (predictor variables).
In the DCCT analysis, BMI change/year equaled (BMI at the end of the trialthe initial BMI) ÷ years on trial. Pearson correlations were computed for food energy (kcals), macronutrients and exercise with BMI change/year. Based on the Bonferonni correction for macronutrients and exercise of both the FAO/WHO and DCCT/EDIC data sets, P values less than 0.002 were considered significant in univariate correlations (i.e., for 21 macronutrient and exercise variables: 0.05/21 = 0.0024).
With DCCT/EDIC data, multiple regression analysis generated a formula quantifying the relationship of BMI change/year (the criterion variable) with macronutrient intake and exercise (predictor variables).
For the FAO/WHO multiple regression analyses, predictor variables gained entry in the formulas if P < 0.10 and remained if P < 0.10. For the DCCT/EDIC formula, predictor variables gained entry in the formulas and remained if P < 0.25. To derive the formulas, multipliers of each significant predictor variable were the non-standardized coefficients (i.e., b weight). Constants centered each macronutrient profiling formula.
The USDA Nutrient Database for Standard Reference Release 24 [19] served as the macronutrient composition reference for the 22 categories of food and beverage groups included in the Harvard diet lifestyle study. These profiles were plugged into the two formulas to generate four-year weight impact estimates for each of these 22 foods and beverages. Pearson correlations compared Harvard diet and lifestyle study weight impact estimate data on food and beverage groups with results of the FAO/WHO fouryear weight impact formula, the DCCT/EDIC four-year weight impact formula, and the average of the two formulas. Additionally, the FAO/WHO 4 four year weight impact formula and the DCCT/EDIC four-year weight impact formula were compared with each other.
SAS statistical software (release 9.1, SAS Institute, Cary, NC) was used in the performance of the data analysis. Table 1 shows selected FAO plant and animal food group availability data (percentage of total kcals available from each major food group) of female and male cohorts from the 167 countries. Kcal intake of males was equated to 1.2 times the mean population kcal intake and for females 0.8 times the mean, in accordance with USA data from the National Health and Nutrition Examination Survey [27]. Table 2 displays the percent of "insufficient physical activity" (WHO female and male cohort data from 2009: mean [95% CI of the mean] on a 0% to 100% scale), the mean [95% CI of the mean] BMI of female and male cohorts, and the percapita GDP of the 167 countries. Table 3 presents the FAO/WHO univariate relationships between mean BMI and availability of 27 plant and animal food products in percent of total kcals of food available, percent "insufficient physical activity" and percapita GDP of the countries.
In the univariate analysis, GDP percapita in 2009 is correlated with mean adult BMI (r=0.36, P < 0.0001). However, when GDP percapita in 2009 is included among the independent variables of the multiple regression, increasing GDP tends to decrease mean BMI. So GDP is not a factor in determining the level of adult BMI independent of macronutrient profile, exercise, and gender. Kcals percapita is not an important            Table 4 shows the fairly strong correlations of the multiple regression derived formula predicting mean BMI in 2008 (n=334 female and male cohorts) with similarly derived formulas relating to single sex cohorts (female and male) and percapita 2009 GDP (below and above the median). These strong correlations suggest that the food group profiles and exercise levels determine the mean adult BMI virtually independently from gender or country per capita GDP. Table 5 shows the template for using the USDA Nutrient Database for Standard Reference, Release 24 [19] to convert FAO food group data to macronutrient profiles for each country. As an example of using this template, Table 6 demonstrates the breakdown of food availability by food group in the USA transformed into the macronutrient profile. Table 7 shows the resulting template derived macronutrient availability data in percent of kcals (g per 1,000 kcals for dietary fiber) for male and female cohorts from the 167 countries. Table 8 presents the univariate relationships between mean BMI and macronutrients available as g and percentages of total kcals of food available in the 167 countries.
The In the initial FAO/WHO multiple regression analysis, protein, kcals and MUFA appeared in the formula, and carbohydrates, total fat, PUFA, SFA and alcohol did not appear. To increase the spectrum of the macronutrient profile covered by the formula, protein (about 10% of kcals), MUFA, SFA, and kcals were omitted, resulting in carbohydrates (about 60% of kcals), total fat, PUFA, alcohol (mean=6.66 g/day consumed) appearing. The b-weight of the alcohol (g) variable was doubled because alcohol did not appear in the DCCT/EDIC macronutrient and exercise formula. The multiple correlations coefficient was almost identical whether protein, MUFA, SFA, and kcals were or were not included as independent variables (R 2 =0.56 versus R 2 =0.54).
Formulas of sub cohorts broken down by gender and percapita GDP are below:  Table 9 shows the strong correlations of the BMI in 2008 with the various BMI prediction formulas and the correlations of the formulas with each other.  Table 10 shows the baseline and on study age, sex, BMI, HbA1c, and exercise levels of the 1,055 DCCT/EDIC participants with average HbA1c levels < 9.5. Table 11 compares the Pearson correlations of macronutrients with BMI change/year. The DCCT/EDIC participant BMI change/year correlated inversely with baseline age (r = −0.10, P = 0.0012), however, age and caloric intake were also negatively correlated (r = −0.15, P < 0.0001). While BMI change/year did not correlate significantly with sex (r = 0.04, P = 0.24), the caloric intake of males exceeded that of females by 51% (mean [90% CI] = 2,732 [1,869 to 3,827] kcals versus 1,804 [1,231 to 2,527] kcals). As with the FAO/WHO analysis, to compensate for the marked influence of age and sex on macronutrient intake while there was an inverse correlation of age and no influence of sex on BMI change/year, the percentages of the kcals contributed by each macronutrient were evaluated in addition to grams of each macronutrient (e.g., (protein (g) × 4 kcals/g/ kcals) × 100 = percentage of kcals as protein). Dietary fiber was expressed as g/1,000 kcals. Since the sum of SFA % kcals, MUFA % kcals, and trans fats % kcals (i.e., total fat -PUFA) directly correlated with BMI change/year in the univariate analysis (r = 0.07, P = 0.0331, from Table 11), these       The initial formula was multiplied by 0.545 to adjust the SD of the formula from 0.416 to the SD of the DCCT/EDIC subjects (0.227). A constant adjusted the mean output to the mean BMI change/year of the DCCT/EDIC subjects (0.268).

Comparisons of FAO/WHO and DCCT/EDIC formulas
To facilitate the comparison of predictions of future BMIs from the FAO/WHO and DCCT/EDIC formulas, both formulas were adjusted, as described in the methods, to convert macronutrient variables from percent of total kcals to grams of macronutrients, as would enable the use of the interactive web-based future BMI prediction tool. The adjustment also involved changing the output of the FAO/WHO formula from "mean BMI" to "BMI change/year." This conversion was made by designating the mean BMI of 20 year old people in the USA (mean BMI=22.4, according to the Center for Disease Control [28]) as the baseline adult BMI. US Census Bureau data were used to estimate the median age of adults over 25 years old in the USA (median age of adults > 25 years old = 48.9 years old [29,30] and the USA incidence of adult obesity (33%) ascertained from the literature [31]. These values were plugged into the formula for a normal distribution [30]: Consequently, where 1. the mean age of adults over 25 years old = 48.9 years, 2. μ = 0.123 BMI change/year, 3. the BMI change/year above baseline BMI required for adult obesity (x) is ≥ 0.155 BMI change/year, 4. and the USA adult obesity rate (p(x)) = 33%; the formula SD (σ) = 0.075. The SD of the FAO/WHO formula for female and male cohorts from all 167 countries was also derived from the normal distribution formula. Where  3. the BMI change/year above baseline required for adult obesity (x) is ≥ 0.155 BMI change/year, and 4. the weighted mean adult obesity rate in WHO countries (p(x)) = 14.1% [33]; the normal distribution formula yields an almost identical SD (σ) = 0.077 for the FAO/ WHO formula.
To translate the FAO/WHO formula for mean adult BMI in the 167 countries to a FAO/WHO formula for BMI change/year, the SD was equated to 0.077 by multiplying the FAO/WHO mean adult BMI formula by 0.03882 (i.e., 1.984 (SD of FAO/ WHO adult BMI prediction formula) * 0.03882 = 0.077). The b-weight of alcohol (g) in the FAO/WHO formula was doubled to compensate for alcohol not appearing in the DCCT/EDIC formula. With this adjustment for alcohol, the mean output of the weight impact of alcohol of the two formulas should better reflect the data. Finally, the output was centered at 0.07289 BMI change/year above baseline BMI per WHO data [17] by adding a constant. This gives the adjusted FAO/WHO formula below: BMI change/year FAO/WHO formula = (0.07710 alcohol (g) + 11.95 carbohydrates g * 4 (g/kcal)/kcals -304.85 dietary fiber g/kcals + 19.7433 total fat g -63.567 PUFA g * 9 (g/kcal)/kcals -2.14356 exercise (DCCT 1-4 scale)) * 0.04115 -0.05033; R 2 =0.54 In order to synchronize the output of the two formulas, the DCCT/EDIC formula was adjusted to correspond with the FAO/WHO formula-i.e., the SD changed to 0.077 BMI change/year, and a constant added to change the mean output to 0.07289 BMI change/year above the baseline (BMI=22.4): Synchronized DCCT/EDIC BMI change/year formula = (0.898 protein g * 4/kcals + 1.063 carbohydrates g * 4 (g/kcal)/kcals -13.19 dietary fiber g/kcals + 0.973 (total fat g -PUFA g) * 9 (g/kcal)/kcals -0.04468 exercise (DCCT 1-4 scale) * 1.574 -1.001;

Correlations of the FAO/WHO and DCCT/EDIC diet and exercise profiling formulas
The continuous model macronutrient and exercise formula for BMI change/year correlated strongly with the FAO/WHO categorical model food groups and exercise BMI change/year formula (r = 0.86, P < 0.0001). Testing with both the FAO/WHO and DCCT/EDIC datasets, the FAO/WHO and DCCT/EDIC macronutrient and exercise formulas also correlated with each other (r = 0.79, P < 0.0001 and r = 0.81, P < 0.0001, respectively, Tables 12 and 13).
Using data from the USDA National Nutrient Database for Standard Reference Release 24 [19], Table 14 details the kcals and macronutrients in average servings of 22 categories of foods and beverages reported by Mozaffarian and colleagues from the Harvard nutritional epidemiology team [15]. Table 15 shows the FAO/WHO and DCCT/EDIC formula predicted weight impacts of the 22 selected foods and beverages in pounds/4 years based on the macronutrient formula predictions compared with the Harvard nutritional epidemiology team food group profiling study data. Generally, for high carbohydrate foods and beverages, if the total carbohydrate/dietary fiber ratio is < 10, the item tended to reduce weight according to the FAO/WHO and DCCT/ EDIC formula predictions (e.g., fruits, vegetables, and whole grains). For high fat foods (e.g., nuts, meat, and dairy), if the ratio of total fat/PUFA < 6, the FAO/WHO and DCCT/EDIC formulas predicted a lower weight in 4 years (e.g., nuts). For a total carbohydrate/dietary fiber ratio > 10 or a total fat/PUFA > 6, an increase in weight was predicted.
Overall, the FAO/WHO and DCCT/EDIC formula predictions had no significant correlation with the food group profiling predictions or with each other (Table 16). Alcohol consumption in grams entered the FAO/WHO future BMI prediction formula, but not the DCCT/EDIC formula. Consequently, the FAO/WHO and DCCT/EDIC formula estimates for the four-year weight impact of one drink per day of alcohol (averaging macronutrient profiles of beer, wine, and spirits) were markedly different (i.e., FAO/WHO = + 2.93 pounds/4 years and DCCT/EDIC= − 2.57 pounds/4 years, Table 15). However, the average value of the two formulas corresponds with the Harvard food and beverage profiling estimate of the 4 year weight impact of 1 drink (i.e., 0.18 versus 0.41 pounds, Table 15). This single divergent data point causes the two formulas to have no significant overall correlation (r = 0.11, P = 0.64, Table 16). The mean FAO/WHO and DCCT/EDIC four-year weight impact estimates of beer, wine, and spirits are 0.58, 0.10, and −0.03, respectively.
The average of the FAO/WHO and DCCT/EDIC formula predictions correlated strongly with 12 food group profiling findings of Mozaffarian and colleagues (r = 0.85, P < 0.0001, Table 17). However, formula predictions trended towards a negative correlation with the Mozaffarian food group profiling findings for potatoes and dairy products (Table 18). Web based health tool utilizing the multiple regression derived formula To allow individuals, health professionals, and nutrition researchers to assess and monitor diet and lifestyle patterns by means of the macronutrient and exercise profiling formulas from the FAO/WHO and the DCCT/EDIC, NR designed a simple-to-use web-based tool [24]. Predicated on sustaining the inputted macronutrient profile and physical activity pattern on average over time, future BMI predictions are made. This long-term BMI prediction tool requires little nutritional or computer expertise on part of the user.

Discussion
These univariate and multivariate analyses support the thesis that disproportionate weight gain is due primarily to lack of exercise and excessive availability/consumption of foods in Table 3 with r > 0 and not enough availability/consumption of foods with r < 0. In Table 3, the r values of the breakdown of items under a broad food group heading probably have less significance that the r value of the broad heading. For instance, individual cereals vary significantly in r values (i.e., broad heading of cereals: r = −0.46, P < 0.0001, and subheadings: rice (r = −0.41, P < 0.0001), maize (r = −0.25, P < 0.0001), and wheat (r = 0.41, P < 0.0001)). This probably indicates that low BMI countries eat more rice and maize and high BMI countries eat more wheat, and much of the wheat  in high BMI countries is likely refined into white (low fiber) flour. Overall in country populations, a high proportion of kcals as cereal contributes significantly to relatively lower mean BMIs. It may be counterintuitive that fruit should be associated with excessive weight gain (i.e., fruit: r=0.22, P < 0.0001 in Table 3). Data from the diet and lifestyle study by Harvard nutritional epidemiologists showed that fruit consumption was associated with significantly decreased weight over a four year span while 100% fruit juices correlated with substantial weight gain (Table 15) [15]. In that study of over 120,000 USA participants, the mean intake of fruit juices was about half of the mean intake of fruit (0.73 juice servings/day versus 1.43 whole fruit servings per day). Under the FAO food group category, "FRUITS AND DERIVED PRODUCTS," is the following explanation, "Fruit crops are consumed directly as food and are processed into dried fruit, fruit juice, canned fruit, frozen fruit, jam, alcoholic beverages, etc." [34]. In the USA, US Department of Agriculture data show that about 40% of fruit availability is in the form of juices [35]. Consequently, whole unprocessed fruit likely correlates with normal BMIs while fruit juices likely correlate with overweight and obesity.
Based on both of these BMI change/year formulas, the adage, "eat less and exercise more" should be clarified to "eat more cereals, fruits, vegetables, pulses, roots, and tubers and exercise more" or "eat more high fiber carbs and more high PUFA fats and exercise more." In discussing the counterintuitive prediction that one serving per day of low-fat yogurt correlated with the largest weight loss of any food or beverage in their study (−0.82 pounds/ 4 years), Mozaffarian and the Harvard nutritional epidemiology team allowed for the possibility of "an unmeasured confounding factor that tracks with yogurt consumption" [15]. The paradox of low-fat yogurt associated with weight loss in the Harvard study while potato consumption correlated with increased weight may be due to confounding in three ways: 1. the association of dairy product consumption, particularly low-fat yogurt, with fruits, vegetables, nuts, and whole grains and with above average exercise in educated, relatively affluent, health-conscious people, 2. the association of potato consumption with oil or fat (e.g., butter, sour cream, etc.), and 3. the greater affordability and therefore consumption of inexpensive foods like French Fries, potato chips, sugar sweetened drinks, hamburgers, etc. for the lower socio-economic classes with higher rates of obesity.
Among the many organizations extolling the health benefits of low-fat yogurt are Cleveland Clinic [36], Mayo Clinic [37], Center for Science in the Public Interest [38], American Heart Association [39], FDA and USDA [40]. Due to these endorsements and the heavy advertising of dairy products; educated, health conscious, relatively more affluent people may respond by consuming more low-fat yogurt (and other dairy products) compared with less health conscious people that drink less expensive sugar sweetened beverages.
Data for analyzing the overall diet and exercise pattern associated with dairy foods consumption come from the "CARDIA Study," a general community sample from four U.S. metropolitan areas [41]. CARDIA Study participants were partitioned into terciles according to consumption of dairy foods. Dairy consumption correlated with 18% more physical activity (the highest tercile in dairy consumption overall averaged about 18% more physical activity than the lowest dairy consuming tercile). Similarly, dietary profile comparisons of the highest and lowest dairy product consuming terciles showed that the highest tercile dairy consumers averaged 68% more whole grains, 13% more fruits and vegetables, and 46% less sugar-sweetened beverages than the tercile consuming the least dairy foods. Estimating conservatively, the highest dairy product tercile consumed at least 40% more dietary fiber/day (i.e., ≥ 10 g/day more) than the lowest. Using the FAO/WHO database, plugging these fiber and exercise values (i.e., 10 g/day more fiber and 18% more exercise) into the two formulas yielded an average prediction that the highest tercile of dairy consumers will gain 0.059 BMI units/year less than the lowest tercile (FAO/WHO formula: 0.065 BMI units/year less and DCCT/EDIC formula: 0.054 BMI units/year less). A similar formula calculation using the DCCT/EDIC database predicted that the highest tercile of dairy consumers will gain 0.069 BMI units/ year less than the lowest tercile (FAO/WHO formula: 0.076 BMI units/year less and DCCT/EDIC formula: 0.063 BMI units/year less). An increase of 0.059 -0.069 BMI units/year is in the range of overall development of the obesity epidemic (i.e., 2.95 -3.45 extra BMI units in 50 years).
The National Health and Nutrition Examination Survey data show that average gains per year in BMI of the USA adult population ranges from 0.087 BMI units -0.137 BMI units [42]. These data suggest that the relatively healthy overall diet and exercise Table 18 Correlations of FAO/WHO and DCCT/EDIC macronutrient profiling formulas and the Harvard categorical food profiling results for 10 potato and dairy products pattern associated with low-fat yogurt may reduce long-term weight gain, but low-fat yogurt itself, and similarly other dairy products, more likely increase weight. Rates of obesity in the U.S. and other developed countries are much higher in the food-insecure lower socio-economic classes [43]. Potatoes (including French Fries and potato chips), sugary foods, low PUFA meats, and refined grains provide dietary energy at the lowest cost and are chosen by poor people out of necessity [44]. As suggested by Drewnowski, food group (categorical) profiling studies linking inexpensive potatoes with obesity may be confounded because low income people, who carry higher risks of obesity, eat more potatoes [45]. FAO/WHO and DCCT/EDIC macronutrient profiling analyses (continuous model profiling) would not be subject to these kinds of biases related to food selections shaped by the income or health consciousness of the consumer.
FAO/WHO formula estimates for dairy products and potatoes accorded with DCCT/ EDIC formula predictions (Table 18: n = 10, r = 0.68 P = 0.0311). Both macronutrient profiling formulas disagreed with the Harvard nutritional epidemiology team food group profiling estimates for dairy products and potatoes (r = −0.12, P = 0.73 and r = −0.56, P = 0.09 for the FAO/WHO and DCCT/EDIC versus Harvard weight impact predictions, respectively, Table 18). This supports the view that the health conscious public's misperceptions of the long-term weight effect of low-fat yogurt and other dairy products and greater affordability of potatoes for low income people confounded the Harvard nutritional epidemiology team's predictions concerning dairy products and potatoes.
While this analysis has potential confounders, it is, hopefully, a valuable first step with the methodology of comparing long-term data on BMI change/year to macronutrient and exercise profiles in different databases. Limitations of this study include: (1) food availability (FAO) is used rather than food consumption for the diet variables in countries, (2) only 167/200 FAO/WHO countries provided sufficient data on which to base an analysis, (3) imputed physical activity data was used for 55 female and 55 male cohorts, (4) data is lacking in the FAO database on nuts, seeds, and vegetables, (5) the DCCT/EDIC data included subjects with a relatively narrow range of variability in macronutrient intake and exercise level, (6) people with type 1 diabetes are not typical of the population for many reasons, so the univariate and multivariate correlations of DCCT/EDIC participants cannot be assumed to be the same as other populations, (7) the foregoing DCCT/EDIC factors probably led to a weak multiple variables correlation with BMI change/year (R 2 = 0.03, P < 0.0001), (8) unequal access to various foods, cultural differences, and other factors may also confound the results of this analysis, and (9) these food group/macronutrient and exercise profiling formulas, although validated by the strong correlation with the food group profiling data from the Harvard nutritional epidemiology group, still require further verification from other databases relating macronutrient and exercise profiles to BMI change/year or adult BMI or from prospective diet and exercise profiling studies.
Inferring the changes in BMI based on the diet and physical activity parameters using the data from these studies may not be optimal, but it is reasonable given that the FAO kcal and macronutrient availability data for each entire country's population would not be expected to change radically over 50 years for most countries. The exceptions will be part of the noise in the data. The DCCT diet analyses were conducted 2-5 times over the 4-9 years on trial and, unfortunately, not repeated during the EDIC 10 year follow-up. Ideally, we should have diet data on a yearly or monthly basis over decades. However, such data do not yet exist.
In countries with mean BMI levels already in overweight or obese categories or projected to increase into these categories, policymakers, nutrition professionals, and the public should consider that these formulas might inform strategies to combat the obesity epidemic. This should stimulate discussion about strategies to increase physical activity and adjust the availability and consumption of foods that increase BMI relative to BMI decreasing foods to avoid excessive weight gain and the associated health problems for individuals and populations.
Policymakers, dietary professionals, and individuals could also consider using or promoting the use of the website health tool offered in this article to base a "nudge" for people. According to the popular book, Nudge: improving decisions about health, wealth, and happiness by Richard Thaler and Cass Sunstein, the concept of nudging describes "any aspect of the choice architecture that alters people's behavior in a predictable way without forbidding any options or significantly changing their economic incentives" [46]. Nudging uses "libertarian paternalism," a political/social philosophy in which people's choices are actively guided in their best interests but they remain at liberty to behave differently [47]. Regular analysis and monitoring of the long-term weight impacts of one's diet and physical activity choices with this tool could nudge people to adopt healthier lifestyles in accordance with their own perceived best interests.
The correlations between these two formulas and the validation of the formulas by comparison with the food group profiling data of the three databases used by Mozaffarian and colleagues raise the possibility that multiple regression formulas derived from all other databases may also have a similar format: Possible general format for BMI change/year prediction formulas = (A * carbohydrates g * 4 (g/kcal)/kcals -B * dietary fiber g/kcals + C * total fat g * 9 (g/kcal)/kcals -D * PUFA g * 9 (g/kcal)/kcals -E * exercise) * F + G; While protein and alcohol were each only in one formula, the univariate correlations of both of these macronutrients in the FAO/WHO database suggest that they tend to increase weight (i.e., r > 0, Table 8). The statistical findings of this analysis support previous recommendations to encourage consumption of mostly unprocessed plant-based commodities (fruits, vegetables, cereals, pulses, roots/tubers, etc.) to combat the obesity epidemic. The formulas in this study may facilitate strategies by individuals and policy makers to nudge the patterns of food consumption in a healthy direction. Further, immediate feedback on the predicted long-term effect of exercise from the health tool should be combined with strategies to promote regular physical activity at population levels (e.g., in schools, worksites, etc.) and to incentivize regular physical activity in the health care system.
Utilizing the website tool offered in this article could provide a welcome "nudge" to motivated users to adopt diet and exercise habits in line with their wishes for longterm weight control. Academic nutrition researchers should consider partnering with the authors in undertaking prospective observational/interventional studies of individuals that use the future BMI prediction interactive website to prevent or treat obesity.