Dietary Patterns in Colombia: An Exploratory and Confirmatory Factor Analysis

This study aimed to construct and validate dietary patterns in Colombian individuals aged between 15 and 64 years. The study sample consisted of 16,216 individuals who participated in the 2015 National Survey on the Nutritional Situation of Colombia (ENSIN). To identify dietary patterns, an exploratory analysis was first conducted with 16 food groups with 70% of the study sample, followed by a confirmatory analysis with the remaining 30%. Four eating patterns were identified, referred to as traditional, industrialized, conservative, and grilled foods/drinks. The traditional pattern presented the highest variance (17.65%), followed by the industrialized (14.36%), conservative (10.74%), and grilled foods/drinks (7.57%) patterns. Regarding the frequency of food consumption, the preference for the consumption of rice/pasta and panela/sugar/honey was evident. In contrast, low consumption of whole foods, energizers, light foods, supplements, and salt was observed. The dietary patterns in Colombia continue to be characterized by a high consumption of panela/sugar/honey and low consumption of fruits/vegetables, which is why the development of public health interventions to improve these habits, such as health promotion and chronic disease prevention tools, is considered necessary.


INTRODUCTION
In recent decades, industrialization, changes in agricultural production, and resource availability have affected peoples' lifestyles, causing disturbances in eating patterns, among other changes. These modifications have led to the abandonment of diets that are rich in complex carbohydrates, fiber (cereals and potatoes), proteins, and vegetable fats and a low intake of foods that are rich in animal proteins and fats. In contrast, there has been an adoption of dietary patterns characterized by a high consumption of processed or ultra-processed foods and foods rich in animal fats and hypercaloric foods and a low intake of fruits and vegetables (Ministerio de Salud y Protección Social, 2017).
Heath indicators have been significantly affected by this change in dietary patterns, leading to an increase in the prevalence of chronic non-communicable diseases, such as diabetes, cardiovascular diseases, and cancer (Khoury et al., 2014). Chronic diseases are estimated to cost €115 billion or 0.8% et al., 2019) (de Paula Matos Souza et al., 2019) (Gil-Toro et al., 2017).
Regarding studies related to the evaluation of dietary patterns in the different regions of Colombia, Quintero-Lesmes et al. conducted an exploratory factor analysis (EFA) of the data obtained in the 2010 ENSIN and identified three dietary patterns in the Colombian population between 18 and 64 years of age (Quintero-Lesmes and Herran, 2019). The first dietary pattern was classified as "fruits-vegetables/dairy" and was characterized by the consumption of milk and its derivatives, chicken, fruits, raw and cooked vegetables, whole fruits and juices, bread/arepa/cookies, grains, beef viscera, light foods, tuna, and sardines. The second was called the "traditional/starch" pattern, highlighting the intake of panela, sugar, honey, rice or pasta, fried foods, bananas, eggs, beef/pork, shellfish, coffee or tea, legumes, eggs, etc.). The third was referred to as the "snack" pattern and was defined by the predominance of fast foods, soft drinks, packaged foods, butter, sausages, chicken offal, etc. (Quintero-Lesmes and Herran, 2019). Besides identifying these patterns, they also established that the most frequent dietary pattern in the Colombian population was the "traditional" one (Quintero-Lesmes and Herran, 2019).
Considering that dietary patterns are related to the risk of developing chronic non-communicable diseases (Al-Maskari, 2021) and that they are influenced by demographic transitions and changes that have occurred over time, this study aimed to construct and validate the current dietary patterns in the Colombian population aged between 15 and 64 years using EFA and confirmatory factor analysis of the data obtained via ENSIN 2015. The purpose of this study was to offer more detailed information to facilitate future research related to the association between dietary patterns, sociodemographic characteristics, and the occurrence of different chronic non-communicable diseases, as well as the development and implementation of health strategies that promote the consumption of healthy foods.

MATERIALS AND METHODS
This was a cross-sectional descriptive study to determine dietary patterns in the different regions of Colombia.
The sample consisted of 16,216 individuals aged between 15 and 64 years who participated in ENSIN 2015, a descriptive crosssectional population survey with probabilistic, cluster, stratified, and multistage sampling (Instituto Colombiano de Bienestar Familiar, 2018).
To facilitate the establishment of eating patterns and to obtain an explanation variance greater than 45%, the response options to the food consumption frequency variables were recorded (initial values from 1 to 4 [1 = daily; 2 = weekly; 3 = monthly; and 4 = not usual consumption], passed to scores from 0 to 3 [0 = not usual consumption; 1 = monthly; 2 = weekly; and 3 = daily]) and transformed to a scale of 0-100, with 0 indicating "no consumption" and 100 indicating "maximum consumption," using the following formula: [(sum of the items that comprise the variable (raw score) − lowest possible score)/maximum possible range] *100. With the transformed variables, new food variable groups were constructed to be tested in the construction of food patterns.
For data analysis, the sample was divided into two groups (Ortiz-Martinez et al., 2019). The first was the "design group," which represented 70% (11,382) of the subjects selected randomly and stratified by age and sex and in which the EFA was applied for the construction of dietary patterns. The second was the "validation group," which comprised the remaining 30% (4,879) of subjects and in which the confirmatory factor analysis (CFA) was applied to validate the construction of the eating patterns obtained in the confirmatory factor analysis. SPSS Amos 26 was used for CFA.
A descriptive analysis was performed to detail the population characteristics. The analysis included mean, median, standard deviation, and interquartile range (IQR) values for the quantitative variables and frequencies and percentages for the qualitative variables. Both groups (design and validation) were compared using chi-square statistics for qualitative variables and the Mann-Whitney U test for quantitative variables. We used nonparametric statistics because the data did not meet the normality criterion, evaluated using the Kolmogórov-Smirnov test.
The relevance of performing EFA in the design group was evaluated using the Bartlett sphericity index and the Kaiser-Meyer-Olkin sample adequacy measure. The extraction and rotation of the factors was performed using the principal components method with Varimax rotation and the maximum likelihood method. The number of dietary patterns (factors) was selected according to the Kaiser rule (minimum number: eigenvalues >1; maximum number: eigenvalues >0.7), sedimentation graph, and criteria of the research team. The food groups that had correlation values of <0.30 were considered irrelevant in the definition of each factor and, in the case of values > 1, a cutoff point of 0.50 was established to decide in which factor the affected variable should be placed. Communality values were assessed to determine the variables to be included in the EFA.
The dietary patterns constructed in the EFA were tested in the validation group by the calculation of global or absolute fit indices, incremental or comparative fit measures, and parsimony fit measures of the model. In addition, the path diagram of the model with the best fit was plotted.
An alpha error of less than 5% (p < 0.05) was set to establish statistical significance in all tests performed.

Sociodemographic Characteristics
The mean age of the study population was 31.5 ± 15.0 years (design group: 31.3 ± 14.9 years; median: 27.0; IQR: 17-43; validation group: 31.5 ± 15 years, median: 27.0, IQR: 17-44; p = 0.333). Overall, 57.0% were women, 74.6% lived in urban areas, 45.3% had an education level between complete secondary school and incomplete higher studies, 58.0% had social security coverage in the subsidized regime, 45.4% were in the first quartile of the wealth index, and 24.5% were from the central region. Table 1 shows the sociodemographic characteristics of the study population compared by analysis group (design and validation). No significant differences were found between the subgroups upon comparing the categories of the variables.

Frequency of Food Consumption During the Previous Month
More than 79% of the population reported daily consumption of rice/pasta and panela/sugar/honey. Regarding weekly consumption, ≥60% reported consuming eggs, chicken, meat, dry grains, and fried foods. Only 28% stated that they consumed fish/shellfish and tuna/sardines on a monthly basis. More than 80% reported not having a usual consumption of bienestarina, other vegetable mixes, whole-grain foods, energizing drinks, light foods, supplements, and salt. Table 2 shows the distribution of the frequency of food consumption during the previous month between the design and validation groups. The frequency of consumption of whole-grain foods was significantly different. Supplementary Table S1 presents the descriptive analysis of the frequency of consumption of the food groups included in the EFA.

Exploratory Factor Analysis
EFA was tested with five models. In all five models, the Bartlett sphericity index, with its respective chi-square test and statistical significance at p < 0.001, confirmed that the correlation coefficients were not zero. Similarly, the Kaiser-Meyer-Olkin sample adequacy measure provided sufficient information to verify the relevance of factor analysis in the establishment of Colombian food patterns. Table 3 shows the statistics of the EFA models evaluated as well as the communality values, factors, and total variances explained.
The communality values of the 16 food groups included in the EFA of models 4 and 5 were 0.361-0.594 and 0.363-0.592, respectively. The four factors with initial eigenvalues >1 accounted for more than 50% of the variance in both models.
The rotated component matrix with the principal components analysis extraction method for models 1-5 is presented in Table 4.

Confirmatory Factor Analysis
The likelihood ratio of the chi-square value showed statistically significant differences between the compared covariance matrices, but because this test is highly influenced by the sample size, the other measures of goodness of fit should be considered to determine the validity of the models. In this sense, the approximation error (root mean square error of approximation) being lower than 0.08 in all the models allowed us to consider that the sample had a good fit (Steiger and Lind, 1984) (Hair et al., 2009) (30) (Bentler, 1990). It is also recommended that the incremental fit measures be between 0 and 1, preferably NFI ≥ 0.90 (Marôco, 2014) (31). As a result, we obtained a Comparative Fit Index (CFI) close to 0.9 for models 1, 3 and 4. Furthermore, we found the best Tucker Lewis index (TLI) in model 4 (value close to 0.9) concerning the incremental fit index. Finally, regarding the parsimonious fit measures, the PRATIO statistic had values close to 0.90, and the Akaike Information Criterion (AIC) index showed that model 4 had a better fit. Therefore, of the nine statistics used to determine the best model, it was found that model 4 was the best fit as it was the lowest value of all the models. Hence, the authors considered it the most appropriate when determining dietary patterns in the different regions of Colombia. The measures of absolute, incremental, and parsimony fit obtained by means of the EFA for models 1-5 are shown in Table 5 and Figure 1 shows the path diagram of model 4.

DISCUSSION
In this study, an investigation was conducted to identify dietary patterns in Colombian people using data from ENSIN 2015.
When evaluating the sociodemographic characteristics of the studied sample, we found that the variables of sex and distribution of the population by area and geographic region had a behavior similar to that reported by the National Administrative Department of in its 2018 report (Departamento Administrativo Nacional de Estadistica, 2018).
Regarding the frequency of food consumption, the results showed a clear preference for the consumption of rice/pasta and panela/sugar/honey. However, low consumption of whole foods, energizers, light foods, supplements, and salt was observed. Alcoholic drinks also had low consumption. In contrast, Leme et al. stated in their study "The Latin American Nutrition and Health Study" that the consumption of sugary drinks and alcoholic drinks in Colombian young people and adults was 48.29% (B. Leme et al., 2021).
Regarding the construction of patterns using EFA, the Bartlett sphericity index and the Kaiser-Meyer-Olkin measure of sample adequacy indicated the relevance of conducted this analysis on the sample studied and, despite model 5 showing a higher explanatory variance than the other models, model 4 yielded the best fit indicators in the subsequent validation.   Thus, 4 dietary patterns were determined in Colombia. The first one, called "traditional pattern," refers to natural, minimally or moderately processed food that includes domestic preparation and cooking techniques and with little presence of ultraprocessed foods. Had an explanatory variance of 17.65% and was made up of the following food groups: dairy (milk, cheese, butter); potatoes/legumes (dried grains, tubers/bananas); cereals (bread, arepa, bienestarina, other vegetable mixtures, cookies, rice/pasta); fried foods; coffee; panela/sugar/honey; and meat/ fish/eggs/sausages (meat, black pudding/beef offal, fish/seafood, tuna/sardine, chicken offal, chicken/hen, eggs, sausages). In this dietary pattern, the potatoes/legumes group yielded the highest factor load (0.74) and had a mean frequency of consumption of 70%. However, other authors have reported a reduction in the consumption of legumes in recent years (Gámbaro et al., 2011) (Fundación Española de la Nutrición, 2022) (Miller et al., 2017) (Téllez, 2019). Further, the Food and Agriculture Organization of the United Nations (FAO) indicated in its 2019 report that the consumption of legumes decreased in countries of the Mesoamerica Region, increased in the countries of the Caribbean Region and remained stable in South American countries (Rapallo andRivera, 2019) (FAO andFIDA, 2020).
The other food groups that were represented in the traditional pattern were (in descending order): coffee, panela/sugar/honey; cereals (bread, arepa, bienestarina, other vegetable mixtures, cookies, rice/pasta); meats/fish/eggs/sausages (meat, black pudding/beef offal, fish/seafood, tuna/sardines, chicken offal, chicken/hen, eggs, sausages) and fried foods; and dairy (milk, cheese, butter). In line with these results, the International Coffee Organization, in its 2020 report, determined that Colombia is one of the Latin American countries with the highest frequency of coffee consumption, along with Mexico, Honduras, and Venezuela (International Coffe Organization, 2021). Likewise, Petuz et al. found that during the Covid-19 pandemic, 80% of the population consumed coffee (Pertuz-Cruz et al., 2021). Regarding sugar consumption, a study carried out related to the addition of   sugars in eight Latin American countries compared the frequency of consumption in absolute terms of grams/day and showed that Colombia is among the countries with the highest intake values (Fisberg et al., 2018). In regards to the consumption of dairy products, the results were similar to those reported in the analysis of the dairy situation, where Colombia was found in the lower limit of intake with respect to other Latin American countries such as Uruguay, Argentina, and Costa Rica (Unidad de Planificación Rural Agropecuaria, 2020). In other states, several studies identified the low frequency of dairy consumption. The gap between actual consumption versus the one recommended by the WHO and other food safety policies was evident on those studies (GBD 2017 Diet Collaborators, 2019) (Herrera-Cuenca et al., 2020). However, Pertuz-Cruz et al. noticed that during the COVID-19 pandemic, Colombians increased the frequency of consumption of dairy products (Pertuz-Cruz et al., 2021).
The second dietary pattern was called the "industrial pattern." It yielded an explanatory variance of 14.36% and was composed of the following food groups, in descending order (correlations between 0.74 and 0.71): sweets/packaged foods and fast food; soft drinks (soda/tea/soft drinks).
The high consumption of foods that made up this pattern provide more calories, threaten the quality of the diet, promote dental caries and could increase subjects' predisposition to the development of Chronic Non-communicable Diseases such as diabetes, overweight or obesity (Silva et al., 2018) (Cediel et al., 2018) (Machado et al., 2020) (Cárdenas Sánchez et al., 2019 (Zheng et al., 2017). In this sense, FAO published that the consumption of soft drinks or sugar-sweetened beverages has increased in Latin America with an average intake of 0.8 servings per day , similar to what was reported in a study conducted in 195 countries, where it was found that the amount ingested (49 g per day) was much higher than that considered optimal (GBD 2017Diet Collaborators, 2019. Another recently published study reported a reduction in the consumption of sugary drinks in the Colombian population (Pertuz-Cruz et al., 2021).
The third dietary pattern called "conservative pattern" obtained an explanatory variance of 10.74% and was made up of the following food groups, in descending order (correlations between 0.68 and 0.50): whole foods and light foods/supplements; fruits (juice fruits, whole fruits) and vegetables (cooked vegetables, raw vegetables). Several authors agree that the consumption of whole grains could reduce plasma glucose concentration and have a favorable effect on serum concentrations of triglycerides (TG) and high-density lipoprotein cholesterol (HDL-C) in overweight children (Hajihashemi et al., 2021) (Hollaender et al., 2015). Besides, this pattern also included the food groups of fruits and vegetables, essential pillars in populations' dietary security and health. Seeking to promote increased consumption of these foods, WHO and FAO have deployed policies and strategies .  Finally, the fourth dietary pattern, called the "grilled food and processed drinks pattern", which includes individuals with high consumption of grilled meat products and frequent consumption of processed or ultra-processed alcoholic and non-alcoholic beverages, where sodas, processed juices with added sugar, industrialized juices and energizers stand out. This pattern obtained an explanatory variance of 7.57% and was made up of the following food groups, in descending order (correlations between 0.71 and0.55): alcohol group (alcoholic drinks), grilled foods, and energy drinks (energizers). The high consumption of these foods constitutes a risk factor for health because of its relationship with chronic diseases: alcohol consumption is associated with the main causes of morbidity and mortality (Chrystoja et al., 2021), and high intake of grilled food, like red meat, is related to mortality from cancer and cardiovascular and respiratory diseases (Zheng et al., 2019).
In this respect, the 2020 report about alcohol and health in the Americas showed that the average total consumption of Alcohol Per Capita (APC) was 25% higher in countries of the Region of the Americas than in the rest of the world population. Regarding the presence of alcohol use disorders, Colombia was among the countries of the Americas with one of the highest percentages (OPS, 2021), a concerning fact considering that this intake is associated with the main causes of morbidity and mortality (Chrystoja et al., 2021).
Regarding the results obtained in the CFA, it should be noted that all the proposed models had measures of reasonable fit, with model 4 being the most appropriate, as it presented the best values in 5 of the 9 measures of fit evaluated.
In this model, the flowchart showed standardized correlations greater than 0.30 in most of the food groups, except for the light food/supplement group (0.22) and the whole food group (0.28) in the conservative pattern (pattern 3) and the grilled food group (0.16) in the drinks/grilled food pattern (pattern 4). Although these food groups did not have acceptable correlations, they were kept in the corresponding pattern following the theoretical coherence that supports the definition of each dietary pattern (Herrán et al., 2016).
Overall, the traditional and conservative eating patterns showed a grouping of healthy foods, while the industrial and drinks/grilled foods patterns included the foods mostly associated with the occurrence of non-communicable diseases. When comparing these dietary patterns with those described by Herrán et al. (Quintero-Lesmes and Herran, 2019), a redistribution was observed in the food groups and with it, the consequent division of the "snack" pattern into two new patterns referred to in this study as "industrial pattern" and "drinks/grilled food pattern." This study had some limitations. First, the 2015 ENSIN reported a low percentage of consumption of some foods, such as alcohol, grilled foods, and salt, among others, compared to the results reported in other studies (Chrystoja et al., 2021) (OPS, 2021) (FAO., 2012). These differences could be conditioned by the type of survey conducted and, in the case of the report on the consumption of alcoholic drinks, it could also be related to the bias towards social or cultural rejection, in which interviewees could perceive alcohol consumption as socially unacceptable, conditioning them to give biased answers. Nevertheless, as it was a population survey with a sample of 16,216 participants, the data could be analyzed, and new food groups were created and later included in the construction of dietary patterns. In addition, the ENSIN 2015 did not contemplate the inclusion of the population over 65 years of age, so the inference of the results to this population group should be handled cautiously.
Regarding the strengths of our study, we point out: first, the type of sampling (probabilistic, cluster, stratified, and multistage) allowed us to have a representative sample of the different geographical areas of the country; on the other hand, using the CFA as a tool for the validation of dietary patterns provided a solid statistical basis for the determination of the patterns derived from the EFA.

CONCLUSIONS Y FUTURE PERSPECTIVES
In conclusion, the four patterns obtained through the factor analysis reflect the variety and quality of the diet in the different regions of Colombia. The traditional and conservative patterns include foods with high nutritional power to cover the needs in the different stages of life. The industrial and drinks/ grilled foods patterns group ultra-processed foods and beverages with a high caloric content, which threaten the quality of the diet and are mostly associated with the occurrence of noncommunicable diseases.
Future research may examine associations between dietary patterns and sociodemographic characteristics, chronic noncommunicable diseases, and cancer mortality. This will make it possible to identify at-risk population groups and, in this way, design specific and sustainable interventions that increase population's health.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
This study was reviewed and approved by The Human Research Ethics Committee of Santiago de Cali University (Act N°11 from the May 29th session). Written informed consent for participation was not required for this study in accordance with national legislation (Resolution 8430 of 1993) and international requirements (Declaration of Helsinki and standards of Good Clinical Practice/Harmonized Tripartite Guide of the International Council for Harmonisation).

AUTHOR CONTRIBUTIONS
ML, VA, and MR conceived this study. All the authors made substantial contributions to the study design, analysis, and interpretation of results. ML, BS, and VD drafted the first version of the manuscript, and all authors reviewed that version and later drafts. All authors read and approved the final manuscript.

FUNDING
This study was funded by the General Directorate of Research of Santiago de Cali University under call No. 03-2020, 01-2021, 05-2021. The authors thank the Colombian Institute of Family Welfare (ICBF), University Santiago de Cali, for providing the original database of the National Survey of the Nutritional Situation of Colombia (ENSIN); Cordoba University for providing the methodological tools for data analysis; and Universidad Santiago de Cali for funding the study (No. 441-621120-1947).