Evaluation of the growth of enterobacteria in fructooligosaccharides using the principal components analysis

Luiza Siede Kuck luizakuck@yahoo.com.br Universidade Federal do Rio Grande do Sul, Porto Alegre-RS Caciano Zapata Noreña czapatan@ufrgs.br Universidade Federal do Rio Grande do Sul, Porto Alegre-RS Experimental data obtained from the literature concerning the growth rate of 35 different enterobacteria and acidification of the media using various carbohydrates as the substrates (glucose and the fructooligosaccharides Profeed P95, Raftilose P95 and Raftiline LS), were submitted to a multivariate statistical analysis of the principal components and clusters. The objective was to evaluate the degree of correlation between the substrates and the microorganisms, grouping them according to their affinities. When the growth rates of the enterobacteria were evaluated, strong correlation was observed between the substrates composed of short chain fructose oligosaccharides (Profeed P95 and Raftilose P95). The microorganisms could be separated into four groups, similarities being found between enterobacteria of the same genus and/or species. With respect to acidification of the medium, strong correlation was observed between glucose and Profeed, which have a greater percentage of glucose units in the chain. However, it was found poor correlation among microorganisms from the same genus and/or species, as well as between the enterobacteria species and fermentation of the carbohydrates. KEY-WORDS: principal component analysis; cluster analysis; enterobacteria; fructooligosaccharides.


INTRODUCTION
Fructans are linear or branched fructose polymers which may or may not have a D-glucose residue, which is usually located at the extremity and connected by an α-1,2 bond (HENDRY; WALLACE, 1993).Fructans are classified according to the type of glycosidic bond (POLLOCK et al., 1996), and can be divided into three groups: inulin, levan and graminan (SMITH, 1993).Inulin is a mixture of oligomers and polymeric chains with a variable number of fructose molecules that generally having a terminal glucose molecule (VILLEGAS; COSTELL, 2007), and has a polymerization degree varying between 2 and 60 with an average degree of 12 (GIBSON; ROBERFROID, 1995;TÁRREGA;ROCAFULL;COSTELL, 2010).Inulin have some related compounds, namely oligofructose and fructooligosaccharides (FOS), differentiated by the degree of polymerization of the molecule (MEYER et al., 2011), that is less than 10 (DREVON; BORNET, 1992).FOS are presents in several crops, such as yacon (CASTRO et al., 2013) and onion (KUMAR; PRASHANTH; VENKATESH, 2015).
Commercial FOS can be produced in two main ways: by the extraction and enzymatic hydrolysis of inulin, which can produce linear compounds of fructose units, with or without a glucose residue at the extremity of the chain, such as Raftilose P95 and Raftiline LS, or by the enzymatic transfructosilation of sucrose, producing compounds with only glucose residues, such as Profeed P95 (HARTEMINK; VAN LAERE;ROMBOUTS, 1997).
The use of prebiotic fibres in foods with the aim of modulating the gut microbiota is important, because they might induce either increase or decrease of the production of health compounds, which are related with the production of bacterial metabolites, growth of health promoting bacteria, decrease in intestinal pathogens, or immune modulation (AL-SHERAJI et al., 2013).FOS are not digested by the enzymes in the human intestine, and thus reach the colon untouched when ingested, where they are fermented by part of the microbiota present, liberating short chain fatty acids and gases (WANG; GIBSON, 1993).
The bacteria responsible for fermenting the FOS can be divided into three main groups: the first group is benefic, consisting of the bifidobacteria, lactobacilli and other lactic bacteria, all bacteria of positive value for human health; the second group consists of the enterobacteria and clostridia, considered to be negative to human health; and the third group consists of other bacteria denominated as neutral (MITSUOKA, 1990).Thus some pathogenic bacteria can compete with the bifidobacteria for the FOS.
Different studies using pathogens can be realized by observing the growth of these bacteria using fructooligosaccharides as substrate (HARA et al., 1994;LINKE et al., 2013;MAO et al., 2015;VAN LAERE et al., 2000).However, the use of a large number of different bacteria and fructooligosaccharides can generate a large amount of data that may complicate the interpretation of results.The multivariate statistical analysis is a group of techniques that permits the simultaneous evaluation of diverse variables, aiming to quantify the correlation between them and hence extract information invisible in the original data, making it possible to eliminate the unimportant variables (HAIR et al., 2009;LUCHESA, 2004).

Página | 118
The principal component analysis is one of the techniques belonging to the multivariate analysis group (SHARMA, 1996).This analysis consists essentially of reducing the dimensionality of a data group constituted of a large number of interrelated variables, maintaining as much as possible of the variation of the data group (JOLLIFFE, 2002).This is done by transforming the data group into a new group of variables denominated as principal components (PC), which are not co-related and are obtained in decreasing order of maximum variance, that is, the first principal component (PC1) detains more statistical information than the second principal component (PC2) (NETO; MOITA, 1998;JOLLIFFE, 2002).The reduction in dimensionality occurs in this way, since the principal components obtained generally possess 90% or more of the information contained in a great number of variables (NETO; MOITA, 1998).
The principal component analysis is often used as an intermediate technique for reducing the size of data, which commonly uses another type of analysis simultaneously (JOLLIFFE, 2002).The Cluster analysis is one of the most frequently used analysis along with the PCA, and its basic goal discovers samples groupings within a data set, i.e., it evaluates the similarity between samples, measuring the distances between the measurement points and space, where similar samples will be close together, while different samples are far from each other (LAVINE, 2000).
Thus the objective of the present study was to apply the Principal Component Analysis and the Cluster Analysis to data obtained from experiments carried out with a large number of variables, so as to evaluate the degree of correlation between the microorganisms and carbohydrates tested and group the microorganisms according to the degree of correlation between them.

MATERIAL AND METHODS
Table 1 shows the different species of enterobacteria used by Hartemink, Van Laere and Rombouts (1997) to study the possible degradation and fermentation of commercial fructooligosaccharides, their growth rate and the final pH of the medium.The microorganisms after incubation were transferred to the following substrates: glucose, Profeed P95®, Raftilose P95® and Raftiline LS® and non-supplemented medium for comparative purposes (HARTEMINK; VAN LAERE;ROMBOUTS, 1997).The results of growth rates of the enterobacteria and the final pH values of the media supplemented with the different carbohydrates after cultivation of each of the microorganisms used in that study were submitted to multivariate analysis.Principal Components Analysis (PCA) were used in order to reduce the number of variables and identify similarities among the samples, where the results might be reduced to two principal components (PCs) denominated first principal component (PC1) and second principal component (PC2) (SHARMA, 1996;JOLLIFFE, 2002).The SAS 9.3 statistical pack was used to perform the PCA analysis (Sharma, 1996), where they obtained the data of Pearson correlation, and other data used for the preparation of correlation charts in STATISTICA 7.0 statistical pack.Finally, a cluster analysis (CA) was applied to the resulting data.A hierarchical clustering was used, with Centroid method where the similarity between two clusters is the distance between their centroids.The Cluster analysis, including the dendrograms used, was made in the statistical software SAS 9.3 according to Sharma (1996).

Growth
From the values obtained for the Pearson's correlation coefficients shown in Table 2, it can be seen that with respect to the growth rates of the microorganisms, the largest correlation coefficients were found between Raftilose P95 and Profeed P95 (0.97), Raftiline LS and Raftilose P95 (0.86) and between Raftiline LS and Profeed P95 (0.82), where values close to 1 were found.Considering that these three substrates were constituted basically of fructooligosaccharides, this result appears to conform to the expected result of having correlation between these substrates.The stronger correlation presented between Raftilose P95 (95% oligosaccharides connected by β-2,1 bonds) and Profeed P95 (95% short chain oligosaccharides) can be justified by the fact that these two substrates are constituted of fructose oligosaccharides, that is by short chain molecules with a polymerization degree of 2 to 8, and of 3 to 5, respectively.On the other hand, Raftiline LS contains oligosaccharides and polysaccharides (92%), that is, much larger molecules with a polymerization degree between 2 and 60.Thus, according to Hartemink, Van Laere and Rombouts (1997), Raftiline LS was not degraded or was degraded much slower by all the species.This way, although an increase in growth rate, there was no reduction in the pH value.It is important to observe that the formation of products which, in this case, would be responsible for the reduction in pH value, may or may not be associated with microbial growth, that is, in some cases the formation of products is less directly related with the growth process (BORZANI; LIMA; AQUARONE, 1975;CASABLANCAS;SANTÍN, 1998).
The principal components analysis resulted in two principal components: PC1 (78.56%) and PC2 (10.38%), which together represented 88.94% of the total Página | 121 variability of the data.Figure 1A shows the two principal components, PC1 being more related to Profeed P95 and Raftilose P95, or when additional carbohydrates were not added to the medium.On the other hand, PC2 was high for glucose and negative for the other substrates.The correlation between Profeed P95 and Raftilose P95 can be observed again, in the same way as observed for the Pearson's correlation coefficients.When the relationship between the microorganisms tested in the experiment was evaluated with respect to the values obtained for the growth rates (Figure 1B) they could be placed in four groups (I, II, III and IV), with one microorganism separate from these groups (Salmonella typhimurium E-18).
In the cluster analysis, which is shown in the dendogram (Figure 2), the formation of these groups, which were observed in the principal components analysis, was confirmed.In this dendogram hierarchical grouping has been used in order to establish similarities among the different types of bacteria, whose names of each microorganism that correspond to each number of strain is shown in Table 1.For both clusters and biplots, the smallest distances represent the greatest similarities, showing that the bacteria in the same group were strongly correlated.In general it can be seen that the principal components analysis and the cluster analysis, allowed for the grouping of microorganisms of similar species based on their growth rates, allowing for an evaluation of the results from a different aspect.It is important to observe that a total of 13 strains of Escherichia coli, 2 strains of Yersinia enterocolitica, 9 species of the genus Salmonella and 2 of the genus Shigella were tested.In addition to these, the following 4 microorganisms were also tested: Enterobacter cloacae, Klebsiella pneumoniae, Proteus vulgaris and Serratia liquefaciens.The first group (I) included three bacterial genera in which only one species was tested: Enterobacter, Klebsiella and Proteus.In addition to correlating amongst themselves, E. coli and Salmonella species were also present.In the third group, in addition to the presence of E. coli, Salmonella and Serratia, two strains of Y. enterocolitica and two species of Shigella were also present.
It could be said there were similarities between the growth rates of bacteria from the same genus, even though some of the E. coli strains and different Salmonella species evaluated, which corresponded to a great number of microorganisms, were distributed amongst the groups, showing stronger or less strong correlations according to the strain.This result is satisfactory since it is to be expected that microorganisms from the same genus present similar behaviors due to their similar metabolisms.Nevertheless, differences exist, since each microorganism has a characteristic mean duplication time, related to its specific growth rate and the limitations of the medium (CASABLANCAS; SANTÍN, 1998).
The use of the principal components analysis as a tool to group microorganisms according to their growth rates can be clearly seen in the groups Página | 123 formed, starting with group IV where the bacteria showing the highest growth rates can be found.The bacteria in group I showed the second highest growth rates, followed by those in group II, with the exception of S. heidelberg E-9, whose growth rate was closer to those of group I.
It is known that FOS are compounds that are not digested by the human organism and thus arrive intact in the colon, where they are mainly degraded by the bifidobacteria present, thus favoring their development (MOLIS et al., 1996;SCHNEEMAN, 1999).These organisms are benefic since they compete with pathogenic bacteria.Hartemink, Van Laere and Rombouts (1997) showed that the enterobacteria growth in the media supplemented with FOS, although it is important to note that for the majority of the bacteria, the growth rate was lower in FOS than in glucose alone.
Eigenvectors data is an important tool to better evaluate the relationship between carbohydrates and bacteria used in the study.The eigenvectors give the weights that are used for forming the equation to compute the new variables (PC1 and PC2) (SHARMA, 1996).Therefore, the first principal component (PC1) is the most important component, because correspond to 78.56% of the total variance of data.The equation for PC1 is: As can be seen, the rate growth of the enterobacteria in the presence of Raftilose P95 and Profeed P95 accounts for a substantial portion (30 and 28% respectively) of total variance.Thus, we can assume that only one principal component is the most important in order to measure the growth rate (78.56%).
The equation indicates the PC1 value though a weighted sum of the growth rate, is strongly affected by the presence of Raftilose P95 (0.587) and Profeed P95 (0.564).Thus, PC1 values suggest that E.coli 139 had the highest growth rate, and in the other hand, Shigella dysenteriae E-29 had the lowest growth rate.

Acidification
Table 3 shows the Pearson's correlation coefficients between the final pH values after 24 hours of incubation with the microorganisms.It can be seen that the strongest correlation was found between Profeed P95 and glucose (0.75).This strong correlation could be due to the fact that a comparison of the chemical compositions of the two substrates shows that Profeed P95 is composed of one unit of glucose and two, three or four of fructose, that is, a relatively high proportion of glucose in relation to the fructose.On the other hand, Raftilose P95 only has glucose units bonds at the extremities of some chains, and Raftiline LS has glucose units linked at all the extremities but with a polymerization degree of from 2 to 60, that is, with the presence of large chains that cause a reduction in the proportion of glucose in the medium.From the principal components analysis (Figure 3A) of the results obtained for the pH values, it can be seen that 36.35% of the variability in the data could be explained by principal component 1 (PC1), which was characterized by a correlation between Profeed P95 and glucose, whereas the second principal component (PC2), which described 27.35% of the variability was high when the medium was not supplemented.Thus it can be seen that the behavior observed for the Pearson's correlation coefficients was repeated in the PCA.With respect to the correlation between the microorganisms for the final pH values of the media, the PCA allowed for the division of the microorganisms (Figure 3B) into 5 groups (I, II, III, IV and V) according to the correlation between them, when the microorganisms closer to each other were more strongly correlated as shown previously.

Página | 124
As can be seen in the dendrogram (Figure 4), the same groups could be formed with the cluster analysis.The name of each microorganism, corresponding to each number of strain is shown in Table 2.

A B
Página | 125 Differently to that observed for the growth rates, various microorganisms of the same genus and/or species presented poor correlation.This difference could be due to the fact mentioned previously, that the formation of products by some of the microorganisms was not associated with cell growth (BORZANI; LIMA; AQUARONE, 1975; CASABLANCAS; SANTÍN, 1998).According to Borzani, Lima and Aquarone (1975), there are cases showing microbial growth during the first 24 hours, with rapid assimilation of the carbon source, followed by a period showing little growth but a high production of metabolites.Rossi et al. (2005) in their studies with fermentation of fructooligosaccharides and inulin by bifidobacteria, noted that there was no correlation among species, the origin and the ability to ferment inulin.Moreover, they observed that the length of the carbohydrate chain influences the physiological responses.Through principal component analysis, it was observed that there was no strong correlation between enterobacteria species in the fermentation of carbohydrates tested.

Figure 1 -
Figure 1 -Loading plot of PC1-PC2 for the substrates for the bacterial growth rates (A) and Score plot and Groups (I, II, III and IV) created by principal component analysis, taking into account the correlation observing between the microorganisms evaluated in relation to their growth rates in media containing different supplements (B).

Figure 2 -
Figure 2 -Dendrogram of the clusters of microorganisms according to their growth rates in culture media supplemented with different carbohydrates.

Figure 3 -
Figure 3 -Loading plot of PC1-PC2 for the substrates for the final pH value after 24 hours incubation with the bacteria (A) and Score plot and Groups (I, II, III, IV and V) created by principal component analysis the microorganisms studied for the final pH of the media after 24 hours of incubation (B).

Figure 4 -
Figure 4 -Dendrogram of the clusters of microorganisms according to the final pH value of the media after 24 hours of incubation with the different supplements added.

Table 2 -
Pearson's correlation coefficients between the substrates for the growth rates of the bacteria

Table 3 -
Pearson's correlation coefficients between the substrates for the final pH values after 24 hours incubation with the bacteria.