Frustration in the Pattern Formation of Polysyllabic Words

A novel frustrated system is given for the analysis of (m + 1)-syllabled vocal sounds for languages with the m-vowel system, where the varieties of vowels are assumed to be m (m ≥ 2). The necessary and sufficient condition for observing the sound frustration is that the configuration of m vowels in an m-syllabled word has a preference for the “repulsive” type, in which there is no duplication of an identical vowel. For languages that meet this requirement, no (m + 1)-syllabled word can in principle select the present type because at mostm different vowels are available and consequently the duplicated use of an identical vowel is inevitable. For languages showing a preference for the “attractive” type, where an identical vowel aggregates in a word, there arises no such conflict. In this paper, we first elucidate for Arabic with m = 3 how to deal with the conflicting situation, where a statistical approach based on the chi-square testing is employed. In addition to the conventional three-vowel system, analyses are made also for Russian, where a polysyllabic word contains both a stressed and an indeterminate vowel. Through the statistical analyses the selection scheme for quadrisyllabic configurations is found to be strongly dependent on the parts of speech as well as the gender of nouns. In order to emphasize the relevance to the sound model of binary oppositions, analyzed results of Greek verbs are also given.


INTRODUCTION
Suppose a language with a three-vowel system where the string of two vowels in a disyllabic word has a strong preference for the AB-type configuration (assuming that vowels A and B are different from each other).For this language, it can be expected that the string of a trisyllabic word has a preference for the ABC-type configuration (assuming C different from A and B) in comparison with other four patterns, AAB, ABA, ABB, and AAA.In a two-vowel system, however, conflict arises in determining the most probable vowel configuration, because the selection of ABC is not in principle possible.The situation would become more complicate for quadrisyllabic words of a three-vowel language bearing a strong preference for the AB as well as the ABC configurations, because the realization of the pattern ABCD (assuming D different from other three) is impossible, and consequently a compromise among other 14 combinations is necessary.For instance, for a language with vowels, /u/, /a/, and /i/, six choices {ua, ui, au, ai, iu, ia} for AB and those {uai, uia, aui, aiu, iua, iau} for ABC are possible.For the quadrisyllabic words, however, one will find a compromise between ABAC, ABCA, and ABCB.It should be noted here that, according to the Shannon's theory of information transmission [1], in order to reduce redundancy, sound configurations such as AB, ABC, as well as ABCD would be preferable.As a matter of fact, such a conflicting situation had possibly been mentioned previously for the analysis of spin glasses (for details see Appendix A in Supplementary Material); it is said that since 1977 this situation due to the competing interactions among spins has been called frustration [2], in an exceptional borrowing from either psychology or psychiatry, where such situations as seen in the Bateson's theory of a double binding and the prisoner's dilemma in game theory were presented.Spin-glasslike systems have also been mentioned in cross-disciplinary areas of physics.In elucidating mechanisms that govern the Belousov-Zhabotinsky reaction, coupling among three chemical oscillators with synchronization in the interval of 2π/3 rad (triphase synchronization) was demonstrated, as a result of a compromise for minimizing effects due to the frustration [3].In nonlinear optics, numerical simulations predicted that an intense laser beam propagating along a three-layered waveguide could show chaotic trajectories as if the guided optical wave were frustrated [4,5].Only recently has optical diffraction by a frustrated system been solved numerically for the triangular Ising antiferromagnet, a disordered lattice system consisting of two kinds of scatter and exhibiting geometric frustration [6].In biophysics, in an attempt to construct a hexagonal lattice of repressing genes, socalled dynamical frustrated states have been found to appear, where the temporal evolution is chaotic, even if there is no built-in frustration [7].In mathematical ecology, experimental results for a three-frogs system, in which a frog was joined into a pair calling out of phase, have been given [8].Specifically, both the triphase and the 1:2 antiphase synchronization, including switching between the two states, have been reported.Here one might expect that a conflict similar to that observed for amphibians could be found as well for other animals that communicate with each other by means of sounds.It would be a matter of course that such an inference is valid also for the human being.More recently has the importance of the phenomenon been reviewed to illustrate how frustration is a fundamental concept in relating function to structural biology [9].Besides the above research areas, the role of the phenomena termed link frustration has been emphasized in detailed analyses of phaserepulsive complex networks of oscillators interacting repressively [10][11][12].Aside from these works concerning the frustration in the complex networks, go-ahead attempts to seek a point of contact between statistical physics and linguistics have been made so far, including applications and/or testing of the statistical laws that were previously predicted by Heap and Zipf [13][14][15] as well as graph-theoretical approaches to analyzing linguistic phenomena [16][17][18].In parallel with these, continual efforts have been making to establish methodologies for quantitatively approaching to phonological complexity [19][20][21][22].Of them, based on the notion of attraction and repulsion between phonemes, calculating cohesion and stability for phonological inventories in the UCLA Phonological Segment Inventory Database (UPSID) is of particular interest [19].
In this paper, a novel frustrated system is presented for the analysis of (m + 1)-syllabled vocal sounds for languages with the m-vowel system, where the varieties of vowels are assumed to be m, an integer larger than unity.The necessary and sufficient condition for predicting the sound frustration is that the configuration of m vowels in an m-syllabled word has a preference for the "repulsive" type, in which there is no duplication of an identical vowel.For languages that meet this requirement, (m + 1)-syllabled words cannot permit of the present type because at most m different vowels are available and in consequence the duplicated use of an identical vowel is inevitable.Here a vowel per syllable is assumed.For languages showing a preference for the "attractive" type, a specific vowel aggregates in a word, and consequently there arises no frustration.Such arrangement of vowels is usually the case with non-European languages showing the vowel harmony, such as Mongolian, Turkish, Hungarian, Finnish, Telugu, and Ainu.Moreover, frustration would be distantly related in such languages as, for instance, Japanese [23], Indonesian, Malayan, and Polynesian, for which similar syllables are frequently reduplicated.Here one may notice that this method in constructing polysyllabic words is avoided in European languages, possibly because the reduplication reminds the speakers of infantile talk [24,25].For instance, in English there are disyllabic words such as "ticktack" and "zigzag." each of which never became "ticktick" and "zigzig."(In Japanese these are realized as, respectively, kati-kati and kune-kune.)Although the reduplication will enhance redundancy, from the perspective of achieving reliable transmission of information, this scheme might not be necessarily regarded as disadvantageous.In what follows, how to cope with the conflicting situation is analyzed for Arabic with m = 3, where a statistical approach based on the chisquare testing is used.For systems under the sound frustration, no one can make an accurate estimation of the most frequent vowel configuration.In addition to the conventional three-vowel system, analyses are made for Russian, where a polysyllabic word contains both a stressed and an indeterminate vowel.Finally, in order to explore the relevance to the phonological theory based on the binary oppositions [26], analyzed results of Greek verbs are also given.Although the minimal form of meaning is not the word but the morpheme, below we shall concentrate on the former.

ANALYSIS AND DISCUSSION
According to a large-scale survey based on linguistic typology, the varieties of vowels in world's languages range from 3 to 14 [27].Among these, most typical is the five-vowel system, which accounts for 32% [28].For instance, major languages such as Spanish, Japanese, Swahili, Hausa, and Greek belong in this category [29].However, to insist on the possibility of the frustration for the six-syllabled words, one must verify the fact that the "repulsive"-type configurations are selected in entire word systems including the two-, three-, four-, and fivesyllabled ones.Note that, except technical terms, the vocabulary sizes of five-as well as six-syllabled words in common use are very limited, making impossible any statistical approach.This difficulty accompanying the most popular five-vowel system could be overcome by choosing a sample from a few minor languages with the two-vowel system.Unfortunately, if there were any in the past, all of them are nowadays either endangered or dead, making their corpora unavailable.In addition, the classification of Kabardian and Abaza with a two-vowel language was judged to be questionable [27,30,31].For this reason, below we would focus our attention on Arabic, which is known as a major language with the only three kinds of vowels, /u/, /a/, and /i/ [29], where discrimination between the short and the long vowels is not included.The three-vowel system would be responsible for establishing frustration in the quadrisyllabic sound system provided that one confirms that the "repulsive" configuration is selected both for the disyllabic and for the trisyllabic words.From surveyed results of words in corpora it has been found statistically that among the four parts of speech, which consist of nouns, verbs, adjectives, and adverbs, only adjectives bear the "repulsive" property; the results for disyllabic and trisyllabic words are shown in Tables 1, 2, respectively.Here results based on two corpora [32,33] are compared, together with the expected values being obtainable analytically from the blending among the three kinds of vowels.First, we can find from Table 1 that, irrespective of the size, N, of sampled words, the distribution of the two configurations, which are denoted by AB with the "repulsive" vowel arrangement and by AA with the "attractive" counterpart, gets highly distorted toward AB.To examine the statistical significance of the divergence from the expectation, we shall employ a hypothetical testing by means of the chi-square statistics [34] where f i and F i , respectively, are the surveyed and the expected values; the summation extends over i = 1 to i = j; j = 2, 5, and 15, respectively, for disyllabic, trisyllabic, and quadrisyllabic words.The corresponding chi-squared values will be attached to the foot of each table, where α in the bracket indicates the significant level for the statistical testing.For the two-dimensional correlation analysis (Table 1) the expected values can be calculated as for AB, where k, m, and n, respectively, indicate the frequencies of /u/, /a/, and /i/.For the three-dimensional counterpart (Table 2) these values can be obtained with the formulae that were derived for the analysis of rhyming pattern selection in haiku [35].Subsequently we shall focus our attention on Table 2, in which the results of the trisyllabic adjectives are seen.In contrast to the two patterns shown in Table 1, there are five choices.Of these, ABC (AAA) corresponds to the "repulsive" ("attractive") type, while the remaining three, {AAB, ABA, ABB}, can be regarded as intermediate between the two extremes.It is evident from Table 2 that independently of the corpus size N the distribution exhibits condensation in ABC, whereas rarefaction can be seen in AAA.From the results of Tables 1, 2 it would be reasonable to conclude that the quadrisyllabic system of Arabic adjectives may be frustrated because of the strong tendency to have a preference for AB and ABC, as well as to avoid AA and AAA.The results for quadrisyllabic adjectives, which can be classified into the The strings AB and AA symbolize, e.g., ua and ii, respectively.The distributions of the three vowels are (u, a, i) = (22, 136, 104) for (1), and (u, a, i) = (155, 552, 419) for ( 2).
(2) χ 2 = 214.11(α = 2 × 10 -8 ).There are five choices possible, which can be grouped into three categories.Of them the first group {ABC} could be regarded as a ground state, whereas the second and the third ones correspond, respectively, to the first and the second excited states.Moreover the former consists of three-fold degenerate states {AAB, ABA, ABB}.The meanings of the strings are as Table 1.For instance, ABC symbolizes, e.g., uai.The distributions of the three vowels are (u, a, i) = (29, 70, 45) for (1), and (u, a, i) = (180, 377, 235) for (1).
15 patterns of vowel configurations, are given in Table 3.The formulae of the expected values, F 1 -F 15 , are given in Appendix B in Supplementary Material.Note that, because in Arabic there are only three kinds of vowels available, the first pattern, #1:ABCD, is not possible.From Table 3 one can summarize the results as follows: (1) There exist twin peaks for #3:ABAC (e.g., 'ajriyā' u "brave") and #14:ABBA (e.g., thuqalā' u "heavy"), together with twin local peaks for #2:AABC (e.g., jabābiru "gigantic") and #5: ABBC (e.g., muta'addid "many").Here the concentration on the pattern #3:ABAC could be explained by a point of compromise for avoiding the neighboring placement of an identical vowel.In other words, frustration is relieved by finding the compromise.However, the concentration on #14:ABBA probably connotes a linguistically profound reason, which makes impossible any explanation due to the avoiding effect.In order to discuss the linguistic reason why the two configurations are frequent, two samples, both of which are most typical of Arabic adjectives, are given: bar 1'/'abriyā' u "innocent" → AB/#3:ABAC, bakh 1l/bukhalā 'u "stingy" → AB/#14:ABBA.
Here the left and the right words off the slash indicate, respectively, the singular and its plural.
(2) Contrary to a prediction, there is a dip on #4: ABCA.
(3) The concentration on the configurations, #2:AABC to #7:ABCC, which reduce to a cluster with the three different vowels {A, B, C}, amounts to 56%, in contrast to 43% for other clusters, #8:AAAB to #14:ABBA, with two different vowels {A, B}. (4) Comparison between the surveyed and the expected values indicates that the avoidance for #15:AAAA (the completely "attractive" type) is worth notice.( 5) Making a comparison between dual components in pairs such as (#2:AABC, #7:ABCC), (#3:ABAC, #6:ABCB), and (#9:AABA, #10:ABAA), one would conclude that in quadrisyllabic adjectives of Arabic an identical vowel tends to cohere rather on its beginning part.To the author's knowledge, it seems that which type of cohesion to be selected depends on the history of each individual language.
For the present three pairs, examples are given as follows: #2: AABC mayām 1nu "lucky, " #7:ABCC Kāthūl 1k 1y "Catholic, " #3:ABAC 'azkiyā'u "pure, " #6:ABCB 'ūlā' ika "those" a., #9:AABA 'ijtimā' 1y "social, " #10:ABAA 'Istarl 1n 1y "of English currency." It can be expected that if vowels of a language could be divided sharply into a fewer categories, frustration may occur in other languages.As a good example, here we take notice of Russian, the words of which are composed of three kinds of vowels.Namely, in addition to the accentual (/á/, /é/, /í/, /ó/, /ú/, and /i ;/) as well as the accent-free vowels (/a/, /e/, /o/, /I/, /i/, and /u/), they are often accompanied with an indeterminate vowel termed schwa /@/ [37].First of all, it should be noted that for at least the four parts of speech (i.e., nouns, verbs, adjectives, and adverbs) all the disyllabic words of Russian bear the feature of the "repulsive" type denoted with AB, e.g., Bépa[v'ε;r@]"belief "; consequently, duplicated use of vowels belonging to the same category is not allowed in a word [37].For this reason, in order to show frustration in the quadrisyllabic system, analysis will be needed solely for the trisyllabic one.The results for the nouns are given in Table 4.
Along with other Indo-European languages such as German, Romanian, and Bulgarian, there are three genders in Russian nouns, namely, the masculine, the neuter, and the feminine ones.We find from Table 4 that, irrespective of the gender, trisyllabic Russian nouns exhibit the "repulsive" type, having a preference for the ABC arrangement (e.g., парохóд[p@raxo; t] "steamship, " óзеро[o;z'Ir@] "lake, " фигýра[f 'Igu;r@]"figure"); this is in contrast to the strong avoidance for the AAA type, bearing a remote resemblance to the Hund's rule in quantum  1, 2. For instance, ABAC signifies, e.g., aiau.The distribution of the three vowels is (u, a, i) = (139, 212, 97).Note that in contrast with the former two cases (Tables 1, 2) the dominance between the vowels does not obey the general rule in accordance with which /u/ is least frequent [36].χ 2 = 203.50(α = 10 -10 ).
chemistry.Indeed, in Table 4 there is no word allotted to the latter.The results for other parts of speech are shown in Table 5, where those of the verbs, the adjectives, and the adverbs are compared.Evidently, the feature of the distributions is very similar to those observed in the nouns (Table 4); an exception is seen solely in the surveyed frequency of the adjectives, for which surprising concentration on the pattern ABA (e.g., корóткий[karo;tk'Ij] "short"; богáтый[baga;tij] "rich") is seen.It should be stressed here that despite this abnormality the condition necessary for identifying the "repulsive" type, i.e., in addition to avoidance for AAA, a preference for ABC (e.g., мировóй[m'Ir@vo;j] "world" a.), is preserved along with the cases of the other two parts of speech.With the results of Tables 4, 5 we could judge that the quadrisyllabic system of Russian vocables can be frustrated.On the basis of this judgment we finally consider quadrisyllabic words of this language.The results for the three nouns and for the other three parts of speech are given, respectively, in Tables 6, 7, from which we conclude as follows: (1) Because the number of syllables in a quadrisyllabic word exceeds that of the vowel categories, there is no frequency on #1: ABCD, corresponding to the perfectly "repulsive" type (Tables 6, 7).(2) Both for the neuter and for the feminine nouns, the surveyed distribution gets extremely distorted toward the group with the three different vowels (i.e., #2:AABC to #7:ABCC);   in particular, strong concentration is seen on #3: ABAC [Table 6(b),(c)], indicating that the frustration is relieved by preferentially selecting this configuration.To explain the reason why this pattern is most frequent, we shall put two samples typical of Russian: внимáние[vn' Ima;n'Ij@] "attention" for neuter noun, истóрия[Isto;r'Ij@] "history" for feminine noun.Evidently, both words are composed of a series of an accentfree, an accentual, an accent-free, and an indeterminate vowels, allowing them to belong to #3:ABAC.
(3) In striking contrast to those of the two genders, the distribution of the masculine nouns becomes considerably broad; specifically, over #2:AABC to #10:ABAA its profile is found to be almost uniform.In other words, symmetry concerning the pattern selection is not broken and approximately maintained over the broad range [Table 6(a)].
Consequently, regarding the masculine nouns, relieving of frustration is not achieved!(4) In the distributions of the verbs and the adverbs, competition among #2:AABC, #3:ABAC, and #4:ABCA, which has occurred in the process of seeking for a compromise, can be observed [Table 7(a),(c)].In other words, both parts of speech in the quadrisyllabic system select the above three configurations from the "six-fold degenerate ground states" available.Therefore, one can conclude that for the two parts of speech the frustration is relieved incompletely among the three.(5) Again, a distinguished feature is seen in the behavior of the adjectives.Namely, a marked aggregation is found nowhere in the vicinity of #3:ABAC but is seen on #9:AABA (e.g., нехорóший[n'Ixaro; ij] "not good") as well as #10:ABAA (e.g., какóй-нибудь[kako;jn'Ibut'] "certain"), the ratio of which amounts to 62% [Table 7(b)].It appears that this property exhibits a good analogy with the population inversion of electrons in an atom.(6) In summary, how to deal with the matter of sound frustration varies substantially, depending both on the gender of nouns and on the parts of speech (Tables 6, 7).
To conclude, with regard to the degree of reducing the frustration, one could decide ranking as follows: Finally, we consider other classifications of the vowels.According to the spectral analyses based on an acoustic-engineering approach along with the auditory impression, Jakobson et al.The distributions of the vowels are (v 1 , v 2 , v 3 ) = (45, 21,   The meanings of the strings AB and AA are as Table 1.The distributions of the five vowels are (u, o, a, e, i) = (10,236,39,56,71).With Opposition I (grave vs. acute) these are divided into the three categories, /u, o/, /a/, and /e, i/, while with Opposition II (compact vs. diffuse) they are divided into /a/, /o, e/, and /u, i/ [26].Note that for Opposition I the string, e.g., ui, is classified in AB, whereas for Opposition II the same string is in AA.
[26] insisted that the distinctive features of all the phonemes in the world's languages can be described with the 12 binary oppositions.Their idea was established by the aid of the X-ray tomography of a speaker's palate and was responsible for a variety of phonological theories represented by the generative phonology [38].Of the 12 oppositions, here we concentrate our attention on the following two cases: "grave vs. acute" (Opposition I) and "compact vs. diffuse" (Opposition II).The meanings of the strings #1-#15 are as Table 3.The distributions of the vowels are (u, o, a, e, i) = (33,674,415,314,380).Note that for Opposition I the string, e.g., eoia, is classified in #3:ABAC, while for Opposition II the same string is in #2:AABC.
of Modern Greek [39] are shown in Tables 8, 9, respectively, where the first person singular and the present form of the verbs are implied (Note that in Modern Greek there is no indeterminate form).The vowel sounds of this language are listed in Appendix C in Supplementary Material.First we find that the qualitative features of the results in Tables 8, 9 are identical to those given in the Arabic adjectives, Tables 1, 2, respectively.Namely, they show the strong tendency to have a preference for AB and ABC, as well as to avoid AA and AAA.Therefore, it would be reasonable to conclude that the quadrisyllabic system of the Greek verbs could be frustrated.Subsequently, the results for quadrisyllabic verbs (the first person, singular, and present form) of the same language are given in Table 10.Note that because the five vowels of Modern Greek are reduced to the three categories the first pattern #1: ABCD is not possible.From Table 10 we find the following facts: (1) In both oppositions the avoidance for #15: AAAA is noticeable.(2) In Opposition I (grave vs. acute) the concentration on #2:AABC (e.g., αναγ γ έλλω "tell") as well as on #8:AAAB (e.g., αγ ανακτ ώ "get angry") is worthy of note, which is consistent with the preference for AAB in the trisyllabic counterpart [Table 9(1)].(3) In Opposition II (compact vs. diffuse) the concentration on #7: ABCC (e.g., αγριε ύω "become violent"), #10:ABAA (e.g., αισ θ άνoµαι "feel"), #11:ABBB (e.g., απoτ ελ ώ "form"), and #12:AABB (e.g., απαγ γ έλνω "recite") can be seen, which is explainable by the preference for ABB in the trisyllabic counterpart [Table 9(2)].(4) In both oppositions it is particularly interesting to make a comparison between the three competing configurations, i.e., #3:ABAC, #4:ABCA, and #6:ABCB.For Opposition I frustration is reduced by selecting #3:ABAC (68%), whereas for Opposition II it is done by selecting #4:ABCA (52%), which would appear to compete also with #2:AABC.Here the percentage in the bracket stands for the relative difference between the surveyed and the expected frequencies.

CONCLUSION
With the use of a statistical approach it has been shown that sound frustration arises in quadrisyllabic vocables of Arabic, Russian, as well as Greek, and subsequently how to relieve the frustration has been discussed through comparison between the surveyed and the expected frequencies of the vowel configurations.It should be stressed again that the frustration presented in this paper would more or less be inevitable for languages bearing in general the minimum redundancy (Notes added in Appendix D in Supplementary Material).The author believes that the results given here could provide a firm basis for finding frustrated systems in natural languages bearing more complicated vowel systems.
To conclude, emphasis should be laid on the fact that the three frustrated systems presented above are quite unusual in that they are both spatial and temporal, depending on whether words are written with letters or realized with a voice.

TABLE 1 |
[32,33]cy distribution of the vowel configurations for disyllabic adjectives of Arabic, where N indicates the size of the vocabulary corpus[32,33].