Impact Factor 2.323

The 1st most cited journal in Multidisciplinary Psychology

This article is part of the Research Topic

Music Cognition

Hypothesis and Theory ARTICLE

Front. Psychol., 30 March 2016 |

Evolution of Tonal Organization in Music Optimizes Neural Mechanisms in Symbolic Encoding of Perceptual Reality. Part-2: Ancient to Seventeenth Century

  • Braavo! Enterprises, Los Angeles, CA, USA

This paper reveals the way in which musical pitch works as a peculiar form of cognition that reflects upon the organization of the surrounding world as perceived by majority of music users within a socio-cultural formation. Part-1 of this paper described the origin of tonal organization from verbal speech, its progress from indefinite to definite pitch, and the emergence of two main harmonic orders: heptatonic and pentatonic, each characterized by its own method of handling tension at both domains, of tonal and social organization. Part-2, here, completes the line of historic development from Antiquity to seventeenth century. Vast archeological data is used to identify the perception of music structures that tells apart the temple/palace music of urban civilizations and the folk music of village cultures. The “mega-pitch-set” (MPS) organization is found to constitute the principal contribution of a math-based music theory to a new diatonic order. All ramifications for psychology of music are discussed in detail. “Non-octave hypermode” is identified as a peculiar homogenous type of MPS, typical for plainchant. The origin of chromaticism is thoroughly examined as an earmark of “art-music” that opposes earlier forms of folk music. The role of aesthetic emotions in formation of chromatic alteration is defined. The development of chromatic system is traced throughout history, highlighting its modern implementation in “hemiolic modes.” The connection between tonal organization in music and spatial organization in pictorial art is established in the Baroque culture, and then tracked back to prehistoric times. Both are shown to present a form of abstraction of environmental topographic schemes, and music is proposed as the primary medium for its cultivation through the concept of pitch. The comparison of stages of tonal organization and typologies of musical texture is used to define the overall course of tonal evolution. Tonal organization of pitch reflects the culture of thinking, adopted as a standard to optimize individual perception of reality within a social group in a way optimal for one's success, thereby setting the conventions of intellectual and emotional intelligence.


Part-1 of this paper presented the framework for study of tonal organization1 in any kind of music. Based on the available data from archeology, anthropology, ethnomusicology and psychoacoustics, the known forms of tonal organization were lined out in a timeline, where the cognitive constraints of perception of different musical typologies were used as criteria for deciding which form of organization came first. The pattern of acquisition of music skills during infancy was used to hypothesize the succession of stages in separation of music from speech and descent of definite pitch organization from indefinite one. The existing types of indefinite-in-pitch music were analyzed to identify khasmatonal and ekmelic modes as specialized methods of processing indefinite pitch. Mechanisms of their evolution into oligotonal definite-pitch mode were defined. The principle of triadic induction was shown to determine the growth of oligotonal into mesotonal, and mesotonal into multitonal schemes. The resulting hemitonic heptatony and anhemitonic pentatony presented two alternative methods of organizing vertical and horizontal harmony—each offering a dedicated style of handling tonal tension—reflecting a more general style of worldview, based on the parallels between tonal tension and social tension. Commitment to heptatony or pentatony as the principal means of tonal organization within a culture, then, appears to generally correspond to the preferred lifestyle in a social group. This correspondence could be the product of abstraction of individual lifestyle preferences into the tonal schemata of a musical mode, and further mediation of the multitude of such modes within a social group - until the statistically prevailing mode would establish the model of tonal organization.

Part-2 continues drawing the lineage until the rise of Western tonality, identifying yet another venue of musical representation of perceptual reality—vertical and horizontal tonal structures encoding the perceived relation of multiple objects in one's surrounding. The spatial organization of depicted images appears to share the same principles as the tonal organization of music in the same culture, probably originating in its environmental topography. Spatial-to-tonal correspondence is the strongest in Western tonality, but is noticeable at earlier stages defined in Part-2: diatonic and chromatic mega-pitch-set (MPS) systems, and non-octave hypermode.

Genesis of Modal Family and the Role of Tetrachord

What separates prehistoric and historic forms of music is the emergence of math-based music theory and notation. Notation encourages production of complex compositions in observation of theoretic rules, and restrains discrepancies in reproduction of the same tune. Oral transmission of folk music, in contrary, employs variation as the leading music-making principle. Any information technology designed to enhance the retention of symbolic information should be regarded as stimulation for the emergence of abstract thought (Couch, 1989).

Musical implementation of abstraction was the inference of modal family from a single mode. The model of it is documented in cuneiform notation of the Hurrian Hymns and related texts from Ugarit ca. 1400 BC. They reveal that Assyrian/Babylonian music was heptatonic, based on 7 modes2 named after a particular series of 5ths that were used to generate each of the modes (Kilmer and Tinney, 1996).

1. Audio: An arrangement of the Hurrian Hymn No. 6, Anne Kilmer's transcription (1974). Kilmer's dyadic interpretation (Kilmer, 1974) was criticized for a number of inconsistencies with the data from the recovered music theory texts (West, 1994).

2. Audio: the alternative transcription by West (1993), which also was criticized (Crocker, 1997).

3. Audio: the alternative transcription by Dumbrill (1998). Despite huge differences between the transcriptions of this single surviving sample of Babylonian music, together with the retrieved music theory texts, it provides substantial information on general principles of tonal organization.

Our “Mixolydian G” formed the base of Babylonian system3. Prioritization of Mixolydian mode is known in numerous Eurasian folk music systems (Belaiev, 1963). Mesopotamian music theory must have adopted it from folk tradition and “rasterized” it mathematically, adopting the tetrachord as the formative tool in modal genesis.

Eurasian instrument-makers have traditionally conceptualized ambitus through equivalence of 4th, which according to Beliayev (1990, p. 248) manifests “the first stage of maturity” in tonal organization—supporting professionalization of folk musical culture. Modal integrity of 4th was epitomized in the Pythagorean cult of Tetraktys (“quaternary”), originating in primitive cultures of the Bronze Age (Burkert, 1972, pp. 188–191), likely in Babylon (Barbera, 1984)4.

Tetraktys was the earliest rational conceptualization of spatial and tonal organization in a single scheme (Figure 1) of an equilateral triangle filled up with symmetrical rows of 1 to 4 dots. Each of these numbers encoded a geometric concept: 1—point, 2—line, 3—surface, and 4—tetrahedron—everyone of which contained the one before it (Riedweg, 2005).


Figure 1. Tetraktys (the “fourness”): the geometric representation of harmonicity of “4th.” (A) Musical aspect of tetraktys. On the left, each row is assigned a value that expresses the ratio of the string required for production of a perfect musical interval. Numbers 1, 2, 3, and 4 follow one another sequentially. Each row contains intervals of the designated type. On the right, the tetraktys diagram is combined with the lambdoma matrix (Bruhn, 2005, p. 67) that was associated with Pythagorean teaching in Ancient Greece (Hero and Foulkrod, 1999). The lambdoma ratios are applied to the frequency value of A4 (440 Hz) taken as 1:1, producing A, D, and E—the unmutable (hestote) tones that comprised the skeleton of the Ancient Greek musical system. Obviously, the harmonic aspect of tetraktys determined the tonal organization of Greek and probably earlier civilizations. (B) Spatial aspect and its melodic correspondence. Each row represented a parameter in spatial representation: 1 point indicated zero dimensions; 2 points defined line, 1 dimension; 3 points defined surface, 2 dimensions; 4 points—a 3D geometric figure. Musically, this would be equivalent in distinguishing between an isolated pitch, a melodic step produced by succession of 2 pitches, a particular harmony produced by the relations of 3 pitches, and a gravitational frame of reference generated by 4 pitches—evident if to look at pitch values at any of the sides of the tetraktys in (A). (C) Centripetal gravity encoded in tetraktys. The configuration of 10 points that comprise the tetraktys can represent 3D by implying 4 triangles. Then, their “corners” correspond to the same octave-equivalent pitch, indicating that the tetraktys is “tuned” to a certain tone (A).

Together, they represented world's order, where point symbolized unity, line—limit, surface—harmony, and tetrahedron—cosmos. This convention displayed amazing vitality throughout the ages, nourishing philosophy of Christian, Arabic, and Jewish traditions (McCartin, 2010). Since Ancients considered numbers “sounding”—following the paradigm of proportional shortening of the string length to attain different pitches, the idea of inclusiveness of a number was understood musically as well (Barbera, 1977): 4 and its integers expressed perfect consonant intervals (2:1 = octave, 3:2 = fifth, and 4:3 = fourth). Musical Tetraktys then served as the container of harmony: intervals that could be expressed in numbers not greater than 4 were considered “symphonia” (accord), whereas all other intervals, including 3rd and 6th, were considered “diaphonia” (discord) (Kholopov, 2006, p. 64). Tetraktys determined the assortment of pitches usable in music, which was also understood in cosmogonic terms. Plato, in Timaeus, described the derivation of music tuning as creation of the World-Soul—and the model for his calculations probably came from earlier Mesopotamian sources (Crickmore, 2009a).

Byzantine, Arabic, Persian, Indian, Syriac, Armenian, Georgian, and Western European music theories—as well as many Eurasian folk traditions—share the tetrachord base. Sachs (1962, p. 163) notes near-omnipresence of 4th in world's music: absent only in Polynesia and Micronesia. Cultural proliferation and longevity of 4th indicate cognitive reasons for its prominence.

Ancient Greek theory may offer a hint for possible explanation. Greeks conceived intervals in terms of stepwise singing male voices: thence, 4th was a sum of 3 steps and an aggregate of the constituent intervals: 2nd and 3rd, each of which carried its own psychological features. Greeks must have been aware of the displacing tendency of a 2nd5. Therefore, perception of melodic 2nd involved competition between two tones—a kind of duality—as opposed to monadic unison. Going over the step produced a leap of a 3rd that contrasted a 2nd by leaving a trace and thereby introducing a new dimension of vertical harmony. Whenever enclosed in a melodic 3rd, two 2nds, did not create friction, making the 3rd indeed the expression of “harmony”. Increasing the leap by another step produced a 4th which contained a 2nd and a 3rd—two intervals of different valence, one implying duality (disharmonious) and the other, accordance (harmonious). Such 4th encapsulated all basic tonal relations: sameness, otherness, and their synthesis—concordance.

This intervallic numerology could date back to Aurignacian culture, where the first cosmogonic concepts were forged in terms of solar/lunar, day/night, male/female dialectics, usually expressed in 2:1, 3:1, and 2:3 proportions (Frolov, 2003). Babylonian philosophy coordinated these proportions and expressed them mathematically as well as musically.

The integrating capacity of a 4th manifested itself in tetrachordal organization. This is still evident in maqam where the II and III tones inside a tetrachord can shift in their tuning values as needed, whereas the marginal tones remain permanently locked (Zannos, 1990). The inclusive power of such tetrachordal 4th is quite obvious to musicians, especially on a string instrument. And string instruments played a formative role in crystallization of the earliest epic Indo-European tradition. Four pitches that corresponded to melodic stresses of the ancient Sanskrit and Greek bridged Rig Veda to Homer - illuminating presence of 4 PCs by the look of a 4-string lyre, used to accompany epic singing (West, 1981). The idea of a tetrachord would then simply be the abstraction of a tunable lyre's string with marginal tones locked by the 3:4 proportion.

Optimization of interval-tracking could explain the preference for tetrachordal organization: most individuals cannot track more than four simultaneously moving entities (Drew and Vogel, 2008). This becomes an issue in heptatonic music-making, where 4th often stands as a “collection” of 4 tones, each of which requires attention and memory. Storing auditory images for each pitch—as incremental representation by a lookahead feature of the brain's error-detection circuitry—occurs while singing a familiar melody in one's mind (Janata, 2012). Facility of quick arithmetic estimations (1+1+1+1 etc.), necessary for vocal coordination, could make 4th into an optimal melodic size “chunk.”

A 4th also provides the best compromise between melodic and harmonic consonance (see Part-1): vertical 4th fuses well, while horizontal 4th might not segregate in very fast tempo (Huron, 2001)6. Its closest competitor in size is 3rd, but 4th has a serious taxonomic advantage of being a perfect interval: for trained musicians, narrowing or widening a 4th by about 12 cents reduces its recognizability, making the listeners hear it as another interval, whereas for major 3rd the tolerance zone is 50% wider, about 18 cents, and for minor 3rd—25 cents (Burns and Ward, 1978). Furthermore, the tuning zone for 4th, enclosed between 3rd and 5th (that are harmonically contrasting to 4th) is substantially narrower than for the range occupied by 3rds (enclosed between the dissonant 4th and 2nd). According to Moran and Pratt (1926), 4th enjoys the lowest deviation rate amongst all intervals, with 13.5 cents average. This makes 4th a better tuning reference than 3rd, which in history of acoustics has been notorious for exorbitance of tuning standards and preferences (Barbour, 2004). Listeners' resolution of interval size is the highest for 4th (43 cents)—exceeding the 5th (50 cents)—and presenting an asymmetric bias towards the 3rd: major 3rd, extended 37% toward 4th, is still heard as a 3rd, whereas 4th allows for only 18% extension toward 3rd (Shackford, 1962).

Preference for 4th might have a developmental origin—there is evidence that mother-to-infant vocalizations during the first 2 years of life tend to tonally tune to the harmonic row of the same fundamental (85% of communication), where 4th is the only interval used outside the row, as an “infrafix” below the fundamental (Van Puyvelde et al., 2010). Such frame of reference would favor unison and 4th as the smallest size perfect consonances, comfortable for vocalization—and tetrachord exactly sets the unison and 4th as the structural axis for melody-making.

For Eurasian instrumental music, expansion of ambitus usually occurs by addition of an extra tetrachord above the initial one. Mixolydian tetrachord presents a choice par excellence because of its ease of tuning on the string and on the pipe (tone+tone+semitone)7, and uniformity of its conjunct reproduction: G-A-B-C+C-D-E-F (Figure 2A).


Figure 2. Genesis of Mixolydian polymodal family. Blue color marks stability, while yellow—instability. Brackets illustrate hierarchic grouping of degrees. Arrows indicate the tendency of tones to alternate as gravitational centers. Figures in circles rate relative gravitational values within the subset: negative indicate instability (−2 more unstable than −1), while positive—stability (+3 more stable than +1). Short black straight line shows the central axis for a mode formed by modal subsets within a mode. Curved arrowed lines indicate the shifts of gravity between the mutable anchors. Placement of this line between two tones indicates a disjunct connection. Placement under the middle of a tone indicates a conjunct connection between the subsets. The dashed black lines display the mega-set membership of the pitches. (A) Mixolydian conjunct heptatonic non-octave mode. The mode has 2 equivalent subsets with gravitational ambiguity in the center. The IV degree alternates from stable to unstable state, depending on the tetrachord in which it is melodically engaged. The hierarchy is limited to the binary distinction between stable degrees, where the lowest tone of the tetrachord receives greater weight. Modal mutability occurs between I and IV degrees. (B) Rotated Mixolydian disjunct heptatonic non-octave mode. This mode has two equivalent subsets with unambiguous gravitational map, where the lowest tone in the tetrachord remains the anchor. Although, its hierarchy is as elementary as in (A), this mode is closer to the uni-tonicity. Modal mutability here occurs between I and V degrees in the manner of “tonic”-“dominant.” Also equivalence of 4th produces nearly perfect match in tuning between I and VIII degrees. (C) Ionian octave equivalent diatonic mode and its three most common modal transpositions: Dorian (D), Phrygian (E), and Lydian (F). This mode features two inequivalent subsets with contrasting placement of unstable degrees. In the lower pentachord they alternate with stable degrees, whereas in the upper tetrachord they are entangled between the stable margins. Both subsets are conjunct through the V degree, which thereby receives greater modal prominence than the III degree. Also, the upper tetrachord is harmonically understood as the “inversion” of the basic pentachord in reference to the stable degrees. The mode is fully octave equivalent. The very same modal scheme is reproduced in every transposition (C, D, E, F). “Sameness” of the pitch values (i.e., D = D) across the sister-modes despite their difference in functionality for each of the sister-modes generates the MPS.

The second favorite is the Phrygian tetrachord (semitone+tone+tone)8, which is simply an intervallic inversion of Mixolydian. Disjunct addition of this tetrachord forms a Phrygian mode: E-F-G-A+B-C-D-E, which was probably a later development9.

Each of these tetrachords constitutes a characteristic 4-tone intonation. Perceptually, the progression of two whole-tones, followed by a semitone, designates a melodic vector: Mixolydian tetrachord suggests ascent, whereas Phrygian—descent. Both are found amongst the world's most widespread modes (Gill and Purves, 2009). Singing them creates illusion of “resolution” toward the upper or lower 4th. A semitone is known to project directional ascending/descending melodic motion (Roederer, 2008, p. 184). Delviniotis et al. (2008) discovered that performers habitually increase the first interval length, and proportionally decrease the last in ascending scales, while inversing this treatment in descending scales—which would emphasize the vector of melodic inertia.

Melodic tetrachord highlights the gravitational relations and suggests spatial concomitants.

• Pentatonic conjunction of trichords “rounds the corners” by avoiding sharp-sounding minor 2nds and projecting concordance;

• Heptatonic conjunction of Phrygian or Mixolydian tetrachords amplifies ascending or descending directionality of the resultant scale, connoting insistence and purposefulness.

Chronologically, induction of Mixolydian mode must have preceded octave equivalence. The Mixolydian conjunct mode is non-octave in its design (Beliayev, 1990, p. 281), and is characterized by alternation in gravity between the base tones of both tetrachords. The IV degree here tends to change from stable to unstable state, depending on the tetrachord in which it is melodically engaged. Like pentatony, this mode lacks gravitational hierarchy, but features greater tension, since its unstable tones tend to shift closer to a stable degree in expressive tuning, usually employed by performers (Morrison and Jánina, 2002).

According to Beliayev (1990, p. 288), conjunct Mixolydian produces a mode equivalent to Ionian via tetrachord rotation. This pseudo-Ionian remains non-octave, since its disjunct tetrachords make the upper C unstable whenever mutable10 G temporarily becomes “tonic.”

In practice of earlier oligotonal and mesotonal music, singers commonly used transposition-by-interval11.

4. Audio: Udasan Yryata, healing incantation. Occasionally, transposition was applied to a particular portion of a song as deliberate expressive means. Transposition of the oligotonal PS: from F-Ab-Bb to G-Bb-C (between the 2nd and 3rd strophes).

Heptatonic modes promoted transposition-by-degree12, introducing the pitch-class set (PCS) concept. The underlying idea of diatonicity originates from cultivation of string plucking instruments (Belaiev, 1963)—which were cardinal for Mesopotamian civilization (Lawergren, 1997). The visibility of strings, each easily equated to a pitch class (PC), and correspondence between the string length and interval size makes a mode obvious to players. Facility of producing few tones simultaneously prototypes the observation of vertical intervals that emerge between vocals and instrumental accompaniment.

5. Audio: Maddoh, Pamir. The accompaniment on rubab (6-string lute) provides an example of vertical 2nds occasionally produced by plucking the adjacent strings.

This is exactly what Nippur music-instruction tablets specify: notation of vocal part with lyrics set against the pitches of the lyre (Colburn, 2009)—for the first time graphically exhibiting the dimension of musical texture.

Formulation of the Mega-Pitch-Set

The next development occurred when the triad induction (see Part-I) caused to re-conceptualize the lower tetrachord plus a tone above it as pentachord (258), forging a concept of melodic intonation of 5th as a modal unit13, and introducing a new hierarchic layer I-III-V into a mode. Pseudo-Ionian non-octave mode then transfigured into Ionian octave mode, with the mutability I-V instead of I-IV. The new axis of I-V pioneered the “authentic” functionality, in light of which the older I-IV axis could be viewed as “plagal.” The novelty of the authentic relationship was that it typically supported a melodic development that would build a climax point and emphasize the prevalence of “tonic” at the end. Krohn et al. (2007) confirmed that the largest N1 component in the ERP corresponded to hearing the V degree of the major key14.

With pentachordal scheme in place, musicians begin reproducing a succession of the same tones from the II rather than the I degree—turning II into the new I degree—and filling up the upper end with an extra tone. Such transposition-by-degree creates a “sister” 7-tone mode, with identical pentachord hierarchy that shares the PS (C-D-E-F-G-A-B-C and D-E-F-G-A-B-C-D), uniting both modes into a single system. There is experimental evidence that listeners categorize such modes by ear despite their identical PCs (Rohrmeier and Widdess, 2012).

It is not an accident that the three closest Mixolydian transpositions (Dorian, Phrygian, and Lydian)15 top the interval set (IS) harmonicity list of the world most popular heptatonic modes (Gill and Purves, 2009). Also sister-modes “harmonize” the music repertoire by making all songs share the same intervals classes (ICs). This “pan-harmonization” separates the partially octave-equivalent multitonal mode of “village” music from the completely octave-equivalent mode of the “palace” modal system. Their difference is manifested in the presence of mega-pitch-set (MPS): a set of tones, legitimized as the building material for any musical composition by music theory.

The larger is the set, the greater is the harmonization, and therefore the greater is the stretch of gravity, causing overall reduction in tension. The earliest Sumerian harps had 11-15 strings, which by the eighteenth century increased to 29 strings (Lawergren, 1997). The ambitus of music performed on such harps greatly exceeded that of the typical folk heptatonic music, easing tension—appropriately for meditation in temple, and eulogy in palace. The “easing up effect” distinguishes MPS from earlier folk heptatonic forms.

The MPS mode loses some of the sovereignty of a stand-alone mode: it is no more a container of characteristic intonations popular within a particular kind of music. The MPS mode has to share its degrees with other modes, evident when one mode immediately follows another mode (as in verse/chorus or song-dance)16. Perceptual “sameness” of degrees encourages the performer to strip off the MPS mode of those intonations whose expressive tuning violates the tuning of a sister-mode. Eventually, all modes within a family turn out being “averaged.” This can be seen in comparing Figure 3A from Part-1 to Figure 4C here: the hierarchy of stable degrees is the same, but the hierarchy of unstable degrees flattens in the MPS. There are only two gradations here: VII vs. II-IV-VI. In the folk heptatonic mode there were 4 gradations: least unstable IV, more unstable II, yet more subordinate VI, and leading VII this hierarchy ends up reduced by one level by the demand to preserve the pitch values for all the member degrees across all sister modes17.


Figure 3. Different types of non-octave hypermode.27 Yellow color represents unstable, while blue—stable degrees. Brackets illustrate hierarchic grouping of the degrees. Arrows mark the tendency of the tones to alternate as gravitational centers. Figures in circles rate relative gravitational values within the subset. Short black straight lines mark the modal subsets within a hypermode: when a line is placed under the middle of a tone, this tone serves as an anchor for two subsets, generating the conjunct connection between them. (A) Three obykhodnyi hypermodes of the hexáechos system: major (G), minor (A), and diminished (B). This hypermode features nearly perfect subset equivalence. Hierarchical organization is present only for stable degrees, where the lowest pair (G-C) often generates a plagal inclination. The highest stable degree (Bb) is hardly ever used as finalis, which usually falls on the central anchors (C or F). Unstable degrees do not form any groups, as in pentatony. However, tension here is weaker than in pentatony because of deeper 3-level hierarchy of stable degrees. Position of a semitone marks the “leading tone.” Anchor points relate to each other by 4th, readily forming the 4th–chord “triads” that are treated like consonance. (B) Three Georgian tetrachordal hypermodes: major (C), minor (D), and diminished (E). Unstable tones in the center of the hypermode provide modal integrity by forming unstable dyads (E-G, G-Bb), entrapping the central tonic dyad (F-A). Sometimes this tonic dyad alternates in gravity with the upper tetrachord's anchors (Bb-D). Whenever this happens, Bb shifts from unstable to stable function. However, the hypermode remains centripetal due to the harmonic dissonance of the diminished octave E/Eb, which generates a melodic inertia toward the inward resolution. The lowest tetrachord executes a complimentary plagal function. The utmost upper tetrachord usually keeps its highest tone unstable, leaning on its lowest tone. Altogether, the tetrachords feature subset inequivalence: gravitational contrast between each other (unlike the trichord hypermode). Tonal mapping of the major hypermode (centripetal F) serves as a prototype for the minor (G) and diminished (A) versions.


Figure 4. Chromaitc system according to Cleonides (Aristoxenian school), c.1st century BC. This system contains 11 subsets based on rotation of fixed diatonic tetrachords. Blue color marks the permanent degrees that were associated with stability. Yellow color marks the mutable degrees that required resolution. The tonal tension rating, marked in the circles, is my estimation based on Beliayev's method40. Mese (A3) is assigned greater stability in tetrachords with two stable tones whose “tonic”/“dominant” interaction is marked by an arrow. E3 and E4 are assigned greater stability than the marginal stable tones (B & A4). Of the two unstable tones, the upper one always involves alteration, whereas the lower one retains its “diatonic” state in the chromatic genus—suggesting subordination of the upper tone to the lower one. The black diagonal line marks diezeuxis—the break between the tetrachord meson (E3-A3) and tetrachord diezeugmenon (B3-E4). This break leaves no common tones between these tetrachords in the chromatic/enharmonic genera (B-C-Db-E vs. A-Bb-Cb-D). In order to enable a smooth melodic transition across diezeuxis, Cleonides reserves three pentachords at strategic points (A-B-C-Db-E, Gb-A-B-C-E, and F-Gb-A-B-C), which significantly complicates the Systema Teleion (which engages two upper tetrachords, B-E and E-A) as well as the Systema Metabolon (which, instead, terminates the MPS at D with the tetrachord synemmenon A-Bb-C-D). The lowest pentachord adds a stand-alone A2 in order to accomplish the A/E modal framework. The chart reflects the chromatic genus. The column on the right summarizes the pitch values for the chromatic as well as enharmonic genera of the unstable tones. The enharmonic genus can be told by the quarter-flat, three-quarter-flat, and double-flat alteration signs. The entire MPS reveals a centripetal tendency, where the greatest variability in pitch occurs next to the central Mesa (A3). The entire system is characterized by extreme gravitational diversity: altogether there are 6 types of distribution of gravity within a modal subset.

The earliest reliable sample of Ancient composition is Epitaph of Seikilos. Sustained in diatonic mode, it was likely composed in observation of the music theory of the day (Mathiesen, 1999, p. 150), exemplary of MPS melody.

6. Audio: Epitaph of Seikilos, 1st century AD. Ancient Greek Phrygian diatonic tonos (coincides with modern Dorian E). Unstable degrees are somewhat averaged and moderated in their attraction to stable degrees, as compared to the stand-alone folk heptatonic mode in the example below.

Its most obvious trait is non-formulaic structure. Diversity of Epitaph's intonations outweighs the only pattern present in the entire composition (line-endings 3-4). Abundance of directional shifts and over-degree-skips obscures anchoring.

7. Audio: Thracian Air. Modern Dorian E (Hypodorian) mode. Well-marked tonicity makes resolution of the unstable degrees clear. The phrase sampled in this example is continuously repeated throughout the recording of the entire song.

Nearly all survived Ancient Greek music features improvising style, even choirs. In this, they contrast the overall formulaic aptitude prevalent in European folklore (Zemtsovsky, 1987), suggesting opposition of folk and palace/temple music in Antiquity (see Appendix 1 in Supplementary Material).

If a beauty-in-averageness effect (Winkielman et al., 2006) can make a folk tune, averaged by modifications of multiple musicians, appear attractive and “natural,” an authored tune can make an “artificial” and idiosyncratic impression. Likeability here is traded for originality. Certainly, the authored tune can also be orally disseminated. But practice of performance under supervision of a musical administrator in Ancient Mesopotamia (Michalowski, 2006) was not likely to provide enough freedom in variation for averaging effect to occur. Administrated music tends to turn into hard “rule.” And later Hellenic civilization made exact public reproduction of someone else's composition socially unprestigious.

Individual practice of following melodic rules sets in place hierarchic processing of pitch, where the “invented” contours are filled up with the standardized intervallic detail-establishing the modern tonal standard of pitch pattern processing (Stewart et al., 2008). Just like performers, listeners here need to know the tonal schemata before they face a particular music work. Melody processing in such music is driven by instant automatic response to the tonal progression conceived or auditioned—relating it to long-term memory (Brattico et al., 2006) for pitch-set class (PSC) and interval-set class (ISC). Pre-attentive response indicates that modal rules are optimized and hard-wired in the MPS, as opposed to earlier modal systems:

• In pre-MPS heptatony the standardized contours were filled with idiosyncratic intervallic detail;

• In MPS heptatony the idiosyncratic contours are filled with standardized intervallic detail.

Emergence of the concept of “key” in music theory reflects this advance. The term “key” is often used synonymously with “tonality,” which is inaccurate. Ancient Greek music used keys that did not constitute tonality. The modern notion of key implies presence of a fixed PCS subordinated to a single tone. In practice, key was brought to life by the necessity to retune string instruments before playing in a different mode (Kholopov, 2006, p. 73). Tuning always proceeds from a certain tone to which other tones are adjusted. Hence, one pitch is singled out from a PS and the entire PS is inferred from it. This is not exactly about stability, but rather priority materialized in audiation. Such “key” is not found in folk cultures (Kvitka, 1973, p. 25).

Tuning practice encourages a single key to incorporate multiple modes—to minimize retuning. This is where the complex interplay between “key” and “mode” begins (Solomon, 2000, p. 75). Convenience of immediate switching from one popular mode to another overweighs the importance of key's integrity, legalizing certain alterations18. These alterations become “modal”—characterizing a certain mode. Their very presence testifies to the presence of key.

Over time, key earned its own ground, different from mode: Greeks distinguished between “modulation according to the scale,” and “modulation according to the key” (Hagel, 2009, p. 5). Their keys could be transposed like our keys, and were associated with “key signatures” (West, 1992, p. 179)—but they neither incorporated the notion of tonic triad, major/minor inclinations, nor implied vertical harmonic functionality (185).

Rigidity of key rules secured the processing speed, enabling the handling of larger stocks of data. Shulgi's introduction of “rigid music” set the foundation for the evolution of complexity in Western music, allowing music structures to convey more information about the perceptual reality as perceived by the creator of music.

An MPS mode becomes a member in the assortment of modes, whose knowledge is obligatory for a professional musician. He is supposed to choose the right mode appropriate to the occasion. Specialization of modes is promoted by ensemble performance and genre application19, and relies on professionalization20. Beliayev (1990, p. 296) underlines that professionalization of tradition necessarily involves development of multimodality. By the eighteenth century BC, there was already an internationally recognized system of accreditation of musicians—in courts and temples of Near East—with clearly defined ranks, and frequent relocation and integration of musicians through conquest and gift-exchange (Franklin, 2007). Already a 2800 B.C. relief shows two lyres playing together (Krispijn, 2010). In order for harpists to stay “in tune” with one another's strings, they had to share the same understanding of a mode/key. Middle Assyrian tablet VAT 10101 (West, 1994, p. 170) presents a census of Akkadian love-songs, classified by modes (tunings)21.

It took about 2000 years for the heptatonic MPS tradition to settle before the Greeks established the status quo for the entire region. Already for Ancient Romans there was no alternative to Greek music: there are almost no traces left of original Etruscan music (Powley, 1996)—it was overwhelmed by Greek influence (Landels, 2002, p. 182). Reliance on the ultimate music-making scale became organic part of this influence (Winnington-Ingram, 2015, p. 50). This was a direct outcome of conceptualizing the ISC, and teaching the ear to center on different tones of the same PS.

The circle of 5ths was that instrument which equalized the MPS. Ernest Clements (1935) reserved the term “quintal” to refer to what I call MPS scales—as opposed to folk heptatony. The diatonic Mediterranean MPS is cross-culturally implemented in the system of 8 modes, produced by modal transposition from each of the degrees of the principal heptatonic mode, including its intervallic reproduction an octave higher, with the tonic placed in another tetrachord. Werner (1948) investigated such octoechos systems, tracking them to the beginning of the 1st millennia BC Mesopotamia. He concluded that division in 8 modes was a melodic concomitant of the mathematically realized harmonic octave affinity assigning a dedicated mode to every degree within an octave, which originated not in musical but in cosmological and calendaric numerology.

Pentatonic mode passed through a similar transformation in constructing the pentatonic system (Cook, 1995)22. The origin of the idea of pentatonic MPS must date back to the ninth century BC, when the tones of PCS obtained their standard pitch names (Kuttner, 1965). Around the fifth century music theory had in place the principles for reproduction of the “legitimate” pentatonic modes across the available tonal space. The similarity of Chinese hexagon circle of 5ths with Chaldean music theory is striking—most likely determined by their astronomic correspondences (Daniélou, 1995, p. 37). The Yang-Yin dialectics defined the anchor points on odd degrees (Yang) vs. “unstable” even degrees (Yin)23. Sixty pitches were standardized by the use of precisely manufactured bells—used as reference tones for tuning instruments (Falkenhausen, 1992). Music theory devised a nominal 12-tone system by inferring the whole-tone scale and then dividing the whole tones in halves24. Its main purpose was to absolutize the pitch values for use in all possible transpositions of the legitimate pentatonic modes (Bagley, 2005)—in effect, a mega-pentatonic PS.

The idea of concert pitch standard is a logical consequence of cultivation of MPS: the idea of maintaining the sameness of tones across the sister modes suggests adoption of some standard of reference. Especially in ensemble performance and ecclesiastic application, a particular pitch could be assigned to a specific supernatural power justifying its standardization. It is unlikely that the Chinese MPS was the only absolute one. Greeks tuned their lyres to aulos, and designed their notation around fixed names of pitches (West, 1992, p. 273). Greek citharas incorporated tuned resonators which would ensure fixed pitches (Hagel, 2009, p. 69). Amazing is that modern pitch standard (A4) closely corresponds to the reference tones that defined Ancient Greek MPS (A2-A4). It would be extremely interesting to find out if the phenomenon of perfect ear existed in antiquity, or if it is a byproduct of modern tonality (Steblin, 1987).

Pan-harmonization of music system can be seen as means of resolution of cognitive conflict. It is not by chance that civilizations of Mesopotamia and Egypt, China, and India, all embraced cosmogonic music theory about the same time they developed script systems. Rise of literacy25 and analytical method of thinking26 were promoting awareness of complexity, contradictions, and imperfection of the state of things in the cultural environment. Analytic approach to text paved the road for rationalization of notions inherited from the traditional folk culture (Civil, 1994). Mesopotamian education system trained to grasp and put in use the meaning of texts (Michalowski, 2012).

Conflict of interests was a common motive in Sumerian and Akkadian literature, with plenty of vivid illustrations of invective and reproaching rhetoric (Foster, 1996, p. 220). In Akkadian literature, first person's speech often emphasized the state of cognitive dissonance. A very popular epistolary genre often presented complaints of unfair treatment (Vulliet, 2011). Even more conflicting were the genres of diatribe, where two persons competed in the verbal attack of each other, disputing before some deity (Hallo, 2010, p. 120). Verbal skills played a deciding role in forging and polishing counter-distinctive manner of thought, thereby amplifying awareness of cognitive dissonance. At its pinnacle was the rise of judicial rhetoric, which exposed conflicts of interest between different individuals, and rewarded better argumentation (Hallo, 2010, p. 126). Trials were held in public, and declaration of each of the parties was pivotal in influencing the court's weighing of the conflicting statements (Wilcke, 2007, p. 44).

Babylonian culture saw a marked increase in individualism (Foster, 2011), confrontation, and disorder, earning the nickname “Dark Age” that hit the entire Mediterranean region around the twelfth century BC (Drews, 1995). The assumption of the state ideology that serving king's interests serves everyone's interests turned out to fail to motivate the subjects to defend the state against external intrusions or internal plots. Cognitive dissonance should be put on the list of contributing factors in the inability of the Bronze Age palatial cultures to sustain resilience toward environmental and international stresses. The “barbaric” tribes, with more homogenous social structure and “cognitively consonant” music would have had an advantage over Mesopotamian civilized societies, subdivided and weakened by contradictory interests of their social groups and their musics.

Codification of the MPS system should be viewed within this context of growing cognitive dissonance. Rational harmonization of the entire compass of all available music tones was not a deliberate political move in reaction to social pressures, but an elemental biological response. Inspired by correlative cosmologies, mathematically-based theories of music harmony catered to neurobiological need of the brain to reduce informational stress by employing a new strategy of organizing data and establishing ways for synthesis of new quality out of it (Farmer et al., 2000).

Non-Octave Hypermode

Ancient Greek Systema Metabolon set the theoretical foundation for yet another distinct method of tonal organization—found in Medieval Western Europe, Byzantium, Russia, Armenia, Georgia, Azerbaijan, and Bulgaria. The title hypermode (Pashinian, 1973) captures its principle of stitching multiple tetrachords or trichords into a single system, spanning well over an octave. The tonal integrity is achieved by taking small elementary subsets, deficient to determine the makeup of the entire melody, and uniformly conjoining them according to the “chain principle” (Sachs, 1960): addition of a twin-subset whenever melody runs over the margin of a subset. The expanded set is treated compositionally as a single entity—especially pronounced in a polyphonic setting.

At the heart of hypermode is the fixed registral contrast between marginal tetrachords/trichords. The PCs of each subset are permanently mounted in the overall ambitus, disallowing alterations. This music makes a fairly diatonic impression between adjacent subsets, while evoking “friction” between the remote ones, expressed in “false relation” of the octave-inequivalent tones. Equivalence of 4th (or 5th) binds the mode.

8. Audio: Ne oryol li s lebedem kupalisia, lyrical Cossack song, Southern Russia. B-C#-D-E-F#-G-A-B-C, false relation C#-C induces subtle increase in tension in the high register—in contrast to relaxation at low register.

The melody sustained in hypermode exhibits a peculiar “elastic” effect: as long as the phrases stay in the same registral position, they appear “casual,” but ascending induces tension, whereas descending—relaxation. The entire melody contracts/expands like an elastic band through cycles of tension/relaxation. The greater the amount of subsets, the greater the “elasticity.”

9. Audio: Mussorgsky—The Great Gate of Kiev, the 2nd theme. The 12-tone hypermode: G#-A#-B-C#-D#-E-F#-G#-A-B-C#-D, with 2 false relations D#-D and A#-A within 4 trichord subsets.

In a few equintervallic diatonic subsets, elasticity is minimal: i.e., Byzantine hexáechos28 (Figure 3A) is very close to diatonic MPS.

10. Audio: The Little Entrance “Come, Let Us Worship” [Priidite, poklonimsia], 2-part Znamennyi chant, based on Byzantine hypermodal system. Minimal tonal tension from ascending motion through 3 equintervallic trichords.

Larger size subsets, such as tetrachordal and pentachordal, common in Georgian traditional music, increase functionality of PCs, inducing substantially greater instability—which is handled by more elaborate hierarchic organization.

11. Audio: Kakhuri nana. Lullaby. Georgian tetrachordal hypermode (Figure 3B) with the characteristic diminished octave G#/G (Gogotishvili, 2010). The unstable functionality prevails over the stable one.

Non-octave hypermodes presented a window for expression for the strictly controlled amounts of tension (see Appendix 2 in Supplementary Material for details) that was compartmentalized in different registers. The resulting opposition to “natural” (for speech and animal vocal communication) association of high register with submissiveness while of low register with aggression (Ohala, 2006), marks the contribution of hypermode to the establishment of specialized musical tonal semantics—in contrast to verbal tonal semantics.

Yet another historic landmark was divergence of hypermode from chromatic system by providing a diatonic-based alternative to the chromatic expandability by alteration/modulation (see below). A noticeable affiliation of hypermodal organization with the Christean plainchant, which subsequently shaped the folk music of many Eastern Orthodox nations and ethnicities, expressed rejection of the cultural heritage of the Greco-Roman philosophy of music and an attempt to restore the older Sumero-Babylonian cosmology on new theological ground (see Appendix 3).

Alteration and Modulation

Unlike the hypermode, the diatonic MPS did not restrict degrees to sustain their pitch values throughout the music work. The need to temporarily increase tension was handled by alteration and modulation. The term “alteration” refers to raising or lowering of a degree in a PCS, involving modification of the IS. When this happens, listeners familiar with this PCS become surprised by its deviation from the norm. The impulse to restore familiar IS is what is responsible for momentary increase in tension associated with the alteration, when the listener experiences intense expectation for it to comply to the norm (Margulis, 2005).

Alteration is a form of cognitive dissonance. Formulation of Systema Metabolon (“the modulating system”) concurred with the formation of the discipline of dialectics in Ancient Greece (Losev, 2000, pp. 601–35), and with the growth of public interest in it (i.e., rhetorics, sophisms; Laertius, 1958, p. 137). As people realized the limitations of words in reference to real objects, the dialectic method of defining opposites began to make an imprint at first on the manner of conducting scholarly research and legal matters, then on the discipline of rhetoric in general, and finally on tonal organization. The primary function of music to harmonize was understood through opposition of tension and relaxation, “united by disunion” (Plato, 2012, p. 13).

Neither unfixed ekmelic degrees, nor expressively tuned multitonal degrees of pre-MPS musics involved cognitive dissonance. Rather, they constituted exaggeration of intonation in pitch—what Cazden (1971) termed “modal inflection.” The principal difference is that “chromatic” alteration implies production of two colors and cognitive conflict, whereas “inflection” implies saturating a single color and no cognitive conflict.

12. Audio: Shelkovoya travushka, Nekrasov Cossacks. The IV degree here exists in three flavors (normal, sharpened, and flattened)—marking the opening of each strophe with a tonal “blot (see the frequency analysis in the Demonstration-4 in Part-I).”

Modal inflections are modally normative: justified by the permanence of melodic rule.

13. Audio: Alilo, Georgian ritual Christmas song. The melodic rule: in the middle voice, every time B goes to C#, it sharpens, but every time it descends to A# in the opening of every strophe—it stays natural.

Alteration does not possess such permanence and logic. By its nature, it is accidental. Alteration splits the normative degree into few versions within the same composition, calling for further “resolution”: two versions cannot both be “right,” one ought to be “wrong,” and therefore “corrected29.”

Alteration is relatively rare in oral traditions30 reserved to technically advanced professional music with fully fledged music theory31.

14. Audio: Maddoh, Pamir. Improvisation on a ghazal by Hafiz. The stanza starts with the altered degree C#, creating a dissonance in relation to the accompaniment—and then resolves into B, restoring the initial non-altered mode: E-F#-G#-A-A#-B-C-D#.

Alteration should not be mistaken for progression of natural degrees in folk “microtonal” modes, where seemingly “chromatic” degrees are normative (Petrovi, 1994). Such modes can contain their own micro-alterations.

15. Audio: Falak-I Badakshani, Pamir. Microtonal alterations of four “natural chromatic” degrees within the ambitus of F#4-A4, providing extra tension for a genre of funeral lamentation (Levin, 2007).

See Presentation-1: Alterations/Micro-alterations.

“Modulation”— transition from one musical mode to another without a break—differs from alteration by violating gravity rather than PCS. Modulation has been theorized exclusively within the framework of Western music. Similar devices are known in other advanced music systems (Indian, Arabic, Chinese)—although without receiving much attention in their music theory. Modulation in folk music presents a novel and controversial object of study.

The most common form of gravitational shift in folk music is intra-modal mutability.

16. Audio: Li Weri, a Senufo funeral, Côte d'Ivoire. Intra-modal mutability in pentatonic mode from C to Eb and finally to F.

Zemtsovsky (1998) calls this “pentatonic enharmonism”: ability of PCs to get included in different trichords, where the same PC would act as an anchor in one trichord, whereas remain unstable in another trichord. Similar “enharmonism” is possible in hemitonic modes usually involving membership of the same PC in two different tetrachords.

17. Audio: Nozanin-Shod-I Uforash, call-and-response sozonda (wedding), Bukhara. Intra-modulation a step up, from Eb to F, in a heptatonic mode.

Mutability of multitonal mode (see Part-1) restricts intra-modulation to only 2-3 anchor-tones, making gravitational shifts predictable and regular.

18. Audio: Ocarina solo, Bulgaria. Each sentence (provided sample) starts in A, in major inclination, but ends in F#, in minor inclination. Such A/F# alternation shapes the form of the entire composition, only by the end of it committing to a prolonged F#.

The MPS generalizes diatonic “enharmonism”: if folk mutability shifted gravity for a single tone, MPS modulation shifts the entire set—rebuilding it from any of the degrees.

Helladic music probably featured simple diatonic modulations (Franklin, 2002). Its original pitch set constituted an Olympic trichord E-F-A (West, 1992, 164). As time progressed, the set size grew—ultimately reaching an octave species, allowing for inter-tetrachordal enharmonism. Despite their size, all MPSs are treated in the same way: music users remember the normative sets, and upon detecting modulation, hypothesize a new set from what they already know (Raman and Dowling, 2012).

The entire PSs are alternated—even if, technically speaking, the PS degrees retain the same pitch values (as C-Ionian/A-Aeolian). In reality, their pitches are not exactly retained, since each PCS imposes its own expressive tuning: certain degrees are slightly sharpened or flattened, depending on their function in the PCS (Sundberg et al., 1995). The same tone B will be intoned sharper in Ionian C, and flatter in Aeolian A (Tchesnokov, 1961, p. 58). Although, this adjustment is not as drastic as a single tone mutation in a folk multitonal mode, it nevertheless does occur32. In the polymodal system, the music user remembers modes by their IS, including their characteristic expressive tuning (Brattico et al., 2006). Absence of expressive tuning is perceived as faulty performance (Sundberg, 1982). Every time music modulates from mode to mode, the melodic ISC switches, causing reassignment of expressive tuning values—all at once, as in switching from one tuning table to another. This is what the phenomenon of “harmonic modulation” practically entails.

Listeners take expressive tuning as a prompt in detecting the most stable (immutable tuning) and unstable (most mutable) degrees. They estimate modulation in terms of gradations in tension determined by the intervallic value of the modulation— the interval between the old and new tonics. Thus, modulations to subdominant (C-F) are perceived “tenser” than modulations to dominant (C-G) (Korsakova-Kreyn and Dowling, 2012). It seems that the listener's affective response to modulation is determined by the way in which the entire PS and IS of the “arrival” mode appears to the listener in relation to the “departure” mode. Thus, modulation from minor dominant to minor tonic appears different than modulation from minor subdominant to minor tonic. Transition from one PS/IS to another is processed probably as a single percept akin to the standard progression of chords33. The emotional reaction to modulation proves to be one of the most exciting stimuli in music listening experience (Korsakova-Kreyn and Dowling, 2014). We shall see later how this emotionality is important for the emergence of chromatic system.

Modulation usually involves alteration—their combination pioneered in Ancient Greece.

19. Audio: Mesomedes—Hymn to the Muse, second century AD, brief modulation from Lydian to chromatic Hypolydian mode by the end of the hymn (Hagel, 2009, p. 287).

Hellenic listeners identified melodies by intervallic differences (Lippman, 1964, p. 160): which involved IS, IC, ICS, and ISC34. Interval-tracking habit was responsible for non-formulaic composition as opposed to contour-tracking habit of earlier folk-musicians.

Both, Babylonian and Assyrian songs fit a single song into a single mode (Franklin, 2013, p. 218). In Classical Greek music, a song often contained a nexus of tetrachords, each bearing its own modal organization (West, 1992, p. 226).

Professionalized folk cultures can come close to what might appear as a chromatic modulation either by emulating MPS music or forming composite mode-a compound of 2 or more stand-alone modes (Belaiev, 1963).

20. Audio: Duma about Marussia of Bohuslav, Ukraine. Modulation from E to B that appears to be influenced by the Western classical modulation from tonic to dominant.

21. Audio: Toshto Marii Kushtymo Sem, Marian dance. Here, Pentatony that characterizes the music of Volga Finns is enriched by the composite mode C-D-Eb-E-F-G-Ab-A-C, which was most probably generated by adding together the C-D-E-G-A and incomplete C-Eb-F-G-Bb (without Bb) pentatonic modes.

When folk musicians learn a diatonic PCS, they begin to transpose it by degree. Eventually, they come to connect two tunes, each associated with its own mode, into a medley. Then, one mode becomes transposed so that it would start on the same I degree as another. As the performer gets used to this juxtaposition, he can combine intonations from both modes within the same song. Even pentatonic modes acquire quasi-chromaticism in this way. Thus, two pentatonic modes built from the same tone (i.e., C-D-E-G-A and C-Eb-F-G-Bb) produce quasi-altered III degree (C-D-Eb-E-F-G-A-Bb). The complete combination of all pentatonic modes results in a 9-tone composite mode C-D-Eb-E-F-G-Ab-A-Bb35.

However, “chromatic” tones in composite modes are never used in scalar fashion (Belaiev, 1963). Even when a folk musical instrument includes the entire chromatic scale, as in Chinese shen or pipa (Riemann, 1899, p. 5), it hardly ever plays chromatic successions. Tunes remain pentatonic or diatonic, while the “chromatic” tones are reserved solely for passing from one mode to another (von Hornbostel, 1975, p. 41).

Chromatic Polymodal System

The more frequent is the alteration, the more likely it is for it to cause habituation, lose its affinity with cognitive dissonance and acquire more “consonant” status. This is what must have happened in the Hellenic culture. According to Ancient Greek sources, altered tone's function was to “shade” the diatonic degrees: notable was the reference to “sweetness” of chromatic alterations (Hagel, 2009, p. 154)36. Pleasantness of alteration was responsible for the quick popularization of lute in Greece from the fourth century BC: unlike lyre, lute allowed to comfortably produce chromaticism (Higgins and Winnington-Ingram, 1965). Fashion for alterations could have “normalization effect” on chromaticism, so that its cultivation would have “domesticated” the cognitively dissonant aspect of it (Katsanevaki, 2011).

22. Audio: First Delphic Hymn to Apollo, second century BC. Essentially, this composition presents spare use of chromatic alterations shading of the Phrygian tetrachord (West, 1992, p. 288).

23. Audio: Katolophyromai fragment from Orestes by Euripides, from papyrus, 3rd century BC. Most of the melodic content of this lamentation in chromatic Lydian mode is made of altered degrees.

Chromatic alteration became affiliated with aesthetic emotion after the practice of connecting certain modes with certain affects was established through the temple culture of Sumerian and Egyptian cults, some time around the 2nd millennium BC (Farmer, 1965) (see Appendix 3 in Supplementary Material).

Earlier agricultural civilizations heavily depended on the calendar, which boosted the development of astronomy and math, but carried no mystic and esoteric implications to entitle numerology to a governing status delegated to the elite (Frolov, 1992, p. 152). Babylonian music theory was first to link the arithmetic definitions of musical tones to cosmology. Cosmology empowered music with the status of natural law, equating music's influence with the sun or the moon. Just as excess or shortage of sunlight can cause problems, so presence or absence of certain modal qualities in music was believed to be beneficial or hazardous for a person. This doctrine is known as “ethos” and existed in numerous Ancient civilizations (Kaufmann, 1976; Rowell, 1981; Deva, 1995; Katz, 1996; Thrasher, 2008).

In the 6th century BC, Sakadas of Argos started combining different ethea in a single composition by employing intra-modulations between different verses of his song. Then, Aristoxenus' Perfect System rationalized the means for the composer to generate his individual map of tonal tension suitable for a particular composition.

24. Audio: Second Delphic Hymn, second century BC. The music is built on the Lydian tetrachord, alternating between Hypolydian and chromatic Lydian modes—which seems to be reserved as means of a peculiar compositional arrangement, unlike the modal stereotypicity of folk music.

Rising standard of authorship incorporated modal creativity. Greek civilization championed cultivation of melopoeia, art of composing music, put forth by Plato (Kholopov, 2006, p. 74). From the fifth century BC until the Dark Ages, authorship guided expression in the arts. Distinguished authors' names were perpetuated, encouraging other artists to either follow their steps or to compete with them. Growing popularity of chromatic style in the fifth century Athens reflected the antithesis of diatonic conventionality vs. chromatic originality. For the next half-millennium, enharmonic and chromatic genera made the diatonic genus look too predictable and unimpressive (Franklin, 2002).

Chromatic modulations were restricted to melodic junctions between the adjacent tetrachords: alteration could only follow the consonant “bounding” tones at the tetrachord's end (Hagel, 2009, p. 10). Thereby, diatonic system provided the skeleton for all modulations and alterations—very much like in a modern key. However, not all musicians followed the rules (Franklin, 2002).

Crexus, Timotheus, and Philoxenus were condemned for increasing the number of strings on the lyre, and excessive elaboration—blamed for using “polyharmonia” to appeal to the mob's ideas of plurality and liberation (LeVen, 2014, p. 81). This accusation should be understood in the context of dithyramb contests and theatrical plays becoming exceedingly popular to the extent of introduction of entrance fees for the first time in Greek history (Csapo, 2000). Theater musicians made lavish profit and enjoyed enormous popularity—this, together with the growing market (18 theatrical festivals per year, fourth century BC) unleashed fierce economic competition (Csapo, 2011). New Music was definitely based on the direct approval/disapproval of live audiences. The immediate reason for the split of public opinion, and voices for its condemnation was its break of conventional ties between mode and genre, and its inter-strophic modulation—which could be rather abrupt, even a semitone apart (Hagel, 2009, p. 44).

25. Audio: Lamentation from Iphigenia Aulidensis by Euripides, third century BC. Modulation from Hyperaeolian to Hyperphrygian mode by common tone.

Chromatic music represented new philosophy of consumerism of aesthetic emotions—in opposition to Platonic philosophy that reserved diatonic music for propaganda of “right” emotions (Stamou, 2002). Chromatic music grew out of older enharmonic music that was cultivated in Dyonisiac dithyramb, and became related to theater and symposium (drinking parties), both of which involved aesthetic appreciation. Chromaticism as “sweetening” of intervals by tonal shading served to evoke states ranging from “pleasant” to “lugubrious” (Franklin, 2005)—essentially, aesthetic emotions37.

Athenian chromaticism replaced cosmogonic consonance admiration with admiration for realistic impersonation of humanistic character traits, interwoven into dramatic development. Aristoxenus' chromatic system instrumented this change by rejecting older Pythagorean numerology as “dogmatic,” and basing a new music theory on psychoacoustic principles put to service of the composer (Barker, 1978).

Another important issue was the topological reference frame: Babylonian/Pythagorean diatonic theory was all arithmetic, defined by prescriptive numerical proportions, whereas Aristoxenian chromatic theory was geometricdescriptive of actual distances on monochord's strings. Remarkable is the commonality of Aristoxenian and Euclidian approaches to the infinitely smallest magnitude, setting a conceptual and a terminological correspondence between musical and physical spaces (Barbera, 1977). Chromatic tetrachords reflected the contemporary advance in the irrational numbers, presenting breakthrough from Pythagorean ratios (Scriba, 2015, p. 44). Babylonian mathematics had strong arithmetic-algebraic character: tables and lists of reciprocals and roots provided the “right” answer for a particular use, where “the geometrical form of the problem was usually only a way of presenting an algebraic question” (Struik, 1987, p. 28). In contrast, Greek geometry sought methods for inferring the relations between objects based on empirical proof.

Moreover, Euclid introduced a strong personalized aspect in such calculations, where angles and distances were estimated from the viewpoint of a particular spectator (and not “in general”), resulting in discrepancies between “optic” and “perspectival” evaluations (Andersen, 2008, p. 725). Unlike Babylonian geometry, Euclidian geometry was influenced by scenography (728), acquiring strong spatial connotations (considering geometric lines as representations of what can be actually seen around)—in contrast to Babylonian “aprioristic” line of thinking (providing ready numbers for a particular application).

Chromatic music was a tonal system engineered to present emotional theater: to convey detailed emotional information prompted by the text and/or dramatic action. Chromatic MPS broke away from a diatonic MPS by becoming a storage of modulation/alteration possibilities for the composer. To minimize the inconvenience of retuning the lyre, which remained a reference instrument for theory, musicians had to find as many common tones between different modes as possible. And seven principal modes, when built from the open string E, mark the E-A-B core of immutable tones, thereby forming the axis for categorization and hierarchical organization (Gombosi, 1951). Of E-A-B, central A3 seemed to execute the function of the ultimate tonic (West, 1992, p. 219).

Just as ekmelic and mesotonal modes, chromatic modes were crystallized by the permanence of tuning: the least frequently retuned tones acquired the status of stability, while the most alterable tones ended up at the bottom of the tonal hierarchy. The synékheia (continuity) law postulated that all chromatic modifications to be derived from diatonic MPS for better melodic coherence (Franklin, 2005)—Aristoxenus was clear on using the entire MPS as a reference for chromatic alterations (Hagel, 2009, p. 44).

The MPS structure in Figure 4 represents the chromatic/enharmonic key of A (Strunk and Treitler, 1998, p. 37), expanded over all the available sonic space—what was called Systema Ametabolon (West, 1992, p. 223). Aristoxenus described 13 chromatic “keys” which altogether regulated organization of chromatic/enharmonic genera, built from each of the 12 semitones between Hypodorian F2 and Hyperphrygian F3 (Hagel, 2009, p. 48). The description of the chromatic system might sound extremely complex, but in practice, the overall number of PSCs in the MPS was not exorbitant38. There was little distinction between the chromatic and enharmonic genera39. Greek notation did not distinguish between them at all (West, 1992, p. 255), and the performance practice left the exact choice to the discretion of the performer. In reality, musicians had to deal with no more than 14 different types of tetrachords: 2 types of each of the 7 principal keys.

The entire Systema Ametabolon clearly stresses the A/E gravity, with tonic/dominant functionality. The epicenter of chromatic mutability falls at the upper middle of the MPS (Figure 4). This is the register where melodies show the greatest modal complexity. The peculiarity of Greek system is that all alterations are descending41. The descending functionality of Ancient Greek music probably originated from the Archaic trichord E-F-A (West, 1981), with its characteristic “directing” semitone placed at the bottom. This trichord became a melodic frame, where extra tones could be placed in between E and A, forming two oldest heptatonic genera, diatonic and enharmonic, circa seventh century BC, credited to Olympus (Barker, 2007, p. 99). Chromatic genus evolved later, as a simplification of enharmonic genus, and gained in popularity—up until AD: surviving musical fragments from the Roman period are almost wholly diatonic, and both, Gaudentius and Macrobius reported that chromatic and enharmonic genera were obsolete by fifth century AD (West, 1992, p. 165).

Chromatic music was ousted in the West, but not in the East of Roman Empire. Greek chromatic MPS impacted all the territories between Greece and India—conquered by Alexander during the heydays of chromatic music. “Gapped” structure with chromatic/enharmonic pyknon (a pinch of three close pitches) penetrated local folk cultures and created a special intervallic class—what Kholopov (1988, p. 38) named “hemiolic” (“hemiolia”—the 1½:1 ratio). Hemiolic mode differs from diatonic by its chromaticism: recoloration (chroma) of ICs due to their inequality—most prominent in microtonal varieties of hemiolic modes, i.e., maqam Hijaz-Kar-Kurdi C-Db-E¾b-F-G-A¾b-B¾b-C (Racy, 2004, p. 108, see Appendix 4 in Supplementary Material).

26. Audio: B'utho, Syrian Orthodox chant, Tminoyo mode. Chromatic alterations with microtonal inflections: A-Bb-Cb-D(Db)-Eb-F-Gb(G)-Ab(A)-Bb-Cb-Db (Lundberg, 1997).

See Presentation-2: Post-Hellenic chromaticism.

Hemiolic modes are decidedly non-diatonic: their tones cannot be positioned in a circle of perfect 5ths. Gapped tones represent discrete—and not altered—degrees of the mode: the entire music work might be based on the stationary gapped tones, without any modification. Such mode differs from pentatony by contrast of gap and semitone, where expansion of one causes shrinking of another, inducing tonal tension. Instability of both gapped tones is responsible for their flexibility in expressive tuning, which enables them to come closer to a target stable tone, thereby exaggerating tension and relaxation (Marcus, 1993). The emotional expression of hemiolic gaps also opposes that of pentatonic trichords: related to heightened pleasure and even ecstasy (ibid.), unlike the gapped tones in pentatony, which constitute a commonplace rather than a sign of elation there.

Manuel (1989) reserves a special term for hemiolic/ music amalgamated by Arabic, Turkish, Greek, Jewish, Gypsy, Romanian, and Andalusian traditions - “the Mediterranean tonality”—first documented in the early nineteenth century manuscripts of linear notation. Initially strong, microtonal enharmonic component of such music eventually became “retuned” into the Western classical music diatonic tonal space, as the performance practice adopted accompaniment with chords. The traditional monodic implementation, on the other hand, has preserved the microtonal adjustments, especially those of the hemiolic gap, supporting the tonal organization that is distinctly different from Western tonality (Marcus, 1993).

The Mediterranean implementation of chromatic recoloring shares affective aspirations with Western musica ficta (Westrup, 1954), but follows a different modal order.

• Western chromaticism was accidental in nature, and followed the trichordal scheme, where the chromatic degree would be jammed between two diatonic degrees.

• Mediterranean chromaticism was regular, modally driven by melodic inflections, following the tetrachordal scheme, where two chromatic degrees would be encapsulated between two diatonic degrees.

This difference determined polarly opposite paths of their development. Western chromaticism, from the twelfth century on, supported emergence of polyphony as a standard of composition—by becoming a tool of regulating the vertical harmony by means of triadic “over-a-degree” vertical relations. Mediterranean chromaticism fueled melodic complexity, serving as the primary expressive means for the composer in organizing horizontal harmony by means of intra-tetrachordal alterations and modulations. This important distinction has led to completely different spatial connotations for Western tonal key and Mediterranean tonality. Western spatio-tonal design went towards incremental geometric projection of ever growing complexity, while Eastern Mediterranean focused on ornamental patterning the nexus of small size modules (al Faruqi, 1985). This contrast seems to reflect more fundamental opposition of philosophies, where Western Christian and Eastern Islamic cultures appear to form the core for the divergence between Western and Mediterranean tonalities (see Appendix-4).

Tonality and Perspective as Models of Representational Organization

From a perceptual angle, the concept of chromatic tonal key incorporates uniformity of distribution of stability/instability within the PCS, which involves 5 hierarchical ranks (Lerdahl, 2009):

1. tonic (function of gravitational center);

2. dominant, subordinated to tonic, yet providing an anchor alternative to it (opposing function);

3. mediant subordinated to tonic, while comprising the tonic triad together with dominant (complimentary function);

4. 4 coordinated non-tonic diatonic degrees subordinated to tonic, dominant and mediant (auxiliary function);

5. chromatic degrees, each subordinated to a neighboring diatonic tone (function of a leading tone).

Such organization evolved from the diatonic MPS through standardization of intervallic relations by means of counterpoint techniques which every composer was expected to know. Ability to hear equivalent concords (triads) between multiple parts, and recognize them as a single typological percept was set in place during the Renaissance (Nutting, 1974). Merging of vertical intervallic relations into a triad sonority, and its categorization by the bass together supported emergence of major/minor tonality in place of the old 8-mode MPS (Parncutt, 2011).

Subsequently, psychological representations of tonality in terms of chords, for Western listeners, became as real as hearing the tones themselves (Vuvan and Schmuckler, 2011). Tones, intervals, and chords are processed through imagery representation, and the representation of chords is derived from the representation of tones (Hubbard and Stoeckig, 1988). Therefore, chords should be regarded as common chunks of pitches, remembered by music users to facilitate tonal navigation across the music work. Standardization of chords is what separates Western polyphony from folk “natural” polyphonies (see Appendix 5 in Supplementary Material) with their unrestrained abundance of possible vertical combinations of tones. Thus, Megrelian polyphony uses 18 types of chords on the VI, 14—on the V, and 13—on the IV degrees (Arom, 2010).

27. Audio: Henry VIII—Pastime with good company (c.1510). The 3-part vertical harmony is based on a few triads: 1 chord on the VI degree, 3—on the V, and 1—on the IV.

28. Audio: Odola, Megrelian work song. The 3-part harmony contains plentiful variety of chords—in stark contrast to the example above.

Yet another crucial distinction is that Non-Western “natural” homophonies lack chordal functionality, treating the vertical harmonic aspect as secondary to melodic and timbral aspects. Such “chords” should be regarded as “timbre-harmonic clusters” that form no continuous linear development—nothing close to the Western notion of standard harmonic progressions (Kubik, 1999, p. 108). Tonal tension here plays no role in vertical organization, and is reserved to horizontal harmony alone: it is not chords that resolve into one another, but tones of the principal melody. Devoid of any hierarchic relationship, “chords” only thicken the texture or provide the reference frame to illuminate the mode in a manner of a pedal cluster of tones. Such are the “chords” of Japanese Sho music (for 17-reed pipes), whose frigidity throughout a composition offsets fluidity of the principal melody (Malm, 2000).

29. Audio: Ompeh, Efufu area of Ghana. The example of a call-and-response leader/chorus form popular in West Africa where the chorus contains multi-part dubbing of the melody—what appears as a chain of reproductions of the same “chord.”

In fact, penetration of Western chordal mentality in non-Western music systems has had detrimental impact on their original tonal organization. Native performers start thinking musically in terms of Western triads and functions—which then remaps their modal intonations and produces new hybrid modes.

30. Audio: Men Kyrym, Crimean Tatar song, Uzbekistan. This solo song features unmistakable tonic, subdominant, and dominant functions in the melody—noticeably different from pentatonic organization that is traditional to neighboring Tatar ethnicities in Eastern Europe and Central Asia.

Crystallization of permanent chromatic “tendency tones” (Huron, 2006, p. 160) was another factor in shaping tonality: diatonic MPS afforded chromaticism only as melodic “accidents” of few types (Adams, 2010), whereas the tonal key embedded chromatic alterations as organic constituent of the PCS—transposable altogether with the tonic (Brown, 1992). Tonal chromaticism became “generative” (Forte, 1980)—in fact, needed to establish tonal integrity, so that its absence becomes a sign: an expression of sublimity, purity, or naiveté by strictly diatonic music (Vashkevich, 2006, p. 7).

Chromatic layer coexists with a diatonic layer as “chromatic pitch fields” (Burnett and Nitzberg, 2007) mapped in certain areas of PCS and remembered by music-users as a hierarchy of inter-connected “pitch alphabets” to be referenced during parsing of music (Deutsch, 2012).

Intelligibility of pitch alphabets embedded in a tonal key must be the immediate cause for the steady pattern of global Westernization, observable since the introduction of tonal keys in the eighteenth century. This process is often denigrated as “colonialistic,” but the truth of the matter is that functional tonality provided the cognitive framework that facilitated creation and comprehension of music for non-Western musicians, so that their Westernization is nothing but demonstration of universality of the cognitive benefits of hierarchic functionality (Yanov-Yanovskaya, 1999). Other issues, such as economic and political advantages, would not have come into play unless the Western music system presented a more effective way of encoding gradations in tension/relaxation than did the traditional local systems. Categorizing melody in terms of implied chords is a form of chunking—a way of compressing information42. Processing music in terms of standardized progressions of implied chords is another method of chunking, enabling even greater compression. Furthermore, both compression methods allow implicit learning of tonal regularities by mere exposure (Tillmann et al., 2000): figuring out which pitch constitutes a part of which chord, and which chord—part of which key, requires no teacher. Ease of implicit learning must be the underlying reason why many non-Western musicians tend to either switch to Western tonality or hybridize their native systems with tonality (Nettl, 1986). Adoption of tonality in the Third World countries essentially is the same as adoption of banking system or electrification.

Technically, what made Western tonality cognitively special was the crystallization of purely intervallic hierarchic typology, where vertical ISC came to replace melodic ISC as the basis of categorization in auditory perception. If early Medieval polyphony had all its parts share the same PS (Atkinson, 2008, p. 127), late Medieval polyphony was conceived linearly, often generating harmonic differences between parts (Bent, 1984). Compositionally, polyphonic harmony was constructed as a sum of monophonic harmonies, different parts in different species of 4ths and 5ths—until tonality of the early seventeenth century brought all parts to a common denominator of a single fixed IS, defined in semitones (Atcherson, 1973). So, the tonal composer conceived the entire texture as a single tonal construct which he had decided precompositionally—in contrast to the modal composer who could only discover the actual harmonic results after summing up all the parts (Mangani and Sabaino, 2008).

Nomothetic centripetal hierarchy of tones, fixed in their subordination and coordination relations, is quite analogous to the astronomic model of planets orbiting the sun, discovered during the Renaissance—as well as to the organization of depicted images in linear perspective (Cook, 2011). Tonality, heliocentricity, and perspective, all implement the same idea of harmonious arrangement of numerous peripheral objects in relation to a centered object. All three also deal with motion: physical, melodic, or “dynamic”—the latter term is reserved for reference to visual representation of “directed tension” in pictorial composition (Arnheim, 1984)—closely matching the idea of tonal tension in melodic harmony43.

And this is not a coincidence. Music composition has a long history of co-influence with architecture and ornamental design, all defined along the dimension of “virtual gravity” (Galeyev, 2003). Their connection comes naturally: sound and light are waves, subject to the same laws of reflection, dispersion, absorption, diffraction, and interference—differing mostly in wavelength: a musical sound-wave is about the size of a human, while optical wave is microscopic (Nazajkinskij, 1972, p. 116). The laws of physical space that rule audio and optic transmission prototype laws of virtual space constructed by art-works—and, here, musical texture becomes the cross-modal equivalent of visual depth (127).

Visual objects populate the visual space, whereas musical tones fill up the musical texture comprised of simultaneous sounding parts and voices. Percepts of pitch and visual size are cross-modally intertwined (Bien et al., 2012): we become aware of the presence of tones in the virtual music space essentially by the same mechanism as we locate visual objects. Melodic layer in musical texture serves as an equivalent of the visible surface—that is closest to the observer. In the pitch domain, discretization occurs in terms of intervals; in the visual domain—by formation of contours. Visual contour equates melodic contour (Terhardt, 1995)—both, outline the object of perception. The correspondence between the two has been known in musical literature since Jean-Jacques Rousseau (Galeyev, 2007). It also finds confirmation in experimental research (Weinstein and Gridley, 2010).

We relate one polyphonic part to another by estimating the vertical intervals between them in terms of their harmonicity, rhythmic simultaneity of tones, and contrast in melodic contour—once we identify the concurrent parts, we track them by their vertical order (Palmer and Holleran, 1994). Cardinality of vertical order is confirmed by long-standing compositional practice of avoidance of part-crossing in voice-leading (Huron, 1991). Despite dividing our attention between all the registered parts (Demany and Semal, 2013), pitch is best detected in the upper part—and this is disregarding whether or not the upper part contains more semantically important material (Palmer and Holleran, 1994). Moreover, the ERP studies of perception of polyphonic music indicate that formation of parallel audio information streams is pre-attentive and involves better encoding in the higher part, and even years of experience playing a low-range instrument does not reverse this bias (Trainor et al., 2014). The high part superiority effect was found in 7-month-old infants, suggesting automatic ordering of segregated audio streams (Marie and Trainor, 2013).

A similar effect occurs in visual perception of direct motion: we receive more information about the motion of closer objects, whereas optical invariants of distant motion are not picked up by the observer: closer motion is processed faster and with greater accuracy (DeLucia, 2008).

Observers can estimate trajectories of up to eight simultaneously moving objects (DeLucia and Novak, 1997)—quite on par with melodic motion in polyphonic parts of Italian Renaissance music, where five parts were the norm for sophisticated style, and 3-part writing was considered a sign of simple folk style (Dubravskaya, 1996, p. 56). Although experimental studies demonstrated that non-musicians are only able to count up to three concurrent parts, and musicians—four concurrent parts (Stoter et al., 2013), there is evidence (Huron, 1989) that at least some musicians can identify a 5-part polyphony when following not an analytical strategy of denumeration, but resort to a holistic strategy: estimating by how many more parts the texture is thicker than two parts. Such “thickness guess” would be perceptually analogous to estimation of crowdedness in a set of visual objects.

Interestingly, the tendency of the upper part in a multi-part setting to be the most busy in contrast to the lowest part that tends to house slower rhythm (Broze and Huron, 2012) remarkably resembles the motion parallax, sensitivity and awareness of which is found in 6-month-old infants (Condry and Yonas, 2013), raising questions about the genetic roots of greater acuity of perception of proximal data.

By the same token that we ascribe greater importance to objects that are coming toward us in the depth parameter, we ascribe greater urgency to sounds that are higher in pitch: “raising voice” implies calling for attention. Everyone knows from experience of vocalizing that raising of the voice involves activation of higher part of the vocal folds (Nazajkinskij, 1972, p. 156). Perhaps, being “higher in pitch” translates into “more important,” so as “being closer.” Yet another implication could be drawn from the Doppler effect, which engages cross-modality—but only in relation to the non-static stimuli. For dynamic stimuli, ascending pitch is congruent with growing in size, whereas descending pitch—with shrinking in size—in accordance with the visual illusion of an approaching object growing in size (Eitan et al., 2014).

Renaissance Polyphony and Perspective: Parallels in Organization

Just like Renaissance painters went into experimentation with projective geometry to develop an eye for perspective, their music colleagues employed an empirical technology. At least from 1531, erasable tablets became a common object of trade, an accessory for “serious” business of composition, indispensable for sketching a music work (Owens, 2000). At first, composers used tablets as visual aids in configuration of pitches in a single part, but by 1612 entire multi-part compositions were drafted on a larger cartella (Owens, 1998, p. 74). Visual representation of music on tablet became mentalized. Monteverdi described his compositional process as “warping” melodic lines in his head before notating the music, a norm of composition at least since 1537: in his instructional treatise, Auctor Lampadius distinguished between a mental and a written stage in composing (64–73). Composers definitely abode by the graphic representation on a cartella when they employed “musical proportions” of painters and architects onto the proportions of sections in music form (Reynolds, 1987), employing decidedly spatial approach to planning a composition.

Composers were well aware of the shift in compositional approach: Tinctoris named 1437 as the Rubicon between “discordant” and “concordant” styles (Blackburn, 2013)—distinguished by the manner of laying out parts according to the harmonic plan. Lowinsky (1989) likened the emergence of this innovative “simultaneous concept of a polyphonic whole” with the development of the theory of perspective. By 1523, composers religiously followed the established technique of a simultaneous conception of all parts in a prescribed order (Dahlhaus, 2014, p. 94)—quite similar to artists religiously applying perspective.

31. Audio: Josquin—Ave Maria (1485), 4-part motet. Graphic visualization by Stephen Malinowsky demonstrates the role of spatiality in the distribution of musical phrases, evoking the state of equilibrium—corroborated by the dominance of C major triad throughout its tonal plan.

Image 1. Piero della Francesca—Brera Madonna (1472). One of the first generation works that employed precise linear perspective, characterized by strong sense of harmonicity and proportionality of composition, contributing to the impression of serenity.

Rise of tonality accompanied the rise of perspective—in the same cities, sharing the same user base, artistic ideals, and similar organizational principles. Perspective made the first public impression in 1425 (Edgerton, 2009, p. 5). Renaissance “monality” (Wienpahl, 1971)—modality with the major/minor triadic principle of organization—presented the first style of tonal rather than modal integration of harmony. It flourished in the genre of frottola, popular since 1490s (Prizer, 1975).

32. Audio: Tromboncino—Frottola “Ostinato vo' seguire” (1509). Solo melody establishes “one-point” perspective for the lute accompaniment that contains progression of chords and melodic figurations over implied chord.

Lute and guitar accompaniment exposed explicit “triadic thinking”—where music was processed in terms of melody supported by chords rather than counterpoint rules that still dominated vocal and keyboard treatises (Christensen, 1992). Similar homophonic unity characterized uni-syllabic delivery of the multi-part arrangements.

33. Audio: Josquin—El Grillo (1505), frottola for 4 vocal parts. This multi-part texture clearly demonstrates the integrative effect of “thinking in chords, with the melody in the upper part.”

Eventually, chordal thinking led to the establishment of general bass (Schulenberg, 1984): a practice of improvising a progression of chords to a given bass line—which can be viewed as an auditory equivalent of scaling 3D objects onto a 2D plain. Here, the bass acts as a ground line, whereas a vertical slice of texture is projected onto it by every beat, so that harmony controlled by a certain chord covers a specific number of beats aligned in relation to the bass. This constant projection of the vertical parameter onto horizontal time-line is what every keyboard player was supposed to do while accompanying an ensemble or a solo instrument (Bach, 1949).

34. Audio: Monteverdi—Zefira Torno (1632), madrigal for 2 parts and basso continuo. Here, complex harmonic pulse is set by the repetitive formula in the bass. Toward the end, repetitions are disrupted before the movement resumes and marks the end with a flourishing cadenza.

General bass discloses harmonic pulse that serves has been serving as an important compositional means, equal to time signature in its formative power, in post-Renaissance music for most part of the Common Practice Period.

35. Audio: Pachelbel—Canon in D (1680). Graphic visualization by Stephen Malinowski demonstrates the formula of 8 chords that is consistently repeated in the bass, forming the progression of 5 vertical harmonies that unite the melodic material of all three melodic parts.

Pioneers of both, perspective and chordal textures, found their inspiration in Ancient Greece. Renaissance theorists of perspective were heavily drawing on treatises by Euclides and Ptolemy (Edgerton, 1974). Music theorists were equally heavily leaning on Aristoxenus and Ptolemy (Galilei, 2003). Vincenzo Galilei, the father of the famous astronomer, forged a new method of improvising harmonies on a string instrument to accompany his own singing (what he called “arie”), following the harmonic models of popular Italian folk songs of his time (Palisca, 1960). He believed that in doing so, he was restoring the venerated principles of Greek composition by creating perceptual analogs to Greek tonos and modo.

Both restorations, of “tonality” and perspective, relied on mathematics. Just as much as visual artists invested into projective gadgets, Renaissance musicians went into calculating optimal ratios for tuning in order to maintain the purity of chords while keeping the melodic line expressive. Moreover, musicians and artists utilized the same proportions—a tradition dating back to Peter Abelard, consummated in Alberti's theory of “musical proportions” (applying ratios of musical consonances to geometric figures) which remained influential until the eighteenth century (Pintore, 2004)44.

From the very beginning, Renaissance perspective was bound to proportionality (Wittkower, 1953). Perspective rules strikingly resemble centripetal gravity in tonality. Moreover, perspective rules are mirrored in vertical, as well as horizontal harmonic organization (Table 1).


Table 1. Correspondence of principles of canonic linear perspective and principles of eighteenth century Western tonality.

Horizontal unification originated from vertical unification in Renaissance polyphony: Renaissance triadic tonality was born out of polyphonic texture, by satisfying the counterpoint rules in 4-part cadences (Randel, 1971). The modern day consensus of historic musicologists holds that compositional process of 14–15th century polyphony was based on expansion of 2-part into multi-part counterpoint (Moll, 2014). Renaissance mode essentially determined the melodic composition through cadential plan of the soprano and tenor parts—in a way similar to classical tonality (Meier, 1988, pp. 123–236)—and was largely controlled by usage of major/minor triads (406–421). Most of the Renaissance went into attempts to forge the meantone temperament with optimal sonance of triads on the degrees critical for most common keys (Lindley, 1976): thereby, the idea of optimal division of octave was interconnected to the idea of better sounding chord, which is essentially the same idea that governed logarithmic intervallic distribution between parts (Huron, 2001).

Tri-unification of centrality, ratio and vectorization is specific to Western civilization. Other cultures that came in touch with the theory of perspective, notably, Arabic and Chinese, did not develop a “psychological mental set” (Edgerton, 1974) out of the intellectual components of algebra, geometry, astronomy, religion, and art. Unique was the aspiration of Westerners to rationalize capitalism as an extension of moral law, seeking “symbiosis between God and mammon,” and putting math in the service of this double-goal—using it to define the Divine order while simultaneously gaining practical benefits in making music and fine art. Neither China nor Arabia, although technologically ahead of Europe by the fifteenth century, generated a culture based on the theoretical realization of mathematically ordered world, designed for practical production of utilities. Arabic scholars of the eleventh century knew Euclid and linear perspective (Belting, 2008), yet despite their contact with artists, perspective made zero impression on Arabic visual composition (Raynaud, 2009), not going any further than affecting depiction of details in architectural design (Yazar, 1991).

Solely in the West the artist adhered to the model of Divine emanation: God creates Man, and Man creates art, which is Divine. Christianity was a viable force in promoting perspective and tonality. Not only perspective was sanctified by the Quattrocento theologians as a faithful representation of Divine light reflected from an object (Edgerton, 2009, p. 29), and not only the foundation of major/minor chords were justified by Zarlino as uncovering perfection of God and Nature (Gozza, 2000, p. 58), but Christian understanding of omnipresence of God in every material particle paved the road to hierarchic unification of pictorial objects, as well as musical tones.

Hierarchic unification opened doors to compression of information, which enabled great complexity, unparalleled in other cultures.

36. Audio: Tallis—Spem in alium (c.1572), a 40-part motet. Extremely thick polyphonic texture, with complex division in groups. Graphic visualization by Stephen Malinowski

Image 2. Bruegel the Elder—The Procession to Calvary (1564). Extensive landscape is filled up with detailed rendition of over a 100 characters, subdivided in multiple concurrent events—all integrated in a single bird's view perspective.

Tonal organization, just like pictorial perspective, establishes a particular model of symbolic representation of reality, shared by majority of the members of the same socio-cultural formation. Such model sustains over a period of time, instilling the same approach to reality in old and new generations—until the time when, for some reason, the socio-economic change renders this approach inadequate. Then, the old model of symbolic representation is abandoned and replaced by a new one. The entire Western history consists of numerous such symbolic “revolutions.” Each major historic period in Western civilization seems to carry its own “special perspective” (Panofsky, 1991, p. 21)—and a corresponding model of tonal organization (see Appendix 6 in Supplementary Material).

Melodic Line, Geometric Line and Environmental Topography: Their Connection

Just as the shaded line can inform whether the drawing engages the depth parameter, performance of a single melodic line can indicate the vertical and horizontal harmony.

A string player or singer, brought up with Western tonality, tunes each melodic tone according to its membership in the chord implied by the vertical harmony (Friberg et al., 2006)—even if the entire music work is solo. Tonal performance presents an ongoing challenge of constant mediation between melodic and harmonic tuning.

The pre-MPS folk music does not involve such intricacies. The archaic folk singer, raised in monodic music culture, is unaware of implied chords, following only the melodic aspect of tuning. Representatives of any primordial polyphonic folk traditions are likely to combine melodic and harmonic tuning of their respective polyphonic system. Kubik (1985) observed harmonic tuning “accents” amongst the encultured groups of African ethnicities—quite similar to having an accent in speaking a language45.

Nikolai Garbuzov coined the concept of zonal hearing to address the discrepancies of such “accents.” Through a series of methodic experiments he established a frequency range within which majority of musicians perceived a particular tone as “the same pitch”—a pitch zone. The exact values differed between different performers, testifying to the presence of a recognizable individual tuning style adopted by each musician as part of his “individual sound.” Yet, overall, all individual styles overlapped within a range of 58-76 cents, disclosing what appeared to be the range of cultural convention of tuning46. Each melodic interval slightly varied in zonal width: from 24 cents for unison to 76 cents for minor 2nd (Garbuzov, 1980, p. 92)47.

This variability reflects the difference in distribution of “tendency tones” in modal intonations. Each pitch zone can be viewed as part of representation of a traditionally established set of frequencies, remembered as a repertoire of standard intonations. The proof of that is the categorical perception of pitch errors: listeners judge about 60% deviations from a standard frequency as “correct” in tuning—possibly due to the same error correction mechanism that is engaged in perception of verbal phonemes (Siegel and Siegel, 1977).

A pitch zone is the aggregate value of all the expressive tunings for a given degree of a PS across all intonations that characterize this mode, afforded by the music-users (Garbuzov, 1980, p. 144). Narrowing a pitch zone for a specific degree in a mode/key necessarily indicates the presence of an important modal intonation that utilizes this degree—and can reveal a particular method of tonal organization. Narrowing of all the pitch zones would signal of the performer's concern with the vertical harmony.

In the same way that lack of harmonic functional hearing leads the folk-fiddler to render a tonal melody with overly strong “melodic” accent, the inability to integrate all the perceptual stimuli presented in a pictorial image prevents members of unacculturated groups from decoding the spatial information in perspective-based drawings (Hudson, 1967). The culprit here is the same—centering on a single aspect of organization due to lack of integrative experience. And just as folk musicians can be taught tonal music, non-Westernized people can learn to draw in linear perspective (Mshelua and Lapidus, 1990).

Non-classical listener hears a modernistic composition as a sort of “noise” because he treats music “unmusically”: he applies the hearing criteria for environmental listening (Nazajkinskij, 1972, p. 173). He employs fissure—isolation of a presumably important sound signal from unimportant ones. Musical listening, in contrary, requires integration of simultaneous sounds, calling for application of an adequate scheme of tonal coding. Not knowing the scheme forces the listener to “flatten” the tonal richness into a guessed “melody”—similar to how a child “flattens” the perspective organization while drawing a 3D object. There is experimental evidence that pitch is mapped to height for isolated tones differently than for melodic intervals—by non-musicians, while musicians process pitch automatically (Lidji et al., 2007). This suggests that spatial representation of tonal organization has to be learned.

The type of limitation, experienced by a viewer while encoding/decoding a perspective-based image, can be telling of his tonal hearing (see Appendix 7 in Supplementary Material for the example of tonal/spatial correspondence in perception of Nenets traditional art).

Each scheme of tonal organization abstracts those perceptual features of the living environment that are crucial for the success of the typical representative of a given culture in pursuing his life goals. Sophistication of a scheme is only an answer to the sophistication of socio-cultural conditions for survival: 2-dimensional representation with indefinite intervallic values sufficed the earliest schemes, whereas later schemes required more dimensions and greater precision.

Tonality, as well as perspective, can be understood as a generic system of representation of 4-dimensional space in 2-dimensional framework: pitch/time for music and x/y for pictorial art, where the observation point is defined according to the position of the individual perceiver. Similar dimensional conversion characterizes literature, where perceptual reality is described from an angle of the speaker, in 2D fashion, word-per-time (Uspensky, 1995, p. 80). Noteworthy, the onset of naturalistic depiction, literature, and harmonic theory all concur in Mesopotamian urban culture.

The next landmark, invention of perspective, paralleled chromatic music and lyric poetry that became means of individual self-identification. Here, the cognitive centerpiece that supported them all was concise definition of one's position in relation to something: be it a depicted object, a composed melody in a certain mode, or a subject of a poem.

Crystallization of linear perspective c.1300 further advanced the self-orientation function: defining a specific observation point optimal for the observed object. The epitome was the spectator positioning himself inside the church to view a fresco—thereby the emerging Franciscan order sought to “include” the viewer into the painting, increasing intensity of the aesthetic emotions and attracting more people to join the congregation (Benton, 1989). The musical equivalent here would be the style of musica reservata where, in the polyphonic texture, the listener had to track a theme that expressed a particular emotion related to the lyrics—shaded tonally with the help of chromaticism (Meier and Dittmer, 1956). There, orientation occurred by triadic sonority and melodic diatonicity guiding the listener in mapping spots of tension and relaxation. Polyphony, tonality, and perspective, each in its own way, all faceted the same sense of individualism that flourished circa fifteenth century (Hyer, 2008).

Their symbiosis did not fall apart after homophony replaced polyphony as the dominant compositional method during mid-eighteenth century48.

37. Audio: Schumann—Scherzo, Piano Quintet Eb Major. Homophonic parts are engaged in complex textural “kaleidoscope” nexus by means of their pitch contours and rhythmic patterns—evident in this graphic visualization by Stephen Malinowski.

Both, implementation of tonality in musical texture and linear perspective in 3D representation, serve the same purpose of the orientation of an individual in sonic/visual space by means of defining the foreground, background, and the distance between them. The earliest germ of foreground/background relation in harmony is constituted by the notion of “tonal distance” in monody: the remoteness of the auditioned tone from a remembered “tonic.” The idea of coordination of tonics was born in Sumer, with the establishment of the sister-mode family, in MPS modes. It took a while before, in Baroque music, it evolved into “key distance” and still later—into “key relationship,” determined by the circle of 5ths. Exploration of “tonal distance” throughout history of music very much parallels exploration of representation of spatiality in pictorial art—and parallels do not stop there: the foreground/background dialectics penetrates into sculpture, architecture, and theater (Grauer, 1996).

Tonal organization is more likely to prototype pictorial representation than vice versa. The archetype could be the Central Asian tradition of carpet weaving, where weavers use chants to aid memorization of ornamental patterns—possibly dating to an older Indo-European tradition, mentioned by Homer and Bacchylides (Tuck, 2006).

Music behaviors are more widespread than drawing behaviors—especially in egalitarian-oriented folk cultures. Every known music tradition follows some kind of “music theory” (Blacking et al., 1995, p. 224), whereas the number of cultures that lack tradition and theory of depiction is vast. In tribal society, it is hard to estimate personal use of music by every tribe-member (although in numerous cultures, “personal songs” occupy a prominent place), but evidently collective use of music can take 2–3 h daily in a subsistent society that lives under constant environmental pressure, yet dedicates to music so much of the tribe's resources (Huron, 2003, p. 64). Despite intense professionalization, levels of public engagement in music-making stayed high even in the industrial West (congregational and amateur music)—until the 1930s (Rothstein, 2006; Chybowski, 2008). Nothing comparable in scale is known in art history: neither in prehistoric49 (Curtis, 2006, p. 235), modern tribal50 (Myers, 2002, p. 63), ancient (Hauser, 1999, p.15), medieval (Florensky, 1996) nor early modern art (Pears, 1991)—drawing from nature has remained the privilege of few ordained and/or gifted artists until the emergence of welfare system and public education (Alexander and Rueschemeyer, 2005).

Relatively clear public consensus on what constitutes excellence in music vs. unclear criteria of excellence in visual arts (Milbrath et al., 2015) suggests that it is music that conserves the method of organization for spatial representation. Drawing adheres to a theory only in religious applications, such as iconography, or in professional institutions, such as academy of arts—where such theory noticeably lacks uniformity, with different schools adhering to different techniques and methods of drawing.

Depiction requires extensive knowledge of visual shapes and command of looking for strategies in determining which level of description to use for analysis of the depicted object and “good” copying performance (Pratt, 1984). Hockney (2006) informs about the multiplicity of technical gadgets utilized by artists throughout ages, starting from Euclid51. Producing and reproducing a tune does not require any comparable technology. To succeed in capturing the spatial organization the drawer must learn to make things look “right” by drawing them “wrong” (Arnheim, 1954, p. 76). Copying a tune by ear is much more intuitive than copying an object by eye.

Unlike ubiquitous acquisition of singing skills (Hargreaves, 1986, pp. 66–104), “learning” how to draw presents a substantial cognitive barrier to children due to required alternations from an object- to a viewer-centered representation, involving curbing of “natural” instincts of viewing (Freeman, 1987). Children's drawing improves by age, because they elaborate better strategies to mediate between “what” and “how” to draw (Park and I, 1995). Their knowledge of what to draw typically interferes with their choice of depicting strategy (Tallandini and Morassi, 2008). The effective solution becomes abstraction of a graphic-motor schemata adopted per object type (Phillips et al., 1978) based upon the criterion of reducing ambiguity in picture (Willats, 1977). The entire development of pictorial skills throughout childhood is shaped by this goal (Morra, 2008). Noteworthy, children are more prone to copy each other's pictures than to infer from life (Wilson and Wilson, 1984).

Reproducing a tune is nowhere near as technical, intellectual, problematic, and cognitively demanding as depicting an object. Russian state general education program requires recognition/reproduction of parts in multi-part settings from 10-year old children (Apraksina, 1983, p. 110)—and majority of the population has been complying to this program for few decades. Children of around that age can infer a tune from a polyphonic folk-song by themselves without any assistance (Naumenko, 2013, p. 124). Flattening of a polyphonic song into a monophonic version is a common trait in many ethnic cultures (Jordania, 2006, pp. 376–378). Conversion of musical texture seems to present no obstacle comparable to 3D-2D conversion. No wonder, musicality is considered a biological trait of Homo (Morley, 2013, p. 5), whereas ability to realistic depiction is reserved for specially gifted (Golomb, 2004, pp. 201–278).

Music occupies a prominent position in children development, only magnifying toward adolescence. In contrast, drawing attracts children at the age of 2–3 years, waning afterwards (Winner, 2007). An adolescent at large becomes estranged to drawing, once so favored at a younger age—in polar opposition to his musical interests (Vygotsky, 1984, p. 61). Modern 8–18 year olds spend on average 2.5 h daily for music (Rideout et al., 2010). The underlying reason must be affiliation of music with cognitive consonance. Vygotsky (2008, p. 397) underlines that childhood presents the most tragic stage in life, when the organism finds itself in the greatest disharmony with the environment, being under constant pressure to quickly bring itself “in-tune” with its surroundings—this cognitive dissonance calls for “music of upbringing”—harmonious nurturing. And nothing answers this call better than music itself.

The Course of Tonal Evolution

Existence of 14 schemes of tonal organization (see Appendix 8 in Supplementary Material) suggests that tonal order in music manifests a general mental organizational scheme, which serves as a cultural adaptation of human perceptive apparatus to a life style optimal in a certain environment (Lomax, 2004, p. 281). Younger generations in a socio-cultural formation keep reproducing the same schemata as long as the same life style pertains. Music appears to act as a principal enforcer of this mental enculturation. The emotional nature of music, its affinity with entrainment, and its capacity to reduce cognitive conflict, all make tonal organization a powerful tool in shaping a methodology of thinking within a given community. Gravitational laws of musical virtual reality reflect perception of physical laws of phenomenological reality, and link the navigation strategies for both of them.

The rigorous and systematic organization of music is a product of pressure imposed by natural selection to adequately collect information about the kinds of objects that exist in the environment, what they do, and how they can possibly be used (Terhardt, 1995). Through a complex process of conversion of frequency data into pitch information, the brain exercises an organizational scheme that is archetypical for majority of music users in a given community. Nature provides humans with some “template” of sensory categories in their default values (i.e., preference for vertical and horizontal consonance, binary meter and frequency range of the speaking voice), and cultural application modifies this “template” into a “document” that would serve a particular need, common for a given community. Such “document” seems to work like a script for our senses, organizing them to operate following the coherent set of principles (Walker, 2004). Auditory imagery is known to evoke visual and/or kinesthetic experience, which in turn influences auditory imagery (Hubbard, 2013). Capacity to make emotional judgments about imagined music (Halpern, 2012) makes music a par excellence tool for social engineering.

Pitch presents a perfect medium for exercising discretization: human hearing is inherently zonal—it cannot distinguish between different pitches in a clear-cut manner; there is a zone between frequencies of tones in a familiar PS, upon hearing of which the listener cannot reliably tell which pitch it is. Even extremely gifted and well-trained soloists observe 6-21 cent inter-zonal threshold (Garbuzov, 1980, p. 205). Parncutt and Cohen (1995) also specify the 10–20 cent threshold52.

This “margin for error” is responsible for mutability of pitch detection, requiring some error-correction mechanisms53. Jordan (1987) demonstrated that listeners can discriminate intervals of 25-50 cents, but categorize them in terms of diatonic IS, and musicians do so more than non-musicians. Models of tonal organization can be viewed exactly as algorithms designed to minimize indiscretion in detection and production of pitches in a given socio-cultural formation. Similar mechanisms must be at play in the perception of other important attributes of musical sound. Garbuzov experimentally identified the zonal nature of perception of dynamics (Garbuzov, 1955), timbre (Garbuzov, 1956), tempo, and rhythm (Garbuzov, 1950). Garbuzov's pupil, Rags (1980, p. 33) sums them in the following definition: “Zone is the measure—a quantitative characteristics of a musical conceptualization…of the inter-relationships between certain qualities of musical tones”—product of an attempt to discretize gradations in “fuzzy sets” of auditory parameters.

It appears that during the passed century the zones and zonal thresholds have shrunken in perception of Western classical music (Rags, 1980, p. 33). For non-tempered intervals, inter-zonal thresholds are about 2–2.5 times more narrow than the thresholds of the adjacent zones (Garbuzov, 1980, pp. 89–99). Moran and Pratt (1926) confirmed this ratio for the tempered scale. Comparative investigation of how Western and native Java musicians estimate intervals in Western and gamelan music demonstrate that Javanese listeners have wider zones of pitch uncertainty in auditioning Western music than Western listeners do in Javanese music (Perlman and Krumhansl, 1996). Léothaud et al. (1997) report that in Central African octave-equivalent polyphony, differences as large as 80 cents54 between two structurally equivalent keys are still considered a unison.

Another trend, noticeable throughout the evolution of tonal organization, is the progressive increase of the number of pitch zones within an ambitus of a mode (Rags, 1980, p. 42).

The more modern the stage, the more pitch gradations fit within the same frequency band (from 2 entities in khasmatonal to 17 entities in tonal key), and the narrower are the zones reserved for each of the pitch entities (from up to 1½ octaves for khasmatonal to 24 cents for tonal key; Table 2).


Table 2. Pitch zone discrimination throughout the evolution of tonal organization in low voice vocal music.

Ten stages of tonal evolution are mostly cumulative—(Figure 5)55. Alekseyev (1986) illustrates how archaic principles of pitch organization are still present in modern music. Khasmatonal principle manifests itself in folk yodeling or rock music growling. Ekmelic principle comes out in gliding inflections and half-spoken pitches of blues or rap.


Figure 5. 14 schemes of tonal organization that occur through 10 stages. Stages 1–6 are cumulative until the divergence between pentatony and heptatony. Second divergence occurred between chromatic and diatonic MPSs, bringing to life hypermode. Chromatic MPS produced yet another divergence between hemiolic modality and tonality. The importance of this is that competence in first six schemes is shared by all music users, who can switch between operating in any of them and their “native” music system: i.e., a mother raised on music of Western tonality employs khasmatonal organization in her motherese. Divergence, on the other hand, requires learning of a scheme that is diverged in relation to the music user's native system.

A music-user from a later stage of tonal development possesses apparatus to decode the representational schemata of earlier cultural formations, but a listener raised exclusively on music of earlier stages cannot adequately decode spatial aspects in music of a later stage. This is most obvious in the communication of motherese: Western mothers have no problems reserving to the khasmatonal organization in their singing despite their competence in tonality (see Presentation-3 from Part-I), while for their babies khasmatonal music remains the only form of tonal organization that they seem to be able to fully follow. Their perception of tonality is severely limited (Trehub, 1987). Walker (1987) concludes his overview of the experimental studies of cross-cultural pitch perception: “the more remote from Western auditory and visual cultural artifacts the subjects, the less likely they are to select the kind of visual metaphor for sound than those trained in Western musical practices consistently select.”

General Summary

Zemtsovsky (2012) considers the very concept of musical tone to carry a signature of tonal organization. Artificiality of periodic oscillation in musical vocalization calls for justification. Hearing a single pitched tone generates expectation, and hence, induces gravitational projection toward the following pitched tone. In order to make sense, a tone must proceed to another tone (or be repeated), forming an intonation. And a melodic intonation becomes not only a brick in the melody, but a quantum of gravity in the virtual reality of music. The entire progression of 11 modal schemes proceeds toward “triadic” mentality of perspectival depth and polyphonic texture—this direction is evident not only in Western music, but in other cultures as well, where it materializes in textures of various density depending on the social order, environmental factors, and cultural exchange (Vinogradov, 1973, p. 106).

It appears that progressive increase in tonal acuity is the result of evolution in melodic complexity. Crystallization of melodic contour typology brought to life ekmelic pitch regions to replace much wider and less definite khasmatonal registers. Forging of absolute intervals of 2nd, 3rd, and 4th between the anchored tones replaced ekmelic regions with oligotonal pitches. Triad induction caused variable tuning of unstable degrees to recede to bi-optional tuning. Transposition-by-degree of pentatonic and hemitonic motives fixed unstable degrees in tuning, and narrowed the pitch zone to a semitone. Institution of chromatic alterations further narrowed the zones below a semitone. This entire process of “pitch-zooming” presents an adaptation to the cultural need for greater discrimination.

Parallel increase in PCs and establishment of complex tonal hierarchy within a mode made encoding of tonal relations effective enough to support simultaneous data transmission via multiple sound streams, up to 7-componental music texture. Each of the components simultaneously encodes information by means of idioms of pitch, rhythm, meter, and harmony, plus the contribution of expressive parameters of dynamics, tempo, articulation, timbre, and music form56. Such unprecedented density of information makes music a par excellence tool for abstraction of important features in a living environment, and mediation of their schemata between the members of the same social group in the best interests of each individual.

Language is an important factor that shaped melodic intonation and its tonal organization. Propensity of language communication to cultivate cognitive dissonance finds a counterbalance in propensity of musical communication to cultivate “cognitive consonance”: a state of inspiration and empowerment, characterized by experience of integrity, clarity, and consistency of attitudes in a person, as well as togetherness within a group. People sing together to share the same experience, but talk one after another—usually to resolve an issue. Listeners routinely engage into musical behaviors for relaxation or recreation, which is less common for speech that normally cannot run in a semi-automatic regime, unlike music. Less stressful manner of experiencing music is likely the result of different strategies preferred in comprehension of music vs. speech. The tendency to fuse spectral content dominates the perception of music, while the tendency to segregate phonemes prevails in the perception of pitch (Bregman, 1994, pp. 461–589). Then, the need to differentiate between the spectral elements greatly promotes the realization of opposition and discrepancy within the sound material, whereas the need to integrate partials into musical tones, and tones into chords, promotes the realization of similarity.

It looks like both forms of communication co-evolved from some primordial animal-like vocalization, defining each other through the different treatment of pitch—each forging its dedicated processing system (Zatorre and Baum, 2012). The ethnomusicological evidence leads one to believe that melodic intonation was formed by borrowing and exaggerating the pitch contour of the conventional verbal intonations, while contrasting the verbal timbral organization and articulation style. The decidedly “artificial” manner of khasmatonal and ekmelic intonations was shaped by counter-distinction to the “natural” manner of speech. Stabilization of pitch in oligotonal music led to reduction in timbral complexity, changing the manner of its opposition to speech. Oligotonal music started contrasting speech primarily by greater harmonicity of its tones and their timbral uniformity—a “bel canto” style of sound production, employed by a given music culture as “canonic.” Speech intonation, on the other hand, featured much more diverse spectral content of each of its phonemes, instituting the fission/fusion antithesis. At this point permanence in tuning of PS and hierarchic ordering of pitches—as well as timbral uniformity—became the “musical” traits. This border is not clear-cut: orators and poets often “melodize” their speech (Nazajkinskij, 1972, p. 261), whereas composers engage in a “recitative” melodic style, sometimes deliberately emulating verbal intonation (Pearl, 2006). However, overall, music tends to contrast speech in pitch organization—most evident in tonal languages, where the musical pitch contour of traditional songs often violates the normative intonation contours of the lyrics (List, 1961).

In a compensatory manner, cardinal stages in the development of language tend to concur with shifts in the method of tonal organization. Development of sentence syntax conditioned transition from khasmatonal to ekmelic mode. Emergence of epic poetry promoted oligotonal music. Introduction of literacy boosted the emergence of philosophy, law, and science—promulgating prescriptive theory of harmony and diatonic music.

“Cognitive consonance” function works in both, horizontal and vertical dimensions of music. Any ensemble attempt to sing a melody with fixed pitches would contain asynchrony, when one performer produces a new pitch while the other still carries the previous pitch—thereby converting melodic interval into harmonic. Just as a single pitch is prone to generate a 2-tone intonation, it is prone to generate a 2-tone vertical harmony. And this is where spatial representation comes into play: 2-tone vertical harmony is cross-modally equivalent to 2D pictorial representation, and 3-tone harmony (chord)—to 3D. In the same way it takes at least three objects to hint to three distinct levels of pictorial depth (Cook et al., 2008), three simultaneous tones imply a certain texture—both representations manifesting the same higher order cognitive scheme of 3-factor opposition that typifies the Renaissance astronomy, major/minor tonal organization, and linear perspective—all setting the framework for “triadic cognition”—the ability to reason by 3-way associations (“X in relation to Y in light of Z”; Cook, 2012, p. 12). Upgrade from dyadic thinking constitutes the greatest achievement of human civilization57, underlying all the technological achievements. Possibly, tetradic thinking is there to follow.

By cross-modal implication, musical intonation differs from verbal. Speech is not heard in terms of visualizing auditioned tones the way music is. The bifurcation point must have occurred during the oligotonal stage:

• Speech adhered to “low-cognition” standard of correspondence “higher pitches—good disposition” and “lower pitches—expression of aggression”—shared by communication of many mammals (Ohala, 2006).

• Music adhered to a “high-cognition” standard, present only amongst humans—already evident in ekmelic organization, where ascending inclination corresponds to stress/climax, while descending—to relaxation/resolution (Alekseyev, 1976, p. 129).

Synesthetic capacity, so vivid in music, must be responsible for re-wiring of the registral associations—and this should be viewed as yet another display of integrative nature of music: musical tones trigger 3D spatial representation, unlike “flat” sounds of speech58. Establishment of centripetal gravity and permanent tuning throughout an oligotonal song brought to life this “hologram” effect.

Melodic intonation is immensely important for human culture of thinking. Once generated, melodic intonation receives life of its own. It can be adopted by many music users and embedded into a mode59. It can be developed: by contribution of each of them. Historic development of intonation resembles epistemology and historic changes in syntax of language. Subsequently, musical mode functions as a “complex dynamic and self-regulated system” that operates on genetic-like principles of optimal adaptation to the cultural environment (Alekseyev, 1976, p. 113).

Analysis of intonations in a song enables reconstruction of the gravitational system observed by the creator of that song—in a way similar to a paleontologist's reconstruction of a fossil from a bunch of bones. Mode is capable of showing which intonations have stayed popular within a community of users over an extended period of time60. This makes it possible to generalize a gravitational scheme exercised in the music of that culture, and to formulate the cognitive style peculiar to a given historical socio-cultural formation. Understanding this cognitive style allows for extrapolating a method of musical order in organization of other cultural activities. Overall, music appears as a naturally formed testing ground for various principles of representation of reality, conscious and unconscious (Hubbard, 2007), used to prime emotional reactions to music idioms, thereby establishing and cultivating conventional standards of intellectual and emotional intelligence.

Author Contributions

The author confirms being the sole contributor of this work and approved it for publication.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


I would like to take an opportunity to thank Josef Jordania, Izaly Zemtsovsky, Alma Kunanbayeva, Leon Crickmore, Eduard Alekseyev, Leonid Perlovsky for helping me to formulate the concepts for this article. My special thanks go to Sheila Bazleh for editing the text of my manuscript. I am grateful to Stephan Vitas, Jon Solomon, Stefan Hagel, John Curtis Franklin, Anthony Seeger, Eduardo Coutinho, Robin Wilkins, Bo Lawergren, Stephen Malinowski, Marina Korsakova-Kreyn and William Griffin for their insightful comments and assistance with preparing examples.

Supplementary Material

The Supplementary Material for this article can be found online at:


1. ^The term “tonal organization” here encompasses not only what is called “tonality,” but every form of integrating musical tones together in a harmonious ensemble— including music where fundamental frequencies are not well marked in the spectral content, i.e., gamelan or throat singing.

2. ^The term “mode” is a modern day concept applied retroactively to address the tonal coherence of a pitch set (PS) selected as the formative basis for music-making: each set adhered to a particular scale, probably associated with some tonal focus—unfortunately, the only surviving melody is insufficient to infer the modal characteristics of Babylonian music system (West, 1994).

3. ^According to UET-VII-126 tablet, the open strings tuning (pitum) is G-A-B-C-D-E-F-G-A, producing the same IS (tone+tone+semitone+tone+tone+semitone) in either direction (Crickmore, 2009b).

4. ^Pythagorean Triplets probably originated from the Babylonian tablets (Skinner, 2006, pp. 17–18). Sacredness of number 9 in Antiquity stemmed from the Babylonian Lunar calendar (Wiltse and Palmer, 2011). Babylonian proportions were perpetuated in the Parthenon (Kappraff, 2006).

5. ^“Displacement” here refers to the last tone of the melodic 2nd which perceptually “erases the trace” of the initial tone in the short-term memory. Such effect seems to be universal (Larson and McAdams, 2004; Deutsch, 2012; Larson, 2012) and therefore applicable to the Ancient Greek music.

6. ^Such as arpeggio passages.

7. ^“Metric” division of a 4th is comfortably visible on a woodwind/string instrument through the placement of holes/frets—and relatable to the resulting changes in pitch (Beliayev, 1990). The difference between perfect 5th (702 cents) and 4th (498) produces major 2nd (204). The succession of 2nds marks the I, II, and III degrees, thereby defining the minor 2nd (90 cents) left out between the III and IV degrees (90 = 498−2 × 204).

8. ^In Ancient Greek theory our Phrygian tetrachord was called Dorian, and viewed as the trademark of Greek civility, ideal for representation of universal harmony (Crickmore, 2006). This Greek “Dorian” is congruent to the Babylonian Isartu [“normal” in Akkadian (Kilmer, 1971)] mode (Duchesne-Guillemin, 1984), by its name suggesting its commonality in Mesopotamia.

9. ^Nicomachus considered conjunct E-F-G-A-Bb-C-D the primordial scale comprised by conjoining tetrachords, only later disjointed by Pythagoras to form an octave scale (West, 1992, p. 176).

10. ^Modal mutability of multiple anchors was explained in Part-1.

11. ^Such reproduction of a progression of musical tones that retains the intervallic relations between all successive tones (C-D-E-F −> D-E-F#-G)—“natural” for singers who don't have perfect pitch.

12. ^Such reproduction that adjusts the intervallic relations between the successive tones to match the PS (C-D-E-F -> D-E-F-G)—“learned” by all singers.

13. ^Joseph Yasser also observed similar lineage in successive development of what he called “quartve,” “quintave” and octave equivalence in evolution of tonal organization (Yasser, 1948).

14. ^Musicians in that study had easier time discriminating the dominant from the repeated tonic—indicating that the “dominant” function is learned through music-making.

15. ^West (1981, p. 127) also describes a similar process of modal genesis of sister-modes by means of modal transposition, producing Dorian, Phrygian, and Lydian scales in Helladic Greece, according to the testimony of Alcman, 7th century BC.

16. ^Variety of songs accompanied the liturgy in Babylonian temples, with no fewer than 6 different compositions (Löhnert, 2011)—at least some of which must have proceeded without stopping while requiring different modes for their liturgical expression.

17. ^Something similar to this must have happened in Helladic music, where the diatonic genus apparently allowed for building the “sister-scales” from any of the tones of a system—and not just stable tones, as in Hellenic music (Franklin, 2002). Subsequently, “old” music was criticized for producing tunes that all sounded “the same” (probably due to the perceived lack of “expressive tuning” that resurfaced again in chromatic music of the 5th century BC).

18. ^This is achieved by: (1) increasing the amount of strings on harp/lyre; (2) using the fingerboard to play closed strings on lute.

19. ^In Mesopotamia, genre classification was the business of scribes, and proceeded according to the intended manner of performance, marked in notation with the help of the subscripts and rubrics (Rubio, 2009, p. 23).

20. ^Sumerian culture marks onset of professionally produced music, coining the term 'šir' that implied authorship of a song (Krispijn, 2010).

21. ^Evidently, all registered love-songs belonged to one of seven standard modes, three of which were by far more common for expression of love. Specialization required expertise from the musicians—to prepare music ahead of time, and their performance could score varying degrees of success, reflected by difference in wage (Ziegler, 2011).

22. ^Because of its conservative nature (see Part-1), the pentatonic MPS lagged behind the heptatonic. The Old Babylonian tablet CBS-1766 postulated heptatonic tuning (Dumbrill, 2007) in the eighteenth century BC (Friberg, 2011), likely descending from the older tradition, perhaps the twenty sixth century, when the Lyre of Ur was made (Rowan, 2013).

23. ^Testifying to universality of Mazel's theory of odd/even functional induction (Mazel, 1952) (see Part-1).

24. ^Like Mesopotamian symmetry, this whole-tone division is foreign to local folk cultures of the Far East, supporting the point that opposition of village folk to palace/temple authored music corresponds to opposition of pre-MPS to MPS systems, and is a cross-cultural trait that has cognitive roots.

25. ^Cuneiform writing was popular amongst commoners—cursive signs and stenography were widespread (Veldhuis, 2011). Functional and technical literacy were taught informally through apprenticeships and “community of practice” (Robson, 2008, p. 52).

26. ^The issue of competence versus incompetence is expressly addressed in a survived dialogue between a scribe and his student, which asserts the necessity to learn whatever it takes to figure out a solution (Vanstiphout, 1997).

27. ^The third type, pentachord-based, is discussed in the Appendix 2 in Supplementary Material.

28. ^Hexáechos literally means “six modes” in reference to the smaller version of the principal Byzantine MPS—oktôçchos (Kholopov, 2006, p. 68). The Russian implementation of hexáechos is called “obykhod” (“common”), codified in 1772, but probably much older (Kholopov, 2004). The influence of obykhod on Russian folk music is tracked to the fifteenth century (Rudneva, 1994, p. 121).

29. ^Throughout Antiquity and Middle Ages a common tool for conception and assessment of tonal organization was monochord with letters designating pitches (Bower, 2002)—a proof of musical thinking in terms of the “legitimate” PCs.

30. ^An unstable mesotonal/multitonal degree can appear chromatic, misleading ethnographers in their transcriptions (Ambrazevičius and Wiśniewska, 2008). Correct spelling of pitches can be verified by computer analysis of the frequency of each discrete musical tone within a song, then defining its PS by mediating between the average values received by incrementing the degrees in the ambitus and the melodic intonations established by morphological analysis of a song (see Presentation-1 for an example).

31. ^Thus, maqam encompasses both, inflections and alterations—possibly under Western influence (Yöre, 2012).

32. ^However, it does not defy the overall pan-harmonization of the PCs in a MPS, since it is minuscule, limited usually to just few degrees (most notably VII), and affects mainly vocals and string instruments.

33. ^Every music theory defines a finite number of ways to modulate from one mode to another. Omnipresence of such regulation indicates that music-users remember the typical “moves” from mode to mode (Koelsch et al., 2003).

34. ^Interval-class set (ICS) is the collection of intervals that characterize PCS. Interval-set class (ISC) is the ICS of an MPS mode. Hearing chromatic modulation requires knowledge of the repertory of ICSs and ISCs.

35. ^Chinese chromatic temperament most likely grew out of the practice of combining composite pentatonic scales (Bagley, 2005).

36. ^Ancient music afforded what we call “accidentals” (Wulstan, 1971). The term “kampe” might have implied pitch bending alien to “harmonia”—“exharmonic” melodic deviations unidentifiable by reference to the departed mode (Franklin, 2013). The earliest notation of them is in the Zeno Archive papyrus, 250 BC (West, 1992, p. 196).

37. ^Hellenic culture pioneered a semiotic system of “musical emotions”—specific affective states conventionally ascribed to specific musical structures (Woodward, 2011). The outline of this system I present in my paper:

38. ^Enharmonic music and modulations poorly suited lyre and kithara which remained the principal tools of teaching music in Greco-Roman education (Marrou, 1956, p. 134). Lack of precise instrumental reference (aulos could not support didactic descriptions of microtonal tuning due to its construction (West, 1992, p. 96) must have been the reason for Aristoxenus to exclude enharmonic modulations altogether from his exhaustive list of modulations (Hagel, 2009, p. 44).

39. ^Widely used, “soft” and “hemiolic” gradations blurred the theoretic differences between chromatic and enharmonic intervallic divisions (153).

40. ^See Part-1, Appendix-2 in Supplementary Material.

41. ^Setting the precedence for association of low register with stability and relaxation.

42. ^Hellenistic chromatic music set the direction for Western civilization by pioneering the notions of error [kheiron] (Losev, 2000, p. 623), error-correction [diorthosis] (Barker, 2004, p. 342) and compression, implied in the idea of “complexity” [poikolia] that meant artificiality of melody, harmony or rhythm/meter, which upset unaccustomed listeners (Franklin, 2013). Technically, this “complexity” involved stuffing more expressive information into the available compositional space, making perception of music burdensome.

43. ^Just like the melodic tones appear to move in a particular way, the sequence of depicted objects suggests to a viewer a particular vector of scanning an image—by the interaction of its perceived shape, size, and centeredness.

44. ^At times, perspective and music attracted the same artist: Leonardo was equally versed in painting and in composition (Mantero, 2001), and applied “musical proportions” in his “Last Supper” (Brachert, 1971).

45. ^Presence of “accents” is best revealed when few ethnomusicologists transcribe a song, producing notations of the same performance, which nevertheless feature different spelling of pitches (List, 1974).

46. ^It remains to be tested, how much representatives of different geographic regions differ in their “local accents” of tuning into the tonal space of a Western key - and what tuning range defines the degrees of tonality cross-culturally.

47. ^Parncutt and Cohen (1995) consider 40 cents the threshold of perceptibility of steps in a scale.

48. ^A typical common practice period polyphonic-homophonic texture subordinates all parts/voices to a one-at-a-time melodic line (with possible dubbing), where each of the textural components receives its own discrete function: principal melody; secondary melody (imitative/contrasting); accompanying melodic figuration; chordal/dyadic accompaniment; bass; pedal/background layer (Kholopova, 1979, p. 37). That still keeps in place about 4-5 textural components.

49. ^Paleolithic paintings suggests much time and effort placed into artistic training: art historians regard many discovered images as “pupil's drawings” and “rough drafts” for “master-works” (Hauser, 1999, p. 9)—which would be hard to accomplish without receiving privileges relieving an artist from food-seeking.

50. ^Customs observed in Aboriginal society confirm the exclusive nature of pictorial image production: drawing presents subjects, which are associated with dreaming, entering a world of spirits and requiring restricted access of only the ordained adult male members of the tribe.

51. ^Sophisticated theory is required by the very task of 3D-to-2D shape conversion: Schäfer (1919, p. 254) describes a German peasant protesting against the crookedness of the roof in painter's sketching of his house—all to be surprised with the “magic” upon completion of the drawing, when the picture looked just right.

52. ^Although certain types of music can promote greater pitch acuity: the frequency resolution of professional Western classical string ensemble players is significantly higher: the finest discrimination reaches 1 cent, whereas the grossest error—20 cents (Shackford, 1962).

53. ^Jordan and Shepard (1987) disclosed such mechanism when they microtonally adjusted the major scale tones to stretch over an octave, or rounded up the intervals between the degrees—both distortions were overlooked by listeners.

54. ^Although, the term “cent” comes from the well-tempered system, there is no another convention to refer to the micro-intervallic distances between the tones.

54. ^Db has been tuned lower than C# in vocal/string performance since at least the 1800s (Barbieri and Mangsen, 1991), presenting different PCs. Listeners' perception of vocal intonation encourages singers to exaggerate sharpening/flattening (Geringer et al., 2015)—to increase expressiveness (Sundberg et al., 2013)—therefore, vocal intonation distinguishes between 17 PCs in a key, and not 12, as commonly believed.

55. ^Thus, the Lithuanian folklore bears traces of 3 stages of tonal development: (1) the proto-Indo-European strict pentatony; (2) the later hexatonic development; (3) the Balto-Slavic heptatony, of probable Hellenic influence (Leisi, 2002).

56. ^For more information, the fuller version of this paper will be available at

57. ^Remarkably, the birth of perspectival depiction, chromatic music and mathematical 3-factor opposition all concur in Hellenic Greece—expressed in the arithmetic, geometric, and harmonic proportions of music: which bind two extremes by a single mean term (Lawlor, 1982, pp. 80–82).

58. ^Also, unlike music, prosaic speech cannot generate the experience of physical motion in a listener (Dalla et al., 2013).

59. ^Computer simulation confirms possibility for a repertoire of intonations to evolve “from scratch” by one music-user imitating the intonation of another with whom he wishes to bond—thereby creating a network of users interacting with one another (Miranda, 2002).

60. ^Ambrazevičius (2004) demonstrates how the acoustic analysis of pitch distribution in the recordings of folk music can reveal modal organization characteristic of different epochs, from modern diatonic to ancient “squeezed anhemitonics.”


Adams, K. (2010). A new theory of chromaticism from the late sixteenth to the early eighteenth century. J. Music Theory 53, 255–304. doi: 10.1215/00222909-2010-004

CrossRef Full Text | Google Scholar

Alekseyev, E. (1976). Problems in Genesis of Mode [Проблемы формирования лада]. Moscow: Muzyka [Музыка].

Alekseyev, E. (1986). Early Folkloric Intonation. Pitch Aspect [Раннефольклорное интонирование: звуковысотный аспект]. Moscow: Sovetskii Kompozitor [Сов. композитор].

Alexander, V. D., and Rueschemeyer, M. (2005). Art and the State: The Visual Arts in Comparative Perspective. London: Palgrave Macmillan. doi: 10.1057/9780230507920

CrossRef Full Text | Google Scholar

al Faruqi, L. I. (1985). The suite in islamic history and culture. World Music 27, 46–66.

Google Scholar

Ambrazevičius, R. (2004). “Scales in traditional solo singing: models and results,” in Proceedings of the Conference on Interdisciplinary Musicology (CIM04) Graz/Austria, 15-18 April, 2004, eds R. Parncutt, A. Kessler, and F. Zimmer (Graz: Druckerei Khil), 1–5.

Ambrazevičius, R., and Wiśniewska, I. (2008). Chromaticisms or performance rules? Evidence from traditional singing pitch transcriptions. J. Interdiscipl. Music Stud. 2, 19–31. Available online at:

Andersen, K. (2008). The Geometry of an Art: The History of the Mathematical Theory of Perspective from Alberti to Monge. Berlin: Springer.

Google Scholar

Apraksina, O. A. (1983). Methodology of Musical Education in Primary School [Методика музыкального воспитания в школе]. Moscow: Prosvesheniye.

Arnheim, R. (1954). Art and Visual Perception: A Psychology of the Creative Eye. Berkeley, CA: University of California Press.

Google Scholar

Arnheim, R. (1984). Perceptual dynamics in musical expression. Music. Q. 70, 295–309. doi: 10.1093/mq/LXX.3.295

CrossRef Full Text | Google Scholar

Arom, S. (2010). “Outline of a syntax of chords in some songs from samegrelo,” in Proceedings: The Fifth International Symposium on Traditional Polyphony: 4–8 October, 2010, edited by R. Tsurtsumia and J. Jordania (Tbilisi: Tbilisi State Conservatoire), 266–277.

Atcherson, W. (1973). Key and mode in seventeenth-century music theory books. J. Music Theory 17, 204–232. doi: 10.2307/843342

CrossRef Full Text | Google Scholar

Atkinson, C. M. (2008). The Critical Nexus: Tone-System, Mode, and Notation in Early Medieval Music. Oxford: Oxford University Press. doi: 10.1093/acprof:oso/9780195148886.001.0001

CrossRef Full Text | Google Scholar

Bach, C. P. E. (1949). Essay on the True Art of Playing Keyboard Instruments, ed W. J. Mitchell. New York, NY: Norton.

Bagley, R. (2005). “The prehistory of chinese music history,” in Proceedings of the British Academy, Vol. 131, eds P. J. Marshall (Oxford, UK: OUP/British Academy), 41–90. doi: 10.5871/bacad/9780197263242.001.0001

CrossRef Full Text

Barbera, A. (1977). Arithmetic and geometric divisions of the tetrachord. J. Music Theory 21, 294–323. doi: 10.2307/843492

CrossRef Full Text | Google Scholar

Barbera, A. (1984). The consonant eleventh and the expansion of the musical tetractys: a study of ancient pythagoreanism. J. Music Theory 28, 191–223. doi: 10.2307/843532

CrossRef Full Text | Google Scholar

Barbieri, P., and Mangsen, S. (1991). Violin intonation: a historical survey. Early Music 19, 69–88. doi: 10.1093/earlyj/XIX.1.69

CrossRef Full Text

Barbour, J. M. (2004). Tuning and Temperament: A Historical Survey. New York, NY: Dover Publications.

Google Scholar

Barker, A. (1978). Music and perception: a study in Aristoxenus. J. Hellen. Stud. 98, 9–16. doi: 10.2307/630189

CrossRef Full Text | Google Scholar

Barker, A. (2004). Greek Musical Writings: Vol. 2, Harmonic and Acoustic Theory. Cambridge, UK: Cambridge University Press.

Google Scholar

Barker, A. (2007). The Science of Harmonics in Classical Greece. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511482465

CrossRef Full Text

Belaiev, V. (1963). The formation of folk modal systems. J. Int. Folk Music Council 15, 4–9. doi: 10.2307/836227

CrossRef Full Text | Google Scholar

Beliayev, V. (1990). “Modal systems in the traditional music of the USSR [Ладовые системы в музыке народов СССР],” in Viktor Mikhailovich Beliayev [Виктор Михайлович Беляев], ed I. Travkina (Moscow: Sovetskii Kompozitor [Советский композитор]), 223–377.

Belting, H. (2008). Perspective: Arab mathematics and renaissance western art. Eur. Rev. 16, 183–190. doi: 10.1017/S106279870800015X

CrossRef Full Text

Bent, M. (1984). Diatonic ‘Ficta.’ Early Music Hist. 4, 1–48. doi: 10.1017/S0261127900000413

CrossRef Full Text | Google Scholar

Benton, J. (1989). Perspective and the Spectator's pattern of circulation in assisi and padua. Artibus Historiae 10, 37–52. doi: 10.2307/1483283

CrossRef Full Text

Bien, N., ten Oever, S., Goebel, R., and Sack, A. T. (2012). The sound of size: crossmodal binding in Pitch-Size synesthesia: a combined TMS, EEG and psychophysics study. Neuroimage 59, 663–672. doi: 10.1016/j.neuroimage.2011.06.095

PubMed Abstract | CrossRef Full Text | Google Scholar

Blackburn, B. J. (2013). “On compositional process in the fifteenth century,” in Musical Theory in the Renaissance, ed C. C. Judd (Burlington, VT; Farnham: Ashgate), 25–100.

Blacking, J., Byron, R., and Nettl, B. (1995). Music, Culture, and Experience: Selected Papers of John Blacking. Chicago, IL: University of Chicago Press.

Bower, C. M. (2002). “The transmission of ancient music theory into the middle ages,” in The Cambridge History of Western Music Theory, ed T. Christensen (Cambridge: Cambridge University Press), 136–167.

Google Scholar

Brachert, T. (1971). A musical canon of proportion in Leonardo Da Vinci's last supper. Metropol. Mus. Art Bull. 53, 461–466. doi: 10.1080/00043079.1971.10790543

CrossRef Full Text | Google Scholar

Brattico, E., Tervaniemi, M., Näätänen, R., and Peretz, I. (2006). Musical scale properties are automatically processed in the human auditory cortex. Brain Res. 1117, 162–174. doi: 10.1016/j.brainres.2006.08.023

PubMed Abstract | CrossRef Full Text | Google Scholar

Bregman, A. (1994). Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge MA: MIT Press.

Brown, H. (1992). “Cognitive interpretations of functional chromaticism in tonal music,” in Cognitive Bases of Musical Communication, eds M. R. Jones and S. Holleran (Washington DC: American Psychological Association), 139–160.

Broze, Y., and Huron, D. (2012). “Does higher music tend to move faster? evidence for a pitch-speed relationship,” in Proceedings of the 12th International Conference on Music Perception and Cognition (ICMPC) and 8th Triennial Conference of the European Society for the Cognitive Sciences of Music (ESCOM), eds E. Cambouropoulos, C. Tsougras, O. Mavromatis, and K. Pastiadis (Thessaloniki: Aristotle University of Thessaloniki), 159–165.

Bruhn, S. (2005). The Musical Order of the World: Kepler, Hesse, Hindemith. Hillsdale, NY: Pendragon Press.

Google Scholar

Burkert, W. (1972). Lore and Science in Ancient Pythagoreanism. Transl. by E. L. Jr. Minar (Cambridge, MA: Harvard University Press).

Google Scholar

Burnett, H., and Nitzberg, R. J. (2007). Composition, Chromaticism, and the Developmental Process: A New Theory of Tonality. Aldershot: Ashgate.

Burns, E. M., and Ward, W. D. (1978). Categorical perception–phenomenon or epiphenomenon: evidence from experiments in the perception of melodic musical intervals. J. Acoust. Soc. Am. 63, 456–468. doi: 10.1121/1.381737

PubMed Abstract | CrossRef Full Text | Google Scholar

Cazden, N. (1971). “A simplified mode classification for traditional anglo-american song tunes,” in Yearbook of the International Folk Music Council, ed C. Haywood (Urbana, IL; Champaign, IL: University of Illinois Press), 45–78.

Christensen, T. (1992). The Spanish baroque guitar and seventeenth-century triadic theory. J. Music Theory 36, 1–42. doi: 10.2307/843908

CrossRef Full Text | Google Scholar

Chybowski, J. J. (2008). Developing American Taste a Cultural History of the Early Twentieth-Century Music Appreciation Movement. Madison, WI: University of Wisconsin.

Google Scholar

Civil, M. (1994). The Farmer's Instructions A Sumerian Agricultural Manual. ed S. V. Aula Orien. Barcelona: Editorial AUSA.

Google Scholar

Clements, E. (1935). The four parent scales of modal music. J. R. Music. Assoc. 62, 77–91. doi: 10.1093/jrma/62.1.77

CrossRef Full Text | Google Scholar

Colburn, J. (2009). A new interpretation of the nippur music-instruction fragments. J. Cuneif. Stud. 67, 97–109. Available online at:

Condry, K., and Yonas, A. (2013). Six-month-old infants use motion parallax to direct reaching in depth. Infant Behav. Dev. 36, 238–244. doi: 10.1016/j.infbeh.2013.01.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Cook, N. D. (2011). “Triadic insights in astronomy, art and music,” in Analecta Husserliana: The Yearbook of Phenomenological Research, Vol. 107, eds A.-T. Tymieniecka and A. Grandpierre (Dordrecht: Springer Science), 129–135.

Google Scholar

Cook, N. D. (2012). Harmony, Perspective, and Triadic Cognition. Cambridge: Cambridge University Press.

Google Scholar

Cook, N. D., Yutsudo, A., Fujimoto, N., and Murata, M. (2008). On the visual cues contributing to pictorial depth perception. Visual Percept. 26, 69–92. doi: 10.2190/em.26.1.f

CrossRef Full Text | Google Scholar

Cook, S. (1995). ‘ Yue Ji’ yes–record of music: introduction, translation, notes, and commentary. Asian Music 26, 1–96. doi: 10.2307/834434

CrossRef Full Text | Google Scholar

Couch, C. J. (1989). Oral technologies: a cornerstone of ancient civilizations? Sociol. Q. 30, 587–602. doi: 10.1111/j.1533-8525.1989.tb01536.x

CrossRef Full Text | Google Scholar

Crickmore, L. (2006). The musicality of plato. Hermathena 180, 19–43. Available online at:

Crickmore, L. (2009a). A possible mesopotamian origin for PlatoâĂŹ S world soul. Hermathena 186, 5–23. Available online at:

Crickmore, L. (2009b). “The tonal systems of mesopotamia and ancient greece: some similarities and differences,” in The Archaeomusicological Review of the Ancient Near East, eds R. Dumbrill and M. Marcetteau (London: Iconea Publications), 1–16.

Crocker, R. L. (1997). Mesopotamian tonal systems. Iraq 59, 189–202. doi: 10.2307/4200443

CrossRef Full Text | Google Scholar

Csapo, E. G. (2000). Later euripidean music. Il. Class. Stud. 24/25, 399–426. Available online at:

Csapo, E. G. (2011). “The economics, poetics, politics, metaphysics and ethics of the ‘new music,”’ in Music and Cultural Politics in Greek Andchinese Societies, Vol. 1, ed D. Yatromanolakis (Cambridge MA: Harvard University, Department of the Classics), 65–132.

Curtis, G. (2006). The Cave Painters: Probing the Mysteries of the World's First Artists. New York, NY: Knopf.

Dahlhaus, C. (2014). Studies on the Origin of Harmonic Tonality. Princeton, NJ: Princeton University Press.

Dalla, B. S., Białuńska, A., and Sowiński, J. (2013). Why movement is captured by music, but less by speech: role of temporal regularity. PLoS ONE 8:e71945. doi: 10.1371/journal.pone.0071945

PubMed Abstract | CrossRef Full Text | Google Scholar

Daniélou, A. (1995). Music and the Power of Sound: The Influence of Tuning and Interval on Consciousness. Rochester, VT: Inner Traditions.

Google Scholar

DeLucia, P. R. (2008). Critical roles for distance, task, and motion in space perception: initial conceptual framework and practical implications. Hum. Fact. 50, 811–820. doi: 10.1518/001872008X312297

PubMed Abstract | CrossRef Full Text | Google Scholar

DeLucia, P. R., and Novak, B. J. (1997). Judgments of relative time-to-contact of more than two approaching objects: toward a method. Percept. Psychophys. 59, 913–928. doi: 10.3758/BF03205508

PubMed Abstract | CrossRef Full Text | Google Scholar

Delviniotis, D. S., Kouroupetroglou, G., and Theodoridis, S. (2008). Acoustic analysis of musical intervals in modern byzantine chant scales. J. Acoust. Soc. Am. 124, EL262–EL269. doi: 10.1121/1.2968299

PubMed Abstract | CrossRef Full Text | Google Scholar

Demany, L., and Semal, C. (2013). Dividing attention between two segregated tone streams. Proc. Meet. Acoust. 19, 050078–050078. doi: 10.1121/1.4798800

CrossRef Full Text | Google Scholar

Deutsch, D. (ed.). (2012). “The processing of pitch combinations,” in Psychology of Music, 3rd Edn (New York, NY: Academic Press), 249–325.

Google Scholar

Deva, B. C. (1995). Indian Music. New Delhi: New Age International.

Google Scholar

Devaney, J., Wild, J., and Fujinaga, I. (2011). “Intonation in solo vocal performance: a study of semitone and whole tone tuning in undergraduate and professional sopranos,” in Proceedings of the International Symposium on Performance Science 2011 (Utrecht: Association Européenne des Conservatoires), 219–224.

Drew, T., and Vogel, E. K. (2008). Neural measures of individual differences in selecting and tracking multiple moving objects. J. Neurosci. 28, 4183–4191. doi: 10.1523/JNEUROSCI.0556-08.2008

PubMed Abstract | CrossRef Full Text | Google Scholar

Drews, R. (1995). The End of the Bronze Age: Changes in Warfare and the Catastrophe Ca. 1200 B. Princeton, NJ: Princeton University Press.

Google Scholar

Dubovsky, I. I., Yevseyev, S. V., Sokolov, V. V., and Sposobin, I. V. (1965). The Textbook of Harmony [Учебник гармонии], 4th Edn. Moscow: Muzyka.

Dubravskaya, T. (1996). History of Polyphony. Music of the Age of Renaissance: 16th Century [Музыка эпохи Возрождения]. Moscow: Muzyka [Музыка].

Duchesne-Guillemin, M. (1984). “A hurrian musical score from ugarit: the discovery of mesopotamian music,” in Sources from the Ancient Near East, Vol. 2. eds C. Buccellati and M. Kelly-BucceIlati (Malibu, CA: Undena Publications), 22.

Google Scholar

Dumbrill, R. (2007). Earliest Evidence of Heptatonism. London: Iconea Publications. Available online at:

Edgerton, S. Y. (1974). Florentine interest in ptolemaic cartography as background for renaissance painting, architecture, and the Discovery of America. J. Soc. Arch. Histor. 33, 275–292. doi: 10.2307/988935

CrossRef Full Text | Google Scholar

Edgerton, S. Y. (2009). The Mirror, the Window, and the Telescope: How Renaissance Linear Perspective Changed Our Vision of the Universe. Ithaca, NY: Cornell University Press.

Google Scholar

Eitan, Z., Schupak, A., Gotler, A., and Marks, L. E. (2014). Lower pitch is larger, yet falling pitches shrink. Exp. Psychol. 61, 1–12. doi: 10.1027/1618-3169/a000246

PubMed Abstract | CrossRef Full Text | Google Scholar

Falkenhausen, L. (1992). On the early development of chinese musical theory: the rise of pitch-standards. J. Am. Orient. Soc. 112, 433–439. doi: 10.2307/603079

CrossRef Full Text | Google Scholar

Farmer, H. G. (1965). The ethos of antiquity. Islam. Stud. 4, 25–30.

Google Scholar

Farmer, S., Henderson, J. B., and Witzel, M. (2000). Neurobiology, layered texts, and correlative cosmologies: a cross-cultural framework for premodern history. Bull. Mus. Far East. Anti. 72, 48–90. Available online at:

Florensky, P. (1996). Iconostasis. New York, NY: St. Vladimir's Seminary Press.

Forte, A. (1980). Generative chromaticism in mozart's music: the rondo in a minor, K. 511. Music. Q. 66, 459–483. doi: 10.1093/mq/LXVI.4.459

CrossRef Full Text | Google Scholar

Foster, B. R. (1996). Before the Muses: An Anthology of Akkadian Literature. Archaic, Classical, Mature. Vol. 1. Bethesda, MD: CDL Press.

Foster, B. R. (2011). “The person in mesopotamian thought,” in The Oxford Handbook of Cuneiform Culture, eds K. Radner and E. Robson (Oxford: Oxford University Press), 117–139.

Franklin, J. C. (2002). Diatonic music in Greece: a reassessment of its antiquity. Mnemosyne 55, 669–702. doi: 10.1163/156852502320880186

CrossRef Full Text | Google Scholar

Franklin, J. C. (2005). Hearing Greek Microtones. Ancient Greek Music in Performance, Vol. 29. Vienna: Wiener Studien Beiheft.

Franklin, J. C. (2007). “The Global Economy of Music in the Ancient Near East,” in Sounds of Ancient Music, ed J. G. Westenholz (Jerusalem: Bible Lands Museum), 27–37.

Franklin, J. C. (2013). “Song-benders of circular choruses'- dithyramb and the ‘demise of music,” in Song Culture and Social Change: The Contexts of Dithyramb, eds P. Wilson and B. Kowalzig (Oxford: Oxford University Press), 213–236.

Freeman, N. H. (1987). Current problems in the development of representational picture production. Arch. Psychol. 55, 127–152.

Google Scholar

Friberg, A., Bresin, R., and Sundberg, J. (2006). Overview of the KTH rule system for musical performance. Adv. Cogn. Psychol. 2, 145–161. doi: 10.2478/v10053-008-0052-x

CrossRef Full Text | Google Scholar

Friberg, J. (2011). Seven-sided star figures and tuning algorithms in mesopotamian, Greek, and Islamic Texts. Archiv. Für. Orientfors. 52, 121–155. Available online at:

Frolov, B. A. (1992). Primitive Graphics of Europe [Первобытная графика Европы]. Moscow: Nauka.

Frolov, B. A. (2003). “Genesis of graphic symbolization [Генезис графической символики],” in Chinese Classic “Book of Changes” and Modern Science [Китайская классическая “Книга Перемен” и современная наука], ed P. M. Kozhin (Moscow: Luch [Луч]), 20–28.

Galeyev, B. M. (2003). Evolution of gravitational synesthesia in music: to color and light! Leonardo 36, 129–134. doi: 10.1162/002409403321554198

CrossRef Full Text | Google Scholar

Galeyev, B. M. (2007). The nature and functions of synesthesia in music. Leonardo 40, 285–288. doi: 10.1162/leon.2007.40.3.285

CrossRef Full Text | Google Scholar

Galilei, V. (2003). Dialogue on Ancient and Modern Music. New Haven, CT: Yale University Press.

Garbuzov, N. (1950). Zonal Nature of Tempo and Rhythm [Зонная природа темпа и ритма]. Moscow: Academy of Science of USSR [Изд-во Академии наук СССР].

Garbuzov, N. (1955). Zonal Nature of Hearing of Dynamics [Зонная природа динамического слуха]. Moscow: Gos Muz Izdat [Гос. музыкальное изд-во].

Garbuzov, N. (1956). Zonal Nature of Hearing of Timbre [Зонная природа тембрового слуха]. Moscow: Gos Muz Izdat [Гос. музыкальное изд-во].

Garbuzov, N. (1980). “Selected Works (1925-1955) [Избранные труды],” in Garbuzov N.A. - Musician, Researcher and Pedagoge [Гарбузов Н.А. - Музыкант, исследователь, педагог], ed Y. Rags (Moscow: Muzyka [Музыка]), 49–263.

Geringer, J. M., MacLeod, R. B., and Sasanfar, J. K. (2015). In tune or out of tune: are different instruments and voice heard differently? J. Res. Music Educ. 63, 89–101. doi: 10.1177/0022429415572025

PubMed Abstract | CrossRef Full Text | Google Scholar

Ghrab, A. (2005). The western study of intervals in arabic music, from the eighteenth century to the cairo congress. World Music 47, 55–79. Available online at:

Gill, K. Z., and Purves, D. (2009). A biological rationale for musical scales. PLoS ONE 4:e8144. doi: 10.1371/journal.pone.0008144

PubMed Abstract | CrossRef Full Text | Google Scholar

Gogotishvili, V. (2010). “On authentic and plagal types of monotonic (non-octave)scales in georgian traditional vocal polyphony,” in Echoes from Georgia: Seventeen Arguments on Georgian Polyphony (Focus on Civilizations and Cultures), eds R. Tsurtsumia and J. Jordania (New York, NY: Nova Science Publishers), 147–156.

Golomb, C. (2004). The Child's Creation of A Pictorial World. Mahwah, NJ: Erlbaum.

Google Scholar

Gombosi, O. (1951). Key, mode, species. J. Am. Musicol. Soc. 4, 20–26. doi: 10.2307/830117

CrossRef Full Text | Google Scholar

Gozza, P. (2000). Number to Sound: The Musical Way to the Scientific Revolution. Dordrecht: Kluwer Academic Publishers. doi: 10.1007/978-94-015-9578-0

CrossRef Full Text | Google Scholar

Grauer, V. A. (1996). Toward a unified theory of the arts. Music Theory Online 2. Available online at:

Greene, R. (1983). Determining the preferred viewpoint in linear perspective. Leonardo 16, 97–102. doi: 10.2307/1574793

CrossRef Full Text

Hagel, S. (2009). Ancient Greek Music: A New Technical History. New York, NY: Cambridge University Press. doi: 10.1017/CBO9780511691591

CrossRef Full Text | Google Scholar

Hallo, W. W. (2010). The World's Oldest Literature: Studies in Sumerian Belles-Lettres. Leiden: Koniklijke Brill NV.

Google Scholar

Halpern, A. R. (2012). Dynamic aspects of musical imagery. Ann. N.Y. Acad. Sci. 1252, 200–205. doi: 10.1111/j.1749-6632.2011.06442.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Hargreaves, D. J. (1986). The Developmental Psychology of Music. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511521225

CrossRef Full Text | Google Scholar

Hauser, A. (1999). The Social History of Art: From Prehistoric Times to the Middle Ages. Vol. 1. New York, NY: Psychology Press.

Hero, B., and Foulkrod, R. (1999). Lambdoma matrix and harmonic intervals: the physiological and psychological effects on human adaptation from combining math and music. Eng. Med. Biol. Magaz. 18, 61–73. doi: 10.1109/51.752977

CrossRef Full Text | Google Scholar

Heylen, L., Wuyts, F. L., Mertens, F., De Bodt, M., and Van de Heyning, P. H. (2002). Normative voice range profiles of male and female professional voice users. J. Voice 16, 1–7. doi: 10.1016/S0892-1997(02)00065-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Higgins, R., and Winnington-Ingram, R. P. (1965). Lute-players in Greek art. J. Hellen. Stud. 85, 62–71. doi: 10.2307/628809

CrossRef Full Text | Google Scholar

Hockney, D. (2006). Secret Knowledge: Rediscovering the Lost Techniques of the Old Masters. New York, NY: Viking Studio.

Google Scholar

Hubbard, T. L. (2007). What is mental representation? and how does it relate to consciousness? J. Consc. Stud. 14, 37–61. Available online at:

Hubbard, T. L. (2013). “Auditory imagery contains more than audition,” in Multisensory Imagery, eds S. Lacey and R. Lawson (Heidelberg: Springer), 221–247. Available online at:

Hubbard, T. L., and Stoeckig, K. (1988). Musical imagery: generation of tones and chords. J. Exp. Psychol. 14, 656–667. doi: 10.1037/0278-7393.14.4.656

PubMed Abstract | CrossRef Full Text | Google Scholar

Hudson, W. (1967). The study of the problem of pictorial perception among unacculturated groups. Int. J. Psychol. 2, 89–107. doi: 10.1080/00207596708247206

CrossRef Full Text | Google Scholar

Huron, D. (1989). Voice denumerability in polyphonic music of homogeneous timbres. Music Percept. Interdiscipl. J. 6, 361–382. doi: 10.2307/40285438

CrossRef Full Text | Google Scholar

Huron, D. (1991). The avoidance of part-crossing in polyphonic music: perceptual evidence and musical practice. Music Percept. 9, 93–103. doi: 10.2307/40286160

CrossRef Full Text | Google Scholar

Huron, D. (2001). Tone and voice: a derivation of the rules of voice-leading from perceptual principles. Music Percept. 19, 1–64. doi: 10.1525/mp.2001.19.1.1

CrossRef Full Text | Google Scholar

Huron, D. (2003). “Is music an evolutionary adaptation?” in The Cognitive Neuroscience of Music, eds I. Peret and R. Zatorre (Oxford: Oxford University Press), 466.

Huron, D. (2006). Sweet Anticipation: Music and the Psychology of Expectation. Cambridge, MA: MIT Press.

Google Scholar

Hyer, B. (2008). “Tonality,” in The Cambridge History of Western Music Theory, ed T. Christensen (Cambridge, UK: Cambridge University Press) 726–752.

Janata, P. (2012). Acuity of mental representations of pitch. Ann N.Y. Acad. Sci. 1252, 214–221. doi: 10.1111/j.1749-6632.2011.06441.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Jordan, D. S. (1987). Influence of the diatonic tonal hierarchy at microtonal intervals. Percept. Psychophys. 41, 482–488. doi: 10.3758/BF03210483

PubMed Abstract | CrossRef Full Text | Google Scholar

Jordan, D. S., and Shepard, R. N. (1987). Tonal schemas: evidence obtained by probing distorted musical scales. Percept. Psychophys. 41, 489–504. doi: 10.3758/BF03210484

PubMed Abstract | CrossRef Full Text | Google Scholar

Jordania, J. (2006). Who Asked the First Question? The Origins of Human Choral Singing, Intelligence, Language and Speech. The Origins of Human Choral Singing, Intelligence. Tbilisi: Logos.

Google Scholar

Kappraff, J. (2006). Anne bulckens' analysis of the proportions of the parthenon and its meanings. Symmetry 17, 91–96. Available online at:

Katsanevaki, A. N. (2011). Chromaticism: a theoretical construction or a practical transformation? Muzikologija 11, 159–180. doi: 10.2298/MUZ1111159K

PubMed Abstract | CrossRef Full Text | Google Scholar

Katz, J. (1996). Music and aesthetics: an early Indian perspective. Early Music 24, 407–420.

Google Scholar

Kaufmann, W. (1976). Musical References in the Chinese Classics. Detroit: Information Coordinators.

Kholopov, Y. (1988). Harmony: A Theoretic Course [Гармония: теоретический курс]. Moscow: Muzyka [Музыка].

Kholopov, Y. (2004). “Obychodnyi modes and polyphony [Обиходные лады и многоголосие],” in Christian Culture: Past and Present; 2000 Anniversary of Birth of Christ [Христианская культура: прошлое и настоящее. К 2000олетию Рождества Христова], ed N.S. Guliantiskaya (Moscow: Gnessin Russian Academy of Music [Российская академия музыки имени Гнесиных]), 39–54.

Kholopov, Y. (2006). Musical-Theoretic Systems [Музыкально-теоретические системы]. Moscow: Kompozitor.

Kholopova, V. (1979). Texture: An Essay [Фактура: Очерк]. Moscow: Muzyka [Музыка].

Kilmer, A. D. (1971). The discovery of an ancient mesopotamian theory of music. Proc. Am. Philosoph. Soc. 115, 131–149.

Google Scholar

Kilmer, A. D. (1974). The cult song with music from ancient ugarit: another interpretation. Revue d'Assyriol. D'archéol. Orient. 68, 69–82.

Google Scholar

Kilmer, A. D., and Tinney, S. (1996). Old Babylonian Music Instruction Texts. J. Cuneif. Stud. 48, 49–56. doi: 10.2307/1359769

CrossRef Full Text | Google Scholar

Kob, M., Henrich, N., Herzel, H., Howard, D., Tokuda, I., and Wolfe, J. (2011). Analysing and understanding the singing voice: recent progress and open questions. Curr. Bioinformat. 6, 362–374. doi: 10.2174/157489311796904709

CrossRef Full Text | Google Scholar

Koelsch, S., Gunter, T., Schröger, E., and Friederici, A. D. (2003). Processing tonal modulations: an ERP study. J. Cogn. Neurosci. 15, 1149–1159. doi: 10.1162/089892903322598111

PubMed Abstract | CrossRef Full Text | Google Scholar

Korsakova-Kreyn, M., and Dowling, W. J. (2012). “Emotion in music: affective responses to motion in tonal space,” in Proceedings: The ICMPC-ESCOM 2012 Joint Conference—12th Biennial International Conference for Music Perception and Cognition, eds E. Cambouropoulos, C. Tsougras, P. Mavromatis, and K. Pastiades (Thessaloniki: Aristotle University of Thessaloniki), 550–558.

Korsakova-Kreyn, M., and Dowling, W. J. (2014). Emotional processing in music: study in affective responses to tonal modulation in controlled harmonic progressions and real music. Psychomusicology 24, 4–20. doi: 10.1037/pmu0000029

CrossRef Full Text | Google Scholar

Krispijn, T. (2010). “Musical ensembles in ancient mesopotamia,” in Proceedings of the International Conference of Near Eastern Archaeomusicology, Held at the British Museum, December 4-6, 2008, eds R. Dumbrill and I. Finkel (London: Iconea Publications), 125–150.

Krohn, K. I., Brattico, E., Välimäki, V., and Tervaniemi, M. (2007). Neural representations of the hierarchical scale pitch structure. Music Percept. 24, 281–296. doi: 10.1525/mp.2007.24.3.281

CrossRef Full Text | Google Scholar

Kubik, G. (1985). African tone-systems: a reassessment. Yearb. Tradit. Music 17, 31–63. doi: 10.2307/768436

CrossRef Full Text | Google Scholar

Kubik, G. (1999). Africa and the Blues. Jackson, MS: University Press of Mississippi.

Google Scholar

Kuttner, F. A. (1965). A musicological interpretation of the twelve lüs in china's traditional tone system. Ethnomusicology 9, 22–38. doi: 10.2307/850415

CrossRef Full Text | Google Scholar

Kvitka, K. V. (1973). Selected Works [Избранные труды], Vol. 2. ed V. L. Goshovskii. Moscow: Sovetskii Kompozitor [Советский композитор].

Laertius, D. (1958). Lives of Eminent Philosophers. Cambridge, MA: Harvard University Press.

Google Scholar

Landels, J. G. (2002). Music in Ancient Greece and Rome. London: Routledge.

Google Scholar

Larson, S. (2012). Musical Forces: Motion, Metaphor, and Meaning in Music. Bloomington, IN: Indiana University Press.

Google Scholar

Larson, S., and McAdams, S. (2004). Musical forces and melodic expectations: comparing computer models and experimental results. Music Percept. 21, 457–498. doi: 10.1525/mp.2004.21.4.457

CrossRef Full Text | Google Scholar

Lawergren, B. (1997). “Mesopotamia, musical instruments,” in Die Musik in Geschichte Und Gegenwart: Allgemeine Enzyklopädie Der Musik, 2nd Edn., Vol. 6, ed L. Finscher (Kassel: Bärenreiter and Metzler), 143–171.

Lawlor, R. (1982). Sacred Geometry: Philosophy and Practice. London: Thames and Hudson.

Google Scholar

Leisi,ö, T. (2002). “On old-lithuanian modalities: a hypothesis on five stylistic strata based on Proto-Indo-European pentatonic roots,” in Ethnic Relations and Musical Folklore, ed R. Astrauskas (Vilnius: Lietuvos Muzikos Akademija), 22–51.

Léothaud, G., Voisin, F., Lamont, A., and Arom, S. (1997). “Experimental ethnomusicology: an interactive approach to the study of musical scales,” in Perception and Cognition of Music, eds I. Deliège and J. A. Sloboda (Hove, UK: Psychology Press), 3–30.

Lerdahl, F. (2009). Genesis and architecture of the GTTM project. Music Percept. 26, 187–194. doi: 10.1525/mp.2009.26.3.187

CrossRef Full Text | Google Scholar

LeVen, P. A. (2014). The Many-Headed Muse: Tradition and Innovation in Late Classical Greek Lyric Poetry. Cambridge: Cambridge University Press.

Google Scholar

Levin, T. (2007). Music of Central Asia Vol. 5: The Badakhshan Ensemble: Song and Dance from the Pamir Mountains. New York, NY: Smithsonian Folkways Recordings.

Lidji, P., Kolinsky, R., Lochy, A., and Morais, J. (2007). Spatial associations for musical stimuli: a piano in the head? J. Exp. Psychol. Hum. Percept. Perform. 33, 1189–1207. doi: 10.1037/0096-1523.33.5.1189

PubMed Abstract | CrossRef Full Text | Google Scholar

Lindley, M. (1976). Evidence for meantone temperament. Proc. R. Music. Assoc. 102, 37–51. doi: 10.1093/jrma/102.1.37

CrossRef Full Text | Google Scholar

Lippman, E. A. (1964). Musical Thought in Ancient Greece. New York, NY: Da Capo Press.

Google Scholar

List, G. (1961). Speech melody and song melody in Central Thailand. Ethnomusicology 5, 16–32. doi: 10.2307/924305

CrossRef Full Text | Google Scholar

List, G. (1974). The reliability of transcription. Ethnomusicology 18, 353–377. doi: 10.2307/850519

CrossRef Full Text | Google Scholar

Löhnert, A. (2011). “Manipulating the gods: lamenting in context,” in The Oxford Handbook of Cuneiform Culture, eds K. Radner and E. Robson (Oxford: Oxford University Press), 402–417.

Lomax, A. (2004). Alan Lomax: Selected Writings, 1934-1997. ed R. Cohen. New York, NY: Routledge.

Losev, A. (2000). History of Ancient Aesthetics: Aristotle and Late Classics [История античной эстетики: Аристотель и поздняя классика], Vol. 4. Moscow: Folio.

Lowinsky, E. E. (1989). “The concept of physical and musical space in the renaissance: a preliminary sketch,” in Music in the Culture of the Renaissance and Other Essays, ed B. J. Blackburn (Chicago, IL: University of Chicago Press), 6–18.

Google Scholar

Lundberg, D. (1997). Welcome to assyria: your land on the cyber space—music and the internet in the establishment of a transnational assyrian identity. Etnomusikologian Vuosikirja 10, 13–28.

Google Scholar

Malm, W. P. (2000). Traditional Japanese Music and Musical Instruments. Vol. 1. Tokyo: Kodansha International.

Google Scholar

Mangani, M., and Sabaino, D. (2008). Tonal types and modal attributions in late renaissance polyphony: new observations. Acta Musicol. 80, 231–250. Available online at:

Mantero, R. (2001). La musique de Léonard de Vinci. Médecine Des Arts: Approche Médicale et Scientifique Des Pratiques Artistiques 37, 38–39. Available online at:

Manuel, P. (1989). Modal harmony in Andalusian, Eastern European, and Turkish Syncretic Musics. Yearb. Tradit. Music 21, 70–94. doi: 10.2307/767769

CrossRef Full Text | Google Scholar

Marcus, S. (1993). The interface between theory and practice: intonation in Arab music. Asian Music 24, 39–58. doi: 10.2307/834466

CrossRef Full Text | Google Scholar

Margulis, E. H. (2005). A model of melodic expectation. Music Percept. 22, 663–714. doi: 10.1525/mp.2005.22.4.663

CrossRef Full Text | Google Scholar

Marie, C., and Trainor, L. J. (2013). Development of simultaneous pitch encoding: infants show a high voice superiority effect. Cereb. Cortex 23, 660–669. doi: 10.1093/cercor/bhs050

PubMed Abstract | CrossRef Full Text | Google Scholar

Marrou, H. I. (1956). A History of Education in Antiquity. Madison, WI: University of Wisconsin Press.

Google Scholar

Mathiesen, T. J. (1999). Apollo's Lyre: Greek Music and Music Theory in Antiquity and the Middle Ages. Lincoln, NE: University of Nebraska Press.

Google Scholar

Mazel, L. (1952). On Melody [О мелодии]. Moscow: Gos Muz Izdat [Гос. музыкальное изд-во].

Mazel, L. (1979). Structuring of the Music Works [Строение музыкальных произведений], 2nd Edn. Moscow: Muzyka [Музыка].

McCartin, J. B. (2010). Mysteries of the Equilateral Triangle. Applied Mathematics, Kettering University. Ruse: Hikari Ltd.

Google Scholar

Meier, B. (1988). The Modes of Classical Vocal Polyphony Described according to the Sources. Transl. E. S. Beebe. New York, NY: Broude Brothers.

Meier, B., and Dittmer, L. A. (1956). The musica reservata of adrianus petit coclico and its relationship to josquin. Musica Disciplina 10, 67–105.

Google Scholar

Michalowski, P. (2006). Love or death? observations on the role of the gala in Ur III ceremonial life. J. Cuneif. Stud. 58, 49–61.

PubMed Abstract | Google Scholar

Michalowski, P. (2012). “Literacy schooling and the transmission of knowledge in early mesopotamian culture,” in Theory and Practice of Knowledge Transfer: Studies in School Education in the Ancient Near East and beyond, eds W. S. Van Egmond and W. H. Van Soldt (Leiden: Nederlands Institut Voor Het Nabije Oosten), 39–5819.

Milbrath, C., McPherson, G. E., and Osborne, M. S. (2015). “Artistic development,” in Handbook of Child Psychology and Developmental Science, Vol. 2: Cognitive Processes, 7th Edn., eds R. Lerner, L. S. Liben, U. Mueller (Hoboken, NJ: Wiley), 897–940.

Miranda, E. R. (2002). “Mimetic development of intonation,” in Music and Artificial Intelligence, eds C. Anagnōstopoulou, M. Ferrand, and A. Smaill (Berlin: Springer), 107–118.

Google Scholar

Moll, K. N. (ed.). (2014). “Toward a comprehensive view of compositional priorities in the music of dufay and his contemporaries,” in Counterpoint and Compositional Process in the Time of Dufay: Perspectives from German Musicology (Abingdon: Routledge), 3–63.

Google Scholar

Moran, H., and Pratt, C. C. (1926). Variability of judgments on musical intervals. J. Exp. Psychol. 9, 492–499. doi: 10.1037/h0070061

CrossRef Full Text | Google Scholar

Morley, I. (2013). The Prehistory of Music: Human Evolution, Archaeology, and the Origins of Musicality. Oxford: Oxford University Press. doi: 10.1093/acprof:osobl/9780199234080.001.0001

CrossRef Full Text | Google Scholar

Morra, S. (2008). “Spatial structures in children's drawings: how do they develop?” in Drawing and the Non-Verbal Mind: A Life-Span Perspective, eds C. Lange-Küttner and A. Vinter (Cambridge: Cambridge University Press), 159–194.

Morrison, S. J., and Jánina, F. (2002). “Intonation,” in The Science and Psychology of Music Performance: Creative Strategies for Teaching and Learning, eds R. Parncutt and G. McPherson (Oxford, UK: Oxford University Press), 183–197.

Mshelua, A. Y., and Lapidus, L. B. (1990). Depth picture percepition in relation to cognitive style and training in non-western children. J. Cross Cult. Psychol. 21, 414–433. doi: 10.1177/0022022190214002

CrossRef Full Text | Google Scholar

Myers, F. R. (2002). Painting Culture: The Making of an Aboriginal High Art. Durham, NC: Duke niversity Press. doi: 10.1215/9780822384168

CrossRef Full Text | Google Scholar

Naumenko, G. M. (2013). Folk ABC: Methodology of Teaching Folk Singing to Children [Фольклорная азбука: Методика обучения детей народному пению], ed F. Takun. Moscow: Sovremennaya muzyka.

Nazajkinskij, E. V. (1972). On Psychology of Human Perception [О психологии музыкального восприятия]. Moscow: Muzyka.

Nettl, B. (1986). World music in the twentieth century: a survey of research on western influence. Acta Musicol. 58, 360–373. doi: 10.2307/932821

CrossRef Full Text | Google Scholar

Nutting, G. (1974). The logic of renaissance harmony. Int. Rev. Aesthet. Soc. Music 5, 253–263. doi: 10.2307/836567

CrossRef Full Text | Google Scholar

Ohala, J. J. (2006). “The frequency code underlies the sound-symbolic use of voice pitch,” in Sound Symbolism, eds L. Hinton, J. Nichols, and J. J. Ohala (Cambridge: Cambridge University Press), 325–347.

Google Scholar

Owens, J. A. (1998). Composers at Work: The Craft of Musical Composition 1450-1600. New York, NY: Oxford University Press.

Google Scholar

Owens, J. A. (2000). Erasable tablets as tools for musical composition. Shakespeare Stud. 28, 139–144.

Google Scholar

Palisca, C. V. (1960). Vincenzo Galilei and some links between ‘Pseudo-Monody’ and Monody. Music. Q. 46, 344–360. doi: 10.1093/mq/XLVI.3.344

CrossRef Full Text | Google Scholar

Palmer, C., and Holleran, S. (1994). Harmonic, melodic, and frequency height influences in the perception of multivoiced music. Percept. Psychophys. 56, 301–312. doi: 10.3758/BF03209764

PubMed Abstract | CrossRef Full Text | Google Scholar

Panofsky, E. (1991). Perspective as Symbolic Form. Transl. by C. S. Wood. Cambridge, MA: MIT Press.

Google Scholar

Park, E., and I, B. (1995). Children's representation systems in drawing three-dimensional objects: a review of empirical studies. Visual Arts Res. 21, 42–56.

Google Scholar

Parncutt, R. (2011). The tonic as triad: key profiles as pitch salience profiles of tonic triads. Music Percept. 28, 333–365. doi: 10.1525/mp.2011.28.4.333

CrossRef Full Text | Google Scholar

Parncutt, R., and Cohen, A. J. (1995). Identification of microtonal melodies: effects of scale-step size, serial order, and training. Percept. Psychophys. 57, 835–846. doi: 10.3758/BF03206799

PubMed Abstract | CrossRef Full Text | Google Scholar

Pashinian, E. (1973). Universal super-modal system in armenian music [Универсальная суперладовая система в армянской музыке]. Hist. Philol. J. Natl. Acad. Sci. Armen. 3, 194–212.

Pearl, J. (2006). Eavesdropping with a master: Leoš Janáček and the music of speech. Emp. Musicol. Rev. 1, 131–165. Available online at:

Pears, I. (1991). The Discovery of Painting: The Growth of Interest in the Arts in England, 1680-1768. New Haven, CT: Yale University Press.

Google Scholar

Perlman, M., and Krumhansl, C. L. (1996). An experimental study of internal interval standards in Javanese and Western Musicians. Music Percept. 14, 95–116. doi: 10.2307/40285714

CrossRef Full Text | Google Scholar

Petrovi,ć, A. (1994). “The eastern roots of ancient yugoslav music,” in Music Cultures in Contact: Convergences and Collisions, eds J. M. Kartomi and S. Blum (Basel: Gordon and Breach Publishers).

Google Scholar

Phillips, W. A., Hobbs, S. B., and Pratt, F. R. (1978). Intellectual realism in children's drawings of cubes. Cognition 6, 15–33. doi: 10.1016/0010-0277(78)90007-0

CrossRef Full Text | Google Scholar

Pintore, A. (2004). Musical symbolism in the works of Leon Battista Alberti. Nexus Netw. J. 6, 49–70. doi: 10.1007/s00004-004-0018-3

CrossRef Full Text | Google Scholar

Plato. (2012). Symposium and Phaedrus. New York, NY: Dover Publications.

Powley, H. (1996). “The musical legacy of the etruscans,” in Etruscan Italy: Etruscan Influences on the Civilizations of Italy from Antiquity to the Modern Era, ed J. F. Hall (Bloomington, IN: Indiana University Press), 287–303.

Pratt, F. (1984). Cognitive processes in the perception of art. Adv. Psychol. Adv. Psychol. 19, 97–109.

Prizer, W. F. (1975). Performance practices in the frottola. Early Music 3, 227–235. doi: 10.1093/earlyj/3.3.227

PubMed Abstract | CrossRef Full Text | Google Scholar

Racy, A. J. (2004). Making Music in the Arab World: The Culture and Artistry of Tarab. Cambridge: Cambridge University Press.

Google Scholar

Rags, Y. (1980). “The concept of zonal nature of musical hearing by N.A. Garbuzov [Концепция зонной природы музыкального слуха Н.А. Гарбузова],” in Garbuzov N.A. - Musician, Researcher and Pedagoge [Гарбузов Н.А. - Музыкант, исследователь, педагог], ed Y. Rags (Moscow: Muzyka [Музыка]), 11–48.

Raman, R., and Dowling, W. J. (2012). “Analyzing modulation in scales (Rāgams) in South Indian Classical (Carnāic) Music: a behavioral study,” in Proceedings of the 12th International Conference on Music Perception and Cognition (ICMPC), eds E. Cambouropoulos, C. Tsougras, P. Mavromatis, and K. Pastiadis (Thessaloniki: Aristotle University of Thessaloniki), 837–838.

Rameau, J.-P. (1971). Treatise on Harmony [Traité de L'harmonie Réduite à Ses Principes Naturels]. Translated by Philip Gossett. New York, NY: Dover Publications.

Randel, D. M. (1971). Emerging triadic tonality in the fifteenth century. Mus. Q. 57, 73–86. doi: 10.1093/mq/LVII.1.73

CrossRef Full Text | Google Scholar

Raynaud, D. (2009). “Why did geometrical optics not lead to perspective in Medieval Islam,” in Raymond Boudon. A Life in Sociology, eds M. Cherkaoui and P. Hamilton (Oxford: Bardwell Press), 243–266.

Reynolds, C. (1987). Musical evidence of compositional planning in the Renaissance: Josquin's ‘Plus Nulz Regretz.’ J. Am. Musicol. Soc. 40, 53–81. doi: 10.2307/831582

CrossRef Full Text | Google Scholar

Rideout, V. J., Roberts, D. F., and Foehr, U. G. (2010). Generation M: Media in the Lives of 8-18 Year Olds. Menlo Park, CA: Kaiser Family Foundation.

Riedweg, C. (2005). Pythagoras: His Life, Teaching, and Influence. Transl. by S. Rendall and A. Schatzmann. Ithaca, NY: Cornell University Press.

Google Scholar

Riemann, L. (1899). Über Eigentümliche Bei Natur- Und Orientalischen Kulturvölkern Vorkommende Tonreihen. Essen: G. D. Baedeker.

Robson, E. (2008). Mathematics in Ancient Iraq: A Social History. Princeton, NJ: Princeton University Press.

Google Scholar

Roederer, J. G. (2008). The Physics and Psychophysics of Music: An Introduction. Berlin; Heidelberg: Springer Science and Business Media.

Google Scholar

Rohrmeier, M., and Widdess, R. (2012). “Incidental learning of modal features of North Indian Music,” in Proceedings: The ICMPC-ESCOM 2012 Joint Conference—12th Biennial International Conference for Music Perception and Cognition, eds E. Cambouropoulos, C. Tsougras, P. Mavromatis, and K. Pastiadis (Thessaloniki: Aristotle University of Thessaloniki), 857–866.

Rothstein, E. (2006). “The new amateur player and listener,” in The Orchestra: A Collection of 23 Essays on Its Origins and Transformations, ed J. Peyser (Milwaukee, WI: Hal Leonard), 523–538.

Rowan, D. (2013). The universal lyre: three perspectives. Am. Harp. J. 24, 55–63. Available online at:

Rowell, L. (1981). Early Indian musical speculation and the theory of melody. J. Music Theory 25, 217–244.

Rubio, G. (2009). “Sumerian literature,” in From an Antique Land: An Introduction to Ancient Near Eastern Literature, eds S. C. Ehrlich (Lanham, MD: Rowman and Littlefield Publishers), 11–76.

Rudneva, A. (1994). Russian Traditional Musical Works: Essays on the Theory of Folklore [Русское народное музыкальное творчество: очерки по теории фольклора]. Moscow: Kompozitor [Композитор].

Sachs, C. (1960). Primitive and medieval music: a parallel. J. Am. Musicol. Soc. 13, 43–49. doi: 10.2307/830245

CrossRef Full Text | Google Scholar

Sachs, C. (1962). The Wellsprings of Music. ed J. Kunst. Hague: Martinus Nijhoff. doi: 10.1007/978-94-015-1059-2

CrossRef Full Text | Google Scholar

Schäfer, H. (1919). Von ägyptischer Kunst, Besonders Der Zeichenkunst. Leipzig: J.C. Hinrichs.

Google Scholar

Schulenberg, D. (1984). Before composition rameau: figured and style in the Baroque. Coll. Music Symp. 24, 130–148.

Scriba, C. J. (2015). 5000 Years of Geometry: Mathematics in History and Culture. Transl. by P. Schreiber. Basel: Springer Basel.

Google Scholar

Shackford, C. (1962). Some aspects of perception III: addenda. J. Music Theory 6, 295–303. doi: 10.2307/842914

CrossRef Full Text | Google Scholar

Siegel, J. A., and Siegel, W. (1977). Categorical perception of tonal intervals: musicians can't tell sharp from flat. Percept. Psychophys. 21, 399–407. doi: 10.3758/BF03199493

CrossRef Full Text | Google Scholar

Skinner, S. (2006). Sacred Geometry: Deciphering the Code. New York: Sterling Publishing Company, Inc.

Google Scholar

Solomon, J. D. (2000). Ptolemy Harmonics: Translation and Commentary. Leiden: Brill Academic Pub.

Google Scholar

Stamou, L. (2002). Plato and aristotle on music and music education: lessons from Ancient Greece. Int. J. Music Educ. 39, 3–16. doi: 10.1177/025576140203900102

CrossRef Full Text | Google Scholar

Steblin, R. (1987). Towards a history of absolute pitch recognition. Coll. Music Sympos. 27, 141–53.

Google Scholar

Stewart, L., Overath, T., Warren, J. D., Foxton, J. M., and Griffiths, T. D. (2008). fMRI Evidence for a cortical hierarchy of pitch pattern processing. PLoS ONE 3:e1470. doi: 10.1371/journal.pone.0001470

PubMed Abstract | CrossRef Full Text | Google Scholar

Stoter, F.-R., Schoeffler, M., Edler, B., and Herre, J. (2013). Human ability of counting the number of instruments in polyphonic music. J. Acoust. Soc. Am. 133, 3366. doi: 10.1121/1.4805760

CrossRef Full Text | Google Scholar

Struik, D. J. (1987). A Concise History of Mathematics. 4th Edn. New York, NY: Dover Publications.

Google Scholar

Strunk, W. O., and Treitler, L. (1998). Source Readings in Music History. New York, NY: Norton.

Sundberg, J. (1982). In Tune or Not?: a study of fundamental frequency in music practise. Progr. Q. Stat. Rep. 23, 49–78.

PubMed Abstract

Sundberg, J., Frydén, L., and Friberg, A. (1995). “Expressive aspects of instrumental and vocal performance,” in Music and the Mind Machine: The Psychophysiology and Psychopathology of the Sense of Music, ed R. Steinberg (Berlin: Springer), 49–62.

Sundberg, J., La, F. M. B., and Himonides, E. (2013). Intonation and expressivity: a single case study of classical western singing. J. Voice 27, 391.e1–391.e8. doi: 10.1016/j.jvoice.2012.11.009

PubMed Abstract | CrossRef Full Text

Tallandini, M. A., and Morassi, L. (2008). “Spatial and symbolic codes in the development of three-dimensional graphic representation,” in Drawing and the Non-Verbal Mind: A Life-Span Perspective, eds C. Lange-Küttner and A. Vinter (Cambridge: Cambridge University Press), 217–238.

Tchesnokov, P. (1961). Choir and Conducting of It [Хор и управление им]. Moscow: Gos Muz Izdat [Гос. музыкальное изд-во].

Terhardt, E. (1995). “Music perception in the auditory hierarchy,” in Music and the Mind Machine: The Psychophysiology and Psychopathology of the Sense of Music, ed R. Steinberg (Berlin: Springer), 81–87.

Google Scholar

Tillmann, B., Bharucha, J. J., and Bigand, E. (2000). Implicit learning of tonality: a self-organizing approach. Psychol. Rev. 107:885. doi: 10.1037/0033-295X.107.4.885

PubMed Abstract | CrossRef Full Text | Google Scholar

Thrasher, A. R. (2008). Sizhu Instrumental Music of South China: Ethos, Theory and Practice, Vol. 84, ed B. J. ter Haar. Leiden; Boston, MA: Brill.

Google Scholar

Trainor, L. J., Marie, C., Bruce, I. C., and Bidelman, G. M. (2014). Explaining the high voice superiority effect in polyphonic music: evidence from cortical evoked potentials and peripheral auditory models. Hear. Res. 308, 60–70. doi: 10.1016/j.heares.2013.07.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Trehub, S. E. (1987). Infants' perception of musical patterns. Percept. Psychophys. 41, 635–641. doi: 10.3758/BF03210495

PubMed Abstract | CrossRef Full Text | Google Scholar

Tuck, A. S. (2006). Singing the rug: patterned textiles and the origins of Indo-European metrical poetry. Am. J. Archaeol. 110, 539–550. doi: 10.3764/aja.110.4.539

CrossRef Full Text

Uspensky, B. (1995). Semiotics of Art [Семиотика искусства]. Moscow: Languages of Russian Culture.

Van Puyvelde, M., Vanfleteren, P., Loots, G., Deschuyffeleer, S., Vinck, B., Jacquet, W., et al. (2010). Tonal synchrony in mother-infant interaction based on harmonic and pentatonic series. Infant Behav. Dev. 33, 387–400. doi: 10.1016/j.infbeh.2010.04.003

PubMed Abstract | CrossRef Full Text

Vanstiphout, H. L. (1997). “Sumerian canonical compositions. c. individual focus. 6. school dialogues,” in The Context of Scripture, I: Canonical Compositions from the Biblical World, ed W. W. Hallo (Leiden: Brill Academic Pub.), 588–593.

Vashkevich, N. (2006). Semantics of Musical Speech. Musical Syntax [Семантика музыкальной речи. Музыкальный синтаксис]. Tver: Musorgsky Musical College.

Veldhuis, N. (2011). “Levels of literacy,” in The Oxford Handbook of Cuneiform Culture, eds K. Radner and E. Robson (Oxford: Oxford University Press), 68–89.

Vinogradov, V. (1973). “The notes on middle Asian polyphony [Заметки о среднеазиатском многоголосии],” in Music of Peoples of Asia and Africa [Музыка народов Азии и Африки], Vol. 2., eds V. Vinogradov (Moscow: Sovetskii Kompozitor [Советский композитор]), 98–127.

von Hornbostel, E. M. (1975). Opera Omnia, Vol. 1. eds K. P. Wachsmann, D. Christensen, and H.-P. Reinecke. Hague: Martinus Nijhoff.

Google Scholar

Vulliet, F. H. (2011). “Letters as correspondence, letters as literature,” in The Oxford Handbook of Cuneiform Culture, eds K. Radner and E. Robson (Oxford: Oxford University Press), 486–507.

Vurma, A., and Ross, J. (2006). Production and perception of musical intervals. Music Percept. 23, 331–344. doi: 10.1525/mp.2006.23.4.331

CrossRef Full Text | Google Scholar

Vuvan, D. T., and Schmuckler, M. A. (2011). Tonal hierarchy representations in auditory imagery. Mem. Cogn. 39, 477–490. doi: 10.3758/s13421-010-0032-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Vygotsky, L. (1984). Collection of Works: Children Psychology [Собрание сочинений: Детская психология], Vol. 4. ed D. B. Elkonin. Moscow: Pedagogika.

Vygotsky, L. (2008). Pedagogical Psychology [Педагогическая психология]. Moscow: ACT Astrel.

Walker, R. (1987). The effects of culture, environment, age, and musical training on choices of visual metaphors for sound. Percept. Psychophys. 42, 491–502. doi: 10.3758/BF03209757

PubMed Abstract | CrossRef Full Text | Google Scholar

Walker, R. (2004). Cultural memes, innate proclivities and musical behaviour: a case study of the western traditions. Psychol. Music 32, 153–190. doi: 10.1177/0305735604041493

CrossRef Full Text | Google Scholar

Weinstein, B., and Gridley, M. C. (2010). Visual perception of music. Psychol. J. 7, 80–87.

Google Scholar

Werner, E. (1948). The origin of the eight modes of music (Octoechos). A study in musical symbolism. Hebr. Union Coll. Ann. 21, 211–255.

West, M. L. (1981). The singing of homer and the modes of early Greek music. J. Hellen. Stud. 101, 113–129. doi: 10.2307/629848

CrossRef Full Text | Google Scholar

West, M. L. (1992). Ancient Greek Music. New York, NY: London: Oxford University Press.

Google Scholar

West, M. L. (1994). The Babylonian musical notation and the Hurrian melodic texts. Music Lett. 161–79. doi: 10.1093/ml/75.2.161

CrossRef Full Text | Google Scholar

Westrup, J. A. (1954). “Medieval song,” in New Oxford History of Music: Early Medieval Music up to 1300, Vol. 2 (Oxford, UK: Oxford University Press), 220–269.

Wienpahl, R. W. (1971). Modality, monality and tonality in the sixteenth and seventeenth centuries: I. Music Lett. 52, 407–417. doi: 10.1093/ml/LII.4.407

CrossRef Full Text | Google Scholar

Wilcke, C. (2007). Early Ancient Near Eastern Law: A History of Its Beginnings: The Early Dynastic and Sargonic Periods. Winona Lake, IN: Eisenbrauns.

Google Scholar

Willats, J. (1977). How children learn to draw realistic pictures. Q. J. Exp. Psychol. 29, 367–382. doi: 10.1080/14640747708400614

CrossRef Full Text | Google Scholar

Wilson, B., and Wilson, M. (1984). Children's Drawings in Egypt: cultural style acquisition as graphic development. Vis. Arts Res. 10, 13–26.

Wiltse, V., and Palmer, H. (2011). Hidden in plain sight: observations on the origins of the enneagram. Enneag. J. 4, 104–137.

Google Scholar

Winkielman, P., Halberstadt, J., Fazendeiro, T., and Catty, S. (2006). Prototypes are attractive because they are easy on the mind. Psychol. Sci. 17, 799–806. doi: 10.1111/j.1467-9280.2006.01785.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Winner, E. (2007). “Development in the arts: drawing and music,” in Handbook of Child Psychology, eds W. Damon and R. M. Lerner (Hoboken, NJ: John Wiley & Sons, Inc), 859–904.

Winnington-Ingram, R. P. (2015). Mode in Ancient Greek Music. Cambridge: Cambridge University Press.

Google Scholar

Wittkower, R. (1953). Brunelleschi and ‘proportion in perspective.’ J. Warb. Courtauld Instit. 16, 275–291.

Google Scholar

Woodward, L. (2011). Diogenes of babylon reading plato on music. Bull. Inst. Classical Stud. 54(suppl. S107), 233–253. doi: 10.1111/j.2041-5370.2011.tb00019.x

CrossRef Full Text | Google Scholar

Wulstan, D. (1971). The origin of the modes. Stud. East. Chant 2, 4–20.

PubMed Abstract | Google Scholar

Yanov-Yanovskaya, N. (1999). One culture two traditions [Одна культура две традиции]. Music. Acad. [Музыкальная академия] 3, 21–27.

Yasser, J. (1948). The highway and the byways of tonal evolution. J. Am. Musicol. Soc. 11/12/13, 11–14. doi: 10.1525/jams.1948.11-12-13.1.03a00020

CrossRef Full Text | Google Scholar

Yazar, H. (1991). Architecture in Miniature: Representation of Space and Form in Illustrations and Buildings in Timurid Central Asia. Cambridge, MA: Massachusetts Institute of Technology.

Yöre, S. (2012). Maqam in music as a concept, scale and phenomenon. J. World Turks 4, 267–286.

Google Scholar

Zannos, I. (1990). Intonation in theory and practice of Greek and Turkish Music. Yearb. Tradit. Music 22, 42–59. doi: 10.2307/767931

CrossRef Full Text | Google Scholar

Zatorre, R. J., and Baum, S. R. (2012). Musical melody and speech intonation: singing a different tune. PLoS Biol. 10:5. doi: 10.1371/journal.pbio.1001372

PubMed Abstract | CrossRef Full Text | Google Scholar

Zemtsovsky, I. (1987). “On melodic formula in Russian folklore [О мелодической ‘формульности’ в русском фольклоре],” in Ethnographic Origins of Folkloric Phenomena: Russian Folklore [Этнографические истоки фольклорных явлений. Русский фольклор], Vol. 14., ed V. Yeremina (Leningrad: Nauka), 117–128.

Zemtsovsky, I. (1998). “The melodic system of pentatonism (a Sketch about the Mongolian Version),” in Ethnologische, Historische Und Systematische Musikwissenschaft: Oskár Elschek Zum 65. Geburtstag, eds F. Födermayr and L. Burlas (Bratislava: ASCO art and science), 193–195.

Google Scholar

Zemtsovsky, I. (2012). “Again on the origin of music [Снова о происхождении музыки],” Zelenyi Zal [Зеленый зал: АльманахРИИИ], Vol. 3, ed A. F. Nekrylova (Sanct-Petersburg: Russian Institute of History of Arts [Российский институт истории искусств]), 7–27.

Ziegler, N. (2011). “Music, the work of professionals,” in The Oxford Handbook of Cuneiform Culture, eds K. Radner and E. Robson (Oxford: Oxford University Press), 288–312.

Keywords: ancient Babylonian and Greek music, diatonic/chromatic music, modulation and alteration, musical texture and pictorial perspective, musical key and pictorial perspective, environmental topography and tonal organization, pitch zone, aesthetic emotion

Citation: Nikolsky A (2016) Evolution of Tonal Organization in Music Optimizes Neural Mechanisms in Symbolic Encoding of Perceptual Reality. Part-2: Ancient to Seventeenth Century. Front. Psychol. 7:211. doi: 10.3389/fpsyg.2016.00211

Received: 16 December 2015; Accepted: 03 February 2016;
Published: 30 March 2016.

Edited by:

Leonid Perlovsky, Harvard University and Air Force Research Laboratory, USA

Reviewed by:

Stephan Thomas Vitas, Formerly affiliated with District of Columbia Psychological Association, USA
Leon Crickmore, Department of Education and Science, UK

Copyright © 2016 Nikolsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Aleksey Nikolsky,