This valence and arousal model could be transcribed into general emotional terms by using The Circumplex Model of Affect by Russell (1980): 

Control words showed expected results:

 

  • Determined results as short duration, high loudness, and high articulation improvisation;
  • Delicate results as low dissonant, low loudness, long note duration, and high pitch improvisation;
  • Ferocious results as high loudness, dissonant, and highly articulated improvisation;
  • Sorrowful results as low pitch, slow, soft, and low articulation.

In addition, the improvisation results have elicited consistent mapping between improvisation and taste words from the non-musician population with high marks, well above average chance level.

The first part was designed to search for any concurrent pattern between pre-selected classical music (selected and performed by the London Symphony Orchestra) and four selected wines representing four distinct characteristics: acidity, fruity, tannins, and sweetness.

This research reveals that different taste correlates with a specific span of sound pitch:

  • Sweet has an average matching with pitches around C6 or a bit below;
  • Sour has an average matching with pitches between C5 - C6;
  • Bitter has an average matching with pitches between C3 - C4;
  • Salty has an average matching with pitches between C4 - C5.

The review has aggregated significant results of researches on crossmodal correspondences between sounds and tastes. The table from this review is presented below.

  • Sweet in a low concentration resulted from valence/arousal as (+2,+1) and in a high concentration (+4,+3), indicating tendencies toward enthusiastic and elated emotions;
  • Sour in a low concentration resulted from valence/arousal as (-1.5,+0.5) and in a high concentration (-2,+2), indicating tendencies toward tensed and nervous emotions;
  • Bitter in a low concentration resulted from valence/arousal as (-1,-1) and in a high concentration (-3,+0.5), indicating tendencies toward upset emotion;
  • Salty in a low concentration resulted from valence/arousal (-0.5,-1) and in a high concentration (-2.5,+1), indicating tendencies toward stress emotion.

In the research by Janice Wang (2016), she evaluated participants matching basic tastes in different concentrations to pitch/volume of sound and role of emotions in crossmodal correspondences between taste and pitch/volume. The model of valence and arousal rating was used for defining different emotions. The results showed that taste quality and concentration significantly affect the choice of volume, pitch, and valance/arousal rating.

In 2012, Mesz advanced his investigation using algorithm, a technological science. With algorithm, an extensive collection of music could be explored, and a close quotation or melodic contour could be found. He applied the results from his previous experiment, i.e., pitch, duration, articulation, and loudness, to make a flavoured musical piece with a reminiscence of classical repertoire. The results were intriguing. Participants were able to decode the taste behind musical pieces well above chance level, and 59 percent of participants showed a perfect matching.

This type of experimentation has been done before in wines and other beverages such as beer and whiskey. For example, there was a report that wine got a rating of more solemn and more powerful when powerful and heavy music, Carmina Burana by Carl Orff, was played in the background.

Moreover, Mesz observed that some common words appeared in musicians mind during improvisation of taste words:

 

  • Sweet has triggered the word delicate during improvisation;
  • Bitter has triggered the words pain and sad during improvisation;
  • Salty has triggered the words joy, energizing, unpleasantness, and restlessness during improvisation;
  • Sour has triggered the words unpleasantness, fear, fast, cruel, and power during improvisation.

Mesz and colleagues investigated whether the basis taste names were reliably associated with specific musical parameters using improvisation on taste words by trained musicians. The results were evaluated whether the improvisation elicited consistent taste words in the average population. In the experiment, Mesz examined not only the taste words, but also the four common control words in musical perspective: Determined, Delicate, Ferocious, and Sorrowful.

Taste words showed a dilute pattern compared to control words but still consistent:

 

  • Sour results as high pitch, long duration, and high dissonant improvisation;
  • Bitter results as low pitch and low articulation improvisation;
  • Sweet results as long duration, low dissonant, low articulation, and soft improvisation;
  • Salty results as short duration and high articulation improvisation.

A relation between concentration and volume results in a direct variation. A lower concentration correlates to lower volume and another way around. 

 

This result can be anticipated as the correspondences between these two subjects could be categorized in the type of Statistical Correspondence.

 

The more intricate crossmodal correspondence between music and complex flavours, namely wine, was explored in this research. The experiments were divided into two parts.

The results demonstrated that participants' rating of the match between music and wine was not random, and the enjoyment level was rated significantly higher while tasting wines with music. 

The results of several experiments show the possibility to specify mechanisms that create taste-auditory crossmodal correspondences:


  • Intensity Matching – This kind of matching covers the possibility of the mappings between stimulus attributes that are "Magnitude based", in other words, synchronization of attributes which can be categorized in terms of "less and more". For example, increasing volume of sound stimuli could enhance the perception of taste intensity;
  • Hedonic Matching - Certain crossmodal matching may be mediated by the common emotional valence of different stimuli from different sensory modules. For example, people have tendencies o match less pleasant taste with a less pleasant sound;
  • Statistical co-occurrences - As mentioned in the first part, Statistical co-occurrences refer to crossmodal correspondences that occur with pair of stimuli that correlated in their nature, which in this case, bitter-sweet matching to low-high vowel consonance, may originate in the innate orofacial gestures. (Humans have tendencies to protrude tongue in response to stimuli, outward and upward in response to pleasant taste, and outward and downward in response to unpleasant taste.);
  • Semantic matching - This kind of matching may occur if the same terms or concepts are overlapping between two stimuli from different sensory modalities. For example, the word high and low that are overlapping description between pitch and spatial elevation.

The relation of emotions and taste has been tested by Robin et al., 2003. They transcribed four basic tastes into Ekman's emotions model (including happiness, surprise, sadness, fear, anger, and disgust); and the result is listed below:

 

  • Sweet could be transcribed as happiness and surprise;
  • Bitter could be transcribed as anger and disgust;
  • Sour and salty had a variety of different emotions.

In her research, Wang used the valence and arousal model instead of emotion names, and the result is shown in the figure.

Several striking consistencies which create a possibility to frame out the summary of each taste's sound aspects as follows:

  • Sweet - legato, consonance, even rhythm, small melodic intervals, slow tempo, high pitch, and low (soft) volume;
  • Sour - average - staccato, dissonance, syncopated rhythm, large melodic intervals, fast tempo, high pitch, and average volume;
  • Salty - staccato, average consonance, average - low pitch, and average volume;
  • Bitter - various conclusions on articulation, average consonance, slow tempo, low pitch, and average volume.

Knoferle and Spence compiled intriguing researches about sounds-tastes crossmodal correspondences. They wrote a review on the topic aiming to consolidate prior findings and to point out several domains beneficial for future researches, as well as conflicting results. The review also covered potential caveats, theoretical challenges, and potential mechanisms underlying auditory-gustatory crossmodal.

There is a significant correlation between pitch and valence ratings. In other words, people tend to match the pleasant musical pitch with a pleasant taste. For example, people match higher pitch, which is statistically pleasant, with a sweet taste. This result shows that hedonic matching might play an important role in sound-taste crossmodal correspondences.

 

Besides all the aforementioned topics, the texture of taste stimuli might play a role; for example, we can assume that lighter texture might match a higher pitch or a lighter sound. On the contrary, a heavier texture might match a lower pitch or a heavier sound.

The second part was designed to observe whether playing matching music during a wine tasting would affect the taster's perception of wine compared with tasting in silence.

The results also showed some indication of the relation between music and psychological states. For example:

 

  • Sweet has a relation with the high value of psychoacoustical pleasantness (soft sound and low roughness);
  • Sour has a relation with the high value of sensory sharpness (loud, high pitch, and dissonant).