Vowels and Diphthongs
Diction is the art of speaking clearly so that each word is heard and understood. The words to songs are the stories that capture the emotion; they are the poetry. Words are what separates singers from other musicians. Understanding the impact of diction choices enables singers to best communicate the story to the audience so that they may develop a deeper connection to the song’s meaning(s).
When one sings, one is singing vowels. Vowels are like colors; there are infinite shades. An E can be piercing or airy — or it can be mixed with a hint of an EH sound or a U.
Try it: U is similar to E in many ways — they are most resonant vowels. Do a long tone on E, then a long tone on U. What changes in your mouth? Your tongue? Your lips? Where do you feel the vibration?
Start to slide between the E sound and the U sound. Make the transition without moving your lips. Make the transition only moving your lips. How does it affect the sound of the vowel?
Experiment with all the vowels: E‑Eh-Ah-Oh-Ou. Which vowels have a similar shape in your mouth?
In English — and especially in rock, pop and jazz — a pure vowel sound is rarely sung. Exercises such as this, help singers become more aware of how their tongue and lips affect the sound of each vowel. These types of exercises can help determine how clear one’s lyrics come across.
Many vocal teachers start students with Italian songs because this language uses mostly pure vowel sounds. I teach rock, pop and jazz — so my students are usually not singing in multiple languages. Instead, I utilize exercises with pure vowels, and explain the Italian pure vowels to students — but work exclusively with English Diction.
One of the greatest challenges for beginning students singing in English is diphthongs. Diphthong means “two sounds” in Greek and refers to two (or sometimes three!) adjacent vowel sounds occurring within the same syllable. The tongue and/or lips move during the pronunciation of the vowel. When singing diphthongs in rock, pop, and jazz one usually holds the first vowel sound and the second vowel sound is added at the very end. Similarly we have to watch our for certain consonants coloring our vowels. The biggest offenders are R, L, M, N, NG.
The above has to be learned. It’s not how we speak so it can seem counter intuitive at first — but will make a huge difference in the natural beauty of one’s tone. Remember: we sing the vowels.
* Sing the word “bye”, sing bah and hold out the ah sound, adding ee only at the very end.
* Sing the word “bay”, sustain the eh sound and add the ee and the very end.
* Try singing the words letting the second vowel ring out. See how it changes the sound of the word.
* Sing the word “car”
Try singing caaaah‑r
Now sing carrrr
* Sing the word “spring”
Try singing spriiii-ing
Now try sprinnnng
Notice the difference — the latter is usually not a desirable sound in singing. The word is distorted, sounds tense, and is much more challenging to sustain.
Initially this takes focus, but will eventually become second nature. If you’re not liking the way something sounds it’s always good to analyze the vowels. You are most likely coloring the vowel with the consonant that comes after it, accentuating the end of a diphthong, or using a vowel sound that is incorrect.
Try it: Sing your song using just the vowel sounds. If the words are there was a boy, you would sing ehr uh ay oy. Then sing it again adding the consonants. Did it change the way you sang the song?
There are some stylistic exceptions with diphthongs. For instance, while singing country music one might put just as much if not more emphasis on the second vowel sound. For example, when singing the word they, one would pronounce it theh-eeee. Another example in pop music is pulsing in rhythm between vowel sounds at the end of a phrase; one could sing the word “plain” like pleh-ee-eh-ee-eh-ee-eheen.
I enjoy exploring consonants via rhythmic exercises.
Try it: Go through the alphabet using a metronome.
B bb B bb
C cccc C cccc
d ff d ff d ff ff ff
…then switch it up a bit…
d k — k B k — k
d gh d gh z — v Z‑z-z‑z
When improvising or using your voice like an instrument, practicing with the metronome or beat helps students develop a physical sense of timing. This exercise will help build the tongue and lip muscles and organically aid with enunciation.
I prefer the use of songs when working on consonants and enunciation. A singer has an intuitive knowledge of consonants. The main concern is enunciating — but not over enunciating. Sometimes the ends of words get lost. R’s and L’s both tend to color the vowel sound too much. Less than one would think is needed; a subtle R or L will be understood clearly.
Although there are many similarities, diction technique varies between styles of music. Amplified and unamplified music require different techniques. If a singer always uses a microphone when performing, they should practice their songs amplified and be aware of basic mic technique.
When singing with a microphone it is important to soften consonants like ss, pp, kk, and anything that will pop. In a recording setting even less popping in consonants is needed. Unfortunately, singers can get quite lazy when using a microphone — and lyrics may be misunderstood or completely lost. Oftentimes, the first part of the word is understood — but the ending consonant is barely audible. Many singers benefit from working with a coach before they record. Microphone technique is another ball game, so to speak; microphones used for live performance and recording are very different — so it naturally follows that one must consciously alter how one sings while in session.
The ultimate goal with diction is to connect and communicate with the necessary control to choose a color and mood that fits the words and the story. There is not one proper way to play a note or sing a word. It changes with every song, phrase, musician and ensemble.