How to read the chart
The chart is a grid. Each row begins with an initial consonant such as b, d, j, zh, or z. Each column is a final: the vowel sound (and sometimes a nasal ending) that completes the syllable. A filled cell means that initial and final combine to form a valid Mandarin syllable. An empty cell means the combination doesn't exist in standard Mandarin; that's not a typo, it's the phonology of the language. The top row holds the standalone finals: syllables that have no initial consonant at all, like a, ai, an, ang.
Click any filled cell to hear the syllable pronounced in all four tones. This is the quickest way to internalize how the same syllable shape shifts meaning when its pitch contour changes.
What's an initial?
An initial is the consonant (or consonant-like sound) that opens a syllable. Mandarin has 21 initials, grouped by where and how they're produced:
- Labials: b, p, m, f
- Alveolars: d, t, n, l
- Velars: g, k, h
- Palatals: j, q, x
- Retroflexes: zh, ch, sh, r
- Sibilants: z, c, s
The palatals and retroflexes are the two groups learners confuse most often. They can look similar on paper but sound very different in the mouth. Why j/q/x and zh/ch/sh feel similar walks through the contrast, and the retroflex challenge covers the curled-tongue series in detail.
A few syllables don't start with a consonant at all. In those cases pinyin uses y or w as a spelling helper to mark the syllable boundary, even when no extra sound is added. Using y and w as syllable starters explains the rule.
What's a final?
A final is everything that comes after the initial: a single vowel, a vowel cluster, or a vowel followed by a nasal ending. Mandarin uses around 35 finals, divided into three groups:
- Simple vowels: a, o, e, i, u, ü
- Compound vowels: ai, ei, ao, ou, ia, ie, ua, uo, üe
- Nasal finals: an, en, in, un, ang, eng, ing, ong, plus combinations like ian, uan, iang, uang
The vowels look like English letters but rarely sound like them. The same letter can shift pronunciation depending on the surrounding sounds, which is why beginners often misread cells in the chart. The pure vowel e and why i sounds different in chi vs li are good starting points for these mismatches.
A special note on ü (yu): it appears spelled as plain u after j, q, x, and y, because in that position no ambiguity is possible. The tricky u after j/q/x covers this shorthand, and the invisible dots explains why the umlaut quietly disappears.
Why about 400 syllables?
If you multiply 21 initials by 35 finals you get over 700 possible combinations, yet only about 400 actually exist. The rest are gaps the language never adopted, the way English has no native words starting with ng-. Add the four tones and the neutral fifth, and the spoken inventory grows to roughly 1,300 distinct toned syllables. Understanding the pinyin syllable structure goes deeper into why some combinations exist and others don't.
Tones turn syllables into words
Every filled cell can be pronounced with one of four tones, plus a lighter neutral tone. The first tone is high and flat, the second rises, the third dips then rises, and the fourth falls sharply. Same syllable, four meanings: mā (mother), má (hemp), mǎ (horse), mà (to scold). The four tones introduces them, and the neutral fifth tone covers the lighter unstressed reading.
How to practice with the chart
A few approaches work well:
- Read across a row. Pick one initial (say j) and click through every final it pairs with. This trains your ear to hear the same consonant shaping different vowel sounds.
- Read down a column. Pick one final and hear how different initials change the syllable. This is where the retroflex and palatal contrast becomes obvious.
- Drill your trouble spots. Most learners stumble on the same few cells. Bookmark them and return until they stop tripping you up.
- Pair each syllable with its tone. Reading without tone is half the work. Click each cell and repeat aloud, matching the pitch contour as you go.
Once the chart starts feeling familiar, the jump from reading pinyin in isolation to recognizing it inside actual Chinese words gets a lot shorter.