A Brief Primer on CJK Languages

This is a brief primer explaining the CJK(Chinese/Japanese/Korean) languages, their roots, how they are alike, and how they are different. I had originally written it several years ago to explain this to an American friend, and recently came across it again while digging up some old emails.

I thought it might be of some use to someone, someday, so decided to post it up here.


The Chinese have been around as a civilized culture for thousands of years. In fact, they are known to have had dynasties with complex heirarchical structures and political intrigue as early as 2,000 BC. As for their written language, we have archeological proof found dating back beyond 1,000 BC.

The Chinese written language doesn’t have an “alphabet” system that Latin-based language speakers will recognize, and are instead an evolved form of hieroglyphs. In fact, it is the only hieroglyph-based language that is still being used today, after others like Egyptian and Mayan died away. If you were to take each Chinese character as an “alphabet”, then there are over 40,000 of them with at least 2,000 used in daily life! While this may sound shocking and totally unfeasible to use as a language, learning it isn’t as difficult as it sounds.

The unique thing about the Chinese written language is that each character actually means something. For example, one character may mean “me” while another means “go”, and another means “house/home”. These three characters will form a sentence meaning “I’m going home” (Chinese:我回家, though I’m not sure if your browser can see this). Having so many different characters to represent different objects and ideas give the language a kind of clarity, as you’ll seldom find ambiguous words. More than one Westerner learning Chinese have been known to half-jokingly exclaim that “there’s a different word for every single thing under the sky!”

Most (around 96%) Chinese characters are made up of a semantic and phonetic elements, which serve as a sort of “meta-alphabet”. So once you learn a few hundred “words”, you’ll become familiar with these elements and be able to recognize and guess at the meaning and pronunciation of words that you don’t even know.

Many words are actually a combination of several characters. For example, the “electricity” and “brain” combines to form “electronic brain”, which is the Chinese word for computer. Or “fire” and “car” combines to form “train”, a reference to steam locomotives where you had to burn coal. Proper nouns such as “Obama”, “Napoleon”, and “Hitler” becomes more of a problem, as you cannot equate any semantic components to them. This gets the job done, but feels inelegant. That’s why some foreign companies seeking to appeal to the Chinese public spends a lot of effort in creating a Chinese brand name. One classic example is Coca Cola, which is know as “Ke Kou Ke Le”. In addition to sounding similar to the original English pronunciation, it contains the meaning of “the more you drink, the happier you get”!

As for the spoken language, there are 7 different major dialects. The reason these are called “major” is actually because people from one major dialect group can’t even understand (or only understand with extreme difficulty) what someone from another dialect group is saying! It’s almost like there are 7 distinct languages, with dozens of dialects within each. But the Chinese had the good sense to unify the written part of their language into a single script, so people from different dialect groups can still communicate on paper.

Now let’s move on to Korean and Japanese.

Most of Asia knows China by the name of “Zhong Guo”(Chinese: 中国), or “Central Empire” (though a more direct translation would be “Middle Kingdom”), due to the fact that they had been the center of civilization for as long as anyone can remember. This being the case, many Asian languages (including Korean and Japanese) have been heavily influenced by theirs.

Korea and Japan have their own spoken as well as written languages. But similar to how French was “the classy language of the ruling class and intellectuals” for a long time in most parts of Europe, ancient Koreans and Japanese tended to look down on people who couldn’t speak Chinese. This legacy continues even now, as a good part of modern Korean and Japanese vocabulary are actually Chinese words pronounced in their own language.

Think of all those English words with non-Anglo-Saxon origins, coming from French, Latin influences instead. Now imagine that all these foreign-origin words come from a single language, and that they consist of more than 70% of the vocabulary. That would be a pretty accurate picture of what the Korean and Japanese language are like.

The Korean and Japanese written languages are based on alphabetic systems, and thus are purely phonetic. The Korean alphabet consists of 14 consonants and 6 basic vowels, although these can combine to form “stressed consonants” and “complex vowels”. The Japanese alphabet consists of 5 singular vowels, 39 distinct consonant-vowel unions, and one singular consonant.

Since each Chinese character traditionally had a distinct meaning, they didn’t seem to have felt the need to differentiate the pronunciations much, and many different characters with different meanings have similar pronunciations. I say “similar”, because the spoken language has 4 distinct “tones” or inflections to each sound, and this minimizes ambiguity.

However, this becomes a problem when the same Chinese words are “borrowed” into a purely phonetic alphabetical system like Korean and Japanese, as a single written Korean/Japanese word can often stand for multiple different Chinese characters, and therefore different meanings. This is why the habit of writing a mixture of Chinese and Korean/Japanese characters remains even today, although it has been falling out of practice in Korea over the last two dozen years.

The fact remains that while Chinese, Japanese and Korean speakers cannot understand each other in conversation, they can easily make out more than half of what the other is saying if they write down their words, and replace all “borrowed” words with their Chinese characters.