Unicode is not UTF-8
SwapnonilMukherjee 120000QESN Visits (2955)
One of the common misconceptions about character encoding is that UTF-8 means Unicode and that Unicode means UTF-8. These two terms are used interchangeably as if they are same. In fact these two are completely different things. One is cheese and the other chalk, one is apple and the other orange.
Unicode is a abstract representation of all characters of almost all the known languages in the World. For example the English word H e l p is represented in Unicode as U+0048 U+0065 U+006C + U0070
OK, so now that I know how to represent "Help" in Unicode, how do I store it. You see U+0048 is a bunch of letters and numbers. Computer don't understand letters and numbers. They only understand 0s and 1s.
That's where encoders come in. They convert Unicode representation of characters in to 0s and 1s so that they can be stored in computer memory. The process carried out by encoders is known as encoding. Here a picture of how things work.
As of today there are two methods of converting that is there are two encoders that can convert Unicode characters to 0s and 1s. One is UTF-8 and the other is UTF-16. UTF-8 is the one that is wildly popular. The reason for it's popularity is that it is backwards compatible with ASCII. UTF-8 uses 8 bits that is 1 byte to store most English characters. For other languages it is free to use up to 6 bytes.
So repeat after me
"Unicode is an Abstract Representation of all characters in all languages"
"UTF-8 is an Encoding Scheme to Store all Unicode characters in computer memory"