Letter case
Locale-specific case mappings
Certain scripts, such as Latin, Greek, and Cyrillic, have letter cases. Those mappings between lower case characters and upper case characters are locale sensitive.
👉 Note: When converting the case of characters in your application, it is highly recommended to use String.toLowerCase(locale) and String.toUpperCase(locale) methods specifying Locale arguments explicitly.
For example, English uses Latin script. It consists of 26 alphabets:
| Upper case | Lower case | ||||
|---|---|---|---|---|---|
| A | U+000041 |
LATIN CAPITAL LETTER A | a | U+000061 |
LATIN SMALL LETTER A |
| B | U+000042 |
LATIN CAPITAL LETTER B | b | U+000062 |
LATIN SMALL LETTER B |
| C | U+000043 |
LATIN CAPITAL LETTER C | c | U+000063 |
LATIN SMALL LETTER C |
| D | U+000044 |
LATIN CAPITAL LETTER D | d | U+000064 |
LATIN SMALL LETTER D |
| E | U+000045 |
LATIN CAPITAL LETTER E | e | U+000065 |
LATIN SMALL LETTER E |
| F | U+000046 |
LATIN CAPITAL LETTER F | f | U+000066 |
LATIN SMALL LETTER F |
| G | U+000047 |
LATIN CAPITAL LETTER G | g | U+000067 |
LATIN SMALL LETTER G |
| H | U+000048 |
LATIN CAPITAL LETTER H | h | U+000068 |
LATIN SMALL LETTER H |
| I | U+000049 |
LATIN CAPITAL LETTER I | i | U+000069 |
LATIN SMALL LETTER I |
| J | U+00004A |
LATIN CAPITAL LETTER J | j | U+00006A |
LATIN SMALL LETTER J |
| K | U+00004B |
LATIN CAPITAL LETTER K | k | U+00006B |
LATIN SMALL LETTER K |
| L | U+00004C |
LATIN CAPITAL LETTER L | l | U+00006C |
LATIN SMALL LETTER L |
| M | U+00004D |
LATIN CAPITAL LETTER M | m | U+00006D |
LATIN SMALL LETTER M |
| N | U+00004E |
LATIN CAPITAL LETTER N | n | U+00006E |
LATIN SMALL LETTER N |
| O | U+00004F |
LATIN CAPITAL LETTER O | o | U+00006F |
LATIN SMALL LETTER O |
| P | U+000050 |
LATIN CAPITAL LETTER P | p | U+000070 |
LATIN SMALL LETTER P |
| Q | U+000051 |
LATIN CAPITAL LETTER Q | q | U+000071 |
LATIN SMALL LETTER Q |
| R | U+000052 |
LATIN CAPITAL LETTER R | r | U+000072 |
LATIN SMALL LETTER R |
| S | U+000053 |
LATIN CAPITAL LETTER S | s | U+000073 |
LATIN SMALL LETTER S |
| T | U+000054 |
LATIN CAPITAL LETTER T | t | U+000074 |
LATIN SMALL LETTER T |
| U | U+000055 |
LATIN CAPITAL LETTER U | u | U+000075 |
LATIN SMALL LETTER U |
| V | U+000056 |
LATIN CAPITAL LETTER V | v | U+000076 |
LATIN SMALL LETTER V |
| W | U+000057 |
LATIN CAPITAL LETTER W | w | U+000077 |
LATIN SMALL LETTER W |
| X | U+000058 |
LATIN CAPITAL LETTER X | x | U+000078 |
LATIN SMALL LETTER X |
| Y | U+000059 |
LATIN CAPITAL LETTER Y | y | U+000079 |
LATIN SMALL LETTER Y |
| Z | U+00005A |
LATIN CAPITAL LETTER Z | z | U+00007A |
LATIN SMALL LETTER Z |
Turkish also uses Latin script. It consists of 29 alphabets:
| Upper case | Lower case | ||||
|---|---|---|---|---|---|
| A | U+000041 |
LATIN CAPITAL LETTER A | a | U+000061 |
LATIN SMALL LETTER A |
| B | U+000042 |
LATIN CAPITAL LETTER B | b | U+000062 |
LATIN SMALL LETTER B |
| C | U+000043 |
LATIN CAPITAL LETTER C | c | U+000063 |
LATIN SMALL LETTER C |
| Ç | U+0000C7 |
LATIN CAPITAL LETTER C WITH CEDILLA | ç | U+0000E7 |
LATIN SMALL LETTER C WITH CEDILLA |
| D | U+000044 |
LATIN CAPITAL LETTER D | d | U+000064 |
LATIN SMALL LETTER D |
| E | U+000045 |
LATIN CAPITAL LETTER E | e | U+000065 |
LATIN SMALL LETTER E |
| F | U+000046 |
LATIN CAPITAL LETTER F | f | U+000066 |
LATIN SMALL LETTER F |
| G | U+000047 |
LATIN CAPITAL LETTER G | g | U+000067 |
LATIN SMALL LETTER G |
| Ğ | U+00011E |
LATIN CAPITAL LETTER G WITH BREVE | ğ | U+00011F |
LATIN SMALL LETTER G WITH BREVE |
| H | U+000048 |
LATIN CAPITAL LETTER H | h | U+000068 |
LATIN SMALL LETTER H |
| I | U+000049 |
LATIN CAPITAL LETTER I | ı | U+000131 |
LATIN SMALL LETTER DOTLESS I |
| İ | U+000130 |
LATIN CAPITAL LETTER I WITH DOT ABOVE | i | U+000069 |
LATIN SMALL LETTER I |
| J | U+00004A |
LATIN CAPITAL LETTER J | j | U+00006A |
LATIN SMALL LETTER J |
| K | U+00004B |
LATIN CAPITAL LETTER K | k | U+00006B |
LATIN SMALL LETTER K |
| L | U+00004C |
LATIN CAPITAL LETTER L | l | U+00006C |
LATIN SMALL LETTER L |
| M | U+00004D |
LATIN CAPITAL LETTER M | m | U+00006D |
LATIN SMALL LETTER M |
| N | U+00004E |
LATIN CAPITAL LETTER N | n | U+00006E |
LATIN SMALL LETTER N |
| O | U+00004F |
LATIN CAPITAL LETTER O | o | U+00006F |
LATIN SMALL LETTER O |
| Ö | U+0000D6 |
LATIN CAPITAL LETTER O WITH DIAERESIS | ö | U+0000F6 |
LATIN SMALL LETTER O WITH DIAERESIS |
| P | U+000050 |
LATIN CAPITAL LETTER P | p | U+000070 |
LATIN SMALL LETTER P |
| R | U+000052 |
LATIN CAPITAL LETTER R | r | U+000072 |
LATIN SMALL LETTER R |
| S | U+000053 |
LATIN CAPITAL LETTER S | s | U+000073 |
LATIN SMALL LETTER S |
| Ş | U+00015E |
LATIN CAPITAL LETTER S WITH CEDILLA | ş | U+00015F |
LATIN SMALL LETTER S WITH CEDILLA |
| T | U+000054 |
LATIN CAPITAL LETTER T | t | U+000074 |
LATIN SMALL LETTER T |
| U | U+000055 |
LATIN CAPITAL LETTER U | u | U+000075 |
LATIN SMALL LETTER U |
| Ü | U+0000DC |
LATIN CAPITAL LETTER U WITH DIAERESIS | ü | U+0000FC |
LATIN SMALL LETTER U WITH DIAERESIS |
| V | U+000056 |
LATIN CAPITAL LETTER V | v | U+000076 |
LATIN SMALL LETTER V |
| Y | U+000059 |
LATIN CAPITAL LETTER Y | y | U+000079 |
LATIN SMALL LETTER Y |
| Z | U+00005A |
LATIN CAPITAL LETTER Z | z | U+00007A |
LATIN SMALL LETTER Z |
As can be seen from these tables, both English and Turkish use many characters in common and most of the case mappings are the same. However there are several incompatibilities:
| Language | Upper case | Lower case |
|---|---|---|
| English | I (U+000049 LATIN CAPITAL LETTER I) |
i (U+000069 LATIN SMALL LETTER I) |
| Turkish | I (U+000049 LATIN CAPITAL LETTER I) |
ı (U+000131 LATIN SMALL LETTER DOTLESS I) |
İ (U+000130 LATIN CAPITAL LETTER I WITH DOT ABOVE) |
i (U+000069 LATIN SMALL LETTER I) |
These special case mappings are defined by Unicode and published at SpecialCasing.txt. In several languages, even the string length can change by case conversion.