[prev] 41 [next]

Unicode

Basically, a 32-bit representation of a wide range of symbols
  • around 140K symbols, covering 140 different languages
Using 32-bits for every symbol would be too expensive
  • e.g. standard roman alphabet + punctuation needs only 7-bits
More compact character encodings have been developed (e.g. UTF-8)