Thursday, May 27, 2010

Charater set vs encoding

A character set is a set of graphic, textual symbols, each of which is mapped to a set of nonnegative integers.

examples:
ASCII, ISO-8859-1 to ISO-8859-13 ISO-8859-15 (each set represent a different language set)
Unicode

An encoding maps a character set's code points to units of a specific width, and defines byte serialization and ordering rules.

example: UTF-8
UTF-8 unifies US-ASCII with Unicode.

http://java.sun.com/blueprints/guidelines/designing_enterprise_applications_2e/i18n/i18n2.html

No comments:

Post a Comment