REVIEW - Unicode Demystified - A Practical Programmer's Guide to the Encoding Standard

Title:

Unicode Demystified - A Practical Programmer's Guide to the Encoding Standard

Author:

Richard Gillam

ISBN:

9780201700527

Publisher:

Addison-Wesley Professional (2002)

Pages:

853pp

Reviewer:

Francis Glassborow

Reviewed:

December 2002

Rating:

★★☆☆☆

Unicode has grown in importance for more than a decade. It was first conceived as a mechanism to unify various character encodings used by computers. It was immediately made more complicated by the desire to support 8-bit encoding mechanisms because a decade ago that was important. The initial design was for encoding everything into 16-bits with extra mechanisms to deal with ways to use just 8-bits. The experienced programmer will realise two things, 16-bits just is not going to be enough, and any encoding that consumes more than 1 byte is vulnerable to endianness.

The latest version of Unicode requires 20-bits for completeness, though 16-bit is good enough for most purposes. Up till now about the only documentation readily available has been the Standard itself published by Addison-Wesley. As those who have a copy will quickly attest, that is pretty heavy going even for people used to such arcane documents as the C++ Standard. All you need to know is certainly in 'The Unicode Standard Version 3.0' but I think you will find this new book a much better way to understand Unicode.

I think it is past time that computer users in general and programmers in particular became familiar with Unicode (and its close ally. ISO 10646). We need to know about representations, about the various problems created by the same symbol being a different character in different languages etc. We need to distinguish between characters and context sensitive glyphs and so on.

You certainly will not find this book a quick read, but I think it is one that you should have somewhere in your personal library and that you browse when you have the time so that you will steadily acquire a sound grasp of this essential aspect of modern communication. Not being able to speak someone else's language is possibly excusable but not knowing how to tackle issues of language representation should make you feel uncomfortable. Understanding ASCII is not enough, it never has been but many have got away with it. It is time to broaden your knowledge and this book certainly makes Unicode more digestible.


Book cover image courtesy of Open Library.