Internationalization and localization tools


Locale-Sensitive Perl Method

unpack TEMPLATE, LIST

Internationalization (I18n) Method Overview

The unpack function converts a list of values to a string using the rules from TEMPLATE. Converted values typically look like machine-level representations.

Note that the unpack function operates in two modes, 'C' and 'U'. In C mode, packed strings are processed per character. In U mode, they are processed byte-by-byte as UTF-8 encoded bytes. C may be used to Unicode characters, while U may be used to get non-Unicode bytes. For instance, packing '0x20AC' (the Euro symbol) in U mode produces the bytes \xe2\x82\xac. Unpacking '\xe2\x82\xac' in U mode again produces '0x20AC'.

See perl's unpack function documentation and perlpacktut documentation for additional details.

I18n Issues

Unpack will not correctly handle invalid Unicode byte sequences. It is also complicated to use. Finally, it is easy to confuse C and U modes.

Suggested Replacement

The Encode module's Encode::encode_utf8 is a better solution for encoding Unicode. It is simpler to use, and is built to handle invalid byte sequences.

Globalyzer will detect this function and report it as an i18n issue. If you have determined that the call is being handled correctly, you can use Globalyzer's Ignore Comment functionality to ensure that it isn't picked up in a subsequent scan.



Locale-Sensitive Perl Methods

 

Lingoport internationalization and localization services and software