Internationalization and localization tools


Locale-Sensitive C/C++ String Operation Function

size_t strcspn(const char *string, const char *strCharSet);

size_t wcscspn(const wchar_t *string, const wchar_t *strCharSet);

size_t _mbscspn(const unsigned char *string, const unsigned char *strCharSet);

size_t _tcscspn(const _TXCHAR *string, const _TXCHAR *strCharSet);

Internationalization (I18n) Function Overview

The strcspn ("string complement span") function returns the length of the initial substring of string that consists entirely of characters that are not members of the set specified by the string strCharSet. (In other words, it returns the offset of the first character in string that is a member of the set strCharSet.

wcscspn is the wide character equivalent; its parameters are wide-character strings and it returns a wide-character index value.

_mbscspn is supported on Windows platforms only and is the multibyte equivalent; its parameters are multibyte strings and it returns a byte index.

_tcscspn is the Windows-only Generic function; mapping to either wcscspn or _mbscspn.

I18n Issues

Use the appropriate version of the function as required for internationalization support, noting the following:

The strcspn function does not work if strCharSet contains multibyte UTF-8 characters. (In a string using a multibyte UTF-8 character encoding, characters consisting of more than one byte are not treated by strcspn as an entity, each byte is treated separately.)

If strCharSet may possibly contain multibyte UTF-8 characters, the parameters (string and strCharSet) will have to be converted to wide characters (wchar_t) and then use the wide function wcscspn. Similarly, the return value will then need to be converted from wide characters back to UTF-8 characters.

For Windows MBCS platforms, ensure that the multibyte code page is set properly, as _mbscspn depends on it. By default, the multibyte code page is set to the system-default ANSI code page obtained from the operating system at program startup. Use _getmbcp and _setmbcp to query or change the current multibyte code page, respectively.

Recommended Replacements*

*If you're already using the recommended function, see I18n Issues for other reasons why Globalyzer is detecting the function.

Locale-Sensitive C/C++ String Operation Functions

 

Lingoport internationalization and localization services and software