Internationalization and localization tools


Configuring Rule Sets

This section explains about Rule Sets and describes the procedure for creating and editing Rule Sets. It discusses the following topics:

What Are Rule Sets?

A Rule Set is of a set of parameters that the Globalyzer Client scanner uses to detect internationalization issues and generate scan results.

A Rule Set is divided into three categories: Detection Rules, Filter Rules and Retention Rules.

Detection Rules

Detection Rules specify internationalization issue conditions that the Globalyzer source code scanner will detect and report to you. All detection categories are accessible to Globalyzer users for altering, adding to and deleting, with the exception of the Locale Sensitive Methods rules, which can only be augmented by the Globalyzer Server administrator.


Note: Add custom locale-sensitive methods as General Patterns rules.

Locale Sensitive Methods

The Locale Sensitive Methods category includes rules used to detect method calls, functions, and constructors in your source that are potentially unsafe within certain locales and for certain character encodings. These rules can be individually selected or deselected but can't be altered. This category is programming language dependent and so is not available for all programming languages. An example of such a rule is Java's lang.String.ToLowerCase() method. This signature draws upon the rules of the default locale to convert a string to lower case. The rules of the default locale would not be correct for languages in all locales, some of which do not have a lower-case mode to their alphabets.

Note: When Globalyzer scans code for locale-sensitive methods, the Quick Summary generated by Globalyzer includes links for all detected locale-sensitive methods. The link takes you to information about how to refactor or replace the detected method to make your code function properly across all locales and character encodings.

Static File References

Another detection category is the Static File References category. The user can add to, delete from, and alter this category of rules. This category is used to detect image and other static file paths in source code. During the internationalization process, these static files need to be localized and placed in locale-specific directories. All paths to those localized files in the source code must then be altered to retrieve and load the localized file at run time from the locale-specific directory corresponding to the user's locale.

Source File Extensions

A very basic detection category is the Source File Extensions category. This category of rules can be also be added to, deleted from and altered by the user. As the name suggests, this category simply tells Globalyzer which source files to scan if you point it at an entire directory of source files.

General Patterns

Users can alter, add to and delete from the General Patterns category, which can include any regular expression that the user wants the scanner to look for during the scan. This might include patterns used to enforce coding standards or anything at all -- not necessarily internationalization-related patterns.

Included HTML Tags

Finally, for HTML projects only, users will see the Included HTML Tags category. This category requires special explanation and has a section dedicated to it. Please read this section if you intend to customize this Rule Set category.

Filter Rules

Filter Rules specify conditions that the scanner will overlook. For instance, by default, Globalyzer detects embedded strings. Embedded string detection is a fixed part of each Rule Set that you create (you cannot edit the embedded string rules). However, some strings embedded in your code are not intended for display to an end user and hence, should not be included in Globalyzer's reports listing strings that need to be externalized for translation.

String Literal Filters

Embedded strings that shouldn't be externalized may be programmatic elements or debug messages printed to the console. Globalyzer provides several ways to filter such strings from your scan results. The first way is to specify a pattern, which if Globalyzer detects within a string literal, will cause Globalyzer to ignore that string. This mechanism is called String Literal Filters.

String Method Filters

The second mechanism for filtering strings is called String Method Filters. Your code likely contains many method or functions that are passed string arguments. When you are sure that any string passed into function X is never displayed to the user, you can specify that function name within the String Method Filters rules and whenever Globalyzer sees that a string literal is passed as an argument to any of the methods, functions, or constructors in this list, it will ignore those strings.

For example, the Java method:

    javax.servlet.http.HTTPServletRequest.getParameter(String s)

takes a string argument, but this string would never appear as text visible to an end user. Therefore, if you add this method (method name only) to the list of String Method Filter rules, Globalyzer will ignore any string literal passed in.

String Line Filters

A final mechanism for filtering embedded strings is called String Line Filters. In the case where there are strings that will never be displayed to the user and reside in lines of code that share a specific string pattern, you can add that pattern to the String Line Filter rules. Whenever Globalyzer detects string literals in a line of code that also contains one of the patterns in this list, it will ignore those strings.

For example, suppose you have several strings defined that are used only in conditionally compiled debug code:

    String szName = "Name";  // Debug
    String szStart = "Start";// Debug
    String szEnd = "End";    // Debug
    String szTime = "Time";  // Debug

Since these are all common words, it would be difficult to eliminate them from the embedded string report using String Literal Filters. However, since each line of code also contains the comment Debug, we can add that shared pattern to the String Line Filter rules. Now, when Globalyzer scans for string literals, it will find a matching pattern in the String Line Filters rules for these four lines, and thus filter the four string literals from the embedded string report.

Method Line Filters

Method Line Filters are similar to String Line Filters, but apply to the Locale Sensitive Methods detection category. Whenever Globalyzer detects a locale-sensitive method in a line of code that also contains one of the patterns in this list, it will ignore the method.

Static File Filters

Static File Filters apply to the Static File References detection category. Globalyzer will ignore the static file if it contains one of the patterns in this list.

Static File Line Filters

Another mechanism for filtering static file references is with the Static File Line Filters. In this case, Globalyzer will ignore the static file if it is in a line of code that also contains one of the patterns in this list.

General Pattern Line Filters

General Pattern Line Filters are the line filters for the General Patterns detection category. Whenever Globalyzer detects a general pattern in a line of code that also contains one of the patterns in this list, it will ignore the general pattern detection.

Retention Rules

Retention Rules helps you add conditions under which strings that would normally be filtered, stay in the scan results. It is basically an override of the different string filters. For example, normally strings where the first word starts with a special character such as @ are filtered out, but your product name is @Large, so naturally you want strings beginning with @Large to still be caught.

Logging In to the Server

Rule Sets must be configured on the Globalyzer Server, using a browser. The first step is to log in to your account on the server. Click here for detailed login instructions.

Creating New Rule Sets

To create a new Rule Set:

  1. Log in to the server.
  2. From your home page, select Create Rule Set. The Create a New I18N Rule Set page appears.
  3. Enter a name for the new Rule Set.
  4. From the Select Source Type list, pick the type of source code that you wish to scan. The choices are: C#, C++, Delphi, HTML, JavaScript, PHP, SQL, Perl, Java, Visual Basic, and VBScript.
  5. Click the Submit button.
  6. In the case of C++, SQL and Visual Basic, an additional page appears for you to choose the context for the programming language.
  7. For example, there are seven supported C++ programming language contexts:

      ANSI UTF-8
      ANSI UTF-16
      Windows Generic
      Windows MBCS
      Windows Unicode
      Cross Platform UTF-8
      Cross Platform UTF-16

    There are four supported SQL programming language contexts:

      Oracle PL/SQL
      MySQL
      MS SQL
      PostgreSQL

    And lastly, Visual Basic has two contexts:

      Classic VB
      VB.NET

    The context determines the default set of Unsafe Methods for the Rule Set. Click here for details on C++ contexts.

    Click the Submit button.

  8. The Configure I18N Rule Set page appears.
  9. This page allows you to edit the default settings for four Detection Rules and three Filter Rules. Click on a link below for detailed information about configuring each type of rule:

Detection Rules

Configure Source File Extensions
Choose from and add to a list of file extensions to determine what types of files will be scanned within a chosen project directory.

Configure Locale Sensitive (Unsafe) Methods
Choose from a list of default unsafe methods to identify in the Unsafe Method code scan.

Configure Included HTML Tags
Choose from and add to a list of tags that are included as part of the embedded text when the HTML scanner is searching for text between matching tags.

Configure Static File References
Choose from and add to a list of default static file extensions to identify in the Static File Reference code scan.

Configure General Patterns
Choose from and add to a list of patterns to include in the General Patterns code scan.

Filter Rules

Configure String Literal Filters
Choose from and add to a list of default strings to filter out of the Embedded String code scan.

Configure String Method Filters
Choose from and add to a list of methods to use to filter strings passed as parameters out of the Embedded String code scan.

Configure String Line Filters
Choose from and add to a list of default patterns to filter code lines out of the Embedded String code scan.

Configure Method Line Filters
Choose from and add to a list of default patterns to filter code lines out of the Unsafe Method code scan.

Configure Static File Filters
Choose from and add to a list of default patterns to filter static file references out of the Static File Reference code scan.

Configure Static File Line Filters
Choose from and add to a list of default patterns to filter code lines out of the Static File Reference code scan.

Configure General Pattern Line Filters
Choose from and add to a list of default patterns to filter code lines out of General Pattern code scan.

Retention Rules

Configure String Retention Patterns
Create Regular Expressions that will override any String Filter, ensuring that strings with these patterns will remain in the Scan Results.

 

Editing Rule Sets

To edit an existing Rule Set:

  1. Log in to the server, as described previously.
  2. From your home page, select View My Rule Sets. The All My Rule Sets page appears.
  3. In the All My Rule Sets page, click the Edit button next to the Rule Set you wish to edit. For example:
  4. The Configure I18N Rule Set page appears. This page allows you to edit the current settings for four Detection Rules and three Filter Rules. Click on a link below for detailed information about configuring each type of rule:

 

Copying Rule Sets

To copy an existing Rule Set:

  1. Log in to the server, as described previously.
  2. From your home page, select View My Rule Sets. The All My Rule Sets page appears.
  3. In the All My Rule Sets page, click the Save As button next to the Rule Set you wish to copy. For example:
  4. Give the new Rule Set a name in the Copy Rule Set page, and click the Save button.

Deleting Rule Sets

To delete an existing Rule Set:

  1. Log in to the server, as described previously.
  2. From your home page, select View My Rule Sets. The All My Rule Sets page appears.
  3. In the All My Rule Sets page, click the Delete button next to the Rule Set you wish to edit. For example:
  4. In the Delete Rule Set page, click the Delete button to confirm the operation.

 

 User's Guide Contents

 

Lingoport internationalization and localization services and software