Thursday, September 18, 2008

Identifying the language used in the software

The software you uses may be used by people using other language. A software may be designed to support multiple languages.

One scenario is that the software may be designed to support multiple languages with different versions. You install the version of the software available in a language. If you want to use the software in a different language, you need to install a different version.

A software may allow different user who install it to use a different language, but while you are installing, you need to specify which language you are using. A variation to this scenario is that you can switch the language you want to use even after installation. However, you can only use one language at a time.

Some software allows multiple users and each user can have a different language preference. These users may store their own data in different languages. The software can display different language based on user preference and store the data in different languages. An extreme of this scenario is that the data created by a user can be stored in a language but can be displayed to another user in different language.

In all the above cases, one of the common requirement is to allow the user to specify a language.

How does the language is identified? How does the user tell the system which language they are using?


The standard way and the most common way the software vendors use nowaday is Locale.

It is used by Linux. You can see the Locale definition in Linux documentation.

Language codes are usually taken from the list of two-letter codes defined in ISO-639-1, country codes from the two-letter codes defined in ISO-3166-1.

It is used in Java. You can see Java Locale documentation. It basically follows the same rule. ISO Language can be used independently. Country is not required but can be used to further qualify the language. For example, the 2 letter code for Chinese is zh. To differentiate the classic Chinese and the simplified Chinese, the Locale can be zh-CN and zh-TW.

The standards are defined in RFC 1766.

Oracle Database is a little bit different. It supports NLS much earlier then then the standard was defined. Oracle has its own language and territory codes. A list of Oracle language code and territory can be found in Oracle documentation.

Oracle actually provides a Locale mapping utility to convert the language code and territory code to ISO standard codes.


.

No comments: