December 19, 2009

New article on web site localisation

Readers may be interested in a new article I posted a few days on web site localisation. The article explores some of the issues involved the actual translation process, but also some of the technical points that are worth considering.

A broad issue that I repeatedly see in localisation is that developers often don't get the translator involved in the early stages of the project, either because they don't think to do so, or because it's just not practical to do so from the point of view of their project workflow. Broad recommendations I would suggest at a relatively early stage (see the article for more details on some of these) include:
  • build into your design some flexibility in terms of field lengths and validation on your DB (such-and-such a field may need to accept a different range of characters and/or length once localised);
  • test at an early stage that you can correctly place accented characters in fields (e.g. web forms) and have them safely saved to your database and retrieved again without any corruption; similarly, ensure that accented characters in your copy appear properly;
  • check that your page layouts and web forms do not depend on the particular length of strings in English;
  • try not to share messages/string properties that are superficially similar but have clearly different uses (don't share the same string property for "up" meaning "to the top of the page" and "higher price")-- and as an approximate way of finding these ambiguous cases, one strategy is to have a separate properties file for each major section of your site, even if this means repeating some strings in English (their translations may differ);
  • find out what localisation features are present in the programming language you are using, and consider whether you need to use each one (e.g. for sorting data in Java, it is preferable to use a Collator rather than the Collections.sort() call in its raw form).
From a practical viewpoint, you should also allow extra time for the translation and localisation process. Beyond the usual recommended timescale of around 1 day per 2,000 words for the translation itself, bear in mind that there may be extra file conversion/importing/exporting involved that will take extra time.