July 23, 2008

French-English text translation tool

An initial version of this site's automatic French/English translation tool is available. The tool is designed to complement the site's dictionary as follows:
  • The dictionary is intended for use by students, learners, translators etc who basically understand some French but need to find out about the translation or uses of an isolated word or phrase;
  • The translation tool is aimed more at English speakers who need to get the gist of an entire French text (or vice versa for French speakers with an English text), typically when they understand very little of the target language.
In practice, initial tests show that learners and users of the language are also finding it useful to look up phrases using the translation tool. You should just be aware of limitations of the automatic translation, and may like to take a look at the accompanying tips on using machine translation.

About the translation system

Many of the text translations are produced by Google Translate, a statistical machine translation system developed by Google which allows sites such as this to query and build on the translation system. However, this site's tool also builds some additional features on top of the Google system:
  • it is assumed that the language pair you are working with is French/English; the site will detect which of these two languages is being entered and translate to the other language;
  • some orthographic corrections are made to the input text which help to get a better result from Google;
  • in some cases, alternative inputs will be run simultaneously through the Google system when such inputs are known to improve the translation results (such alternatives and their translations are always listed alongside the original query);
  • in some cases, instead of using the Google system, a translation is pulled directly from this site's dictionary data. Note that the source of the translation is always clearly indicated.
The above features are being actively improved on the basis of the typical use to which users are putting the system.

Known limitations

The system works best on texts that are similar to the types of text that the system has been trained on. Google Translate is a statistical system, trained on sets of existing translations. In practice, this means the system works well on "the types of text that people tend to translate". For example, whole sentences or passages from commercial or technical texts work quite well. Isolated words and phrases, particularly when they have a "dangling" word such as a preposition, sometimes don't work as well. You may also find what you consider to be quite a "basic" phrase or question doesn't translate so well, because a phrase used in everyday speech didn't crop up in the material that the translation system was trained on.

No comments: