Open Translation Tools for E-Science (Amsterdam conference 22-24th June)

Printer-friendly version

As the percentage of the Web content in English decreased dramatically from 80% in the 90’s to 34% today, Web translation at the age of linguistic E-diversity has become one of the hottest contemporary topics in the digital world. This evolution also concerns Science, which too many people tend to consider monolingual, disregarding the diversity of its domains (« hard » sciences, Humanities, Social Sciences) or audience (global/local levels).

Financed by the Open Society Institute (George Soros’ Foundation) and the Ford Foundation, and organized by the California based NGO Aspirationtech, the Amsterdam Open Translation Tools Conference recently gathered a large international community of NGO’s, developers, translators and companies interested in Web Translation.

The participants tackled all the issues at stake in the vast Spectrum of contemporary translation practice (from production to publication through editing and proof reading) and needs (from software localization to cultural translation through general translation workflows). Many challenges were discussed : quality, community building, retribution, project management, intellectual property rights, specific language area problems, video subtitling, spoken corpora, etc (see the wiki).

Some of the various ongoing projects that were presented clearly pave the way to a more multilingual science, and hence to a better dissemination of knowledge and a more democratic combination of science and society. Among others Open Source initiatives, improved (hybrid) human-machine translation tools, like the Worldwide Lexicon might help people read the scientific RSS feeds in their own language. In particular, the collaborative English-Arabic Platform Meedan might help science events reach the Arab world in a easier and more efficient way.

Google Wave might very well be open source, but none of its embedded translation device Google Translate nor the recently born Google Translation Toolkit are meant to be. What is more, as far as the license of the translated data is concerned, the Terms of Service of the Google Toolkit remain unclear. Both the Open Source and the translators communities are wondering whether an Open Corpora Initiative, or a Linguistic Resource Commons would not be the best way to improve massive international web translation.

A book synthesizing the different reflections on these various topics has been immediately written and sprint-printed by FlossManuals: it is available online.

0
Average: 4 (1 vote)

Comments

daniel's picture

I would certainly enjoy

I would certainly enjoy science even more if it were less monolingual but for this to occur, at least the two following conditions have to be met:

  • Scholarly communication moves from stand-alone papers to a system where each contribution (be it a single data point, a whole database, an experimental protocol or a review of the state of the art) is embedded in a coherent hyperlinked context.
  • Automated translators become available that can make use of such context.

I am currently writing up a blog post on the context aspect, and anyone interested is free to join in.