skip to Main Content

The benefits of using your own terminology

Whether it’s choosing the correct pharmaceutical terms, or keeping up with the latest youth or social media slang, terminology is an essential part of translation.

A translation might be technically correct, but without the right terminology, it won’t meet a client’s standards.

Is it windshield or windscreen for a car manufacturer? Should a medical text use generic or brand names for drugs? What about industry-specific jargon?

When translators try to match a required tone of voice, a terminology database will help them take the best decisions. While it may seem time-consuming and expensive to maintain, the benefits are clearly worth the effort:

• Saves time and cost
• Minimises errors
• Improves consistency and quality
• Can be continuously updated by both client and translator

 

Lingo24’s approach to building terminology databases

 

There are two major steps in how we build a terminology database, from stratch.

Automated term pair extraction – using our Machine Translation technology, we extract term pairs at a success rate well above standards (between 74-95%).

Terminology validation – once term pairs are extracted, they need to be validated, either by our own Lead Terminologists, or even by your very own designated linguists. We’ve even built a bespoke interface to facilitate the process of validation, making it very easy to approve/reject term candidates, in line with the existing Translation Memories.

The Terminology Database created is fully compatible with all high-end computer-assisted translation (CAT) tools.

 

A recent study found that the failure to provide specialised glossaries could lead to translators spending up to 90% of their time searching for the right words.

With the right terminology list, in a computer-assisted translation (CAT) tool, translation time drops dramatically, while the risk of errors falls even further.

Terminology creation using TermFinder

Lingo24’s TermFinder tool is used to accelerate terminology development by using existing client assets. Within our TermFinder tool, we use a refreshed statistical approach by combining techniques that look at the data differently than traditional frequency based methods that are often used.

 

We start with a detailed analysis of your assets

  • identifying potential terms by extracting monolingual term sets from both the source and target sides
  • stripping out all “stopwords” (like “a”, “the” etc.)
  • comparing the frequency of each potential term in a generic corpus vs the Translation Memory, giving us its log likelihood
  • identifying terms that appear more frequently in the Translation Memory, due to their frequency in a generic text

Alignment and ranking

  • The resulting terms are then aligned and ranked by training a phrase-based Statistical Machine Translation engine using the Translation Memory. During this process, we use customised features to identify good terms (e.g. running a DBPedia check) that are more likely to be relevant.
  • It is important to note, that we are not using Machine Translation to generate terms. We are building a Machine Translation engine as ephemeral data used to merely help rank and align potential terms based on commonality and uniqueness.

The result and impact of using TermFinder

  • The outcome of this process is a high-quality, focused, bilingual terminology, that can easily be approved by one or more reviewers either from the client’s side or from Lingo24.
  • In addition to the terms extracted using TermFinder, Lingo24 can use the existing attributes as baseline for source terms. Lingo24 can help review the attributes, ensuring the content is optimal for high level matching and that no existing duplication exists before proceeding to the next steps.
  • By analysing the terms against the client’s existing Translation Memory, Lingo24 can identify any terms which do not have a target language match and require translation.Lingo24 can then advise the client of the wordcount, service options and time required to ensure target language correspondence for all attribute terms. This in turn, results in cuting costs and reducing time needed on all your future translation projects.