Knax Shop Website Localization
May 14, 2015 admin 0 Comments
The first project I worked on was the KNAX Shop Website Localization. KNAX Shop is an online business held by a Dane Jacob Nielsen who sells innovative Danish-designed wall hooks:
Jacob lives in France, and he had his website in 3 languages: Danish (the source language), English and French. The CMS he uses is called PrestaShop – this is a free eCommerce solution that now has 230,000 users.
I worked with John Di Rico, Josh Potter and Duncan Young.
In order to manage the project, we used a free cloud-based Project Management tool called Zoho Project.
My part included leveraging the old version of the website in order to create a base Translation Memory and glossaries.
Josh did the most challenging part of the project – he extracted the strings from the website in three languages: Danish, English and French.
Then we aligned the file pairs using the Wordfast Anywhere alignment tool and built an EN-DA TM as well as an EN-FR TM.
I was in charge of creating glossaries, and I did some research to find out what the best way is to extract terms from files automatically.
I first used Tilde.com, which is a free cloud-based solution powered by TAAS. But I was not happy with the results as it extracted very few bilingual terms. It is good in extracting monolingual terms, though.
I also tried SDL Multiterm Extract for the automatic term extraction from bilingual files. For the En-Fr pair, it only extracted 20 terms, which is too few, so I would not advise buying it.
Finally, AntConc is really good for monolingual term base extraction as it is a free solution that extracts a lot of terms (230 per 1000 words) and indicated frequency.
But I am a memoQ user, and I ended up using memoQ for leveraging monolingual files and matching the results.
Here is how the process looks:
1. Add a Stop Word List
MemoQ has a very limited number of Stopword lists – besides, these lists are too short. SDL Trados Studio provides long and very good SWL for a lot of languages.
However, adding them to memoQ can be a challenge as SWL in memoQ have their own syntax and extension (.mqres). Here is how you can do it:
– go to Resource Console – Stopword Lists
– export a SWL from memoQ (.mqres) to get mqres syntax (open in in Notepad++ or TextWrangler)
– add it to a Word document
– add an SDL SWL to the above Word file
– format it using wildcards to observe the .mqres syntax
– import it back to memoQ: give it a name, specify the language
- Extract two monolingual glossaries:
– go to Options – Extract Terms
– indicate the frequency and add a Stopword list
– generate candidates, accept them and export them to a memoQ termbase
– export the termbase as a csv file
3. Now that you have two columns (En-Fr, for example), you can compare and match them. First, convert the csv to Excel, then sort the left column alphabetically and start comparing and populating the right column.
I also had to generate an En-Da glossary, but I do not speak Danish. So, I used Google Translate to do back translation.
Then I pasted the Danish translation done by Google Translate back to the Excel file as the third column and then started comparing – I now had a column in English and two columns in Danish (one was human-translated (leveraged from the website), the other was machine-translated, but 80% of Danish terms in those two columns were the same).
As a result, I generated 70 terms for En-Fr and En-Da language pairs, without speaking Danish!
When I attended the Google Translate workshop in Google, the GT speakers asked the audience: “What are the unusual ways you use Google Translate?” I shared this example, and the Google Translate speakers were apparently glad to hear it as they have never thought that GT could be used in this way.