Discussion about this post

User's avatar
Rob Grayson's avatar

Interesting. I was recently working with a pdf file (in a pdf reader, not a CAT tool) where the search function refused to find words that I could clearly see were present in the file. I'm guessing this could well have been caused by the same (or a similar) issue.

Expand full comment
James Kirchner's avatar

Many times I have run into something similar in German source texts that were not composed with Unicode fonts. You'd think everything would be Unicode now, but sometimes things still arrive in a pre-Unicode font.

In those texts, a word like "früher" will look fine, like just one word, but the programs will perceive it as something like "fru<¨>her". The programs will also not recognize it for glossary display, among other problems. Luckily, Word will select the whole thing as a misspelled word in spellcheck, so a quick spellcheck before import usually fixes the problem.

One time, I handed over a job and was told by the project manager, with the utmost urgency, that I hadn't finished it. It turned out that import into both MemoQ and Trados would stop at a certain point and the remaining text would not display. Trados wouldn't show me the text it was importing (these were text files and not Word files), so I couldn't see what was causing the glitch. However, MemoQ's text import dialog, where you can choose the text encoding shows you exactly what the text you're importing will look like. I scrolled down to the point where import stopped, and I found a strange Chinese-looking character that didn't belong there. I opened up the text file, deleted and rewrote the area where the invisible character was, and then everything imported fine. The client always wanted everything done in Trados, but if I hadn't been using MemoQ, I'd never have found the problem, and the project manager could have blamed me, even though she didn't know how to fix it yourself.

Expand full comment
3 more comments...

No posts