Post-editing (PE) is correcting machine translation output. There isn’t a simpler way to put it. However, in spite of the simple definition, there is a lot that it is still unclear about post-editing. The fact that there are different degrees of post-editing and the lack of a general PE business model companies can use, certainly does not help. Let’s try to clarify some of these issues and share some tips for better PE.
What is Post-Editing?
Post-editing involves improving the translations produced by an MT system, also called raw output. The task of post-editors is to take some MT output and apply any fixes necessary so that said output complies with predefined quality expectations. That’s the key to successful post-editing: making only necessary changes, and avoiding preferential changes.
Full vs Light
Based on the level of quality, turn-around time, volume, and other factors expected for a given project, different degrees of post-editing may be implemented. For example, if the content is going to be published, near human translation quality may be required. If the content is for internal use or understandability purposes, light or fast PE to remove blatant errors may be enough.
Don’t get confused by names: light PE sometimes is called fast PE, rapid PE, etc. They all mean the same, and proper guidelines and rules need to be defined in advance for all types of PE. Below is a brief, general checklist for the two major types of post-editing:
How to do it
The most common approach to PE involves using a CAT tool as editing environment. There are two main ways in which MT output will be displayed to the post-editor: a) as a translation memory match, if the content was run through MT beforehand; b) if the CAT tool is connected to an MT engine, suggested translations will be presented to the post-editor on demand. In most cases, the use of a translation memory adds value in leveraging content and helping achieve greater consistency.
PE mechanics are simple: read the source and read the target, identify issues and make only required changes. If the quality of the output is very bad, and fixing it would take more time than translating from scratch, don’t overthink it – just translate from scratch. This is something that will come up naturally with time and practice.
Two things to keep in mind: 1) “If it ain’t broke, don’t fix it”; 2) how much to fix needs to be determined in advance, as quality is a vague concept.
How To Do It Better
A good post-editor tries to retain as much of the MT output as possible. The less number of edits you make, the better. It is extremely important to make quick decisions. Think of post-editing as taking the shortest way to get to the right place. If all the information is correctly transmitted to the target then move on to the next segment.
If you know how to use a quality assurance (QA) tool, like Checkmate, this will definitely help you achieve better quality. Most modern CAT tools, like mMemoqQ, have QA features which you can use. These tools will identify terminology errors (if you are working with a glossary), consistency issues, and problems with numbers, tags, and placeholders.
During the PE process, if you are working with technical terminology, using a termbase is definitely helpful. If you can’t use a QA tool, look up the most important terms beforehand and make sure they are all translated correctly. This will let you focus on fixing other types of errors.
How To Do It Faster
The key to faster and more efficient PE is fixing patterns, especially for engine training purposes. Identifying recurring errors allows post-editors to cover more ground in less time.
A great technique for streamlining the process is automated post-editing. It consists in applying fixes through search and replace operations to correct errors automatically. Based on observation of the MT output, post-editors can identify recurring errors produced by MT and create fixes that can be applied automatically. (More on this topic in one of the next posts – stay tuned!)
Also, try to make the most of the tool you are using for PE. For example, MemoQ has a really cool feature called Predictive Typing. The name gives it away: it predicts what you are typing and suggests how to complete a phrase or a word. If you find yourself making the same correction over and over, consider adding such correction to the predictive typing list, instead of typing it yourself.
Finally, here’s a tip for those of you who can code or write scripts. Using a simple script to automate repetitive tasks, like web searches and word order changes, will help you save a nice amount of time. For example, at eBay, we use a simple script created with Autohotkey that turns some key combinations into commands, and can be used to run Google or Google Image searches, look up terms in dictionaries, change the order of words, and change the position of words in a segment all using keyboard shortcuts.