This is a new post on the Getting Translators to Code series. If you haven’t read previous posts, I suggest you do – you’ll like them!
This time, we’ll work on a short script to count words. Do I need to talk about the importance of counting words for translators? If you translate for a living, you probably know that’s how you know how to charge for a translation, that’s how you split files or documents when working in teams, that’s how you estimate how long a translation will take, etc.
Now, if CAT tools provide wordcounts, if there are standalone tools specifically for counting words, why do you need this? Well, this is a script intended to count words quickly. This is how it works, in a nutshell:
- You highlight text
- You press the combination of keys of your preference, like ALT+1
- You get a tooltip with the wordcount
The script DOES NOT match your text against a TM. But it DOES:
- work in virtually any program (text editor, web browser, even within your CAT tool),
- exclude numbers
- exclude inline tags (HTML, but you can adapt this to your needs), and
- takes literally no time to show results (of course, this depends on the amount of text you have).
Also, it is far more convenient to copy and paste in Word – I know that’s what most of you typically do.
Let’s take a look at the code then:
#q:: ; Win + q to activate script ClipSaved := ClipboardAll ; Save the entire clipboard to a variable Clipboard := Send ^c ; This simulate CTRL+C = copy ClipWait, 5 ; We give the program a few miliseconds to act StringReplace, clipboard, clipboard, ', x, All ; Replace single quotes in clipboard ClipWait, 5 StringReplace, clipboard, clipboard, -, x, All ; Replace dashes in clipboard ClipWait, 5 Clipboard := RegExReplace( Clipboard, "<.+?>", "", Tags) ;Remove tags form clipboard. Also counts tags - not needed, but can be useful if you need to account for tags Clipboard := RegExReplace( Clipboard, "\b[^\d\W]+\b", "", Count ) ;Count words EXCLUDING numbers. Clipboard := ClipSaved #Persistent ; To have a ToolTip disappear after a certain amount of time ToolTip, Word Count: %Count% ;Show a tooltip with the wordcount SetTimer, RemoveToolTip, 5000 ;Tooltip duration in milliseconds return RemoveToolTip: SetTimer, RemoveToolTip, Off ToolTip Return
In a nutshell, this is how it works: when you press WIN+Q after selecting text, it’s copied to the Clipboard. Quotes and hyphens are removed. Then, all tags are removed using a regular expression (see how this work here). Finally, we replace all words excluding numbers and count how many replacements were done. That count is our wordcount, displayed as a tooltip.
We use the same variable (Clipboard) in the different steps that we need to take to get an accurate wordcount, like removing punctuation and getting rid of tags.
Note that you can tweak the regular expressions used here, like the ones for tags. You can do the same if you want to include numbers.
This is what the wordcount looks like in action: