Count Words Faster – a script (GTtC V)

on

This is a new post on the Getting Translators to Code series. If you haven’t read previous posts, I suggest you do – you’ll like them!

This time, we’ll work on a short script to count words. Do I need to talk about the importance of counting words for translators? If you translate for a living, you probably know that’s how you know how to charge for a translation, that’s how you split files or documents when working in teams, that’s how you estimate how long a translation will take, etc.

Now, if CAT tools provide wordcounts, if there are standalone tools specifically for counting words, why do you need this? Well, this is a script intended to count words quickly. This is how it works, in a nutshell:

  • You highlight text
  • You press the combination of keys of your preference, like ALT+1
  • You get a tooltip with the wordcount

The script DOES NOT match your text against a TM. But it DOES:

  • work in virtually any program (text editor, web browser, even within your CAT tool),
  • exclude numbers
  • exclude inline tags (HTML, but you can adapt this to your needs), and
  • takes literally no time to show results (of course, this depends on the amount of text you have).

Also, it is far more convenient to copy and paste in Word – I know that’s what most of you typically do.

Let’s take a look at the code then:

#q:: ; Win + q to activate script
ClipSaved := ClipboardAll ; Save the entire clipboard to a variable
Clipboard := 
Send ^c ; This simulate CTRL+C = copy
ClipWait, 5 ; We give the program a few miliseconds to act
StringReplace, clipboard, clipboard, ', x, All ; Replace single quotes in clipboard
ClipWait, 5
StringReplace, clipboard, clipboard, -, x, All ; Replace dashes in clipboard
ClipWait, 5
Clipboard := RegExReplace( Clipboard, "<.+?>", "", Tags) ;Remove tags form clipboard. Also counts tags - not needed, but can be useful if you need to account for tags
Clipboard := RegExReplace( Clipboard, "\b[^\d\W]+\b", "", Count ) ;Count words EXCLUDING numbers. 
Clipboard := ClipSaved 

#Persistent ; To have a ToolTip disappear after a certain amount of time
ToolTip, Word Count: %Count% ;Show a tooltip with the wordcount
SetTimer, RemoveToolTip, 5000 ;Tooltip duration in milliseconds
return

RemoveToolTip:
SetTimer, RemoveToolTip, Off
ToolTip
Return

In a nutshell, this is how it works: when you press WIN+Q after selecting text, it’s copied to the Clipboard. Quotes and hyphens are removed. Then, all tags are removed using a regular expression (see how this work here). Finally, we replace all words excluding numbers and count how many replacements were done. That count is our wordcount, displayed as a tooltip.

We use the same variable (Clipboard) in the different steps that we need to take to get an accurate wordcount, like removing punctuation and getting rid of tags.

Note that you can tweak the regular expressions used here, like the ones for tags. You can do the same if you want to include numbers.

This is what the wordcount looks like in action:

wordocuhntscriupt

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s