Finding Duplicate Words

As an example of how back references can be very useful, this will search for duplicated words in a text. You should be familiar with Markers beforetrying to understand this

The sequence @xy will match the text between the marks 'x' and 'y', where x and y are 0-9. Obviously these marks must have already been set in the search string before you can use them there. You are advised not to use marks 0 or 9 in backreferences.

An example of where backreferences can be useful is in finding out if a text contains doubled-up words such as the the.

Here's an expression, using set shorthands, to do this:

Search \s|\p @1 {\a}+ @2 \s|\p @12 \s|\p

So if we have arrived here, the search has ended and we have found a duplicated word.

Note that instead of the set shorthands we could have used predefined sets: White for \s, Punct for \p and Alpha for \a

If you wish to experiment you can use the sample text

Other relevant pages

Top of page

Page Information Document URI:
Page first published
Last modified:Wed, 30 May 2018 09:33:40 BST
© 2017 - 2018 Richard Torrens.