Finding Duplicate Words

As an example of how back references can be very useful, this will search for duplicated words in a text. You should be familiar with Markers before trying to understand this

The sequence @xy will match the text between the marks 'x' and 'y', where x and y are 0-9. Obviously these marks must have already been set in the search string before you can use them there. You are advised not to use marks 0 or 9 in backreferences.

An example of where backreferences can be useful is in finding out if a text contains doubled-up words such as the the.

Here's an expression, using set shorthands, to do this:

Search \s|\p @1 {\a}+ @2 \s|\p @12 \s|\p

So if we have arrived here, the search has ended and we have found a duplicated word.

Note that instead of the set shorthands we could have used predefined sets: White for \s, Punct for \p and Alpha for \a

If you wish to experiment you can use the sample text

Was this page helpful? Please email me and/or rate this page:

If you want a reply make sure any email address will not get spam-binned!
Optional comment

Other relevant pages

Top of page

Page Information Document URI:
Page first published
Last modified:Thu, 10 Oct 2019 13:12:18 BST
© 2017 - 2024 Richard Torrens.