How an Advanced Search is Made

There are some subtle differences between some of the search possibilities and to make best use of these an understanding of the search process will help.

To understand the search process it is important to remember that a file in memory is stored as a long string of Bytes, each Byte representing a character.

The search expression consists of a number of separate search elements strung together, with spaces as desired, to make the expression readable. Spaces outside of a search element are generally ignored (see exception 1 below). The diagram shows six search elements, e1 to e6.
search_how/png

During a search, StrongED maintains three pointers. Pointer Ts (Text Start) points to the Start of the Text string currently being inspected for a match, and is tied to the first element in the Search expression. Tm (Text Match) points to the byte currently being inspected for a Match against the Search expression.

Em (Element Match) points to the search element currently doing the inspecting.

At the start of a search, Tm points to the same place as Ts. Em points to e1. If no match is found, all three are moved on together, one byte at a time, in the text and so on until the text byte being searched matches e1. Now e2 is matched against the next byte(s) of the text - in this case starting at the letter e. If these bytes match, Tm moves on to the end of the matched section and Em moves to e3.

Thus the matching continues in chunks until one if two things happens:

  1. No match is found. But this may be because this expression is part of an OR - so the pointer Em must Look Ahead (Ea) to see if there is any such element which need to override the "no-match". If indeed there is no match, then Ta is moved on one byte and Tm and Em are returned to their starts, coupled to Ts.
  2. If a match is found, then Em moves on to the next element which is matched against the next chunk of text.
If all the elements of the search expression match, then a complete match is found.

There is also an element ~ which causes search to look ahead without moving the Text Match (Tm) pointer - as shown by Ta (Text ahead) - until the element qualified by the ~ fails to match it (in other words, the ~ qualifier causes any bytes that match the qualified element to be ignored).

The text byte that caused the failure (i.e. is not ignored) is then matched against the following element.

Some exceptions to the above

Exception1

In the list of predefined shorthands for character sets is shown the letter D standing for any numerical digit. To specify several digits using the letter D, spaces must be included. Thus D D D D will find groups of four digits and is identical to ####.

Was this page helpful? Please email me and/or rate this page:


If you want a reply make sure any email address @torrens.org will not get spam-binned!
Optional comment

Other relevant pages

Top of page


Page Information

http://css.torrens.org/valid-html401-bluehttp://css.torrens.org/valid-css Document URI: http://stronged.torrens.org/man/search/how.html
Page first published: Tuesday the 23rd of January, 2018
Last modified:Mon, 08 Jul 2024 09:18:37 BST
© 2018 - 2024 Richard John Torrens.