[ Reply not posted to the defunct group comp.lang.perl ]
CV wrote:
> How can I match 'n' number of neighbouring words of a pattern using
> regular expressions?
>
> For example, suppose I am looking for the pattern "length xyz cm"
> in some text. where xyz is a number - integer or fraction or
> decimal point. How can I also grab about 3-5 words on either side
> of the pattern "length xyz cm"? The surrounding words are not
> always constant & may be variable. Also, the original text to be
> matched is not just a single sentence, but lines from a file
> concatenated together - so the text has many newline characters
> too. I only want the words on the same line as the pattern.
>
> I have tried using regex of the form
> /\b(\w*)\b(\w*)\b(\w*)\b($pattern)\b(\w*)\b(\w*)\b( \w*), but this
> doesn't work for some reason.
It doesn't work for several reasons, such as:
- No space characters.
- '\w*\b\w*' is an impossible combination that can never match (check
out the description of \b in "perldoc perlre" to learn why).
- The \w character class does not include e.g. the '$' character,
while you mentioned that a "word" may be a variable.
> Could someone please offer some suggestions?
Try something like this:
/((?:\S+ +){0,3})\b($pattern)\b((?: +\S+){0,3})/
--
Gunnar Hjalmarsson
Email:
http://www.gunnar.cc/cgi-bin/contact.pl