Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > search "window" pattern matching

Reply
Thread Tools

search "window" pattern matching

 
 
Cheez
Guest
Posts: n/a
 
      01-11-2004
Hello, hard to desribe my question in a clear way. I want to process
a string that looks like this:

$mystring = "thetextinherewillbefairlyrandom";

I want to capture chunks of text and place them in an array or hash
table. If possible, I want to make a regex that will start at the
first letter and capture letters 1 - 5, in this case $capture =
"thete". Then, I want this window to shift 1 letter so that the next
captured string is letters 2 - 6, or $capture= "hetex" and so on until
the end of the line. Can anyone offer up a sample regex would
accomplish this task?

Thanks,
Cheez

==============================================

My idea is this (although it doesn't work):

$mystring = "thetextinherewillbefairlyrandom";

$length = scalar ($mystring);

while ($counter < $length) {

$_ =~ /\w[$counter-$counter+4]/; # 'capture' regex

push @newarray; $counter++; # regex capture window increments by
1
# pushing chunks into array
}

foreach (@newarray) { #sample output

print "$newarray";

}
 
Reply With Quote
 
 
 
 
Randal L. Schwartz
Guest
Posts: n/a
 
      01-11-2004
>>>>> "Cheez" == Cheez <(E-Mail Removed)> writes:

Cheez> Hello, hard to desribe my question in a clear way. I want to process
Cheez> a string that looks like this:

Cheez> $mystring = "thetextinherewillbefairlyrandom";

Cheez> I want to capture chunks of text and place them in an array or hash
Cheez> table. If possible, I want to make a regex that will start at the
Cheez> first letter and capture letters 1 - 5, in this case $capture =
Cheez> "thete". Then, I want this window to shift 1 letter so that the next
Cheez> captured string is letters 2 - 6, or $capture= "hetex" and so on until
Cheez> the end of the line. Can anyone offer up a sample regex would
Cheez> accomplish this task?

Use string lookahead, so they can be overlapping:

while ($mystring =~ /(?=.{5})/sg) {
push @result, $1;
}

print "Just another Perl hacker,"

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<(E-Mail Removed)> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
 
Reply With Quote
 
 
 
 
Toby
Guest
Posts: n/a
 
      01-11-2004
Cheez wrote:
> Hello, hard to desribe my question in a clear way. I want to process
> a string that looks like this:
>
> $mystring = "thetextinherewillbefairlyrandom";
>
> I want to capture chunks of text and place them in an array or hash


perldoc -f substr

maybe what you're looking for.
 
Reply With Quote
 
gnari
Guest
Posts: n/a
 
      01-11-2004
"Randal L. Schwartz" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) s.com...
> >>>>> "Cheez" == Cheez <(E-Mail Removed)> writes:

>
> Cheez> Hello, hard to desribe my question in a clear way. I want to

process
> Cheez> a string that looks like this:
>
> Cheez> $mystring = "thetextinherewillbefairlyrandom";
>
> Cheez> I want to capture chunks of text and place them in an array or hash
> Cheez> table. If possible, I want to make a regex that will start at the
> Cheez> first letter and capture letters 1 - 5, in this case $capture =
> Cheez> "thete". Then, I want this window to shift 1 letter so that the

next
> Cheez> captured string is letters 2 - 6, or $capture= "hetex" and so on

until
> Cheez> the end of the line. Can anyone offer up a sample regex would
> Cheez> accomplish this task?
>
> Use string lookahead, so they can be overlapping:
>
> while ($mystring =~ /(?=.{5})/sg) {
> push @result, $1;
> }


or use pos(),
or more likely, use substr()

gnari



 
Reply With Quote
 
Marc Bissonnette
Guest
Posts: n/a
 
      01-11-2004
http://www.velocityreviews.com/forums/(E-Mail Removed) (Cheez) wrote in news:1e85f7c8.0401111026.52915a71
@posting.google.com:

> Hello, hard to desribe my question in a clear way. I want to process
> a string that looks like this:
>
> $mystring = "thetextinherewillbefairlyrandom";
>
> I want to capture chunks of text and place them in an array or hash
> table. If possible, I want to make a regex that will start at the
> first letter and capture letters 1 - 5, in this case $capture =
> "thete". Then, I want this window to shift 1 letter so that the next
> captured string is letters 2 - 6, or $capture= "hetex" and so on until
> the end of the line. Can anyone offer up a sample regex would
> accomplish this task?
>
> Thanks,
> Cheez
>
> ==============================================
>
> My idea is this (although it doesn't work):
>
> $mystring = "thetextinherewillbefairlyrandom";
>
> $length = scalar ($mystring);
>
> while ($counter < $length) {
>
> $_ =~ /\w[$counter-$counter+4]/; # 'capture' regex
>
> push @newarray; $counter++; # regex capture window increments by
> 1
> # pushing chunks into array
> }
>
> foreach (@newarray) { #sample output
>
> print "$newarray";
>
> }


Lemme take a crack at it:

#!/usr/bin/perl
use strict;
use warnings;
my $mystring = "thetextinherewillbefairlyrandom";
# get the length of $mystring:
my $length = length $mystring;
# set / declare the counter:
my $counter=0;
# set / declare the array:
my @newarray;
# while the counter is less than the length of $mystring, grab bits of
text:
while ($counter < $length) {
# grab 5 characters from the last position used within $mystring
my $tempstring = substr $mystring,$counter,5;
# dump it into @newarray:
push @newarray,$tempstring;
# increment the counter and loop again
++ $counter;
}
for (@newarray) {
print "$_\n";
}

output:

thete
hetex
etext
texti
extin
xtinh
tinhe
inher
nhere
herew
erewi
rewil
ewill
willb
illbe
llbef
lbefa
befai
efair
fairl
airly
irlyr
rlyra
lyran
yrand
rando
andom
ndom
dom
om
m

--
Marc Bissonnette
CGI / Database / Web Management Tools: http://www.internalysis.com
Something To Sell? Looking To Buy? http://www.whitewaterclassifieds.ca
Looking for a new ISP? http://www.canadianisp.com
 
Reply With Quote
 
Randal L. Schwartz
Guest
Posts: n/a
 
      01-11-2004
>>>>> "gnari" == gnari <(E-Mail Removed)> writes:

>> Use string lookahead, so they can be overlapping:
>>
>> while ($mystring =~ /(?=.{5})/sg) {
>> push @result, $1;
>> }


gnari> or use pos(),
gnari> or more likely, use substr()

Uh, why? Any solution with pos and substr is likely to be a lot
more complex than this simple regex.

Or are you of the habit of replacing simple solutions with complex
ones for the helluvit?

print "Just another Perl hacker,"

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<(E-Mail Removed)> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      01-11-2004
Marc Bissonnette <(E-Mail Removed)> wrote:

> # get the length of $mystring:
> my $length = length $mystring;
> # set / declare the counter:
> my $counter=0;
> # set / declare the array:
> my @newarray;



Comments that repeat what is already said in the code are worse
than no comments.

They are distracting, plus you have to remember to change stuff
in 2 places, the code and the comment that repeats the code.
(they have a very good chance of getting out-of-sync)


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
gnari
Guest
Posts: n/a
 
      01-11-2004
"Randal L. Schwartz" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) s.com...
> >>>>> "gnari" == gnari <(E-Mail Removed)> writes:

>
> >> Use string lookahead, so they can be overlapping:
> >>
> >> while ($mystring =~ /(?=.{5})/sg) {
> >> push @result, $1;
> >> }

>
> gnari> or use pos(),
> gnari> or more likely, use substr()
>
> Uh, why? Any solution with pos and substr is likely to be a lot
> more complex than this simple regex.
>
> Or are you of the habit of replacing simple solutions with complex
> ones for the helluvit?


sometimes

I just have the impression that a substr() solution is
easier for a beginner to understand and change, if
necessary.
Also, it is allways good to rub in the TMWTDI.

On the other hand, maybe the OP really just wanted
to know if there was a *regexp* solution. In that case,
he will just ignore my comment.

gnari



 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      01-11-2004
"Randal L. Schwartz" wrote:
>
> >>>>> "Cheez" == Cheez <(E-Mail Removed)> writes:

>
> Cheez> Hello, hard to desribe my question in a clear way. I want to process
> Cheez> a string that looks like this:
>
> Cheez> $mystring = "thetextinherewillbefairlyrandom";
>
> Cheez> I want to capture chunks of text and place them in an array or hash
> Cheez> table. If possible, I want to make a regex that will start at the
> Cheez> first letter and capture letters 1 - 5, in this case $capture =
> Cheez> "thete". Then, I want this window to shift 1 letter so that the next
> Cheez> captured string is letters 2 - 6, or $capture= "hetex" and so on until
> Cheez> the end of the line. Can anyone offer up a sample regex would
> Cheez> accomplish this task?
>
> Use string lookahead, so they can be overlapping:
>
> while ($mystring =~ /(?=.{5})/sg) {
> push @result, $1;
> }


(?=) doesn't capture. You probably meant /(?=(.{5}))/sg




John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Cheez
Guest
Posts: n/a
 
      01-12-2004
Blown away at how useful c.l.p.m is for a newbie perl dude. I thanks
all again for the replies. I think Gnari made a point about $substr
being easier to understand for newbies... Yes! I have Java
background so it's always nice to see a friendly face (substring)!

God is in the regex's though

Cheers,
Cheez

(E-Mail Removed) (Cheez) wrote in message news:<(E-Mail Removed). com>...
> Hello, hard to desribe my question in a clear way. I want to process
> a string that looks like this:

[snip]
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
How to Optimize the Pattern Matching search m.muthukareem@gmail.com Perl Misc 1 04-06-2007 02:27 AM
search within a search within a search - looking for better way...my script times out Abby Lee ASP General 5 08-02-2004 04:01 PM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM



Advertisments