Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > String matching and alignment?

Reply
Thread Tools

String matching and alignment?

 
 
Bryan
Guest
Posts: n/a
 
      06-09-2004
If I have some sequence data:
my $seq = "AGCCTCAAAGTTCGG";

and some subset:
my $subset = "CAAAGTTC";

I want to first, match the pattern to see if $subset is found in $seq
(which is fine), but then I also want to know the starting and end
positions of the match in the original sequence, i.e. start = 6, end =
13, is there something (like backreferences) fromt the regex that will
give me this info? Or....?

thanks,
B

 
Reply With Quote
 
 
 
 
gnari
Guest
Posts: n/a
 
      06-09-2004
"Bryan" <(E-Mail Removed)> wrote in message
news:y7Mxc.68823$(E-Mail Removed) om...
> If I have some sequence data:
> my $seq = "AGCCTCAAAGTTCGG";
>
> and some subset:
> my $subset = "CAAAGTTC";
>
> I want to first, match the pattern to see if $subset is found in $seq
> (which is fine), but then I also want to know the starting and end
> positions of the match in the original sequence, i.e. start = 6, end =
> 13, is there something (like backreferences) fromt the regex that will
> give me this info? Or....?


perldoc -f index

if you really want to use regexes:

perldoc perlvar (see @-)

gnari



 
Reply With Quote
 
 
 
 
Simon Taylor
Guest
Posts: n/a
 
      06-09-2004
Bryan wrote:
> If I have some sequence data:
> my $seq = "AGCCTCAAAGTTCGG";
>
> and some subset:
> my $subset = "CAAAGTTC";
>
> I want to first, match the pattern to see if $subset is found in $seq
> (which is fine), but then I also want to know the starting and end
> positions of the match in the original sequence, i.e. start = 6, end =
> 13, is there something (like backreferences) fromt the regex that will
> give me this info? Or....?


Try the following:


#!/usr/bin/perl

use strict;
use warnings;

my $seq = "AGCCTCAAAGTTCGG";
my $subset = "CAAAGTTC";

if ($seq =~ m/$subset/g) {
print "offset where last m//g match left off: " . pos($seq) . "\n";
print "everything before matched string: $`\n";
print "everything after matched string: $'\n";
print "The entire matched string: $&\n\n";
}

Which gives me the following output:

offset where last m//g match left off: 13
everything before matched string: AGCCT
everything after matched string: GG
The entire matched string: CAAAGTTC


Hope this helps.

- Simon Taylor
--
Unisolve Pty Ltd - Melbourne, Australia
 
Reply With Quote
 
Sisyphus
Guest
Posts: n/a
 
      06-09-2004
Bryan wrote:
> If I have some sequence data:
> my $seq = "AGCCTCAAAGTTCGG";
>
> and some subset:
> my $subset = "CAAAGTTC";
>
> I want to first, match the pattern to see if $subset is found in $seq
> (which is fine), but then I also want to know the starting and end
> positions of the match in the original sequence, i.e. start = 6, end =
> 13, is there something (like backreferences) fromt the regex that will
> give me this info? Or....?
>


Start = length($`) + 1
End = length($`) + length($&)
Alternatively, end = length($`) + length($subset)

See perldoc perlvar for documentation on $`, $' and $&.

Cheers,
Rob


--
To reply by email u have to take out the u in kalinaubears.

 
Reply With Quote
 
Joe Smith
Guest
Posts: n/a
 
      06-10-2004
Bryan wrote:

> If I have some sequence data:
> my $seq = "AGCCTCAAAGTTCGG";
>
> and some subset:
> my $subset = "CAAAGTTC";
>
> I want to first, match the pattern to see if $subset is found in $seq
> (which is fine), but then I also want to know the starting and end
> positions of the match in the original sequence, i.e. start = 6, end =
> 13, is there something (like backreferences) fromt the regex that will
> give me this info?


Use the magic arrays @- and @+.
(Do not use $`, $&, and $' as they will just slow you down.

my $seq = "AGCCTCAAAGTTCGG";
my $subset = "CAAAGTTC";
if ($seq =~ /($subset)(.?)/) {
print "Overall match starts at $-[0] and ends just before $+[0]\n";
print " 1st () match starts at $-[1] and ends just before $+[1]\n";
print " 2nd () match starts at $-[2] and ends just before $+[2]\n";
}
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
compilation error: "error: no matching function for call to 'String::String(String)' =?ISO-8859-1?Q?Martin_J=F8rgensen?= C++ 5 05-06-2006 03:48 PM
Regular Expression and string Matching/Replace sanjay010@yahoo.com Java 6 10-07-2005 03:36 AM
Pattern Matching Given # of Characters and no String Input; use RegularExpressions? Synonymous Python 10 04-22-2005 07:56 AM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM



Advertisments