Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > String::Approx 'aindex' help

Reply
Thread Tools

String::Approx 'aindex' help

 
 
Puri
Guest
Posts: n/a
 
      08-24-2005
Hello,

I am trying to write a simple program using the String-Approx module
that returns the indexes of multiple matches in a single string. Here
is my code:
use String::Approx 'aindex';

$seq1 = "cagtttgtgtaagtgatcacgtnngatttacatatagccatcg";
$seq2 = "ag";
$seq2length = length($seq2);

print "$seq1\n\n";

$indexcount = 0;
#$index[0] = aindex($seq2, ["i 0"], $seq1);
$initial = 0;

until ($index[$indexcount-1] == -1) {
$index[$indexcount] = aindex($seq2, ["i 0 initial_position=$initial"],
$seq1);
$initial=($index[$indexcount]+$seq2length);
$indexcount++;
}

pop @index;

print "@index";

##End code

This returns the following array @index: 1 10 35

However, this is incorrect, because the third "ag" is found starting at
the 11th character, not the 10th. Is there something wrong with the
code, or an easier way to use the "aindex" function of String-Approx to
search for multiple matches within one string?

Thanks in advance,
Puri

 
Reply With Quote
 
 
 
 
jl_post@hotmail.com
Guest
Posts: n/a
 
      08-24-2005
Puri wrote:
>
> I am trying to write a simple program using the String-Approx
> module that returns the indexes of multiple matches in a
> single string. Here is my code:
> use String::Approx 'aindex';



Dear Puri,

Out of curiosity, couldn't you just use the index() function? If
you don't know how to use it you can learn by reading "perldoc -f
index". I suggest this because it doesn't look like to me that you are
trying to find an approximation, but rather an exact match.


> $seq1 = "cagtttgtgtaagtgatcacgtnngatttacatatagccatcg";
> $seq2 = "ag";

[code snipped]
>
> This returns the following array @index: 1 10 35
>
> However, this is incorrect, because the third "ag" is
> found starting at the 11th character, not the 10th.
> Is there something wrong with the code,



That doesn't seem wrong to me. Normally, positions in Perl start at
zero, which means that 0 signifies the first character (and that 10
signifies the 11th character). This property of certain programming
languages can be confusing to those who aren't familiar with it.

I hope this helps, Puri.

-- Jean-Luc

 
Reply With Quote
 
 
 
 
Puri
Guest
Posts: n/a
 
      08-24-2005
Jean-Luc,

Thank you for the quick reply, however I don't think that answers my
question. I am using String-Approx (version 3.25 by the way) because I
hope to eventually be finding approximate matches within sequences as
well. However, I wanted to check to make sure that I could get the
indexing to work properly first with perfect matches (hence the 0 in
the modifiers).

As for Perl positions starting at zero, I understand this, and this is
exactly why I think there is a problem. If you look at the array, it
says there are matches starting at positions 1, 10, and 35. However,
if you look at the first sequence I have entered ($seq1), the matches
should be at 1 (literally the second character entered in my string),
11 (character#12) and 35. 1 and 35 are correct, but the indexing seems
to be getting confused with the middle match, possibly because of the
'a' in front of the 'ag'.

Any other suggestions will be greatly appreciated.

-Puri

 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      08-25-2005
Puri wrote:
>
> As for Perl positions starting at zero, I understand this, and this is
> exactly why I think there is a problem. If you look at the array, it
> says there are matches starting at positions 1, 10, and 35. However,
> if you look at the first sequence I have entered ($seq1), the matches
> should be at 1 (literally the second character entered in my string),
> 11 (character#12) and 35. 1 and 35 are correct, but the indexing seems
> to be getting confused with the middle match, possibly because of the
> 'a' in front of the 'ag'.
>
> Any other suggestions will be greatly appreciated.


You should ask the author of that module whether that is a bug or the correct
behavior.


John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      08-25-2005
"Puri" <(E-Mail Removed)> wrote:
....
> $seq1 = "cagtttgtgtaagtgatcacgtnngatttacatatagccatcg";
> $seq2 = "ag";
> $seq2length = length($seq2);
>
> print "$seq1\n\n";
>
> $indexcount = 0;
> #$index[0] = aindex($seq2, ["i 0"], $seq1);
> $initial = 0;
>
> until ($index[$indexcount-1] == -1) {
> $index[$indexcount] = aindex($seq2, ["i 0 initial_position=$initial"],
> $seq1);
> $initial=($index[$indexcount]+$seq2length);
> $indexcount++;
> }
>
> pop @index;
>
> print "@index";
>
> ##End code
>
> This returns the following array @index: 1 10 35


At first I thought it was a counting problem, but now I see that problem is
that 'aa' is matching. If you change the code so that overlapping matches
are allowed, then it returns 1 10 11 35. I don't know why aa matches, and
I don't know enough about String::Approx to know if this an bug or if it is
correct for some reason I don't understand.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      08-25-2005
"Puri" <(E-Mail Removed)> wrote:
....
> $seq1 = "cagtttgtgtaagtgatcacgtnngatttacatatagccatcg";
> $seq2 = "ag";
> $seq2length = length($seq2);
>
> print "$seq1\n\n";
>
> $indexcount = 0;
> #$index[0] = aindex($seq2, ["i 0"], $seq1);
> $initial = 0;
>
> until ($index[$indexcount-1] == -1) {
> $index[$indexcount] = aindex($seq2, ["i 0 initial_position=$initial"],
> $seq1);
> $initial=($index[$indexcount]+$seq2length);
> $indexcount++;
> }
>
> pop @index;
>
> print "@index";
>
> ##End code
>
> This returns the following array @index: 1 10 35


At first I thought it was a counting problem, but now I see that problem is
that 'aa' is matching. If you change the code so that overlapping matches
are allowed, then it returns 1 10 11 35. I don't know why aa matches, and
I don't know enough about String::Approx to know if this is a bug or if it
is correct for some reason I don't understand.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help Help Help Pentax S5i Help needed (Please) The Martian Digital Photography 14 06-20-2008 07:56 AM
HELP - HELP - HELP =?Utf-8?B?S2ltb24gSWZhbnRpZGlz?= ASP .Net 4 03-09-2006 12:46 PM
HELP WANTED HELP WANTED HELP WANTED Harvey ASP .Net 1 07-16-2004 01:12 PM
HELP WANTED HELP WANTED HELP WANTED Harvey ASP .Net 0 07-16-2004 10:00 AM
HELP! HELP! HELP! Opening Web Application Project Error =?Utf-8?B?dHJlbGxvdzQyMg==?= ASP .Net 0 02-20-2004 05:16 PM



Advertisments