"Ross" <> wrote in
news:dafs3h$b2l$:
> Dear all,
> For the sequence below (indeed a single line), when i use the
> conditional
> checking
>
> if ($line =~ /(.*)A{10,}(.*)/ ) {
> $tmpline = $1;
> }
>
> to try to remove substring after 10 or more consecutive A's, perl
> seems to recognize the last poly A's and leave the former ones intact.
> what can i do? In general, How to take acton upon a pattern of nth
> occurrence?
It seems to me that you should be using index rather than regular
expressions, although I am not sure what you mean by "nth occurence".
If I understand you correctly, you want to find the first string of at
least 10 As, and only keep the substring up to and including the last
character before that string of at least 10 As. That can be translated
directly to Perl in a very straightforward way:
#!/usr/bin/perl
use strict;
use warnings;
my $s;
while( <DATA> ) {
chomp;
$s .= $_;
}
my $r = substr $s, 0, index $s, 'AAAAAAAAAA';
print "$r\n";
__END__
TCCTCAGTGGGAATTCGGCATTACGGCCGGGGCACCACAATGAATGATCA TTTTC
TTCTTTGCTCTCCTTGCTATTGCTGCATGCAGCGCCTCTGCGCAGTTTGA TGCTG
TTACTCAAGTTTACAGGCAATATCAGCTGCAGCCGCATCTCATGCTGCAG CAACA
GATGCTTAGCCCATGCGGTGAGTTCGTAAGGCAGCAGTGCAGCACAGTGG CAACC
CCCTTCTTCCAATCACCCGTGTTTCAACTGAGAAACTGCCAAGTCATGCA GCAGC
AGTGCTGCCAACAGCTCAGGATGATCGCGCAACAGTCTCACCGCCAGGCC ATTAG
TAGTGTTCAGGCGATTGTGCAGCAGCTACAGCTACAACAGTTTGCTGGCG TCTAC
TTCGATCAGACTCAAGCTCAAGCCCAAGCTATGTTGGCCCTAAACTTGCT GTCAA
TATGCGGTATCTACCCAAGCTACAACACTGCTCCCTGTAGCATTCCCACC GTCGG
TGGTATCTGGTACTGAATTGTAGCAGTATAGTAGTACAGGAGAGAAAAAT AAAGT
CATGCATCATCGTGTGTGACAAGTTGAAACATCGGGGTGATACAAATCTG AATAA
AAATGTCATGCAAGTTTAAACANNNNANANNNANNNNAAANAAAAAAAAA AAAAA
AAAANANAAAAAAAAAAAAAAAAAAAAAAAAAAANAAAAANAAAAAAAAA AAAAA
AAAAANNNNNNANANNNNNNAAAAAAAAAAAAAAAAANNNNNNNNNNGGG GGGGG
GGGGGGGCGGGAAGAAAAAAAAAAA
--
A. Sinan Unur <>
(reverse each component and remove .invalid for email address)
comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html