Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Matching strings with index – getting extra matches.

Reply
Thread Tools

Matching strings with index – getting extra matches.

 
 
G
Guest
Posts: n/a
 
      02-09-2004
I’m looping through a sales_file looking for matches. The file
has a number of entries such as the following:

sales item aaa | m423a
sales item bbb | m423
sales item ccc | m423b
sales item ddd | 423

These refer to sales_item and code respectively.

Here is the code segment:

open FILE, "<$sales_file";
while (<FILE>) {
($sales_item, $code) = split /\|/;
if (index($code, $entered_code) != -1) {
$list .= "<br>" if ($list);
$list .= $sales_item;
}
} # while
close FILE;

The problem is, if the $entered_code is 423 I get matches for all 4
when I would only want matches for the fourth sales item “sales
item ddd” line. Similarly, an $entered_code of m423 would match
the first 3. Any suggestions on how I can get the right matches,
keeping in mind that I would prefer to do it in code, and not alter
the sales_file.

Thanks,

C
 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      02-09-2004
On Mon, 9 Feb 2004, G wrote:

> I’m looping through a sales_file looking for matches. The file
> has a number of entries such as the following:
>
> sales item aaa | m423a
> sales item bbb | m423
> sales item ccc | m423b
> sales item ddd | 423
>
> These refer to sales_item and code respectively.
>
> Here is the code segment:
>
> open FILE, "<$sales_file";
> while (<FILE>) {
> ($sales_item, $code) = split /\|/;
> if (index($code, $entered_code) != -1) {
> $list .= "<br>" if ($list);
> $list .= $sales_item;
> }
> } # while
> close FILE;
>
> The problem is, if the $entered_code is 423 I get matches for all 4
> when I would only want matches for the fourth sales item “sales
> item ddd” line. Similarly, an $entered_code of m423 would match
> the first 3. Any suggestions on how I can get the right matches,
> keeping in mind that I would prefer to do it in code, and not alter
> the sales_file.


Replace the index() line with
if ($code =~ /^\s*$entered_code\s*$/) {

This will search the $code line for 'beginning of string, possible white
space, the code, possible white space, end of string', rather than just
"the code anywhere within the string" as you're doing now.

Paul Lalli
 
Reply With Quote
 
 
 
 
gnari
Guest
Posts: n/a
 
      02-09-2004
"G" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) om...

[problem with index not matching string exactly]

> sales item aaa | m423a
> sales item bbb | m423
> sales item ccc | m423b
> sales item ddd | 423


if there is allways space around the '|' you should
have them in your split

> ...
> ($sales_item, $code) = split /\|/;

($sales_item, $code) = split / \| /;

> if (index($code, $entered_code) != -1) {



if there is no trailing space after the code then
if ($code eq $entered_code) {

if on the other hand, your data is dirty with
whilespace, you are better off with a
regexp match as someone else suggested
or even replace the split with a match:

($sales_item, $code) = /^\s*(.+?)\s*\|\s*(.+?)\s*/;
if ($code eq $entered_code) {

gnari




 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      02-09-2004
G <(E-Mail Removed)> wrote:

> open FILE, "<$sales_file";



You should always, yes *always*, check the return value from open():

open FILE, "<$sales_file" or die "could not open '$sales_file' $!";


--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
G
Guest
Posts: n/a
 
      02-10-2004
"gnari" <(E-Mail Removed)> wrote in message news:<c08oh6$3b9$(E-Mail Removed)>...
> "G" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed) om...
>
> [problem with index not matching string exactly]
>
> > sales item aaa | m423a
> > sales item bbb | m423
> > sales item ccc | m423b
> > sales item ddd | 423

>
> if there is allways space around the '|' you should
> have them in your split
>
> > ...
> > ($sales_item, $code) = split /\|/;

> ($sales_item, $code) = split / \| /;
>
> > if (index($code, $entered_code) != -1) {

>
>
> if there is no trailing space after the code then
> if ($code eq $entered_code) {
>
> if on the other hand, your data is dirty with
> whilespace, you are better off with a
> regexp match as someone else suggested
> or even replace the split with a match:
>
> ($sales_item, $code) = /^\s*(.+?)\s*\|\s*(.+?)\s*/;
> if ($code eq $entered_code) {
>
> gnari


Thanks for the suggestions so far, but I now realize I the sample text
file was flawed. For one there is Never white space around the '|'.
Secondly a line could have multiple codes but no duplicates(on that
line only). The sample file should have looked as follows:

sales item aaa|543,m423a
sales item bbb|m423,543 'Note how code 543 is on the 1st 2nd
line.
sales item ccc|m423b
sales item ddd|423,423b,m523,652

Given that the above has changed how could I get a match. e.g. a code
of 423 should return the description in line 4 "sales item ddd" Where
m423 only matches the 3rd line. etc.

Thanks,

C
 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      02-10-2004

(E-Mail Removed) (G) wrote:
> Thanks for the suggestions so far, but I now realize I the sample text
> file was flawed. For one there is Never white space around the '|'.
> Secondly a line could have multiple codes but no duplicates(on that
> line only). The sample file should have looked as follows:
>
> sales item aaa|543,m423a
> sales item bbb|m423,543 'Note how code 543 is on the 1st 2nd
> line.
> sales item ccc|m423b
> sales item ddd|423,423b,m523,652
>
> Given that the above has changed how could I get a match. e.g. a code
> of 423 should return the description in line 4 "sales item ddd" Where
> m423 only matches the 3rd line. etc.


my $code = 'm423';
while (<>) {
my ($item, $codes) = split /\|/;
my @codes = split /,/, $codes;
print $item if grep { $_ eq $code } @codes;
}

alternatively:

/(.*) \| (?:.*,|) \Q$code\E (?:,|$)/x and print $1 while <>;

Ben

--
$.=1;*g=sub{print@_};sub r($$\$){my($w,$x,$y)=@_;for(keys%$x){/main/&&next;*p=$
$x{$_};/(\w)::$/&&(r($w.$1,$x.$_,$y),next);$y eq\$p&&&g("$w$_")}};sub t{for(@_)
{$f&&($_||&g(" "));$f=1;r"","::",$_;$_&&&g(chr(0012))}};t # (E-Mail Removed)
$J::u::t, $a::n:::t::h::e::r, $P::e::r::l, $h::a::c::k::e::r, $.
 
Reply With Quote
 
G
Guest
Posts: n/a
 
      02-12-2004
Ben Morrow <(E-Mail Removed)> wrote in message news:<c0arqv$9u4$(E-Mail Removed)>...
> (E-Mail Removed) (G) wrote:
> > Thanks for the suggestions so far, but I now realize I the sample text
> > file was flawed. For one there is Never white space around the '|'.
> > Secondly a line could have multiple codes but no duplicates(on that
> > line only). The sample file should have looked as follows:
> >
> > sales item aaa|543,m423a
> > sales item bbb|m423,543 'Note how code 543 is on the 1st 2nd
> > line.
> > sales item ccc|m423b
> > sales item ddd|423,423b,m523,652
> >
> > Given that the above has changed how could I get a match. e.g. a code
> > of 423 should return the description in line 4 "sales item ddd" Where
> > m423 only matches the 3rd line. etc.

>
> my $code = 'm423';
> while (<>) {
> my ($item, $codes) = split /\|/;
> my @codes = split /,/, $codes;
> print $item if grep { $_ eq $code } @codes;
> }
>

I finally gave this code a try, but it only partially works. For
instance a code of m423b will not pull up any results. Neither will
m523. My guess is we are not splitting things out right: my
@codes = split /,/, $codes;

Thanks,

C
 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      02-12-2004

(E-Mail Removed) (G) wrote:
> Ben Morrow <(E-Mail Removed)> wrote in message news:<c0arqv$9u4$(E-Mail Removed)>...
> > my $code = 'm423';
> > while (<>) {


chomp;

> > my ($item, $codes) = split /\|/;
> > my @codes = split /,/, $codes;
> > print $item if grep { $_ eq $code } @codes;
> > }

>
> I finally gave this code a try, but it only partially works. For
> instance a code of m423b will not pull up any results. Neither will
> m523. My guess is we are not splitting things out right: my
> @codes = split /,/, $codes;


Ben

--
The cosmos, at best, is like a rubbish heap scattered at random.
- Heraclitus
(E-Mail Removed)
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
matching strings in a large set of strings Karin Lagesen Python 13 05-03-2010 03:53 PM
External Hashing [was Re: matching strings in a large set of strings] Helmut Jarausch Python 3 04-30-2010 08:44 PM
How to replace all strings matching a pattern with correspondinglower case strings ? anonym Java 1 01-15-2009 07:29 PM
sorting index-15, index-9, index-110 "the human way"? Tomasz Chmielewski Perl Misc 4 03-04-2008 05:01 PM
PUZZLE Getting DropDownList Index of Matching Value Earl Teigrob ASP .Net 3 08-06-2003 09:41 PM



Advertisments