Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Re: memcmp versus strstr; reaction to chr(0)

Reply
Thread Tools

Re: memcmp versus strstr; reaction to chr(0)

 
 
Burne C
Guest
Posts: n/a
 
      07-24-2003

"Walter Dnes" <(E-Mail Removed)> wrote in message
news:bfocbv$gepau$(E-Mail Removed)-berlin.de...
> When I asked in another thread about string comparisons, I forgot
> about the chr(0) booby-trap in C strings. Since I want to compare
> random binary data, this is important to me. Someone correct me if I'm
> wrong; strstr stops at chr(0). memcmp doesn't treat chr(0) as a
> delimiter, and can compare ranges (the word "strings" is incorrect here)
> that included embedded chr(0). I realize that memcmp won't
> automatically scan a larger string, but I can put it in a loop to sweep
> through a larger string. Too bad that memmem is not standard.
>


You would use memchr together to speed up the memcmp search sightly:

----------------------
void *memsearch(void *source, void *target, long soc_length, long tar_length)
{
void *addr,*start_address;

search_address = source;

while(addr = memchr(search_address, *(char*)target, soc_length))
{
if(memcmp(addr, target, tar_length)==0)
return addr;

++(char*)search_address;
}

return NULL;
}


--
BC


 
Reply With Quote
 
 
 
 
Burne C
Guest
Posts: n/a
 
      07-24-2003

"Burne C" <(E-Mail Removed)> wrote in message news:bfp14b$(E-Mail Removed)...
>
> "Walter Dnes" <(E-Mail Removed)> wrote in message
> news:bfocbv$gepau$(E-Mail Removed)-berlin.de...
> > When I asked in another thread about string comparisons, I forgot
> > about the chr(0) booby-trap in C strings. Since I want to compare
> > random binary data, this is important to me. Someone correct me if I'm
> > wrong; strstr stops at chr(0). memcmp doesn't treat chr(0) as a
> > delimiter, and can compare ranges (the word "strings" is incorrect here)
> > that included embedded chr(0). I realize that memcmp won't
> > automatically scan a larger string, but I can put it in a loop to sweep
> > through a larger string. Too bad that memmem is not standard.
> >

>
> You would use memchr together to speed up the memcmp search sightly:
>
> ----------------------
> void *memsearch(void *source, void *target, long soc_length, long tar_length)
> {
> void *addr,*start_address;
>


Sorry, I have made a mistake to the variable name

it should be "void *addr, *search_address"

--
BC


 
Reply With Quote
 
 
 
 
Burne C
Guest
Posts: n/a
 
      07-25-2003

"Peter Ammon" <(E-Mail Removed)> wrote in message news:bfq8uv$3cv$(E-Mail Removed)...
> Burne C wrote:
>
> > "Walter Dnes" <(E-Mail Removed)> wrote in message
> > news:bfocbv$gepau$(E-Mail Removed)-berlin.de...
> >
> >> When I asked in another thread about string comparisons, I forgot
> >>about the chr(0) booby-trap in C strings. Since I want to compare
> >>random binary data, this is important to me. Someone correct me if I'm
> >>wrong; strstr stops at chr(0). memcmp doesn't treat chr(0) as a
> >>delimiter, and can compare ranges (the word "strings" is incorrect here)
> >>that included embedded chr(0). I realize that memcmp won't
> >>automatically scan a larger string, but I can put it in a loop to sweep
> >>through a larger string. Too bad that memmem is not standard.
> >>

> >
> >
> > You would use memchr together to speed up the memcmp search sightly:
> >
> > ----------------------
> > void *memsearch(void *source, void *target, long soc_length, long tar_length)
> > {
> > void *addr,*start_address;
> >
> > search_address = source;
> >
> > while(addr = memchr(search_address, *(char*)target, soc_length))
> > {
> > if(memcmp(addr, target, tar_length)==0)
> > return addr;
> >
> > ++(char*)search_address;
> > }
> >
> > return NULL;
> > }
> >

>
> This function has a few unfortunate properties.
>
> 1) It won't compile, since (char*)search_address it not an lvalue, and
> so ++(char*)search_address is illegal.


search_address is lvalue, it is a local variable. I have made a mistake to the variable name
"start_address" and I have corrected it in the last post.

>
> 2) If it were to compile, it would invoke undefined behavior in many
> cases, because you do not update the soc_length value to reflect the
> incremented pointer and so search beyond the ends of the array.
>


Right.


> 3) If the behavior were defined, it would be inefficient. For example,
> memsearch("aaaaaaaaaaaaaaaaaaax", "x1", 20, 2) would start at the first
> 'a', walk the entire string looking for an 'x', find the x, determine
> that the string "x1" was not present, then go to the second 'a' and
> repeat the process. This has O(n^2) complexity.


Yes, the "++(char*)search_address" line should be

search_address = (char*)addr+1;

It search for the first "x" using memchr, and check the substring using "memcmp", if it is not
match, the memchr search continue _after_ the returned addr.

>
> 4) Function beginning with "mem" and followed by a lowercase letter are
> reserved. You should call your function something like searchmem or
> mem_search.
>
> That said, here's my recently written real life function to do the same
> thing. Now I'm the target.
>
> const char* mymemstr(const char* hay, const char* needle, size_t
> hayLength, size_t needleLength) {
> size_t hayOuter;
> size_t needleIndex=0;
> size_t memory=0;
> for (hayOuter=0; hayOuter < hayLength; hayOuter++) {
> if (needleIndex >= needleLength) return hay + hayOuter -
> needleLength;
> if (needle[needleIndex]==hay[hayOuter]) {
> if (needleIndex++==0) memory=hayOuter;
> }
> else { /* needle[needleIndex]!=hay[hayOuter] */
> if (needleIndex > 0) {
> needleIndex=0;
> hayOuter=memory+1;
> }
> }
> }
> return NULL;
> }
>
> -Peter
>



 
Reply With Quote
 
Peter Ammon
Guest
Posts: n/a
 
      07-25-2003
Burne C wrote:

> "Peter Ammon" <(E-Mail Removed)> wrote in message news:bfq8uv$3cv$(E-Mail Removed)...
>
>>Burne C wrote:
>>
>>
>>>"Walter Dnes" <(E-Mail Removed)> wrote in message
>>>news:bfocbv$gepau$(E-Mail Removed)-berlin.de...
>>>
>>>
>>>> When I asked in another thread about string comparisons, I forgot
>>>>about the chr(0) booby-trap in C strings. Since I want to compare
>>>>random binary data, this is important to me. Someone correct me if I'm
>>>>wrong; strstr stops at chr(0). memcmp doesn't treat chr(0) as a
>>>>delimiter, and can compare ranges (the word "strings" is incorrect here)
>>>>that included embedded chr(0). I realize that memcmp won't
>>>>automatically scan a larger string, but I can put it in a loop to sweep
>>>>through a larger string. Too bad that memmem is not standard.
>>>>
>>>
>>>
>>>You would use memchr together to speed up the memcmp search sightly:
>>>
>>>----------------------
>>>void *memsearch(void *source, void *target, long soc_length, long tar_length)
>>>{
>>> void *addr,*start_address;
>>>
>>> search_address = source;
>>>
>>> while(addr = memchr(search_address, *(char*)target, soc_length))
>>> {
>>> if(memcmp(addr, target, tar_length)==0)
>>> return addr;
>>>
>>> ++(char*)search_address;
>>> }
>>>
>>> return NULL;
>>>}
>>>

>>
>>This function has a few unfortunate properties.
>>
>>1) It won't compile, since (char*)search_address it not an lvalue, and
>>so ++(char*)search_address is illegal.

>
>
> search_address is lvalue, it is a local variable.


Yes, but (char*)search_address is not an lvalue. Casting an lvalue
doesn't give you an lvalue.

> I have made a mistake to the variable name
> "start_address" and I have corrected it in the last post.


I see that, but it's still not an lvalue.

[...]
>>3) If the behavior were defined, it would be inefficient. For example,
>>memsearch("aaaaaaaaaaaaaaaaaaax", "x1", 20, 2) would start at the first
>>'a', walk the entire string looking for an 'x', find the x, determine
>>that the string "x1" was not present, then go to the second 'a' and
>>repeat the process. This has O(n^2) complexity.

>
>
> Yes, the "++(char*)search_address" line should be
>
> search_address = (char*)addr+1;
>
> It search for the first "x" using memchr, and check the substring using "memcmp", if it is not
> match, the memchr search continue _after_ the returned addr.


Much better.
[...]
-Peter

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Mozilla versus IE versus Opera versus Safari Peter Potamus the Purple Hippo Firefox 0 05-08-2008 12:56 PM
equal? versus eql? versus == versus === verus <=> Paul Butcher Ruby 12 11-28-2007 06:06 AM
Re: memcmp versus strstr; reaction to chr(0) Dan Pop C Programming 0 07-24-2003 05:51 PM
Re: memcmp versus strstr; reaction to chr(0) Thomas Matthews C Programming 0 07-24-2003 02:34 PM
Re: memcmp versus strstr; reaction to chr(0) Joona I Palaste C Programming 0 07-24-2003 10:37 AM



Advertisments