Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Quandry with the following C code (Intermediate)

Reply
Thread Tools

Quandry with the following C code (Intermediate)

 
 
BMarsh
Guest
Posts: n/a
 
      01-12-2005
Hi all,

I have a slight problem understanding the following code that I saw on
a Unix-PAM tutorial (not OT!)

The following code will compare and old string to a new one, bombing
out if 'max' similar chars is exceeded.


------8<------

static
int compare(unsigned char *old, unsigned char *new, int max)
{
unsigned char in_old[256];
int equal = 0;

(void)memset(in_old, 0, sizeof (in_old));

while (*old)
in_old[*(old++)]++;

while (*new) {
if (in_old[*new])
equal++;
new++;
}

if (equal > max)
return (1);

return (0);
}
------->8---------

I fail to see how the 2 strings are compared for character equality,
especially in how the

in_old[*(old++)]++;

line is used.
Could anyone please shed some light on this for me?

cheers

Bry

 
Reply With Quote
 
 
 
 
Rob van der Leek
Guest
Posts: n/a
 
      01-12-2005
Hi Bry,

In article < .com>,
BMarsh wrote:
> I have a slight problem understanding the following code that I saw on
> a Unix-PAM tutorial (not OT!)
>
> The following code will compare and old string to a new one, bombing
> out if 'max' similar chars is exceeded.
>
> ------8<------
>
> static
> int compare(unsigned char *old, unsigned char *new, int max)
> {
> unsigned char in_old[256];
> int equal = 0;
>
> (void)memset(in_old, 0, sizeof (in_old));
>
> while (*old)
> in_old[*(old++)]++;
>
> while (*new) {
> if (in_old[*new])
> equal++;
> new++;
> }
>
> if (equal > max)
> return (1);
>
> return (0);
> }
> ------->8---------
>
> I fail to see how the 2 strings are compared for character equality,
> especially in how the
>
> in_old[*(old++)]++;


The numerical character value of each character in the first input
string is used as an index for an array that counts the occurrences of
that character. Think about it like this: when the input string is "aab"
the first while loop does: in_old['a']++, in_old['a']++, in_old['b']++.

The second while loop checks for each character in the second input
string if it occurred in the first input string.

The first while loop could also be written as:

while (*old) {
in_old[*old]++;
old++;
}

Regards,
--
Rob van der Leek | rob(at)ricardis(dot)tudelft(dot)nl
 
Reply With Quote
 
 
 
 
BMarsh
Guest
Posts: n/a
 
      01-12-2005
Hi Rob,

Many thanks for your answer; it's cleared it up for me! I was totally
thrown off by the way the loop was written.

Thanks again,

Bryan.

 
Reply With Quote
 
Richard Bos
Guest
Posts: n/a
 
      01-13-2005
"BMarsh" <> wrote:

> The following code will compare and old string to a new one, bombing
> out if 'max' similar chars is exceeded.


It doesn't do a compare the usual way. That is, it does something
completely different from strcmp().

(Oh, btw, if you insist on posting through Google-Broken-Beta, it would
be a good thing if you could get it not to strip all indentation. Your
code is hard to read this way.)

> static
> int compare(unsigned char *old, unsigned char *new, int max)
> {
> unsigned char in_old[256];


First of all, you need to use UCHAR_MAX here, instead of 256. If you
don't, you may try to run this code on a Unicode system some day, and be
surprised when your function scribbles all over memory when you pass it
a string with Unicode characters over 256 in it.

> int equal = 0;
>
> (void)memset(in_old, 0, sizeof (in_old));


Lose the cast. It does no good, and clutters up the code.

> while (*old)
> in_old[*(old++)]++;


This tallies the number of occurrences of each separate character value
in the first string. There's a bug in it: what happens if you pass it a
string of UCHAR_MAX 'a's?

> while (*new) {
> if (in_old[*new])
> equal++;
> new++;


(See what I mean about the indentation?)

This checks each character in the second string, and if there were any
of the same character at all in the first string, counts it as "equal".

> }
>
> if (equal > max)
> return (1);
>
> return (0);


If the number of "equal" characters, that is, the number of chars in the
second string of which there was at least one in the first string,
exceeds the passed-in maximum, return 1, else 0. This could be more
easily written as

return (equal>max);

> I fail to see how the 2 strings are compared for character equality,


So do I; they're not.

Note, in particular, the different treatment of "old" and "new".

For example, try to explain the discrepancy between

compare("abc", "dbbbe", 2)

and

compare("dbbbe", "abc", 2)

Then, when you want an exercise I can't solve, try to explain _why_
someone would write a function like that, and then call it, sec,
"compare". The logic escapes me, I'm afraid. It's reasonably clear to me
_what_ this function does, but not why.

> especially in how the
>
> in_old[*(old++)]++;


The index entry corresponding to the character at the _current_ value of
old is increased (that is, the character now under the old pointer is
tallied); and old is moved to the next character. Not necessarily in
that order, or in any order at all, but since (old++) returns the old
value of old (so to speak) no matter which order is chosen, it doesn't
matter for the result.

Richard
 
Reply With Quote
 
infobahn
Guest
Posts: n/a
 
      01-13-2005
Richard Bos wrote:
>
> "BMarsh" <> wrote:
>


<snip>

> > static
> > int compare(unsigned char *old, unsigned char *new, int max)
> > {
> > unsigned char in_old[256];

>
> First of all, you need to use UCHAR_MAX here, instead of 256.


I think you mean "UCHAR_MAX + 1"

> If you
> don't, you may try to run this code on a Unicode system some day, and be
> surprised when your function scribbles all over memory when you pass it
> a string with Unicode characters over 256 in it.


I think you mean "over 255"

<snip>
 
Reply With Quote
 
Francois Grieu
Guest
Posts: n/a
 
      01-14-2005
infobahn <> wrote:

> Richard Bos wrote:
> >
> > "BMarsh" <> wrote:
> > > unsigned char in_old[256];

> >
> > First of all, you need to use UCHAR_MAX here, instead of 256.

>
> I think you mean "UCHAR_MAX + 1"


Do we need "UCHAR_MAX + 1L" to cover the case of UCHAR_MAX
equal to UINT_MAX, say both 0xFFFF ?

Francois Grieu
 
Reply With Quote
 
Richard Bos
Guest
Posts: n/a
 
      01-14-2005
Francois Grieu <> wrote:

> infobahn <> wrote:
>
> > Richard Bos wrote:
> > >
> > > "BMarsh" <> wrote:
> > > > unsigned char in_old[256];
> > >
> > > First of all, you need to use UCHAR_MAX here, instead of 256.

> >
> > I think you mean "UCHAR_MAX + 1"


Yes (and yes).

> Do we need "UCHAR_MAX + 1L" to cover the case of UCHAR_MAX
> equal to UINT_MAX, say both 0xFFFF ?


In theory, yes. In practice, systems where SCHAR_MAX == INT_MAX or
UCHAR_MAX==UINT_MAX have so many problems that I wouldn't bother to
cater for them. Anyone porting code to that kind of implementation knows
he's getting into a hornets' (or mare's <g>) nest, and should take all
necessary precautions himself.
(And why stop there? What if UCHAR_MAX==ULONG_MAX? Could happen
(probably does happen) on a 32-bit embedded processor.)

Richard
 
Reply With Quote
 
infobahn
Guest
Posts: n/a
 
      01-14-2005
Francois Grieu wrote:
> infobahn <> wrote:
> > Richard Bos wrote:
> > > First of all, you need to use UCHAR_MAX here, instead of 256.

> >
> > I think you mean "UCHAR_MAX + 1"

>
> Do we need "UCHAR_MAX + 1L" to cover the case of UCHAR_MAX
> equal to UINT_MAX, say both 0xFFFF ?


Good spot, although I think we'd have to lump such an implementation
in with the DS9K.

Actually, this really is a problem on CSILP32 systems such as
(some) DSPs, and the L suffix doesn't help on such systems.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
design quandry .. forums_mp@hotmail.com C++ 2 10-27-2005 12:01 PM
namespace/dictionary quandry Jack Carter Python 9 09-22-2004 07:31 PM
Client-Side Object Reference Quandry Fred ASP .Net 3 07-12-2004 10:28 PM
Quandry over which software to use Phil Digital Photography 5 02-22-2004 01:40 AM
CD quandry MB Computer Support 3 09-13-2003 02:14 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57