Velocity Reviews > review of the "container library", part 1/?

# review of the "container library", part 1/?

Alan Curry
Guest
Posts: n/a

 03-01-2011
To make up for posting in one of those threads of mass stupidity, here's
something less off-topic.

I've noticed so far. This part is mostly about superficial packaging issues.
The meaty stuff is harder to comment on since I'd have to read the
documentation, and there's over 200 pages of that.

unzipping splattered all over current directory. This makes a bad first
impression. Can I apply tarball etiquette to zip files? Well, I just did.

There was a mention of ccl.tex but I don't see it anywhere.

The makefile actually worked, and is readable, instead of being a maze of
$(shell stuff) and includes and other non-POSIX features. That's rare these days. However, when I tried it with a pmake (derived from NetBSD make) it failed. The reason: line-continuing backslashes are not immediately followed by newline. They have ^M's after them. "UNIX" makefile shouldn't have MS-DOS line endings. The ^M's in the C source files aren't fun to look at either. The build rule for dotest should use test.o, not test.c. The way you have it, test.c gets compiled with the implicit .c.o rule since test.o is one of OBJS, then test.o never gets used, because the link line of dotest compiles it again (without your CFLAGS!). Also, having test.o in OBJS means it's included in the .a which it probably shouldn't be. To fix all that, just remove test.o from OBJS and change the 2 instances of test.c in the dotest rule to test.o Header file dependencies are not expressed in the Makefile, so nothing gets rebuilt if a header file is modified. That's bad. A few warnings were issued with gcc 4.4.5 and the options in the Makefile. I'll go through them all: In test.c there are a couple of %lu printf's with size_t args - gcc warns about this as a potential mismatch. It should be %zu (clean, C99) or cast to unsigned long (icky, C89) The ULL suffixed integer constants in bitstrings.c give warnings because -pedantic without -std=c99 is asking for C89 pedanticness, which disallows long long. Dead code here: bitstrings.c:503: warning: 'AddRange' defined but not used More dead code here: searchtree.c:621: warning: 'hide' defined but not used And here's one that's a serious behavioral issue: searchtree.c in the definition of struct tagBinarySearchTreeNode has "char factor;" which is used to hold the values LEFT, BALANCED, and RIGHT defined as 1, 0, and -1. On my platform, char is unsigned so assigning -1 to a char makes it 255, and then comparing it to -1 is false. So all the ==RIGHT tests are useless. gcc didn't warn about those because you didn't include -W (-Wextra), but luckily it warned about the switch() statements with case RIGHT. Your test program segfaults because it gets severely confused by that searchtree factor bug. After changing "char factor;" to "signed char factor;" it didn't segfault. I can't interpret the test output so I won't say it definitely passed all the tests, but at least it returned a 0 exit code. One more formatting thing: it looks like you use tabstop=4. Lots of things look wrong with tabstop=8, the default in unix land. If you used tabs consistently this wouldn't be a problem but there are lots of places where a line indented with a tab is followed by a line indented with 4 spaces, so they don't line up for anyone who hasn't figured out that you wanted the files to be viewed with tabstop=4. Adding my own warning options, I get a lot more warnings. The most interesting ones led me to: casting away const in test.c:testDictionary on this line: pi = (int *)iDictionary.GetElement(d,(const unsigned char *)"Two"); where GetElement returns const void *. This cast is not doing anything but throwing away the const. This is occasionally justifiable with complex interactions where the correct constness is situation-dependent, but this isn't one of those cases. Nothing attempts to modify *pi so there are no drawbacks to making pi a "const int *", and one big advantage: the cast can go away. There are lots of other bad casts losing const for no reason; this is just the one I picked as a representative of the group. A lot of casts seem to be present because there are strings being passed around as unsigned char * and then they get converted back to char * for use with strcmp and such. I'd try to get rid of the "unsigned" stuff except where it really matters. A function that wants to operate on a string of unsigned chars internally should still have its arguments declared as plain "char *" if that makes the caller's job easier. Then there's the general lack of const in the proper places. A function called Strcmp that wraps strcmp should really be able to take pointer-to-const parameters. Omitting the const is just screwing up the interface. In buffer.c:CBClear, gcc thinks the variable p "may be used uninitialized" and without studying too deeply, I can't disagree. It looks like the memset(p, ...) is reachable with p uninitialized in the case where cb->DestructorFn is false. The big picture: count me among those not convinced that "container" is a useful abstraction. Things indexed by integers and things indexed by strings don't have that much in common. part 2, later, if I have time to read the documentation. -- Alan Curry jacob navia Guest Posts: n/a  03-01-2011 Hi Alan. THANKS a LOT for your excellent review. I am at work now and can't answer all your points but I will come back to this this evening. jacob Tom St Denis Guest Posts: n/a  03-01-2011 On Feb 28, 8:42*pm, (E-Mail Removed) (Alan Curry) wrote: > To make up for posting in one of those threads of mass stupidity, here's > something less off-topic. > > I downloaded the library fromhttp://code.google.com/p/ccl/and here's what > I've noticed so far. This part is mostly about superficial packaging issues. > The meaty stuff is harder to comment on since I'd have to read the > documentation, and there's over 200 pages of that. > > unzipping splattered all over current directory. This makes a bad first > impression. Can I apply tarball etiquette to zip files? Well, I just did. > > There was a mention of ccl.tex but I don't see it anywhere. > > The makefile actually worked, and is readable, instead of being a maze of >$(shell stuff) and includes and other non-POSIX features. That's rare these
> days. However, when I tried it with a pmake (derived from NetBSD make) it
> failed. The reason: line-continuing backslashes are not immediately followed
> by newline. They have ^M's after them. "UNIX" makefile shouldn't have MS-DOS
> line endings.
>
> The ^M's in the C source files aren't fun to look at either.

Unzip with -a.

Tom

jacob navia
Guest
Posts: n/a

 03-01-2011
Le 01/03/11 02:42, Alan Curry a écrit :
> To make up for posting in one of those threads of mass stupidity, here's
> something less off-topic.
>

Thanks for this. Great help.

> I've noticed so far. This part is mostly about superficial packaging issues.
> The meaty stuff is harder to comment on since I'd have to read the
> documentation, and there's over 200 pages of that.
>
> unzipping splattered all over current directory. This makes a bad first
> impression. Can I apply tarball etiquette to zip files? Well, I just did.
>

Problem is that zip will include the .svn directory when using
the recursive option. If I go to the upper directory and issue:
zip -9 -r container-lib-src.zip ccl
it will put all .svn stuff intoi the zip. And I did not find any way
that works to exclude that (no --exclude "ccl/.svn" does NOT work
for unknown reasons.

> There was a mention of ccl.tex but I don't see it anywhere.
>

Yes, I have added images to the doc and alot of stuff so it would
require tex and a lot of other non essential stuff, so I dropped tex
doc sources and ship only pdfs

> The makefile actually worked, and is readable, instead of being a maze of
> $(shell stuff) and includes and other non-POSIX features. That's rare these > days. However, when I tried it with a pmake (derived from NetBSD make) it > failed. The reason: line-continuing backslashes are not immediately followed > by newline. They have ^M's after them. "UNIX" makefile shouldn't have MS-DOS > line endings. > OK. Eliminated that > The ^M's in the C source files aren't fun to look at either. > Please use the -a option when unzipping (unzip -a ...) This will eliminate the \r in Unix and ADD them under windows > The build rule for dotest should use test.o, not test.c. The way you have it, > test.c gets compiled with the implicit .c.o rule since test.o is one of OBJS, > then test.o never gets used, because the link line of dotest compiles it > again (without your CFLAGS!). Also, having test.o in OBJS means it's included > in the .a which it probably shouldn't be. To fix all that, just remove test.o > from OBJS and change the 2 instances of test.c in the dotest rule to test.o > Correct. Modified that. > Header file dependencies are not expressed in the Makefile, so nothing gets > rebuilt if a header file is modified. That's bad. > > A few warnings were issued with gcc 4.4.5 and the options in the Makefile. > I'll go through them all: > > In test.c there are a couple of %lu printf's with size_t args - gcc warns > about this as a potential mismatch. It should be %zu (clean, C99) or cast to > unsigned long (icky, C89) > > The ULL suffixed integer constants in bitstrings.c give warnings because > -pedantic without -std=c99 is asking for C89 pedanticness, which disallows > long long. > I did not find a way out. I wanted to avoid C99 but I think it will be necessary. > Dead code here: bitstrings.c:503: warning: 'AddRange' defined but not used > Yes, they are unfinished. > More dead code here: searchtree.c:621: warning: 'hide' defined but not used > Yes, this just sets a flag and marks the node as "hidden" instead of erasing it... I am not sure if this should be congtinued. I will ifdef it out > And here's one that's a serious behavioral issue: searchtree.c in the > definition of struct tagBinarySearchTreeNode has "char factor;" which is used > to hold the values LEFT, BALANCED, and RIGHT defined as 1, 0, and -1. On my > platform, char is unsigned so assigning -1 to a char makes it 255, and then > comparing it to -1 is false. So all the ==RIGHT tests are useless. gcc didn't > warn about those because you didn't include -W (-Wextra), but luckily it > warned about the switch() statements with case RIGHT. > Well, thanks a lot for finding this bug.I just didn't thought about having plain char as unsigned. > Your test program segfaults because it gets severely confused by that > searchtree factor bug. After changing "char factor;" to "signed char factor;" > it didn't segfault. I can't interpret the test output so I won't say it > definitely passed all the tests, but at least it returned a 0 exit code. > OK. > One more formatting thing: it looks like you use tabstop=4. Lots of things > look wrong with tabstop=8, the default in unix land. If you used tabs > consistently this wouldn't be a problem but there are lots of places where a > line indented with a tab is followed by a line indented with 4 spaces, so > they don't line up for anyone who hasn't figured out that you wanted the > files to be viewed with tabstop=4. > > Adding my own warning options, I get a lot more warnings. The most > interesting ones led me to: > > casting away const in test.c:testDictionary on this line: > pi = (int *)iDictionary.GetElement(d,(const unsigned char *)"Two"); > where GetElement returns const void *. This cast is not doing anything but > throwing away the const. This is occasionally justifiable with complex > interactions where the correct constness is situation-dependent, but this > isn't one of those cases. Nothing attempts to modify *pi so there are no > drawbacks to making pi a "const int *", and one big advantage: the cast can > go away. There are lots of other bad casts losing const for no reason; this > is just the one I picked as a representative of the group. > Yes, you are right, "const correctness" is CONSTantly getting me mad... > A lot of casts seem to be present because there are strings being passed > around as unsigned char * and then they get converted back to char * for use > with strcmp and such. I'd try to get rid of the "unsigned" stuff except where > it really matters. A function that wants to operate on a string of unsigned > chars internally should still have its arguments declared as plain "char *" > if that makes the caller's job easier. > I started a discussion here some months ago about this problem. Apparently there is no solution. strcmp requires char * but I want to have unsigned chars in all CHARACTERS since I do NOT want any sign problems with them. > Then there's the general lack of const in the proper places. A function > called Strcmp that wraps strcmp should really be able to take > pointer-to-const parameters. Omitting the const is just screwing up the > interface. Yes, you are right. > > In buffer.c:CBClear, gcc thinks the variable p "may be used uninitialized" > and without studying too deeply, I can't disagree. It looks like the > memset(p, ...) is reachable with p uninitialized in the case where > cb->DestructorFn is false. > RIGHT! I added destructor support and included the initialization of p within the if () . That was a huge mistake. Thanks > The big picture: count me among those not convinced that "container" is a > useful abstraction. Things indexed by integers and things indexed by strings > don't have that much in common. > That is why there are two types of containers: sequantial and associative. What do they have in common? Size (# of elements they store) Flags (properties like read-,ly, etc) Methods like: contains (searches one element) clear (removes all element but not the container) erase (removes one element) Finalize (destroy) Sizeof (bytes used) > part 2, later, if I have time to read the documentation. > Again: Thanks for your interest. jacob James Waldby Guest Posts: n/a  03-01-2011 On Tue, 01 Mar 2011 20:51:37 +0100, jacob navia wrote: > Le 01/03/11 02:42, Alan Curry a Ã©crit : .... >> I downloaded the library from http://code.google.com/p/ccl/ and here's >> what I've noticed so far. [...] >> unzipping splattered all over current directory. This makes a bad first >> impression. Can I apply tarball etiquette to zip files? [...] >> > Problem is that zip will include the .svn directory when using the > recursive option. If I go to the upper directory and issue: > zip -9 -r container-lib-src.zip ccl > it will put all .svn stuff into the zip. And I did not find any way > that works to exclude that (no --exclude "ccl/.svn" does NOT work for > unknown reasons. .... I don't know what the exclusion problem is, and imagine the lack of detail in your "does not work" sentence will keep others from knowing as well. However, you might try zip ... -x '.svn/\*' In any case, you can get whatever directory structure layout you want in a zip file by making a temporary clean copy of the structure and zipping that. You could write a shell script that creates the directory structure with no files, then in the structure adds links to desired files, then zips the structure, then deletes it. This would be on topic in comp.unix.shell newsgroup, rather than c.l.c. >> One more formatting thing: it looks like you use tabstop=4. Lots of >> things look wrong with tabstop=8, the default in unix land. If you used >> tabs consistently this wouldn't be a problem but there are lots of >> places where a line indented with a tab is followed by a line indented >> with 4 spaces, so they don't line up for anyone who hasn't figured out >> that you wanted the files to be viewed with tabstop=4. .... Consider running the code through indent -ut with other options set so that program layout doesn't change much except for conversion of some spaces to tabs. -- jiw Jorgen Grahn Guest Posts: n/a  03-01-2011 On Tue, 2011-03-01, James Waldby wrote: > On Tue, 01 Mar 2011 20:51:37 +0100, jacob navia wrote: >> Le 01/03/11 02:42, Alan Curry a écrit : > ... >>> I downloaded the library from http://code.google.com/p/ccl/ and here's >>> what I've noticed so far. [...] >>> unzipping splattered all over current directory. This makes a bad first >>> impression. Can I apply tarball etiquette to zip files? [...] Yes, a /very/ bad impression. >> Problem is that zip will include the .svn directory when using the >> recursive option. If I go to the upper directory and issue: >> zip -9 -r container-lib-src.zip ccl >> it will put all .svn stuff into the zip. [...] .... > In any case, you can get whatever directory structure layout you > want in a zip file by making a temporary clean copy of the structure > and zipping that. You could write a shell script that creates the > directory structure with no files, then in the structure adds links > to desired files, then zips the structure, then deletes it. This > would be on topic in comp.unix.shell newsgroup, rather than c.l.c. It's offtopic, but I'll mention it anyway since it took me a few years to discover it myself: CVS has an 'export' command which is much like a checkout but creates a clean directory tree without the CVS/ subdirectories and control files. I'm pretty sure SVN has an equivalent. That's what he should try first! /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . Ian Collins Guest Posts: n/a  03-01-2011 On 03/ 2/11 08:51 AM, jacob navia wrote: > Le 01/03/11 02:42, Alan Curry a écrit : >> To make up for posting in one of those threads of mass stupidity, here's >> something less off-topic. >> > > Thanks for this. Great help. > >> I downloaded the library from http://code.google.com/p/ccl/ and here's >> what >> I've noticed so far. This part is mostly about superficial packaging >> issues. >> The meaty stuff is harder to comment on since I'd have to read the >> documentation, and there's over 200 pages of that. >> >> unzipping splattered all over current directory. This makes a bad first >> impression. Can I apply tarball etiquette to zip files? Well, I just did. >> > > Problem is that zip will include the .svn directory when using > the recursive option. If I go to the upper directory and issue: > zip -9 -r container-lib-src.zip ccl > it will put all .svn stuff intoi the zip. And I did not find any way > that works to exclude that (no --exclude "ccl/.svn" does NOT work > for unknown reasons. You could setup anonymous read access to your SVN repository, which is a better long term option. -- Ian Collins Alan Curry Guest Posts: n/a  03-02-2011 In article <ikjio6$b0l\$(E-Mail Removed)>,
jacob navia <(E-Mail Removed)> wrote:
>Le 01/03/11 02:42, Alan Curry a écrit :
>
>> There was a mention of ccl.tex but I don't see it anywhere.
>>

>
>Yes, I have added images to the doc and alot of stuff so it would
>require tex and a lot of other non essential stuff, so I dropped tex
>doc sources and ship only pdfs

Since I didn't have an editable documentation format, I couldn't make changes
as I found errors. So you won't be getting a diff, just a list of complaints.

>
>> The ^M's in the C source files aren't fun to look at either.
>>

>
>Please use the -a option when unzipping (unzip -a ...) This will
>eliminate the \r in Unix and ADD them under windows

It's been so long since I saw a zip file (at least one with text files in
it). I used to know about that option but forgot it.

>
>> And here's one that's a serious behavioral issue: searchtree.c in the
>> definition of struct tagBinarySearchTreeNode has "char factor;" which is used
>> to hold the values LEFT, BALANCED, and RIGHT defined as 1, 0, and -1. On my
>> platform, char is unsigned so assigning -1 to a char makes it 255, and then
>> comparing it to -1 is false. So all the ==RIGHT tests are useless. gcc didn't
>> warn about those because you didn't include -W (-Wextra), but luckily it
>> warned about the switch() statements with case RIGHT.
>>

>
>Well, thanks a lot for finding this bug.I just didn't thought about
>having plain char as unsigned.

If you run your tests once with gcc -fsigned-char and once with
gcc -funsigned-char, you will have both possibilities covered. The same
segfault I saw will happen on any architecture with gcc -funsigned-char

>
>> A lot of casts seem to be present because there are strings being passed
>> around as unsigned char * and then they get converted back to char * for use
>> with strcmp and such. I'd try to get rid of the "unsigned" stuff except where
>> it really matters. A function that wants to operate on a string of unsigned
>> chars internally should still have its arguments declared as plain "char *"
>> if that makes the caller's job easier.
>>

>
>
>Apparently there is no solution. strcmp requires char * but I want to
>have unsigned chars in all CHARACTERS since I do NOT want any sign
>problems with them.

Because of your unique idea that plain char is a "problem", your interfaces
need a lot of tedious casts.

--
Alan Curry

Alan Curry
Guest
Posts: n/a

 03-02-2011
For this post, I've not only built the library and run the test program, but
also read the documentation! So mostly I'm reporting documentation bugs.

In the example code for iVector.Erase, There's a "|n" typo, where "\n" should
be.

The iDictionary.Erase example code looks like an unmodified cut and paste of
the iVector.Erase example.

There are some odd features that I can only assume are the result of syntax
errors in the source: The header for the iDlist.PushBack section is messed up,
and iHashTable.Finalize looks similar. (Both have title in a small font
preceded by "Synopsis" instead of a big font with a dividing line)

Under "7.1.2 Lists", there's an "\item" in the middle of a struct definition.

Somehow the charset has gone wrong, causing upside down question marks to
appear in the iVector.GetRange and iHashTable.Add sections.

iDeque.Front likewise has the wrong name (PeekFront) in the prototype.

Now for a serious functional issue the came to mind while reading about
key, just the value. That's a deficient interface compared to glib's hash
tables for example.

For part 3, I might even try to *use* the library...
--
Alan Curry

Kenny McCormack
Guest
Posts: n/a

 03-02-2011
In article <(E-Mail Removed)>,
Jorgen Grahn <(E-Mail Removed)> wrote:
>On Tue, 2011-03-01, James Waldby wrote:
>> On Tue, 01 Mar 2011 20:51:37 +0100, jacob navia wrote:
>>> Le 01/03/11 02:42, Alan Curry a écrit :

>> ...
>>>> what I've noticed so far. [...]
>>>> unzipping splattered all over current directory. This makes a bad first
>>>> impression. Can I apply tarball etiquette to zip files? [...]

>

Yes, if you are 12 years old.

Over the years, I've downloaded countless ZIPs, TARs, and g*d knows how
many other archive format files. One thing I learned early is that
sometimes there is a directory structure in the file, sometimes there
isn't. Therefore, you should always look inside first before doing the
"unzip" (or whatever). If you see inside the file that everything is at
the top level, then you should of course, create an empty directory and
do the unzipping there. Simple stuff, once you get used to it.

To just blindly do the unzip without looking first is, to put it
charitably, stupid. But it *is* typical CLC mindset at work...

I believe the psychologists call this mindset "passive aggressive".

--
But the Bush apologists hope that you won't remember all that. And they
also have a theory, which I've been hearing more and more - namely,
that President Obama, though not yet in office or even elected, caused the
2008 slump. You see, people were worried in advance about his future
policies, and that's what caused the economy to tank. Seriously.

(Paul Krugman - Addicted to Bush)