On Apr 17, 9:30 pm, Adrian <n...@bluedreamer.com> wrote:
> I want a const static std::set of strings which is case insensitive
> for the values.
> So I have the following which seems to work but something doesnt seem
> right about it. Is there a better way or any gotcha's from my code
> below.
Your code has undefined behavior.
> #include <iostream>
> #include <functional>
> #include <algorithm>
> #include <set>
> #include <string>
> #include <iterator>
Don't forget:
#include <cctype>
(or <locale>, if you use the toupper functions from there).
> class Test
> {
> public:
> void p()
> {
> std::copy(fields.begin(), fields.end(),
> std:
stream_iterator<std::string>(std::cout, ","));
> std::cout << std::endl;
> }
> private:
> struct nocase_cmp : public std::binary_function<const
> std::string &, const std::string &, bool>
> {
> struct nocase_char_cmp : public std::binary_function<char,
> char, bool>
> {
> bool operator()(char a, char b)
The function should be const, I think.
> {
> return std::toupper(a) < std::toupper(b);
Calling the single argument form of toupper with a char as
argument is undefined behavior. The argument type is int, with
the constraint that the value of the int must be either EOF, or
in the range [0...UCHAR_MAX]. If char is signed, it won't be in
range when converted (implicitly) to int.
There are two solutions here: either explicitly convert the char
to unsigned char before calling toupper, e.g.:
return toupper( static_cast< unsigned char >( a ) )
< toupper( static_cast< unsigned char >( b ) ) ;
or use the two operator forms in std::ctype. (In that case, I
would use something like:
class nocase_char_cmp
{
public:
typedef std::ctype< char >
ctype ;
explicit nocase_char_cmp(
std::locale const& l = std::locale() )
: my_ctype( &std::use_facet< ctype >( l ) )
{
}
bool operator()( char a, char b ) const
{
return my_ctype->tolower( a ) < my_ctype->toupper( a ) ;
}
private:
ctype const* my_ctype ;
} ;
..)
If you have a lot of case insensitive comparisons, it might be
worth writing a case insensitive collate facet (or there might
even be one available ready-made); in that case, just pass an
std::locale with this facet as the fifth argument to
lexicographical_compare, and you're done with it.
> }
> };
> bool operator()(const std::string &a, const std::string &b)
> {
> return std::lexicographical_compare(a.begin(), a.end(),
> b.begin(), b.end(),
> nocase_char_cmp());
> }
> };
> typedef std::set<std::string, nocase_cmp> Field_names_t;
> static const Field_names_t fields;};
> const char *f[]={
> "string1",
> "string2",
> "string3",
> "STRIng1",
> "string5"};
Try throwing in some characters whose encoding results in a
negative number, and see what happens. (On my machine, just
about any accented character will do the trick. In my test
suites, I'll generally make sure that there is a ÿ somewhere,
since in the most frequent encoding, it is 0xFF, which, when
stored into a char, becomes -1, or EOF. You'd be surprised how
many programs stop when they encounter this character in a
file.)
--
James Kanze (GABI Software) email:
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34