Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > according to MS, stricmp is deprecated

Reply
Thread Tools

according to MS, stricmp is deprecated

 
 
Geoff
Guest
Posts: n/a
 
      04-30-2013
On Mon, 29 Apr 2013 16:08:57 -0500, Paavo Helde
<(E-Mail Removed)> wrote:

>Geoff <(E-Mail Removed)> wrote in
>news:(E-Mail Removed) :
>
>> On Mon, 29 Apr 2013 12:57:51 -0500, Paavo Helde
>> <(E-Mail Removed)> wrote:
>>
>>>Geoff <(E-Mail Removed)> wrote in
>>>news:(E-Mail Removed) :
>>>
>>>> On Sun, 28 Apr 2013 01:23:20 -0500, Paavo Helde
>>>> <(E-Mail Removed)> wrote:
>>>>
>>>>>What's wrong with an inline forwarding function?
>>>>
>>>> Nothing. As it is, I got off my ass this weekend and eliminated the
>>>> rename and implemented the strcasecmp function within the
>>>> application, conditional on _WIN32 as I posted here earlier. A hack
>>>> is a hack and it was time to eliminate that one.
>>>>
>>>> Since we're in comp.lang.c++, what do you think of this
>>>> implementation of a case-insensitive compare of C++ strings?
>>>>
>>>> int stringcasecmp(const std::string s1, const std::string s2)
>>>> {
>>>> std::string us1 = s1;
>>>> std::string us2 = s2;
>>>> std::transform(us1.begin(), us1.end(), us1.begin(), ::tolower);
>>>> std::transform(us2.begin(), us2.end(), us2.begin(), ::tolower);
>>>> return us1.compare(us2);
>>>> }
>>>
>>>This calls C function tolower(int), which has undefined behavior in
>>>case of negative arguments. As the plain char type used by std::string
>>>is often a signed type it can easily contain negative values. So this
>>>implementation seems quite dangerous.
>>>
>>>Even if the correct C++ std::tolower() function were used, it could
>>>not properly support multibyte characters like UTF-8 encoding. I am
>>>even not talking about comparing German ß and SS.
>>>
>>>It also makes copies of the both strings (twice, as the parameters are
>>>not references), then makes two passes through both strings, which may
>>>become suboptimal if the strings do not fit in the cpu caches.
>>>
>>>As the above function basically works only for ASCII (never seen
>>>EBCDIC in real life), it would be best to acknowledge this; then
>>>calling the locale-specific ::tolower would be wrong and also not
>>>necessary. Improved version:
>>>
>>>int strcasecmp_ascii(const std::string& s1, const std::string& s2)
>>>{
>>> const std::string::size_type n = std::min(s1.length(), s2.length());
>>> for (std::string::size_type i = 0; i<n; ++i) {
>>> char c1 = s1[i], c2=s2[i];
>>> if (c1>='A' && c1<='Z') {
>>> c1+=32;
>>> }
>>> if (c2>='A' && c2<='Z') {
>>> c2+=32;
>>> }
>>> if (c1<c2) {
>>> return -1;
>>> } else if (c1>c2) {
>>> return 1;
>>> }
>>> }
>>> return s1.length()>n? 1: (s2.length()>n? -1: 0);
>>>}

>>
>> I wouldn't call that an improvement.
>>
>> I would modify mine to eliminate your objections about unnecessary
>> copies of the strings. The goal was to make it non destructive of the
>> original strings.
>>
>> //headers needed for implementing this function
>> #include <string>
>> #include <algorithm>
>> #include <locale.h>
>>
>> int stringcasecmp(std::string s1, std::string s2)
>> {
>> std::transform(s1.begin(), s1.end(), s1.begin(), ::tolower);
>> std::transform(s2.begin(), s2.end(), s2.begin(), ::tolower);
>> return s1.compare(s2);
>> }

>
>This is still ignoring the undefined behavior of ::tolower with negative
>values. See http://www.unix.com/man-page/POSIX/3posix/tolower/.
>
>> You make a cogent argument for a general solution to the problem but
>> you don't offer one.

>
>I'm not quite sure what the problem is. If it is just implementing
>strcasecmp() on Windows, then one can just forward to _stricmp() or
>better yet to _wcsicmp().


This advice is amusing because Microsoft's _stricmp() and _wcsicmp()
both use ascii and wide versions of tolower().

>
>If it is about implementing it without using the C portion of the C++
>standard then you have failed as ::tolower is a C function as well.
>


No, it's not about that at all.

>If it is about providing a general solution of case insensitive string
>comparison then this has been already long done as strcasecmp() is part
>of POSIX.
>


This is also amusing because the POSIX documentation states it is both
locale specific and depends on the tolower conversion.

http://pubs.opengroup.org/onlinepubs/9699919799/

"...

The strcasecmp() and strncasecmp() functions use the current locale to
determine the case of the characters.

The strcasecmp_l() and strncasecmp_l() functions use the locale
represented by locale to determine the case of the characters.

When the LC_CTYPE category of the locale being used is from the POSIX
locale, these functions shall behave as if the strings had been
converted to lowercase and then a byte comparison performed.
Otherwise, the results are unspecified.

The behavior is undefined if the locale argument to strcasecmp_l() or
strncasecmp_l() is the special locale object LC_GLOBAL_LOCALE or is
not a valid locale object handle."

All the implementations strcasecmp() I have been able to find trace
their parentage to BSD 4.3 from 1993 to wit:

int strcasecmp(const char *s1, const char *s2)
{
const unsigned char
*uc1 = (const unsigned char *)s1,
*uc2 = (const unsigned char *)s2;

while (tolower(*uc1) == tolower(*uc2++))
if (*uc1++ == '\0')
return (0);
return (tolower(*uc1) - tolower(*--uc2));
}
 
Reply With Quote
 
 
 
 
James Kanze
Guest
Posts: n/a
 
      04-30-2013
On Monday, April 29, 2013 6:15:36 PM UTC+1, Geoff wrote:
> On Sun, 28 Apr 2013 01:23:20 -0500, Paavo Helde
> <(E-Mail Removed)> wrote:


> >What's wrong with an inline forwarding function?


> Nothing. As it is, I got off my ass this weekend and eliminated the
> rename and implemented the strcasecmp function within the application,
> conditional on _WIN32 as I posted here earlier. A hack is a hack and
> it was time to eliminate that one.


> Since we're in comp.lang.c++, what do you think of this implementation
> of a case-insensitive compare of C++ strings?


> int stringcasecmp(const std::string s1, const std::string s2)
> {
> std::string us1 = s1;
> std::string us2 = s2;
> std::transform(us1.begin(), us1.end(), us1.begin(), ::tolower);
> std::transform(us2.begin(), us2.end(), us2.begin(), ::tolower);
> return us1.compare(us2);
> }


For starters, it has undefined behavior: you can't call
::tolower on a char.

Performance-wise, of course: is there really any reason to build
two new strings? Wouldn't it be a lot better if you did the
transformations on the fly? (I'm supposing the fact that you
use pass by value, rather than references, is an oversight.)

And perhaps above all: what is the program really supposed to
do? (In most contexts, today, I would want it to handle UTF-8.
Obviously, this one doesn't, but what about "MASSE" vs. "Maße"?)
Shouldn't it be locale sensitive?

Part of the problem (and perhaps the reason why C and C++ didn't
adopt anything similar) is because case-insensitive compare
isn't really a well defined concept.

--
James
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
stricmp Michael Sgier C++ 2 07-07-2006 03:32 AM
Need to implement strdup, strnicmp and stricmp jamihuq C Programming 18 06-30-2006 07:42 PM
comparing two strcasecmp (stricmp) implementations William Krick C Programming 88 12-08-2005 11:53 PM
question about string compare stricmp xuatla C++ 11 09-26-2005 03:04 AM
Deprecated warnings within deprecated code Barney Barumba Java 0 07-23-2003 12:46 AM



Advertisments