Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Stupid std::codecvt question

Reply
Thread Tools

Stupid std::codecvt question

 
 
wscholine
Guest
Posts: n/a
 
      07-02-2007
This is with MSVC8, if there's an implementation dependency.

I have a requirement to read lines from files that might be composed
of of wchar_t (for example, text files written by MS Notepad using
"Save As Unicode"). I would like to do this:

typedef std::codecvt<wchar_t, wchar_t, mbstate_t> nullcodecvt;
...
std::wifstream myFile;
...
// somehow associate a nullcodecvt facet with myFile, if it has a
Unicode BOM
...
std::wstring wline;
std::getstring(myFile, wline);
...

What I tried is this:

// awkward-looking circumlocution seems to be the only way to get
a
// reference to a nullcodecvt
const nullcodecvt &conv =
std::use_facet<nullcodecvt>(std::wcin.getloc());
const std::locale from(std::wcin.getloc(), &conv);
// the file I'm playing with contains the text of a sonnet, hence
the name
std::wifstream wsonnet;
wsonnet.imbue(from);
wsonnet.open(L"sonnet-2");
// seek past the BOM
wsonnet.seekg(2, std::ios::beg);
std::wstring wline;
while (wsonnet)
{
std::getline(wsonnet, wline);
}

which does not do the trick. The first time through the loop, wline
gets the low-order half of the character after the BOM, and is empty
thereafter.

Inspecting the data structures with the debugger, I find that wsonnet
has a member of type std::basic_filebuf<wchar_t,
std::char_traits<wchar_t> >, and that this member has a member of type
std::codecvt<wchar_t, char, int> *. The call to
std::wifstream::imbue() doesn't touch that (unsurprisingly, since it's
a different type than the codecvt instantiation that I want). However,
if I manually modify the pointer to point to my nullcodecvt & conv,
the behavior is what I want: each time through the loop, the
successive lines get read without being converted.

FWIW, wsonnet::basic_istream::basic_ios::ios_base dose have a
std::locale * that includes my nullcodecvt in its facets. It doesn't
affect the behavior of std::getline() though.

Is what I am trying to do just wrong? Or is there something broken
with the MS implementation of std::wifstream?

If I'm not totally on the wrong track, is there some less kludgy-
looking way of getting the facet instantiated?

Thanks in advance.

 
Reply With Quote
 
 
 
 
P.J. Plauger
Guest
Posts: n/a
 
      07-02-2007
"wscholine" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) ups.com...

> This is with MSVC8, if there's an implementation dependency.
>
> I have a requirement to read lines from files that might be composed
> of of wchar_t (for example, text files written by MS Notepad using
> "Save As Unicode"). I would like to do this:
>
> typedef std::codecvt<wchar_t, wchar_t, mbstate_t> nullcodecvt;
> ...
> std::wifstream myFile;
> ...
> // somehow associate a nullcodecvt facet with myFile, if it has a
> Unicode BOM
> ...
> std::wstring wline;
> std::getstring(myFile, wline);
> ...
>
> What I tried is this:
>
> // awkward-looking circumlocution seems to be the only way to get
> a
> // reference to a nullcodecvt
> const nullcodecvt &conv =
> std::use_facet<nullcodecvt>(std::wcin.getloc());
> const std::locale from(std::wcin.getloc(), &conv);
> // the file I'm playing with contains the text of a sonnet, hence
> the name
> std::wifstream wsonnet;
> wsonnet.imbue(from);
> wsonnet.open(L"sonnet-2");
> // seek past the BOM
> wsonnet.seekg(2, std::ios::beg);
> std::wstring wline;
> while (wsonnet)
> {
> std::getline(wsonnet, wline);
> }
>
> which does not do the trick. The first time through the loop, wline
> gets the low-order half of the character after the BOM, and is empty
> thereafter.
>
> Inspecting the data structures with the debugger, I find that wsonnet
> has a member of type std::basic_filebuf<wchar_t,
> std::char_traits<wchar_t> >, and that this member has a member of type
> std::codecvt<wchar_t, char, int> *. The call to
> std::wifstream::imbue() doesn't touch that (unsurprisingly, since it's
> a different type than the codecvt instantiation that I want). However,
> if I manually modify the pointer to point to my nullcodecvt & conv,
> the behavior is what I want: each time through the loop, the
> successive lines get read without being converted.
>
> FWIW, wsonnet::basic_istream::basic_ios::ios_base dose have a
> std::locale * that includes my nullcodecvt in its facets. It doesn't
> affect the behavior of std::getline() though.
>
> Is what I am trying to do just wrong?


Yes.

> Or is there something broken
> with the MS implementation of std::wifstream?


No.

> If I'm not totally on the wrong track, is there some less kludgy-
> looking way of getting the facet instantiated?


You need one of the codecvt facets in our code conversion library.
Just which one depends on details you haven't specified, but I'm
sure what you need is in there. Or you might get lucky and find
an open-source codecvt facet that does what you want.

> Thanks in advance.


P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
stupid, STUPID question! rincewind HTML 25 05-08-2009 01:07 PM
stupid question...waiting for a stupid answer Brandon McCombs Java 4 08-28-2006 06:57 PM
Stupid question. Please, only stupid responders. If you're not sureif you're stupid, you probably aren't. =?ISO-8859-1?Q?R=F4g=EAr?= Computer Support 6 07-18-2005 05:11 AM
stupid stupid stupid kpg MCSE 17 11-26-2004 02:59 PM
Stupid is as Stupid Does! Michael P Gabriel Digital Photography 3 06-26-2004 12:49 PM



Advertisments