On Dec 14, 5:41*pm, Sam <s...@email-scan.com> wrote:
> I am assuming, based on your description, that your file contents are coded
> in UTF-16.
>
> If so, each two-byte codepoints should've been read into single wchar_t.
> That's what a wchar_t is, after all. Sounds like your std::wifstream thought
> that your file contents were coded in, probably, ISO-8859-1, and you're
> seeing the results.
Sounds reasonable.
> Double-check that you've set your global locale correctly to reflect that
> your system environment uses UTF-16 coding, or imbue a UTF-16 locale into
> your std::wifstream.
As I understand it, In Visual Studio, if a project is set to use
unicode, then any wide strings are UTF16. I also assume the Windows
API calls to read and write files treat text as UTF16. That's a
question for a MS newsgroup though.
My questions here are,
How do I set a "global locale"?
How do I imbue a UTF16 locale into a stream?
Are there built in UTF-16 locales?
Are there built in UTF-8 locales?
Are there built in conversions methods?
I am googling the hell out of facets and locales and finding very
little, aside from similarly frustrated people.
> > Furthermore,
> > 1) I cannot double click the file and open it as XML on Windows Server
> > 2003. It says "Invalid character. Error processing resource"
>
> If that's the case, then this has nothing to do with your code, and the
> file's coding does not match your system locale.
> The file must've been generated on a system that uses a locale with a
> different character set/code point.
I think that the encoding is not valid anywhere because of the mix and
match between multibyte, wide, acii, UTF16, UTF8, Windows generated
text, 3rd party library generated text, streaming, etc. used
throughout the project I am in, without any regard or consistancy for
character encoding.
I am trying to decypher what they "thought it was" and how to get it
into something usable.
> Additionally, all XML files should be coded in UTF-8 anyway, not UTF-16, and
> not ISO-8859-1.
It's not XML that follows the rules. It's "XML" that only resembles
xml in its use of tags, that some developer put into a file using
Windows API functions.
|