Character encoding (2)
Sorry for posting this again, but since my thread of last saturday kind
of ended on a dead track, I decided to post it brand new. Refer also
The problem I'm having is basically only on the server side...
I'm working on a server that should receive HTTP requests. It is
however possible that the request that arrives at the server is not
HTTP. This possibility is verified on the first byte of data.
(in other words:
if the first byte is equal to 0x01,
then not HTTP
else ... )
Given that the information is posted according to HTTP, I'm trying to
resolve the following: I don't know a priori which encoding is used for
the data stream. The following rules for encoding apply:
If the string (using regex) <?xml [^>]+encoding="([^"]+)" is
encountered, $1 is used for decoding, otherwise a default char set is
used. My goal is to both use the characters (i.e. the server's
'interpretation' of the bytes received) as the original byte stream. I
want to write to a file the original byte stream, while using the
derived character stream for processing (using beans, XSL
I tried simulating the client using a basic HTML page, with a FORM
action to my server's url. Now in HTML I can specify the meta element
Content-type, and set it to "text/xml; charset=utf-8 or whatever I
like. I recall that by default HTML Forms encode using the platform
default charset and content-type application/x-www-form-urlencoded
Also tried to simulate the client with a JAVA application that makes
use of the java.net.HttpURLConnection. Here I have set the
requestProperty "Content-type" to "text/xml; charset=utf-8".
Now I'm not sure whether in either one or both cases the stream is mime
Someone in the previous thread suggested me to use HttpURLConnection
also on the serverside, but since I'm expecting also non-HTTP requests,
I'm not sure if I can. Most likely I cannot use a BufferedReader,
because it is based on a character stream, so I lose the original byte
Re: Character encoding (2)
Actually, rereading my post, I would like to add: I want to write to a
file the original byte representation of characters after having
processed them. The bean object(s) I'm using have their own write()
method, which writes to an outputstream.
I guess the solution here is remind what was the original encoding, use
the bean's write method to write to a ByteArrayOutputStream, and then
parse that to a String using platform default encoding, and then
rewrite that using the original encoding to the file output stream...
|All times are GMT. The time now is 01:29 PM.|
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.