Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Java (http://www.velocityreviews.com/forums/f30-java.html)
-   -   How do i decode an ISO-8859-1 encoded text file? (http://www.velocityreviews.com/forums/t130494-how-do-i-decode-an-iso-8859-1-encoded-text-file.html)

SHIRE 01-19-2004 01:25 PM

How do i decode an ISO-8859-1 encoded text file?
 
Hi,

I want to decode the content of the text file which looks like this:

Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Vill bara s=E4ga att du hade r=E4tt.

I tried to do it like this:

import java.io.*;
import java.util.*;
import java.net.*;
import java.text.SimpleDateFormat;



public class MessageHandler {

BufferedReader _reader = null;

/**
* Constructor
*/
public MessageHandler(String msgFile){
try {
_reader = new BufferedReader(new InputStreamReader(new
FileInputStream(msgFile), "ISO-8859-1"));
}catch(IOException e){System.out.println(e.toString());}
}

//---------------------------------------------------------------
public void close(){
try {
_reader.close();
}catch(IOException e){
System.out.println(e.toString());
}
}

//-------------------------------------------------------------
public String getFileContent(){
String msg ="";
String line ="";
try{
while((line = _reader.readLine()) != null){
msg += line+"\r\n";
}
}catch(IOException e){
System.out.println(e.toString());
}
return msg;
}


//-------------------------------------------------------------------
public static void main(String args[]){
MessageHandler msgHandler = new MessageHandler("data.txt");
System.out.println("Content:" + msgHandler.getFileContent() );
msgHandler.close();
}

}//End of MessageHandler


But it didn't work.
Please advice.
Thanks for your help.

Mohamud Jama



Michael Borgwardt 01-19-2004 01:43 PM

Re: How do i decode an ISO-8859-1 encoded text file?
 
SHIRE wrote:

> Hi,
>
> I want to decode the content of the text file which looks like this:
>
> Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
> Content-Type: text/plain; charset=iso-8859-1
> Content-Transfer-Encoding: quoted-printable
>
> Vill bara s=E4ga att du hade r=E4tt.


The key here is not ISO-8859-1, it's "quoted printable".
To decode it, replace all equal signs which are followed by two
hexadecimal digits by the character with the ASCII number of those
digits' value, and whenever an equal sign ends a line, remove that
line break.


Oscar Kind 01-19-2004 02:02 PM

Re: How do i decode an ISO-8859-1 encoded text file?
 
SHIRE <mohamud.jama@ericsson.com> wrote:
> Hi,
>
> I want to decode the content of the text file which looks like this:
>
> Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
> Content-Type: text/plain; charset=iso-8859-1
> Content-Transfer-Encoding: quoted-printable
>
> Vill bara s=E4ga att du hade r=E4tt.
>
> I tried to do it like this:
>

[code that AFAIK correctly uses character encoding]

> But it didn't work.


You didn't decode the quoted printable text (7-bit text; US-ASCII) into
it's original (8-bit text; ISO-8859-1).


Oscar

--
No trees were harmed in creating this message.
However, a large number of electrons were terribly inconvenienced.

Thomas Weidenfeller 01-19-2004 02:25 PM

Re: How do i decode an ISO-8859-1 encoded text file?
 
SHIRE wrote:

> Subject: How do i decode an ISO-8859-1 encoded text file?


For the record: That mail is NOT ISO-8859-1 encoded. You get ISO-8859-1
if you manage to decode it. ISO-8859-1 is a common 8bit character set
(aka Latin-1), but in order to get it down to 7 bits for mail transfer,
it needs encoding. In your case, quoted printable has been used for
encoding. You have to reverse that encoding to get an ISO-8859-1 text back.

> I want to decode the content of the text file which looks like this:
>
> Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
> Content-Type: text/plain; charset=iso-8859-1
> Content-Transfer-Encoding: quoted-printable
>
> Vill bara s=E4ga att du hade r=E4tt.


E.g., you either get the RFC which describes quoted printable encoding,
and implement your own decoder, or you get the JavaMail package from Sun
and use the decoder in that package - once you have figured out how to
use it. JavaMail will also be able to decode the subject line (which is
AFAIR defined in yet another RFC).

/Thomas


SHIRE 02-04-2004 04:03 PM

Re: How do i decode an ISO-8859-1 encoded text file?
 
Thank you all. I used JavaMail package for decoding.
Thanks Thomas!

/Mohamud


"Thomas Weidenfeller" <nobody@ericsson.invalid> wrote in message
news:bugpal$opu$1@newstree.wise.edt.ericsson.se...
> SHIRE wrote:
>
> > Subject: How do i decode an ISO-8859-1 encoded text file?

>
> For the record: That mail is NOT ISO-8859-1 encoded. You get ISO-8859-1
> if you manage to decode it. ISO-8859-1 is a common 8bit character set
> (aka Latin-1), but in order to get it down to 7 bits for mail transfer,
> it needs encoding. In your case, quoted printable has been used for
> encoding. You have to reverse that encoding to get an ISO-8859-1 text

back.
>
> > I want to decode the content of the text file which looks like this:
> >
> > Subject: =?iso-8859-1?Q?Sek=2Dm=F6te?=
> > Content-Type: text/plain; charset=iso-8859-1
> > Content-Transfer-Encoding: quoted-printable
> >
> > Vill bara s=E4ga att du hade r=E4tt.

>
> E.g., you either get the RFC which describes quoted printable encoding,
> and implement your own decoder, or you get the JavaMail package from Sun
> and use the decoder in that package - once you have figured out how to
> use it. JavaMail will also be able to decode the subject line (which is
> AFAIR defined in yet another RFC).
>
> /Thomas
>





All times are GMT. The time now is 01:04 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.