Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Characters aren't displayed correctly

Reply
Thread Tools

Characters aren't displayed correctly

 
 
Hussein B
Guest
Posts: n/a
 
      03-01-2009
Hey,
I'm retrieving records from MySQL database that contains non english
characters.
Then I create a String that contains HTML markup and column values
from the previous result set.
+++++
markup = u'''<table>.....'''
for row in rows:
markup = markup + '<tr><td>' + row['id']
markup = markup + '</table>
+++++
Then I'm sending the email according to this tip:
http://code.activestate.com/recipes/473810/
Well, the email contains ????? characters for each non english ones.
Any ideas?
Ubuntu 8.04
Python 2.5.2
Evolution Mail Client
Thanks.
 
Reply With Quote
 
 
 
 
Philip Semanchuk
Guest
Posts: n/a
 
      03-01-2009

On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

> Hey,
> I'm retrieving records from MySQL database that contains non english
> characters.
> Then I create a String that contains HTML markup and column values
> from the previous result set.
> +++++
> markup = u'''<table>.....'''
> for row in rows:
> markup = markup + '<tr><td>' + row['id']
> markup = markup + '</table>
> +++++
> Then I'm sending the email according to this tip:
> http://code.activestate.com/recipes/473810/
> Well, the email contains ????? characters for each non english ones.
> Any ideas?


There's so many places where this could go wrong and you haven't
narrowed down the problem.

Are the characters stored in the database correctly?

Are they stored consistently (i.e. all using the same encoding, not
some using utf-8 and others using iso-8859-1)?

What are you getting out of the database? Is it being converted to
Unicode correctly, or at all?

Are you sure that the program you're using to view the email
understands the encoding?

Isolate those questions one at a time. Add some debugging breakpoints.
Ensure that you have what you think you have. You might not fix your
problem, but you will make it much smaller and more specific.


Good luck
Philip



 
Reply With Quote
 
 
 
 
J. Clifford Dyer
Guest
Posts: n/a
 
      03-01-2009
On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
> On Mar 1, 2009, at 8:31 AM, Hussein B wrote:
>
> > Hey,
> > I'm retrieving records from MySQL database that contains non english
> > characters.
> > Then I create a String that contains HTML markup and column values
> > from the previous result set.
> > +++++
> > markup = u'''<table>.....'''
> > for row in rows:
> > markup = markup + '<tr><td>' + row['id']
> > markup = markup + '</table>
> > +++++
> > Then I'm sending the email according to this tip:
> > http://code.activestate.com/recipes/473810/
> > Well, the email contains ????? characters for each non english ones.
> > Any ideas?

>
> There's so many places where this could go wrong and you haven't
> narrowed down the problem.
>
> Are the characters stored in the database correctly?
>
> Are they stored consistently (i.e. all using the same encoding, not
> some using utf-8 and others using iso-8859-1)?
>
> What are you getting out of the database? Is it being converted to
> Unicode correctly, or at all?
>
> Are you sure that the program you're using to view the email
> understands the encoding?
>
> Isolate those questions one at a time. Add some debugging breakpoints.
> Ensure that you have what you think you have. You might not fix your
> problem, but you will make it much smaller and more specific.
>
>
> Good luck
> Philip
>
>


Let me add to that checklist:

Are you sure the email you are creating has the encoding declared
properly in the headers?

>
> --
> http://mail.python.org/mailman/listinfo/python-list
>


Cheers,
Cliff


 
Reply With Quote
 
Hussein B
Guest
Posts: n/a
 
      03-02-2009
On Mar 1, 4:51*pm, Philip Semanchuk <(E-Mail Removed)> wrote:
> On Mar 1, 2009, at 8:31 AM, Hussein B wrote:
>
> > Hey,
> > I'm retrieving records from MySQL database that contains non english
> > characters.
> > Then I create a String that contains HTML markup and column values
> > from the previous result set.
> > +++++
> > markup = u'''<table>.....'''
> > for row in rows:
> > * * markup = markup + '<tr><td>' + row['id']
> > markup = markup + '</table>
> > +++++
> > Then I'm sending the email according to this tip:
> >http://code.activestate.com/recipes/473810/
> > Well, the email contains ????? characters for each non english ones.
> > Any ideas?

>
> There's so many places where this could go wrong and you haven't *
> narrowed down the problem.
>
> Are the characters stored in the database correctly?

Yes they are.

> Are they stored consistently (i.e. all using the same encoding, not *
> some using utf-8 and others using iso-8859-1)?

Yes.

> What are you getting out of the database? Is it being converted to *
> Unicode correctly, or at all?

I don't know, how to make sure of this point?

> Are you sure that the program you're using to view the email *
> understands the encoding?

Yes.

> Isolate those questions one at a time. Add some debugging breakpoints. *
> Ensure that you have what you think you have. You might not fix your *
> problem, but you will make it much smaller and more specific.
>
> Good luck
> Philip


 
Reply With Quote
 
Hussein B
Guest
Posts: n/a
 
      03-02-2009
On Mar 1, 11:27*pm, "J. Clifford Dyer" <(E-Mail Removed)> wrote:
> On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
> > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

>
> > > Hey,
> > > I'm retrieving records from MySQL database that contains non english
> > > characters.
> > > Then I create a String that contains HTML markup and column values
> > > from the previous result set.
> > > +++++
> > > markup = u'''<table>.....'''
> > > for row in rows:
> > > * * markup = markup + '<tr><td>' + row['id']
> > > markup = markup + '</table>
> > > +++++
> > > Then I'm sending the email according to this tip:
> > >http://code.activestate.com/recipes/473810/
> > > Well, the email contains ????? characters for each non english ones.
> > > Any ideas?

>
> > There's so many places where this could go wrong and you haven't *
> > narrowed down the problem.

>
> > Are the characters stored in the database correctly?

>
> > Are they stored consistently (i.e. all using the same encoding, not *
> > some using utf-8 and others using iso-8859-1)?

>
> > What are you getting out of the database? Is it being converted to *
> > Unicode correctly, or at all?

>
> > Are you sure that the program you're using to view the email *
> > understands the encoding?

>
> > Isolate those questions one at a time. Add some debugging breakpoints. *
> > Ensure that you have what you think you have. You might not fix your *
> > problem, but you will make it much smaller and more specific.

>
> > Good luck
> > Philip

>
> Let me add to that checklist:
>
> Are you sure the email you are creating has the encoding declared
> properly in the headers?
>
>
>
> > --
> >http://mail.python.org/mailman/listinfo/python-list

>
> Cheers,
> Cliff


My HTML markup contains only table tags (you know, table, tr and td)
 
Reply With Quote
 
J. Clifford Dyer
Guest
Posts: n/a
 
      03-02-2009
On Mon, 2009-03-02 at 00:33 -0800, Hussein B wrote:
> On Mar 1, 11:27 pm, "J. Clifford Dyer" <(E-Mail Removed)> wrote:
> > On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
> > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

> >
> > > > Hey,
> > > > I'm retrieving records from MySQL database that contains non english
> > > > characters.
> > > > Then I create a String that contains HTML markup and column values
> > > > from the previous result set.
> > > > +++++
> > > > markup = u'''<table>.....'''
> > > > for row in rows:
> > > > markup = markup + '<tr><td>' + row['id']
> > > > markup = markup + '</table>
> > > > +++++
> > > > Then I'm sending the email according to this tip:
> > > >http://code.activestate.com/recipes/473810/
> > > > Well, the email contains ????? characters for each non english ones.
> > > > Any ideas?

> >
> > > There's so many places where this could go wrong and you haven't
> > > narrowed down the problem.

> >
> > > Are the characters stored in the database correctly?

> >
> > > Are they stored consistently (i.e. all using the same encoding, not
> > > some using utf-8 and others using iso-8859-1)?

> >
> > > What are you getting out of the database? Is it being converted to
> > > Unicode correctly, or at all?

> >
> > > Are you sure that the program you're using to view the email
> > > understands the encoding?

> >
> > > Isolate those questions one at a time. Add some debugging breakpoints.
> > > Ensure that you have what you think you have. You might not fix your
> > > problem, but you will make it much smaller and more specific.

> >
> > > Good luck
> > > Philip

> >
> > Let me add to that checklist:
> >
> > Are you sure the email you are creating has the encoding declared
> > properly in the headers?
> >
> >
> >
> >
> > Cheers,
> > Cliff

>
> My HTML markup contains only table tags (you know, table, tr and td)


Ah. The issue is not with the HTML markup, but the email headers. For
example, the email you sent me has a header that says:

Content-type: text/plain; charset="iso-8859-1"

Guessing from the recipe you linked to, you probably need something
like:

msgRoot['Content-type'] = 'text/plain; charset="utf-16"'

replacing utf-16 with whatever encoding you have encoded your email
with.

Or it may be that the header has to be attached to the individual mime
parts. I'm not as familiar with MIME.


Cheers,
Cliff



 
Reply With Quote
 
Hussein B
Guest
Posts: n/a
 
      03-02-2009
On Mar 2, 4:03*pm, "J. Clifford Dyer" <(E-Mail Removed)> wrote:
> On Mon, 2009-03-02 at 00:33 -0800, Hussein B wrote:
> > On Mar 1, 11:27 pm, "J. Clifford Dyer" <(E-Mail Removed)> wrote:
> > > On Sun, 2009-03-01 at 09:51 -0500, Philip Semanchuk wrote:
> > > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

>
> > > > > Hey,
> > > > > I'm retrieving records from MySQL database that contains non english
> > > > > characters.
> > > > > Then I create a String that contains HTML markup and column values
> > > > > from the previous result set.
> > > > > +++++
> > > > > markup = u'''<table>.....'''
> > > > > for row in rows:
> > > > > * * markup = markup + '<tr><td>' + row['id']
> > > > > markup = markup + '</table>
> > > > > +++++
> > > > > Then I'm sending the email according to this tip:
> > > > >http://code.activestate.com/recipes/473810/
> > > > > Well, the email contains ????? characters for each non english ones.
> > > > > Any ideas?

>
> > > > There's so many places where this could go wrong and you haven't *
> > > > narrowed down the problem.

>
> > > > Are the characters stored in the database correctly?

>
> > > > Are they stored consistently (i.e. all using the same encoding, not *
> > > > some using utf-8 and others using iso-8859-1)?

>
> > > > What are you getting out of the database? Is it being converted to *
> > > > Unicode correctly, or at all?

>
> > > > Are you sure that the program you're using to view the email *
> > > > understands the encoding?

>
> > > > Isolate those questions one at a time. Add some debugging breakpoints. *
> > > > Ensure that you have what you think you have. You might not fix your *
> > > > problem, but you will make it much smaller and more specific.

>
> > > > Good luck
> > > > Philip

>
> > > Let me add to that checklist:

>
> > > Are you sure the email you are creating has the encoding declared
> > > properly in the headers?

>
> > > Cheers,
> > > Cliff

>
> > My HTML markup contains only table tags (you know, table, tr and td)

>
> Ah. *The issue is not with the HTML markup, but the email headers. *For
> example, the email you sent me has a header that says:
>
> Content-type: text/plain; charset="iso-8859-1"
>
> Guessing from the recipe you linked to, you probably need something
> like:
>
> msgRoot['Content-type'] = 'text/plain; charset="utf-16"'
>
> replacing utf-16 with whatever encoding you have encoded your email
> with.
>
> Or it may be that the header has to be attached to the individual mime
> parts. *I'm not as familiar with MIME.
>
> Cheers,
> Cliff


Hey Cliff,
I tried your tip and I still get the same thing (?????)
I added print statement to print each value of the result set into the
console, which also prints ???? characters instead of the real
characters values.
Maybe a conversion is happened upon getting the data from the
database?
(the values are stored correctly in the database)
 
Reply With Quote
 
John Machin
Guest
Posts: n/a
 
      03-02-2009
On Mar 2, 7:30*pm, Hussein B <(E-Mail Removed)> wrote:
> On Mar 1, 4:51*pm, Philip Semanchuk <(E-Mail Removed)> wrote:
>
> > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

>
> > > Hey,
> > > I'm retrieving records from MySQL database that contains non english
> > > characters.


Can you reveal which language???

> > > Then I create a String that contains HTML markup and column values
> > > from the previous result set.
> > > +++++
> > > markup = u'''<table>.....'''
> > > for row in rows:
> > > * * markup = markup + '<tr><td>' + row['id']
> > > markup = markup + '</table>
> > > +++++
> > > Then I'm sending the email according to this tip:
> > >http://code.activestate.com/recipes/473810/
> > > Well, the email contains ????? characters for each non english ones.
> > > Any ideas?

>
> > There's so many places where this could go wrong and you haven't *
> > narrowed down the problem.

>
> > Are the characters stored in the database correctly?

>
> Yes they are.


How do you KNOW that they are stored correctly? What makes you so
sure?

>
> > Are they stored consistently (i.e. all using the same encoding, not *
> > some using utf-8 and others using iso-8859-1)?

>
> Yes.


So what is the encoding used to store them?

>
> > What are you getting out of the database? Is it being converted to *
> > Unicode correctly, or at all?

>
> I don't know, how to make sure of this point?


You could show us some of the output from the database query. As well
as
print the_output
you should
print repr(the_output)
and show us both, and also tell us what you *expect* to see.

And let's get the database output sorted out before we worry about the
email message.
 
Reply With Quote
 
Hussein B
Guest
Posts: n/a
 
      03-02-2009
On Mar 2, 4:31*pm, John Machin <(E-Mail Removed)> wrote:
> On Mar 2, 7:30*pm, Hussein B <(E-Mail Removed)> wrote:
>
> > On Mar 1, 4:51*pm, Philip Semanchuk <(E-Mail Removed)> wrote:

>
> > > On Mar 1, 2009, at 8:31 AM, Hussein B wrote:

>
> > > > Hey,
> > > > I'm retrieving records from MySQL database that contains non english
> > > > characters.

>
> Can you reveal which language???
>
>
>

Arabic

> > > > Then I create a String that contains HTML markup and column values
> > > > from the previous result set.
> > > > +++++
> > > > markup = u'''<table>.....'''
> > > > for row in rows:
> > > > * * markup = markup + '<tr><td>' + row['id']
> > > > markup = markup + '</table>
> > > > +++++
> > > > Then I'm sending the email according to this tip:
> > > >http://code.activestate.com/recipes/473810/
> > > > Well, the email contains ????? characters for each non english ones..
> > > > Any ideas?

>
> > > There's so many places where this could go wrong and you haven't *
> > > narrowed down the problem.

>
> > > Are the characters stored in the database correctly?

>
> > Yes they are.

>
> How do you KNOW that they are stored correctly? What makes you so
> sure?
>
>

Because MySQL Query Browser displays them correctly, in addition I use
BIRT as the reporting system and it shows them correctly.

>
> > > Are they stored consistently (i.e. all using the same encoding, not *
> > > some using utf-8 and others using iso-8859-1)?

>
> > Yes.

>
> So what is the encoding used to store them?
>
>
>

Tables are created with UTF-8 encoding option

> > > What are you getting out of the database? Is it being converted to *
> > > Unicode correctly, or at all?

>
> > I don't know, how to make sure of this point?

>
> You could show us some of the output from the database query. As well
> as
> * *print the_output
> you should
> * *print repr(the_output)
> and show us both, and also tell us what you *expect* to see.
>


The result of print repr(row['name']) is '??? ??????'
The '?' characters are supposed to be Arabic characters.

> And let's get the database output sorted out before we worry about the
> email message.


Thanks all for help.
 
Reply With Quote
 
Philip Semanchuk
Guest
Posts: n/a
 
      03-02-2009

On Mar 2, 2009, at 9:50 AM, Hussein B wrote:

> On Mar 2, 4:31 pm, John Machin <(E-Mail Removed)> wrote:
>> On Mar 2, 7:30 pm, Hussein B <(E-Mail Removed)> wrote:
>>
>>> On Mar 1, 4:51 pm, Philip Semanchuk <(E-Mail Removed)> wrote:
>>>> What are you getting out of the database? Is it being converted to
>>>> Unicode correctly, or at all?

>>
>>> I don't know, how to make sure of this point?


Personally, I'd add a debug breakpoint just after extracting the
characters from the database, like so:

import pdb
pdb.set_trace()

When you're stopped at the breakpoint, examine the string you get
back. Is it what you expect? For instance, is it Unicode?

isinstance(my_string, unicode)

Or maybe you're expecting a utf-8 encoded string, so examine one of
the non-ASCII characters. Is it really utf-8 encoded?

>>> my_string = u"snö".encode("utf-8")
>>> my_string[0]

's'
>>> my_string[1]

'n'
>>> my_string[2]

'\xc3'
>>> my_string[3]

'\xb6'


Since you feel pretty confident that you're getting what you expect
out of the database, maybe you want to eliminate that from
consideration. As a test, construct "by hand" a string that represents
the email message you're trying to send. If you send that with the
proper content-type header and you still don't get the results you
want, then we can all stop discussing the database. Make sense?

Forget about the HTML markup, too. That's just a distraction. Start
with the simplest problem first, and then add pieces on.

See if you can successfully construct and send an email that says
"Hello world" in English/ASCII. If that works, change it to Arabic. If
that works, change the email format to HTML. If that works, starts
pulling the content from the database. If that works, then you're
done. =)

bye
Philip








 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Japanese Text not displayed on Image Generated by Servlet on winXP, Linux but displayed on Win2000 boney Java 1 12-15-2006 02:24 PM
auto rotation from digital files, not displayed correctly in win64 =?Utf-8?B?cGF1bA==?= Windows 64bit 1 11-10-2005 09:24 PM
Asp.net pages not displayed correctly =?Utf-8?B?YmFocg==?= ASP .Net 4 08-16-2005 05:01 PM
Saved HTM File not displayed correctly =?Utf-8?B?TWFya3VzUG9laGxlcg==?= ASP .Net 2 08-01-2005 07:49 PM
Webforms are not displayed correctly in non-MS Browsers? Arnold Franke ASP .Net 2 02-10-2004 10:22 AM



Advertisments