Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Unicode string handling problem (revised)

Thread Tools

Unicode string handling problem (revised)

Richard Schulman
Posts: n/a
The appended program fragment works correctly with an ascii input
file. But the file I actually want to process is Unicode (utf-16
encoding). This file must be Unicode rather than ASCII or Latin-1
because it contains mixed Chinese and English characters.

When I run the program I get an attribute_count of zero. This
is incorrect for the input file, which should give a value of fifteen
or sixteen. In other words, the count function isn't recognizing the


characters to be counted in the line read.

Here's the program:

in_file = open("c:\\pythonapps\\","rU")
# Skip the first line; make the second available for processing
in_line = in_file.readline()
attribute_count = in_line.count('",')
print attribute_count

Any suggestions?

Richard Schulman
(delete 'xx' characters for email reply)
Reply With Quote
John Machin
Posts: n/a

Richard Schulman wrote:
> in_line = in_file.readline()


We'd already deduced that that line was incorrectly published.
Please don't start new threads like this; if you want to make a
correction, do a couple-of-lines reply to your original message.
Now please leave this new thread alone, and reply to the
much-more-meaningful questions in the original thread.

Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem Regarding Handling of Unicode string joy99 Python 4 08-16-2009 11:19 AM
Help for Unicode char and Unicode char based string in Ruby Chirag Mistry Ruby 6 02-08-2008 12:45 PM
[unicode] inconvenient unicode conversion of non-string arguments Holger Joukl Python 5 12-13-2006 10:10 PM
Unicode string handling problem Richard Schulman Python 8 09-07-2006 10:37 PM
Unicode digit to unicode string Gabriele *darkbard* Farina Python 2 05-16-2006 01:15 PM