Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > files.py (weird encoding error)

Reply
Thread Tools

files.py (weird encoding error)

 
 
Νικόλαος Κούρας
Guest
Posts: n/a
 
      06-10-2013
All happened when using FileZilla to upload greek filenames to my remote
linux server and putty as an ssh cleint, using greek-iso as a locale
encoding setting, because win8 used that by default.

Everything work when filenames in the directorry are ngleish file names.
IF i rename an eglish filename to greek filename i get the error that
shows upo at the end my post.

I know you guys know linu and there is a good chance you know python
too, so you can help me out.

thank you.


Code:
#====================
# Collect directory and its filenames as bytes
path = b'/home/nikos/public_html/data/apps/'
files = os.listdir( path )

for filename in files:
# Compute 'path/to/filename'
filepath_bytes = path + filename
for encoding in ('utf-8', 'iso-8859-7', 'latin-1'):
try:
filepath = filepath_bytes.decode( encoding )
except UnicodeDecodeError:
continue

# Rename to something valid in UTF-8
if encoding != 'utf-8':
os.rename( filepath_bytes,
filepath.encode('utf-8') )

assert os.path.exists( filepath )
break
else:
# This only runs if we never reached the break
raise ValueError( 'unable to clean filename %r' %
filepath_bytes )


#========================================================
# Collect filenames of the path dir as strings
filenames = os.listdir( '/home/nikos/public_html/data/apps/' )

# Load'em
for filename in filenames:
try:
# Check the presence of a file against the database and
insert if it doesn't exist
cur.execute('''SELECT url FROM files WHERE url = %s''',
(filename,) )
data = cur.fetchone()

if not data:
# First time for file; primary key is
automatic, hit is defaulted
print( "iam here", filename + '\n' )
cur.execute('''INSERT INTO files (url, host,
lastvisit) VALUES (%s, %s, %s)''', (filename, host, lastvisit) )
except pymysql.ProgrammingError as e:
print( repr(e) )


#========================================================
# Collect filenames of the path dir as strings
filenames = os.listdir( '/home/nikos/public_html/data/apps/' )
filepaths = set()

# Build a set of 'path/to/filename' based on the objects of path dir
for filename in filenames:
filepaths.add( filename )

# Delete spurious
cur.execute('''SELECT url FROM files''')
data = cur.fetchall()

# Check database's filenames against path's filenames
for rec in data:
if rec not in filepaths:
cur.execute('''DELETE FROM files WHERE url = %s''', rec )
When trying to runt he above i get:

Code:
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173] Original
exception was:, referer:http://superhost.gr/
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173] Traceback
(most recent call last):, referer:http://superhost.gr/
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173]   File
"/home/nikos/public_html/cgi-bin/files.py", line 83, in <module>,
referer:http://superhost.gr/
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173]     assert
os.path.exists( filepath ), referer:http://superhost.gr/
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173]   File
"/usr/local/lib/python3.3/genericpath.py", line 18, in exists,
referer:http://superhost.gr/
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173]
os.stat(path), referer:http://superhost.gr/
[Sun Jun 09 09:37:51 2013] [error] [client 79.103.41.173]
UnicodeEncodeError: 'ascii' codec can't encode characters in position
34-37: ordinal not in range(128), refere
Why am i still receing unicode decore errors?
i have write a prodecure just to avoid decoding issues and rename all
greek_bytes filenames to utf-8_bytes.

Can you help please?

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: A Weird Appearance for a Weird Site dorayme HTML 1 01-21-2011 06:51 AM
Re: A Weird Appearance for a Weird Site richard HTML 0 01-21-2011 06:46 AM
Reading Text File Encoding and converting to Perls internal UTF-8 encoding sln@netherlands.com Perl Misc 2 04-17-2009 11:22 PM
changing JVM encoding; setting -Dfile.encoding doesn't work pasmol@plusnet.pl Java 1 10-08-2004 09:50 PM
Encoding.Default and Encoding.UTF8 Hardy Wang ASP .Net 5 06-09-2004 04:04 PM



Advertisments