Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Alphabetics respect to a given locale

Reply
Thread Tools

Alphabetics respect to a given locale

 
 
candide
Guest
Posts: n/a
 
      04-01-2011
How to retrieve the list of all characters defined as alphabetic for the
current locale ?
 
Reply With Quote
 
 
 
 
Emile van Sebille
Guest
Posts: n/a
 
      04-01-2011
On 4/1/2011 1:55 PM candide said...
> How to retrieve the list of all characters defined as alphabetic for the
> current locale ?


I think this is supposed to work, but not for whatever reason for me
when I try to test after changing my locale (but I think that's a centos
thing)...

import locale
locale.setlocale(locale.LC_ALL,'')
import string
print string.lowercase

I don't see where else this might be for python.

However, you can test if something is alpha:

>>> val = u'caf' u'\xE9'
>>> val.isalpha()

True
>>>


.... and check its unicode category

>>> import unicodedata
>>> unicodedata.category(u'a')

'Ll' # Letter - lower case
>>> unicodedata.category(u'A')

'Lu' # Letter - upper case
>>> unicodedata.category(u'1')

'Nd' # Number - decimal?
>>> unicodedata.category(u'\x01')

'Cc' #


HTH,

Emile

 
Reply With Quote
 
 
 
 
candide
Guest
Posts: n/a
 
      04-02-2011
Le 01/04/2011 22:55, candide a écrit :
> How to retrieve the list of all characters defined as alphabetic for the
> current locale ?



Thanks for the responses. Alas, neither solution works.

Under Ubuntu :

# ----------------------
import string
import locale

print locale.getdefaultlocale()
print locale.getpreferredencoding()

locale.setlocale(locale.LC_ALL, "")

print string.letters

letter_class = u"[" + u"".join(unichr(c) for c in range(0x10000) if
unichr(c).isalpha()) + u"]"

#print letter_class
# ----------------------

prints the following :


('fr_FR', 'UTF8')
UTF-8
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwx yz


I commented out the letter_class printing for outputing a flood of
characters not belonging to the usual french character set.


More or less the same problem under Windows, for instance,
string.letters gives the "latin capital letter eth" as an analphabetic
character (this is not the case, we never use this letter in true french
words).



 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
the relation between C++ locale and C locale zade C++ 1 03-05-2010 06:04 PM
Re: List of locale values for locale.setlocale() under Windows Gabriel Genellina Python 0 02-18-2009 12:00 AM
Create C++ std::locale without changing C locale dertopper@web.de C++ 4 08-26-2008 01:15 PM
i18n problem, involving Locale.getDisplayLanguage and Locale.getDisplayCountry Maurice Hulsman Java 1 07-25-2004 06:11 PM
locale.nl_langinfo(RADIXCHAR) vs locale.localeconv()['decimal_point'] Jeff Epler Python 2 08-31-2003 02:18 PM



Advertisments