Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Remove some characters from a string

Reply
Thread Tools

Remove some characters from a string

 
 
Julien
Guest
Posts: n/a
 
      07-17-2008
Hi,

I can't seem to find the right regular expression to achieve what I
want. I'd like to remove all characters from a string that are not
numbers, letters or underscores.

For example:

>>> magic_function('si_98%u^d@.as-*gf')

str: 'si_98udasgf'

Would you have any hint?

Thanks a lot!

Julien
 
Reply With Quote
 
 
 
 
Chris
Guest
Posts: n/a
 
      07-17-2008
On Jul 17, 10:13*am, Julien <jpha...@gmail.com> wrote:
> Hi,
>
> I can't seem to find the right regular expression to achieve what I
> want. I'd like to remove all characters from a string that are not
> numbers, letters or underscores.
>
> For example:
>
> >>> magic_function('si_98%u^d@.as-*gf')

>
> str: 'si_98udasgf'
>
> Would you have any hint?
>
> Thanks a lot!
>
> Julien


One quick and dirty way would be...

import string
safe_chars = string.ascii_letters + string.digits + '_'
test_string = 'si_98%u^d@.as-*gf'
''.join([char if char in safe_chars else '' for char in test_string])

you could also use a translation table, see string.translate (the
table it uses can be made with string.maketrans)
 
Reply With Quote
 
 
 
 
Paul Hankin
Guest
Posts: n/a
 
      07-17-2008
On Jul 17, 9:13*am, Julien <jpha...@gmail.com> wrote:
> Hi,
>
> I can't seem to find the right regular expression to achieve what I
> want. I'd like to remove all characters from a string that are not
> numbers, letters or underscores.
>
> For example:
>
> >>> magic_function('si_98%u^d@.as-*gf')

>
> str: 'si_98udasgf'


For speed, you can use 'string.translate', but simplest is to use a
comprehension:

import string

def magic_function(s, keep=string.ascii_letters + string.digits +
'_'):
return ''.join(c for c in s if c in keep)

--
Paul Hankin
 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      07-17-2008
Julien wrote:

> I can't seem to find the right regular expression to achieve what I
> want. I'd like to remove all characters from a string that are not
> numbers, letters or underscores.
>
> For example:
>
>>>> magic_function('si_98%u^d@.as-*gf')

> str: 'si_98udasgf'


the easiest way is to replace the things you don't want with an empty
string:

>>> re.sub("\W", "", "si_98%u^d@.as-*gf")

'si_98udasgf'

("\W" matches everything that is "not numbers, letters, or underscores",
where the alphabet defaults to ASCII. to include non-ASCII letters, add
"(?u)" in front of the expression, and pass in a Unicode string).

</F>

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Remove only special characters and junk characters from a file rvino Perl 0 08-14-2007 07:23 AM
How to remove and substitute characters within a string francescomoi@europe.com C Programming 4 04-30-2005 01:44 AM
Interrogating string for number of characters, response.writing identical number of characters on new line Ken Fine ASP General 2 02-05-2004 03:40 AM
Remove no-printable characters in string Pascal Python 3 12-04-2003 05:07 PM
Remove no-printable characters in string Pascal Python 0 12-03-2003 11:17 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57