On 2006-07-21, Jim Cochrane <allergic-to-> wrote:
> On 2006-07-21, Logan Capaldo <> wrote:
>>
>> On Jul 21, 2006, at 3:40 PM, Jim Cochrane wrote:
>>
>>> On 2006-07-21, Theallnighter Theallnighter
>>> <> wrote:
>>>> Hi all,
>>>> how can i delete all non alphanumeric characters in a string ? thanks
>>>>
>>> ...
>>> #!/usr/bin/ruby
>>>
>>> x = "There are 2007 beans and 15234 grains of rice in this bag."
>>> puts x
>>> x.gsub!(/\W/, '')
>>> puts x
>>> ...
>>>
>>>
>>
>> Well the only "problem" with that is
>>
>> x = '\w includes_under_scores_too'
>>
>
> Woah! Thanks for pointing that out. It looks like
> http://www.ruby-doc.org/docs/ruby-do...rg/regexp.html
> has a bug:
>
> \w letter or digit; same as [0-9A-Za-z]
>
> It's missing a _.
>
> Here's a fixed version:
>
>
> #!/usr/bin/ruby
>
> x = "There are 2007 beans_and 15234 grains of rice in this bag."
> puts x
> x.gsub!(/\W/, '')
> puts x
> x.gsub!(/\W|_/, '')
> puts "fixed:"
> puts x
Oops - the above has a bug (although it still "works"). Here's a fixed
version, with an opposite example further demonstrating the bug in the
ruby doc site:
#!/usr/bin/ruby
s = "There are 2007 beans_and 15234 grains of rice in this bag."
x = s.dup
y = s.dup
puts "original:"
puts x
x.gsub!(/\W/, '')
puts "\nbroken:"
puts x
y.gsub!(/\W|_/, '')
puts "\nfixed:"
puts y
puts "\nopposite:"
z = s.dup
z.gsub!(/\w/, '')
puts z
--
original:
There are 2007 beans_and 15234 grains of rice in this bag.
broken:
Thereare2007beans_and15234grainsofriceinthisbag
fixed:
Thereare2007beansand15234grainsofriceinthisbag
opposite: