Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Parsing text with regular expression

Reply
Thread Tools

Parsing text with regular expression

 
 
Sebastian probst Eide
Guest
Posts: n/a
 
      04-29-2007
Hi
I am writing a class that parses text. It checks each word and counts
how many times they occur in the text. It also checks for 'special'
words, that being words that are capitalized, all upper case or in mixed
case, and ads a flag to those words and checks that the words that are
not special fulfill a certain length requirement. The information is
stored in a hash like this:

{'word' => {:count => 1, :special => false}, 'other_word' => {:count=>
3, :special => true}}

Everything is working fine so far. The thing I am struggling to
implement though is the following:
I want to be able to check the context the 'special' words are in to see
if a capitalized special word maybe only is capitalized because it is
the first word in a new sentence or something like that.

I thought I could check by looking for something like this:

text =~ /[[unct:]]\s?WORD_I_AM_LOOKING_FOR/
and if I got something else than 0 as a result it would mean that the
word is in the beginning of a sentence. But how do I insert a variable
into the regular expression? Or is there a different much cleverer way
to do this sort of check?

Currently I am scanning for each word like this:

_inn.scan(/\w{2,}[-\w]?/i) do |word|
...
end

and then doing the checking of the words inside that iterator.

Hope you have understood my problem and that you can point me in the
right direction.

best regards
Sebastian

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Timothy Hunter
Guest
Posts: n/a
 
      04-29-2007
Sebastian probst Eide wrote:
> I thought I could check by looking for something like this:
>
> text =~ /[[unct:]]\s?WORD_I_AM_LOOKING_FOR/
> and if I got something else than 0 as a result it would mean that the
> word is in the beginning of a sentence. But how do I insert a variable
> into the regular expression?

Use #{}, like this

word = "hello"

test =~ /[[unct:]]\s?#{word}/

"word" can be any regular expression.

 
Reply With Quote
 
 
 
 
Sebastian probst Eide
Guest
Posts: n/a
 
      04-29-2007
Timothy Hunter wrote:
> Sebastian probst Eide wrote:
>> I thought I could check by looking for something like this:
>>
>> text =~ /[[unct:]]\s?WORD_I_AM_LOOKING_FOR/
>> and if I got something else than 0 as a result it would mean that the
>> word is in the beginning of a sentence. But how do I insert a variable
>> into the regular expression?

> Use #{}, like this
>
> word = "hello"
>
> test =~ /[[unct:]]\s?#{word}/
>
> "word" can be any regular expression.


Huh... that was the first thing I tried... must have done something else
wrong too in the same expression because it didn't work... I'll try
again.
Thanks Timothy

Sebastian

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression to Replace UPPER Case Text with lower case text penny Perl Misc 28 03-10-2008 01:14 AM
Regular Expression Parsing In Java ArdGre Java 9 01-09-2007 04:06 AM
need help on a regular expression of text OR text OR etc... mike ASP General 1 10-03-2006 11:31 PM
perl-like regular expression parsing for C++ Bill Chiu C++ 4 09-12-2003 05:37 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments