Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > using regular expressions...

Reply
Thread Tools

using regular expressions...

 
 
soldier.coder
Guest
Posts: n/a
 
      11-11-2008
I have the following code:

require 'open-uri'
def scrape_table(html)
%r{</thead.*?>(.*?)</table>}m =~html
$1
end

def scrape_case(a_line)
%r{(<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>)}m =~ a_line
$1
end

if $0 == __FILE__

url = 'http://localhost:8080/tests/raw.html';
page = open(url) #open the url like a file
text = page.read; #read it into one string
my_table = scrape_table(text) #grab or "scrape" the table
my_link = scrape_case(my_table) #grab a html that includes a 6-2
digit number (ex: 080910-15)
puts(my_table) #prints out my_table -- which contains the table
information
puts("\n")
puts(my_link)

end

The code grabs the one table contained in my URL then looks for an
HTML link that includes a number that is 6 digits, followed by a dash,
followed by 6 digits. I'm fairly certain the regex in scrape_case( )
grabs more than one html link, if more than one is in the table. Is
there any way I can grab all those links into an array?
 
Reply With Quote
 
 
 
 
Peter Szinek
Guest
Posts: n/a
 
      11-11-2008

On 2008.11.11., at 14:22, soldier.coder wrote:

> I have the following code:
>
> require 'open-uri'
> def scrape_table(html)
> %r{</thead.*?>(.*?)</table>}m =~html
> $1
> end
>
> def scrape_case(a_line)
> %r{(<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>)}m =~ a_line
> $1
> end


> Is there any way I can grab all those links into an array?


Sure - String#scan is your friend:

def scrape_case(a_line)
a_line.scan(/<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>/)
end

ex:

>> "<a href='123456-78'>123456-78</a> here is another: <a

href='111111-99'>111111-99</a>".scan(/<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<
\/a>/)
=> ["<a href='123456-78'>123456-78</a>", "<a
href='111111-99'>111111-99</a>"]


HTH,
Peter


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Replacing String Using Regular Expression lucky ASP .Net 11 11-11-2005 07:01 AM
using regular phone as voip device with regular modem? hygum VOIP 5 03-23-2005 03:51 PM
match three digit number using regular expression championsleeper Perl 6 04-06-2004 08:54 PM
Whitespace using Regular Expressions ASP .Net 1 10-19-2003 04:08 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments