Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Problem with a regular expression

Reply
Thread Tools

Problem with a regular expression

 
 
charles.nadeau@gmail.com
Guest
Posts: n/a
 
      10-13-2006
I have the following code snippet:

require 'net/http'
begin
hdoc =
Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))

re = /<TD>(.*)</TD>/
if hdoc =~ re
print "#{$&}\n"
else
print "Nothing\n"
end
end

The regular expression is never matched when I use the code as shown
above (the expression for re is just a simple one for my testing).
However, if I replace the variable name hdoc by a string like
"<TD>Test</TD>Test1", the regular expression is matched. The type of
hdoc is String. What is wrong with the snippet above. I even tried to
replace hdoc by hdoc.to_s and it still doesn't work.
Thanks for your help!

Charles
------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/

 
Reply With Quote
 
 
 
 
Aaron Patterson
Guest
Posts: n/a
 
      10-13-2006
Hi,

On Sat, Oct 14, 2006 at 02:55:10AM +0900, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> I have the following code snippet:
>
> require 'net/http'
> begin
> hdoc =
> Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))
>
> re = /<TD>(.*)</TD>/
> if hdoc =~ re
> print "#{$&}\n"
> else
> print "Nothing\n"
> end
> end
>
> The regular expression is never matched when I use the code as shown
> above (the expression for re is just a simple one for my testing).
> However, if I replace the variable name hdoc by a string like
> "<TD>Test</TD>Test1", the regular expression is matched. The type of
> hdoc is String. What is wrong with the snippet above. I even tried to
> replace hdoc by hdoc.to_s and it still doesn't work.
> Thanks for your help!


It looks like there are no upper case "TD" tags in the page that you are
fetching. Try this instead:

begin
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))

re = /<TD>(.*)<\/TD>/i
if hdoc =~ re
print "#{$&}\n"
else
print "Nothing\n"
end
end

Your regular expression was case sensitive, I changed it to be case
insensitive by adding the "i" switch.

>
> Charles
> ------
> http://radio.weblogs.com/0111823/
> http://charlesnadeau.blogspot.com/
>
>


--
Aaron Patterson
http://tenderlovemaking.com/

 
Reply With Quote
 
 
 
 
Morton Goldberg
Guest
Posts: n/a
 
      10-13-2006

On Oct 13, 2006, at 1:55 PM, (E-Mail Removed) wrote:

> require 'net/http'
> begin
> hdoc =
> Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> s=Dupont&t=S&m=US'))
>
> re = /<TD>(.*)</TD>/
> if hdoc =~ re
> print "#{$&}\n"
> else
> print "Nothing\n"
> end
> end


Whrn I substitute '\/TD' for '/TD' and make the regex case
insensitive, I get a match. See below:

<code>
! /usr/bin/env ruby -w
require 'net/http'
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
s=Dupont&t=S&m=US'))
re = /<TD>(.*)<\/TD>/i ### note changes
if hdoc =~ re
puts "#{$&}\n"
else
puts "Nothing\n"
end
</code>

<result>
<td><table border="0" cellpadding="6" width="100%"
cellspacing="0"><tr><td bgcolor="#556f93"><big><b
style="color:#ffffff">Symbol Lookup </b></big></td></tr></table></
td></tr><tr><td></td></tr></table></td></tr><tr><td><table
cellpadding="0" border="0" cellspacing="0"><tr><td></td></tr></
table></td></tr><tr><td valign="top"><form><table border="0"
cellpadding="4" bgcolor="a0b8c8" cellspacing="1"><tr><td
bgcolor="eeeeee"><table cellpadding="1" width="100%"
cellspacing="0"><tr><td>Name:</td><td>Type:</td><td>Market:</td><td></
td></tr><tr><td><input size="30" name="s"></td><td><select
name="t"><option selected value="S"> Stocks </option><option
value="E"> ETFs </option><option value="I"> Indices </option><option
value="M"> Mutual Funds </option><option value="F"> Futures </
option></select></td><td><select name="m"><option selected
value="US">U.S. & Canada</option><option value="ALL">World Market</
option></select></td><td><input value="Look Up" type="submit"></td></
tr><tr><td valign="bottom" colspan="4"><small><a href="http://
finance.yahoo.com/exchanges">View supported exchanges</a></small></
td></tr></table></td></tr></table></form><table><tr><td
align="left">2 results for <b>'Dupont'</b> (type=<b>Stocks</b>,
market=<b>U.S. &amp; Canada</b>)</td></result>

Regards, Morton



 
Reply With Quote
 
David Vallner
Guest
Posts: n/a
 
      10-13-2006
--------------enig280BCDB6206058B773A41139
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

> re =3D /<TD>(.*)</TD>/


Use a HTML parser? Hpricot considered sexy recently.

David Vallner


--------------enig280BCDB6206058B773A41139
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (MingW32)

iD8DBQFFL+Iwy6MhrS8astoRAr78AJ9k6XyYBuGHGq4rrULWGz G3sgfVFACdGOfP
ysZLu4JbDY+8t/hq8Ro0ahc=
=mbxv
-----END PGP SIGNATURE-----

--------------enig280BCDB6206058B773A41139--

 
Reply With Quote
 
charles.nadeau@gmail.com
Guest
Posts: n/a
 
      10-13-2006
Morton Goldberg wrote:
> On Oct 13, 2006, at 1:55 PM, (E-Mail Removed) wrote:
>
> > require 'net/http'
> > begin
> > hdoc =
> > Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> > s=Dupont&t=S&m=US'))
> >
> > re = /<TD>(.*)</TD>/
> > if hdoc =~ re
> > print "#{$&}\n"
> > else
> > print "Nothing\n"
> > end
> > end

>
> Whrn I substitute '\/TD' for '/TD' and make the regex case
> insensitive, I get a match. See below:
>
> <code>
> ! /usr/bin/env ruby -w
> require 'net/http'
> hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> s=Dupont&t=S&m=US'))
> re = /<TD>(.*)<\/TD>/i ### note changes
> if hdoc =~ re
> puts "#{$&}\n"
> else
> puts "Nothing\n"
> end
> </code>
>
> <result>
> <td><table border="0" cellpadding="6" width="100%"
> cellspacing="0"><tr><td bgcolor="#556f93"><big><b
> style="color:#ffffff">Symbol Lookup </b></big></td></tr></table></
> td></tr><tr><td></td></tr></table></td></tr><tr><td><table
> cellpadding="0" border="0" cellspacing="0"><tr><td></td></tr></
> table></td></tr><tr><td valign="top"><form><table border="0"
> cellpadding="4" bgcolor="a0b8c8" cellspacing="1"><tr><td
> bgcolor="eeeeee"><table cellpadding="1" width="100%"
> cellspacing="0"><tr><td>Name:</td><td>Type:</td><td>Market:</td><td></
> td></tr><tr><td><input size="30" name="s"></td><td><select
> name="t"><option selected value="S"> Stocks </option><option
> value="E"> ETFs </option><option value="I"> Indices </option><option
> value="M"> Mutual Funds </option><option value="F"> Futures </
> option></select></td><td><select name="m"><option selected
> value="US">U.S. & Canada</option><option value="ALL">World Market</
> option></select></td><td><input value="Look Up" type="submit"></td></
> tr><tr><td valign="bottom" colspan="4"><small><a href="http://
> finance.yahoo.com/exchanges">View supported exchanges</a></small></
> td></tr></table></td></tr></table></form><table><tr><td
> align="left">2 results for <b>'Dupont'</b> (type=<b>Stocks</b>,
> market=<b>U.S. &amp; Canada</b>)</td></result>
>
> Regards, Morton


Morton, Aaron,

You are both right, thanks a lot! I also added "m" at the end of the
regular expression to match whatever might span two lines.
Cheers!

Charles
------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/

 
Reply With Quote
 
charles.nadeau@gmail.com
Guest
Posts: n/a
 
      10-13-2006
Morton Goldberg wrote:
> On Oct 13, 2006, at 1:55 PM, (E-Mail Removed) wrote:
>
> > require 'net/http'
> > begin
> > hdoc =
> > Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> > s=Dupont&t=S&m=US'))
> >
> > re = /<TD>(.*)</TD>/
> > if hdoc =~ re
> > print "#{$&}\n"
> > else
> > print "Nothing\n"
> > end
> > end

>
> Whrn I substitute '\/TD' for '/TD' and make the regex case
> insensitive, I get a match. See below:
>
> <code>
> ! /usr/bin/env ruby -w
> require 'net/http'
> hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> s=Dupont&t=S&m=US'))
> re = /<TD>(.*)<\/TD>/i ### note changes
> if hdoc =~ re
> puts "#{$&}\n"
> else
> puts "Nothing\n"
> end
> </code>
>
> <result>
> <td><table border="0" cellpadding="6" width="100%"
> cellspacing="0"><tr><td bgcolor="#556f93"><big><b
> style="color:#ffffff">Symbol Lookup </b></big></td></tr></table></
> td></tr><tr><td></td></tr></table></td></tr><tr><td><table
> cellpadding="0" border="0" cellspacing="0"><tr><td></td></tr></
> table></td></tr><tr><td valign="top"><form><table border="0"
> cellpadding="4" bgcolor="a0b8c8" cellspacing="1"><tr><td
> bgcolor="eeeeee"><table cellpadding="1" width="100%"
> cellspacing="0"><tr><td>Name:</td><td>Type:</td><td>Market:</td><td></
> td></tr><tr><td><input size="30" name="s"></td><td><select
> name="t"><option selected value="S"> Stocks </option><option
> value="E"> ETFs </option><option value="I"> Indices </option><option
> value="M"> Mutual Funds </option><option value="F"> Futures </
> option></select></td><td><select name="m"><option selected
> value="US">U.S. & Canada</option><option value="ALL">World Market</
> option></select></td><td><input value="Look Up" type="submit"></td></
> tr><tr><td valign="bottom" colspan="4"><small><a href="http://
> finance.yahoo.com/exchanges">View supported exchanges</a></small></
> td></tr></table></td></tr></table></form><table><tr><td
> align="left">2 results for <b>'Dupont'</b> (type=<b>Stocks</b>,
> market=<b>U.S. &amp; Canada</b>)</td></result>
>
> Regards, Morton


Morton, Aaron,

You are both right, thanks a lot! I also added "m" at the end of the
regular expression to match whatever might span two lines.
Cheers!

Charles
------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Seek xpath expression where an attribute name is a regular expression GIMME XML 3 12-29-2008 03:11 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C++ 42 11-04-2008 12:39 PM
C/C++ language proposal: Change the 'case expression' from "integral constant-expression" to "integral expression" Adem C Programming 45 11-04-2008 12:39 PM
Matching abitrary expression in a regular expression =?iso-8859-1?B?bW9vcJk=?= Java 8 12-02-2005 12:51 AM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM



Advertisments