Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Ruby (http://www.velocityreviews.com/forums/f66-ruby.html)
-   -   Noob, html trees & parsing (http://www.velocityreviews.com/forums/t858024-noob-html-trees-and-parsing.html)

Michael Lesser 06-12-2009 08:10 PM

Noob, html trees & parsing
 
Hi all.

Noob, first project, read the Poignant Guide, et al.

I have a big Perl script that parses badly-formed HTML files with HTML
Element/Tree. I think it's time for an update.

I think the equivalent in Ruby is Hpricot? I haven't found a lot of dox
on this, so I am assuming that this type of problem is something that
becomes 'obvious' once you start working in Ruby. Or should I be
looking at another/better solution (as in, duh, it's got XXX built-in,
noob...)?

TIA
--
Posted via http://www.ruby-forum.com/.


Sanjay Sharma 06-13-2009 07:27 PM

Re: Noob, html trees & parsing
 
Michael Lesser wrote:
> Hi all.
>
> Noob, first project, read the Poignant Guide, et al.
>
> I have a big Perl script that parses badly-formed HTML files with HTML
> Element/Tree. I think it's time for an update.
>
> I think the equivalent in Ruby is Hpricot? I haven't found a lot of dox
> on this, so I am assuming that this type of problem is something that
> becomes 'obvious' once you start working in Ruby. Or should I be
> looking at another/better solution (as in, duh, it's got XXX built-in,
> noob...)?
>
> TIA


You might want to take a look at html5lib <
http://code.google.com/p/html5lib/ > for parsing bad markup.
--
Posted via http://www.ruby-forum.com/.



All times are GMT. The time now is 08:10 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.