![]() |
[ANN] scRUBYt! 0.2.3 - Hpricot and Mechanize on steroids
Hello,
I am pleased to announce that the new release of scRUBYt!, 0.2.3 is available for download. scRUBYt! is a very easy to learn and use, yet powerful Web scraping framework based on Hpricot and mechanize. It's purpose is to free you from the drudgery of web page crawling, looking up HTML tags, attributes, XPaths, form names and other typical low-level web scraping woes by figuring these out from your examples copy'n'pasted from the Web page. The current release has a lot of new features, tons of bugfixes and some shiny new examples - scraping reddit, del.icio.us, rubyforge login, wordpress automatic commenting for example. Thanks everybody for the great feedback! Cheers, Peter __ http://www.rubyrailways.com :: Ruby and Web2.0 blog http://scrubyt.org :: Ruby web scraping framework http://rubykitchensink.ca/ :: The indexed archive of all things Ruby. |
Re: scRUBYt! 0.2.3 - Hpricot and Mechanize on steroids
Peter,
I really, really like scRUBYt! so far, especially after so much scraping using PHP, however the lack of documentation kills me. Are there going to be more tutorials and examples soon? When I was first testing it I could not believe it was possible to scrape google with only a few lines, but what about more complicated pages? For example, it isn't possible to scrape Ask.com the same way you do Google because of it's markup, what are you supposed to do on those cases? I know how scraping works but I'm not very experienced with XPath so it would be really good to have more examples (in their final form, not the learner only because they stop working after a while for most of the time), plus a more detailed explanation of everything that can be done. Either way, this seems to be shaping up very well and I wish you good luck with it! Cheers On Feb 21, 7:46 am, Peter Szinek <p...@rubyrailways.com> wrote: > Hello, > > I am pleased to announce that the new release of scRUBYt!, 0.2.3 is > available for download. > > scRUBYt! is a very easy to learn and use, yet powerful Web scraping > framework based on Hpricot and mechanize. It's purpose is to free you > from the drudgery of web page crawling, looking up HTML tags, > attributes, XPaths, form names and other typical low-level web scraping > woes by figuring these out from your examples copy'n'pasted from the Web > page. > > The current release has a lot of new features, tons of bugfixes and > some shiny new examples - scraping reddit, del.icio.us, rubyforge login, > wordpress automatic commenting for example. > > Thanks everybody for the great feedback! > > Cheers, > Peter > __http://www.rubyrailways.com:: Ruby and Web2.0 bloghttp://scrubyt.org:: Ruby web scraping frameworkhttp://rubykitchensink.ca/:: The indexed archive of all things Ruby. |
Re: scRUBYt! 0.2.3 - Hpricot and Mechanize on steroids
toulax@gmail.com wrote:
> Peter, > > I really, really like scRUBYt! so far, especially after so much > scraping using PHP, however the lack of documentation kills me. Are > there going to be more tutorials and examples soon? When I was first > testing it I could not believe it was possible to scrape google with > only a few lines, but what about more complicated pages? For example, > it isn't possible to scrape Ask.com the same way you do Google because > of it's markup, what are you supposed to do on those cases? There are tons of examples here: http://rubyforge.org/frs/download.ph...ples-0.2.3.zip I am also planning to finish the tutorials and add even more docs. What's up with ask.com? Send me what would you like to accomplish and I will help you with it. > > I know how scraping works but I'm not very experienced with XPath so > it would be really good to have more examples (in their final form, > not the learner only because they stop working after a while for most > of the time), OK - however, you can always replace the examples with the current ones and export the extractor to get a production one. plus a more detailed explanation of everything that can > be done. Yeah, that's my goal too, just I am flooded with everything else... But I am working on it all the time. Please subscribe to the feed at http://scrubyt.org if you would like to be notified if new stuff arrives... I am announcing everything there, > Either way, this seems to be shaping up very well and I wish you good > luck with it! Great! Please send as much feedback as possible so I can improve the whole stuff and add what's missing. Cheers, Peter __ http://www.rubyrailways.com :: Ruby and Web2.0 blog http://scrubyt.org :: Ruby web scraping framework http://rubykitchensink.ca/ :: The indexed archive of all things Ruby. |
| All times are GMT. The time now is 12:44 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.