![]() |
Extracting text using Beautifulsoup
Greetings all.
Working with data from 'http://www.finviz.com/quote.ashx?t=SRS', I was able to get the info using re, however I thought using Beautifulsoup a more elegant approach. Having a bit of a problem though... Trying to extract text: SMA20 -1.77% SMA50 -9.73% utilizing attribute body in <td... body=[Distance from 20-Day Simple Moving Average].... > From: -----------------------HTML snippet------------------------------------------------------------ <td width="7%" class="snapshot-td2-cp" align="left" title="cssbody=[tooltip_short_bdy] cssheader=[tooltip_short_hdr] body=[Distance from 20-Day Simple Moving Average] offsetx=[10] offsety=[20] delay=[300]"> SMA20 </td> <td width="8%" class="snapshot-td2" align="left"> <b> <span style="color:#aa0000;"> -1.77% </span> </b> </td> <td width="7%" class="snapshot-td2-cp" align="left" title="cssbody=[tooltip_short_bdy] cssheader=[tooltip_short_hdr] body=[Distance from 50-Day Simple Moving Average] offsetx=[10] offsety=[20] delay=[300]"> SMA50 </td> <td width="8%" class="snapshot-td2" align="left"> <b> <span style="color:#aa0000;"> -9.73% </span> </b> </td> -----------------------HTML snippet------------------------------------------------------------ Using: import urllib from BeautifulSoup import BeautifulSoup archives_url = 'http://www.finviz.com/quote.ashx?t=SRS' archives_html = urllib.urlopen(archives_url).read() soup = BeautifulSoup(archives_html) t = soup.findAll('table') for table in t: g.write(str(table.name) + '\r\n') rows = table.findAll('tr') for tr in rows: g.write('\r\n\t') cols = tr.findAll('td') for td in cols: ret = str(td.find(name='title')) g.write('\t\t' + str(td) + '\r\n') g.close() Total failure of course. Any ideas? Thanks in advance... |
| All times are GMT. The time now is 09:35 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.