Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Scrapy/XPath help

Reply
Thread Tools

Scrapy/XPath help

 
 
Always Learning
Guest
Posts: n/a
 
      12-21-2012
Hello all. I'm new to Python, but have been playing around with it for a few weeks now, following tutorials, etc. I've spun off on my own and am trying to do some basic web scraping. I've used Firebug/View XPath in Firefox for some help with the XPaths, however, I still am receiving errors when I try to run this script. If you could help, it would be greatly appreciated!

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from cbb_info.items import CbbInfoItem, Field

class GameInfoSpider(BaseSpider):
name = "game_info"
allowed_domains = ["www.sbrforum.com"]
start_urls = [
'http://www.sbrforum.com/betting-odds/ncaa-basketball/',
]

def parse(self, response):
hxs = HtmlXPathSelector(response)
toplevels = hxs.select("//div[@class='eventLine-value']")
items = []
for toplevels in toplevels:
item = CbbInfoItem()
item ["teams"] = toplevels.select("/span[@class='team-name'/text()").extract()
item ["lines"] = toplevels.select("/div[@rel='19']").extract()
item.append(item)
return items
 
Reply With Quote
 
 
 
 
Grant Rettke
Guest
Posts: n/a
 
      12-21-2012
You might have better luck if you share the python make, version, os,
error message, and some unit tests demonstrating what you expect.

On Fri, Dec 21, 2012 at 3:21 PM, Always Learning <(E-Mail Removed)> wrote:
> Hello all. I'm new to Python, but have been playing around with it for a few weeks now, following tutorials, etc. I've spun off on my own and am trying to do some basic web scraping. I've used Firebug/View XPath in Firefox for some help with the XPaths, however, I still am receiving errors when I try to run this script. If you could help, it would be greatly appreciated!
>
> from scrapy.spider import BaseSpider
> from scrapy.selector import HtmlXPathSelector
> from cbb_info.items import CbbInfoItem, Field
>
> class GameInfoSpider(BaseSpider):
> name = "game_info"
> allowed_domains = ["www.sbrforum.com"]
> start_urls = [
> 'http://www.sbrforum.com/betting-odds/ncaa-basketball/',
> ]
>
> def parse(self, response):
> hxs = HtmlXPathSelector(response)
> toplevels = hxs.select("//div[@class='eventLine-value']")
> items = []
> for toplevels in toplevels:
> item = CbbInfoItem()
> item ["teams"] = toplevels.select("/span[@class='team-name'/text()").extract()
> item ["lines"] = toplevels.select("/div[@rel='19']").extract()
> item.append(item)
> return items
> --
> http://mail.python.org/mailman/listinfo/python-list




--
Grant Rettke | ACM, AMA, COG, IEEE
http://www.velocityreviews.com/forums/(E-Mail Removed) | http://www.wisdomandwonder.com/
Wisdom begins in wonder.
((λ (x) (x x)) (λ (x) (x x)))
 
Reply With Quote
 
 
 
 
Always Learning
Guest
Posts: n/a
 
      12-21-2012
Sorry about that. I'm using Python 2.7.3, 32 bit one Windows 7.

The errors I get are

>>File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
>>raise ValueError("Invalid XPath: %s" % xpath)
>>exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()


Ultimaly, I expect it to gather the team name in text, and then the odds in one of the columns in text as well, so I can then put it into a .csv
 
Reply With Quote
 
Always Learning
Guest
Posts: n/a
 
      12-21-2012
Sorry about that. I'm using Python 2.7.3, 32 bit one Windows 7.

The errors I get are

>>File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
>>raise ValueError("Invalid XPath: %s" % xpath)
>>exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()


Ultimaly, I expect it to gather the team name in text, and then the odds in one of the columns in text as well, so I can then put it into a .csv
 
Reply With Quote
 
Dave Angel
Guest
Posts: n/a
 
      12-22-2012
On 12/21/2012 04:58 PM, Always Learning wrote:
> Sorry about that. I'm using Python 2.7.3, 32 bit one Windows 7.
>
> The errors I get are
>
>>> File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select
>>> raise ValueError("Invalid XPath: %s" % xpath)
>>> exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()

> Ultimaly, I expect it to gather the team name in text, and then the odds in one of the columns in text as well, so I can then put it into a .csv


Why are you displaying only the last 3 lines of the error message?
Unless your source code is lxmlsel.py, there are other stack levels
above this one.

(I can't help, but I'm trying to save some time for someone who can)

--

DaveA

 
Reply With Quote
 
donarb
Guest
Posts: n/a
 
      12-25-2012
On Friday, December 21, 2012 1:58:47 PM UTC-8, Always Learning wrote:
> The errors I get are
> >>File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select

>
> >>raise ValueError("Invalid XPath: %s" % xpath)

>
> >>exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()

>



You're missing a right bracket in the xpath expression:

/span[@class='team-name']/text()
 
Reply With Quote
 
donarb
Guest
Posts: n/a
 
      12-25-2012
On Friday, December 21, 2012 1:58:47 PM UTC-8, Always Learning wrote:
> The errors I get are
> >>File "C:\python27\lib\site-packages\scrapy-0.16.3-py2.7.egg\scrapy\selector\lxmlsel.py", line 47, in select

>
> >>raise ValueError("Invalid XPath: %s" % xpath)

>
> >>exceptions.ValueError: Invalid XPath: /span[@class='team-name'/text()

>



You're missing a right bracket in the xpath expression:

/span[@class='team-name']/text()
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help Help Help Pentax S5i Help needed (Please) The Martian Digital Photography 14 06-20-2008 07:56 AM
HELP - HELP - HELP =?Utf-8?B?S2ltb24gSWZhbnRpZGlz?= ASP .Net 4 03-09-2006 12:46 PM
HELP WANTED HELP WANTED HELP WANTED Harvey ASP .Net 1 07-16-2004 01:12 PM
HELP WANTED HELP WANTED HELP WANTED Harvey ASP .Net 0 07-16-2004 10:00 AM
HELP! HELP! HELP! Opening Web Application Project Error =?Utf-8?B?dHJlbGxvdzQyMg==?= ASP .Net 0 02-20-2004 05:16 PM



Advertisments