Python - XPath help

Soldato
Joined
2 Aug 2004
Posts
8,050
Location
Buckinghamshire
I'm trying to scrape some data (from a bookies website) using XPath and Python.

I have something working quite well, but the output isn't as clean as I need it to be as all the data I'm scrapping is being stored as a list as default and I'm not sure why.

Below is the data structure I'm trying to create through a dictionary:
{ID: [Runner Name, Odds]}

However, I'm getting the below outline:
{[ID]:[[Runner Name], [Odds]]}

Code:
("['1']", ':', [['Imada'], ['11/8']])
("['8']", ':', [['Dubai Celebrity'], ['10/1']])
("['3']", ':', [['Byronegetonefree'], ['50/1']])
("['6']", ':', [['The Fresh Prince'], ['9/4']])
('[]', ':', [[], ['SP']])
("['7']", ':', [['Berkshire Downs'], ['3/1']])
("['4']", ':', [['Coachie Bear'], ['50/1']])
("['2']", ':', [['Ange Des Malberaux'], ['66/1']])
("['5']", ':', [['Grexit'], ['33/1']])

Here's my code - I'm guessing tree.xpath is pulling the data as a list?

Code:
from fractions import Fraction
from lxml import html
import requests

BookiePage = requests.get('removed')
tree = html.fromstring(BookiePage.content)

RunnersCount = int(tree.xpath('count(//table[@class="market-table  js-sort js-toggle__target  "]/tbody//tr)'))

NumList = list(range(0, RunnersCount))

    
BookieData = {}
for x in NumList:
    keyString = str(map(str.strip,tree.xpath('//table[@class="market-table  js-sort js-toggle__target  "]/tbody/tr[' + str(x + 1) + ']/td[1]/a/span/text()')))
    runnerString = map(str.strip,tree.xpath('//table[@class="market-table  js-sort js-toggle__target  "]/tbody/tr[' + str(x + 1) +']/td[3]/a/b/text()'))
    Odds = map(str.strip,tree.xpath('//table[@class="market-table  js-sort js-toggle__target  "]/tbody/tr[' + str(x + 1) +']/td[4]/span[1]/text()'))
    BookieData[keyString] = [runnerString, Odds]

for x in BookieData:
    print (x,':',BookieData[x])

Any help appreciated.
 
I don't have python experience, but when using xpath expressions from java and directly in XSL, they are pretty much always lists. There is no way the evaluator can know from the expression that it matches to a single value - and you don't get any way to provide a "hint" that there is only a single value expected back either.

If you want a single value, you probably want to wrap the use of the xpath with something that will take the first entry as the result - and error handling for if there are zero or more than one entries.
 
Back
Top Bottom