Soldato
I'm trying to scrape some data (from a bookies website) using XPath and Python.
I have something working quite well, but the output isn't as clean as I need it to be as all the data I'm scrapping is being stored as a list as default and I'm not sure why.
Below is the data structure I'm trying to create through a dictionary:
{ID: [Runner Name, Odds]}
However, I'm getting the below outline:
{[ID]:[[Runner Name], [Odds]]}
Here's my code - I'm guessing tree.xpath is pulling the data as a list?
Any help appreciated.
I have something working quite well, but the output isn't as clean as I need it to be as all the data I'm scrapping is being stored as a list as default and I'm not sure why.
Below is the data structure I'm trying to create through a dictionary:
{ID: [Runner Name, Odds]}
However, I'm getting the below outline:
{[ID]:[[Runner Name], [Odds]]}
Code:
("['1']", ':', [['Imada'], ['11/8']])
("['8']", ':', [['Dubai Celebrity'], ['10/1']])
("['3']", ':', [['Byronegetonefree'], ['50/1']])
("['6']", ':', [['The Fresh Prince'], ['9/4']])
('[]', ':', [[], ['SP']])
("['7']", ':', [['Berkshire Downs'], ['3/1']])
("['4']", ':', [['Coachie Bear'], ['50/1']])
("['2']", ':', [['Ange Des Malberaux'], ['66/1']])
("['5']", ':', [['Grexit'], ['33/1']])
Here's my code - I'm guessing tree.xpath is pulling the data as a list?
Code:
from fractions import Fraction
from lxml import html
import requests
BookiePage = requests.get('removed')
tree = html.fromstring(BookiePage.content)
RunnersCount = int(tree.xpath('count(//table[@class="market-table js-sort js-toggle__target "]/tbody//tr)'))
NumList = list(range(0, RunnersCount))
BookieData = {}
for x in NumList:
keyString = str(map(str.strip,tree.xpath('//table[@class="market-table js-sort js-toggle__target "]/tbody/tr[' + str(x + 1) + ']/td[1]/a/span/text()')))
runnerString = map(str.strip,tree.xpath('//table[@class="market-table js-sort js-toggle__target "]/tbody/tr[' + str(x + 1) +']/td[3]/a/b/text()'))
Odds = map(str.strip,tree.xpath('//table[@class="market-table js-sort js-toggle__target "]/tbody/tr[' + str(x + 1) +']/td[4]/span[1]/text()'))
BookieData[keyString] = [runnerString, Odds]
for x in BookieData:
print (x,':',BookieData[x])
Any help appreciated.