xpath - getting php to display the tex content of a remote table cell

Sorry for the late reply- yep got it. You should have a working version.. hopefully :o.
 
Use Firebug for XPath Queries (example - "/html/body/div/div[7]/div/div/div[4]/ul/li[3]/p" etc), XPath Checker throws out odd queries.

In the example of grabbing the Premier League table (id 'table1') rows from Sky Sports (HERE) -
Code:
$table = $xpath->query('//table[@id="table1"]/tbody/tr');
foreach($table as $node) {
    print $node->nodeValue."<br />";
}

to grab the columns (it puts them all together so you'd need to do some counting) -
Code:
$table = $xpath->query('//table[@id="table1"]/tbody/tr/td');
foreach($table as $node) {
    print $node->nodeValue."<br />";
}

and to grab the title ('caption') -
Code:
$xpath->query('//table[@id="table1"]/caption')->item(0)->nodeValue;


Also I’d use cURL as it quicker to return a pages contents that file, fopen etc
 
Well with the help of Pho we can now grab data from any site, except the one I need which only works on WAMP lol....

I thought php was platform independent? :p
 
Getting these errors now:

Seems I need to tell the script what character encoding to use. Any ideas how to do this?

PHP:
Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: input conversion failed due to input error, bytes 0x9C 0xAC 0xE8 0xAA in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: no name in Entity, line: 2 in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: expecting ';' in Entity, line: 2 in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: expecting ';' in Entity, line: 2 in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: Attribute border redefined in Entity, line: 16 in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: input conversion failed due to input error, bytes 0x9C 0xAC 0xE8 0xAA in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: encoder errorAttValue: " expected in Entity, line: 16 in /home/freeonli/public_html/oxygrab_test1.php on line 51

Warning:  DOMDocument::loadHTML() [domdocument.loadhtml]: Couldn't find end of Start Tag a in Entity, line: 16 in /home/freeonli/public_html/oxygrab_test1.php on line 51
£0
PHP:
<?php

    Class Scrape
    {
        var $url;
        var $xpathQuery;
        var $xpathResults;

        function Scrape($url, $query)
        {
            $this->setURL($url);
            $this->setXpathQuery($query);
        }

        function getURL()
        {
            return $this->url;
        }

        function setURL($url)
        {
            $this->url = $url;
        }

        function getXpathQuery()
        {
            return $this->xpathQuery;
        }

        function setXpathQuery($query)
        {
            $this->xpathQuery= $query;
        }
        
        function getXpathResults()
        {
            return $this->xpathResults;
        }
        
        function setXpathResults($result)
        {
            $this->xpathResults = $result;
        }
        
        function execute()
        {                
            $html = file_get_contents($this->getURL());

            $dom = new DOMDocument();
            @$dom->loadHTML($html);
            $xpath = new DOMXPath($dom);                
            $results = $xpath->query($this->getXpathQuery());
            $this->setXpathResults($results);
        }
    }

// Query site
    $scrape1 = new Scrape(' INSERT URL HERE' ,'//td[@id="tdtestbox"]//td[@class="SingleRowTableCell"][2]');
    $scrape1->execute();
   
    // Get result
    // (for some reason I get: £ 2,400,000 so we'll remove that bit next)
    $output =$scrape1->getXpathResults()->item(0)->nodeValue;
   
    // Remove everything but numbers from $jackpot
    $output = preg_replace("/\D/", "", $output);
   
    // Show the result
    echo '£'.number_format((double)($output));


    
    ?>
The page i'm attempting to scrape has this:

Code:
http-equiv="content-type" content="text/html; charset=windows-1255"
 
Back
Top Bottom