Web Scraper Extraction Speed ?

Associate
Joined
22 Dec 2009
Posts
1,340
Location
Upper Skurt
Hi,

I have a custom exe file coded in Perl that extracts data from an online database via screen scraping and returns it to a local PC as a text file.

The problem I am having is with the amount of time it takes to extract the online data each month. On a Sony VAIO AR51 it takes 8 hours, however, on all other PC's I have used it takes approx 36 hours.

There is something about the Sony VAIO configuration that speeds up the whole thing by a factor of about 4. The other PC's I have tried are of higher and lower spec than the Sony VAIO so I am a bit puzzled by it all. I no longer have the Sony VAIO and would like to get one of my other PC's to process the web scraping as fast as the Sony used to.

If this is beeter being discussed in any of the software or hardware areas woulld a mod please move it to the most appropriate area. As it is to do with the web and programming I thought here may be best.

Has anone got any ideas what could be going on ? I do not know if it is related to any settings in the BIOS that could impact on the process. I tested all the speeds using the same router and phone line etc and at the same time of the day for initiating the process.

Thx
Binty
 
BlackDragon,

Thx for the link, I will check out the factors involved to see if I can find anything relevant.

Thx
Binty
 
It's nothing to do with hardware it'll be some fudge of a compile to get cURL working with Perl on Windoze.

A 108k webpage should take about 4 seconds on even a sh** connection like mine. So unless the database your scrapping is HUGE it's unlikely you need to spend 8 hours getting it.

Download cygwin and install the curl libraries and try

curl www.bbc.co.uk > testfile

that'll tell you how long it takes to download the beeb font page.

TBH if you've got a brain then you could code a web scrapper in half an hour yourself
 
Back
Top Bottom