Automating webpage interaction - Internet Explorer

Soldato
Joined
6 Jun 2011
Posts
2,741
Hi all,

At work there is an internal web page that constructs tables based on data that a VB script is pulling. I'm looking at ways of automatically accessing this webpage, selecting the necessary filters in a couple of drop downs and then submitting this to create a table of data.

What would be ideal is if I could then store this information somewhere (preferably excel) and then schedule this to run daily. Are there any relatively simple ways of approaching this?

Thanks :)
 
Hi all,

At work there is an internal web page that constructs tables based on data that a VB script is pulling. I'm looking at ways of automatically accessing this webpage, selecting the necessary filters in a couple of drop downs and then submitting this to create a table of data.

What would be ideal is if I could then store this information somewhere (preferably excel) and then schedule this to run daily. Are there any relatively simple ways of approaching this?

Thanks :)

Writing software to interact with websites as if it were a person (click link, get data, etc.) is perfectly possible - I've done it. But it can be annoying and fiddly. By far the better approach is to do it the "right" way and get the web site to return you data in a structured data format such as JSON or XML. (JSON is more common and easier to read, but XML is fine).

Find whoever wrote the website or their code if that's not possible and write yourself a small web service to return the data that you want. So you'd find that bit that was generating the data for the table but instead of making a table, copy that code to a new page and have it return JSON (or XML). Remember to set the content type of the page you're returning to JSON (or XML). Get your software to call this page rather than the human-readable page and then use a common library to turn the JSON (or XML) into arrays / objects / whatever.

Trust me, you can write something that goes to a normal webpage, scrapes the data from a table, etc. But it's much cleaner to get the two systems talking to each other machine to machine. That can still be an HTTP (web) call, but you're getting nicely structured data back instead of a web-page.

Search terms you want to be using are things like "JSON", "import JSON to Excel", etc. Excel has built in tools for connecting to websites. This should be quite doable.
 
ALTERNATELY (And this is the wrong way to do it, btw, but it works and will take you sixty seconds):

Assuming you have the latest version of Excel, open a new sheet, go to the Data tab, select "New Query", go to "From Other Sources..." and select "Web".

Enter the URL of the page you want to pull from. After a moment, it should present you with a list of tables present in that page. Select the one that has the data you wanted, edit it if needed, then click Load.

Congratulations - that should have taken you less than a minute and has solved your problem. You have now cheated and created a system that two years from now somebody will hate you for not planning out properly.
 
Thanks guys for your help. A couple of things to mention as I might not have been clear/fully understand...

I don't have access to the code that is used and really I think it is best that I don't (due to potential responsibility etc.) In terms of the webpage I think what happens is a separate VB script pulls data and stores this in a large table/database. The webpage then allows you to interface with the database and filter into smaller tables.

I'm assuming the excel option wouldn't work as the table needs to be generate via the webpage first of all?

Cheers
 
Thanks guys for your help. A couple of things to mention as I might not have been clear/fully understand...

I don't have access to the code that is used and really I think it is best that I don't (due to potential responsibility etc.) In terms of the webpage I think what happens is a separate VB script pulls data and stores this in a large table/database. The webpage then allows you to interface with the database and filter into smaller tables.

I'm assuming the excel option wouldn't work as the table needs to be generate via the webpage first of all?

Cheers

If the web page has the data on it in a table from the start, the Excel approach will work. If you can only get the web page to put the table up by selecting options on the page then you'll need to find a way to do that first. The page might do it via AJAX (dynamic calls without a page load) or it might do it by submitting a form (refreshing the entire page). In either case, you can probably capture the request that is going to the server and send it yourself (unless the page is using a form token which it shouldn't be for a GET request).

Really, nobody here can give you answers much beyond what we have without actually seeing the site or else giving you an entire free course on web design and how the web works via this forum.

Go to the page - is the data there in a table ready? If so, just do the Excel import I suggest. If the data isn't there because it's waiting for you to select things from a drop down menu then open the Developer mode of your browser (either hit F12 in IE or Edge, or install Firebug in Firefox and use that). Switch to the network tab and see what happens when you do whatever it is you do to get the table to display. You should see a web request go over to the server containing some parameters. That's the page request you want - use that.

Really, I'm at my limit on this now and if the above isn't enough, you need to go away and learn web design properly because if you don't understand form submissions and / or AJAX, you're likely out of your depth (though you can still learn).

Also, we're not all guys, just FYI.
 
If the web page has the data on it in a table from the start, the Excel approach will work. If you can only get the web page to put the table up by selecting options on the page then you'll need to find a way to do that first. The page might do it via AJAX (dynamic calls without a page load) or it might do it by submitting a form (refreshing the entire page). In either case, you can probably capture the request that is going to the server and send it yourself (unless the page is using a form token which it shouldn't be for a GET request).

Really, nobody here can give you answers much beyond what we have without actually seeing the site or else giving you an entire free course on web design and how the web works via this forum.

Go to the page - is the data there in a table ready? If so, just do the Excel import I suggest. If the data isn't there because it's waiting for you to select things from a drop down menu then open the Developer mode of your browser (either hit F12 in IE or Edge, or install Firebug in Firefox and use that). Switch to the network tab and see what happens when you do whatever it is you do to get the table to display. You should see a web request go over to the server containing some parameters. That's the page request you want - use that.

Really, I'm at my limit on this now and if the above isn't enough, you need to go away and learn web design properly because if you don't understand form submissions and / or AJAX, you're likely out of your depth (though you can still learn).

Also, we're not all guys, just FYI.

Thanks for your response. Apologies if I offended you with my use of the word 'guys'. I understand not everyone is male on the forums but it's just a word I use to encompass everyone, it's a bad habit.

In terms of web design I do have a basic knowledge but that is about it. I will spend time getting it working and learning how to do this but just needed to be pointed in the right direction :)

Will give it a go...
 
Thanks for your response. Apologies if I offended you with my use of the word 'guys'. I understand not everyone is male on the forums but it's just a word I use to encompass everyone, it's a bad habit.

Not a big deal. I just sometimes get the urge to speak up when people assume I'm male. Especially on technical forums.

In terms of web design I do have a basic knowledge but that is about it. I will spend time getting it working and learning how to do this but just needed to be pointed in the right direction :)

Will give it a go...

Well I hope it's helped. It's honestly sort of the wrong direction. Sometimes you want to tell people that they shouldn't want what they want. ;) You can do it and I have a feeling based on it being done with VB and your mentioning of select boxes, that this is just a page that reads the GET parameters passed to it and reloads itself with the appropriate data. If that's the case, you'll even be able to just copy and paste the URL with the parameters in it into your Excel import. You'd obviously need to change that for each of the different queries, but there are ways to automate that.

It might be simpler to ask the developer (if you don't want to do it yourself) to just pass the results back via JSON or XML, however. That would be the "proper" way to do it. But good luck with it. It sounds like it should be fairly straight-forward so don't get discouraged if something doesn't work first time as that's, well, that's how programming goes. ;)
 
Back
Top Bottom