basic coding query

Soldato
Joined
18 Aug 2004
Posts
6,739
Location
The Toilet
Im looking to programme something that will automatically retrieve certain data from within a word document that is sent in a standard format daily, and then storing this data in a database.

Thats all I need help thinking about at the moment, the next phase of the programme will be to plot the coordinates(which are given in email) for each of the entries on a map, and then display a little point on the map which can be hovered over, bringing a popup box up which then displays other information which came with the coordinates, such as information, dates, times and its unique identifier.


Basically im flirting with the idea of making something that I can load the daily email we get at work with all the NOTAMs(notice to airmen) into, which then processes the individual ones and plots them automatically, replacing the need to replot them everyday, and having to sort through them manually for anything that is on aircraft flightpaths.
Other handy features will be that it can automatically flag up anything within certain areas, streamlining work for other flight planning tasks.

Anyway the question.. what would you as a programmer use to get the information from word into a database, and would there be something you would prefer to use to do the whole thing with, including the plotting and everything.

Im not that clued up on programming anymore, havnt done any since I finished uni last year, and didnt really touch it too much over 3rd and 2nd years seriously, but I learn fast and im willing to learn.
Not so much a deadly serious project, just something Iv flirted with doing as it will make life at work easier, and look great for me if I do it well, itll also give me something to do when im bored too.. even at work where they wont mind me sitting playing with it!
 
Do you have any control of the incoming file? If so get it out of word and into a plain text file of some format - word uses a proprietary file format so you'll be limited by what is out there to let you read it programatically. A flat file on the other hand is pretty simple to work with, either to load straight into a database if it's in a sensible format or to manipulate into a format that can be loaded easily.
 
Me being silly didnt bother looking at what file format it was sent as, I would hope its not a .doc, if it was it wouldnt be too much of a pain to save it as a RTF or something.

The data is stored inside a table within the email, but its only got 3 columns or something, and generally everything is lumped into the main column bar the identifier and something else(which i cant remember)

What would you load it straight into a database with? Can access or something similar be set to do this automatically?
 
oh forgot to add, we get the file emailed but after that we do have full control over it, being able to edit it and what not.
 
Do you have access to whatever is generating the co-ordinates for the e-mail? If you just put the data straight to a web page at source it would be a lot easier than trying to read a Word Doc. The e-mail can then reference the web page output.

Rgds
 
Connect to the database using ADO/JDO, then simply have it update whatever fields you want and return values using SQL.

It's hard to suggest what to use without knowing what tools you have available.

Personally, I would ASP it and have the page on the intranet.

If your company doesn't have a server they will let you use or that supports ASP then you could create a stand-alone App in VB.net, Java or one of the C variants. Which one to use would depend on your familiarity with the language and whether the IT guys will let you have the IDE/VM on your computer.
 
Last edited:
I'd hack about with something like

Firstly automate the process of saving the word document to a specified directory (can't help you there)
Install antiword

then you'll want to do something like this in your directory

antiword *.doc | perl getCoOrds.pl

That'll convert the word documents into a relatively decent text format, and pipe it into a perl script.

In your perl script, prey to god the word documents contain a standardized co-ordinate format. So say it's \s[0-9]\.[0-9]{5}\s+\s[0-9]\.[0-9]{5}\s

you'll want your Perl script to do something like

while(my $line = <STDIN>){
if($line =~ /(\s[0-9]\.[0-9]{5})\s+\s([0-9]\.[0-9]{5})\s/){
print "Lat = $1, Long=$2\n";
}

}

You can easily add database drivers to Perl to upload those co-ordinates to a MySQL database or such

In terms of displaying those points - you could look at exporting them from the database in some sort of KML file for Google Earth

HTH

D
 
What would I do?

Well you've already mentioned Word, so I'd hazard a guess you're using Office, so to extract the emails, information from the Word file, push it into a database, then I'd using nothing more than VBA to do ALL of this for me.

Don't see why the hell you'd need to introduce ASP, PERL and anything else. Office has all the tools you'll need to extract the data from Word, push it into a database, create KML files or even hyperlinks to maps.google.com etc etc.
 
What would I do?

Well you've already mentioned Word, so I'd hazard a guess you're using Office, so to extract the emails, information from the Word file, push it into a database, then I'd using nothing more than VBA to do ALL of this for me.

Don't see why the hell you'd need to introduce ASP, PERL and anything else. Office has all the tools you'll need to extract the data from Word, push it into a database, create KML files or even hyperlinks to maps.google.com etc etc.

If you can show me how to extract co-ords from a word document automatiicaly and push that to a MySQL databsae without using an anything but MS products I will be surprised to say the least
 
If you can show me how to extract co-ords from a word document automatiicaly and push that to a MySQL databsae without using an anything but MS products I will be surprised to say the least
If we're allowed to factor in the MySQL Connector for ODBC (you can't say *NO* MS products since MySQL is not an MS product itself) you could simply use VBA as mentioned...

With the correct SDKs VBA is actually quite powerful. We use it to document our Business Objects Universes. Although it is hideously slow.
 
If you can show me how to extract co-ords from a word document automatiicaly and push that to a MySQL databsae without using an anything but MS products I will be surprised to say the least

I don't see a problem.

The word document can be opened via a VBA macro running in outlook that checks the recipient and opens the word document.

The co-ords are text, and will have some pattern like the 18th line down, or after a word saying XCo:, something like that, so read the remainder of that line into a variable and do the same for the next.

Open up a connection to ANY database using an ODBC connection, again all still done within VBA, push the data and sorted.

Again, don't see the problem!
 
Well you've already mentioned Word, so I'd hazard a guess you're using Office, so to extract the emails, information from the Word file, push it into a database, then I'd using nothing more than VBA to do ALL of this for me.
VBA was my initial thought as well. If I simply wanted to get information from an email or document into a database I would definitely use VBA, however, it seems the original poster has some more features planned for it that may not suit VBA.


Don't see why the hell you'd need to introduce ASP, PERL and anything else. Office has all the tools you'll need to extract the data from Word, push it into a database, create KML files or even hyperlinks to maps.google.com etc etc.
Perl is amazingly good at parsing text files. ASP would allow him access to grid drawing classes that are easily modifiable and even a GoogleMaps control. :)

Everyone here has started making assumptions about what the original poster is going to be doing. The company he works for might not use Google Maps for a start...

As I said earlier

It's hard to suggest what to use without knowing what tools you have available.

If we're allowed to factor in the MySQL Connector for ODBC (you can't say *NO* MS products since MySQL is not an MS product itself) you could simply use VBA as mentioned...

With the correct SDKs VBA is actually quite powerful. We use it to document our Business Objects Universes. Although it is hideously slow.
Again we do not know what db software he is going to using, the original post mentioned Access. Let's wait to hear back with more information.

I don't see a problem.

The word document can be opened via a VBA macro running in outlook that checks the recipient and opens the word document.

The co-ords are text, and will have some pattern like the 18th line down, or after a word saying XCo:, something like that, so read the remainder of that line into a variable and do the same for the next.

Open up a connection to ANY database using an ODBC connection, again all still done within VBA, push the data and sorted.

Again, don't see the problem!
I agree with some of your points, a macro or two could turn out to the best solution. Not sure about parsing Word docs for co-ordinates though, I'd just avoid the minefield altogether and get it into csv or tab delimited format.

Once the op replies with more information we should be able to work out quite quickly what is going to be the most effective way to approach this.
 
Last edited:
Im fairly sure iv got access on my account, and ill upload one of the documents that I had received last week after iv checked theres nothing restricted on it.

Its MOD for the record, so I will be limited as to what I can get on the computer as far as applications and what not.

Theres some great replies but there seems to be people who are just worrying about getting the information into a database and leaving the rest for some other application to then process this into a database, and then theres others who are looking at doing the full thing and further suggesting things I could do the whole plotting etc with.

Some great help from everyone so far, thanks people :)
 
I don't see a problem.

The word document can be opened via a VBA macro running in outlook that checks the recipient and opens the word document.

The co-ords are text, and will have some pattern like the 18th line down, or after a word saying XCo:, something like that, so read the remainder of that line into a variable and do the same for the next.

Open up a connection to ANY database using an ODBC connection, again all still done within VBA, push the data and sorted.

Again, don't see the problem!

With the most respect that's probably the most hideous solution I've ever heard. Firstly your going to automatically open word documents with a VBA macro (massive security issues with this).

VBA is garbage at regular expressions so god help him if the formatting was non-standard, Perl kicks every scripting language in the regex department.

Last but not least it'll be slllloooowwwwww. On top of that how does then plot the points?

IMHO if someone who worked with me suggested that as a solution to this problem, I'd slap so hard, they'd go to their grave with an imprint of my palm on their face:p
 
Back
Top Bottom