XML Editing

Soldato
Joined
18 Oct 2002
Posts
4,898
At work we receive files XML from a number of external bodies (loosely) to an agreed schema that are loaded into our case management system. The files contain information about jobs that have been carried out at an address.

As part of the loading process the addresses are matched to land and property gazetteer. The loader is provided by our software supplier and we don't really have any control over it. If an address doesn't match, it is not processed and has to be matched manually or rejected.

WorkAddressDetails>
- <WorkAddress>
<NumberName>1</NumberName>
<Street>Any Street</Street>
<TownCity>Any Town</TownCity>
<PostCode>AB1 2CD</PostCode>
</WorkAddress>


Problem we have is that the match rate is only about 20% because many of the jobs omit the County tag from the address. If I add this in manually using xmlEditor, the match rate jumps to 80% (the other 20% are due to spelling errors and duff addresses).

What I need is a tool to go through each file and add the County, if it's not already present. All the jobs will the in the same county.

WorkAddressDetails>
- <WorkAddress>
<NumberName>1</NumberName>
<Street>Any Street</Street>
<TownCity>Any Town</TownCity>
<County>Any County</County>
<PostCode>AB1 2CD</PostCode>
</WorkAddress>

Are there any tools that will allow me to do this automatically? I'm not a programmer other than dabbling a bit in VBA when I have to, so something off the shelf would be good.
 
If you really want to automate this you will need a programmer, it's dead easy to do in nearly any programming language and wouldn't take very long. I would suggest it would be a nice and easy task for someone on these forums to have a go at if they have some free time. There are several approaches to doing this as well, ranging from a simple find a replace to a full DOM traversal of the XML.

I think it would be handy if you could post a sample of the raw XML you receive if possible, obviously with any details blanked out. This would allow a tool to be written that could be pointed towards a directory full of these XML files and then the fixed versions spit out into another (if required).
 
As RobH has said, it would be pretty simple to write a program to do this for you.

That being said, I'd be more inclined to tell your supplier to sort it out! Unless this is for some reason outside of their control.
 
Take care on how you do the match part - you'll likely need to use the postcode as you can't trust the town name. For example, there are three Ripley's in the UK, Yorkshire, Derbyshire and Surrey.
 
I agree with the others best solution would be to get your supplier to sort it out, failing that have a programmer write a small application.

However there is a quick work around that you can do which doesn't require any programming knowledge. Just do a search and replace, 99% of text editors have this feature.

Search for: </TownCity>
Replace with </TownCity><County>Any County</County>

Assuming your supplier has handled the reading of the XML in a good manner it won't matter that the county appears on the same line as TownCity. The XML file won't be as tidy but it should allow your tool to work better
 
Back
Top Bottom