SQL Thought

Associate
Joined
25 Aug 2008
Posts
947
Hi,

Working on a large scale database and had the following thought...

When storing items such as postcode, would it be better to store it in stages?

Like TD 15 5BY OR TD15 5BY or TD155BY


I'm trying to create a coordinate system, that given latt/long it will spit out a list of results giving the postcode.

So i could have the table:-
Code:
CREATE TABLE postcode
{
     P1 varchar(2),
     p2 Int,
     p3 varchar(3)
}
and search :-
Code:
SELECT * 
FROM postcode
WHERE P1='EH'
AND P2=15

or have the following table:-
Code:
CREATE TABLE postcode
{
     P1 varchar(7)
}
and the following search

Code:
SELECT * 
FROM postcode
WHERE P1 CONTAINS ('"EH15*" ')

I might just end up running tests to find out which is more efficient...
 
We just use the full 'TD155BY' format for our postcodes.

Cutting the post code up into multiple column's has pro's and con's

a pro you might reduce search results (thus the time need to pull information back)
a con you will need the extra AND statement, possibly slowing it slightly.

I really don't think efficiency will be effected much either way, more how you would like it to appear.
 
As you have suggested, the solution is to test both approaches, though I suspect that most modern DB engines will level the playing field and performance will be similar.

I guess the way in which you will typically query the data will have the biggest impact.

If you typically only search on full postcodes, using a single column will be slightly faster through the omission of one Join during searches. If you are going to frequently query by Postal District(TD15), splitting into two columns will be marginally faster because you don't need to expensive operators like LIKE, CONTAINS etc.

Perhaps you want even greater resolution? Do you also want to search by Postal Area (TD) or Postal Sector (TD155)? Many GIS data suppliers will explicitly list each component in their datasets.

What are you going to use the Postcodes for? Find the nearest? Thematic mapping? Travelling Salesmen.

SQL Databases prefer narrow-but-long tables, so ideally you don't want extra columns if they aren't going to be needed. Equally, since fuzzy operators (LIKE, CONTAINS etc) are expensive, you need to be wary of being forced to use them if you opt for as single Postcode column.

So my answer is a question: How are you going to use the data?

EDIT: I assume you have purchased a geocoded dataset (PAF-related presumably) to link to? How does it break down postcodes?
 
Last edited:
So my answer is a question: How are you going to use the data?

I'm working on an interlaced website with google maps to place its own dataset on top of the map, given latt/long coordinates...

Users should be able to syphon the information they want to search for/find.

Bar that I dont really want to give away more of what I am building for.

I intend to run tests on various setups to check the bandwith/response times, was just wondering anyones general opinion on the construction of the underlying database.
 
Back
Top Bottom