Location Database help!

Associate
Joined
25 Aug 2008
Posts
947
Hi,

I am creating a site that uses location data. The easiest way to make it consistant around the globe would be by the use of lat/long details - but I would like to incorporate a worldwide postcode format if possible as well.

I will try to raise my concern as an example(not related to my site).

Lets say my mobile gives the site my lat/lon details, and I want to know all the indian takeaways in a 1km radius.

I could setup a calculation in my SQL statement to pull back results, but thats a lot of calculations - especially if the database has thousands of entries.

So to speed this up, I could for example create a box around my location 2km by 2km. That way I could say if a<lattitude<b and c<longitude<d then display results. Results not so accurate, but that should prove a faster search right? With a lot of users this would still slow the server - so could I make this faster again?

Accuracy is not so much my concern here.

My only thought of making this any faster, would be to 'grid' up the globe, on a variety of scales. I could then say my current location is in grid box A, I want to search a distance of Y, so my search results should return all entries in grid boxes ....

Does this seem feesible? To speed things up, I think I could have a database that stores say grid box A, and then has a subset of grid boxes to include, based on a search of distances 1k/5k/10k/25k/50k and so on.

Would this be faster, and require less use of the server? The onlything I am not sure about then would be the space required in the database.
 
Associate
OP
Joined
25 Aug 2008
Posts
947
At the equator, Earth's circumference is 24,901.55 miles (40,075.16 km). It is slightly smaller between the North and South poles at 24,859.82 miles (40,008 km). Earth's diameter at the poles is 7,899.80 miles (12,713.5 km) while it is 7,926.28 miles (12,756.1 km) at the equator.
 
Associate
Joined
29 Dec 2004
Posts
420
Location
Fife, Scotland
I would use WGS84 GPS co-ordinates. That makes it easy for anyone, anywhere to get their location to find information, or to record it.

How to calculate the distance between two GPS co-ordinates is well-documented.

Gridding up - how would that work for somewhere on the edge of the grid? But you could calculate 10k (or whatever) away in each direction to use as a filter searching the database rather than calculate the distance for each searched location. If you want you could calculate these co-ordinates for a number of distances from each point and store them in the database when you store the main data about the point.

Performance with lots of users will need accurate indexing.

Another approach might be to find points within certain distances when you add a point, and store them as you save each bit of data. That would give a table which joined two points in the main table along with the distance between them, but you would want to use the approach above to filter which points to calculate the distance otherwise storing a new point would become a very slow process.


If time is of interest record it in GMT and convert to local time where necessary.
 
Last edited:
Caporegime
Joined
18 Oct 2002
Posts
32,623
What you want is to use a spatial index.
In my line of work I use an R*tree as a hierarchical spatial tree composed of bounding a boxes and points. With this tree I can make spatial queries in logarithmic time, and easily make spatial search queries such as "Find the n-nearest entries" using a kind of alpha-beta (min-max) pruning of the search tree.

There are other spatial index structures like quad trees etc, but rtree a are fast and convenient for irregular spatial densities, clustering etc. quad trees work well with a more uniform distribution.

I know there are some Rtree SQL libraries out there that we considered using but we rolled are own library for speed reasons, there was never really a need for SQL.


There are lots of caveats and gotchas though. When you use spherical coordinates intersection with a bounding box is not a simple bounds test, nor is calculating distances (you need to use the haversine function to find the great circle distance). Plus dealing with date line is a pain.
 
Associate
OP
Joined
25 Aug 2008
Posts
947
Thanks for the comments.

cats_five - I want to avoid calculations for every user as much as possible, a spacial index does sound what I am looking for.
D.P - I eventually expect to hold hundred of thousands of entries in a database, I dont think having to recalculate each time would be a good idea.

I need to think of the spacial index more - I might start limiting the location details to say LAT/LONG/AREA/TOWN/REGION/COUNTRY.

The zoom level on the map might help with choosing the field to use to reduce the results.
 
Soldato
Joined
28 Oct 2006
Posts
12,456
Location
Sufferlandria
So to speed this up, I could for example create a box around my location 2km by 2km. That way I could say if a<lattitude<b and c<longitude<d then display results. Results not so accurate

What about using this idea to narrow down the results but then perform the accurate distance calculations on them.
You'll still get the same accuracy overall but a much smaller number of calculations.
 
Back
Top Bottom