Sorting a String Alphabetically in SQL?

Soldato
Joined
14 Apr 2003
Posts
4,950
Location
Deepest Yorkshire
Hi guys, just wondering if there were any sql gurus in here, I have a small problem that I'm stumped on...

OK basically I'm making an anagram solver... I put in an anagram e.g. 'odg' and it tells me things that match: 'dog', 'god'.
In order to do this I'm ordering the strings alphabetically and comparing them.
so:
odg >> dgo
dog >> dgo
god >> dgo

and then you can easily see if they match....

in order to make this run faster I want to have an 'anagram' column in the database with the word pre sorted.

I am currently trying to do this in PHP with a loop which goes through each row, sorts the word, then does an update statement to enter the anagram in.

However the DB has 750k words in and this is incredibly slow. Is there any way to do this in pure SQL for more speed?

The current sorting algorithm is like so: implode(sort(explode($word))). as far as I know SQL doesn't have any similar functionality.

This is MySQL btw.

Cheers!
Hamish
 
I'm not a SQL guru but I doubt there is such string manipulation ability in SQL (no doubt someone will prove me wrong in a minute).

However to improve the speed I would do it in batches, rather than complete one word at a time, extract a large number, sort the strings and then update them all at the same time.

It will reduce the amount of SQL chatter between your application and MySQL
 
Couple of caveats to start with

1) I'm a DB2 expert, not MySQL so this might not fly
2) I may have misunderstood the requirement.

The key to this is going to be getting stuff out of the database quickly and the best way of doing that is to get the database design right. If you have a lookup table that contains 28 columns - 1 for the solution word, 1 for the number of letters in that word then 26 integer columns which contain the number of occurrences of each letter of the alphabet for that word. A wee bit of up front code to parse your anagram then the SQL becomes (using the odg example)

Code:
select word from wordlist where length=3 and d=1 and g=1 and o=1

That will return every three letter word which contains 1 d, 1 g and 1 o.

After that it's just about tuning - a single column index on the count and each of the letter columns - and if MySQL supports index anding it'll fly.
 
Back
Top Bottom