Can this query be optimised at all?

Caporegime
Joined
18 Oct 2002
Posts
25,287
Location
Lake District
Basically what this query does is take all the keywords from one image then searches the entire db and orders them by weighting, but it takes several seconds each time and is really taxing on the CPU. Any ideas on how to make this faster and more efficient.

Code:
SELECT SQL_CALC_FOUND_ROWS main.*, MATCH(KEYWORDS) AGAINST ('Coast 019390-08 AF Abersoch Activity Beach Beache Boat Britain British Card Coast Coastal Coastline Destination Greeting Horizontal Isle Kingdom Mountain Resort Sailing Sea Seascape Seaside Summer Summery UK United VC08 VW08 Water Welsh West Western Beache Boating Boat Coast Coastal Coastline Coast Resort Sand Seascape Seaside Summery Tidal Tidal Tourism Tourist Touristy Travel Western' IN BOOLEAN MODE) AS score FROM main WHERE MATCH(KEYWORDS) AGAINST ('Coast 019390-08 AF Abersoch Activity Beach Beache Boat Britain British Card Coast Coastal Coastline Destination Greeting Horizontal Isle Kingdom Mountain Resort Sailing Sea Seascape Seaside Summer Summery UK United VC08 VW08 Water Welsh West Western Beache Boating Boat Coast Coastal Coastline Coast Resort Sand Seascape Seaside Summery Tidal Tidal Tourism Tourist Touristy Travel Western' IN BOOLEAN MODE) and ARCHIVED = 0 ORDER BY score DESC LIMIT 32, 32
 
You've got a lot of seperate search terms in there which takes the time. Off the top of my head, you could try:

1. Remove the duplicate search terms you've got in there. Words like 'Boat', 'Coast' and 'Tidal' appear twice which may have a slight hit on performance.

2. Make sure all the columns you are searching on have a FULLTEXT index on them.
 
Have done both of those already, the keywords are replicated in the query due to the way Boolean Mode works.
 
(I'm not an SQL expert)

Could you look at putting it in a stored procedure on the database? I'm pretty sure they offer good performance benefits when conditional logic is involved.
 
Well how often is the query run? If the db is not updated with data very often you could update the results in a temp table every so often?
 
The query is different every time though, it depends on the picture that people click the similar link on.
 
Lets look at the underlying problem rather than the query. People click on a similar link and you want to provide them with results of pictures that have tags in common. The picture may be different every time but are some of the tags the same each time? What are the most common tags and do you tend to get common subset groupings of the tags (all pictures tagged with "trees" normally also have "nature")?

How many results do people actually ever look through, could you look at reducing the scope of the query to return fewer results quicker as most people don't need EVERY similar photo?

Are there ways to reduce or group the tags so that there are fewer search terms to work through? Could a table of the most common 100 tags and the pictures they relate to be kept relatively up to date (during the quietest time for your server or whenever there is a low load?) and any queries with these terms in can be provided instantly without a search? The results then add the remaining uncommon tag similar pictures later once the query completes.
 
Dureth is right, you could limit the amount of tags to use. This is usually better in general as well because it forces the user to think about what they actually want rather than the stuff they see and go "oohhhhhhh ahhhhhh"
 
Back
Top Bottom