Designing SQL Query

Associate
Joined
2 Nov 2005
Posts
931
Location
Leicester
I need help designing a SQL Query.

I need to find all distinct MODULENAME's with the latest REVISION_ID and exclude the ones that have an empty/null CONTENT column.

REVISION_ID is a auto-incrementing primary key.
There will be many more MODULENAME's other than Front_Page normally.

Here's a screenshot of my table so you can make more sense of what I am talking about:
20qmcly.jpg


If I am making no sense at all, please say so!

Just give me solution in whatever flavor of SQL you use and I will work from that. I cant get my brain around how to write this one out!

Edit: Heres CREATE TABLE
Code:
CREATE TABLE content (
	revision_id INTEGER NOT NULL, 
	modulename VARCHAR(256), 
	content VARCHAR(102400), 
	summary VARCHAR(256), 
	created TIMESTAMP, 
	PRIMARY KEY (revision_id)
)
 
Last edited:
How about this.. maybe not the best way. Are you using MSSQL?

Create a function to return the details of the last entry of each distinct modulename..
PHP:
CREATE FUNCTION dbo.distinct_modules  ()

RETURNS TABLE 
AS
RETURN 
(
	SELECT
		   MAX(revision_id) AS revision_id
		  ,modulename
		  ,MAX(content)	AS content

	FROM content
	GROUP BY modulename
)
GO


Now call the function and filter out any null / empty values..
PHP:
SELECT revision_id, modulename FROM dbo.distinct_modules()
WHERE content IS NOT NULL OR LEN(content) > 0
 
Last edited:
I use sqlite and mysql mostly but I get the general idea from your reply :)
SQL is mostly a standardized language after all.

I was hoping its possible to do it with one query! Is that not possible?

Edit: That function does not work as expected, it outputs:
revision_id - modulename - content
4 ---------- Front_Page - hahaha

The content column is from row 3 and the revision_id says row 4.
 
Last edited:
Not sure what happened there, it was late :p.

I split it into two queries because if we included the 'WHERE content IS NOT NULL OR LEN(content) > 0' part in the first query it would select ID 3 not 4 because it specifically filters out entries where content is null or blank.

Try this single query instead then..

PHP:
SELECT revision_id, modulename
FROM content
WHERE revision_id IN (
	SELECT MAX(revision_id) FROM content c
	GROUP BY modulename
)
AND LEN(content) > 0
ORDER BY modulename
 
Last edited:
Not sure what happened there, it was late :p.

I split it into two queries because if we included the 'WHERE content IS NOT NULL OR LEN(content) > 0' part in the first query it would select ID 3 not 4 because it specifically filters out entries where content is null or blank.

Try this single query instead then..

PHP:
SELECT revision_id, modulename
FROM content
WHERE revision_id IN (
	SELECT MAX(revision_id) FROM content c
	GROUP BY modulename
)
AND LEN(content) > 0
ORDER BY modulename

I am impressed! You are a sql god :)

I only had to modify it slightly but its spot on!
PHP:
SELECT revision_id, modulename, content
FROM content
WHERE revision_id IN (
    SELECT MAX(revision_id) FROM content c
    GROUP BY modulename
)
AND content != ''
ORDER BY modulename

Some odd reason len wasnt working with mysql
PHP:
#1305 - FUNCTION pynguins_hn.len does not exist

pynguins_hn is the database name.
 
Last edited:
:).

LEN() might be Length() in SQLite. Just using '' might not account for null values.
 
Does that work though? If the max revision id has a content of '' wont you just get no record returned (subquery returns a max revision id that isn't found in the main query because of the condition)? Wouldnt you need to move the content != '' condition into the subquery?

Not tried it - just a thought :)
 
Does that work though? If the max revision id has a content of '' wont you just get no record returned (subquery returns a max revision id that isn't found in the main query because of the condition)? Wouldnt you need to move the content != '' condition into the subquery?

Not tried it - just a thought :)

Moving into sub query has same result (heres the query i used to test your hypothesis):
PHP:
SELECT revision_id, modulename, content
FROM content
WHERE revision_id IN (
    SELECT MAX(revision_id) FROM content c
    WHERE length(content) > 0
    GROUP BY modulename
)
ORDER BY modulename

Not sure which is more efficient performance wise, maybe the former?

If someone comes up with a even more simplified way to achieve what this query does, then I will be thankful (since it makes my programming code look cleaner and easier to understand). Maybe some info from here may help: http://jan.kneschke.de/projects/mysql/groupwise-max/
Perhaps should be some kind of digital trophy for this and this question is to help an open source project too :)

And as far as I know NULL and '' are two different things.
 
Last edited:
Are you sure the result is always the same? If you take the example in the first post, the sub query would return '4', but the main query would discard this as the content is equal to ''. Note that if the max content is not deleted, they both will return the same result...

Also, imo your design could be improved by having a 'isdeleted' column, and kept the content / summary intact, and then just do 'where isdelete = 0' rather than 'where content != '''
 
Back
Top Bottom