SQL Help

Soldato
Joined
16 Dec 2005
Posts
14,443
Location
Manchester
I have a database with a few tables that contains articles, users and other info for a site I am creating.

At the moment I am working out all the possible SQL statements I could need before I start the PHP.

I have done quite well so far, considering my lack of SQL knowledge, but I have hit a snag.

I have two tables user and article. I have an SQL statement that will give me the most recent article of the whole site and it appears to work properly:

Code:
 SELECT user_name, article_title, article_date, article_section 
FROM user, article 
WHERE user_id = article_user 
AND article_date = (SELECT MAX(article_date) FROM article);
The output is:
Code:
+-----------+----------------------------+---------------------+-----------------+
| user_name | article_title              | article_date        | article_section |
+-----------+----------------------------+---------------------+-----------------+
| pspain    | New Folding@Home V6 Client | 2007-10-04 00:52:00 |               1 | 
+-----------+----------------------------+---------------------+-----------------+
Now, I would like statements that grab the most recent article per section and per user.

So I came up with this:

Code:
SELECT user_name, article_title, MAX(article_date), article_section 
FROM user, article 
WHERE user_id = article_user GROUP BY article_section;
The output is:
Code:
+-----------+--------------------------------------------------+---------------------+-----------------+
| user_name | article_title                                    | MAX(article_date)   | article_section |
+-----------+--------------------------------------------------+---------------------+-----------------+
| pspain    | New Folding@Home V6 Client                       | 2007-10-04 00:52:00 |               1 | 
| pspain    | Llyn Ogwen to Pen Yr Ole Wen                     | 2007-10-01 19:41:28 |               3 | 
| pspain    | The new 'Age of the Five' books by Trudi Canavan | 2007-10-01 19:46:06 |               9 | 
+-----------+--------------------------------------------------+---------------------+-----------------+
This appeared to work at first, but when I actually checked I realised the article title was wrong for the second record returned as you can see by the contents of the article table:

Code:
+--------------------------------------------------+---------------------+
| article_title                                    | article_date        |
+--------------------------------------------------+---------------------+
| Llyn Ogwen to Pen Yr Ole Wen                     | 2007-10-01 19:34:59 | 
| Pen-Y-Pass to Elidir Fawr                        | 2007-10-01 19:39:09 | 
| Crowden to Black Hill                            | 2007-10-01 19:41:28 | 
| The new 'Age of the Five' books by Trudi Canavan | 2007-10-01 19:46:06 | 
| New Folding@Home V6 Client                       | 2007-10-04 00:52:00 | 
+--------------------------------------------------+---------------------+
The SQL statement correctly gets the latest date per section, but seems to choose the earliest article per section [and associated user_name]. I am not sure how to fix it!

Any help would be great :)
 
SELECT user_name, article_title, MAX(article_date), article_section

All that will do is select a username, an article title, the last date, and an article selection


Use

SELECT user_name, article_title, article_date, article_section
FROM user, article
WHERE user_id = article_user
GROUP BY user_name, article_title, article_date, article_section
ORDER BY article_date DESC


Same goes for your first one,
why use
AND article_date = (SELECT MAX(article_date) FROM article)
Just use

SELECT TOP 1 user_name, article_title, article_date, article_section
FROM user, article
WHERE user_id = article_user
order by article_date DESC


I may have my group by's slightly wrong, so you may need to alter my code... but you should get the idea :)

You over working the code, adding more code than is needed, using order by and TOP (number) will give similar results but should be less stressful
 
Yes, do what nikebee says. Just a note though, assuming you're using MySQL you'll want to be using 'LIMIT 1' at the end instead of the 'TOP 1' at the beginning :)
 
Thanks guys. Will give that a go soon.

Like I said in my OP I am very much an SQL n00b so I can only write code I know [and understand]. Although I will now be looking into TOP [or LIMIT for MySQL] and DESC.

Thanks again :)
 
Yes, do what nikebee says. Just a note though, assuming you're using MySQL you'll want to be using 'LIMIT 1' at the end instead of the 'TOP 1' at the beginning

Sure, forgot to mention that.

If you use MySQL rather than Microsoft SQL then my last part of code would not be

Code:
SELECT TOP 1 user_name, article_title, article_date, article_section 
FROM user, article 
WHERE user_id = article_user 
order by article_date DESC


But would be
Code:
SELECT user_name, article_title, article_date, article_section 
FROM user, article 
WHERE user_id = article_user 
order by article_date DESC
LIMIT 1
I'm pretty sure the rest of it is okay, been a long time since I did MySQL code :o

Thanks guys. Will give that a go soon.

Like I said in my OP I am very much an SQL n00b so I can only write code I know [and understand]. Although I will now be looking into TOP [or LIMIT for MySQL] and DESC.

Thanks again
Sure no worries man, we were all n00bs at one stage :)
Top (or Limit) will obviously only bring back the number you want.
Order by (something) DESC will order by it descending (or reverse if you will)

Say you have an table called TEST with one field called ID.

As follows

ID
---
1
2
3
4
5

Select top 2 ID from TEST order by ID desc

would produce

ID
5
4

Select top 2 ID from TEST order by ID

would just produce

ID
1
2

Select top 2 ID from TEST order by ID ASC
would also produce

ID
1
2


As all order by's are sorted ascending by default :)

It's always worth creating a temporary table or database that you can play around with. Whilst your learning, it's always best to test your code on a dummy database first to make sure you haven't got any code wrong.

E.G.
Delete from (tablename) where ID=1
that would delete one row of course

Missing that where clause would wipe the whole table...

I'm sure you probably knew that, but it's just an example... as I said it's definitely worth setting up a test database and a few test tables... you don't want to lose or screw up all you data with dodgey coding :)
 
There is no "live" database yet. So the one I set up is my test bed until I am happy everything is working. Then I have the hard task of filling it up with real data :p

Regarding your statements I can confirm that

Code:
SELECT user_name, article_title, article_date, article_section 
FROM user, article 
WHERE user_id = article_user 
order by article_date DESC
LIMIT 1;
Works as it should, selecting the newest article in the database.

I am however having a problem with selecting the newest article in each section..

Your code:
Code:
SELECT user_name, article_title, article_date, article_section 
FROM user, article 
WHERE user_id = article_user
GROUP BY user_name, article_title, article_date, article_section 
ORDER BY article_date DESC;
Produces:
Code:
+-----------+-------------------------------------------------------+---------------------+-----------------+
| user_name | article_title                                         | article_date        | article_section |
+-----------+-------------------------------------------------------+---------------------+-----------------+
| mheyes    | Evergrey Announced at Apollo                          | 2007-10-04 14:49:12 |              14 | 
| kbottoms  | Pitch Black Progress - The new album by Scar Symmetry | 2007-10-04 14:49:12 |              13 | 
| pspain    | New Folding@Home V6 Client                            | 2007-10-04 00:52:00 |               1 | 
| pspain    | The new 'Age of the Five' books by Trudi Canavan      | 2007-10-01 19:46:06 |               9 | 
| mheyes    | Crowden to Black Hill                                 | 2007-10-01 19:41:28 |               3 | 
| pspain    | Pen-Y-Pass to Elidir Fawr                             | 2007-10-01 19:39:09 |               3 | 
| pspain    | Llyn Ogwen to Pen Yr Ole Wen                          | 2007-10-01 19:34:59 |               3 | 
+-----------+-------------------------------------------------------+---------------------+-----------------+
It has got everything correct, but it has chosen all three articles in Section 3. I only want the latest in each section.

If my understanding of GROUP BY is correct, it takes all the items in the column given and basically groups all the identicle values. In your code, you are grouping several columns. There is no way those columns put together can form groups, as they will always make a different combination.

However, just grouping article_section means all the info goes out of whack, as per my original post.

I suppose I could code something into the PHP to ignore the extra articles, but that's not really ideal - especially if i have hundreds of articles to wade through.

Sorry to be a bore :p
 
Right, try this.

I didn't read what you wanted properly the first time around :o

So you want a list of article_section (ids), and next to them goes the last entered piece of data?

Code:
SELECT DISTINCT article_section,
(select top 1 user_name from user where user_id =
(select top 1 article_user from article where article_section = A.article_section order by article_date desc)) as user_name,
(select top 1 article_title from article where article_section = A.article_section order by article_date desc) as article_title,
(select top 1 article_date from article where article_section = A.article_section order by article_date desc) as article_date
FROM user U, article A
WHERE user_id = article_user
ORDER BY article_section

Again, if your using MySQL use Limit 1 after each nested select statement

Code:
SELECT DISTINCT article_section,
(select user_name from user where user_id =
(select article_user from article where article_section = A.article_section order by article_date desc LIMIT 1) LIMIT 1) as user_name,
(select article_title from article where article_section = A.article_section order by article_date desc LIMIT 1) as article_title,
(select article_date from article where article_section = A.article_section order by article_date desc LIMIT 1) as article_date
FROM user U, article A
WHERE user_id = article_user
ORDER BY article_section

Using nested select statements, means you can create each field in your table with whatever type of data you want :)

giving tables user and articles aliases (U = user and A = article) means it's easier to reference to them inside the nested selects, so the code knows that when A.article_sectionis typed, it references to the article_section in the main part of the select statement and not inside any of the nested selects.

It's a bit more complex, but it get the job done.
 
Woah. That's an SQL statement with chest hairs!

I'd never have come up with that in a million years. Heh, I am still trying to get my head around it :p However, the main thing is works! So thank you very much.

Hopefully I wont run into any other problems!
 
me said:
Hopefully I wont run into any other problems!

Ooof, spoke too soon!

I'd like a similar SQL statement to choose the latest article by each user. Unfortunately I just can't get my brain around your SQL statement, so I can't modify it.

A brief explanation would be a great help, if you don't mind :)

EDIT: After a small epiphany [or stroke] I think I understand that SQL statement... at least enough to come up with this:

Code:
SELECT DISTINCT article_user,
(SELECT user_name FROM user WHERE user_id = article_user ORDER BY article_date DESC LIMIT 1) AS user_name,
(SELECT article_title FROM article WHERE article_user = A.article_user ORDER BY article_date DESC LIMIT 1) AS article_title,
(SELECT article_date FROM article WHERE article_user = A.article_user ORDER BY article_date DESC LIMIT 1) AS article_date
FROM user U, article A
WHERE user_id = article_user
ORDER BY article_user;

Which appears to work. The output looks like:

Code:
+--------------+-----------+-------------------------------------------------------+---------------------+
| article_user | user_name | article_title                                         | article_date        |
+--------------+-----------+-------------------------------------------------------+---------------------+
|            1 | pspain    | Dimmu Borgir at Academy 1                             | 2007-10-04 17:22:40 | 
|            2 | mheyes    | Evergrey Announced at Apollo                          | 2007-10-04 14:49:12 | 
|            3 | kbottoms  | Pitch Black Progress - The new album by Scar Symmetry | 2007-10-04 14:49:12 | 
+--------------+-----------+-------------------------------------------------------+---------------------+

I get a redundant "article_user" column though. At least I am on the right track though :)
 
Last edited:
Hi,

Don't want to stick my nose in etc. but this might be another way of selecting the most recent article per user and section:

Code:
select a.article_date, a.article_title, u.username, a.article_section
from users u, article a
where u.user_id = a.article_user
and a.article_date in (select max(a2.article_date)
                        from article a2
                        where a2.article_user = a.article_user
                        and a2.article_section = a.article_section);

The main select retrieves the various details from the users and article tables. The sub-select retrieves the max date from the article table where the user and section are the ones being currently looked at in the main select. The join to the sub-select then ensures that only the article with the max date for the section is returned. The joining between the sub-select and the main select in the where clause of the sub-select seems to ensure that you only get the right row per section (not entirely sure if this is working properly though).

I've tested on a small amount of data which includes combinations of duplicate dates for different articles and sections, and it seems to work.

Worth a try?

Jim
 
I just tried your code as is and didn't work. I then whacked on a "GROUP BY article_section" at the end, and it appears to work!

Code:
SELECT a.article_date, a.article_title, u.user_name, a.article_section
FROM user u, article a
WHERE u.user_id = a.article_user
AND a.article_date IN (SELECT MAX(a2.article_date)
FROM article a2
WHERE a2.article_user = a.article_user
AND a2.article_section = a.article_section) GROUP BY article_section;

I cant make it work for per-user though. Tried to GROUP BY article_user [and user_name] but it didn't work. Any ideas?

Also, how would one modify the query to also grab information from rank and status tables, which are linked to the user table through user_rank and user_status?

I have figured out how to do it with nikebee's code [it looks awful!] but if yours can be easily modified - then great :D
 
Yeah mine is a little sketchy...

This works for your new problem however

Code:
SELECT DISTINCT
(SELECT user_name FROM user WHERE user_id = article_user ORDER BY article_date DESC LIMIT 1) AS user_name,
(SELECT article_title FROM article WHERE article_user = A.article_user ORDER BY article_date DESC LIMIT 1) AS article_title,
(SELECT article_date FROM article WHERE article_user = A.article_user ORDER BY article_date DESC LIMIT 1) AS article_date
FROM user U, article A
WHERE user_id = article_user

If you need to use an order by though (which is always useful) try this

Code:
Declare @articleusers table (user_name nvarchar (4000),article_title nvarchar (4000), article_date smalldatetime)
insert into @articleusers
SELECT DISTINCT
(SELECT user_name FROM user WHERE user_id = article_user ORDER BY article_date DESC LIMIT 1),
(SELECT article_title FROM article WHERE article_user = A.article_user ORDER BY article_date DESC LIMIT 1),
(SELECT article_date FROM article WHERE article_user = A.article_user ORDER BY article_date DESC LIMIT 1)
FROM user U, article A
WHERE user_id = article_user

select user_name,article_title,article_date
from @articleusers
order by article_date

Code:
Declare @articleusers table
Creates a temporary table that can be referenced to only when the code is run, you insert your data into it as you normally would, and can then do any select statements that you wish


What are the names of the fields in both tables that link the rank and status tables?

is rank just user_rank in both, and status just user_status in both?

If so you should just be able to do

FROM user U, article A, rank R, status S
WHERE user_id = article_user
and U.user_rank = R.user_rank
and U.status = S.status

failing that, you could do (proper) inner joins to get the same effect and not do it the girls way :p
 
All this SQL is making my brain hurt.

I have no idea what is going on with the DECLARE stuff. I only know it doesn't work [syntax errors]. I can't fix it either :(

This SQL is getting horribly complicated - I am finding it hard to keep up. I just about got around the code you gave me yesterday, and I adapted it to get most recent article per user by myself [and got rid of the redundant column].

I also managed to adapt your original code to also grab the rank_name and status_name for each user. However it looks horrendous. Before JIMA posted yesterday I was looking for a way of storing the result of a sub-query so I could use it again, as my new SQL statement from hell repeats the same sub-query a few times as you can see below.

Small children and those of a nervous disposition should look away now:

Code:
SELECT DISTINCT article_section,
(SELECT user_name FROM user WHERE user_id = (SELECT article_user FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) LIMIT 1)  AS user_name,
(SELECT rank_name FROM rank WHERE rank_id = (SELECT user_rank FROM user WHERE user_id = (SELECT article_user FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) LIMIT 1) LIMIT 1) AS rank_name, 
(SELECT status_name FROM status WHERE status_id = (SELECT user_status FROM user WHERE user_id = (SELECT article_user FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) LIMIT 1) LIMIT 1) AS status_name,
(SELECT article_title FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) AS article_title,
(SELECT article_date FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) AS article_date
FROM user U, article A, rank R, status S
WHERE user_id = article_user AND user_rank = rank_id AND user_status = status_id
ORDER BY article_section;
The above outputs:

Code:
+-----------------+-----------+-----------+-------------+-------------------------------------------------------+---------------------+
| article_section | user_name | rank_name | status_name | article_title                                         | article_date        |
+-----------------+-----------+-----------+-------------+-------------------------------------------------------+---------------------+
|               1 | kbottoms  | Member    | Active      | Windows SMP Client                                    | 2007-10-04 22:50:27 | 
|               3 | mheyes    | Moderator | Active      | Crowden to Black Hill                                 | 2007-10-01 19:41:28 | 
|               9 | pspain    | Admin     | Active      | The new 'Age of the Five' books by Trudi Canavan      | 2007-10-01 19:46:06 | 
|              13 | kbottoms  | Member    | Active      | Pitch Black Progress - The new album by Scar Symmetry | 2007-10-04 14:49:12 | 
|              14 | pspain    | Admin     | Active      | Dimmu Borgir at Academy 1                             | 2007-10-04 17:22:40 | 
+-----------------+-----------+-----------+-------------+-------------------------------------------------------+---------------------+
So it does work. It's just a shame it is a horrible monstrosity of an SQL statement! There has got to be a way to reduce the repetitive statments.

The new rank and status tables look like:

rank
rank_id
rank_name
.... [other non related info]

status
status_id
status_name
... [other info]

Both the tables link to user through user_rank and user_status

The status table might not be needed, as the data it holds could probably be put somewhere else [or be derived in the PHP]. However, i wont give up and nuke the table just yet.
 
Code:
SELECT DISTINCT article_section,
(SELECT user_name FROM user WHERE user_id = (SELECT article_user FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) LIMIT 1)  AS user_name,
(SELECT rank_name FROM rank WHERE rank_id = (SELECT user_rank FROM user WHERE user_id = (SELECT article_user FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) LIMIT 1) LIMIT 1) AS rank_name, 
(SELECT status_name FROM status WHERE status_id = (SELECT user_status FROM user WHERE user_id = (SELECT article_user FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) LIMIT 1) LIMIT 1) AS status_name,
(SELECT article_title FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) AS article_title,
(SELECT article_date FROM article WHERE article_section = A.article_section ORDER BY article_date DESC LIMIT 1) AS article_date
FROM user U, article A, rank R, status S
WHERE user_id = article_user AND user_rank = rank_id AND user_status = status_id
ORDER BY article_section;

Holy crap man... that has got a bit insane :eek:
If I get a chance later I'll have a proper play at lunch or during a quiet 5 minutes and I'll see if I can write a simpler version - I think you've expanded on my original query a little too much :)

I didn't expect that declare stuff to work on MySQL but it was worth a shot.
 
Holy crap man... that has got a bit insane :eek:
If I get a chance later I'll have a proper play at lunch or during a quiet 5 minutes and I'll see if I can write a simpler version - I think you've expanded on my original query a little too much :)

I know! Insane isn't it?

At first it all seemed so logical... then I found it worked and I was like "hurrah!". Then I stepped back and realise what I had created! :eek:

Bloody SQL. It's not like other languages. The learning curve is more like a brick wall.

As usual, your help is greatly appreciated. Please don't feel the need to do this during your breaks though. Do it when you get back to work :p
 
As usual, your help is greatly appreciated. Please don't feel the need to do this during your breaks though. Do it when you get back to work :p

:D will do


edit:
Bloody SQL. It's not like other languages. The learning curve is more like a brick wall.

And yeah, SQL makes things that should really be simple a ruddy pain in the arse :-/
 
little off topic but what statemant do you use to produce the outputs of the results like so ?

Code:
+--------------+-----------+-------------------------------------------------------+---------------------+
| article_user | user_name | article_title                                         | article_date        |
+--------------+-----------+-------------------------------------------------------+---------------------+
|            1 | pspain    | Dimmu Borgir at Academy 1                             | 2007-10-04 17:22:40 | 
|            2 | mheyes    | Evergrey Announced at Apollo                          | 2007-10-04 14:49:12 | 
|            3 | kbottoms  | Pitch Black Progress - The new album by Scar Symmetry | 2007-10-04 14:49:12 | 
+--------------+-----------+-------------------------------------------------------+---------------------+
 
Hi,

Try this one:

Code:
select a.article_date, a.article_section, a.article_title,
         u.user_id , u.username
from article a, users u
where a.article_user = u.user_id
and a.article_date in (select max(a2.article_date)
                         from article a2, users u2
                         where a.article_section = a2.article_section
                  	 and      u2.user_id = u.user_id
                  	 and a2.article_user = u2.user_id) 
order by a.article_section;

This select will (is intended to anyway) retrieve details of the latest article for each user within a section. So, if three users have put a number of articles each into a section it will retrieve one row per user, that row being their latest article in the section.

Is that any help?

Jim
 
Back
Top Bottom