Denormalizing Relation Data With SQL

Soldato
Joined
14 Feb 2004
Posts
14,315
Location
Peoples Republic of Histonia, Cambridge
I have some relational data I need to export to a customer as a flat table excel document.

For arguments sake lets say the data is a simple parent child relationship.

Parent
-----------
ParentID
ParentName

Child
-----------
ChildID
ParentID
ChildName


Let’s also say for arguments sake a parent can have no more than 3 children.

How would I get the information into the format below using an SQL statement?

Export
-----------
ParentID
ParentName
FirstChildName
SecondChildName
ThirdChildName

If a parent has less than 3 children the child columns are nullable.


EDIT: I forgot to say I'm using SQL Server 2005.

I think I've found a way using three sub queries, but it's a bit messy. It depends on numbering each of the children from 1 to 3 for each parent.

Each subquery would look something like this -

Code:
SELECT RANK() OVER (ORDER BY c.ChildName) as Rank, c.ChildName
FROM Child c
WHERE c.ParentID = p.ParentID
AND Rank = [1-3]
ORDER BY Rank

I'm not sure if it works because I've only go access to an Oracle database at the moment. If any one can confirm it might work or suggest a better solution I'd be grateful.
 
Last edited:
My first thought was to use a union or two -
Code:
SELECT ParentID FROM Parent WHERE ParentID = 2 -- you could just select this into a variable if needed, just done it this way for clarity
UNION
SELECT ParentName FROM Parent WHERE ParentID = 2
UNION
SELECT Childname FROM Child WHERE ParentID = 2 ORDER BY ChildID
This should give you the result you need, however if there are only 1 or two children, then there will be no record returned for those, rather than a null. If there are more then three you can just do a Select TOP 3 * on the final query.

You may need to do some Converting or Collating to make sure the field types match (especially if your ID field is an int and the name are nvarchars), but I've tested it on SQL2000 and it worked for me.

But how would that give you the three separate child columns?
 
Because the final query will return more then one record.
I think you might have got slightly the wrong end of the stick.

These are the table headings.

Export
-----------
ParentID
ParentName
FirstChildName
SecondChildName
ThirdChildName

So each row should have these values, and I want multiple rows
 
Last edited:
I'm happy to say I create a local sqlexpress 2005 DB earlier the RANK() function works. :)

I'll post the final query up when I've got it sorted out
 
I'm sure there's an easier way of doing it but here we go.

Code:
SELECT p.ParentID, p.ParentName,

(SELECT ChildName
FROM         (SELECT     RANK() OVER (ORDER BY c.ChildName) AS Rank, c.ChildName
FROM         Child c
WHERE     c.ParentID = p.ParentID) AS rankselect
WHERE Rank = 1) AS FirstChild,

(SELECT ChildName
FROM         (SELECT     RANK() OVER (ORDER BY c.ChildName) AS Rank, c.ChildName
FROM         Child c
WHERE     c.ParentID = p.ParentID) AS rankselect
WHERE Rank = 2) AS SecondChild, 

(SELECT ChildName
FROM         (SELECT     RANK() OVER (ORDER BY c.ChildName) AS Rank, c.ChildName
FROM         Child c
WHERE     c.ParentID = p.ParentID) AS rankselect
WHERE Rank = 3) AS ThirdChild

FROM Parent p
 
Can you not use this then?
I can't test as I don't have 2005
Code:
select p.ParentName, 
case when [c.ChildPosition] = 1 then c.ChildName END as FirstChildName,
case when [c.ChildPosition] = 2 then c.ChildName END as SecondChildName,
case when [c.ChildPosition] = 3 then c.ChildName END as ThirdChildName
From Parent p
Inner join 
(SELECT ChildName, ParentID, rankselect.Rank as ChildPosition
FROM         (SELECT RANK() OVER (ORDER BY c1.ChildName) AS Rank,
                           c.ChildName
                  FROM  Child c1
                  WHERE c1.ParentID = p.ParentID) AS rankselect
) AS c, 
on c.ParentID = p.ParentID

Simon

I've not tried it but I think it probably would work yes. It's not radically different though.
 
The way I've done similar things in the past is to use a table variable. Something like this:

--table variable
declare @table table
(
intId int primary key identity(1,1)
, intParentId int
, vchParentName varchar(100)
, intChildId1 int
, vchChild1 varchar(100)
, intChildId2 int
, vchChild2 varchar(100)
, intChildId3 int
, vchChild3 varchar(100)
)

--initial insert
insert into @table
(
intParentId
, vchParentName
, intChildId1
, vchChild1
)
select p.intparentid
, p.vchparentname
, c.intchildId
, top 1(c.vchChildName)
from parent p
inner join child c on p.intParentId = c.intParentId

--first update
update @table
set vchChild2 = top 1 (c.vchChild)
from child c
inner join @table t on c.intparentid = t.intParentId
where c.intchildid not in (select childid from @table)

--second update etc

I don't have access to the original code I wrote i'm afraid, but the concept should be sound. I've also not had a good day, and am a little drunk so your mileage may vary! :p
Using an identity is another way I was thinking of doing it, but it's much more work using temporary tables etc. I just need a quick and easy for now. It probably more efficient though so may be useful for when I have a bit more data :)
 
Back
Top Bottom