Denormalizing Relation Data With SQL

Stretch · 30 Sep 2008 at 11:07

I have some relational data I need to export to a customer as a flat table excel document.

For arguments sake lets say the data is a simple parent child relationship.

Parent
-----------
ParentID
ParentName

Child
-----------
ChildID
ParentID
ChildName

Let’s also say for arguments sake a parent can have no more than 3 children.

How would I get the information into the format below using an SQL statement?

Export
-----------
ParentID
ParentName
FirstChildName
SecondChildName
ThirdChildName

If a parent has less than 3 children the child columns are nullable.

EDIT: I forgot to say I'm using SQL Server 2005.

I think I've found a way using three sub queries, but it's a bit messy. It depends on numbering each of the children from 1 to 3 for each parent.

Each subquery would look something like this -

Code:

SELECT RANK() OVER (ORDER BY c.ChildName) as Rank, c.ChildName
FROM Child c
WHERE c.ParentID = p.ParentID
AND Rank = [1-3]
ORDER BY Rank

I'm not sure if it works because I've only go access to an Oracle database at the moment. If any one can confirm it might work or suggest a better solution I'd be grateful.

Stretch · 30 Sep 2008 at 11:42

Spunkey said:
My first thought was to use a union or two -

Code:

SELECT ParentID FROM Parent WHERE ParentID = 2 -- you could just select this into a variable if needed, just done it this way for clarity UNION SELECT ParentName FROM Parent WHERE ParentID = 2 UNION SELECT Childname FROM Child WHERE ParentID = 2 ORDER BY ChildID

This should give you the result you need, however if there are only 1 or two children, then there will be no record returned for those, rather than a null. If there are more then three you can just do a Select TOP 3 * on the final query.

You may need to do some Converting or Collating to make sure the field types match (especially if your ID field is an int and the name are nvarchars), but I've tested it on SQL2000 and it worked for me.

But how would that give you the three separate child columns?

Stretch · 30 Sep 2008 at 11:56

Spunkey said:
Because the final query will return more then one record.

I think you might have got slightly the wrong end of the stick.

These are the table headings.

Export
-----------
ParentID
ParentName
FirstChildName
SecondChildName
ThirdChildName

So each row should have these values, and I want multiple rows

Stretch · 30 Sep 2008 at 12:09

SimonCHere said:
If you had childPosition on the Child table it would be easy:

That's pretty much what this code dose

Code:

SELECT RANK() OVER (ORDER BY c.ChildName) as Rank, c.ChildName
FROM Child c
WHERE c.ParentID = p.ParentID
AND Rank = [1-3]
ORDER BY Rank

but I'm not sure it works???

How to dynamically number rows in a SELECT Transact-SQL statement

It's a piece of **** to d in Oracel

Stretch · 30 Sep 2008 at 12:19

SimonCHere said:
Well SQL 2000 doesn't recognise RANK() so I can't help there.

Sorry, forgot to say it's 2005 only

Stretch · 30 Sep 2008 at 14:51

I'm happy to say I create a local sqlexpress 2005 DB earlier the RANK() function works.

I'll post the final query up when I've got it sorted out

Stretch · 30 Sep 2008 at 15:18

I'm sure there's an easier way of doing it but here we go.

Code:

SELECT p.ParentID, p.ParentName,

(SELECT ChildName
FROM         (SELECT     RANK() OVER (ORDER BY c.ChildName) AS Rank, c.ChildName
FROM         Child c
WHERE     c.ParentID = p.ParentID) AS rankselect
WHERE Rank = 1) AS FirstChild,

(SELECT ChildName
FROM         (SELECT     RANK() OVER (ORDER BY c.ChildName) AS Rank, c.ChildName
FROM         Child c
WHERE     c.ParentID = p.ParentID) AS rankselect
WHERE Rank = 2) AS SecondChild, 

(SELECT ChildName
FROM         (SELECT     RANK() OVER (ORDER BY c.ChildName) AS Rank, c.ChildName
FROM         Child c
WHERE     c.ParentID = p.ParentID) AS rankselect
WHERE Rank = 3) AS ThirdChild

FROM Parent p

Stretch · 30 Sep 2008 at 21:29

SimonCHere said:

Can you not use this then?
I can't test as I don't have 2005

Code:

select p.ParentName, 
case when [c.ChildPosition] = 1 then c.ChildName END as FirstChildName,
case when [c.ChildPosition] = 2 then c.ChildName END as SecondChildName,
case when [c.ChildPosition] = 3 then c.ChildName END as ThirdChildName
From Parent p
Inner join 
(SELECT ChildName, ParentID, rankselect.Rank as ChildPosition
FROM         (SELECT RANK() OVER (ORDER BY c1.ChildName) AS Rank,
                           c.ChildName
                  FROM  Child c1
                  WHERE c1.ParentID = p.ParentID) AS rankselect
) AS c, 
on c.ParentID = p.ParentID

Simon

I've not tried it but I think it probably would work yes. It's not radically different though.

Stretch · 30 Sep 2008 at 21:31

lokkers said:
The way I've done similar things in the past is to use a table variable. Something like this:

--table variable
declare @table table
(
intId int primary key identity(1,1)
, intParentId int
, vchParentName varchar(100)
, intChildId1 int
, vchChild1 varchar(100)
, intChildId2 int
, vchChild2 varchar(100)
, intChildId3 int
, vchChild3 varchar(100)
)

--initial insert
insert into @table
(
intParentId
, vchParentName
, intChildId1
, vchChild1
)
select p.intparentid
, p.vchparentname
, c.intchildId
, top 1(c.vchChildName)
from parent p
inner join child c on p.intParentId = c.intParentId

--first update
update @table
set vchChild2 = top 1 (c.vchChild)
from child c
inner join @table t on c.intparentid = t.intParentId
where c.intchildid not in (select childid from @table)

--second update etc

I don't have access to the original code I wrote i'm afraid, but the concept should be sound. I've also not had a good day, and am a little drunk so your mileage may vary!

Using an identity is another way I was thinking of doing it, but it's much more work using temporary tables etc. I just need a quick and easy for now. It probably more efficient though so may be useful for when I have a bit more data