Help needed with an MS SQL query

redeye_uk · 7 Jul 2010 at 16:05

I have some database corruption and need to write an SQL script to rectify it.

Basically I have something similar to the following

Code:

table1
{
 ID int PK
 ModelID int FK
 ClientID int
}

and some typical data might look like

Code:

ID   ModelID    ClientID
1        1             -1
2        1             -2
3        1             -2
4        1             -3
5        1             -3
6        2             -1
7        2             -2
8        2             -3

so each entry for a model should have a unique ClientID. But you will note that model 1 has duplicate rows for clientID -2 & -3

Now I cant delete these duplicate rows, instead I need to update the ClientID value. So for ID 3 I need to update the ClientID to -4 and for ID 5 I need to update the ClientID to -5. (basically the next available ID)

Hopefully you're still following me

I have written a query to get the duplicate rows from the table and tell me what the current highest ClientID used for that Model is.

Code:

select 
t1.ClientID, 
t1.ModelID, 
(SELECT TOP(1) KnowdeID from Table1 where ModelID = t1.ModelID and ClientID = t1.ClientID) ID, 
(SELECT min(ClientID)  from Table1 where ModelID = t1.ModelID) HighestClientID
from Table1 t1
group by t1.ClientID, t1.ModelID
having count(t1.ClientID) > 1

So this query will give me something like

Code:

ClientID   ModelID   ID    HighestClientID
   -2        1       3           -3
   -3        1       5           -3

How can I tweak the above query so I do an update and change the ClientID to "HighestClientID" + 1. In the above result set it would work for the first row, but not the second. i.e. it would still leave me with duplicates for ClientID -3

Pho · 7 Jul 2010 at 18:19

PHP:

-- I'm using table variables for my testing, you can remove all this and use your real tables if you want
DECLARE @Table1 TABLE ( ID INT PRIMARY KEY,
						ModelID INT,
						ClientID INT)
						
INSERT @Table1 ( ID, ModelID, ClientID )
	SELECT 1, 1, -1 UNION
	SELECT 2, 1, -2 UNION
	SELECT 3, 1, -2 UNION
	SELECT 4, 1, -3 UNION
	SELECT 5, 1, -3 UNION
	SELECT 6, 2, -1 UNION
	SELECT 7, 2, -2 UNION
	SELECT 8, 2, -3


-- Grab a list of duplicates and save them into the temporary table #UPDATE_TEMP
SELECT
    ModelID ,
    ClientID ,
	(SELECT MAX(ID) FROM @Table1 WHERE ModelID = t1.ModelID and ClientID = t1.ClientID) ID
INTO #UPDATE_TEMP
FROM @Table1 t1
GROUP BY ModelID, ClientID
HAVING COUNT(ClientID) > 1


-- Loop over all rows in #UPDATE_TEMP and update them one at a time
-- so that the next client ID is calculated properly
WHILE EXISTS (SELECT * FROM #UPDATE_TEMP)
BEGIN
	-- Get the next ID from #UPDATE_TEMP
	DECLARE @ID INT = ( SELECT TOP 1 ID FROM #UPDATE_TEMP ORDER BY ID )
	
	-- Update the ClientID to the latest available
	UPDATE @Table1
		SET ClientID = (SELECT MIN(ClientID)-1 FROM @Table1 WHERE ModelID = t1.ModelID)
	FROM @Table1 T1
	INNER JOIN #UPDATE_TEMP UT
		ON UT.ID = T1.ID
	WHERE UT.ID = @ID
		
	-- Remove the row from the #UPDATE_TEMP table (so that the next loop iteration will pick up the next available row to fix)
	DELETE FROM #UPDATE_TEMP WHERE ID = @ID
END


-- Ta-da!
SELECT * FROM @Table1

Wonderfully over-engineered, but seems to work

.

redeye_uk · 12 Jul 2010 at 10:11

Cheers Pho, exactly what I was after. I agree it's wonderfully over-engineered but its a change script so performance isn't too much of an issue

I was wanting to be really elegant and do the whole thing in a single query by updating variables at the same time as columns in an UPDATE statement. so something like....

PHP:

declare @clientid int
select @clientid = select max(clientid)+1 from Table1 -- whatever you have to do to get the next number
update t1
   set clientid = @clientid,
       @clientid = @clientid + 1
  from Table1 t1
 where ...

.. problem with that is it'll only work for a single Model with duplicates (in reality I have multiple models with duplicates). I'm sure some t-sql demigod can prove me wrong, but time is against me so the while loop will have to suffice.

Pho · 12 Jul 2010 at 10:36

I was doing something similar a while ago and I seem to remember you need to do it in a loop because if you use a single query the individual row updates aren't committed to the database until all rows have been updated (which means you won't be able to calculate the correct Client ID). Doing it in a loop means each iteration is committed as it loops meaning you can calculate the next ID properly.

philldutton · 12 Jul 2010 at 15:54

If you're using Sql Server 2005 you could do it with something like this:

Code:

update t
set t.clientid = updated_t1.new_client_id
from table1 t
inner join (
	select id, 
	modelid, 
	clientid, 
	row_number() over (partition by modelid order by id) * -1 as new_client_id
	from table1
) updated_t1 on t.id = updated_t1.id