SQL PIVOT (I think!)

Associate
Joined
25 Feb 2007
Posts
905
Location
Midlands
I have some data that looks like this:

Address| Location
123 Fake St | Kitchen
123 Fake St | Bathroom
123 Fake St | Bedroom
123 Fake St | Lounge
56 New Road | Kitchen
56 New Road | Lounge
12 Another Lane | Bathroom
12 Another Lane | Lounge

I need it to look like this:

123 Fake St | Kitchen | Bathroom | Bedroom | Lounge
56 New Road | Kitchen | | | Lounge
12 Another Lane | | Bathroom | | Lounge

How would I do this? I thought I needed to use PIVOT, but all the examples I've found use aggregates on numbers (sum of sales etc.)

Cheers!
 
Saying that... if it's going to be running on an enourmous table, you might be better off setting up multiple CTE's (One for each location). Then you can join them all together outside...
 
I was afraid you'd say that...I don't know how many different locations there are (my example is very dumbed down!)

I could find out, but wouldn't like to write a subquery/cte for each possibility...

Any other ideas? Doesn't have to be massively efficient, as it's only a one off.
 
I've never really used a pivot table before, so this has been a bit of a learning experience for me too! However, the below appears to work with your test data.

Code:
SELECT *

FROM #TestData 
	 PIVOT
	 (
	  MIN(Location)
	  FOR Location
	  IN ([Kitchen], [Bathroom], [Bedroom], [Lounge])
	 ) PTable

Now, you said you may have a lot more locations than this in your actual example, so I'd suggest pulling all of those into Excel and concatenating them with the [ ], so - ="["&A1&"]," would do that for you in cell A1. Once that's done, you can copy that list into the IN of the PIVOT (Remember to remove the last comma!), and run it! It seems a little silly that you can't us a subquery in the IN!

The data set will look like this:

Adress | Kitchen | Bathroom | Bedroom | Lounge |
12 Another Lane | NULL | Bathroom | NULL | Lounge
123 Fake St | Kitchen | Bathroom | Bedroom | Lounge
56 New Road | Kitchen | NULL | NULL | Lounge

Hope that helps?

Good luck - let me know how you get on.
 
Could you define a variable = a subquery then run against that list?

You would probably need to use a function to take the distinct records from the location field and then change those into a comma seperated list. But might work?
 
It's a bit on the crazy side but if you absolutely must have:

- indeterminate number of locations
- each column representing a different location after the address column

Maybe something like this that builds the SQL for you, so long as you don't generate a SQL statement with it that's > 4K characters long.....

Code:
create table raw (address nvarchar(128), location nvarchar(128))
go

insert into raw 
select '123 Fake St' address, 'Kitchen' location
union select '123 Fake St' address, 'Bathroom' location
union select '123 Fake St' address, 'Bedroom' location
union select '123 Fake St' address, 'Lounge' location
union select '56 New Road' address, 'Kitchen' location
union select '56 New Road' address, 'Lounge' location
union select '12 Another Lane' address, 'Bathroom' location
union select '12 Another Lane' address, 'Lounge' location
go

declare locations cursor local for
select location from raw group by location;

declare @location nvarchar(100);
declare @sql_prefix nvarchar(1000);
declare @sql_mid nvarchar(1000);
declare @sql_suffix nvarchar(1000);
declare @sql_part1 nvarchar(4000);
declare @sql_part2 nvarchar(4000);
begin

set @sql_prefix = 'select address, 
';

set @sql_mid = ' '''' as dummy
from (
select address,
';

set @sql_suffix = ' '''' as dummy 
from raw
) subq
group by address';

set @sql_part1 = '';
set @sql_part2 = '';

OPEN locations;
FETCH NEXT FROM locations INTO @location;
WHILE (@@FETCH_STATUS = 0)
  BEGIN

  set @sql_part1 = @sql_part1 + 
'case when sum(' + @location + 's) = 0 then '''' else ''' + @location + ''' end as ' + @location + ',' + char(10);

  set @sql_part2 = @sql_part2 + 
'case when location = ''' + @location + ''' then 1 else 0 end as ' + @location + 's,' + char(10);

  FETCH NEXT FROM locations INTO @location;
END         

close locations;
deallocate locations;

print @sql_prefix + @sql_part1 + @sql_mid + @sql_part2 + @sql_suffix;

exec(@sql_prefix + @sql_part1 + @sql_mid + @sql_part2 + @sql_suffix);

end
go

drop table raw
go

Outputs:
Code:
select address, 
case when sum(Bathrooms) = 0 then '' else 'Bathroom' end as Bathroom,
case when sum(Bedrooms) = 0 then '' else 'Bedroom' end as Bedroom,
case when sum(Kitchens) = 0 then '' else 'Kitchen' end as Kitchen,
case when sum(Lounges) = 0 then '' else 'Lounge' end as Lounge,
 '' as dummy
from (
select address,
case when location = 'Bathroom' then 1 else 0 end as Bathrooms,
case when location = 'Bedroom' then 1 else 0 end as Bedrooms,
case when location = 'Kitchen' then 1 else 0 end as Kitchens,
case when location = 'Lounge' then 1 else 0 end as Lounges,
 '' as dummy 
from raw
) subq
group by address

 address          Bathroom     Bedroom     Kitchen     Lounge     dummy    
 ---------------  -----------  ----------  ----------  ---------  -------- 
 12 Another Lane  Bathroom                             Lounge              
 123 Fake St      Bathroom     Bedroom     Kitchen     Lounge              
 56 New Road                               Kitchen     Lounge              

 3 record(s) selected [Fetch MetaData: 0/ms] [Fetch Data: 1/ms]

(Ignore the dummy column, it was added for simplicity in the SQL building).
 
Thanks - I'll give it a go!

Just to pick up on something you said - I don't have to have each location in a separate column, I'd be quite happy to have them all concatenated in one column with a delimiter, as long as I'm left with one row per address.

Alternatively, if this can be done in Excel with the original data set then I'm open to that, as it's going to end up there anyway!
 
But would you be able to have this:

123 Fake St | Kitchen | Bathroom | Bedroom | Lounge
56 New Road | Kitchen | Lounge

Or do you NEED to have delimiters in the second row where a location doesn't exist?

I.e. precisely as you had it before:
123 Fake St | Kitchen | Bathroom | Bedroom | Lounge
56 New Road | Kitchen | | | Lounge
 
Code:
select a.address, 
stuff(
  (select cast('|' as varchar(max)) + b.location 
   from yourtable b 
   where a.address = b.address 
   for xml path('')), 1, 1, '') as locations
from yourtable a 
group by a.address
go

 address          locations                       
 ---------------  ------------------------------- 
 12 Another Lane  Bathroom|Lounge                 
 123 Fake St      Bathroom|Bedroom|Kitchen|Lounge 
 56 New Road      Kitchen|Lounge                  

 3 record(s) selected [Fetch MetaData: 0/ms] [Fetch Data: 0/ms]

or

Code:
select a.address, 
  (select cast('|' as varchar(max)) + b.location 
   from yourtable b 
   where a.address = b.address 
   for xml path('')) as locations
from yourtable a 
group by a.address
go

 address          locations                        
 ---------------  -------------------------------- 
 12 Another Lane  |Bathroom|Lounge                 
 123 Fake St      |Bathroom|Bedroom|Kitchen|Lounge 
 56 New Road      |Kitchen|Lounge                  

 3 record(s) selected [Fetch MetaData: 0/ms] [Fetch Data: 1/ms]

.. as preferred.
 
Last edited:
That's great topdog.

I've used XML PATH before, not sure why it didn't occur to me to use it this time! Suppose I didn't really think that the locations would be fine concatenated in one field.

On an unrelated question, I see you've cast to varchar(max) on the delimiter - I'd tend to specify the number of characters rather than use max. Are there any benefits to this?

Thanks for you help guys!
 
It's only a single character in this case anyway so doesn't really matter. I'm not sure how it might compare under some heavier data/performance testing and if differences might arise then; things like this I tend to hope/rely on the database optimizer to know what to do when faced with a hard-coded constant and the best way of dealing with it.
 
Back
Top Bottom