SQL and Standard Deviation

Associate
Joined
25 Feb 2007
Posts
905
Location
Midlands
Hi,

I need to calculate the standard deviation for the number of tasks completed each week.

I have a table consisting of task id and date completed, so the following gives me the number of taskscompleted each week:

Code:
SELECT DATEPART(wk,DateCompleted) AS WeekNum
, COUNT(TaskID) AS NumTasks

FROM Tasks

GROUP BY DATEPART(wk,DateCompleted)

I then need to get the standard deviation for each week. I've tried using STDEV() but I get null for every row.

I've replicated this in Excel and I get the correct figure.

Can this be done?

Thanks,
 
I'm guessing there is only 1 value for each WeekNum? In which case STDEV will always return null as there is only 1 value for it to calculate the deviation of.

Eg:

Code:
WeekNum		NumTasks
------------------------------
1		15
2		30
3		40
 
Hi,

Yes that's right - when I mocked this up in Excel, I was calculating the standard deviation of the number of tasks for the current week and any previous weeks. Is this possible using STDEV?

I might end up not using STDEV and calculating it manually.

Cheers,
 
I'm not sure you're getting what "standard deviation" is, properly. It's one single number that tells you information about a set. It applies to the entire set as a unit, not to each individual item within the set. Same as the mean. Each item doesn't have a mean, only the set does.

If I have the numbers (1,2,3,4,5,6), they have a mean of 3.5. They have an SD of (at a guess) 2. Each number in there doesn't have a mean, and nor does it have an SD. Only the set as a whole.

So.

What you either need to do is realise that the SD applies only to the entire set of weeks, or what I suspect you're looking for is the actual distance from the mean for each row. "distance from the mean for each data item" probably doesn't have a statistical name or direct function to calculate it (unlike "average distance from the mean for the data set, aka SD") so you'll have to do it manually, determining the mean and subtracting the row value from it, and removing the sign.

Probably you should stick to just looking at the SD though. It means more than looking at a list of "distance from the mean" for each data item would.

P.S. So hopefully you realise you can't pull this out from the same single SQL query, as it's not a per-row thing. You should pull it separately or just calculate it in PHP.
 
Last edited:
Thanks for your reply.

A user has requested this for a report I'm writing in reporting services.

I currently have a graph showing a trend in the number of tasks completed each month. The user want's a rolling average to also be plotted on the graph and also lines to show this +/- the SD - is this what you were suggesting I needed?

From speaking to the user, it also sounds like the SD should be 'rolling', but I'm not sure if this is correct?

As mentioned above, this is in Reporting Services, so needs to be entirely in T-SQL

Thanks,
 
I know nothing about T-SQL I'm afraid, I write stuff with PHP/MySQL so have more flexibility with how I process data. Not sure how to achieve specifically what you're looking for, in fact I don't even know how a "rolling average" would work, or what one is, off the top of my head. I'd guess at it being a "limited" average for a given data point based on X number of points either side of it, in which case, yes, you could have a "rolling average" for each data point, and even a "rolling SD" for each point, but how you'd go about pulling that from any sort of SQL-only interface... I'm envisaging an absolute nightmare of nested queries.
 
Back
Top Bottom