T-SQL - Column is NULL, or False. Which to use?

Associate
Joined
11 Jun 2009
Posts
438
Hi Guys,

Just part way through writing some t-sql and I've created a bit column 'Approved' I auto import data into this column with a NULL value for approved.

Just wondering if this is proper in this case, as my WHERE statements then have 'approved IS NULL', or should I be importing as False and coding appropriately?

It'll work either way I know, but just one of those questions that's bugging me.

Thanks in advance guys.
 

Pho

Pho

Soldato
Joined
18 Oct 2002
Posts
9,325
Location
Derbyshire
I'd say import it as false. A null value for approved is surely the same as false is it not (e.g., it's not approved)? Do you actually need it to be null for any reason?
 
Last edited:
Associate
Joined
14 May 2010
Posts
1,136
Location
Somerset
Well, that depends :)

NULL should indicate that the value is unknown. So if you know that they are Approved then you should us 1, if you know they are Not Approved you should use 0. If you don't know if they are approved or not then you should use NULL.

If you will always know if they are approved or not (i.e. if everyone/thing defaults to Not Approved until Approved) then you should set the column to non-nullable and set the value to 0.
 
Associate
Joined
1 Dec 2005
Posts
803
Well, that depends :)

NULL should indicate that the value is unknown. So if you know that they are Approved then you should us 1, if you know they are Not Approved you should use 0. If you don't know if they are approved or not then you should use NULL.

If you will always know if they are approved or not (i.e. if everyone/thing defaults to Not Approved until Approved) then you should set the column to non-nullable and set the value to 0.

Exactly this. It's a tri-state attribute, although not all data access layers will treat it as such. You could change your WHERE criteria to look for approved = 1 or approved <> 1, which might simplify things for you.
 
Associate
Joined
2 Jan 2007
Posts
1,976
WHERE criteria to look for approved = 1 or approved <> 1

Beware if you do have Nulls then using this would not bring them up at all.

You would need and "Or Is Null" criteria if you want them to come through to view.
 
Man of Honour
Joined
26 Dec 2003
Posts
31,094
Location
Shropshire
I'd set it as true/false just makes more sense plus NULL's can be a bit more difficult to deal with depending on what you're planning on doing with the data.
 
Associate
OP
Joined
11 Jun 2009
Posts
438
Thanks for all the replies guys.

I was swaying towards making everything either FALSE or TRUE and, to be honest, setting the column to non-nullable completely slipped my mind when I was posting this. That makes the most sense to me.

As I said, I just needed some justification for either method as I know it would have worked either way. Something to keep in mind for future :)
 
Associate
Joined
15 Jan 2007
Posts
1,071
For flag field I normally use integers, 0 for FALSE and 1 for TRUE.

Dependent on DB engine there are BOOLEAN datatypes which are TRUE/FALSE (sometimes 0/1) or you can use BIT (0/1).

One of the neat things about some DB engines is when using a BIT or INTEGER datatype is using them in if/while statements in code, they can treat 0 as FALSE and any value above that is TRUE. Also, some coding languages use 0/1 as TRUE/FALSE as well which makes it quite handy without conversions.

eg.
_tempvar = 1;
if (_tempvar) {
-- TRUE
} else {
-- FALSE
}
 
Last edited:

aln

aln

Associate
Joined
7 Sep 2009
Posts
2,076
Location
West Lothian, Scotland.
You sure you want to go with just true/false?

Personally I'd assume you'd at least want not-processed, rejected, approved. You could do that with null, true, false, or -1, 0, 1, etc.
 

aln

aln

Associate
Joined
7 Sep 2009
Posts
2,076
Location
West Lothian, Scotland.
Or .. just to be completely crazy.. "pending", "rejected" and "approved" :p

While it probably largely doesn't matter, using strings isn't exactly the most efficient use of a database, which is why I didn't suggest it. But yeah, that'll work as well. Then again, I haven't actually got any clue about T-SQL. :p
 
Last edited:
Soldato
Joined
12 Dec 2003
Posts
8,141
Location
East Sussex
While it probably largely doesn't matter, using strings isn't exactly the most efficient use of a database, which is why I didn't suggest it.

True but the performance difference is marginal for most real-world systems. Null's meaning is totally ambiguous to anyone other than the person who wrote the code that relies it. Even worse someone might assume that it means no value. You could map a constant in the code to the value but the database in isolation holds misleading data. Unless speed is a top issue I believe clarity is always king. This is true even if you don't work in a team.
 
Last edited:
Caporegime
Joined
18 Oct 2002
Posts
29,491
Location
Back in East London
As above. Premature optimisation. Descriptive values until performance becomes an issue. Do not assume it to ever be the other way around. Why?

In short: it won't ever become a problem.

In long: it won't until you are hitting millions of transactions per hour and you will have countless far better optimisations to undertake long before this becomes an issue.
 

aln

aln

Associate
Joined
7 Sep 2009
Posts
2,076
Location
West Lothian, Scotland.
My understanding of premature optimisation is generally refering to technical debt, for example jumping on the noSQL bandwagon for wins that probably aren't there, which is going to take your typical rdbms team of devs a significant amount of effort. Doing something which is typically easy to do in your typical path of development doesn't really fall into the same criteria.

You're right that it probably won't matter in the real world, I admitted as much in my previous post, but I don't entirly agree that it's unambiguous. The null would only exist before the system actually attempted to process anything, thus having no makes some sense. Histoically I'd go with the numbers though, which can be commented in databases like MySQL but that's me living within someone elses convention anyway.

Regardless, my point wasn't really to defend any singular method of storing the data, it was to back up the argument that you probably want 3 states as opposed to 2, which is the more important issue. I don't care how you store those dates, but I will care whether or not I can tell if somethings failing, or if somethings never happening. Obviously it depends on specifics though.
 
Caporegime
Joined
18 Oct 2002
Posts
29,491
Location
Back in East London
That's one form of PO, yes. That's not what I think would be called technical debt, though. Using NoSQL or whatever flavour of the month "just because that's what the cool kids are doing" should have its own name like "fanboism" or something. Perhaps "CV Driven Development." Technical debt is when you accrue bad design for the sake of producing a product quickly/cheaply. It (the current design) will hinder development in the future, but allow you to push the current stuff under development out quicker. There are two types of TD - the stuff you know is bad, but you want to get something out the door ASAP, and then there is the stuff you just don't know about at the time and won't discover until later. Typically the latter is when there is a disconnect between the business and the developer - the develop doesn't know what is to come, so makes a design decision that may not suit the upcoming work.

Premature optimisation is very specifically about worrying about small performance gains where you sacrifice readability/simplicity for the sake of a few CPU cycles or such, and/or when you are trying to solve a problem you don't even have. What it says on the tin: it is about prematurely optimising.

But yeah, using strings is not going to make any measurable difference to the responsiveness of the system. The only time that kind of optimisation comes into it is when you are rapidly accessing/indexing millions of rows per hour. Besides which, the conversion/comparison to a string (i.e. the readable bit of descriptive text for your users to read - using "null, 1, 0" or other is going to mean nothing to them) will soak up most if not all of the gains anyway.
 
Last edited:

aln

aln

Associate
Joined
7 Sep 2009
Posts
2,076
Location
West Lothian, Scotland.
That's one form of PO, yes. That's not what I think would be called technical debt, though. Using NoSQL or whatever flavour of the month "just because that's what the cool kids are doing" should have its own name like "fanboism" or something. Perhaps "CV Driven Development." Technical debt is when you accrue bad design for the sake of producing a product quickly/cheaply. It (the current design) will hinder development in the future, but allow you to push the current stuff under development out quicker. There are two types of TD - the stuff you know is bad, but you want to get something out the door ASAP, and then there is the stuff you just don't know about at the time and won't discover until later. Typically the latter is when there is a disconnect between the business and the developer - the develop doesn't know what is to come, so makes a design decision that may not suit the upcoming work.

I don't know, excusing my poor choice of words, I would suggest CV Driven Development (that's a great term) and PO both leads to the same bad decisions which will be refered to as technical debt at some point in the future, of course this doesn't encompass the entire scope of TD, nor does TD cover the wasted development time, but we'll generally end up in the same place while we support flavour of the month that nobody really understands. In reality if it didn't take longer and doesn't become technical debt, it's actually not much of an issue in my opinion.

That said, I don't participate in CVDD and I often wonder if I haven't done the wrong thing not padding my CV because most job adverts seems to be based around such fads. I often see jobs requires 4 technologies I've never actually used because I was busy doing my job, but the guy who sats across from me constantly using new tech and comitting broken code every day probably is 'qualified'. =D

But yeah, using strings is not going to make any measurable difference to the responsiveness of the system. The only time that kind of optimisation comes into it is when you are rapidly accessing/indexing millions of rows per hour. Besides which, the conversion/comparison to a string (i.e. the readable bit of descriptive text for your users to read - using "null, 1, 0" or other is going to mean nothing to them) will soak up most if not all of the gains anyway.

While that's true, it'll likely be done on a different piece of the stack that is traditionally easier to scale.
 
Last edited:
Back
Top Bottom