Database design question

ProfessorFate · 22 Oct 2006 at 16:48

What is the best way to deal with situations where you have a table (TableA) with a primary key with say 4 fields in it, which has a relationship with another table (TableB) so that TableB has a foreign key consisting of the 4 key fields in TableA?

Is it a good idea to introduce a "rowid" here and use that instead (because it's much more convenient and probably more efficient) and how would you use it, or is this bad practice?

Any thoughts would be much appreciated

HTMLHugo · 22 Oct 2006 at 17:28

maybe ive misread in which case im sorry

but i think you just mean you have a table (a) which includes a pk and 4 fields and in table (b) you just want to refer to the 4 fields in table (a)

im wondering why not just refer to table 1 by unique identifier anyway ?

table (b) would just be a lookup table other wise which means you would be best using a rowid and a fk but im sure its not best practice to create a table for just sticking a fk in

Mr^B · 22 Oct 2006 at 18:03

Standard practice would be to NOT have a composite key here, but to do as you say and use another generated id as the key.

General rule of thumb is to never use composite keys - there's practically no situation where it would be beneficial to use one over a generated/internal/artificial ID.

Artificial keys can be created out of what would have been the composite key, such as:

280768smithec1a4qj

Which would be d.o.b. + surname + postcode.

Beansprout · 22 Oct 2006 at 18:04

Describe the situation and the data you're working with, and the relationships between the data

ProfessorFate · 22 Oct 2006 at 19:34

Thankyou all for your replies

To clarify, TableA is a table with a primary key made up of 4 of its fields (it could have other fields too but these aren't part of its key),
and there is a one-to-one relationship between TableA and TableB.

It was more of a general theoretical question really. I've always been inclined to use artifical record IDs in this case, but I've been told this was a bad thing to do (by my databases lecturer no less!).

Having done a bit of Googling, I think the reason I was told this was to do with making sure I get a properly normalised set of tables with referential integrity and constraints based on the "real" data. Having first done that, then I could look at introducing artificial keys (surrogate keys) to make implementation easier.

So looking at Mr^B's example with his "Person" table in which he replaces a composite key with an artifical key; the composite key fields still remain as a unique index in the table (to prevent duplicates), but the artifical key becomes the primary key which would be used in queries, and as a foreign key in related tables. Am I close?

Mr^B · 22 Oct 2006 at 20:36

You wouldn't use it for searching, you'd still have the other information elsewhere in the table. It would be hideously inefficient to search on a key like that.

It's just "nicer" (and also more secure if it's an externally facing system) than just having an incrementing number - many companies use it to help their customers remember their customer reference numbers, etc.

Eriedor · 22 Oct 2006 at 20:47

Best practice is to normalise the database (3rd form at least) and then de-normalise using artificial pk's inplace of composite keys.

JIMA · 22 Oct 2006 at 23:51

IMHO, having a system generated pk is fine for having one table refer to a row on another table for use by the application code, particularly in cases where the composite key contains a lot of different elements. However, I think there are reasons for keeping composite keys in a lot (most) cases.

Keeping composite keys makes it clear as to how the table is structured and the meaning of the key. This meaning would be removed by having a system-generated key.

Having system-generated keys also adds an extra level of maintenance that could be unnecessary.

In most cases I don't think having a composite key and either searching on it or using it joins to other tables would be inefficient. I've seldom found it to be a problem. In some instances having an incremental system-generated key could lead to poor performance as the database engine wouldn't be able to gather meaningful stats on key values e.g frequency etc.

Just my opinion, and apologies if I've misunderstood anything along the way.

ProfessorFate · 24 Oct 2006 at 20:43

Having worked with a database (not designed by me I should add!) where all tables used generated keys exclusively with no other indices at all (it was horrid), I think I see your point JIMA.

One article I read suggested having both, i.e; keeping the composite key as a unique index, but adding a system generated field as well, and making this the primary key. I think this is what others have suggested in their replies too. Some articles I read elsewhere even suggested having a surrogate primary key for all tables, and some said they are always bad.

I guess I need more practice and experience so I can form my own opinions, but it definitely helps having experienced people to bounce my ideas off on the forums :cool:

Mr^B · 24 Oct 2006 at 21:47

A primary key is also unique index, having more than one is pointless.

In most cases having a composite foreign key obfuscates the structure rather than clarifying it.

An example will clarify:

Table: Customer
Primary Key: Customer_ID

Table: Order
Primary Key: Order_ID
Foreign Key: Customer_ID

Table: OrderItem
Primary Key: OrderItem_ID
Foreign keys: Item_ID, Order_ID

Table: Item
Primary Key: Item_ID

All with system generated IDs and VERY easy to see how the tables relate to each other.

Table: Customer
Primary Key: Surname, Forename, DOB (composite key)

Table: Order
Primary Key: ItemName, Colour, Supplier
Foreign Key: Surname, Forename, DOB

Already, without including the 3rd table, we are breaking normalisation rules and we have duplicated data where it really isn't necessary. And the relationship between the tables becomes unclear

Composite keys are OK for mucking around with and doing noddy DIY systems. Anyone caught implementing it in any "professional" capacity needs shooting, it really is the first and fundamental principle of good relational database design.

"a smart key is a dumb thing to do, while a dumb key is a smart thing to do."

JIMA · 24 Oct 2006 at 23:19

My thinking would be that if the OrderItem table is to be used purely as a resolver (junction table) between Order and Item, dealing with the many-to-many relationship between them, there isn't anything wrong in structuring the OrderItem table as:

Table: OrderItem
Primary Key: Item_ID, Order_ID

However, if the key from OrderItem was to be used in a child table of OrderItem then there would be a definite advantage to creating an OrderItem_ID key rather than use the Item_ID, Order_ID key (possibly in multiple places).

In the case of entities like customers matching on character fields etc. is definitely not something to do, as shown in Mr^B's example. In these cases a (preferably) numeric key is a must.

It's the first point with the resolver table that I was getting at in my original mail.

Mr^B · 25 Oct 2006 at 09:12

Yup, I agree with all that.

Having another surrogate key in a table purely to resolve a many-to-many join isn't really necessary, a composite key here is the best solution.

Although there might be a reason to have duplicate order_ids and item_ids, perhaps if you wanted to track what people were thinking about adding to that order, what they removed, how they varied the quantity in the order before placing it, etc - but that might be better in a separate order audit table...

Oh what fun we have designing databases!