Creating a database of what's in my database?

Hamish · 26 Nov 2007 at 23:32

ok basically at work I look after these banking databases, they have around 800 tables, many of which have around 120 columns in each. I write a lot of reports using lots of these tables and have to join them all over the place, I'm trying to make a database which will show me what columns will match up to other columns: simple query! Only problem is getting the table and column names into the database, ideally I would have a table like so:

tablename, columnname, datatype, comment

Easy enough! Now how do I populate it?! I'm not going through 350,000 columns by hand!

SiriusB · 26 Nov 2007 at 23:58

Came across these commands with a quick search:

Code:

select * from information_schema.columns
select * from information_schema.key_column_usage

From here: http://www.thescripts.com/forum/thread81434.html

Beansprout · 26 Nov 2007 at 23:59

Depending on the db you could probably link it to Visio

Goksly · 27 Nov 2007 at 00:16

All databases will have information on themselves - but different databases have different system tables.

Hamish · 27 Nov 2007 at 00:29

it's MSSQL can use either 2000 enterprise or 2005 express. Guessing it's sysobjects I need?

SiriusB · 27 Nov 2007 at 01:22

Woops! Just reloaded firefox and it posted again :/

Haircut · 27 Nov 2007 at 07:38

It will also depend on whether they have foreign keys defined, banks tend not to be good at that sort of thing!

I worked at an investment bank on a system that was used to make hundreds of millions each year and there wasn't a single primary or foreign key in the database :eek:

Daveyboyuk · 27 Nov 2007 at 09:16

Haircut said:
It will also depend on whether they have foreign keys defined, banks tend not to be good at that sort of thing!

I worked at an investment bank on a system that was used to make hundreds of millions each year and there wasn't a single primary or foreign key in the database

Denormalised for speed maybe? We do that a lot (I work on IB systems).

Dj_Jestar · 27 Nov 2007 at 10:11

Denormalised for speed?! I've never heard anything so ironic in my life :/

To the OP, use indexes and temporary views if your queries are becoming too complex.

Haircut · 27 Nov 2007 at 10:41

Daveyboyuk said:
Denormalised for speed maybe? We do that a lot (I work on IB systems).

No, they just couldn't be bothered (or most likely didn't have the time after traders kept changing their mind every two minutes) to do it properly.

What's ironic about denormalising a database for speed purposes anyway? It's pretty much what happens in a data warehouse.

rpstewart · 27 Nov 2007 at 11:16

Haircut said:
What's ironic about denormalising a database for speed purposes anyway? It's pretty much what happens in a data warehouse.

Yup, it's standard practice. Space is cheaper than CPU so rather than spending time joining left right and centre it's far quicker to have big tables and simple indexes.

Dj_Jestar · 27 Nov 2007 at 11:27

Bigger tables need more resources to search.. which is why indexes and views were designed.

Daveyboyuk · 27 Nov 2007 at 11:36

Dj_Jestar said:
Bigger tables need more resources to search.. which is why indexes and views were designed.

You optimise table for it's purpose. Denormalising for speed is standard practice. A table that you rarely search and but do a lot of UDI, you would not put an index on as this carries its on over head of having to update the index. For large amounts of data, column based databases like KDB beat the socks of SQL.

rpstewart · 27 Nov 2007 at 11:52

Dj_Jestar said:
Bigger tables need more resources to search.. which is why indexes and views were designed.

You still index the denormalised table. If you normalise the data and have to retrieve columns from two tables then you need to do (as a minimum) two index scans, a join (which is expensive) and two fetches. On a denormalised table you do one index scan and one fetch.

Views in my opinion are a waste of time. In almost all cases you have to execute the view SQL first into a temporary data set and then run the user query against that. You're better off either materialising the view (if you're database supports it) or creating a summary table as a batch process.

Daveyboyuk · 27 Nov 2007 at 12:12

I agree with rpstewart aswell, views are a waste of time.

Hamish · 27 Nov 2007 at 19:39

Yep it doesn't have any foreign keys! only primary keys, the program sorts out all the validation. It's OPICS by the way.

oneilldo · 27 Nov 2007 at 21:08

I don't have SQL to hand but from the top of my head you could try the following.
NB I can't remember how the master table links in that defines what the type of column is (int,varchar etc). Will you want the length of the field as well? you could do select name from master..sysdatabases where dbid > 4 and then put the results in a cursor and cycle through SQL similar to whats below for each DB (I can post exact SQL if need be).

I can post the exact SQL tomorrow if need be but the following should get you started

--unsure on this as can't remember the master table to link to
select so.name as TableName, sc.name as ColumnName, mt.Type as ColumnType, 'COMMENTS' as Comments
FROM sysobjects so, syscolumns sc, master..xtypes mt
where so.id = sc.id
and sc.type = mt.type --UNSURE HERE - will check tomorrow
and so.xtype = 'U' --user table
order by so.name, sc.name

--This will definately run and get a list of tables with each column for the current DB you are in
select so.name as TableName, sc.name as ColumnName, 'COMMENTS' as Comments
FROM sysobjects so, syscolumns sc
where so.id = sc.id
and so.xtype = 'U' --user table
order by so.name, sc.name

oneilldo · 27 Nov 2007 at 21:10

rpstewart said:
You still index the denormalised table. If you normalise the data and have to retrieve columns from two tables then you need to do (as a minimum) two index scans, a join (which is expensive) and two fetches. On a denormalised table you do one index scan and one fetch.

Views in my opinion are a waste of time. In almost all cases you have to execute the view SQL first into a temporary data set and then run the user query against that. You're better off either materialising the view (if you're database supports it) or creating a summary table as a batch process.

Completely agree, although I hope you mean index seek and not scan

Clustered Indexed views might offer some gain (I assume materialising the view is the same in oracle)

rpstewart · 27 Nov 2007 at 21:57

oneilldo said:
Completely agree, although I hope you mean index seek and not scan
Clustered Indexed views might offer some gain (I assume materialising the view is the same in oracle)

I deal with DB2 so I use those terms but yes it's a proper B-Tree seek rather than a top down scan.

SiriusB · 28 Nov 2007 at 01:03

Speaking as a DB n00b.. but what's wrong with using views?

I use one in a database I have created to GREATLY simplify some of my SQL queries. Admittedly this is a very small database compared to the huge ugly monsters some of you guys obviously deal with.