Simple database /data storage and analysis solution?

Energize · 21 Jun 2013 at 00:59

I volunteer at a local charity who have a small office LAN and want a new software system to manage all the storing and handling of their offline data.

Currently they are storing all their data in spreadsheets (no doubt unencrypted), different people have their own spreadsheets they've created, some listing horse information such as name, age, people who have previously loaned them and dates of those loans, certain special requirements, other sheets have staff/volunteer data, others have donor/trustee information, others financial data, mailing lists et al.

Sometimes data gets overridden because two people are altering a file at once.

What I suggested as a possible solution is a standard relational database, with tables representing entities derived from the imported normalised spreadsheet data. Then using input forms utilising SQL to allow users to enter information into the database, and forms to allow users to bring up whatever data they wanted, and also creating some custom queries as well, to expedite common queries. I'd like any data to be able to be output into various common formats such as word, excel, pdf as appropriate too, maybe html/php for the website too. But I haven't finished discussing requirements relating to their website and how such a database could be useful to them in a website.

Of course will need to be able to set various user groups and appropriate permissions. Some way I could remotely manage it too would be great for support purposes. I'll need a decent encryption system too, maybe utilising key-files on pen drives as well as a realistic size password for high level users?

I may be a little out of the loop here regarding databases cos it's been a few years since I studied them but what do you guys think?

Also any suggestions for a decent database software package that a charity could afford, free obviously being preferable, and something that doesn't require a supercomputer to run either please.

!bluetonic! · 21 Jun 2013 at 08:16

Use a hosted website that included MySQL with PHPMyAdmin. I heartily recommend TSOHost for this.

You can then use the online forms to CRUD the data in your database.

You can specify permissions.

You can export this data.

You can allow remote access to the database.

You can create a PHP website to show the data.

If you wanted an localised approach, you can use MS SQL Express for free. It has limitations, but there's plenty of support of the web.

Raumarik · 21 Jun 2013 at 09:08

Depending on what you're storing in the database you may need to check the organisation is registered with the ICO, only costs a few quid but it's illegal not to for certain data types.

Only saying this as one of my mates was caught out recently.. :-P

Energize · 21 Jun 2013 at 12:32

!bluetonic! said:
Use a hosted website that included MySQL with PHPMyAdmin. I heartily recommend TSOHost for this.

You can then use the online forms to CRUD the data in your database.

You can specify permissions.

You can export this data.

You can allow remote access to the database.

You can create a PHP website to show the data.

If you wanted an localised approach, you can use MS SQL Express for free. It has limitations, but there's plenty of support of the web.

You see I had considered a web based approached and use tsohost for my own web hosting but I doubt there's much chance of them going for that, they don't have a huge amount of technical understanding upon initial impression seemed a bit wary about any suggestions of "remote access".

Thus is seems like it's gonna have to be a localised approach.

I didn't realise that Microsoft had any free dbms available, so thanks for that, I'll definitely check that out.

Halfmad said:
Depending on what you're storing in the database you may need to check the organisation is registered with the ICO, only costs a few quid but it's illegal not to for certain data types.

Only saying this as one of my mates was caught out recently..

I'm dreading having to read up on the legal side, hopefully though I shouldn't need to personally process any confidential information they store.

!bluetonic! · 21 Jun 2013 at 14:07

If you work for a charity, you may be entitled to free legal advise somewhere.

As for the 'remote access', I'm sure TSoHost or another forum member can ease your mind.

If you use a free MySQL DB client, secure passwords, the end user won't notice a difference.

Energize · 21 Jun 2013 at 16:16

Yeah I'm sure they will have some sort of legal advice, however with regards to security it's not my mind that needs easing, I use Tsohost!

Though admittedly the fact that Tsohost allows one to change the cpanel password without needing to know the previous one via the my.tsohost area, combined with the fact that the cpanel password is also displayed in plaintext above that same input form and is thus clearly not hashed does concern me somewhat.

thenewoc · 21 Jun 2013 at 22:24

Perhaps set up mysql locally with all the data in it then create views of some of the data that would have a use on the website but find a way to replicate these views into a seperate web hosted mysql db. That way the more sensitive info is only stored locally. Can't recall if the mysql synchronise feature allows just specific tables or views or the whole db. If that isn't the solution you could write some sql to truncate table(s) in web mysql (this would drop current data but keep structure) then insert queries to push the specific data from the local db back to the web db. I've assumed here you only want some reference data in the web db rather than transactional data. Toad for mysql could be used locally to connect to both db's and run the sql to do the above.

Energize · 30 Jun 2013 at 20:59

After my recent experience there I definitely agree that there needs to be some mirroring of data between local and web servers. Aside from the fact that the internet connection is unreliable their electric also went off for over an hour when I was there the other day and we can't afford for data submitted from the web server to just vaporise if the local network is down.

Can you recommend any good software for the actual design/modelling process for database projects? I've realised that the technical aspects are the easy part for me, but modelling the processes of the organisation that the database needs to facilitate and things that I can vaguely recall from uni like "use cases"

and all that UML stuff for giving a graphical representation are harder for me to take into account. It would be a lot easier for them to understand the structural design too, rather than showing them an entity-relationship diagram!

thenewoc · 30 Jun 2013 at 22:48

I think I would use a spreadsheet to record what data each process needs to capture first from those who currently create those adhoc spreadsheets you mentioned. I'd also pay particular attention to whether each piece of data is always available at time of creating process. Understanding this will help in defining required fields. It may be that some processes don't always have all the data to fully create a record initially but are perhaps back filled by the user revisiting the record later. You may not be able to appreciate this from their fully completed current spreadsheets but the requirements gathering should aim to clarify this. It sounds a bit like you also have a bit of a mission in getting those people to buy in to creating processes going forward rather than the way they've been used to working with spreadsheets. Once you know what they need which I'm guessing is mainly provided in their existing spreadsheets I would see how you can normalise to a degree, less so for transactional data, eg price should represent price paid not a foreign key to the current price. Perhaps use something like Visio to create some flow charts for each process with some sort of call outs to other processes they rely on or enable . When you collect the user's requirements for data capture you can group it together into processes / screens as you record it into a spreadsheet depending on the order that a user would be working through recording that data. You may need to consider user access control too as to what type of user can perform each process. I think it was SSADM when I was at Uni but I think you can get too carried away with that side of things and have a set of processes that look pretty but loose sight of the detail that needs to be captured. I've always used Access to model the db as I've always found it quicker to put the structure together. It can give you ERD's and table structures to printout and work from later too although some field types will differ there will be comparable types you can relate them to in MySQL.

Energize · 8 Jul 2013 at 12:29

Thanks. I've been pretty careful about making sure whether they will have the information available as soon as the record is created, for example I thought about using the horses passport # as the unique ID to increase efficiency but obviously that's not going to work if that info isn't available at the start.

Regarding server specifications for this kind of project, with only a few local users, what would you recommend, just any old computer running linux with raid 1 and a backup power supply?

thenewoc · 8 Jul 2013 at 19:52

Hopefully the process talks you have with them will thrash out whether they need to enter horse details prior to having the passport #. If they did you might need to use an auto number field but then you'd also not be able to make the passport # a required field and would need some secondary process to flag up records that were not filled as much as would be desired.

Server wise whatever is going to be most suitable for who ever to manage going forward, as far as keeping it patched up. If you go the Windows route make sure the hdd's are not using FAT as it limits MySQL in size and efficiency. Yes I would use RAID 1 and a backup power but also some sort of offsite data backup too.

Also, you should consider when creating the front end to use transactions to commit data into the db. For instance a screen in your front end UI maybe collecting several bits of data that belong in different tables that are related to each other. You'd want a transaction to wrap the posting of this data to all tables involved so that the transaction completes when all of this occurs but throws an error if the transaction didn't complete its payload. Something in your front end should check for transactions that have not been committed. Successful transactions should trigger the cached copy of that transaction in the front end being dropped. That should reduce the opportunity for the db to become corrupt.

You could also you put some sort of numbering pool logic together that a record also has a field that takes on the next number from a pool then you will also be able to create some check for missing record numbers. This would indicate records that have either been deleted, lost or waiting in the front end to be committed. You'd need some logic in the front end that enables the current user to lock that numbering table momentarily though whilst writing something to a field to indicate that the next available number has been taken and therefore in a status of having a record created. If the transaction is abandoned in a controlled manor you can update the number pool to make that number available again. I'd date / time stamp all records too so if a problem does occur you can trace an affected time period.

Energize · 8 Jul 2013 at 21:35

Thanks for your advice, you clearly have a pretty advanced DB knowledge!

After talking with them, they do enter details before having the passport number as far as the transfer waiting list is concerned, maybe I could suggest obtaining the passport number at the starting point, I'm not sure.

I wonder if "Registered name" would be a suitable primary key for the horses? I can't imagine in reality that two horses would ever have the same registered name at a centre that size, but is it bad design?

As far as the server goes I'll probably be using Linux to keep costs as low as possible.

This is what I have so far regarding structure, just based on the information they gave me. Obviously there's a lot of improvement to be done and relationships added!

thenewoc · 8 Jul 2013 at 23:06

Thanks, I've done a few over the years but am more into analysis and data migrations these days.

Not sure if 'Waiting list' relates to horses that haven't yet made it into the 'Horses' table? If not then the HorseID should be all that's needed to link to 'Horses' and 'Waiting list' would just have stuff relating to the waiting list request?

'Age' doesn't seem right but do horses have DOB's? That brings me onto passport number, could you use horse name and DOB as a combined key which is pretty unlikely to have two named the same born on the same day.

I wouldn't use name on it's own as I'm guessing they acquire horses after they've already been named and there's no doubt some common names that crop up. You could keep it as you have it though and treat the HorseID with a UniqueID as an internal ID that the center allocates to each horse. If you did that though you'd need some validation in the front end that triggered a query after the horse name and DOB were entered to see if a record already existed with those details and present it to the end user in a separate pane of the UI with some sort of warning to draw the users attention.

I think your 'Treatment type', 'Wormer Type', 'Work Done', 'Work Type', 'Procedure Type', 'Vaccination Type' across the Horse Info tables would be better as foreign keys to respective category tables. This will prevent several different spellings or descriptions being entered for essentially the same thing and make later analysis of the data easier.

You might want some sort of provider info too so you know where for instance a vet treatment was procured from, likewise for the other medical type tables.

You might also want 'Location', 'Height' and 'Weight' to be in separate many to one tables so that a date can be recorded on a record to give some relevance to when it was each of these things.

eg
table: height
HorseID
Height
Date height taken

Energize · 9 Jul 2013 at 00:05

Thanks. The waiting list is for horses that haven't made it to the centre but the owner has requested they be transferred there. And no, horses don't have DOB, their age is purely based on the year they were born, for example my mare was born in 2001, so she's classed as 12 years old, regardless of month or day.

As for the "work done" etc. I was thinking drop down boxes, because there's a number of common treatments which will be entered that they have given me a list of.

The contactID, is supposed to be the Vet/Farrier etc. so they can tell who performed the treatment.

Energize · 9 Jul 2013 at 13:10

I was wondering, do you think for the server I'd be better off with 2x ~64GB SSD's rather than 2x ~500GB HDD's?

Raumarik · 9 Jul 2013 at 13:46

Depends how many people are likely to be accessing it at once, SSDs maybe marginally faster but at least with traditional HDDs you usually get some warning that they're starting to fair (event ID 7s etc).

Ultimately you'd be backing it up anyway so the failure thing shouldn't matter too much.

Energize · 11 Jul 2013 at 12:17

Well isn't that precisely what raid is for?

I just remember being told somewhere that the performance of transactional databases is greatly increased when operating using an SSD or RAMDISK.

In any case I'm now setting up a MySQL server on my pc for test purposes.

ZombieFan · 11 Jul 2013 at 13:06

RAID shouldn't be used instead of a backup solution. If both disks fail (which can happen) then you still need a backup of the data to restore.

In regards to the disks, unless the database is going to be under serious load (are we talking 10 users, or 10,000?) then I would go with tranditional HDDs. I would imagine cost is an issue if this is for a charity? I doubt cost vs benefit will be worth it.

Energize · 11 Jul 2013 at 13:28

I mean with regards to availability, ie it not mattering about a warning of an eminent failure. I will of course be using a backup, even with all the protections I can put on it, I wouldn't be surprised if it became infested with viruses.

Only a few users. If the consensus is that hard disks are better then I will go with that, but in my example, the SSD costs the same as the hard drive, so there's no cost issue (I imagine the database, OS and software will not exceed 64GB).

Energize · 19 Aug 2013 at 11:12

I've come up with this atm.

What do you think of it?