Associate
I expect many of us are doing 3rd or 4th year projects in our degree course at the moment, and I thought it would be interesting to post some details about them.
Mine has the ever so catchy title "An Investigation into the Application of Regression, Expectation Maximisation (EM) and Neural Network Methods to the Imputation of Missing or Incorrect Data".
Basically this involves analysing an existing data set for either missing value (easy enough), or erroneous values (a bit less easy). Once these have been analysed and outliers found, the real fun begins.
The aim of the project is to impute these erroneous values back to what they should be based on the over all trend of the data. This can be done by formulating simple, multiple or polynomial regression models, as well as using probabilistic methods such as the Expectation Maximisation (E.M) algorithm. Next I am going to be looking at if it's possible to improve on these methods by using Neural Networks to formulate a data model.
It's all going pretty well so far, just about on target, and should end up with a pretty decent mark at the end of it all.
Here's a few screenshots of the utility I'm working on to automate all this (very primitive)
Ignore the error exception . The highlighted bit is what's important
Now you
Mine has the ever so catchy title "An Investigation into the Application of Regression, Expectation Maximisation (EM) and Neural Network Methods to the Imputation of Missing or Incorrect Data".
Basically this involves analysing an existing data set for either missing value (easy enough), or erroneous values (a bit less easy). Once these have been analysed and outliers found, the real fun begins.
The aim of the project is to impute these erroneous values back to what they should be based on the over all trend of the data. This can be done by formulating simple, multiple or polynomial regression models, as well as using probabilistic methods such as the Expectation Maximisation (E.M) algorithm. Next I am going to be looking at if it's possible to improve on these methods by using Neural Networks to formulate a data model.
It's all going pretty well so far, just about on target, and should end up with a pretty decent mark at the end of it all.
Here's a few screenshots of the utility I'm working on to automate all this (very primitive)
Ignore the error exception . The highlighted bit is what's important
Now you