Best way to analyse 1m rows of data?

I use qlikview, Excel, R, MySql

If you don't want to install any new software that rules out qlikview and R (R is too steep a learning curve if you are adverse to that)
Qlikview would be the easiest to pick up


But if you have access use that as it has that familiar gui

How many excel sheets need importing?

Usually I convert excel sheets to CSV, import into database, analyse with appropriate tool
 
How would you use access? I had to build some relational databases during my masters but that was a long time ago!

I've got access so if theres a better way I'm all ears.

My problem would be scraping all the data from the sheets into the program. Unless you can batch import? (THere are 5 tabs on each sheet, just need the one, not the rest).

The GUI on Access is pretty straightforward, once you've got the data sorted out.

You probably need to give more detail on what you're doing with the data in terms of VLOOKUPS etc but I'd argue it's as easy if not easier to do the equivalent in Access using the GUI to link a few tables and then create the output.

Importing data is also pretty straightforward using the wizards and you can save the steps involved to make it easily repeatable.

I'd imagine a half day spent building on your database knowledge to learn the importing, table linking and a couple of formulae in Access would pay off in terms of processing time very quickly with data sets that big, I certainly got a lot of payback moving about 600k lines from Excel to Access recently.
 
The GUI on Access is pretty straightforward, once you've got the data sorted out.

You probably need to give more detail on what you're doing with the data in terms of VLOOKUPS etc but I'd argue it's as easy if not easier to do the equivalent in Access using the GUI to link a few tables and then create the output.

Importing data is also pretty straightforward using the wizards and you can save the steps involved to make it easily repeatable.

I'd imagine a half day spent building on your database knowledge to learn the importing, table linking and a couple of formulae in Access would pay off in terms of processing time very quickly with data sets that big, I certainly got a lot of payback moving about 600k lines from Excel to Access recently.

Seconded

For me I moved to MySql
Learning database language and it's differences to excel. Took a bit of time but once you have your schema and tables created it's so much more useful

For. Basic use just being able to edit and add to data and then have every other dependency have access to this on the fly is so useful.
The time to learn a few basics is more then worth it even if you only have to do it a few more times

Excel, R, Qlikview etc etc can now all access an up to date version without the need for multiple original files

I create a backup box a CSV just in case I corrupt the database just to be on the safe side!
 
Looking for any help from Excel Gurus on this one!

I've written a macro to pick up and process all files in a folder, but I'm finding excel is struggling when it gets close to the 1 million row mark and is crashing everytime.

My brother wangles million-row spreadsheets routinely. Some tips I've learned from him:

Excel is multi-threaded. One thread per tab. Design your spreadsheet so that the data is processed on multiple tabs.

The VBA engine is single-threaded. Unwind VBA as much as possible into the separate tabs. Yes, it makes the spreadsheet code much cleaner to use VBA but it introduces a huge bottleneck.

Repeated identical lookups are bad. Try to do the lookup once, store it somewhere, and then reference that. Again, you're trading code cleanliness for efficiency.

FWIW I asked a similar question on his behalf when he encountered the VBA issue and he was recommended to use Matlab.
 
Back
Top Bottom