Best way to analyse 1m rows of data?

Thug
Soldato
Joined
4 Jan 2013
Posts
3,783
Looking for any help from Excel Gurus on this one!

I've written a macro to pick up and process all files in a folder, but I'm finding excel is struggling when it gets close to the 1 million row mark and is crashing everytime.

Is there a better way to do this? Anything that would be causing the crashing? Am I pulling it wrong, do I need to split it?

Usually I'd take a sample, but on this occasion I actually need the data for totals, and for generating stats...

There is a further caveat, it'll have to be done on a pc with excel 2007 (don't ask). pc is reasonably specced (4770k, 8gb ram), but nothing special. Can't see speed being a hindrance but I think excel 2007 has limitations?

There'll be 5 formulas on each line (data goes across to AC) but they're simple conditional formulas based on vlookup. Will be using a pivot table to derive my results.

Any help at all would be useful. Got a tight deadline :(
 
1m row is the limit in excel IIRC.

Just split the file into two data-sets if excel is the only tool you have & pool the aggregates. (two pivots) - then pivot those results.

Ah, didn't realise that! I thought it was unlimited for some stupid reason...

Great idea on the two pivots will try that. Having researched the topic more, looks like I scraped everything including formatting which probably doesn't help...

Do the stats in the source data and just pool the results?

Excel doesnt like lots of formulas at once, if you have 5 add them one at a time then paster over with their values

Thanks for the idea, unfortunately there are several hundred spreadsheets. Also I'll be using external data (Libor rates etc) and I don't want to run the risk of getting values wrong for each sheet. Much better to have one sheet and know everything is fully standardised.

Good idea on pasting the values though, I like it!

Were they all created with 64bit excel in xlsx?

We found that files made in 32bit excel are subject to the performance limitations (both ram and cpu) even when opened in a 64bit version.

Hmm not sure on this, will have to check work version. Thanks for the heads up though.

With so much data, have you looked into something like R? It's geared towards big data analysis.

Alas I don't have the time to learn new software, nor do I think work would be happy with me installing random stuff on pcs.

Is this software particularly useful? I'm going to be doing a lot more over time, so if there is an advantage over excel, then I'd be happy to learn it myself in my spare time. : )
 
Could you use Access?

This might be the better option, if you've got Excel you surely have Access too?

Linking a few tables instead of VLOOKUPS would be a 5 minute job to learn even if you've had no experience in Access.

How would you use access? I had to build some relational databases during my masters but that was a long time ago!

I've got access so if theres a better way I'm all ears.

My problem would be scraping all the data from the sheets into the program. Unless you can batch import? (THere are 5 tabs on each sheet, just need the one, not the rest).
 
Back
Top Bottom