Bulk Loading CSV Files Into MySQL Database?

Associate
Joined
22 Dec 2009
Posts
1,286
Location
Upper Skurt
Hi,

Is there a way to bulk load a list of csv files into a MySQL database where a new table is created from the import of the 1st file and then all the remaining csv files are appended to this table.

The files are named:

2011-01-07.csv
2011-01-14.csv
2011-01-21.csv
2011-01-28.csv.....

up to the end of 2011, ie, 52 files.

The files are quite large, each csv file is approx 275mb.

I can create a macro in microsoft access to do it but the problem is that an access database mdb file cannot exceed 2gb and this is exceeded quite quickly. I know that MySQL will not have the same file size constraint but I am not familiar with how to bulk load or create a macro in MySQL to do the job.



Binty
 
It wouldn't be that difficult to do apart from the obvious issues with datatypes. You can read the column names of the first csv and build the table from that but you need to be aware that:

a) It probably won't be the most efficient storage means (using INT that is longer than it needs be etc)
b) You will probably have to loop over all the data to get a good picture of what the data is like so you don't choose the wrong datatype / length of field.
c) If an edge case appears in one of the subsequent csvs that wasn't present in the "template" csv that you based your schema off, you may have problems.

What is the data and what does it relate to as that might allow us to suggest other options or at least be a bit better informed.
 
I have an update I have to perform occasionally with a similar dataset, multiple csv's encompassing 20+ million rows.

I personally use SQL Server for the big stuff so I wrote a simple .NET app to loop through the dir containing the .csv's and execute a bulk upload command using a .fmt file to define the format (you could get away with not doing that though). Took a few hours to tweak and produce but means every refresh now is just a button click rather than messing around so its worth wasting a few hours on to set up if the update is a regular thing.

There is a similar bulk load class for the MySql connector so maybe look at: http://dev.mysql.com/doc/refman/5.1/en/connector-net-programming-bulk-loader.html as it's fairly simple.
 
Last edited:
Thx for the replies.

Each csv file is has the " symbol as a text qualifier in the appropriate places and the fields are text, number and dates. There are no real issues with the data files so any bulk load process would be OK as I check after the import and do some cleansing anyway.

I will check out the link for the MySQL bulk load.


Thx
 
Back
Top Bottom