Been working a lot with rails lately and I needed to import a large dataset (almost 3200 records). If you do something like this in rails you have to use the fastercsv gem, as, well it is faster. The thing is though, ActiveRecord isn’t. So although I can read the csv file in about 2-3 seconds creating the records ends up taking about 10-12. I didn’t really like this performance wise so I poked around on the web and found this method using “LOAD DATA INFILE” – a feature in mysql that lets you specify a csv to load and map to columns and boy, it is fast. The whole process takes about as long as it takes fastercsv to read the file (2-3 seconds now).

So here is the method:


def fast_import
    csv_string = FasterCSV.generate do |csv| 
      FasterCSV.foreach( absolute_path, {:headers => true, :return_headers => false, :header_converters => :symbol} ) do |row|
        csv << [row[:firstname],row[:lastname],row[:doctor_name],row[:homephone],self.id,row[:insurance],row[:mi],row[:dob],row[:address1],row[:address2],row[:city],row[:state],row[:zip],row[:id],row[:salutation],row[:survey_result_code] ]
      end
    end
    tmp_file = absolute_path + '.tmp'
    File.open(tmp_file, 'w') {|f| f.write(csv_string) }
    ActiveRecord::Base.connection.execute(load_data_infile(tmp_file))
    system("rm", tmp_file) # remove the temp file
  end

private
def load_data_infile(temp_path)
           <<-EOF
                 LOAD DATA INFILE "#{temp_path}" 
                          INTO TABLE attendees
                          FIELDS TERMINATED BY ','
                             (first_name, last_name, doctor_name, home_phone, event_id, insurance, middle_initial, date_of_birth, address, address_continued, city, state, zip_code, barcode_id, salutation, survey_result_code)
                            SET created_at = '#{Time.current.to_s(:db)}',
                            updated_at = '#{Time.current.to_s(:db)}';
            EOF
        end


Ok all we are doing is using fastercsv to parse the uploaded file, read it and then write it back to the disk so that mysql can import it. The first method is rails specific but the second could be used in any language as it is really just some mysql code. It is pretty self explanatory. If you need some help getting it working, just leavea comment!

2 Responses to “Fast csv import mysql in rails or any language”

  1. johno Says:

    Hi, you might find my gem useful http://github.com/jsuchal/activerecord-fast-import/tree/master

  2. efehav Says:

    Thanx :)

Leave a Reply