Tuesday, 23 October 2012

import-insert data from csv file to mysql using php

Import/Inserting data into mysql with load data query. Using load data infile query we can easily insert data into table.
We already have seen the tutorial on how to export data into csv - click here to see export csv tutorial

It is quiet often required to insert bulk of data into database table doing with script its bit lengthy. Instead we can easily import data using mysql load data query.

Sysntax for LOAD DATA INFILE

LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name' 
          [REPLACE | IGNORE] INTO TABLE tbl_name 
          [CHARACTER SET charset_name] 
          [FIELDS [TERMINATED BY 'string'] [OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char'] 
          [LINES [STARTING BY 'string'] [TERMINATED BY 'string']
          [IGNORE number LINES] 
          [(col_name_or_user_var,...)] 
Ex. 
LOAD DATA INFILE 'PATH_TO_CSV' INTO TABLE tblTableName FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 LINES (tblCol1,tblCol2,tblCol3)
[LOW_PRIORITY | CONCURRENT]
LOW_PRIORITY : If you use LOW_PRIORITY execution of the LOAD DATA statement is delayed until no other clients are reading from the table. This affects to only those storage engines that uses only table-level locking (such as MyISAM, MEMORY, and MERGE).
CONCURRENT : If you specify CONCURRENT with a MyISAM table that satisfies the condition for concurrent inserts, other threads can easily retrieve data from the table while LOAD DATA is executing on other hand.
[LOCAL]
If it is specified, is treated as client end connection. Means when you specified it is read on client side and sent to server where a temporary copy generated for execution
 [REPLACE | IGNORE]
This two are Control handling Keywords on input rows that duplicate existing rows on unique key values:
REPLACE : Input rows replaces existing rows that have same value for primary key, unique key index.
 IGNORE : Means Duplicate rows are skipped.
If you don't specify either option behavior depends on LOCAL keywords. If you don't specify LOCAL Keyword then error generated when duplicate entry found. With LOCAL Keyword default behavior is IGNORE.
[FIELDS [TERMINATED BY 'string'] [OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char'] 
This are the arguments that how you wants to treat all fields.  TERMINATED BY  to define field  end with specified string. ENCLOSED BY  specified character.
 If you don't specify any of the above argument it will treat as below default behavior 
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
[LINES [STARTING BY 'string'] [TERMINATED BY 'string']
This line is to determine line behavior.  STARTING BY  is used to define starting symbols of given files matches the lines which starting with specified string value.  TERMINATED BY  is to used to specify end of LINE.
 If you don't specify anything then default behavior for LINE is as below
LINES TERMINATED BY '\n' STARTING BY ''
[IGNORE number LINES]
This is to specify starting of line number means to ignore the rows from top. If you specfy IGNORE 1 LINES then first line will be skipped, execution will start from the second row/record.
[(col_name_or_user_var,...)] 
Specify the respective columns for your insertion operation.