Wednesday, 24 October 2012

read word files using php

In php tutorial we will see how to read word document .doc and .docx into browser. Generally it is not possible to read word file into browser with php. While pdfs can easily be embed into html.

But here we will see two different method which works well to displays characters from word .doc and .docx file.

To create .docx files with php you can use phpdocx. While if you wanna create pdf you can use fpdf and mpdf.

I suggest you to prefer method 1

Method 1: COM object to read MS WORD files. This works well with .docx and .doc


<div style="border:2px solid #1a4572; width:720px;padding:15px">
<?php

$filename = 'msword.docx';
$word = new COM("word.application") or die ("Could not initialise MS Word object.");
$word->Documents->Open(realpath($filename));

// Extract content.
$content = (string) $word->ActiveDocument->Content;

echo nl2br($content);

$word->ActiveDocument->Close(false);

$word->Quit();
$word = null;
unset($word);
?>
</div>


Method 2 : This works well with .doc 

if(file_exists($filename))
{
    if(($fh = fopen($filename, 'r')) !== false ) 
    {
       $headers = fread($fh, 0xA00);

       // 1 = (ord(n)*1) ; Document has from 0 to 255 characters
       $n1 = ( ord($headers[0x21C]) - 1 );

       // 1 = ((ord(n)-8)*256) ; Document has from 256 to 63743 characters
       $n2 = ( ( ord($headers[0x21D]) - 8 ) * 256 );

       // 1 = ((ord(n)*256)*256) ; Document has from 63744 to 16775423 characters
       $n3 = ( ( ord($headers[0x21E]) * 256 ) * 256 );

       // 1 = (((ord(n)*256)*256)*256) ; Document has from 16775424 to 4294965504 characters
       $n4 = ( ( ( ord($headers[0x21F]) * 256 ) * 256 ) * 256 );

       // Total length of text in the document
       $textLength = ($n1 + $n2 + $n3 + $n4);

       $extracted_plaintext = fread($fh, $textLength);

       // simple print character stream without new lines
       //echo $extracted_plaintext;

       // if you want to see your paragraphs in a new line, do this
       echo nl2br($extracted_plaintext);
       // need more spacing after each paragraph use another nl2br
    }
}

Tuesday, 23 October 2012

import-insert data from csv file to mysql using php

Import/Inserting data into mysql with load data query. Using load data infile query we can easily insert data into table.
We already have seen the tutorial on how to export data into csv - click here to see export csv tutorial

It is quiet often required to insert bulk of data into database table doing with script its bit lengthy. Instead we can easily import data using mysql load data query.

Sysntax for LOAD DATA INFILE

LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name' 
          [REPLACE | IGNORE] INTO TABLE tbl_name 
          [CHARACTER SET charset_name] 
          [FIELDS [TERMINATED BY 'string'] [OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char'] 
          [LINES [STARTING BY 'string'] [TERMINATED BY 'string']
          [IGNORE number LINES] 
          [(col_name_or_user_var,...)] 
Ex. 
LOAD DATA INFILE 'PATH_TO_CSV' INTO TABLE tblTableName FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 LINES (tblCol1,tblCol2,tblCol3)
[LOW_PRIORITY | CONCURRENT]
LOW_PRIORITY : If you use LOW_PRIORITY execution of the LOAD DATA statement is delayed until no other clients are reading from the table. This affects to only those storage engines that uses only table-level locking (such as MyISAM, MEMORY, and MERGE).
CONCURRENT : If you specify CONCURRENT with a MyISAM table that satisfies the condition for concurrent inserts, other threads can easily retrieve data from the table while LOAD DATA is executing on other hand.
[LOCAL]
If it is specified, is treated as client end connection. Means when you specified it is read on client side and sent to server where a temporary copy generated for execution
 [REPLACE | IGNORE]
This two are Control handling Keywords on input rows that duplicate existing rows on unique key values:
REPLACE : Input rows replaces existing rows that have same value for primary key, unique key index.
 IGNORE : Means Duplicate rows are skipped.
If you don't specify either option behavior depends on LOCAL keywords. If you don't specify LOCAL Keyword then error generated when duplicate entry found. With LOCAL Keyword default behavior is IGNORE.
[FIELDS [TERMINATED BY 'string'] [OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char'] 
This are the arguments that how you wants to treat all fields.  TERMINATED BY  to define field  end with specified string. ENCLOSED BY  specified character.
 If you don't specify any of the above argument it will treat as below default behavior 
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
[LINES [STARTING BY 'string'] [TERMINATED BY 'string']
This line is to determine line behavior.  STARTING BY  is used to define starting symbols of given files matches the lines which starting with specified string value.  TERMINATED BY  is to used to specify end of LINE.
 If you don't specify anything then default behavior for LINE is as below
LINES TERMINATED BY '\n' STARTING BY ''
[IGNORE number LINES]
This is to specify starting of line number means to ignore the rows from top. If you specfy IGNORE 1 LINES then first line will be skipped, execution will start from the second row/record.
[(col_name_or_user_var,...)] 
Specify the respective columns for your insertion operation.