Skip to main content

Creating a Batch API operation to parse a large CSV file

In this code snippet, I'll show how you can parse a (large) CSV file using Drupal's Batch API. The purpose of batching an operation is to avoid PHP memory limits and time outs. Before you begin, I recommend reviewing the following two articles. Be sure to review the additional batch parameters outlined in the documentation, you might need to use them.

function MYMODULE_callback_csv_import() {
// define path to CSV file
$csv_file_path = file_directory_path() . '/import_path/myfile.csv';
// define a redirect path upon batch completion
$redirect_path = 'admin/import-csv';
// define batch array structure
// NOTE: minimal parameters defined to simplify code
$batch = array( 'title' => t('Reading File'),
  'operations' =>array(
  array( '_MYMODULE_batch_read', array($csv_file_path), )
  , )
  , );
// set batch
// process batch

Next, we'll define the batch callback function. This function will be called repeatedly until the $context['finished'] variable is set to "1".

function _MYMODULE_batch_read ($csv_file_path, &$context) {
// define batch limit  
$batch_limit = 100;
// assume the batch process has not completed
$context['finished'] = 0;
// open the file for reading
$file_handle = fopen($csv_file_path, 'r');
// check if file pointer position exists in the sandbox, and jump to location in file
if ($context['sandbox']['file_pointer_position']) {
fseek($file_handle, $context['sandbox']['file_pointer_position']);
} // loop through the file and stop at batch limit
for ($i = 0; $i < $batch_limit; $i++) {
// get file line as csv
$csv_line = fgetcsv($file_handle);
// NOTE: at this point, do what ever you'd like with the CSV array data!
if (is_array($csv_line)) {
// db_query(), etc } // retain current file pointer position
$context['sandbox']['file_pointer_position'] = ftell($file_handle);
// check for EOF
if (feof($file_handle)) { // complete the batch process
$context['finished'] = 1;
// end loop
break; } }