Bulk insert into an Oracle database

One of the things I worked on today was to import thousands of records (originating from an xml file) into an Oracle database in a .NET component. Normally the data access layer uses entity framework and Devart to handle database operations, but obviously this was not going to be a good solution in this case, as this would definitely be way too slow. A quick test proved this: inserting 25.000 records took more than half an hour and locked up the database server during this time.

So I needed a better way to bulk import data into my Oracle database.

After some research I found two options:

Both allow you to do bulk inserts using something called array binding. This basically means that you can insert multiple records at once in one database call, thereby avoiding unnecessary round trips.

Because in the company we were already using Devart and we have a license for it, I decided to use that one (but the way of working is about the same with ODP.NET).

The table that I had to fill was called IMPORTJOBS:

CREATE TABLE IMPORTJOBS
(
ID RAW(16) NOT NULL,
IMPORTBATCHID RAW(16) NOT NULL,
DATA VARCHAR2(4000 BYTE) NOT NULL,
STATUSCODE VARCHAR2(20 BYTE) NOT NULL,
RESULTID RAW(16)
)

The component that has to call the bulk insert is an operation in the business layer. Because I like abstracting out components to reduce coupling and dependencies, I’ll created an OracleBulkInsertProvider that will contain the actual data logic to insert the records, so that my operations in the business layer do not have to worry about the technical details.

public interface IOracleBulkInsertProvider
{
void BulkInsertImportBatches(string connectionString,
List<ImportJob> importJobs);
}

public class OracleBulkInsertProvider : IOracleBulkInsertProvider
{
public void BulkInsertImportBatches(string connectionString,
List<ImportJob> importJobs)
{
var ids = importJobs.Select(ib => ib.Id).ToList();
var importBatchIds =
importJobs.Select(ib => ib.ImportBatchId).ToList();
var datas = importJobs.Select(ib => ib.Data).ToList();
var statusCodes =
importJobs.Select(ib => ib.ImportJobStatus.Code).ToList();

const string sql = "insert into IMPORTJOBS (ID, IMPORTBATCHID, DATA,
STATUSCODE) values (:id, :importBatchId, :data, :statusCode)"
;

using (var oracleConnection =
new OracleConnection(connectionString))
{
oracleConnection.Open();
using (var command = oracleConnection.CreateCommand())
{
command.CommandText = sql;
command.CommandType = CommandType.Text;
command.Parameters.Add(":id", OracleDbType.Raw,
ids.ToArray(), ParameterDirection.Input);
command.Parameters.Add(":importBatchId", OracleDbType.Raw,
importBatchIds.ToArray(), ParameterDirection.Input);
command.Parameters.Add(":data", OracleDbType.VarChar,
datas.ToArray(), ParameterDirection.Input);
command.Parameters.Add(":statusCode", OracleDbType.VarChar,
statusCodes.ToArray(), ParameterDirection.Input);
command.ExecuteArray(importJobs.Count);
}
}
}
}

I think the code is straightforward. The goal is to create a command, add input parameters and at the same time provide a list of actual values to insert for that parameter. At the end we simply call the ExecuteArray operation.

This way, the time needed to insert 25.000 records takes about 2 seconds.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.