Performance Issues with Disconnected Recordsets
This article will establish performance differences between disconnected and connected recordsets when using them with Microsoft Access and Microsoft SQL Server, and uncover potential pitfalls when designing your system for multi-database access.
Author: Jens G. Balchen

Introduction

In a recent project I was working on, the client wanted a database system where data would be maintained locally on his machine, and then transferred to a web server in batches. For the client application, I used Microsoft Access and a standard set of ADO routines I have developed. These routines use disconnected recordsets, since their primary use have been in transactional, server-side, stateless components. I saw no potential dangers in using disconnected recordsets in a client application.

As it turns out, there is a bug in the MFC data access library. You cannot have more than 40 fields in a recordset and at the same time perform batch updates on that recordset. Since disconnected recordsets use batch updates, this bug caused my client application to crash whenever it tried to update a record. The solution was to use connected recordsets instead, which do not require batch updates. A curious side effect to this, to our client's immense pleasure, was a significant decrease in database access times.

I decided to perform some tests on connected vs. disconnected recordsets using both Microsoft Access and Microsoft SQL Server as database servers. The test code, test results and my conclusion make out this article.

Writing the test code

The test code was kept very simple. A form containing two command buttons formed the framework, and the code beneath each button had only minor differences. The basic code looks like this:

Private Sub cmdDisconnected_Click()

Dim Connection As ADODB.Connection
Dim Rs         As ADODB.Recordset

Dim a          As Variant

Dim Opening    As SETimer
Dim Traversing As SETimer

   Set Opening = New SETimer
   Set Traversing = New SETimer

   Set Connection = New ADODB.Connection
   Set Rs = New ADODB.Recordset
   
   Opening.StartTimer
   
   ' Open a database connection using a client library cursor.
   With Connection
      .ConnectionString = "(whichever connection string you like)"
      .CursorLocation = adUseClient
      .Open
   End With
   
   ' Open a recordset with lots of data.
   With Rs
      .CursorType = adOpenStatic
      .LockType = adLockBatchOptimistic
      .Open "SELECT * FROM Files", Connection
   End With
   
   ' Disconnect recordset.
   Set Rs.ActiveConnection = Nothing
   
   Opening.StopTimer
   
   ' Close connection.
   Connection.Close
   Set Connection = Nothing
   
   Traversing.StartTimer
   
   ' Traverse entire recordset.
   Do While Not Rs.EOF
      ' Get a text (BLOB) field to increase data load.
      a = Rs("Description")
      Rs.MoveNext
   Loop
   
   Traversing.StopTimer
   
   ' Close recordset.
   Rs.Close
   Set Rs = Nothing
   
   txtTimes = txtTimes & Opening.TimeMS & vbTab & Traversing.TimeMS & vbCrLf

End Sub

The code for cmdConnected_Click looks the same, except for some important differences:

   ' Open a database connection using a server library cursor.
   With Connection
      .ConnectionString = "(whichever connection string you like)"
      .CursorLocation = adUseServer
      .Open
   End With

   ' Open a recordset with lots of data.
   With Rs
      .CursorType = adOpenStatic
      .LockType = adLockOptimistic
      .Open "SELECT * FROM Files", Connection
   End With
   
   ' The recordset isn't disconnected and the connection 
   ' isn't closed before the traversing starts.
   
As you can see, I performed two tests -- one recorded the time used to open and populate the recordset, and one recorded the time used to traverse the entire recordset, including some basic data access. The output into txtTimes was tab-separated to ease the migration to Microsoft Excel (to do some basic graph creation).

The test results

Since both Microsoft Access and Microsoft SQL Server perform data caching, subsequent tests show little variation in data access times. However, a comparison between database servers and access methods show huge differences.

Microsoft SQL Server Access Times

DisconnectedConnected
OpeningTraversingOpeningTraversing
9447161453
10931161453
10932161453
11031151469
10931161453
11046151438
11047151438
1094701438
1093201454
11046151438
1103101453
10931161437
10932161437
12531161453
11031161437
10932151453
11031161453
9332161453
1103101453
1253101437
9347161453
11031151469
10931161516

These test results clearly show the advantage of disconnected recordsets with Microsoft SQL Server. In average, the basic open, traverse, and close is done 10 times faster than with connected recordsets.

See graph.

Microsoft Access Access Times

DisconnectedConnected
OpeningTraversingOpeningTraversing
547311579
125311678
125313278
125311678
110317878
125321578
125323278
125311578
125311579
125321579
125311678
141321678
125321578
125313278
125313178
125321678
125311678
125311693
125311678
109311578
125311594
125321678
125313178

As you can see, disconnected recordsets show a slight performance degrade with Microsoft Access, but the most important thing about these data is the significant increase in speed with connected recordsets. The access time has been reduced to 1 / 15 of the time used with Microsoft SQL Server, and is only 2 / 3 of the time used with disconnected recordsets.

See graph.

Conclusion

Disconnected recordsets provide a fast and convenient way to access data in stateless, transactional enviroments, and work very well with Microsoft SQL Server. This simple test does not take into account factors such as cross-thread marshalling that typically apply to larger systems, but nonetheless, disconnected recordsets prove the most efficient way to access SQL Server data. Connected recordsets seem to work best with Microsoft Access. The access times are significantly reduced, and I suspect the decrease will be even larger when the recordset contains more fields.

When you write applications meant to run with both database servers, give this topic some consideration. If you chose the wrong kind of data access, your application could suffer from slow access times.