- Too much data is pulled across the network and dealt with at the client.
- "Functionally rich," fat clients require extra maintenance and reuse limit reuse.
- Databases only scale to a certain number of users.
Using stored procedures in the database is a means of extending the limits of a system. This means you can typically go from a system which in two-tier has a ceiling of 100 users and often as much as double that ceiling.
How do stored procedures do this? To start with, stored procedures can improve database scalability, since they use a database's own optimized code to efficiently process data, and may even use the database's SQL extensions to do specific tasks more efficiently. Stored procedures are also stored in an efficient manner (usually in some precompiled, ready-to-rock state) so they are quick to start as well as rapid to run. Obviously, because they are within the database itself, stored procedures avoid dragging lots of data across a network, since by their nature they must process the data at its source. Stored procedures are frequently used as a way of restricting data access, in that they can be restricted to specific user groups or even individuals. This means that the stored procedure-or its cousin the database view-is in fact an encapsulation of the data. This can help to avoid the necessity of changing the application if the database structure has to change. There's also an element of reuse with stored procedures: different interfaces and applications can reuse the same stored procedures.
But the land of stored procedures isn't all milk and honey. SQL is not as powerful as more general programming languages such as Visual Basic. Long-running stored procedures can tie up database connections. Stored procedures, again due to their nature, must always run on the same machine as the data, which limits and essentially eradicates hardware scaling of an application.
Cursors-Just Say No: Or, How I Learned to Stop Worrying about Locking and Love Read-Only Data
The big three in Visual Basic data access libraries-DAO/ODBCDirect, RDO, and ADO-can all implement cursors. The kind of cursor you choose has a vast impact on the performance, concurrency, and thus scalability of your system. Getting it wrong can really hurt. So it helps to know your cursor options carnally.
…in the evening we found a reply waiting for us at our hotel. Holmes tore it open, and then with a bitter curse hurled it into the grate.
-Sir Arthur Conan Doyle, The Final Problem:
The Memoirs of Sherlock Holmes
Choice of cursor option primarily affects where a resultset will be built, how it will travel to its consumer, how an application can move through the data, and whether and when data in the cursor is updated. For a long time there have been two ways to look at cursors. The first is the lazy, sloppy, thoughtless way (at TMS, the shorthand for this is "evil cursors"). Rather than thinking carefully about what cursor functionality is required in each given data access situation, and then choosing a cursor that provides a close fit and as minimal and lightweight a set of functionality as can be lived with, many developers choose the fullest functionality and thus the heaviest and most expensive cursor option available. The principle here seems to be, "If it can't be done with the biggest cursor, it can't be done in Visual Basic." In addition, some developers feel that a generic cursor solution is best, rather than taking the trouble to understand the benefits and drawbacks of specific cursor options. In reality though, time spent choosing the most appropriate cursor options is time well spent. If we don't spend it here, much more time will be squandered by our frustrated users as they wait for data to travel across the network or for locks to be released.
The following is a list of options you must decide on when selecting a cursor.
- Location This is where the cursor will be physically built-there is also a link with membership, in that the combination of location and membership can control where the data and keys are and how and when they move from client to server. Location has the following options:Client The cursor is built on the client. If it is a static cursor, all the data travels across the network at one time, down to the client. This raises questions of capacity-does the client have enough RAM and disk space for this? If the resultset is large, and especially if few of the records in it will become current records, this might be a bad choice. Don't get fooled into thinking that the client is the sink client (where the user interface is) if you open a client-side cursor on a middle-tier machine. When you finally have the resultset on the client, performance can seem fast. As a general rule, client-side cursors scale better, since you are spreading the load across many workstations.
Server Some databases support building the cursor on the database server machine in a temporary area. SQL Server does this and caches the recordset in the TempDB. Then when the client wants to make a given record the current record, the data for that record travels from the server to the client. Thus all that is on the client at any given time is the current record's data. This is mainly useful for lightweight clients and large recordsets, especially if you are only going to directly access and manipulate a few of the rows in a large recordset. If you use the server option wisely it can greatly improve the effect on network traffic. So this option can be the best choice when there isn't much spare bandwidth on a network, or when you have to have a lot of network traffic using client-side cursors. The cursor has to be fully populated before control goes back to the client code, so it can sometimes feel a bit slower, especially if you like to pull the trick of trying to show the first row in the user interface as soon as it is available, and then putting the others in while the user is ogling it. Also, try not to have too many of these, especially if they are big and open at the same time-the TempDB is finite, too. If you use these you can't have disconnected recordsets (see "Returning Changes from a Disconnected Recordset" later in this chapter). Opening server-side cursors in transactions can have locking implications such as blocking other users. Note also that server-side cursors don't support executing queries that return more than one recordset. Alternatives to server-side cursors are using a client-side cursor, or preferably a disconnected recordset.
- Membership/Cursortype This refers to which rows are in the cursor and which aren't, and when the database engine picks them. (Initially the query processor decides based on the WHERE clause and puts in the resultset.) Also included here is what is in the cursor, in terms of data or just keys (IDs of records). This has an impact on what travels across the network when. This has the following options:Static Static cursors usually retrieve a copy of all records in the cursor's membership. Membership freezes when the last row is found (usually when code or database engine moves the current row pointer to the last row), so data changes made by other users don't join the membership. Often some sort of locks (usually share locks) are kept in place until this is decided. Usually this means that all the data travels from the database engine across the network to the client and has to be stored there. Static cursors are usually fully scrollable.
Keyset The cursor is built with a key for each record in the membership. Although membership is usually static in that new rows won't be added to the membership, in this case the keys (the IDs) of the records are what initially travels to the client. When a given record becomes the current record, a request goes back to the database for the data for that record, using its key. This means that at any time all the keys plus the data for the current record are likely to be on the client. Because keys, not full records, are being stored, the overhead is less than for static or dynamic cursors. This can help when not all records will be touched (become current record) or where the client has limited data space.
Dynamic Some cursors never close membership-instead they keep asking the cursor driver to requery the database to make sure that rows added, updated, and deleted are taken into account in the membership. This can be very costly to performance, chewing up lots of memory and network resources. Dynamic cursors can change their membership to see changes made by other users. They are fully scrollable and normally updateable-the gold-plated cursor.
- Scrolling This is the capability provided by the combination of database engine and cursor driver to let you move from row to row. The following are the scrolling options:Scrollable This allows capabilities such as bookmarks so that you can mark a record as you scroll, move the current record and then return to it later, or jump up and down a recordset using MovePrevious and MoveFirst type functionality. This is expensive functionality. Usually most applications read through a resultset once to put it in the interface, so they don't need this.
Forward-only This is not really a cursor, but it's usually the first choice, especially when read-only. This provides the capability to go from start to finish sequentially (FIFO-first in first out) through the resultset, making each item in it the current record in turn. This is adequate for most applications bringing back large numbers of rows at a time. (By "large" here we often mean more than one!) The forward-only cursor is generally the fastest and the lightest on resources.
- Updateability This is how your database and cursor driver let you go about changing membership and data values. Updateability has the following options:Updateable You can maintain data (add, delete, and update new rows) using the cursor's functionality-AddNew, Delete, and Edit/Update methods on a resultset object are how this normally looks. If you don't use this option (which again is very expensive functionality), you can use action queries. This option is expensive because each row has to be individually addressable, which normally requires a primary key on the table. Use Read-only if you aren't going to update, since it is more efficient.
Read-only This is exactly what it says on the side of the can. This option is cheap on resources and is usually an excellent choice, especially when combined with Forward-only scrolling.
So when picking a cursor, you should consider what the functionality needs, and then find the cheapest cursor option to implement that will provide it. Once you've made this decision a few times you'll probably have been through all the normal scenarios-such as
Scenario Cursor Choice Other Information Look up data Read-only, Forward-only, Static On the server Provide a picklist to select from Read-only, Forward-only, Static On the client if the number of list options is small, on the server if you have the option and have to provide large numbers. We always try and restrict this kind of query to a maximum size of 50-100 rows (who is going to scroll more?) and show the first 50-100 if more are found, but prompt them to qualify their query more Bring back details of one item (chosen from a picklist) into a Detail/Edit screen Read-only, Forward-only, Static On the client-we'll do the up-dates/delete using an action query if we can (not methods on a recordset, but we'll talk more about this later). If you need scrolling, use a client-side keyset cursor
A useful concept here is that of the current row: essentially with most resultsets and cursors, only one row of data is "exposed" for dealing with at a time. Scrollable cursors can vary in a number of ways. (Go to the first, go to the last, go the next, go the previous, go to a specific one, or go to one in an absolute or relative position.)
You to your beauteous blessings add a curse, Being fond on praise, which makes your praises worse.
-William Shakespeare, Sonnet 84
Cursors are rarely the best way to access your data. But sometimes you need to be able to do the following:
- Scroll forward and backward (browse) through a limited resultset
- Move to a specific row based on a saved value (a bookmark)
- Move to the "nth" row of a resultset in absolute or relative terms
- Update limited resultsets created against base tables using the RemoteData control
If you don't need a cursor, you will usually find your application is faster and more efficient. So, how to avoid them? With RDO, if you set the rdoDefaultCursorDriver property to rdUseNone, all resultsets created by your application will be created as if you used the OpenResultset method with the rdOpenForwardOnly and rdConcurReadOnly options set, and with the RowsetSize property set to 1. This is often the most efficient way to pass data from the remote server to your application. ADO doesn't have such an equivalent setting (well actually, that is a lie-it has an obsolete setting, adUseNone, which exists only for backward compatibility)-ADO's default cursor location setting is adUseServer.
Although it is also possible to create low-impact resultsets that are also updateable through use of the Edit/Update methods, this is usually impractical because the base tables are not directly updateable, so creating an updateable cursor is not possible. This is particularly so in stateless tiered designs.
Whenever you create a resultset with a stored procedure, the resultset is not updateable-at least not using the Edit/Update methods. In these cases, you can use the WillUpdateRows event to execute an action query that performs the actual update operation(s).
However you create a cursor, you can usually update the data using one of the following techniques-even if the cursor is not updateable:
- Executing a stored procedure that updates a selected row based on a code-provided key.
- Executing an action query that changes specifically addressed rows. In this case your code creates a suitable WHERE clause used in the query.
- Using the WillUpdateRows event to trap update operations and substitute appropriate stored procedure calls to perform the actual changes.
Try not to update directly through cursors-it's almost always more efficient to use a stored procedure or an action query. If you are using ADO, use the adExecuteNoRecords execution option-this stops ADO setting any cursor properties.Dim conn As New Connection Dim sOpen As String, sExecute As String sOpen = "Provider=SQLOLEDB;Data Source=TMS1;" & _ "Database=DUDES;User Id=sa;Password=;" conn.Open sOpen sExecute = "Insert into DudeMain values(5, 'Chris'," & _ " 'Star Programmer')" conn.Execute sExecute, , adExecuteNoRecords
It is even harder to justify using a query that creates an unrestrained, off-the-leash kind of cursor against one or more base tables (e.g. "SELECT * FROM Table"). Not only is this not a permissible option in protected systems, it can cause serious concurrency problems as you attempt to scale your application to more than a few users. Whenever you create a cursor, be sure to limit the scope to the fewest number of rows possible. In interactive systems (where there is a human operator), fetching more than a few hundred rows is often counterproductive and leads to increasingly complex concurrency problems.