Presentation on Optimizing JDBC performance
John Goodson - Data Direct Technologies
http://www.javalobby.org/eps/jdbc-performance/
This is a really great presentation on that explains how the JDBC Drivers actually work and what methods you should and shouldn’t call on a driver. A lot of the time when dealing with JDBC you just take the driver for granted and there are implications of how the driver is built that should affect your application code.
I have summarized some of the key points below
The Main Point: Network and Disc I/O are the most expensive.
You should fill in the null arguments in any fo the JDBC calls will speed up the driver, think about what the driver is going to do.
You should not be uising getColumns anywhere , only using this in code generation which is fine, but use rs.getMetaData. This is very network i/o intensive
In stored procedures use ? (bound parameters ) instead of acual literals, Driver can take advantage of RPC at the DB protocol level, beneath the application code.
Forward Only Cursors are fastest for small sets, driver prefetches rows. Cursor position is not the same as JDBC driver position. Scroll Insesitive is not scalable.
Batch Statements are the best way to load a table. DB CPUs are effectively cheap compared to Network Disk I/O , we will always take this tradeoff. Less Roundtrips to the DB the better.
Use ordinal position of the result column, almost always 100% faster, We use Firestorm DAO and it does this.
String fullname = rs.getString(3); //good
String fullname = rs.getString(“lname”); //bad
getObject generally bad
String children = (String) rs.getObject(2); //bad
int children = rs.getInt(2); //good
Select * is bad , should almost always specify the list of columns your using, Especially bad when table contains long varchars or clob data, a lot of network IO, Firestorm kind of handles, it doesn’t do * but it does do all columns unless that column is removed from the xml, Stay away from clobs if you can
Calling getGeneratedKeys allows you to switch between dbs (rowId rownum) This way the psuedocolumns can be extracted, abstracted.
Avoid rs.getLast always slow cause driver must cache, try to us forward only cursors
Don’t write code that depends on the number of rows in a query.
Instead write code like while loop that doesn't care about the number of rows in query:
while (rs.next)
{
//code
}
setMaxRows can limit the network IO if you know app never going to use more than 10K records
setMaxFieldSize can limit the size of long data 1MB
Don’t run in autoCommit mode, applciation controls commits, even in select statements driver is commiting after select statement
use JTA only if required , distributed txn is 4 X as slow as non distributed, only use if you have multiple connections to different datasources