What is the overhead of getting a fresh
connection from
the connection pool instead of passing the connection around?
How much faster are repeated SQL statements when using a fixed
query string with bind variables as opposed to directly interpolating
the data into the query string? How much faster when re-using
the same prepared statement? How much faster when using a batched update?
I ran a little benchmark.
- A) 1000x [getConnection createStatement executeUpdate commit]
- B) 1000x [getConnection prepareStatement executeUpdate commit]
- C) getConnection prepareStatement 1000x [executeUpdate commit]
- D) getConnection prepareStatement 1000x [executeUpdate] commit
- E) getConnection prepareStatement 1000x [addBatch] executeBatch commit
I wanted to test Oracle XE on Ubuntu, but did not get either installed
(the eMachine did not like the Ubuntu CD, and Oracle's web-site was unresponsive),
so I went with Postgresql 8.3 and MySQL5(InnoDB) instead. The databases were
running on Windows XP, both fresh installs using the default settings,
accessed from the Java test program on a Mac mini via local ethernet network.
|
- Commits against MySQL are amazingly slow. I assume that this is a problem with my setup, or with Windows. This also probably only affects the transactional InnoDB backend.
- With MySQL, there is no speed difference between methods A, B, and C, and hence no visible performance advantage to prepared statements. Maybe the JDBC driver does not implement the feature. With Postgresql it seems to improve throughput, but not by much. The Oracle figures should be interesting here.
- Committing only once instead of separately after every update makes a big difference, especially with MySQL (see above). Of course, performance considerations should not be a factor in deciding what a transaction is.
- Bulk updates give another big boost to Postgresql, not so much to MySQL.



