Merce Select

Relational databases

Database goes relational

  • We have developed applications to integrate with Oracle, IBM DB2, MySQL, PostgreSQL, mSQL, SQLite, and Microsoft SQL Server database management systems. We have integrated with these database from Perl, PHP and Java code.

  • Batch data On various occasions, we have had to manipulate very large volumes of data in batches. For instance, there may be a list of 50,000 unique customer IDs in a file, and we have to update the status fields of all customer records whose IDs are listed in the file. This is a batch operation, but part of the data is in a bach file and part is in the database. These situations give rise to performance challenges with interesting options.

    We have tried three approaches to tackle these situations. One approach is to use application code and read the batch file line by line, and issue SQL update statements to update the status of each correspoding customer record in the database. This is the slowest, as expected. The second approach is to load all data from the batch file into a temporary table and then run a stored procedure which searches through the temp table and does the status updates in the main table. This is relatively much faster. The third approach involves a temporary table but involves no stored procedure. We just run one or more complex SQL statements to do the status updates. This approach is roughly as fast as the stored procedure but has the advantage of not requiring a stored procedure. We try to minimise the use of stored procedures where performance permits so that the application code remains database independent to the extent feasible.

  • Infrastructure We have set up database mirrors and data protection in interesting ways where required. We have configured database systems to replicate from active to backup server on a few occasions. PostgreSQL v9 has excellent near real-time replication support, and retains the backup server in an active state to handle read-only SQL statements if needed.

    One customer had many small servers, each of which ran a copy of their application and database. This customer wanted data to be replicated from the active to the backup database when both database instances were on the same server, but keeping their data on two different spindles. We set up this replication by hand-configuring two different database server instances to listen on two different port numbers on the same computer.