The Dummy Key
There's no question about it: the Alpha AXP platform was amazing. Here in the UK we have an expression: "like sh*t off a shovel", meaning extremely fast, and the Alpha was certainly that. I have never been quite so astounded as when I started up Sybase ASE on an Alpha platform running Tru64. Honestly, the thing had started before I'd even lifted my finger off the Return key.
So it was somewhat surprising when I was called in to investigate a performance issue with a database application running on this platform. In fact, the site had recently upgraded their version of Sybase ASE, and the new version was showing a noticeable decrease in performance compared to the previous version.
The reason for the performance slow-down was due to the way Sybase reads from the network. After any outstanding queries have been processed, and all the results have been sent back to the client processes, Sybase will continue polling a configurable number of times ("runnable process search count") to check for new queries. After that time it will issue a blocking network read, and let the operating system wake it up when there is something new. You could think of it as driving a car in traffic - if you are constantly stopping and starting then you keep the engine running so that you can pull away quickly when the traffic flows again. However, if you get stuck in a traffic jam, or otherwise haven't moved in a while, you may decide to switch the engine off. When the traffic moves again, the time it takes to restart the engine and pull away is longer than if you had kept the engine running.
From a database point of view, 'keeping the engine running' is a good way of reducing the latency (i.e. the delay) when processing new queries, since the server doesn't need to 'wake up' each time. As a side note, system administrators often see high CPU usage on their machines and incorrectly assume that the database server is working too hard, not realising that a lot of this time is spent simply polling for new queries.
So back to the performance problem: the issue with this particular application was that the new version of Sybase issued its blocking network read differently, and so took longer to 'wake up' every time a new query was received over the network. Because the Alpha platform was so fast, the database server completed its default number of network polls for incoming queries (2000, in this case) before a new query had been issued, and thus went to sleep. The work-around was obviously to change the number of times that Sybase polled to a higher number (20,000), so that the 'engine' was still 'running' when a new query came in.
During the course of the investigation, however, I discovered probably one of the most inefficient ways of generating unique IDs for new rows that I have ever encountered. Most of you will have come across a table that has a number of columns, including one that is a unique ID. There are a number of ways of generating a unique ID for the primary key, but the two most common ways are: (a) to use an identity column, which gets populated automatically by the server when the new row is inserted, or (b) to have a separate table holding the next unique ID - the application retrieves this and then increments it for the next insert.
In these two cases, the unique ID will either be generated during the insert (for the identity column), or before the insert (as in the separate ID table). This application, however, took a unique approach to this, which was as follows:
- A new row was inserted into the table with an ID of -1.
- A trigger fired, which retrieved a new ID from a separate ID table, and then updated the ID of the newly inserted row to the new value.
A mistake I often see with triggers is when they are used to alter values in newly inserted rows (why not just put the correct values in to start with?!), but doing this with the primary key is nothing short of bizzarre. I can only assume that the idea came about after a serious night on the tequila shots.
Of course, had they done the insert properly, this would have more than compensated for the increased latency following the upgrade.
The White Elephant Settlement System
Those of you who have read my article describing name-value pairs will already be aware of the star of this next mishap: the massively inefficient settlement system.
Instead of writing a dedicated settlement system, the creators of this application decided that they were going to have a bit of fun writing a generic workflow system, and then tailor it for settlements. Underlying the workflow system was a number of worker processes, each of which could perform a unit of work. There were different types of work unit, and these were all represented by virtual objects that were passed between the worker processes. Some work units generated additional work units, and others destroyed them. All the virtual objects were held in the database as name-value pairs, and some could have hundreds of attributes.
Furthermore, this particular software house took the 'helpful' step of taking the workflow design away from the developers, and putting them it in the hands of the business users. Clearly, these users didn't have efficiency at the forefront of their minds, so all in all, to settle just one trade involved over 10,000 (yes ten thousand!) INSERT, UPDATE, or DELETE queries. To make matters worse, often thousands of queries would be encompassed in a single transaction (BEGIN - COMMIT block).
Unsurprisingly, they had problems with performance, and particularly deadlocks.
The Lazy Auto-generator
I have to be honest, I'm not a big fan of auto-generated code. It's a lazy way of producing software, and it almost invariably ends up producing drastically inefficient code. There, I said it.
This next example, though, was quite special.
The system processed orders. Each order had over a hundred attributes associated with it, and these were periodically persisted (saved) back to the database. When a new order was persisted, a row was inserted into the order table, and when an existing order was persisted, a row was updated. That all sounds simple enough. Unfortunately, the external system was written in such a way that it did not keep track of whether an order was new (and thus needed inserting) or existing (and thus needed updating), and furthermore it had no idea what attributes had been updated. To put the boot in, it persisted everything periodically - both new orders and existing ones, whether they had been updated or not!
Naturally it was left to a stored procedure to check if this was a new or updated order. The insert was standard and uninteresting, but the update was simply a gem: every column was updated. That's right - every column, including the primary key - the order ID (which of course never changed). In their rush to 'save' time on coding, the developer had created an auto-generator that simply updated every column as part of the update, including the primary key, whilst using the very same primary key to identify the row in the WHERE clause.
As well as the primary key index, the order table had a further six indexes. This query was running on a Sybase Adaptive Server Enterprise (ASE) database, so the query plan for the stored procedure was generated the first time the stored procedure was called. Naturally, not knowing what columns were actually updated, the optimiser was forced to maintain all the indexes as part of the update. So every time an existing order was persisted back to the database, no matter if zero or all the columns actually changed, all the indexes were maintained. In this case, they were maintained expensively, meaning as deletes followed by inserts.
Add to the mix massive connection pools, and it won't be at all surprising to hear that at busy times there were huge numbers of deadlocks. After a while the external processes would simply give up and try again later. The orders were persisted back to the database eventually.
The developer in this story was unrepentant, insisting that they weren't going to change anything because it made his life easier ...
The Fire-and-go-on-holiday Report
For this mistake we turn to telecoms. One particular telecoms company had a traffic route report that they used to run every week. Essentially it was a big GROUP BY query that summarised all their voice traffic over their interconnect routes, and came out with a report at the end of it. Back when it started up and the volume traffic was low, it could be run as a single database query. As the traffic began to grow, and there were many millions of calls made every month, the database server could no longer handle the grouping as a single query.
The developers were obviously quite keen to keep all processing on the database server, because the solution they came up with was a rather elaborate scheme that worked as follows:
- A temporary table was created, and populated with a list of all the customer accounts.
- A number of sub-processes were spawned to work in parallel. The exact number was generally picked out of the air according to wind speed and direction ...
- Each sub-process would take a customer account, group their traffic by interconnect route, and then merge the aggregates (number of calls etc.) with a central list of routes.
Of course, this was starting to get a bit shaky, especially since single customers were starting to get such large volumes that grouping them in single queries was approaching the limit of the server's capacity. The job was so inefficient that the report took 5 days to run. If that wasn't enough, periodically the sub-processes would die and have to be restarted - so the job needed constant monitoring.
The report was re-written in such a way that an external application simply pulled out all the traffic data from the database, and then merge-sorted and grouped the traffic in blocks. The report subsequently took 5 hours to run.
There are two main lessons that we can learn from this example, which seem to crop up repeatedly with database development:
- A database server is not an application server.
- Not all sorting and grouping needs to be done in the database.
The System Start-up Scandal
The administrators of this next system would restart it every weekend. Sometimes this would be in order to perform routine maintenance, but even if there was no maintenance, every weekend it would be brought down and subsequently back up again. The application was pretty memory hungry, being implemented in Java, and had numerous memory leaks, so bouncing all the servers was no doubt a worthwhile activity.
Among the server processes that comprised the application were a set of order management servers. These held financial product information in memory, and the set of products that they would load were the complete set that had been traded to date, by any of the order management servers. During start-up, each server would in turn run a medium-complexity database query, and load up the complete set of products that they were interested in. Every server would run the same query, and load the same product set. About 6 of these were started, each one took half an hour to run, and they were started sequentially - once one had finished, the next one would be kicked off.
Three hours is a long time to wait for 6 processes to run exactly the same query, and load exactly the same set of products.
Strangely, no-one questioned that this was an extraordinary waste of time. It had become accepted as a normal weekend activity that would be shared out between the various system administrators, on a rota system. Sigh ... sometimes I think my IT colleagues don't want a life ...
The Oversized Cache
For the last mistake we return to Sybase Adaptive Server Enterprise.
Sybase ASE has a feature that allows administrators to create user-defined partitions within its data cache, known as "named caches". Database objects, and even databases in their entirety, can then be "bound" to the named cache, which could be used to provide objects with a dedicated area of memory, for instance.
Unfortunately this feature tends to be over-, and often mis-, used, as this next example shows. The administrators of this system had created a 500 MB named cache, but bound just one 50 MB database to it. That left 450 MB of partitioned but unallocated cache that would never be used - memory that was sorely needed by the rest of the server.
My take on named caches? If your system is well written and running efficiently then they're great for tuning, but unless you can prove to me that your system is "well written and running efficiently" and can show me some figures that prove that your application runs faster with them - don't bother. Don't micro-manage resources unless you have some pretty specific requirements.
In a similar vein to the previous example, I once came across a site that was using an EMC for its storage array. It turns out that the engineer who had installed it had left the read and write caches at their default sizes, despite there being double the amount of physical cache available on the array controller. The moral of this story: don't rely on default settings!