Afterwood

Afterwood

By Chris Oldwood

Overload, 31(178):16, December 2023


Halloween has been and gone. Chris Oldwood therefore takes time to consider the ghosts in the machine.

I always find writing an article at this time of year particularly challenging. Due to the lead time in publishing, we’re currently celebrating Halloween, but by the time this hits the metaphorical shelves, we’ll be enjoying the Christmas festivities. In the world of programming, this Halloween/Christmas duality is even immortalised in a joke that riffs on date formats and number bases to illustrate that ‘OCT 31 == DEC 25’.

If you’ve ever worked with databases, you may have come across The Halloween Problem. This was an issue coined back in the 1970s, where a database update to adjust employee salaries used an index on the same column that was being updated. As each salary was increased, it caused the row to move further down the index and consequently was visited multiple times, meaning that every salary was bumped repeatedly until it exceeded the threshold specified for the raise. If you were one of the lowest paid employees in that company, then Christmas would definitely have come very early that year!

By the time I had started working with databases, all these issues had been worked out and the ACID guarantee had become a staple question at job interviews. It wasn’t until the rise of the NoSQL movement in the late 2000s and the new breed of document-oriented databases like CouchDB and MongoDB that I started to become more aware of issues like The Halloween Problem, and its associated problems like Phantom Reads. The performance demands of ‘Web 2.0 at scale’ meant that the cool kids were happy to trade their ACID guarantees for throughput, and so the pendulum started to swing back the other way, and we also got to use the new excuse ‘eventual consistency’ whenever things didn’t quite add up. In those early days, some database products traded off more than just the ACID guarantees. In the race to appear fastest in the benchmarks, they chose dangerous defaults which meant you couldn’t even be sure if your request left the machine. The pitchforks came out, the ‘NoSQL Considered Harmful’ posts were written, and the pendulum swung back again towards the Pit of Success.

Even if you do manage to avoid the phantoms in your result-sets you’ll struggle to escape one of the most curious perversions in relational database logic – the non-value NULL. After scratching your head wondering why your SQL query doesn’t work as you expect when NULLs are present, you learn that one NULL does not equal another NULL and you need to litter your SQL code with IS NULL, ISNULL, COALESCE, etc. instead. Until, that is, you introduce certain aggregations, grouping, or sorting constructs, at which point NULL starts to feel like it does have equivalence semantics after all. But the SQL standards committee are a cunning bunch and with a little sleight of hand they sidestep the apparent similarities of equivalence by introducing the concept of ‘distinctness’ instead and any notion of equivalence remains merely a figment of your imagination.

If a database NULL is a value which doesn’t exist then, in the world of floating-point numbers, the undead comes in the form of a nan – once a number, full of life, but now destined to walk the Earth turning every other number it meets into the undead, too. This one really is evil, though, as there is no standards committee to save you here if one creeps into your collection before you try and sort it. Depending on the language and sort implementation, you might be lucky and escape with a sequence that remains intact, whereas if you’re unlucky, your sort won’t complete until the heat death of the universe, making the outcome a moot point. I’m pretty sure IEEE754 wasn’t what the late Fred Brooks had in mind when he warned us there was no silver bullet but maybe he also advised us to decorate our collection types with garlic in one of his lesser-known essays.

If you’ve dabbled in computing for even a small amount of time, you’ll likely have experienced ‘The Ghost in the Machine’. Like many real-world ghost sightings, they eventually get debunked. That elusive bug, which initially appears to be from another realm, turns out to be entirely real, and all too often self-inflicted. We might call it Undefined Behaviour to avoid frightening the children, but it’s really a portal to another dimension where only those tooled up like Ash Williams will make it out alive. C++ in particular has the kind of power to contact the dead all too easy, although it tends to be hackers that celebrate its ‘use after free’ abilities.

Einstein famously coined the phrase ‘spooky action at a distance’ to describe the weirdness of quantum mechanics but you don’t need to delve that far down the technology stack to experience the spookiness of hardware. Even though we’re becoming less susceptible to the vagaries of some older technologies – like hard disk drives, aka ‘spinning rust’– we are more reliant on network connectivity, meaning that Leslie Lamport’s famous quip from the late 80s about distributed systems and being reliant on computers you didn’t even know exist is becoming ever more prescient. Failing and loosely seated RAM chips also provide just enough of a distraction to make you question your sanity before declaring the host cursed.

The complexities of modern hardware and software can make it feel like you’re constantly being haunted by a poltergeist as you struggle to reason why your code is not behaving the way you intended. Halloween might only be one day for normal people but for programmers it can feel like we’re permanently living in The Upside Down. When you’re battling with phantoms, daemons, and zombies, sometimes it can feel more like exorcism than programming.

Chris Oldwood is a freelance programmer who started out as a bedroom coder in the 80s writing assembler on 8-bit micros. These days it’s enterprise grade technology from plush corporate offices the comfort of his breakfast bar. He has resumed commentating on the Godmanchester duck race but continues to be easily distracted by emails and DMs.

Web Editor's Note: ACID stands for Atomicity, Consistency, Isolation, and Durability. See https://en.wikipedia.org/wiki/ACID.






Your Privacy

By clicking "Accept All Cookies" you agree ACCU can store cookies on your device and disclose information in accordance with our Privacy Policy and Cookie Policy.

By clicking "Share IP Address" you agree ACCU can forward your IP address to third-party sites to enhance the information presented on the site, and that these sites may store cookies on your device.