The Vector Refactored

The Vector Refactored

By Teedy Deigh

Overload, 30(168):15-16, April 2022


Finding the right level of abstraction can be challenging. Teedy Deigh razes the level of abstraction.

@Polymorpheus: What if I told you that vector is not the only data structure available from the standard library?
@Teedy: But what if I told you, Polly, that the C++ standard says that “vector is the type of sequence that should be used by default”? As any engineer knows, don’t mess with the defaults, amiright?
[Teedy looks for support from the others on the call. She has Zoom in Muppet Show mode, but all she gets is a wall of muted silence against a back drop of either camera-off darkness or Star Wars backdrops and devs who are obviously ‘multitasking’.]
@Polymorpheus: I think it’s reasonable to use string for strings rather than vector<char>, don’t you?
@smith: And also u16string and u32string – but please, please, don’t use wstring.
@Polymorpheus: Yes, thank you, Winston. For a relatively new codebase, this system does seem a little stuck in the last century. Either way, basic_string is the template you’re looking for.
@Teedy: I can roll with that – still an array under the hood, which we know is the one true data structure.
@Polymorpheus: I appreciate the fundamental nature of arrays, but it feels a little low level for some of the code, such as – wait a moment, let me just share my screen [code appears] – this code here, which queues requests.
@Teedy: Oh yes, love that code.
@smith: Love is blind.
@Polymorpheus: I think a deque would have been just fine.
@Teedy: Pfft, deque’s just a vector wannabe that lacks cache locality. It’s all over the place. This variant of a circular buffer does everything needed, and with the kind of code that keeps me in work.
@smith: Wouldn’t it be better to encapsulate all this [waves hand at camera] in a class rather than leaving the logic strewn all over the code, big footprints of duplicated control flow everywhere the vector – sorry, circular buffer – passes? There’s so much copy and pasting [cat walks across keyboard in front of camera, a stream of garbled characters fill the chat] I get a haunting sense of déjà vu every time I look through this code.
@Teedy: Repetition legitimises.
@smith: ...
@Teedy: What?
@smith: That, Ms Deigh, is the sound of incredulity.
@Polymorpheus: Winston has a point, though, Teedy. I think for a system that is I/O bound – spending most of its time in the OS, the network, setting cookies and chatting to the Oracle database – this micro-optimisation doesn’t justify the code complexity it brings.
I presume low-level caching is also why you steer clear of map in – wait a moment – this code?
@Teedy: Strictly speaking, that is a map... it’s just that it’s implemented as a vector. I mean, really, that’s all a map is, isn’t it? It’s a binary tree that wants to be a sorted array. I just chopped it down to size.
@smith: I think you should be trying to raise the level of abstraction rather than raze it to the ground.
@Teedy: Look, all those other data structures are just simulating certain algorithms on a vector. Rather than live in a simulation, I prefer to keep it real.
@Polymorpheus: How do you define ‘real’? At one level, these are all simply electrical signals. We need to work at a higher level than that.
@Teedy: This talk of higher-level stuff is all very well, but performance has always been an issue with this system – it’s like watching everything in slow motion. I’m doing my bit to make my code as cache friendly as possible, given how cash unfriendly some of our customer response has been.
And what about thread safety, atomic operations and low-lock code? A lot easier with vector than any data structure based on linking together fragmented memory. I could go on.
@smith: I don’t doubt it.
Well, rather than let Teedy rabbit on, I think we should address the elephant in the Zoom: the architecture. Teedy’s micro-optimisations are infuriating from a coding perspective, but her desire to optimise is understandable given what she’s working with and around.
@Teedy: Thanks... I think.
@smith: I mean, this code shouldn’t even be multi-threaded. A thread per request? That’s so 1990s. I’m not even convinced we should be using Oracle.
@Teedy: Agreed. With hindsight, NoSQL might be the better option.
@Polymorpheus: You’re SWOMming.
@Teedy: What?
@Polymorpheus: Not you, the architect. I think he’s been trying to say something, but has been speaking while on mute.
@TheArchitect: Ah, yes, umm... as I have been saying for the last few minutes, I planned the architecture up front and in meticulous detail. I sent the spec round a few months ago.
@smith: I think that ended up in my spam folder.
@Teedy: I got it. It inspired me to write that limerick.
@smith: Oh yes, that was actually quite good.
@Teedy: Thanks... I think.
@smith: How did it go?
@Teedy: UML, UML, UML,
Bloody Hell, Bloody Hell, Bloody Hell,
UML, Bloody Hell,
Bloody Hell, UML,
UML, UML, Bloody Hell.
[Thumbs-up and applause emojis appear next to some of the callers.]
@TheArchitect: Anyway, this is the sixth time we have rewritten this system, and I am convinced it is going to work out well this time. There is no need to change anything.
@smith: You mean, there’s no need to change anything from all the times it didn’t work out?
@Polymorpheus: We’re a bit more agile these days. We run retrospectives to avoid repeating the mistakes of the past.
@TheArchitect: Repetition legitimises.
@smith: Dammit, not everyone believes what you believe.
@TheArchitect: My beliefs do not require them to.
@Polymorpheus: I think it would help us and the system if we understood the rationale behind some of your more, errm, interesting and dated–
@TheArchitect: –timeless–
@Polymorpheus: –decisions. I’m not sure we’re ready to just swallow the take-it-as-read pill.
@TheArchitect: Some of my answers you will understand, and some of them you will not.
@Teedy: Always riddles with you, huh?
@smith: Do not try and bend the architect. That’s impossible. Instead, only try to realise the truth.
@Teedy: What truth?
[@TheArchitect disappears.]
@Polymorpheus: There is no architect. One of the advantages of being the meeting host.
[Thumbs-up and applause emojis appear next to some of the callers.]
@smith: Yeah, he’ll probably think it was a bad connection or blame it on Zoom.
@Teedy: OK, so now we can redesign the system, right?
@smith: We should aim for something more stateless, functional and habitable. For storing options and other data, I’d like to move away from XML... all those bloody, pointy angle brackets – whoever said XML was human readable clearly had a very specific human in mind. Perhaps we could bridge the code–data divide and opt for something like LISP? It’s like wiping your config with silk.
@Teedy: I know Scheme.
@Polymorpheus: Show me.

Teedy Deigh noodles with code and codes with noodles. She enjoys unlimited coffee refills and avoiding long walks... and short walks... and, in fact, any kind of walk whatsoever. She has no love of the outdoors and believes that doors are there for a reason. For her star sign, she identifies more with -> than *. She has no pets, and the only cat she likes is on the command line.