We are taught to count as children. Frances Buontempo wonders: how hard can it be?
I’ve been somewhat distracted by various goings on recently. I attended the SoCraTes UK Unconference and the inaugural ‘AI for the rest of us’ conference. I’ve blogged about these if you’re interested [Buontempo24a, Buontempo24b]. We’ve also been looking after some neighbour’s animals while they are on holiday. All of which means I have, of course, not written an editorial.
The neighbours have chickens and guinea pigs, which we have looked after before. They now have several quails too. We were told there are 11, but counting quails turns out to be difficult. For starters, I have never seen a quail in real life before. When we first went round, I could see one and thought it was near three light coloured stones. But the stones then opened their eyes and moved. Quails come in different colours. By the end of the week, I had started to get the hang of counting small animals I don’t know much about, and believe I spotted 11. Hard to be sure. Had we spotted more than 11 that would have been a surprise. Counting small birds isn’t easy, even when you’ve got the hang of the shapes and colours to look for. They also move very quickly. Fortunately, we didn’t spot any outside the run. We might be invited back.
Counting quails is difficult, but counting anything can be problematic. How many unfixed bugs does a software system have? Hopefully, they are in a bug tracking system, and you can query to get an answer. As with the quails though, are some reported twice, which means you are double counting? Or have some been marked as ‘won’t fix’, so they are no longer open? They are still bugs, surely? Needing to qualify ‘how many’ might not be your first thought, but it’s important. Knowing an absolute number might not be that useful, but a trend might be informative. Does the number of bugs go up over time? On the face of it, an increasing number of bugs sounds dreadful. However, this might mean more people are using the software, rather than new bugs being added to the system on a regular basis. A raw number isn’t always helpful.
BBC Radio 4 sometimes runs a programme called More or Less by Tim Harford [BBC]. The latest episode unpacked a claim that 50 million leaves will be removed from railway tracks in the South East UK this year. He asked if this was a big number, or a small number, or a silly one. I’m not sure what a silly number is: perhaps I should make up a definition. A biodiversity strategy manager at network rail was interviewed to try to understand where the number had come from. If you’re not from the UK, you may not know that we frequently have news in the autumn telling us about too many or even the wrong sort of leaves blocking train lines. There are myriad other excuses for our trains not running on time, so it’s a bit of a joke. Anyway, Network Rail counted trees (using LiDAR), 13 million in total on the UK train network (or nearby), with 1.5 million in the South East. So, how many leaves are there per tree? It depends. Apple trees apparently have 50,000. Harford didn’t explain how that figure was arrived at, but that’s estimates for you. South Eastern Rail used 50,000 leaves per tree to arrive at their total figure of 75 billion leaves. This number was “pruned back to 50 billion for reasons unknown”. They then said 99.9% of these might not need to be cleared, but the others might. And there you have 50 million. All of which begs the question, how useful is this number? Harford suggested this is what he thinks of as a silly number. It might be more informative to state how much time or money might be spent clearing leaves.
Estimation is, by definition, usually inaccurate. It can be useful, though. I’ve written a couple of books, and have started a third recently. The proposal requires an expected number of pages. How do you guess? I prefer shorter books, so I can manage to read them over a few days or weeks without forgetting earlier details. I wrote about book lengths a long time ago, in ‘Too Much Information’ [Buontempo12]. I weighed K&R (The C Programming Language) and discovered it was 375 grammes, which makes it suitable for carrying in a bag on a journey. The exact number of pages varies depending on where you look, but it’s around 250. Yes, I know, the weight probably varies too. It’s a ball-park figure.
How did I estimate the number of pages I would write? Badly, certainly. But sketching out the potential chapter titles and having in mind 250 pages is ideal, meant I could claim each chapter would be about 20 pages, and bosh, a total page count somewhere around 250. I suspect what is more important than the actual number is using that as a guide, so you can tell if you are going into more detail than you planned, or haven’t written as much. An estimate isn’t a commitment, but it can be a guide. Sergey Ignatchenko wrote about ‘The Importance of Back-of-Envelope Estimates’ [Ignatchenko17]. I used a strapline starting “Guestimate questions make many people grumble.” I have been asked to guestimate the number of petrol stations in the UK so many times at interviews that I hit a point where I had to make an effort not to groan out loud when asked. The article emphasized finding an order of magnitude, which could show something would be impossible. This could stop you wasting hours trying to code up something that could never work. Both providing a page count and doing a back of the envelope calculation can provide some helpful input to a process from the start, potentially avoiding problems later on. Another simple example might be walking somewhere unfamiliar. If you know approximately how far you need to go, you can track how long you walk for as a hint about whether you are probably going in the wrong direction. If somewhere ought to be a 20 minute walk and you aren’t there within half an hour, you may need to retrace your steps.
Estimates are used in various contexts. Agile teams usually come out with story point estimates for code requirements. As I am sure you know, the story points are meant to indicate effort rather than a time commitment. Whether they are always used like that is another story. Maybe you have seen Lunar Logic’s ‘1/TFB/NFC’ estimation cards? [LunarLogic]. The letters on the estimation deck are a bit sweary in full, but your options are 1 point, too big or no chance. I like the idea of deciding if something is plausible, giving it one point, or too big, so able to be broken into smaller chunks, or downright impossible. You may not agree, which is fine. A recent LinkedIn post [Ottinger24] talked about product owners thinking developers tend to seem obstinate and reluctant. Surely they just need to type in the code? The product owners then spent a day pairing up with developers, and saw what they actually needed to do. Imagining how something works and actually doing it can be miles apart. Walking a day in another person’s shoes, as the phrase goes, can be illuminating. By the same score, counting quails might sound easy, until you try it yourself. They blend in with the background and move. It’s complicated.
To answer questions such as “How many quails?” or “How much effort?”, you clearly need a definition of ‘quails’ or ‘effort’. You also need a way to qualify the ‘how much/many’ as well. Obviously, with numbers? Well, maybe not, as the estimation cards just mentioned show. However, we are usually fundamentally counting when we answer these questions. ChatGPT tells me:
Counting is a basic mathematical process used to determine the quantity or number of items in a set. It involves incrementally assigning numbers to items, either one-by-one or in groups, until reaching a total count.
It has sneaked the word set in there, perhaps to avoid double counting, as I am sure I did with the quails. Ignoring the AI for now, you count by mapping items to the natural numbers, giving you labels 1, 2, 3,… n for each quail, or whatever you are counting. This is also called an enumeration and neatly avoids defining numbers, which is a whole other topic. There are several different kinds of numbers. How many, I wonder. The AI listed 7:
- Natural numbers: 1, 2, 3, …
- Whole numbers: 0, 1, 2, 3, …
- Integers: … -3, -2, -1, 0, 1, 2, 3, …
- Rational numbers like ½ -¾
- Irrational numbers like √2, π
- Real numbers (no examples given)
- Complex numbers like 3 + 4i.
The rational numbers were not put in order because that’s slightly difficult. The AI failed to mention transcendental numbers, and left out Hamilton’s quaternions. I could have made the list myself, but hey. If you’ve not come across quaternions before, go have a read [Wikipedia-1]. They are lots of fun. They extend complex numbers, using a j and k as well, with the property
- i2 = j2 = k2 = −1
- ij = k, -ji = k; jk = -kj=i and ki = -ik = j.
I recall a classmate during a mathematics lesson at school going off on a rant when complex numbers were introduced, based on them being obviously made up. They are, but they are interesting in and of themselves. They also have practical uses, including connections with sine and cosine, making them useful for cycles such as sinusoidal currents and voltages [ECStudio]. (Beware that electrical engineers use j for imaginary numbers, rather than i, which is obviously reserved for current.) We didn’t cover quaternions at school, which would probably have upset my classmate even more.
I wonder if you can enumerate all the types of numbers. To enumerate, you must be able to order the elements, and I don’t know if you can really do that. Some types of numbers are subsets of others. The natural numbers are included in the integers, and so on. This gives you some kind of partial ordering. Weirdly, the set of whole numbers and integers contain the same number of elements. Map 0 to 0, odd whole numbers to 1, 2, 3, … and even whole numbers to -1, -2 , -3, … and you have a one-to-one mapping, so they must be the same size. There are more real numbers though. The proof is left as an exercise for the reader, or go read about Cantor’s diagonal argument [Wikipedia-2]. The real numbers are therefore an example of an uncountable set. The quails were almost uncountable, but for different reasons.
Part of the difficulty with the quails was their movement. Counting or, more generally, measuring is difficult when things move. Heisenberg’s uncertainty principle immediately springs to mind [Wikipedia-3], which states that we cannot measure the position and momentum of subatomic particles. To be fair, the movement itself might not be the problem, though you don’t get much momentum without movement. Trying to measure software does run into similar problems. Instrumenting code for profiling changes the code itself. The numbers will be wrong, but can still be informative.
Now, I’m sure I double-counted some quails, as I mentioned. The approximate figure was close, though. Doing the same thing twice isn’t the end of the world, but can be slightly annoying. If you have come across the Lunar estimation cards before, I notice I mentioned them in ‘I am not a number’ [Buontempo17]. I suspect what I have written now is a slight overlap, rather than a complete clone. Writing an editorial is impossible! Counting is also rather difficult, so don’t be too hard on yourself if you have an off-by-one error, or get a number wrong. Spotting the inaccuracy is brilliant, and the order of magnitude approximation might be good enough.
References
[BBC] ‘More or Less’ on BBC Radio 4: https://www.bbc.co.uk/programmes/b006qshd
[Buontempo12] Frances Buontempo, ‘Too Much Information’ (editorial), Overload 111, published October 2012, available at https://accu.org/journals/overload/20/111/buontempo_1885/
[Buontempo17] Frances Buontempo, ‘I Am Not a Number’ (editorial), Overload 139, published June 2017, available at https://accu.org/journals/overload/25/139/buontempo_2377/
[Buontempo24a] Frances Buontempo, ‘SoCraTesUK 2024’ on BuontempoConsulting, posted 24 September 2024 at https://buontempoconsulting.blogspot.com/2024/09/socratesuk-2024.html
[Buontempo24b] Frances Buontempo, ‘AI for the Rest of Us’ on BuontempoConsulting, posted 29 October 2024 at https://buontempoconsulting.blogspot.com/2024/10/ai-for-rest-of-us.html
[ECStudio] ‘Complex Numbers in Electronics’, ECStudio, available at https://ecstudiosystems.com/discover/textbooks/basic-electronics/ac-circuits/complex-numbers-in-electronics/
[Ignatchenko17] Sergey Ignatchenko, ‘The Importance of Back-of Envelope Estimates’ in Overload 137, available at https://accu.org/journals/overload/25/137/ignatchenko_2341/
[LunarLogic] Lunar Logic estimation cards: https://estimation.lunarlogic.io/
[Ottinger24] LinkedIn post (original poster, Tim Ottinger, but many contributors), initally posted during November 2024 at https://www.linkedin.com/posts/agileotter_ive-had-pos-tell-me-that-they-didnt-know-activity-7257067028925108224-Stxo
[Wikipedia-1] Quaternion: https://en.wikipedia.org/wiki/Quaternion
[Wikipedia-2] ‘Cantor’s diagonal argument’: https://en.wikipedia.org/wiki/Cantor%27s_diagonal_argument
[Wikipedia-3] ‘Uncertainty principle’: https://en.wikipedia.org/wiki/Uncertainty_principle
has a BA in Maths + Philosophy, an MSc in Pure Maths and a PhD using AI and data mining. She's written a book about machine learning: Genetic Algorithms and Machine Learning for Programmers. She has been a programmer since the 90s, and learnt to program by reading the manual for her Dad’s BBC model B machine.