Are we nearly there yet?

Deciding if you are making progress can be a challenge. Frances Buontempo considers various metrics and their effects.

Summer is a time for holidays including road trips for many, possibly with children screaming, “Are we nearly there yet?” from the back of the car. We didn’t get a chance to have a holiday but did go to a metal festival, so I’ve been too busy discovering new music to write an editorial. In fact, the deadline for this issue of Overload sneaked up on me, so I haven’t even tried thinking about it. Trying to keep a sense of how close targets or deadlines are is difficult. Even how long it might take to get to a location is not straight forward. While being physically close to your destination, the estimated time to arrival might be surprisingly large if there is a huge traffic queue. So near, and yet so far away.

‘Nearly’ is relative, of course. Topology, a branch of mathematics, defines relationships in a very abstract way. Lisa Lippincott gave a keynote at the ACCU conference this year, called ‘The shape of a program’ [ Lippincott18 ]. She pulled on ideas from topology to talk about places and meaningful areas, particularly in a codebase. Now, topology provides a formal definition of points in a neighbourhood. Such points are close, in some sense, but topology doesn’t have a metric or distance measure, so ‘close’ is very abstract here. The concept of belonging to a neighbourhood depends on definition of an open set, which in turn has a complementary closed set, which has nothing to do with being close/nearby. Furthermore, sets can be both open and closed [ Wikipedia-a ]. This sounds counter-intuitive. Some sets are neither open nor closed. How can this possibly be useful? There is more to life than usefulness. I find topology mesmerising, but it is useful. It gives an abstract way to think about closeness and convergence. We could think more about convergence, but will have to save that for another time. For now, think tending towards somewhere. Where, exactly?

Defining ‘there’ is equally complicated. Some cultures or groups have an initiation ceremony to mark a transition to adulthood, or full membership status. Some professions require certified status. An event marks a boundary between before and after. A novice takes a vow and becomes an initiate. A trainee engineer passes exams and gains experience, allowing certification. Not everything has a clear definition or demarcation between in and out, open and closed, before and after. Even success or fail. If you are developing a product, how can you be sure what your customer needs? If you are learning a new subject, how do you know when you really understand it? What about learning a new programming language?

How do you prepare a talk, or proposal for a talk at a conference? How do you write an editorial for that matter? You need a topic or title. These might get chosen up front, or come into focus as you knock a few ideas around. Something similar happens when you try to learn a new programming paradigm or language. You need a topic, toy example or something to build. Once you decide, you start somewhere, possibly in the middle. Trying to measure your progress needs an idea of what ‘done’ means. For learning tasks or anything with a deadline, done often means time’s up. You may not have all the features you were dreaming of, but you have learnt or written something. A time limit provides a clear definition of ‘there’. Other tasks, such as tidying your bedroom, redecorating or documenting a system do not immediately give a way to define ‘done’. Again, a time limit may enforce this, or running out of money. Some tasks like this stop when you run out of stamina. If you go swimming in a pool, you may plan to swim for an hour, but discover you can only manage five lengths because you are out of practice. You may not achieve your target, but at least you turned up. Sometimes just getting started is good enough. Other tasks, like building a Lego rocket or a robot have a clearly defined endpoint. All the pieces are in the right place. In the case of the robot hiding behind my sofa, the failure was due to missing parts. Not something you want to discover when you’re right near the end of a twelve-hour build. That’s the reason my Dad used to lay out and count pieces before self-assembling furniture, or making jigsaws or models. He did try to fix things like clocks from time to time. The universal rule there appears to be that a spare screw or two is bound to materialise and can be safely stored in a tin, without affecting the performance of the machine. Something similar seems to happen when you refactor code, but you don’t keep the spare lines in a tin. Just delete them and jog on.

Other tasks are longer running, and do need some kind of milestones to check the project is on target, or at least heading in the right direction. Some projects follow more traditional methods, forming detailed project plans in the form of Gantt charts [ Wikipedia-b ], think rows of tasks and columns of dates, showing who will do what, when. Other times, Kanban boards [ Wikipedia-c ] are used, usually starting with a to-do, doing, done column. In both cases, as time marches forward, things change. Relabelling the weeks or months or even years on a Gantt chart might help to provide new estimates of what’s left to be done before the final project is complete. The Kanban board might sprout a few extra columns. A project I worked on ended up with 17 columns. It was using a web interface, rather than being post-it notes on a wall, and I had to scroll right for quite a long way to get to the final column. And don’t get me started on scroll bars that move as you try to scroll, having cached the length of data then updating to discover more.

When tasks don’t form a neat linear progression, they often make something like a tree structure, which isn’t easy to shape into a list or table of rows and columns. I have never been a project manager so can’t claim expertise in this area, but I do usually try to track if a project is moving forward and keep a to-do list somewhere. I do this for personal projects too, which does mean I have several lists scattered around the house. I need a list of lists to help me find the list, which suggests a tree structure yet again. As more items get added to the list, it becomes hard to estimate how long it will take to finish a project. If you add things at a faster rate than you complete things, you might start to feel as though you are going backwards. This can be dis-heartening. Frequently, what started as a straightforward collection of a few things to achieve over the next couple of days might end up as a few months’ worth of research, experimentation and learning. The original estimate failed to take on board all the details that are required. We frequently under-estimate the effort required. George W Bush once claimed people had misunderestimated him [ BBC09 ]. This suggests he thought you can make correct and incorrect underestimates. Logically ridiculous, however, many projects do leave a time slot towards the end for validation, deployment or testing, which are really code words for unknown unknowns [ Wikipedia-d ]. The misunderestimation comes from not allowing enough time to complete everything required, or possibly being too ambitious.

Project planning usually involves aiming towards a final goal, such as deploying some working software. Learning, whether via an academic institution, or attempting to teach yourself, doesn’t always have a clear final goal. A timetable for a term, series of lectures or a book with exercises might give the impression of a linear progression from zero to master. A final exam sets a deadline and time boxes the learning into a term or period of time. Some employers may take people on as apprentices, perhaps offering a twelve month training programme. How does the neophyte or learner asses their progress? I was discussing this with Chris Oldwood recently. He has a mentee who asked how to tell how much she had improved. Chris suggested she noted how many of his jokes she laughed at. If you’ve not had to chance to listen to Chris’ stand-up comedy routine, you are missing out, though traces of it are out there, somewhere [ Oldwood15 ]. I suspect in a few cases, a groan rather than a laugh would disambiguate a yellow belt from a Dan master, but I’m no expert. This metric, which I shall refer to henceforth as ‘the Oldwood quota’, gives a straightforward way to measure something. It’s not immediately clear what. If Chris tells ten jokes and his mentee laughs at all of them, she gets ten out of ten. Does this mean she’s wiser? As children learn about jokes, they laugh, but you gain more insight when they try to re-tell the jokes. The punchlines sometimes get told first. The words come out wrong. Eventually the child starts to invent their own jokes, usually with varying success. I suspect Chris will know his mentee has surpassed him when she can make him laugh rather than vice versa. The conversation with Chris was very interesting. I love his jokes and am sure he’s a great mentor. As we talked, it struck me that general knowledge quizzes are a thing. You can count how many answers people get correct. You can automate it, using a multiple choice format. Precise and straightforward, as long as the questions make sense and the answers are correct. Now, have you ever heard of a general wisdom quiz? No. It would be very difficult to award marks for this. As you are trying to learn a new subject, this is part of what you are trying to measure. Multiple choice questions are easy to mark, but do not show how much people understand. College exams are designed to test pupil’s understanding, but this is very hard to do. An essay-based subject may tend to have shorter questions, which are relatively quick to think up, but the marking will take a long time. A more mathematical subject may have longer questions in several parts, which can take a long time to devise, trying to avoid questions from previous years where the answers can be memorised, though the marking might be slightly quicker. Assessing someone’s understanding isn’t easy. An exam result of 80% would usually be a pass, in fact, possibly distinction. A rocket launch that achieved 80% (I’m not sure how you would measure a percentage, perhaps a fifth of it went into space) is more likely to be a fail.

Trying to shoe-horn a metric onto a space leads to all kinds of anomalies. Sometimes introducing a metric to give league tables or performance metrics leads to problems. As we know, measuring car exhaust emissions caused a stir a while ago [ Wikipedia-e ]. This involved a claim of deliberately measuring the wrong thing to ‘comply’ with the Clean Air Act. Rather than deliberate misreporting, metrics can still have unintended consequences. If a team is required to achieve 100% test coverage, then deleting all the code is a sure-fire way to achieve this. Target hit, intention missed. When the league tables for GCSEs (exams for 16-year-olds in the UK) were introduced years ago, schools reported the percentage of pupils who achieved a C or above, aged 16. Many schools allowed their pupils to take exams a year early, allowing them to start on A levels or similar sooner. I remember reading about one school where the pupils took all their exams a year early, so despite being a popular school that was hard to get into, they would have ended at the bottom of this table. Solution? Make the pupils take the exam a year later. Ridiculous, in my opinion. However, trying to get the metric changed would have been difficult. Some programming teams using Agile measure the effort of a task in story points. We know this is an attempt to avoid a one-to-one correspondence with time estimates. I wonder whether these form a metric, in a mathematical sense. I’ll leave that as an exercise for the reader.

Assessing progress is difficult, but important. Perhaps you have some metrics you want to share with us? Write in. Or perhaps you learn a smattering of new things by reading Overload . Don’t forget to let authors know if you enjoyed reading their articles.

References

[BBC09] ‘The ‘misunderestimated’ president’: http://news.bbc.co.uk/1/hi/7809160.stm

[Lippincott18] ‘The shape of a program’: https://www.youtube.com/watch?v=IP5akjPwqEA

[Oldwood15] http://chrisoldwood.blogspot.com/2015/04/the-daily-stand-up.html

[Wikipedia-a] https://en.wikipedia.org/wiki/Open_set#Open_and_closed_are_not_mutually_exclusive

[Wikipedia-b] https://en.wikipedia.org/wiki/Gantt_chart

[Wikipedia-c] https://en.wikipedia.org/wiki/Kanban_board

[Wikipedia-d] https://en.wikipedia.org/wiki/There_are_known_knowns

[Wikipedia-e] https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal

Frances Buontempo Frances Buontempo has a BA in Maths + Philosophy, an MSc in Pure Maths and a PhD technically in Chemical Engineering, but mainly programming and learning about AI and data mining. She has been a programmer since the 90s, and learnt to program by reading the manual for her Dad’s BBC model B machine.

References

Advertisement

Advertisement

Your Privacy