By Frances Buontempo & Steve Love
How many times do people claim testing their code is difficult, if not impossible? Or have hundreds of tests, most of which fail on a regular basis?
As data science becomes more prevalent, we often see few useful tests along with the code. Often the "science" relies on data, and the tests end up pulling in entire datasets, run for ages and only state "fail" when they get to the end.
You could hit the code with a hammer until the tests pass, ignore the tests or change the asserts, but does that give you confidence to deploy your solution?
Even when the code uses randomness, you can test this. We’ll investigate some typical problems in machine learning, data science and programming in general, suggesting varied approaches to testing in order to increase confidence that your code Does The Right Thing ™.