Be Afraid. Be Very Afraid.

May 18, 2013

I know how easy it is to make mistakes when writing software and I know how difficult it is to ensure that you find these mistakes prior to releasing that software. I have, consequently, to work quite hard to set these concerns aside when using any device dependent upon software for its operation. This, actually, could make life quite difficult, fortunately I am becoming gradually inured and can now use everyday household devices without excessive concern. I still look slightly nervously at medical devices and worry about air-traffic control and the like. It is a professional affliction, shared I know by many computer scientists, eased by a strong dose of 'get real' taken when required. My level of concern has however, been raised unexpectedly by a growing awareness of my dependence on a class of software system I had hitherto wholly ignored: policy models. Let me explain.

Many public policy decisions are too complex to be determined by straightforward 'qualitative' reasoning. They involve multiple interlocking constraints that relate resources, effects, finance and anticipated behaviours by a range of independent actors. Often these give rise to feedbacks yielding dynamic behaviour that is far from obvious. Further, there are timing factors that mean that certain actions are only possible at certain times or only if specified environmental conditions hold. This creates further networks of temporal dependencies. It is clear in these circumstances that modelling is required to aid understanding and analysis. This is however, where the problem starts.

It is unclear whether those who understand the policy dimensions of the problem are equipped to do the necessary modelling or perhaps really understand what such models can, and cannot, yield. Even if they are equipped, their first instinct is to construct an enormous and elaborate spreadsheet. Now, don't get me wrong, spreadsheets can be an extremely powerful tool and have the benefit of being relatively simple to use for straightforward planning tasks. They are however, also very difficult to debug, to test, to document systematically and to understand except in a piecemeal fashion. The sophisticated user, or worse the clever but unsophisticated user, can create untold havoc. The full functionality of Excel with workbooks, scripting and pivot tables can approximate to that of a powerful software development environment. The shallow and deep end are not far apart ... and there are sharks.

I am not sure that there has ever been a full accounting for the societal consequences of buggy spreadsheets. If there were I feel certain that it would encompass much misery, economic loss and perhaps worse. The fact however, remains that serious and important policy alternatives, the case for major infrastructure investments for instance, depend on spreadsheet models.

When policy analysis becomes really thorny it is necessary to construct a fully fledged computational model. Climate change mitigation and energy policy being cases in point but health resourcing and transport planning are also candidate examples. The models are usually built by experienced modellers who have access to sophisticated tools and have a good appreciation of issues such as sensitivity and model validation. Unfortunately modelling experience is not necessarily associated with an engineering understanding of model construction. Some of the basics - specification, documentation, clear interfaces, modularisation schemes, built in error checking, systematic testing - are commonly neglected. Errors can be introduced in the data, in the modelling assumptions, in the model, in the model coding and in the language and data storage components on which the whole thing rests. They can also arise in the interrogation of the model and the ways in which results are presented or visualised. Chasing these errors through the multiple layers represents a major challenge that lies at, or somewhat beyond, the state of the art in software engineering.

It is sensible to assume that many large societally critical decisions are being made supported by computational models that are based on sound data, incorporate well founded assumptions and leverage the best analysis available, but which yield erroneous results because, for example, the data that is actually used is some earlier poor quality version or because, again for example, the assumptions are wrongly coded. This is worrying. We require two advances: the development of a greater consciousness amongst the policy community of the engineering dimensions of modelling; and, the emergence of a composite discipline of 'model engineering' that brings together the computational modellers and the software engineers. Until these steps have been taken a degree of scepticism about public policy built on the outcomes of modelling is reasonable.

prof serious

Discussion about this post