During 2005-07 I was part of the team at the Met Office researching and developing techniques in health forecasting and anticipatory care. Many sectors take weather forecasts as part of their planning and risk management processes. We were asking whether the health services and patients would also benefit from forecasts of increased risk to the health of individuals and populations; how those forecasts could be developed and delivered; and how they might be integrated into short- and medium-term action.
Our focus was on conditions which epidemiological studies have shown to be weather-dependent, such as COPD and other respiratory illnesses, but the concept of health forecasting need not be limited to these factors.
The Met Office still works with Government and other partners to provide advice on seasonal health and wellbeing, and on the impacts of climate change on health. However, it proved difficult to build a foundation for delivery of a forecasting service on the shifting sands of NHS structure and personnel, and the Health Forecasting project closed a number of years ago.
At some point I conducted a thought experiment, looking at the different aspects of numerical weather prediction and weather forecasting at the Met Office, and asking whether these techniques might have an analogue in health forecasting. At the time, I thought there might be a paper in it. It never came to be written, but I kept the idea alive.
Some years later, big data, machine learning and artificial intelligence have significantly altered the modelling and forecasting landscape. Data have been used in many new ways to support the COVID-19 response. But although the thought experiment might be dated in some respects, there is still value in the overview and insights. Also, I always enjoy making connections between things, and I decided it is never too late to write up an idea. Not least because this is not about health forecasting per se, but about what can be learnt from weather forecasting. Hence there are possible applications to forecasting in other domains.
I have broken my ideas down into several sections, starting from the modelling, through production of the forecasts, to how they are used. There is a rough but not necessarily direct correspondence between the bullet points in each section. I have not elaborated – it would take a lot of work to write this up properly so the points are little more than headlines – but I have indicated what I think are the key points in bold type.
There should be nothing in what follows that is not in the public domain. It has basically come out of my head, so I have not included references. It is intended to be helpful, so if any of it is problematic in any way, I apologise, and please let me know. Any formal paper would have to be fact-checked and referenced properly, of course.
Models are typically a combination of what we know and can incorporate directly as mathematical equations, such as thermodynamics; what we know but have to come at slant, because it can’t easily be represented by equations or the detail is computationally expensive; and what we don’t know but can guess at.
There are further complications. Although the weather system is continuous in space and time (at the Newtonian scale!), computers can’t solve continuous differential equations directly, and therefore models use a spatial grid and time steps. Then the atmosphere behaves very differently at small-scale resolutions and the large-scale average; think of the difference between the behaviour of an individual and a crowd.
At least the weather doesn’t change its behaviour based on the forecast information available; compare with economic forecasts that trigger actions that change the system. Not in the short term, at any rate. I hope very much that most the climate forecasts turn out to be very wrong because we have acted on them.
- Physics* vs statistics
- Equations of physics; parametrisation of physical processes; statistics to explain any remaining error
- Higher horizontal resolution (smaller grid size) → capture smaller-scale processes and local detail
- Embed a small area at high resolution (UK Continental Shelf) within a larger area at low resolution (N Atlantic and Europe or the whole world)
- Higher vertical resolution (more layers in the upper atmosphere, lower atmosphere, ocean) → capture smaller-scale processes
- Higher temporal resolution (smaller timesteps) → not necessarily better or more useful forecasts
- There are trade-offs between physics, resolution and model horizon. Eg climate models are lower resolution, and some are atmosphere with parameterisation of the atmosphere-ocean boundary and land processes, some are ocean plus parameterisation, and some coupled
- Model can learn to correct biases, eg using a Kalman filter, which requires observations
* Physics is a shorthand that also includes chemistry and biology
- Physiology vs statistics
- What is deductive, based on physical realities, and what is inductive, based on conclusions inferred from data?
- Parameterisation of physiological processes; statistical relationships derived from epidemiology and behavioural science
- Risk factors for patients and populations
- Resolution depends on availability of data used to create model
- Higher spatial resolution with respect to population → capture more local effects, but greater noise in data
- Patient resolution, eg by condition or age group → may capture effects hidden by aggregation, but more noise in data
- How to capture changes in patient behaviour, value of patients’ diaries and case studies
- Temporal resolution depends on data collection
- Take the long-term or population average (analogous to the climatology), or what happened yesterday or this time last year as a first guess
Depending on what physical processes are included in the model, and whether the model is running an ensemble or scenario, some assumptions must be made regarding its drivers and constraints. These assumptions are not necessarily fixed in time but may evolve.
A parameter-based ensemble can be run to analyse the sensitivity of the model to parametrisation of physical processes, introducing small differences in the parameters of interest and evaluating the range of end-points.
Climate models used to take carbon emission pathways as drivers for different scenarios. The more recent models use radiative forcing, the difference between the sun’s energy absorbed by the earth and the energy radiated back to space, which roughly correspond to emissions pathways.
- Solar cycle
- Boundary conditions:
- horizontal – eg global model provides boundary for local-area model
- upper – top of the atmosphere or top of the ocean
- lower – surface exchange scheme, ocean boundary or sea-bed
- Topology, land use, vegetation
- Human activity, volcanic activity
- Weekly and seasonal cycles (including Christmas and other bank holidays)
- Population under consideration – general population, or a subset by eg location, age, condition
- Risk factors in the population
- Weather, prevailing viruses and other drivers – some drivers may be available as real-time datafeeds, others may not
- Resource/bed availability – can ward function be switched at short notice, eg to respiratory (thunderstorm asthma) or trauma (freezing rain)?
Better knowledge of the starting point means a better forecast. In the case of weather forecasting, a tiny difference in the starting point may have a big difference in the future – the fabled butterfly flapping its wings.
Starting-point ensembles involve running the model many times with small differences in the starting conditions. The range of end-points indicate possible futures for the weather and the probability of and confidence in each.
- Sources of weather observations include satellites, aircraft and weather balloons, shipping and buoys, radar, automated land stations
- Assimilation – running recent observations through a model to build a coherent picture of the atmosphere in 3D at a single point in time, or in 4D over eg 6 hours
- Is it possible to build a picture of the current health service or situation in real-time?
- Data are collected for eg: ambulance and 111 calls; GP and out-of-hours visits; A&E attendances and hospital admissions; in-patient and out-patient elective attendances; discharges; hospital situation reports; flu surveillance, track & trace data.
- What data and information should private health providers be required to supply in the public interest?
- What of this is available in real-time, subject to information security and data protection regulations? How much extra information is there about the patient, symptoms, diagnosis?
- What proxies are there, eg internet searches for coronavirus symptoms?
- Is it possible to find a simple exemplar for how improved knowledge of the current system can improve a health forecasting model?
- What is needed by the end user or system? What is available?
- Frequency of forecast production
- Forecast horizon, timestep
- Location – model grid point, specific site, route
- Average over an area or over time
- Weather and other model variables, derived variables such as windchill
- Avoid spurious precision in number of significant figures or decimal places
- Accuracy of forecasts falls with time – see Verification
- Deterministic forecast – the weather will be this
- Probabilistic forecast – there is an x% chance that the weather will be this; probably more guidance needed re usage
- Confidence in the forecast – not the same as probability
- What would be useful to the patient or health provider? What is their appetite for risk? How would they act on the information? How would they benefit?
- Difference between a targeted health forecast vs a forecast of the drivers coupled with knowledge of the likely health impacts
- Providing health forecast, eg risk of COPD exacerbation, or health outcome, eg presentation at A&E with COPD exacerbation?
- Forecast horizon, timestep, frequency, accuracy, precision
- Requirements may change with season, eg heatwaves or thunderstorm asthma in summer, cold or freezing rain in winter
- Outcomes from an area or group, or to a provider
- Population of interest
- Catchment areas of providers
- Errors introduced through: inaccurate observations; physics parametrisation; the need to use a grid; non-linearity and chaos; open system or closed system isolated from its environment
- Need observations to test
- Hindcasting – testing the model by running it forward from a starting point in the past over a period when observations are available
- Skills scores – root-mean-square error, bias, contingency (event forecast vs event observed?)
- High-level measures of forecasting skill, eg the Numerical Weather Prediction (NWP) Index – should be determined by an external agency, not cherry-picked by the forecast provider, and be reproducible over years
- Compare with the long-term average and persistence – the weather will be the same tomorrow as today
- Deterministic vs probabilistic verification
- Does intervention by a human forecaster improve model performance? (see below)
- Need verification to get acceptance
- Verification of model outputs or health outcomes
- Errors in drivers and errors in the health forecasting model compound
- Errors introduced through: faulty understanding of cause and effect; averaging population characteristics and behaviour; small numbers → noise; big numbers → smoothing; capturing all interactions in an open system
- Need observations of reasonable quality in near real-time to avoid divergences between the model and reality – also useful for model development
- In absence of real-time observations, compare with long-term averages or what happened this time last year rather than persistence
- Collection, quality control and assimilation of observations
- Scheduling model runs to maximise efficiency of supercomputer and other resources
- Raw forecast model outputs
- Post-processing of forecasts – eg Kalman filters to correct bias
- Post-processing to create products – eg calculate indices, site-specific or route-based forecasts, decision-making tools
- Human forecaster intervention to quality-control forecasts
- Known weaknesses in model for certain weather conditions
- Dealing with low confidence
- Decision-making tools on a knife-edge, eg grit/don’t grit
- Local knowledge of biases or peculiarities
- Customer insights
- International collaboration – pooling observations and forecasts with other forecast centres → improved starting conditions, multi-model ensembles
- Provide raw model output for others to use in creating products – shared innovation
- First-guess health forecasting model outputs based on best-available drivers
- Supplement with other more adhoc information on possible drivers or surveillance data
- Role of international organisations in pandemics, pooling data and knowledge
- Is there currently any advantage to pooling real-time data and forecasts with other UK and international organisations looking into health forecasting?
Presentation and dissemination
- Model outputs as formatted data: gridded datasets; subsets as csv
- Post-processing and auto-generation of symbols, time-series graphs, text, maps
- Interpretation and value-added by human forecasters, eg as script for Shipping Forecast, presentation for TV broadcast
- Interpolation in time and space
- Regular forecasts vs ad-hoc warnings of eg severe weather, atmospheric pollution events
- How are thresholds determined? How best to communicate risk, probabilities and impacts?
- What guidance and caveats are needed to ensure the information is used correctly?
- Dissemination by ftp, email, web, TV, radio, teletext, fax, Telex, SMS, WAP, etc
- Free to the public at the point of access
- Providing information to government, commerce, service providers for use in decision making – eg direct to the bridge or cockpit
- Model outputs as absolute number, % change, standard deviations, difference from average/expected, category, category of change, thresholds and warnings, indices
- Human interpretation and value-added
- Regular forecasts vs ad-hoc warnings of pollen, thunderstorm-asthma, freezing rain, elevated risk of exacerbation of underlying condition
- How are thresholds determined? How best to communicate risk, probabilities and impacts?
- What guidance and ‘health warnings’ are needed to ensure the forecast information is used correctly?
- How to reach general public at risk – eg thunderstorm-asthma warnings to hayfever sufferers
- How to reach patients at risk – eg with long-term conditions – via GP registers, carers, outbound calling, email, SMS
- Providing information to providers – via web, NHS net, email, fax, and eg direct to the ambulance cab or the ward
Use of forecasts
- For interest only, as a national obsession!
- How does the public access weather forecasts? Eg older people struggle to take in audio and visual at the same time
- What triggers the public to act on forecasts and warnings?
- What other qualitative and quantitative information do the public consciously or subconsciously combining with a forecast or warning in taking a decision or action? Includes past performance!
- What different information is needed by public and private sector organisations, and what decisions and actions are they taking that are affected by the weather?
- Integration by with other information in a decision-making tool, or a data input to another model, eg flood forecasting, electricity/gas demand
- Take preventative action to reduce risk, eg grit the roads, close bridge, warn public
- Prepare for action to mitigate impact, eg deploy sandbags
- Decision might be go/no go; change route; change timing of eg harvesting, holiday
- Don’t recommend a particular decision or action – ensure liability is with the forecast user
- Compare with the language of climate change
- For interest and awareness of general health
- What are the public, patients’, service providers’ attitude to risk?
- Are public, patients, service providers willing to base decisions or actions on information with <100% confidence?
- Don’t harm the patient, don’t withdraw care, improve patient care, don’t increase the burden on the provider
- So what information is required?
- Fit with strategy, objectives, plans
- Ensure, or even enhance, the integration of primary and secondary care
- Integration with other information in a decision-making tool, or a data input to another model, eg hospital bed state, COVID projections
- Make intervention or take preventative action to reduce risk, eg trigger GP call to patient on risk register, vaccination programme
- Make preparations to mitigate impact, eg plan for set-up of emergency treatment unit
- Planning – electives, rotas, holiday banks
- Targeting resources – ambulance location, containing epidemic or pandemic spread
- Protocols for responding to warnings of infrequent but acute events
- Combine health forecasts with advice, education, patient knowledge
- Demonstrate cost-benefit, cost-utility, cost-effectiveness
Hindcasting involves testing the model by running it forward from a starting point in the past over a period when observations are available, enabling verification.
I wrote the following in the past, and there have been many observable changes since…
Four ways of improving weather forecasts:
- Improving observations → better start point
- Improving the model physics
- Increasing the resolution
Observations can also be improved by duplicating sources or means of measuring variables, both to improve quality control and to build in redundancy to cope with lost observations. Many observations are lost when aircraft are grounded, eg during Eyjafjallajökull eruptions of 2010 and now the current coronavirus pandemic.
Improved observations means investment in observing instruments and faster computers for processing and assimilation. Better physics, higher resolutions, and running ensembles means faster computers. Both weather satellites and supercomputers are expensive, so there is always a trade-off.
Sharing across weather services and other organisations is key to improving observations and the understanding of the physics, and to building multi-model forecast ensembles.
Four ways of improving health forecasts:
- Improving the understanding of decision-making and attitude towards risk across the system → what forecast outputs are required
- More, better and more timely observations of health outcomes and demand for services → better models, starting conditions and verification
- More, better and more timely observations and forecasts of health drivers → more, and more accurate, drivers are available as inputs to model development and in real-time to forecasting systems
- Improving the understanding of physiology, eg how the environment and risk factors in the population and individuals affect health
Four ways of improving the usefulness of health forecasts:
- Shifting providers’ thinking from reactive to proactive
- Introducing incentives and accountability for shifting healthcare interventions down the chain from secondary to primary to self-care, and setting appropriate targets for eg reducing admissions and bed days.
- Better information collection and management across healthcare, so that the forecasts can be combined with other sources of information to give a better knowledge of the current state and likely future real-time evolution of the system
- And in the long-term, re-integrating public health in healthcare provision and improving public, patient, and provider education
So do these suggestions verify at all against the observed changes?
Given the lack of preparedness for COVID, despite the recommendations that came out of Exercise Cygnus, I suspect not. So there is one underlying way of improving health forecasts and the usefulness of health forecasts: stop using the NHS as a political football and fund healthcare provision properly.
There is a lot of forecasting out there, and a lot more potential for useful forecasts. I’ve also worked in economic forecasting in the past, and I can stick my finger in the air and guess at: climate change (very similar to weather forecasting); stocks and derivatives; macroeconomics, microeconomics and behavioural economics; business planning; scenario analysis; social, scientific and technological developments; engineering and materials performance; oil and mineral prospecting, energy production and demand; gambling.
Some existing forecasting is sophisticated, usually where there is money to be made, eg derivatives. Much is surprisingly crude, eg general equilibrium economic models. But much like different sports being open to learn from each other to enhance performance, forecasting disciplines should be able to learn from each other.
I’d like to think that forecasting could be used to improve the common good, rather than to make money or grab power. But I’ll leave you with this Arabian Proverb, to ponder and meditate on our human contingency:
They who foretell the future lie, even if they tell the truth.