More defenses are being built to protect communities worldwide from flooding. But defenses are often under-represented in flood modeling, which can lead to distorted views of risk. This research, led by University of Bristol PhD student Gang Zhao and co-authored by Fathom’s Paul Bates and Jeff Neal, uses a machine-learning approach to address the problem.
Flood is the most frequently occurring – and most costly – natural hazard. The number of major floods in 2000–2019 was double that of the previous 20 years, causing significant loss of life and severe economic impacts to populations and property worldwide.
To combat the impact of floods, countries globally have been building more – and better – flood defenses such as dams, reservoirs and levees. In the US, it is estimated that as of 2018 approximately 11 million people and $1.3 trillion of property value existed in flood-defended areas.
Flood defenses have a major effect on flood distribution and exposure, which needs to be taken into account when assessing flood hazard. While it is now possible to map flooding on a large scale at high resolution, detailed information on the location of flood defenses and, in particular, flood defense standards for most rivers in the world is still scarce (Samson et al, 2015).
This makes modeling flood defenses a challenge. Without the necessary information, hazard models typically make educated guesses about defense standards or assume there are no flood defenses at all. This leads to misestimation of flood hazard and a distorted view of risk.
Scientists have been looking at new methods, based on machine-learning, to improve estimation of flood defense standards. This previous research paper led by Fathomers, for example, described a new automated method of extracting levees from high resolution terrain data that is used in products such as Fathom’s US Flood Map.
In this new study, researchers tested a random forest regression machine-learning model to estimate flood defense standards in the conterminous United States and England.
Flood defenses need to be considered for a realistic picture of risk. Fathom’s US Flood Map includes defended and undefended views of flood risk in the United States.
Regression models for estimating flood defenses
Random forest regression is a widely used machine learning model, which the researchers used to estimate the defense standard for unlabeled sites (i.e. levee sites with no declared defense standard).
The team evaluated the relationship between the defense standard and 10 explanatory factors contained in publicly available datasets. These included river flood hazard, physical factors such as elevation and slope, and socio-economic conditions such as the GDP and population density of the protected areas.
They compared the random forest regression method with another type of machine-learning model, called multiple linear regression, and finally, they incorporated the estimated defense standard into Fathom’s global flood hazard model to simulate flooding in three case studies, validating the results against official flood maps.
The results: Accuracy of machine-learning
There were several key findings:
- The random forest regression model performed better than multiple linear regression
- The model achieved a ‘good’ performance, i.e. a Nash–Sutcliffe efficiency of 0.85 in the CONUS and 0.76 in England
- The approach was incorporated into large-scale flood hazard modeling
- There were obvious overestimations of flood hazard when defenses were not taken into account
- Flood defenses were successfully represented using the new approach.
Read the original research article, published in Water Resources Research.
What happens when dams fail?
The largest plausible flood events in many areas of the world would occur as the result of a catastrophic dam failure.
Read how Fathom’s researchers used data from our US Flood Map to explore potential dam break scenarios.