Data analytics research and development

The following themes underpin many of the projects currently being undertaken within the hazard and resilience modelling team and cover many of the core concepts considered within our data product development workflows.

Themes

Data-driven analytics

Partially collapsed house in Ripon

We are utilising data-driven techniques to further investigate the vast datasets that we have developed and that we curate. By delving deeper into the datasets and understanding the relationships between them, we are investigating how we might be able to develop and modify various outputs including contributing to the advancement of existing hazard susceptibility maps. Examples of work in this area include:

  • analysis of the outputs of various clustering algorithms with which to identify inter-relationships between our datasets
  • contrasting data-driven approaches to heuristic methodologies, such as for GeoSure
  • identifying models currently available to investigate space-time processes for applications such as urban hazard modelling.

Uncertainty and confidence

Our overarching aim is to ensure user confidence in the application of our models and outputs through reliability and transparency.

Data are inherently noisy and it is important to account for this noise or uncertainty when analysing data and incorporating them as part of product development. As well as accounting for uncertainty through analysis and development, it is vital that it is effectively communicated to product end users. This ensures that end users can be confident in the use of our products whilst also being aware of the potential limitations of products, therefore ensuring that they are utilised both effectively and appropriately.

The incorporation and communication of uncertainty in both our product development and delivery is consequently an active area of ongoing research. This work includes:

  • the incorporation of uncertainty in third-party datasets that we use (e.g. elevation data)
  • uncertainty associated with application of different algorithms during data analysis and product development
  • the effect of changes in spatial scale both in terms of data resolution and feature-length scale

Temporal data modelling

An undulating road surface due to ground movement in Lincolnshire.

We are beginning to investigate how we can incorporate temporal modelling into data products effectively. Current work is investigating the seasonally changing patterns of rainfall and temperature across multiple years and associating this with geohazard occurrence and potential. For example, the new GeoClimate shrink–swell dataset (currently in beta testing) analyses multiple years of temporal weather data to ascertain trigger thresholds of this particular hazard in Great Britain. Additionally, similar analytics are being employed to feed into landslide forecast modelling and coastal vulnerability. Ultimately, we aim to provide temporal data features within our hazard data products in the future.

Social media data

Projects such as GeoSocial, which use data extracted from social media platforms like Twitter, provide an opportunity for us to assess real-time trends and gain situational awareness of geo-related events as they unfold.

GeoSocial is a tool that allows users to display social media posts related to geoscience. This currently focuses on themes such as landslides, aurorae, flooding, volcanic eruptions and earthquakes. We are looking to expand this research to other geohazards and to include different social media sites. We are constantly seeking to improve and update the way the application analyses search terms and gathers data.

Contact BGS citizen science for more information on this.

Contact

Contact Katy Lee for more information.

See also