Volcanic Eruption Prediction

Use Case

 

 

CATEGORY

Use Case

DATE

20 October

TIME

8 Minutes

Predicting volcanic eruptions using earthquake data

With this use case, we present a data app about the recent volcanic eruptions occurred in the island of La Palma (Spain), which have caused the declaration of a catastrophic area of more than 400 hectares which produce 20 million kilograms of bananas every year and the destruction of 320 homes. We want to highlight the relationship between earthquakes and volcanic eruptions and importance of seismic monitoring to anticipate volcanic activity as much as possible.

In the data app, the user is allowed to select a date range and can see the location of all earthquakes happened in the island of La Palma between the two dates selected. Additionally, the user can run predictions using a trained model that relies on several features extracted from earthquake data in order to predict volcanic eruptions. For this, the user can select an earthquake from an updated list containing all earthquakes happened in the island since records are available and will obtain a prediction on the probability of a volcanic eruption after that earthquake. All the data used in this data app is provided by the Spanish National Geographic Institute (www.ign.es)

THE CHALLENGE

The main challenge solved with this data app is the retrieval of earthquake information from a public data source. Through its website, the Spanish National Geographic Institute (www.ign.es) provides detailed data about all earthquakes with historical records which have happened across the country.

In this case, the data is stored in a server and is accessible through HTTP requests in which the details of the request (latitude and longitude ranges, and a range of dates) are sent to a server and the response is a TXT file containing the data for all the earthquakes found. Additionally, a machine learning model stored as a pickle file can be loaded and predictions can be run with it directly from the data app, using the data retrieved from a specific earthquake selected by the user.

METODOLOGY

In order to build this data app, the following steps are covered:

  • Building a custom function (get_earthquakes_IGN) that, given a map area (i.e. longitude and latitude ranges), a starting date and an end date, performs an HTTP request in order to download a list of all earthquakes from the servers of the Spanish National Geographic Institute. The coordinates are constant, corresponding to the island of La Palma, and the dates are such that they are set automatically to the oldest possible date and today’s date, so that all earthquake records available are downloaded.
  • The function get_earthquakes_IGN relies on another function (encode_multipart_formdata) which helps build the right HTTP request, incorporating all the needed information about geographic coordinates and date range.
  • All earthquakes returned by get_earthquakes_IGN are converted into options for a selector. In this selector the user can select an earthquake and then, when the probability of volcanic eruption is requested by the user, the function compute_probability will be executed, downloading the data file of the chosen earthquake, building the features, normalizing them, feeding them into the model, and finally returning the desired probability.

METRICS

Since the machine learning model attempts to classify an earthquake as related or not to a volcanic eruption, three basic classification metrics are used:

  • Precision reflects the fraction of actual volcanic eruptions found by the model among all the volcanic eruptions found by the model. The model provided achieves 67% precision.
  • Recall reflects the fraction of actual volcanic eruptions found by the model among all actual volcanic eruptions. The model provided achieves 52% recall.
  • F1-score is a classification metric based on the harmonic mean of precision and recall, and reflects how good a classifier is considering precision and recall simultaneously. The model provided achieves an F1-score of 57%

SYNTHESIZED RESOLUTION

The user may choose different date ranges in order to obtain information about the seismic activity in the island of La Palma. For instance, these were the earthquakes that happened in the island during 2019.

 

If we now choose to see the earthquakes between 2020 and 2021, we can see a considerable increase in seismic activity.

Finally, we can check the seismic activity during 2021. The image speaks for itself.

We can choose a record on September 12th 2021, one week before the eruption started (3:13 am, Fuencaliente de La Palma, magnitude 3.3) and obtain a probability of eruption of 54.88%. For the day on which the longest registered eruption started, September 19th 2021, we obtain a probability of 83.58% for an earthquake recorded at 6:28 in El Paso, also with a magnitude of 3.3. 

RESULTS

Even though this data app is just a simple example of data retrieval, data visualization and model inference, we can highlight the following results:

  • The number of earthquakes increased dramatically over the last months before the eruption in September 2021 (the largest ever registered for this volcano).
  • According to the trained model, which achieved a precision of 67%, a recall of 52% and an F1-score of 57%, some earthquakes indicated imminent volcanic activity even one week before the first eruption happened.

  • The data corresponding to some earthquakes on the same day in which the longest eruption started indicate a probability of eruption of 83.58%.

HOW DOES SHAPELETS HELP SOLVE THIS CHALLENGE?

In this use case, Shapelets facilitates building a data app that can retrieve earthquake data from a public database using HTTP requests. It also allows to easily build powerful geographical visualizations with the retrieved data. Finally, it allows to use trained models to run inferences on them with the retrieved data, obtaining predictions with a simple click.

CLOSURE

With this use case, we have learnt about earthquakes, volcanic eruptions and the relationship between them. We have built an app that can help the user retrieve data from a third-party provider, in this case the Spanish National Geographic Institute, visualize this data in geographic maps and use it to make predictions. Similar tasks can be performed using other types of data: meteorological, geological or astronomic data in order to visualize and analyse it, build models, and share the predictions of those models seamlessly with your team.