The future of the Time-Series Data

CATEGORY

Shapelets

CATEGORY

Shapelets

CATEGORY

Shapelets

 CATEGORY

   BigData

 DATE

  08 Feb

 TIME

   7 Minutes

We must learn from the past to build a brighter future.”

There are many famous quotes reiterating the idea that to create a better future, you must first dig into the past. With our desire to create an impact in the time-series analysis world, we decided to do just that.

We wanted the chance to learn about his story, the reasons that motivated him to create the RRDTool, and to get his opinion about the future of time-series databases. We even found out that the RRD Tool wasn’t the first program that Tobias successfully created.

We were fortunate to have the opportunity to talk to the creator of the first open-source time database, RRD Tool. This time-series database was created by the Swiss software engineer, Tobias Oetiker, in 1999

A platform specialising in Data Science

Creator of the RRDTool

The first opensource Time-Series database

 

THE PAST

After our interview with Tobias (as seen in the video above), we found out that he originally created the program MRTG. He originally created this program in 1995 in the De Montfort University Leicester while working as a system administrator with the task to monitor and manage the University’s network. To help combat and better understand the reoccurring issue of a slow and unstable internet link, Tobias created what came to be MRTG.  

 

Still available for use today, MRGT functions as a way to record traffic on the link made available on a webpage. This program worked well for monitoring networks, but Tobias felt inspired to make a program that could be implemented with all sorts of time-series monitoring and database applications. He dreamed up the idea to make a standalone time-series database tool.  

This dream came to fruition in the summer of 1999 when Tobias embarked on a work exchange in San Diego, California with the company CAIDA. The company gave him the chance to continue developing his program, and he successfully launched the first version of the RRD Tool.  

Tobias had just created the first open-source time-series database, and the legacy of the RRD Tool continues even today.  

THE PRESENT

As an open-source program, Tobias never received great financial gain from his endeavour, but he did gain great respect and recognition in the computer science world. His contribution marked a great moment in the development of Big Data storage and data analysis.  

The RRD Tool has been used for numerous metric monitoring projects, and tracking satellites, and was even utilized by weather stations.

Today when comparing the RRD Tool to new database solutions, it still renders competitive advantages. While it may need more previous information to get the project started, once you have your data set in, the process is quite easy to extract. This is contrary to other databases available, like InfluxDB, that make initial data set storage a breeze, but once you need to extract your data, you may come across issues if you haven’t stored your data in a specific way.  

Another competitive advantage that the RRD Tool has is that it is great for systems that run for a long time periods. The RRD Tool implements mandatory deletion of old data which eliminates the issue of filling up disc space. This is unlike many modern database solutions, as they are often based out of the classic database-style where data is continuously stored until you manually have to do something with the old data. 

In the time series analysis world, we have seen a lot of development in data storage, which often times is available for free, and even in data monitoring, time series modeling and troubleshooting. But the same question continues to resurface, “How can we get the most value from our vast amounts of data?” 

THE FUTURE

Tobias Oetiker gave us great insight into the beginnings of time-series databases and how far they have come today, but we still wanted to know what the future holds for time-series data.  

In order to do this, Tobias pointed out that the value of time series data comes from being able to identify the normal case so that we are able to troubleshoot and compare when things go wrong, along with scalability and growth in data. 

We have come a long way in the world of time series data, but Tobias identifies that what seems to be missing is that we don’t have a way to find the interesting data in the huge mountain of data that we are able to collect today.  

Tobias has played with the idea of potentially creating a second version of the RRD Tool, but he knows it would only be worth investing in a database that has an analytical aspect with as little user interaction as possible. The analytical aspect is the future of time series data.

   

FINAL THOUGHTS

We felt so honoured to have the opportunity to talk to someone so knowledgeable yet so humble, especially given everything that he has accomplished. Tobias gave us many insights about time series analysis for us to reflect upon. 

We found it very curious to see how individual open-source projects such as the RRD Tool or Graphite started out with a specific purpose but were able to transform into widely used tools with many other functionalities, such as serendipity.  

For us at  Shapelets, it has always been our priority to offer effective solutions for the technology community. This is why we made the decision to publish Khiva, an open-source library of novel time-series algorithms optimized for GPU-CPU, as soon as we started our journey in 2018. In Shapelets, we want to empower time series forecasting and time series modeling for those involved in the technology community. 

Reflecting on what Tobias Oetiker said about the future of time-series data, we couldn’t agree more. With an overwhelming amount of data, we need tools to be able to find, collect and analyze the most interesting parts of this data. That is our mission. We will always strive to offer all the tools necessary to help people find the most valuable information in time series data.

This is why we have created a platform providing the best tools for data scientists to discover valuable and interesting data amongst the mountains of data available. Using the combination of our platform with a range of new and powerful algorithms that we offer, we know that, together, we are beginning to create the future of time series analysis.

 

How can we get the most value of our vast amounts of data