Real-time data is central to Internet of Things applications, which deal with event-based, machine-generated data, flowing in uninterrupted streams into various IoT systems. The Waylay platform is one such system, that was built with a strong focus on real-time data acquisition, real-time orchestration and real-time rule-based actuation. But there is new value to be extracted from the vast amounts of real-time IoT data that is not put to use in the live moment. This data can get a whole new makeover in its afterlife, where new developments in ML and AI make it more available than ever to extract new meaning from it. It is this rich afterlife of time-based data that we will discuss in this and the following blog post, and how we enable it at Waylay for our customers.

Real-time IoT data and historical IoT data

The way a real-time automation engine like Waylay works is that it ingests data (coming from a variety of sources, not just IoT devices), processes it in real-time, and based on the results of this processing, performs different automated actions. It’s the real-time processing that lies at the heart of all typical IoT applications, as it allows businesses to improve operations or offer new digital services to customers by gaining new live insights into their connected products and reacting in real time.

But many data applications also require access and management of historical data, in order to get an overview of what is going on over longer periods of time and to enable more structure and resources for exploration. Business Intelligence reporting, statistical analysis of devices, training and testing of statistical models: this is where business analysts and data scientists create knowledge out of historical big data. As a first service for these types of analytics requirements, the Waylay platform provides an ETL (Extract, Transform, Load)-export mechanism to regularly offload your Waylay data into files that you can handle with your existing batch-oriented ETL tools. We’ll explain below how this mechanism works and what sort of data you gain access to by using our ETL service.

Once you have your ETL files, you can proceed with getting the data into a professional data pipeline and carry on the actual offline data analysis. For these last two steps, you can either rely on your own internal resources or you can use Waylay’s data science team that can assist with these and more advanced use cases, where you close the loop and input your offline findings back into the real-time engine to refine your running logic scenarios and to ultimately improve your IoT solutions.

ETL to complement real-time processing

There are two types of data that we export and deliver to customers via our ETL service and they come in two different files and formats: the first is time series data with actual sensor-measurements and actions taken, and the second is metadata – data about the physical products themselves, such as manufacturer and model, user, location etc. Both types of data are important and both are needed in order to gain a complete overview over the IoT solution over time.

Time-series is king, but context matters too

Time-series is a big deal in IoT as it registers the moving real-time data that we talked about as being the heart of IoT applications. It contains series of data points collected at regular intervals and indexed in time order – the sort of reading you might see, for example, from a smart meter or thermostat in a home or from a connected ventilation system in an office building. It arrives in volume, requires careful handling and is ‘unique’ in that sensors deliver time-stamped data in order to measure change over time – for example, a rise in energy consumption when a family returns home or the different peaks of HVAC equipment use in an office building.

Metadata is at least as important to time-series because with IoT data, context matters. For some applications it is not only desirable but even critical that the user (be it a person or a company) be provided with as much contextual information as possible. This can include the provenance of the products such as their make and type, the literal context of where the sensors are situated (e.g. inside a moving vehicle, attached to power poles, etc.), whether the source-devices are maintained and calibrated and their change logs. All of this bounding information is found and conveyed outside of the data streams themselves.

Putting it all back together

Merging your historical time-series with your metadata enables answering questions like “What is the average registered temperature over this set period of time for this particular model of device?” or “What is the periodicity of alarms triggered for this type of meter?”.. and so on. A historical view over how your real-time IoT solution behaves over time enables richer insights that in turn can help with longer-term planning and strategizing. Waylay’s ETL export is a first important service that we provide in order to help customers achieve this. The service is available as a daily, weekly or monthly export and customers choose the frequency based on their specific use case.

In addition to the ETL-export service, Waylay also offers optional professional services to take the next steps, beyond the data export. If curious to learn more, in this follow-up post we explain what the next steps are and how our data pipeline works.