Univariate data plausibility checks in water management

The workflows of the hetida platform enable univariate data plausibility checks to quickly identify incorrect measured values. In this article, we describe how to proceed in such a case from the water industry.In this article, we describe how to proceed in such a case from the water industry.

Purpose of the plausibility check

According to the SüwVO Abw. (Self-Monitoring Ordinance for Wastewater), the municipal drainage company is obliged to install continuously recording water level measuring devices in its storage sewer. Shortly before the quarterly report on the city’s sewer structures is due, an employee notices that the sensor has been sending strange readings for several days:

On 24.11.2024, some values are initially missing, then the water level is suddenly three meters higher than before and thus even above the maximum level of 600 cm. During her investigations, the employee finds out that the sensor was reconfigured during the period of missing data – it is a distance-measuring water level sensor and the suspension height must have been set incorrectly. This is easy to fix. The only annoying thing is that the error only becomes apparent after two days of incorrect measurements.

To avoid implausible measurement data, the causes of errors must be identified and rectified as quickly as possible. To do this, we need an automatic measurement data plausibility check that provides an overview of all incorrectly measuring sensors at a glance. We implement this requirement below in the IoT and analytics solution hetida platform.

We proceed as follows:

First, we visualize typical implausible data for each sensor type used by the urban drainage system.
On this basis, we introduce univariate rules for the automatic detection of such implausibilities.
We then use these rules to check the plausibility of measurement data by creating a hierarchy of all urban drainage sensors in the hetida platform and implementing workflows in hetida designer that execute the rules.
Finally, we create a user-friendly dashboard in the hetida platform that shows all implausibilities in the data at a glance.

Examples of implausible data

Implausible data can occur not only with water level measuring devices. The urban drainage system uses a variety of sensors to measure water levels, precipitation, temperatures and humidity values. Depending on the type of sensor, different patterns in the data are implausible. For temperature sensors, sudden outliers are not to be expected, but can occur due to hardware errors, for example:

In the time series of a precipitation sensor, however, such fluctuations occur frequently. Therefore, values that remain exactly the same over a longer period of time are implausible – unless the value is 0. However, such measurements can occur, for example, with an optical sensor due to heavy soiling of the lens:

For water level and precipitation sensors, we must also recognize values that lie outside a certain range (less than 0, or greater than the maximum fill level for water level sensors) as implausible. An example of this has already been shown in the introduction. In this example, we have also seen that sensors that transmit at a regular fixed frequency must be recognized as faulty if measured values are missing.

Univariate rules to detect implausible data

All the implausibilities described above can already be recognized by simple univariate rules (i.e. rules that can be applied to the time series of a single sensor), which can be applied to different sensor types:

Rule for detecting outliers: All measured values that deviate unusually strongly from the previous measured values are implausible. This rule applies to water level and temperature sensors.

Formal Let m_t := max{ v_t- v_s | t - d < s < t } be the maximum deviation of a measured value v_t at time t from the previous measured values within a time period of length d. Let Q₁ be the first quartile of all m_t, Q₃ the third quartile and IQR := Q₃ - Q₁ the interquartile range. Then define all those measured values v_tas implausible for which m_t∉ [Q₁ - c⋅IQR, Q₃ + c⋅IQR], where c is a configurable factor.

Rule for plateau detection: All measured values that match the previous measured values exactly, but are not a value from a certain set of legitimate constant values, are implausible. This rule applies to water level, temperature and precipitation sensors, whereby 0 is a legitimate constant value for precipitation sensors.

Formal Let L be a possibly empty set of legitimate constant values. Define all measured values v_t for which m_t= 0 but m_t∉ L as implausible

Rule for detecting values outside a valid range: All measured values that lie outside a previously defined valid value range are implausible. This rule applies to water level and precipitation sensors, where the lower limit of plausible values is 0.

Formal Let v_min, v_max ∈ ℝ∪{-∞, ∞}, v_min < v_max. Define all measured values v_t with v_t < v_minor v_t < v_minas implausible.

Rule for detecting missing values: All periods in which no data was sent for an unusually long time are classified as faulty. This rule applies to all sensors.

Formal Let d_t := t - max{ s | there is value v_s at time s } be the past duration at time t up to the last transmitted value. Let Q₁ be the first quartile of all d_t, Q₃ the third quartile and IQR := Q₃ - Q₁ the interquartile range. Then define all those time periods as faulty within which d_t ∉ [Q₁ - c⋅IQR, Q₃ + c⋅IQR], where c is a configurable factor.

Hierarchical organization of sensors in the hetida platform

Each urban drainage sensor sends its measurements to the hetida platform. Accordingly, each sensor is represented in the hetida platform as a data channel. These data channels can be organized in a hierarchical structure:

For example, the city’s drainage system has decided to group all water level sensors together on one side and all meteorological sensors on the other.

Further information on the hierarchical asset structure of the hetida platform

Implementation of the rules in hetida designer

For each of the rules defined above for checking the plausibility of measurement data, we create a workflow in hetida designer that receives the time series of values measured by the sensor as input. All implausible data is identified using the rules that match the sensor type. The output is a plot of the time series in which all incorrect values are marked as such. Here is the workflow for a water level sensor, which contains the rules for the valid value range and missing data:

Each of the rules is implemented in a component, then the implausibilities recognized by these components are merged again and plotted together with the original data:

The workflow tool hetida designer recognizes the missing data and values above the permitted maximum and marks them with an orange background.

Further information on the workflow tool hetida designer

Presentation of the results in the hetida platform

The urban drainage employee now creates a dashboard in the hetida platform in which she brings together all the installed sensors. For each sensor, the measured time series is sent through the hetida designer workflow that matches the sensor type and any implausibilities detected are plotted together with the time series. At the time selected here, there were three periods of implausible data in the last 24 hours and one of the sensors is still affected:

Equipped with this dashboard, the urban drainage employee will in future be able to immediately recognize whether there is a problem with a sensor that requires human intervention. She can then find the location of the faulty precipitation sensor on the map and can take care of maintenance without long periods of faulty data or critical systems such as flood forecasting, which rely on error-free data, no longer delivering good results.

Further information on the dynamic dashboards of the hetida platform

Request a free demo now!

Experience the IoT and analytics platform live and make an appointment.

Our experts will get in touch with you shortly.

We look forward to hearing from you.

Univariate data plausibility checks in water management

Purpose of the plausibility check

Examples of implausible data

Univariate rules to detect implausible data

Hierarchical organization of sensors in the hetida platform

Implementation of the rules in hetida designer

Presentation of the results in the hetida platform

...don't miss anything - subscribe to our newsletter!

Request a free demo now!

Request a free demo now!