Novel approach for burst detection in water distribution systems based on graph neural networks
Introduction
Sustainable water management is a fundamental keyword for the future since the constant increase of water demand due to socio-economic factors and climate change puts WDSs under increasing strain. Hence, wasted water is not acceptable for today’s utilities anymore. To limit losses and wastes, optimization of water pressure in WDSs and the use of smart meters can provide essential savings (Spedaletti et al., 2022). In this condition, water management needs proper care to undertake a transition towards a smart and sustainable paradigm (Oberascher et al., 2022, Puchol-Salort et al., 2021). The new era of big data (Shafiee, Barker, & Rasekh, 2018) and artificial intelligence allows innovative ways to make water management more efficient (Savic, 2019) in order to advance towards a sustainable management paradigm (Ávila, Sánchez-Romero, López-Jiménez, & Pérez-Sánchez, 2022).
Research aimed at developing new data-driven strategies for improving water management has blossomed in the last few years. Among a myriad of different applications, there highlight the development of modeling water demand (e.g., House-Peters & Chang, 2011), data analysis techniques for smart water metering systems (e.g., Nguyen et al., 2018, Rahim et al., 2020), intrusion detection methods (e.g., Mboweni, Abu-Mahfouz, & Ramotsoela, 2021) and water demand forecasting (e.g., Brentan et al., 2017, Herrera et al., 2010). All these research approaches have in common the aim of helping water utilities in efficient WDS management. This is the case of (Bakker et al., 2013), who showed how a system to forecast urban water demand in the Netherlands allowed to reduce both energy consumption and the energy cost of a WDS. These achievements encouraged the scientific community to focus on developing innovative and powerful methodologies and even the adaptation of the novel techniques that are rising across different research fields (Bronstein et al., 2017, Schmidhuber, 2015).
One crucial research task for water distribution systems lies in leak detection and water wastes reduction (Cavazzini, Pavesi, & Ardizzon, 2020). Leaks can be distinguished into background losses and pipe bursts (Puust et al., 2010, Zaman et al., 2020). While background losses are distributed along the network and mainly caused by time deterioration of a water network assets (i.e., pipes, valves, fire hydrants), pipe bursts are instead characterized by a significant crack of a pipe with a consequent high water outflow. The identification of this latter type of leak is fundamental to avoid water waste, and service interruptions (Misiunas, Lambert, Simpson, & Olsson, 2005). Consequently, the use of efficient leak detection algorithms is mandatory for the correct management of the WDSs. The development of data-driven methods for leak detection has been possible in the latest years to address this challenge, thanks to the increasing use of cyber–physical systems by water utilities and the consequent available data.
The literature review of Wu and Liu (2017) identifies three categories of approaches to address the anomaly detection problem: classification, prediction–classification, and statistical methods.
- •
The classification methods consist of identifying data affected by bursts from normal data. Usually, classification methods require labeled data, which means data with full knowledge on whether an anomaly has occurred or not. Still, a range of data-driven and machine learning algorithms have been developed for these methods. For instance, Aksela, Aksela, and Vahala (2009) proposed an approach based on a self-organizing map artificial neural network (ANN) for leak detection. The output of the latter was transformed into an alarm system based on a threshold value. Mounce and Machell (2006) proposed two conceptually different ANN to detect bursts: a static ANN and a time-delay ANN. The authors showed the crucial role played by the temporal dimension.
- •
The prediction–classification methods rely on the fundamental idea for which at the occurrence of an anomaly (i.e. a burst), the prediction output consistently differs from the measured data. These methods have the advantage that require only normal hydraulic data for developing a prediction model. However, in order to have this kind of normal hydraulic data, there is the need of removing outliers and abnormal data for obtaining such clean data. Therefore, some authors adopt data pre-processing and statistical methods (e.g., Romano, Kapelan, & Savić, 2014). A range of data-driven methodologies has been developed for this category. This is the case of support vector regression (e.g., Mounce et al., 2011, Zhang et al., 2016) and ANN (e.g., Arsene et al., 2012, Fang et al., 2019). However, the prediction–classification methods rely on a first prediction phase that is followed by a second classification phase.
- •
The statistical methods for burst identification do not require any prediction or classification models. However, in most cases the detection relies on statistical theory, meaning this class of methods usually falls in the category of statistical process control (SPC). This consists of monitoring of process variation caused by anomaly events through control charts and analytic tools (e.g., Jung et al., 2015, Loureiro et al., 2016). Despite this class of methods does not require any of the complex data-driven models that are usually developed in the previously described classification and prediction–classification methods, the results from SPC methods can be affected by high uncertainty (Wu & Liu, 2017).
It is worth mentioning that all the different methodologies rely on the quality and the quantity of available data, which are fundamental for the proper development of anomaly detection approaches (Chan et al., 2018, Menapace et al., 2020).
Today, many data-driven problems in science and engineering have seen the rise of graph-structured approaches to provide closer representations of problems in non-Euclidean spaces (Bronstein et al., 2017). Graphs are structures that can model a set of objects (vertices) along with their relationships/connections (edges). The use of such structures has been previously adapted for dealing with engineering problems, such as anomaly detection in internet traffic networks (Herrera, Proselkov, Pérez-Hernández, & Parlikad, 2021). Other researchers use machine learning on graphs due to the ability of graphs to represent and analyze data with graph neural network (GNN) models (Wu et al., 2021). In machine learning, the non-Euclidean structure of graphs allows many different tasks such as node level classification, link prediction, and clustering, among others (Zhou et al., 2020). Graph convolutional neural networks (GCN) are one of the multiple variants of GNNs. Due to the expressive power of GCNs to learn graph representation, GCNs have demonstrated superior performance in many deep learning problems (Zhang, Tong, Xu, & Maciejewski, 2019).
GCN has recently been adopted for different applications with anomaly detection purposes. For instance, Zhou et al. (2021) proposed a cross-network contamination source identification method based on GCN, where the latter is used to capture spatial information of the network topology. The authors showed the ability of the model to identify contamination source nodes reliably, which can transfer knowledge from one WDS to another. Furthermore, Wang, Luo, and Zhou (2020) adopted GCN to build an anomaly detection model in the context of block-chain-based healthcare systems. The authors showed that their proposed approach could deal with the associated security requirements. In a different field, John, Thomas, and Emmanuel (2020) proposed an anomaly detection method based on GCN, in the context of android malware. Their proposed approach showed a promising performance compared to other machine learning techniques. Another important application is the one introduced by Arifoglu, Charif, and Bouchachia (2020), where the authors deal with the problem of detecting the abnormal activities of people affected by dementia. In this context, the authors explored the use of GCN to detect anomalies from activation data and compared the results with some state-of-art methods in their field. They highlighted the ability of the GCN model to recognize abnormal activity related to dementia.
This paper proposes a novel approach for burst detection in WDSs. This approach consists of developing a classification model based on a graph convolutional neural network (GCN) to identify abnormal data from a dataset of pressures and flow rates measurements. This novel model relies on a graph structure that represents the input and is the basis for a graph neural network classification model. The WDSs data generator developed by Menapace et al. (2020) is adopted to get demand data affected by bursts due to the complexity of labeling data for classification models. For the sake of clarity, having labeled data means having the full knowledge on burst occurrences. Therefore, the use of the generator allows to create suitable water demand time series affected by realistic burst events, following the formulations of van Zyl and Cassa (2014) and simulating them into a distributed pressure-driven hydraulic solver (Menapace & Avesani, 2019). This latter allows to simulate both time series of flow rates along the pipes and pressure at the nodes. This study uses the well-known WDS of Modena (Italy) with four datasets of synthetic water demand to test the classification methodology. Then, it is proposed to adopt such data to build a graph, where the nodes are the different meters involved, and the links represent their correlation. The graph structure is used to build the GCN-based models that aim to classify whether there is an anomaly. In particular, two models based on GCN are proposed: (1) a model based on GCN that does not use past observation in the inputs, but only data of the same time frame of the event that has to be classified; and (2) a graph convolutional recurrent neural network (GCRNN) that is also fed by past observations. GCRNN uses recurrent layers to deal with the temporal feature extraction from the data. Furthermore, to compare the performances of the two GCN-based models, it is proposed to develop two classification models based on a multi-layer perceptron (MLP) architecture. These latters are employed to emphasize the differences between the novel graph-based approaches (i.e. GCN and GCRNN models) and the conventional data-driven ones (i.e. the model based on MLP). As for the two graph-based approaches, also the two MLP models have been designed using, in one case, past observations and, in the other, only data from the same moment of the event to be classified. The results highlight a high potential of the proposed graph-based approaches to detect bursts, showing reliable detection with high accuracy in all the tested case studies. An additional novelty of this paper lies on highlighting the significant advantages and superior performances of the use of anomaly detection models that can learn from a graph structure.
Section snippets
Methodology
This section presents a description of the proposed graph-based classification methods for burst detection. Fig. 1 summarizes the overall research proposal.
The proposed methodology relies on a classification model for estimating whether a burst occurs. Given a WDS, the data generation method from (Menapace et al., 2020) is adopted to generate realistic time series of water demand affected by bursts with also the pressures in the network nodes. A description of the generator is proposed in
Benchmark models
In order to compare the results of the two graph-based proposed models, it is proposed to use the well established multi-layer perceptron architecture, which is nowadays still used for burst detection methodologies (Fallahi, Jalili Ghazizadeh, Aminnejad, & Yazdi, 2021). The proposed benchmark model mechanism is depicted in Fig. 4.
The proposed benchmark models is composed by a sequence of dense layers, starting from an input layers that is fed with the pressure and flow rate data of the
Case studies
To test the proposed burst detection methodologies, it is proposed to use the well known Modena network (Bragalli, D’Ambrosio, Lee, Lodi, & Toth, 2008) located in Italy. The sensors configuration of the network is selected by means of the procedure described in Section 2.2, as it is also shown in Zanfei, Menapace, Santopietro, and Righetti (2020). The network is depicted in Fig. 5.
Modena network model consists of 4 reservoirs, 267 nodes and 317 pipes with a total length of 71.8 kilometers.
Results and discussion
A reliable burst detection algorithm should be able to provide correct detection with TPR values close to 1 and with low detection times. Furthermore, a reduced number of false alarms with a low time persistence highlight a robust detection algorithm. The results of the proposed anomaly detection models are shown in Table 2. In particular, it reports the detection time from instantaneous to 8 h delay, and some metrics including FN, FP and TPR for the 4 considered models.
Table 2 highlights the
Conclusions
Accurate detection of leakages improves the sustainable management of water resources decreasing the water waste and increasing the resilience of such systems. This study proposes a novel approach for detecting bursts in WDSs based on graph convolutional neural networks. The proposed GCN-based models are developed to detect abnormal events, using as input the generated data of pressures and flow rates in some metering locations to build a graph structure. The position of the sensors that
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Funding
This study has been partially funded by the project “TESES-Urb - Techno-economic methodologies to investigate sustainable energy scenarios at urban level” of the Free University of Bozen-Bolzano, Italy, and by the project DIADEM “Data driven anomaly detection for sustainable water and energy smart grids management” of the Free University of Bozen-Bolzano, Italy .
References (55)
- et al.
Detecting indicators of cognitive impairment via graph convolutional networks
Engineering Applications of Artificial Intelligence
(2020) - et al.
Decision support system for water distribution systems based on neural networks and graphs theory for leakage detection
Expert Systems with Applications
(2012) - et al.
Improve leakage management to reach sustainable water supply networks through by green energy systems. Optimized case study
Sustainable Cities and Society
(2022) - et al.
Hybrid regression model for near real-time urban water demand forecasting
Journal of Computational and Applied Mathematics
(2017) - et al.
Optimal assets management of a water distribution network for leakage minimization based on an innovative index
Sustainable Cities and Society
(2020) - et al.
Predictive models for forecasting hourly urban water demand
Journal of Hydrology
(2010) - et al.
Re-engineering traditional urban water management practices with smart metering and informatics
Environmental Modelling & Software
(2018) - et al.
Towards a smart water city: A comprehensive review of applications, data requirements, and communication technologies for integrated management
Sustainable Cities and Society
(2022) - et al.
An urban planning sustainability framework: Systems approach to blue green urban design
Sustainable Cities and Society
(2021) Deep learning in neural networks: An overview
Neural Networks
(2015)
Enhancing water system models by integrating big data
Sustainable Cities and Society
Improvement of the energy efficiency in water systems through water losses reduction using the district metered area (DMA) approach
Sustainable Cities and Society
Guardhealth: Blockchain empowered secure data management and graph convolutional network enabled anomaly detection in smart healthcare
Journal of Parallel and Distributed Computing
Graph neural networks: A review of methods and applications
AI Open
Graph convolutional networks based contamination source identification across water distribution networks
Process Safety and Environmental Protection
Leakage detection in a real distribution network using a SOM
Urban Water Journal
Better water quality and higher energy efficiency by using model predictive flow control at water supply systems
Journal of Water Supply: Research and Technology—AQUA
Learning long-term dependencies with gradient descent is difficult
IEEE Transactions on Neural Networks
Water network design by MINLPRep. No. RC24495
Geometric deep learning: going beyond euclidean data
IEEE Signal Processing Magazine
Review of current technologies and proposed intelligent methodologies for water distributed network leakage detection
IEEE Access
Keras
Leakage detection in water distribution networks using hybrid feedforward artificial neural networks
Journal of Water Supply: Research and Technology-Aqua
Detection of multiple leakage points in water distribution networks based on convolutional neural networks
Water Science and Technology: Water Supply
Graph neural networks in tensorflow and keras with spektral
Mining graph-Fourier transform time series for anomaly detection of internet traffic at core and metro networks
IEEE Access
Cited by (20)
Gated graph neural networks for identifying contamination sources in water distribution systems
2024, Journal of Environmental ManagementFlow forecasting for leakage burst prediction in water distribution systems using long short-term memory neural networks and Kalman filtering
2023, Sustainable Cities and SocietyTowards deep probabilistic graph neural network for natural gas leak detection and localization without labeled anomaly data
2023, Expert Systems with Applications