Many crowdsensing applications today rely on learning algorithms applied to data streams to accurately classify information and events of interest in smart environments. Unfor-tunately, the statistical properties of the input data may change in unexpected ways. As a result, the definition of anomalous and normal data can vary over time and machine learning models may need to be re-trained incrementally. This problem is known as concept drift, and it has often been ignored by anomaly detection systems, resulting in significant performance degradation. In addition, the statistical distribution of past data often tends to repeat itself, and thus old learning models could be reused, avoiding costly retraining phases on new data, which would waste computational and energy resources. In this paper, we propose a hybrid anomaly detection system for streaming data in smart environments that accounts for concept drift and minimize the number of machine learning models that need to be retrained when shifts in incoming data distribution are detected. The system is multi-tier and relies on two different concept drift detection modules and an ensemble of anomaly detection models. An extensive experimental evaluation has been carried out, using two real datasets and a synthetic one; results show the high performance achieved by the system using common metrics such as F1-score and accuracy.

Agate V., Drago S., Ferraro P., Lo Re G. (2022). Anomaly Detection for Reoccurring Concept Drift in Smart Environments. In Proceedings - 2022 18th International Conference on Mobility, Sensing and Networking, MSN 2022 (pp. 113-120). Institute of Electrical and Electronics Engineers Inc. [10.1109/MSN57253.2022.00031].

Anomaly Detection for Reoccurring Concept Drift in Smart Environments

Agate V.;Drago S.;Ferraro P.
;
Lo Re G.
2022-12-14

Abstract

Many crowdsensing applications today rely on learning algorithms applied to data streams to accurately classify information and events of interest in smart environments. Unfor-tunately, the statistical properties of the input data may change in unexpected ways. As a result, the definition of anomalous and normal data can vary over time and machine learning models may need to be re-trained incrementally. This problem is known as concept drift, and it has often been ignored by anomaly detection systems, resulting in significant performance degradation. In addition, the statistical distribution of past data often tends to repeat itself, and thus old learning models could be reused, avoiding costly retraining phases on new data, which would waste computational and energy resources. In this paper, we propose a hybrid anomaly detection system for streaming data in smart environments that accounts for concept drift and minimize the number of machine learning models that need to be retrained when shifts in incoming data distribution are detected. The system is multi-tier and relies on two different concept drift detection modules and an ensemble of anomaly detection models. An extensive experimental evaluation has been carried out, using two real datasets and a synthetic one; results show the high performance achieved by the system using common metrics such as F1-score and accuracy.
14-dic-2022
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni
978-1-6654-6457-4
Agate V., Drago S., Ferraro P., Lo Re G. (2022). Anomaly Detection for Reoccurring Concept Drift in Smart Environments. In Proceedings - 2022 18th International Conference on Mobility, Sensing and Networking, MSN 2022 (pp. 113-120). Institute of Electrical and Electronics Engineers Inc. [10.1109/MSN57253.2022.00031].
File in questo prodotto:
File Dimensione Formato  
Anomaly_Detection_for_Reoccurring_Concept_Drift_in_Smart_Environments.pdf

Solo gestori archvio

Descrizione: paper + TOC
Tipologia: Versione Editoriale
Dimensione 2.99 MB
Formato Adobe PDF
2.99 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/588276
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact