SDDM: an interpretable statistical concept drift detection method for data streams
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ProQuest Central
Abstract
Machine learning models assume that data is drawn from a stationary distribution. However,
in practice, challenges are imposed on models that need to make sense of fast-evolving data
streams, where the content of data is changing and evolving over time. This change between
the distributions of training data seen so-far and the distribution of newly coming data is
called concept drift. It is of utmost importance to detect concept drifts to maintain the accu racy and reliability of online classifiers. Reactive drift detectors monitor the performance of
the underlying machine learning model. That is, to detect a drift, feedback on the classifier
output has to be given to the drift detector, known as prequential evaluation. In many real life scenarios, immediate feedback on classifier output is not possible. Thus, drift detection
is delayed and gets out of context. Moreover, the drift detector output is in the form of a
binary answer if there is a drift or not. However, it is equally important to explain the source
of drift. In this paper, we present the Statistical Drift Detection Method (SDDM) which can
detect drifts by monitoring the change of data distribution without the need for feedback on
classifier output. Moreover, the detection is quantified and the source of drift is identified.
We empirically evaluate our method against the state-of-the-art on both synthetic and real
life data sets. SDDM outperforms other related approaches by producing a smaller number
of false positives and false negatives.
Description
Keywords
Online machine learning · Concept drift detection ·
Data streams analytics
Citation
Awad, A. (2021) “SDDM: an interpretable statistical concept drift detection method for data streams,” Journal of Intelligent Information Systems, 56(3), pp. 459–484.