Suche nach „[R.] [State]“ hat 5 Publikationen gefunden
Suchergebnis als PDF
    DigitalAngewandte Informatik

    Beitrag (Sammelband oder Tagungsband)

    Patrick Glauner, P. Valtchev, R. State

    Impact of Biases in Big Data

    Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2018) [April 27-29, 2018; Bruges, Belgium]

    Abstract anzeigen

    The underlying paradigm of big data-driven machine learning reflects the desire of deriving better conclusions from simply analyzing more data, without the necessity of looking at theory and models. Is having simply more data always helpful? In 1936, The Literary Digest collected 2.3M filled in questionnaires to predict the outcome of that year's US presidential election. The outcome of this big data prediction proved to be entirely wrong, whereas George Gallup only needed 3K handpicked people to make an accurate prediction. Generally, biases occur in machine learning whenever the distributions of training set and test set are different. In this work, we provide a review of different sorts of biases in (big) data sets in machine learning. We provide definitions and discussions of the most commonly appearing biases in machine learning: class imbalance and covariate shift. We also show how these biases can be quantified and corrected. This work is an introductory text for both researchers and practitioners to become more aware of this topic and thus to derive more reliable models for their learning problems.

    DigitalAngewandte Informatik

    Beitrag (Sammelband oder Tagungsband)

    Patrick Glauner, A. Boechat, L. Dolberg, R. State, F. Bettinger, Y. Rangoni, D. Duarte

    Large-scale detection of non-technical losses in imbalanced data sets

    Proceedings of the 2016 Seventh IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT 2016) [September 6-9, 2016; Minneapolis, MN, USA]

    DOI: 10.1109/ISGT.2016.7781159

    Abstract anzeigen

    Non-technical losses (NTL) such as electricity theft cause significant harm to our economies, as in some countries they may range up to 40% of the total electricity distributed. Detecting NTLs requires costly on-site inspections. Accurate prediction of NTLs for customers using machine learning is therefore crucial. To date, related research largely ignore that the two classes of regular and non-regular customers are highly imbalanced, that NTL proportions may change and mostly consider small data sets, often not allowing to deploy the results in production. In this paper, we present a comprehensive approach to assess three NTL detection models for different NTL proportions in large real world data sets of 100Ks of customers: Boolean rules, fuzzy logic and Support Vector Machine. This work has resulted in appreciable results that are about to be deployed in a leading industry solution. We believe that the considerations and observations made in this contribution are necessary for future smart meter research in order to report their effectiveness on imbalanced and large real world data sets.

    DigitalAngewandte Informatik


    Patrick Glauner, J. Meira, P. Valtchev, R. State, F. Bettinger

    The Challenge of Non-Technical Loss Detection Using Artificial Intelligence: A Survey

    International Journal of Computational Intelligence Systems, vol. 10, no. 1, pp. 760-775

    DOI: 10.2991/ijcis.2017.10.1.51

    Abstract anzeigen

    Detection of non-technical losses (NTL) which include electricity theft, faulty meters or billing errors has attracted increasing attention from researchers in electrical engineering and computer science. NTLs cause significant harm to the economy, as in some countries they may range up to 40% of the total electricity distributed. The predominant research direction is employing artificial intelligence to predict whether a customer causes NTL. This paper first provides an overview of how NTLs are defined and their impact on economies, which include loss of revenue and profit of electricity providers and decrease of the stability and reliability of electrical power grids. It then surveys the state-of-the-art research efforts in a up-to-date and comprehensive review of algorithms, features and data sets used. It finally identifies the key scientific and engineering challenges in NTL detection and suggests how they could be addressed in the future.

    DigitalAngewandte Informatik

    Beitrag (Sammelband oder Tagungsband)

    Patrick Glauner, N. Dahringer, O. Puhachov, J. Meira, P. Valtchev, R. State, D. Duarte

    Identifying Irregular Power Usage by Turning Predictions into Holographic Spatial Visualizations

    Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW 2017) [November 18-21, 2017; New Orleans, LA, USA]

    DOI: 10.1109/ICDMW.2017.40

    Abstract anzeigen

    Power grids are critical infrastructure assets that face non-technical losses (NTL) such as electricity theft or faulty meters. NTL may range up to 40% of the total electricity distributed in emerging countries. Industrial NTL detection systems are still largely based on expert knowledge when deciding whether to carry out costly on-site inspections of customers. Electricity providers are reluctant to move to large-scale deployments of automated systems that learn NTL profiles from data due to the latter's propensity to suggest a large number of unnecessary inspections. In this paper, we propose a novel system that combines automated statistical decision making with expert knowledge. First, we propose a machine learning framework that classifies customers into NTL or non-NTL using a variety of features derived from the customers' consumption data. The methodology used is specifically tailored to the level of noise in the data. Second, in order to allow human experts to feed their knowledge in the decision loop, we propose a method for visualizing prediction results at various granularity levels in a spatial hologram. Our approach allows domain experts to put the classification results into the context of the data and to incorporate their knowledge for making the final decisions of which customers to inspect. This work has resulted in appreciable results on a real-world data set of 3.6M customers. Our system is being deployed in a commercial NTL detection software.

    DigitalGesundAngewandte Informatik

    Beitrag (Sammelband oder Tagungsband)

    L. Trestioreanu, Patrick Glauner, J. Meira, M. Gindt, R. State

    Using Augmented Reality and Machine Learning in Radiology

    Innovative Technologies for Market Leadership: Investing in the Future

    ISBN: 978-3-030-41308-8