Supervised learning requires data to be labeled. However, labels may not always be available, or creating a labeled dataset may be costly. Even when the data is labeled, labeling is often inconsistent, incomplete and inaccurate. If the data changes over time, a model also needs to be retrained periodically. A machine learning model, therefore, needs to learn from data "in the wild", not just from an initial training dataset. This problem can be addressed by techniques that combine clustering and classification with user feedback. The paper describes one such technique in the form of a pattern: Incremental Analysis. The target audience includes developers who do not have much experience with using machine learning in dynamic environments. This is the first of a number of planned papers on patterns for machine learning.

Labeling data, Machine learning, Patterns
24th European Conference on Pattern Languages of Programs, EuroPLoP 2019
Department of Systems and Computer Engineering

Weiss, M, Muegge, S, & Nazari, A. (Ali). (2019). Incremental analysis in machine learning. In ACM International Conference Proceeding Series. doi:10.1145/3361149.3361150