Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams

Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream...

Mô tả chi tiết

Lưu vào:
Hiển thị chi tiết
Tác giả chính: Fahy, Conor
Đồng tác giả: Yang, Shengxiang
Định dạng: BB
Ngôn ngữ:English
Thông tin xuất bản: IEEE Xplore 2020
Chủ đề:
Truy cập trực tuyến:http://tailieuso.tlu.edu.vn/handle/DHTL/9783
Từ khóa: Thêm từ khóa bạn đọc
Không có từ khóa, Hãy là người đầu tiên gắn từ khóa cho biểu ghi này!
id oai:localhost:DHTL-9783
record_format dspace
spelling oai:localhost:DHTL-97832020-11-25T08:18:08Z Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams Fahy, Conor Yang, Shengxiang Data stream clustering multi-density clustering concept drift concept evolution swarm intelligence change detection Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise. https://doi.org/10.1109/TBDATA.2019.2922969 2020-11-25T08:17:20Z 2020-11-25T08:17:20Z 2019 BB http://tailieuso.tlu.edu.vn/handle/DHTL/9783 en IEEE Transactions on Big Data, (2019), pp 15, issue 99 application/pdf IEEE Xplore
institution Trường Đại học Thủy Lợi
collection DSpace
language English
topic Data stream clustering
multi-density clustering
concept drift
concept evolution
swarm intelligence
change detection
spellingShingle Data stream clustering
multi-density clustering
concept drift
concept evolution
swarm intelligence
change detection
Fahy, Conor
Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
description Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise.
author2 Yang, Shengxiang
author_facet Yang, Shengxiang
Fahy, Conor
format BB
author Fahy, Conor
author_sort Fahy, Conor
title Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
title_short Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
title_full Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
title_fullStr Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
title_full_unstemmed Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
title_sort finding and tracking multi-density clusters in online dynamic data streams
publisher IEEE Xplore
publishDate 2020
url http://tailieuso.tlu.edu.vn/handle/DHTL/9783
work_keys_str_mv AT fahyconor findingandtrackingmultidensityclustersinonlinedynamicdatastreams
_version_ 1768590412731121664