Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams
Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream...
Lưu vào:
Tác giả chính: | |
---|---|
Đồng tác giả: | |
Định dạng: | BB |
Ngôn ngữ: | English |
Thông tin xuất bản: |
IEEE Xplore
2020
|
Chủ đề: | |
Truy cập trực tuyến: | http://tailieuso.tlu.edu.vn/handle/DHTL/9783 |
Từ khóa: |
Thêm từ khóa bạn đọc
Không có từ khóa, Hãy là người đầu tiên gắn từ khóa cho biểu ghi này!
|
id |
oai:localhost:DHTL-9783 |
---|---|
record_format |
dspace |
spelling |
oai:localhost:DHTL-97832020-11-25T08:18:08Z Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams Fahy, Conor Yang, Shengxiang Data stream clustering multi-density clustering concept drift concept evolution swarm intelligence change detection Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise. https://doi.org/10.1109/TBDATA.2019.2922969 2020-11-25T08:17:20Z 2020-11-25T08:17:20Z 2019 BB http://tailieuso.tlu.edu.vn/handle/DHTL/9783 en IEEE Transactions on Big Data, (2019), pp 15, issue 99 application/pdf IEEE Xplore |
institution |
Trường Đại học Thủy Lợi |
collection |
DSpace |
language |
English |
topic |
Data stream clustering multi-density clustering concept drift concept evolution swarm intelligence change detection |
spellingShingle |
Data stream clustering multi-density clustering concept drift concept evolution swarm intelligence change detection Fahy, Conor Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams |
description |
Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise. |
author2 |
Yang, Shengxiang |
author_facet |
Yang, Shengxiang Fahy, Conor |
format |
BB |
author |
Fahy, Conor |
author_sort |
Fahy, Conor |
title |
Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams |
title_short |
Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams |
title_full |
Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams |
title_fullStr |
Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams |
title_full_unstemmed |
Finding and Tracking Multi-Density Clusters in Online Dynamic Data Streams |
title_sort |
finding and tracking multi-density clusters in online dynamic data streams |
publisher |
IEEE Xplore |
publishDate |
2020 |
url |
http://tailieuso.tlu.edu.vn/handle/DHTL/9783 |
work_keys_str_mv |
AT fahyconor findingandtrackingmultidensityclustersinonlinedynamicdatastreams |
_version_ |
1768590412731121664 |