考虑边界样本邻域归属信息的粗糙K-means增量聚类算法
DOI:
作者:
作者单位:

1.南京财经大学;2.南京邮电大学

作者简介:

通讯作者:

中图分类号:

TP18

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目); 江苏省自然科学基金项目


Rough K-means Incremental Clustering Algorithm Considering Neighborhood Belonging Information of Boundary Samples
Author:
Affiliation:

1.Nanjing University of Finance and Economics;2.Nanjing University of Posts and Telecommunications

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan); the Natural Science Foundation of Jiangsu Province

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在原有数据聚类结果的基础上,如何对新增数据进行归属度量分析是提高增量式聚类质量的关键,现有增量式聚类算法更多地是考虑新增数据的位置分布,忽略了其邻域数据点的归属信息。在粗糙K-Means聚类算法的基础上,针对边界区域新增数据点的不确定性信息处理,提出一种基于邻域归属信息的粗糙K-Means增量式聚类算法。该算法综合考虑了边界区域新增数据样本的位置分布及其邻域数据点的类簇归属信息,使得新增数据点与各类簇的归属度量更为合理;此外,在增量式聚类过程中,根据新增数据点所导致的类簇结构的变化,对类簇进行相应的合并或分裂操作,使类簇划分可以自适应调整。在人工数据集和UCI标准数据集上的对比实验结果验证了算法的有效性。

    Abstract:

    The key to improve the quality of incremental clustering is how to assign the new data to different clusters on the basis of original data clustering results. The existing incremental clustering algorithms mostly considered the location distribution of the newly added data point, while ignored the belonging information of the neighbor points around the new data point. To deal with the uncertain information of new data points that fall into boundary regions of original clusters, based on the rough K-Means clustering, a rough K-Means incremental clustering algorithm is developed. In this algorithm, focusing on the assignment of the newly added data in the boundary region, the neighborhood belonging information of the new data is taken into consideration, so that, the hybrid measure of the new data point belonging to different clusters is more reasonable. Furthermore, the clusters will be merged or split to make the new divided clusters becoming more reasonable according to the cluster structure changes caused by the new data. The validity of this algorithm is demonstrated by the experimental results on the artificial data sets and UCI standard data sets.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-04-12
  • 最后修改日期:2021-07-20
  • 录用日期:2021-07-29
  • 在线发布日期:
  • 出版日期: