The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)
许多应用场景要求每个类别的数量相对平衡，而传统模糊C均值（FCM）聚类算法无法实现此功能。为此，利用标签信息构造 标签分布熵以评价聚类的平衡度，然后将标签分布熵、模糊隶属度矩阵与标签矩阵之间的平方损失同时引入到传统FCM 中，进而提出了一种标签分布熵正则的模糊C均值平衡聚类方法（FCMLDE）。同时，利用迭代方法和增广拉格朗日乘数 法设计了该模型的优化算法。最后，利用6个真实数据集进行了聚类实验，结果表明所提方法在聚类性能和平衡性能上 均具有很好优势。
Many application scenarios require that the number of each category is relatively balanced, and the traditional fuzzy C-means (FCM) clustering algorithm cannot achieve this function. For this reason, we first design a label distribution entropy by using the label information , which can evaluate the balance degree of clustering. Then, the label distribution entropy and the square loss between the fuzzy membership matrix and the label matrix are simultaneously introduced into the traditional FCM, and then a fuzzy C-means balanced clustering method based on the regular label distribution entropy (FCMLDE) is proposed. Besides, this paper design an optimization algorithm to solve the proposed model, which is realized through the iterative strategy and Augmented Lagrange Multipliers method. Finally, clustering experiments are performed using six real data sets, and the results show that the proposed method has good advantages in clustering performance and balance performance.