Entropic Weights
不同特征之间的权重分配可以通过不同特征的熵差异进行优化, 使得熵更大 (更离散) 的特征获得更高的权重占比, 直观上来说差异更大的特征维度可以更好的区分不同类别. 下面是整理的基于熵的权重分配公式.
Supposing there are m samples and n features in the dataset, xij is the j-th feature value in the i-th sample. In order to eliminate the influence of features dimension on incommensurability, it is necessary to standardize features using the equations of relative optimum membership degree. There are two ways to standardize features value.
rij′=maxjxijxij,(i=1,…,m;j=1,…,n)(1)
rij′=xijminjxij,minjxij≠0,(i=1,…,m;j=1,…,n)(2)
Calculation of the features' entropy according to the definition of entropy. The entropy of the j-th feature is determined by (3)
Hj=−lnm∑i=1mfijlnfij,(i=1,…,m;j=1,…,n)(3)
wherein
fij=∑i=1mrijrij′,(i=1,…,m;j=1,…,n)(4)
Calculation of the feature's entropy weight. Entropy weight of the j-th feature is determined by (5)
wj=n−∑j=1nHj1−Hj,∑j=1nwj=1,(j=1,…,n)(5)
References
[1] Yufeng WANG, Ming DAI. Safety assessment based on information entropy and unascertained theory. Shanxi Coal,
2009(5):8-10.
[2] Li, Xiangxin, et al. "Application of the entropy weight and TOPSIS method in safety evaluation of coal mines." Procedia engineering 26 (2011): 2085-2091.