聚类,Mclust(),提取聚类-R

我正在使用mclust::Mclust()函数对一个小的数据集进行聚类。但是,我在为每个要放入数据集中的数据提取聚类分类而苦苦挣扎。

以下是数据:

df <- structure(list(latitud = c(-43.8189010620117,-34.2731018066406,-47.0666999816895,-35.7543983459473,-47.1413993835449,-36.6260986328125,-37.2118988037109,-33.3086013793945,-37.2792015075684,-35.4524993896484,-36.5856018066406,-44.6591987609863,-28.6996994018555,-48.1591987609863,-45.4000015258789,-29.94580078125,-30.4386005401611,-31.6646995544434,-51.2000007629395,-51.3328018188477,-51.25,-45.551700592041,-39.0144004821777,-38.6081008911133,-34.9844017028809,-32.8403015136719,-29.9953002929688,-18.3999996185303,-35.6169013977051,-35.9085998535156,-35.4068984985352,-32.7571983337402,-32.8502998352051,-33.5938987731934,-38.4303016662598,-38.6866989135742,-45.4057998657227,-37.5503005981445,-37.8997001647949,-38.0368995666504,-37.7047004699707,-37.7963981628418,-37.7092018127441,-31.5835990905762,-30.9242000579834,-38.2008018493652,-31.6881008148193,-31.8117008209229,-27.9747009277344,-30.7047004699707,-36.6500015258789,-34.4921989440918,-34.6581001281738,-47.3499984741211,-47.5,-33.7219009399414,-33.6613998413086,-35.5574989318848
),longitud = c(-72.38330078125,-71.371696472168,-72.8000030517578,-71.0864028930664,-72.7257995605469,-72.4891967773438,-72.3242034912109,-70.3572006225586,-71.9847030639648,-71.7332992553711,-71.5255966186523,-71.8082962036133,-70.5500030517578,-73.0888977050781,-72.5999984741211,-70.5327987670898,-71.002197265625,-71.2546997070312,-72.9332962036133,-73.1091995239258,-72.5167007446289,-72.0680999755859,-73.0828018188477,-72.8478012084961,-72.0100021362305,-71.0255966186523,-70.5867004394531,-70.3000030517578,-71.7677993774414,-71.2981033325195,-72.2082977294922,-70.736701965332,-70.5093994140625,-70.3792037963867,-72.0105972290039,-72.502799987793,-72.6231002807617,-72.5903015136719,-71.6239013671875,-71.4781036376953,-71.7683029174805,-71.6988983154297,-71.823600769043,-71.4606018066406,-70.7731018066406,-71.2988967895508,-71.2658004760742,-70.9302978515625,-69.997802734375,-70.9244003295898,-72.4499969482422,-71.3731002807617,-71.3019027709961,-72.8499984741211,-72.9749984741211,-71.5550003051758,-71.3371963500977,-71.7067031860352)),row.names = c(NA,-58L),class = c("tbl_df","tbl","data.frame"))

集群:

d_clust <- Mclust(df)

现在,当我运行plot(d_clust)时,它会显示所有图形和所有内容。但这并没有告诉我哪个集群对应于每一行。我研究了文档和其他文档(123)以及与Mclust()1,{{3} })无法解决我的问题。

我正在寻找这样的东西:

| latitud | longitud | cluster_id |

顺便说一句,我做class(d_clust)时是Mclust类。如果仅运行d_clust却无法提供要绘制的表/数据框,如何绘制d_clust

a409788679 回答:聚类,Mclust(),提取聚类-R

当您运行Mclust时,它将尝试使用不同的模型和不同的G值(簇数)。因此,请查看BIC图:

enter image description here

因为Mclust将仅基于BIC选择最佳模型,并将其保留为d_clust $ modelName和d_clus $ G。

一旦您知道使用哪种模型(我认为您的情况是EVE和G = 4),就可以进行分类了,您可以使用以下方法将其删除:

d_clust$classification
# or
results = data.frame(df,cluster=d_clust$classification)
head(results)
   latitud longitud cluster
1 -43.8189 -72.3833       1
2 -34.2731 -71.3717       2
3 -47.0667 -72.8000       1
4 -35.7544 -71.0864       3
5 -47.1414 -72.7258       1
6 -36.6261 -72.4892       3

您还可以绘制:

with(results,plot(latitud,longitud,col=factor(cluster)))

enter image description here

例如,您可以考虑是否应该使用聚类,而应该使用G = 4。

本文链接:https://www.f2er.com/3077129.html

大家都在问