在R中的data.frames列表中子集特定于群集的值

this answer的启发,我的目标是在m数据集群中找到仅对一个m(例如m[[15]])特定的变量,而并非其他m

例如,我知道变量genre == 4仅特定于m[[15]](“ Fazio”,即names(m)[15]),而genre == 4不会出现在其他任何变量中m个集群(由subset(d,genre == 4)确认)。

因此,我希望输出的名称为"Fazio"genre == 4

我想对mods中显示的所有变量而不只是genre重复此过程吗?

我尝试了以下操作,但未成功:

d <- read.csv("https://raw.githubusercontent.com/rnorouzian/m/master/v.csv",h = T) # DATA

mods <- c("genre","cont.type","time","cf.timely","ssci","setting","ed.level",# mods
          "Age","profic","motivation","Ss.aware","random.grp","equiv.grp","rel.inter","rel.intra","sourced","timed","Location","cf.scope","cf.type","error.key","cf.provider","cf.revision","cf.oral","Length","instruction","graded","acc.measure","cf.training","error.type")

m <- split(d,d$study.name) # `m` clusters of data.frames

# SOLUTION TRIED:

tmp = do.call(rbind,lapply(mods,function(x){
  d = unique(d[c("study.name",x)])
  names(d) = c("study.name","val")
  transform(d,nm = x)
}))

# this logic may need to change:
tmp = tmp[ave(as.numeric(as.factor(tmp$val)),tmp$val,FUN = length) == 1,] 

lapply(split(tmp,tmp$study.name),function(a){
 setNames(a$val,a$nm)
})                               # doesn't return anything
gg85806539 回答:在R中的data.frames列表中子集特定于群集的值

我们可以通过在ave中添加'nm'来进行分组

tmp1 <- tmp[with(tmp,ave(val,val,nm,FUN = length)==1),]

现在执行split

tmp2 <- lapply(split(tmp1,tmp1$study.name,drop = TRUE),`row.names<-`,NULL)
rm.df <- data.frame(study.name = c(rep("Bitc_Knch_c",3),rep("Sun",3)),code = c(88,88,7,4,0),mod.name = c("error.type","cf.scope","cf.type","error.type","error.key"))
rm.these <- split(rm.df,rm.df$study.name)

tmp2[names(rm.these)] <- Map(function(x,y) {
     subset(x,!(nm %in% y$mod.name & val %in% y$code))},tmp2[names(rm.these)],rm.these)
Filter(nrow,tmp2)
本文链接:https://www.f2er.com/3122047.html

大家都在问