如何在R中的数据框中计算一组其他唯一值的最小/最大值项？

2024-05-15 • 问答

我在Billboard热门歌曲中在R中设置了数据。我能够计算出给定艺术家的独特点击数（请参见下面的代码），但是很难弄清楚如何找到歌曲中排行榜上的最高点。我唯一能想到的是，在过滤出唯一值之前，对每首歌曲运行一个循环并计算最小值。我是R的新手，所以没有意识到其他更简单的方法。

mydata=read.csv("Hot100.csv")
mydata <- mydata[order(mydata$artist,mydata$song,mydata$date),]
head(mydata)

# date position        song  artist
# 218482 2000-07-01       40 Bye Bye Bye 'N Sync
# 226912 2002-02-09       70  Girlfriend 'N Sync
# 226997 2002-02-16       55  Girlfriend 'N Sync
# 227072 2002-02-23       30  Girlfriend 'N Sync
# 227164 2002-03-02       22  Girlfriend 'N Sync
# 227260 2002-03-09       18  Girlfriend 'N Sync

# to remove some cols - leaves artist and song.  Has duplicates
mysub = subset(mydata,select = -c(date,position))

# now to make unique
mysub_u = unique(mysub[,c(1,2)])
View(mysub_u)

# put into table form
mytable = table(mysub_u$artist)

# but this is table form,not df
df=as.data.frame(mytable)

head(df)

# Var1 Freq
# 1                  'N Sync    7
# 2 'N Sync & Gloria Estefan    1
# 3  'N Sync Featuring Nelly    1
# 4             'Til Tuesday    1
# 5      "Weird Al" Yankovic    2
# 6                    (+44)    1

我怎么能创建一张表，列出歌手，歌曲和到达的最高编号（位置），其中1最高？

library(readr) library(dplyr) mydata <- read_table2("index date position song artist 218482 2000-07-01 40 Bye_Bye_Bye 'N_Sync 226912 2002-02-09 70 Girlfriend 'N_Sync 226997 2002-02-16 55 Girlfriend 'N_Sync 227072 2002-02-23 30 Girlfriend 'N_Sync 227164 2002-03-02 22 Girlfriend 'N_Sync 227260 2002-03-09 18 Girlfriend 'N_Sync") out <- mydata %>% group_by(artist,song) %>% mutate(highest_position = min(position)) %>% select(-index,-date,-position) %>% unique(.)

> out # A tibble: 2 x 3 # Groups: artist,song [2] song artist highest_position <chr> <chr> <dbl> 1 Bye_Bye_Bye 'N_Sync 40 2 Girlfriend 'N_Sync 18

如何在R中的数据框中计算一组其他唯一值的最小/最大值项？

ssssf 回答：如何在R中的数据框中计算一组其他唯一值的最小/最大值项？

大家都在问