我的代码的优先级是(MWE):
# https://www.kaggle.com/kaggle/kaggle-survey-2017/data
#### Analisis primario del dataset ####
response <- read.csv(file = "multipleChoiceResponses.csv",na.strings = "")
# seleccionamos solo algunas variables :
Variables <- c("GenderSelect","Country","Age","CurrentJobTitleSelect","MLToolNextYearSelect","LanguageRecommendationSelect","FormalEducation","FirstTrainingSelect","EmployerIndustry")
# Mantenemos en memoria solo las variables seleecionadas :
response <- response[,Variables]
# Por un tema de cantidades solo nos quedamos con M y F
Response <- response[response$GenderSelect == "Male" | response$GenderSelect == "Female",]
# agrego una columna para los continenetes (continent) a donde pertenecen los paises (Country)
library(countrycode)
Response$continent <- countrycode(sourcevar = Response[,"Country"],origin = "country.name",destination = "continent")
# Convertimos a factor esta nueva variable
Response$continent <- as.factor(Response$continent)
# Eliminamos las filas con elementos NA
Response <- Response[complete.cases(Response),]
# Enumeramos todas las filas de manera adecuada
rownames(Response) <- 1:nrow(Response)
Response <- droplevels(Response)
bp_Continent <- barplot(table(Response$continent),main = "Distribucion de DS por continentes",ylim = c(0,3500)
)
# Add GenderSelect proportion by continent in label argument ("BLABLABLA")
text(x = bp_Continent,y = table(Response$continent),label = "BLABLABLA",pos = 3,cex = 0.8,col = "red")
基本上,脚本会加载数据,选择一些变量,创建新变量(大陆)以最终清理数据。接下来要做的是创建一个条形图,将男人和女人的比例放在条形图上
我想要做的是按大陆将“ BLABLABLA”更改为男女比例(GenderSelect变量)。
我的问题根本不类似于: How to display the frequency at the top of each factor in a barplot in R
因为我感兴趣的是比例和条形上方印象的计算。