在 R DataFrame 中使用多个条件进行计算

我有以下关于投资回报的数据集:

df <- structure(list(Date = structure(c(1620604800,1620604800,1627257600,1627257600),class = c("POSIXct","POSIXt"),tzone = "UTC"),Client = c(56203,56203,56203),FundName = c("SAFRA KEPLER EQUITY HEDGE FI MULTIMERCADO","SAFRA KEPLER FI MULTIMERCADO","SAFRA S&P SPECIAL FIC MULTIMERCADO","SAFRA AÇÕES LIVRE FIC AÇÕES","SAFRA CONSUMO AMERICANO FIC AÇÕES BDR NÍVEL I","SAFRA CONSUMO AMERICANO FIC AÇÕES BDR NÍVEL I"
),Nature = c("Aplicação","Aplicação","Resgate","Resgate"),Quantity = c(145.927569,62.684383,16.964545,57.852278,14.900635,0),Value = c(21240,12120,8760,9120,-9157.45,-1125),Saldo = c(21240,55.60576570794,9198.89221970512),FundCode = c(25079578000106,21144577000147,19107923000175,32666326000149,19436818000180,19436818000180),NAVInitialDate = c(145.551660000114,193.349593000021,521.980031999759,157.489282999886,639.441026999615,543.077092000283,692.849144999869),NAVYesterday = c(147.526190999895,196.017907999922,531.332205999643,166.330892999889,675.641246000305,675.641246000305)),row.names = c(NA,-7L),class = c("tbl_df","tbl","data.frame"))

df$NAVInitialDate = df$Value / df$Quantity

“NAVInitialDate”列是每行“价值/数量”的结果。 我想对 R 说的是:“如果在一行中我们有 Quantity = 0,那么 'NAVInitialDate' = ABS(Value) 除以与行中 Fundname 具有相同 FundName 的所有行的数量总和数量等于0"

结果应该是,对于第 6 行 > NavInitialDate = 539,793 ; 对于第 7 行 > NavInitialDate = 75,50

我该怎么做?我想它可能需要 3 个条件

zhuzhu1107 回答:在 R DataFrame 中使用多个条件进行计算

我们可以在组内的所有 group_by Quantity mutate 0 中使用 dplyr、case_when FundName 和 == NAVInitialDate。

library(dplyr)

df%>%group_by(FundName)%>%
        mutate(NAVInitialDate=case_when(Quantity==0 ~ abs(Value)/sum(Quantity,na.rm=TRUE)))

# A tibble: 7 x 10
# Groups:   FundName [5]
  Date                Client FundName       Nature Quantity  Value  Saldo FundCode NAVInitialDate
  <dttm>               <dbl> <chr>          <chr>     <dbl>  <dbl>  <dbl>    <dbl>          <dbl>
1 2021-05-10 00:00:00  56203 SAFRA KEPLER … Aplic…    146.  21240  2.12e4  2.51e13           NA  
2 2021-05-10 00:00:00  56203 SAFRA KEPLER … Aplic…     62.7 12120  1.21e4  2.11e13           NA  
3 2021-05-10 00:00:00  56203 SAFRA S&P SPE… Aplic…     17.0  8760  8.76e3  1.91e13           NA  
4 2021-05-10 00:00:00  56203 SAFRA AÇÕES L… Aplic…     57.9  9120  9.12e3  3.27e13           NA  
5 2021-05-10 00:00:00  56203 SAFRA CONSUMO… Aplic…     14.9  8760  8.76e3  1.94e13           NA  
6 2021-07-26 00:00:00  56203 SAFRA S&P SPE… Resga…      0   -9157. 5.56e1  1.91e13          540. 
7 2021-07-26 00:00:00  56203 SAFRA CONSUMO… Resga…      0   -1125  9.20e3  1.94e13           75.5
# … with 1 more variable: NAVYesterday <dbl>
,

试试:

sum_0 <- sum(df[df$FundName%in%(df[df$Quantity==0,]$FundName),]$Quantity)
df$NAVInitialDate <- ifelse(df$Quantity == 0,abs(df$Value)/sum_0,df$Value / df$Quantity)
,

试试

library(tidyverse)


df <- df %>% group_by(FundName) %>% 
  mutate(sum_Quantity = sum(Quantity),mycol = ifelse(Quantity==0,abs(Value)/sum_Quantity,Value/Quantity)) 

在这里您按 FundName 分组并应用一个简单的 mutate 命令。

为了检查,让我们看看感兴趣的行/列:

df %>% filter(Quantity==0) %>% select(FundName,Quantity,NAVInitialDate,mycol)
# A tibble: 2 x 4
# Groups:   FundName [2]
  FundName                                      Quantity NAVInitialDate mycol
  <chr>                                            <dbl>          <dbl> <dbl>
1 SAFRA S&P SPECIAL FIC MULTIMERCADO                   0           -Inf 540. 
2 SAFRA CONSUMO AMERICANO FIC AÇÕES BDR NÍVEL I        0           -Inf  75.5
本文链接:https://www.f2er.com/1256.html

大家都在问