如何同時獲得資料幀的平均值和模式摘要？-有解無憂

我有一個包含 10 個數字列和 3 個字符列的資料框，作為示例，我準備了這個資料框：

df <- data.frame(
  name = c("ANCON","ANCON","ANCON", "LUNA", "MAGOLLO", "MANCHAY", "MANCHAY","PATILLA","PATILLA"),
  destiny = c("sea","reuse","sea","sea", "reuse","sea","sea","sea","sea"),
  year = c("2022","2015","2022","2022", "2015","2016","2016","2018","2018"),
  QQ = c(10,11,3,4,13,11,12,23,7),
  Temp = c(14,16,16,15,16,20,19,14,18))

我需要按“名稱”列對其進行分組，獲取“QQ”和“溫度”列的平均摘要，以及“命運”和“年份”列的模式。我可以得到平均摘要，但我不能包含模式

df_mean <- df %>%                 
  group_by(name) %>%
  summarise_all(mean, na.rm = TRUE)

  name    destiny  year    QQ  Temp
  <chr>     <dbl> <dbl> <dbl> <dbl>
1 ANCON        NA    NA   8    15.3
2 LUNA         NA    NA   4    15  
3 MAGOLLO      NA    NA  13    16  
4 MANCHAY      NA    NA  11.5  19.5
5 PATILLA      NA    NA  15    16

具有中位數的所需輸出是這樣的：

     name destiny year   QQ Temp
1   ANCON     sea 2022  8.0 15.3
2    LUNA     sea 2022  4.0 15.0
3 MAGOLLO   reuse 2015 13.0 16.0
4 MANCHAY     sea 2016 11.5 19.5
5 PATILLA     sea 2018 15.0 16.0

我怎么能做到？請幫忙

uj5u.com熱心網友回復：

使用across和cur_column。不過，中位數僅適用于序數資料，對于您擁有的字符列等分類資料，請使用模式：

mode <- function(x) {
   x_unique <- unique(x)
   x_unique[which.max(tabulate(match(x, x_unique)))]
}

然后

mode_columns <- c('destiny', 'year')
df %>% 
    group_by(name) %>%
    summarise(
        across(
            everything(),
            ~ if (cur_column() %in% mode_columns) mode(.x) else mean(.x)
        )
    )

# A tibble: 5 × 5
  name    destiny year     QQ  Temp
  <chr>   <chr>   <chr> <dbl> <dbl>
1 ANCON   sea     2022    8    15.3
2 LUNA    sea     2022    4    15  
3 MAGOLLO reuse   2015   13    16  
4 MANCHAY sea     2016   11.5  19.5
5 PATILLA sea     2018   15    16

UPD：或者你可以用不同的方式總結

     summarise(
        across({{mode_cols}}, mode),
        across(!{{mode_cols}}, mean)
    )

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/486573.html

標籤：r 数据框总计的总结团体

上一篇：如何從資料框中洗掉一組特定行？

下一篇：在字典中格式化python字典以匯出到excel