我有以下資料框
my_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'b', 'b', 'c','c','c','d','d'),
state=c('ac', 'ac', 'ac', 'ac', 'pb', 'pb', 'am','am','am','pi','pi'),
votes=c(541, 463, 246, 49, 2443, 2287, 1035,3530,9999,666,3809))
我想計算每個“直轄市”的投票份額以及每個“直轄市”相對于各州最高投票份額的差異(“邊際勝利”)。我嘗試了以下代碼
actual_df<-my_df %>%
group_by(Municipality,state) %>%
mutate(
share_vote = votes / sum(votes), # calculate vote shares
margin_victory = (max(share_vote)-(max( share_vote[share_vote!=max(share_vote)]))),
) %>%
ungroup()
此代碼按預期正確計算份額投票。然而,只有當你有兩個自治市時,“邊際勝利”才是正確的。以下是我想要的
desired_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'b', 'b', 'c','c','c','d','d'),
state=c('ac', 'ac', 'ac', 'ac', 'pb', 'pb', 'am','am','am','pi','pi'),
votes=c(541, 463, 246, 49, 2443, 2287, 1035,3530,9999,666,3809),
margin_victory= c(0.06004619,-0.06004619,0.2270978, 0.3787529,
0.03298097,-0.03298097,
-0.6154902,-0.44417742,0.44417742,
-0.70234637,0.70234637))
我試圖用margin_victory = for (i in share_vote ) {max(share_vote)-share_vote},
但沒有成功替換“實際df”代碼中的“保證金勝利”。
uj5u.com熱心網友回復:
你確定你想要的結果的跡象嗎?如果沒有,我會建議以下內容:
library(tidyverse)
my_df %>% group_by(Municipality, state) %>%
mutate(
share_vote = votes / sum(votes),
mar = ifelse(votes == max(votes),
votes - max(votes[votes != max(votes)]),
(votes - max(votes))) / sum(votes)) %>%
ungroup()
#> # A tibble: 11 × 5
#> Municipality state votes share_vote mar
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 a ac 541 0.416 0.0600
#> 2 a ac 463 0.356 -0.0600
#> 3 a ac 246 0.189 -0.227
#> 4 a ac 49 0.0377 -0.379
#> 5 b pb 2443 0.516 0.0330
#> 6 b pb 2287 0.484 -0.0330
#> 7 c am 1035 0.0711 -0.615
#> 8 c am 3530 0.242 -0.444
#> 9 c am 9999 0.687 0.444
#> 10 d pi 666 0.149 -0.702
#> 11 d pi 3809 0.851 0.702
由reprex 包于 2022-06-17 創建(v2.0.1)
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/492616.html