我的資料框如下。
df <- data.frame(stat = c(3.38, -3.40, 4.45, -4.21, 3.33),
patient1 = c(-0.44, -0.22, 0.80, -0.21, -0.22),
patient2 = c(0.40, 0.045, -0.14, -0.078, -0.16),
patient3 = c(0.35, 0.21, -0.23, -0.019, -0.21),
row.names = c("gene1","gene2","gene3","gene4","gene5"))
> df
stat patient1 patient2 patient3
gene1 3.38 -0.44 0.400 0.350
gene2 -3.40 -0.22 0.045 0.210
gene3 4.45 0.80 -0.140 -0.230
gene4 -4.21 -0.21 -0.078 -0.019
gene5 3.33 -0.22 -0.160 -0.210
我一直在努力尋找如何撰寫腳本或制作回圈來計算“stat”列和每個患者列的乘法總和,因為我的患者資料集中有 141 列和 142 行來完成這項作業。
所以,我想要一個名為“簽名分數”的新行,其計算值如下:
row.names(df)[nrow(df)] <- "Signature Score"
sum_multi_1 <- sum(df[c(1:nrow(df)-1),2]*df[c(1:nrow(df)-1),1])
sum_multi_2 <- sum(df[c(1:nrow(df)-1),3]*df[c(1:nrow(df)-1),1])
sum_multi_3 <- sum(df[c(1:nrow(df)-1),4]*df[c(1:nrow(df)-1),1])
df[nrow(df),2] <- sum_multi_1
df[nrow(df),3] <- sum_multi_2
df[nrow(df),4] <- sum_multi_3
這是...
> df
stat patient1 patient2 patient3
gene1 3.38 -0.4400 0.40000 0.35000
gene2 -3.40 -0.2200 0.04500 0.21000
gene3 4.45 0.8000 -0.14000 -0.23000
gene4 -4.21 -0.2100 -0.07800 -0.01900
gene5 3.33 -0.2200 -0.16000 -0.21000
Signature Score NA 2.9723 0.37158 -1.17381
我試圖做一個像這樣的for回圈......
for (i in 1:nrow(df)){
df[nrow(df),i 1] <- sum(df[c(1:nrow(df)-1,i 1)]*df[c(1:nrow(df)-1),1])
}
但它沒有做這項作業。誰能告訴我我缺少什么或我需要寫什么?
一切順利,Tj
uj5u.com熱心網友回復:
您可以使用mutate
andacross
計算所需的乘法,然后adorn_totals()
從janitor
包中添加總計列。
library(dplyr)
df <- data.frame(stat = c(3.38, -3.40, 4.45, -4.21, 3.33),
patient1 = c(-0.44, -0.22, 0.80, -0.21, -0.22),
patient2 = c(0.40, 0.045, -0.14, -0.078, -0.16),
patient3 = c(0.35, 0.21, -0.23, -0.019, -0.21),
row.names = c("gene1","gene2","gene3","gene4","gene5")) %>%
rownames_to_column(var = "genes") %>%
mutate(across(patient1:patient3, ~.x * stat)) %>%
janitor::adorn_totals(name = "Signature Score")
df[length(df) 1, 2] <- NA
輸出:
rowname stat patient1 patient2 patient3
gene1 3.38 -1.4872 1.35200 1.18300
gene2 -3.40 0.7480 -0.15300 -0.71400
gene3 4.45 3.5600 -0.62300 -1.02350
gene4 -4.21 0.8841 0.32838 0.07999
gene5 3.33 -0.7326 -0.53280 -0.69930
Signature Score NA 2.9723 0.37158 -1.17381
uj5u.com熱心網友回復:
另一種可能的解決方案,在基礎 R 中:
rbind(df, signa = c(NA,colSums(df[,1] * df[-1])))
#> stat patient1 patient2 patient3
#> gene1 3.38 -0.4400 0.40000 0.35000
#> gene2 -3.40 -0.2200 0.04500 0.21000
#> gene3 4.45 0.8000 -0.14000 -0.23000
#> gene4 -4.21 -0.2100 -0.07800 -0.01900
#> gene5 3.33 -0.2200 -0.16000 -0.21000
#> signa NA 2.9723 0.37158 -1.17381
uj5u.com熱心網友回復:
我注意到你減去1
是為了讓索引從0
. 然而,與 Python 不同的是,R 中的索引從 1 開始。所以你可能想要這個:
colSums(df[-1]*df$stat)
# patient1 patient2 patient3
# 2.97230 0.37158 -1.17381
uj5u.com熱心網友回復:
你太復雜了。
為了使代碼更清晰,定義一個輔助函式fun
來對列進行乘法和求和。然后apply
函式到資料。
df <- data.frame(stat = c(3.38, -3.40, 4.45, -4.21, 3.33),
patient1 = c(-0.44, -0.22, 0.80, -0.21, -0.22),
patient2 = c(0.40, 0.045, -0.14, -0.078, -0.16),
patient3 = c(0.35, 0.21, -0.23, -0.019, -0.21),
row.names = c("gene1","gene2","gene3","gene4","gene5"))
# auxiliary function
fun <- function(x, y) sum(x * y)
apply(df[-1], 2, fun, y = df[[1]])
#> patient1 patient2 patient3
#> 2.97230 0.37158 -1.17381
sigscore <- apply(df[-1], 2, fun, y = df[[1]])
rbind(df, `Signature Score` = c(NA, sigscore))
#> stat patient1 patient2 patient3
#> gene1 3.38 -0.4400 0.40000 0.35000
#> gene2 -3.40 -0.2200 0.04500 0.21000
#> gene3 4.45 0.8000 -0.14000 -0.23000
#> gene4 -4.21 -0.2100 -0.07800 -0.01900
#> gene5 3.33 -0.2200 -0.16000 -0.21000
#> Signature Score NA 2.9723 0.37158 -1.17381
由reprex 包于 2022-05-05 創建(v2.0.1)
uj5u.com熱心網友回復:
這是另一個tidyverse
選項,我在其中應用該函式,summarise
然后獲取列總計,然后更改行名,最后系結回原始資料框。
library(tidyverse)
df %>%
summarise(across(-stat, ~ sum(.x * stat, na.rm = T))) %>%
`row.names<-`("Signature Score") %>%
bind_rows(df, .)
輸出
stat patient1 patient2 patient3
gene1 3.38 -0.4400 0.40000 0.35000
gene2 -3.40 -0.2200 0.04500 0.21000
gene3 4.45 0.8000 -0.14000 -0.23000
gene4 -4.21 -0.2100 -0.07800 -0.01900
gene5 3.33 -0.2200 -0.16000 -0.21000
Signature Score NA 2.9723 0.37158 -1.17381
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/470697.html