我有以下字串:
x <- "??????????DRHRTRHLAK??????????"
x2 <- "????????????????????TRCYHIDPHH"
x3 <- "FKDHKHIDVK????????????????????TRCYHIDPHH"
x4 <- "FKDHKHIDVK????????????????????"
我想要做的是?
用另一個字串替換所有字符
rep <- "ndqeegillkkkkfpssyvv"
導致:
ndqeegillkDRHRTRHLAKkkkfpssyvv # x
ndqeegillkkkkfpssyvvTRCYHIDPHH # x2
FKDHKHIDVKndqeegillkkkkfpssyvvTRCYHIDPHH # x3
FKDHKHIDVKndqeegillkkkkfpssyvv # x4
基本上,rep
用.DRHRTRHLAK
x
的總長度與rep
的總長度相同,均為?
20 個字符。
請注意,我不想rep
作為額外的步驟再次手動拆分。
我試過這個但失敗了:
>gsub(pattern = "\\? ", replacement = rep, x = x)
[1] "ndqeegillkkkkfpssyvvDRHRTRHLAKndqeegillkkkkfpssyvv"
uj5u.com熱心網友回復:
示例資料:
x <- c(
"??????????DRHRTRHLAK??????????",
"????????????????????TRCYHIDPHH",
"FKDHKHIDVK????????????????????TRCYHIDPHH"
)
rep <- "ndqeegillkkkkfpssyvv"
regmatches<-
用矢量化方式替換它:
gr <- gregexpr("\\? ", x)
csml <- lapply(gr, \(x) cumsum(attr(x, "match.length")) )
regmatches(x, gr) <- lapply(csml, \(x) substring(rep, c(1,x[1]), x) )
##[1] "ndqeegillkDRHRTRHLAKkkkkfpssyvv"
##[2] "ndqeegillkkkkfpssyvvTRCYHIDPHH"
##[3] "FKDHKHIDVKndqeegillkkkkfpssyvvTRCYHIDPHH"
uj5u.com熱心網友回復:
字串拆分為substr()
:
x <- "??????????DRHRTRHLAK??????????"
rep <- "ndqeegillkkkkfpssyvv"
x<-gsub(pattern = "^\\? ", replacement = substr(rep, 1, 10), x = x)
x<-gsub(pattern = "\\? $", replacement = substr(rep, 11, 20), x = x)
x
#[1] "ndqeegillkDRHRTRHLAKkkkfpssyvv"
正則運算式^
匹配開始,$
匹配結束。
uj5u.com熱心網友回復:
您可以計算 ? 的數量,然后rep
據此進行切割:
x <- "??????????DRHRTRHLAK??????????"
rep <- "ndqeegillkkkkfpssyvv"
pattern <- "(\\? )(DRHRTRHLAK)(\\? )"
n <- nchar(gsub(pattern, "\\1", x))
gsub(pattern, paste0(substr(rep, 1, n), "\\2", substr(rep, n 1, nchar(rep))), x)
#[1] "ndqeegillk??????????kkkfpssyvv"
編輯:新示例:
一個非常冗長的方法是做一個 if else 鏈,檢查 ? 的位置,并rep
相應地替換。
if(grepl("^\\?. \\?$", x)){ #?'s on both ends
n <- gsub(pattern, "\\1", x) %>% nchar()
gsub(pattern, paste0(substr(rep, 1, n), "\\2", substr(rep, n 1, nchar(rep))), x)
} else if(grepl("^\\?", x)){ #?'s only on start
n <- gsub(pattern, "\\1", x) %>% nchar()
gsub(pattern, paste0(substr(rep, 1, n), "\\2"), x)
} else if(grepl("\\?$", x)){ #?'s only on end
n <- gsub(pattern, "\\2", x) %>% nchar()
gsub(pattern, paste0("\\2", substr(rep, 1, n)), x)
} else if(grepl("^[A-Z] \\? [A-Z] $", x)){ #?'s only on middle
n <- gsub(pattern, "\\2", x) %>% nchar()
gsub("([A-Z] )\\? ([A-Z] )", paste0("\\1", substr(rep, 1, n), "\\2"), x)
}
轉載請註明出處,本文鏈接:https://www.uj5u.com/ruanti/531695.html