如何用R中的交錯字符替換另一個字串-有解無憂

我有以下字串：

x  <- "??????????DRHRTRHLAK??????????"
x2 <- "????????????????????TRCYHIDPHH"
x3 <- "FKDHKHIDVK????????????????????TRCYHIDPHH"
x4 <- "FKDHKHIDVK????????????????????"

我想要做的是?用另一個字串替換所有字符

rep <- "ndqeegillkkkkfpssyvv"

導致：

ndqeegillkDRHRTRHLAKkkkfpssyvv           # x
ndqeegillkkkkfpssyvvTRCYHIDPHH           # x2
FKDHKHIDVKndqeegillkkkkfpssyvvTRCYHIDPHH # x3
FKDHKHIDVKndqeegillkkkkfpssyvv           # x4

基本上，rep用.DRHRTRHLAKx

的總長度與rep的總長度相同，均為?20 個字符。

請注意，我不想rep作為額外的步驟再次手動拆分。

我試過這個但失敗了：

>gsub(pattern = "\\? ", replacement = rep, x = x)
[1] "ndqeegillkkkkfpssyvvDRHRTRHLAKndqeegillkkkkfpssyvv"

uj5u.com熱心網友回復：

示例資料：

x <- c(
    "??????????DRHRTRHLAK??????????",
    "????????????????????TRCYHIDPHH",
    "FKDHKHIDVK????????????????????TRCYHIDPHH"
)
rep <- "ndqeegillkkkkfpssyvv"

regmatches<-用矢量化方式替換它：

gr <- gregexpr("\\? ", x)
csml <- lapply(gr, \(x) cumsum(attr(x, "match.length")) )
regmatches(x, gr) <- lapply(csml, \(x) substring(rep, c(1,x[1]), x)  )
##[1] "ndqeegillkDRHRTRHLAKkkkkfpssyvv"         
##[2] "ndqeegillkkkkfpssyvvTRCYHIDPHH"          
##[3] "FKDHKHIDVKndqeegillkkkkfpssyvvTRCYHIDPHH"

uj5u.com熱心網友回復：

字串拆分為substr()：

x <- "??????????DRHRTRHLAK??????????"
rep <- "ndqeegillkkkkfpssyvv"
x<-gsub(pattern = "^\\? ", replacement = substr(rep, 1, 10), x = x)
x<-gsub(pattern = "\\? $", replacement = substr(rep, 11, 20), x = x)
x
#[1] "ndqeegillkDRHRTRHLAKkkkfpssyvv"

正則運算式^匹配開始，$匹配結束。

uj5u.com熱心網友回復：

您可以計算 ? 的數量，然后rep據此進行切割：

x <- "??????????DRHRTRHLAK??????????"
rep <- "ndqeegillkkkkfpssyvv"

pattern <- "(\\? )(DRHRTRHLAK)(\\? )"
n <- nchar(gsub(pattern, "\\1", x))

gsub(pattern, paste0(substr(rep, 1, n), "\\2", substr(rep, n 1, nchar(rep))), x)
#[1] "ndqeegillk??????????kkkfpssyvv"

編輯：新示例：

一個非常冗長的方法是做一個 if else 鏈，檢查 ? 的位置，并rep相應地替換。

if(grepl("^\\?. \\?$", x)){ #?'s on both ends
  n <- gsub(pattern, "\\1", x) %>% nchar()
  gsub(pattern, paste0(substr(rep, 1, n), "\\2", substr(rep, n 1, nchar(rep))), x)
} else if(grepl("^\\?", x)){ #?'s only on start
  n <- gsub(pattern, "\\1", x) %>% nchar()
  gsub(pattern, paste0(substr(rep, 1, n), "\\2"), x)
} else if(grepl("\\?$", x)){ #?'s only on end
  n <- gsub(pattern, "\\2", x) %>% nchar()
  gsub(pattern, paste0("\\2", substr(rep, 1, n)), x)
} else if(grepl("^[A-Z] \\? [A-Z] $", x)){ #?'s only on middle
  n <- gsub(pattern, "\\2", x) %>% nchar()
  gsub("([A-Z] )\\? ([A-Z] )", paste0("\\1", substr(rep, 1, n), "\\2"), x)
}

轉載請註明出處，本文鏈接：https://www.uj5u.com/ruanti/531695.html

標籤：r正则表达式细绳tidyverse

上一篇：正則運算式如何洗掉任何不帶數字的字符

下一篇：匹配字串中長度大于2個字符的字母數字單詞