R:gsub,pattern = vector和replacement = vector
正如标题所述,我试图在gsub中使用“模式”和“replace”向量。 目前,我有一个如下所示的代码:
names(x1) <- gsub("2110027599", "Inv1", names(x1)) #x1 is a data frame names(x1) <- gsub("2110025622", "Inv2", names(x1)) names(x1) <- gsub("2110028045", "Inv3", names(x1)) names(x1) <- gsub("2110034716", "Inv4", names(x1)) names(x1) <- gsub("2110069349", "Inv5", names(x1)) names(x1) <- gsub("2110023264", "Inv6", names(x1))
我希望做的是这样的:
a <- c("2110027599","2110025622","2110028045","2110034716", "2110069349", "2110023264") b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6") names(x1) <- gsub(a,b,names(x1))
我猜是有一个应用函数可以做到这一点,但我不是很确定哪一个使用!
编辑:名称(x1)看起来像这样(有更多的列,但我把他们离开):
> names(x1) [1] "2110023264A.Ms.Amp" "2110023264A.Ms.Vol" "2110023264A.Ms.Watt" "2110023264A1.Ms.Amp" [5] "2110023264A2.Ms.Amp" "2110023264A3.Ms.Amp" "2110023264A4.Ms.Amp" "2110023264A5.Ms.Amp" [9] "2110023264B.Ms.Amp" "2110023264B.Ms.Vol" "2110023264B.Ms.Watt" "2110023264B1.Ms.Amp" [13] "2110023264Error" "2110023264E-Total" "2110023264GridMs.Hz" "2110023264GridMs.PhV.phsA" [17] "2110023264GridMs.PhV.phsB" "2110023264GridMs.PhV.phsC" "2110023264GridMs.TotPFPrc" "2110023264Inv.TmpLimStt" [21] "2110023264InvCtl.Stt" "2110023264Mode" "2110023264Mt.TotOpTmh" "2110023264Mt.TotTmh" [25] "2110023264Op.EvtCntUsr" "2110023264Op.EvtNo" "2110023264Op.GriSwStt" "2110023264Op.TmsRmg" [29] "2110023264Pac" "2110023264PlntCtl.Stt" "2110023264Serial Number" "2110025622A.Ms.Amp" [33] "2110025622A.Ms.Vol" "2110025622A.Ms.Watt" "2110025622A1.Ms.Amp" "2110025622A2.Ms.Amp" [37] "2110025622A3.Ms.Amp" "2110025622A4.Ms.Amp" "2110025622A5.Ms.Amp" "2110025622B.Ms.Amp" [41] "2110025622B.Ms.Vol" "2110025622B.Ms.Watt" "2110025622B1.Ms.Amp" "2110025622Error" [45] "2110025622E-Total" "2110025622GridMs.Hz" "2110025622GridMs.PhV.phsA" "2110025622GridMs.PhV.phsB"
我希望得到的是这样的:
> names(x1) [1] "Inv6A.Ms.Amp" "Inv6A.Ms.Vol" "Inv6A.Ms.Watt" "Inv6A1.Ms.Amp" "Inv6A2.Ms.Amp" [6] "Inv6A3.Ms.Amp" "Inv6A4.Ms.Amp" "Inv6A5.Ms.Amp" "Inv6B.Ms.Amp" "Inv6B.Ms.Vol" [11] "Inv6B.Ms.Watt" "Inv6B1.Ms.Amp" "Inv6Error" "Inv6E-Total" "Inv6GridMs.Hz" [16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC" "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt" [21] "Inv6InvCtl.Stt" "Inv6Mode" "Inv6Mt.TotOpTmh" "Inv6Mt.TotTmh" "Inv6Op.EvtCntUsr" [26] "Inv6Op.EvtNo" "Inv6Op.GriSwStt" "Inv6Op.TmsRmg" "Inv6Pac" "Inv6PlntCtl.Stt" [31] "Inv6Serial Number" "Inv2A.Ms.Amp" "Inv2A.Ms.Vol" "Inv2A.Ms.Watt" "Inv2A1.Ms.Amp" [36] "Inv2A2.Ms.Amp" "Inv2A3.Ms.Amp" "Inv2A4.Ms.Amp" "Inv2A5.Ms.Amp" "Inv2B.Ms.Amp" [41] "Inv2B.Ms.Vol" "Inv2B.Ms.Watt" "Inv2B1.Ms.Amp" "Inv2Error" "Inv2E-Total" [46] "Inv2GridMs.Hz" "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB"
很多的解决scheme已经在这里,还有一个:
qdap包:
library(qdap) names(x1) <- mgsub(a,b,names(x1))
新的答案
如果我们可以再做一个假设,那么下面的工作就可以了 这次的假设是,你真的有兴趣用names(x1)
每个值replace前10个字符。
在这里,我将names(x1)
存储为名为“X1”的字符向量。 该解决scheme基本上使用substr
将X1中的值分成两部分, match
找出正确的replace选项,并paste
到一起。
a <- c("2110027599", "2110025622", "2110028045", "2110034716", "2110069349", "2110023264") b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6") X1pre <- substr(X1, 1, 10) X1post <- substr(X1, 11, max(nchar(X1))) paste0(b[match(X1pre, a)], X1post) # [1] "Inv6A.Ms.Amp" "Inv6A.Ms.Vol" "Inv6A.Ms.Watt" # [4] "Inv6A1.Ms.Amp" "Inv6A2.Ms.Amp" "Inv6A3.Ms.Amp" # [7] "Inv6A4.Ms.Amp" "Inv6A5.Ms.Amp" "Inv6B.Ms.Amp" # [10] "Inv6B.Ms.Vol" "Inv6B.Ms.Watt" "Inv6B1.Ms.Amp" # [13] "Inv6Error" "Inv6E-Total" "Inv6GridMs.Hz" # [16] "Inv6GridMs.PhV.phsA" "Inv6GridMs.PhV.phsB" "Inv6GridMs.PhV.phsC" # [19] "Inv6GridMs.TotPFPrc" "Inv6Inv.TmpLimStt" "Inv6InvCtl.Stt" # [22] "Inv6Mode" "Inv6Mt.TotOpTmh" "Inv6Mt.TotTmh" # [25] "Inv6Op.EvtCntUsr" "Inv6Op.EvtNo" "Inv6Op.GriSwStt" # [28] "Inv6Op.TmsRmg" "Inv6Pac" "Inv6PlntCtl.Stt" # [31] "Inv6Serial Number" "Inv2A.Ms.Amp" "Inv2A.Ms.Vol" # [34] "Inv2A.Ms.Watt" "Inv2A1.Ms.Amp" "Inv2A2.Ms.Amp" # [37] "Inv2A3.Ms.Amp" "Inv2A4.Ms.Amp" "Inv2A5.Ms.Amp" # [40] "Inv2B.Ms.Amp" "Inv2B.Ms.Vol" "Inv2B.Ms.Watt" # [43] "Inv2B1.Ms.Amp" "Inv2Error" "Inv2E-Total" # [46] "Inv2GridMs.Hz" "Inv2GridMs.PhV.phsA" "Inv2GridMs.PhV.phsB"
老答案
如果我们可以假定names(x1)
与模式和replace的顺序相同,并且它基本上是一对一的replace,那么你可能只需要sapply
就可以sapply
。
这是一个特殊情况的例子:
想象一下“names(x)”看起来像这样:
X1 <- paste0("A2", a, sequence(length(a))) X1 # [1] "A221100275991" "A221100256222" "A221100280453" # [4] "A221100347164" "A221100693495" "A221100232646"
这是我们的pattern
和replacement
vector:
a <- c("2110027599", "2110025622", "2110028045", "2110034716", "2110069349", "2110023264") b <- c("Inv1","Inv2","Inv3","Inv4","Inv5","Inv6")
如果这些假设是有效的,我们可以使用这种方法。
sapply(seq_along(a), function(x) gsub(a[x], b[x], X1[x])) # [1] "A2Inv11" "A2Inv22" "A2Inv33" "A2Inv44" "A2Inv55" "A2Inv66"
在str_replace_all
stringr文档中,“如果要将多个模式和replace应用于同一个string,请将指定的版本传递给模式”。
因此,从上面使用a,b和名称(x1)
library(stringr) names(b) <- a str_replace_all(names(x1), b)
不知何故names<-
在这里match
似乎更合适…
names( x1 ) <- b[ match( names( x1 ) , a ) ]
但是我假设vector a
的元素是data.frame
的实际names
。
如果真的是在每个x1
names
中find的模式,那么这个names<-
grepl
方法可能是有用的。
new <- sapply( a , grepl , x = names( x1 ) ) names( x1 ) <- b[ apply( new , 1 , which.max ) ]
尝试mapply
。
names(x1) <- mapply(gsub, a, b, names(x1), USE.NAMES = FALSE)
或者,甚至更简单,从stringr
str_replace
。
library(stringr) names(x1) <- str_replace(names(x1), a, b)