字母“y”在按字母顺序排列时出现在“i”之后
当使用函数sort(x)
,其中x
是一个字符,字母“y”跳转到中间,在字母“i”之后:
> letters [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" [21] "u" "v" "w" "x" "y" "z" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" [21] "t" "u" "v" "w" "x" "z"
原因可能是我位于立陶宛,这是“立陶宛式的”字母sorting,但我需要正常的sorting。 如何在R代码中将sorting方法改回正常?
我在Win7上使用R 2.15.2。
您需要更改R所运行的语言环境。要么为您的整个Windows安装(这看起来不太理想),要么通过以下方式在R会话中执行此操作:
Sys.setlocale("LC_COLLATE", "C")
您可以使用任何其他有效的区域设置string代替"C"
,但是这应该让您回到所需letters
的sorting顺序。
阅读?locales
获取更多信息
我想这是值得注意的姊妹函数Sys.getlocale()
,它查询当前设置的语言环境参数。 所以你可以做
(locCol <- Sys.getlocale("LC_COLLATE")) Sys.setlocale("LC_COLLATE", "lt_LT") sort(letters) Sys.setlocale("LC_COLLATE", locCol) sort(letters) Sys.getlocale("LC_COLLATE") ## giving: > (locCol <- Sys.getlocale("LC_COLLATE")) [1] "en_GB.UTF-8" > Sys.setlocale("LC_COLLATE", "lt_LT") [1] "lt_LT" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" [16] "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "z" > Sys.setlocale("LC_COLLATE", locCol) [1] "en_GB.UTF-8" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" [16] "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" > Sys.getlocale("LC_COLLATE") [1] "en_GB.UTF-8"
这当然是@哈德利的答案显示with_collate()
一旦你有devtools安装,做得更简洁。
如果你想暂时做这个, devtools
提供了with_collate
函数:
library(devtools) with_collate("C", sort(letters)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" # [20] "t" "u" "v" "w" "x" "y" "z" with_collate("lt_LT", sort(letters)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" "o" "p" "q" "r" # [20] "s" "t" "u" "v" "w" "x" "z"