字母“y”在按字母顺序排列时出现在“i”之后

当使用函数sort(x) ,其中x是一个字符,字母“y”跳转到中间,在字母“i”之后:

 > letters [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" [21] "u" "v" "w" "x" "y" "z" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" [21] "t" "u" "v" "w" "x" "z" 

原因可能是我位于立陶宛,这是“立陶宛式的”字母sorting,但我需要正常的sorting。 如何在R代码中将sorting方法改回正常?

我在Win7上使用R 2.15.2。

您需要更改R所运行的语言环境。要么为您的整个Windows安装(这看起来不太理想),要么通过以下方式在R会话中执行此操作:

 Sys.setlocale("LC_COLLATE", "C") 

您可以使用任何其他有效的区域设置string代替"C" ,但是这应该让您回到所需letters的sorting顺序。

阅读?locales获取更多信息

我想这是值得注意的姊妹函数Sys.getlocale() ,它查询当前设置的语言环境参数。 所以你可以做

 (locCol <- Sys.getlocale("LC_COLLATE")) Sys.setlocale("LC_COLLATE", "lt_LT") sort(letters) Sys.setlocale("LC_COLLATE", locCol) sort(letters) Sys.getlocale("LC_COLLATE") ## giving: > (locCol <- Sys.getlocale("LC_COLLATE")) [1] "en_GB.UTF-8" > Sys.setlocale("LC_COLLATE", "lt_LT") [1] "lt_LT" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" [16] "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "z" > Sys.setlocale("LC_COLLATE", locCol) [1] "en_GB.UTF-8" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" [16] "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" > Sys.getlocale("LC_COLLATE") [1] "en_GB.UTF-8" 

这当然是@哈德利的答案显示with_collate()一旦你有devtools安装,做得更简洁。

如果你想暂时做这个, devtools提供了with_collate函数:

 library(devtools) with_collate("C", sort(letters)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" # [20] "t" "u" "v" "w" "x" "y" "z" with_collate("lt_LT", sort(letters)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" "o" "p" "q" "r" # [20] "s" "t" "u" "v" "w" "x" "z"