如何在Haskell中分割string?
有没有一个标准的方法来拆分Haskell中的string?
lines
和words
在空间或换行符上的分割效果很好,但肯定有一个标准的方法来分割逗号? 我无法在Hoogle上find它?
具体来说,我正在寻找的东西, split "," "my,comma,separated,list"
返回["my","comma","separated","list"]
谢谢。
有一个这个被称为分裂的包。
cabal install split
像这样使用它:
ghci> import Data.List.Split ghci> splitOn "," "my,comma,separated,list" ["my","comma","separated","list"]
它带有很多其他的function来分割匹配的分隔符或者有多个分隔符。
请记住,您可以查看Prelude函数的定义!
http://www.haskell.org/onlinereport/standard-prelude.html
看那里, words
的定义是,
words :: String -> [String] words s = case dropWhile Char.isSpace s of "" -> [] s' -> w : words s'' where (w, s'') = break Char.isSpace s'
所以,把它改为一个带谓词的函数:
wordsWhen :: (Char -> Bool) -> String -> [String] wordsWhen ps = case dropWhile ps of "" -> [] s' -> w : wordsWhen p s'' where (w, s'') = break ps'
然后用任何你想要的谓词来调用它!
main = print $ wordsWhen (==',') "break,this,string,at,commas"
如果你使用Data.Text,有splitOn:
http://hackage.haskell.org/packages/archive/text/0.11.2.0/doc/html/Data-Text.html#v:splitOn
这是build立在Haskell平台。
举个例子:
import qualified Data.Text as T main = print $ T.splitOn (T.pack " ") (T.pack "this is a test")
要么:
{-# LANGUAGE OverloadedStrings #-} import qualified Data.Text as T main = print $ T.splitOn " " "this is a test"
在模块Text.Regex(Haskell平台的一部分)中,有一个函数:
splitRegex :: Regex -> String -> [String]
它根据正则expression式分割一个string。 API可以在Hackagefind。
使用Data.List.Split
,它使用split
:
[me@localhost]$ ghci Prelude> import Data.List.Split Prelude Data.List.Split> let l = splitOn "," "1,2,3,4" Prelude Data.List.Split> :tl l :: [[Char]] Prelude Data.List.Split> l ["1","2","3","4"] Prelude Data.List.Split> let { convert :: [String] -> [Integer]; convert = map read } Prelude Data.List.Split> let l2 = convert l Prelude Data.List.Split> :t l2 l2 :: [Integer] Prelude Data.List.Split> l2 [1,2,3,4]
试试这个:
import Data.List (unfoldr) separateBy :: Eq a => a -> [a] -> [[a]] separateBy chr = unfoldr sep where sep [] = Nothing sep l = Just . fmap (drop 1) . break (== chr) $ l
只适用于单个字符,但应容易扩展。
split :: Eq a => a -> [a] -> [[a]] split d [] = [] split ds = x : split d (drop 1 y) where (x,y) = span (/= d) s
例如
split ';' "a;bb;ccc;;d" > ["a","bb","ccc","","d"]
一个尾随分隔符将被删除:
split ';' "a;bb;ccc;;d;" > ["a","bb","ccc","","d"]
我不知道如何给史蒂夫的回答添加评论,但我想推荐
GHC图书馆文件 ,
并在那里具体的
Data.List中的子列表函数
作为一个参考,比阅读简单的Haskell报告要好得多。
一般来说,关于何时创build一个新的子列表来支持的折叠,也应该解决它。
我昨天开始学习Haskell,所以纠正我,如果我错了,但:
split :: Eq a => a -> [a] -> [[a]] split xy = func xy [[]] where func x [] z = reverse $ map (reverse) z func x (y:ys) (z:zs) = if y==x then func x ys ([]:(z:zs)) else func x ys ((y:z):zs)
得到:
*Main> split ' ' "this is a test" ["this","is","a","test"]
或者也许你想要
*Main> splitWithStr " and " "this and is and a and test" ["this","is","a","test"]
这将是:
splitWithStr :: Eq a => [a] -> [a] -> [[a]] splitWithStr xy = func xy [[]] where func x [] z = reverse $ map (reverse) z func x (y:ys) (z:zs) = if (take (length x) (y:ys)) == x then func x (drop (length x) (y:ys)) ([]:(z:zs)) else func x ys ((y:z):zs)
在ghci中的例子:
> import qualified Text.Regex as R > R.splitRegex (R.mkRegex "x") "2x3x777" > ["2","3","777"]
除了答案中给出的高效和预buildfunction之外,我还将添加自己的function,这些function仅仅是我自己编写的用于学习语言的Haskell函数的一部分:
-- Correct but inefficient implementation wordsBy :: String -> Char -> [String] wordsBy sc = reverse (go s []) where go s' ws = case (dropWhile (\c' -> c' == c) s') of "" -> ws rem -> go ((dropWhile (\c' -> c' /= c) rem)) ((takeWhile (\c' -> c' /= c) rem) : ws) -- Breaks up by predicate function to allow for more complex conditions (\c -> c == ',' || c == ';') wordsByF :: String -> (Char -> Bool) -> [String] wordsByF sf = reverse (go s []) where go s' ws = case ((dropWhile (\c' -> f c')) s') of "" -> ws rem -> go ((dropWhile (\c' -> (f c') == False)) rem) (((takeWhile (\c' -> (f c') == False)) rem) : ws)
解决scheme至less是尾recursion的,所以它们不会导致堆栈溢出。