如何使用Haskell在列表中分组类似的项目?
给出这样的元组列表:
dic = [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")]
如何分组的项目产生一个列表grp在哪里,
grp = [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])]
我其实是Haskell的新手…似乎正在爱上它..
在Data.List中使用group或groupBy将只分组列表中的相似的相邻项目。 我为此写了一个低效率的函数,但是由于我需要处理一个非常大的编码string列表,所以导致内存失败。 希望你能帮我find一个更有效的方法。
这是我的解决scheme:
import Data.Function (on) import Data.List (sortBy, groupBy) import Data.Ord (comparing) myGroup :: (Eq a, Ord a) => [(a, b)] -> [(a, [b])] myGroup = map (\l -> (fst . head $ l, map snd l)) . groupBy ((==) `on` fst) . sortBy (comparing fst)
这个工作首先用sortBy
对列表进行sortBy
:
[(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] => [(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")]
然后用groupBy
将关联的键分组列表元素:
[(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")] => [[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]]
然后将分组项转换为带有map
元组:
[[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]] => [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])]`)
testing:
> myGroup dic [(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])]
只要有可能,重用库代码。
import Data.Map sortAndGroup assocs = fromListWith (++) [(k, [v]) | (k, v) <- assocs]
尝试在ghci:
*Main> sortAndGroup [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] fromList [(1,["bb","cc","aa"]),(2,["aa"]),(3,["gg","ff"])]
您也可以使用TransformListComp扩展名,例如:
Prelude> :set -XTransformListComp Prelude> import GHC.Exts (groupWith, the) Prelude GHC.Exts> let dic = [ (1, "aa"), (1, "bb"), (1, "cc") , (2, "aa"), (3, "ff"), (3, "gg")] Prelude GHC.Exts> [(the key, value) | (key, value) <- dic, then group by key using groupWith] [(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])]
-
如果列表没有sorting在第一个元素上,我不认为你可以比O(nlog(n))做的更好。
-
一个简单的方法就是对第二部分的答案进行
sort
然后使用任何东西。 -
你可以从
Data.Map
中使用一个像Map k [a]
这样的Map k [a]
来使用元组的第一个元素作为关键字,并继续添加这些值。 -
你可以编写自己的复杂函数,即使你所有的尝试仍然会采取O(nlog(n))。
-
-
如果list与第一个元素一样,就像在你的例子中那样,那么对于像@Mikhail回答中给出的groupBy或者使用foldr这样的任务是微不足道的,还有很多其他的方法。
使用foldr的一个例子是:
grp :: Eq a => [(a,b)] -> [(a,[b])] grp = foldr f [] where f (z,s) [] = [(z,[s])] f (z,s) a@((x,y):xs) | x == z = (x,s:y):xs | otherwise = (z,[s]):a
{-# LANGUAGE TransformListComp #-} import GHC.Exts import Data.List import Data.Function (on) process :: [(Integer, String)] -> [(Integer, [String])] process list = [(the a, b) | let info = [ (x, y) | (x, y) <- list, then sortWith by y ], (a, b) <- info, then group by a using groupWith]