如何使用Haskell在列表中分组类似的项目?

给出这样的元组列表:

dic = [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] 

如何分组的项目产生一个列表grp在哪里,

 grp = [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])] 

我其实是Haskell的新手…似乎正在爱上它..
Data.List中使用groupgroupBy将只分组列表中的相似的相邻项目。 我为此写了一个低效率的函数,但是由于我需要处理一个非常大的编码string列表,所以导致内存失败。 希望你能帮我find一个更有效的方法。

这是我的解决scheme:

 import Data.Function (on) import Data.List (sortBy, groupBy) import Data.Ord (comparing) myGroup :: (Eq a, Ord a) => [(a, b)] -> [(a, [b])] myGroup = map (\l -> (fst . head $ l, map snd l)) . groupBy ((==) `on` fst) . sortBy (comparing fst) 

这个工作首先用sortBy对列表进行sortBy

 [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] => [(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")] 

然后用groupBy将关联的键分组列表元素:

 [(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")] => [[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]] 

然后将分组项转换为带有map元组:

 [[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]] => [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])]`) 

testing:

 > myGroup dic [(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])] 

只要有可能,重用库代码。

 import Data.Map sortAndGroup assocs = fromListWith (++) [(k, [v]) | (k, v) <- assocs] 

尝试在ghci:

 *Main> sortAndGroup [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] fromList [(1,["bb","cc","aa"]),(2,["aa"]),(3,["gg","ff"])] 

您也可以使用TransformListComp扩展名,例如:

 Prelude> :set -XTransformListComp Prelude> import GHC.Exts (groupWith, the) Prelude GHC.Exts> let dic = [ (1, "aa"), (1, "bb"), (1, "cc") , (2, "aa"), (3, "ff"), (3, "gg")] Prelude GHC.Exts> [(the key, value) | (key, value) <- dic, then group by key using groupWith] [(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])] 
  1. 如果列表没有sorting在第一个元素上,我不认为你可以比O(nlog(n))做的更好。

    • 一个简单的方法就是对第二部分的答案进行sort然后使用任何东西。

    • 你可以从Data.Map中使用一个像Map k [a]这样的Map k [a]来使用元组的第一个元素作为关键字,并继续添加这些值。

    • 你可以编写自己的复杂函数,即使你所有的尝试仍然会采取O(nlog(n))。

  2. 如果list与第一个元素一样,就像在你的例子中那样,那么对于像@Mikhail回答中给出的groupBy或者使用foldr这样的任务是微不足道的,还有很多其他的方法。

使用foldr的一个例子是:

  grp :: Eq a => [(a,b)] -> [(a,[b])] grp = foldr f [] where f (z,s) [] = [(z,[s])] f (z,s) a@((x,y):xs) | x == z = (x,s:y):xs | otherwise = (z,[s]):a 
 {-# LANGUAGE TransformListComp #-} import GHC.Exts import Data.List import Data.Function (on) process :: [(Integer, String)] -> [(Integer, [String])] process list = [(the a, b) | let info = [ (x, y) | (x, y) <- list, then sortWith by y ], (a, b) <- info, then group by a using groupWith]