从Python列表中列出一个扁平列表

我想知道是否有一个快捷方式可以在Python列表中列出一个简单列表。

我可以做一个for循环，但也许有一些很酷的“单线”？我尝试减less ，但我得到一个错误。

码

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] reduce(lambda x, y: x.extend(y), l)

错误信息

 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <lambda> AttributeError: 'NoneType' object has no attribute 'extend'

 flat_list = [item for sublist in l for item in sublist]

意思是：

 for sublist in l: for item in sublist: flat_list.append(item)

比迄今为止发布的捷径要快。（ l是扁平的列表。）

这是一个相应的function：

 flatten = lambda l: [item for sublist in l for item in sublist]

作为证据，一如既往，您可以使用标准库中的timeit模块：

 $ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]' 10000 loops, best of 3: 143 usec per loop $ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])' 1000 loops, best of 3: 969 usec per loop $ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)' 1000 loops, best of 3: 1.1 msec per loop

说明：当存在L个子列表时，基于+ （包括sum的隐含用法）的快捷方式必然是O(L**2) – 随着中间结果列表不断变长，每一步都有一个新的中间结果列表对象被分配，并且前一个中间结果中的所有项目都必须被复制（以及最后添加的几个新项目）。所以（为了简单起见，没有实际的一般性损失），假设每个项目都有L个子项：第一个项目被来回复制L-1次，第二个项目L-2次，等等。总的拷贝数是I的x的总和，x从1到L排除，即I * (L**2)/2 。

列表理解只产生一个列表，一次，并且每个项目（从其原始的居住地点到结果列表）也只是一次地复制。

你可以使用itertools.chain() ：

 >>> import itertools >>> list2d = [[1,2,3],[4,5,6], [7], [8,9]] >>> merged = list(itertools.chain(*list2d))

或者，在Python> = 2.6中，使用itertools.chain.from_iterable() ，不需要解包列表：

 >>> import itertools >>> list2d = [[1,2,3],[4,5,6], [7], [8,9]] >>> merged = list(itertools.chain.from_iterable(list2d))

这种方法可以说比[item for sublist in l for item in sublist]更可读，似乎也更快：

 [me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99;import itertools' 'list(itertools.chain.from_iterable(l))' 10000 loops, best of 3: 24.2 usec per loop [me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]' 10000 loops, best of 3: 45.2 usec per loop [me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])' 1000 loops, best of 3: 488 usec per loop [me@home]$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)' 1000 loops, best of 3: 522 usec per loop [me@home]$ python --version Python 2.7.3

 >>> sum(l, []) [1, 2, 3, 4, 5, 6, 7, 8, 9]

请注意，只适用于清单列表。对于列表清单列表，您需要另一个解决scheme。

@Nadia：你必须使用更长的列表。那么你会看到这个差异呢！我的结果len(l) = 1600

 A took 14.323 ms B took 13.437 ms C took 1.135 ms

哪里：

 A = reduce(lambda x,y: x+y,l) B = sum(l, []) C = [item for sublist in l for item in sublist]

 >>> l = [[1,2,3],[4,5,6], [7], [8,9]] >>> reduce(lambda x,y: x+y,l) [1, 2, 3, 4, 5, 6, 7, 8, 9]

在你的例子中的extend()方法修改x而不是返回一个有用的值（ reduce()期望的）。

一个更快的方式来做reduce版本将是

 >>> import operator >>> l = [[1,2,3],[4,5,6], [7], [8,9]] >>> reduce(operator.concat, l) [1, 2, 3, 4, 5, 6, 7, 8, 9]

我用perfplot （我的一个宠物项目，本质上是时间的包装）testing了大多数build议的解决scheme，并发现

list(itertools.chain.from_iterable(a))

成为最快的解决scheme（如果连接超过10个列表）。

在这里输入图像描述

代码重现情节：

 import functools import itertools import numpy import operator import perfplot def forfor(a): return [item for sublist in a for item in sublist] def sum_brackets(a): return sum(a, []) def functools_reduce(a): return functools.reduce(operator.concat, a) def itertools_chain(a): return list(itertools.chain.from_iterable(a)) def numpy_flat(a): return list(numpy.array(a).flat) def numpy_concatenate(a): return list(numpy.concatenate(a)) perfplot.show( setup=lambda n: [range(10)] * n, kernels=[ forfor, sum_brackets, functools_reduce, itertools_chain, numpy_flat, numpy_concatenate ], n_range=[2**k for k in range(12)], logx=True, logy=True, xlabel='num lists' )

我把我的声明回来。 总和不是赢家。 虽然列表很小，但速度更快。 但是，更大的列表性能会显着降低。

 >>> timeit.Timer( '[item for sublist in l for item in sublist]', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10000' ).timeit(100) 2.0440959930419922

总和版本仍然运行了一分多钟，尚未完成处理！

对于中等列表：

 >>> timeit.Timer( '[item for sublist in l for item in sublist]', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10' ).timeit() 20.126545906066895 >>> timeit.Timer( 'reduce(lambda x,y: x+y,l)', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10' ).timeit() 22.242258071899414 >>> timeit.Timer( 'sum(l, [])', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10' ).timeit() 16.449732065200806

使用小列表和timeit：number = 1000000

 >>> timeit.Timer( '[item for sublist in l for item in sublist]', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]' ).timeit() 2.4598159790039062 >>> timeit.Timer( 'reduce(lambda x,y: x+y,l)', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]' ).timeit() 1.5289170742034912 >>> timeit.Timer( 'sum(l, [])', 'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]' ).timeit() 1.0598428249359131

这是一个适用于列表，数字，string和其他混合容器types的嵌套列表的一般方法。

 from collections import Iterable def flatten(items): """Yield items from any nested iterable; see REF.""" for x in items: if isinstance(x, Iterable) and not isinstance(x, (str, bytes)): yield from flatten(x) else: yield x list(flatten(l)) # list of lists #[1, 2, 3, 4, 5, 6, 7, 8, 9] items = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"] # numbers & mixed containers list(flatten(items)) #[1, 2, 3, 4, 5, 6, 7, 8, '9']

这个解决scheme使用Python 3的强大的关键字yield from关键字中提取项目，从子生成器提取项 ~~请注意，此解决scheme不适用于string。~~ 更新：现在支持string。

REF：由Beazley，D.和B. Jones修改的解决scheme。 Recipe 4.14，Python Cookbook 3rd Ed。，O'Reilly Media Inc. Sebastopol，CA：2013。

你为什么使用扩展？

 reduce(lambda x, y: x+y, l)

这应该工作得很好。

似乎与operator.add混淆！当您将两个列表一起添加时，正确的术语是concat ，而不是添加。 operator.concat是你需要使用的。

如果你在思考function，就像这样简单::

 >>> list2d = ((1,2,3),(4,5,6), (7,), (8,9)) >>> reduce(operator.concat, list2d) (1, 2, 3, 4, 5, 6, 7, 8, 9)

你看到reduce方面的顺序types，所以当你提供一个元组的时候，你得到一个元组。让我们尝试一个列表::

 >>> list2d = [[1,2,3],[4,5,6], [7], [8,9]] >>> reduce(operator.concat, list2d) [1, 2, 3, 4, 5, 6, 7, 8, 9]

啊哈，你回来一个清单。

性能如何::

 >>> list2d = [[1,2,3],[4,5,6], [7], [8,9]] >>> %timeit list(itertools.chain.from_iterable(list2d)) 1000000 loops, best of 3: 1.36 µs per loop

from_iterable相当快！但是用concat减less是没有比较的。

 >>> list2d = ((1,2,3),(4,5,6), (7,), (8,9)) >>> %timeit reduce(operator.concat, list2d) 1000000 loops, best of 3: 492 ns per loop

你的函数不起作用的原因：扩展扩展数组就地，并没有返回它。你仍然可以从lambda返回x，使用一些技巧：

 reduce(lambda x,y: x.extend(y) or x, l)

注意：扩展在列表上比+更有效。

如果你想扁平化一个你不知道嵌套的数据结构，你可以使用iteration_utilities.deepflatten ¹

 >>> from iteration_utilities import deepflatten >>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] >>> list(deepflatten(l, depth=1)) [1, 2, 3, 4, 5, 6, 7, 8, 9] >>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]] >>> list(deepflatten(l)) [1, 2, 3, 4, 5, 6, 7, 8, 9]

这是一个生成器，因此您需要将结果转换为list或明确地迭代它。

为了只展平一个级别，如果每个项目本身都是可迭代的，你也可以使用iteration_utilities.flatten ，它本身就是itertools.chain.from_iterable一个简单的包装：

 >>> from iteration_utilities import flatten >>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] >>> list(flatten(l)) [1, 2, 3, 4, 5, 6, 7, 8, 9]

^{1免责声明：我是该图书馆的作者}

上面Anil函数的一个坏处是它要求用户总是手动指定第二个参数为空列表[] 。这应该是一个默认值。由于Python对象的工作方式，这些应该在函数内而不是在参数中设置。

这是一个工作function：

 def list_flatten(l, a=None): #check a if a is None: #initialize with empty list a = [] for i in l: if isinstance(i, list): list_flatten(i, a) else: a.append(i) return a

testing：

 In [2]: lst = [1, 2, [3], [[4]],[5,[6]]] In [3]: lst Out[3]: [1, 2, [3], [[4]], [5, [6]]] In [11]: list_flatten(lst) Out[11]: [1, 2, 3, 4, 5, 6]

以下对我来说似乎最简单：

 >>> import numpy as np >>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] >>> print (np.concatenate(l)) [1 2 3 4 5 6 7 8 9]

考虑安装more_itertools包。

 > pip install more_itertools

它附带flatten （ source ，来自itertools食谱）的实现：

 import more_itertools # Using flatten() l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] list(more_itertools.flatten(l)) # [1, 2, 3, 4, 5, 6, 7, 8, 9]

从版本2.4开始，可以使用more_itertools.collapse （源代码，由abarnet提供）将更复杂的嵌套迭代more_itertools.collapse扁平化。

 # Using collapse() l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] # given example list(more_itertools.collapse(l)) # [1, 2, 3, 4, 5, 6, 7, 8, 9] l = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9] # complex nesting list(more_itertools.collapse(l)) # [1, 2, 3, 4, 5, 6, 7, 8, 9]

你也可以使用NumPy的平板：

 import numpy as np list(np.array(l).flat)

编辑11/02/2016：只有当子列表具有相同的尺寸时才有效。

简单的代码为underscore.py包风扇

 from underscore import _ _.flatten([[1, 2, 3], [4, 5, 6], [7], [8, 9]]) # [1, 2, 3, 4, 5, 6, 7, 8, 9]

它解决了所有扁平化问题（无列表项或复杂嵌套）

 from underscore import _ # 1 is none list item # [2, [3]] is complex nesting _.flatten([1, [2, [3]], [4, 5, 6], [7], [8, 9]]) # [1, 2, 3, 4, 5, 6, 7, 8, 9]

你可以用pip安装underscore.py

 pip install underscore.py

如果你愿意放弃一个很小的速度来获得更清晰的外观，那么你可以使用numpy.concatenate().tolist()或numpy.concatenate().ravel().tolist() ：

 import numpy l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99 %timeit numpy.concatenate(l).ravel().tolist() 1000 loops, best of 3: 313 µs per loop %timeit numpy.concatenate(l).tolist() 1000 loops, best of 3: 312 µs per loop %timeit [item for sublist in l for item in sublist] 1000 loops, best of 3: 31.5 µs per loop

您可以在文档numpy.concatenate和numpy.ravel中find更多信息

 def flatten(l, a): for i in l: if isinstance(i, list): flatten(i, a) else: a.append(i) return a print(flatten([[[1, [1,1, [3, [4,5,]]]], 2, 3], [4, 5],6], [])) # [1, 1, 1, 3, 4, 5, 2, 3, 4, 5, 6]

我find了最快的解决scheme（无论如何，大列表）：

 import numpy as np #turn list into an array and flatten() np.array(l).flatten()

完成！你当然可以通过执行列表（l）把它变回列表

 def flatten(alist): if alist == []: return [] elif type(alist) is not list: return [alist] else: return flatten(alist[0]) + flatten(alist[1:])

清理了@Deleet例子

 from collections import Iterable def flatten(l, a=[]): for i in l: if isinstance(i, Iterable): flatten(i, a) else: a.append(i) return a daList = [[1,4],[5,6],[23,22,234,2],[2], [ [[1,2],[1,2]],[[11,2],[11,22]] ] ] print(flatten(daList))

例如： https : //repl.it/G8mb/0

我最近遇到了一个情况，就是我在这个子列表中混合了string和数字数据

 test = ['591212948', ['special', 'assoc', 'of', 'Chicago', 'Jon', 'Doe'], ['Jon'], ['Doe'], ['fl'], 92001, 555555555, 'hello', ['hello2', 'a'], 'b', ['hello33', ['z', 'w'], 'b']]

像flat_list = [item for sublist in test for item in sublist]没有工作。所以，我提出了以下解决scheme1 +级别的子列表

 def concatList(data): results = [] for rec in data: if type(rec) == list: results += rec results = concatList(results) else: results.append(rec) return results

结果

 In [38]: concatList(test) Out[38]: Out[60]: ['591212948', 'special', 'assoc', 'of', 'Chicago', 'Jon', 'Doe', 'Jon', 'Doe', 'fl', 92001, 555555555, 'hello', 'hello2', 'a', 'b', 'hello33', 'z', 'w', 'b']

 weird_list=[[1, 2, 3], [4, 5, 6], [7], [8, 9]] nice_list = list(map(int, ''.join([elem for elem in str(weird_list) if elem not in '[ ]']).split(',')))

你可以先把它转换成string（其他的答案可能会更好）

您可以很简单地使用实际的堆栈数据结构来避免对堆栈的recursion调用。

 alist = [1,[1,2],[1,2,[4,5,6],3, "33"]] newlist = [] while len(alist) > 0 : templist = alist.pop() if type(templist) == type(list()) : while len(templist) > 0 : temp = templist.pop() if type(temp) == type(list()) : for x in temp : templist.append(x) else : newlist.append(temp) else : newlist.append(templist) print(list(reversed(newlist)))

从Python列表中列出一个扁平列表

取消select一个选项

在JavaScript中合并/拼合数组数组？

如何在jQuery中压扁数组？