Python是否有一个有序的集合？

Python有一个有序的字典。那么有序集呢？

有一个有序集（可能的新链接），这是从Python 2文档引用的。这运行在Py2.6或更高版本和3.0或更高版本，没有任何修改。界面几乎和普通的一样，除了初始化应该用一个列表来完成。

 OrderedSet([1, 2, 3])

这是一个MutableSet，因此.union的签名与.union不匹配，但是因为它包含__or__类似的东西可以很容易地添加：

 @staticmethod def union(*sets): union = OrderedSet() union.union(*sets) return union def union(self, *sets): for set in sets: self |= set

有序集合在function上是有序字典的特例。

字典的键是唯一的。因此，如果人们忽略有序字典中的值（例如将它们赋值为None ），那么基本上有一个有序集合。

从Python 3.1起，有collections.OrderedDict 。以下是OrderedSet的示例实现。（请注意，只有几个方法需要定义或重写： collections.OrderedDict和collections.MutableSet做了繁重的工作。）

 import collections class OrderedSet(collections.OrderedDict, collections.MutableSet): def update(self, *args, **kwargs): if kwargs: raise TypeError("update() takes no keyword arguments") for s in args: for e in s: self.add(e) def add(self, elem): self[elem] = None def discard(self, elem): self.pop(elem, None) def __le__(self, other): return all(e in other for e in self) def __lt__(self, other): return self <= other and self != other def __ge__(self, other): return all(e in self for e in other) def __gt__(self, other): return self >= other and self != other def __repr__(self): return 'OrderedSet([%s])' % (', '.join(map(repr, self.keys()))) def __str__(self): return '{%s}' % (', '.join(map(repr, self.keys()))) difference = property(lambda self: self.__sub__) difference_update = property(lambda self: self.__isub__) intersection = property(lambda self: self.__and__) intersection_update = property(lambda self: self.__iand__) issubset = property(lambda self: self.__le__) issuperset = property(lambda self: self.__ge__) symmetric_difference = property(lambda self: self.__xor__) symmetric_difference_update = property(lambda self: self.__ixor__) union = property(lambda self: self.__or__)

在PyPI上的实现

虽然其他人已经指出，在Python中还没有内置的插入顺序保留集的实现，但我觉得这个问题缺less一个答案，指出了在PyPI上可以find什么。

据我所知，目前有：

有序集
OSET

这两个实现都是基于Raymond Hettinger发布给ActiveState的配方，这里也提到了其他答案。我已经检查了两个并确定了以下内容

关键区别：

有序集（版本1.1）
- 优点：O（1）通过索引查找（例如my_set[5] ）
- 缺点： remove(item)未执行
oset（版本0.1.3）
- 优点：O（1） remove(item)
- 缺点：显然O（n）通过索引查找

两个实现都有O（1）用于add(item)和__contains__(item) （ item in my_set ）。

不幸的是，两个实现都没有像set1.union(set2)那样的基于方法的集合操作 – >您必须使用基于操作符的表单，比如set1 | set2 set1 | set2来代替。有关Set操作方法及其基于操作员的等价物的完整列表，请参阅Set Objects上的Python文档。

我第一次使用有序集，直到我第一次使用remove(item) ，这与我NotImplementedError崩溃我的脚本。由于我目前从来没有用索引来查找，所以我同时切换到oset。

如果您了解PyPI上的其他实现，请在评论中告诉我们。

如果您使用有序集维护sorting顺序，请考虑使用PyPI中的sorting集实现。 sortedcontainers模块为此提供了一个SortedSet 。一些好处：纯Python，快速的C实现，100％的unit testing覆盖，压力testing小时。

使用pip从PyPI安装很容易：

 pip install sortedcontainers

请注意，如果您无法进行pip install ，只需从开放源代码存储库中下载 sortedlist.py和sortedset.py文件即可。

一旦安装，你可以简单地：

 from sortedcontainers import SortedSet help(SortedSet)

sortedcontainers模块还维护与几个替代实现的性能比较。

有关Python的包数据types的评论，还有一个SortedList数据types可以用来有效地实现一个包。

我可以比OrderedSet做的更好：boltons有一个纯Python，2/3兼容的IndexedSettypes ，它不仅是一个有序集合，还支持索引（和列表一样）。

只需点击pip install boltons （或将setutils.py复制到您的代码库中），导入IndexedSet并：

 >>> x = IndexedSet(list(range(4)) + list(range(8))) >>> x IndexedSet([0, 1, 2, 3, 4, 5, 6, 7]) >>> x - set(range(2)) IndexedSet([2, 3, 4, 5, 6, 7]) >>> x[-1] 7 >>> fcr = IndexedSet('freecreditreport.com') >>> ''.join(fcr[:fcr.index('.')]) 'frecditpo'

一切都是独特的，保持秩序。完全披露：我写了IndexedSet ，但是这也意味着如果有任何问题，我可以 IndexedSet 。 🙂

有一点晚了，但是我已经写了一个类setlist作为collections-extended一部分，它完全实现了Sequence和Set

 >>> from collections_extended import setlist >>> sl = setlist('abracadabra') >>> sl setlist(('a', 'b', 'r', 'c', 'd')) >>> sl[3] 'c' >>> sl[-1] 'd' >>> 'r' in sl # testing for inclusion is fast True >>> sl.index('d') # so is finding the index of an element 4 >>> sl.insert(1, 'd') # inserting an element already in raises a ValueError ValueError >>> sl.index('d') 4

GitHub： https ： //github.com/mlenzen/collections-extended

文档： http : //collections-extended.lenzm.net/en/latest/

PyPI： https ：//pypi.python.org/pypi/collections-extended

如果你已经在你的代码中使用了pandas，那么它的Index对象的行为就像一个有序集合，正如本文所示。

ParallelRegression包提供了一个setList（）有序集合类，它比基于ActiveState配方的选项方法更完整。它支持列表的所有可用方法，以及大多数（如果不是所有）可用的方法。

对于许多目的，只需调用sorted就足够了。例如

 >>> s = set([0, 1, 2, 99, 4, 40, 3, 20, 24, 100, 60]) >>> sorted(s) [0, 1, 2, 3, 4, 20, 24, 40, 60, 99, 100]

如果你打算重复使用它，那么调用sorting后的函数将会产生开销，所以你可能想要保存结果列表，只要你改变了设置。如果您需要维护独特的元素并进行sorting，我同意使用具有任意值（如None）的集合中OrderedDict的build议。

有四种可能需要的订购，我相信：

按键sorting
按价值sorting（虽然我没有听说有人要求这个）
按修改时间sorting
按添加时间sorting

我相信collections.OrderedDict得到你＃4。或者，您可以删除一个密钥，并重新添加它，为＃3。

对于＃1，你可能应该检查一个红黑树或treap：

红黑树的运行时间变化很小（对于交互式应用来说可能会更好），但是平均速度并不像平均速度那么快（对于批处理来说这可能更好一些 – 颠簸通常不会自行重组，平均，但是当他们重组时，可能需要相当长的时间）。

这两个都是build立在许多语言中的实现的数据结构。

您可以使用reduce()在一行中获取唯一值的列表：

 >>> mylist = [4, 1, 2, 1, 3, 2, 4, 1, 3, 2, 3, 1, 3, 2, 4] >>> reduce(lambda a, b: b[0] in a and a or a + b, [[i] for i in mylist]) [4, 1, 2, 3]

 >>> a = {3, 4, 2, 6, 1, 7} >>> type(a) <class 'set'> >>> sorted(a, reverse=True) [7, 6, 4, 3, 2, 1] >>> sorted(a) [1, 2, 3, 4, 6, 7]

Python是否有一个有序的集合？

有序集合在function上是有序字典的特例。

在PyPI上的实现

关键区别：

在Python中从序列中删除项目的优雅方式？

你如何确定两个哈希集是否相等（按值，而不是参考）？

从一个集合中挑选一个随机元素

如何计算两个交集？

boolean 与BitSet：哪个更有效？

为什么tuple（set（））== tuple（set（））85％的时间哈希随机化启用？

Python集与VS列表

Python字典如何具有相同散列的多个键？

如何检查一个表是否包含Lua中的一个元素？

获得两套之间的差异