为什么SortedSet <T> .GetViewBetween不是O（log N）？

在.NET 4.0+中，类SortedSet<T>具有一个名为GetViewBetween(l, r) ，该方法返回包含指定的两个值之间的所有值的树部分的接口视图。鉴于SortedSet<T>被实现为红黑树，我自然希望它在O(log N)时间运行。在C ++中类似的方法是std::set::lower_bound/upper_bound ，在Java中它是TreeSet.headSet/tailSet ，它们是对数的。

但是，这不是事实。下面的代码在32秒内运行，而等价的GetViewBetween O(log N)版本将使这个代码在1-2秒内运行。

 var s = new SortedSet<int>(); int n = 100000; var rand = new Random(1000000007); int sum = 0; for (int i = 0; i < n; ++i) { s.Add(rand.Next()); if (rand.Next() % 2 == 0) { int l = rand.Next(int.MaxValue / 2 - 10); int r = l + rand.Next(int.MaxValue / 2 - 10); var t = s.GetViewBetween(l, r); sum += t.Min; } } Console.WriteLine(sum);

我使用dotPeek反编译System.dll，这里是我得到的：

 public TreeSubSet(SortedSet<T> Underlying, T Min, T Max, bool lowerBoundActive, bool upperBoundActive) : base(Underlying.Comparer) { this.underlying = Underlying; this.min = Min; this.max = Max; this.lBoundActive = lowerBoundActive; this.uBoundActive = upperBoundActive; this.root = this.underlying.FindRange(this.min, this.max, this.lBoundActive, this.uBoundActive); this.count = 0; this.version = -1; this.VersionCheckImpl(); } internal SortedSet<T>.Node FindRange(T from, T to, bool lowerBoundActive, bool upperBoundActive) { SortedSet<T>.Node node = this.root; while (node != null) { if (lowerBoundActive && this.comparer.Compare(from, node.Item) > 0) { node = node.Right; } else { if (!upperBoundActive || this.comparer.Compare(to, node.Item) >= 0) return node; node = node.Left; } } return (SortedSet<T>.Node) null; } private void VersionCheckImpl() { if (this.version == this.underlying.version) return; this.root = this.underlying.FindRange(this.min, this.max, this.lBoundActive, this.uBoundActive); this.version = this.underlying.version; this.count = 0; base.InOrderTreeWalk((TreeWalkPredicate<T>) (n => { SortedSet<T>.TreeSubSet temp_31 = this; int temp_34 = temp_31.count + 1; temp_31.count = temp_34; return true; })); }

所以， FindRange显然是O(log N) ，但是在那之后我们调用了VersionCheckImpl …它只对已find的子树进行线性时间遍历，只是为了重新计算它的节点！

为什么你需要一直这样遍历呢？
为什么.NET不包含O(log N)方法来分割基于键的树，如C ++或Java？这在很多情况下都很有用。

关于`version`字段

UPDATE1：

在我的记忆中，BCL中的许多（也许是所有？）集合都有字段version 。

首先，关于`foreach` ：

根据这个msdn链接

foreach语句为数组或对象集合中的每个元素重复一组embedded式语句。 foreach语句用于遍历集合以获取所需信息，但不应该用于更改集合的内容以避免不可预知的副作用。

在其他许多集合中， version被保护，数据在foreach期间不被修改

例如， HashTable的MoveNext() ：

 public virtual bool MoveNext() { if (this.version != this.hashtable.version) { throw new InvalidOperationException(Environment.GetResourceString("InvalidOperation_EnumFailedVersion")); } .......... }

但是在SortedSet<T>的MoveNext()方法中：

 public bool MoveNext() { this.tree.VersionCheck(); if (this.version != this.tree.version) { ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_EnumFailedVersion); } .... }

UPDATE2：

但是O（N）循环不仅可以用于version而且可以用于Count属性。

因为GetViewBetween的MSDN表示：

此方法返回落在lowerValue和upperValue之间的元素范围的视图，如比较器所定义的…. 您可以在视图和基础SortedSet（Of T）中进行更改 。

所以对于每一个更新应该是同步count字段（键和值已经相同）。确保Count是正确的

有两个政策可以达到目标：

微软的
Mono的

First.MS在代码中牺牲了GetViewBetween()的性能，赢得了Count Property的性能。

VersionCheckImpl()是同步Count属性的一种方法。

二，单声道。在mono的代码中， GetViewBetween()更快，但是在GetCount()方法中：

 internal override int GetCount () { int count = 0; using (var e = set.tree.GetSuffixEnumerator (lower)) { while (e.MoveNext () && set.helper.Compare (upper, e.Current) >= 0) ++count; } return count; }

它总是一个O（N）操作！

为什么SortedSet <T> .GetViewBetween不是O（log N）？

关于`version`字段

UPDATE1：

首先，关于`foreach` ：

UPDATE2：

如何计算excel中的行数与数据？

使用entity framework6和SQLite的问题

我如何判断一个types是“简单”types？即拥有一个单一的值

为什么IIS线程与常规CLR线程相比如此珍贵？

LINQ插入后可以返回'id'字段吗？

字节到ASCII

如何debugging.NET Windows服务OnStart方法？

内联常量列表的实例化

SOLID原则是否真的很稳固？

HashSet <T>和List <T>有什么区别？

为什么SortedSet <T> .GetViewBetween不是O（log N）？

关于version字段

UPDATE1：

首先，关于foreach ：

UPDATE2：

如何计算excel中的行数与数据？

使用entity framework6和SQLite的问题

我如何判断一个types是“简单”types？ 即拥有一个单一的值

为什么IIS线程与常规CLR线程相比如此珍贵？

LINQ插入后可以返回'id'字段吗？

字节到ASCII

如何debugging.NET Windows服务OnStart方法？

内联常量列表的实例化

SOLID原则是否真的很稳固？

HashSet <T>和List <T>有什么区别？

关于`version`字段

首先，关于`foreach` ：

我如何判断一个types是“简单”types？即拥有一个单一的值