什么是重写的System.Object.GetHashCode的最佳algorithm?
在.NET中System.Object.GetHashCode
方法被用在很多地方,在整个.NET基类库中。 特别是在快速查找collections中的物品或确定平等时。 是否有一个标准的algorithm/最佳做法如何实现我的自定义类的GetHashCode
覆盖,所以我不降低性能?
我通常使用类似于Josh Bloch 神话般的 Effective Java中的实现 。 这很快,并创build一个不太可能造成冲突的相当不错的哈希。 select两个不同的素数,例如17和23,然后:
public override int GetHashCode() { unchecked // Overflow is fine, just wrap { int hash = 17; // Suitable nullity checks etc, of course :) hash = hash * 23 + field1.GetHashCode(); hash = hash * 23 + field2.GetHashCode(); hash = hash * 23 + field3.GetHashCode(); return hash; } }
正如评论中指出的那样,您可能会发现,select一个大的素数来代替乘法会更好。 显然486187739是好的…虽然我看到的小数字的大多数例子倾向于使用素数,但是至less有类似的algorithm,其中经常使用非素数。 例如,在后来的FNV不太完整的例子中,我使用的数字显然效果不错 – 但是初始值并不是主要的。 (乘法常数是主要的,但我不知道这是多么的重要。)
由于两个主要原因,这比通常的XOR
更好。 假设我们有一个带有两个int
字段的types:
XorHash(x, x) == XorHash(y, y) == 0 for all x, y XorHash(x, y) == XorHash(y, x) for all x, y
顺便说一下,早期的algorithm是C#编译器当前使用的匿名types的algorithm。
这个页面提供了很多选项。 我认为在大多数情况下,上述内容“足够好”,记住和正确的logging非常容易。 FNV的select同样简单,但使用不同的常量和XOR
而不是ADD
作为组合操作。 它看起来像下面的代码,但正常的FNValgorithm对单个字节进行操作,所以这需要修改每个字节执行一次迭代,而不是每个32位散列值。 FNV也被devise用于可变长度的数据,而我们在这里使用的方式始终是相同数量的字段值。 对这个答案的评论表明,这里的代码实际上并不如上面的加法方法那样工作(在样本情况下testing)。
// Note: Not quite FNV! public override int GetHashCode() { unchecked // Overflow is fine, just wrap { int hash = (int) 2166136261; // Suitable nullity checks etc, of course :) hash = (hash * 16777619) ^ field1.GetHashCode(); hash = (hash * 16777619) ^ field2.GetHashCode(); hash = (hash * 16777619) ^ field3.GetHashCode(); return hash; } }
请注意,有一点需要注意的是,理想情况下,应该在将其添加到依赖哈希代码的集合中之后,防止对平等敏感(因此对哈希码敏感)状态发生更改。
根据文件 :
你可以为不可变的引用types覆盖GetHashCode。 通常,对于可变引用types,只有在以下情况下才应该重写GetHashCode:
- 你可以从不可变的字段计算哈希代码; 要么
- 当对象包含在依赖哈希码的集合中时,可以确保可变对象的哈希码不会更改。
微软已经提供了一个很好的通用HashCode生成器:只需将您的属性/字段值复制到一个匿名types并将其哈希:
new { PropA, PropB, PropC, PropD }.GetHashCode();
这将适用于任何数量的属性。 它不使用拳击或额外的资源。 它只是使用匿名types框架中已经实现的algorithm。
这是我的hashcode助手。
它的优点是它使用genericstypes参数,因此不会导致装箱:
public static class HashHelper { public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2) { unchecked { return 31 * arg1.GetHashCode() + arg2.GetHashCode(); } } public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3) { unchecked { int hash = arg1.GetHashCode(); hash = 31 * hash + arg2.GetHashCode(); return 31 * hash + arg3.GetHashCode(); } } public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4) { unchecked { int hash = arg1.GetHashCode(); hash = 31 * hash + arg2.GetHashCode(); hash = 31 * hash + arg3.GetHashCode(); return 31 * hash + arg4.GetHashCode(); } } public static int GetHashCode<T>(T[] list) { unchecked { int hash = 0; foreach (var item in list) { hash = 31 * hash + item.GetHashCode(); } return hash; } } public static int GetHashCode<T>(IEnumerable<T> list) { unchecked { int hash = 0; foreach (var item in list) { hash = 31 * hash + item.GetHashCode(); } return hash; } } /// <summary> /// Gets a hashcode for a collection for that the order of items /// does not matter. /// So {1, 2, 3} and {3, 2, 1} will get same hash code. /// </summary> public static int GetHashCodeForOrderNoMatterCollection<T>( IEnumerable<T> list) { unchecked { int hash = 0; int count = 0; foreach (var item in list) { hash += item.GetHashCode(); count++; } return 31 * hash + count.GetHashCode(); } } /// <summary> /// Alternative way to get a hashcode is to use a fluent /// interface like this:<br /> /// return 0.CombineHashCode(field1).CombineHashCode(field2). /// CombineHashCode(field3); /// </summary> public static int CombineHashCode<T>(this int hashCode, T arg) { unchecked { return 31 * hashCode + arg.GetHashCode(); } }
也有扩展方法提供stream畅的界面,所以你可以这样使用它:
public override int GetHashCode() { return HashHelper.GetHashCode(Manufacturer, PartN, Quantity); }
或者像这样:
public override int GetHashCode() { return 0.CombineHashCode(Manufacturer) .CombineHashCode(PartN) .CombineHashCode(Quantity); }
我在帮助库中有一个哈希类,用于此目的。
/// <summary> /// This is a simple hashing function from Robert Sedgwicks Hashing in C book. /// Also, some simple optimizations to the algorithm in order to speed up /// its hashing process have been added. from: www.partow.net /// </summary> /// <param name="input">array of objects, parameters combination that you need /// to get a unique hash code for them</param> /// <returns>Hash code</returns> public static int RSHash(params object[] input) { const int b = 378551; int a = 63689; int hash = 0; // If it overflows then just wrap around unchecked { for (int i = 0; i < input.Length; i++) { if (input[i] != null) { hash = hash * a + input[i].GetHashCode(); a = a * b; } } } return hash; }
那么,只要你可以使用它:
public override int GetHashCode() { return Hashing.RSHash(_field1, _field2, _field3); }
我没有评估它的performance,所以欢迎任何反馈。
这是我的助手类,它使用Jon Skeet的实现 。
public static class HashCode { public const int Start = 17; public static int Hash<T>(this int hash, T obj) { var h = EqualityComparer<T>.Default.GetHashCode(obj); return unchecked((hash * 31) + h); } }
用法:
public override int GetHashCode() { return HashCode.Start .Hash(field1) .Hash(field2) .Hash(field3); }
如果你想避免写一个System.Int32的扩展方法:
public struct HashCode { private readonly int hashCode; public HashCode(int hashCode) { this.hashCode = hashCode; } public static HashCode Start { get; } = new HashCode(17); public static implicit operator int(HashCode hashCode) => hashCode.GetHashCode(); public HashCode Hash<T>(T obj) { var h = EqualityComparer<T>.Default.GetHashCode(obj); return unchecked(new HashCode((this.hashCode * 31) + h)); } public override int GetHashCode() => this.hashCode; }
它仍然是通用的,它仍然避免了任何堆分配,它的使用方式完全相同:
public override int GetHashCode() { // This time `HashCode.Start` is not an `Int32`, it's a `HashCode` instance. // And the result is implicitly converted to `Int32`. return HashCode.Start .Hash(field1) .Hash(field2) .Hash(field3); }
马丁评论后更新:
obj != null
导致拳击,所以我切换到默认比较。
- 看到这个答案关于默认比较器的性能。
- 看到这个问题的空值的哈希代码的讨论。
在Equals()比较多个字段的大多数情况下,如果GetHash()在一个字段或多个字段上散列并不重要。 你只需要确保计算哈希值真的很便宜(请不要分配 ),而且速度快( 没有繁重的计算 ,当然也没有数据库连接),并提供了一个很好的分配。
繁重的工作应该是Equals()方法的一部分。 哈希应该是一个非常便宜的操作,以尽可能less的项目调用Equals()。
最后一个提示: 不要依赖GetHashCode()在多个应用程序运行中保持稳定 。 许多.Nettypes不保证它们的哈希码在重新启动后保持不变,所以您应该只在内存数据结构中使用GetHashCode()的值。
直到最近,我的答案都非常接近Jon Skeet在这里。 不过,我最近开始了一个项目,它使用了幂级别为2的散列表,即内部表的大小为8,16,32等的散列表。有一个很好的理由来支持素数大小,但是在那里对于两款机型来说也是一些优势。
而且非常吸引人。 所以经过一些实验和研究之后,我开始用以下方式对哈希进行重新哈希处理:
public static int ReHash(int source) { unchecked { ulong c = 0xDEADBEEFDEADBEEF + (ulong)source; ulong d = 0xE2ADBEEFDEADBEEF ^ c; ulong a = d += c = c << 15 | c >> -15; ulong b = a += d = d << 52 | d >> -52; c ^= b += a = a << 26 | a >> -26; d ^= c += b = b << 51 | b >> -51; a ^= d += c = c << 28 | c >> -28; b ^= a += d = d << 9 | d >> -9; c ^= b += a = a << 47 | a >> -47; d ^= c += b << 54 | b >> -54; a ^= d += c << 32 | c >> 32; a += d << 25 | d >> -25; return (int)(a >> 1); } }
然后我的权力的哈希表没有再吸了。
这让我感到不安,因为上述不应该起作用。 或者更确切地说,除非原始的GetHashCode()
以非常特殊的方式很差,否则它不应该工作。
重新混合一个哈希码不能改善一个好的哈希码,因为唯一可能的影响就是我们引入了更多的碰撞。
重新混合哈希码不能改善可怕的哈希码,因为唯一可能的影响是我们将例如53值上的大量碰撞改变为大量值183487291。
重新混合一个哈希码只能改善一个哈希码,它至less可以很好地避免整个范围内的绝对冲突(2 32个可能的值),但是在哈希表中实际使用时,避免了冲突的冲突。 虽然更简单的“二的幂次表”模式使这一点变得更加明显,但它也对更常见的素数表产生了负面影响,但这并不明显(重新调整的额外工作将超过收益,但好处仍然在那里)。
编辑:我也使用开放寻址,这也会增加对碰撞的敏感度,可能比它的功率更强。
好吧,这是多lessstring.GetHashCode()
可以改进这种方式(按照testing的顺序运行约20-30倍,由于更less的冲突),更扰乱多less我自己的哈希代码可以改善(更多比起那个来说)。
我过去编写的所有GetHashCode()实现,实际上用作本网站的答案的基础,比我经历的要糟得多 。 大部分时间对于大部分用途来说“足够好”,但是我想要更好的东西。
所以我把这个项目放在一边(反正是一个宠物项目),并开始着眼于如何快速地在.NET中生成一个良好的,散布的散列码。
最后我决定把SpookyHash移植到.NET上。 实际上,上面的代码是使用SpookyHash从32位input产生32位输出的快速path版本。
现在,SpookyHash不是一个很好的快速记住一段代码。 我的端口更是如此,因为我为了更好的速度手工添加了很多内容。 但是这就是代码重用的目的。
然后我把这个项目放在一边,因为正如原来的项目产生了如何产生更好的哈希代码的问题,所以项目产生了如何产生更好的.NET memcpy的问题。
然后我回来了,并产生了很多重载,轻松地将几乎所有的本地types(除了decimal
†)都送入哈希码。
Bob Jenkins值得信赖,因为他的原始代码更快,尤其是在64位机器上,这个algorithm是针对‡优化的。
完整的代码可以在https://bitbucket.org/JonHanna/spookilysharp/src看到,但是可以认为上面的代码是它的简化版本。;
但是,由于现在已经写好了,所以可以更容易地使用它:
public override int GetHashCode() { var hash = new SpookyHash(); hash.Update(field1); hash.Update(field2); hash.Update(field3); return hash.Final().GetHashCode(); }
它也需要种子值,所以如果你需要处理不可信的input,并希望防止哈希DoS攻击,你可以设置基于正常运行时间或类似的种子,并使结果不可预知的攻击者:
private static long hashSeed0 = Environment.TickCount; private static long hashSeed1 = DateTime.Now.Ticks; public override int GetHashCode() { //produce different hashes ever time this application is restarted //but remain consistent in each run, so attackers have a harder time //DoSing the hash tables. var hash = new SpookyHash(hashSeed0, hashSeed1); hash.Update(field1); hash.Update(field2); hash.Update(field3); return hash.Final().GetHashCode(); }
*一个很大的惊喜就是手动旋转返回的旋转方法(x << n) | (x >> -n)
(x << n) | (x >> -n)
改进了一些东西。 我可以肯定的是,抖动会为我说明,但分析表明,否则。
† decimal
不是从.NET的angular度来看,尽pipe它来自C#。 问题在于它自己的GetHashCode()
将精度视为重要,而它自己的Equals()
则不精确。 两者都是有效的select,但不是那样混合。 在实现你自己的版本时,你需要select做一个,或者其他的,但我不知道你想要的。
‡通过比较。 如果用在一个string上,64位的SpookyHash比32位的string.GetHashCode()
要快得多,这比64位的string.GetHashCode()
要快一些,这比32位的SpookyHash快得多,但仍然快足以成为一个合理的select。
这是一个很好的:
/// <summary> /// Helper class for generating hash codes suitable /// for use in hashing algorithms and data structures like a hash table. /// </summary> public static class HashCodeHelper { private static int GetHashCodeInternal(int key1, int key2) { unchecked { var num = 0x7e53a269; num = (-1521134295 * num) + key1; num += (num << 10); num ^= (num >> 6); num = ((-1521134295 * num) + key2); num += (num << 10); num ^= (num >> 6); return num; } } /// <summary> /// Returns a hash code for the specified objects /// </summary> /// <param name="arr">An array of objects used for generating the /// hash code.</param> /// <returns> /// A hash code, suitable for use in hashing algorithms and data /// structures like a hash table. /// </returns> public static int GetHashCode(params object[] arr) { int hash = 0; foreach (var item in arr) hash = GetHashCodeInternal(hash, item.GetHashCode()); return hash; } /// <summary> /// Returns a hash code for the specified objects /// </summary> /// <param name="obj1">The first object.</param> /// <param name="obj2">The second object.</param> /// <param name="obj3">The third object.</param> /// <param name="obj4">The fourth object.</param> /// <returns> /// A hash code, suitable for use in hashing algorithms and /// data structures like a hash table. /// </returns> public static int GetHashCode<T1, T2, T3, T4>(T1 obj1, T2 obj2, T3 obj3, T4 obj4) { return GetHashCode(obj1, GetHashCode(obj2, obj3, obj4)); } /// <summary> /// Returns a hash code for the specified objects /// </summary> /// <param name="obj1">The first object.</param> /// <param name="obj2">The second object.</param> /// <param name="obj3">The third object.</param> /// <returns> /// A hash code, suitable for use in hashing algorithms and data /// structures like a hash table. /// </returns> public static int GetHashCode<T1, T2, T3>(T1 obj1, T2 obj2, T3 obj3) { return GetHashCode(obj1, GetHashCode(obj2, obj3)); } /// <summary> /// Returns a hash code for the specified objects /// </summary> /// <param name="obj1">The first object.</param> /// <param name="obj2">The second object.</param> /// <returns> /// A hash code, suitable for use in hashing algorithms and data /// structures like a hash table. /// </returns> public static int GetHashCode<T1, T2>(T1 obj1, T2 obj2) { return GetHashCodeInternal(obj1.GetHashCode(), obj2.GetHashCode()); } }
这里是如何使用它:
private struct Key { private Type _type; private string _field; public Type Type { get { return _type; } } public string Field { get { return _field; } } public Key(Type type, string field) { _type = type; _field = field; } public override int GetHashCode() { return HashCodeHelper.GetHashCode(_field, _type); } public override bool Equals(object obj) { if (!(obj is Key)) return false; var tf = (Key)obj; return tf._field.Equals(_field) && tf._type.Equals(_type); } }
这是我简单化的方法。 我正在使用经典的生成器模式。 这是types安全(没有装箱/拆箱),也与.NET 2.0(没有扩展方法等)compatbile。
它是这样使用的:
public override int GetHashCode() { HashBuilder b = new HashBuilder(); b.AddItems(this.member1, this.member2, this.member3); return b.Result; }
这里是acutalbuild造者类:
internal class HashBuilder { private const int Prime1 = 17; private const int Prime2 = 23; private int result = Prime1; public HashBuilder() { } public HashBuilder(int startHash) { this.result = startHash; } public int Result { get { return this.result; } } public void AddItem<T>(T item) { unchecked { this.result = this.result * Prime2 + item.GetHashCode(); } } public void AddItems<T1, T2>(T1 item1, T2 item2) { this.AddItem(item1); this.AddItem(item2); } public void AddItems<T1, T2, T3>(T1 item1, T2 item2, T3 item3) { this.AddItem(item1); this.AddItem(item2); this.AddItem(item3); } public void AddItems<T1, T2, T3, T4>(T1 item1, T2 item2, T3 item3, T4 item4) { this.AddItem(item1); this.AddItem(item2); this.AddItem(item3); this.AddItem(item4); } public void AddItems<T1, T2, T3, T4, T5>(T1 item1, T2 item2, T3 item3, T4 item4, T5 item5) { this.AddItem(item1); this.AddItem(item2); this.AddItem(item3); this.AddItem(item4); this.AddItem(item5); } public void AddItems<T>(params T[] items) { foreach (T item in items) { this.AddItem(item); } } }
Here is another fluent implementation of the algorithm posted above by Jon Skeet , but which includes no allocations or boxing operations:
public static class Hash { public const int Base = 17; public static int HashObject(this int hash, object obj) { unchecked { return hash * 23 + (obj == null ? 0 : obj.GetHashCode()); } } public static int HashValue<T>(this int hash, T value) where T : struct { unchecked { return hash * 23 + value.GetHashCode(); } } }
用法:
public class MyType<T> { public string Name { get; set; } public string Description { get; set; } public int Value { get; set; } public IEnumerable<T> Children { get; set; } public override int GetHashCode() { return Hash.Base .HashObject(this.Name) .HashObject(this.Description) .HashValue(this.Value) .HashObject(this.Children); } }
The compiler will ensure HashValue
is not called with a class due to the generic type constraint. But there is no compiler support for HashObject
since adding a generic argument also adds a boxing operation.
Most of my work is done with database connectivity which means that my classes all have a unique identifier from the database. I always use the ID from the database to generate the hashcode.
// Unique ID from database private int _id; ... { return _id.GetHashCode(); }
Pretty much similar to nightcoder's solution except it's easier to raise primes if you want to.
PS: This is one of those times where you puke a little in your mouth, knowing that this could be refactored into one method with 9 default's but it would be slower, so you just close your eyes and try to forget about it.
/// <summary> /// Try not to look at the source code. It works. Just rely on it. /// </summary> public static class HashHelper { private const int PrimeOne = 17; private const int PrimeTwo = 23; public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9, T10 arg10) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); hash = hash * PrimeTwo + arg5.GetHashCode(); hash = hash * PrimeTwo + arg6.GetHashCode(); hash = hash * PrimeTwo + arg7.GetHashCode(); hash = hash * PrimeTwo + arg8.GetHashCode(); hash = hash * PrimeTwo + arg9.GetHashCode(); hash = hash * PrimeTwo + arg10.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); hash = hash * PrimeTwo + arg5.GetHashCode(); hash = hash * PrimeTwo + arg6.GetHashCode(); hash = hash * PrimeTwo + arg7.GetHashCode(); hash = hash * PrimeTwo + arg8.GetHashCode(); hash = hash * PrimeTwo + arg9.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); hash = hash * PrimeTwo + arg5.GetHashCode(); hash = hash * PrimeTwo + arg6.GetHashCode(); hash = hash * PrimeTwo + arg7.GetHashCode(); hash = hash * PrimeTwo + arg8.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); hash = hash * PrimeTwo + arg5.GetHashCode(); hash = hash * PrimeTwo + arg6.GetHashCode(); hash = hash * PrimeTwo + arg7.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3, T4, T5, T6>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); hash = hash * PrimeTwo + arg5.GetHashCode(); hash = hash * PrimeTwo + arg6.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3, T4, T5>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); hash = hash * PrimeTwo + arg5.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); hash = hash * PrimeTwo + arg4.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); hash = hash * PrimeTwo + arg3.GetHashCode(); return hash; } } public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2) { unchecked { int hash = PrimeOne; hash = hash * PrimeTwo + arg1.GetHashCode(); hash = hash * PrimeTwo + arg2.GetHashCode(); return hash; } } }
ReSharper users can generate GetHashCode, Equals, and others with ReSharper -> Edit -> Generate Code -> Equality Members
.
// ReSharper's GetHashCode looks like this public override int GetHashCode() { unchecked { int hashCode = Id; hashCode = (hashCode * 397) ^ IntMember; hashCode = (hashCode * 397) ^ OtherIntMember; hashCode = (hashCode * 397) ^ (RefMember != null ? RefMember.GetHashCode() : 0); // ... return hashCode; } }
397
seems to be an ersatz ReSharper "signature", as all methods as of version 2016.2 use it.
Microsoft lead for several way of hashing….
return this.value;// for classes that contain a single int value return x ^ y;//for classes that contain multiple int value return ((int)value ^ (int)(value >> 32));//Forclasses that contain single number bigger than int return obj1.GetHashCode();//for classes that contain class instance field which inherit from object return obj1.GetHashCode() ^ obj2.GetHashCode() ^ obj3.GetHashCode();//for classes that contain multiple class instance field which inherit from object
i can guess that for multiple big int u can use this:
int a=((int)value1 ^ (int)(value1 >> 32)); int b=((int)value2 ^ (int)(value2 >> 32)); int c=((int)value3 ^ (int)(value3 >> 32)); return a ^ b ^ c;
and same for multi-type… all converted first to int using GetHashCode() then the int values will be xor'ed and the result is ur hash…
for those who use hash as ID (i mean a unique value), hash is naturally limit number of digits, i think it was 5 bytes for hashing algorithm, at last MD5…
you may turn multiple value to a hashed value and some of them be same, so don't use it as an identifier… (maybe some day i gonna use your component)
I ran into an issue with floats and decimals using the implementation selected as the answer above.
This test fails (floats; hash is the same even though I switched 2 values to be negative):
var obj1 = new { A = 100m, B = 100m, C = 100m, D = 100m}; var obj2 = new { A = 100m, B = 100m, C = -100m, D = -100m}; var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D); var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D); Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different hash1:{0} hash2:{1}",hash1,hash2));
But this test passes (with ints):
var obj1 = new { A = 100m, B = 100m, C = 100, D = 100}; var obj2 = new { A = 100m, B = 100m, C = -100, D = -100}; var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D); var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D); Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different hash1:{0} hash2:{1}",hash1,hash2));
I changed my implementation to not use GetHashCode for the primitive types and it seems to work better
private static int InternalComputeHash(params object[] obj) { unchecked { var result = (int)SEED_VALUE_PRIME; for (uint i = 0; i < obj.Length; i++) { var currval = result; var nextval = DetermineNextValue(obj[i]); result = (result * MULTIPLIER_VALUE_PRIME) + nextval; } return result; } } private static int DetermineNextValue(object value) { unchecked { int hashCode; if (value is short || value is int || value is byte || value is sbyte || value is uint || value is ushort || value is ulong || value is long || value is float || value is double || value is decimal) { return Convert.ToInt32(value); } else { return value != null ? value.GetHashCode() : 0; } } }
Here is a class that I built with methods to use inside of the overrides of Equals(object obj) and GetHashCode(). I decided to use generics and a hashing algorithm that should be able to cover most objects. Please let me know if you see anything here that doesn't work for some types of object and you have a way to improve it.
public class Equality<T> { public int GetHashCode(T classInstance) { List<FieldInfo> fields = GetFields(); unchecked { int hash = 17; foreach (FieldInfo field in fields) { hash = hash * 397 + field.GetValue(classInstance).GetHashCode(); } return hash; } } public bool Equals(T classInstance, object obj) { if (ReferenceEquals(null, obj)) { return false; } if (ReferenceEquals(this, obj)) { return true; } if (classInstance.GetType() != obj.GetType()) { return false; } return Equals(classInstance, (T)obj); } private bool Equals(T classInstance, T otherInstance) { List<FieldInfo> fields = GetFields(); foreach (var field in fields) { if (!field.GetValue(classInstance).Equals(field.GetValue(otherInstance))) { return false; } } return true; } private List<FieldInfo> GetFields() { Type myType = typeof(T); List<FieldInfo> fields = myType.GetTypeInfo().DeclaredFields.ToList(); return fields; } }
Here is how it's used in a class:
public override bool Equals(object obj) { return new Equality<ClassName>().Equals(this, obj); } public override int GetHashCode() { unchecked { return new Equality<ClassName>().GetHashCode(this); } }