Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
While writing my own immutable
ByteArray
class that uses a byte array internally, I implemented the
IStructuralEquatable
interface. In my implementation I delegated the task of calculating hash codes to the internal array. While testing it, to my great surprise, I found that my
two different arrays had the same structural hash code
, i.e. they returned the same value from
GetHashCode
. To reproduce:
IStructuralEquatable array11 = new int[] { 1, 1 };
IStructuralEquatable array12 = new int[] { 1, 2 };
IStructuralEquatable array22 = new int[] { 2, 2 };
var comparer = EqualityComparer<int>.Default;
Console.WriteLine(array11.GetHashCode(comparer)); // 32
Console.WriteLine(array12.GetHashCode(comparer)); // 32
Console.WriteLine(array22.GetHashCode(comparer)); // 64
IStructuralEquatable
is quite new and unknown, but I read somewhere that it can be used to compare the contents of collections and arrays. Am I wrong, or is my .Net wrong?
Note that I am not talking about Object.GetHashCode
!
Edit:
So, I am apparently wrong as unequal objects may have equal hash codes. But isn't GetHashCode
returning a somewhat randomly distributed set of values a requirement? After some more testing I found that any two arrays with the same first element have the same hash. I still think this is strange behavior.
–
–
–
What you have described is not a bug. GetHashCode()
does not guarantee unique hashes for nonequal objects.
From MSDN:
If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.
While the MSFT .NET implementation of GetHashCode()
for Array.IStructuralEquatable
obeys the principles in the above MSDN documentation, it appears that the authors did not implement it as intended.
Here is the code from "Array.cs":
int IStructuralEquatable.GetHashCode(IEqualityComparer comparer) {
if (comparer == null)
throw new ArgumentNullException("comparer");
Contract.EndContractBlock();
int ret = 0;
for (int i = (this.Length >= 8 ? this.Length - 8 : 0); i < this.Length; i++) {
ret = CombineHashCodes(ret, comparer.GetHashCode(GetValue(0)));
return ret;
Notice in particular this line:
ret = CombineHashCodes(ret, comparer.GetHashCode(GetValue(
0)));
Unless I am mistaken, that 0
was intended to be i
. Because of this, GetHashCode()
always returns the same value for arrays with the same max(0, n-8th) element, where n is the length of the array. This isn't wrong (doesn't violate documentation), but it is clearly not as good as it would be if 0
were replaced with i
. Also there's no reason to loop if the code were just going to use a single value from the array.
–
–
–
–
This bug has been fixed, at least as of .NET 4.6.2. You can see it through Reference Source.
ret = CombineHashCodes(ret, comparer.GetHashCode(GetValue(i)));
GetHashCode
does not return unique values for instances that are not equal. However, instances that are equal will always return the same hash code.
To quote from Object.GetHashCode
method:
If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.
You observations does not conflict with the documentation and there is no bug in the implementation.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.