The hashCode() method has a few nuances worth noticing. If you ever add your Objects into a hash such a HashSet or a HashMap, it will take precedence over the equals() method when comparing Objects. This is quite subtle and it makes sense when you think about it for a while.
Say you create the ubiquitous Person class with a name, surname and age.
package com.blogspot.babyloncandle;
public final class Person {
private String name;
private String surname;
private int age;
public Person(final String name, final String surname,
final int age) {
this.name = name;
this.surname = surname;
this.age = age;
}
public String getName() {
return name;
}
public String getSurname() {
return surname;
}
public int getAge() {
return age;
}
@Override
public int hashCode() {
return surname.hashCode();
}
@Override
public boolean equals(final Object obj) {
if (obj == null || !(obj instanceof Person)) {
return false;
}
final Person incoming = (Person) obj;
return incoming.getSurname().equals(surname);
}
}
Say we initially state that any Person with the same Sunrname is equal. The hashCode() implementation also returns a value based on the Surname for a Person.
Using Instinct, we could then create a Context for Person to specify that 2 People with the same surname are equal:
package com.blogspot.babyloncandle;
import static com.googlecode.instinct.expect.Expect.expect;
import com.googlecode.instinct.integrate.junit4.InstinctRunner;
import com.googlecode.instinct.marker.annotate.Context;
import com.googlecode.instinct.marker.annotate.Specification;
import org.junit.runner.RunWith;
import java.util.HashSet;
import java.util.Set;
@Context
@RunWith(InstinctRunner.class)
public class APersonContext {
@Specification
public void shouldEquatePeopleWithTheSameSurname() {
final Person person1 = new Person("Humpty", "Dumpty", 5);
final Person person2 = new Person("Lumpy", "Dumpty", 6);
expect.that(person1).isEqualTo(person2);
final Set<Person> uniquePeople = new HashSet<Person>();
uniquePeople.add(person1);
uniquePeople.add(person2);
expect.that(uniquePeople).isOfSize(1);
}
}
The specification passes with no dramas. Now if we change the hashCode() method of the Person object such that the hashCode is calculated on age:
@Override
public int hashCode() {
return age;
}
If we rerun the specification, it fails with:
java.lang.AssertionError:
Expected: <1>
got: <2>
This implies that the uniquePeople Set now has 2 elements instead of 1. This almost seems counter-intuitive as we have not changed the implementation of the equals() method of Person.
Why is the spec failing? It has to do with how equals() and hashCode() works. If 2 Objects are equal they MUST have the same hashCode(). (But 2 Objects with the same hashCode() need not be equal). We have now broken this contract. (as equals() works on Surname and hashCode() works on Age) The implications are more subtle though. The only reason the specification actually failed is because we used a hash implementation (HashSet) to store the 2 Person Objects.
If we add a System.out.Println() to the equals() method of Person, we see that it is never called when it's in the Set. If we add a System.out.Println() to the hashCode() method of Person, we see that it is called for each Person Object within the Set.
This has to do with the way hashing works in Java. The hashCode is used to select a "bucket" into which each Object is added. If Object are in different buckets (meaning they have different hashCodes) they are never compared for equality. This is very important, because, if you go against the equals/hashCode specification, your Objects could never be found in hash implementations. If you revert the hashCode() of Person to use Surname, while retaining the System.out.Println() statements, you'll see that hashCode() is called on each Object before equals().
So it's not just that you should override hashCode() when you override equals() as many people do automatically. (without paying much attention to the hashCode value). You should override it such that you always maintain their contracts or you could end up with hard-to find errors.
Lots of people use a default value (eg. 42) for hashCode(). This satisfies the equals/hashCode contract. It is not so great for hash performance though, as all Objects land in the same bucket increasing processing time.
You also have to be careful between choosing a highly unique hashCode (A bucket per equal Objects of a certain type - this leads to an massive increase of buckets) and choosing a hashCode that is not unique at all (1 bucket for all Objects - this could lead to large processing times). Depending on the performance requirements of your application, you may need to make some compromises on the value of hashCode you choose to return.
