Understanding Caching in Hibernate

Understanding Caching in Hibernate – Part One : The Session Cache

Hibernate offers caching functionality which is designed to reduces the amount of necessary database access.  This is a very powerful feature if used correctly. However I have seen a lot of cases and also talked to many people on caching in Hibernate, where caching is either not understood correctly or even used the wrong way.

There are already a number of good articles on Hibernate caching, which provide good hints on how to use the cache. The Hibernate documentation itself offers good advise. Still I see value in having a deeper look at the dynamic and behaviour of the Hibernate cache as it might help people to understand it even better.

Hibernate Cache Types

Hibernate uses different types of caches. Each type of cache is used for different purposes. Let us first have a look at this cache types.

  • The first cache type is the session cache. The session cache caches object within the current session.
  • The second cache type is the query Cache. The query cache is responsible for caching queries and their results.
  • The third cache type is the second level cache. The second level cache is responsible for caching objects across sessions.

Now as we have an overview of caching types we can look at them in more detail. We will use dynaTrace to provide detailed insight into the dynamic of Hibernate caching. In this post we will start looking closer at the session cache.

Sample Data Model

For the samples used in the post, we will use a very simple data model. Although it is simple it is sufficient for illustration purposes and you should not have any problems mapping it to your application’s use cases.

Our data model consists of two entities – Persons and Addresses. Persons have references to Addresses. This allows us to explore caching behaviour regarding single entities as well as relations.

The Session Cache

As already the session cache caches values within the current session.  This cache is enabled by default.  Let us have a look at the following code sample. We create two queries to load a person object from cache. As we are loading the same object twice, we expect it to be retrieved from the cache.

Session session = getSessionFactory().openSession();

Transaction tx = session.beginTransaction();

Query query = session.createQuery(“from Person p where p.id=1”);

Iterator it = query.list().iterator();

while (it.hasNext ()){

Person p = (Person) it.next();

System.out.println(p.getFirstName());

}

query = session.createQuery(“from Person p where p.id=1”);

it = query.list().iterator();

while (it.hasNext ()){

Person p = (Person) it.next();

System.out.println(p.getFirstName());

}

tx.commit();

session.close();

Using the code above we would expect the query to be executed only once. However if we look at the PurePath of this transaction we can see, that two database queries have been executed.

Loading a person two times in a row, but no session cache involved

Now we will not use the createQuery but instead the load Method and pass the key directly as shown in the code below. The System.out.println calls by the way are required to force Hibernate to load data at all. If we would not put them in, nothing would get loaded. This is because data is always by default loaded lazyly in Hibernate. Thoug interesting this if off topic for this post.

Session session = getSessionFactory().openSession();

Transaction tx = session.beginTransaction();

Person person1 = (Person) session.load(Person.class, 1L);

System.out.println(person1.getFirstName());

Person person2 = (Person) session.load(Person.class, 1L);

System.out.println(person2.getFirstName());

tx.commit();

session.close();

As we can see in the trace below, now only one database query is issued. The same behaviour could have been achieve by using get instead of load.  For more information on the difference between these two methos either refer to the Hibernate documentation or read this nice blog post(I have chosen one from may out there)

Loading by class and key using session cache

The question now is what is the difference between these two scenarios and why does it work in one case and not in the other. Therefore we have to look deeper into Hibernate to see what is going on the second example. As shown below Hibernate first tries to retrieve the object within the session, if this fails (like in the green section), the object will be loaded from the database.  The objects retrieval and storage in handled by the  _PersistenceContext_ object, which is kept by within the Hibernate session.

Difference First and Second Time Loading

The handling for storing objects in the persistence context is the same, whether we use the _load_ method or a hibernate query. The figure below shows the dynamic behavior for loading the object via a hibernate query. This however also means that all objects loaded within a session are kept as long this session is open. This can lead to performance problems due to memory consumption in cases where a large amount of objects are loaded.

Persistence Context Behavior When Using Hibernate Query

Conclusion

Hibernate internally always uses the session cache transparently.  We have also seen that Hibernate requires a key to load object from the session cache. So in case we have a key available it is prefered to use load and a key instead of a HQL query.

 

Understanding Caching in Hibernate – Part Two : The Query Cache

by Alois Reitbauer, Feb 16, 09

In the last post I wrote on caching in Hibernate in general as well as on the behavior of the session cache. In this post we will have a closer look at the QueryCache. I will not explain the query cache in details as there are very good articles like Hibernate: Truly Understanding the Second-Level and Query Caches.

As we have seen in the last post the session cache can help in caching values when we have an _EntityKey_ available. If we do not have the key, we ran into the problems of having to issue multiple queries for retrieving the same object. This was the reason why the session cache worked fine for the _load_ method but not when we used _session.createQuery()_.

Now this is the point where the query cache comes into play.  The query cache is responsible for caching the results of queries – or to be more precise the keys of the objects returned by queries.  Let us have a look how Hibernate uses the query cache to retrieve objects. In order to make use of the query cache we have to modify the person loading example as follows.

Session session = getSessionFactory().openSession();

Transaction tx = session.beginTransaction();

Query query = session.createQuery(“from Person p where p.id=1”);

query.setCacheable(true);

Iterator it = query.list().iterator();

while (it.hasNext ()){

Person p = (Person) it.next();

System.out.println(p.getFirstName());

}

query = session.createQuery(“from Person p where p.id=1”);

query.setCacheable(true);

it = query.list().iterator();

while (it.hasNext ()){

Person p = (Person) it.next();

System.out.println(p.getFirstName());

}

tx.commit();

session.close();

As highlighted in bold face we had to add a line for defining that the query is actually cachable. If we would not do this, it won’t be cached.  (Note: The while loops could be omitted here. I am using for other examples where we have multiple results. …. just for code esthetics lovers). Additionally we also have to change the hibernate configuration to enable the query cache. This is done by adding the following line to the Hibernate configuration.

<property name=”hibernate.cache.use_query_cache”>true</property>

Unlike most examples I found on the web I will not immediately enable the second-level cache. As the basic working do not depend on it and I do not want to create the impression that the query cache requires the second level or vice versa. Let us now verify that everything is working correctly. As we can see below only the first _query.list()_ result in a SQL statement to be issued.

Caching by the Query Cache For Two Susequent Queries

The question now is, what happens internally. Therefore we analyze what happens within the second _get_ method of the _StandardQueryCache_. As we can see in the image below Hibernate first tries to retrieve the key values from the cache (as we can see the query cache internally uses the _EhCache_). After retrieving the keys the person entity is loaded from the session cache.

Internal Behavior of the Hibernate Query Cache

Query Cache Pitfalls

The query cache can be really usefull to optimize the performance of your data access layer. However there are a number of pitfalls as well.  This blog post describes a serious problem regarding memory consumption of the Hibernate query cache when using objects as parameters.

Conclusion

We have learned that the query cache helps us to cache the keys of results of Hibernate queries. These keys are then used to retrieve data objects using the Hibernate Internal loading behavior which involves the session cache and potentially also the second-level cache.

 

Understanding Caching in Hibernate – Part Three : The Second Level Cache

by Alois Reitbauer, Mar 24, 09

In the last posts I already covered the session cache as well as the query cache. In this post I will focus on the second-level cache. The Hibernate Documentation provides a good entry point reading on the second-level cache.

The key characteristic of the second-level cache is that is is used across sessions, which also differentiates it from the session cache, which only – as the name says – has session scope. Hibernate provides a flexible concept to exchange cache providers for the second-level cache. By default Ehcache is used as caching provider. However more sophisticated caching implementation can be used like the distributed JBoss Cache or Oracle Coherence.

First we have to modify our code sample so that we now load the Person object in two sessions. The source code then looks as follows

public void loadInTwoSessions (){

// loading in first session

Session session = getSessionFactory().openSession();

Transaction tx = session.beginTransaction();

Person p = (Person) session.load(Person.class, 1L);

System.out.println(p.getFirstName());

tx.commit();

session.close();

// loading in second session

session = getSessionFactory().openSession();

tx = session.beginTransaction();

p = (Person) session.load(Person.class, 1L);

System.out.println(p.getFirstName());

tx.commit();

session.close();

}

As we have not activated the second level cache, we expect the SQL queries to be executed twice. Looking at the PurePath of this transactions verifies our asumption.

Loading a person object in two sessions without second-level cache

Now we activate the second-level cache. Activating the second level cache requires us change to Hibernate configuration file and enable second-level caching by adding and additionally specify the cache provider as shown below.

<property name=”hibernate.cache.use_second_level_cache”>true</property>

<property name=”hibernate.cache.provider_class”>org.hibernate.cache.EhCacheProvider</property>

In this example I am using Ehcache for demonstration purposes. In order to enable caching of our Person objects if have to specify the caching configuration in the ehcache.xml file.  The actual cache configuration depends on the caching provider. For Ehcache the configuartion is defined as follows. The configuration for the Person class used in the example is boiler-plate Ehcache configuration. It can be adopted to specific needs. Describing all possible configurations options like using mulitple cache regions etc. is beyond scope of this post.

<cache name=”com.dynatrace.samples.database.Person”

maxElementsInMemory=”300″

eternal=”true”

overflowToDisk=”false”

timeToIdleSeconds=”12000″

timeToLiveSeconds=”12000″

diskPersistent=”false”

diskExpiryThreadIntervalSeconds=”120″

memoryStoreEvictionPolicy=”LRU”

/>

Finally we have to configure caching also at Hibernate level. Hibernate supports mulitple settings for caching. As we are only reading data it the moment a read-only cache is sufficient for our purposes. Hibernate for sure supports  read-write cache as well and also transactional caches in case this is supported by the cache provider.  The following liine in the hibernate configuration enable read-only caching for Person objects. Alternatively also Hibernate associations could be used.

<cache usage=”read-only” />

Now we expect the object to be retrieved from the second-level the second time it is loaded. A PurePath trace verifies this assumption.  Now, only the first time a database call gets executed.

Loading a person object in two sessions with enabled second-level cache

Read-Write Caching

After having looked at plain read caching we look in the next step at read-write caching.  Our code example gets a bit more complex. We again use two sessions. We load the object in the first session, update it thenload it in the second session. Both sessions are created upfront and are kept open until the end.

public void complexLoad (){

Session session1 = getSessionFactory().openSession();

Session session2 = getSessionFactory().openSession();

Transaction tx1 = session1.beginTransaction();

Person p1 = (Person) session1.load(Person.class, 1L);

System.out.println (p1.getFirstName());

p1.setFirstName (“” + System.currentTimeMillis());

tx1.commit();

Transaction tx2 = session2.beginTransaction();

Person p2 = (Person)session2.load(Person.class, 1L);

System.out.println (p2.getFirstName());

tx2.commit();

session1.close();

session2.close();

}

We expect the object to be retrieved from the cache when it is loaded in the second session.  Looking at the PurePath of this transaction however shows something different. This method executes three SQL statements. First a SELECT to load the Person object, then an UPDATE to update the record in the database and then again a SELECT to load the Person object for the second session.

Two transaction loading a person object both times leading to a database query.

This is not what we necessarily where expecting. The object could have been retrieved from the cache in the second session. However it got loaded from the database. So why wasn’t the object taken from the cache. A closer look at the internal Hibernate behaviour unveils this secret.

Details on loading from second-level cache

The PurePath snippet above shows the details for loading the Person object in the second session. The key is the isGettable method, which in this case returns false. The input to isGettable is the session creation timestamp, as indicated by the arrow.  A look at the sourcecode unveils what is checked within this method.

public boolean isGettable(long txTimestamp) {

return freshTimestamp < txTimestamp;

}

The method verifies wheter the session’s timestamp (txTimestamp) is greated than the freshTimestamp of the cached object. In our case the second session was created BEFORE the object was updated. Consequently this method will return false. If we modify our code as follows the object will be loaded from the second-level cache.

public void complexLoad (){

Session session1 = getSessionFactory().openSession();

Transaction tx1 = session1.beginTransaction();

Person p1 = (Person) session1.load(Person.class, 1L);

System.out.println (p1.getFirstName());

p1.setFirstName (“” + System.currentTimeMillis());

tx1.commit();

Session session2 = getSessionFactory().openSession();

Transaction tx2 = session2.beginTransaction();

Person p2 = (Person)session2.load(Person.class, 1L);

System.out.println (p2.getFirstName());

tx2.commit();

session1.close();

session2.close();

}

The PurePath snipped below verfies this assumption and shows that this time isGettable returns true and the object is retrieved from the cache.

Loading the Person object from second-level cache directly

Interaction of Session and Second-Level Cache

Finally I want to take a short look of the interaction between the session and the second-level cache. The important point to understand is that as soon as we use the second-level cache we have two caches in place. Caches are always a source of inconsistent information, which we take as the price for better performance and scalability. In order to avoid problems and unwanted behaviour we have to understand their internal behaviour.  Hibernate always tries to first retrieve objects from the session and if this fails tries to retrieve them from the second-level cache. If this fails again objects are directly loaded from the database.  The PurePath snippet below shows this loading behavior.

Load hierarchy in Hibernate showing logical flow of object retrieval

Conclusion

The second level cache is powerful mechanism for improving performance and scalability of your database driven application. Read-only caches are easy to handle, while read-write caches are more subtile in their behavior. Especially the interaction with the Hibernate session can lead to unwanted behavior. Sessions should therefore be used as what they are designed for – a transactional context. There are more details on the second-level cache I did not elaborate on like synchronization or replication behavior. However the combination of the three caching articles should provide good insight into Hibernate caching behavior.

See Also

http://blog.dynatrace.com/tag/hibernate/

For Testing Tool:

http://blog.dynatrace.com/2009/11/17/a-step-by-step-guide-to-dynatrace-ajax-edition-available-today-for-public-download/

Advertisements

About Sanju
I am Software Programmer. I am working in JAVA/J2EE Technologies.

2 Responses to Understanding Caching in Hibernate

  1. parthu says:

    Hi, I’m facing a problem with reloading an object.

    The scenario is
    a) storing an object with status ‘R’
    b) calling a stored procedure, which changes the status to ‘S’ in db
    c) fetching the updated status. Instead of giving ‘S’, still the status is showing as ‘R’

    Hibernate cache configuration is
    true
    org.hibernate.cache.EhCacheProvider

    any idea what could be the problem here?

  2. parthu says:

    This issue is resolved.

    Called session.evict() after the step 1. The object status is stored as ‘R’ in the cache. The object status is fetching from cache instead of data base, hence called the method evict() to clear the cache. Once the cache is cleared, step 3 is fetching the modified value from the data base. Thank you for the article to understand the concept caching concept.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: