Skip to main content

A Simple Output Cache Solution for DD4T Java

This blog post presents a very simple output cache solution presented in the form of a Java Servlet filter. The output cache works together with DD4T Java, although it is a pretty generic solution.

The idea behind this output cache is to store the response output in a memory cache, such that a subsequent request to the same URL will load the cached response content from cache, instead of letting it be processed again through the normal request processing pipeline. This approach has, of course, a few constraints — namely, the biggest being there cannot be any personalization in the response based on users. The response is the same for several requests which might belong to several users, sessions, cookies, etc. Hence this ‘simple’ output cache.

The output cache filter uses the incoming request URL as a key in the local memory cache. If a previously cached response is found in cache, it will be served; otherwise, the request is let go through the processing pipeline and the resulting response is then cached, such that a subsequent request for the same URL path will be served from cache.

The filter uses the concept of ‘cached response’, which not only caches the actual response content, but it also caches additional attributes such as mime type, encoding, content length, headers and status code.

This particular DD4T implementation relies on a DD4T CacheProvider and CacheInvalidator to store the cache responses in cache and to invalidate them respectively.

Invalidation occurs when the Page whose response was cached is published again (or unpublished). At that moment, the cache entry corresponding to the page URL path gets expired, fact that will trigger either a reload of the expired page (should it be requested again), or an eviction from cache (in case said page is not requested again with a given time interval).

Enough talk, let’s see the code… :)

OutputCache Filter

The entire logic is contained in the doFilter() method. Namely, a cache lookup is made for an CacheResponse object corresponding to the requested URL path. If such entry exists, the CachedResponse object is sent back. Otherwise, the request goes through the processing pipeline and a response wrapper is used that would collect all headers, status code, response content, mime type, etc. We use this response wrapper to build the CachedResponse object and then we store the latter in the the cache.

    @Override
    public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain)
            throws IOException, ServletException {
        HttpServletRequest request = (HttpServletRequest) servletRequest;
        HttpServletResponse response = (HttpServletResponse) servletResponse;

        String url = HttpUtils.getOriginalUri(request);
        url = HttpUtils.appendDefaultPageIfRequired(url);

        String key = getKey(url);
        CacheElement<CachedResponse> cacheElement = cacheProvider.loadFromLocalCache(key);
        CachedResponse cachedResponse = null;

        if (cacheElement.isExpired()) {
            synchronized (cacheElement) {
                if (cacheElement.isExpired()) {
                    CharResponseWrapper responseWrapper = new CharResponseWrapper(response);
                    chain.doFilter(request, responseWrapper);
                    String output = responseWrapper.toString();

                    cachedResponse = new CachedResponse(responseWrapper);
                    cacheElement.setPayload(cachedResponse);
                    GenericPage page = getPage(url);
                    if (page == null) {
                        cacheProvider.storeInItemCache(key, cacheElement);
                    } else {
                        TCMURI tcmuri = new TCMURI(page.getId());
                        cacheProvider.storeInItemCache(key, cacheElement, tcmuri.getPublicationId(), tcmuri.getItemId());
                    }
                } else {
                    cachedResponse = cacheElement.getPayload();
                }
            }
        } else {
            cachedResponse = cacheElement.getPayload();
        }

        sendCachedResponse(request, response, cachedResponse);
    }

The helper method HttpUtils.getOriginalUri returns the original request URL. If a forward header is present in the request, it will return the URL before the forward took place.

Method HttpUtils.appendDefaultPageIfRequired will append a default page (e.g. /index.html) if the current request is to a directory. The reason this is done is to facilitate the DD4T PageFactory lookup for the page model.

If a page model is found in the Content Delivery Database, the Page item id and Publication id are used to set a dependency on the CachedResponse. This ensures the cache will be invalidated once the Page is (re-)published. Otherwise, if page is not found in CD DB, the CachedResponse is stored in cache with a dummy TTL.

Another method of interest is sendCachedResponse. This method takes care of populating the current response with values from the CachedResponse object. Also the method handles request/response caching headers, such as IfModifiedSince and NotModified.

    private void sendCachedResponse(HttpServletRequest request, HttpServletResponse response, CachedResponse cachedResponse) throws IOException {
        if (cachedResponse != null) {
            if (request.getHeader(IF_MODIFIED_SINCE) != null) {
                try {
                    DateTime lmDate = CachedResponse.parseDateFromHeader(cachedResponse.getHeader(LAST_MODIFIED));

                    String modifiedSince = request.getHeader(IF_MODIFIED_SINCE);

                    if (modifiedSince.contains(";")) {
                        modifiedSince = modifiedSince.substring(0, modifiedSince.indexOf(";"));
                    }

                    DateTime imsDate = CachedResponse.parseDateFromHeader(modifiedSince);
                    if (!lmDate.isAfter(imsDate)) {
                        // the last modified date is OLDER than the 'if modified since' date
                        // we will return a 304 Not Modified
                        response.setStatus(304);
                        response.setHeader(LAST_MODIFIED, cachedResponse.getHeader(LAST_MODIFIED));
                        return;
                    }
                } catch (ParseException | IllegalArgumentException e) {
                    LOG.error("Error parsing Header.", e);
                }
            }

            response.setStatus(cachedResponse.getStatus());
            response.setContentType(cachedResponse.getContentType());

            for (Map.Entry<String, String> entry : cachedResponse.getHeaders().entrySet()) {
                response.setHeader(entry.getKey(), entry.getValue());
            }

            String content = cachedResponse.getContent();
            if (StringUtils.isNotEmpty(content)) {
                response.setContentLength(cachedResponse.getContentLength());
                PrintWriter writer = response.getWriter();
                writer.print(content);
            }
        }
    }

Finally, it’s worth mentioning the CachedResponse object that contains the following properties and functionality:

public class CachedResponse {
    private static DateTimeFormatter dateFormat = DateTimeFormat.forPattern("EEE, dd MMM yyyy HH:mm:ss 'GMT'")
            .withZoneUTC();
    private String content;
    private int status;
    private String contentType;
    private int contentLength;
    private Map<String, String> headers;

    public CachedResponse(CharResponseWrapper responseWrapper) {
        content = responseWrapper.toString();
        status = responseWrapper.getStatus();
        contentType = responseWrapper.getContentType();
        contentLength = StringUtils.isEmpty(content) ? 0 : content.length();
        headers = new HashMap<>();
        setHeaders(responseWrapper);
    }

    static DateTime parseDateFromHeader(String dateString) throws ParseException {
        return dateFormat.parseDateTime(dateString);
    }

    static String formatDateForHeader(DateTime date) {
        return dateFormat.print(date);
    }

    private void setHeaders(CharResponseWrapper responseWrapper) {
        for (String name : responseWrapper.getHeaderNames()) {
            String value = responseWrapper.getHeader(name);
            if (name.equals("Last-Modified"))
                continue;
            headers.put(name, value);
        }
        headers.put("Last-Modified", formatDateForHeader(new DateTime(DateTimeZone.UTC)));
    }

    public String getHeader(String name) {
        return headers.get(name);
    }
}

Configuration Bits

In descriptor web.xml, add the following filter definition and filter mapping:

    <filter>
        <filter-name>OutputCacheFilter</filter-name>
        <filter-class>com.anchorage.web.filters.OutputCacheFilter</filter-class>
    </filter>

    <filter-mapping>
        <filter-name>OutputCacheFilter</filter-name>
        <url-pattern>*.html</url-pattern>
        <dispatcher>REQUEST</dispatcher>
    </filter-mapping>

All is left for me now is to bid you "happy caching!" :)


Comments

Popular posts from this blog

Scaling Policies

This post is part of a bigger topic Autoscaling Publishers in AWS . In a previous post we talked about the Auto Scaling Groups , but we didn't go into details on the Scaling Policies. This is the purpose of this blog post. As defined earlier, the Scaling Policies define the rules according to which the group size is increased or decreased. These rules are based on instance metrics (e.g. CPU), CloudWatch custom metrics, or even CloudWatch alarms and their states and values. We defined a Scaling Policy with Steps, called 'increase_group_size', which is triggered first by the CloudWatch Alarm 'Publish_Alarm' defined earlier. Also depending on the size of the monitored CloudWatch custom metric 'Waiting for Publish', the Scaling Policy with Steps can add a difference number of instances to the group. The scaling policy sets the number of instances in group to 1 if there are between 1000 and 2000 items Waiting for Publish in the queue. It also sets the

Running sp_updatestats on AWS RDS database

Part of the maintenance tasks that I perform on a MSSQL Content Manager database is to run stored procedure sp_updatestats . exec sp_updatestats However, that is not supported on an AWS RDS instance. The error message below indicates that only the sa  account can perform this: Msg 15247 , Level 16 , State 1 , Procedure sp_updatestats, Line 15 [Batch Start Line 0 ] User does not have permission to perform this action. Instead there are several posts that suggest using UPDATE STATISTICS instead: https://dba.stackexchange.com/questions/145982/sp-updatestats-vs-update-statistics I stumbled upon the following post from 2008 (!!!), https://social.msdn.microsoft.com/Forums/sqlserver/en-US/186e3db0-fe37-4c31-b017-8e7c24d19697/spupdatestats-fails-to-run-with-permission-error-under-dbopriveleged-user , which describes a way to wrap the call to sp_updatestats and execute it under a different user: create procedure dbo.sp_updstats with execute as 'dbo' as

Toolkit - Dynamic Content Queries

This post if part of a series about the  File System Toolkit  - a custom content delivery API for SDL Tridion. This post presents the Dynamic Content Query capability. The requirements for the Toolkit API are that it should be able to provide CustomMeta queries, pagination, and sorting -- all on the file system, without the use third party tools (database, search engines, indexers, etc). Therefore I had to implement a simple database engine and indexer -- which is described in more detail in post Writing My Own Database Engine . The querying logic does not make use of cache. This means the query logic is executed every time. When models are requested, the models are however retrieved using the ModelFactory and those are cached. Query Class This is the main class for dynamic content queries. It is the entry point into the execution logic of a query. The class takes as parameter a Criterion (presented below) which triggers the execution of query in all sub-criteria of a Criterio