Skip to main content

A DD4T.net Implementation - Taxonomy Performance Issues

Retrieving Classified Items for Multiple Keywords

My intention when retrieving the classified items for a taxonomy is to resolve the entire taxonomy (i.e. all its keywords) first, and then cache it. This way, retrieving the related content for a given keyword would be very fast. I described this approach for DD4T Java in post Retrieve Classified Items for Multiple Keywords.

There is one issue with that approach in .NET -- it is not standard API and one must write their own Java Hibernate code to retrieve the classified items. In DD4T Java that is not an issue, but in DD4T .NET, exposing the custom Java logic to the .NET CLR is not easy. It would involve writing some JNI proxy classes to bridge the two virtual machines. I just don't feel like writing that code.

Enter the second best approach -- resolving classified items on-the-fly, for one keyword at the time, on demand, as described in post Taxonomy Factory.

Retrieving Large Taxonomies

Another performance issue is to retrieve large taxonomies. The API call to read entire taxonomies is TaxonomyFactory.GetTaxonomyKeywords(taxonomyUri). This is the only method on the Tridion.ContentDelivery.Taxonomies.TaxonomyFactory class that will retrieve a root Keyword with all its Parent/Child keyword properties resolved, so the taxonomy is fully navigable up and down.

However, the method above is a bottleneck for large (really large) taxonomies. Internally the method reads all keywords in the taxonomy and all their custom meta objects. This can take a significant hit on performance.

My solution for this problem was to use a discovery algorithm that would read one keyword in the taxonomy and resolve its parent keywords up to the root. Resolving means reading its custom meta and the items directly classified against it. The root keyword would be cached, together with all its discovered child keywords.

When a new keyword would be requested, first the algorithm tries to read it from the cached root keyword (as one of its possible children). If that search didn't find the keyword, we assume the keyword was not resolved yet, and the discovery process would start once again, and it would attach the new keyword at its appropriate level in the taxonomy.

Slowly and on-demand, the taxonomy structure would be created and it would consist only of the keywords that were requested. The following code shows this algorithm. Note the usage of TaxonomyFactory.GetTaxonomyKeyword() method that returns a partially resolved keyword.

using dd4t = DD4T.ContentModel;
using tridion = Tridion.ContentDelivery.Taxonomies;

public IMyKeyword ResolveKeywordLazy(dd4t.IKeyword keyword)
{
    IMyKeyword result;

    if (keyword == null)
    {
        return null;
    }

    if (keyword is IMyKeyword)
    {
        result = (IMyKeyword)keyword;
    }
    else
    {
        IMyKeyword root;
        string key = GetKey(keyword.TaxonomyId);
        CacheWrapper.TryGet(key, out root);
        if (root == null)
        {
            result = ResolveKeywordLazyRecursive(keyword, out root);
            CacheWrapper.Insert(key, root, cacheMinutes);
        }
        else
        {
            result = ResolveKeywordLazyRecursive(root, keyword);
        }
    }
    return result;
}

private IMyKeyword ResolveKeywordLazyRecursive(IMyKeyword root, dd4t.IKeyword keyword)
{
    if (keyword == null)
    {
        return null;
    }

    IMyKeyword result = GetKeywordByUri(root, keyword.Id);
    if (result == null)
    {
        tridion.Keyword tridionKeyword = taxonomyFactory.GetTaxonomyKeyword(keyword.Id);
        result = new TaxonomyConverter().ConvertToDD4T(tridionKeyword);
        IMyKeyword parent = ResolveKeywordLazyRecursive(root, result.ParentKeyword);

        if (parent != null)
        {
            result.ParentKeywords.Clear();
            result.ParentKeywords.Add(parent);
            parent.ChildKeywords.Add(result);
        }
    }

    return result;
}

private IMyKeyword ResolveKeywordLazyRecursive(dd4t.IKeyword keyword, out IMyKeyword root)
{
    if (keyword == null)
    {
        root = null;
        return null;
    }

    tridion.Keyword tridionKeyword = taxonomyFactory.GetTaxonomyKeyword(keyword.Id);
    IMyKeyword result = new TaxonomyConverter().ConvertToDD4T(tridionKeyword);
    IMyKeyword parent = ResolveKeywordLazyRecursive(result.ParentKeyword, out root);

    if (parent == null)
    {
        root = result;
    }
    else
    {
        result.ParentKeywords.Clear();
        result.ParentKeywords.Add(parent);
        parent.ChildKeywords.Add(result);
    }

    return result;
}


Comments

Popular posts from this blog

Running sp_updatestats on AWS RDS database

Part of the maintenance tasks that I perform on a MSSQL Content Manager database is to run stored procedure sp_updatestats . exec sp_updatestats However, that is not supported on an AWS RDS instance. The error message below indicates that only the sa  account can perform this: Msg 15247 , Level 16 , State 1 , Procedure sp_updatestats, Line 15 [Batch Start Line 0 ] User does not have permission to perform this action. Instead there are several posts that suggest using UPDATE STATISTICS instead: https://dba.stackexchange.com/questions/145982/sp-updatestats-vs-update-statistics I stumbled upon the following post from 2008 (!!!), https://social.msdn.microsoft.com/Forums/sqlserver/en-US/186e3db0-fe37-4c31-b017-8e7c24d19697/spupdatestats-fails-to-run-with-permission-error-under-dbopriveleged-user , which describes a way to wrap the call to sp_updatestats and execute it under a different user: create procedure dbo.sp_updstats with execute as 'dbo' as

Content Delivery Monitoring in AWS with CloudWatch

This post describes a way of monitoring a Tridion 9 combined Deployer by sending the health checks into a custom metric in CloudWatch in AWS. The same approach can also be used for other Content Delivery services. Once the metric is available in CloudWatch, we can create alarms in case the service errors out or becomes unresponsive. The overall architecture is as follows: Content Delivery service sends heartbeat (or exposes HTTP endpoint) for monitoring Monitoring Agent checks heartbeat (or HTTP health check) regularly and stores health state AWS lambda function: runs regularly reads the health state from Monitoring Agent pushes custom metrics into CloudWatch I am running the Deployer ( installation docs ) and Monitoring Agent ( installation docs ) on a t2.medium EC2 instance running CentOS on which I also installed the Systems Manager Agent (SSM Agent) ( installation docs ). In my case I have a combined Deployer that I want to monitor. This consists of an Endpoint and a

Debugging a Tridion 2011 Event System

OK, so you wrote your Tridion Event System. Now it's time to debug it. I know this is a hypothetical situtation -- your code never needs any kind of debugging ;) but indulge me... Recently, Alvin Reyes ( @nivlong ) blogged about being difficult to know how exactly to debug a Tridion Event System. More exactly, the question was " What process do I attach to for debugging even system code? ". Unfortunately, there is no simple or generic answer for it. Different events are fired by different Tridion CM modules. These modules run as different programs (or services) or run inside other programs (e.g. IIS). This means that you will need to monitor (or debug) different processes, based on which events your code handles. So the usual suspects are: dllhost.exe (or dllhost3g.exe ) - running as the MTSUser is the SDL Tridion Content Manager COM+ application and it fires events on generic TOM objects (e.g. events based on Tridion.ContentManager.Extensibility.Events.CrudEven