Skip to main content

A DD4T.net Implementation - Taxonomy Performance Issues

Retrieving Classified Items for Multiple Keywords

My intention when retrieving the classified items for a taxonomy is to resolve the entire taxonomy (i.e. all its keywords) first, and then cache it. This way, retrieving the related content for a given keyword would be very fast. I described this approach for DD4T Java in post Retrieve Classified Items for Multiple Keywords.

There is one issue with that approach in .NET -- it is not standard API and one must write their own Java Hibernate code to retrieve the classified items. In DD4T Java that is not an issue, but in DD4T .NET, exposing the custom Java logic to the .NET CLR is not easy. It would involve writing some JNI proxy classes to bridge the two virtual machines. I just don't feel like writing that code.

Enter the second best approach -- resolving classified items on-the-fly, for one keyword at the time, on demand, as described in post Taxonomy Factory.

Retrieving Large Taxonomies

Another performance issue is to retrieve large taxonomies. The API call to read entire taxonomies is TaxonomyFactory.GetTaxonomyKeywords(taxonomyUri). This is the only method on the Tridion.ContentDelivery.Taxonomies.TaxonomyFactory class that will retrieve a root Keyword with all its Parent/Child keyword properties resolved, so the taxonomy is fully navigable up and down.

However, the method above is a bottleneck for large (really large) taxonomies. Internally the method reads all keywords in the taxonomy and all their custom meta objects. This can take a significant hit on performance.

My solution for this problem was to use a discovery algorithm that would read one keyword in the taxonomy and resolve its parent keywords up to the root. Resolving means reading its custom meta and the items directly classified against it. The root keyword would be cached, together with all its discovered child keywords.

When a new keyword would be requested, first the algorithm tries to read it from the cached root keyword (as one of its possible children). If that search didn't find the keyword, we assume the keyword was not resolved yet, and the discovery process would start once again, and it would attach the new keyword at its appropriate level in the taxonomy.

Slowly and on-demand, the taxonomy structure would be created and it would consist only of the keywords that were requested. The following code shows this algorithm. Note the usage of TaxonomyFactory.GetTaxonomyKeyword() method that returns a partially resolved keyword.

using dd4t = DD4T.ContentModel;
using tridion = Tridion.ContentDelivery.Taxonomies;

public IMyKeyword ResolveKeywordLazy(dd4t.IKeyword keyword)
{
    IMyKeyword result;

    if (keyword == null)
    {
        return null;
    }

    if (keyword is IMyKeyword)
    {
        result = (IMyKeyword)keyword;
    }
    else
    {
        IMyKeyword root;
        string key = GetKey(keyword.TaxonomyId);
        CacheWrapper.TryGet(key, out root);
        if (root == null)
        {
            result = ResolveKeywordLazyRecursive(keyword, out root);
            CacheWrapper.Insert(key, root, cacheMinutes);
        }
        else
        {
            result = ResolveKeywordLazyRecursive(root, keyword);
        }
    }
    return result;
}

private IMyKeyword ResolveKeywordLazyRecursive(IMyKeyword root, dd4t.IKeyword keyword)
{
    if (keyword == null)
    {
        return null;
    }

    IMyKeyword result = GetKeywordByUri(root, keyword.Id);
    if (result == null)
    {
        tridion.Keyword tridionKeyword = taxonomyFactory.GetTaxonomyKeyword(keyword.Id);
        result = new TaxonomyConverter().ConvertToDD4T(tridionKeyword);
        IMyKeyword parent = ResolveKeywordLazyRecursive(root, result.ParentKeyword);

        if (parent != null)
        {
            result.ParentKeywords.Clear();
            result.ParentKeywords.Add(parent);
            parent.ChildKeywords.Add(result);
        }
    }

    return result;
}

private IMyKeyword ResolveKeywordLazyRecursive(dd4t.IKeyword keyword, out IMyKeyword root)
{
    if (keyword == null)
    {
        root = null;
        return null;
    }

    tridion.Keyword tridionKeyword = taxonomyFactory.GetTaxonomyKeyword(keyword.Id);
    IMyKeyword result = new TaxonomyConverter().ConvertToDD4T(tridionKeyword);
    IMyKeyword parent = ResolveKeywordLazyRecursive(result.ParentKeyword, out root);

    if (parent == null)
    {
        root = result;
    }
    else
    {
        result.ParentKeywords.Clear();
        result.ParentKeywords.Add(parent);
        parent.ChildKeywords.Add(result);
    }

    return result;
}


Comments

Popular posts from this blog

A DD4T.net Implementation - Custom Binary Publisher

The default way to publish binaries in DD4T is implemented in class DD4T.Templates.Base.Utils.BinaryPublisher and uses method RenderedItem.AddBinary(Component) . This produces binaries that have their TCM URI as suffix in their filename. In my recent project, we had a requirement that binary file names should be clean (without the TCM URI suffix). Therefore, it was time to modify the way DD4T was publishing binaries. The method in charge with publishing binaries is called PublishItem and is defined in class BinaryPublisher . I therefore extended the BinaryPublisher and overrode method PublishItem. public class CustomBinaryPublisher : BinaryPublisher { private Template currentTemplate; private TcmUri structureGroupUri; In its simplest form, method PublishItem just takes the item and passes it to the AddBinary. In order to accomplish the requirement, we must specify a filename while publishing. This is the file name part of the binary path of Component.BinaryConten

Event System to Create Mapped Structure Groups for Binary Publish

As a continuation of last week's Publish Binaries to Mapped Structure Group , this week's TBB is in fact the Event System part of that solution. Make sure you do check out the previous post first, which explains why and what this Event System does. To reiterate, the Event System intercepts a Multimedia Component save, take its Folder path and create a 1-to-1 mapping of Structure Groups. The original code was written, again, by my colleague Eric Huiza : [ TcmExtension ( "MyEvents" )] public class EventsManager  : TcmExtension {     private Configuration configuration;     private readonly Regex SAFE_DIRNAME_REGEX = new Regex ( @"[\W_]+" );     public EventsManager() {         ExeConfigurationFileMap fileMap = new ExeConfigurationFileMap ();         fileMap.ExeConfigFilename = Path .GetDirectoryName( Assembly .GetExecutingAssembly().Location) + "\\EventSystem.config" ;         configuration = ConfigurationManager

Running sp_updatestats on AWS RDS database

Part of the maintenance tasks that I perform on a MSSQL Content Manager database is to run stored procedure sp_updatestats . exec sp_updatestats However, that is not supported on an AWS RDS instance. The error message below indicates that only the sa  account can perform this: Msg 15247 , Level 16 , State 1 , Procedure sp_updatestats, Line 15 [Batch Start Line 0 ] User does not have permission to perform this action. Instead there are several posts that suggest using UPDATE STATISTICS instead: https://dba.stackexchange.com/questions/145982/sp-updatestats-vs-update-statistics I stumbled upon the following post from 2008 (!!!), https://social.msdn.microsoft.com/Forums/sqlserver/en-US/186e3db0-fe37-4c31-b017-8e7c24d19697/spupdatestats-fails-to-run-with-permission-error-under-dbopriveleged-user , which describes a way to wrap the call to sp_updatestats and execute it under a different user: create procedure dbo.sp_updstats with execute as 'dbo' as