Skip to main content

Retrieve Classified Items for Multiple Keywords

Using the standard Content Delivery API, it is possible to retrieve related content (i.e. items that are classified against Keywords) only for one Keyword at a time. If you find yourself with a requirement to read classified items for all Keywords in a Taxonomy, then you'll quickly realize that approach is a performance killer.

To better express what I'm looking for is the following:

Keyword A -- classified with Component 1, Component 2
    Keyword B -- classified with Component 3, Page 1
    Keyword C -- classified with Component 1, Page 2

I want to read the classified items for all Keywords in a Taxonomy and still maintain the 'classified' relationship between each Keyword and its corresponding list of Components & Pages.

The way I did it was to write my own Spring/Hibernate query and piggy-back on the existing DAO objects and methods available in the CD Storage API.

The key is the bean, which basically maps the records in DB table ITEM_CATEGORIES_AND_KEYWORDS. This table contains all information needed in the following columns:
  • publicationId
  • itemId
  • taxonomyId
  • keywordId
This provides the mapping from TaxonomyId, KeywordId to PublicationId, ItemId. The only unknown is the ItemType, which is available however in table ITEMS. So it's enough to make a join on PublicationId and ItemId and retrieve items by type. We are only interested in Components and Pages, so we'll make 2 queries -- one for each type.

The Hibernate query looks like this:

select distinct(rk) from RelatedKeyword rk, ItemMeta im
    where rk.publicationId = :publicationId and rk.taxonomyId = :taxonomyId and
    im.itemType = :itemType and rk.itemId = im.itemId and rk.publicationId = im.publicationId

We execute the Hibernate query in the following way:

public List<RelatedKeyword> getRelatedItems(String taxonomyURI, int itemType) {
    TCMURI taxonomyTcmUri = new TCMURI(taxonomyURI);
    int publicationId = taxonomyTcmUri.getPublicationId();

    Map<String, Object> queryParams = new HashMap<>();
    queryParams.put("publicationId", publicationId);
    queryParams.put("taxonomyId", taxonomyTcmUri.getItemId());
    queryParams.put("itemType", itemType);

    JPABaseDAO itemDAO = (JPABaseDAO) StorageManagerFactory.getDAO(
        publicationId, StorageTypeMapping.ITEM_META);
    return itemDAO.executeQueryListResult(THE_QUERY, queryParams);

From the calling code, we need to call getRelatedItems twice -- once for Components and once for Pages:

List<RelatedKeyword> components = getRelatedItems(taxonomyURI, ItemTypes.COMPONENT);
List<RelatedKeyword> pages = getRelatedItems(taxonomyURI, ItemTypes.PAGE);

Finally, what I want to do is to merge the components and pages lists into one single Map where the Keyword TCMURI is a key and the value is a Set of TCMURIs of the items directly classified against the said Keyword. The following method accomplishes just that:

private void mergeRelatedItems(List<RelatedKeyword> relatedKeywords,
        Map<String, Set<TCMURI>> result, int itemType) {
    for (RelatedKeyword keyword : relatedKeywords) {
        int publicationId = keyword.getPublicationId();
        String key = String.format("tcm:%d-%d-1024", publicationId, keyword.getKeywordId());
        Set<TCMURI> itemList = result.get(key);
        if (itemList == null) {
            itemList = new TreeSet<>();
            result.put(key, itemList);
        TCMURI itemURI = new TCMURI(publicationId, keyword.getItemId(), itemType, 0);

We call the method with the following code:

Map<String, Set<TCMURI>> result = new HashMap<>();
mergeRelatedItems(components, result, ItemTypes.COMPONENT);
mergeRelatedItems(pages, result, ItemTypes.PAGE);


Popular posts from this blog

Running sp_updatestats on AWS RDS database

Part of the maintenance tasks that I perform on a MSSQL Content Manager database is to run stored procedure sp_updatestats . exec sp_updatestats However, that is not supported on an AWS RDS instance. The error message below indicates that only the sa  account can perform this: Msg 15247 , Level 16 , State 1 , Procedure sp_updatestats, Line 15 [Batch Start Line 0 ] User does not have permission to perform this action. Instead there are several posts that suggest using UPDATE STATISTICS instead: I stumbled upon the following post from 2008 (!!!), , which describes a way to wrap the call to sp_updatestats and execute it under a different user: create procedure dbo.sp_updstats with execute as 'dbo' as

Content Delivery Monitoring in AWS with CloudWatch

This post describes a way of monitoring a Tridion 9 combined Deployer by sending the health checks into a custom metric in CloudWatch in AWS. The same approach can also be used for other Content Delivery services. Once the metric is available in CloudWatch, we can create alarms in case the service errors out or becomes unresponsive. The overall architecture is as follows: Content Delivery service sends heartbeat (or exposes HTTP endpoint) for monitoring Monitoring Agent checks heartbeat (or HTTP health check) regularly and stores health state AWS lambda function: runs regularly reads the health state from Monitoring Agent pushes custom metrics into CloudWatch I am running the Deployer ( installation docs ) and Monitoring Agent ( installation docs ) on a t2.medium EC2 instance running CentOS on which I also installed the Systems Manager Agent (SSM Agent) ( installation docs ). In my case I have a combined Deployer that I want to monitor. This consists of an Endpoint and a

Debugging a Tridion 2011 Event System

OK, so you wrote your Tridion Event System. Now it's time to debug it. I know this is a hypothetical situtation -- your code never needs any kind of debugging ;) but indulge me... Recently, Alvin Reyes ( @nivlong ) blogged about being difficult to know how exactly to debug a Tridion Event System. More exactly, the question was " What process do I attach to for debugging even system code? ". Unfortunately, there is no simple or generic answer for it. Different events are fired by different Tridion CM modules. These modules run as different programs (or services) or run inside other programs (e.g. IIS). This means that you will need to monitor (or debug) different processes, based on which events your code handles. So the usual suspects are: dllhost.exe (or dllhost3g.exe ) - running as the MTSUser is the SDL Tridion Content Manager COM+ application and it fires events on generic TOM objects (e.g. events based on Tridion.ContentManager.Extensibility.Events.CrudEven