Skip to main content

Publishing Queue metrics in CloudWatch

This post is part of a bigger topic Autoscaling Publishers in AWS.

In order to define autoscaling of some servers, we need some metrics that we can use to create the autoscaling logic, i.e. when to spin up new instances and when to terminate them. A good measure for this, in Tridion terms, is the size of the Publishing Queue. Namely for Publishers autoscaling, it's useful to look at the number of items in the Publishing Queue that are in the state "Waiting for Publish".

The approach is to read this metric somehow from the Tridion Content Manager database and make it available in AWS, so that we can use it later. AWS CloudWatch provides a way to define and/or intercept events that can trigger some code execution. The code executed is supposed to read the Publishing Queue and push the count of items into CloudWatch as a custom metric.

1. Define Lambda Function

This function represents the code that is executed by the CloudWatch rule. The function reads the size of the Publishing Queue and pushes it as custom metrics into CloudWatch.

The languages available in AWS Lambda at the moment include .Net Core 1 and Python 2.7. I tried writing a nice .net application that uses Tridion's CoreService client to read the Publishing Queue metrics I needed. Unfortunately, I had to give this up after realizing the limitations in .Net Core 1 regarding connectivity to WCF services. Connecting to a service is really a big deal in 2017 -- you need a ton of DLLs!

Instead, I wrote the Lambda code in Python 2.7 using direct DB access to read the metrics from the Tridion CM DB. Definitely not the nicest approach, but it seems like the only way to do it. Also because the DB is an RDS instance in the same VPC, I wasn't too concerned with security.

After a few iterations and optimizations, the code looks like this:

from os import getenv
import pymssql
import boto3

client = boto3.client('cloudwatch')

def handler(event, context):

    server = getenv("PYMSSQL_SERVER")
    user = getenv("PYMSSQL_USERNAME")
    password = getenv("PYMSSQL_PASSWORD")
    database = getenv("PYMSSQL_DB")

    conn = pymssql.connect(server, user, password, database)
    cursor = conn.cursor()
    cursor.execute('select STATE, COUNT(*) from PUBLISH_TRANSACTIONS where STATE=1 or STATE=4 group by STATE')

    metrics = {'Waiting for Publish': 0, 'Waiting for Deployment': 0}

    for row in cursor.fetchall():
        count = row[1]
        if row[0] == 1:
            metrics['Waiting for Publish'] = count
        elif row[0] == 4:
            metrics['Waiting for Deployment'] = count

    print 'Metrics', metrics

    for metric in metrics:
        response = client.put_metric_data(
            Namespace='SDL Web',
            MetricData=[
                {
                'MetricName': metric,
                'Value': metrics[metric],
                'Unit': 'Count',
                },
            ]
        )

    conn.close()

I used environment variables in order to make the code more portable and clean. These variables are specified in the AWS console.

The code reads 2 values:
  • number of items in Waiting for Publish state;
  • number of items in Waiting for Deployment state;

Since I'm going to implement autoscaling for Deployers, I might as well read the relevant metrics in one go.

The code uses pymssql library to interact with CM DB. It also uses the boto3 CloudWatch client to push the custom metrics into CloudWatch.

2. Define Rule in CloudWatch

CloudWatch rules can be defined based on a time schedule (like a cron job) or based on events raised somewhere else.

In this situation, a time pattern rule made sense. So I created a rule that fires every minute.

You also associate a target with the rule. This specifies what happens when the rule fires. In my case it executes the Lambda function created in step 1.


Give the rule a name and a description and save it.


3. Visualize Data in Dashboard

One can inspect the new custom metrics in CloudWatch and use them for creating alarms (presented in a later post) or place them in a dashboard like this:



Comments

Popular posts from this blog

Running sp_updatestats on AWS RDS database

Part of the maintenance tasks that I perform on a MSSQL Content Manager database is to run stored procedure sp_updatestats . exec sp_updatestats However, that is not supported on an AWS RDS instance. The error message below indicates that only the sa  account can perform this: Msg 15247 , Level 16 , State 1 , Procedure sp_updatestats, Line 15 [Batch Start Line 0 ] User does not have permission to perform this action. Instead there are several posts that suggest using UPDATE STATISTICS instead: https://dba.stackexchange.com/questions/145982/sp-updatestats-vs-update-statistics I stumbled upon the following post from 2008 (!!!), https://social.msdn.microsoft.com/Forums/sqlserver/en-US/186e3db0-fe37-4c31-b017-8e7c24d19697/spupdatestats-fails-to-run-with-permission-error-under-dbopriveleged-user , which describes a way to wrap the call to sp_updatestats and execute it under a different user: create procedure dbo.sp_updstats with execute as 'dbo' as

Content Delivery Monitoring in AWS with CloudWatch

This post describes a way of monitoring a Tridion 9 combined Deployer by sending the health checks into a custom metric in CloudWatch in AWS. The same approach can also be used for other Content Delivery services. Once the metric is available in CloudWatch, we can create alarms in case the service errors out or becomes unresponsive. The overall architecture is as follows: Content Delivery service sends heartbeat (or exposes HTTP endpoint) for monitoring Monitoring Agent checks heartbeat (or HTTP health check) regularly and stores health state AWS lambda function: runs regularly reads the health state from Monitoring Agent pushes custom metrics into CloudWatch I am running the Deployer ( installation docs ) and Monitoring Agent ( installation docs ) on a t2.medium EC2 instance running CentOS on which I also installed the Systems Manager Agent (SSM Agent) ( installation docs ). In my case I have a combined Deployer that I want to monitor. This consists of an Endpoint and a

Debugging a Tridion 2011 Event System

OK, so you wrote your Tridion Event System. Now it's time to debug it. I know this is a hypothetical situtation -- your code never needs any kind of debugging ;) but indulge me... Recently, Alvin Reyes ( @nivlong ) blogged about being difficult to know how exactly to debug a Tridion Event System. More exactly, the question was " What process do I attach to for debugging even system code? ". Unfortunately, there is no simple or generic answer for it. Different events are fired by different Tridion CM modules. These modules run as different programs (or services) or run inside other programs (e.g. IIS). This means that you will need to monitor (or debug) different processes, based on which events your code handles. So the usual suspects are: dllhost.exe (or dllhost3g.exe ) - running as the MTSUser is the SDL Tridion Content Manager COM+ application and it fires events on generic TOM objects (e.g. events based on Tridion.ContentManager.Extensibility.Events.CrudEven