In order to squeeze the most performance out of the publishing in a Tridion, SDL Web system, there is a high amount of parameters and configurations that must all be optimized. This post presents a few of these parameters, especially in the context of AWS infrastructure.
CME
The Tridion GUI, the Content Manager Explorer server is a normal Tridion installation, but with Publisher and Transport services disabled. This server is only used for running the website -- the CME.
Publishers
Publisher servers are Tridion installations without the CME. Namely these servers have the Publisher and Transport services enabled and any other Tridion services, if installed, are disabled. The purpose of a Publisher is simply to lookup publishable items in the Publishing Queue and proceed to render them, hand them over to the Transport service and have the transport package pushed to the Content Delivery.
Publishers should use approximately twice the number of CPU cores as number of rendering threads and the same number of cores are transporting threads. If more threads are used, typically no significant performance gain is achieved. Instead, it is better to scale-out the Publisher servers.
In AWS context, it is advisable to scale-out the Publishers to 2-4 instances in order to obtain optimal performance. Scaling out to more than 4 Publishers results in little to no performance gain achieved, mostly due to the limitations on database transactions and number of permitted I/O operations, but more about that below.
In terms of instance type, anything from t2.medium to m4.xlarge are advisable Publisher instance types. Some very interesting performance is achieved with several smaller type instances than with fewer larger instances.
Deployer Receiver
The receiver on transport packages on the Content Delivery side is this so called Deployer Receiver, which only listens to incoming HTTP connections that post transport packages and other admin/update notification requests.
A receiver is usually a light weight server, so smaller instances in the t2.large vicinity will do the job nicely.
It is possible to scale-out receivers, by placing them under an Elastic Load Balancer, in an active-active manner using normal round-robin requests allocation.
The location where deployers store the incoming transport packages can be configured to shared file system (e.g. EFS) or a Redis instance. Performance of Redis is better than shared file system. Also Redis is more reliable than shared file systems, because the latter are prone to locking issues under load. However, one bit limitation of Redis is that it can't store transport packages larger than 512MB. So rule of thumb: use Redis if possible, otherwise fall back to shared FS, for example EFS.
The notification mechanism the receiver uses to notify workers there are new available transport packages can be configured to use file system queues (e.g. EFS) or JMS (e.g. SQS or ActiveMQ). The file system queues have to exist on the shared file system and their performance is greatly impacted by the fact that the notification is not actually sent, rather the workers monitor a certain folder in order to detect changes in it. This is also prone to file locking issues and is in general less stable than a messaging system. Therefore the JMS implementation (e.g. SQS) is highly recommended here. The receiver will post a message to the SQS mechanism and that is relayed further to all listening worker servers.
Deployer Worker
This server is the one actually performing the deploy/undeploy of content. Upon noticing a transport package is available, it will proceed to handle it and return its status to the receiver.
The configuration that yields the best performance is to use Redis as binary storage, if possible, and to use JMS notification mechanism, such as SQS.
The worker server can be configured to use a number of threads that perform the deploying/undeploying internally. A good number if around 15 worker threads. Less than 10 will result in poor performance and more than 20 will not yield any noticeable performance gain.
In AWS context, the instance type of a deployer worker can range from t2.medium to m4.xlarge, with some very nice surprises for several smaller instances rather than fewer larger instances. A number of 2-3 instances are sufficient to yield great publishing throughputs.
Comments