Skip to main content

A DD4T.net Implementation - IIS URL Rewrite

I have used IIS's URL Rewrite module in several of my .net projects. It is a very neat module that gives a lot of URL rewrite/redirect functionality out-of-the-box. Namely, the module can do:
  • URL rewrite -- rewrite the URL path before request processing starts (similar to a server transfer);
  • URL redirect -- redirects the client browser to a modified URL by sending back redirect HTTP status codes;
  • Rewrite outgoing URLs in the response body;
My requirements have so far involved using URL rewrite together with the outgoing URLs rewrite in the response. For example, I had recently the following use case -- my client has legacy .aspx pages under location /devices. There are new DD4T .html pages in the system, but because of some routing restrictions they had to be placed under a temporary location /device (notice the difference from /devices). The .html pages should, however, be exposed to the internet as if they belonged to folder /devices.

Example: /device/page.html should be accessible to the Internet as /devices/page.html, even though in Tridion, the page is under Structure Group "/device".

Enter IIS URL Rewrite. I set a rewrite rule in web.config that takes a URL starting with /devices/ and rewrites it server side to /device/ followed by the same sub-path and .html page name. The rule uses regular expressions and replacement groups:

<system.webServer>
  <rewrite>
    <rules>
      <rule name="'devices' to 'device'" stopProcessing="true">
        <match url="^devices/(.*\.html)$" />
        <conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
        <action type="Rewrite" url="device/{R:1}" />
      </rule>
...

Using the rule above a certain page under /device is now also available under /devices. From an SEO perspective, this is something to avoid. The same page should not be accessible on the same website under more than one URL. So the next rule takes care of that. It forces direct access to pages under /device to yield HTTP status 404.

<rule name="'device' yields 404" stopProcessing="true">
  <match url="^device/.*\.html" />
  <conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
  <action type="CustomResponse" statusCode="404" statusReason="Page Not Found"
    statusDescription="Page Not Found" />
</rule>

There is one more issue left to deal with -- Tridion-resolved Component links to the page under /device. These links will be resolved to the actual URL of the page in Tridion, i.e. /device/page.html. But we want this URL to be exposed as /devices/page.html.

I chose to use URL Rewrite module to rewrite these links in the response body. So the rule below intercepts only anchor href links to the /device/ URL and rewrites them to /devices/. The rewrite occurs using regular expressions and replacement groups, but it is restricted to only those responses that have content mime-type text/html. This done simply for performance reasons, so we don't try accidentally to rewrite a binary response, for example.

<outboundRules>
  <rule name="'device' to 'devices'" preCondition="IsHTML"
        enabled="true" stopProcessing="false">
    <match filterByTags="A" pattern="^/device/(.*)$" />
    <action type="Rewrite" value="/devices/{R:1}" />
    <conditions logicalGrouping="MatchAny"/>
  </rule>

  <preConditions>
    <preCondition name="IsHTML" logicalGrouping="MatchAny">
      <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
    </preCondition>
  </preConditions>
</outboundRules>

Redirect Directories to index.html

Take the following rule as a bonus example. It redirects the client browser to the /index.html in case the requested URL is to a directory. The rule is a bit more complex, so I'll explain the regular expression ^((|devices)[^\.]*)(\/)?$ :

  • ^(|devices) match URL path that starts either with nothing OR 'devices' -- the match on nothing is for home pages, which don't have a URL path at all;
  • [^\.]* followed by zero or many characters that are not a dot (.) -- this indicates our request URL path is a directory (i.e. it doesn't contain an extension);
  • (\/)?$ URL path ends with an optional forward slash (/);

<rule name="Redirect directories to /index.html" stopProcessing="true">
  <match url="^((|devices)[^\.]*)(\/)?$" />
  <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
  </conditions>
  <action type="Redirect" url="{R:1}/index.html" />
</rule>



Comments

Popular posts from this blog

Running sp_updatestats on AWS RDS database

Part of the maintenance tasks that I perform on a MSSQL Content Manager database is to run stored procedure sp_updatestats . exec sp_updatestats However, that is not supported on an AWS RDS instance. The error message below indicates that only the sa  account can perform this: Msg 15247 , Level 16 , State 1 , Procedure sp_updatestats, Line 15 [Batch Start Line 0 ] User does not have permission to perform this action. Instead there are several posts that suggest using UPDATE STATISTICS instead: https://dba.stackexchange.com/questions/145982/sp-updatestats-vs-update-statistics I stumbled upon the following post from 2008 (!!!), https://social.msdn.microsoft.com/Forums/sqlserver/en-US/186e3db0-fe37-4c31-b017-8e7c24d19697/spupdatestats-fails-to-run-with-permission-error-under-dbopriveleged-user , which describes a way to wrap the call to sp_updatestats and execute it under a different user: create procedure dbo.sp_updstats with execute as 'dbo' as

Content Delivery Monitoring in AWS with CloudWatch

This post describes a way of monitoring a Tridion 9 combined Deployer by sending the health checks into a custom metric in CloudWatch in AWS. The same approach can also be used for other Content Delivery services. Once the metric is available in CloudWatch, we can create alarms in case the service errors out or becomes unresponsive. The overall architecture is as follows: Content Delivery service sends heartbeat (or exposes HTTP endpoint) for monitoring Monitoring Agent checks heartbeat (or HTTP health check) regularly and stores health state AWS lambda function: runs regularly reads the health state from Monitoring Agent pushes custom metrics into CloudWatch I am running the Deployer ( installation docs ) and Monitoring Agent ( installation docs ) on a t2.medium EC2 instance running CentOS on which I also installed the Systems Manager Agent (SSM Agent) ( installation docs ). In my case I have a combined Deployer that I want to monitor. This consists of an Endpoint and a

Debugging a Tridion 2011 Event System

OK, so you wrote your Tridion Event System. Now it's time to debug it. I know this is a hypothetical situtation -- your code never needs any kind of debugging ;) but indulge me... Recently, Alvin Reyes ( @nivlong ) blogged about being difficult to know how exactly to debug a Tridion Event System. More exactly, the question was " What process do I attach to for debugging even system code? ". Unfortunately, there is no simple or generic answer for it. Different events are fired by different Tridion CM modules. These modules run as different programs (or services) or run inside other programs (e.g. IIS). This means that you will need to monitor (or debug) different processes, based on which events your code handles. So the usual suspects are: dllhost.exe (or dllhost3g.exe ) - running as the MTSUser is the SDL Tridion Content Manager COM+ application and it fires events on generic TOM objects (e.g. events based on Tridion.ContentManager.Extensibility.Events.CrudEven