Skip to main content

A DD4T.net Implementation - IIS URL Rewrite

I have used IIS's URL Rewrite module in several of my .net projects. It is a very neat module that gives a lot of URL rewrite/redirect functionality out-of-the-box. Namely, the module can do:
  • URL rewrite -- rewrite the URL path before request processing starts (similar to a server transfer);
  • URL redirect -- redirects the client browser to a modified URL by sending back redirect HTTP status codes;
  • Rewrite outgoing URLs in the response body;
My requirements have so far involved using URL rewrite together with the outgoing URLs rewrite in the response. For example, I had recently the following use case -- my client has legacy .aspx pages under location /devices. There are new DD4T .html pages in the system, but because of some routing restrictions they had to be placed under a temporary location /device (notice the difference from /devices). The .html pages should, however, be exposed to the internet as if they belonged to folder /devices.

Example: /device/page.html should be accessible to the Internet as /devices/page.html, even though in Tridion, the page is under Structure Group "/device".

Enter IIS URL Rewrite. I set a rewrite rule in web.config that takes a URL starting with /devices/ and rewrites it server side to /device/ followed by the same sub-path and .html page name. The rule uses regular expressions and replacement groups:

<system.webServer>
  <rewrite>
    <rules>
      <rule name="'devices' to 'device'" stopProcessing="true">
        <match url="^devices/(.*\.html)$" />
        <conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
        <action type="Rewrite" url="device/{R:1}" />
      </rule>
...

Using the rule above a certain page under /device is now also available under /devices. From an SEO perspective, this is something to avoid. The same page should not be accessible on the same website under more than one URL. So the next rule takes care of that. It forces direct access to pages under /device to yield HTTP status 404.

<rule name="'device' yields 404" stopProcessing="true">
  <match url="^device/.*\.html" />
  <conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
  <action type="CustomResponse" statusCode="404" statusReason="Page Not Found"
    statusDescription="Page Not Found" />
</rule>

There is one more issue left to deal with -- Tridion-resolved Component links to the page under /device. These links will be resolved to the actual URL of the page in Tridion, i.e. /device/page.html. But we want this URL to be exposed as /devices/page.html.

I chose to use URL Rewrite module to rewrite these links in the response body. So the rule below intercepts only anchor href links to the /device/ URL and rewrites them to /devices/. The rewrite occurs using regular expressions and replacement groups, but it is restricted to only those responses that have content mime-type text/html. This done simply for performance reasons, so we don't try accidentally to rewrite a binary response, for example.

<outboundRules>
  <rule name="'device' to 'devices'" preCondition="IsHTML"
        enabled="true" stopProcessing="false">
    <match filterByTags="A" pattern="^/device/(.*)$" />
    <action type="Rewrite" value="/devices/{R:1}" />
    <conditions logicalGrouping="MatchAny"/>
  </rule>

  <preConditions>
    <preCondition name="IsHTML" logicalGrouping="MatchAny">
      <add input="{RESPONSE_CONTENT_TYPE}" pattern="^text/html" />
    </preCondition>
  </preConditions>
</outboundRules>

Redirect Directories to index.html

Take the following rule as a bonus example. It redirects the client browser to the /index.html in case the requested URL is to a directory. The rule is a bit more complex, so I'll explain the regular expression ^((|devices)[^\.]*)(\/)?$ :

  • ^(|devices) match URL path that starts either with nothing OR 'devices' -- the match on nothing is for home pages, which don't have a URL path at all;
  • [^\.]* followed by zero or many characters that are not a dot (.) -- this indicates our request URL path is a directory (i.e. it doesn't contain an extension);
  • (\/)?$ URL path ends with an optional forward slash (/);

<rule name="Redirect directories to /index.html" stopProcessing="true">
  <match url="^((|devices)[^\.]*)(\/)?$" />
  <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
  </conditions>
  <action type="Redirect" url="{R:1}/index.html" />
</rule>



Comments

Popular posts from this blog

Toolkit - Dynamic Content Queries

This post if part of a series about the  File System Toolkit  - a custom content delivery API for SDL Tridion. This post presents the Dynamic Content Query capability. The requirements for the Toolkit API are that it should be able to provide CustomMeta queries, pagination, and sorting -- all on the file system, without the use third party tools (database, search engines, indexers, etc). Therefore I had to implement a simple database engine and indexer -- which is described in more detail in post Writing My Own Database Engine . The querying logic does not make use of cache. This means the query logic is executed every time. When models are requested, the models are however retrieved using the ModelFactory and those are cached. Query Class This is the main class for dynamic content queries. It is the entry point into the execution logic of a query. The class takes as parameter a Criterion (presented below) which triggers the execution of query in all sub-criteria of a Criterio

A DD4T.net Implementation - Custom Binary Publisher

The default way to publish binaries in DD4T is implemented in class DD4T.Templates.Base.Utils.BinaryPublisher and uses method RenderedItem.AddBinary(Component) . This produces binaries that have their TCM URI as suffix in their filename. In my recent project, we had a requirement that binary file names should be clean (without the TCM URI suffix). Therefore, it was time to modify the way DD4T was publishing binaries. The method in charge with publishing binaries is called PublishItem and is defined in class BinaryPublisher . I therefore extended the BinaryPublisher and overrode method PublishItem. public class CustomBinaryPublisher : BinaryPublisher { private Template currentTemplate; private TcmUri structureGroupUri; In its simplest form, method PublishItem just takes the item and passes it to the AddBinary. In order to accomplish the requirement, we must specify a filename while publishing. This is the file name part of the binary path of Component.BinaryConten

Scaling Policies

This post is part of a bigger topic Autoscaling Publishers in AWS . In a previous post we talked about the Auto Scaling Groups , but we didn't go into details on the Scaling Policies. This is the purpose of this blog post. As defined earlier, the Scaling Policies define the rules according to which the group size is increased or decreased. These rules are based on instance metrics (e.g. CPU), CloudWatch custom metrics, or even CloudWatch alarms and their states and values. We defined a Scaling Policy with Steps, called 'increase_group_size', which is triggered first by the CloudWatch Alarm 'Publish_Alarm' defined earlier. Also depending on the size of the monitored CloudWatch custom metric 'Waiting for Publish', the Scaling Policy with Steps can add a difference number of instances to the group. The scaling policy sets the number of instances in group to 1 if there are between 1000 and 2000 items Waiting for Publish in the queue. It also sets the