The Curious Dev

Various programming sidetracks and shiny-object detours

Three Ways of Mapping With XSL

When transforming a payload from one form to another it’s often necessary to map various fields. An example of this may be as simple as mapping from a country code to the full name of said country, i.e. AUS to Australia.

There several ways to achieve this, some more flexible than others.

To use the example of country code mapping, here’s the input data that we’re required to map/transform:

<countries>
    <country>AUS</country>
    <country>BRA</country>
    <country>GRC</country>
    <country>JPN</country>
    <country>GBR</country>
</countries>

External mapping file

First up is referring to an external file that contains the relevant mapping and using XPATH grab the correct value.

For an external mapping file we simply include the relevant path to the mapping file, but this could just as easily be an URL from the web, either of:

<xsl:variable name="countryList">../xml/countrylist.xml</xsl:variable>

or

<xsl:variable name="countryList">http://s3.amazonaws.com/thecuriousdev.com/countrylist.xml</xsl:variable>

This file simply consists of both the code and name for each country, i.e.

<?xml version="1.0" encoding="UTF-8"?>
<countries>
    <country>
        <name>Afghanistan</name>
        <code>AFG</code>
    </country>
    <country>
        <name>Albania</name>
        <code>ALB</code>
    </country>
    ...
    <country>
        <name>Zambia</name>
        <code>ZMB</code>
    </country>
    <country>
        <name>Zimbabwe</name>
        <code>ZWE</code>
    </country>
</countries>

Performing the mapping is as easy as loading the external file with the document function and then extract the appropriate value with an XPATH:

<xsl:value-of select="document($countryList)/countries/country[code=$country]/name"/>

Full template is here.

Local mapping variable

Sometimes it may not be desirable to have an external mapping file, so an embedded variable with the relevant mappings could be included in the XSL itself.

This is the same data as in the external mapping file from above, only it’s now in a variable:

<xsl:variable name="countryList">
            <countries>
                <country>
                    <name>Afghanistan</name>
                    <code>AFG</code>
                </country>
                <country>
                    <name>Albania</name>
                    <code>ALB</code>
                </country>
                ...
                <country>
                    <name>Zambia</name>
                    <code>ZMB</code>
                </country>
                <country>
                    <name>Zimbabwe</name>
                    <code>ZWE</code>
                </country>
            </countries>
        </xsl:variable>

It’s not possible to simply call an XPATH on this variable due to a limitation in XSLT 1.0, it returns an all but unusable “fragmented nodeset” but with a widely supported extension it’s very easy.

Typcically I use the default XSLT Processor in Java (generally known as JAXP) which my take is it was a port of Xalan-J at some point. For JAXP, the extended functions are included so enabling the extended libraries is as easy as including the namespace in the opening tag of the XSL file:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:exsl="http://exslt.org/common" extension-element-prefixes="exsl">

With that enabled, we can then simply pass the countryList variable through the exsl:node-set() function to produce a useful node-set that can be XPATH’d.

<xsl:value-of select="exsl:node-set($countryList)/countries/country[code=$country]/name"/>

Full template is here.

Local string mapping variable

In a recent piece of work I did there was a constraint where it was not possible to have external resources, nor use extension functions, so both the previously discussed options were ruled out. What this left was good ole string manipulation, what I term ”string bashing”. Essentially we simply include a string variable made up of key/value properties that we then substring to extract the relevant data as needed for the mapping.

I wrote a separate simple XSL to produce this key/value string, which we just include in our template:

<xsl:variable name="countryList">
            ;AFG=Afghanistan;ALB=Albania;DZA=Algeria;AND=Andorra;AGO=Angola;ATG=Antigua and Barbuda;ARG=Argentina;ARM=Armenia;AUS=Australia;AUT=Austria;AZE=Azerbaijan;BHS=Bahamas
            ...
        </xsl:variable>

Then with a compact but powerful call we can grab the desired value:

<xsl:value-of select="substring-before(substring-after($countryList, concat(';',$countryCode,'=')),';')"/>

The result of this is just the same as the first two mapping options discussed above, it just doesn’t have the simplicity of being able to use an XPATH.

Full template is here.

So there are three ways to do mapping with XSL, there are likely others, generally I prefer the first option with an external mapping file as that gives us the best option for re-use elsewhere, but that is not always possible.

Updating DNS Automatically

This post is the next in the series on Cloudifying MoinMoin and Scheduling an ASG, here we Update the DNS record automatically upon the provisioning of an instance.

Now that we’ve got everything up and running on the schedule that we want, and being automatically replaced upon failure, we need to ensure our domain alias A record i.e. wiki.easyas.info gets updated with the IP of each new instance as they’re provisioned.

An easy way to do this is to simply execute a script that runs at provisioning time for the instance and updates Route53 with the appropriate IP. A great place to execute this script from is the UserData section of the LaunchConfiguration within the CloudFormation template.

#!/bin/bash

DOMAIN=easyas.info
HOSTNAME=wiki
EC2_PUBLIC=`/usr/bin/curl -s http://169.254.169.254/latest/meta-data/public-hostname`

export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-key>

RESULT=`/usr/bin/cli53 rrcreate --replace $DOMAIN "$HOSTNAME 60 A $EC2_PUBLIC"`
#e.g. cli53 rrcreate --replace easyas.info "wiki 60 A 123.234.101.50"

exit 0

This script was inspired by this answer on Stackoverflow. The script utilises a great little library, cli53 that augments the AWS CLI to provide the needed extra DNS functionality. Alternatively I could use the appropriate AWS CLI command for Route53, which would be cleaner as it wouldn’t require my storing of the AWS Access Keys on the instance. Unfortunately this time around I gave up after 30+ mins trying to work it out with somewhat unhelpful “Invalid request” responses. What I have done is create an AWS user and associated Access Keys for solely updating Route53.

Another way is to avoid the changing IP altogether and simply fork out the few extra $/month for an Elastic IP, they’re free when attached to a running instance. But working within the boundaries of the above configured schedule, it’d cost almost as much to pay for the EIP (0.5c/hr) as it would for the instance (0.6c/hr), based on current US-West-2 / Oregon pricing.

Scheduling Actions on an ASG

This post builds on Cloudified MoinMoin further where we’re going to schedule the application to be down at certain times of the day/week. A relatively new feature of CloudFormation templates is the AWS::AutoScaling::ScheduledAction block, which allows us to alter the MinSize and MaxSize properties of the AutoScalingGroup.

This can be useful in a number of settings, but perhaps a good example is an internal webapp for a business where there are quite clear “business hours” where it’s critical to have the application up and responsive all day long. The flipside is at night time and other out of hours times it might be desirable to trim the number of instances deployed down to the minimum, which could be zero.

So in this case I’m going to bring the MoinMoin ASG down completely just before midnight every night and then wake it up at 6pm the next day. This makes sense as a dead simple way to save 75% on instance costs (obviously the EBS storage keeps on costing, but that is quite minimal anyway). Which gives me a great wiki for use in the evenings.

I’ve added two new sections to the Resources segment of the template, one to schedule the ASG down to be at MinSize=0 and MaxSize=0 and another to bring it up to be at MinSize=1 and MaxSize=1. As this little wiki is hardly taxing the t2.nano instance, there’s plenty of room to scale up and certainly no need to scale out but it is likely it would be easy enough to have two or more instances in an ASG sharing an EFS file system if high-availability was desired.

Scaling down the ASG:

"moinmoinScheduleSleep": {
  "Type" : "AWS::AutoScaling::ScheduledAction",
  "Properties" : {
    "AutoScalingGroupName" : { "Ref": "moinmoinASG" },
    "DesiredCapacity" : 0,
    "MaxSize" : 0,
    "MinSize" : 0,
    "Recurrence": "59 15 * * *"
  }
}

Waking the ASG back up:

"moinmoinScheduleWake": {
  "Type" : "AWS::AutoScaling::ScheduledAction",
  "Properties" : {
    "AutoScalingGroupName" : { "Ref": "moinmoinASG" },
    "DesiredCapacity" : 1,
    "MaxSize" : 1,
    "MinSize" : 1,
    "Recurrence": "0 10 * * *"
  }
}

This is how it then appears in your ASG configuration within the AWS console:

Schedule tab for ASG

Note: You may have noticed the cron / “Recurrence” times are not 6pm and midnight as I described above, these times are actually UTC time and I’m in UTC+8 so I’ve had to adjust for this.

Included file 'facebook_like.html' not found in _includes directory