Message Content Filtering with WSO2 ESB

29 Mar

ContentFilterEvery integration architect or developer should be familiair with Enterprise Integration Patterns (EIP) as described by Gregor Hohpe and Bobby Woolf. One of the patterns is the ‘Content Message Filter’ (not to be confused with the Message Filter pattern).
There are multiple ways to achieve this in WSO2 with different Mediator. One way is using the XSLT Mediator where you can simply use an XSLT to do the filtering. The other one (not so obvious based on the name of it) is the Enrich Mediator.
Continue reading

Running PageRank Hadoop job on AWS Elastic MapReduce

2 Mar

aws-emrIn a previous post I described an example to perform a PageRank calculation which is part of the Mining Massive Dataset course with Apache Hadoop. In that post I took an existing Hadoop job in Java and modified it somewhat (added unit tests and made file paths set by a parameter). This post shows how to use this job on a real-life Hadoop cluster. The cluster is a AWS EMR cluster of 1 Master Node and 5 Core Nodes, each being backed by a m3.xlarge instance.
The first step is to prepare the input for the cluster. I make use of AWS S3 since this is a convenient way when working with EMR. I create a new bucket, ‘emr-pagerank-demo’, and made the following subfolders:

  • in: the folder containing the input files for the job
  • job: the folder containing my executable Hadoop jar file
  • log: the folder where EMR will put its log files

Continue reading

Calculate PageRanks with Apache Hadoop

18 Feb

hadoopCurrently I am following the Coursera training ‘Mining Massive Datasets‘. I have been interested in MapReduce and Apache Hadoop for some time and with this course I hope to get more insight in when and how MapReduce can help to fix some real world business problems (another way to do so I described here). This Coursera course is mainly focussing on the theory of used algorithms and less about the coding itself. The first week is about PageRanking and how Google used this to rank pages. Luckily there is a lot to find about this topic in combination with Hadoop. I ended up here and decided to have a closer look at this code. Continue reading

Sharing your sources stored in a Git repository

12 Feb

logo-gitI have been using Git for some time now and so far like it a lot. Especially the set up as described by Vincent Driessen in combination with the git-flow (and perhaps even better for lots of Java projects) the Maven implementation of it make using it easy.
However you might always end up in a situation as I did lately that you have to share your sources with someone who doesn’t use Git. In that case there is a simple Git command to help you out. It is the ‘git archive‘ command. You can use it like this:
git archive --format zip --output develop
In this case a zip file ‘’ is created containing all the sources that are in the ‘develop’ branch.
For lots more Git commands see this page and for more general background info about the way Git works see this book.

Making use of the open sources of WSO2 ESB

28 Jan

wso2-logo-e1412323639751When implementing services using the WSO2 stack (or any other open source Java framework) you will sooner or later run into a situation that the framework behaviour doesn’t do what you expect it should do. Or you just want to verify the way a product works. I lately had several of these experiences and I got around it to setup a remote debug session so I could go through the code step-by-step to see what exactly was happening. Of course this only makes sense if you have the source code available (long live open source :-)).
In this post an example with the WSO2 ESB (v 4.8.1) in combination with IntelliJ IDEA. Continue reading

Base64 encoding of binary file content

28 Jan

For testing a base64Binary XML type at one of my projects I needed an example of a base64 encoded file content. There is a simple command for that (at least when you are working on a Mac). For a file called ‘abc.pdf’ the command is:

openssl base64 -in abc.pdf -out encoded.txt

The result is a file ‘encoded.txt’ with a base64 decoded string:
Continue reading

Using your own WSDL with a WSO2 ESB Proxy Service

21 Jan

wso2-logo-e1412323639751It is common practice to use an external XSD file in your WSDL. This way you can easily reuse your XSD at other places. However if you want to use such WSDL in your WSO2 ESB Proxy Service you have to configure the path to the XSD correctly.
This post describes how to set this up. More background info about this can be found here. I created a Multi Module Maven project and added the WSDL artifact and the XSD’s so I got a result like this:
Screenshot at Dec 06 17-36-43
In the WSDL I imported the ‘EchoElements.xsd’ like this:

<?xml version='1.0' encoding='UTF-8' standalone='no'?>
<wsdl:definitions xmlns:soap='' 
xmlns:tns='' xmlns:elm='' 
xmlns:wsdl='' xmlns:xsd='' name='EchoWsdl' targetNamespace=''>
       <xsd:import namespace=''
		schemaLocation='../xsd/EchoElements.xsd' />

As you can see I go up one directory and look for the XSD file in the ‘xsd’ folder.
In the corresponding EchoElements.xsd I also import another XSD like this:

  <xs:import namespace=''  schemaLocation='./EchoTypes.xsd' />

So in the same folder as the ‘parent’ XSD I am looking for the ‘EchoTypes.xsd’ folder.

Now this will translate to the following proxy service configuration :

  <publishWSDL key='gov:wsdl/EchoWsdl.wsdl'>
     <resource location='../xsd/EchoElements.xsd' key='gov:/xsd/EchoElements.xsd'/>
     <resource location='./EchoTypes.xsd' key='gov:/xsd/EchoTypes.xsd'/>

As you can see the keys defined are pointing to the keys of the artifacts in the registry. Another import thin g to notice is that the location attribute of the resource has to match the defined schema location attribute in the artifact that is importing the resource.
If we for example modify the ‘EchoElements.xsd’ and rewrite the import element to this:

  <xs:import namespace=''  schemaLocation='../xsd/EchoTypes.xsd' />

It would actually point to the same (physical) location as seen from this XSD however if we deploy this configuration the ESB will throw an error because it won’t be able to match the resource with the defined import:

ERROR - ProxyService Error building service from WSDL
org.apache.axis2.AxisFault: WSDLException (at /wsdl:definitions/wsdl:types/xsd:schema/xs:schema): faultCode=PARSER_ERROR: Problem parsing 'file:../xsd/./EchoTypes.xsd'.: ../xsd/./EchoTypes.xsd (No such file or directory)
at org.apache.axis2.AxisFault.makeFault(
at org.apache.axis2.description.WSDL11ToAxisServiceBuilder.populateService(
at org.apache.synapse.core.axis2.ProxyService.buildAxisService(
at org.apache.synapse.deployers.ProxyServiceDeployer.deploySynapseArtifact(
at org.wso2.carbon.proxyadmin.ProxyServiceDeployer.deploySynapseArtifact(
at org.apache.synapse.deployers.AbstractSynapseArtifactDeployer.deploy(
at org.wso2.carbon.application.deployer.synapse.SynapseAppDeployer.deployArtifacts(
at org.wso2.carbon.application.deployer.internal.ApplicationManager.deployCarbonApp(
at org.wso2.carbon.application.deployer.CappAxis2Deployer.deploy(
at org.apache.axis2.deployment.repository.util.DeploymentFileData.deploy(
at org.apache.axis2.deployment.DeploymentEngine.doDeploy(

If we now also change the resource declaration in the Proxy Service so it matches the import in the XSD it works again:

  <publishWSDL key='gov:wsdl/EchoWsdl.wsdl'>
     <resource location='../xsd/EchoElements.xsd' key='gov:/xsd/EchoElements.xsd'/>
     <resource location='../xsd/EchoTypes.xsd' key='gov:/xsd/EchoTypes.xsd'/>

Developing with WSO2

29 Nov

wso2-logo-e1412323639751Since a few months I am back working with WSO2 products. In the upcoming posts I describe some of the (small) issues I ran into and how to solve them.

The first thing I did when setting up my development environment was downloading the Developer Studio (64-bit version) on my Mac. Continue reading

right-pad values with XSLT

19 Oct

In this post an XSLT function that can be used to right-pad the value of an element with a chosen character to a certain length. No rocket science but this might become handy again so by putting it down here I don’t have to reinvent it later. The function itself looks like:

<xsl:stylesheet version="2.0"  xmlns:functx="http://my/functions"

    <xsl:function name="functx:pad-string-to-length" as="xsd:string">
    <xsl:param name="stringToPad" as="xsd:string?"/>
    <xsl:param name="padChar" as="xsd:string"/>
    <xsl:param name="length" as="xsd:integer"/>
    <xsl:sequence select="
     string-join (
       ($stringToPad, for $i in (1 to $length) return $padChar)


Continue reading

Running MapReduce Design Patterns on Cloudera’s CDH5

9 Sep

cloudera-hadoopOne of the better books I read so far about MapReduce is ‘MapReduce Design Patterns‘ as I mentioned in my previous post. In this post I describe the steps to get started with running the Hadoop source code that goes with the book on Cloudera’s latest Hadoop distribution CDH5. I decided to be making use of HDFS and YARN for testing the patterns. Take the following steps to get it all up and running:

  • Get CDH5 and run it
  • Install IntelliJ IDEA
  • Upgrade GIT client
  • Create local directory
  • Checkout source code
  • Install source data
  • Run the job

Continue reading


Get every new post delivered to your Inbox.

Join 167 other followers