Using AWS SQS as JMS provider with Spring

21 Apr

Recently AWS published a new client library that implements the JMS 1.1 specification and uses their Simple Queue Service (SQS) as the JMS provider (see Jeff Barr’s post here). In my post I will show you how to set up your Maven project to use the Spring Framework to use this library.
We will perform the following steps:

  • Create the queue in the AWS Management Console
  • Set up your AWS credentials on your machine
  • Set up your Maven project
  • Create the Spring configuration
  • Create the Java files to produce and receive messages

This post will only show some basic use of the SQS possibilities but should be good enough to get you started. I assume you have already created your AWS Account and you are familiair with Maven and basic Spring setup. Continue reading

Finished the course “Mining Massive Data Sets”

11 Apr

As I mentioned in a previous post I have been following the Coursera course ‘Mining Massive Datasets‘. Anyone who is not familiair with Coursera should have a look, as they offer a lot of (free) courses that you can follow remotely. This specific course is created by three instructors with a Stanford background: Jure Leskovec, Anand Rajaraman and Jeffrey Ullman. These three are also the authors of the corresponding book ‘Mining of Massive Datasets’ which you can find here. Continue reading

Message Content Filtering with WSO2 ESB

29 Mar

ContentFilterEvery integration architect or developer should be familiair with Enterprise Integration Patterns (EIP) as described by Gregor Hohpe and Bobby Woolf. One of the patterns is the ‘Content Message Filter’ (not to be confused with the Message Filter pattern).
There are multiple ways to achieve this in WSO2 with different Mediator. One way is using the XSLT Mediator where you can simply use an XSLT to do the filtering. The other one (not so obvious based on the name of it) is the Enrich Mediator.
Continue reading

Running PageRank Hadoop job on AWS Elastic MapReduce

2 Mar

aws-emrIn a previous post I described an example to perform a PageRank calculation which is part of the Mining Massive Dataset course with Apache Hadoop. In that post I took an existing Hadoop job in Java and modified it somewhat (added unit tests and made file paths set by a parameter). This post shows how to use this job on a real-life Hadoop cluster. The cluster is a AWS EMR cluster of 1 Master Node and 5 Core Nodes, each being backed by a m3.xlarge instance.
The first step is to prepare the input for the cluster. I make use of AWS S3 since this is a convenient way when working with EMR. I create a new bucket, ’emr-pagerank-demo’, and made the following subfolders:

  • in: the folder containing the input files for the job
  • job: the folder containing my executable Hadoop jar file
  • log: the folder where EMR will put its log files

Continue reading

Calculate PageRanks with Apache Hadoop

18 Feb

hadoopCurrently I am following the Coursera training ‘Mining Massive Datasets‘. I have been interested in MapReduce and Apache Hadoop for some time and with this course I hope to get more insight in when and how MapReduce can help to fix some real world business problems (another way to do so I described here). This Coursera course is mainly focussing on the theory of used algorithms and less about the coding itself. The first week is about PageRanking and how Google used this to rank pages. Luckily there is a lot to find about this topic in combination with Hadoop. I ended up here and decided to have a closer look at this code. Continue reading

Sharing your sources stored in a Git repository

12 Feb

logo-gitI have been using Git for some time now and so far like it a lot. Especially the set up as described by Vincent Driessen in combination with the git-flow (and perhaps even better for lots of Java projects) the Maven implementation of it make using it easy.
However you might always end up in a situation as I did lately that you have to share your sources with someone who doesn’t use Git. In that case there is a simple Git command to help you out. It is the ‘git archive‘ command. You can use it like this:
git archive --format zip --output develop
In this case a zip file ‘’ is created containing all the sources that are in the ‘develop’ branch.
For lots more Git commands see this page and for more general background info about the way Git works see this book.

Making use of the open sources of WSO2 ESB

28 Jan

wso2-logo-e1412323639751When implementing services using the WSO2 stack (or any other open source Java framework) you will sooner or later run into a situation that the framework behaviour doesn’t do what you expect it should do. Or you just want to verify the way a product works. I lately had several of these experiences and I got around it to setup a remote debug session so I could go through the code step-by-step to see what exactly was happening. Of course this only makes sense if you have the source code available (long live open source :-)).
In this post an example with the WSO2 ESB (v 4.8.1) in combination with IntelliJ IDEA. Continue reading

Base64 encoding of binary file content

28 Jan

For testing a base64Binary XML type at one of my projects I needed an example of a base64 encoded file content. There is a simple command for that (at least when you are working on a Mac). For a file called ‘abc.pdf’ the command is:

openssl base64 -in abc.pdf -out encoded.txt

The result is a file ‘encoded.txt’ with a base64 decoded string:
Continue reading

Using your own WSDL with a WSO2 ESB Proxy Service

21 Jan

wso2-logo-e1412323639751It is common practice to use an external XSD file in your WSDL. This way you can easily reuse your XSD at other places. However if you want to use such WSDL in your WSO2 ESB Proxy Service you have to configure the path to the XSD correctly.
This post describes how to set this up. More background info about this can be found here. I created a Multi Module Maven project and added the WSDL artifact and the XSD’s so I got a result like this:
Screenshot at Dec 06 17-36-43
In the WSDL I imported the ‘EchoElements.xsd’ like this:

<?xml version='1.0' encoding='UTF-8' standalone='no'?>
<wsdl:definitions xmlns:soap='' 
xmlns:tns='' xmlns:elm='' 
xmlns:wsdl='' xmlns:xsd='' name='EchoWsdl' targetNamespace=''>
       <xsd:import namespace=''
		schemaLocation='../xsd/EchoElements.xsd' />

As you can see I go up one directory and look for the XSD file in the ‘xsd’ folder.
In the corresponding EchoElements.xsd I also import another XSD like this:

  <xs:import namespace=''  schemaLocation='./EchoTypes.xsd' />

So in the same folder as the ‘parent’ XSD I am looking for the ‘EchoTypes.xsd’ folder.

Now this will translate to the following proxy service configuration :

  <publishWSDL key='gov:wsdl/EchoWsdl.wsdl'>
     <resource location='../xsd/EchoElements.xsd' key='gov:/xsd/EchoElements.xsd'/>
     <resource location='./EchoTypes.xsd' key='gov:/xsd/EchoTypes.xsd'/>

As you can see the keys defined are pointing to the keys of the artifacts in the registry. Another import thin g to notice is that the location attribute of the resource has to match the defined schema location attribute in the artifact that is importing the resource.
If we for example modify the ‘EchoElements.xsd’ and rewrite the import element to this:

  <xs:import namespace=''  schemaLocation='../xsd/EchoTypes.xsd' />

It would actually point to the same (physical) location as seen from this XSD however if we deploy this configuration the ESB will throw an error because it won’t be able to match the resource with the defined import:

ERROR - ProxyService Error building service from WSDL
org.apache.axis2.AxisFault: WSDLException (at /wsdl:definitions/wsdl:types/xsd:schema/xs:schema): faultCode=PARSER_ERROR: Problem parsing 'file:../xsd/./EchoTypes.xsd'.: ../xsd/./EchoTypes.xsd (No such file or directory)
at org.apache.axis2.AxisFault.makeFault(
at org.apache.axis2.description.WSDL11ToAxisServiceBuilder.populateService(
at org.apache.synapse.core.axis2.ProxyService.buildAxisService(
at org.apache.synapse.deployers.ProxyServiceDeployer.deploySynapseArtifact(
at org.wso2.carbon.proxyadmin.ProxyServiceDeployer.deploySynapseArtifact(
at org.apache.synapse.deployers.AbstractSynapseArtifactDeployer.deploy(
at org.wso2.carbon.application.deployer.synapse.SynapseAppDeployer.deployArtifacts(
at org.wso2.carbon.application.deployer.internal.ApplicationManager.deployCarbonApp(
at org.wso2.carbon.application.deployer.CappAxis2Deployer.deploy(
at org.apache.axis2.deployment.repository.util.DeploymentFileData.deploy(
at org.apache.axis2.deployment.DeploymentEngine.doDeploy(

If we now also change the resource declaration in the Proxy Service so it matches the import in the XSD it works again:

  <publishWSDL key='gov:wsdl/EchoWsdl.wsdl'>
     <resource location='../xsd/EchoElements.xsd' key='gov:/xsd/EchoElements.xsd'/>
     <resource location='../xsd/EchoTypes.xsd' key='gov:/xsd/EchoTypes.xsd'/>

Developing with WSO2

29 Nov

wso2-logo-e1412323639751Since a few months I am back working with WSO2 products. In the upcoming posts I describe some of the (small) issues I ran into and how to solve them.

The first thing I did when setting up my development environment was downloading the Developer Studio (64-bit version) on my Mac. Continue reading


Get every new post delivered to your Inbox.

Join 172 other followers