Tag Cloud
- ActiveMQ
- Artifactory
- AWS
- AWS Beanstalk
- AWS DynamoDB
- AWS EMR
- AWS Glacier
- AWS IAM
- AWS RDS
- AWS Route 53
- AWS S3
- AWS SDK Java
- AWS SQS
- AWS VPC
- Axis2
- blockchain
- Boxfuse
- BPEL
- BPMN
- Citrus
- Cloud
- CloudCheckr
- Continuous Build
- Continuous Delivery
- CruiseControl
- CXF
- DataMining
- Docker
- EJB3
- ethereum
- Git
- GitLab
- GlassFish
- Hadoop
- Hibernate
- IntelliJ IDEA
- iOS
- Jasper Reports
- JAX-WS
- JAXB
- JBoss AS
- Jenkins
- JMS
- Linux
- MapForce
- MapReduce
- maven
- MongoDB
- Mule
- Mule ESB
- Mule iON
- Netbeans
- Nexus
- OpenEJB
- Oracle BPEL
- Oracle iAS
- Oracle WSM
- Oracle XE
- Quartz
- Red Hat
- REST
- Security
- Smooks
- SOA/Web Services
- SoapUI
- Spring Boot
- Spring Framework
- Spring Integration
- Spring WS
- Swift
- TOGAF9
- Tomcat
- WSO2 ESB
- XCode
- XML/XSD/XSLT
Archives
Categories
Top Posts & Pages
- Use Spring and Hibernate with MongoDB
- Using AWS SQS as JMS provider with Spring
- Transforming XML to CSV via XSLT
- Small hack to avoid SSL validation in Spring RestTemplate
- Using a WAR module as dependency in Maven
- Compiling Jasper Reports with Maven2
- Assigning a subdomain to a Beanstalk application with AWS Route 53
- Finished 'iOS App Development with Swift' specialization at Coursera
- Using Amazon RDS with your WordPress installation
- Citrus testcase example: CSV File Inbound -> Xml Http Outbound
About me
Pascal Alma
Pascal is a senior IT consultant and has been working in IT since 1997. He is monitoring the latest development in new technologies (Mobile, Cloud, Big Data) closely and particularly interested in Java open source tool stacks, cloud related technologies like AWS and mobile development like building iOS apps with Swift. Specialties: Java/JEE/Spring Amazon AWS API/REST Big Data Continuous Delivery Swift/iOS
Personal Links
Tag Archives: Hadoop
Running PageRank Hadoop job on AWS Elastic MapReduce
In a previous post I described an example to perform a PageRank calculation which is part of the Mining Massive Dataset course with Apache Hadoop. In that post I took an existing Hadoop job in Java and modified it somewhat … Continue reading
Calculate PageRanks with Apache Hadoop
Currently I am following the Coursera training ‘Mining Massive Datasets‘. I have been interested in MapReduce and Apache Hadoop for some time and with this course I hope to get more insight in when and how MapReduce can help to … Continue reading
Running MapReduce Design Patterns on Cloudera’s CDH5
One of the better books I read so far about MapReduce is ‘MapReduce Design Patterns‘ as I mentioned in my previous post. In this post I describe the steps to get started with running the Hadoop source code that goes … Continue reading
Hadoop and MapReduce Design Patterns
Recently I finished my last project in which I was implementing Mule ESB. This gives me some room in my schedule to dive into the world of Big Data again (more specifically the Hadoop ecosystem). I have looked into this … Continue reading
Run your Hadoop MapReduce job on Amazon EMR
I have posted a while ago how to setup an EMR cluster by using CLI. In this post I will show how to setup the cluster by using the Java SDK for AWS. The best way to show how to … Continue reading
Unit testing a Java Hadoop job
In my previous post I showed how to setup a complete Maven based project to create a Hadoop job in Java. Of course it wasn’t complete because it is missing the unit test part :-). In this post I show … Continue reading
Writing a Hadoop MapReduce task in Java
Although Hadoop Framework itself is created with Java the MapReduce jobs can be written in many different languages. In this post I show how to create a MapReduce job in Java based on a Maven project like any other Java … Continue reading
Hadoop on the Amazon cloud by BigData’University’
Last night I stumbled upon this free online training program where you could learn about how to combine Hadoop and Amazon AWS! You would even receive a certificate if you passed the test at the end of the course. So … Continue reading
Running Hive jobs on AWS EMR
In a previous post I showed how to run a simple job using AWS Elastic MapReduce (EMR). In this example we continue to make use of EMR but now to run a Hive job. Hive is a data warehouse system … Continue reading