Tag Archives: MapReduce

Finished the course “Mining Massive Data Sets”

As I mentioned in a previous post I have been following the Coursera course ‘Mining Massive Datasets‘. Anyone who is not familiair with Coursera should have a look, as they offer a lot of (free) courses that you can follow … Continue reading

Posted in Hadoop, MapReduce | Tagged , | Comments Off on Finished the course “Mining Massive Data Sets”

Running PageRank Hadoop job on AWS Elastic MapReduce

In a previous post I described an example to perform a PageRank calculation which is part of the Mining Massive Dataset course with Apache Hadoop. In that post I took an existing Hadoop job in Java and modified it somewhat … Continue reading

Posted in AWS, Hadoop | Tagged , , , | 4 Comments

Calculate PageRanks with Apache Hadoop

Currently I am following the Coursera training ‘Mining Massive Datasets‘. I have been interested in MapReduce and Apache Hadoop for some time and with this course I hope to get more insight in when and how MapReduce can help to … Continue reading

Posted in Hadoop, MapReduce | Tagged , , | 1 Comment

Running MapReduce Design Patterns on Cloudera’s CDH5

One of the better books I read so far about MapReduce is ‘MapReduce Design Patterns‘ as I mentioned in my previous post. In this post I describe the steps to get started with running the Hadoop source code that goes … Continue reading

Posted in Hadoop | Tagged , , | Comments Off on Running MapReduce Design Patterns on Cloudera’s CDH5

Hadoop and MapReduce Design Patterns

Recently I finished my last project in which I was implementing Mule ESB. This gives me some room in my schedule to dive into the world of Big Data again (more specifically the Hadoop ecosystem). I have looked into this … Continue reading

Posted in Hadoop | Tagged , , , | 2 Comments

Unit testing a Java Hadoop job

In my previous post I showed how to setup a complete Maven based project to create a Hadoop job in Java. Of course it wasn’t complete because it is missing the unit test part :-). In this post I show … Continue reading

Posted in Hadoop, Maven | Tagged , , | 2 Comments

Writing a Hadoop MapReduce task in Java

Although Hadoop Framework itself is created with Java the MapReduce jobs can be written in many different languages. In this post I show how to create a MapReduce job in Java based on a Maven project like any other Java … Continue reading

Posted in Hadoop | Tagged , | 5 Comments