Java Hadoop wierd join behaviour

Java Hadoop wierd join behaviour - java

Aim
I have two csv files trying to make a join between them. One containing movieId, title and the other containing userId, movieId, comment-tag. I want to find out how many comments-tags each movie has, by printing title, comment_count. So my code:
Driver
public class Driver
{
public Driver(String[] args)
{
if (args.length < 3) {
System.err.println("input path ");
}
try {
Job job = Job.getInstance();
job.setJobName("movie tag count");
// set file input/output path
MultipleInputs.addInputPath(job, new Path(args[1]), TextInputFormat.class, TagMapper.class);
MultipleInputs.addInputPath(job, new Path(args[2]), TextInputFormat.class, MovieMapper.class);
FileOutputFormat.setOutputPath(job, new Path(args[3]));
// set jar class name
job.setJarByClass(Driver.class);
// set mapper and reducer to job
job.setReducerClass(Reducer.class);
// set output key class
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
int returnValue = job.waitForCompletion(true) ? 0 : 1;
System.out.println(job.isSuccessful());
System.exit(returnValue);
} catch (IOException | ClassNotFoundException | InterruptedException e) {
e.printStackTrace();
}
}
}
MovieMapper
public class MovieMapper extends org.apache.hadoop.mapreduce.Mapper<Object, Text, Text, Text>
{
#Override
protected void map(Object key, Text value, Context context) throws IOException, InterruptedException
{
String line = value.toString();
String[] items = line.split("(?!\\B\"[^\"]*),(?![^\"]*\"\\B)"); //comma not in quotes
String movieId = items[0].trim();
if(tryParseInt(movieId))
{
context.write(new Text(movieId), new Text(items[1].trim()));
}
}
private boolean tryParseInt(String s)
{
try {
Integer.parseInt(s);
return true;
} catch (NumberFormatException e) {
return false;
}
}
}
TagMapper
public class TagMapper extends org.apache.hadoop.mapreduce.Mapper<Object, Text, Text, Text>
{
#Override
protected void map(Object key, Text value, Context context) throws IOException, InterruptedException
{
String line = value.toString();
String[] items = line.split("(?!\\B\"[^\"]*),(?![^\"]*\"\\B)");
String movieId = items[1].trim();
if(tryParseInt(movieId))
{
context.write(new Text(movieId), new Text("_"));
}
}
private boolean tryParseInt(String s)
{
try {
Integer.parseInt(s);
return true;
} catch (NumberFormatException e) {
return false;
}
}
}
Reducer
public class Reducer extends org.apache.hadoop.mapreduce.Reducer<Text, Text, Text, IntWritable>
{
#Override
protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException
{
int noOfFrequency = 0;
Text movieTitle = new Text();
for (Text o : values)
{
if(o.toString().trim().equals("_"))
{
noOfFrequency++;
}
else
{
System.out.println(o.toString());
movieTitle = o;
}
}
context.write(movieTitle, new IntWritable(noOfFrequency));
}
}
The problem
The result I get is something like this:
title, count
_, count
title, count
title, count
_, count
title, count
_, count
How does this _ gets to be the key? I can't understand it. There is an if statment checking if there is an _ count it and don't put it as the title. Is there something wrong with the toString() method and the equals operation fails? Any ideas?

it is not weird because you iterate through values and o is a pointer to elements of values which is here are Text. at some point in time you make movieTitle to points to where o points movieTitle = o. in next iterations o points to "_" and also movieTitle points to "_".
if you change your code like this every thing works fine:
int noOfFrequency = 0;
Text movieTitle = null;
for (Text o : values)
{
if(o.toString().trim().equals("_"))
{
noOfFrequency++;
}
else
{
movieTitle = new Text(o.toString());
}
}
context.write(movieTitle, new IntWritable(noOfFrequency));

Related

why is the first output line in map reduce null in java

I don't understand why the first output of my map reduce job is 0 and null
The output is : url ; number of visits
and here is the mapper class :
public class WordCountMapper extends
Mapper<LongWritable, Text, Text, IntWritable>
{
public void map(LongWritable cle, Text valeur, Context sortie)
throws IOException
{
String url="";
int nbVisites=0;
Pattern httplogPattern = Pattern.compile("([^\\s]+) - - \\[(.+)\\] \"([^\\s]+) (/[^\\s]*) HTTP/[^\\s]+\" [^\\s]+ ([0-9]+)");
String ligne = valeur.toString();
if (ligne.length()>0) {
Matcher matcher = httplogPattern.matcher(ligne);
if (matcher.matches()) {
url = matcher.group(1);
nbVisites = Integer.parseInt(matcher.group(5));
}
}
Text urlText = new Text(url);
IntWritable value = new IntWritable(nbVisites);
try
{
sortie.write(urlText, value);
System.out.println(urlText + " ; " + value);
}
catch (InterruptedException e)
{
e.printStackTrace();
}
}
and reducer :
public class WordCountReducer extends
Reducer<Text, IntWritable, Text, IntWritable>
{
public void reduce(Text key, Iterable<IntWritable> values, Context sortie) throws IOException, InterruptedException
{
Iterator<IntWritable> it = values.iterator();
int nb=0;
while (it.hasNext()) {
nb = nb + it.next().get();
}
try {
sortie.write(key, new IntWritable(nb));
System.out.println(key.toString() + ";" + nb);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Each line of the input file looks like this :
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
and here is the output :
0
04-dynamic-c.rotterdam.luna.net 4
06-dynamic-c.rotterdam.luna.net 1
10.salc.wsu.edu 3
11.ts2.mnet.medstroms.se 1
128.100.183.222 4
128.102.149.149 4
As you can see first line is a couple of null values
Thank you

You get an empty key (not null) because your default mapper Text is an empty string. Then the reducer counts that as 0...
It works fine if you check that your lines actually match before writing the output
Here's a refactored version of your code
public class WebLogDriver extends Configured implements Tool {
public static final String APP_NAME = WebLogDriver.class.getSimpleName();
public static void main(String[] args) throws Exception {
final int status = ToolRunner.run(new Configuration(), new WebLogDriver(), args);
System.exit(status);
}
#Override
public int run(String[] args) throws Exception {
Configuration conf = getConf();
Job job = Job.getInstance(conf, APP_NAME);
job.setJarByClass(WebLogDriver.class);
// outputs for mapper and reducer
job.setOutputKeyClass(Text.class);
// setup mapper
job.setMapperClass(WebLogDriver.WebLogMapper.class);
job.setMapOutputValueClass(IntWritable.class);
// setup reducer
job.setReducerClass(WebLogDriver.WebLogReducer.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
final Path outputDir = new Path(args[1]);
FileOutputFormat.setOutputPath(job, outputDir);
return job.waitForCompletion(true) ? 0 : 1;
}
static class WebLogMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
static final Pattern HTTP_LOG_PATTERN = Pattern.compile("(\\S+) - - \\[(.+)] \"(\\S+) (/\\S*) HTTP/\\S+\" \\S+ (\\d+)");
final Text keyOut = new Text();
final IntWritable valueOut = new IntWritable();
#Override
protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context) throws IOException, InterruptedException {
String line = value.toString();
if (line.isEmpty()) return;
Matcher matcher = HTTP_LOG_PATTERN.matcher(line);
if (matcher.matches()) {
keyOut.set(matcher.group(1));
try {
valueOut.set(Integer.parseInt(matcher.group(5)));
context.write(keyOut, valueOut);
} catch (NumberFormatException e) {
e.printStackTrace();
}
}
}
}
static class WebLogReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
static final IntWritable valueOut = new IntWritable();
#Override
protected void reduce(Text key, Iterable<IntWritable> values, Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws IOException, InterruptedException {
int nb = StreamSupport.stream(values.spliterator(), true)
.mapToInt(IntWritable::get)
.sum();
valueOut.set(nb);
context.write(key, valueOut);
}
}
}

In MapReduce program, reducer is not getting called by Driver

According to map reduce programming model I wrote this program where Driver code is as follows
MY DRIVER CLASS
public class MRDriver extends Configured implements Tool
{
#Override
public int run(String[] strings) throws Exception {
if(strings.length != 2)
{
System.err.println("usage : <inputlocation> <inputlocation> <outputlocation>");
System.exit(0);
}
Job job = new Job(getConf(), "multiple files");
job.setJarByClass(MRDriver.class);
job.setMapperClass(MRMapper.class);
job.setReducerClass(MRReducer.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(strings[0]));
FileOutputFormat.setOutputPath(job, new Path(strings[1]));
return job.waitForCompletion(true) ? 0 : 1;
//throw new UnsupportedOperationException("Not supported yet."); //To change body of generated methods, choose Tools | Templates.
}
public static void main(String[] args) throws Exception
{
Configuration conf = new Configuration();
System.exit(ToolRunner.run(conf, new MRDriver(), args));
}
}
MY MAPPER CLASS
class MRMapper extends Mapper<LongWritable, Text, Text, Text>
{
#Override
public void map(LongWritable key, Text value, Context context)
{
try
{
StringTokenizer iterator;
String idsimval = null;
iterator = new StringTokenizer(value.toString(), "\t");
String id = iterator.nextToken();
String sentival = iterator.nextToken();
if(iterator.hasMoreTokens())
idsimval = iterator.nextToken();
context.write(new Text("unique"), new Text(id + "_" + sentival + "_" + idsimval));
} catch (IOException | InterruptedException e)
{
System.out.println(e);
}
}
MY REDUCER CLASS
class MRReducer extends Reducer<Text, Text, Text, Text> {
String[] records;
HashMap<Long, String> sentiMap = new HashMap<>();
HashMap<Long, String> cosiMap = new HashMap<>();
private String leftIdStr;
private ArrayList<String> rightIDList, rightSimValList, matchingSimValList, matchingIDList;
private double leftVal;
private double rightVal;
private double currDiff;
private double prevDiff;
private int finalIndex;
Context newContext;
private int i;
public void reducer(Text key, Iterable<Text> value, Context context) throws IOException, InterruptedException {
for (Text string : value) {
records = string.toString().split("_");
sentiMap.put(Long.parseLong(records[0]), records[1]);
if (records[2] != null) {
cosiMap.put(Long.parseLong(records[0]), records[2]);
}
if(++i == 2588)
{
newContext = context;
newfun();
}
context.write(new Text("hello"), new Text("hii"));
}
context.write(new Text("hello"), new Text("hii"));
}
void newfun() throws IOException, InterruptedException
{
for (HashMap.Entry<Long, String> firstEntry : cosiMap.entrySet()) {
try {
leftIdStr = firstEntry.getKey().toString();
rightIDList = new ArrayList<>();
rightSimValList = new ArrayList<>();
matchingSimValList = new ArrayList<>();
matchingIDList = new ArrayList<>();
for (String strTmp : firstEntry.getValue().split(" ")) {
rightIDList.add(strTmp.substring(0, 18));
rightSimValList.add(strTmp.substring(19));
}
String tmp = sentiMap.get(Long.parseLong(leftIdStr));
if ("NULL".equals(tmp)) {
leftVal = Double.parseDouble("0");
} else {
leftVal = Double.parseDouble(tmp);
}
tmp = sentiMap.get(Long.parseLong(rightIDList.get(0)));
if ("NULL".equals(tmp)) {
rightVal = Double.parseDouble("0");
} else {
rightVal = Double.parseDouble(tmp);
}
prevDiff = Math.abs(leftVal - rightVal);
int oldIndex = 0;
for (String s : rightIDList) {
try {
oldIndex++;
tmp = sentiMap.get(Long.parseLong(s));
if ("NULL".equals(tmp)) {
rightVal = Double.parseDouble("0");
} else {
rightVal = Double.parseDouble(tmp);
}
currDiff = Math.abs(leftVal - rightVal);
if (prevDiff > currDiff) {
prevDiff = currDiff;
}
} catch (Exception e) {
}
}
oldIndex = 0;
for (String s : rightIDList) {
tmp = sentiMap.get(Long.parseLong(s));
if ("NULL".equals(tmp)) {
rightVal = Double.parseDouble("0");
} else {
rightVal = Double.parseDouble(tmp);
}
currDiff = Math.abs(leftVal - rightVal);
if (Objects.equals(prevDiff, currDiff)) {
matchingSimValList.add(rightSimValList.get(oldIndex));
matchingIDList.add(rightIDList.get(oldIndex));
}
oldIndex++;
}
finalIndex = rightSimValList.indexOf(Collections.max(matchingSimValList));
newContext.write(new Text(leftIdStr), new Text(" " + rightIDList.get(finalIndex) + ":" + rightSimValList.get(finalIndex)));
} catch (NumberFormatException nfe) {
}
}
}
}
What is the problem and does it belong to map reduce program or hadoop system configuration? Whenever I run this program, it only writes mapper ouput into hdfs.

Inside your Reducer class you must override the reduce method. You are declaring a reducer method, which is not correct.
Try modifying your function inside the Reducer class:
#Override
public void reduce(Text key, Iterable<Text> value, Context context) throws IOException, InterruptedException {

Reduce method in Reducer class is not executing

In the below code,inside reducer class reduce method is not executing. please help me.In my reduce method i want to write output in multiple files. so i have used multipleoutputs.
public class DataValidation {
public static class Map extends Mapper<LongWritable, Text, Text, Text> {
int flag = 1;
boolean result;
private HashMap<String, FileConfig> fileConfigMaps = new HashMap<String, FileConfig>();
private HashMap<String, List<LineValidator>> mapOfValidators = new HashMap<String, List<LineValidator>>();
private HashMap<String, List<Processor>> mapOfProcessors = new HashMap<String, List<Processor>>();
protected void setup(Context context) throws IOException {
System.out.println("configure inside map class");
ConfigurationParser parser = new ConfigurationParser();
Config config = parser.parse(new Configuration());
List<FileConfig> file = config.getFiles();
for (FileConfig f : file) {
try {
fileConfigMaps.put(f.getName(), f);
System.out.println("quotes in" + f.isQuotes());
System.out.println("file from xml : " + f.getName());
ValidationBuilder builder = new ValidationBuilder();
// ProcessorBuilder constructor = new ProcessorBuilder();
List<LineValidator> validators;
validators = builder.build(f);
// List<Processor> processors = constructor.build(f);
mapOfValidators.put(f.getName(), validators);
// mapOfProcessors.put(f.getName(),processors);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
// String filename = ((FileSplit) context.getInputSplit()).getPath()
// .getName();
FileSplit fs = (FileSplit) context.getInputSplit();
String fileName = fs.getPath().getName();
System.out.println("filename : " + fileName);
String line = value.toString();
String[] csvDataArray = null;
List<LineValidator> lvs = mapOfValidators.get(fileName);
flag = 1;
csvDataArray = line.split(",", -1);
FileConfig fc = fileConfigMaps.get(fileName);
System.out.println("filename inside fileconfig " + fc.getName());
System.out.println("quote values" + fc.isQuotes());
if (fc.isQuotes()) {
for (int i = 0; i < csvDataArray.length; i++) {
csvDataArray[i] = csvDataArray[i].replaceAll("\"", "");
}
}
for (LineValidator lv : lvs) {
if (flag == 1) {
result = lv.validate(csvDataArray, fileName);
if (result == false) {
String write = line + "," + lv.getFailureDesc();
System.out.println("write" + write);
System.out.println("key" + new Text(fileName));
// output.collect(new Text(filename), new Text(write));
context.write(new Text(fileName), new Text(write));
flag = 0;
if (lv.stopValidation(csvDataArray) == true) {
break;
}
}
}
}
}
protected void cleanup(Context context) {
System.out.println("clean up in mapper");
}
}
public static class Reduce extends Reducer<Text, Text, NullWritable, Text> {
protected void reduce(Text key, Iterator<Text> values, Context context)
throws IOException, InterruptedException {
System.out.println("inside reduce method");
while (values.hasNext()) {
System.out.println(" Nullwritable value" + NullWritable.get());
System.out.println("key inside reduce method" + key.toString());
context.write(NullWritable.get(), values.next());
// out.write(NullWritable.get(), values.next(), "/user/hadoop/"
// + context.getJobID() + "/" + key.toString() + "/part-");
}
}
}
public static void main(String[] args) throws Exception {
System.out.println("hello");
Configuration configuration = getConf();
Job job = Job.getInstance(configuration);
job.setJarByClass(DataValidation.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
private static Configuration getConf() {
return new Configuration();
}
}

You have not properly over-ridden reduce method. Use this:
public void reduce(Key key, Iterable values,
Context context) throws IOException, InterruptedException

Hadoop mapper is never called, custom input format might be the issue

So I am doing a little test program just to get the hang of hadoops inputformat classes. I had a word search already built which took in lines as values and searched for the word line by line. I wanted to see if I could get hadoop to take in values word by word, hadoop doesn't seem to like that and keeps giving me results using the default mapper. My mappers initialize function is never even called.
I do know my record reader is called and that it is doing more or less what it is supposed to and I'm pretty sure the output of the record reader is what my mapper is searching for so why does hadoop decide not to call it?
Here is the relevant code
Input Format Class
public class WordReader extends FileInputFormat<Text, Text> {
#Override
public RecordReader<Text, Text> createRecordReader(InputSplit split,
TaskAttemptContext context) {
return new MyWholeFileReader();
}
}
Record Reader
public class MyWholeFileReader extends RecordReader<Text, Text> {
private long start;
private LineReader in;
private Text key = null;
private Text value = null;
private ArrayList<String> outputvalues;
public void initialize(InputSplit genericSplit,
TaskAttemptContext context) throws IOException {
outputvalues = new ArrayList<String>();
FileSplit split = (FileSplit) genericSplit;
Configuration job = context.getConfiguration();
start = split.getStart();
final Path file = split.getPath();
// open the file and seek to the start of the split
FileSystem fs = file.getFileSystem(job);
FSDataInputStream fileIn = fs.open(split.getPath());
in = new LineReader(fileIn, job);
if (key == null) {
key = new Text();
}
key.set(split.getPath().getName());
if (value == null) {
value = new Text();
}
}
public boolean nextKeyValue() throws IOException {
if (outputvalues.size() == 0) {
Text buffer = new Text();
int i = in.readLine(buffer);
String str = buffer.toString();
for (String vals : str.split(" ")) {
outputvalues.add(vals);
}
if (i == 0 || outputvalues.size() == 0) {
key = null;
value = null;
return false;
}
}
value.set(outputvalues.remove(0));
System.out.println(value.toString());
return true;
}
#Override
public Text getCurrentKey() {
return key;
}
#Override
public Text getCurrentValue() {
return value;
}
/**
*
* Get the progress within the split
*/
public float getProgress() {
return 0.0f;
}
public synchronized void close() throws IOException {
if (in != null) {
in.close();
}
}
}
Mapper
public class WordSearchMapper extends Mapper<Text, Text, OutputCollector<Text,IntWritable>, Reporter> {
static String keyword;
BloomFilter<String> b;
public void configure(JobContext jobConf) {
keyword = jobConf.getConfiguration().get("keyword");
System.out.println("keyword>> " + keyword);
b = new BloomFilter<String>(.01,10000);
b.add(keyword);
System.out.println(b.getExpectedBitsPerElement());
}
public void map(Text key, Text value, OutputCollector<Text,IntWritable> output,
Reporter reporter) throws IOException {
int wordPos;
System.out.println("value.toString()>> " + value.toString());
System.out.println(((FileSplit) reporter.getInputSplit()).getPath()
.getName());
String[] tokens = value.toString().split("[\\p{P} \\t\\n\\r]");
for (String st :tokens) {
if (b.contains(st)) {
if (value.toString().contains(keyword)) {
System.out.println("Found one");
wordPos = ((Text) value).find(keyword);
output.collect(value, new IntWritable(wordPos));
}
}
}
}
}
Driver:
public class WordSearch {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf,"WordSearch");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(WordSearchMapper.class);
job.setInputFormatClass( WordReader.class);
job.setOutputFormatClass(TextOutputFormat.class);
conf.set("keyword", "the");
FileInputFormat.setInputPaths(job, new Path("search.txt"));
FileOutputFormat.setOutputPath(job, new Path("outputs"+System.currentTimeMillis()));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}

And I figured it out... this is why hadoop needs to stop supporting multiple versions of itself or why I should stop jamming multiple tutorials together. Turns out my mapper needs to be set up like this for the way my mapper and record reader are set up to interact.
'public class WordSearchMapper extends Mapper { static String keyword;`
I only realized this after looking at my imports and seeing that reporter was from package org.apache.hadoop.mapred as opposed to org.apache.hadoop.mapreduce –

PowerMockito for testing MapReduce Code

I have the following reducer Code and i am trying to use PowerMock to test it .
package com.cerner.cdh.examples.reducer;
public class LinkReversalReducer extends TableReducer<Text, Text, ImmutableBytesWritable> {
#Override
protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
StringBuilder inlinks = new StringBuilder();
for (Text value : values) {
inlinks.append(value.toString());
inlinks.append(" ");
}
byte[] docIdBytes = Bytes.toBytes(key.toString());
Put put = new Put(docIdBytes);
put.add(WikiConstants.COLUMN_FAMILY_BYTES, WikiConstants.INLINKS_COLUMN_QUALIFIER_BYTES,
Bytes.toBytes(inlinks.toString().trim()));
context.write(new ImmutableBytesWritable(docIdBytes), put);
}
}
Below is the test i have written for the above:
#Test
public void testLinkReversalReducer() throws IOException, InterruptedException {
Text key = new Text("key");
#SuppressWarnings("rawtypes")
Context context = PowerMockito.mock(Context.class);
Iterable<Text> values = generateText();
StringBuilder inlinks = new StringBuilder();
for (Text value : values) {
inlinks.append(value);
inlinks.append(" ");
}
LinkReversalReducer reducer = new LinkReversalReducer();
byte[] docIdBytes = Bytes.toBytes(key.toString());
byte[] argument1 = WikiConstants.COLUMN_FAMILY_BYTES;
byte[] argument2 = WikiConstants.INLINKS_COLUMN_QUALIFIER_BYTES;
byte[] argument3 = Bytes.toBytes(inlinks.toString().trim());
Put put = new Put(docIdBytes);
put.add(argument1, argument2, argument3);
reducer.reduce(key, values, context);
Mockito.verify(context).write(new ImmutableBytesWritable(docIdBytes), put);
}
private List<Text> generateText() {
Text value = new Text("AB");
List<Text> texts = new ArrayList<Text>();
texts.add(value);
return texts;
}
}
So the thing is that my Mockito.verify(context).write(new ImmutableBytesWritable(docIdBytes), put); seems to get called with the right values in place and also my junit result shows that the Invoked and the Actual give the same response. But the test still seems to fail. Does anyone have a clue ? . Any help would be appreciated :)

The problem here is, that the Put class does not define an equals method. Therefore the verify method thinks that the actual Put passed to context.write inside yourLinkReversalReducer.reduce method is different to the expected Putassembled in your testLinkReversalReducer method.
To work around this problem you could do the following:
Mockito.verify(context).write(Mockito.eq(new ImmutableBytesWritable(docIdBytes)), MockitoHelper.eq(put));
...
class MockitoHelper {
public static Put eq(final Put expectedPut) {
return Mockito.argThat(new CustomTypeSafeMatcher<Put>(expectedPut.toString()) {
#Override
protected boolean matchesSafely(Put actualPut) {
return Bytes.equals(toBytes(expectedPut), toBytes(actualPut));
}
});
}
private static byte[] toBytes(Put put) {
ByteArrayDataOutput out = new ByteArrayDataOutput();
try {
put.write(out);
return out.toByteArray();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Hadoop wierd join behaviour - java

Related

why is the first output line in map reduce null in java

In MapReduce program, reducer is not getting called by Driver

Reduce method in Reducer class is not executing

Hadoop mapper is never called, custom input format might be the issue

PowerMockito for testing MapReduce Code

Categories

Resources