I am wondering to see this issue while parsing the file by Mapper. My code is pretty simple, I am taking the data by "::" separated file line.
For example (input):
1::Toy Story (1995)::2077
Using below snip code of mapper which I usually doing in my practice
String tokens[]= value.toString().split("::");
int empId = Integer.parseInt(tokens[0]) ;
int count = Integer.parseInt(tokens[2]) ;
Technically line should split as below.
1 Toy Story (1995) 2077
tokens[0] tokens[1] tokens[2]
So, If I am looking for tokens[0] and tokens[2] then also why job is picking tokens[1], which is throwing below NumberFormatException exception and this is expected exception if I am trying to parse char to int. Could you please help me out from this.
17/09/05 19:06:49 INFO mapreduce.Job: Task Id : attempt_1500305785265_0095_m_000000_2, Status : FAILED
Error: java.lang.NumberFormatException: For input string: "1::Toy Story (1995)::2077"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at com.dataflair.comparableTest.ValueSortExp$MapTask.map(ValueSortExp.java:93)
at com.dataflair.comparableTest.ValueSortExp$MapTask.map(ValueSortExp.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
CODE
import java.io.IOException;
import java.nio.ByteBuffer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.IntWritable.Comparator;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.WritableComparator;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.Mapper.Context;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class ValueSortExp2 {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration(true);
String arguments[] = new GenericOptionsParser(conf, args).getRemainingArgs();
Job job = new Job(conf, "Test commond");
job.setJarByClass(ValueSortExp.class);
// Setup MapReduce
job.setMapperClass(ValueSortExp.MapTask.class);
job.setReducerClass(ValueSortExp.ReduceTask.class);
job.setNumReduceTasks(1);
// Specify key / value
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
//job.setSortComparatorClass(IntComparator.class);
// Input
FileInputFormat.addInputPath(job, new Path(arguments[0]));
job.setInputFormatClass(TextInputFormat.class);
// Output
FileOutputFormat.setOutputPath(job, new Path(arguments[1]));
job.setOutputFormatClass(TextOutputFormat.class);
/*
* // Delete output if exists FileSystem hdfs = FileSystem.get(conf); if
* (hdfs.exists(outputDir)) hdfs.delete(outputDir, true);
*
* // Execute job int code = job.waitForCompletion(true) ? 0 : 1;
* System.exit(code);
*/
// Execute job
int code = job.waitForCompletion(true) ? 0 : 1;
System.exit(code);
}
/*public static class IntComparator extends WritableComparator {
public IntComparator() {
super(IntWritable.class);
}
#Override
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
Integer v1 = ByteBuffer.wrap(b1, s1, l1).getInt();
Integer v2 = ByteBuffer.wrap(b2, s2, l2).getInt();
return v1.compareTo(v2) * (-1);
}
}*/
public static class MapTask extends Mapper<LongWritable, Text, IntWritable, IntWritable> {
public void map(LongWritable key,Text value, Context context) throws IOException, InterruptedException {
String tokens[]= value.toString().split("::");
int empId = Integer.parseInt(tokens[0]) ;
int count = Integer.parseInt(tokens[2]) ;
context.write(new IntWritable(count), new IntWritable(empId));
}
}
public static class ReduceTask extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
public void reduce(IntWritable key, Iterable<IntWritable> list, Context context)
throws java.io.IOException, InterruptedException {
for (IntWritable value : list) {
context.write(key, value);
}
}
}
}
INPUT DATA
1::Toy Story (1995)::2077
10::GoldenEye (1995)::888
100::City Hall (1996)::128
1000::Curdled (1996)::20
Related
I have a problem where I need to chain
Mapper >> Reducer >> Reducer
This is my data:
Dpt.csv
EmpNo1,DeptNo1
EmpNo2,DeptNo2
EmpNo3,DeptNo1
EmpNo4,DeptNo2
EmpNo5,DeptNo2
EmpNo6,DeptNo1
Emp.csv
EmpNo1,10000
EmpNo2,4675432
EmpNo3,76568658
EmpNo4,241423
EmpNo5,75756
EmpNo6,9796854
And finally I want something like this:
Dept1 >> Total_Salary_Dept_1
One major issue is my first reducer is not getting called when I use multiple files as input.
The second issue is that I can't pass that output to next reducer. (ChainReducer can't chain 2 reducers)
I was using this as a reference but quickly realized it won't help.
I found this link where, in one of the comments the author says this: "In Hadoop 2.X series, internally you can chain mappers before reducer with ChainMapper and chain Mappers after reducer with ChainReducer."
Does this mean I will have a structure like this:
Chain Mapper(mapper 1) --> Chain Reducer(reducer 1) --> ChainMapper(unnecessary mapper) --> Chain Reducer(rreducer 2)
And if this is the case then how exactly is the data handed off from Reducer 1 to Mapper 2?
Can someone help me out?
This is my code so far.
Thanks.
package Aggregate;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import org.apache.commons.io.FileUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.chain.ChainMapper;
import org.apache.hadoop.mapreduce.lib.chain.ChainReducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.map.InverseMapper;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class Sales extends Configured implements Tool{
public static class CollectionMapper extends Mapper<LongWritable, Text, Text, Text>{
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String[] vals = value.toString().split(",");
context.write(new Text(vals[0]), new Text(vals[1]));
}
}
public static class DeptSalaryJoiner extends Reducer<Text, Text, Text, Text>{
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
ArrayList<String> DeptSal = new ArrayList<>();
for (Text val : values) {
DeptSal.add(val.toString());
}
context.write(new Text(DeptSal.get(0)), new Text(DeptSal.get(1)));
}
}
public static class SalaryAggregator extends Reducer<Text, Text, Text, IntWritable>{
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
Integer totalSal = 0;
for (Text val : values) {
Integer salary = new Integer(val.toString());
totalSal += salary;
}
context.write(key, new IntWritable(totalSal));
}
}
public static void main(String[] args) throws Exception {
int exitFlag = ToolRunner.run(new Sales(), args);
System.exit(exitFlag);
}
#Override
public int run(String[] args) throws Exception {
String input1 = "./emp.csv";
String input2 = "./dept.csv";
String output = "./DeptAggregate";
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Sales");
job.setJarByClass(getClass());
Configuration mapConf = new Configuration(false);
ChainMapper.addMapper(job, CollectionMapper.class, LongWritable.class, Text.class, Text.class, Text.class, mapConf);
Configuration reduce1Conf = new Configuration(false);
ChainReducer.setReducer(job, DeptSalaryJoiner.class, Text.class, Text.class, Text.class, Text.class, reduce1Conf);
Configuration reduce2Conf = new Configuration(false);
ChainReducer.setReducer(job, SalaryAggregator.class, Text.class, Text.class, Text.class, IntWritable.class, reduce2Conf);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(input1));
FileInputFormat.addInputPath(job, new Path(input2));
try {
File f = new File(output);
FileUtils.forceDelete(f);
} catch (Exception e) {
}
FileOutputFormat.setOutputPath(job, new Path(output));
return job.waitForCompletion(true) ? 0 : 1;
}
}
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class CommonFriends {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
private IntWritable friend = new IntWritable();
private Text friends = new Text();
public void map(Object key, Text value, Context context ) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString(),"\n");
while (itr.hasMoreTokens()) {
String[] line = itr.nextToken().split(" ");
if(line.length > 2 ){
int person = Integer.parseInt(line[0]);
for(int i=1; i<line.length;i++){
int ifriend = Integer.parseInt(line[i]);
friends.set((person < ifriend ? person+"-"+ifriend : ifriend+"-"+person));
for(int j=1; j< line.length; j++ ){
if( i != j ){
friend.set(Integer.parseInt(line[j]));
context.write(friends, friend);
}
}
}
}
}
}
}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,Text> {
private Text result = new Text();
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
HashSet<IntWritable> duplicates = new HashSet();
ArrayList<Integer> tmp = new ArrayList();
for (IntWritable val : values) {
if(duplicates.contains(val))
tmp.add(val.get());
else
duplicates.add(val);
}
result.set(tmp.toString());
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "Common Friends");
job.setJarByClass(CommonFriends.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.IntWritable
at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:194)
at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:1350)
at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1667)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
at CommonFriends$IntSumReducer.reduce(CommonFriends.java:51)
at CommonFriends$IntSumReducer.reduce(CommonFriends.java:38)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1688)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1637)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1489)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
This is my code, The error message is the following.
Any idea??
I think the problem in the configuration of output classes of mapper and the reducer
the input files are a list of numbers in file.
Some more details will be provided if needed.
The program finds the common friend between friends
remove job.setCombinerClass(IntSumReducer.class); in your code could solve this problem
Just had a look into your code, it seems you are using reducer code as combiner code.
One thing you need to check.
Your combiner code will take input in form of <Text, IntWritable> and output of Combiner would be <Text, Text> format .
Then the input to your Reducer would be in format of < Text, Text> but you had specified the input to Reducer as < Text, IntWritable > , so it is throwing the error.
Two things can be done :-
1) You might consider changing the output type of Reducer .
2) You might consider writing a separate Combiner code.
I am trying to run a MapReduce job using(using new API) on Hadoop 2.7.1 using command line. I have followed the below steps. No error in compiling and creating a jar file.
javac -cp `hadoop classpath` MaxTemperatureWithCompression.java -d /Users/gangadharkadam/hadoopdata/build
jar -cvf MaxTemperatureWithCompression.jar /Users/gangadharkadam/hadoopdata/build
hadoop jar MaxTemperatureWithCompression.jar org.myorg.MaxTemperatureWithCompression user/ncdc/input /user/ncdc/output
Error Messages-
Exception in thread "main" java.lang.ClassNotFoundException: org.myorg.MaxTemperatureWithCompression
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Java Code-
package org.myorg;
//Standard Java Classes
import java.io.IOException;
import java.util.regex.Pattern;
//extends the class Configured, and implements the Tool utility class
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.util.GenericOptionsParser;
//send debugging messages from inside the mapper and reducer classes
import org.apache.log4j.Logger;
//Job class in order to create, configure, and run an instance of your MapReduce
import org.apache.hadoop.mapreduce.Job;
//extend the Mapper class with your own Map class and add your own processing instructions
import org.apache.hadoop.mapreduce.Mapper;
//extend it to create and customize your own Reduce class
import org.apache.hadoop.mapreduce.Reducer;
//Path class to access files in HDFS
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileSystem;
//pass required paths using the FileInputFormat and FileOutputFormat classes
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
//Writable objects for writing, reading,and comparing values during map and reduce processing
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.compress.GzipCodec;
public class MaxTemperatureWithCompression extends Configured implements Tool {
private static final Logger LOG = Logger.getLogger(MaxTemperatureWithCompression.class);
//main menhod to invoke the toolrunner to create instance of MaxTemperatureWithCompression
public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new MaxTemperatureWithCompression(), args);
System.exit(res);
}
//call the run method to configure the job
public int run(String[] args) throws Exception {
if (args.length != 2) {
System.err.println("Usage: MaxTemperatureWithCompression <input path> " + "<output path>");
System.exit(-1);
}
Job job = Job.getInstance(getConf(), "MaxTemperatureWithCompression");
//set the jar to use based on the class
job.setJarByClass(MaxTemperatureWithCompression.class);
//set the input and output path
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//set the output key and value
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//set the compressionformat
/*[*/FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);/*]*/
//set the mapper and reducer class
job.setMapperClass(Map.class);
job.setCombinerClass(Reduce.class);
job.setReducerClass(Reduce.class);
return job.waitForCompletion(true) ? 0 : 1;
}
//mapper
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
#Override
public void map(LongWritable key, Text value, Context context)
throws IOException,InterruptedException {
String line = value.toString();
String year = line.substring(15,19);
int airTemperature;
if (line.charAt(87) == '+') {
airTemperature = Integer.parseInt(line.substring(88, 92));
}
else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}
String quality = line.substring(92,93);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}
}
//reducer
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
#Override
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new IntWritable(maxValue));
}
}
}
I see few posts on the same issue but those couldn't help me to resolve this issue. Any help on resolving this is highly appreciated. Thanks in advance.
I am learning Hadoop and tried executing my Mapreduce program. All Map tasks and Reducer tasks are completed fine, but Reducer Writing Mapper Output into Output file. It means Reduce function not at all invoked. My sample input is like below
1,a
1,b
1,c
2,s
2,d
and the expected output is like below
1 a,b,c
2 s,d
Below is my Program.
package patentcitation;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class MyJob
{
public static class Mymapper extends Mapper <Text, Text, Text, Text>
{
public void map (Text key, Text value, Context context) throws IOException, InterruptedException
{
context.write(key, value);
}
}
public static class Myreducer extends Reducer<Text,Text,Text,Text>
{
StringBuilder str = new StringBuilder();
public void reduce(Text key, Iterable<Text> value, Context context) throws IOException, InterruptedException
{
for(Text x : value)
{
if(str.length() > 0)
{
str.append(",");
}
str.append(x.toString());
}
context.write(key, new Text(str.toString()));
}
}
public static void main(String args[]) throws IOException, ClassNotFoundException, InterruptedException
{
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "PatentCitation");
FileSystem fs = FileSystem.get(conf);
job.setJarByClass(MyJob.class);
FileInputFormat.addInputPath(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(Mymapper.class);
job.setReducerClass(Myreducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
conf.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator",",");
if(fs.exists(new Path(args[1]))){
//If exist delete the output path
fs.delete(new Path(args[1]),true);
}
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Same question is asked here, I used the Iterable value in my reduce function as the answer suggested in that thread. But that doesnt fix the issue. I cannot comment there since my reputation score is low. So created the new Thread
Kindly help me where am doing wrong.
You have made few mistakes in your program. Following are the mistakes:
In the driver, following statement should be called before instantiating the Job class:
conf.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator",",");
In reducer, you should put the StringBuilder inside the reduce() function.
I have modified your code as below and I got the output:
E:\hdp\hadoop-2.7.1.2.3.0.0-2557\bin>hadoop fs -cat /out/part-r-00000
1 c,b,a
2 d,s
Modified code:
package patentcitation;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import java.io.IOException;
public class MyJob
{
public static class Mymapper extends Mapper <Text, Text, Text, Text>
{
public void map(Text key, Text value, Context context) throws IOException, InterruptedException
{
context.write(key, value);
}
}
public static class Myreducer extends Reducer<Text,Text,Text,Text>
{
public void reduce(Text key, Iterable<Text> value, Context context) throws IOException, InterruptedException
{
StringBuilder str = new StringBuilder();
for(Text x : value)
{
if(str.length() > 0)
{
str.append(",");
}
str.append(x.toString());
}
context.write(key, new Text(str.toString()));
}
}
public static void main(String args[]) throws IOException, ClassNotFoundException, InterruptedException
{
Configuration conf = new Configuration();
conf.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator",",");
Job job = Job.getInstance(conf, "PatentCitation");
FileSystem fs = FileSystem.get(conf);
job.setJarByClass(MyJob.class);
FileInputFormat.addInputPath(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(Mymapper.class);
job.setReducerClass(Myreducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
/*if(fs.exists(new Path(args[1]))){
//If exist delete the output path
fs.delete(new Path(args[1]),true);
}*/
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
I'm working in a java project using Hadoop and I have a java.lang.VerifyError and I don't know how to resolve it. I saw people with the same type of question but without answer or the solution are not working in my case.
My class :
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;
public class GetStats {
public static List<Statistique> stats; // class with one String an one int
public static class TokenizerMapper extends
Mapper<Object, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class IntSumReducer extends
Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
if (key.toString().contains("HEAD")
|| key.toString().contains("POST")
|| key.toString().contains("GET")
|| key.toString().contains("OPTIONS")
|| key.toString().contains("CONNECT"))
GetStats.stats.add(new Statistique(key.toString().replace("\"", ""), sum));
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
System.out.println("Start wc");
stats = new ArrayList<>();
// File file = new File("err.txt");
// FileOutputStream fos = new FileOutputStream(file);
// PrintStream ps = new PrintStream(fos);
// System.setErr(ps);
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(GetStats.class);
job.setMapperClass(TokenizerMapper.class);
// job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("input"));
job.setOutputFormatClass(NullOutputFormat.class);
job.waitForCompletion(true);
System.out.println(stats);
System.out.println("End");
}
}
and the error :
Exception in thread "main" java.lang.VerifyError: Bad type on operand stack
Exception Details:
Location:
org/apache/hadoop/mapred/JobTrackerInstrumentation.create(Lorg/apache/hadoop/mapred/JobTracker;Lorg/apache/hadoop/mapred/JobConf;)Lorg/apache/hadoop/mapred/JobTrackerInstrumentation; #5: invokestatic
Reason:
Type 'org/apache/hadoop/metrics2/lib/DefaultMetricsSystem' (current frame, stack[2]) is not assignable to 'org/apache/hadoop/metrics2/MetricsSystem'
Current Frame:
bci: #5
flags: { }
locals: { 'org/apache/hadoop/mapred/JobTracker', 'org/apache/hadoop/mapred/JobConf' }
stack: { 'org/apache/hadoop/mapred/JobTracker', 'org/apache/hadoop/mapred/JobConf', 'org/apache/hadoop/metrics2/lib/DefaultMetricsSystem' }
Bytecode:
0000000: 2a2b b200 03b8 0004 b0
at org.apache.hadoop.mapred.LocalJobRunner.<init>(LocalJobRunner.java:573)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:494)
at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:479)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:563)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:561)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:549)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
at hadoop.GetStats.main(GetStats.java:79)
Do you have any idea ? if you need something more to help me just ask.
I solved my problem.
The imported jar was good, but another version (probably older one) which I had tried earlier, was also in the project folder. When I called the class, it appears that older version of the jar in was used. Also, that jar was before the one I wanted in the class path. I deleted the older jar from the project folder and it worked.