Errors running flume agent with interceptor - java

I am trying to run custom flume agent from terminal using linux. I am working on cloudera VM. Command running flume looks like:
flume-ng agent --conf . -f spoolDirLocal2hdfs_memoryChannel.conflume.root.logger=DEBUG,console -n Agent5
Sources with interceptor looks like:
Agent5.sources.spooldir-source.interceptors = i1
Agent5.sources.spooldir-source.interceptors.i1.type = org.flumefiles.flume.HtmlInterceptor$Buider
I've placed my jar file both into /usr/lib/hadoop/lib/ and /usr/lib/flume-ng/lib/. Also I've created plugins.d at /usr/lib/flume-ng/plugins.d/ and placed jar there. But when running flume agent I've got an error:
15/02/18 06:10:46 ERROR channel.ChannelProcessor: Builder class not found. Exception follows.
java.lang.ClassNotFoundException: org.intropro.flume.HtmlInterceptor$Buider
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
.....
Where should I place my jar file to make it find builder?

Place it into FLUME_HOME/lib and then restart flume.
If that doesn't work, make sure your interceptor actually implements the Builder interface. That might be another reason.

Related

Java NoClassDefFoundError : Apache Flink Complex Event Processing

I am trying to understand the Apache Flink CEP program to monitor rack temperatures in a data center as described by Flink Official Documentation. But when I follow the steps and create a jar using mvn clean package and tried to execute the package using the command
java -cp "../cep-monitoring-1.0.jar" org.stsffap.cep.monitoring.CEPMonitoring
But I get the following error,
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/streaming/api/functions/source/SourceFunction
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.api.functions.source.SourceFunction
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
I tried different variations of giving the classpath as described here but getting the same error. Can someone point out my mistake in running the program?
To submit a job to the local Flink cluster:
Run Flink.
/path/to/flink-1.4.0/bin/start-local.sh
Submit the job.
/path/to/flink-1.4.0/bin/flink run -c com.package.YourClass /path/to/jar.jar
Alternatively you can run the job simply from your IDE:
Your job in this case will be run in a Flink environement.
Check Flink's example: https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/wordcount/WordCount.java
The cep example uses flink version 1.3.2. So here are the steps to run it.
Install version 1.3.2 of apache flink. (wget it from here and extract it).
cd into flink-1.3.2
./bin/start-local.sh, this will start the flink cluster. Do cd ...
Clone this repo using git clone and cd into that.
mvn clean package to build the project. This will create target directory.
Run ../flink-1.3.2/bin/flink run target/cep-monitoring-1.0.jar, to start the process.
In separate terminal the output can be logged like this (assuming that you are in same directory as previous step) tail -f ../flink-1.3.2/log/flink-*-jobmanager-*.out (* will be replaced by specific user detail, press tab to autocomplete those).
Here is the sample output,
rshah9#bn18-20:~/tools/cep-monitoring-master$ tail -f ../flink-1.3.2/log/flink-rshah9-jobmanager-0-bn18-20.dcs.mcnc.org.out
TemperatureWarning(9, 102.45860162626161)
TemperatureWarning(6, 113.21295716135027)
TemperatureWarning(5, 105.46064102697723)
TemperatureWarning(0, 106.44635415722034)
TemperatureWarning(4, 112.07396748089734)
TemperatureWarning(9, 114.53346561628322)
TemperatureWarning(3, 109.05305417712648)
TemperatureWarning(7, 112.3698094257147)
TemperatureWarning(3, 107.78609416982076)
TemperatureWarning(9, 107.34373990230458)
TemperatureWarning(5, 111.46480675461656)

Apache Ignite - error 'JavaLoggerFileHandler' when running Spark shell

I am starting spark-shell (of spark 2.2) and added bunch of jars in spark-shell command (from Ignite 2.1 directory).
Still getting error:
Can't load log handler "org.apache.ignite.logger.java.JavaLoggerFileHandler"
Also followed recommendation from here:
https://apacheignite.readme.io/v1.2/docs/installation--deployment
# Optionally set IGNITE_HOME here.
# IGNITE_HOME=/path/to/ignite
IGNITE_LIBS="${IGNITE_HOME}/libs/*"
for file in ${IGNITE_HOME}/libs/*
do
if [ -d ${file} ] && [ "${file}" != "${IGNITE_HOME}"/libs/optional ]; then
IGNITE_LIBS=${IGNITE_LIBS}:${file}/*
fi
done
export SPARK_CLASSPATH=$IGNITE_LIBS
Also set logging to only ERROR as well but still getting error:
Can't load log handler "org.apache.ignite.logger.java.JavaLoggerFileHandler"
java.lang.ClassNotFoundException: org.apache.ignite.logger.java.JavaLoggerFileHandler
java.lang.ClassNotFoundException: org.apache.ignite.logger.java.JavaLoggerFileHandler
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.util.logging.LogManager$5.run(LogManager.java:965)
at java.security.AccessController.doPrivileged(Native Method)
Looks like you use documentation for an old Ignite version 1.2, while you use Ignite 2.1. Check the documentation for the latest version here: https://apacheignite-fs.readme.io/v2.2/docs/installation-deployment
Also, please make sure that have configured IGNITE_HOME in your environment. JavaLoggerFileHandler placed in the ignite-core module, looks like spark classpath doesn't see any Ignite lib at all.
The documentation describe the issue here:
https://apacheignite-fs.readme.io/v2.2/docs/troubleshooting
This issue appears when you do not have any loggers included in classpath and Ignite tries to use standard Java logging. By default Spark loads all user jar files using separate class loader. Java logging framework, on the other hand, uses application class loader to initialize log handlers. To resolve this, you can either add ignite-log4j module to the list of the used jars so that Ignite would use Log4j as a logging subsystem, or alter default Spark classpath as described

How to run MapReduce Program from local IDE on remote cluster

I have a simple MapReduce program which I want to run it on a remote cluster. I can do this from command line by simply running
hadoop jar myjar.jar input output
but when I want to run a function in my junit TestCase class from my IDE which invokes the MR job, I get the following warnings:
WARN org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
although I have this line set, before submitting the MR job:
job.setJarByClass(MyJob.class);
and hence the job fails as it cannot find the appropriate classes (like MyMapKey which is the mapper key class) to operate.
Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :java.lang.RuntimeException: java.lang.ClassNotFoundException: Class MyMapKey not found
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414)
at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
any thoughts on this?
First you should add remote Hadoop cluster config files (i.e. core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml, ssl-client.xml) as resources to your Configuration object. Then follow the steps in the above link to see how you should add manually the job jar to classpath on remote cluster.

Configuring memory for mappers and reducer during mapreduce job submission

I am trying to configure memory for mapper/reducer memory during a map reduce job submission as below:
hadoop jar Word-0.0.1-SNAPSHOT.jar -Dmapreduce.map.memory.mb=5120 com.test.Word.App /tmp/ilango/input /tmp/ilango/output/
Is there any wrong in the command above ? I am getting the following exception. It looks like do we need to put JAR file or need to configure what to use -D option in Hadoop. Thanks in advance.
Exception in thread "main" java.lang.ClassNotFoundException: -Dmapreduce.map.memory.mb=5120
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
Command to run a MR job is
hadoop jar jarname classname input output
As per your command
hadoop jar jarname -D mapreduce.map.memory.mb=5120 classname input output
hadoop checks Driver class with name "-Dmapreduce.map.memory.mb=5120".
Thats why it is showing java.lang.ClassNotFound Exception.
-D option should be supplied after your Driver class.
Try using below command.
hadoop jar Word-0.0.1-SNAPSHOT.jar com.test.Word.App -D mapreduce.map.memory.mb=5120 /tmp/ilango/input /tmp/ilango/output/
Hope this solve your issue.
It looks like you are missing a space after -D
try -D mapreduce.map.memory.mb=5120
There is a difference between -Dproperty=value and -D property=value. The first one sets JVM system property where as the second one sets the Hadoop configuration property.
Quoting from the book Hadoop the Definitive guide, :
-D property=value Sets the given Hadoop configuration property to the given value. Overrides any default or site properties in the
configuration, and any properties set via the -conf option.
If you're using MVN and added the Main class to the manifest, in this case com.test.Word.App, your command -D mapreduce.map.memory.mb=5120 will be taken as input.
So, just remove the com.test.Word.App line

Java SQLite org.sqlite.JDBC classpath broken?

I stumbled upon a weird error while using JDBC sqlite with org.sqlite.JDBC
my code compiles and runs fine on Windows.
But when I tried moving it to Ubuntu it started showing this:
Exception in thread "main" java.lang.ClassNotFoundException: org.sqlite.JDBC
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:259)
at mall.SQLiteJDBC.<init>(SQLiteJDBC.java:27)
at mall.AllegroReader.<init>(AllegroReader.java:33)
at mall.Mall.main(Mall.java:31)
I'm running it with java -classpath "sqlite-jdbc-3.7.2.jar" -jar Mall.jar" and java -classpath "sqlite-jdbc4-3.8.2-SNAPSHOT.jar" -jar Mall.jar
with both versions in the same directory as my jar and I've tried a dozen different options specifying classpath and it behaves exactly the same. I tried openjdk and oracle jdk.
I tried rebuilding it on Ubuntu, changing ant .xmls, changing paths, etc.
I have no idea what is going on. Pls help.
Here is what happens inside my dist directory:
work1#workwork:/var/www/mall/dist$ ls
mall.db Mall.jar Mall.jar.old sqlite-jdbc-3.8.4.3-SNAPSHOT.jar
work1#workwork:/var/www/mall/dist$ java -classpath "sqlite-jdbc-3.8.4.3-SNAPSHOT.jar:Mall.jar" Mall
Error: Could not find or load main class Mall
The classpath is ignored when you use -jar.
You have to either include the dependencies in the jar (or at least have the jar manifest point to them), or run it with -classpath sqlite.jar:Mall.jar the.main.class.
Error: Could not find or load main class Mall.main. all files are there,
my main class comes from Mall.java and is in mall package which
compiles to Mall.jar
So the correct command line is:
java -classpath "sqlite-jdbc-3.8.4.3-SNAPSHOT.jar:Mall.jar" mall.Mall
OP findings
to view the classes in jar use jar tf Mall.jar - from this I got mall/Mall.class meaning my class containing main was mall.Mall
it showed
mall/Mall.class
so I should have used mall.Mall as the class to run (instead of pulling my hair)
After spending over 6 hours total with many failed attempts at running "portable" jar package using classpath and whatnot, after having tried OneJar and jarjar to no avail (ended up with Class file too large!) I decided to write the offending piece of code in PHP.
It proved to be more portable than Java in my case.

Categories

Resources