I am using neo4j-reco, to pre-compute real-time recommendations.
I have a sample graph and .jar files have been placed into the plugins directory of Neo4j installation as mentioned in the readme file,
but getting following error when restarting the server.
2015-12-01 15:38:35.769+0530 INFO Neo4j Server shutdown initiated by request
15:38:35.788 [Thread-12] INFO c.g.s.f.b.GraphAwareServerBootstrapper - stopped
2015-12-01 15:38:35.789+0530 INFO Successfully shutdown Neo4j Server
15:38:36.399 [Thread-12] INFO c.g.runtime.BaseGraphAwareRuntime - Shutting down GraphAware Runtime...
15:38:36.399 [Thread-12] INFO c.g.r.schedule.RotatingTaskScheduler - Terminating task scheduler...
15:38:36.399 [Thread-12] INFO c.g.r.schedule.RotatingTaskScheduler - Task scheduler terminated successfully.
15:38:36.399 [Thread-12] INFO c.g.runtime.BaseGraphAwareRuntime - GraphAware Runtime shut down.
2015-12-01 15:38:36.405+0530 INFO Successfully stopped database
2015-12-01 15:38:36.405+0530 INFO Successfully shutdown database
15:38:40.041 [main] INFO c.g.r.b.RuntimeKernelExtension - GraphAware Runtime enabled, bootstrapping...
15:38:40.069 [main] INFO c.g.r.b.RuntimeKernelExtension - Bootstrapping module with order 1, ID reco, using com.graphaware.reco.neo4j.module.RecommendationModuleBootstrapper
15:38:40.077 [main] INFO c.g.r.n.m.RecommendationModuleBootstrapper - Constructing new recommendation module with ID: reco
15:38:40.080 [main] INFO c.g.r.n.m.RecommendationModuleBootstrapper - Trying to instantiate class FriendsComputingEngine
15:38:40.089 [main] ERROR c.g.r.n.m.RecommendationModuleBootstrapper - Engine FriendsComputingEngine wasn't found on the classpath. Will not pre-compute recommendations
java.lang.ClassNotFoundException: FriendsComputingEngine
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[na:1.7.0_91]
at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[na:1.7.0_91]
at java.security.AccessController.doPrivileged(Native Method) ~[na:1.7.0_91]
at java.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[na:1.7.0_91]
at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[na:1.7.0_91]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) ~[na:1.7.0_91]
at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[na:1.7.0_91]
at java.lang.Class.forName0(Native Method) ~[na:1.7.0_91]
at java.lang.Class.forName(Class.java:195) ~[na:1.7.0_91]
How to solve this
You need to build one first if you're referring to it in your config. If you follow the steps in the readme file you're mentioning, you will end up building one.
Related
I'm new to Maven and Java but when in search for a good EID reader this looked like the best choice (https://github.com/grakic/jfreesteel/).
I've ran "mvn install" and made the .jar file which when ran resulted in:
The terminal waiting for me to insert the card
When the card is inserted the terminal recognises it and continues .
And finally it crashes
This is the full log of what happens:
[main] INFO net.devbase.jfreesteel.nativemessaging.EidWebExtensionApp - Starting web extensions native messaging background app...
[main] INFO net.devbase.jfreesteel.nativemessaging.EidWebExtensionApp - Using terminal factory type PC/SC
[Thread-0] INFO net.devbase.jfreesteel.nativemessaging.EidWebExtensionApp - Card inserted
[Thread-0] INFO net.devbase.jfreesteel.EidCard - exclusive
[Thread-0] INFO net.devbase.jfreesteel.EidCard - exclusive free
[Thread-0] INFO net.devbase.jfreesteel.EidCard - photo exclusive
[Thread-0] INFO net.devbase.jfreesteel.EidCard - photo exclusive free
Exception in thread "Thread-0" java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
at net.devbase.jfreesteel.Utils.image2Base64String(Utils.java:220)
at net.devbase.jfreesteel.nativemessaging.EidWebExtensionApp.inserted(EidWebExtensionApp.java:261)
at net.devbase.jfreesteel.nativemessaging.EidWebExtensionApp.access$400(EidWebExtensionApp.java:25)
at net.devbase.jfreesteel.nativemessaging.EidWebExtensionApp$2.run(EidWebExtensionApp.java:155)
at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.DatatypeConverter
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:602)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 5 more
I am following this document to do hive hook:
http://dharmeshkakadia.github.io/hive-hook/
But I got this error when show tables
2018-08-12 09:57:38,122 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-315]: hive.exec.pre.hooks Class not found: HiveExampleHook
2018-08-12 09:57:38,122 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-315]: FAILED: Hive Internal Error: java.lang.ClassNotFoundException(HiveExampleHook)
java.lang.ClassNotFoundException: HiveExampleHook
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.hive.ql.hooks.HooksLoader.getHooks(HooksLoader.java:100)
at org.apache.hadoop.hive.ql.hooks.HooksLoader.getHooks(HooksLoader.java:64)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1674)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2018-08-12 09:57:38,122 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-315]: </PERFLOG method=Driver.execute start=1534067858120 end=1534067858122 duration=2 from=org.apache.hadoop.hive.ql.Driver>
2018-08-12 09:57:38,122 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-315]: Completed executing command(queryId=hive_20180812095757_e6516d83-ddc9-4f82-8151-def7e7f1eb37); Time taken: 0.002 seconds
2018-08-12 09:57:38,122 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-315]: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2018-08-12 09:57:38,122 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-315]: </PERFLOG method=releaseLocks start=1534067858122 end=1534067858122 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2018-08-12 09:57:38,130 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-315]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Hive Internal Error: java.lang.ClassNotFoundException(HiveExampleHook)
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: HiveExampleHook
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.hive.ql.hooks.HooksLoader.getHooks(HooksLoader.java:100)
at org.apache.hadoop.hive.ql.hooks.HooksLoader.getHooks(HooksLoader.java:64)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1674)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1501)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1285)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1280)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:236)
... 11 more
I am sure the last step add jar target/Hive-hook-example-1.0.jar; is wrong.
I tried the following:
I put the jar file into hdfs /user/hive/ :
add jar hdfs:///user/hive/Hive-hook-example-1.0.jar;
I also set "Hive Auxiliary JARs Directory" as
/home/centos/HiveExampleHook/target/Hive-hook-example-1.0.jar in Hiveserver2 node and restart Hive plus beeline.
Copy the jar file to /opt/cloudera/parcels/CDH/jars/
Copy the jar file to /opt/cloudera/parcels/CDH/lib/hive/lib/
Nothing helps.
Any idea?
UPDATE 1:
If I do LIST JARS; this would show
+----------------------------------------------------+--+
| resource |
+----------------------------------------------------+--+
| /tmp/3fe67bb1-5cfd-427f-8faa-cab6524afeb3_resources/Hive-hook-example-1.0.jar |
+----------------------------------------------------+--+
I tried two ways to do CREATE FUNCTION too:
CREATE TEMPORARY FUNCTION test1 AS 'HiveExampleHook';
INFO : Compiling command(queryId=hive_20180812153838_47589f9d-eaeb-410d-80b0-9cf414ae557f): CREATE TEMPORARY FUNCTION test1 AS 'HiveExampleHook'
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20180812153838_47589f9d-eaeb-410d-80b0-9cf414ae557f); Time taken: 0.002 seconds
INFO : Executing command(queryId=hive_20180812153838_47589f9d-eaeb-410d-80b0-9cf414ae557f): CREATE TEMPORARY FUNCTION test1 AS 'HiveExampleHook'
INFO : Starting task [Stage-0:FUNC] in serial mode
ERROR : FAILED: Class HiveExampleHook not found
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask
INFO : Completed executing command(queryId=hive_20180812153838_47589f9d-eaeb-410d-80b0-9cf414ae557f); Time taken: 0.003 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask (state=08S01,code=1)
and...
CREATE TEMPORARY FUNCTION test1 AS 'HiveExampleHook' USING JAR 'hdfs:///user/hive/Hive-hook-example-1.0.jar';
INFO : Compiling command(queryId=hive_20180812153939_cf1f31c9-0361-47dc-8903-78221bd12401): CREATE TEMPORARY FUNCTION test1 AS 'HiveExampleHook' USING JAR 'hdfs:///user/hive/Hive-hook-example-1.0.jar'
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20180812153939_cf1f31c9-0361-47dc-8903-78221bd12401); Time taken: 0.004 seconds
INFO : Executing command(queryId=hive_20180812153939_cf1f31c9-0361-47dc-8903-78221bd12401): CREATE TEMPORARY FUNCTION test1 AS 'HiveExampleHook' USING JAR 'hdfs:///user/hive/Hive-hook-example-1.0.jar'
INFO : Starting task [Stage-0:FUNC] in serial mode
INFO : converting to local hdfs:///user/hive/Hive-hook-example-1.0.jar
INFO : Added [/tmp/3fe67bb1-5cfd-427f-8faa-cab6524afeb3_resources/Hive-hook-example-1.0.jar] to class path
INFO : Added resources: [hdfs:///user/hive/Hive-hook-example-1.0.jar]
ERROR : FAILED: Class HiveExampleHook not found
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask
INFO : Completed executing command(queryId=hive_20180812153939_cf1f31c9-0361-47dc-8903-78221bd12401); Time taken: 0.03 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask (state=08S01,code=1)
So clearly it can find the jar but not the class name. Am I right?
UPDATE 2:
I tried this:
[Hive-hook-example]# java -cp `pwd`/target/Hive-hook-example-1.0.jar HiveExampleHook
And still got this:
Error: Could not find or load main class HiveExampleHook
I believe this is some stupid mistake I did.
UPDATE 3:
OK I got it figured out. You have to use hive CLI and not beeline for this to work.
hive> add jar hdfs:///user/hive/Hive-hook-example-1.0.jar;
add jar hdfs:///user/hive/Hive-hook-example-1.0.jar
converting to local hdfs:///user/hive/Hive-hook-example-1.0.jar
Added [/tmp/0a90132d-70cd-4ef0-b4cd-e75dc823e5ca_resources/Hive-hook-example-1.0.jar] to class path
Added resources: [hdfs:///user/hive/Hive-hook-example-1.0.jar]
hive> set hive.exec.pre.hooks=HiveExampleHook;
set hive.exec.pre.hooks=HiveExampleHook
hive> show tables;
show tables
Hello from the hook !!
OK
test1
Time taken: 0.023 seconds, Fetched: 5 row(s)
So the question is how to run this in beeline then? Because hive CLI is deprecated.
UPDATE 4:
I decided to do this:
Ran beeline and saw this:
2018-08-12 16:39:13,286 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-60]: <PERFLOG method=PreHook.HiveExampleHook from=org.apache.hadoop.hive.ql.Driver>
2018-08-12 16:39:13,286 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-60]: </PERFLOG method=PreHook.HiveExampleHook start=1534091953286 end=1534091953286 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2
That is some progress although I am not sure what it means and whether the class was ran. As I see nothing output.
With beeline, you have to use HDFS path while adding jar. Remember beeline is just a JDBC CLI, so when you use add jar with local path, it has the reference to you local path, that is not accessible to hive session running on the cluster.
(Thanks for asking https://twitter.com/quanghoc/status/1028671393376874496 for help. I am the author of the blog you referred to.)
Is there any technical reason why spark 2.3 does not work with java 1.10 (as of July 2018)?
Here is the output when I run SparkPi example using spark-submit.
$ ./bin/spark-submit ./examples/src/main/python/pi.py
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2018-07-13 14:31:30 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-07-13 14:31:31 INFO SparkContext:54 - Running Spark version 2.3.1
2018-07-13 14:31:31 INFO SparkContext:54 - Submitted application: PythonPi
2018-07-13 14:31:31 INFO Utils:54 - Successfully started service 'sparkDriver' on port 58681.
2018-07-13 14:31:31 INFO SparkEnv:54 - Registering MapOutputTracker
2018-07-13 14:31:31 INFO SparkEnv:54 - Registering BlockManagerMaster
2018-07-13 14:31:31 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-07-13 14:31:31 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-07-13 14:31:31 INFO DiskBlockManager:54 - Created local directory at /private/var/folders/mp/9hp4l4md4dqgmgyv7g58gbq0ks62rk/T/blockmgr-d24fab4c-c858-4cd8-9b6a-97b02aa630a5
2018-07-13 14:31:31 INFO MemoryStore:54 - MemoryStore started with capacity 434.4 MB
2018-07-13 14:31:31 INFO SparkEnv:54 - Registering OutputCommitCoordinator
...
2018-07-13 14:31:32 INFO StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint
Traceback (most recent call last):
File "~/Documents/spark-2.3.1-bin-hadoop2.7/./examples/src/main/python/pi.py", line 44, in <module>
count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/rdd.py", line 862, in reduce
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/rdd.py", line 834, in collect
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:103)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:432)
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.b(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:262)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:261)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:261)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2073)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:939)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.collect(RDD.scala:938)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:162)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:844)
2018-07-13 14:31:33 INFO SparkContext:54 - Invoking stop() from shutdown hook
...
I resolved the issue by switching to Java8 instead of Java10 as mentioned here.
Primary technical reason is that Spark depends heavily on direct access to native memory with sun.misc.Unsafe, which has been made private in Java 9.
https://issues.apache.org/jira/browse/SPARK-24421
http://apache-spark-developers-list.1001551.n3.nabble.com/Java-9-td20875.html
Committer here. It's actually a fair bit of work to support Java 9+: SPARK-24417
It's also almost done and should be ready for Spark 3.0, which should run on Java 8 through 11 and beyond.
The goal (well, mine) is to make it work without opening up module access. The key issues include:
sun.misc.Unsafe usage has to be removed or worked around
Changes to the structure of boot classloader
Scala support for Java 9+
A bunch of dependency updates to work with Java 9+
JAXB no longer automatically available
Spark depends on the memory API's which has been changed in JDK 9 so it is not available starting JDK 9.
And that is the reason for this.
Please check the issue:
https://issues.apache.org/jira/browse/SPARK-24421
I am trying to write java code in map reduce form it ran in Eclipse but when am trying to implement in map reduce form i am getting this error please help me what does this error means and how to fix this
16/07/15 14:05:17 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
16/07/15 14:05:17 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
16/07/15 14:05:17 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
16/07/15 14:05:18 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/07/15 14:05:18 INFO mapred.FileInputFormat: Total input paths to process : 2
16/07/15 14:05:18 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/app/hadoop/tmp/mapred/staging/hadoop1148163758/.staging/job_local1148163758_0001
Exception in thread "main" java.io.IOException: Not a file: hdfs://localhost:54310/TcTest/NewTest
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:320)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833)
at TextClassification.run(TextClassification.java:38)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at TextClassification.main(TextClassification.java:43)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
You need to provide the complete path of the input file as arguments to your program. Right click on the program --> Run configurations-->Arguments
Before this you need to add few jar files from your hadoop-version/share/hadoop folder. You can refer to the below blog for complete details on how to run map reduce program through eclipse in local mode.
https://acadgild.com/blog/running-mapreduce-in-local-mode-2/
I am trying to solve an issue when a Hadoop app throws java.lang.ClassNotFoundException:
WARN mapreduce.FaunusCompiler: Using the distribution Faunus job jar: ../lib/faunus-0.4.4-hadoop2-job.jar
INFO mapreduce.FaunusCompiler: Compiled to 1 MapReduce job(s)
INFO mapreduce.FaunusCompiler: Executing job 1 out of 1: VerticesMap.Map > CountMapReduce.Map > CountMapReduce.Reduce
INFO mapreduce.FaunusCompiler: Job data location: output/job-0
INFO client.RMProxy: Connecting to ResourceManager at yuriys-bigdata3/172.31.8.161:8032
WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner
INFO input.FileInputFormat: Total input paths to process : 1
INFO mapreduce.JobSubmitter: number of splits:1
INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1402963354379_0016
INFO impl.YarnClientImpl: Submitted application application_1402963354379_0016
INFO mapreduce.Job: The url to track the job: http://local-bigdata3:8088/proxy/application_1402963354379_0016/
INFO mapreduce.Job: Running job: job_1402963354379_0016
INFO mapreduce.Job: Job job_1402963354379_0016 running in uber mode : false
INFO mapreduce.Job: map 0% reduce 0%
INFO mapreduce.Job: Task Id : attempt_1402963354379_0016_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException:
com.tinkerpop.blueprints.util.DefaultVertexQuery
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at com.thinkaurelius.faunus.formats.graphson.GraphSONInputFormat.setConf(GraphSONInputFormat.java:39)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
The app does create a "fat" jar file, where all the dependency jars (including the one that contains the not found class) are included under the lib node
The app does set the Job.setJar on this fat jar file.
The code does not do anything strange:
job.setJar(hadoopFileJar);
...
boolean success = job.waitForCompletion(true);
Besides, I looked up the configuration in the yarn-site.xml and verified that a job dir under yarn.nodemanager.local-dirs does contain that jar (it is renamed to job.jar though) and also that lib directory with extracted jars in them.
i.e. the jar that contains the missing class is there. Yarn/MR recreates this dir with all these required files after each job schedule, so the files do get transferred there.
I've discovered so far, is that the classpath environment variable on the java worker processes that execute the failing code is set as
C:\hdp\data\hadoop\local\usercache\user\appcache\application_1402963354379_0013\container_1402963354379_0013_02_000001\classpath-3824944728798396318.jar
and this jar just contains a manifest.mf That manifest contains paths to the directory with the "fat.jar" file and its directories (original formatting saved):
file:/c:/hdp/data/hadoop/loc al/usercache/user/appcache/application_1402963354379_0013/container
_1402963354379_0013_02_000001/job.jar/job.jar file:/c:/hdp/data/hadoo p/local/usercache/user/appcache/application_1402963354379_0013/cont ainer_1402963354379_0013_02_000001/job.jar/classes/ file:/c:/hdp/data /hadoop/local/usercache/user/appcache/application_1402963354379_001 3/container_1402963354379_0013_02_000001/jobSubmitDir/job.splitmetain fo file:/c:/hdp/data/hadoop/local/usercache/user/appcache/applicati on_1402963354379_0013/container_1402963354379_0013_02_000001/jobSubmi tDir/job.split file:/c:/hdp/data/hadoop/local/usercache/user/appcac he/application_1402963354379_0013/container_1402963354379_0013_02_000 001/job.xml file:/c:/hdp/data/hadoop/local/usercache/user/appcache/ application_1402963354379_0013/container_1402963354379_0013_02_000001 /job.jar/
However, this path does not explicitly adds the jars in the directories, i.e. the directory from the above manifest
file:/c:/hdp/data/hadoop/local/usercache/user/appcache/application_1402963354379_0013/container_1402963354379_0013_02_000001/job.jar/
does contain the jar file with the class that is not being found by yarn (as this directory contains all the jars from the "fat" jar lib section), but for JAVA world this kind of setting of classpath seems incorrect – this directory was supposed to be included with star*,
e.g:
file:/c:/hdp/data/hadoop/local/usercache/user/appcache/application_1402963354379_0013/container_1402963354379_0013_02_000001/job.jar/*
What I am doing wrong with passing dependencies to Yarn?
Could cluster configuration be an issue or possibly this is a bug on my Hadoop distro (HDP 2.1, Windows x64)?