Unable to connect to spark on remote system

Unable to connect to spark on remote system - java

I am trying to connect to spark master on a remote system through java app
I am using
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.1</version>
</dependency>
and code
{
SparkSession sparkSession = SparkSession.builder().
master("spark://ip:7077")
.appName("spark session example")
.getOrCreate();
JavaSparkContext sc = new JavaSparkContext(sparkSession.sparkContext());
}
Getting
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at org.apache.spark.sql.SparkSession$Builder.config(SparkSession.scala:713)
at org.apache.spark.sql.SparkSession$Builder.master(SparkSession.scala:766)
at com.mobelisk.spark.JavaSparkPi.main(JavaSparkPi.java:9)
Also If I change to
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
**<version>2.0.1</version>**
</dependency>
on the same program getting
Caused by: java.lang.RuntimeException: java.io.InvalidClassException: org.apache.spark.rpc.netty.RequestMessage; local class incompatible: stream classdesc serialVersionUID = -2221986757032131007, local class serialVersionUID = -5447855329526097695
In Spark-shell on remote
Spark context available as 'sc' (master = local[*], app id = local-1477561433881).
Spark session available as 'spark'.
Welcome to
____ __
/ / _ _____/ /
_\ / _ / _ `/ / '/
// .__/_,// //_\ version 2.0.1
//
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_101)
Type in expressions to have them evaluated.
Type :help for more information.
As I am very new to all this, I am not able to figure out the issue in program

I figured it out, posting this in case if someone is going to follow the similar approach.
I had added
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>2.0.0-M3</version>
which comes with scala-library 2.10.6
but there already exists a scala-library 2.11.8 in spark-core
so I had to exclude the earlier one like this
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>2.0.0-M3</version>
<exclusions>
<exclusion>
<artifactId>scala-library</artifactId>
<groupId>org.scala-lang</groupId>
</exclusion>
<exclusion>
<artifactId>scala-reflect</artifactId>
<groupId>org.scala-lang</groupId>
</exclusion>
</exclusions>
</dependency>
Now everything is working fine

This Spark version mismatch:
you use 2.10 in project.
cluster uses 2.11
Update dependency to 2.11.

Related

Helidon MP OpenAPI aren't generating a updated openapi endpoint response

I'm currently building a microservice-based on Helidon Microprofile following guides and tutorials from Oracle themselves, but I've run into a problem related to the 'Automatic OpenAPI specification generator' when using Annotations.
My POM consists of an MP bundle and integrations to make it work with Hibernate-provided JPA.
Even after setting up all the annotations on my Resource, it doesn't generate an updated specification.
POM
<dependencies>
<dependency>
<groupId>io.helidon.microprofile.bundles</groupId>
<artifactId>helidon-microprofile</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>org.jboss</groupId>
<artifactId>jandex</artifactId>
<version>2.1.1.Final</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>io.helidon.integrations.cdi</groupId>
<artifactId>helidon-integrations-cdi-datasource-hikaricp</artifactId>
<version>1.4.0</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>io.helidon.integrations.cdi</groupId>
<artifactId>helidon-integrations-cdi-jta-weld</artifactId>
<version>1.4.0</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>io.helidon.integrations.cdi</groupId>
<artifactId>helidon-integrations-cdi-hibernate</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>javax.transaction</groupId>
<artifactId>javax.transaction-api</artifactId>
<version>1.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>jakarta.persistence</groupId>
<artifactId>jakarta.persistence-api</artifactId>
<version>2.2.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.mariadb.jdbc</groupId>
<artifactId>mariadb-java-client</artifactId>
<version>2.5.2</version>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.media</groupId>
<artifactId>jersey-media-json-jackson</artifactId>
<version>2.29.1</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.auth0</groupId>
<artifactId>java-jwt</artifactId>
<version>3.8.3</version>
</dependency>
<dependency>
<groupId>com.auth0</groupId>
<artifactId>jwks-rsa</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.5.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<version>5.5.2</version>
<scope>test</scope>
</dependency>
</dependencies>
I'm only using Annotations specified in the guides and #OpenAPIDefinition for defining things like Title and Licence.
RESOURCE
#OpenAPIDefinition(
info = #Info(
title = "Newsletter Microservice",
version = "1.0", description = "Microservice in charge of handling newsletter",
license = #License(name = "Apache 2.0", url = "https://www.apache.org/licenses/LICENSE-2.0"),
contact = #Contact(name = "Email", url = "mailto:email")
),
tags = {
#Tag(name = "public"), #Tag(name = "private")
}
)
#Path("/newsletter")
#RequestScoped
public class NewsletterClientResource {
LOG
Connected to the target VM, address: '127.0.0.1:0', transport: 'socket'
Java HotSpot(TM) 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
2019.12.02 20:01:19 INFO org.jboss.weld.Version Thread[main,5,main]: WELD-000900: 3.1.1 (Final)
2019.12.02 20:01:20 INFO org.jboss.weld.Bootstrap Thread[main,5,main]: WELD-ENV-000020: Using jandex for bean discovery
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.jboss.weld.util.bytecode.ClassFileUtils$1 (file:/C:/Users/Brenno%20Fagundes/.m2/repository/org/jboss/weld/weld-core-impl/3.1.1.Final/weld-core-impl-3.1.1.Final.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int)
WARNING: Please consider reporting this to the maintainers of org.jboss.weld.util.bytecode.ClassFileUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2019.12.02 20:01:21 INFO org.jboss.weld.Event Thread[main,5,main]: WELD-000411: Observer method [BackedAnnotatedMethod] public org.glassfish.jersey.ext.cdi1x.internal.ProcessAllAnnotatedTypes.processAnnotatedType(#Observes ProcessAnnotatedType<?>, BeanManager) receives events for all annotated types. Consider restricting events using #WithAnnotations or a generic type with bounds.
2019.12.02 20:01:21 INFO org.jboss.weld.Event Thread[main,5,main]: WELD-000411: Observer method [BackedAnnotatedMethod] private io.helidon.microprofile.openapi.IndexBuilder.processAnnotatedType(#Observes ProcessAnnotatedType<X>) receives events for all annotated types. Consider restricting events using #WithAnnotations or a generic type with bounds.
2019.12.02 20:01:22 INFO org.jboss.weld.Bootstrap Thread[main,5,main]: WELD-ENV-002003: Weld SE container 404f642b-892f-4676-960e-8df848aee3a3 initialized
2019.12.02 20:01:22 INFO io.helidon.microprofile.security.SecurityMpService Thread[main,5,main]: Security extension for microprofile is enabled, yet security configuration is missing from config (requires providers configuration at key security.providers). Security will not have any valid provider.
2019.12.02 20:01:22 INFO io.smallrye.openapi.api.OpenApiDocument Thread[main,5,main]: OpenAPI document initialized: io.smallrye.openapi.api.models.OpenAPIImpl#7793ad58
2019.12.02 20:01:23 INFO io.helidon.webserver.NettyWebServer Thread[main,5,main]: Version: 1.4.0
2019.12.02 20:01:24 INFO io.helidon.webserver.NettyWebServer Thread[nioEventLoopGroup-2-1,10,main]: Channel '#default' started: [id: 0x4e1f119b, L:/0:0:0:0:0:0:0:0:7200]
2019.12.02 20:01:24 INFO io.helidon.microprofile.server.ServerImpl Thread[nioEventLoopGroup-2-1,10,main]: Server initialized on http://localhost:7200 (and all other host addresses) in 5254 milliseconds.
BUMP, also, the generation works using custom filters and models, a static definition in META-INF also works. Currently using JDK 13.
EDIT: this is my Application class after Tim Quinn's suggested modifications.
APPLICATION CLASS
#ApplicationScoped
#ApplicationPath("/")
#OpenAPIDefinition(
info = #Info(
title = "Newsletter Microservice",
version = "1.0", description = "Microservice in charge of handling newsletter",
license = #License(name = "Apache 2.0", url = "https://www.apache.org/licenses/LICENSE-2.0"),
contact = #Contact(name = "Email", url = "mailto:john.doe#gmail.com")
),
tags = {
#Tag(name = "public"), #Tag(name = "private")
}
)
public class NewsletterApplication extends Application {
#Override
public Set<Class<?>> getClasses(){
HashSet<Class<?>> classes = new HashSet<Class<?>>();
classes.add(NewsletterClientResource.class);
return classes;
}
}
GENERATED FILE
---
openapi: 3.0.1
info:
title: Generated API
version: "1.0"
paths: {}

Brenno,
You didn't show your updated source code, but I'm assuming you added the #OpenAPIDefinition annotation to the quickstart GreetResource class, is that right?
That annotation's attributes describe the whole application, not a subset of its resources, so try moving the annotation to the GreetApplication class instead:
#ApplicationScoped
#ApplicationPath("/")
#OpenAPIDefinition(info = #Info(title = "QuickStart API", version = "1.1"))
public class GreetApplication extends Application {...}
That should work. I just tried this on the generated Helidon MP quickstart example source, rebuilt, and the server returned the expected results.
Here is the diff for the change I made:
diff --git a/examples/quickstarts/helidon-quickstart-mp/src/main/java/io/helidon/examples/quickstart/mp/GreetApplication.java b/examples/quickstarts/helidon-quickstart-mp/src/main/java/io/helidon/examples/quickstart/mp/GreetApplication.java
index fd140738..cca60da2 100644
--- a/examples/quickstarts/helidon-quickstart-mp/src/main/java/io/helidon/examples/quickstart/mp/GreetApplication.java
+++ b/examples/quickstarts/helidon-quickstart-mp/src/main/java/io/helidon/examples/quickstart/mp/GreetApplication.java
## -23,12 +23,15 ## import javax.ws.rs.ApplicationPath;
import javax.ws.rs.core.Application;
import io.helidon.common.CollectionsHelper;
+import org.eclipse.microprofile.openapi.annotations.OpenAPIDefinition;
+import org.eclipse.microprofile.openapi.annotations.info.Info;
/**
* Simple Application that produces a greeting message.
*/
#ApplicationScoped
#ApplicationPath("/")
+#OpenAPIDefinition(info = #Info(title = "QuickStart API", version = "1.1"))
public class GreetApplication extends Application {
#Override
And here is the beginning of the updated OpenAPI document returned from the server:
---
openapi: 3.0.1
info:
title: QuickStart API
version: "1.1"
paths:
The title and version have changed, as expected.

Case 1 : You are using jandex and your /openapi is not getting updated.
If you are using jandex, there is a high chance that your jandex.idx is not getting updated. You can do this by running mvn process-classes
Case 2 : You are not using jandex and when you hit /openapi you get somewhat blank response.
This seems to be an issue with Helidon. The dependencies helidon-integrations-cdi-jpa, helidon-integrations-cdi-jta, helidon-integrations-cdi-eclipselink etc... contains jandex.idx and Helidon now thinks that jandex is enabled and it
will read only from those jandex files, skipping your resources. So for the time-being, you can include jandex plugin to solve the issue.

Why does running MLlib project in IntelliJ IDEA fail with "AssertionError: assertion failed: unsafe symbol CompatContext"?

I'm trying to run the Logistic Regression example (https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaLogisticRegressionWithElasticNetExample.java)
This is the code:
public final class GettingStarted {
public static void main(final String[] args) throws InterruptedException {
System.setProperty("hadoop.home.dir", "C:\\winutils");
SparkSession spark = SparkSession
.builder()
.appName("JavaLogisticRegressionWithElasticNetExample")
.config("spark.master", "local")
.getOrCreate();
// $example on$
// Load training data
Dataset<Row> training = spark.read().format("libsvm").load("data/mllib/sample_libsvm_data.txt");
LogisticRegression lr = new LogisticRegression()
.setMaxIter(10)
.setRegParam(0.3)
.setElasticNetParam(0.8);
// Fit the model
LogisticRegressionModel lrModel = lr.fit(training);
// Print the coefficients and intercept for logistic regression
System.out.println("Coefficients: "
+ lrModel.coefficients() + " Intercept: " + lrModel.intercept());
// We can also use the multinomial family for binary classification
LogisticRegression mlr = new LogisticRegression()
.setMaxIter(10)
.setRegParam(0.3)
.setElasticNetParam(0.8)
.setFamily("multinomial");
// Fit the model
LogisticRegressionModel mlrModel = mlr.fit(training);
// Print the coefficients and intercepts for logistic regression with multinomial family
System.out.println("Multinomial coefficients: " + lrModel.coefficientMatrix()
+ "\nMultinomial intercepts: " + mlrModel.interceptVector());
// $example off$
spark.stop();}}
I'm also using the same file of the example (https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt)
But I get these errors:
Exception in thread "main" java.lang.AssertionError: assertion failed: unsafe symbol CompatContext (child of package macrocompat) in runtime reflection universe
at scala.reflect.internal.Symbols$Symbol.<init>(Symbols.scala:184)
at scala.reflect.internal.Symbols$TypeSymbol.<init>(Symbols.scala:2984)
at scala.reflect.internal.Symbols$ClassSymbol.<init>(Symbols.scala:3176)
at scala.reflect.internal.Symbols$StubClassSymbol.<init>(Symbols.scala:3471)
at scala.reflect.internal.Symbols$Symbol.newStubSymbol(Symbols.scala:498)
at scala.reflect.internal.pickling.UnPickler$Scan.readExtSymbol$1(UnPickler.scala:258)
at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:284)
at scala.reflect.internal.pickling.UnPickler$Scan.readSymbolRef(UnPickler.scala:649)
at scala.reflect.internal.pickling.UnPickler$Scan.readType(UnPickler.scala:417)
at scala.reflect.internal.pickling.UnPickler$Scan$LazyTypeRef$$anonfun$6.apply(UnPickler.scala:725)
at scala.reflect.internal.pickling.UnPickler$Scan$LazyTypeRef$$anonfun$6.apply(UnPickler.scala:725)
at scala.reflect.internal.pickling.UnPickler$Scan.at(UnPickler.scala:179)
at scala.reflect.internal.pickling.UnPickler$Scan$LazyTypeRef.completeInternal(UnPickler.scala:725)
at scala.reflect.internal.pickling.UnPickler$Scan$LazyTypeRef.complete(UnPickler.scala:749)
at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1489)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$$anon$12.scala$reflect$runtime$SynchronizedSymbols$SynchronizedSymbol$$super$info(SynchronizedSymbols.scala:162)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$$anonfun$info$1.apply(SynchronizedSymbols.scala:127)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$$anonfun$info$1.apply(SynchronizedSymbols.scala:127)
at scala.reflect.runtime.Gil$class.gilSynchronized(Gil.scala:19)
at scala.reflect.runtime.JavaUniverse.gilSynchronized(JavaUniverse.scala:16)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$class.gilSynchronizedIfNotThreadsafe(SynchronizedSymbols.scala:123)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$$anon$12.gilSynchronizedIfNotThreadsafe(SynchronizedSymbols.scala:162)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$class.info(SynchronizedSymbols.scala:127)
at scala.reflect.runtime.SynchronizedSymbols$SynchronizedSymbol$$anon$12.info(SynchronizedSymbols.scala:162)
at scala.reflect.internal.Mirrors$RootsBase.ensureClassSymbol(Mirrors.scala:94)
at scala.reflect.internal.Mirrors$RootsBase.getClassByName(Mirrors.scala:102)
at scala.reflect.internal.Mirrors$RootsBase.getClassIfDefined(Mirrors.scala:114)
at scala.reflect.internal.Mirrors$RootsBase.getClassIfDefined(Mirrors.scala:111)
at scala.reflect.internal.Definitions$DefinitionsClass.BlackboxContextClass$lzycompute(Definitions.scala:496)
at scala.reflect.internal.Definitions$DefinitionsClass.BlackboxContextClass(Definitions.scala:496)
at scala.reflect.runtime.JavaUniverseForce$class.force(JavaUniverseForce.scala:305)
at scala.reflect.runtime.JavaUniverse.force(JavaUniverse.scala:16)
at scala.reflect.runtime.JavaUniverse.init(JavaUniverse.scala:147)
at scala.reflect.runtime.JavaUniverse.<init>(JavaUniverse.scala:78)
at scala.reflect.runtime.package$.universe$lzycompute(package.scala:17)
at scala.reflect.runtime.package$.universe(package.scala:17)
at org.apache.spark.sql.catalyst.ScalaReflection$.<init>(ScalaReflection.scala:40)
at org.apache.spark.sql.catalyst.ScalaReflection$.<clinit>(ScalaReflection.scala)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.org$apache$spark$sql$catalyst$encoders$RowEncoder$$serializerFor(RowEncoder.scala:74)
at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:61)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67)
at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:415)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:172)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:156)
at GettingStarted.main(GettingStarted.java:95)
Do you know what I'm wrong about?
EDIT:
I run it on IntelliJ, it is a Maven project and I added the dependencies:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.mongodb.spark</groupId>
<artifactId>mongo-spark-connector_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>2.2.0</version>
</dependency>

tl;dr As soon as you start seeing errors internal to scala, mentionning reflection universe, think incompatible scala versions.
Your scala versions on your libs do not match one another (2.10 and 2.11).
You should align all on your actual scala version.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId> <!-- This is scala v2.11 -->
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId> <!-- This is scala v2.10 -->
<version>2.2.0</version>
</dependency>

HiveContext.sql() gives runtime No Such Method Error

Hi I am trying to run a simple java program using Apache Hive and Apache Spark. The program compiles without any error, but on runtime I get the following error:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext.sql(Ljava/lang/String;)Lorg/apache/spark/sql/DataFrame;
at SparkHiveExample.main(SparkHiveExample.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Following is my code:
import org.apache.spark.SparkContext;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.hive.HiveContext;
import org.apache.spark.sql.DataFrame;
public class SparkHiveExample {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("SparkHive Example");
SparkContext sc = new SparkContext(conf);
HiveContext hiveContext = new HiveContext(sc);
System.out.println("Hello World");
DataFrame df = hiveContext.sql("show tables");
df.show();
}
}
My pom.xml file looks as follows:
<project>
<groupId>edu.berkeley</groupId>
<artifactId>simple-project</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Simple Project</name>
<packaging>jar</packaging>
<version>1.0</version>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.3.0</version>
</dependency>
</dependencies>
</project>
What could be the problem?
EDIT: I tried using SQLContext.sql() method and I still get a similar method not found runtime error. This stackoverflow answer suggests that the problem is caused due to dependency problem, but I am unable to figure out what.

make sure your spark core and spark hive dependencies are set to the scope of provided as shown below. These dependencies are provided by the cluster and not by your application.
And ensure the version of your spark installation is 1.3 or above. prior to 1.3 the sql method returned a RDD (SchemaRDD) instead of DataFrame. It is most likely the version of spark that is installed is older than 1.3.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.3.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.3.0</version>
<scope>provided</scope>
</dependency>
And it is recommended you use SparkSession object to run queries instead of HiveContext. The below code snippet explains the usage of SparkSession.
val spark = SparkSession.builder.
master("local")
.appName("spark session example")
.enableHiveSupport()
.getOrCreate()
spark.sql("show tables")

The error is because you are applying query of showing table and assigning to a Dataframe.
You can assign to a DataFrame when you use select query or similar queries but not show query

from pyspark.sql.types import DecimalType,StringType
from pyspark.sql.functions import *
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Your APPName").enableHiveSupport().getOrCreate()
from pyspark.sql import HiveContext
hive_context = HiveContext(spark)
hive_context.sql("select current_date()").show()

NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less

I have seen many answers related to this error, but all re-directing to scala versions etc. But I think my case is different.
I have a remote spark master-worker cluster set up with version 2.10. I was able to verify it through http://master-ip:8080 listing all worker nodes
From my application, I am trying to create SparkConf with Java 7 code. Following below is the code
sparkConf = new SparkConf(true)
.set("spark.cassandra.connection.host", "localhost")
.set("spark.cassandra.auth.username", "username")
.set("spark.cassandra.auth.password", "pwd")
.set("spark.master", "spark://master-ip:7077")
.set("spark.app.name","Test App");
Following are the maven dependencies i added
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
<exclusions>
<exclusion>
<groupId>javax.validation</groupId>
<artifactId>validation-api</artifactId>
</exclusion>
</exclusions>
</dependency>
I get the below error
Caused by: java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
at org.apache.spark.util.Utils$.getSystemProperties(Utils.scala:1710)
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:73)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:68)
Spark Version from one of the worker nodes
./spark-shell --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.0
/_/
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_91
Branch
Compiled by user jenkins on 2016-12-16T02:04:48Z
Revision
Url
Type --help for more information.

It is related to Scala version.
Your cluster has Scala 2.10, but Spark dependency is
spark-core_2.11
which means Scala 2.11
Change it to 2.10 and will work

NoSuchMethodError : Spark + Kafka : Java

Use Case
Simple message fetching and printing from Kafka topic using Spark with Java as programming language
Background
Experience in dealing with Kafka Storm Integration, developed and maintained kafka cluster and storm topologies more than a year.
No experience with Apache Spark and Scala
Simple word count application built and tested successfully using stand alone spark cluster.
Problem
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:64)
at org.apache.spark.streaming.kafka.KafkaUtils$.createStream(KafkaUtils.scala:110)
at org.apache.spark.streaming.kafka.KafkaUtils.createStream(KafkaUtils.scala)
at com.random.spark.EventsToFileAggregator.main(EventsToFileAggregator.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
At EventsToFileAggregator.java:54
JavaPairReceiverInputDStream<String, String> messages =
KafkaUtils.createStream(jsc, args[0], args[1], topicMap,
StorageLevel.MEMORY_AND_DISK_SER());
pom.xml
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.6.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>1.6.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.11</artifactId>
<version>1.6.1</version>
</dependency>
</dependencies>
Build
Successful without any warnings
Command
./bin/spark-submit --class com.random.spark.EventsToFileAggregator --master spark://host:7077 /usr/local/spark/stats/target/stats-1.0-SNAPSHOT-jar-with-dependencies.jar localhost:2181 test topic 2

NoSuchMethodError is almost always an indication that two libraries are not at a compatible version. In this case Spark-Streaming Kafka is attempting to use a Scala language feature that doesn't exist. Check that the version of Spark-Streaming Kafka is compatible with the version of Scala you're using. Make sure you're not actually running with Scala and not Java.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Unable to connect to spark on remote system - java

This Spark version mismatch: you use 2.10 in project. cluster uses 2.11 Update dependency to 2.11.

Related

Helidon MP OpenAPI aren't generating a updated openapi endpoint response

Why does running MLlib project in IntelliJ IDEA fail with "AssertionError: assertion failed: unsafe symbol CompatContext"?

HiveContext.sql() gives runtime No Such Method Error

NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less

NoSuchMethodError : Spark + Kafka : Java

Categories

Resources