java to scala to spark. NoClassDefFoundError - java

I have some scala utility classes for loading csv files and manipulating as DataFrames. They work fine from scala.
I just tried using the classes by invoking my scala util from java. I got the following exception.
java.lang.NoClassDefFoundError: scala/Product$class
at org.apache.spark.SparkConf$DeprecatedConfig.(SparkConf.scala:723)
at org.apache.spark.SparkConf$.(SparkConf.scala:571)
Both my java and scala projects are maven projects
My java application pom.xml just has one dependency, the dependency to my scala util.
My scala util initiates a SparkSession loads the csv files and manipulates the data in DataFrames. It has the following dependencies (which work fine when runningas standalone scala)
<!---spark-->
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-graphx_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_2.11</artifactId>
<version>2.2.0</version>
</dependency>
Can someone offer a hint as to what I am missing
UPDATE: This is not a duplicate question.
Scala is not invoked directly in java. Even so I added a property to the java pom:
<scala.version>2.11.11</scala.version>
This made no difference. The property is already in the scala pom

Related

Package 'org.pmml4s.model' is declared in module with an invalid name ('pmml4s.2.10')

I'm trying to use PMML4S to make predictions from an imported model from sklearn. I have the model in an xml file that I am trying to load into java using pmml4s. I am trying to follow this. However, I am having issues getting it to work: specifically, "Package 'org.pmml4s.model' is declared in module with an invalid name ('pmml4s.2.10')" . I am using IntelliJ as my IDE. Please let me know if I can provide other information/code. Any help is appreciated!
Error is here:
import org.pmml4s.model.Model;
Dependencies:
<dependencies>
<dependency>
<groupId>org.openjfx</groupId>
<artifactId>javafx-controls</artifactId>
<version>18.0.2</version>
</dependency>
<dependency>
<groupId>org.openjfx</groupId>
<artifactId>javafx-fxml</artifactId>
<version>18.0.2</version>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>${junit.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<version>${junit.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.pmml4s</groupId>
<artifactId>pmml4s_2.10</artifactId>
<version>0.9.16</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.36</version>
</dependency>
<dependency>
<groupId>com.example</groupId>
<artifactId>doctor</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.9</version>
</dependency>
<dependency>
<groupId>org.osgi</groupId>
<artifactId>org.osgi.core</artifactId>
<version>6.0.0</version>
</dependency>
</dependencies>
You used java 9 modules in your project and imported a library which was not converted to java 9 modules. Thus Java treats this library as an automatic module and derives module name from jar name. The jar name happens to be illegal.
See:
What is an automatic module?
You have 2 options:
Option 1: Dont use java9 modules
See:
Is there any need to switch to modules when migrating to Java 9 or later?
Option 2: Sanitize jar name
See:
Unable to derive module descriptor for auto generated module names in Java 9?
Scala Suffix Maven plugin looks like a tool designed precisely to solve your problem
Note that you need to require the automatic module in this approach:
How to use 3rd party library in Java9 module?

How to run single scenario

Everytime I'm trying to run a single feature file or a single scenario in a feature file, it create new configuration file in intellij. The Glue property is empty and the Feature or folder path is located on a specific feature file, the feature file of that scenario:
The errors after trying to run a single feature file or a single scenario is:
Undefined step: .... for every step in the feature file/scenario I'm trying to run.
Is there any sulotion to this problem instead of creating 1000 configurations?
My dependencies:
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-java8</artifactId>
<version>1.2.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-jvm</artifactId>
<version>1.2.5</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-junit</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
You have two different versions of Cucumber involved. 1.2.4 and 1.2.5, is that the reason you're having problems?
IDEA adds "glue" automatically if your step definition file is located under named package (not right under test->java).
I checked this example http://github.com/czeczotka/cucumber-jvm-maven with your dependencies and it seems to work fine.

Getting Spark Logging class not found when using Spark SQL

I am trying to do a simple Spark SQL programming in Java. In the program, I am getting data from a Cassandra table, converting the RDD into a Dataset and displaying the data. When I run the spark-submit command, I am getting the error: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging.
My program is:
SparkConf sparkConf = new SparkConf().setAppName("DataFrameTest")
.set("spark.cassandra.connection.host", "abc")
.set("spark.cassandra.auth.username", "def")
.set("spark.cassandra.auth.password", "ghi");
SparkContext sparkContext = new SparkContext(sparkConf);
JavaRDD<EventLog> logsRDD = javaFunctions(sparkContext).cassandraTable("test", "log",
mapRowTo(Log.class));
SparkSession sparkSession = SparkSession.builder().appName("Java Spark SQL").getOrCreate();
Dataset<Row> logsDF = sparkSession.createDataFrame(logsRDD, Log.class);
logsDF.show();
My POM dependencies are:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.0.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.2</version>
</dependency>
</dependencies>
My spark-submit command is: /home/ubuntu/spark-2.0.2-bin-hadoop2.7/bin/spark-submit --class "com.jtv.spark.dataframes.App" --master local[4] spark.dataframes-0.1-jar-with-dependencies.jar
How do I solve this error? Downgrading to 1.5.2 does not work as 1.5.2 does not have org.apache.spark.sql.Dataset and org.apache.spark.sql.SparkSession.
This may be a problem into your IDE. As some of this packages are created and Scala the Java project, sometimes the IDE is unable to understand what is going on. I am using the Intellij and it keeps displaying this message to me. But, when I try to run the "mvn test" or "mvn package" everything is fine. Please check if this is really some package error or just the IDE that is lost.
Spark Logging is available for Spark version 1.5.2 and lower but not higher version. So your dependency in pom.xml should be like this:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.5.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.5.2</version>
</dependency>
</dependencies>
Please let me know if it works or not.
The below dependency worked fine for my case.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
Pretty late to the party here, but I added
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.1</version>
<scope>provided</scope>
</dependency>
To solve this issue. Seems to work for my case.
Make sure you have the correct spark version in the pom.xml.
previously, in local, I have a different version of Spark and that is why I was getting the error in IntelliJ IDE. "Can not have access Spark.logging class"
In my case, Changed it from 2.4.2 -> 2.4.3, and it solved.
Spark version & Scala version info, we can get from spark-shell command.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.3</version>
</dependency>

Cant run feature in Cucumber

Im having issues running a feature in Cucumber, the feature is very basic as it's from a tutorial.
It is not defined and is as follows:
Feature: Proof that my concept works
Scenario: My first test
Given this is my first step
When this is my second step
Then this is my final step
My Cucumber runner class is as follows:
package cucumber;
import org.junit.runner.RunWith;
import cucumber.api.junit.Cucumber;
#RunWith(Cucumber.class)
#Cucumber.Options(
format = {"pretty", "json:target/"},
features = {"src/cucumber/"}
)
public class CucumberRunner {
}
Also the external .jar files that I have in the project are as follows:
The exception that I'm getting is:
Exception in thread "main" cucumber.runtime.CucumberException: Failed
to instantiate public
cucumber.runtime.java.JavaBackend(cucumber.runtime.io.ResourceLoader)
with [cucumber.runtime.io.MultiLoader#75d837b6]
I've tried to look around online for the solution to this problem but have not had any luck.
I've also discussed with the OP of the tutorial and I'm still awaiting feedback but it has been a while.
I ran into a similar issue and got the same error as you did.
Firstly mention the path to the feature file
features = {"src/cucumber/myfile.feature"}
Anyway, that didn't cause the error.
To just run your Cucumber runner class, all the dependencies you need are
cucmber-junit
cucumber-java and
junit.
I had an additional cucumber-guice which was creating the problem and once I removed it, the error went away and runner was executed successfully.
From the link to the image you have mentioned it looks like you are not using cucumber-guice but still I would recommend you remove other unnecessary cucumber dependencies and try again.
1, I ran into this too few days ago, its simple just remove cucumber-Spring from the dependency.
2 If that doesn't work try updating cucumber-core, cucumber-junit, and cucumber-java all version 1.2.3
I believe the issue is that many of the cucumber add-ins, such as cucumber-testng, cucumber-spring, and (in my case) cucumber-guice, expect the corresponding module they link to be included as well. But apparently the cucumber experts decided not to include this dependency in their pom.xml files, so the problem doesn't manifest itself until runtime.
So (to answer Eugene S's question under LING's answer) if you want to actually use guice with cucumber, you need to also add guice itself as a dependency.
This worked for me, I hope it will work for you as well.
Update your Cucumber dependencies in pom.xml
i.e
cucumber-java (1.2.2)
cucumber-jvm (1.2.2)
cucumber-junit (1.2.2)
And update your Junit dependency as well. (4.11).
The only reason for this error is the version of all the cucumber libraries are not same. It should be like this:
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-java8</artifactId>
<version>4.2.6</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.cucumber/cucumber-picocontainer -->
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-picocontainer</artifactId>
<version>4.2.6</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.cucumber/cucumber-testng -->
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-testng</artifactId>
<version>4.2.6</version>
<exclusions>
<exclusion>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</exclusion>
</exclusions>
</dependency>
First Thing : We would request you to use Cucumber v >=4.0.0 as you are using pretty old dependency(v1.2.5) of Cucumber.
Key Point : We shall not mix direct & transitive dependencies specially their versions! Doing so can cause unpredictable outcome.
Solution: Please remove. cucumber-core, cucumber-java, cucumber-jvm-deps, gherkin and cucumber-html. They're transitive dependencies and will be provided by your direct dependencies.
You can add below set of cucumber minimal dependencies.
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-junit</artifactId>
<version>4.2.6</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-picocontainer</artifactId>
<version>4.2.6</version>
<scope>test</scope>
</dependency>
After spending a lot of time on this issue, most of the errors I was receiving were due to dependencies and dependencies versions mismatch. Adding these dependencies to pom.xml file worked for me:
<!-- https://mvnrepository.com/artifact/junit/junit -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-scala_2.11</artifactId>
<version>4.7.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/io.cucumber/cucumber-jvm -->
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-jvm</artifactId>
<version>4.8.1</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/io.cucumber/cucumber-junit -->
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-junit</artifactId>
<version>4.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/io.cucumber/cucumber-java8 -->
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-java8</artifactId>
<version>4.8.1</version>
</dependency>

difference between org.apache.hbase -hbase and -hbase-client

I'm working with project that uses HBase and need to use some API for both working with data (Put and other classes) and schema(HTable, HColumn etc). I found two maven dependencies to work with it:
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
</dependency>
and
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
</dependency>
They both have similar classes, so what library should I use and what is difference between them?

Categories

Resources