Using Java, I start a basic Spark app using:
SparkConf conf = new SparkConf().setAppName("myApp").setMaster("local");
JavaSparkContext javaSparkContext = new JavaSparkContext(conf);
javaSparkContext.setLogLevel("INFO");
SQLContext sqlContext = new SQLContext(javaSparkContext);
I try to have the system a little less verbosy by adding the setLogLevel, but it does not take it. I still have a lot of Debug information.
Ideally, I would like to shut off all org.apache.spark.* except errors...
Update #1:
Here is my pom.xml:
<dependencies>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.6</version>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-core</artifactId>
<version>5.2.0.Final</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.4.0</version>
</dependency>
</dependencies>
There is a file conf/log4j.properties.template, copy it and modify according to your need for logging.
cd spark/conf
cp log4j.properties.template log4j.properties
add rows to log4j.properties should work
log4j.logger.org.apache.spark=ERROR
[Edit]
If it is a maven java project, running a standalone spark. Copy the log4j.properties to src/main/resources, or to src/test/resources if it's for test cases. And modify accordingly.
Related
I am trying to use Selenium api with Gradle. This is my build.gradle dependency section:
dependencies {
compile 'com.google.api-client:google-api-client:1.23.0'
compile 'com.google.oauth-client:google-oauth-client-jetty:1.23.0'
compile 'com.google.apis:google-api-services-sheets:v4-rev506-1.23.0'
compile group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '2.9.0'
compile group: 'org.seleniumhq.selenium', name: 'selenium-chrome-driver', version: '2.9.0' }
My selenium - Java code:
System.setProperty("webdriver.chrome.driver", "C:\\Program Files(x86)\\Google\\Chrome\\Application\\chrome.exe");
WebDriver driver = new ChromeDriver();
Code works fine, and I am able to get Chrome browser opened.
However, in build.gradle, I am using 2.9.0 version of selenium and chromedriver. If I try to use any version after 2.9.0, it gives me below error in WebDriver driver = new ChromeDriver(); method:
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkState(ZLjava/lang/String;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)V
at org.openqa.selenium.remote.service.DriverService.findExecutable(DriverService.java:124)
at org.openqa.selenium.chrome.ChromeDriverService.access$000(ChromeDriverService.java:32)
at org.openqa.selenium.chrome.ChromeDriverService$Builder.findDefaultExecutable(ChromeDriverService.java:137)
at org.openqa.selenium.remote.service.DriverService$Builder.build(DriverService.java:339)
at org.openqa.selenium.chrome.ChromeDriverService.createDefaultService(ChromeDriverService.java:88)
at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:123)
at Quickstart.main(Quickstart.java:130)
I tried looking for gradle+Maven+selenium supported version. Was not able to find any good info. Any idea?
Try to update your Guava to
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.1-jre</version>
</dependency>
It will solve your issue.
This error message...
Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.base.Preconditions.checkState(ZLjava/lang/String;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)V
...implies that the Java Client was unable to find ChromeDriver().
Issue & Solution
As per the Selenium - Java code you have shared, the System.setProperty() line is used to set the ChromeDriver binary path not the chrome binary path. For that you have to download the ChromeDriver binary from the ChromeDriver - WebDriver for Chrome and place it in your system and mention the absolute path of the ChromeDriver through System.setProperty() line. Hence you have to change :
System.setProperty("webdriver.chrome.driver", "C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe");
WebDriver driver = new ChromeDriver();
To :
System.setProperty("webdriver.chrome.driver", "C:\\path\\to\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
I have the exact same problem (I am using Maven though).
I noticed that the problem is that using one of com.google.api-client, or com.google.oauth-client, or com.google.apis:google-api-services-sheets alongside org.seleniumhq.selenium causes the error.
The problem is that both dependencies depend on a different com.google.guava:guava artifact.
In order to solve the error, you should explicitly depend on a single com.google.guava:guava artifact.
So go ahead and add the following in your build.gradle:
compile 'com.google.guava:guava:27.0.1-jre'
Just wanted to post here in case anyone else comes to this from Google like I did. For whatever reason, I needed to run with sudo. I was having issues using the npm selenium-standalone package and running:
/node_modules/selenium-standalone/bin/selenium-standalone start
And it would show that error. What fixed it was running with sudo
sudo /node_modules/selenium-standalone/bin/selenium-standalone start
I don't think I needed to do this before but suddenly it's the only way it works now.
I had the same problem and ran a dependency check and found that there were conflicts. The solution that worked for me was to exclude the conflicting dependencies.
Your project will probably have different dependencies than mine. So, listing the specific conflicts in my project may not be helpful.
copy and paste the following dependencies in the pom.xml and then do a maven build:
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.testng/testng -->
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>7.1.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/io.rest-assured/rest-assured -->
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>rest-assured</artifactId>
<version>3.0.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/io.rest-assured/json-path -->
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>json-path</artifactId>
<version>3.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.rest-assured/json-schema-validator -->
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>json-schema-validator</artifactId>
<version>3.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.rest-assured/xml-path -->
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>xml-path</artifactId>
<version>3.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.hamcrest/java-hamcrest -->
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>java-hamcrest</artifactId>
<version>2.0.0.0</version>
<scope>test</scope>
</dependency>
<!-- cucumber dependency begins -->
<!-- https://mvnrepository.com/artifact/net.masterthought/cucumber-reporting -->
<dependency>
<groupId>net.masterthought</groupId>
<artifactId>cucumber-reporting</artifactId>
<version>4.7.0</version>
</dependency>
<!-- starts here -->
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-core</artifactId>
<version>1.2.5</version>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-java</artifactId>
<version>1.2.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-jvm</artifactId>
<version>1.2.5</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-junit</artifactId>
<version>1.2.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-jvm-deps</artifactId>
<version>1.0.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/info.cukes/cucumber-html -->
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-html</artifactId>
<version>0.2.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/info.cukes/gherkin -->
<dependency>
<groupId>info.cukes</groupId>
<artifactId>gherkin</artifactId>
<version>2.12.2</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/io.cucumber/cucumber-testng -->
<dependency>
<groupId>io.cucumber</groupId>
<artifactId>cucumber-testng</artifactId>
<version>5.4.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.theoryinpractise/cucumber-testng-factory -->
<dependency>
<groupId>com.theoryinpractise</groupId>
<artifactId>cucumber-testng-factory</artifactId>
<version>1.0.1</version>
</dependency>
<!-- https://stackoverflow.com/questions/49021707/java-lang-nosuchmethoderror-com-google-common-base-preconditions-checkstatezlj?rq=1 -->
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.1-jre</version>
</dependency>
</dependencies>
Just adding the below was not enough
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.1-jre</version>
</dependency>
It was not working first. Then I moved this dependency up higher in pom.xml
than junit dependency and it worked. So, make sure that in pom file ,it is higher than junit or testng or whatever runner you are using
A comment in this post helped
Adding guava dependency and chromedriver version dependency worked for me -
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>25.0-jre</version>
</dependency>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-chrome-driver</artifactId>
<version>3.141.59</version>
<scope>test</scope>
</dependency>
Problem might be also in inclusion of the google-collections:
// https://mvnrepository.com/artifact/com.google.collections/google-collections
implementation 'com.google.collections:google-collections:1.0'
I had this included in one library that I linnked in and it drived me nuts finding the reason.
Including parameter -verbose:class into JVM helped to pinpoint the culprit.
I'm trying to connect spark and cassandra database using Java language. For connecting spark and cassandra I'm using latest version of Spark-cassandra-Connector i.e 2.4.0. Currently I can connect spark and cassandra using connector. I am getting data in RDD format but I can not read data from that data structure. If I used row reader factory as third parameter of cassandraTable() I am getting
> Wrong 3rd argument type. Found:
> 'java.lang.Class<com.journaldev.sparkdemo.JohnnyDeppDetails>',
> required:
> 'com.datastax.spark.connector.rdd.reader.RowReaderFactory<T>'
Can any one tell me which version I should use or what is problem here?
CassandraTableScanJavaRDD pricesRDD2 =
CassandraJavaUtil.javaFunctions(sc).cassandraTable(keyspace,table,JohnnyDeppDetails.class);
My pom.xml:
<!-- Import Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.5.0-M2</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.9</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-mapping</artifactId>
<version>2.1.9</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
Instead of passing the class instance, you need to create a RowReaderFactory using the mapRowTo function, like this (this is from my example):
CassandraJavaRDD<UUIDData> uuids = javaFunctions(spark.sparkContext())
.cassandraTable("test", "utest", mapRowTo(UUIDData.class));
when you'll write back, you can convert class into corresponding factory via mapToRow function.
I was trying a very basic hello world program for java+spark+cassandra.
Initially I had some mixed versions of libraries which caused the NoSuchMethodError (#5). When I got the versions right, I get the noclassdefound error for spark logging (#4). This is from the cassandra connector code. I have built it from the b2.3 branch of github which is only couple of commits behind the master(using sbt).
All solutions for spark logging issue point to moving to older versions. This is not a practical solution for us as we need to figure this out for future development.
Wonder why the latest stable build of cassandra connector refers to spark logging which is no longer available now??.
Any help is appreciated.
Spark version: 2.3.0
Cassandra: 3.9.0
The relevant code snippet is pasted below.
#1 SparkConf sparkConf = new SparkConf().setAppName("appname")
.setMaster("local");
#2 sparkConf.set("spark.cassandra.connection.host", "127.0.0.1");
#3 JavaSparkContext ctx = new JavaSparkContext(sparkConf);
#4 CassandraConnector connector = CassandraConnector.apply(ctx.getConf()); <<<< org/apache/spark/logging noclassdeffound error
#5 try (Session session = connector.openSession()) { <<< nosuchmethoderror: scala.runtime.objectref.zero()lscala/runtime/objectref
The POM is below
http://maven.apache.org/xsd/maven-4.0.0.xsd">
4.0.0
com.mygroup
apache-spark
1.0-SNAPSHOT
jar
apache-spark
http://maven.apache.org
<dependencies>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.1</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.11</artifactId>
<version>1.6.0-M1</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libthrift</artifactId>
<version>0.11.0</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
</dependencies>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<org.apache.spark.spark-core.version>2.2.1</org.apache.spark.spark-core.version>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
First thing to fix is
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.11</artifactId>
<version>1.6.0-M1</version>
</dependency>
Which does not match any of your other build versions. The Java module was merged into the main artifact. You also shouldn't include the java driver module on it's own since this will most likely have issue with guava inclusions.
Take a look at
https://github.com/datastax/SparkBuildExamples/blob/master/scala/maven/oss/pom.xml for example pom files.
Im trying to communicate with hbase using spark. I´m using this code below:
SparkConf sparkConf = new SparkConf().setAppName("HBaseRead");
JavaSparkContext jsc = new JavaSparkContext(sparkConf);
Configuration conf = HBaseConfiguration.create();
conf.addResource(new Path("/etc/hbase/conf/core-site.xml"));
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
JavaHBaseContext hbaseContext = new JavaHBaseContext(jsc, conf);
Scan scan = new Scan();
scan.setCaching(100);
JavaRDD<Tuple2<ImmutableBytesWritable, Result>> hbaseRdd = hbaseContext.hbaseRDD(TableName.valueOf("climate"), scan);
System.out.println("Number of Records found : " + hbaseRdd.count());
If I execute this, I get the following error:
Exception in thread "dag-scheduler-event-loop" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/regionserver/StoreFileWriter
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.getDeclaredMethod(Class.java:2128)
at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
...
I did not find any solution via google. Has anyone an idea?
--------edit--------
I´m using maven. My Pom looks like:
<dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.sharegov</groupId>
<artifactId>mjson</artifactId>
<version>1.4.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-xml_2.10</artifactId>
<version>0.3.5</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-spark</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
</dependencies>
Im building my application with dependencies using the maven-assembly-plugin
You are getting the NoClassDefFoundError, because spark is not able to find hbase jars in the classpath, you need to supply the required jars to spark-submit explicitly using --jars parameter while launching job:
${SPARK_HOME}/bin/spark-submit \
--jars ${..add hbase jars comma separated...}
--class ....
.........
I am trying to do a simple Spark SQL programming in Java. In the program, I am getting data from a Cassandra table, converting the RDD into a Dataset and displaying the data. When I run the spark-submit command, I am getting the error: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging.
My program is:
SparkConf sparkConf = new SparkConf().setAppName("DataFrameTest")
.set("spark.cassandra.connection.host", "abc")
.set("spark.cassandra.auth.username", "def")
.set("spark.cassandra.auth.password", "ghi");
SparkContext sparkContext = new SparkContext(sparkConf);
JavaRDD<EventLog> logsRDD = javaFunctions(sparkContext).cassandraTable("test", "log",
mapRowTo(Log.class));
SparkSession sparkSession = SparkSession.builder().appName("Java Spark SQL").getOrCreate();
Dataset<Row> logsDF = sparkSession.createDataFrame(logsRDD, Log.class);
logsDF.show();
My POM dependencies are:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.0.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.2</version>
</dependency>
</dependencies>
My spark-submit command is: /home/ubuntu/spark-2.0.2-bin-hadoop2.7/bin/spark-submit --class "com.jtv.spark.dataframes.App" --master local[4] spark.dataframes-0.1-jar-with-dependencies.jar
How do I solve this error? Downgrading to 1.5.2 does not work as 1.5.2 does not have org.apache.spark.sql.Dataset and org.apache.spark.sql.SparkSession.
This may be a problem into your IDE. As some of this packages are created and Scala the Java project, sometimes the IDE is unable to understand what is going on. I am using the Intellij and it keeps displaying this message to me. But, when I try to run the "mvn test" or "mvn package" everything is fine. Please check if this is really some package error or just the IDE that is lost.
Spark Logging is available for Spark version 1.5.2 and lower but not higher version. So your dependency in pom.xml should be like this:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.5.2</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.5.2</version>
</dependency>
</dependencies>
Please let me know if it works or not.
The below dependency worked fine for my case.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
Pretty late to the party here, but I added
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.1</version>
<scope>provided</scope>
</dependency>
To solve this issue. Seems to work for my case.
Make sure you have the correct spark version in the pom.xml.
previously, in local, I have a different version of Spark and that is why I was getting the error in IntelliJ IDE. "Can not have access Spark.logging class"
In my case, Changed it from 2.4.2 -> 2.4.3, and it solved.
Spark version & Scala version info, we can get from spark-shell command.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.3</version>
</dependency>