assume part of my code is like as:-
where doc is List[Document] that contains stu_name and roll_number
sometimes stu_name and roll_name may be null.
I used Try to avoid null Pointer exception in first two lines.
but why I m getting again Null Pointer exception in "val myRow".
val name= Try {Option.apply(doc.getFieldValue("stu_name"))}.getOrElse(null)
val rollNumber ={Option.apply(doc.getFieldValue("roll_number"))}.getOrElse(null)
val myRow = (
doc.getFieldValue("ID").asInstanceOf[Int] //can't be null
name.getOrElse(null).toString, //NullPointerException
rollNumber.getOrElse(null).asInstanceOf[Int] //NullPointerException
)
.....
.....
I m getting following error:
[2016-01-14 22:40:16,896] WARN o.a.s.s.TaskSetManager [] [akka://JobServer/user/context-supervisor/demeter] - Lost task 0.0 in stage 0.0 (TID 0, 10.29.23.136): java.lang.NullPointerException
at com.test.events.Monitoring$$anonfun$geteventTableReplicateDayFunc$1.apply(Monitoring.scala:75)
at com.test.events.Monitoring$$anonfun$geteventTableReplicateDayFunc$1.apply(Monitoring.scala:57)
at com.test.events.Monitoring$$anonfun$27.apply(Monitoring.scala:104)
at com.test.events.Monitoring$$anonfun$27.apply(Monitoring.scala:104)
I tried in console following but did not see any error:
scala> val a = Try (Option.apply("atar")).getOrElse(null)
a: Option[String] = Some(atar)
scala> a.getOrElse(null)
res16: String = atar
scala> val a = Try (Option.apply(null)).getOrElse(null)
a: Option[Null] = None
scala> a.getOrElse(null)
res17: Null = null
This is all wrong. By using getOrElse(null) you are basically removing all advantages to using an Option to begin with. Plus, generating much more complexity than needed.
You need to define what you will do if the values are null. This just keeps them as Options (None on null input):
val myRow = (
doc.getFieldValue("ID").toInt, // Fails if null
Option(doc.getFieldValue("stu_name")), // `None` if null
Option(doc.getFieldValue("roll_number")).map(_.toInt) // `None` if null
)
Or use default values:
val myRow = (
doc.getFieldValue("ID").toInt,
Option(doc.getFieldValue("stu_name")).getOrElse("default"),
Option(doc.getFieldValue("roll_number")).map(_.toInt).getOrElse(0)
)
Related
I'm trying to make a generic value type in my HashMap like so:
val aMap = ArrayBuffer[HashMap[String, Any]]()
aMap += HashMap()
aMap(0)("aKey") = "aStringVal"
aMap(0)("aKey2") = true // a bool value
aMap(0)("aKey3") = 23 // an int value
This works in my spark-shell but it gives me this ClassNotFoundException on scala.Any in my IntelliJ Project:
org.apache.spark.streaming.scheduler.JobScheduler logError - Error running job streaming job 1521859195000 ms.0
java.lang.ClassNotFoundException: scala.Any
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
I'm using Scala 2.11. Any ideas what could be causing this?
What this ended up being for me was creating a DataFrame with mixed data using .toDF
I had:
val baseDataFrame = Seq(
("value1", "one"),
("value2", 2),
("value3", 3)
).toDF("column1", "column2")
and this change fixed the issue:
val baseDataFrame = Seq(
("value1", "one"),
("value2", "2"),
("value3", "3")
).toDF("column1", "column2")
I accessed MySQL database and fetched the table.
Everything is working fine till that.
when i am trying to save the records in text or other formats i am getting the error
Exit Code Exception exit Code=1: 'Change File Mode By Mask error' (5): Access is denied.
Any help will be appreciated.
object jdbcConnect {
def main(args: Array[String]) {
val url="jdbc:mysql://127.0.0.1:3306/mydb"
val username = "root"
val password = "token_password"
Class.forName("com.mysql.jdbc.Driver").newInstance
//DriverManager.registerDriver(new com.mysql.jdbc.Driver());
val conf = new SparkConf().setAppName("JDB CRDD").setMaster("local[2]").set("spark.executor.memory", "1g")
val sc = new SparkContext(conf)
val myRDD = new JdbcRDD( sc, () =>
DriverManager.getConnection(url,username,password) ,
"select s_Id,issue_date from store_details limit ?, ?",
0, 10, 1, r => r.getString("s_Id") + ", " + r.getString("issue_date"))
myRDD.foreach(println)
myRDD.saveAsTextFile("C:/jdbcrddexamplee")
}
}
Error
17/07/18 11:10:19 ERROR Executor: Exception in task 0.0 in stage 2.0
(TID 2) ExitCodeException exitCode=1: ChangeFileModeByMask error (5):
Access is denied.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:582) at
org.apache.hadoop.util.Shell.run(Shell.java:479) at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:866) at
org.apache.hadoop.util.Shell.execCommand(Shell.java:849) at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.(RawLocalFileSystem.java:225)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.(RawLocalFileSystem.java:209)
at
org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
It seemed to be a permission error. My foolishness...
Make sure to run anything as an admin. Though i will suggest to use dataframe instead of RDD :D
I am new to scala and java altogether and trying to run a sample producer code. All it does is, takes some raw products and referrers stored in csv files and uses rnd to generate some random log. Following is my code:
object LogProducer extends App {
//WebLog config
val wlc = Settings.WebLogGen
val Products = scala.io.Source.fromInputStream(getClass.getResourceAsStream("/products.csv")).getLines().toArray
val Referrers = scala.io.Source.fromInputStream(getClass.getResourceAsStream("/referrers.csv")).getLines().toArray
val Visitors = (0 to wlc.visitors).map("Visitors-" + _)
val Pages = (0 to wlc.pages).map("Pages-" + _)
val rnd = new Random()
val filePath = wlc.filePath
val fw = new FileWriter(filePath, true)
//adding randomness to time increments for demo
val incrementTimeEvery = rnd.nextInt(wlc.records - 1) + 1
var timestamp = System.currentTimeMillis()
var adjustedTimestamp = timestamp
for (iteration <- 1 to wlc.records) {
adjustedTimestamp = adjustedTimestamp + ((System.currentTimeMillis() - timestamp) * wlc.timeMultiplier)
timestamp = System.currentTimeMillis()
val action = iteration % (rnd.nextInt(200) + 1) match {
case 0 => "purchase"
case 1 => "add_to_cart"
case _ => "page_view"
}
val referrer = Referrers(rnd.nextInt(Referrers.length - 1))
val prevPage = referrer match {
case "Internal" => Pages(rnd.nextInt(Pages.length - 1))
case _ => ""
}
val visitor = Visitors(rnd.nextInt(Visitors.length - 1))
val page = Pages(rnd.nextInt(Pages.length - 1))
val product = Products(rnd.nextInt(Products.length - 1))
val line = s"$adjustedTimestamp\t$referrer\t$action\t$prevPage\t$visitor\t$page\t$product\n"
fw.write(line)
if (iteration % incrementTimeEvery == 0) {
//os.flush()
println(s"Sent $iteration messages!")
val sleeping = rnd.nextInt(incrementTimeEvery * 60)
println(s"Sleeping for $sleeping ms")
}
}
}
It is pretty straightforward where it is basically generating some variables and adding it to the line.
However I am getting a big exception error stack which i am not able to understand:
"C:\Program Files\Java\jdk1.8.0_92\bin\java...
Exception in thread "main" java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:70)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:59)
at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:50)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
at scala.collection.TraversableOnce.to(TraversableOnce.scala:310)
at scala.collection.TraversableOnce.to$(TraversableOnce.scala:308)
at scala.collection.AbstractIterator.to(Iterator.scala:1417)
at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:302)
at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:302)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1417)
at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:289)
at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:283)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1417)
at clickstream.LogProducer$.delayedEndpoint$clickstream$LogProducer$1(logProducer.scala:16)
at clickstream.LogProducer$delayedInit$body.apply(logProducer.scala:12)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at clickstream.LogProducer$.main(logProducer.scala:12)
at clickstream.LogProducer.main(logProducer.scala)
Process finished with exit code 1
Can someone please help me identify what the exception mean? Thanks all
So it wasnt hard.. it was my amateurish knowledge. It was a simple IO exception where Intellij wasnt able to get the values from my csv file. When i imported it into resources root directory, it gave me a warning message of wrong encoding.
The error was at this point:
val Products = scala.io.Source.fromInputStream(getClass.getResourceAsStream("/products.csv")).getLines().toArray
thanks for efforts though
It was an encoding issue, for Scala a quick fix would be:
replace:
val Products=scala.io.Source.fromInputStream(getClass.getResourceAsStream("/products.csv")).getLines().toArray
val Referrers = scala.io.Source.fromInputStream(getClass.getResourceAsStream("/referrers.csv")).getLines().toArray
using this:
val Products=scala.io.Source.fromInputStream(getClass.getResourceAsStream("/products.csv"))("UTF-8").getLines().toArray
val Referrers = scala.io.Source.fromInputStream(getClass.getResourceAsStream("/referrers.csv"))("UTF-8").getLines().toArray
For java and more details please check out this link: http://biercoff.com/malformedinputexception-input-length-1-exception-solution-for-scala-and-java/
I am trying to write a scala program to generate an output file omega0_Real.txt which contains pre-calculated values of the cosine function for inputs ranging from 0 to pi/2 radians. Each of these calculated values are 72 bit long and are stored in hex format. The code I have written so far is as follows:
import java.io._
import scala.math._
object omega0_Real {
def main (args: Array[String]) {
val arg = (0.0).to(2-pow(2, -10), pow(2, -10))
val cosArg = arg.map (i => cos(i))
val cosBit = cosArg.map (i => List.tabulate(72)(j = (BigDecimal((i*pow(2,j))).toBigInt % 2)))
val cosStr = cosBit.map (i => i mkString)
val cosBig = cosStr.map (i => BigInt(i, 2))
val cosBigStr = cosBig.map (i => i.toString(16))
val cosList = cosBigStr.toList
val file = "omega0_Real.txt"
val writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file)))
for (x <- cosList) {
writer.write(x + "\n")
}
writer.close()
}
which gives the error: java.lang.NumberFormatException: Illegal embedded sign character followed by many others. Please help me debug this code.
PS: I ran this code line-by-line on sbt console but it did not give any error, although the values generated were erroneous.
I have the following code, we are not using System.out.println statements but have to use a logger to print out to console.
Here is example code (Java):
public void printColumnStats() {
java.util.logging.Logger log = java.util.logging.Logger
.getLogger("ProfileStatusClass");
log.setLevel(Level.ALL);
ConsoleHandler handler = new ConsoleHandler();
handler.setFormatter(new MyFormatter());
handler.setLevel(Level.ALL);
log.addHandler(handler);
// This will print the current Column Profiling stats
log.fine("FieldName : " + this.datasetFieldName);
log.fine("Field index : " + this.fieldIndex);
NumberFormat formatter = new DecimalFormat("#0.00000000");
if (this.fieldType.equalsIgnoreCase("number")) {
log.fine("Field Null Count : " + this.datasetFieldNullCount);
log.fine("Field Valid/Obs Count : " + this.datasetFieldObsCount);
log.fine("Field Min : " + (0l + this.datasetFieldMin));
...
I have the following call for it (sorry this part is in Scala, but should be straight forward:
for (e <- tResults) {
e._2.printColumnStats()
println("++........................................................++")
}
What I am getting tons of repeats before the next set of stats pulls up even though there is just one of each type for the loop:
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
Field Null Count : 0.0
You are adding a new ConsoleHandler on every call to 'printColumnStats'. You only want to install one handler. If you are going to use code to setup the logger then move the setup code out of the printColumnStats function and into a static block.
private static final Logger log = Logger.getLogger("ProfileStatusClass");
static {
log.setLevel(Level.ALL);
ConsoleHandler handler = new ConsoleHandler();
handler.setFormatter(new MyFormatter());
handler.setLevel(Level.ALL);
log.addHandler(handler);
}
By default, the JVM will install a ConsoleHandler on the root logger too. Your logger should setUserParentHandlers to false so you don't publish to that handler too.