I'm having bit of a struggle with these writeObject/readObject methods.
Let's say I have a
trait AbstractPosition{
def file:Path
def start:String
def end:String
}
with
class SourcePosition(val: Path, val start:String, val end:String)
extends AbstractPosition
object SourcePosition {
def apply(file: Path, start: String, end: String) =
new SourcePosition(file, start, Some(end))
def unapply(sp: SourcePosition) = Some((sp.file, sp.start, sp.end))
}
And that I now have to store such positions to file. The naive attempt fails because Path objects are not serializable:
java.io.NotSerializableException: ... .SourcePosition
So I rewrite:
trait AbstractPosition extends Serializable{
def file:Path
def start:String
def end:String
}
class SourcePosition(#transient var fileArg: Path, val start:String, val end:String)
extends AbstractPosition{
private var fileString :String = null
override def file: Path = this.fileArg
#throws(classOf[IOException])
private def writeObject(out: ObjectOutputStream): Unit = {
fileString = file.toString
out.defaultWriteObject()
}
#throws(classOf[IOException])
private def readObject(in: ObjectInputStream): Unit = {
in.defaultReadObject()
fileArg = Paths.get(fileString)
}
object SourcePosition {
def apply(file: Path, start: String, end: String) =
new SourcePosition(file, start, Some(end))
def unapply(sp: SourcePosition) = Some((sp.file, sp.start, sp.end))
}
But to no avail:
java.io.NotSerializableException: sun.nio.fs.WindowsPath$WindowsPathWithAttributes
What am I doing wrong?
And how can I achieve what I'm trying to do?
Make your SourcePosition a case class: it's a perfect candidate as it's fully immutable. Case classes a serializable by default without all this writeObject/readObject stuff. As a bonus you will get apply/unapply methods generated automatically by scalac.
The above actually seems to work.
The problem appears to have been that I had overlooked a val using file. Changing that val to a def allowed me to serialize SourcePosition
Related
I have a dataframe a2 written in scala :
val a3 = a2.select(printme.apply(col(“PlayerReference”)))
the column PlayerReference contains a string.
that calls an udf function :
val printme = udf({
st: String =>
val x = new JustPrint(st)
x.printMe();
})
this udf function calls a java class :
public class JustPrint {
private String ss = null;
public JustPrint(String ss) {
this.ss = ss;
}
public void printMe() {
System.out.println("Value : " + this.ss);
}
}
but i have this error for the udf :
java.lang.UnsupportedOperationException: Schema for type Unit is not supported
The goal of this exercise is to validate the chain of calls.
What should I do to solve this problem ?
The reason you're getting this error is that your UDF doesn't return anything, which, in terms of spark is called Unit.
What you should do depends on what you actually want, but, assuming you just want to track values coming through your UDF you should either change printMe so it returns String, or the UDF.
Like this:
public String printMe() {
System.out.println("Value : " + this.ss);
return this.ss;
}
or like this:
val printme = udf({
st: String =>
val x = new JustPrint(st)
x.printMe();
x
})
I have two Scala Objects.
common_code
dependent_code
In common_code I have one method in which I am writing my common code and declaring some variables. I want to use these variables and code in my 2nd object but when I am trying to access these Varibales I'm getting common_method not found value: variable name issue.
I'm using below code.
object comman_code{
def common_method(args: Array[String]) {
val properties: Properties = new Properties()
val hdfsConf = new Configuration();
val fs: FileSystem = FileSystem.get(hdfsConf);
val is = fs.open(new Path(args(0)));
properties.load(is)
//created sparkSesssion
//Table_Name i want to use in 2nd program
val Table_Name = properties.getProperty("Table_Name")
}
}
object dependent_code {
def main(args: Array[String]):Unit = {
val common_method = helper_class.common_method(args)
val mydf=sparksesssion.sql(s"select * from ${Table_Name}").show() //not able to acess getting not found value: Table_Name
}
}
Can someone please suggest how I can access Table_Name variable in my other object?
As you are working with Scala Objects they are instantiated automatically and you can easily access them like shown below.
object common_code {
def common_method(args: Array[String]): String = {
val properties: Properties = new Properties()
val hdfsConf = new Configuration();
val fs: FileSystem = FileSystem.get(hdfsConf);
val is = fs.open(new Path(args(0)));
properties.load(is)
//created sparksesssion
val Table_Name: String = properties.getProperty("Table_Name")
Table_Name
}
}
object dependent_code {
def main(args: Array[String]):Unit = {
val tableName: String = common_code.common_method(args)
val mydf=sparksesssion.sql(s"""select * from ${tableName}""").show()
}
}
One important thing here is that you cannot access to fields which are located inside method.
You should not assign to a variable (val Table_Name) on the last line in common_method, but return it. Otherwise your method is just Unit, meaning that nothing will return after invoke. Here is little improvement that you can try understand:
object comman_code {
def common_method(args: Array[String]): String = {
val properties: Properties = new Properties()
val hdfsConf = new Configuration();
val fs: FileSystem = FileSystem.get(hdfsConf);
val is = fs.open(new Path(args(0)));
properties.load(is)
//created sparksesssion
properties.getProperty("Table_Name")
}
}
object dependent_code {
def main(args: Array[String]): Unit = {
val tableName = comman_code.common_method(args)
val mydf = sparksesssion.sql(s"select * from $tableName").show()
}
}
Note: I called common_method from common_code object and the result is assigned to a variable called tableName. Then in turn tableName is used in string interpolation.
Couple of another suggestions:
Naming Conventions
How to post a question
Is it possible to get ClassTag information from a Java Class instance obtained via reflection?
Here's the situation. I have a Scala case class that looks like this:
case class Relation[M : ClassTag](id: UUID,
model: Option[M] = None)
And it is used like this (although with many more classes related to each other):
case class Organization(name: String)
case class Person(firstName: String,
lastName: String,
organization: Relation[Organization])
What I'm trying to do is programmatically build up a tree of these relations using something that looks like this:
private def generateFieldMap(clazz: Class[_]): Map[String, Class[_]] = {
clazz.getDeclaredFields.foldLeft(Map.empty[String, Class[_]])((map, field) => {
map + (field.getName -> field.getType)
})
}
private def getRelationModelClass[M : ClassTag](relationClass: Class[_ <: Relation[M]]): Class[_] = {
classTag[M].runtimeClass
}
def treeOf[M: ClassTag](relations: List[String]): Map[String, Any] = {
val normalizedRelations = ModelHelper.normalize(relations)
val initialFieldMap = Map("" -> generateFieldMap(classTag[M].runtimeClass))
val relationFieldMap = relations.foldLeft(initialFieldMap)((map, relation) => {
val parts = relation.split('.')
val parentRelation = parts.dropRight(1).mkString(".")
val relationClass = map(parentRelation)(parts.last)
val relationModelClass = relationClass match {
case clazz: Class[_ <: Relation[_]] => getRelationModelClass(clazz)
case _ => throw ProcessStreetException("cannot follow non-relation: " + relation)
}
val fieldMap = generateFieldMap(relationModelClass)
map + (relation -> fieldMap)
})
relationFieldMap
}
val relations = List("organization")
val tree = treeOf[Person](relations)
This won't compile. I get this error:
[error] Foo.scala:148: not found: type _$12
[error] case clazz: Class[_ <: Relation[_]] => getRelationModelClass(clazz)
[error] ^
[error] one error found
[error] (compile:compile) Compilation failed
Basically, what I'd like to do is be able to access the ClassTag information when all I have is a Java Class. Is this possible?
Yes, it is absolutely possible and very easy:
val clazz = classOf[String]
val ct = ClassTag(clazz) // just use ClassTag.apply() method
In your example you'd want to call getRelationModelClass method like this:
getRelationModelClass(clazz)(ClassTag(clazz))
This is possible because [T: ClassTag] syntax implicitly creates second parameters list like (implicit ct: ClassTag[T]). Usually it is filled by the compiler, but nothing prevents you from using it explicitly.
You also don't really need to pass the class AND class tag for this clazz at the same time to the method. You're not even using explicit class object in its body. Just pass the class tag, it will be enough.
I ended up accomplishing my goal using TypeTags and the Scala reflection API. Here are the changes necessary.
First, change the Relation class to use a TypeTag.
case class Relation[M : TypeTag](id: UUID,
model: Option[M] = None)
Then change the rest of the code to use the Scala reflection API:
private def generateFieldMap(tpe: Type): Map[String, Type] =
tpe.members.filter(_.asTerm.isVal).foldLeft(Map.empty[String, Type])((map, field) => {
map + (member.name.toString.trim -> member.typeSignature)
})
private def getRelationModelType(tpe: Type): Type =
tpe match { case TypeRef(_, _, args) => args.head }
def treeOf[M: TypeTag](relations: List[String]): Map[String, Any] = {
val normalizedRelations = ModelHelper.normalize(relations)
val initialFieldMap = Map("" -> generateFieldMap(typeTag[T].tpe))
val relationFieldMap = relations.foldLeft(initialFieldMap)((map, relation) => {
val parts = relation.split('.')
val parentRelation = parts.dropRight(1).mkString(".")
val relationType = map(parentRelation)(parts.last)
val relationModelType = getRelationModelType(relationType)
val fieldMap = generateFieldMap(relationModelType)
map + (relation -> fieldMap)
})
relationFieldMap
}
I have tested three variation of the same code and I got it to work just fine. I want to know why the different behavior.
So I have this working code, which converts a long time stamp to a string of the ECMA date standard format :
lazy val dateFormat = new java.text.SimpleDateFormat("yyyy-MM-DD'T'HH:mm:ss.sssZ")
implicit def dateToECMAFormat(time: Long) = new {
def asECMADateString: String = {
dateFormat.format(new java.util.Date(time))
}
}
Other variation that works :
implicit def dateToECMAFormat(time: Long) = new {
val dateFormat = new java.text.SimpleDateFormat("yyyy-MM-DD'T'HH:mm:ss.sssZ")
def asECMADateString: String = {
dateFormat.format(new java.util.Date(time))
}
}
But I do not want the SimpleDateFormat to be re instanciated all the time . So I prefere the first one. But now the real mystery :
val dateFormat = new java.text.SimpleDateFormat("yyyy-MM-DD'T'HH:mm:ss.sssZ")
implicit def dateToECMAFormat(time: Long) = new {
def asECMADateString: String = {
dateFormat.format(new java.util.Date(time))
}
}
This last piece of code compiles but throws an exception at run-time; I did not manage to get the stack trace from play framework. I just know my controller in play framework 2.1 return with a 500 (Internal Server Error) without any more information (the other controllers work though and the main services are still up).
In each case the call looks like this: 100000L.asECMADateString
Can someone explain to me the different behaviors and why does the last one does not work? I though I had a good grasp of the difference between val, lazy val and def, but now I feel like I am missing something.
UPDATE
The code is called in object like this :
object MyController extends Controller{
implicit val myExecutionContext = getMyExecutionContext
lazy val dateFormat = new java.text.SimpleDateFormat("yyyy-MM-DD'T'HH:mm:ss.sssZ")
implicit def dateToECMAFormat(time: Long) = new {
def asECMADateString: String = {
dateFormat.format(new java.util.Date(time))
}
}
def myAction = Action {
Async {
future {
blocking{
//here get some result from a db
val result = getStuffFromDb
result.someLong.asECMADateString
}
} map { result => Ok(result) } recover { /* return some error code */ }
}
}
}
It is your basic playframework Async action call.
Since the difference between the 1st and 3rd examples are the lazy val, I'd be looking at exactly where your call (100000L.asECMADateString) is being made. lazy val helps correct some "order of initialization" issues with mix-ins, for example: see this recent issue to see if it's similar to yours.
I am using case classes to define different "models" of data in our app. Reason is to enable easy use of Jerkson (Scala interface to Jackson). To convert my User to a domain object in Riak I have used the #RiakKey annotation on my guid. I have the following:
case class User(
#RiakKey val guid: String,
#RiakIndex(name = "email") val email: String,
val salt: String,
val passwordHash: String,
val emailHash: String,
val firstName: String,
val lastName: String,
val suspended: Boolean=false,
val created: Timestamp=now
)
When I go to perform a domain conversion on the case class, the #RiakKey isn't recognized. It throws an NoKeySpecifedException. Here's my converter:
class UserConverter(val bucket: String) extends Converter[User] {
def fromDomain(domainObject: User, vclock: VClock) = {
val key = getKey(domainObject)
if(key == null) throw new NoKeySpecifedException(domainObject)
val kryo = new Kryo()
kryo.register(classOf[User])
val ob = new ObjectBuffer(kryo)
val value = ob.writeObject(domainObject)
RiakObjectBuilder.newBuilder(bucket, key)
.withValue(value)
.withVClock(vclock)
.withContentType(Constants.CTYPE_OCTET_STREAM)
.build()
}
}
Is this an issue in Scala with Java annotations? Is there a workaround?
Update
Here's where the User object is created and stored, and where the converter is referenced:
1)
val user = parse[User](body) // jerkson parse, body is a string of JSON
User.store(user)
2)
object User {
val bucketName = "accounts-users"
val bucket = DB.client.createBucket(bucketName).execute()
def fetch(id: String) = bucket.fetch(id).execute().getValueAsString()
def store(o: User) = bucket.store( o ).withConverter(new UserConverter(bucketName)).execute()
}
Strack Trace
com.basho.riak.client.convert.NoKeySpecifedException
at com.basho.riak.client.bucket.DefaultBucket.store(DefaultBucket.java:455)
at com.threetierlogic.AccountService.models.User$.store(User.scala:58)
at com.threetierlogic.AccountService.controllers.Users$$anonfun$routes$3.apply(Users.scala:54)
at com.threetierlogic.AccountService.controllers.Users$$anonfun$routes$3.apply(Users.scala:51)
(I apologize for the long conversation before this answer)
After learning a bit more about scala I discovered that with a case class you have to do it a little differently.
http://piotrbuda.eu/2012/10/scala-case-classes-and-annotations-part-1.html
If you do:
#(RiakKey#field) guid: String
it works.
I wrote a small test program in scala and was able to extract the annotated key using the static getKey() method used in the DefaultBucket that was returning null and causing the exception to be thrown.
import com.basho.riak.client.convert.KeyUtil.getKey;
object Main {
def main(args: Array[String]): Unit = {
val u = User("my_key")
val k = getKey(u)
System.out.println(k);
}
}
User.scala
/* scala 2.9.1 would be scala.annotation.target.field */
import scala.annotation.meta.field
import com.basho.riak.client.convert.RiakKey;
case class User (#(RiakKey#field) guid: String)
Output:
my_key
(And, if you change the annotation back to the way you had it, it returns null as expected)
Here's my proposed workaround for the problem. Instead of relying on annotations, I am just going to use the DefaultBucket.store method and manually designate a key.
My User companion object:
object User {
val bucketName = "accounts-users"
val bucket = DB.client.createBucket(bucketName).execute()
def store(key: String, o: User) = bucket.store(key, o).withConverter(new UserConverter(bucketName)).execute()
}
And using it:
val user = parse[User](body)
User.store(user.guid, user)