How can I call this function from Java? Or do I need a wrapper in scala?
package com.datastax.spark.connector
class DataFrameFunctions(dataFrame: DataFrame) extends Serializable {
def createCassandraTable(
keyspaceName: String,
tableName: String,
partitionKeyColumns: Option[Seq[String]] = None,
clusteringKeyColumns: Option[Seq[String]] = None)(
connector: CassandraConnector = CassandraConnector(sparkContext.getConf)): Unit = {
I used the following code :
DataFrameFunctions frameFunctions = new DataFrameFunctions(dfTemp2);
Seq<String> argumentsSeq1 = JavaConversions.asScalaBuffer(Arrays.asList("CategoryName")).seq();
Option<Seq<String>> some1 = new Some<Seq<String>>(argumentsSeq1);
Seq<String> argumentsSeq2 = JavaConversions.asScalaBuffer(Arrays.asList("DealType")).seq();
Option<Seq<String>> some2 = new Some<Seq<String>>(argumentsSeq2);
frameFunctions.createCassandraTable("coupons", "IdealFeeds", some1, some2, connector);
I am attempting to read avro data from Kafka using Spark Streaming but I receive the following error message:
Streaming Query Exception caught!: org.apache.spark.sql.streaming.StreamingQueryException: Job aborted.
=== Streaming Query ===
Identifier: [id = 8b54c92d-6bbc-4dbc-84d0-55b762c21ba2, runId = 4bc92b3c-343e-4886-b0bc-0777b89f9ec8]
Current Committed Offsets: {KafkaV2[Subscribe[customer-avro4]]: {"customer-avro":{"0":17}}}
Current Available Offsets: {KafkaV2[Subscribe[customer-avro4]]: {"customer-avro":{"0":20}}}
Current State: ACTIVE
Thread State: RUNNABLE
Any idea on what the issue might be and how to resolve it? Code is the following (inspired from xebia-france spark-structured-streaming-blog). Actually, I think it ran earlier already but now there is a problem.
import com.databricks.spark.avro.SchemaConverters
import io.confluent.kafka.schemaregistry.client.{CachedSchemaRegistryClient, SchemaRegistryClient}
import io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer
import org.apache.avro.Schema
import org.apache.avro.generic.GenericRecord
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.streaming.StreamingQueryException
object AvroConsumer {
private val topic = "customer-avro4"
private val kafkaUrl = "http://localhost:9092"
private val schemaRegistryUrl = "http://localhost:8081"
private val schemaRegistryClient = new CachedSchemaRegistryClient(schemaRegistryUrl, 128)
private val kafkaAvroDeserializer = new AvroDeserializer(schemaRegistryClient)
private val avroSchema = schemaRegistryClient.getLatestSchemaMetadata(topic + "-value").getSchema
private val sparkSchema = SchemaConverters.toSqlType(new Schema.Parser().parse(avroSchema))
def main(args: Array[String]): Unit = {
val spark = SparkSession
spark.udf.register("deserialize", (bytes: Array[Byte]) =>
val kafkaDataFrame = spark
.option("kafka.bootstrap.servers", kafkaUrl)
.option("subscribe", topic)
val valueDataFrame = kafkaDataFrame.selectExpr("""deserialize(value) AS message""")
import org.apache.spark.sql.functions._
val formattedDataFrame =
from_json(col("message"), sparkSchema.dataType).alias("parsed_value"))
val writer = formattedDataFrame
.option("checkpointLocation", "hdfs://localhost:9000/data/spark/parquet/checkpoint")
while (true) {
val query = writer.start("hdfs://localhost:9000/data/spark/parquet/total")
try {
catch {
case e: StreamingQueryException => println("Streaming Query Exception caught!: " + e);
object DeserializerWrapper {
val deserializer: AvroDeserializer = kafkaAvroDeserializer
class AvroDeserializer extends AbstractKafkaAvroDeserializer {
def this(client: SchemaRegistryClient) {
this.schemaRegistry = client
override def deserialize(bytes: Array[Byte]): String = {
val genericRecord = super.deserialize(bytes).asInstanceOf[GenericRecord]
Figured it out - the problem was not as I had thought with the Spark-Kafka integration directly, but with the checkpoint information inside the hdfs filesystem instead. Deleting and recreating the checkpoint folder in hdfs solved it for me.
I know I can compile individual "snippets" in Scala using the Toolbox like this:
import scala.reflect.runtime.universe
object Compiler {
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
def main(args: Array[String]): Unit = {
Is there any way I can compile more than just "snippets", i.e., classes that refer to each other? Like this:
import scala.reflect.runtime.universe
object Compiler {
private val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
val a: String =
|package pkg {
|class A {
|def compute(): Int = 42
val b: String =
|import pkg._
|class B {
|def fun(): Unit = {
| new A().compute()
def main(args: Array[String]): Unit = {
val compiledA = tb.parse(a)
val compiledB = tb.parse(b)
Obviously, my snippet doesn't work as I have to tell the toolbox how to resolve "A" somehow:
Exception in thread "main" reflective compilation has failed:
not found: type A
import scala.reflect.runtime.universe._
import scala.reflect.runtime.universe
val tb = universe.runtimeMirror(getClass.getClassLoader).mkToolBox()
val a = q"""
class A {
def compute(): Int = 42
val symbA = tb.define(a)
val b = q"""
class B {
def fun(): Unit = {
new $symbA().compute()
In cases more complex than those the toolbox can handle, you can always run the compiler manually
import scala.reflect.internal.util.{AbstractFileClassLoader, BatchSourceFile}
import{AbstractFile, VirtualDirectory}
import{Global, Settings}
import scala.reflect.runtime
import scala.reflect.runtime.universe
import scala.reflect.runtime.universe._
val a: String =
|package pkg {
|class A {
| def compute(): Int = 42
val b: String =
|import pkg._
|class B {
| def fun(): Unit = {
| println(new A().compute())
| }
val directory = new VirtualDirectory("(memory)", None)
compileCode(List(a, b), List(), directory)
val runtimeMirror = createRuntimeMirror(directory, runtime.currentMirror)
val bInstance = instantiateClass("B", runtimeMirror)
runClassMethod("B", runtimeMirror, "fun", bInstance) // 42
def compileCode(sources: List[String], classpathDirectories: List[AbstractFile], outputDirectory: AbstractFile): Unit = {
val settings = new Settings
classpathDirectories.foreach(dir => settings.classpath.prepend(dir.toString))
settings.usejavacp.value = true
val global = new Global(settings)
val files = { case (code, i) => new BatchSourceFile(s"(inline-$i)", code) }
(new global.Run).compileSources(files)
def instantiateClass(className: String, runtimeMirror: Mirror, arguments: Any*): Any = {
val classSymbol = runtimeMirror.staticClass(className)
val classType = classSymbol.typeSignature
val constructorSymbol = classType.decl(termNames.CONSTRUCTOR).asMethod
val classMirror = runtimeMirror.reflectClass(classSymbol)
val constructorMirror = classMirror.reflectConstructor(constructorSymbol)
constructorMirror(arguments: _*)
def runClassMethod(className: String, runtimeMirror: Mirror, methodName: String, classInstance: Any, arguments: Any*): Any = {
val classSymbol = runtimeMirror.staticClass(className)
val classType = classSymbol.typeSignature
val methodSymbol = classType.decl(TermName(methodName)).asMethod
val instanceMirror = runtimeMirror.reflect(classInstance)
val methodMirror = instanceMirror.reflectMethod(methodSymbol)
methodMirror(arguments: _*)
//def runObjectMethod(objectName: String, runtimeMirror: Mirror, methodName: String, arguments: Any*): Any = {
// val objectSymbol = runtimeMirror.staticModule(objectName)
// val objectModuleMirror = runtimeMirror.reflectModule(objectSymbol)
// val objectInstance = objectModuleMirror.instance
// val objectType = objectSymbol.typeSignature
// val methodSymbol = objectType.decl(TermName(methodName)).asMethod
// val objectInstanceMirror = runtimeMirror.reflect(objectInstance)
// val methodMirror = objectInstanceMirror.reflectMethod(methodSymbol)
// methodMirror(arguments: _*)
def createRuntimeMirror(directory: AbstractFile, parentMirror: Mirror): Mirror = {
val classLoader = new AbstractFileClassLoader(directory, parentMirror.classLoader)
I am trying to use kotlin instead of Java, I cannot find a good way to do with try resource:
Java Code like this:
import org.tensorflow.Graph;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.TensorFlow;
public class HelloTensorFlow {
public static void main(String[] args) throws Exception {
try (Graph g = new Graph()) {
final String value = "Hello from " + TensorFlow.version();
// Construct the computation graph with a single operation, a constant
// named "MyConst" with a value "value".
try (Tensor t = Tensor.create(value.getBytes("UTF-8"))) {
// The Java API doesn't yet include convenience functions for adding operations.
g.opBuilder("Const", "MyConst").setAttr("dtype", t.dataType()).setAttr("value", t).build();
// Execute the "MyConst" operation in a Session.
try (Session s = new Session(g);
// Generally, there may be multiple output tensors,
// all of them must be closed to prevent resource leaks.
Tensor output = s.runner().fetch("MyConst").run().get(0)) {
System.out.println(new String(output.bytesValue(), "UTF-8"));
I do it in kotlin, I have to do this:
fun main(args: Array<String>) {
val g = Graph();
try {
val value = "Hello from ${TensorFlow.version()}"
val t = Tensor.create(value.toByteArray(Charsets.UTF_8))
try {
g.opBuilder("Const", "MyConst").setAttr("dtype", t.dataType()).setAttr("value", t).build()
} finally {
var sess = Session(g)
try {
val output = sess.runner().fetch("MyConst").run().get(0)
println(String(output.bytesValue(), Charsets.UTF_8))
} finally {
} finally {
I have try to use use like this:
Graph().use {
it -> ....
I got error like this:
Error:(16, 20) Kotlin: Unresolved reference. None of the following candidates is applicable because of receiver type mismatch:
#InlineOnly public inline fun ???.use(block: (???) -> ???): ??? defined in
I just use wrong dependency:
compile "org.jetbrains.kotlin:kotlin-stdlib"
replace it with:
compile "org.jetbrains.kotlin:kotlin-stdlib-jdk8"
I have to invoke external java methods in xquery using saxon HE. I could able to invoke the methods with the below code. But the problem is i want to bind my input externally.
final Configuration config = new Configuration();
config.registerExtensionFunction(new ShiftLeft());
final StaticQueryContext sqc = new StaticQueryContext(config);
final XQueryExpression exp = sqc.compileQuery(new FileReader(
final DynamicQueryContext dynamicContext = new DynamicQueryContext(config);
String xml = "<student_list><student><name>George Washington</name><major>Politics</major><phone>312-123-4567</phone><email></email></student><student><name>Janet Jones</name><major>Undeclared</major><phone>311-122-2233</phone><email></email></student><student><name>Joe Taylor</name><major>Engineering</major><phone>211-111-2333</phone><email></email></student></student_list>";
DocumentBuilderFactory newInstance = DocumentBuilderFactory.newInstance();
Document parse = newInstance.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
DocumentWrapper sequence = new DocumentWrapper(parse, "", config);
StructuredQName qname = new StructuredQName("", "", "student_list");
dynamicContext.setParameter(qname, sequence);
Properties props = new Properties();
final SequenceIterator iter = exp.iterator(dynamicContext);
props.setProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
props.setProperty(OutputKeys.INDENT, "yes");
StringWriter writer = new StringWriter();
QueryResult.serializeSequence(iter, config, writer, props);
System.out.println("Result is " + writer);
declare namespace eg="";
declare namespace xs = "";
declare variable $student_list as element(*) external;
<value> {
let $n := eg:shift-left(2, 2)
return $n
{ $student_list//student_list/student/name }
But getting the below error
Error at procedure student_list on line 3 of students.xml:
XPTY0004: Required item type of value of variable $student_list is element(); supplied
value has item type document-node(element(Q{}student_list))
net.sf.saxon.trans.XPathException: Required item type of value of variable $student_list is element(); supplied value has item type document- node(element(Q{}student_list))
at net.sf.saxon.expr.ItemTypeCheckingFunction.testConformance(
at net.sf.saxon.expr.ItemTypeCheckingFunction.mapItem(
at net.sf.saxon.expr.CardinalityCheckingIterator.<init>(
at net.sf.saxon.type.TypeHierarchy.applyFunctionConversionRules(
at net.sf.saxon.expr.instruct.GlobalParameterSet.convertParameterValue(
at net.sf.saxon.expr.instruct.Bindery.useGlobalParameter(
at net.sf.saxon.expr.instruct.GlobalParam.evaluateVariable(
at net.sf.saxon.expr.GlobalVariableReference.evaluateVariable(
at net.sf.saxon.expr.VariableReference.evaluateItem(
at net.sf.saxon.expr.Atomizer.evaluateItem(
at net.sf.saxon.expr.Atomizer.evaluateItem(
at net.sf.saxon.expr.AtomicSequenceConverter.evaluateItem(
at net.sf.saxon.expr.AtomicSequenceConverter.evaluateItem(
at net.sf.saxon.functions.Doc.doc(
at net.sf.saxon.functions.Doc.evaluateItem(
at net.sf.saxon.functions.Doc.evaluateItem(
at net.sf.saxon.expr.SimpleStepExpression.iterate(
at net.sf.saxon.expr.SlashExpression.iterate(
at net.sf.saxon.expr.sort.DocumentSorter.iterate(
at net.sf.saxon.expr.SlashExpression.iterate(
at net.sf.saxon.expr.sort.DocumentSorter.iterate(
at net.sf.saxon.expr.Expression.process(
at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(
at net.sf.saxon.expr.instruct.ElementCreator.processLeavingTail(
at net.sf.saxon.expr.instruct.Block.processLeavingTail(
at net.sf.saxon.expr.instruct.Instruction.process(
at net.sf.saxon.expr.instruct.ElementCreator.constructElement(
at net.sf.saxon.expr.instruct.ElementCreator.evaluateItem(
at net.sf.saxon.expr.instruct.Instruction.iterate(
at net.sf.saxon.query.XQueryExpression.iterator(
at com.example.saxon.ExternalMethodCaller.main(
Thanks in advance..
Unless you have a very good reason not to, my advice is to use Snappi (the Saxon 9 API, or s9api):
Processor saxon = new Processor(false);
saxon.registerExtensionFunction(new MyExtension());
XQueryCompiler compiler = saxon.newXQueryCompiler();
XQueryExecutable exec = compiler.compile(new File("input/names.xq"));
XQueryEvaluator query = exec.load();
DocumentBuilder builder = saxon.newDocumentBuilder();
String students = "<xml>...</xml>";
Source src = new StreamSource(new StringReader(students));
XdmNode doc =;
query.setExternalVariable(new QName("student_list"), doc);
XdmValue result = query.evaluate();
With MyExtension looking something like the following:
public class MyExtension
implements ExtensionFunction
public QName getName()
return new QName("", "my-fun");
public SequenceType getResultType()
return SequenceType.makeSequenceType(
ItemType.INTEGER, OccurrenceIndicator.ONE);
public SequenceType[] getArgumentTypes()
return new SequenceType[] {
ItemType.INTEGER, OccurrenceIndicator.ONE),
ItemType.INTEGER, OccurrenceIndicator.ONE)
public XdmValue call(XdmValue[] args) throws SaxonApiException
long first = ((XdmAtomicValue)args[0].itemAt(0)).getLongValue();
long second = ((XdmAtomicValue)args[0].itemAt(0)).getLongValue();
long result = ...;
return new XdmAtomicValue(result);
See the documentation at for details.
EXPath also has a project called tools-saxon, containing several tools for using Saxon in Java. Including extension functions. It introduces the concept of a function library, which is convenient if you have several extension functions. It also introduces a function definition builder, allowing one to build a function definition with as less boiler plate code as possible (and providing convenient shortcuts for type sequences). In the above code, replace the function registering (the first 2 lines) by:
Processor saxon = new Processor(false);
Library lib = new MyLibrary();
and replace the extension class with the 2 following classes (a library and a function, resp.):
public class MyLibrary
extends Library
public MyLibrary()
super("", "my");
protected Function[] functions()
return new Function[] {
new MyFunction(this)
public class MyFunction
extends Function
public MyFunction(Library lib)
protected Definition makeDefinition()
return library()
.function(this, "my-fun")
.param(Types.SINGLE_INTEGER, "first")
.param(Types.SINGLE_INTEGER, "second")
public Sequence call(XPathContext ctxt, Sequence[] args)
throws XPathException
Parameters params = checkParams(args);
long first = params.asLong(0, true);
long second = params.asLong(1, true);
long result = 0;
return Return.value(result);
See all informatio on the project home on Github, at
Note: not tested.
Edit: Using Kryo 1.04
I'm right now serializing a User class that contains a java.sql.Timestamp field in Scala. For some reason, Kryo can't find a zero-arg constructor and throws an error:
Caused by: com.esotericsoftware.kryo.SerializationException: Class cannot be created (missing no-arg constructor): java.sql.Timestamp
Serialization trace:
created (com.threetierlogic.AccountService.models.User)
at com.esotericsoftware.kryo.Kryo.newInstance(
at com.esotericsoftware.kryo.Serializer.newInstance(
at com.esotericsoftware.kryo.serialize.FieldSerializer.readObjectData(
at com.esotericsoftware.kryo.serialize.FieldSerializer.readObjectData(
at com.esotericsoftware.kryo.serialize.FieldSerializer.readObjectData(
at com.esotericsoftware.kryo.Serializer.readObject(
at com.esotericsoftware.kryo.Kryo.readObject(
... 84 more
Caused by: java.lang.InstantiationException: java.sql.Timestamp
at java.lang.Class.newInstance0(
at java.lang.Class.newInstance(
at com.esotericsoftware.kryo.Kryo.newInstance(
... 90 more
This is part of a converter class to convert domain objects for Riak. Here's my converter class:
* Kryo converter for passing domain objects into Riak
class UserConverter(val bucket: String) extends Converter[User] {
def fromDomain(domainObject: User, vclock: VClock): IRiakObject = {
val key = domainObject.guid
if(key == null) throw new NoKeySpecifedException(domainObject)
val kryo = new Kryo()
val ob = new ObjectBuffer(kryo)
val value = ob.writeObject(domainObject)
RiakObjectBuilder.newBuilder(bucket, key)
def toDomain(riakObject: IRiakObject): User = {
if(riakObject == null) null
val kryo = new Kryo()
val ob = new ObjectBuffer(kryo)
ob.readObject(riakObject.getValue(), classOf[User])
Do I need to extend Timestamp and create a zero argument constructor? Or is there a better workaround?
If I need to upgrade to 2.20, what's the replacement for ObjectBuffer without writing to a file?
A quick look at the Kryo home page suggests that in the absence of a zero-arg constructor, you can create what Kryo calls an "Instantion Strategy" to handle that class. Look in the "Object Creation" section.
You can do something like this :
class KryoSO {
import com.esotericsoftware.kryo.KryoSerializable
import de.javakaffee.kryoserializers.KryoReflectionFactorySupport
import com.esotericsoftware.kryo.Kryo
import com.esotericsoftware.kryo.Serializer
import{ InputStream, OutputStream }
import{ Output, Input }
import java.sql.Timestamp
object TimestampSerializer extends Serializer[Timestamp] {
override def write(kryo: Kryo, output: Output, t: Timestamp): Unit = {
output.writeLong(t.getTime(), true);
override def read(kryo: Kryo, input: Input, t: Class[Timestamp]): Timestamp = {
new Timestamp(input.readLong(true));
override def copy(kryo: Kryo, original: Timestamp): Timestamp = {
new Timestamp(original.getTime());
val kryo: Kryo = new KryoReflectionFactorySupport
kryo.addDefaultSerializer(classOf[Timestamp], TimestampSerializer)
def serialize(o: Any, os: OutputStream) = {
val output = new Output(os);
this.kryo.writeClassAndObject(output, o);
def deserialize(is: InputStream): Any = {
kryo.readClassAndObject(new Input(is));
val k = new KryoSO
val b = new
val timestamp = new java.sql.Timestamp(System.currentTimeMillis())
k.serialize(timestamp, b)
val result = k.deserialize(new
println(timestamp == result)
Result :
2013-02-07 10:59:19.482
class java.sql.Timestamp