java lang Class Cast Exception - java

I have written a code which can reduce the grammatical boundaries for a text, but when I run the program this exception comes up
java.lang.ClassCastException
here is the class that i run,
public class paerser {
public static void main (String [] arg){
LexicalizedParser lp = new LexicalizedParser("grammar/englishPCFG.ser.gz");
lp.setOptionFlags("-maxLength", "500", "-retainTmpSubcategories");
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
String text = "John, who was the CEO of a company, played golf.";
edu.stanford.nlp.trees.Tree parse = lp.apply(Arrays.asList(text));
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
List<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
System.out.println(tdl);
}
}
Updated,
here is the full stack trace ...
Loading parser from serialized file grammar/englishPCFG.ser.gz ... done [1.5 sec].
Following exception caught during parsing:
java.lang.ClassCastException: java.lang.String cannot be cast to edu.stanford.nlp.ling.HasWord
at edu.stanford.nlp.parser.lexparser.ExhaustivePCFGParser.parse(ExhaustivePCFGParser.java:346)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.parse(LexicalizedParser.java:386)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:304)
at paerser.main(paerser.java:19)
Recovering using fall through strategy: will construct an (X ...) tree.
Exception in thread "main" java.lang.ClassCastException: java.lang.String cannot be cast to edu.stanford.nlp.ling.HasWord
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:317)
at paerser.main(paerser.java:19)

Stacktrace shows that ExhaustivePCFGParser's parse method is being used. It expects a List of HasWord objects. You are passing a list of String. Hence, the exception.
public boolean parse(List<? extends HasWord> sentence) { // ExhaustivePCFGParser

Related

How to create a Spark UDF in Java which accepts array of Strings?

This question has been asked here for Scala, and it does not help me as I am working with Java API. I have been literally throwing everything and the kitchen sink at it, so this was my approach:
List<String> sourceClasses = new ArrayList<String>();
//Add elements
List<String> targetClasses = new ArrayList<String>();
//Add elements
dataset = dataset.withColumn("Transformer", callUDF(
"Transformer",
lit((String[])sourceClasses.toArray())
.cast(DataTypes.createArrayType(DataTypes.StringType)),
lit((String[])targetClasses.toArray())
.cast(DataTypes.createArrayType(DataTypes.StringType))
));
And for my UDF declaration:
public class Transformer implements UDF2<Seq<String>, Seq<String>, String> {
// #SuppressWarnings("deprecation")
public String call(Seq<String> sourceClasses, Seq<String> targetClasses)
throws Exception {
When I run the code, the execution does not proceed past the UDF call, which is expected because I am not being able to match up the types. Please help me in this regard.
EDIT
I tried the solution suggested by #Oli. However, I got the following exception:
org.apache.spark.SparkException: Failed to execute user defined function($anonfun$261: (array<string>, array<string>) => string)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:836)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.collection.immutable.Seq
at com.esrx.dqm.uuid.UUIDTransformerEngine$1.call(UUIDTransformerEngine.java:1)
at org.apache.spark.sql.UDFRegistration$$anonfun$261.apply(UDFRegistration.scala:774)
... 22 more
This line specifically seems to be indicative of a problem:
Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.collection.immutable.Seq
From what I understand from the type of your UDF, you are trying to create a UDF that takes two arrays as inputs and returns a string.
In java, that's a bit painful but manageable.
Let's say that you want to join both arrays and link them with the word AND. You could define the UDF as follows:
UDF2 my_udf2 = new UDF2<WrappedArray<String>, WrappedArray<String>, String>() {
public String call(WrappedArray<String> a1, WrappedArray a2) throws Exception {
ArrayList<String> l1 = new ArrayList(JavaConverters
.asJavaCollectionConverter(a1)
.asJavaCollection());
ArrayList<String> l2 = new ArrayList(JavaConverters
.asJavaCollectionConverter(a2)
.asJavaCollection());
return l1.stream().collect(Collectors.joining(",")) +
" AND " +
l2.stream().collect(Collectors.joining(","));
}
};
Note that you need to use scala WrappedArray in the signature in the method and transform them in the body of the method with JavaConverters to be able to manipulate them in Java. Here are the required import just in case.
import scala.collection.mutable.WrappedArray;
import scala.collection.JavaConverters;
Then you can register the UDF can use it with Spark. To be able to use it, I created a sample dataframe and two dummy arrays from the 'id' column. Note that it can also work with the lit function as you were trying to do in your question.
spark.udf().register("my_udf2", my_udf2, DataTypes.StringType);
String[] data = {"abcd", "efgh", "ijkl"};
spark.range(3)
.withColumn("id", col("id").cast("string"))
.withColumn("array", functions.array(col("id"), col("id")))
.withColumn("string_of_arrays",
functions.callUDF("my_udf2", col("array"), lit(data)))
.show(false);
which yields:
+---+------+----------------------+
|id |array |string_of_arrays |
+---+------+----------------------+
|0 |[0, 0]|0,0 AND abcd,efgh,ijkl|
|1 |[1, 1]|1,1 AND abcd,efgh,ijkl|
|2 |[2, 2]|2,2 AND abcd,efgh,ijkl|
+---+------+----------------------+
In Spark >= 2.3, you could also do it like this:
UserDefinedFunction my_udf2 = udf(
(WrappedArray<String> s1, WrappedArray<String> s2) -> "some_string",
DataTypes.StringType
);
df.select(my_udf2.apply(col("a1"), col("a2")).show(false);

Kafka Streamer: Issue with user defined 'Serdes'

I am using Confluent-3.2.1 as a Kafka streamer. I am trying to aggregate my KGroupedStream<String, MyClass1> into KTable<Windowed<String>,MsgAggr>. While using aggregation, I am also using TimeWindows.of(TimeUnit.SECONDS.toMillis(5)). I am using user defined "Serdes" as an argument to aggregation. The code for User define "Serdes" is,
Map<String, Object> serdeProps = new HashMap<>();
final Serializer<MsgAggr> pageViewSerializer = new JsonPOJOSerializer<>();
serdeProps.put("JsonPOJOClass", MsgAggr.class);
pageViewSerializer.configure(serdeProps, false);
final Deserializer<MsgAggr> pageViewDeserializer = new JsonPOJODeserializer<>();
serdeProps.put("JsonPOJOClass", MsgAggr.class);
pageViewDeserializer.configure(serdeProps, false);
final Serde<MsgAggr> pageViewSerde = Serdes.serdeFrom(pageViewSerializer, pageViewDeserializer);`
Code for Streaming is
KGroupedStream<String, MyClass1> msg_grp = message
.groupByKey();
KTable<Windowed<String>,MsgAggr> msg_win = msg_grp
//.reduce(new Reduced(), arg1, arg2);
.aggregate(new Init(),
new Aggr(),
TimeWindows.of(TimeUnit.SECONDS.toMillis(5)),
pageViewSerde,
"MySample_out");
When I run the code I got the errors:
[2017-05-23 18:16:45,648] ERROR stream-thread [StreamThread-1] Streams application error during processing: (org.apache.kafka.streams.processor.internals.StreamThread:249)
java.lang.ClassCastException: my.kafka.strm.MyClass1 cannot be cast to java.lang.String
at org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:64)
at org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:82)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202)
at org.apache.kafka.streams.kstream.internals.KStreamFilter$KStreamFilterProcessor.process(KStreamFilter.java:44)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202)
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:43)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:66)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:180)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:436)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
Exception in thread "StreamThread-1" java.lang.ClassCastException: my.kafka.strm.MyClass1 cannot be cast to java.lang.String
at org.apache.kafka.common.serialization.StringSerializer.serialize(StringSerializer.java:24)
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:64)
at org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:82)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202)
at org.apache.kafka.streams.kstream.internals.KStreamFilter$KStreamFilterProcessor.process(KStreamFilter.java:44)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202)
at org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:43)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:202)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:66)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:180)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:436)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
The problem is with message.groupByKey();. Its using the String Serde for your custom class MyClass1. Please implement custom Serializer and deserializer for MyClass1 and use the same in the overloaded version of groupByKey - https://kafka.apache.org/0102/javadoc/org/apache/kafka/streams/kstream/KStream.html#groupByKey(org.apache.kafka.common.serialization.Serde,%20org.apache.kafka.common.serialization.Serde)

How to read and write a custom class from parquet file

I am trying to write a parquet read/write class for a certain class type using DataFrame/datasets
class schema:
class A {
long count;
List<B> listOfValues;
}
class B {
String id;
long count;
}
code :
String path = "some path";
List<A> entries = somerandomAentries();
JavaRDD<A> rdd = sc.parallelize(entries, 1);
DataFrame df = sqlContext.createDataFrame(rdd, A.class);
df.write().parquet(path);
DataFrame newDataDF = sqlContext.read().parquet(path);
newDataDF.show();
when i try to run this, it throws an error. what am I missing here? Do I need to provide a schema for the whole class while creating data frames
error:
Caused by: scala.MatchError: B(Id=abc, count=0) (of class B)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:255)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:250)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$ArrayConverter.toCatalystImpl(CatalystTypeConverters.scala:169)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$ArrayConverter.toCatalystImpl(CatalystTypeConverters.scala:153)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:102)
at org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:401)
at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358)
at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358)
at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:263)
... 8 more
You are getting error because nested JavaBeans are not supported in Spark 1.6 version. Please see https://spark.apache.org/docs/1.6.0/sql-programming-guide.html#inferring-the-schema-using-reflection
Currently, Spark SQL does not support JavaBeans that contain nested or contain complex types such as Lists or Arrays.

Javassist CannotCompileException when trying to add a line to create a Map

um trying to instrument a method to do the following task.
Task - Create a Map and insert values to the map
Adding System.out.println lines wouldn't cause any exception. But when i add the line to create the Map, it throws a cannotCompileException due to a missing ;. When i print the final string it doesn't seem to miss any. What am i doing wrong here.
public void createInsertAt(CtMethod method, int lineNo, Map<String,String> parameterMap)
throws CannotCompileException {
StringBuilder atBuilder = new StringBuilder();
atBuilder.append("System.out.println(\"" + method.getName() + " is running\");");
atBuilder.append("java.util.Map<String,String> arbitraryMap = new java.util.HashMap<String,String>();");
for (Map.Entry<String,String> entry : parameterMap.entrySet()) {
}
System.out.println(atBuilder.toString());
method.insertAt(1, atBuilder.toString());
}
String obtained by printing the output of string builder is,
System.out.println("prepareStatement is
running");java.util.Map arbitraryMap = new
java.util.HashMap();
Exception received is,
javassist.CannotCompileException: [source error] ; is missing
at javassist.CtBehavior.insertAt(CtBehavior.java:1207)
at javassist.CtBehavior.insertAt(CtBehavior.java:1134)
at org.wso2.das.javaagent.instrumentation.InstrumentationClassTransformer.createInsertAt(InstrumentationClassTransformer.java:126)
at org.wso2.das.javaagent.instrumentation.InstrumentationClassTransformer.instrumentMethod(InstrumentationClassTransformer.java:100)
at org.wso2.das.javaagent.instrumentation.InstrumentationClassTransformer.transform(InstrumentationClassTransformer.java:37)
at sun.instrument.TransformerManager.transform(TransformerManager.java:188)
at sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:424)
at sun.instrument.InstrumentationImpl.retransformClasses0(Native Method)
at sun.instrument.InstrumentationImpl.retransformClasses(InstrumentationImpl.java:144)
at org.wso2.das.javaagent.instrumentation.Agent.premain(Agent.java:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:382)
at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:397)
Caused by: compile error: ; is missing
at javassist.compiler.Parser.parseDeclarationOrExpression(Parser.java:594)
at javassist.compiler.Parser.parseStatement(Parser.java:277)
at javassist.compiler.Javac.compileStmnt(Javac.java:567)
at javassist.CtBehavior.insertAt(CtBehavior.java:1186)
... 15 more
(Is there any way to debug these kind of issues.) Some help please.....
Javassist's compiler doesn't support generics. Either remove or comment them out:
.append("java.util.Map arbitraryMap = new java.util.HashMap();")
or
.append("java.util.Map/*<String,String>*/ arbitraryMap = new java.util.HashMap/*<String,String>*/();")
The latter is useful as comment for yourself only, of course, it has no special meaning for Javassist.

Exception in thread "main" java.util.UnknownFormatConversionException: Conversion = 'ti'

package chapterreader;
import java.util.Scanner;
import java.io.File;
public class ChapterReader {
public static void main(String[] args) throws Exception {
Chapter myChapter = new Chapter();
File chapterFile = new File("toc.txt");
Scanner chapterScanner;
//check to see if the file exists to read the data
if (chapterFile.exists()) {
System.out.printf("%7Chapter %14Title %69Page %80Length");
chapterScanner = new Scanner(chapterFile);
//Set Delimiter as ';' & 'new line'
chapterScanner.useDelimiter(";|\r\n");
while (chapterScanner.hasNext()) {
//Reads all the data from file and set it to the object Chapter
myChapter.setChapterNumber(chapterScanner.nextInt());
myChapter.setChapterTitle(chapterScanner.next());
myChapter.setStartingPageNumber(chapterScanner.nextInt());
myChapter.setEndingPageNumber(chapterScanner.nextInt());
displayProduct(myChapter);
}
chapterScanner.close();
} else {
System.out.println("Missing Chapter File");
}
}
//Display the Chapter Information in a correct Format
public static void displayProduct(Chapter reportProduct) {
System.out.printf("%7d", reportProduct.getChapterNumber());
System.out.printf("%-60s", reportProduct.getChapterTitle());
System.out.printf("%-6d", reportProduct.getStartingPageNumber());
System.out.printf("%-7d%n", reportProduct.getEndingPageNumber());
}
}
But then I got an Error:
run: Exception in thread "main"
java.util.UnknownFormatConversionException: Conversion = 'ti' at
java.util.Formatter$FormatSpecifier.checkDateTime(Formatter.java:2915)
at java.util.Formatter$FormatSpecifier.(Formatter.java:2678)
at java.util.Formatter.parse(Formatter.java:2528) at
java.util.Formatter.format(Formatter.java:2469) at
java.io.PrintStream.format(PrintStream.java:970) at
java.io.PrintStream.printf(PrintStream.java:871) at
chapterreader.ChapterReader.main(ChapterReader.java:17) Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)
What's wrong with this error? Please, Help!
Your below statement is not formattable. That why it throws UnknownFormatConversionException
System.out.printf("%7Chapter %14Title %69Page %80Length");
If you want to separate these words than use following way
System.out.printf("%7s %14s %69s %80s", "Chapter", "Title", "Page", "Length");
Instead of
System.out.printf("%7Chapter %14Title %69Page %80Length");
I think you wanted something like
System.out.printf("%7s %14s %69s %80s%n", "Chapter", "Title", "Page",
"Length");
and your message is telling you that your format String(s) aren't valid (%14Ti). The Formatter#syntax javadoc says (in part)
't', 'T' date/time Prefix for date and time conversion characters. See Date/Time Conversions.

Categories

Resources