kinesis analytics flink write parquet file

kinesis analytics flink write parquet file - java

Using amazon kinesis analytics with a java flink application I am taking data from a firehose and trying to write it to a S3 bucket as a series of parquet files. I am hitting the following exception in my cloud watch logs which is the only error I can see that might be related.
I have enabled checkpointing as specified in the documentation and included the flink/arvo dependancies. Running this locally works. The parquet files are written to local local disk when a checkpoint is reached.
The exception
"message": "Exception type is USER from filter results [UserClassLoaderExceptionFilter -> USER, UserAPIExceptionFilter -> SKIPPED, UserSerializationExceptionFilter -> SKIPPED, UserFunctionExceptionFilter -> SKIPPED, OutOfMemoryExceptionFilter -> NONE, TooManyOpenFilesExceptionFilter -> NONE, KinesisServiceExceptionFilter -> NONE].",
"throwableInformation": [
"java.lang.Exception: Error while triggering checkpoint 1360 for Source: Custom Source -> Map -> Sink: HelloS3 (1/1)",
"org.apache.flink.runtime.taskmanager.Task$1.run(Task.java:1201)",
"java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)",
"java.util.concurrent.FutureTask.run(FutureTask.java:266)",
"java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)",
"java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)",
"java.lang.Thread.run(Thread.java:748)",
"Caused by: java.lang.AbstractMethodError: org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writePage(Lorg/apache/parquet/bytes/BytesInput;IILorg/apache/parquet/column/statistics/Statistics;Lorg/apache/parquet/column/Encoding;Lorg/apache/parquet/column/Encoding;Lorg/apache/parquet/column/Encoding;)V",
"org.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:53)",
"org.apache.parquet.column.impl.ColumnWriterBase.writePage(ColumnWriterBase.java:315)",
"org.apache.parquet.column.impl.ColumnWriteStoreBase.flush(ColumnWriteStoreBase.java:152)",
"org.apache.parquet.column.impl.ColumnWriteStoreV1.flush(ColumnWriteStoreV1.java:27)",
"org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:172)",
"org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114)",
"org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:308)",
"org.apache.flink.formats.parquet.ParquetBulkWriter.finish(ParquetBulkWriter.java:62)",
"org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:62)",
"org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:235)",
"org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.prepareBucketForCheckpointing(Bucket.java:276)",
"org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onReceptionOfCheckpoint(Bucket.java:249)",
"org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.snapshotActiveBuckets(Buckets.java:244)",
"org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.snapshotState(Buckets.java:235)",
"org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.snapshotState(StreamingFileSink.java:347)",
"org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118)",
"org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99)",
"org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:90)",
"org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:395)",
"org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1138)",
"org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1080)",
"org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:754)",
"org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:666)",
"org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:584)",
"org.apache.flink.streaming.runtime.tasks.SourceStreamTask.triggerCheckpoint(SourceStreamTask.java:114)",
"org.apache.flink.runtime.taskmanager.Task$1.run(Task.java:1190)",
"\t... 5 more"
Below is my code snippets. I am getting my logging when processing the events and even the logging from the bucketassigner.
env.setStateBackend(new FsStateBackend("s3a://<BUCKET>/checkpoint"));
env.setParallelism(1);
env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE);
StreamingFileSink<Metric> sink = StreamingFileSink
.forBulkFormat(new Path("s3a://<BUCKET>/raw"), ParquetAvroWriters.forReflectRecord(Metric.class))
.withBucketAssigner(new EventTimeBucketAssigner())
.build();
My pom:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-parquet_2.11</artifactId>
<version>1.11-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.11.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.2.1</version>
</dependency>
My AWS configuration has 'Snapshots' enabled. Write permissions are working to the bucket when I use the rowWriting instead of bulk writing.
Really unsure what to look for to get this working now.

Related

MongoDb error: 'cannot use 'j' option when a host does not have journaling enabled'

I was using mongo in dev just fine, when deploying the app into test env I got this error:
com.mongodb.MongoCommandException: Command failed with error 2 (BadValue): 'cannot use 'j' option when a host does not have journaling enabled' on server localhost:34653. The full response is {"ok": 0.0, "errmsg": "cannot use 'j' option when a host does not have journaling enabled", "code": 2, "codeName": "BadValue"}
full stuck trace:
Caused by: com.mongodb.MongoCommandException: Command failed with error 2 (BadValue): 'cannot use 'j' option when a host does not have journaling enabled' on server localhost:34653. The full response is {"ok": 0.0, "errmsg": "cannot use 'j' option when a host does not have journaling enabled", "code": 2, "codeName": "BadValue"}
at com.mongodb.internal.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:175)
at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:303)
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:259)
at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:99)
at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:450)
at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:72)
at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:218)
at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:269)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:131)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:123)
at com.mongodb.operation.CommandOperationHelper.executeWriteCommand(CommandOperationHelper.java:369)
at com.mongodb.operation.CommandOperationHelper.executeWriteCommand(CommandOperationHelper.java:360)
at com.mongodb.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:284)
at com.mongodb.operation.CommandOperationHelper.executeCommand(CommandOperationHelper.java:277)
at com.mongodb.operation.CreateIndexesOperation$1.call(CreateIndexesOperation.java:177)
at com.mongodb.operation.CreateIndexesOperation$1.call(CreateIndexesOperation.java:172)
at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:530)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:492)
at com.mongodb.operation.CreateIndexesOperation.execute(CreateIndexesOperation.java:172)
at com.mongodb.operation.CreateIndexesOperation.execute(CreateIndexesOperation.java:72)
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:206)
at com.mongodb.client.internal.MongoCollectionImpl.executeCreateIndexes(MongoCollectionImpl.java:886)
at com.mongodb.client.internal.MongoCollectionImpl.createIndexes(MongoCollectionImpl.java:869)
at com.mongodb.client.internal.MongoCollectionImpl.createIndexes(MongoCollectionImpl.java:864)
at com.mongodb.client.internal.MongoCollectionImpl.createIndex(MongoCollectionImpl.java:849)
at com.github.cloudyrock.mongock.driver.mongodb.sync.v4.repository.MongoSync4RepositoryBase.createRequiredUniqueIndex(MongoSync4RepositoryBase.java:99)
at com.github.cloudyrock.mongock.driver.mongodb.sync.v4.repository.MongoSync4RepositoryBase.ensureIndex(MongoSync4RepositoryBase.java:58)
at com.github.cloudyrock.mongock.driver.mongodb.sync.v4.repository.MongoSync4RepositoryBase.initialize(MongoSync4RepositoryBase.java:43)
at com.github.cloudyrock.mongock.driver.core.driver.ConnectionDriverBase.initialize(ConnectionDriverBase.java:40)
at com.github.cloudyrock.mongock.runner.core.executor.MigrationExecutor.initializationAndValidation(MigrationExecutor.java:225)
at com.github.cloudyrock.spring.v5.core.SpringMigrationExecutor.initializationAndValidation(SpringMigrationExecutor.java:31)
at com.github.cloudyrock.mongock.runner.core.executor.MigrationExecutor.executeMigration(MigrationExecutor.java:63)
at com.github.cloudyrock.spring.v5.core.SpringMigrationExecutor.executeMigration(SpringMigrationExecutor.java:37)
at com.github.cloudyrock.mongock.runner.core.executor.MongockRunnerBase.execute(MongockRunnerBase.java:53)
... 49 common frames omitted
dependencies:
<dependency>
<groupId>com.github.cloudyrock.mongock</groupId>
<artifactId>mongock-bom</artifactId>
<version>4.3.8</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>com.github.cloudyrock.mongock</groupId>
<artifactId>mongock-spring-v5</artifactId>
</dependency>
<dependency>
<groupId>com.github.cloudyrock.mongock</groupId>
<artifactId>mongodb-springdata-v3-driver</artifactId>
</dependency>
configuration:
#Bean
public MongockSpring5.MongockApplicationRunner mongockApplicationRunner(
ApplicationContext springContext,
MongoTemplate mongoTemplate) {
log.debug("Configuring Mongock");
return MongockSpring5.builder()
.setDriver(SpringDataMongoV3Driver.withDefaultLock(mongoTemplate))
// package to scan for migrations
.addChangeLogsScanPackage("ru.fabit.visor.config.dbmigrations")
.setSpringContext(springContext)
.setEnabled(true)
.buildApplicationRunner();
}
enter image description here
i set command: mongod --journal
but the same error

The error is clear, you should enable journaling on the server. Look here. Another option, don't configure journal in writeConcern (which is not recommended).

This is happening because Mongock, by default, requires strong consistency(it's the only way to guarantee a change is only applied once). This means that MongoDB needs to have journalism enabled.
However, as said, that's the default configuration(and highly recommended for production), but you can relax it for specific scenarios, such as tests, where you may want to set up a MongoDB in memory(which by definition doesn't allow journalism).Take a look to the Mongock documentation for MongoDB

How to manage RecordTooLargeException avoiding Flink job restarting

Is there any way to ignore oversized messages without Flink job restarting?
If I try to produce (using KafkaSink ) a message which is too large (greater than max.message.bytes) then the RecordTooLargeException occurs and the Flink job restarts, then this "exception&restart" cycle is repeating endlessly!
I don't need to increase messages size limits such as max.message.bytes (Kafka Topic Config) and max.request.size (Flink Producer Config), they are good, they are already big. I just want to handle the situation when an unrealistically large message is trying to be produced. In this case, this big message should be ignored, and an error should be logged, and any Runtime Exception should NOT occur, and the endless restarting loop should NOT start.
I tried to use ProducerInterceptor -> it cannot intercept/reject a message, it can just modify it.
I tried to ignore oversized messages in SerializationSchema (implemented a custom wrapper of SerializationSchema) -> it cannot discard message producing too.
I am trying to overwrite KafkaWriter and KafkaSink classes, but it seems to be challenging.
I will be grateful for any advice!
A few quick environment details:
Kafka version is 2.8.1
Flink code is Java code based on the newer KafkaSource/KafkaSink API, not the
older KafkaConsumer/KafkaProduer API.
The flink-clients and flink-connector-kafka version is 1.15.0
Code sample which throws the RecordTooLargeException:
int numberOfRows = 1;
int rowsPerSecond = 1;
DataStream<String> stream = environment.addSource(
new DataGeneratorSource<>(
RandomGenerator.stringGenerator(1050000), // max.message.bytes=1048588
rowsPerSecond,
(long) numberOfRows),
TypeInformation.of(String.class))
.setParallelism(1)
.name("string-generator");
KafkaSinkBuilder<String> builder = KafkaSink.<String>builder()
.setBootstrapServers("localhost:9092")
.setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.setRecordSerializer(
KafkaRecordSerializationSchema.builder().setTopic("test.output")
.setValueSerializationSchema(new SimpleStringSchema())
.build());
KafkaSink<String> sink = builder.build();
stream.sinkTo(sink).setParallelism(1).name("output-producer");
Exception Stack Trace:
2022-06-02/14:01:45.066/PDT [flink-akka.actor.default-dispatcher-4] INFO output-producer: Writer -> output-producer: Committer (1/1) (a66beca5a05c1c27691f7b94ca6ac025) switched from RUNNING to FAILED on 271b1b90-7d6b-4a34-8116-3de6faa8a9bf # 127.0.0.1 (dataPort=-1). org.apache.flink.util.FlinkRuntimeException: Failed to send data to Kafka null with FlinkKafkaInternalProducer{transactionalId='null', inTransaction=false, closed=false} at org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.throwException(KafkaWriter.java:440) ~[flink-connector-kafka-1.15.0.jar:1.15.0] at org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.lambda$onCompletion$0(KafkaWriter.java:421) ~[flink-connector-kafka-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:353) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) ~[flink-runtime-1.15.0.jar:1.15.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1050088 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.

Siddhi HTTP NoSuchMethodError

This question is about java library of Siddhi - CEP
Description:
I tried to establish an HTTP source to receive data. There was no error creating the Runtime and starting it.
[nioEventLoopGroup-2-1] INFO org.wso2.transport.http.netty.listener.ServerConnectorBootstrap$HTTPServerConnector - HTTP(S) Interface starting on host localhost and port 9056
[main] INFO org.wso2.extension.siddhi.io.http.source.HttpConnectorPortBindingListener - siddhi: started HTTP server connector localhost:9056
[main] INFO org.wso2.extension.siddhi.io.http.source.HttpSourceListener - Source Listener has created for url http://localhost:9056/endpoints/
However, when I send a POST request to the designated address. I get an error:
[nioEventLoopGroup-3-1] ERROR org.wso2.extension.siddhi.io.http.source.HTTPConnectorListener - Error in http server connector
java.lang.NoSuchMethodError: io.netty.handler.codec.http.HttpRequest.method()Lio/netty/handler/codec/http/HttpMethod;
at org.wso2.transport.http.netty.listener.CustomHttpContentCompressor.decode(CustomHttpContentCompressor.java:44)
at org.wso2.transport.http.netty.listener.CustomHttpContentCompressor.decode(CustomHttpContentCompressor.java:14)
at io.netty.handler.codec.MessageToMessageCodec$2.decode(MessageToMessageCodec.java:81)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:354)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:748)
Could anyone suggest a reason of what I have done wrong? Thank you in advance.
Affected Product Version:
4.1.17
OS, DB, other environment details and versions:
IntelliJ IDEA 2017.3.5 (Community Edition)
Build #IC-173.4674.33, built on March 6, 2018
JRE: 1.8.0_152-release-1024-b15 amd64
JVM: OpenJDK 64-Bit Server VM by JetBrains s.r.o
Windows 10 10.0
Steps to reproduce:
The test code I wrote:
import org.wso2.siddhi.core.SiddhiAppRuntime;
import org.wso2.siddhi.core.SiddhiManager;
import org.wso2.siddhi.core.event.Event;
import org.wso2.siddhi.core.stream.output.StreamCallback;
import org.wso2.siddhi.core.util.EventPrinter;
//import org.wso2.extension.siddhi.io.http.source.*;
public class httpTest
{
public static void main(String[] args) {
String siddhiString = "#App:name(\"haha\") " +
"#App:description(\"fasd\") " +
"#App:statistics(reporter = \"jmx\", interval = \"30\") " +
"#source(type=\"http\",receiver.url=\"http://localhost:9056/endpoints/\",#map(type=\"text\",fail.on.missing.attribute=\"true\",regex.A=\"(.*)\",#attributes(data=\"A\"))) " +
"#sink(type=\"mqtt\",url=\"tcp://120.78.71.179:1883\",topic=\"34\",#map(type=\"text\")) " +
"define stream a4P068X5YCK(data String);";
SiddhiManager siddhiManager = new SiddhiManager();
SiddhiAppRuntime siddhiAppRuntime = siddhiManager.createSiddhiAppRuntime(siddhiString);
siddhiAppRuntime.addCallback("a4P068X5YCK", new StreamCallback() {
#Override
public void receive(Event[] events) {
EventPrinter.print(events);
}
});
siddhiAppRuntime.start();
}
}
Then I send a POST request to http://localhost:9056/endpoints/. It returns the exception posted above.
Update:
I went back and check the Siddhi-io-http github documentation page. I found that it says:
... This extension only works inside the WSO2 Data Analytic Server and cannot be run with standalone siddhi.
I guess it might suggest that http is not supported by siddhi library at the moment. I have submitted issue on siddhi repository page to ask for confirmation.
Update 2:
I have changed my Siddhi Query so that it copy the source stream into the other sink stream. Other part of the code remains the same:
String siddhiString = "#App:name(\"haha\") " +
"#App:description(\"fasd\") " +
"#App:statistics(reporter = \"jmx\", interval = \"30\") " +
"#source(type=\"http\",receiver.url=\"http://localhost:9056/endpoints/\",#map(type=\"text\",fail.on.missing.attribute=\"true\",regex.A=\"(.*)\",#attributes(data=\"A\"))) " +
"define stream a4P068X5YCK(data String); " +
"#sink(type=\"mqtt\",url=\"tcp://120.78.71.179:1883\",topic=\"34\",#map(type=\"text\")) " +
"define stream pout(data String); " +
"from a4P068X5YCK " +
"select * " +
"insert into pout; " +
"";
The same problem still exists. I tried the wso2 processor and it works fine. Now my guesses are:
1. version mismatch
2. lack of some packages in wso2 processor dependecies.
I will try to identify it in those two direction and will update in here and Issue page as soon as I find something new.
Update 3:
As I keep adding updates, the format seems to have some problem but fortunately this issue also comes to an end. I tried to Include all dependencies from wso2 processor source code and my test program starts working. Therefore I assume there is a component in wso2 processor that siddhi library is lacking.
I tried to delete the dependencies one by one to see if my test program still works. Finally I have found that package. With this package my code works well.
<dependency>
<groupId>org.wso2.msf4j</groupId>
<artifactId>org.wso2.msf4j.feature</artifactId>
<version>${msf4j.version}</version>
<type>zip</type>
</dependency>
As I am a beginner to coding, I am not exactly what was the problem. I would be grateful if someone could explain to me the reason behind the problem. I appreciate all the helps received in this process and it would also be a great experience for me.
Update 4: #Grainier I tried the sample code you posted and it actually worked! Although I still have no idea why. I tried to copy your exact code to a new .java in my project. It still won't work. Therefore I guess there is something to do with POM file.
Something I noticed is that when I ran your sample code there are few more WARNINGS printed in console: SMALL UPDATE: I have found that the Warnings appeared because I am using JDK 10. As soon as I switch back to 1.8 warnings disappeared and the code still works. So maybe this is not the reason.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by io.netty.util.internal.ReflectionUtil (file:/C:/Users/ktz001/.m2/repository/io/netty/netty-common/4.1.16.Final/netty-common-4.1.16.Final.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of io.netty.util.internal.ReflectionUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
The second difference is in the POM file. In you have one more repository added compared to mine.
<repository>
<id>wso2-nexus</id>
<name>WSO2 internal Repository</name>
<url>http://maven.wso2.org/nexus/content/groups/wso2-public/</url>
<releases>
<enabled>true</enabled>
<updatePolicy>daily</updatePolicy>
<checksumPolicy>ignore</checksumPolicy>
</releases>
</repository>
It would be great if you could suggest any reason.
Thank you for all of your work! It has been really helpful.

There seems to be an issue with the documentation... This should work with standalone Siddhi. All you have to do is add following dependencies in your project (also mqtt, which I haven't included below);
<dependencies>
<dependency>
<groupId>org.wso2.siddhi</groupId>
<artifactId>siddhi-core</artifactId>
<version>${siddhi.version}</version>
</dependency>
<dependency>
<groupId>org.wso2.siddhi</groupId>
<artifactId>siddhi-annotations</artifactId>
<version>${siddhi.version}</version>
</dependency>
<dependency>
<groupId>org.wso2.siddhi</groupId>
<artifactId>siddhi-query-compiler</artifactId>
<version>${siddhi.version}</version>
</dependency>
<dependency>
<groupId>org.wso2.extension.siddhi.io.http</groupId>
<artifactId>siddhi-io-http</artifactId>
<version>${siddhi.io.http.version}</version>
</dependency>
<dependency>
<groupId>org.wso2.extension.siddhi.map.text</groupId>
<artifactId>siddhi-map-text</artifactId>
<version>${siddhi.mapper.text.version}</version>
</dependency>
</dependencies>
However, there's an issue with your query which is, you have defined a #source and a #sink to a single stream. Which is wrong. If you want to make it a passthrough, then you have to define two streams (one for source and one for sink) and write a query to insert events from source stream to sink stream.
UPDATE:
A sample can be found here; Please try that and see whether it's working.

java.lang.LinkageError on Websphere while trying to load HttpUriRequest

I'm using CUPS4J for my project, which depends on http-client, http-core, and slf4j.
To resolve dependencies we use Maven, and I have defined dependencies as follows:
<dependency>
<groupId>cups4j</groupId>
<artifactId>cups4j</artifactId>
<version>0.6.4</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.0.3</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.1</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.7</version>
</dependency>
The cups4j dependency is on our ArtiFactory server (I couldn't find it online).
Everything works like a charm if I create a sample main method to print some document and launch it as a java application.
When I publish my classes to the Websphere server and call that method from a webpage, it generates a java.lang.LinkageError.
This is the relevant part of the stacktrace:
Caused by: java.lang.LinkageError: loader constraint violation: loader "org/eclipse/osgi/internal/baseadaptor/DefaultClassLoader#208c132" previously initiated loading for a different type with name "org/apache/http/client/methods/HttpUriRequest" defined by loader "com/ibm/ws/classloader/CompoundClassLoader#1e0f797"
at java.lang.ClassLoader.defineClassImpl(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:260)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.defineClass(DefaultClassLoader.java:188)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.defineClass(ClasspathManager.java:580)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findClassImpl(ClasspathManager.java:550)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClassImpl(ClasspathManager.java:481)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClass_LockClassName(ClasspathManager.java:460)
at org.eclipse.osgi.baseadaptor.loader.ClasspathManager.findLocalClass(ClasspathManager.java:447)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.findLocalClass(DefaultClassLoader.java:216)
at org.eclipse.osgi.internal.loader.BundleLoader.findLocalClass(BundleLoader.java:393)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:469)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:422)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:410)
at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107)
at java.lang.ClassLoader.loadClass(ClassLoader.java:612)
at org.apache.http.impl.client.AbstractHttpClient.determineTarget(AbstractHttpClient.java:584)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:708)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:700)
at org.cups4j.operations.IppOperation.sendRequest(IppOperation.java:207)
at org.cups4j.operations.IppOperation.request(IppOperation.java:76)
at org.cups4j.CupsPrinter.print(CupsPrinter.java:113)
at it.dropcomp.tasks.print.PrinterService.printPDF(PrinterService.java:160)
This is the method that prints the PDF (Inside it.dropcomp.tasks.print.PrinterService):
public void printPDF() throws RemoteServiceException {
/*
* generatedPDF is defined as File, and it's properly initialized
* before calling this method.
*/
if(generatedPDF == null) {
throw new RemoteServiceException("You must generate a file first!");
}
try {
CupsPrinter selectedPrinter = new CupsPrinter(
new URL(Constants.PRINTER_FULL_URL),
Constants.PRINTER_NAME, true
);
InputStream is = new FileInputStream(generatedPDF);
PrintJob pj = new PrintJob.Builder(is).build();
selectedPrinter.print(pj); //this is line 160
} catch (Exception e) {
LOG.error("Exception", e);
throw new RemoteServiceException(e);
}
}
It seems that HttpUriRequest already exists and makes conflict with the one provided by the httpclient library from Apache, but if I try removing that dependency from pom.xml, I get a NoClassDefFoundException for that class.
If it matters, my IDE is Eclipse Luna.
How can I solve this exception?

WebSphere uses also httpclient library which may conflict with one you are providing.
Try to create isolated shared library in admin console via Environment > Shared Libraries. Put http-*, slf4j and cups4j jars there and associate that shared library with your application.

ElasticSearch: xerial.snappy error FAILED_TO_LOAD_NATIVE_LIBRARY

I'm trying running ElasticSearch client and getting xerial.snappy error FAILED_TO_LOAD_NATIVE_LIBRARY.
I'm using elastic search v. 0.20.5:
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>0.20.5</version>
</dependency>
and also added snappy v.1.0.4.1 into my dependency(but it did not help either):
<dependency>
<groupId>org.xerial.snappy</groupId>
<artifactId>snappy-java</artifactId>
<version>1.0.4.1</version>
</dependency>
here is the error I'm getting (my app continues to run, but I suspect compression lib is not in use)
INFO Log4jESLogger.internalInfo - [Human Top II] loaded [], sites []
DEBUG Log4jESLogger.internalDebug - using [UnsafeChunkDecoder] decoder
DEBUG Log4jESLogger.internalDebug - failed to load xerial snappy-java
org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
at org.elasticsearch.common.compress.snappy.xerial.XerialSnappy.<clinit>(XerialSnappy.java:42)
at org.elasticsearch.common.compress.CompressorFactory.<clinit>(CompressorFactory.java:58)
at org.elasticsearch.client.transport.TransportClient.<init>(TransportClient.java:161)
at org.elasticsearch.client.transport.TransportClient.<init>(TransportClient.java:109)
My code that generates this issue:
public static void main(String[] args)
{
// Error happens during client creation...
Client client = new TransportClient().addTransportAddress(new InetSocketTransportAddress("localhost", 9300));
try
{
SearchResponse res = client.prepareSearch().execute().actionGet();
SearchHits hits = res.getHits();
}
finally
{
client.close();
}
}
Can anyone shed some light into this issue? How to make snappy to load native lib? I'm currently on Win7-64, but want to be running on AWS(centOS,RH,etc)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

kinesis analytics flink write parquet file - java

Related

MongoDb error: 'cannot use 'j' option when a host does not have journaling enabled'

How to manage RecordTooLargeException avoiding Flink job restarting

Siddhi HTTP NoSuchMethodError

java.lang.LinkageError on Websphere while trying to load HttpUriRequest

ElasticSearch: xerial.snappy error FAILED_TO_LOAD_NATIVE_LIBRARY

Categories

Resources