I'm trying to measure the latency of my service at a lower level. Poking around I saw that it is possible to add a addStreamTracerFactory to the grpc builder.
I've done this simple implementation like this and printed the logs:
val server = io.grpc.netty.NettyServerBuilder.forPort(ApplicationConfig.Service.bindPort).addStreamTracerFactory(ServerStreamTracerFactory)....
class Telemetry(fullMethodName: String, headers: Metadata) extends ServerStreamTracer with LazyLogging {
override def serverCallStarted(callInfo: ServerStreamTracer.ServerCallInfo[_, _]): Unit = {
logger.info(s"Telemetry '$fullMethodName' '$headers' callinfo:$callInfo")
super.serverCallStarted(callInfo)
}
override def inboundMessage(seqNo: Int): Unit = {
logger.info(s"inboundMessage $seqNo")
super.inboundMessage(seqNo)
}
override def inboundMessageRead(seqNo: Int, optionalWireSize: Long, optionalUncompressedSize: Long): Unit = {
logger.info(s"inboundMessageRead $seqNo $optionalWireSize $optionalUncompressedSize")
super.inboundMessageRead(seqNo, optionalWireSize, optionalUncompressedSize)
}
override def outboundMessage(seqNo: Int): Unit = {
logger.info(s"outboundMessage $seqNo")
super.outboundMessage(seqNo)
}
override def outboundMessageSent(seqNo: Int, optionalWireSize: Long, optionalUncompressedSize: Long): Unit = {
logger.info(s"outboundMessageSent $seqNo $optionalWireSize $optionalUncompressedSize")
super.outboundMessageSent(seqNo, optionalWireSize, optionalUncompressedSize)
}
override def streamClosed(status: Status): Unit = {
logger.info(s"streamClosed $status")
super.streamClosed(status)
}
}
object ServerStreamTracerFactory extends Factory with LazyLogging{
logger.info("called")
override def newServerStreamTracer(fullMethodName: String, headers: Metadata): ServerStreamTracer = {
logger.info(s"called with $fullMethodName $headers")
new Telemetry(fullMethodName, headers)
}
}
I'm running a simple grpc client in a loop and examining the output of the server stream tracer.
I see that the "lifecycle" of logs repeats itself. Here is one iteration (but it spews out the exact same again and again):
22:15:06 INFO [grpc-default-worker-ELG-3-2] [newServerStreamTracer:38] [ServerStreamTracerFactory$] called with com.dy.affinity.service.AffinityService/getAffinities Metadata(content-type=application/grpc,user-agent=grpc-python/1.15.0 grpc-c/6.0.0 (osx; chttp2; glider),grpc-accept-encoding=identity,deflate,gzip,accept-encoding=identity,gzip)
22:15:06 INFO [grpc-default-executor-0] [serverCallStarted:8] [Telemetry] Telemetry 'com.dy.affinity.service.AffinityService/getAffinities' 'Metadata(content-type=application/grpc,user-agent=grpc-python/1.15.0 grpc-c/6.0.0 (osx; chttp2; glider),grpc-accept-encoding=identity,deflate,gzip,accept-encoding=identity,gzip)' callinfo:io.grpc.internal.ServerCallInfoImpl#5badffd8
22:15:06 INFO [grpc-default-worker-ELG-3-2] [inboundMessage:13] [Telemetry] inboundMessage 0
22:15:06 INFO [grpc-default-worker-ELG-3-2] [inboundMessageRead:17] [Telemetry] inboundMessageRead 0 19 -1
22:15:06 INFO [pool-1-thread-5] [outboundMessage:21] [Telemetry] outboundMessage 0
22:15:06 INFO [pool-1-thread-5] [outboundMessageSent:25] [Telemetry] outboundMessageSent 0 0 0
22:15:06 INFO [grpc-default-worker-ELG-3-2] [streamClosed:29] [Telemetry] streamClosed Status{code=OK, description=null, cause=null}
A few things that aren't quite clear to me from just looking at these logs:
Why is a new stream being created for each request? I though that the grpc client is supposed to re-use the connection. "stream closed" shouldn't be called right?
If the stream is being re-used, how come I see that the inboundMessage number (and outboundMessage) is always "0". (Also when I've started multiple clients in parallel this is always 0). In what case should the message number not be 0?
If the stream isn't being re-used, how should I be configuring the clients differently to re-use the connection?
In gRPC one HTTP2 stream is created for each RPC (while if retries or hedging is enabled there can be more than one streams for each RPC). HTTP2 streams are multiplexed on one connection, and it's pretty cheap to open and close streams. So, it's the connection being re-used, not the stream.
The seqNo you get from the tracer methods is the seqNo of messages for this stream, which starts from 0. Looks like you are doing unary RPCs, which makes one request and gets one response then closes. What you see is totally normal.
Related
Is there any way to ignore oversized messages without Flink job restarting?
If I try to produce (using KafkaSink ) a message which is too large (greater than max.message.bytes) then the RecordTooLargeException occurs and the Flink job restarts, then this "exception&restart" cycle is repeating endlessly!
I don't need to increase messages size limits such as max.message.bytes (Kafka Topic Config) and max.request.size (Flink Producer Config), they are good, they are already big. I just want to handle the situation when an unrealistically large message is trying to be produced. In this case, this big message should be ignored, and an error should be logged, and any Runtime Exception should NOT occur, and the endless restarting loop should NOT start.
I tried to use ProducerInterceptor -> it cannot intercept/reject a message, it can just modify it.
I tried to ignore oversized messages in SerializationSchema (implemented a custom wrapper of SerializationSchema) -> it cannot discard message producing too.
I am trying to overwrite KafkaWriter and KafkaSink classes, but it seems to be challenging.
I will be grateful for any advice!
A few quick environment details:
Kafka version is 2.8.1
Flink code is Java code based on the newer KafkaSource/KafkaSink API, not the
older KafkaConsumer/KafkaProduer API.
The flink-clients and flink-connector-kafka version is 1.15.0
Code sample which throws the RecordTooLargeException:
int numberOfRows = 1;
int rowsPerSecond = 1;
DataStream<String> stream = environment.addSource(
new DataGeneratorSource<>(
RandomGenerator.stringGenerator(1050000), // max.message.bytes=1048588
rowsPerSecond,
(long) numberOfRows),
TypeInformation.of(String.class))
.setParallelism(1)
.name("string-generator");
KafkaSinkBuilder<String> builder = KafkaSink.<String>builder()
.setBootstrapServers("localhost:9092")
.setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.setRecordSerializer(
KafkaRecordSerializationSchema.builder().setTopic("test.output")
.setValueSerializationSchema(new SimpleStringSchema())
.build());
KafkaSink<String> sink = builder.build();
stream.sinkTo(sink).setParallelism(1).name("output-producer");
Exception Stack Trace:
2022-06-02/14:01:45.066/PDT [flink-akka.actor.default-dispatcher-4] INFO output-producer: Writer -> output-producer: Committer (1/1) (a66beca5a05c1c27691f7b94ca6ac025) switched from RUNNING to FAILED on 271b1b90-7d6b-4a34-8116-3de6faa8a9bf # 127.0.0.1 (dataPort=-1). org.apache.flink.util.FlinkRuntimeException: Failed to send data to Kafka null with FlinkKafkaInternalProducer{transactionalId='null', inTransaction=false, closed=false} at org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.throwException(KafkaWriter.java:440) ~[flink-connector-kafka-1.15.0.jar:1.15.0] at org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.lambda$onCompletion$0(KafkaWriter.java:421) ~[flink-connector-kafka-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:353) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) ~[flink-runtime-1.15.0.jar:1.15.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1050088 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.
One of our application just suffered from some nasty deadlocks. I had quite a hard time recreating the problem because the deadlock (or stacktrace) did not show up immediately in my java application logs.
To my surprise the marklogic java api retries failing requests (e.g because of a deadlock). This might make sense, if your request is not a multi statement request, but otherwise i'm not sure if it does.
So lets stick with this deadlock problem. I created a simple code snippet in which i create a deadlock on purpose. The snippet creates a document test.xml and then tries to read and write from two different transactions, each on a new thread.
public static void main(String[] args) throws Exception {
final Logger root = (Logger) LoggerFactory.getLogger(Logger.ROOT_LOGGER_NAME);
final Logger ok = (Logger) LoggerFactory.getLogger(OkHttpServices.class);
root.setLevel(Level.ALL);
ok.setLevel(Level.ALL);
final DatabaseClient client = DatabaseClientFactory.newClient("localhost", 8000, new DatabaseClientFactory.DigestAuthContext("username", "password"));
final StringHandle handle = new StringHandle("<doc><name>Test</name></doc>")
.withFormat(Format.XML);
client.newTextDocumentManager().write("test.xml", handle);
root.info("t1: opening");
final Transaction t1 = client.openTransaction();
root.info("t1: reading");
client.newXMLDocumentManager()
.read("test.xml", new StringHandle(), t1);
root.info("t2: opening");
final Transaction t2 = client.openTransaction();
root.info("t2: reading");
client.newXMLDocumentManager()
.read("test.xml", new StringHandle(), t2);
new Thread(() -> {
root.info("t1: writing");
client.newXMLDocumentManager().write("test.xml", new StringHandle("<doc><t>t1</t></doc>").withFormat(Format.XML), t1);
t1.commit();
}).start();
new Thread(() -> {
root.info("t2: writing");
client.newXMLDocumentManager().write("test.xml", new StringHandle("<doc><t>t2</t></doc>").withFormat(Format.XML), t2);
t2.commit();
}).start();
TimeUnit.MINUTES.sleep(5);
client.release();
}
This code will produce the following log:
14:12:27.437 [main] DEBUG c.m.client.impl.OkHttpServices - Connecting to localhost at 8000 as admin
14:12:27.570 [main] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction null
14:12:27.608 [main] INFO ROOT - t1: opening
14:12:27.609 [main] DEBUG c.m.client.impl.OkHttpServices - Opening transaction
14:12:27.962 [main] INFO ROOT - t1: reading
14:12:27.963 [main] DEBUG c.m.client.impl.OkHttpServices - Getting test.xml in transaction 5298588351036278526
14:12:28.283 [main] INFO ROOT - t2: opening
14:12:28.283 [main] DEBUG c.m.client.impl.OkHttpServices - Opening transaction
14:12:28.286 [main] INFO ROOT - t2: reading
14:12:28.286 [main] DEBUG c.m.client.impl.OkHttpServices - Getting test.xml in transaction 8819382734425123844
14:12:28.289 [Thread-1] INFO ROOT - t1: writing
14:12:28.289 [Thread-1] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction 5298588351036278526
14:12:28.289 [Thread-2] INFO ROOT - t2: writing
14:12:28.290 [Thread-2] DEBUG c.m.client.impl.OkHttpServices - Sending test.xml document in transaction 8819382734425123844
Neither t1 or t2 will get commited. MarkLogic logs confirm that there actually is a deadlock:
==> /var/opt/MarkLogic/Logs/8000_AccessLog.txt <==
127.0.0.1 - admin [24/Nov/2018:14:12:30 +0000] "PUT /v1/documents?txid=5298588351036278526&category=content&uri=test.xml HTTP/1.1" 503 1034 - "okhttp/3.9.0"
==> /var/opt/MarkLogic/Logs/ErrorLog.txt <==
2018-11-24 14:12:30.719 Info: Deadlock detected locking Documents test.xml
This would not be a problem, if one of the requests would fail and throw an exception, but this is not the case. MarkLogic Java Api retries every request up to 120 seconds and one of the updates timeouts after like 120 seconds or so:
Exception in thread "Thread-1" com.marklogic.client.FailedRequestException: Service unavailable and maximum retry period elapsed: 121 seconds after 65 retries
at com.marklogic.client.impl.OkHttpServices.putPostDocumentImpl(OkHttpServices.java:1422)
at com.marklogic.client.impl.OkHttpServices.putDocument(OkHttpServices.java:1256)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:920)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:758)
at com.marklogic.client.impl.DocumentManagerImpl.write(DocumentManagerImpl.java:717)
at Scratch.lambda$main$0(scratch.java:40)
at java.lang.Thread.run(Thread.java:748)
What are possible ways to overcome this problem? One way might be to set a maximum time to live for a transaction (like 5 seconds), but this feels hacky and unreliable. Any other ideas? Are there any other settings i should check out?
I'm on MarkLogic 9.0-7.2 and using marklogic-client-api:4.0.3.
Edit: One way to solve the deadlock would be by syncronizing the calling function, this is actually the way i solved it in my case (see comments). But i think the underlying problem still exists. Having a deadlock in a multi statement transaction should not be hidden away in a 120 second timeout. I rather have a immediately failing request than a 120 second lock on one of my documents + 64 failing retries per thread.
Deadlocks are usually resolvable by retrying. Internally, the server does a inner-retry loop because usually deadlocks are transient and incidental, lasting a very short time. In your case you have constructed a case that will never succeed with any timeout that's equal for both threads.
Deadlocks can be avoided at the application layer by avoiding multi-statement transactions when using the REST API. (which is what the Java api uses).
Multi statement transactions over REST cannot be implemented 100% safely due to the client's responsibility to manage the transaction ID and the server's inability to detect client-side errors or client-side identity. Very subtle problems can and do occur unless you are aggressively proactive wrt handling errors and multithreading. If you 'push' the logic to the server (xquery or javascript) the server is able to manage things much better.
As for if its 'good' or not for the Java API to implement retries for this case, that's debatable either way. (The compromise for an seemingly easy-to-use interface is that many things that would otherwise be options are decided for you as a convention. There's generally no one-size-fits-all answer. In this case I am presuming the thought was that a deadlock is more likely caused by independant code/logic by 'accident' as opposed to identical code running in tangent -- a retry in that case would be a good choice. In your example its not, but then an earlier error would still fail predictably until you change your code to 'not do that' ).
If it doesn't already exist, a feature request for a configurable timeout and retry behaviour does seem a reasonable request. I would recommend, however, to attempt to avoid any REST calls that result in an open transaction -- inherently that is problematic, particularly if you don't notice the problem upfront (then its more likely to bite you in production). Unlike JDBC, which keeps a connection open so that the server can detect client disconnects, HTTP and the ML Rest API do not -- which leads to a different programming model then traditional database coding in java.
My configuration for the consumer is as documented in Spring cloud stream consumer properties documentation.
spring-cloud-dependencies:Finchley.SR1
springBootVersion = '2.0.5.RELEASE'
I have 4 partitions for kstream_test topic and they are filled with messages from producer as seen below:
root#kafka:/# kafka-run-class kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic kstream_test --time -1
kstream_test:2:222
kstream_test:1:203
kstream_test:3:188
kstream_test:0:278
My spring cloud stream kafka binder based configuration is:
spring.cloud.stream.bindings.input:
destination: kstream_test
group: consumer-group-G1_test
consumer:
useNativeDecoding: true
headerMode: raw
startOffset: latest
partitioned: true
concurrency: 3
KStream Listener class
#StreamListener
#SendTo(MessagingStreams.OUTPUT)
public KStream<?, ?> process(#Input(MessagingStreams.INPUT) KStream<?, ?> kstreams) {
......
log.info("Got a message");
......
return kstreams;
}
My producer sends 100 messages in 1 run. But the logs seems to have only 1 thread StreamThread-1 handling the messages, though I have concurrency as 3. What might be wrong here ? Is 100 messages not enough to see the concurrency at play ?
2018-10-18 11:50:01.923 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.923 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.945 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.956 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.972 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
UPDATE:
As per the answer, the below num.stream.threads configuration works at the binder level.
spring.cloud.stream.kafka.streams.binder.configuration:
num.stream.threads: 3
It seems that the num.stream.threads needs to be set to increase the concurrency...
/** {#code num.stream.threads} */
#SuppressWarnings("WeakerAccess")
public static final String NUM_STREAM_THREADS_CONFIG = "num.stream.threads";
private static final String NUM_STREAM_THREADS_DOC = "The number of threads to execute stream processing.";
...it defaults to 1.
The binder should really set that based on the ...consumer.concurrency property; please open a github issue to that effect against the binder.
In the meantime, you can just set that property directly in ...consumer.configuration.
CORRECTION
I've just been told that the ...consumer.configuration is not currently applied to the streams binder either; you would have to set it at the binder level.
I have a simple application which will expose a RESTFul GET endpoint called 'getAllDeviceData' which will simply return List of fetched data of all devices from the device table in a DB.
For each request I am authenticating the user by validating the HttpServletRequest.getUserPrincipal() method.
To speedup the process I have used parallelStream with lambda expressions.
In ParallelStream I am invoking another method called 'getDeviceData' in which I am doing the authentication and fetch data from DB.
The problem is, when parallel stream process invokes getDeviceData method, I am getting a NullPointer exception and failed to complete the parallel steam.
The caues is, HttpServletRequest.getUserPrincipal() is null inside the method. But it actually exists in 'getAllDeviceData' (where the lambda expression is).
This works without any issue if I replace 'parallelStream()' with just 'stream()' but in this case parallel nature is not there.
#Override
#ResponseBody
#RequestMapping(value = "getAllDeviceData", method = RequestMethod.GET, consumes = "*")
public List<List<Data>> getAllDeviceData(
#RequestParam(value = "recordLimit", required = false) final Integer recordLimit,
final HttpServletRequest request) {
final List<Device> deviceList = deviceService.getAllDevices();
final List<List<Data>> dataList = deviceList.parallelStream().map(device -> getDeviceData(recordLimit, device.getDeviceId(), request)).collect(Collectors.toList());
return alerts;
}
private List<Data> getDeviceData(#RequestParam(value = "recordLimit", required = false) Integer recordLimit, String deviceId, HttpServletRequest request) {
if(request.getUserPrincipal() == null){
logger.info("User Principle Null - 1");
}else {
logger.info("User Principle Not Null - 1");
}
authService.doAuthenticate(request);
// if authrnticated proceed with following...
List<Data> deviceData = deviceService.getGetDeviceData(deviceId);
return deviceData;
}
However, I have observed something.
Look at the following log (unnecessary parts have been ommitted) of above application.
In it, main threads (eg : http-nio-7070-exec-2 etc. - which are main threads of thread pool of this application's server) are working fine because it prints out 'User Principle Not Null - 1' but, in broken down threads of parallel stream such as ForkJoinPool.commonPool-worker-2 etc. HTTPServletRequest.getUserPrincipal() is becoming null.
2018-01-15 15:28:06,897 INFO [http-nio-7070-exec-2] User Principle Not Null - 1
2018-01-15 15:28:06,897 INFO [ForkJoinPool.commonPool-worker-2] User Principle Null - 1
2018-01-15 15:28:06,906 INFO [ForkJoinPool.commonPool-worker-3] User Principle Null - 1
2018-01-15 15:28:06,955 INFO [ForkJoinPool.commonPool-worker-2] User Principle Null - 1
2018-01-15 15:28:06,955 INFO [ForkJoinPool.commonPool-worker-1] User Principle Null - 1
2018-01-15 15:28:06,957 INFO [ForkJoinPool.commonPool-worker-2] User Principle Null - 1
2018-01-15 15:28:06,959 INFO [ForkJoinPool.commonPool-worker-3] User Principle Null - 1
2018-01-15 15:28:07,064 INFO [ForkJoinPool.commonPool-worker-2] User Principle Null - 1
2018-01-15 15:28:07,076 INFO [http-nio-7070-exec-2] User Principle Not Null -1
2018-01-15 15:28:07,078 INFO [ForkJoinPool.commonPool-worker-1] User Principle Null - 1
I am still new to lambda expressions and parallel stream.
Please help me understand what is the issue here.
Java Details:
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
the root cause is that Spring is injecting SecurityContextHolderAwareRequestWrapper instance into your method. This wrapper is calling the following lines when equest.getUserPrincipal() is called:
private Authentication getAuthentication() {
Authentication auth = SecurityContextHolder.getContext().getAuthentication();
SecurityContextHolder has different strategies. By default MODE_THREADLOCAL strategy is used. That's why you have user principle in main threads but don't have one in forkjoinpool threads.
-Dspring.security.strategy=MODE_INHERITABLETHREADLOCAL VM option is a solution to your problem. InheritableThreadLocal javadoc and InheritableThreadLocalSecurityContextHolderStrategy source code might bring additional value to the understanding.
I'm trying to call 2 xsl transforms in my route via recipientList. I have one wrapped in a choice step, and the other added as normal:
from("direct:cosTransform")
.routeId(TransformerConstants.TRANSFORM_XSLT_ROUTE)
.process(exchange -> {
// Get the xml payload from the exchange body
final String xml = exchange.getIn().getBody(String.class);
// Determine LOB and set as header. If the LOB is invalid, the xslt step will fail.
final String lob = TransformerUtil.getLOBFromAcordXml(xml);
if (null == lob) {
throw new GeneralException("Could not derive the LOB from the ACORD Form. ACORD Form["
+ TransformerUtil.getAcordFormFromAcordXml(xml) + "]");
}
exchange.getIn().setHeader("lineOfBusiness", lob);
// Market should be NI or BI. If the market is invalid, the xslt step will fail.
final String market = TransformerUtil.getMarketFromAccordXml(xml);
if (!StringUtils.equals(market, TransformerConstants.BI_MARKET)
&& !StringUtils.equals(market, TransformerConstants.NI_MARKET)) {
throw new GeneralException("Missing or invalid market[" + market + "].");
}
exchange.getIn().setHeader("market", market);
})
.log("Executing an xsl transform for Market=${header.market} and LOB=${header.lineOfBusiness}")
.choice()
.when(header("market").isEqualTo("NI"))
.recipientList(simple("xslt:./xsl/${header.market}/Common.xsl?saxon=true&contentCache=false")).id("commonTransformNI")
.log("after common :${body}")
.endChoice()
.recipientList(simple("xslt:./xsl/${header.market}/${header.lineOfBusiness}.xsl?saxon=true&contentCache=false"))
.log("after lob : ${body}")
.choice().id("postTransform")
.when(header("market").isEqualTo("NI"))
.process(exchange -> {
String xml = exchange.getIn().getBody(String.class);
xml = TransformerUtil.setUniqueIds(xml);
exchange.getIn().setBody(xml);
})
.endChoice()
.end();
}
`
When the exchange with a header of to be transformed hits the first recipientList call(Common.xsl), the xml is transformed and it works fine. When it hits the second call, I get the following in the console:
[ #0 - seda://transform-receive] [CLM] [CID=UNKNOWN] o.a.c.impl.ProcessorEndpoint$1.doStart DEBUG Starting producer: Producer[xslt://./xsl/NI/CGL.xsl?contentCache=false&saxon=true]
[ #0 - seda://transform-receive] [CLM] [CID=UNKNOWN] o.a.camel.impl.ProducerCache .doGetProducer DEBUG Adding to producer cache with key: Endpoint[xslt://./xsl/NI/CGL.xsl?contentCache=false&saxon=true] for producer: Producer[xslt://./xsl/NI/CGL.xsl?contentCache=false&saxon=true]
[ #0 - seda://transform-receive] [CLM] [CID=UNKNOWN] o.a.c.b.xml.XsltUriResolver .resolve DEBUG Resolving URI from classpath:: classpath:./xsl/NI/CGL.xsl
[ #0 - seda://transform-receive] [CLM] [CID=UNKNOWN] o.a.c.p.DefaultErrorHandler .log DEBUG Failed delivery for (MessageId: ID-LIBP03P-QK70A9V-60563-1487254928941-1-7 on ExchangeId: ID-LIBP03P-QK70A9V-60563-1487254928941-1-8). On delivery attempt: 0 caught: java.lang.NullPointerException
[ #0 - seda://transform-receive] [CLM] [CID=UNKNOWN] TRANSFORM.COR.XSLT.ROUTE .log ERROR null
[ #0 - seda://transform-receive] [CLM] [CID=UNKNOWN] o.a.c.processor.SendProcessor .process DEBUG >>>> Endpoint[log://showException=true] Exchange[ID-LIBP03P-QK70A9V-60563-1487254928941-1-8]
I've tried a few tests, adding in a mock endpoint after the second call and checking for a message, using log messages to track how far along the route the exchange gets to and it never seems to make it to the second call, it always throws an exception when heading to the second xslt step. Even when I send in an exchange that won't match the when condition, it doesn't hit the second xslt step either. The paths in the xsl strings are correct, when i remove the choice logic and second xslt step, the xsl runs fine, whether the string is for the common.xsl file or a line of business xsl file
Note: Due to certain templates in the xsl, I have to split the transforms out into their own steps, i.e. i can't import one of the files in, so the transforms have to be called separately
EDIT: I have implemented a processor to run the xslt manually for now. I'd still like to get to the bottom of this though, even when using a .to call and hardcoding the parameters I still get the same issue. Is it something around the XSLT component in camel?