Been recently experiencing some timeouts with scala.concurrent.Future objects created awaiting processing within an Akka actor and I was wondering how to handle those timeout'd events. Are they really lost? Are they retried and preserved in memory or how does it work?
To put a bit of context, the code goes the following.
List<Future<MyMessage>> futureMessageList = plainMessages.stream()
.map(this::toFuture)
.collect(Collectors.toList());
Futures.sequence(futureMessageList, ExecutionContexts.global())
.onComplete(new OnComplete<Iterable<MyMessage>>() {
#Override
public void onComplete(Throwable throwable, Iterable<MyMessage> messages) {
... // iterate futureMessageList list
Within the onComplete an iteration over futureMessageList takes place, which is basically composed of Future objects which encapsulate MyMessage.
However, the function toFuture does a Patterns.ask() with a given dispatcher and that seems to be taking more than the timeout I sent (60 seconds). Take into account that the response times depend on an underlying system which may be under high load or without the fastest network depending on the environment it runs.
Future<MyMessage> message = Patterns.ask(actorSystem.getSampleDispatcher(), msg, TIMEOUT_60_SECS)
So my question is, after the onComplete throws the following exception due to the Future not being processed in time...
java.lang.NullPointerException
at my.package.Clazz.onComplete(Clazz.java:4)
at my.package.Clazz$1.onComplete(Clazz.java:5)
at akka.dispatch.OnComplete.internal(Future.scala:258)
at akka.dispatch.OnComplete.internal(Future.scala:256)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
at scala.concurrent.impl.CallbackRunnable.run$$$capture(Promise.scala:32)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala)
at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
Are those MyMessage objs saved within memory and retried afterwards? Should I somehow handle the exception and handle those timeout'd messages with an in-memory list or how should I workaround this?
When ask times out from not getting a reply it completes the Future (or CompletionStage) with a failure. The message may still be somewhere being processed and if there is a response it will end up in dead letters (https://doc.akka.io/docs/akka/current/general/message-delivery-reliability.html#dead-letters). Other scenarios where the timeout could hit are if the actor has stopped or crashed processing the message, the request or response got lost (not likely unless the responding actor is remote).
Future.sequence will either complete successfully when all futures passed to it has completed successfully or fail if any of them fails.
This means that if any of the asks time out you will get null as the messages parameter and the exception from the first failing future as the throwable parameter in your onComplete callback.
If you rather would like to get a partial list of results, each being either a successful value or an exception. You can do that with with the help of recover on each future before passing them to Future.sequence.
Related
I have 2 data sources: DB and server. When I start the application, I call the method from the repository (MyRepository):
public Observable<List<MyObj>> fetchMyObjs() {
Observable<List<MyObj>> localData = mLocalDataSource.fetchMyObjs();
Observable<List<MyObj>> remoteData = mRemoteDataSource.fetchMyObjs();
return Observable.concat(localData, remoteData);
}
I subscribe to it as follows:
mMyRepository.fetchMyObjs()
.compose(applySchedulers())
.subscribe(
myObjs -> {
//do somthing
},
throwable -> {
//handle error
}
);
I expect that the data from the database will be loaded faster, and when the download of data from the network is completed, I will simply update the data in Activity.
When the Internet is connected, everything works well. But when we open the application without connecting to the network, then mRemoteDataSource.fetchMyObjs(); throws UnknownHostException and on this all Observable ends (the subscriber for localData does not work (although logs tell that the data from the database was taken)). And when I try to call the fetchMyObjs() method again from the MyRepository class (via SwipeRefresh), the subscriber to localData is triggered.
How can I get rid of the fact that when the network is off, when the application starts, does the subscriber work for localData?
Try some of error handling operators:
https://github.com/ReactiveX/RxJava/wiki/Error-Handling-Operators
I'd guess onErrorResumeNext( ) will be fine but you have to test it by yourself. Maybe something like this would work for you:
Observable<List<MyObj>> remoteData = mRemoteDataSource.fetchMyObjs()
.onErrorResumeNext()
Addidtionally I am not in position to judge if your idea is right or not but maybe it's worth to think about rebuilding this flow. It is not the right thing to ignore errors - that's for sure ;)
You can observe your chain with observeOn(Scheduler scheduler, boolean delayError) and delayError set to true.
delayError - indicates if the onError notification may not cut ahead of onNext notification on the other side of the scheduling boundary. If true a sequence ending in onError will be replayed in the same order as was received from upstream
Kafka allows for asynchronous message sending through below methods on Producer (KafkaProducer) class:
public java.util.concurrent.Future<RecordMetadata> send(ProducerRecord<K,V> record)
public java.util.concurrent.Future<RecordMetadata> send(ProducerRecord<K,V> record, Callback callback)
Successes can be handled through
1) the Future<RecordMetaData> object or
2) onCompletion method invoked by the callback. Full method signature and usage of onCompletion is as below (taken from kafka docs)
`
ProducerRecord<byte[],byte[]> record = new ProducerRecord<byte[],byte[]>("the-topic", key, value);
producer.send(record,
new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if(e != null)
e.printStackTrace();
System.out.println("The offset of the record we just sent is: " + metadata.offset());
}
});
While failure needs to be handled through the Exception e passed to the onCompletion method
Fine every thing looks good so far.
But if I am getting it right, any reasonable information that can be obtained from exception or e object is stacktrace and exception message. What I mean to point out here is, e does not contain any information of the actual record sent. Or in other words, it does not contain a reference to the actual record that was sent to kafka broker. So what useful processing or handling can be done by the producer if the record was not sent successfully. Really not much.
Why I say this is - ideally I would like to make a log of the failed message some where and then try to resend it. But with the little information (e) provided by framework, i feel this is not possible.
Can someone point out if I am right or wrong?
You could easily create a callback that receives the producerRecord as a constructor argument. So upon onCompletion with an exception, you can have complete knowledge of the producer record, and even try to send it again.
I dealt with the same issue. Created a callback that gets both producerRecord, and a callback handler that uses an executor service to send the record again. So eventually, I can tolerate any number of failures (e.g. network issues or kafka is down), and recover from it.
I am consuming a REST web service from Java code using Apache commons HTTP client API. If no response returns within the socket timeout value configured in the connection manager parameters, socket time out exception occurs. In such cases as the thread returns the exception to the caller class, even if the REST service returns response few secs later, will be lost.
Is it possible to create a new thread which will still listen to the service even after the timeout and just logs the response, while the main thread returns the exception to the caller class?
Is there any better way to achieve this?
Thanks.
The pattern you are most likely looking for involves asynchronous requests. For every action you post you create a unique "job" id and with that a specific URL for the job status. After starting the job, you can then query on that specific job instance's status. For example:
POST to /actions
Returns 202 Accepted & include a Location header to /actions/results/1234
Immediately GET /actions/results/1234 to ascertain it's status.
If it returns a 2xx your job is done.
If it returns 404, wait 10 seconds (or whatever) and try again.
Once you are happy with the result, issue a DELETE to /actions/results/1234 to clean up after yourself.
Of course you don't have to return 404 if the job is not done, there are other strategies for checking on the status - the key thing is that it's a subsequent call.
I'm playing around with Vert.x and quite new to the servers based on event loop as opposed to the thread/connection model.
public void start(Future<Void> fut) {
vertx
.createHttpServer()
.requestHandler(r -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request received - "+start.format(DateTimeFormatter.ISO_DATE_TIME));
final MyModel model = new MyModel();
try {
for(int i=0;i<10000000;i++){
//some simple operation
}
model.data = start.format(DateTimeFormatter.ISO_DATE_TIME) +" - "+LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME);
} catch (Exception e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
r.response().end(
new Gson().toJson(model)
);
})
.listen(4568, result -> {
if (result.succeeded()) {
fut.complete();
} else {
fut.fail(result.cause());
}
});
System.out.println("Server started ..");
}
I'm just trying to simulate a long running request handler to understand how this model works.
What I've observed is the so called event loop is blocked until my first request completes. Whatever little time it takes, subsequent request is not acted upon until the previous one completes.
Obviously I'm missing a piece here and that's the question that I have here.
Edited based on the answers so far:
Isn't accepting all requests considered to be asynchronous? If a new
connection can only be accepted when the previous one is cleared
off, how is it async?
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request). So it means, the
event loop can't accept a new connection until the previous request
finishes(even if its winds up in a second). And If I as a programmer
have to think through all these and push such request handlers to a
worker thread , then how does it differ from a thread/connection
model?
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve
c10k problem, when it can't start all concurrent requests parallely and have to wait till the previous one terminates?
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
Edits and topping this question for a bounty
Do not completely understand how this model is claimed to asynchronous.
Vert.x has an async JDBC client (Asyncronous is the keyword) which I tried to adapt with RXJava.
Here is a code sample (Relevant portions)
server.requestStream().toObservable().subscribe(req -> {
LocalDateTime start = LocalDateTime.now();
System.out.println("Request for " + req.absoluteURI() +" received - " +start.format(DateTimeFormatter.ISO_DATE_TIME));
jdbc.getConnectionObservable().subscribe(
conn -> {
// Now chain some statements using flatmap composition
Observable<ResultSet> resa = conn.queryObservable("SELECT * FROM CALL_OPTION WHERE UNDERLYING='NIFTY'");
// Subscribe to the final result
resa.subscribe(resultSet -> {
req.response().end(resultSet.getRows().toString());
System.out.println("Request for " + req.absoluteURI() +" Ended - " +LocalDateTime.now().format(DateTimeFormatter.ISO_DATE_TIME));
}, err -> {
System.out.println("Database problem");
err.printStackTrace();
});
},
// Could not connect
err -> {
err.printStackTrace();
}
);
});
server.listen(4568);
The select query there takes 3 seconds approx to return the complete table dump.
When I fire concurrent requests(tried with just 2), I see that the second request completely waits for the first one to complete.
If the JDBC select is asynchronous, Isn't it a fair expectation to have the framework handle the second connection while it waits for the select query to return anything.?
Vert.x event loop is, in fact, a classical event loop existing on many platforms. And of course, most explanations and docs could be found for Node.js, as it's the most popular framework based on this architecture pattern. Take a look at one more or less good explanation of mechanics under Node.js event loop. Vert.x tutorial has fine explanation between "Don’t call us, we’ll call you" and "Verticles" too.
Edit for your updates:
First of all, when you are working with an event loop, the main thread should work very quickly for all requests. You shouldn't do any long job in this loop. And of course, you shouldn't wait for a response to your call to the database.
- Schedule a call asynchronously
- Assign a callback (handler) to result
- Callback will be executed in the worker thread, not event loop thread. This callback, for example, will return a response to the socket.
So, your operations in the event loop should just schedule all asynchronous operations with callbacks and go to the next request without awaiting any results.
Assume a typical request takes anywhere between 100 ms to 1 sec (based on the kind and nature of the request).
In that case, your request has some computation expensive parts or access to IO - your code in the event loop shouldn't wait for the result of these operations.
I'm just trying to understand how is this model better from a traditional thread/conn server models? Assume there is no I/O op or all the I/O op are handled asynchronously?
When you have too many concurrent requests and a traditional programming model, you will make thread per each request. What this thread will do? They will be mostly waiting for IO operations (for example, result from database). It's a waste of resources. In our event loop model, you have one main thread that schedule operations and preallocated amount of worker threads for long tasks. + None of these workers actually wait for the response, they just can execute another code while waiting for IO result (it can be implemented as callbacks or periodical checking status of IO jobs currently in progress). I would recommend you go through Java NIO and Java NIO 2 to understand how this async IO can be actually implemented inside the framework. Green threads is a very related concept too, that would be good to understand. Green threads and coroutines are a type of shadowed event loop, that trying to achieve the same thing - fewer threads because we can reuse system thread while green thread waiting for something.
How does it even solve c10k problem, when it can't start all concurrent requests parallel and have to wait till the previous one terminates?
For sure we don't wait in the main thread for sending the response for the previous request. Get request, schedule long/IO tasks execution, next request.
Even if I decide to push all these operations to a worker thread(pooled), then I'm back to the same problem isn't it? Context switching between threads?
If you make everything right - no. Even more, you will get good data locality and execution flow prediction. One CPU core will execute your short event loop and schedule async work without context switching and nothing more. Other cores make a call to the database and return response and only this. Switching between callbacks or checking different channels for IO status doesn't actually require any system thread's context switching - it's actually working in one worker thread. So, we have one worker thread per core and this one system thread await/checks results availability from multiple connections to database for example. Revisit Java NIO concept to understand how it can work this way. (Classical example for NIO - proxy-server that can accept many parallel connections (thousands), proxy requests to some other remote servers, listen to responses and send responses back to clients and all of this using one or two threads)
About your code, I made a sample project for you to demonstrate that everything works as expected:
public class MyFirstVerticle extends AbstractVerticle {
#Override
public void start(Future<Void> fut) {
JDBCClient client = JDBCClient.createShared(vertx, new JsonObject()
.put("url", "jdbc:hsqldb:mem:test?shutdown=true")
.put("driver_class", "org.hsqldb.jdbcDriver")
.put("max_pool_size", 30));
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
// create a table
connection.execute("create table test(id int primary key, name varchar(255))", create -> {
if (create.failed()) {throw new RuntimeException(create.cause());}
});
});
vertx
.createHttpServer()
.requestHandler(r -> {
int requestId = new Random().nextInt();
System.out.println("Request " + requestId + " received");
client.getConnection(conn -> {
if (conn.failed()) {throw new RuntimeException(conn.cause());}
final SQLConnection connection = conn.result();
connection.execute("insert into test values ('" + requestId + "', 'World')", insert -> {
// query some data with arguments
connection
.queryWithParams("select * from test where id = ?", new JsonArray().add(requestId), rs -> {
connection.close(done -> {if (done.failed()) {throw new RuntimeException(done.cause());}});
System.out.println("Result " + requestId + " returned");
r.response().end("Hello");
});
});
});
})
.listen(8080, result -> {
if (result.succeeded()) {
fut.complete();
} else {
fut.fail(result.cause());
}
});
}
}
#RunWith(VertxUnitRunner.class)
public class MyFirstVerticleTest {
private Vertx vertx;
#Before
public void setUp(TestContext context) {
vertx = Vertx.vertx();
vertx.deployVerticle(MyFirstVerticle.class.getName(),
context.asyncAssertSuccess());
}
#After
public void tearDown(TestContext context) {
vertx.close(context.asyncAssertSuccess());
}
#Test
public void testMyApplication(TestContext context) {
for (int i = 0; i < 10; i++) {
final Async async = context.async();
vertx.createHttpClient().getNow(8080, "localhost", "/",
response -> response.handler(body -> {
context.assertTrue(body.toString().contains("Hello"));
async.complete();
})
);
}
}
}
Output:
Request 1412761034 received
Request -1781489277 received
Request 1008255692 received
Request -853002509 received
Request -919489429 received
Request 1902219940 received
Request -2141153291 received
Request 1144684415 received
Request -1409053630 received
Request -546435082 received
Result 1412761034 returned
Result -1781489277 returned
Result 1008255692 returned
Result -853002509 returned
Result -919489429 returned
Result 1902219940 returned
Result -2141153291 returned
Result 1144684415 returned
Result -1409053630 returned
Result -546435082 returned
So, we accept a request - schedule a request to the database, go to the next request, we consume all of them and send a response for each request only when everything is done with the database.
About your code sample I see two possible issues - first, it looks like you don't close() connection, which is important to return it to pool. Second, how your pool is configured? If there is only one free connection - these requests will serialize waiting for this connection.
I recommend you to add some printing of a timestamp for both requests to find a place where you serialize. You have something that makes the calls in the event loop to be blocking. Or... check that you send requests in parallel in your test. Not next after getting a response after previous.
How is this asynchronous? The answer is in your question itself
What I've observed is the so called event loop is blocked until my
first request completes. Whatever little time it takes, subsequent
request is not acted upon until the previous one completes
The idea is instead of having a new for serving each HTTP request, same thread is used which you have blocked by your long running task.
The goal of event loop is to save the time involved in context switching from one thread to another thread and utilize the ideal CPU time when a task is using IO/Network activities. If while handling your request it had to other IO/Network operation eg: fetching data from a remote MongoDB instance during that time your thread will not be blocked and instead an another request would be served by the same thread which is the ideal use case of event loop model (Considering that you have concurrent requests coming to your server).
If you have long running tasks which does not involve Network/IO operation, you should consider using thread pool instead, if you block your main event loop thread itself other requests would be delayed. i.e. for long running tasks you are okay to pay the price of context switching for for server to be responsive.
EDIT:
The way a server can handle requests can vary:
1) Spawn a new thread for each incoming request (In this model the context switching would be high and there is additional cost of spawning a new thread every time)
2) Use a thread pool to server the request (Same set of thread would be used to serve requests and extra requests gets queued up)
3) Use a event loop (single thread for all the requests. Negligible context switching. Because there would be some threads running e.g: to queue up the incoming requests)
First of all context switching is not bad, it is required to keep application server responsive, but, too much context switching can be a problem if the number of concurrent requests goes too high (roughly more than 10k). If you want to understand in more detail I recommend you to read C10K article
Assume a typical request takes anywhere between 100 ms to 1 sec (based
on the kind and nature of the request). So it means, the event loop
can't accept a new connection until the previous request finishes(even
if its winds up in a second).
If you need to respond to large number of concurrent requests (more than 10k) I would consider more than 500ms as a longer running operation. Secondly, Like I said there are some threads/context switching involved e.g.: to queue up incoming requests, but, the context switching amongst threads would be greatly reduced as there would be too few threads at a time. Thirdly, if there is a network/IO operation involved in resolving first request second request would get a chance to be resolved before first is resolved, this is where this model plays well.
And If I as a programmer have to think
through all these and push such request handlers to a worker thread ,
then how does it differ from a thread/connection model?
Vertx is trying to give you best of threads and event loop, so, as programmer you can make a call on how to make your application efficient under both the scenario i.e. long running operation with and without network/IO operation.
I'm just trying to understand how is this model better from a
traditional thread/conn server models? Assume there is no I/O op or
all the I/O op are handled asynchronously? How does it even solve c10k
problem, when it can't start all concurrent requests parallely and
have to wait till the previous one terminates?
The above explanation should answer this.
Even if I decide to push all these operations to a worker
thread(pooled), then I'm back to the same problem isn't it? Context
switching between threads?
Like I said, both have pros and cons and vertx gives you both the model and depending on your use case you got to choose what is ideal for your scenario.
In these sort of processing engines, you are supposed to turn long running tasks in to asynchronously executed operations and these is a methodology for doing this, so that the critical thread can complete as quickly as possible and return to perform another task. i.e. any IO operations are passed to the framework to call you back when the IO is done.
The framework is asynchronous in the sense that it supports you producing and running these asynchronous tasks, but it doesn't change your code from being synchronous to asynchronous.
I am working on a Java application that pulls messages from an Azure Service Bus queue. I am using the Java Azure API (com.microsoft.windowsazure.services). The problem that I'm experiencing is that the deletion of brokered messages after they had been processed sometimes fails.
My application pulls a message from the queue using the receiveQueueMessage() method on a ServiceBusContract object, using peek-lock receive mode. Once the message had been sucessfully processed, I remove the message from the queue by calling the deleteMessage() method (I believe this method corresponds to the Complete() method in the .NET API).
However, sometimes this method call fails. A com.sun.jersey.api.client.UniformInterfaceException exception is logged to the console by deleteMessage(), but it does not throw this exception (I'll produce the output below). The exception seems to tell that the message could not be found. When this happens, the message stays in the queue. In fact, the next call to receiveQueueMessage() retrieves this message again. The deletion then fails once or twice more, and then it succeeds. The messages retrieved thereafter delete successfully.
Here is the code where the problem occurs:
ReceiveMessageOptions receiveOptions = ReceiveMessageOptions.DEFAULT;
receiveOptions.setReceiveMode(ReceiveMode.PEEK_LOCK);
BrokeredMessage message = serviceBus.receiveQueueMessage("my_queue",receiveOptions).getValue();
// Process the message
System.out.println("Delete message with ID: "+message.getMessageId());
serviceBus.deleteMessage(message);
Here is an example of the output when the problem occurs:
Delete message with ID: 100790000086491
2013/01/22 12:58:29 com.microsoft.windowsazure.services.serviceBus.implementation.ServiceBusExceptionProcessor processCatch
WARNING: com.sun.jersey.api.client.UniformInterfaceException: DELETE https://voyagernetzmessaging.servicebus.windows.net/sms_queue/messages/24/efa56a1c-95e8-4cd6-931a-972eac21563a returned a response status of 404 Not Found
com.sun.jersey.api.client.UniformInterfaceException: DELETE https://voyagernetzmessaging.servicebus.windows.net/sms_queue/messages/24/efa56a1c-95e8-4cd6-931a-972eac21563a returned a response status of 404 Not Found
at com.sun.jersey.api.client.WebResource.voidHandle(WebResource.java:697)
at com.sun.jersey.api.client.WebResource.delete(WebResource.java:261)
at com.microsoft.windowsazure.services.serviceBus.implementation.ServiceBusRestProxy.deleteMessage(ServiceBusRestProxy.java:260)
at com.microsoft.windowsazure.services.serviceBus.implementation.ServiceBusExceptionProcessor.deleteMessage(ServiceBusExceptionProcessor.java:176)
at microworks.voyagernetzmessaging.smsservice.SmsSender$Runner.finalizeSms(SmsSender.java:114)
at microworks.voyagernetzmessaging.smsservice.SmsSender$Runner.finalizeSms(SmsSender.java:119)
at microworks.voyagernetzmessaging.smsservice.SmsSender$Runner.run(SmsSender.java:340)
com.microsoft.windowsazure.services.core.ServiceException: com.sun.jersey.api.client.UniformInterfaceException: DELETE https://voyagernetzmessaging.servicebus.windows.net/sms_queue/messages/24/efa56a1c-95e8-4cd6-931a-972eac21563a returned a response status of 404 Not Found
Response Body: <Error><Code>404</Code><Detail>The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue..TrackingId:4b112c5a-5919-4680-b6bb-e10a2c081ba3_G15_B9,TimeStamp:1/22/2013 10:58:30 AM</Detail></Error>
at com.microsoft.windowsazure.services.serviceBus.implementation.ServiceBusExceptionProcessor.deleteMessage(ServiceBusExceptionProcessor.java:179)
at microworks.voyagernetzmessaging.smsservice.SmsSender$Runner.finalizeSms(SmsSender.java:114)
at microworks.voyagernetzmessaging.smsservice.SmsSender$Runner.finalizeSms(SmsSender.java:119)
at microworks.voyagernetzmessaging.smsservice.SmsSender$Runner.run(SmsSender.java:340)
Caused by: com.sun.jersey.api.client.UniformInterfaceException: DELETE https://voyagernetzmessaging.servicebus.windows.net/sms_queue/messages/24/efa56a1c-95e8-4cd6-931a-972eac21563a returned a response status of 404 Not Found
at com.sun.jersey.api.client.WebResource.voidHandle(WebResource.java:697)
at com.sun.jersey.api.client.WebResource.delete(WebResource.java:261)
at com.microsoft.windowsazure.services.serviceBus.implementation.ServiceBusRestProxy.deleteMessage(ServiceBusRestProxy.java:260)
at com.microsoft.windowsazure.services.serviceBus.implementation.ServiceBusExceptionProcessor.deleteMessage(ServiceBusExceptionProcessor.java:176)
... 3 more
Do note that the URI in the exception seems to refer to a different message ID (efa56a1c-95e8-4cd6-931a-972eac21563a, while the message's ID is in fact 100790000086491). I do not know if this could be a key to the failure, but I have a hunch.
Another interesting observation: it looks as though the error always happens with the first message that is retrieved from the queue after the application had been started, or after the queue had been empty. All the messages coming thereafter don't seem to ever cause this type of problem.
The queue has a lock duration of 2 minutes, and the processing of the messages takes well under that duration, so an expiring lock cannot be the cause.
Any ideas?
I would suggest you to call Complete() of BrokeredMessage class.
So in your case, try calling:
message.Complete();
When the Service bus sees Complete(), it considers the message to be consumed and removes it from the queue.
The UUID that appears in the URL is a random token that the server uses to track which message is locked; it is not supposed to the be same as the message id. You can access the lock URL using message.getLockLocation().
The code you have looks correct, I cannot see any obvious reason why it would fail, especially in the say you describe. Some things to check:
Check that the message you get is a valid message. If you peek-lock an empty queue, it will return an empty message. Then the lock location should be null. (But that would not cause the failure you see.)
You could get the lock supplied is invalid error if you are trying to delete the same message more than once. That could happen if you have code that notices when the service returns an empty message, and substitutes the previous message. (But that would not explain why trying to delete the message eventually works, unless it is a different message that is getting deleted.)
Hopefully that will help!