retry logic uptil databases comes up - java

Through my java code i m connecting to multiple databases using connection pooling.if my database goes down i need handle the retry logic to get connection until its return a connection object.

If your db connection throws some sort of an Exception then you can just sleep for a bit and retry the operation again.
In the example below worker is an object that does some work such as connecting to a db, etc. It's pretty generic so you can retry any sort of an operation such as reading from a file, etc.
Note that catching Throwable might not necessarily be a great idea.
boolean success = false;
int i = 0;
long delay = retryDelay;
LOGGER.info("Starting operation");
/*
* Loop until you cannot retry anymore or the operation completed successfully
* The catch block has a nested try catch to ensure that nothing goes wrong
* while trying to sleep
*
* In case of failure the last retry exception is propagated up to the calling
* class.
*/
while (i++ < retryMax && !success)
{
try
{
worker.work();
success = true;
}
catch (Throwable t)
{
try
{
LOGGER.warn("Caught throwable", t);
if (i == retryMax)
{
LOGGER.warn("Retry maximum reached, propagating error");
throw t;
}
if (retryPolicy == RetryPolicy.ESCALATING)
{
delay *= 2;
}
LOGGER.info("Sleeping for " + delay + " milliseconds");
Thread.sleep(delay);
}
catch (Throwable tt)
{
/*
* Quick check to see if the maximum has been hit, so we don't log twice
*
* t is the original error, and tt is the error we got while retrying
* tt would most likely be a InterruptedException or something
*/
if (i == retryMax)
{
throw t;
}
LOGGER.warn("Error while retrying, propagating original error up", tt);
throw t;
}
}
} // end retry loop

Related

How to check when polling stopped

I have a message stream, where messages comes which I need to process and then store them in database. In Java, I've written polling code which polls stream and consumes messages every 20 seconds.
This is done inside an infinite for-loop, like below:
for (;;) {
try{
//1. Logic for polling.
//2. Logic for processing the message.
//3. Logic for storing the message in database.
Thread.sleep(20000 - <time taken for above 3 steps >);
} catch(Exception E){
//4. Exception handling.
}
}
This logic runs as expected and the stream is polled, but once in a while it hits an exception or something goes wrong and polling stops.
I want to have a mechanism, that as soon as polling stopped, let's say this for loop is not running for 60 seconds, I should receive a mail or ping.
What is the best way to invoke a method if this for loop is not running for 60 seconds?
I am thinking like, each for-loop execution will ping a heartbeat, and when that heartbeat pinging not received from for-loop then a mail sending is invoked.
There are two different reasons why polling stops making progress, and each needs a different approach:
If the logic throws a Throwable other than an Exception, for instance an Error, the catch does not match, and execution will leave the for-loop, and likely reach the thread's UncaughtExceptionHandler, the default implementation of which logs the exception to System.err and terminates the thread. To prevent this, you should catch Throwable rather than Exception.
The second possibility is that some step in your logic doesn't terminate, for instance due to an infinite loop, a deadlock, waiting for I/O operations, or whatever. In this case, you'll want to take a thread dump to see where the thread is stuck. You can automate this as follows:
class Watchdog {
final Duration gracePeriod;
final Thread watchedThread;
volatile Instant lastProgress;
public Watchdog(Duration gracePeriod) {
this.gracePeriod = gracePeriod;
watchedThread = Thread.currentThread();
everythingIsFine();
var t = new Thread(this::keepWatch);
t.setDaemon(true);
t.start();
}
public void everythingIsFine() {
lastProgress = Instant.now();
}
void keepWatch() {
while (true) {
var silence = Duration.between(lastProgress, Instant.now());
if (silence.compareTo(gracePeriod) > 0) {
System.err.println("Watchdog hasn't seen any progress for " + silence.toSeconds() + " seconds. The watched thread is currently at:");
for (var element : watchedThread.getStackTrace()) {
System.err.println("\tat " + element);
}
}
try {
Thread.sleep(gracePeriod);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
while you can use as follows:
public class Test {
void step() throws Exception {
System.in.read();
}
void job() {
var snoopy = new Watchdog(Duration.ofSeconds(2));
for (;;) {
try {
step();
snoopy.everythingIsFine();
Thread.sleep(1000);
} catch (Throwable t) {
System.err.println(t);
}
}
}
public static void main(String[] args) throws Exception {
new Test().job();
}
}
once the grace period elapses, the WatchDog will print something like:
Watchdog hasn't seen any progress for 2 seconds. The watched thread is currently at:
at java.base/java.io.FileInputStream.readBytes(Native Method)
at java.base/java.io.FileInputStream.read(FileInputStream.java:293)
at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:255)
at java.base/java.io.BufferedInputStream.implRead(BufferedInputStream.java:289)
at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:276)
at stackoverflow.Test.step(Test.java:48)
at stackoverflow.Test.job(Test.java:55)
at stackoverflow.Test.main(Test.java:65)

Same error line multiple times in error log

Could someone explain me what could be the reason for such an error log. When would this be printed. I am not able to understand and this is causing a performance issue in my app.
my error log is like below-
at xxx.createBooking(MailEJB3ServiceZipProxy.java:453)
at xxxx.onSelectBooking(Main.java:2524)
at xxxx.onSelectBooking(Main.java:2603)
at xxxx.onSelectBooking(Main.java:2603)
at xxxx.onSelectBooking(Main.java:2603)
at xxxx.onSelectBooking(Main.java:2603)
at xxxx.onSelectBooking(Main.java:2603)
at xxxx.onSelectBooking(Main.java:2603)
my catch block code looks like -
public void onSelectBooking(){
try{
////
} catch(ReservationBusinessException ex){
m_hModuleContainer.setBusy(false);
List mail = ex.getMailHeader();
m_hCargoRecordDTO = (CargoRecordDTO)mail.get(0);
ReservationObserver m_hReservationObserver= new ReservationObserver();
m_hReservationObserver.setCargoRecordDTO(m_hCargoRecordDTO);
m_hReservationParameterDTO.setReservationObserver(m_hReservationObserver);
ExceptionTab exceptionTab = new ExceptionTab(m_hReservationParameterDTO,m_hCargoRecordDTO,ex);
if ( exceptionTab.isErrorsOverridden() ){
// set all overridden flags
m_hCargoRecordDTO.setSoftErrorsAccepted(true);
m_hCargoRecordDTO.setErrorShown(true);
m_hMailHeaderDTO.addHtCar(m_hCargoRecordDTO);
onSelectBooking();**// line 2603**
}
Presuming the error printed before the first line is StackOverflowError, it's because you have infinite recursion, where the logic in the try block fails with ReservationBusinessException, causing the code to retry infinitely with a recursive call in line 2603, until the call stack is full.
There are 2 ways to fix this:
Change the code to use a loop, instead of recursion, to retry, e.g.
public void onSelectBooking() {
boolean retry;
do {
retry = false;
try {
...
} catch(ReservationBusinessException ex) {
...
retry = true; // instead of recursive call
}
} while (retry);
The problem with this solution is that the code may never complete, if the cause of the exception isn't resolved.
Limit the number of retries. This should be done using a loop like above but with a retry count instead of a boolean, but can still be done using recursion:
private static int MAX_RETRIES = 3;
public void onSelectBooking() {
onSelectBooking(0); // first attempt is not a "retry"
}
public void onSelectBooking(int retry) {
try {
...
} catch(ReservationBusinessException ex) {
if (retry > MAX_RETRIES) {
throw new RuntimeException("Max. number of retries (" + MAX_RETRIES + ") exceeded: " + ex, ex);
}
...
onSelectBooking(retry + 1);
}
}

SocketChannel high CPU load on read

I have SocketChannel configured for read only SelectionKey.OP_CONNECT | SelectionKey.OP_READ
Profiler shows runChannel is the most CPU consuming method and actually it is reasonable because it's infinite loop which calls method selector.select() all the time, but on the other hand I have dozens of such connections and it kills CPU.
Is there a possibility to decrease CPU load and in the same time do not miss any incoming message?
public void runChannel() {
while (session.isConnectionAlive()) {
try {
// Wait for an event
int num = selector.select();
// If you don't have any activity, loop around and wait
// again.
if (num == 0) {
continue;
}
} catch (IOException e) {
log.error("Selector error: {}", e.toString());
log.debug("Stacktrace: ", e);
session.closeConnection();
break;
}
handleSelectorkeys(selector.selectedKeys());
}
}
Unsunscribe from OP_CONNECT - select() won't block if you're subscribed to OP_CONNECT and connected.

Loop On Exception

I have a custom built API for interacting with their messaging system. But this API doesn't give me any way to confirm that I have established a connection aside from when it is unable to connect an exception will be thrown.
When I receive a exception while connected, I have an exception listener that attempts to reconnect to the server. I'd like this to loop on exception to retry the connection. Doing an infinite loop until I am able to connect, or until the program is closed. I attempted to do this with break labels like so:
reconnect: try{
attemptReconnection();
}catch(Exception e){
log.error(e);
break reconnect;
}
but that was unable to find the reconnect label for me, and is a bit to close to using a GOTO statement than I would be comfortable putting into production.
Proceed this way:
do { // optional loop choice
try{
attemptReconnection();
break; // Connection was successful, break out of the loop
} catch(Exception e){
// Exception thrown, do nothing and move on to the next connection attempt (iteration)
log.error(e);
}
}while(true);
If the execution flow reaches the break; instruction then that means that you successfully connected. Otherwise, it will keep moving on to the next iteration. (Note that the loop choice is optional, you can use pretty much any loop you want)
Can't say I have experience with APIs, but I would think something like this would achieve the result you're after.
boolean success = false;
while (!success){
try{
attemptReconnection();
success = true;
}
catch(Exception e){
log.error(e);
}
}
Once attemptReconnection() executes without errors, success would be set to true and terminate the loop.
Have attemptReconnection return true when connection succeds, false otherwise.
The method attemptReconnection should also catch and log the Exception.
Then :
while(!attemptReconnection()){
log.error("Connection failure");
}
I would suggest controlling the reconnection attempts not with a while loop but with a scheduled event. This you can easily initiate multiple connections and implement a back off mechanism not to over-consume resources while trying to reconnect
private ScheduledExecutorService scheduler;
...
public void connect() {
for (int i = 0; i < numberOfConnections; i++) {
final Runnable r = new Runnable() {
int j = 1;
public void run() {
try {
final Connection connection = createNewConnection();
} catch (IOException e) {
//here we do a back off mechanism every 1,2,4,8,16... 512 seconds
final long sleep = j * 1000L;
if (j < 512) {
j *= 2;
} else {
j = 1;
}
LOGGER.error("Failed connect to host:port: {}:{}. Retrying... in {} millis",
host, port, sleep);
LOGGER.debug("{}", e);
scheduler.schedule(this, sleep, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
Thread.currentThread.interrupt();
}
}
};
scheduler.schedule(r, 1, TimeUnit.SECONDS);
}
}
Do not forget to do a scheduler.shutdownNow() if you want to close the application so as to avoid the treadpool being leaking.
You can even implement a reconnect mechanism once you have been connected and you are disconnected by having the listener execute the connect method in case of a status change on the connection.

Couchbase backoff with Java API

I'm trying to understand how incremental backoff works in the Java Couchbase API. The following code snippet is from the Couchbase Java Tutorial (I have added a few comments).
public OperationFuture<Boolean> contSet(String key,
int exp,
Object value,
int tries) {
OperationFuture<Boolean> result = null;
OperationStatus status;
int backoffexp = 0;
try {
do {
if (backoffexp > tries) {
throw new RuntimeException("Could not perform a set after "
+ tries + " tries.");
}
result = cbc.set(key, exp, value);
status = result.getStatus(); // Is this a blocking call?
if (status.isSuccess()) {
break;
}
if (backoffexp > 0) {
double backoffMillis = Math.pow(2, backoffexp);
backoffMillis = Math.min(1000, backoffMillis); // 1 sec max
Thread.sleep((int) backoffMillis);
System.err.println("Backing off, tries so far: " + backoffexp);
}
backoffexp++;
// Why are we checking again if the operation previously failed
if (!status.isSuccess()) {
System.err.println("Failed with status: " + status.getMessage());
}
// If we break on success, why not do while(true)?
} while (status.getMessage().equals("Temporary failure"));
} catch (InterruptedException ex) {
System.err.println("Interrupted while trying to set. Exception:"
+ ex.getMessage());
}
if (result == null) {
throw new RuntimeException("Could not carry out operation.");
}
return result;
}
Do calls to getStatus() return only when the operation has either succeeded or failed? (i.e. synchronous). The Java Tutorial seem to say it is blocking, but the Java API says:
Get the current status of this operation. Note that the operation status may change as the operation is tried and potentially retried against the servers specified by the NodeLocator.
Why do we need to check status.isSuccess() multiple times? If it was successful we would have broken out of the loop, and we can assume it has failed?
If there any reason to do while (status.getMessage().equals("Temporary failure")) instead of while(true), since we call break when the status is successful?
Thanks
getStatus() forces the Future to complete (i.e. internally calls get(), so the status of the operation can be known. To put it another way, there is no such thing as a pending status in the APU, so you must force the Future to complete to be able to determine is status.
See the source code of Operation_Future:getStatus for details.

Categories

Resources