Quartz Failure On Job Recovery - java

I'm new to JSP and Quartz Scheduling! In this project, I'm trying to make the quartz scheduler continue functioning in case the server is turned off then on ignoring the missed jobs.
For this, I researched JobPersistence and I have modified the quartz.properties file as the following:
org.quartz.threadPool.threadCount=5
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.tablePrefix = QRTZ_
org.quartz.jobStore.useProperties = true
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource = myDB
org.quartz.dataSource.myDB.driver = com.mysql.jdbc.Driver
org.quartz.dataSource.myDB.URL = jdbc:mysql://localhost:3306/contacts
org.quartz.dataSource.myDB.user = root
org.quartz.dataSource.myDB.password = root
the web.xml file contains the following:
...
<listener>
<listener-class>
org.quartz.ee.servlet.QuartzInitializerListener
</listener-class>
</listener>
...
I've added the tables to the DB and when I select I can see that it really inserted triggers to its tables.
The trigger is built as the following:
Trigger trig = TriggerBuilder
.newTrigger()
.startAt(scal.getTime())
.withSchedule(
SimpleScheduleBuilder.simpleSchedule()
.withIntervalInMinutes(minutes).repeatForever())
.endAt(ecal.getTime()).build();
Now, when I run my web app, I schedule a job and it executes. Then, I turn off the tomcat server and start it again. It prints the following error to the logger:
org.quartz.SchedulerConfigException: Failure occured during job recovery. [See nested exception: org.quartz.JobPersistenceException: Couldn't recover jobs: null [See nested exception: java.lang.NullPointerException]]
I have tried executing the following statement once in MySQLWorkbench:
UPDATE QRTZ_TRIGGERS SET NEXT_FIRE_TIME=1 WHERE NEXT_FIRE_TIME < 0;
Now, I got this new error:
.manage - MisfireHandler: Error handling misfires: Unexpected runtime exception: null
org.quartz.JobPersistenceException: Unexpected runtime exception: null [See nested exception: java.lang.NullPointerException]
If you want me to edit and include the stackTrace, I can do that...

You may want to use in the properties file
org.quartz.scheduler.misfirePolicy = doNothing
Because apparently the missed jobs are causing you problems...
I know this is an old post but if you have an answer then please share it with us all !

Related

Quartz trigger does not fire immediately

I'd like to execute the job ~immediately with quartz scheduler using jdbc datastore. However I have like 20-30 seconds delay between the scheduling and trigger fire even though I schedule with now() or calling triggerJob.
I tried to execute the job with a simple trigger:
JobKey key = //...
JobDetail jobDetail = newJob(jobBean.getClass())
.withIdentity(key)
.usingJobData(new JobDataMap(jobParams))
.storeDurably()
.build();
Trigger trigger = newTrigger()
.withIdentity(key.getName(), key.getGroup())
.startNow()
.withSchedule(SimpleScheduleBuilder.simpleSchedule()
.withMisfireHandlingInstructionFireNow()
.withRepeatCount(0))
.build();
scheduler.scheduleJob(jobDetail, trigger);
And I also tried to trigger with scheduler:
JobKey key = // ...
JobDetail jobDetail = newJob(jobBean.getClass())
.withIdentity(key)
.storeDurably()
.build();
scheduler.addJob(jobDetail, true);
scheduler.triggerJob(key, new JobDataMap(jobParams));
Here are the listener logs that shows the delay.
2019-05-15 13:59:52,066Z INFO [nio-8081-exec-2] c.m.f.s.logger.SchedulingListener : Job added: newsJobTemplate:1557928791965
2019-05-15 13:59:52,066Z INFO [nio-8081-exec-2] c.m.f.s.logger.SchedulingListener : Job scheduled: newsJobTemplate:1557928791965
2019-05-15 14:00:18,660Z INFO [eduler_Worker-1] c.m.f.s.logger.TriggerStateListener : Trigger fired: QUARTZ_JOBS.newsJobTemplate:1557928791965 {}
2019-05-15 14:00:18,703Z INFO [eduler_Worker-1] c.m.f.s.logger.JobExecutionListener : Job will be executed: QUARTZ_JOBS.newsJobTemplate:1557928791965
2019-05-15 14:00:19,284Z INFO [eduler_Worker-1] c.m.f.s.logger.JobExecutionListener : Job was executed: QUARTZ_JOBS.newsJobTemplate:1557928791965
I found crumbs here and there that suggested that the problem is transaction related.
So I removed #Transactional from the service method and voila it worked.
Looks like when you call trigger the scheduler thread asyncronously tries to look up schedules and triggers from the DB but the transaction is not committed at that time. Later the scheduler thread looks up the db again and it finds it finally.
zolee's answer describes the problem perfectly, but there are also a few things one can do to solve it.
One imperfect solution is to reduce org.quartz.scheduler.idleWaitTime. In fact, the problem itself is described, though somewhat obliquely, in the quartz configuration doc, org.quartz.scheduler.idleWaitTime section.
Normally you should not have to ‘tune’ this parameter, unless you’re
using XA transactions, and are having problems with delayed firings of
triggers that should fire immediately.
That will allow you to reduce 30-second delay to 5 seconds or even less.
A full solution is to extend QuartzScheduler to add transaction support. Exact implementation will depend on what library/code you're using for transaction support, but it worked for us perfectly.
class TransactionAwareScheduler extends QuartzScheduler {
#Override
protected void notifySchedulerThread(long candidateNewNextFireTime) {
if (insideTransaction) {
transaction.addCommitHook(() -> {
super.notifySchedulerThread(candidateNewNextFireTime);
});
}
} else {
super.notifySchedulerThread(candidateNewNextFireTime);
}
}

Grails NPE in java.net.URI$Parser.parse() when sending mail inside Quartz

We have a scheduled invoicing service, where we send invoices to customers' email.
asynchronousMailService.sendMail {
multipart true
to emailTo.split("[,;]")
bcc bccString
from fromString
subject subjectString
html view:'/email/invoiceEmailTemplate',
model: [companyName: companyName, customerFirstName: order.customer.firstName,
xeroInvoiceId: invoice.invoiceNumber, invoiceTotal: order.totalAmount,
invoiceUrl: invoiceUrl,
currencyCode: invoice.currencyCode, dueDate: invoice.dueDate]
attachBytes invoice.invoiceNumber+".pdf" , 'application/pdf', invoiceBytes
}
Causing this error:
2016-03-09 18:22:23,073 [quartzScheduler_Worker-10] ERROR listeners.ExceptionPrinterJobListener - Exception occurred in job: Grails Job
org.quartz.JobExecutionException: java.lang.NullPointerException [See nested exception: java.lang.NullPointerException]
at grails.plugins.quartz.GrailsJobFactory$GrailsJob.execute(GrailsJobFactory.java:111)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: java.lang.NullPointerException
at java.net.URI$Parser.parse(URI.java:3023)
at java.net.URI.<init>(URI.java:595)
at grails.plugin.mail.MailMessageContentRenderer$PageRenderRequestCreator.createInstance(MailMessageContentRenderer.groovy:198)
at grails.plugin.mail.MailMessageContentRenderer$RenderEnvironment.init(MailMessageContentRenderer.groovy:147)
at grails.plugin.mail.MailMessageContentRenderer$RenderEnvironment.with(MailMessageContentRenderer.groovy:178)
at grails.plugin.mail.MailMessageContentRenderer.render(MailMessageContentRenderer.groovy:63)
at grails.plugin.asyncmail.AsynchronousMailMessageBuilder.doRender(AsynchronousMailMessageBuilder.groovy:281)
at grails.plugin.asyncmail.AsynchronousMailMessageBuilder.html(AsynchronousMailMessageBuilder.groovy:267)
at com.mycompany.thirdparty.InvoiceService$_emailInvoiceToCustomer_closure5.doCall(InvoiceService.groovy:118)
at grails.plugin.asyncmail.AsynchronousMailService.sendAsynchronousMail(AsynchronousMailService.groovy:21)
at AsynchronousMailGrailsPlugin$_configureSendMail_closure9.doCall(AsynchronousMailGrailsPlugin.groovy:132)
at com.mycompany.thirdparty.InvoiceService.emailInvoiceToCustomer(InvoiceService.groovy:112)
at com.mycompany.thirdparty.InvoiceService$_createInvoice_closure2.doCall(InvoiceService.groovy:47)
at grails.plugin.multitenant.core.MultiTenantService$_doWithTenantId_closure1_closure2.doCall(MultiTenantService.groovy:32)
at grails.plugin.hibernatehijacker.template.HibernateTemplates$_withTransaction_closure1.doCall(HibernateTemplates.groovy:39)
at grails.plugin.hibernatehijacker.template.HibernateTemplates.withTransaction(HibernateTemplates.groovy:37)
at grails.plugin.multitenant.core.MultiTenantService$_doWithTenantId_closure1.doCall(MultiTenantService.groovy:31)
at grails.plugin.hibernatehijacker.template.HibernateTemplates$_withNewSession_closure2.doCall(HibernateTemplates.groovy:65)
at grails.plugin.hibernatehijacker.template.HibernateTemplates.withNewSession(HibernateTemplates.groovy:57)
at grails.plugin.multitenant.core.MultiTenantService.doWithTenantId(MultiTenantService.groovy:30)
at grails.plugin.multitenant.singledb.MtSingleDbPluginSupport$_createWithTenantIdMethod_closure2.doCall(MtSingleDbPluginSupport.groovy:141)
at com.mycompany.thirdparty.InvoiceService.createInvoice(InvoiceService.groovy:38)
at com.mycompany.thirdparty.InvoiceJob.execute(InvoiceJob.groovy:13)
at grails.plugins.quartz.GrailsJobFactory$GrailsJob.execute(GrailsJobFactory.java:102)
The above error logs only happen in production server and I have no success in replicating on my local machine and dev server.
Any ideas?
I answered that question once here: Grails Mail service not working with guartz scheduler in war mode
It's the AsyncMail unable to get the server's URL. The simplest thing is to configure it via the grails.serverURL config property.

Cassandra NoHostAvailableException: All host(s) tried for query failed in Production

We have 10 Cassandra nodes in production running Cassandra-2.1.8. We recently upgraded to 2.1.8 version. Previously we were using only 3 nodes running Cassandra-2.1.2. First we upgraded the initial 3 nodes from 2.1.2 to 2.1.8 (following the procedure as described in Upgrading Cassandra). Then we added 7 more nodes running Cassandra-2.1.8 in cluster. Then we started our client programs. For first few hours everything worked fine, but after few hours, we saw some errors in client program logs like
Thread-0 [29/07/15 17:41:23.356] ERROR com.cleartrail.entityprofiling.engine.InterpretationWriter - Error:com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/172.50.33.161:9041, /172.50.33.162:9041, /172.50.33.95:9041, /172.50.33.96:9041, /172.50.33.165:9041, /172.50.33.166:9041, /172.50.33.163:9041, /172.50.33.164:9041, /172.50.33.42:9041, /172.50.33.167:9041] - use getErrors() for details)
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:259)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:175)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at com.cleartrail.entityprofiling.engine.InterpretationWriter.WriteInterpretation(InterpretationWriter.java:430)
at com.cleartrail.entityprofiling.engine.Profiler.buildProfile(Profiler.java:1042)
at com.cleartrail.messageconsumer.consumer.KafkaConsumer.run(KafkaConsumer.java:336)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/172.50.33.161:9041, /172.50.33.162:9041, /172.50.33.95:9041, /172.50.33.96:9041, /172.50.33.165:9041, /172.50.33.166:9041, /172.50.33.163:9041, /172.50.33.164:9041, /172.50.33.42:9041, /172.50.33.167:9041] - use getErrors() for details)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:102)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:176)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Now, I double checked the Firewall (as suggested in few posts), ports, timeouts in client as well as nodes and they all are correct.
I am also not closing the connection anywhere in between. I am using batch queries with batch size of 1000 and the queries are update queries updating counters in my table with three columns
entity , twfwv , cvalue
where entity and twfwv columns are text and primary key and cvalue is counter column.
I even restarted all my nodes (because this trick helped me in my dev environment when I faced the same exception) but its not helping. Please suggest what can be the probable problem here.
My issue was resolved by checking the errors collection of NoHostAvailableException as advised by Olivier Michallat in the comments. For me it was the protocol version on the cluster configuration. Mine was null, setting it to 3 fixed the problem.
My issue was resolved by removing/using a property to set or unset the custom load balancing TokenAwarePolicy my connection was using, and relying on the default.
Specifically, I was trying to get a local spring boot app talking to a single dockerized Cassandra instance.
Cluster.Builder builder = Cluster.builder()
.addContactPoints(cassandraProperties.getHosts())
.withPort(cassandraProperties.getPort())
.withProtocolVersion(ProtocolVersion.V4)
.withRetryPolicy(new LoggingRetryPolicy(DefaultRetryPolicy.INSTANCE))
.withCredentials(cassandraProperties.getUsername(), cassandraProperties.getPassword())
.withCodecRegistry(codecRegistry);
if (loadBalanced) {
builder.withLoadBalancingPolicy(
new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().withLocalDc(localDc).build()));
}

about misfire in java quartz scheduling

I have set a trigger using cronScheduler with misfireInstruction like follows
trigger = newTrigger().withIdentity("autoLockTrigger", "autoLockGroup").startNow() .withSchedule(cronSchedule(croneExpression).withMisfireHandlingInstructionFireAndProceed())
.forJob("autoLockJob","autoLockGroup")
.build();
my quartz.properties is like follows
org.quartz.scheduler.instanceName =MyScheduler
# Configuring ThreadPool
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 1
org.quartz.threadPool.threadPriority = 9
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource = myDS
org.quartz.jobStore.tablePrefix = QRTZ_
#org.quartz.dataSource.myDS.jndiURL = jdbc/vikas
org.quartz.dataSource.myDS.driver = com.mysql.jdbc.Driver
org.quartz.dataSource.myDS.URL = jdbc:mysql://staging:3307/facao
org.quartz.dataSource.myDS.user = root
org.quartz.dataSource.myDS.password = toor
org.quartz.dataSource.myDS.maxConnections = 30
#org.quartz.jobStore.nonManagedTXDataSource = myDS
#to store data in string format (name-value pair)
org.quartz.jobStore.useProperties=true
org.quartz.jobStore.misfireThreshold = 60000
In my code if I set some trigger at particular time and if server is in running state then scheduler runs properly but if server is down for the time in which scheduler is suppose to be run and then started after some time then scheduler should run the misfired instruction. But in my case the misfired instruction is not running all the time it runs some time not always so my purpose is not fulfilled. Please give some solution. Thank you in advance.
I am not sure about the cron triggers but for simple triggers yeah,
if the end time of the trigger has been passed then some of the provided misfire instruction
will not work. See the javadoc snippet for more info.
I guess the same would be the case with cron trigger too.
So, it totally depends on what cron expression you use.

weblogic context lookup error : java.rmi.UnmarshalException: error unmarshalling arguments

We are facing an issue in our production env. We have searched the net high and low and we were not able to come up with any answers.
This error(stacktrace below) occurs when an ejb lookup is made from managed server 1 to manager server 2. Virtual ip is used for the lookup. It occurs intermittently and at random intervals. We are not able to identify any pattern and If the ejb call is attempted two or three times, it gets through successfully.
Env details :
server : weblogic 10.0 MP1 running on java 1.5
os : solaris
Pls revert if any other details are required.
Source used for lookup :
private TreControlRemote getController() throws Exception {
Context context = null;
Properties p = new Properties();
TreControlHome treHome = null;
TreControlRemote remote = null;
ConfigurationLoader lAppLoader = null;
try {
mLog.debug("Entering");
lAppLoader = PropertiesFileLoader.getInstance("context.properties");
p.put(Context.INITIAL_CONTEXT_FACTORY, lAppLoader.getValue("INITIAL_CONTEXT_FACTORY"));
p.put(Context.PROVIDER_URL, lAppLoader.getValue("PROVIDER_URL"));
context = new InitialContext(p);
mLog.debug("context : " + context.getEnvironment());
remote = null;
treHome = (TreControlHome) context.lookup("CONTROL");
mLog.debug("Object --->>>>" + treHome);
remote = (TreControlRemote) treHome.create();
mLog.debug("Leaving");
} catch (Exception ex) {
mLog.fatal("Exception while getting remote", ex);
ex.printStackTrace();
throw ex;
} finally {
lAppLoader = null;
}
return remote;
}
The url is a virtual ip pointing to managed server 2 and it contains a ejb with jndi "CONTROL". The problem is that it successful on certain occassions and fails randomly with the error:
stack trace of the error :
*javax.naming.CommunicationException [Root exception is java.rmi.UnmarshalException: error unmarshalling arguments; nested exception is:
java.io.StreamCorruptedException]
at weblogic.jndi.internal.ExceptionTranslator.toNamingException(ExceptionTranslator.java:74)
at weblogic.jndi.internal.WLContextImpl.translateException(WLContextImpl.java:426)
at weblogic.jndi.internal.WLContextImpl.lookup(WLContextImpl.java:382)
at weblogic.jndi.internal.WLContextImpl.lookup(WLContextImpl.java:367)
at javax.naming.InitialContext.lookup(InitialContext.java:351)
```````````````````````````````````````````````````````````````````
Caused by: java.rmi.UnmarshalException: error unmarshalling arguments; nested exception is:
java.io.StreamCorruptedException
at weblogic.rjvm.ResponseImpl.unmarshalReturn(ResponseImpl.java:221)
at weblogic.rmi.cluster.ClusterableRemoteRef.invoke(ClusterableRemoteRef.java:338)
at weblogic.rmi.cluster.ClusterableRemoteRef.invoke(ClusterableRemoteRef.java:252)
at weblogic.jndi.internal.ServerNamingNode_1001_WLStub.lookup(Unknown Source)
at weblogic.jndi.internal.WLContextImpl.lookup(WLContextImpl.java:379)
... 33 more
Caused by: java.io.StreamCorruptedException
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1332)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
at weblogic.utils.io.ChunkedObjectInputStream.readObject(ChunkedObjectInputStream.java:195)
at weblogic.rjvm.MsgAbbrevInputStream.readObject(MsgAbbrevInputStream.java:565)
at weblogic.utils.io.ChunkedObjectInputStream.readObject(ChunkedObjectInputStream.java:191)
at weblogic.jndi.internal.RootNamingNode_WLSkel.invoke(Unknown Source)
at weblogic.rmi.internal.BasicServerRef.invoke(BasicServerRef.java:589)
at weblogic.rmi.cluster.ClusterableServerRef.invoke(ClusterableServerRef.java:224)
at weblogic.rmi.internal.BasicServerRef$1.run(BasicServerRef.java:479)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:363)
at weblogic.security.service.SecurityManager.runAs(Unknown Source)
at weblogic.rmi.internal.BasicServerRef.handleRequest(BasicServerRef.java:475)
at weblogic.rmi.internal.BasicServerRef.access$300(BasicServerRef.java:59)
at weblogic.rmi.internal.BasicServerRef$BasicExecuteRequest.run(BasicServerRef.java:1016)
... 2 more*
Obtained the below mentioned stacktrace from the weblogic log. Could this error be related to our problem mentioned above?
*####<Aug 25, 2009 2:11:04 AM BST> <Info> <RJVM> <pkssv049> <M1AP4> <ACTIVE ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <1251162664181> <BEA-000513> <Failure in heartbeat trigger for RJVM: 5433424963141690658S:169.93.73.0:10040,10040,-1,-1,-1,-1,-1:pkssv049.***.net:10240,pkssv049.***.net:10241,pkssv050.***.net:10240,pkssv050.***.net:10241:LIQP1_LMSDomain:M1AP3
java.io.IOException: The connection manager to ConnectionManager for: 'weblogic.rjvm.RJVMImpl#189ed0e - id: '5433424963141690658S:169.93.73.0:10040,10040,-1,-1,-1,-1,-1:pkssv049.***.net:10240,pkssv049.***.net:10241,pkssv050.***.net:10240,pkssv050.***.net:10241:LIQP1_LMSDomain:M1AP3' connect time: 'Mon Aug 24 20:24:02 BST 2009'' has already been shut down.
java.io.IOException: The connection manager to ConnectionManager for: 'weblogic.rjvm.RJVMImpl#189ed0e - id: '5433424963141690658S:169.93.73.0:10040,10040,-1,-1,-1,-1,-1:pkssv049.***.net:10240,pkssv049.***.net:10241,pkssv050.***.net:10240,pkssv050.***.net:10241:LIQP1_LMSDomain:M1AP3' connect time: 'Mon Aug 24 20:24:02 BST 2009'' has already been shut down
at weblogic.rjvm.ConnectionManager.getOutputStream(ConnectionManager.java:1686)
at weblogic.rjvm.ConnectionManager.createHeartbeatMsg(ConnectionManager.java:1629)
at weblogic.rjvm.ConnectionManager.sendHeartbeatMsg(ConnectionManager.java:607)
at weblogic.rjvm.RJVMImpl$HeartbeatChecker.timerExpired(RJVMImpl.java:1540)
at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:273)
at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:464)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:200)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:172)*
Any help would be greatly appreciated.
Here is some additional info..
Is the problem intermittent, or does reproduce every single time? If the problem is intermittent, do you know what conditions it occurs under?
It occurs intermittently and we are not able to observe any pattern.
Are there any other errors/warnings logged either on the local server or on the remote server?
We see a lot of connection refused errors in the weblogic log
Are both the managed servers in the same domain?
Yes
when you pass an instance of com.myclientcompany.server.eai.interactionspecimpl as argument to
your ejb. the weblogic needs to deserialize(unmarshal) the object under the ejb context, and its needs the required class for unmarshalling. so if you include the interactionspecimpl class in your ejb-jar file, then you do not need to include those classes in your servers classpath
This issue can occur if you have either a Duplicate entry for or due to a blank space in between.
You need to check all the configuration files including the JDBC , JMS and the config.xml file to find such and entry.
Check if you have left a blank space while entering the JNDI name from the console as well.
Removing the blank space or removing the duplicate entry resolves this issue.

Categories

Resources