JDBC Communication link failure under heavy load - java

I have Scala / Akka / Slick 3.x runnable that retrieve 25000 records in a Mysql 5.6 community DB. For each of them, it spawns an actor that will do 7 counts and return a case class containing those 7 values to the runner, which will, once finished save that as a csv file.
It works perfectly fine locally, it gets all the data, store them in memory, output them, close itself, everything is good. On my production DB, it's ok for about 40 seconds, I can see in the logs that it gets all the data until this happens :
[INFO] [03/24/2016 08:41:46.710] [indicator-runner-akka.actor.default-dispatcher-17] [akka://indicator-runner/user/user-usage-provider/$gO] Valid data set sent for xxxxxx
[INFO] [03/24/2016 08:41:46.711] [indicator-runner-akka.actor.default-dispatcher-17] [akka://indicator-runner/user/user-usage-provider/$fO] Valid data set sent for xxxxxx
[INFO] [03/24/2016 08:41:46.722] [indicator-runner-akka.actor.default-dispatcher-9] [akka://indicator-runner/user/user-usage-provider/$uN] Valid data set sent for xxxxxx
[INFO] [03/24/2016 08:41:46.731] [indicator-runner-akka.actor.default-dispatcher-12] [akka://indicator-runner/user/user-usage-provider/$hO] Valid data set sent for xxxxxx
[ERROR] [03/24/2016 08:41:46.823] [indicator-runner-akka.actor.default-dispatcher-6] [akka://indicator-runner/user/user-usage-provider/$kO] Should'nt receive this message : Failure(com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.)$
At this point, i shutdown the actorsystem and return appropriate exit code, but this operation hasn't gone through.
I faced the maxQueue exception in slick which I fixed with :
executor = AsyncExecutor("HeavyLoad", numThreads = 24, queueSize = 10000)
I also tried :
echo 4096 > /proc/sys/net/core/somaxconn
And set max_connections to 500... No luck
Any ideas of what could go wrong ?
Thank you !

Related

Redisson client ; RedisTimeoutException issue

I am using Google cloud managed redis cluster(v5) via redisson(3.12.5)
Following are my SingleServer configurations in yaml file
singleServerConfig:
idleConnectionTimeout: 10000
connectTimeout: 10000
timeout: 3000
retryAttempts: 3
retryInterval: 1500
password: null
subscriptionsPerConnection: 5
clientName: null
address: "redis://127.0.0.1:6379"
subscriptionConnectionMinimumIdleSize: 1
subscriptionConnectionPoolSize: 50
connectionMinimumIdleSize: 40
connectionPoolSize: 250
database: 0
dnsMonitoringInterval: 5000
threads: 0
nettyThreads: 0
codec: !<org.redisson.codec.JsonJacksonCodec> {}
I am getting following exceptions when I increase the load on my application
org.redisson.client.RedisTimeoutException: Unable to acquire connection! Increase connection pool size and/or retryInterval settings Node source: NodeSource
org.redisson.client.RedisTimeoutException: Command still hasn't been written into connection! Increase nettyThreads and/or retryInterval settings. Payload size in bytes: 34. Node source: NodeSource
It seems there is no issue on redis cluster and i think i need to make tweaking in my client side redis connection pooling confs(mentioned above) to make it work.
Please suggest me the changes i need to make in my confs
I am also curious if I should close the Redis connection after making get/set calls. I have tried finding this but found nothing conclusive on how to close Redis connections
One last thing that I want to ask is that is there any mechanism to get Redis connection pool stats(active connection, idle connection etc ) in Redisson
Edit1:
I have tried by changing values following values in 3 different iterations
Iteration 1:
idleConnectionTimeout: 30000
connectTimeout: 30000
timeout: 30000
Iteration 2:
nettyThreads: 0
Iteration 3:
connectionMinimumIdleSize: 100
connectionPoolSize: 750
I have tried these things but nothing has worked for me
Any help is appreciated.
Thanks in advance
Assuming you are getting low memory alerts on your cache JVM.
You may have to analyze the traffic and determine 2 things
Too many parallel cache persists.
Huge chunk of data being persisted.
Both can be determined by the traffic on your server.
For option 1 configuring pool-size would solve you issue, but for option 2 you may have to refactor your code to persist data in smaller chunks.
Try to set nettyThreads = 64 settings

MQ7 with Java 7 and SSL is not working., it was working before 6 months

We have One QM and One CHANNEL and many QUEUES created for clients. Around 5 clients are connected to this QM for their transactions. Each 5 clients connected to their respective QUEUES . There is a jks file created in this QM for SSL connection. Each 5 clients connect with jks file + SSL_RSA_WITH_RC4_128_SHA from their javaClient. QM is also configured with SSLCIPH(RC4_SHA_US).
Now all of a sudden , without any javaClient change , 1 client could not able to connect to configured QM. All others are able to connect to same QM , without any issue.
AMQERR01.LOG is not logged with any specific exception or error
In application logs its saying common MQ exception
Error as com.ibm.mq.MQException: MQJE001: Completion Code '2', Reason '2397'
2397 - Cipher spec<>suite not matching--is any possibility?
we enabled tracing (strmqtrc -m TEST.QM -t detail -t all) and saw Trace logs in path (C:\Program Files (x86)\IBM\Websphere MQ\trace) ,but could not get any details on why SSL-connection could not happening?
We done one more exercise like created a new QM for issue client and tested without SSL and its working. When we enabled SSL in new QM and javaClient , the same 2397 started logging.
Could someone guide me for better logging and tracing in MQ , which can see why 2397 is throwing?
Could someone guide me for better logging and tracing in Java using -D [-Djavax.net.debug=all] , which can see why 2397 is throwing?
MQ Version ->7
MQ Server in ->Windows
from trace logs
returning TEST.QM
Freeing cbmindex:0 pointer:24DDB540 length:2080
-----} TreeNode.getMQQmgrExtObject (rc=OK)
cbmindex:10
-------------} xcsFreeMemFn (rc=OK)
------------} amqjxcoa.wmqGetAttrs (rc=OK)
-----{ UiQueueManager.testQmgrAttribute
-------------{ Message.getMessage
testing object 'TEST.QM'
An internal method detected an unexpected system return code. The method {0} returned {1}. (AMQ4580)
checking attribute 'QmgrCmdLevelGreaterThan'
-------------} Message.getMessage (rc=OK)
for value '510'
-----------}! NativeCalls.getAttrs (rc=Unknown(C35E))
-----} UiQueueManager.testQmgrAttribute (rc=OK)
Message = An internal method detected an unexpected system return code. The method wmq_get_attrs returned "retval.rc2 = 268460388". (AMQ4580), msgID = AMQ4580, rc = 50014, reason = 268460388, severity = 30
result = true
---} TreeNode.testAttribute (rc=OK)
---{ TreeNode.testAttribute
-----{ QueueManagerTreeNode.toString
-----} QueueManagerTreeNode.toString (rc=OK)
testing object 'TEST.QM'
checking attribute 'OamTreeNode'
-----------{ NativeCalls.getAttrs
------------{ amqjxcoa.wmqGetAttrs
qmgr:2A7B32C8, stanza:2A7B32C4, version:1
for value 'true'
QMgrName('TEST.QM')
-----{ TreeNode.getMQQmgrExtObject
StanzaName('QMErrorLog')
testing object 'TEST.QM'
Full QM.INI filename: SOFTWARE\IBM\MQSeries\CurrentVersion\Configuration\QueueManager\TEST!QM, Multi-Instance: FALSE
--------------} xcsGetIniFilename (rc=OK)
--------------{ xcsGetIniAttrs
---------------{ xcsBrowseIniCallback
FileType = (1)
----------------{ xcsBrowseRegistryCallback
xcsBrowseRegistryCallback
-----------------{ xusAddStanzaLineList
------------------{ xcsGetMemFn
checking attribute 'PluginEnabled'
component:24 function:15 length:2080 options:0 cbmindex:0 *pointer:24DDB540
------------------} xcsGetMemFn (rc=OK)
for value 'com.ibm.mq.explorer.oam'
RetCode (OK)
-----------------} xusAddStanzaLineList (rc=OK)
-----------------{ xusAddStanzaLineList
------------------{ xcsGetMemFn
-----{ UiPlugin.isPluginEnabled
component:24 function:15 length:2080 options:0 cbmindex:1 *pointer:24DDDFE8
------------------} xcsGetMemFn (rc=OK)
RetCode (OK)
-----------------} xusAddStanzaLineList (rc=OK)
testing plugin_id: com.ibm.mq.explorer.oam
-----------------{ xurGetSpecificRegStanza
-------{ PluginRegistrationManager.isPluginEnabled
Couldn't open key (QMErrorLog) result 2: The system cannot find the file specified.
MQ version 7.0.1.9
jdk1.8.0_181-i586
com.ibm.mq*jar Version
Specification -version : 6.0.2.1
Implementation-Version :6.0.2.1 -j600-201-070305

ibm mq test return MQJE001: Completion Code '2', Reason '2035'

I have web app that allow sent messages to queue, it deployed on Websphere Application Server and work very well.
I try to build light environment for autotests, but when i try to sent message to queue from test it returns to me MQJE001: Completion Code '2', Reason '2035'
I thought that problem in CHLAUTH rules but seems that i have all rights.
C:/> dspmqaut -m M00.EDOGO -n OEP.FROM.GW_SBAST.DLV -t q -p out-bychek-ao
Entity out-bychek-ao has the following authorizations for object OEP.FROM.GW_SBA
ST.DLV:
get
browse
put
inq
set
crt
dlt
chg
dsp
passid
passall
setid
setall
clr
error from logs :
AMQ8075: Authorization failed because the SID for entity 'out-bychek-a' cannot
be obtained.
EXPLANATION:
The Object Authority Manager was unable to obtain a SID for the specified
entity. This could be because the local machine is not in the domain to locate
the entity, or because the entity does not exist.
ACTION:
Ensure that the entity is valid, and that all necessary domain controllers are
available. This might mean creating the entity on the local machine.
----- amqzfubn.c : 2252 -------------------------------------------------------
7/9/2018 15:39:57 - Process(2028.3) User(MUSR_MQADMIN) Program(amqrmppa.exe)
Host(SBT-ORSEDG-204) Installation(Installation1)
VRMF(7.5.0.4) QMgr(M00.EDOGO)
AMQ9557: Queue Manager User ID initialization failed.
EXPLANATION:
The call to initialize the User ID failed with CompCode 2 and Reason 2035.
ACTION:
Correct the error and try again.
----- cmqxrsrv.c : 1975 -------------------------------------------------------
7/9/2018 15:39:57 - Process(2028.3) User(MUSR_MQADMIN) Program(amqrmppa.exe)
Host(SBT-ORSEDG-204) Installation(Installation1)
VRMF(7.5.0.4) QMgr(M00.EDOGO)
AMQ9999: Channel 'SC.EDOGO' to host '10.82.38.188' ended abnormally.
EXPLANATION:
The channel program running under process ID 2028(11564) for channel 'SC.EDOGO'
ended abnormally. The host name is '10.82.38.188'; in some cases the host name
cannot be determined and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error logs to
determine the cause of the failure. Note that this message can be excluded
completely or suppressed by tuning the "ExcludeMessage" or "SuppressMessage"
attributes under the "QMErrorLog" stanza in qm.ini. Further information can be
found in the System Administration Guide.
----- amqrmrsa.c : 909 --------------------------------------------------------
notice AMQ8075: Authorization failed because the SID for entity 'out-bychek-a' cannot in my account name lost last letter. Is it normal?
and this
DISPLAY CHLAUTH('SYSTEM.DEF.SVRCONN') MATCH(RUNCHECK) ALL ADDRESS('127.0.0.1') CLNTUSER('out-bychek-ao')
7 : DISPLAY CHLAUTH('SYSTEM.DEF.SVRCONN') MATCH(RUNCHECK) ALL ADDRESS('127.0.0.1') CLNTUSER('out-bychek-ao')
AMQ8898: Display channel authentication record details - currently disabled.
CHLAUTH(SYSTEM.*) TYPE(ADDRESSMAP)
DESCR(Default rule to disable all SYSTEM channels)
CUSTOM( ) ADDRESS(*)
USERSRC(NOACCESS) WARN(NO)
ALTDATE(2016-11-14) ALTTIME(17.33.34)
dmpmqaut -m M00.EDOGO -n OEP.FROM.GW_SBAST.DLV -t q -p out-bychek-ao -e
profile : OEP.FROM.GW_SBAST.DLV
object type: queue
entity : out-bychek-ao#alpha
entity tyoe: principal
authority : allmqi dlt chg dsp clr
- - - - - - - - -
profile : CLASS
object type: queue
entity : out-bychek-ao#alpha
entity tyoe: principal
authority : clt

DSE: Unable to sstablellaoding data from 4.8.9 to 5.0.2

I have 5GB worth of data in DSE 4.8.9. I am trying to load the same data into DSE 5.0.2. The command I use is following:
root#dse:/mnt/cassandra/data$ sstableloader -d 10.0.2.91 /mnt/cassandra/data/my-keyspace/my-table-0b168ba1637111e6b40131c603254a9b/
This gives me following exception:
DEBUG 15:27:12,850 Using framed transport.
DEBUG 15:27:12,850 Opening framed transport to: 10.0.2.91:9160
DEBUG 15:27:12,850 Using thriftFramedTransportSize size of 16777216
DEBUG 15:27:12,851 Framed transport opened successfully to: 10.0.2.91:9160
Could not retrieve endpoint ranges:
InvalidRequestException(why:unconfigured table schema_columnfamilies)
java.lang.RuntimeException: Could not retrieve endpoint ranges: at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:342)
at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:109)
Caused by: InvalidRequestException(why:unconfigured table schema_columnfamilies)
at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:50297)
at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result$execute_cql3_query_resultStandardScheme.read(Cassandra.java:50274)
at org.apache.cassandra.thrift.Cassandra$execute_cql3_query_result.read(Cassandra.java:50189)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql3_query(Cassandra.java:1734)
at org.apache.cassandra.thrift.Cassandra$Client.execute_cql3_query(Cassandra.java:1719)
at org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:321)
... 2 more
Thoughts?
For scenarios when you have few nodes and not a lot of data, you can follow these steps for a cluster migration (ensure the clusters are at most 1 major release apart)
1) create the schema in the new cluster
2) move both node's data to each new node (into the new cfid tables)
3) nodetool refresh to pick up the data
4) nodetool cleanup to clear out the extra data
5) If the old cluster was from a previous major version, run sstable upgrade on the new cluster.

Java code to get service start time

Can anyone help me Java code to get Windows Service start time. Like how we get using Process Explorer.
Screenshots for enabling the service start time using Process Explorer
This is highly Windows-specific, so there is nothing built in to Java or its libraries for this. One possible approach is to use two external commands, sc and wmic to extract this information.
Use sc to get the process ID of the service you're interested in, for example for service W32Time:
C:\>sc queryex W32Time
SERVICE_NAME: W32Time
TYPE : 20 WIN32_SHARE_PROCESS
STATE : 4 RUNNING
(STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN)
WIN32_EXIT_CODE : 0 (0x0)
SERVICE_EXIT_CODE : 0 (0x0)
CHECKPOINT : 0x0
WAIT_HINT : 0x0
PID : 1072
FLAGS :
Parse out the PID value (1072) and then do
C:\Users\jim>wmic process where processid="1072"
Caption CommandLine CreationClassName CreationDate CSCreationClassName CSName Description Execu
tablePath ExecutionState Handle HandleCount InstallDate KernelModeTime MaximumWorkingSetSize MinimumWorkingSetSiz
e Name OSCreationClassName OSName OtherO
perationCount OtherTransferCount PageFaults PageFileUsage ParentProcessId PeakPageFileUsage PeakVirtualSize PeakW
orkingSetSize Priority PrivatePageCount ProcessId QuotaNonPagedPoolUsage QuotaPagedPoolUsage QuotaPeakNonPagedPool
Usage QuotaPeakPagedPoolUsage ReadOperationCount ReadTransferCount SessionId Status TerminationDate ThreadCount
UserModeTime VirtualSize WindowsVersion WorkingSetSize WriteOperationCount WriteTransferCount
svchost.exe Win32_Process 20160709170336.990827-420 Win32_ComputerSystem HOME svchost.exe
1072 765 21060135
svchost.exe Win32_OperatingSystem Microsoft Windows 7 Professional |C:\Windows|\Device\Harddisk0\Partition2 66053
3433281 18371 17072 828 17616 142090240 28740
8 17481728 1072 46 185 51
232 240 9800 0 24
11076071 117727232 6.1.7601 28708864 6 820
Buried in that mess is the CreationDate field (value 20160709170336.990827-420) which is what you want. The -420 appears to be a timezone offset in minutes.
You can implement a class that is able to run Windows command to query the Windows logs. This can be done in this fashion:
Runtime rt = Runtime.getRuntime();
try {
rt.exec("Your command");
} catch (IOException e) {
e.printStackTrace();
}
Using wmic you'll be able to find the start time you want as stated above.
Unfortunately the sc won't be able to provide you with this kind of information. Another way (not sure if it would work though) is to query the window's event viewer for the logged event of a service starting (I think it's eventId is 902). After getting the information you can parse the string to find information regarding the service you're interested.
One word of warning though. If you're planning to deploy your app on older Windows installation be careful as old Windows installations (XP etc. etc.) may not always contain a valid WMIC installation, meaning that the command would not be available.

Categories

Resources