Kubernetes "fatal alert: protocol_version" error when creating a Deployment - java

We're running a kubernetes cluster on the Google Cloud Platform, which creates a Deployment with 8 hazelcast-based replicas. We've had this running fine for over a month, but recently, we started receiving the below error message whenever we try to start our deployment (non-relevant stack frames omitted):
2016-07-15 12:58:02,117 [My-hazelcast.my-deployment-368708980-8v7ig # my-deployment-368708980-8v7ig] ERROR - [10.68.5.3]:5701 [MyProject] [3.6.2] Error executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/endpoints/my-service. Cause: Received fatal alert: protocol_version
io.fabric8.kubernetes.client.KubernetesClientException: Error executing: GET at: https://kubernetes.default.svc/api/v1/namespaces/default/endpoints/my-service. Cause: Received fatal alert: protocol_version
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestException(OperationSupport.java:272) ~[kubernetes-client-1.3.66.jar:na]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:205) ~[kubernetes-client-1.3.66.jar:na]
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:196) ~[kubernetes-client-1.3.66.jar:na]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:483) ~[kubernetes-client-1.3.66.jar:na]
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:108) ~[kubernetes-client-1.3.66.jar:na]
at com.noctarius.hazelcast.kubernetes.ServiceEndpointResolver.resolve(ServiceEndpointResolver.java:62) ~[hazelcast-kubernetes-discovery-0.9.2.jar:na]
at com.noctarius.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy.discoverNodes(HazelcastKubernetesDiscoveryStrategy.java:74) ~[hazelcast-kubernetes-discovery-0.9.2.jar:na]
at com.hazelcast.spi.discovery.impl.DefaultDiscoveryService.discoverNodes(DefaultDiscoveryService.java:74) ~[hazelcast-all-3.6.2.jar:3.6.2]
....
Caused by: javax.net.ssl.SSLException: Received fatal alert: protocol_version
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208) ~[na:1.7.0_95]
at sun.security.ssl.Alerts.getSSLException(Alerts.java:154) ~[na:1.7.0_95]
at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:1991) ~[na:1.7.0_95]
...
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:203) ~[kubernetes-client-1.3.66.jar:na]
... 18 common frames omitted
When I google this error, I get a lot of hits about TLS protocol version mismatch. Apparently, Java 8 assumes a different TLS protocol version (TLS 1.2) than Java 7 and 6(TLS 1.0). However, all of our containers run the same docker image (based off of the hazelcast/hazelcast:3.6.2 image), which is based off of Java 7, so there should be no protocol version mismatch (and this layer of our image has not changed).
We've tried to revert all of our recent changes in an attempt to resolve this error, to no avail. And frankly, nobody on our team has changed anything receltly related to SSL or the Hazelcast Kubernetes discovery mechanism. We recently updated our google cloud SDK components (gcloud components update) at the urging of the Cloud SDK tools ("Updates are available for some Cloud SDK components."). We're now running Google Clouds SDK version 117.0.0, but I don't see any breaking changes related to SSL or TLS in the release notes.
Why would we suddenly start seeing this "fatal alert: protocol_version" error message in our kubernetes pods, and how can I resolve it?

The initial google searches indicating this was a TLS version error (version 1.0 vs 1.2 incompatibility) turned out to be useful. This answer to a question about a similar SSLException protocol_version error is what pointed me in the right direction.
I got a test container to run, and using kubectl exec my-test-pod -i -t -- /bin/bash -il to launch an interactive bash shell into the container, I determined that the Hazelcast discovery service could NOT connect using TLS 1.0, but could using TLS 1.2:
/opt/hazelcast# curl -k --tlsv1.0 https://kubernetes.default.svc/api/v1/namespaces/default/endpoints/my-service
curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
/opt/hazelcast# curl -k --tlsv1.2 https://kubernetes.default.svc/api/v1/namespaces/default/endpoints/my-service
Unauthorized # <-- Unauthorized is expected, as I didn't specify a user/passwd.
I am still not sure what exactly changed, possibly a layer of a public Docker container we use, possibly something within Google cloud service (Java 7 is End of Life, after all), and the fine folks at Hazelcast suggested perhaps the REST API had been updated. But evidently something changed that was causing the discovery service expect clients to TLS version 1.2.
The solution was to download the Hazelcast Docker image we were using, and tweak it to use Java 8 instead of Java 7, and then rebuild the image in our own development sandbox:
$ pwd
/home/jdoe/devel/hazelcast-docker-3.6.2/hazelcast-oss
$ head -n3 Dockerfile
FROM java:8
ENV HZ_VERSION 3.6.2
ENV HZ_HOME /opt/hazelcast/
Voila! Our Deployment is running again.

Related

SimpleEmail works within IntelliJ fails from command line

Using org.apache.commons.mail.SimpleEmail to send an email from within a background application. This has been working for 8 months with no problems. On January 1st it started failing. The application is Scala (2.12.8) and Java (1.8) running on a Mac (macOS 10.15.7).
Sending the email to smtp.googlemail.com port 465 (also tried smtp.gmail.com).
Using IntelliJ as the IDE. If the application runs within IntelliJ, it still works perfectly, but if you create a Jar and run from the command line it fails every time. Running from a Jar using 'sudo' also fails.
So did some setting change at Google as of Jan 1? Why would it still work within the IDE - is there a context or certificate present within IntelliJ? A certificate needed for SSL?
Appreciate any suggestions!
-------- Addendum - all parameters to the send() method ---------
Heading: FAILURE --- my addition
ID: (None) --- my addition
To: j.crowley#computer.org
Subject: Backup for JDCMacBook was 150.1M
Message: For JDCMacBook, Drive: USBExtA, Backup 2021-01-08-103310 compared to 2021-01-08-053055 Adds: 3.3M Changes: 142.3M Deletes: 4.4M
SendIfPossible: true
To Host: smtp.googlemail.com
To Port: 465
Auth User: tmviewer.smtp#gmail.com
Auth Pwd: .... redacted ....
Set SSL: true
------- Stack trace -----------
Exception: org.apache.commons.mail.EmailException: Sending the email to the following server failed : smtp.googlemail.com:465
10:44:11.749 0:00.002 ERROR: org.apache.commons.mail.Email.sendMimeMessage(Email.java:1469)
10:44:11.749 ERROR: org.apache.commons.mail.Email.send(Email.java:1496)
10:44:11.749 ERROR: jdctm.Utils$.sendEMail(Utils.scala:547)
10:44:11.750 0:00.001 ERROR: jdctm.NotifyInfo.notify(Cache.scala:1036)
10:44:11.750 ERROR: jdctm.Cache$.notify(Cache.scala:722)
10:44:11.750 ERROR: jdctm.ExecuteTMUtil.run(Cache.scala:952)
10:44:11.750 ERROR: java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
10:44:11.751 0:00.001 ERROR: java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
10:44:11.751 ERROR: java.base/java.lang.Thread.run(Thread.java:831)
IntelliJ is using Java 8, had upgraded to Java 16 (early access) for some tests and that is what is used on the command line. Turned on SimpleEmail setDebug(true) and failing during the SSL handshake - No appropriate protocol (protocol is disabled or cipher suites are inappropriate) So some change in protocols/cipher suites between Java 8 and 16 (ea). WORKs under Java 15.0.1, FAILs under jdk16.ea.30.2130. Will revert to Java 8 for now, retest once Java 16 is released.

ONOS Service Start FrameworkEvent Error and GUI not ready yet

I have installed ONOS 2.3.0 on an Ubuntu Server 18.04.4 virtual machine running on Hyper-V following this steps (taken from here and here):
Firstly, I have installed Java 11 (openjdk-11-jdk and openjdk-11-jre), maven and curl;
then I have downloaded ONOS 2.3.0 from here and extracted it with tar xzf onos-2.3.0.tar.gz;
lastly, I exported the required environment variable export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64.
When I try to launch it using the command ./onos-service start (tested both from a normal user and sudo), it gives me the following errors:
21:54:57.869 ERROR [onos-core-net] FrameworkEvent ERROR - org.onosproject.onos-core-net
org.osgi.framework.ServiceException: Service factory returned null. (Component: org.onosproject.store.cfg.DistributedComponentConfigStore (6))
at org.apache.felix.framework.ServiceRegistrationImpl.getFactoryUnchecked(ServiceRegistrationImpl.java:380)
at org.apache.felix.framework.ServiceRegistrationImpl.getService(ServiceRegistrationImpl.java:247) org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:834)
[...]
at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1373)
at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:308) at java.base/java.lang.Thread.run(Thread.java:834)
[...]
21:54:57.881 WARN [NettyMessagingService] Failed to bind TCP server to port 0.0.0.0:9876 due to {}
java.net.BindException: Address already in use
at java.base/sun.nio.ch.Net.bind0(Native Method)
[...]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:906)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:834)
21:54:57.899 ERROR [onos-core-primitives] bundle org.onosproject.onos-core-primitives:2.3.0 (192)[org.onosproject.store.atomix.impl.AtomixManager(115)] : The activate method has thrown an exception
java.util.concurrent.CompletionException: java.net.BindException: Address already in use
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
[...]
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.BindException: Address already in use
at java.base/sun.nio.ch.Net.bind0(Native Method)
at java.base/sun.nio.ch.Net.bind(Net.java:455)
at java.base/sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:227)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:132)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:563)
... 12 more
Connecting to karaf instance with ssh -p 8101 karaf#localhost confirm that ONOS is working (at least partially), the web interface login loads, but after login it hangs saying that ONOS GUI not ready yet... please stand by....
Does anyone has an idea about how to solve this problem?
Thanks in advance.
UPDATE 19-03-2020: I have prepared another virtual machine following exactly the same steps on another PC using VirtualBox and lower virtual resources assigned, and it works. Honestly i don't understand why it fails on the Hyper-V configuration.
UPDATE 20-03-2020: I have reinstalled Ubuntu configuring the network directly from the installer, and prerequisites and dependecies of ONOS offline (downloaded on another machine via sudo apt install --download-only <package-name>) and it worked. I think the problem was related to something in the network configuration that didn't let him recognize its own process on port 9876 (see the WARN above).
Hope this can be helpful for others.
I had this problem. ONOS is locked to the IP at first install. I grepped for my IP in the /onos folder and was able to reset the binding by deleting the following files that contained the IP. They were rebuilt at next ONOS run.
grep -rl 192.168. --exclude=*.log ~/onos
rm ~/onos/apache-karaf-4.2.9/data/db/partitions/data/partitions/1/raft-partition-1.conf
rm ~/onos/apache-karaf-4.2.9/data/db/partitions/data/partitions/1/raft-partition-1.meta
rm ~/onos/apache-karaf-4.2.9/data/db/partitions/data/partitions/1/.raft-partition-1.lock
rm ~/onos/apache-karaf-4.2.9/data/db/partitions/system/partitions/1/.system-partition-1.lock
rm ~/onos/apache-karaf-4.2.9/data/db/partitions/system/partitions/1/system-partition-1.conf
rm ~/onos/apache-karaf-4.2.9/data/db/partitions/system/partitions/1/system-partition-1.meta
I have faced this issue after changing the IP address of the controller (Host machine).
The quick way to solve it is to set the IP controller as it was (Static)
then reboot your machine
after putting the URL (YourIP:8181/onos/ui/index.html)
Karaf will ask you for login in credentials, use (username:karaf/password:karaf)
then on ONOS's login page, use onos/rocks as credentials.
Good luck..

"UNAVAILABLE" gRPC failure from android client to python server

I've been struggling all day with the following issue with gRPC on Android when trying to make an RPC call to a python RPC server running on my machine (on the same network).
My android gRPC android client compiles and runs, however, I get the following error message on the client (and silence on the python server):
io.grpc.StatusRuntimeException: UNAVAILABLE
Caused by: java.io.IOException: PROTOCOL_ERROR invalid settings id: -509
The full stack-trace reads:
05-06 18:39:01.133 5018-5302/com.example.android.cimi I/SyncAdapter: Failed... :
io.grpc.StatusRuntimeException: UNAVAILABLE
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:227)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:208)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:141)
at com.example.android.cimi.CimiSyncerGrpc$CimiSyncerBlockingStub.getHouseholdTimestamps(CimiSyncerGrpc.java:209)
at com.example.android.cimi.SyncAdapter.onPerformSync(SyncAdapter.java:130)
at android.content.AbstractThreadedSyncAdapter$SyncThread.run(AbstractThreadedSyncAdapter.java:259)
Caused by: java.io.IOException: PROTOCOL_ERROR invalid settings id: -509
at io.grpc.okhttp.internal.framed.Http2.ioException(Http2.java:589)
at io.grpc.okhttp.internal.framed.Http2.access$200(Http2.java:47)
at io.grpc.okhttp.internal.framed.Http2$Reader.readSettings(Http2.java:304)
at io.grpc.okhttp.internal.framed.Http2$Reader.nextFrame(Http2.java:162)
at io.grpc.okhttp.OkHttpClientTransport$ClientFrameHandler.run(OkHttpClientTransport.java:868)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:818)
05-06 18:39:01.139 5018-5018/com.example.android.cimi I/SyncService: Service destroyed
When shutting off the server, I get a different error, so something else must be going on.
I'm using the following versions:
io.grpc:protoc-gen-grpc-java:1.3.0
com.google.protobuf:protoc:3.0.0
Things I've tried:
Downgrading to grpc 1.0.0
Downgrading Android studio
Tried python 2&3
Connecting from a python client works!
How else could I debug this? Is there any way to get more meaningful exceptions? Is there a way to make the server more verbose?
Any help would be greatly appreciated!
I've narrowed it down to a problem with the python gRPC server by confirming that the call works when using a Java server instead.
It turns out it must have been related to recent gRPC/protobuf python packages (just installed via pip install, must have grabbed the most recent ones):
The following sequence of commands for downgrading the gRPC installation solved the issue:
cd /usr/lib/python2.7/site-packages
rm -rf packaging*
rm -rf pyparsing*
pip2 install protobuf==3.0.0
pip2 install grpcio==1.0.0
pip2 install grpcio-tools==1.0.0

getBalance.sh on Amazon Mechanical Turk (mturk) Command Line Tools (CLT) returns an error

Could someone help me figure out my mistake? Thank you in advance :) I'm trying to set up the Command Line Tool (CLT) on my Mac OS X Yosemite and I'm getting error messages.
My problem seems similar to the one in the link below, but not identical; I have changed "http" to "https" in the murk.properties file after installing the CLT.
getBalance in Amazon Turk gives error
CODE: This is what I entered in Terminal (initially thinking that my problem was Java location):
$ export MTURK_CMD_HOME=/Applications/aws-mturk-clt-1.3.1
$ java -version
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
$ which java
/usr/bin/java
$ export JAVA_HOME=/usr
$ cd /Applications/aws-mturk-clt-1.3.1/bin/
$ ./getBalance.sh
ERROR: This is an excerpt of the error message I received
Unable to find a $JAVA_HOME at "/usr", continuing with system-provided Java...
I/O exception (javax.net.ssl.SSLPeerUnverifiedException) caught when processing request: HTTPS hostname invalid: expected '176.32.98.23', received 'mechanicalturk.amazonaws.com'
Retrying request
I/O exception (javax.net.ssl.SSLPeerUnverifiedException) caught when processing request: HTTPS hostname invalid: expected '176.32.98.23', received 'mechanicalturk.amazonaws.com'
Retrying request
I/O exception (javax.net.ssl.SSLPeerUnverifiedException) caught when processing request: HTTPS hostname invalid: expected '176.32.98.23', received 'mechanicalturk.amazonaws.com'
Retrying request
An error occurred while fetching your balance: javax.net.ssl.SSLPeerUnverifiedException: HTTPS hostname invalid: expected '176.32.98.23', received 'mechanicalturk.amazonaws.com'
com.amazonaws.mturk.service.exception.InternalServiceException: javax.net.ssl.SSLPeerUnverifiedException: HTTPS hostname invalid: expected '176.32.98.23', received 'mechanicalturk.amazonaws.com'
at com.amazonaws.mturk.service.axis.AWSService.executeRequestMessage(AWSService.java:243)
at com.amazonaws.mturk.filter.FinalFilter.execute(FinalFilter.java:38)
at com.amazonaws.mturk.filter.Filter.passMessage(Filter.java:56)
at com.amazonaws.mturk.filter.ErrorProcessingFilter.execute(ErrorProcessingFilter.java:46)
at com.amazonaws.mturk.filter.Filter.passMessage(Filter.java:56)
at com.amazonaws.mturk.filter.RetryFilter.execute(RetryFilter.java:115)
at com.amazonaws.mturk.filter.Filter.passMessage(Filter.java:56)
at com.amazonaws.mturk.util.CLTExceptionFilter.sendMessage(CLTExceptionFilter.java:77)
at com.amazonaws.mturk.util.CLTExceptionFilter.execute(CLTExceptionFilter.java:62)
at com.amazonaws.mturk.service.axis.FilteredAWSService.executeRequests(FilteredAWSService.java:172)
at com.amazonaws.mturk.service.axis.FilteredAWSService.executeRequest(FilteredAWSService.java:152)
at com.amazonaws.mturk.service.axis.FilteredAWSService.executeRequest(FilteredAWSService.java:116)
at com.amazonaws.mturk.service.axis.RequesterServiceRaw.getAccountBalance(RequesterServiceRaw.java:1193)
at com.amazonaws.mturk.service.axis.RequesterService.getAccountBalance(RequesterService.java:922)
at com.amazonaws.mturk.cmd.GetBalance.getBalance(GetBalance.java:50)
at com.amazonaws.mturk.cmd.GetBalance.runCommand(GetBalance.java:41)
at com.amazonaws.mturk.cmd.AbstractCmd.run(AbstractCmd.java:148)
at com.amazonaws.mturk.cmd.GetBalance.main(GetBalance.java:28)
Caused by: javax.net.ssl.SSLPeerUnverifiedException: HTTPS hostname invalid: expected '176.32.98.23', received 'mechanicalturk.amazonaws.com'
at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
The error message continues a bit. So, what do you think? Thanks again for looking at this. - Soon
mturk.properties file
# -------------------
# ADVANCED PROPERTIES
# -------------------
#
# If you want to test your solution in the Amazon Mechanical Turk Developers Sandbox (http://sandbox.mturk.com)
# use the service_url defined below:
#service_url=https://mechanicalturk.sandbox.amazonaws.com/?Service=AWSMechanicalTurkRequester
# If you want to have your solution work against the Amazon Mechnical Turk Production site (http://www.mturk.com)
# use the service_url defined below:
service_url=https://mechanicalturk.amazonaws.com/?Service=AWSMechanicalTurkRequester
# The settings below should only be modified under special circumstances.
# You should not need to adjust these values.
retriable_errors=Server.ServiceUnavailable,503
retry_attempts=6
retry_delay_millis=500'
Guan, have you tried with an earlier version of the JDK (e.g., JDK-1.5?). I realize it's much older, but I'm curious if it's related to using the CLT on JDK 1.8. Just an idea.
Additionally, it would help if we could see the turk.properties file (please don't share your access keys or secret key) to make sure the endpoints are well-formed. Thanks!

hivemq illegal state exception

I'm trying to develop a hivemq authentication plugin.
I have followed the hivemq guide for creating a project, and I am not doing anything in the plugin itself. I return true immediately.
mosquitto_sub -t hey
When I try to connect with mosquitto_sub using the command above, I get the error below.
INFO - Started HiveMQ 1.4.2 in 1528ms
ERROR - An unexpecteed error occured:
java.lang.IllegalStateException: illegal state during login of client mosq_sub_12248_ahmetce
at com.dcsquare.hivemq.handler.protocol.ConnectMessageHandler.logStatus(ConnectMessageHandler.java:176)
at com.dcsquare.hivemq.handler.protocol.ConnectMessageHandler.processSuccessfulLogin(ConnectMessageHandler.java:114)
This is the code I'm testing with: http://pastie.org/8555786#22-23
Has anyone had a similar error?
This is a regression in the 1.4.x HiveMQ line and is fixed in all HiveMQ versions > 1.4.3. The solution is to upgrade the HiveMQ version to a more recent one. (At the time of this writing, 1.4.3 is the most recent stable version, but hotfix versions are available.)

Categories

Resources