Getting Permission denied (publickey). when starting hadoop cluster on AWS

Getting Permission denied (publickey). when starting hadoop cluster on AWS - java

I am getting "Permission denied (publickey)" error when starting hadoop multi node cluster in AWS. But when i do ssh to each individual slave node without starting the cluster then i am able to access them. I did all the settings correct and checked twice.Any help on what may be wrong?

The problem was i created a new user i.e. hduser and then configured hadoop in it.
I did all the setup(hadoop configuration) in Ubuntu user(default for ec2 Ubuntu instances) and it worked. I think its better to use default users's in AWS instances then creating any new one and then struggling to get permissions and other errors.

Related

Websphere 9.0.0 not able to connect Db2 database from WAS admin in Linux

It was working from last 3 month but from last 3 days I am facing this issue
Even After creating JNDI in Websphere when I try to connect test connection it giving me the following error.
java.sql.SQLNonTransientException: java.sql.SQLNonTransientException: null DSRA0010E: SQL State = 08001, Error Code = -1,639
I am not able to restart node agent it gives me the following error with ./startNode.sh and ./stopNode.sh
serverNode01/servers/nodeagent/server.xml file is missing
Please give an idea to restart the node agent.
Thanks

The description of SQL1639N:
SQL1639N The database server was unable to perform authentication
because security-related database manager files on the server do not
have the required operating system permissions.
Explanation
The DB2 database system requires that your instance and database
directories, and the files in those directories, have a minimum level
of operating system permissions. When the instance and database
directories are created by the database manager the permissions are
accurate, and changing those permissions could cause database manager
functions to fail. The complexity of DB2 file permissions is increased
in the case of non-root installed instances and operating system-based
authentication.
This message is returned when security-related database manager
executable files do not have necessary permissions for the database
manager to perform remote connection authentication-related tasks.
There are several reasons why these security-related files might not
have the necessary permissions, including the following reasons:
The database manager instance is a non-root installed instance and operating system-based authentication has not been enabled using the
db2rfe command
Operating system permissions of database manager files were accidentally changed
User response
Respond to this message in one of the following ways:
If the instance is a non-root installed instance, enable operating system-based authentication using the db2rfe command.
Reset all of the operating system permissions for the database manager binary files for this instance by running the following
command as a superuser:
db2iupdt -k <instance-name>
where is the name of the affected instance.
Note that both the db2rfe command and the db2iupdt command require
that the database manager instance be stopped and restarted.
Are you able to connect to the database manually from some remote client (using JDBC/ODBC/CLI/DB2 CLP)?

Apache Flink throws UnknownHostException on cluster

I have a flink project that is connecting to nifi to pull data. The setup to pull get the datastream works just fine when running locally.
.url("http://1.2.3.4:8080/nifi")
.portName("MyPortName")
.requestBatchCount(5)
.buildConfig();
But when I add the .jar to the remote cluster and run the job it throws this:
java.net.UnknownHostException
at sun.nio.ch.Net.translateException(Net.java:177)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:127)
at org.apache.nifi.remote.client.socket.EndpointConnectionPool.establishSiteToSiteConnection(EndpointConnectionPool.java:712)
at org.apache.nifi.remote.client.socket.EndpointConnectionPool.establishSiteToSiteConnection(EndpointConnectionPool.java:685)
at org.apache.nifi.remote.client.socket.EndpointConnectionPool.getEndpointConnection(EndpointConnectionPool.java:301)
at org.apache.nifi.remote.client.socket.SocketClient.createTransaction(SocketClient.java:129)
at org.apache.flink.streaming.connectors.nifi.NiFiSource.run(NiFiSource.java:90)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:78)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:56)
at org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:39)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:272)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
at java.lang.Thread.run(Thread.java:745)
The only reason I can find for an UnknownHostException is that it is because the IP of the host name can't be resolved, but I am giving the IP already. There was an issue earlier with it being unable to connect to nifi because I have to set what IP is allowed to access the nifi instance. So I added the AWS server as allowed and it fixed that, but obviously I have this now.
Any help is greatly appreciated!

I figured the problem. I had my nifi cluster and my flink cluster in different regions. Moved the flink cluster to the same region and used either the public or private url for the cluster and it works fine.

Drill not finding drillbit from zookeeper?

I am trying out the drill sample in my project using this example.
https://github.com/vicenteg/DrillJDBCExample/blob/master/src/main/java/com/mapr/drill/DrillJDBCExample.java
I have started drillbits on all my datanodes with the same "cluster-id" and I specify "zk.connect" to point to "zookeeper1,zookeeper2,zookeeper3" in my drill-override.conf (picked up by default i believe).
I am getting the following error:
java.lang.IllegalStateException: No DrillbitEndpoint can be found
Am I supposed to start drillbits on my zookeeper nodes too in addition to my datanodes? Or what is wrong?
My drill-override is as follows:
drill.exec: {
cluster-id: "testcluster",
zk.connect: "zookeeper1:2181,zookeeper2:2181,zookeeper3:2181"
}

if you are trying to connect to a different machine where drill is installed. Then while connecting from windows, give ip address of the machine where drill is running as .
Important
If in drill-override.conf(Linux machine where Drill is running) you have written "zookeeper1" as node name, then you should modify your "c:\Windows\System32\Drivers\etc\hosts" file in your client machine, and gave a DNS name of that ip.
Example :
192.168.32.84 zookeeper1 .

Apache Tez configuration with hadoop

Here is what I have done in a nutshell:
STEP1: I have successfully configured hadoop 2.6 on my laptop (single node) and ran a sample mapreduce job.
STEP2: I cloned tez repository and successfully built the 0.8.0 version and copied the jarfiles into HDFS and exports the required variables. I also changed the value of variable mapreduce.framework.name to yarn-tez in the mapred-site.xml.
But when I want to run a tez orderedwordcount job, I got this error:
15/07/04 18:45:03 INFO ipc.Client: Retrying connect to server: hostname/hostIP:57339.
Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/07/04 18:45:12 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
I have checked resource manager and it is listening on port 8030.
But it seems the client tries to connect to a random port. is it correct?
What can I do to get it work correctly?

It seems that it was the problem of this version (0.8.0) connecting to the resource manager. I compiled and integrated the previous stable release (0.7.0) and everything is good to go now. I hope that they will figure the problem out.

From your logs it seems a Firewall issue rather than issue with Tez version. And it is irrespective of Tez, even if you run Hadoop only you can face this.
Hadoop uses multiple ports for communication with clients and between service components. To enable Hadoop communication, open the specific ports that Hadoop uses.
To open specific ports, you can set the access rules in Windows. For example, the following command will open up port 80 in the active Windows Firewall:
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=80
For more see here http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0-Win/bk_HDP_Install_Win/content/ref-79239257-778e-42a9-9059-d982d0c08885.1.html

Binding a port < 1024 for non root user in Java

I have a Java application which is running as non root mode.
My App will create a TFTP server (using apache commons tftp). TFTP server is bind to port 69(Default TFTP port). When running the app from IDE everything works fine since the IDE running as root. But if the app is run from other user i get the error
java.net.BindException: Permission denied
It is clear that for non root user i can not open the port. Is there a workaround for this issue?

For binding on Linux of ports less that 1024 you need to application to run a root. There is no way around this. If you need to do this you have you run as root. sudo might be the command to look into.
BTW - Running your IDE as root is not a very good idea.

To resolve this issue. You can use setuid() and setfid() system calls. So that you can temporarily elevate the permissions and then drop the permission back to user permissions.

In my Case, this problem happened in Solaris 11 OS. I added privileges to user to use the ports under 1024.
https://technicalsanctuary.wordpress.com/2014/06/03/allowing-a-user-to-use-ports-under-1024-on-solaris-11/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Getting Permission denied (publickey). when starting hadoop cluster on AWS - java

I am getting "Permission denied (publickey)" error when starting hadoop multi node cluster in AWS. But when i do ssh to each individual slave node without starting the cluster then i am able to access them. I did all the settings correct and checked twice.Any help on what may be wrong?

Related

Websphere 9.0.0 not able to connect Db2 database from WAS admin in Linux

Apache Flink throws UnknownHostException on cluster

Drill not finding drillbit from zookeeper?

Apache Tez configuration with hadoop

Binding a port < 1024 for non root user in Java

Categories

Resources