hi folks i am stucked in very strange problem.I am installing HBase and hadoop on another VM by accessing it from my machine.Now i have properly installed hadoop and then iran it ./start-all.sh and i see that all processes are running perfectly.So i do jps and i saw that
jobtracker
tasktracker
namenode
secondrynamenode
datanode
everything is running good.Now when I setup hbase and then started hadoop and Hbase , I saw that namenode is not running and in logs (from namenode log file) I got this exception
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager.java:65)
at java.lang.Thread.run(Thread.java:662)
2012-05-19 08:46:07,493 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
2012-05-19 08:46:07,516 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to localhost/23.21.195.24:54310 : Cannot assign requested address
at org.apache.hadoop.ipc.Server.bind(Server.java:227)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:301)
at org.apache.hadoop.ipc.Server.<init>(Server.java:1483)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:545)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:506)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:294)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:497)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1268)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1277)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
at org.apache.hadoop.ipc.Server.bind(Server.java:225)
... 8 more
2012-05-19 08:46:07,516 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
i checked ports and revise all conf files again and again but didn't find the solution. Please guide me if anyone have an idea-
Thnaks alot
Based on your comment, you're probably is most probably related to the hosts file.
Firstly you should uncomment the 127.0.0.1 localhost entry, this is a fundamental entry.
Secondly, Have you set up hadoop and hbase to run with external accessible services - i'm not too up on hbase, but for hadoop, the services need to be bound to non-localhost addresses for external access, so your masters and slaves files in $HADOOP_HOME/conf need to name the actual machine names (or IP addresses if you don't have a DNS server). None of your configuration files should mention localhost, and should use either the host names or IP addresses.
Related
I'm using the Apache Flink Kubernetes operator to deploy a standalone job on an Application cluster setup.
I have setup the following files using the Flink official documentation - Link
jobmanager-application-non-ha.yaml
taskmanager-job-deployment.yaml
flink-configuration-configmap.yaml
jobmanager-service.yaml
I have not changed any of the configurations in these files and am trying to run a simple WordCount example from the Flink examples using the Apache Flink Operator.
After running the kubectl commands to setting up the job manager and the task manager - the job manager goes into a NotReady state while the task manager goes into a CrashLoopBackOff loop.
NAME READY STATUS RESTARTS AGE
flink-jobmanager-28k4b 1/2 NotReady 2 (4m24s ago) 16m
flink-kubernetes-operator-6585dddd97-9hjp4 2/2 Running 0 10d
flink-taskmanager-6bb88468d7-ggx8t 1/2 CrashLoopBackOff 9 (2m21s ago) 15m
The job manager logs look like this
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Slot request bulk is not fulfillable! Could not allocate the required slot within slot request timeout
at org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkCheckerImpl.lambda$schedulePendingRequestBulkWithTimestampCheck$0(PhysicalSlotRequestBulkCheckerImpl.java:86) ~[flink-dist-1.16.0.jar:1.16.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRunAsync$4(AkkaRpcActor.java:453) ~[flink-rpc-akka_be40712e-8b2e-47cd-baaf-f0149cf2604d.jar:1.16.0]
at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) ~[flink-rpc-akka_be40712e-8b2e-47cd-baaf-f0149cf2604d.jar:1.16.0]
The Task manager it seems cannot connect to the job manager
2023-01-28 19:21:47,647 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Connecting to ResourceManager akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000).
2023-01-28 19:21:57,766 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not resolve ResourceManager address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*.
2023-01-28 19:22:08,036 INFO akka.remote.transport.ProtocolStateActor [] - No response from remote for outbound association. Associate timed out after [20000 ms].
2023-01-28 19:22:08,057 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://flink#flink-jobmanager:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#flink-jobmanager:6123]] Caused by: [No response from remote for outbound association. Associate timed out after [20000 ms].]
2023-01-28 19:22:08,069 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not resolve ResourceManager address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*.
2023-01-28 19:22:08,308 WARN akka.remote.transport.netty.NettyTransport [] - Remote connection to [null] failed with org.jboss.netty.channel.ConnectTimeoutException: connection timed out: flink-jobmanager/100.127.18.9:6123
The flink-configuration-configmap.yaml looks like this
flink-conf.yaml: |+
jobmanager.rpc.address: flink-jobmanager
taskmanager.numberOfTaskSlots: 2
blob.server.port: 6124
jobmanager.rpc.port: 6123
taskmanager.rpc.port: 6122
queryable-state.proxy.ports: 6125
jobmanager.memory.process.size: 1600m
taskmanager.memory.process.size: 1728m
parallelism.default: 2
This is what the pom.xml looks like - Link
You deployed the Kubernetes Operator in the namespace, but you did not create the CRDs the Operator requires. Instead you tried to create a standalone Flink Kubernetes cluster.
The Flink Operator makes it a lot easier to deploy your Flink jobs, you only need to deploy the operator itself and FlinkDeployment/FlinkSessionJob CRDs. The operator will manage your deployment after.
Please use this documentation for the Kubernetes Operator: Link
I have followed the necessary instructions of setting up a distributed JMeter Testing environment with JMeter 4.0.
I have one master and one slave. Both are on the same subnet and I have (whether I should have or not) opened both inward and outward firewall ports for 1099 (for RMI) and 23 on both the master and slave. I could not shut down all the firewall as there is some 'group'policy at my workplace.
I have the necessary rmi_keystore.jks file created with the name as 'rmi' created and their paths referenced correctly in the properties file. I have put them both in the jmeter\bin directories for both slave and master. Hence, it starts the slave object properly.
When I start the master I wait for a bit and eventually get the following:
Remote engines have been started
Waiting for possible Shutdown/StopTestNow/Heapdump message on port 4445
c:\XXX\>jmeter -n -t YYY.jmx -r -l ZZZ.jtl -e -o Result
Creating summariser <summary>
Created the tree successfully using YYY.jmx
Configuring remote engine: AAA.AAA.AAA.AAA
Starting remote engines
Starting the test # Fri May 18 17:29:19 BST 2018 (xxxxxxxxxxxxx)
Error in rconfigure() method java.rmi.ConnectException: Connection refused to
host: AAA.AAA.AAA.AAA; nested exception is:
java.net.ConnectException: Connection timed out: connect
Remote engines have been started
Waiting for possible Shutdown/StopTestNow/Heapdump message on port 4445
I am not quite sure what else to do as I have followed the necessary instructions hence would really appreciate some help? Thanks.
AFAIK, the remote engines need to be started manually.
Be sure you started them before starting the master.
Furthermore, if the master has two (or more) interfaces, you need to specify the one where RMI server sits and listens.
Here is what I have done in a nutshell:
STEP1: I have successfully configured hadoop 2.6 on my laptop (single node) and ran a sample mapreduce job.
STEP2: I cloned tez repository and successfully built the 0.8.0 version and copied the jarfiles into HDFS and exports the required variables. I also changed the value of variable mapreduce.framework.name to yarn-tez in the mapred-site.xml.
But when I want to run a tez orderedwordcount job, I got this error:
15/07/04 18:45:03 INFO ipc.Client: Retrying connect to server: hostname/hostIP:57339.
Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/07/04 18:45:12 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED
I have checked resource manager and it is listening on port 8030.
But it seems the client tries to connect to a random port. is it correct?
What can I do to get it work correctly?
It seems that it was the problem of this version (0.8.0) connecting to the resource manager. I compiled and integrated the previous stable release (0.7.0) and everything is good to go now. I hope that they will figure the problem out.
From your logs it seems a Firewall issue rather than issue with Tez version. And it is irrespective of Tez, even if you run Hadoop only you can face this.
Hadoop uses multiple ports for communication with clients and between service components. To enable Hadoop communication, open the specific ports that Hadoop uses.
To open specific ports, you can set the access rules in Windows. For example, the following command will open up port 80 in the active Windows Firewall:
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=80
For more see here http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0-Win/bk_HDP_Install_Win/content/ref-79239257-778e-42a9-9059-d982d0c08885.1.html
There is a Linux VM with Hadoop installed and running.
And there is Java app running in Eclipse that retrieve data from HDFS.
If I am copying file(s) to or from HDFS inside the VM everything works fine.
But when i am running the app from my Windows physical machine I am getting the next exception:
WARN hdfs.DFSClient: Failed to connect to /127.0.0.1:50010 for block, add to
deadNodes and continue. java.net.ConnectException: Connection refused: no further
information. Could not obtain BP-*** from any node: java.io.IOException:
No live nodes contain current block. Will get new block locations from namenode and retry
I can only retrieve list of files from HDFS.
Seems that when retrieve data from data node it is connecting to my Windows localhost.
Because when I made a tunnel in putty from my localhost to VM everything was fine.
Here is my Java code:
Configuration config = new Configuration();
config.set("fs.defaultFS", "hdfs://ip:port/");
config.set("mapred.job.tracker", "hdfs://ip:port");
FileSystem dfs = FileSystem.get(new URI("hdfs://ip:port/"), config, "user");
dfs.copyToLocalFile(false, new Path("/tmp/sample.txt"),newPath("D://sample.txt"), true);
How can it be fixed?
Thanks.
P.S. This error occurs when I am using QuickStart VM from Cloudera.
Your DataNode is advertising its address to the NameNode as 127.0.0.1. You need to re-configure your Pseudo distributed cluster such that the nodes use externally available addresses (hostnames or IP addresses) when opening socket services.
I imagine if you run a netstat -atn on your VM, you'll see the Hadoop ports bound to 127.0.0.1 rather than 0.0.0.0 - this means they will only accept internal connections.
You need to look at your VM's /etc/hosts configuration file and ensure hostname doesn't have an entry resolving to 127.0.0.1.
Whenever you start a VM, it gets its own I.P. Something like 192.x.x.x or 172.x.x.x.
Using 127.0.0.1 for HDFS wont help when you are executing from your windows box, because this is mapped to local i.p. So, if you are using 127.0.0.1 from your windows machine, it will think that your HDFS is running on windows machine. This is why your connection is failing.
Find the i.p that is associated with your VM. Here is a link to get that if you are using Hyper-V. http://windowsitpro.com/hyper-v/quickly-view-all-ip-addresses-hyper-v-vms
Once you get the VMs I.P, use it in the application.
You need to change the ip. First go to linux VM and in its terminal find the IP address of your VM.
Command to see the ip address in linux VM is below
ifconfig
Then in your code change the ip address to the IP thats shown in your linux VM.
I am trying to connect to cassandra. I installed the latest stable version that is apache-cassandra-1.2.4 and extracted it on my desktop. As I run cassandra it sets up nicely listening for thrift client and displaying following :
sudo cassandra -f
log :
INFO 15:30:34,646 Cassandra version: 1.0.12
INFO 15:30:34,646 Thrift API version: 19.20.0
INFO 15:30:34,646 Loading persisted ring state
INFO 15:30:34,650 Starting up server gossip
INFO 15:30:34,661 Enqueuing flush of Memtable-LocationInfo#1117603949(29/36 serialized/live bytes, 1 ops)
INFO 15:30:34,661 Writing Memtable-LocationInfo#1117603949(29/36 serialized/live bytes, 1 ops)
INFO 15:30:34,877 Completed flushing /var/lib/cassandra/data/system/LocationInfo-hd-54-Data.db (80 bytes)
INFO 15:30:34,892 Starting Messaging Service on port 7000
INFO 15:30:34,901 Using saved token 143186062733850112297005303551620336860
INFO 15:30:34,903 Enqueuing flush of Memtable-LocationInfo#1282534304(53/66 serialized/live bytes, 2 ops)
INFO 15:30:34,904 Writing Memtable-LocationInfo#1282534304(53/66 serialized/live bytes, 2 ops)
INFO 15:30:35,102 Completed flushing /var/lib/cassandra/data/system/LocationInfo-hd-55-Data.db (163 bytes)
INFO 15:30:35,106 Node localhost/127.0.0.1 state jump to normal
INFO 15:30:35,107 Bootstrap/Replace/Move completed! Now serving reads.
INFO 15:30:35,108 Will not load MX4J, mx4j-tools.jar is not in the classpath
INFO 15:30:35,150 Binding thrift service to localhost/127.0.0.1:9160
INFO 15:30:35,155 Using TFastFramedTransport with a max frame size of 15728640 bytes.
INFO 15:30:35,160 Using synchronous/threadpool thrift server on localhost/127.0.0.1 : 9160
INFO 15:30:35,168 Listening for thrift clients...
Now as I run : cassandra-cli -h localhost -p 9160, it throws up the error. I have checked for the port to be free and cassandra is listening at the port. :
**
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.cli.CliMain.connect(CliMain.java:80)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:256)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
... 3 more
Exception connecting to localhost/9160. Reason: Connection refused.
**
I had the same error.Now, it is OK.
The main problem is that the configuration is wrong.
My configuration is as following:
My visual machine ip is 192.168.11.11.My cassandra was installed into the machine.So, I configurate thar
listen_address: 192.168.11.11
rpc_address: 0.0.0.0
broadcast_rpc_address: 192.168.11.11
That is OK。
The documentation of cassandra-stress seems to be sketchy. Maybe in due course that would be corrected. As of now, this command worked for me
./cassandra-stress write -node <IP_OF_NODE1>
Once this works, we could try putting in the other optional parameters to tweak our command.
Option 1:
Run jps command under root user and kill CassandraDaemon if you will see it. After this you will start Cassandra again.
Option2:
Try to connect Cassandra with CQL
./cqlsh 10.234.31.232 9042
Final Check:
An intermediate firewall is blocking the JVM from making the connection.
An operating system firewall, or antivirus that is causing the problems as well.
I think you installed in windows and looks like firewall is blocking your connection.