Its more of a design and architecture scenario.
I want to have number of nodes in the cluster and initially all the nodes are pre-installed with java 6 and windows/linux. In all the nodes I want to install my application (this application I will be maintaining on the server) and this application will be used to run the some tasks on parallel.
On server I want to monitor traffic of all the nodes and task execution status.
So how to achieve it?
Any comments on this will be appreciated.
Thanks in advance.
If I understood your question correctly you can use parallel-ssh and its pscp and pssh commands to copy your distrib onto the remote hosts and run commands you want to install it.
There are also some alternatives: dsh, clusterit
Related
I have created a 4-node hadoop cluster. I start all datanodes,namenode resource manager,etc.
To find whether all of my nodes are working or not, I tried the following procedure:
Step 1. I run my program when all nodes are active
Step 2. I run my program when only master is active.
The completion time in both cases were almost same.
So, I would like to know if there is any other means by which I can know how many nodes are actually used while running the program.
Discussed in the chat. The problem is caused by incorrect Hadoop installation, in both cases job was started locally using LocalJobRunner.
As a recommendations:
Install Hadoop using Ambari (http://ambari.apache.org/)
Change platform to CentOS 6.4+
Use Oracle JDK 7
Be patient with host names and firewall
Get familiar with the cluster commands for health diagnostics and default Hadoop WebUIs
I need to determine the list of JVMs running on a remote machine, and once that is done, to connect to each of the JVMs using JMX. I am a newbie and have gone through the following concepts:
1. using jps and jstat: I read that these commands may not be available in the future jdk versions.
2. using the java class "virtualmachine().list" . The problem with this though is that it helps you fetch the list of JVMs but only for the local machine. I do not know how to connect to a remote machine and then obtain this list.
Can anyone please suggest how to use either "virtualmachine().list" or any other method to obtain a list of JVMs on a remote machine ?
The problem is that all the methods(including the way jconsole works) that I have studied to connect to a remote JVM are focused to a SPECIFIC JVM where I need to provide the port number(of the JVM process). But I need a list of all the JVMs running. How can I do this ? Is it even possible ?
One option would be to launch a small java application on the remote machine and have it run virtualmachine().list or similar and then send back the information or make it accessible using JMX. This application could be running all the time, or you could maybe launch it remotely.
Some other ideas mentioned here: Get System Information of a Remote Machine (Using Java).
You could add a java-agent or some other common component to each of the remote JVMs and have them "phone-home" their JMXServiceURLs to a central JVM clearing house. Other than that, I think your only options are derived broadly from monex0's suggestion.
hi
i want to perform operations on files like rename, copy and etc.
these are not local files. they are located on remote computers.
I have 2 options:
1. run some kind of a telnet client (a framework that i already have in my system) from the java code. connect to the remote computer and perform a cmd operation.
2. perform a regular java.io operation on the remote path.
the problem with 1 is that its not cross platform (only theoretical problem for me), and that i generally dont want to use this telnet framework.
the problem with 2 is that large operations on remote files is slower, compared to same operation being performed on the machine itself with telnet.
am i right?
any other options?
any additional inputs?
thanks.
If you can deploy your application to the remote computer, you can simply write your own little client and server for these file operations.
I have here a Windows distribution server that runs an ANT task to build enterprise software. What I need to do is to have the ANT task copy and run a VM image (Linux), and then...talk to that Linux VM through the host operating system (through the ant task itself). We need to be able to send files and/or commands to it.
Is there a practical way to go about this? I know that we already have a way to send commands to VMs that are also running Windows (so windows-windows interaction) -- but is there a way to do a windows-linux interaction?
I've implemented the thing you wanted. Of course, for my own purposes, and then just found this question by googling on keywords "vmware" and "ant".
https://github.com/zhuravlik/ant-vix-tasks
This is the taskset for Ant to manage VMWare VMs.
It works via VIX API, so Linux guests should be supported by it.
I did not test it with VMWare Server, though. Only with Workstation.
But the API is common, so it should work.
Using ssh is probably the simplest. There is an ant task for that. Scp task is also there to copy files
It will depend on what you need to do, but:
The Linux system could expose an SSH server, and the host can do just about anything it needs to via SSH.
The Linux system could expose a web service that the host consumes.
The Linux system could expose a Samba share which the host then connects to and reads/writes from (if all you need to do is deal with some files, but that seems unlikely).
There are probably dozens of options.
I have a cluster of 32 servers and I need a tool to distribute a Java service, packaged as a Jar file, to each machine and remotely start the service. The cluster consists of Linux (Suse 10) servers with 8 cores per blade. The application is a data grid which uses Oracle Coherence. What is the best tool for doing this?
I asked something similar once, and it seems that the Java Parallel Processing Framework might be what you need:
http://www.jppf.org/
From the web site:
JPPF is an open source Grid Computing
platform written in Java that makes it
easy to run applications in parallel,
and speed up their execution by orders
of magnitude. Write once, deploy once,
execute everywhere!
Have a look at OpenMOLE: http://www.openmole.org/
This tool enables you to distribute a computing workflow to several resources: from multicores machines, to clusters and computing grids.
It is nicely documented and can be controlled through groovy code or a GUI.
Distributing a jar on a cluster should be very easy to do with OpenMOLE.
Is your service packaged as an EJB? JBoss does a fairly good job with clustering.
Use Bit Torrent. Using Peer to Peer sharing style on clusters can really boost up your deployment speed.
It depends on which operating system you have and how security is setup on your network.
If you can use NFS or Windows Share, I suggest you put the software on an NFS drive which is visible to all machines. That way you can run them all from one copy.
If you have remote shell or secure remote shell you can write a script which runs the same command on each machine e.g. start on all machines, or stop on all machines.
If you have windows you might want to setup a service on each machine. If you have linux you might want to add a startup/shutdown script to each machine.
When you have a number of machines, it may be useful to have a tool which monitors that all your services are running, collects the logs and errors in one place and/or allows you to start/stop them from a GUI. There are a number of tools to do this, not sure which is the best these days.