Pentaho - Execute a Kettle Transformation remotely using Java - java

I am executing My Jobs/ Transformation using Java API and I am able to do it correctly in my host.
Now I am looking for a way to execute the transformation in remote host(where carte in running). Please help me or redirect me to the proper documentation where I find the classes to use to accomplish this.
PDI Version - 5.0.1
Currently I am executing my Job as below
try {
if(jobDetails.getGraphlocation()!=null)
{
KettleEnvironment.init();
JobMeta jobMeta = new JobMeta(jobDetails.getGraphlocation(), null);
for( String s : jobDetails.getArguments() )
{
String[] splitString = s.split("\\=");
if(splitString.length==2)
{
jobMeta.setParameterValue(splitString[0], splitString[1]);
}
else
System.err.println("Parameter should be of the form - name=value");
}
Job job = new Job(null, jobMeta);
job.setLogLevel(LogLevel.valueOf(jobDetails.getLoglevel().toString()));
job.start();
job.waitUntilFinished();
if (job.getErrors()!=0) {
System.out.println("Error encountered!");
}
}
The above code is able to execute the Job where ever I am running it. But I want to execute it in slave server by just passing the carte username,password ans server IP address.

you can do it through spoon by registering the carte server, or you can do it in a job by specifying the name and port of the carte server in the actual job/transformation step. i.e. you can create a launcher which just has start, job ( pointing at carte server ), success steps.

Related

How to programmatically connect internet via datacard with AT commands?

I have a datacard ZTE MF190. I want to use AT commands to register in 2G or 3G and access internet via datacard. Found this article about how to make data call:
AT+cgatt=1
AT+CGDCONT=1,”IP”,”epc.tmobile.com” //I used my operator PDP context
AT+CGACT=1,1
But ping from OS terminal shows 100% package loss.
I've tried on Ubuntu 14 and Windows 7.
How can I connect internet with AT commands using datacard on Ubuntu?
UPDATE
I gave bounty to #tripleee's answer because it's more full than first one and answered all my questions. But I'm not satisfied with answers, so I'll answer my own question in a week.
In my answer I'll show how to handle this process with Java. So, please do not move this question to other Stack Exchange websites.
Creating a connection between the card and your provider is not sufficient. You need some mechanism for creating a network interface out of this connection, and set up your network stack to route packets over this interface.
Traditionally, the pppd daemon has been a popular choice for this task. You would create a "chat script" with the commands for establishing a data call (these days, pppd might come packaged with a suitable canned script) and the daemon would handle the entire process of placing the call, authenticating, setting up a network interface over the circuit, and configuring the system to route packets over it, as well as configuring DNS etc to use it for resolver queries, etc.
I tried to sniff USB port but on this case dashboard can not connect because of busy port
It is certainly possible. See this question
Found this article about how to make data call
What that article is about is how to set up the call, not how to make it.
After you made correct setup, connect to internet with this command:
ATD*99***1#
UPDATE1: After a bit of research I believe that article was written only to promote their software and has no practical use. In reality dialing is made with pppd or wvdial
UPDATE2: We discussed ways to solve the problem in a chat room (in Russian). It turned out cnetworkmanager will be the way to go
As far as I know wvdial uses ppp daemon to connect to the internet using modem. wvdial is preinstalled on desktop version of Ubuntu.
wvdial uses a config file located /etc/wvdial.conf. Let's edit this file. Type in your terminal
sudo nano /etc/wvdial.conf
and you will see something like this
[Dialer Defaults]
Init1 = ATZ
Init2 = ATQ0 V1 E1 S0=0 &C1 &D2
Stupid Mode = yes
ISDN = 0
Modem Type = Analog Modem
New PPPD = yes
Phone = *99#
Modem = /dev/ttyUSB2
Username = ''
Password = ''
Baud = 9600
Dial Timeout = 30
Dial Attempts = 3
Explanation of all keys you can find in wvdial.conf(5) - Linux man page. If you need to change your provider dial number, username, password or any other information about connection and device you can change file content and save it.
There are 3 serial ports for ZTE MF190. Normally it's ttyUSB0, ttyUSB1 and ttyUSB2. And in my case ttyUSB2 is for internet connection. It would not work on other ports. So you need to find the right serial port for your modem.
There is an automatic configurator which edits wvdial.conf file, sets serial port baud rate etc. Since it is not always configure correctly I would not recommend to use it:
sudo wvdialconf /etc/wvdial.conf
It would be better if you configure wvdial manually.
Now, when your device connected and wvdial configured to work with device, you can execute this line from terminal:
wvdial
You will see a lot of lines. But if you see those lines - you have succeeded.
local IP address XX.XX.XX.XX
remote IP address XX.XX.XX.XX
primary DNS address XX.XX.XX.XX
secondary DNS address XX.XX.XX.XX
Now, how we can use it in programming? I'll provide some code to work with it on Java. You can use this code to dial.
public int dialer() {
// status for debug. If status == 4 then you connected successfully
int status;
// create process of wvdial
ProcessBuilder builder = new ProcessBuilder("wvdial");
try {
// start wvdial
final Process process = builder.start();
// wvdial listener thread
final Thread ioThread = new Thread() {
#Override
public void run() {
try {
final BufferedReader reader = new BufferedReader(
new InputStreamReader(process.getErrorStream()));
// wvdial output line
String line;
while ((line = reader.readLine()) != null) {
// if "local IP address" line detected set status 1
if (line.contains("local IP address")) {
status = 1;
}
if (line.contains("remote IP address")) {
status = 2;
}
if (line.contains("primary DNS address")) {
status = 3;
}
if (line.contains("secondary DNS address")) {
status = 4;
}
}
reader.close();
} catch (final Exception e) {
}
}
};
// start listener
ioThread.start();
// wait 6 secs and return status. Some kind of timeout
Thread.sleep(6000);
} catch (Exception e) {
}
return status;
}
And here is a disconnector method. All you need is to kill wvdial process and thread will be destroyed:
public boolean disconnect() {
ProcessBuilder builder = new ProcessBuilder("pkill", "wvdial");
try {
builder.start();
return true;
} catch (IOException e) {
return false;
}
}

Killing an Oozie workflow from java

So i've been playing around with the Oozie java api, all fine and dandy until i've hit the following problem. While trying to run the following java code:
OozieClient oc = new OozieClient(OOZIE_URL);
Properties conf = oc.createConfiguration();
conf.setProperty(OozieClient.APP_PATH, PATH_TO_WF);
String jobId = oc.run(conf);
while(oc.getJobInfo(jobId).getStatus() == WorkflowJob.Status.PREP){
Thread.sleep(1000);
}
oc.kill(jobId);
This fails with the following exception:
E0508: User [?] not authorized for WF job [JOB_ID_GOES_HERE]
I've been able to find some related issues on google, though the ones i noticed were only related to the command line oozie client.
My main question is that considering you can run an Oozie workflow from java as another user by simply adding:
conf.setProperty("user.name", "user123");
Is there something similar that can be done with killing a workflow ?
Use AuthOozieClient and set system user.
OozieClient oc = new AuthOozieClient(OOZIE_URL);
System.setProperty("user.name", userName);
client.kill(jobId);

How to programmatically get all running jobs in a Hadoop cluster using the new API?

I have a software component which submits MR jobs to Hadoop. I now want to check if there are other jobs running before submitting it. I found out that there is a Cluster object in the new API which can be used to query the cluster for running jobs, get their configurations and extract the relevant information from them. However I am having problems using this.
Just doing new Cluster(conf) where conf is a valid Configuration which can be used to access this cluster (e.g., to submit jobs to it) leaves the object unconfigured, and the getAllJobStatuses() method of Cluster returns null.
Extracting mapreduce.jobtracker.address from the configuration, constructing an InetSocketAddress from it and using the other constructor of Cluster throws Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses..
Using the old api, doing something like new JobClient(conf).getAllJobs() throws an NPE.
What am I missing here? How can I programmatically get the running jobs?
I investigated even more, and I solved it. Thomas Jungblut was right, it was because of the mini cluster. I used the mini cluster following this blog post which turned out to work for MR jobs, but set up the mini cluster in a deprecated way with an incomplete configuration. The Hadoop Wiki has a page on how to develop unit tests which also explains how to correctly set up a mini cluster.
Essentially, I do the mini cluster setup the following way:
// Create a YarnConfiguration for bootstrapping the minicluster
final YarnConfiguration bootConf = new YarnConfiguration();
// Base directory to store HDFS data in
final File hdfsBase = Files.createTempDirectory("temp-hdfs-").toFile();
bootConf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, hdfsBase.getAbsolutePath());
// Start Mini DFS cluster
final MiniDFSCluster hdfsCluster = new MiniDFSCluster.Builder(bootConf).build();
// Configure and start Mini MR YARN cluster
bootConf.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 64);
bootConf.setClass(YarnConfiguration.RM_SCHEDULER, FifoScheduler.class, ResourceScheduler.class);
final MiniMRYarnCluster yarnCluster = new MiniMRYarnCluster("test-cluster", 1);
yarnCluster.init(bootConf);
yarnCluster.start();
// Get the "real" Configuration to use from now on
final Configuration conf = yarnCluster.getConfig();
// Get the filesystem
final FileSystem fs = new Path ("hdfs://localhost:" + hdfsCluster.getNameNodePort() + "/").getFileSystem(conf);
Now, I have conf and fs I can use to submit jobs and access HDFS, and new Cluster(conf) and cluster.getAllJobStatuses works as expected.
When everything is done, to shut down and clean up, I call:
yarnCluster.stop();
hdfsCluster.shutdown();
FileUtils.deleteDirectory(hdfsBase); // from Apache Commons IO
Note: JAVA_HOME must be set for this to work. When building on Jenkins, make sure JAVA_HOME is set for the default JDK. Alternatively you can explicitly state a JDK to use, Jenkins will then set up JAVA_HOME automatically.
I tried it like this, it worked for me, but it is after submitting the job
JobClient jc = new JobClient(job.getConfiguration());
for(JobStatus js: jc.getAllJobs())
{
if(js.getState().getValue() == State.RUNNING.getValue())
{
}
}
jc.close();
or else we can get the cluster from job api and there are methods which return all the jobs, jobs status
cluster.getAllJobStatuses();

Selenium server is not starting for easyb project

[FAILURE: Could not contact Selenium Server; have you started it on 'localhost:4444' ? Read more at http://seleniumhq.org/projects/remote-control/not-started.html Connection refused]
Hi..
I am working on easyB and encounters the above problem
how can start the selenium rc server and what this problem is all about?
Thanks...
Well you could write a groovy script into [your-webapp]/scripts/_Events.groovy to start and stop selenium
(You would have to install selenium-rc plugin before to have access to the seleniumConfig or selenium Server scripts. )
includeTargets << new File("$seleniumRcPluginDir/scripts/_SeleniumConfig.groovy")
includeTargets << new File("$seleniumRcPluginDir/scripts/_SeleniumServer.groovy")
eventTestPhaseStart = { phase ->
if(isAcceptance(phase)){
startSeleniumServer()
}
}
eventTestPhaseEnd = { phase ->
if(isAcceptance(phase)){
stopSeleniumServer()
}
}
isAcceptance = { phase->
phase?.contains("acceptance");
}
You need to start the Selenium Server first before you can use the client instance.
So before you call your defaultSelenium instance creation, you can start your server by using a RemoteControlConfiguration (Link to javadoc) object and use it as an argument for the SeleniumServer constructor call and then boot the server using the serverinstance.boot() call.
Something like
RemoteControlConfiguration rcc = new RemoteControlConfiguration()
//set whatever values you want your rc to start with:port,logoutfile,profile etc.
SeleniumServer ss = new SeleniumServer(rcc)
ss.boot()
Make sure you shut it down when you are done with tests.

How to connect to custom network when building image/container in Docker-Java API

I am using the Docker-Java API found here https://github.com/docker-java/docker-java.
I have a dockerfile that I have to build 3 containers for. I am trying to automate the process of building 3 images and the commands to go with them when running the shell inside. The 3 containers must be in the same local network in order for them to communicate. I am able to do this manually just fine...
So first, using the docker-java API, I am building a custom network using the following function:
private void createNetwork() {
CreateNetworkResponse networkResponse = dockerClient.createNetworkCmd()
.withName("ETH")
.withDriver("bridge")
.withAttachable(true)
.exec();
System.out.printf("Network %s created...\n", networkResponse.getId());
}
This works great, and if I run docker network ls, I can see the ETH network listed.
The next step is building the image. I am running the following function:
public String buildImage(String tag) {
String imageID = dockerClient.buildImageCmd()
.withDockerfile(new File("/Dockerfile"))
.withPull(true)
.withNoCache(false)
.withTags(new HashSet<>(Collections.singletonList(tag)))
.withNetworkMode("ETH")
.exec(new BuildImageResultCallback())
.awaitImageId();
System.out.println("Built image: " + imageID);
return imageID;
}
So the image builds fine and I can see the image when I run the docker images command in terminal. I do expect that the image to be connected to the ETH network, but I do not see that.
I thought that maybe I have to connect to the network when creating the container instead then, so I pass the same commands I would if I were to manually do this when building the container through the following function:
private String createContainer(String name, String imageID, int port) {
CreateContainerResponse container = dockerClient
.createContainerCmd(name)
.withImage(imageID)
.withCmd("docker", "run", "--rm", "-i", "-p", port + ":" + port, "--net=ETH", name)
.withExposedPorts(new ExposedPort(port))
.exec();
dockerClient.startContainerCmd(container.getId()).exec();
return container.getId();
}
Unfortunately, when passing in the arguments like this, the built container does not show up in the ETH network when running the command docker network inspect ETH.
I'm not sure what I am doing wrong. If I build the image using the API, and then run the following command manually, docker run --rm -it -p 8545:8545 --net=ETH miner_one everything works fine. Any help would be greatly appreciated. Thank you!
The docker-java client supports a subset of the Docker Remote API. To connect to a network when you create the container set the NetworkMode field
(see HostConfig -> NetworkMode in Container Create section)
Network mode to use for this container. Supported standard values are:
bridge, host, none, and container:. Any other value is taken
as a custom network's name to which this container should connect to.
Therefore, in order for the container to connect to the custom network set the value of the network mode to ETH.
In Java, for older versions of the Docker-Java client, use the withNetworkMode() method:
CreateContainerResponse container = dockerClient
.createContainerCmd(name)
.withImage(imageID)
.withNetworkMode("ETH")
...
In the latest version, the methods in CreateContainerCmd used to set the fields in HostConfig are deprecated. Use withHostConfig() instead:
CreateContainerResponse container = dockerClient.createContainerCmd(name)
.withImage(imageID)
.withHostConfig(HostConfig.newHostConfig().withNetworkMode("ETH"))
...
Here is a basic example:
List<Network> networks = dockerClient.listNetworksCmd().withNameFilter("ETH").exec();
if (networks.isEmpty()) {
CreateNetworkResponse networkResponse = dockerClient
.createNetworkCmd()
.withName("ETH")
.withAttachable(true)
.withDriver("bridge").exec();
System.out.printf("Network %s created...\n", networkResponse.getId());
}
CreateContainerResponse container = dockerClient
.createContainerCmd("ubuntu")
.withName("my-ubuntu")
.withCmd("sleep", "10")
.withHostConfig(HostConfig
.newHostConfig()
.withNetworkMode("ETH")
.withAutoRemove(true))
.exec();
String containerId = container.getId();
dockerClient.startContainerCmd(containerId).exec();
Network ethNetwork = dockerClient.inspectNetworkCmd()
.withNetworkId("ETH")
.exec();
Set<String> containerIds = ethNetwork.getContainers().keySet();
if(containerIds.contains(containerId)) {
System.out.printf("Container with id:%s is connected to network %s%n", containerId, ethNetwork.getName());
}
It creates a network named ETH and a container my-ubuntu from an ubuntu image. The container is connected to the ETH network.
Hope this helps.

Categories

Resources