running an oozie workflow using java code

running an oozie workflow using java code - java

I'm new to java and having some trouble running an oozie job using java code. I am unable to figure out the problem in the code. Some help will be really appreciated. Here's my code
import java.util.Properties;
import org.apache.oozie.client.OozieClient;
import org.apache.oozie.client.WorkflowJob;
public class oozie {
public static void main(String[] args) {
OozieClient wc = new OozieClient("http://host:11000/oozie");
Properties conf = wc.createConfiguration();
conf.setProperty(OozieClient.APP_PATH, "hdfs://cluster/user/apps/merge-psp-logs/merge-wf/workflow.xml");
conf.setProperty("jobTracker", "jobtracker.bigdata.com:8021");
conf.setProperty("nameNode", "hdfs://namenode.bigdata.com:8020");
conf.setProperty("queueName", "jobtracker.bigdata.com:8021");
conf.setProperty("appsRoot", "hdfs://namenode.bigdata.com:8020/user/workspace/apps");
conf.setProperty("appLibLoc", "hdfs://namenode.bigdata.com:8020/user/workspace/lib");
conf.setProperty("rawlogsLoc", "hdfs://namenode.bigdata.com:8020/user/workspace/");
conf.setProperty("mergedlogsLoc", "jobtracker.bigdata.com:8021");
try {
String jobId = wc.run(conf);
System.out.println("Workflow job submitted");
while (wc.getJobInfo(jobId).getStatus() == WorkflowJob.Status.RUNNING) {
System.out.println("Workflow job running ...");
Thread.sleep(10 * 1000);
}
System.out.println("Workflow job completed ...");
System.out.println(wc.getJobInfo(jobId));
} catch (Exception r) {
System.out.println("Errors");
}
}
}
Though i am able to launch the job using command line

Without any further information, i would say this is the probably cause of your runtime errors:
conf.setProperty(OozieClient.APP_PATH,
"hdfs://cluster/user/apps/merge-psp-logs/merge-wf/workflow.xml");
conf.setProperty("jobTracker", "jobtracker.bigdata.com:8021");
conf.setProperty("nameNode", "hdfs://namenode.bigdata.com:8020");
conf.setProperty("queueName", "jobtracker.bigdata.com:8021");
Unless you have two clusters, my guess is you meant the APP_PATH to point to the same HDFS instance as the one named in your nameNode property, in which case try:
conf.setProperty(OozieClient.APP_PATH,
"hdfs://namenode.bigdata.com:8020/user/apps/merge-psp-logs/merge-wf/workflow.xml");
You might also want to change the queueName to a real queue name (probably "default", unless jobtracker.bigdata.com:8021 is the actual name of your queue):
conf.setProperty("queueName", "default");
Aside from those observations, try and post the actual runtime error you're seeing.

Related

Broadleaf Commerce Embedded Solr cannot run with root user

I download a fresh 6.1 broadleaf-commerce and run my local machine via java -javaagent:./admin/target/agents/spring-instrument.jar -jar admin/target/admin.jar successfully on mine macbook. But in my centos 7 I run sudo java -javaagent:./admin/target/agents/spring-instrument.jar -jar admin/target/admin.jar with following error
2020-10-12 13:20:10.838 INFO 2481 --- [ main] c.b.solr.autoconfigure.SolrServer : Syncing solr config file: jar:file:/home/mynewuser/seafood-broadleaf/admin/target/admin.jar!/BOOT-INF/lib/broadleaf-boot-starter-solr-2.2.1-GA.jar!/solr/standalone/solrhome/configsets/fulfillment_order/conf/solrconfig.xml to: /tmp/solr-7.7.2/solr-7.7.2/server/solr/configsets/fulfillment_order/conf/solrconfig.xml
*** [WARN] *** Your Max Processes Limit is currently 62383.
It should be set to 65000 to avoid operational disruption.
If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
WARNING: Starting Solr as the root user is a security risk and not considered best practice. Exiting.
Please consult the Reference Guide. To override this check, start with argument '-force'
2020-10-12 13:20:11.021 ERROR 2481 --- [ main] c.b.solr.autoconfigure.SolrServer : Problem starting Solr
Here is the source code of solr configuration, I believe it is the place to change the configuration to run with the argument -force in programming way.
package com.community.core.config;
import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.broadleafcommerce.core.search.service.SearchService;
import org.broadleafcommerce.core.search.service.solr.SolrConfiguration;
import org.broadleafcommerce.core.search.service.solr.SolrSearchServiceImpl;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.stereotype.Component;
/**
*
*
* #author Phillip Verheyden (phillipuniverse)
*/
#Component
public class ApplicationSolrConfiguration {
#Value("${solr.url.primary}")
protected String primaryCatalogSolrUrl;
#Value("${solr.url.reindex}")
protected String reindexCatalogSolrUrl;
#Value("${solr.url.admin}")
protected String adminCatalogSolrUrl;
#Bean
public SolrClient primaryCatalogSolrClient() {
return new HttpSolrClient.Builder(primaryCatalogSolrUrl).build();
}
#Bean
public SolrClient reindexCatalogSolrClient() {
return new HttpSolrClient.Builder(reindexCatalogSolrUrl).build();
}
#Bean
public SolrClient adminCatalogSolrClient() {
return new HttpSolrClient.Builder(adminCatalogSolrUrl).build();
}
#Bean
public SolrConfiguration blCatalogSolrConfiguration() throws IllegalStateException {
return new SolrConfiguration(primaryCatalogSolrClient(), reindexCatalogSolrClient(), adminCatalogSolrClient());
}
#Bean
protected SearchService blSearchService() {
return new SolrSearchServiceImpl();
}
}

Let me preface this by saying you would be better off simply not starting the application as root. If you are in Docker, you can use the USER command to switch to a non-root user.
The Solr server startup in Broadleaf Community is done programmatically via the broadleaf-boot-starter-solr dependency. This is the wrapper around Solr that ties it to the Spring lifecycle. All of the real magic happens in the com.broadleafcommerce.solr.autoconfigure.SolrServer class.
In that class, you will see a startSolr() method. This method is what adds startup arguments to Solr.
In your case, you will need to mostly copy this method wholesale and use cmdLine.addArgument(...) to add additional arguments. Example:
class ForceStartupSolrServer extends SolrServer {
public ForceStartupSolrServer(SolrProperties props) {
super(props);
}
protected void startSolr() {
if (!isRunning()) {
if (!downloadSolrIfApplicable()) {
throw new IllegalStateException("Could not download or expand Solr, see previous logs for more information");
}
stopSolr();
synchConfig();
{
CommandLine cmdLine = new CommandLine(getSolrCommand());
cmdLine.addArgument("start");
cmdLine.addArgument("-p");
cmdLine.addArgument(Integer.toString(props.getPort()));
// START MODIFICATION
cmdLine.addArgument("-force");
// END MODIFICATION
Executor executor = new DefaultExecutor();
PumpStreamHandler streamHandler = new PumpStreamHandler(System.out);
streamHandler.setStopTimeout(1000);
executor.setStreamHandler(streamHandler);
try {
executor.execute(cmdLine);
created = true;
checkCoreStatus();
} catch (IOException e) {
LOG.error("Problem starting Solr", e);
}
}
}
}
}
Then create an #Configuration class to override the blAutoSolrServer bean created by SolrAutoConfiguration (note the specific package requirement for org.broadleafoverrides.config):
package org.broadleafoverrides.config;
public class OverrideConfiguration {
#Bean
public ForceStartupSolrServer blAutoSolrServer(SolrProperties props) {
return new ForceStartupSolrServer(props);
}
}

URI Schema: Infinite command prompts are opening

I went through the following doc center and tried to create my own URI schema myDocs:
https://msdn.microsoft.com/en-us/library/aa767914(v=vs.85).aspx
Following is my Java program. It takes a command line argument and returns the URL in the browser.
import java.awt.Desktop;
import java.io.IOException;
public class URIOpen {
public static void main(String args[]) {
if (args.length == 0) {
return;
}
String uri = args[0];
try {
Desktop.getDesktop().browse(java.net.URI.create(uri));
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
}
I updated the (Default) value field of the command key like below.
"C:\Program Files (x86)\Java\jdk1.8.0_102\bin\java" -cp "C:\Users\Krishna\Documents\Study\Miscellaneous\examples" "URIOpen" "%1"
When I try to run the command myDocs:http://google.com, I end up opening infinite command prompts.
The following is my URI schema entry structure in the registry. Any help on this?

Your solution end up opening infinite command prompts because of:
you registered the execution of the custom URIOpen class to be activated by the system when it has to deal with myDocs:'s scheme based URI;
when custom URIOpen class executes the line Desktop.getDesktop().browse(java.net.URI.create(uri)); the system will receive again an URI based on the same scheme ( myDocs: ) and it will activate again a new command to execute your class again and again and again ...
Probably you would like to change your code in someway like that:
try {
java.net.URI theURI = java.net.URI.create(uri);
// System.out.println(theURI.getScheme()); => myDocs
String uriBrowsablePart = theURI.getRawSchemeSpecificPart();
// System.out.println(uriBrowsablePart); => http://google.com
Desktop.getDesktop().browse(java.net.URI.create(uriBrowsablePart));
// the above statement will open default browser on http://google.com
} catch (IOException e) {
System.out.println(e.getMessage());
}
try replacing your try-catch block with my suggestion and see if it works as required.

java.lang.ClassNotFoundException: org.zeromq.ZContext when trying to start windows service

I have a basic Maven java app that I created and it depends on JeroMQ which is a full Java implemenetation of ZeroMQ. Since I also need to wrap this java app as a windows service, I chose to use Apache Commons Daemon and specifically, followed this excellent example: http://web.archive.org/web/20090228071059/http://blog.platinumsolutions.com/node/234 Here's what the Java code looks like:
package com.org.SubscriberACD;
import java.nio.charset.Charset;
import org.zeromq.ZContext;
import org.zeromq.ZMQ;
import org.zeromq.ZMQ.Socket;
/**
* JeroMQ Subscriber for Apache Commons Daemon
*
*/
public class Subscriber
{
/**
* Single static instance of the service class
*/
private static Subscriber subscriber_service = new Subscriber();
/**
* Static method called by prunsrv to start/stop
* the service. Pass the argument "start"
* to start the service, and pass "stop" to
* stop the service.
*/
public static void windowsService(String args[]) {
String cmd = "start";
if(args.length > 0) {
cmd = args[0];
}
if("start".equals(cmd)) {
subscriber_service.start();
}
else {
subscriber_service.stop();
}
}
/**
* Flag to know if this service
* instance has been stopped.
*/
private boolean stopped = false;
/**
* Start this service instance
*/
public void start() {
stopped = false;
System.out.println("My Service Started "
+ new java.util.Date());
ZContext context = new ZContext();
Socket subscriber = context.createSocket(ZMQ.SUB);
subscriber.connect("tcp://localhost:5556");
String subscription = "MySub";
subscriber.subscribe(subscription.getBytes(Charset.forName("UTF-8")));
while(!stopped) {
System.out.println("My Service Executing "
+ new java.util.Date());
String topic = subscriber.recvStr();
if (topic == null)
break;
String data = subscriber.recvStr();
assert(topic.equals(subscription));
System.out.println(data);
synchronized(this) {
try {
this.wait(60000); // wait 1 minute
}
catch(InterruptedException ie){}
}
}
subscriber.close();
context.close();
context.destroy();
System.out.println("My Service Finished "
+ new java.util.Date());
}
/**
* Stop this service instance
*/
public void stop() {
stopped = true;
synchronized(this) {
this.notify();
}
}
}
Then I created the following folder structure just like the tutorial suggested:
E:\SubscriberACD
\bin
\subscriberACD.exe
\subscriberACDw.exe
\classes
\com\org\SubscriberACD\Subscriber.class
\logs
I then navigated to the bin directory and issued the following command to install the service:
subscriberACD.exe //IS//SubscriberACD --Install=E:\SubscriberACD\bin\subscriberACD.exe --Descriptio
n="Subscriber using Apache Commons Daemon" --Jvm=c:\glassfish4\jdk7\jre
\bin\server\jvm.dll --Classpath=E:\SubscriberACD\classes --StartMode=jvm
--StartClass=com.org.SubscriberACD.Subscriber --StartMethod=windowsSer
vice --StartParams=start --StopMode=jvm --StopClass=com.org.SubscriberA
CD.Subscriber --StopMethod=windowsService --StopParams=stop --LogPath=E:\SubscriberACD\logs --StdOutput=auto --StdError=auto
The install works fine since I can see it in Windows Services. However, when I try to start it from there, I get an error saying "Windows cannot start the SubscriberACD on Local Computer".
I checked the error logs and see the following entry:
2016-04-14 14:38:40 Commons Daemon procrun stderr initialized
Exception in thread "main" ror: org/zeromq/ZContext
at com.org.SubscriberACD.Subscriber.start(Subscriber.java:57)
at com.org.SubscriberACD.Subscriber.windowsService(Subscriber.java:33)
Caused by: java.lang.ClassNotFoundException: org.zeromq.ZContext
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 2 more
It's worth noting that JeroMQ is currently a jar under my Maven Dependencies. I configured it from my POM.xml file.
I think the problem might be that my service doesn't have access to the JeroMQ jar that is under my Maven Dependencies. My assumption is that the class file doesn't contain the dependencies. So what I tried was exporting my entire project as a jar and stuck that baby under E:\SubscriberACD\classes\
So my structure now looks like this:
E:\SubscriberACD
\bin
\subscriberACD.exe
\subscriberACDw.exe
\classes
\com\org\SubscriberACD\
\Subscriber.class
\Subscriber.jar
\logs
However, that didn't fix the issue. Can anyone shed some light on this?

Change your --Classpath argument to :
--Classpath=E:\SubscriberACD\classes\your-jar-filename.jar
You almost certainly have other jarfiles you'll need, so just append them to the end of the --Classpath using ; (semi-colon) delimiters...
--Classpath=E:\SubscriberACD\classes\your-jar-filename.jar;e:\other-dir\classes\some-other.jar;etc...

Java opening File Streams in one class and closing/deletion of file in another class

I want to delete the file which is opened and done writing but not closed. Please refer to code below:
Class A (can't be changed):
import java.io.FileOutputStream;
public class A {
public void run(String file) throws Exception {
FileOutputStream s = new FileOutputStream(file);
}
}
Class B:
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;
public class B {
public static void main(String[] args) throws Exception {
String path = "D:\\CONFLUX_HOME\\TestClient\\Maps\\test\\newTest.txt";
A a = new A();
a.run(path);
File f = new File(path);
Files.delete(Paths.get(f.getAbsolutePath()));
}
}
In Class A , just open the stream without closing the file.
In class B , calling A's run method and then try to delete the file.
Since the file is still opened. I'm unable to delete the file.
Error is :
The process cannot access the file because it is being used by another process.
Actual Scenario is :
We are loading the jars dynamically. Classes inside jar are creating the file. When there is an exception, a file gets created whose size will be 0 bytes. We need to delete this file. Since the file is not closed during the exception, we can't delete the file.
We could fix the issue if we could close the streams in the jar classes, but we can't modify the jars that create the files as they are client specific jars.
Please suggest how to delete the opened file, without modifying the code in class A.

Make sure you close the file, even if there was an Exception when writing to it.
E.g.
public void run(String file) throws Exception {
FileOutputStream s = null;
try {
s = new FileOutputStream(file);
} finally {
try {
s.close();
} catch(Exception e) {
// log this exception
}
}
}

You have to close the file before any delete operation as firstly its a bad practice and second is it will lead to memory leaks.

If you are using Tomcat, it is possible to set AntiLockingOption and antiJARLocking in $CATALINA_HOME/conf/context.xml for Windows:
<Context antiJARLocking="true" antiResourceLocking="true" >
Important note:
The antiResourceLocking option can stop JSPs from redeploying when they are edited requiring a redeploy.
Read more about this option:
http://tomcat.apache.org/tomcat-7.0-doc/config/context.html
antiResourceLocking:
If true, Tomcat will prevent any file locking. This will significantly impact startup time of applications, but allows full webapp hot deploy and undeploy on platforms or configurations where file locking can occur. If not specified, the default value is false.

Pass the resource as a parameter and it becomes the caller's responsibility to clear up the resources
public void run(FileOutputStream stream) throws Exception {
...
}
caller:
try(FileStream stream = new FileStream(path)){
A a = new A();
a.run(stream);
}catch(Exception e){
.. exception handling
}

Updated according to OPs comment.
Another approach could be to subclass A and override run().
public static void main(String[] args) throws Exception {
String path = "D:\\CONFLUX_HOME\\TestClient\\Maps\\test\\newTest.txt";
A a = new A() {
#Override
public void run(String file) throws Exception {
FileOutputStream s = new FileOutputStream(file);
s.close();
}
};
a.run(path);
File f = new File(path);
Files.delete(Paths.get(f.getAbsolutePath()));
System.out.println("foo");
}

I don't think you'll find a pure java solution to this problem. One option is to install Unlocker (being careful to hit "Skip" on all the junkware) and invoke it from your code.
If you have UAC enabled, you'll also need to be running your java in an elevated process (e.g. start command prompt as Administrator). Then, assuming unlocker is in C:\Program Files\Unlocker:
Process p = new ProcessBuilder("c:\\Program Files\\Unlocker\\Unlocker.exe",path,"-s").start();
p.waitFor();
And after that you can delete the file as before. Or you could use "-d" instead of "-s" and Unlocker will delete the file for you.

Kafka Storm Integration using Kafka Spout

I am using KafkaSpout. Please find the test program below.
I am using Storm 0.8.1. Multischeme class is there in Storm 0.8.2. I will be using that. I just want to know how were the earlier versions working just by instantiating the StringScheme() class? Where can I download earlier versions of Kafka Spout? But I doubt that would be a correct alternative than to work on Storm 0.8.2. ??? (Confused)
When I run the code (given below) on storm cluster (i.e. when I push my topology) I get the following error (This happens when the Scheme part is commented else of course I will get compiler error as the class is not there in 0.8.1):
java.lang.NoClassDefFoundError: backtype/storm/spout/MultiScheme
at storm.kafka.TestTopology.main(TestTopology.java:37)
Caused by: java.lang.ClassNotFoundException: backtype.storm.spout.MultiScheme
In the code given below you may find the spoutConfig.scheme=new StringScheme(); part commented. I was getting compiler error if I don't comment that line which is but natural as there are no constructors in there. Also when I instantiate MultiScheme I get error as I dont have that class in 0.8.1.
public class TestTopology {
public static class PrinterBolt extends BaseBasicBolt {
public void declareOutputFields(OutputFieldsDeclarer declarer) {
}
public void execute(Tuple tuple, BasicOutputCollector collector) {
System.out.println(tuple.toString());
}
}
public static void main(String [] args) throws Exception {
List<HostPort> hosts = new ArrayList<HostPort>();
hosts.add(new HostPort("127.0.0.1",9092));
LocalCluster cluster = new LocalCluster();
TopologyBuilder builder = new TopologyBuilder();
SpoutConfig spoutConfig = new SpoutConfig(new KafkaConfig.StaticHosts(hosts, 1), "test", "/zkRootStorm", "STORM-ID");
spoutConfig.zkServers=ImmutableList.of("localhost");
spoutConfig.zkPort=2181;
//spoutConfig.scheme=new StringScheme();
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
builder.setSpout("spout",new KafkaSpout(spoutConfig));
builder.setBolt("printer", new PrinterBolt())
.shuffleGrouping("spout");
Config config = new Config();
cluster.submitTopology("kafka-test", config, builder.createTopology());
Thread.sleep(600000);
}

I had the same problem. Finally resolved it, and I put the complete running example up on github.
You are welcome to check it out here >
https://github.com/buildlackey/cep
(click on the storm+kafka directory for a sample program that should get you up and running).

We had a similar issue.
Our solution:
Open pom.xml
Change scope from provided to <scope>compile</scope>
If you want to know more about dependency scopes check the maven docu:
Maven docu - dependency scopes

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

running an oozie workflow using java code - java

Related

Broadleaf Commerce Embedded Solr cannot run with root user

URI Schema: Infinite command prompts are opening

java.lang.ClassNotFoundException: org.zeromq.ZContext when trying to start windows service

Java opening File Streams in one class and closing/deletion of file in another class

Kafka Storm Integration using Kafka Spout

Categories

Resources