I am executing a map-reduce code by calling Driver class in Oozie java action. Map reduce run successfully and I get the output as expected. However, the log statements in my driver class are not shown on oozie job logs. I am using log4j for logging in my driver class.
Do I need to make some configuration changes to see the logs ?. Snippet of my workflow.xml `
<action name="MyAppDriver">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/home/hadoop/work/surjan/outpath/20160430" />
</prepare>
<main-class>com.surjan.driver.MyAppMainDriver</main-class>
<arg>/home/hadoop/work/surjan/PoC/wf-app-dir/MyApp.xml</arg>
<job-xml>/home/hadoop/work/surjan/PoC/wf-app-dir/AppSegmenter.xml</job-xml>
</java>
<ok to="sendEmailSuccess"/>
<error to="sendEmailKill"/>
</action>
`
The logs going into Yarn.
In my case I have a custom java action. If you look in Yarn UI you have to dig into the mapper task that the java action is in. So in my case the oozie wf item was 0070083-200420161025476-oozie-xxx-W and oozie job -log ${wf_id} shows the java action 0070083-200420161025476-oozie-xxx-W#java1 failed with a Java exception. I cannot see any context to that. Looking on the oozie web UI only the "Job Error Log" is populated as per what is shown on the commandline. The actual logging isn't shown. The ooize job -info ${wf_id} status shows failed:
Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0070083-200420161025476-oozie-xxx-W#:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0070083-200420161025476-oozie-xxx-W#java1 ERROR job_1587370152078_1090 FAILED/KILLEDJA018
------------------------------------------------------------------------------------------------------------------------------------
0070083-200420161025476-oozie-xxx-W#fail OK - OK E0729
------------------------------------------------------------------------------------------------------------------------------------
You can search for the actual yarn on the web console Yarn Resource Manager UI (not within the "yarn logs" web console which are yarns own logs not the logs of what it is hosting). You can easily grep for the correct id on the commandline by looking for the ooize wf job id:
user#host:~/apidemo$ yarn application --list -appStates FINISHED | grep 0070083-200420161025476
20/04/22 20:42:12 INFO client.AHSProxy: Connecting to Application History server at your.host.com/130.178.58.221:10200
20/04/22 20:42:12 INFO client.RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]...
20/04/22 20:42:12 INFO client.RequestHedgingRMFailoverProxyProvider: Found active RM [rm2]
application_1587370152078_1090 oozie:launcher:T=java:W=java-whatever-sql-task-wf:A=java1:ID=0070083-200420161025476-oozie-xxx-W MAPREDUCE kerberos-id default FINISHED SUCCEEDED 100% https://your.host.com:8090/jobhistory/job/job_1587370152078_1090
user#host:~/apidemo$
Note that oozie says things failed. Yet the status of the action is "FINISHED" and the yarn application status is "SUCCESS" which seems strange.
Helpfully that commandline output also shows the url to the job history. That opens the webpage that takes you to the parent application that ran your java. If you click on the little logs link in the page you see some logs. If you look closely that page said it ran 1 operation of "task type map". If you click on the link in that row it takes you to the actual task which in my case is task_1587370152078_1090_m_000000. You have to click into that to see the first attempt which is attempt_1587370152078_1090_m_000000_0 then on the right hand side you have a tiny logs link which shows some more specific logging.
You can also ask yarn for the logs once you know the application id:
yarn logs -applicationId application_1587370152078_1090
That showed me very detailed logs including the custom java logging that I didn't see on the console easily where I could see what was really going on.
Note that if you are writing custom code you want to let yarn set the log4j properties file rather than supply your own version so that the yarn tools can find your logs. The code will be run with a flag:
-Dlog4j.configuration=container-log4j.properties
The detailed logs show all the jars that are added to the classpath. You should ensure that your custom code uses the same jars and log4j version.
Related
I've been getting the following error page, when visiting my app:
Application error
An error occurred in the application and your page could not be served. If you are the application owner, check your logs for details. You can do this from the Heroku CLI with the command
heroku logs --tail
However, heroku logs --tail does not return anything. If I go to the dashboard through web (https://dashboard.heroku.com/apps/app-name/logs), the logs window keeps loading but nothing is shown.
I have nothing to go on, and Heroku won't accept a ticket of a free app. Does anyone know any more steps I can take to trace the error?
Additional info:
The build logs all look fine:
BUILD SUCCESSFUL in 49s
5 actionable tasks: 5 executed
Discovering process types
Procfile declares types -> (none)
Default types for buildpack -> web
Compressing...
Done: 85.2M
Launching...
Released v3
https://app-name.herokuapp.com/ deployed to Heroku
I am creating a spring Java API, it works fine when I try to run it locally. If I try to run it through $heroku local I do get an error:
[FAIL] No Procfile and no package.json file found in Current Directory - See run --help
I had same issue. I just added system.properties to root folder of spring app and this file contained 1 line:
java.runtime.version=11
Heroku procfile documentation.
Spring documentation for Heroku deployment
I'm usually running stuff from JUnit but I've also tried running from main and it makes no difference.
I have read nearly two dozen SO questions, blog posts, and articles and tried almost everything to get Spark to stop logging so much.
Things I've tried:
log4j.properties in resources folder (in src and test)
Using spark-submit to add a log4j.properties which failed with "error: missing application resources"
Logger.getLogger("com").setLevel(Level.WARN);
Logger.getLogger("org").setLevel(Level.WARN);
Logger.getLogger("akka").setLevel(Level.WARN);Logger.getRootLogger().setLevel(Level.WARN);spark.sparkContext().setLogLevel("WARN");
In another project I got the logging to be quiet with:
Logger.getLogger("org").setLevel(Level.WARN);
Logger.getLogger("akka").setLevel(Level.WARN);
but it is not working here.
How I'm creating my SparkSession:
SparkSession spark = SparkSession
.builder()
.appName("RS-LDA")
.master("local")
.getOrCreate();
Let me know if you'd like to see more of my code.
Thanks
I'm using IntelliJ and Spark and, this work for me:
Logger.getRootLogger.setLevel(Level.ERROR)
You could change Log Spark configurations too.
$ cd SPARK_HOME/conf
$ gedit log4j.properties.template
# find this lines in the file
# Set everything to be logged to the console
log4j.rootCategory=INFO, console
and change to ERROR
log4j.rootCategory=ERROR, console
In this file you have other options tho change too
# Set the default spark-shell log level to WARN. When running the spark-shell, the
# log level for this class is used to overwrite the root logger's log level, so that
# the user can have different defaults for the shell and regular Spark apps.
log4j.logger.org.apache.spark.repl.Main=WARN
# Settings to quiet third party logs that are too verbose
.....
And finally rename the log4j.properties.template file
$ mv log4j.properties.template log4j.properties
You can follow this link for further configuration:
Logging in Spark with Log4j
or this one too:
Logging in Spark with Log4j. How to customize the driver and executors for YARN cluster mode.
It might be an old question, but I just ran by the same problem.
To fix it what I did was:
Adding private static Logger log = LoggerFactory.getLogger(Spark.class);
as a field for the class.
spark.sparkContext().setLogLevel("WARN"); after creating the spark session
Step 2 will work only after step 1.
I am trying to run a simple Java Spark job using Oozie on an EMR cluster. The job just takes files from an input path, does few basic actions on it and places the result in different output path.
When I try to run it from command line using spark-submit as shown below, it works fine:
spark-submit --class com.someClassName --master yarn --deploy-mode cluster /home/hadoop/some-local-path/my-jar-file.jar yarn s3n://input-path s3n://output-path
Then I set up the same thing in an Oozie workflow. However, when run from there the job always fails. The stdout log contains this line:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://[emr-cluster]:8020/user/oozie/workflows/[WF-Name]/lib/[my-jar-file].jar) multiple times to the distributed cache.
java.lang.IllegalArgumentException: Attempt to add (hdfs://[emr-cluster]:8020/user/oozie/workflows/[WF-Name]/lib/[my-jar-file].jar) multiple times to the distributed cache.
I found a KB note and another question here on StackOverflow that deals with a similar error. But for them, the job was failing due to an internal JAR file - not the one the user is passing to run. Nonetheless, I tried out its resolution steps to remove jar files common between spark & oozie in share-lib and ended up removing a few files from "/user/oozie/share/lib/lib_*/spark". Unfortunately, that did not solve the problem either.
Any ideas on how to debug this issue?
So we finally figured out the issue - at least in our case.
While creating the Workflow using Hue, when a Spark Action is added, it by default prompts for "File" and "Jar/py name". We provided the path to the JAR file that we wanted to run and the name of that JAR file respectively in those fields and it created the basic action as seen below:
The final XML that it created was as follows:
<action name="spark-210e">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>cluster</mode>
<name>CleanseData</name>
<class>com.data.CleanseData</class>
<jar>JCleanseData.jar</jar>
<spark-opts>--driver-memory 2G --executor-memory 2G --num-executors 10 --files hive-site.xml</spark-opts>
<arg>yarn</arg>
<arg>[someArg1]</arg>
<arg>[someArg2]</arg>
<file>lib/JCleanseData.jar#JCleanseData.jar</file>
</spark>
<ok to="[nextAction]"/>
<error to="Kill"/>
</action>
The default file tag in it was causing the issue in our case.
So, we removed it and edited the definition to as seen below and that worked. Note the change to <jar> tag as well.
<action name="spark-210e">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>cluster</mode>
<name>CleanseData</name>
<class>com.data.CleanseData</class>
<jar>hdfs://path/to/JCleanseData.jar</jar>
<spark-opts>--driver-memory 2G --executor-memory 2G --num-executors 10 --files hive-site.xml</spark-opts>
<arg>yarn</arg>
<arg>[someArg1]</arg>
<arg>[someArg1]</arg>
</spark>
<ok to="[nextAction]"/>
<error to="Kill"/>
</action>
PS: We had a similar issue with Hive actions too. The hive-site.xml file we were supposed to pass with Hive action - which created a <job-xml> tag - was also causing issues. So we removed it and it worked as expected.
I have a java-application (standard springboot from default tutorial: https://spring.io/guides/gs/spring-boot-for-azure/ ) that I "successfuly" deploy to my WebApp (created during deployment) via Eclipse/maven plugin azure-webapp:deploy
Once deployed, files are inside the WebApp, I can see them. If start-up is successful I do get running application, but if it is not - I do not know how to troubleshoot. I don't know where to find error logs, what caused the problem and as consequence, how to solve it.
as an example of how to make it fail, add this line:
throw new RuntimeException("Doomed to fail");
I tried enabling logs from "diagnostic logs tab" and expected to see them under LogFiles/Applications but that folder remains empty.
How do I troubleshoot java-application that fails to start in WebApps of Azure?
edit: additional example of Exception to troubleshoot:
public static void main(String[] args) {
throw new RuntimeException("start failure #21");
//SpringApplication.run(Application.class, args);
}
It sounds like you followed the springboot tutorial Deploying a Spring Boot app to Azure to build the GitHub project microsoft/gs-spring-boot and deploy to Azure, but not works.
Here is my steps which I also followed the tutorial, but deployed via my own way.
I created a directory SpringBoot on my local machine, and to do the commands cd SpringBoot and git clone https://github.com/microsoft/gs-spring-boot.
Then, to build it via commands cd gs-spring-boot/complete and mvnw clean package
Note: I reviewed the sections of the tutorial under Create a sample Spring Boot web app which seems to do on Linux, but the web.config file in microsoft/gs-spring-boot/complete is ready to Azure WebApp for Windows. However, there is not any comments to describe the deployment target that be Azure WebApp for Windows or Linux.
So I used my existing WebApp for Windows to test my deployment. I open my Kudu console in browser via the url https://<webapp name>.scm.azurewebsites.net/DebugConsole and drag the files complete/web.config and complete/target/gs-spring-boot-0.1.0.jar to site/wwwroot as the figure below. Then, I started my webapp, and it works fine.
Note: Please check the JAVA_HOME environment variable which has been configured on Azure via command echo %JAVA_HOME% as the figure below.
If not, you need to set Java runtime in the Application settings tab of Azure portal.
Or you can also configure the web.config file to replace the reference %JAVA_HOME% with an existing Java runtime installed in the path D:\Program Files\Java of Azure WebApp, as below.
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<system.webServer>
<handlers>
<add name="httpPlatformHandler" path="*" verb="*" modules="httpPlatformHandler" resourceType="Unspecified"/>
</handlers>
<!-- <httpPlatform processPath="%JAVA_HOME%\bin\java.exe" -->
<httpPlatform processPath="D:\Program Files\Java\jre1.8.0_181\bin\java.exe"
arguments="-Djava.net.preferIPv4Stack=true -Dserver.port=%HTTP_PLATFORM_PORT% -jar "%HOME%\site\wwwroot\gs-spring-boot-0.1.0.jar"">
</httpPlatform>
</system.webServer>
</configuration>
I didn't manage to find logs in windows based machine, but If you enable logs on linux-based-machine you will see them in the "Diagnostic logs" output. There is a catch though.
There is 230 timeout. It will wait full "timeout" time, until producing logs in the log file, after which you can access them via log file or through "Diagnostic logs". Make sure to enable logging before you start application. This applies to linux based machine, I don't know if it can apply to windows based machine.
Then it waits for an answer trigger. Answer trigger, as it turns out, is a phrase in console output "Application is started in X seconds". I increased the timeout to 500 seconds, because although it starts in 60 seconds on my local machine, it takes 430 seconds on remote-linux-based machine of Microsoft Azure*
Second, I changed the name of my main class from "GameStart" to "Application" and after that it actually caught the trigger. After that the application started. Nowhere in manuals did I find the "until timeout - no logs" and "trigger phrase" clauses mentioned.
ps: For reference, it took me 20 to upload application, 5 minutes to start it. I was using centralUS server, cuz centralEU did not work out for me, as it was even longer, although I'm in central EU myself
-
*using test account. It might happen that on payed account number are either different or similar.
I am trying to execute a Map-Reduce task in an Oozie workflow using a <java> action.
O'Reilley's Apache Oozie (Islam and Srinivasan 2015) notes that:
While it’s not recommended, Java action can be used to run Hadoop MapReduce jobs because MapReduce jobs are nothing but Java programs after all. The main class invoked can be a Hadoop MapReduce driver and can call Hadoop APIs to run a MapReduce job. In that mode, Hadoop spawns more mappers and reducers as required and runs them on the cluster.
However, I'm not having success using this approach.
The action definition in the workflow looks like this:
<java>
<!-- Namenode etc. in global configuration -->
<prepare>
<delete path="${transformOut}" />
</prepare>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>default</value>
</property>
</configuration>
<main-class>package.containing.TransformTool</main-class>
<arg>${transformIn}</arg>
<arg>${transformOut}</arg>
<file>${avroJar}</file>
<file>${avroMapReduceJar}</file>
</java>
The Tool implementation's main() implementation looks like this:
public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new TransformTool(), args);
if (res != 0) {
throw new Exception("Error running MapReduce.");
}
}
The workflow will crash with the "Error running MapReduce" exception above every time; how do I get the output of the MapReduce to diagnose the problem? Is there a problem with using this Tool to run a MapReduce application? Am I using the wrong API calls?
I am extremely disinclined to use the Oozie <map-reduce> action, as each action in the workflow relies on several separately versioned AVRO schemas.
What's the issue here? I am using the 'new' mapreduce API for the task.
Thanks for any help.
> how do I get the output of the MapReduce...
Back to the basics.
Since you don't care to mention which version of Hadoop and which version of Oozie you are using, I will assume a "recent" setup (e.g. Hadoop 2.7 w/ TimelineServer and Oozie 4.2). And since you don't mention which kind of interface you use (command-line? native Oozie/Yarn UI? Hue?) I will give a few examples using good'old'CLI.
> oozie jobs -localtime -len 10 -filter name=CrazyExperiment
Shows the last 10 executions of "CrazyExperiment" workflow, so that you can inject the appropriate "Job ID" in next commands.
> oozie job -info 0000005-151217173344062-oozie-oozi-W
Shows the status of that execution, from Oozie point of view. If your Java action is stuck in PREP mode, then Oozie failed to submit it to YARN; otherwise you will find something like job_1449681681381_5858 under "External ID". But beware! The job prefix is a legacy thing; the actual YARN ID is application_1449681681381_5858.
> oozie job -log 0000005-151217173344062-oozie-oozi-W
Shows the Oozie log, as could be expected.
> yarn logs -applicationId application_1449681681381_5858
Shows the consolidated logs for AppMaster (container #1) and Java action Launcher (container #2) -- after execution is over. The stdout log for Launcher contains a whole shitload of Oozie debug stuff, the real stdout is at the very bottom.
In case your Java action successfully spawned another YARN job, and you were careful to display the child "application ID", you should be able to retrieve it there and run another yarn logs command against it.
Enjoy your next 5 days of debugging ;-)