setting up Hadoop YARN on ubuntu (single node)

setting up Hadoop YARN on ubuntu (single node) - java

I setup Hadoop YARN (2.5.1) on Ubuntu 13 as a single node cluster. When I run start-dfs.sh, it gives the following output and the process does not start (I confirmed usng jps and ps commands). My bashrc setup is also copied below. Any thoughts on what I need to reconfigure?
bashrc additions:
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_INSTALL=/opt/hadoop/hadoop-2.5.1
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
start-dfs.sh output:
14/09/22 12:24:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/hadoop/hadoop-2.5.1/logs/hadoop-hduser-namenode-zkserver1.fidelus.com.out
localhost: nice: $HADOOP_INSTALL/bin/hdfs: No such file or directory
localhost: starting datanode, logging to /opt/hadoop/hadoop-2.5.1/logs/hadoop-hduser-datanode-zkserver1.fidelus.com.out
localhost: nice: $HADOOP_INSTALL/bin/hdfs: No such file or directory
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is cf:e1:ea:86:a4:0c:cd:ec:9d:b9:bc:90:9d:2b:db:d5.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/hadoop-2.5.1/logs/hadoop-hduser-secondarynamenode-zkserver1.fidelus.com.out
0.0.0.0: nice: $HADOOP_INSTALL/bin/hdfs: No such file or directory
14/09/22 12:24:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
The bin directory has the hdfs file and its owner is hduser (I am running the process as hduser). My $HADOOP_INSTALL setting points to the hadoop directory (/opt/hadoop/hadoop-2.5.1). Should I change anything with the permissions, configuration or simply move the directory out of opt and perhaps onto /usr/local?
Update:
When I run start-yarn.sh, I get the following message:
localhost: Error: Could not find or load main class org.apache.hadoop.yarn.server.nodemanager.NodeManager
Update
I moved the directory to /usr/local but I get the same warning message.
Update
I have ResourceManager running as per jps command. However, when I try to start yarn, it fails with the error given above. I can access the resourcemanager UI on port 8088. Any ideas?

Try running namenode with the following (as opposed to using the start-dfs.sh) and see if that works.
hadoop-daemon.sh start namenode
hadoop-daemon.sh start secondarynamenode
hadoop-daemon.sh start datanode
hadoop-daemon.sh start nodemanager
mr-jobhistory-daemon.sh start historyserver

Related

Unable to create directory path [/User/Desktop/db2/logs] for Neo4j store

I am trying to use a tool that, in two steps, analyzes code smells for android.
In the first step, the tool parses an apk and generates within a directory .db files that should then be converted to CSV files in the next step; however, whenever I try to run the second step, the console returns the following error:
java.io.IOException: Unable to create directory path [/User/Desktop/db2/logs] for Neo4j store.
I think it is a Neo4J configuration problem.
I am currently running the tool with the following Java configuration:
echo $JAVA_HOME
/home/User/openlogic-openjdk-11.0.15
update-alternatives --config java
* 0 /usr/lib/jvm/java-11-openjdk-amd64/bin/java 1111 auto mode
To be safe, I also started Neo4J, which returned the following output
sudo systemctl status neo4j.service
neo4j.service - Neo4j Graph Database
Loaded: loaded (/lib/systemd/system/neo4j.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-07-06 20:11:04 CEST; 16min ago
Main PID: 1040 (java)
Tasks: 57 (limit: 18901)
Memory: 705.4M
CPU: 16.639s
CGroup: /system.slice/neo4j.service
└─1040 /usr/bin/java -cp "/var/lib/neo4j/plugins:/etc/neo4j:/usr/share/neo4j/lib/*:/var/lib/neo4j/plugins/*" -XX:+UseG1GC -XX:-OmitStackTraceInFastThrow -XX:+AlwaysPreTouch -XX:+UnlockExper>.
How can I solve this?

You posted this error:
java.io.IOException: Unable to create directory path [/User/Desktop/db2/logs] for Neo4j store.
From that error, it looks like:
Neo4j was installed at "/User/Desktop/db2"
The permissions for that directory do not have "write" permission
I tried to reproduce this locally using Neo4j Community 4.4.5, following the steps below.
I do see an IOException related to "logs", but it's slightly different from what you posted. Perhaps we're on different versions of Neo4j.
Open terminal into install directory: cd neo4j
Verify "neo4j" is stopped: ./bin/neo4j stop
Rename existing "logs" directory: mv logs logs.save
Remove write permission for the Neo4j install: chmod u-w .
Start neo4j in console mode: ./bin/neo4j console
Observe errors in console output
2022-07-08 03:28:38.081+0000 INFO Starting...
ERROR StatusLogger Unable to create file [****************************]/neo4j/logs/debug.log
java.io.IOException: Could not create directory [****************************]/neo4j/logs
...
To fix things, try:
Get a terminal into your Neo4j directory:
cd /User/Desktop/db2
Set write permissions for the entire directory tree:
chmod u+w -R .
Start neo4j in console mode:
./bin/neo4j console
If this works and you're able to run neo4j fine, it points to an issue with user permissions when running neo4j as a system service.
The best steps from there depend on the system, your access, how comfortable you are making changes, probably other things. An easy, brute-force hammer would be to manually create each directory you discover (such as "/User/Desktop/db2/logs") and grant premissions to all users (chmod ugo+w .), then try re-running the service, see what errors pop up. Repeat that until you're able to run the service without errors.

Exception in remote debugging for SpringBoot Application

I have a SpringBoot Application which internally communicate with the JMS and activeMQ. I have a .cmd file to start that application. I have added arguments to enable remote debugging, so I can debug the application in eclipse. The cmd file as below:
set JAVA_CP=./;./config;./lib/*
set JAVA_JMX=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=10090 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false
set JAVA_CL=com.myapp.test.server.TestServer
set JAVA_OP=-Xmx280m -Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=8090,suspend=n %JAVA_JMX%
"%JAVA_HOME%\bin\java" %JAVA_OP% -cp "%JAVA_CP%" %JAVA_CL%
Now when I start ActiveMQ and then my application with above .cmd file. I got following error:
18:27:53.234 [main] ERROR [o.a.coyote.http11.Http11NioProtocol] Failed to start end point associated with ProtocolHandler ["http-nio-8080"]
java.net.BindException: Address already in use: bind
If I remove the debugging arguments(-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=8090,suspend=n) from .cmd file, it works fine.
I search about this and found that it may possible that two instances are running, but I verified that also. Can you please help?

Java agent library failed to init: instrument

I'm working on an open-source project called "Cloudnet-v3". I am using a symlink /data on my local machine to the data-point in my IntelliJProjects-Folder.
I got the following startup command:
[java, -XX:+UseG1GC, -XX:MaxGCPauseMillis=50, -XX:-UseAdaptiveSizePolicy, -XX:CompileThreshold=100, -XX:+UnlockExperimentalVMOptions, -XX:+UseCompressedOops, -Dcom.mojang.eula.agree=true, -Djline.terminal=jline.UnsupportedTerminal, -Dfile.encoding=UTF-8, -Dio.netty.noPreferDirect=true, -Dclient.encoding.override=UTF-8, -Dio.netty.maxDirectMemory=0, -Dio.netty.leakDetectionLevel=DISABLED, -Dio.netty.recycler.maxCapacity=0, -Dio.netty.recycler.maxCapacity.default=0, -DIReallyKnowWhatIAmDoingISwear=true, -Dcloudnet.wrapper.receivedMessages.language=english, -Xmx372M, -javaagent: "/data/temp/caches/wrapper.jar", -cp, "/data/launcher/libs/io/kubernetes/client-java/4.0.0/client-java-4.0.0.jar:/data/launcher/libs/io/netty/netty-codec-http/4.1.36.Final/netty-codec-http-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-handler/4.1.36.Final/netty-handler-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-transport-native-epoll/4.1.36.Final/netty-transport-native-epoll-4.1.36.Final-linux-x86_64.jar:/data/launcher/libs/io/netty/netty-transport-native-kqueue/4.1.36.Final/netty-transport-native-kqueue-4.1.36.Final-osx-x86_64.jar:/data/launcher/libs/io/kubernetes/client-java-api/4.0.0/client-java-api-4.0.0.jar:/data/launcher/libs/io/kubernetes/client-java-proto/4.0.0/client-java-proto-4.0.0.jar:/data/launcher/libs/org/yaml/snakeyaml/1.19/snakeyaml-1.19.jar:/data/launcher/libs/commons-codec/commons-codec/1.11/commons-codec-1.11.jar:/data/launcher/libs/org/apache/commons/commons-compress/1.18/commons-compress-1.18.jar:/data/launcher/libs/org/apache/commons/commons-lang3/3.7/commons-lang3-3.7.jar:/data/launcher/libs/com/squareup/okhttp/okhttp-ws/2.7.5/okhttp-ws-2.7.5.jar:/data/launcher/libs/com/google/guava/guava/25.1-jre/guava-25.1-jre.jar:/data/launcher/libs/org/slf4j/slf4j-api/1.7.25/slf4j-api-1.7.25.jar:/data/launcher/libs/org/bouncycastle/bcprov-ext-jdk15on/1.59/bcprov-ext-jdk15on-1.59.jar:/data/launcher/libs/org/bouncycastle/bcpkix-jdk15on/1.59/bcpkix-jdk15on-1.59.jar:/data/launcher/libs/com/google/protobuf/protobuf-java/3.4.0/protobuf-java-3.4.0.jar:/data/launcher/libs/com/google/code/gson/gson/2.8.2/gson-2.8.2.jar:/data/launcher/libs/io/netty/netty-codec/4.1.36.Final/netty-codec-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-transport-native-unix-common/4.1.36.Final/netty-transport-native-unix-common-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-transport/4.1.36.Final/netty-transport-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-buffer/4.1.36.Final/netty-buffer-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-resolver/4.1.36.Final/netty-resolver-4.1.36.Final.jar:/data/launcher/libs/io/netty/netty-common/4.1.36.Final/netty-common-4.1.36.Final.jar:/data/launcher/libs/io/sundr/builder-annotations/0.9.2/builder-annotations-0.9.2.jar:/data/launcher/libs/io/swagger/swagger-annotations/1.5.12/swagger-annotations-1.5.12.jar:/data/launcher/libs/com/squareup/okhttp/logging-interceptor/2.7.5/logging-interceptor-2.7.5.jar:/data/launcher/libs/com/squareup/okhttp/okhttp/2.7.5/okhttp-2.7.5.jar:/data/launcher/libs/joda-time/joda-time/2.9.3/joda-time-2.9.3.jar:/data/launcher/libs/org/joda/joda-convert/1.2/joda-convert-1.2.jar:/data/launcher/libs/com/google/code/findbugs/jsr305/3.0.2/jsr305-3.0.2.jar:/data/launcher/libs/org/checkerframework/checker-qual/2.0.0/checker-qual-2.0.0.jar:/data/launcher/libs/com/google/errorprone/error_prone_annotations/2.1.3/error_prone_annotations-2.1.3.jar:/data/launcher/libs/com/google/j2objc/j2objc-annotations/1.1/j2objc-annotations-1.1.jar:/data/launcher/libs/org/codehaus/mojo/animal-sniffer-annotations/1.14/animal-sniffer-annotations-1.14.jar:/data/launcher/libs/org/bouncycastle/bcprov-jdk15on/1.59/bcprov-jdk15on-1.59.jar:/data/launcher/libs/io/sundr/sundr-core/0.9.2/sundr-core-0.9.2.jar:/data/launcher/libs/io/sundr/sundr-codegen/0.9.2/sundr-codegen-0.9.2.jar:/data/launcher/libs/io/sundr/resourcecify-annotations/0.9.2/resourcecify-annotations-0.9.2.jar:/data/launcher/libs/com/squareup/okio/okio/1.6.0/okio-1.6.0.jar:/data/launcher/versions/3.0.0-RELEASE-e48128a/driver.jar:/data/temp/caches/wrapper.jar", de.dytanic.cloudnet.wrapper.Main, nogui]
And my current workdir is: /data/temp/services/Lobby-1#4a517311-09e6-4f77-89a5-64b4bc15399a
So whenever I am in the workdir and execute the given command, it fails with the following error: Error opening zip file or JAR manifest missing :
Error occurred during initialization of VM
agent library failed to init: instrument Full Log
Now I am wondering because it's working in the automatic-environment but there are no changes to the master-Branch Source e.g. a changed Path to /data/launcher instead of launcher (System.getProperty("cloudnet.launcher.dir", "/data/launcher"))[https://github.com/CloudNetService/CloudNet-v3/blob/master/cloudnet-launcher/src/main/java/de/dytanic/cloudnet/launcher/Constants.java].
A short lookup: ls -laR /Users/.../Documents/IdeaProjects/cloudnet-parent/data

javaagent option is misused. Correct syntax is
-javaagent:/data/temp/caches/wrapper.jar

Why does start-all.sh from root cause "failed to launch org.apache.spark.deploy.master.Master: JAVA_HOME is not set"?

I am trying to execute a Spark application built through Scala IDE through my standalone Spark service running on cloudera quickstart VM 5.3.0.
My cloudera account JAVA_HOME is /usr/java/default
However, I am facing the below error message while executing the start-all.sh command from cloudera user as below:
[cloudera#localhost sbin]$ pwd
/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin
[cloudera#localhost sbin]$ ./start-all.sh
chown: changing ownership of `/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs': Operation not permitted
starting org.apache.spark.deploy.master.Master, logging to /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs/spark-cloudera-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out
/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/spark-daemon.sh: line 151: /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs/spark-cloudera-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out: Permission denied
failed to launch org.apache.spark.deploy.master.Master:
tail: cannot open `/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs/spark-cloudera-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out' for reading: No such file or directory
full log in /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs/spark-cloudera-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out
cloudera#localhost's password:
localhost: chown: changing ownership of `/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/logs': Operation not permitted
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/logs/spark-cloudera-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
localhost: /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/spark-daemon.sh: line 151: /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/logs/spark-cloudera-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out: Permission denied
localhost: failed to launch org.apache.spark.deploy.worker.Worker:
localhost: tail: cannot open `/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/logs/spark-cloudera-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out' for reading: No such file or directory
localhost: full log in /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/logs/spark-cloudera-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out
I had added export CMF_AGENT_JAVA_HOME=/usr/java/default in /etc/default/cloudera-scm-agent and run sudo service cloudera-scm-agent restart. See How to set CMF_AGENT_JAVA_HOME
I had also added export JAVA_HOME=/usr/java/default in locate_java_home function definition in file /usr/share/cmf/bin/cmf-server and restarted the cluster and standalone Spark service
But the below error is repeating while starting spark service from root user
[root#localhost spark]# sbin/start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out
failed to launch org.apache.spark.deploy.master.Master:
JAVA_HOME is not set
full log in /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-localhost.localdomain.out
root#localhost's password:
localhost: Connection closed by UNKNOWN
Can anybody suggest how to set JAVA_HOME so as to start Spark standalone service on cloudera manager?

The solution had been quite easy and straightforward. Just added export JAVA_HOME=/usr/java/default in /root/.bashrc and it successfully started the spark services from root user without the JAVA_HOME is not set error. Hope it helps somebody facing same problem.

set JAVA_HOME variable in ~/.bashrc as follows
sudo gedit ~/.bashrc
write this line in the file (address of your installed JDK)
JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"
Then command
source ~/.bashrc

Installing, Configuring, and running Hadoop 2.2.0 on Mac OS X

I've installed hadoop 2.2.0, and set up everything (for a single node) based on this tutorial here: Hadoop YARN Installation. However, I can't get hadoop to run.
I think my problem is that I can't connect to my localhost, but I'm not really sure why. I've spent upwards of about 10 hours installing, googling, and hating open-source software installation guides, so I've now turned to the one place that has never failed me.
Since a picture is worth a thousand words, I give you my set up ... in many many words pictures:
Basic profile/setup
I'm running Mac OS X (Mavericks 10.9.5)
For whatever it's worth, here's my /etc/hosts file:
My bash profile:
Hadoop file configurations
The setup for core-site.xml and hdfs-site.xml:
note: I have created folders in the locations you see above
The setup for my yarn-site.xml:
Setup for my hadoop-env.sh file:
Side Note
Before I show the results of when I run start-dfs.sh, start-yarn.sh, and check to see what's running with jps, keep in mind that I have a hadoop pointing to hadoop-2.2.0.
Starting up Hadoop
Now, here's the results of when I start the deamons up:
For those of you who don't have a microscope (it looks super small on the preview of this post), here's a code chunk of what shows above:
mrp:~ mrp$ start-dfs.sh
2014-11-08 13:06:05.695 java[17730:1003] Unable to load realm info from SCDynamicStore
14/11/08 13:06:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop-2.2.0/logs/hadoop-mrp-namenode-mrp.local.out
localhost: starting datanode, logging to /usr/local/hadoop-2.2.0/logs/hadoop-mrp-datanode-mrp.local.out
localhost: 2014-11-08 13:06:10.954 java[17867:1403] Unable to load realm info from SCDynamicStore
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.2.0/logs/hadoop-mrp-secondarynamenode-mrp.local.out
0.0.0.0: 2014-11-08 13:06:16.065 java[17953:1403] Unable to load realm info from SCDynamicStore
2014-11-08 13:06:20.982 java[17993:1003] Unable to load realm info from SCDynamicStore
14/11/08 13:06:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mrp:~ mrp$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-mrp-resourcemanager-mrp.local.out
2014-11-08 13:06:43.765 java[18053:20b] Unable to load realm info from SCDynamicStore
localhost: starting nodemanager, logging to /usr/local/hadoop-2.2.0/logs/yarn-mrp-nodemanager-mrp.local.out
Check to see what's running:
Time Out
OK. So far, I think, so good. At least this looks good based on all the other tutorials and posts. I think.
Before I try to do anything fancy, I'll just want to see if it's working properly, and run a simple command like hadoop fs -ls.
Failure
When I run hadoop fs -ls, here's what I get:
Again, in case you can't see that pic, it says:
2014-11-08 13:23:45.772 java[18326:1003] Unable to load realm info from SCDynamicStore
14/11/08 13:23:45 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: Call From mrp.local/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I've tried to run other commands, and I get the same basic error in the beginning of everything:
Call From mrp.local/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Now, I've gone to that website mentioned, but honestly, everything in that link means nothing to me. I don't get what I should do.
I would very much appreciate any assistance with this. You'll make me the happiest hadooper, ever.
...this should go without saying, but obviously I'd be happy to edit/update with more info if needed. Thanks!

add these to .bashrc
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

Had a very similar problem and found this question while googling for a solution.
Here is how I could resolve it (on Mac OS 10.10 with Hadoop 2.5.1). Not sure if the question is exactly the same problem: I checked the log files generated by the data-node (/usr/local/hadoop-2.2.0/logs/hadoop-mrp-datanode-mrp.local.out) and found the following entry:
2014-11-09 17:44:35,238 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode:
Exception in namenode join org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
Directory /private/tmp/hadoop-kthul/dfs/name is in an inconsistent state: storage
directory does not exist or is not accessible.
Based on this, I concluded that something is wrong with the HDFS data on the datanode.
I deleted the directory with the HDFS data and reformatted HDFS:
rm -rf /private/tmp/hadoop-kthul
hdfs namenode -format
Now, I am up and running again. Still wondering if /private/tmp is a good place to keep the HDSF data - looking options to change this.

So I've got Hadoop up and running. I had two problems (I think).
When starting up the NameNode and DataNode, I received the following error: Unable to load realm info from SCDynamicStore.
To fix this, I added the following two lines to my hadoop-env.sh file:
HADOOP_OPTS="${HADOOP_OPTS} -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
HADOOP_OPTS="${HADOOP_OPTS} -Djava.security.krb5.conf=/dev/null"
I found those two lines in the solution to this post, Hadoop on OSX "Unable to load realm info from SCDynamicStore". The Answer was posted by Matthew L Daniel.
I had formatted the NameNode folder more than once, which apparently screws things up?
I can't verify this screws things up, because I don't have any errors in any of my log files, however once I followed Workaround 1 (deleting & recreating NameNode/DataNode folders, then reformatting) on this post, No data nodes are started, I was able to load up the DataNode and get everything working.

Since native library isn't supported on Mac, if you want to suppress this warning:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Add this to the log4j.properties in ${HADOOP_HOME}/libexec/etc/hadoop:
# Turn of native library warning
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.