How to configure hadoop with eclipse

How to configure hadoop with eclipse - java

I am new in hadoop I have downloaded the hortonworks sanbox image and mounted that with virtualBox. And sanbox ui is coming in the localhost when I am typing 192.168.56.101/ in the Chrome. Also I am able to log in to hadoop shell with hue/hadoop username password. Now I want to run a simple program in eclipse. I have added hadoop-0.18.3-eclipse-plugin to the eclipse and then tried the following steps.
1.choosed map/reduce from eclipse.
2.went to hadoop location editer
localhost name:localhost
under map/reduce master
port:9000
under DFS master
port:9001
But I am getting this error
Cannot connect to the Map/Reduce location: localhost Call to
localhost/127.0.0.1:9001 failed on connection exception:
java.net.ConnectException: Connection refused: no further information
Virtual box is running.

Add required hadoop dependancy jar files to your eclipse class path.
In your main method of your mapreduce program add these lines
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:50000");
conf.set("mapreduce.job.tracker", "localhost:50001");
if you are running in virtual machine change the localhost to
required ip address (where hadoop demon runs).you can get the ip
address bytyping ifconfig
run the mapreduce program as simple java program
.you will get the output in the eclipse console.

Related

In windows,How to start appium server without mentioning any specific ports and utilize available free ports?

Am using Java code to start Appium server using Command Line Agruments mentioning a specific port in it.Am looking currently how to start the appium server with using available ports in windows machine.

If you have installed the appium using node js then you can start using following command
appium -a 127.0.0.1 -p 4723
If you have download the .exe file then you have to open the executable file and it will show you like this..
then you have to click the start appium server button and if the server start sucessfully then it will look like this..

To start appium service without providing any port, you can use AppiumDriverLocalService class and AppiumServiceBuilder. We will use method 'usingAnyFreePort()' which Configures the appium server to start on any available port. Node.js should be installed on system to achieve it.
We need to provide path of Appium node.exe file path and appium.js file path as below :
String Appium_Node_Path="C:\Program Files\nodejs\node.exe";
//Appium.js file can be available on one of these 2 path
String Appium_JS_Path="C:\Program Files (x86)\Appium\resources\app\node_modules\appium\build\lib/appium.js";
OR
String Appium_JS_Path="C:\Users\username\AppData\Local\Programs\appium-desktop\resources\app\node_modules\appium\lib\appium.js";
AppiumDriverLocalService appiumService";
appiumService = AppiumDriverLocalService.buildService(new AppiumServiceBuilder().usingAnyFreePort().usingDriverExecutable(new File(Appium_Node_Path)).withAppiumJS(new File(Appium_JS_Path)));
appiumService.start();

What is a simple, effective way to debug custom Kafka connectors?

I'm working a couple of Kafka connectors and I don't see any errors in their creation/deployment in the console output, however I am not getting the result that I'm looking for (no results whatsoever for that matter, desired or otherwise). I made these connectors based on Kafka's example FileStream connectors, so my debug technique was based off the use of the SLF4J Logger that is used in the example. I've searched for the log messages that I thought would be produced in the console output, but to no avail. Am I looking in the wrong place for these messages? Or perhaps is there a better way of going about debugging these connectors?
Example uses of the SLF4J Logger that I referenced for my implementation:
Kafka FileStreamSinkTask
Kafka FileStreamSourceTask

I will try to reply to your question in a broad way. A simple way to do Connector development could be as follows:
Structure and build your connector source code by looking at one of the many Kafka Connectors available publicly (you'll find an extensive list available here: https://www.confluent.io/product/connectors/ )
Download the latest Confluent Open Source edition (>= 3.3.0) from https://www.confluent.io/download/
Make your connector package available to Kafka Connect in one of the following ways:
Store all your connector jar files (connector jar plus dependency jars excluding Connect API jars) to a location in your filesystem and enable plugin isolation by adding this location to the
plugin.path property in the Connect worker properties. For instance, if your connector jars are stored in /opt/connectors/my-first-connector, you will set plugin.path=/opt/connectors in your worker's properties (see below).
Store all your connector jar files in a folder under ${CONFLUENT_HOME}/share/java. For example: ${CONFLUENT_HOME}/share/java/kafka-connect-my-first-connector. (Needs to start with kafka-connect- prefix to be picked up by the startup scripts). $CONFLUENT_HOME is where you've installed Confluent Platform.
Optionally, increase your logging by changing the log level for Connect in ${CONFLUENT_HOME}/etc/kafka/connect-log4j.properties to DEBUG or even TRACE.
Use Confluent CLI to start all the services, including Kafka Connect. Details here: http://docs.confluent.io/current/connect/quickstart.html
Briefly: confluent start
Note: The Connect worker's properties file currently loaded by the CLI is ${CONFLUENT_HOME}/etc/schema-registry/connect-avro-distributed.properties. That's the file you should edit if you choose to enable classloading isolation but also if you need to change your Connect worker's properties.
Once you have Connect worker running, start your connector by running:
confluent load <connector_name> -d <connector_config.properties>
or
confluent load <connector_name> -d <connector_config.json>
The connector configuration can be either in java properties or JSON format.
Run
confluent log connect to open the Connect worker's log file, or navigate directly to where your logs and data are stored by running
cd "$( confluent current )"
Note: change where your logs and data are stored during a session of the Confluent CLI by setting the environment variable CONFLUENT_CURRENT appropriately. E.g. given that /opt/confluent exists and is where you want to store your data, run:
export CONFLUENT_CURRENT=/opt/confluent
confluent current
Finally, to interactively debug your connector a possible way is to apply the following before starting Connect with Confluent CLI :
confluent stop connect
export CONNECT_DEBUG=y; export DEBUG_SUSPEND_FLAG=y;
confluent start connect
and then connect with your debugger (for instance remotely to the Connect worker (default port: 5005). To stop running connect in debug mode, just run: unset CONNECT_DEBUG; unset DEBUG_SUSPEND_FLAG; when you are done.
I hope the above will make your connector development easier and ... more fun!

i love the accepted answer. one thing - the environment variables didn't work for me... i'm using confluent community edition 5.3.1...
here's what i did that worked...
i installed the confluent cli from here:
https://docs.confluent.io/current/cli/installing.html#tarball-installation
i ran confluent using the command confluent local start
i got the connect app details using the command ps -ef | grep connect
i copied the resulting command to an editor and added the arg (right after java):
-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005
then i stopped connect using the command confluent local stop connect
then i ran the connect command with the arg
brief intermission ---
vs code development is led by erich gamma - of gang of four fame, who also wrote eclipse. vs code is becoming a first class java ide see https://en.wikipedia.org/wiki/Erich_Gamma
intermission over ---
next i launched vs code and opened the debezium oracle connector folder (cloned from here) https://github.com/debezium/debezium-incubator
then i chose Debug - Open Configurations
and entered the highlighted debugging configuration
and then run the debugger - it will hit your breakpoints !!
the connect command should look something like this:
/Library/Java/JavaVirtualMachines/jdk1.8.0_221.jdk/Contents/Home/bin/java -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005 -Xms256M -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/var/folders/yn/4k6t1qzn5kg3zwgbnf9qq_v40000gn/T/confluent.CYZjfRLm/connect/logs -Dlog4j.configuration=file:/Users/myuserid/confluent-5.3.1/bin/../etc/kafka/connect-log4j.properties -cp /Users/myuserid/confluent-5.3.1/share/java/kafka/*:/Users/myuserid/confluent-5.3.1/share/java/confluent-common/*:/Users/myuserid/confluent-5.3.1/share/java/kafka-serde-tools/*:/Users/myuserid/confluent-5.3.1/bin/../share/java/kafka/*:/Users/myuserid/confluent-5.3.1/bin/../support-metrics-client/build/dependant-libs-2.12.8/*:/Users/myuserid/confluent-5.3.1/bin/../support-metrics-client/build/libs/*:/usr/share/java/support-metrics-client/* org.apache.kafka.connect.cli.ConnectDistributed /var/folders/yn/4k6t1qzn5kg3zwgbnf9qq_v40000gn/T/confluent.CYZjfRLm/connect/connect.properties

Connector module is executed by the kafka connector framework. For debugging, we can use the standalone mode. we can configure IDE to use the ConnectStandalone main function as entry point.
create debug configure as the following. Need remember to tick "Include dependencies with "Provided" scope if it is maven project
connector properties file need specify the connector class name "connector.class" for debugging
worker properties file can copied from kafka folder /usr/local/etc/kafka/connect-standalone.properties

Apache Connection Refused when running Docker-client Java API

I am trying to install the Docker-client Remote API library ( https://github.com/spotify/docker-client ) to do some image searches and inspect image data (all in public repositories). I have the boot2docker VM downloaded, installed and running. Commands such as "Docker pull ubuntu" work fine but I would like to do this via a Java program now. I used the Eclipse IDE Egit plugin to import the github project and created a Maven/Java project from the existing Master branch. The source code is completely imported and no errors reported. I then tried writing a simple test:
final DockerClient docker = DefaultDockerClient.fromEnv().build();
//docker.pull("busybox");
List<ImageSearchResult> results = docker.searchImages("ubuntu");
for (ImageSearchResult res : results) {
System.out.println(res.getName());
}
However, when running the code in Eclipse I get the following error:
Exception in thread "main" com.spotify.docker.client.DockerException: java.util.concurrent.ExecutionException: javax.ws.rs.ProcessingException: org.apache.http.conn.HttpHostConnectException: Connect to localhost:2375 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
at com.spotify.docker.client.DefaultDockerClient.propagate(DefaultDockerClient.java:1109)
at com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:1028)
at com.spotify.docker.client.DefaultDockerClient.searchImages(DefaultDockerClient.java:653)
at com.spotify.docker.client.main.Test.main(Test.java:28)
I tried setting up an apache server on that port but then I get:
Exception in thread "main" com.spotify.docker.client.DockerRequestException: Request error: GET http://localhost:2375/v1.12/images/search?term=ubuntu: 404
at com.spotify.docker.client.DefaultDockerClient.propagate(DefaultDockerClient.java:1100)
at com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:1028)
at com.spotify.docker.client.DefaultDockerClient.searchImages(DefaultDockerClient.java:653)
at com.spotify.docker.client.main.Test.main(Test.java:28)
Can anyone tell me what I am supposed to do here to make my search/pull call work? This is my first try with Docker and I've searched through the documentation and googled the problem but can't find anyone with a similar problem.
Thank you!
EDIT: I am running docker in Windows 7 via the pre-built VM Boot2Docker. Maybe the Docker daemon running inside that is not accessible from programs outside of the VM such as Eclipse?
EDIT: solved it by upgrading to v1.6 instead of v1.5 which makes the daemon available in the Windows host. Current problem is that all my API calls are returning "The server failed to respond with a valid HTTP response"

I encountered a similar issue and I managed to solve this issue by using the following way, to build up the DockerClient:
final DockerClient docker = DefaultDockerClient.builder()
.uri(URI.create("unix:///var/run/docker.sock"))
.build();
I had been getting the same exception but adding the above URI part helped me to solve the issue.
A better explanation for a issue similar to the above and how to solve it has been provided in the following issue tracker.
https://github.com/spotify/docker-maven-plugin/issues/61

The Java program does essentially a docker search: that can only work in an environment where the docker engine is present.
Either in the boot2docker VM.
Or in a full Linux host.

I did encounter the same problem on Mac with eclipse and Docker version 1.10.3, I did search for a solution before I settled for a workaround - Using docker CLI docker-manager to create a new virtualbox and get the DOCKER_HOST and DOCKER_CERT_PATH values of that virtualbox and create a new builder.
In my case: I have created a virtual box default2 using docker CLI command docker-machine create -d virtualbox default2
Docker CLI
$ docker-machine env
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.103:2376"
export DOCKER_CERT_PATH="/Users/XXXX/.docker/machine/machines/default2"
export DOCKER_MACHINE_NAME="default2"
Docker-client JAVA
DockerCertificates defaultCertificates = new DockerCertificates(Paths.get("/Users/XXXX/.docker/machine/machines/default2"));
DockerClient docker = DefaultDockerClient.builder()
.uri("https://192.168.99.103:2376")
.dockerCertificates(defaultCertificates)
.build();

Hadoop with eclipse is not connecting

I am using ubuntu 12.04. I am trying to connect hadoop in eclipse.Successfully installed plugin for 1.04. I am using java 1.7 for this.
My configuration data are
username:hduser,locationname:test,map/reduce host port are localhost:9101 and M/R masterhost localhost:9100.
My temp directory is /app/hduser/temp.
As per this location I set advanced parameters.But I was not able to set fs.s3.buffer.dir as there was no such directory created like /app/hadoop/tmp//s3. unable to set map reduce master directory.I only found local directory. I didnot find mapred.jobtracker.persist.job.dir. And also map red temp dir.
When I ran hadoop in pseudo distributed mode I didnot found any datanode running also with JPS.
I am not sure what is the problem here.In eclipse I got the error while setting the dfs server.I got the message like...
An internal error occurred during: "Connecting to DFS test".
org/apache/commons/configuration/Configuration
Thanks all

I was facing the same issue. Later found this:
Hadoop eclipse mapreduce is not working?
The main blog post is this. HTH someone who is looking for a solution.

FileNotFoundException while running SolrCloud on Tomcat

I have a Solr 4.2.0 server which is running under the Tomcat 7.0 container. I'm trying to wire it with my external zookeeper (actually, it doesn't work with the embdedded zookeeper too).
I tried this java opts:
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf
-DzkRun
-DnumShards=2
for running the embedded zookeeper.
And also this java opts:
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf
-DzkHost=localhost:2181
-DnumShards=2
For connecting to external zookeeper
In both cases I continue to get the same exception:
java.io.FileNotFoundException: File '.\solr\collection1\conf \admin-extra.html' does not exist
But the problem is that file admin-extra.html exists and it's right here. And I can't figure out what the problem is.

From your exception it seems your path has a white space after the config directory.
Try to define your bootstrap_configdir between "", like:
-Dbootstrap_confdir="./solr/collection1/conf"

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.