I am trying to run pig scripts remotely from my java machine, for that i have written below code
code:
import java.io.IOException;
import java.util.Properties;
import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
import org.apache.pig.backend.executionengine.ExecException;
public class Javapig{
public static void main(String[] args) {
try {
Properties props = new Properties();
props.setProperty("fs.default.name", "hdfs://hdfs://192.168.x.xxx:8022");
props.setProperty("mapred.job.tracker", "192.168.x.xxx:8021");
PigServer pigServer = new PigServer(ExecType.MAPREDUCE, props);
runIdQuery(pigServer, "fact");
}
catch(Exception e) {
System.out.println(e);
}
}
public static void runIdQuery(PigServer pigServer, String inputFile) throws IOException {
pigServer.registerQuery("A = load '" + inputFile + "' using org.apache.hive.hcatalog.pig.HCatLoader();");
pigServer.registerQuery("B = FILTER A by category == 'Aller';");
pigServer.registerQuery("DUMP B;");
System.out.println("Done");
}
}
but while executing i am getting below error.
Error
ERROR 4010: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath).
I don't know what am i doing wrong.
Well, self describing error...
neither hadoop-site.xml nor core-site.xml was found in the classpath
You need both of those files in the classpath of your application.
You ideally would get those from your $HADOOP_CONF_DIR folder, and you would copy them into your Java's src/main/resources, assuming you have a Maven structure
Also, with those files, you should rather use a Configuration object for Hadoop
PigServer(ExecType execType, org.apache.hadoop.conf.Configuration conf)
Related
I have this piece of code which can fetch a file from a Hadoop filesystem. I setup hadoop on a single node and from my local machine ran this code to see if it would be able to fetch file from HDFS setup on that node. It worked.
package com.hdfs.test.hdfs_util;
/* Copy file from hdfs to local disk without hadoop installation
*
* params are something like
* hdfs://node01.sindice.net:8020 /user/bob/file.zip file.zip
*
*/
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HDFSdownloader{
public static void main(String[] args) throws Exception {
System.getProperty("java.classpath");
if (args.length != 3) {
System.out.println("use: HDFSdownloader hdfs src dst");
System.exit(1);
}
System.out.println(HDFSdownloader.class.getName());
HDFSdownloader dw = new HDFSdownloader();
dw.copy2local(args[0], args[1], args[2]);
}
private void copy2local(String hdfs, String src, String dst) throws IOException {
System.out.println("!! Entering function !!");
Configuration conf = new Configuration();
conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
conf.set("fs.default.name", hdfs);
FileSystem.get(conf).copyToLocalFile(new Path(src), new Path(dst));
System.out.println("!! copytoLocalFile Reached!!");
}
}
Now I took the same code, bundled it in a jar and tried to run it on another node(say B). This time the code had to fetch a file from a proper distributed Hadoop cluster. That cluster has Kerberos enabled in it.
The code ran but gave an exception :
Exception in thread "main" org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2030)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1999)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1975)
at com.hdfs.test.hdfs_util.HDFSdownloader.copy2local(HDFSdownloader.java:49)
at com.hdfs.test.hdfs_util.HDFSdownloader.main(HDFSdownloader.java:35)
Is there a way to programatically make this code run. For some reason, I can't install kinit on the source node.
Here's a code snippet to work in the scenario you have described above i.e. programatically access a kerberos enabled cluster. Important points to note are
Provide keytab file location in UserGroupInformation
Provide kerberos realm details in JVM arguments - krb5.conf file
Define hadoop security authentication mode as kerberos
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.security.UserGroupInformation;
public class KerberosHDFSIO {
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
//The following property is enough for a non-kerberized setup
// conf.set("fs.defaultFS", "localhost:9000");
//need following set of properties to access a kerberized cluster
conf.set("fs.defaultFS", "hdfs://devha:8020");
conf.set("hadoop.security.authentication", "kerberos");
//The location of krb5.conf file needs to be provided in the VM arguments for the JVM
//-Djava.security.krb5.conf=/Users/user/Desktop/utils/cluster/dev/krb5.conf
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("user#HADOOP_DEV.ABC.COM",
"/Users/user/Desktop/utils/cluster/dev/.user.keytab");
try (FileSystem fs = FileSystem.get(conf);) {
FileStatus[] fileStatuses = fs.listStatus(new Path("/user/username/dropoff"));
for (FileStatus fileStatus : fileStatuses) {
System.out.println(fileStatus.getPath().getName());
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I'm trying to load a file from resources/ path using
getClassLoader().getResourceAsStream("file.LIB")
but the method always returns null, unless I rename the file into another extension, say ".dll".
I've looked into the official Java documentation, but to no avail.
Why does the method acts strange on that file type?
Note: I'm using JDK 1.8.0_111 x86 (due to constraints on that lib file, which only works well with a 32-bit JVM)
It does works for me, you need to be sure what exactly you are doing with lib file.
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
public class FileHelper {
public String getFilePathToSave() {
Properties prop = new Properties();
String filePath = "";
try {
InputStream inputStream =
getClass().getClassLoader().getResourceAsStream("abc.lib");
prop.load(inputStream);
filePath = prop.getProperty("json.filepath");
} catch (IOException e) {
e.printStackTrace();
}
return filePath;
}
public static void main(String args[]) {
FileHelper fh = new FileHelper();
System.out.println(fh.getFilePathToSave());
}
}
Here is my code:
import com.bmc.arsys.api.ARException;
import com.bmc.arsys.api.ARServerUser;
public class ARServer {
public static void main(String[] args) {
ARServerUser ar = new ARServerUser();
ar.setServer("ServerName");
ar.setUser("Username");
ar.setPassword("Password");
ar.connect();
ar.login();
try {
ar.verifyUser();
} catch (ARException e) {
System.out.println(e.getMessage());
}
}
}
I have created build path for this jar file "ardoc7604_build002.jar" but still i am getting errors like:
import com.bmc.arsys.api.ARException can not be resolved
import com.bmc.arsys.api.ARServerUser; can not be resolved
ARserver can not be resolved
ARException can not be resolved to a type.
Thanks in advance for help.
you have put the javadoc jar in your path - you need to use the java api jar instead.
arapi7604_build002.jar
are you Only Java person or do you understand C# aswell if so I can give you some examples
I have a GWT project running in dev and production mode as well as on web and mobile.
I have different web.xml files for each mode.
I also need different constants for each version. Currently I use this:
class Params {
public static final String SOME_CONSTANT = "value";
...
}
The value of SOME_CONSTANT may change across modes (versions of the app).
How can I have different constants for each mode (dev, prod, web, mobile)?
Move these constants into properties file one for each environment.
create a folder like this (It must be outside of your final generated war file, somewhere on server)
resources
|__dev
|__prod
|__web
|__mobile
Each folder contains properties file having values based on environment.
Pass the value of environment at start up of server as system property or environment variable. Load all the properties at the application context initialization and use it anywhere in your application.
Use ServletContextListener to read all the properties at server start up.
How to load properties file based on system properties or environment variable?
Use
System.getProperty()
or
System.getenv()
to read the location of properties file.
and load the properties file
Properties properties = new Properties()
properties.load(new FileInputStream(new File(absolutePath)));
You can store properties as Application context attribute that can be read from anywhere including JSP as well.
--EDIT--
Load properties file at server start-up:
web.xml
<listener>
<listener-class>com.x.y.z.server.AppServletContextListener</listener-class>
</listener>
AppServletContextListener.java
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.Properties;
import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;
public class AppServletContextListener implements ServletContextListener {
private static Properties properties = new Properties();
static {
// load properties file
String absolutePath = null;
if (System.getenv("properties_absolute_path") == null) {
absolutePath = System.getProperty("properties_absolute_path");
} else {
absolutePath = System.getenv("properties_absolute_path");
}
try {
File file = new File(absolutePath);
properties.load(new FileInputStream(file));
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
#Override
public void contextDestroyed(ServletContextEvent servletContextEvent) {
}
#Override
public void contextInitialized(ServletContextEvent servletContextEvent) {
servletContextEvent.getServletContext().setAttribute("properties", properties);
}
public static Properties getProperties() {
return properties;
}
}
I'm happily connecting to HDFS and listing my home directory:
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://hadoop:8020");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
RemoteIterator<LocatedFileStatus> ri = fs.listFiles(fs.getHomeDirectory(), false);
while (ri.hasNext()) {
LocatedFileStatus lfs = ri.next();
log.debug(lfs.getPath().toString());
}
fs.close();
What I'm wanting to do now though is connect as a specific user (not the whois user). Does anyone know how you specify which user you connect as?
As soon as I see this is done through UserGroupInformation class and PrivilegedAction or PrivilegedExceptionAction. Here is sample code to connect to remote HDFS 'like' different user ('hbase' in this case). Hope this will solve your task. In case you need full scheme with authentication you need to improve user handling. But for SIMPLE authentication scheme (actually no authentication) it works just fine.
package org.myorg;
import java.security.PrivilegedExceptionAction;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FileStatus;
public class HdfsTest {
public static void main(String args[]) {
try {
UserGroupInformation ugi
= UserGroupInformation.createRemoteUser("hbase");
ugi.doAs(new PrivilegedExceptionAction<Void>() {
public Void run() throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://1.2.3.4:8020/user/hbase");
conf.set("hadoop.job.ugi", "hbase");
FileSystem fs = FileSystem.get(conf);
fs.createNewFile(new Path("/user/hbase/test"));
FileStatus[] status = fs.listStatus(new Path("/user/hbase"));
for(int i=0;i<status.length;i++){
System.out.println(status[i].getPath());
}
return null;
}
});
} catch (Exception e) {
e.printStackTrace();
}
}
}
If I got you correct, all you want is to get home directory of the user if specify and not the whois user.
In you configuration file, set your homedir property to user/${user.name}. Make sure you have a system property named user.name
This worked in my case.
I hope this is what you want to do, If not add a comment.