Hadoop setting fsPermission recursively to dir using Java

Hadoop setting fsPermission recursively to dir using Java - java

Hi I have test program which loads files into hdfs at this path user/user1/data/app/type/file.gz Now this test program runs multiple times by multiple users. So I want to set file permission to rwx so that anyone can delete this file. I have the following code
fs.setPermission(new Path("user/user1/data"),new FsPermission(FsAction.ALL,FsAction.ALL,FsAction.ALL))
Above line gives drwxrwxrwx to all dirs but for the file.gz it gives permission as -rw-r--r-- why so? Because of this reason another user apart from me not able to delete this file through test program. I can delete file through test program because I have full permssion.
Please guide. I am new to Hadoop. Thanks in advance.

Using FsShell APIs solved my dir permission problem. It may be not be optimal way but since I am solving it for test code it should be fine.
FsShell shell=new FsShell(conf);
try {
shell.run(new String[]{"-chmod","-R","777","user/usr1/data"});
}
catch ( Exception e) {
LOG.error("Couldnt change the file permissions ",e);
throw new IOException(e);
}

I think Hadoop doesn't provide a java API to provide permission recursively in the version you are using. The code is actually giving permission to the dir user/user1/data and nothing else. You should better use FileSystem.listStatus(Path f) method to list down all files in the directory and use Filsystem.setPermission to them individually. It's working currently on my 2.0.5-alpha-gphd-2.1.0.0 cluster.
However from Hadoop 2.2.0 onward a new constructor
FsPermission(FsAction u, FsAction g, FsAction o, boolean sb)
is being provided. The boolean field may be the recursive flag you wanted. But the documentation(including param name) is too poor to infer something concrete.
Also have a look at Why does "hadoop fs -mkdir" fail with Permission Denied? although your sitaution may be different. (in my case dfs.permissions.enabled is still true).

I wrote this in scala but I think you could easily adapt it:
def changeUserGroup(user:String,fs:FileSystem, path: Path): Boolean ={
val changedPermission = new FsPermission(FsAction.ALL,FsAction.ALL,FsAction.ALL, true)
val fileList = fs.listFiles(path, true)
while (fileList.hasNext()) {
fs.setPermission(fileList.next().getPath(),changedPermission)
}
return true
}
You will also have to add a logic for the error handling, I am always returning true.

Related

Getting a specific version of an image with Jib (Maven, Docker, testcontainers)

I'm trying to understand a comment that a colleague made. We're using testcontainers to create a fixture:
import org.testcontainers.containers.GenericContainer;
import org.testcontainers.utility.DockerImageName;
public class SalesforceFixture extends GenericContainer<SalesforceFixture> {
private static final String APPLICATION_NAME = "salesforce-emulator";
public SalesforceFixture() {
// super(ImageResolver.resolve(APPLICATION_NAME));
super(DockerImageName.parse("gcr.io/ad-selfserve/salesforce-emulator:latest"));
...
}
...
The commented code is what it used to be. The next line is my colleague's suggestion. And on that line he commented:
This is the part I don't know. The [ImageResolver] gets the specific version of the emulator, rather than the latest. You need a docker-info file for that though, which jib doesn't automatically generate (but I think it can).
This is what I know or have figured so far:
SalesforceFixture is a class that will be used by other projects to write tests. It spins up a container in Docker, running a service that emulates the real service's API. It's like a local version of the service that behaves enough like the real thing that if one writes code and tests using the fixture, it should work the same in production. (This is where my knowledge ends.)
I looked into ImageResolver—it seems to be a class we wrote that searches a filesystem for something:
public static String resolve(String applicationName, File... roots) {
Stream<File> searchPaths = Arrays.stream(roots).flatMap((value) -> {
return Stream.of(new File(value, "../" + applicationName), new File(value, applicationName));
});
Optional<File> buildFile = searchPaths.flatMap((searchFile) -> {
if (searchFile.exists()) {
File imageFile = new File(searchFile + File.separator + "/target/docker/image-name");
if (imageFile.exists()) {
return Stream.of(imageFile);
}
}
return Stream.empty();
}).findAny();
InputStream build = (InputStream)buildFile.map(ImageResolver::fileStream).orElseGet(() -> {
return searchClasspath(applicationName);
});
if (build != null) {
try {
return IOUtils.toString(build, Charset.defaultCharset()).trim();
} catch (IOException var6) {
throw new RuntimeException("An exception has occurred while reading build file", var6);
}
} else {
throw new RuntimeException("Could not resolve target image for application: " + applicationName);
}
}
But I'm confused. What filesystem? Like, what is the present working directory? My local computer, wherever I ran the Java program from? Or is this from within some container? (I don't think so.) Or maybe the directory structure inside a .jar file? Or somewhere in gcr.io?
What does he mean about a "specific version number" vs. "latest"? I mean, when I build this project, whatever it built is all I have. Isn't that equivalent to "latest"? In what case would an older version of an image be present? (That's what made me think of gcr.io.)
Or, does he mean, that in the project using this project's image, one will not be able to specify a version via Maven/pom.xml—it will always spin up the latest.
Sorry this is long, just trying to "show my work." Any hints welcome. I'll keep looking.

I can't comment on specifics of your own internal implementations, but ImageResolver seems to work on your local filesystem, e.g. it looks into your target/ directory and also touches the classpath. I can imagine this code was just written for resolving an actual image name (not an image), since it also returns a String.
Regarding latest, using a latest tag for a Docker image is generally considered an anti-pattern, so likely your colleague is commenting about this. Here is a random article from the web explaining some of the issues with latest tag:
https://vsupalov.com/docker-latest-tag/
Besides, I don't understand why you ask these questions which are very specific to your project here on SO rather than asking your colleague.

Read the jar version for a class

For a webservice client I'd like to use Implementation-Title and Implementation-Version from the jar file as user-agent string. The question is how to read the jar's manifest.
This question has been asked multiple times, however the answer seems not applicable for me. (e.g. Reading my own Jar's Manifest)
The problem is that simply reading /META-INF/MANIFEST.MF almost always gives wrong results. In my case, it would almost always refer to JBoss.
The solution proposed in https://stackoverflow.com/a/1273196/4222206
is problematic for me as you'd have to hardcode the library name to stop the iteration, and then still it may mean two versions of the same library are on the classpath and you just return the first - not necessarily the right - hit.
The solution in https://stackoverflow.com/a/1273432/4222206
seems to work with jar:// urls only which completely fails within JBoss where the application classloader produces vfs:// urls.
Is there a way for code in a class to find it's own manifest?
I tried the abovementioned items which seem to run well in small applications run from the java command line but then I'd like to have a portable solution as I cannot predict where my library would be used later.
public static Manifest getManifest() {
log.debug("getManifest()");
synchronized(Version.class) {
if(manifest==null) {
try {
// this works wrongly in JBoss
//ClassLoader cl = Version.class.getProtectionDomain().getClassLoader();
//log.debug("found classloader={}", cl);
//URL manifesturl = cl.getResource("/META-INF/MANIFEST.MF");
URL jar = Version.class.getProtectionDomain().getCodeSource().getLocation();
log.debug("Class loaded from {}", jar);
URL manifesturl = null;
switch(jar.getProtocol()) {
case "file":
manifesturl = new URL(jar.toString()+"META-INF/MANIFEST.MF");
break;
default:
manifesturl = new URL(jar.toString()+"!/META-INF/MANIFEST.MF");
}
log.debug("Expecting manifest at {}", manifesturl);
manifest = new Manifest(manifesturl.openStream());
}
catch(Exception e) {
log.info("Could not read version", e);
}
}
}
The code will detect the correct jar path. I assumed by modifying the url to point to the manifest would give the required result however I get this:
Class loaded from vfs:/C:/Users/user/Documents/JavaLibs/wildfly-18.0.0.Final/bin/content/webapp.war/WEB-INF/lib/library-1.0-18.jar
Expecting manifest at vfs:/C:/Users/user/Documents/JavaLibs/wildfly-18.0.0.Final/bin/content/webapp.war/WEB-INF/lib/library-1.0-18.jar!/META-INF/MANIFEST.MF
Could not read version: java.io.FileNotFoundException: C:\Users\hiran\Documents\JavaLibs\wildfly-18.0.0.Final\standalone\tmp\vfs\temp\tempfc75b13f07296e98\content-e4d5ca96cbe6b35e\WEB-INF\lib\library-1.0-18.jar!\META-INF\MANIFEST.MF (The system cannot find the path specified)
I checked that path and it seems even the first URL to the jar (obtained via Version.class.getProtectionDomain().getCodeSource().getLocation() ) was wrong already. It should have been C:\Users\user\Documents\JavaLibs\wildfly-18.0.0.Final\standalone\tmp\vfs\temp\tempfc75b13f07296e98\content-e4d5ca96cbe6b35e\WEB-INF\lib\library-1.0.18.jar.
So this could even point to a problem in Wildfly?

It seems I found some suitable solution here:
https://stackoverflow.com/a/37325538/4222206
So in the end this code can display the correct version of the jar (at least) in JBoss:
this.getClass().getPackage().getImplementationTitle();
this.getClass().getPackage().getImplementationVersion();
Hopefully I will find this answer when I search next time...

Execute .jar file in Spoon (Pentaho Kettle)

I need to execute a java jar file from Spoon.
The program has only one class, and all I want is to run it with or without parameters.
The class is named "Limpieza", and is inside a package named:
com.overflow.csv.clean
I have deploy the jar to:
C:\Program Files (x86)\Kettle\data-integration\lib
And from a Modified JavaScriptValue step, I am calling it this way:
var jar = com.everis.csv.clean.Limpieza;
This is not working at all, is there a way around to make it work?
Also would be nice to have a way to see the logs printed by the program when it runs.
I am not getting any error when I run the transformation.
Thanks.

Check the blog below:
https://anotherreeshu.wordpress.com/2015/02/07/using-external-jars-import-in-pentaho-data-integration/
Hope this might help :)

Spoon will load any jar files present in its
data-integration\lib
folder and its subfolders during startup, so if you want to access classes from a custom jar, you could place the jar here.
So you need to create a custom jar and place the jar in
data-integration\lib
location.
While calling a custom class in "Modified Java Script Value" or in "User Defined Java Class step" you should call with fully qualified name. For example var jar = com.everis.csv.clean.Limpieza.getInstance().getMyString();
Note: After placing the jar, make sure you restart the Spoon.
If still does not work please attach the Pentaho.log (data-integration-server/logs/Pentaho.log) and catalina.out(data-integration-server/tomcat/logs) logs

The answer was to create a User Defined Java Class (follow the guide Rishu pointed), and here is my working code:
import java.util.*;
import com.everis.csv.Cleaner;
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
{
Cleaner c = new Cleaner();
c.clean();
// The rest of it is for making it work
// You will also need to make a Generate Rows step that inputs a row to this step.
Object[] r = getRow();
if (r == null) {
setOutputDone();
return false;
}
r = createOutputRow(r, data.outputRowMeta.size());
putRow(data.outputRowMeta, r);
return true;
}

Purpose and usage of "application.path" variable

In the Play! framework source for the main method in Server.java I found these two lines:
File root = new File(System.getProperty("application.path"));
if (System.getProperty("precompiled", "false").equals("true")) {
Play.usePrecompiled = true;
}
Where can I find the application.path value?

System.getProperty("application.path") that looks like a -D property. So at the start of the server there is a call like
java -Dapplication.path=/opt/play/myApp
/play/framework/pym/play/application.py in line 251 makes the work.

There might be a properties file in your application and also there might be a mechanism to load all those properties into System properties.
Search for application.path in your application folder file contents and you may get a clue.

It may be in your application.conf file. Check there.

How can I make OS X recognize drive letters?

I know. Heresy. But I'm in a bind. I have a lot of config files that use absolute path names, which creates an incompatibility between OS X and Windows. If I can get OS X (which I'm betting is the more flexible of the two) to recognize Q:/foo/bar/bim.properties as a valid absolute file name, it'll save me days of work spelunking through stack traces and config files.
In the end, I need this bit of Java test code to print "SUCCESS!" when it runs:
import java.io.*;
class DriveLetterTest {
static public void main(String... args) {
File f = new File("S:");
if (f.isDirectory()) {
System.out.println("SUCCESS!");
} else {
System.out.println("FAIL!");
}
}
}
Anyone know how this can be done?
UPDATE: Thanks for all the feedback, everyone. It's now obvious to me I really should have been clearer in my question.
Both the config files and the code that uses them belong to a third-party package I cannot change. (Well, I can change them, but that means incurring an ongoing maintenance load, which I want to avoid if at all possible.)
I'm in complete agreement with all of you who are appalled by this state of affairs. But the fact remains: I can't change the third-party code, and I really want to avoid forking the config files.

Short answer: No.
Long answer: For Java you should use System.getProperties(XXX).
Then you can load a Properties file or Configuration based on what you find in os.name.
Alternate Solution just strip off the S: when you read the existing configuration files on non-Windows machines and replace them with the appropriate things.
Opinion: Personally I would bite the bullet and deal with the technical debt now, fix all the configuration files at build time when the deployment for OSX is built and be done with it.
public class WhichOS
{
public static void main(final String[] args)
{
System.out.format("System.getProperty(\"os.name\") = %s\n", System.getProperty("os.name"));
System.out.format("System.getProperty(\"os.arch\") = %s\n", System.getProperty("os.arch"));
System.out.format("System.getProperty(\"os.version\") = %s\n", System.getProperty("os.version"));
}
}
the output on my iMac is:
System.getProperty("os.name") = Mac OS X
System.getProperty("os.arch") = x86_64
System.getProperty("os.version") = 10.6.4

Honestly, don't hard-code absolute paths in a program, even for a single-platform app. Do the correct thing.
The following is my wrong solution, saved to remind myself not to repeat giving a misdirected advice ... shame on me.
Just create a symbolic link named Q: just at the root directory / to / itself.
$ cd /
$ ln -s / Q:
$ ln -s / S:
You might need to use sudo. Then, at the start of your program, just chdir to /.
If you don't want Q: and S: to show up in the Finder, perform
$ /Developer/Tools/SetFile -P -a V Q:
$ /Developer/Tools/SetFile -P -a V S:
which set the invisible-to-the-Finder bit of the files.

The only way you can replace java.io.File is to replace that class in rt.jar.
I don't recommend that, but the best way to do this is to grab a bsd-port of the OpenJDK code, make necessary changes, build it and redistribute the binary with your project. Write a shell script to use your own java binary and not the built-in one.
PS. Just change your config files! Practice your regex skills and save yourself a lot of time.

If you are not willing to change your config file per OS, what are they for in first place?
Every installation should have its own set of config files and use it accordingly.
But if you insist.. you just have to detect the OS version and if is not Windows, ignore the letter:
Something along the lines:
boolean isWindows = System.getProperty("os.name").toLowerCase()
.contains("windows");
String folder = "S:";
if (isWindows && folder.matches("\\w:")) {
folder = "/";
} else if (isWindows && folder.matches("\\w:.+")) {
folder = folder.substring(2);// ignoring the first two letters S:
}
You get the idea

Most likely you'd have to provide a different java.io.File implementation that can parse out the file paths correctly, maybe there's one someone already made.
The real solution is to put this kind of stuff (hard-coded file paths) in configuration files and not in the source code.

Just tested something out, and discovered something interesting: In Windows, if the current directory is on the same logical volume (i.e. root is the same drive letter), you can leave off the drive letter when using a path. So you could just trim off all those drive letters and colons and you should be fine as long as you aren't using paths to items on different disks.

Here's what I finally ended up doing:
I downloaded the source code for the java.io package, and tweaked the code for java.io.File to look for path names that start with a letter and a colon. If it finds one, it prepends "/Volumes/" to the path name, coughs a warning into System.err, then continues as normal.
I've added symlinks under /Volumes to the "drives" I need mapped, so I have:
/Volumes/S:
/Volumes/Q:
I put it into its own jar, and put that jar at the front of the classpath for this project only. This way, the hack affects only me, and only this project.
Net result: java.io.File sees a path like "S:/bling.properties", and then checks the OS. If the OS is OS X, it prepends "/Volumes/", and looks for a file in /Volumes/S:/bling.properties, which is fine, because it can just follow the symlink.
Yeah, it's ugly as hell. But it gets the job done for today.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Hadoop setting fsPermission recursively to dir using Java - java

Related

Getting a specific version of an image with Jib (Maven, Docker, testcontainers)

Read the jar version for a class

Execute .jar file in Spoon (Pentaho Kettle)

Purpose and usage of "application.path" variable

How can I make OS X recognize drive letters?

Categories

Resources