How to get the filename without the extension in Java? - java

Can anyone tell me how to get the filename without the extension?
Example:
fileNameWithExt = "test.xml";
fileNameWithOutExt = "test";

If you, like me, would rather use some library code where they probably have thought of all special cases, such as what happens if you pass in null or dots in the path but not in the filename, you can use the following:
import org.apache.commons.io.FilenameUtils;
String fileNameWithOutExt = FilenameUtils.removeExtension(fileNameWithExt);

The easiest way is to use a regular expression.
fileNameWithOutExt = "test.xml".replaceFirst("[.][^.]+$", "");
The above expression will remove the last dot followed by one or more characters. Here's a basic unit test.
public void testRegex() {
assertEquals("test", "test.xml".replaceFirst("[.][^.]+$", ""));
assertEquals("test.2", "test.2.xml".replaceFirst("[.][^.]+$", ""));
}

Here is the consolidated list order by my preference.
Using apache commons
import org.apache.commons.io.FilenameUtils;
String fileNameWithoutExt = FilenameUtils.getBaseName(fileName);
OR
String fileNameWithOutExt = FilenameUtils.removeExtension(fileName);
Using Google Guava (If u already using it)
import com.google.common.io.Files;
String fileNameWithOutExt = Files.getNameWithoutExtension(fileName);
Files.getNameWithoutExtension
Or using Core Java
1)
String fileName = file.getName();
int pos = fileName.lastIndexOf(".");
if (pos > 0 && pos < (fileName.length() - 1)) { // If '.' is not the first or last character.
fileName = fileName.substring(0, pos);
}
if (fileName.indexOf(".") > 0) {
return fileName.substring(0, fileName.lastIndexOf("."));
} else {
return fileName;
}
private static final Pattern ext = Pattern.compile("(?<=.)\\.[^.]+$");
public static String getFileNameWithoutExtension(File file) {
return ext.matcher(file.getName()).replaceAll("");
}
Liferay API
import com.liferay.portal.kernel.util.FileUtil;
String fileName = FileUtil.stripExtension(file.getName());

See the following test program:
public class javatemp {
static String stripExtension (String str) {
// Handle null case specially.
if (str == null) return null;
// Get position of last '.'.
int pos = str.lastIndexOf(".");
// If there wasn't any '.' just return the string as is.
if (pos == -1) return str;
// Otherwise return the string, up to the dot.
return str.substring(0, pos);
}
public static void main(String[] args) {
System.out.println ("test.xml -> " + stripExtension ("test.xml"));
System.out.println ("test.2.xml -> " + stripExtension ("test.2.xml"));
System.out.println ("test -> " + stripExtension ("test"));
System.out.println ("test. -> " + stripExtension ("test."));
}
}
which outputs:
test.xml -> test
test.2.xml -> test.2
test -> test
test. -> test

If your project uses Guava (14.0 or newer), you can go with Files.getNameWithoutExtension().
(Essentially the same as FilenameUtils.removeExtension() from Apache Commons IO, as the highest-voted answer suggests. Just wanted to point out Guava does this too. Personally I didn't want to add dependency to Commons—which I feel is a bit of a relic—just because of this.)

Below is reference from https://android.googlesource.com/platform/tools/tradefederation/+/master/src/com/android/tradefed/util/FileUtil.java
/**
* Gets the base name, without extension, of given file name.
* <p/>
* e.g. getBaseName("file.txt") will return "file"
*
* #param fileName
* #return the base name
*/
public static String getBaseName(String fileName) {
int index = fileName.lastIndexOf('.');
if (index == -1) {
return fileName;
} else {
return fileName.substring(0, index);
}
}

If you don't like to import the full apache.commons, I've extracted the same functionality:
public class StringUtils {
public static String getBaseName(String filename) {
return removeExtension(getName(filename));
}
public static int indexOfLastSeparator(String filename) {
if(filename == null) {
return -1;
} else {
int lastUnixPos = filename.lastIndexOf(47);
int lastWindowsPos = filename.lastIndexOf(92);
return Math.max(lastUnixPos, lastWindowsPos);
}
}
public static String getName(String filename) {
if(filename == null) {
return null;
} else {
int index = indexOfLastSeparator(filename);
return filename.substring(index + 1);
}
}
public static String removeExtension(String filename) {
if(filename == null) {
return null;
} else {
int index = indexOfExtension(filename);
return index == -1?filename:filename.substring(0, index);
}
}
public static int indexOfExtension(String filename) {
if(filename == null) {
return -1;
} else {
int extensionPos = filename.lastIndexOf(46);
int lastSeparator = indexOfLastSeparator(filename);
return lastSeparator > extensionPos?-1:extensionPos;
}
}
}

For Kotlin it's now simple as:
val fileNameStr = file.nameWithoutExtension

While I am a big believer in reusing libraries, the org.apache.commons.io JAR is 174KB, which is noticably large for a mobile app.
If you download the source code and take a look at their FilenameUtils class, you can see there are a lot of extra utilities, and it does cope with Windows and Unix paths, which is all lovely.
However, if you just want a couple of static utility methods for use with Unix style paths (with a "/" separator), you may find the code below useful.
The removeExtension method preserves the rest of the path along with the filename. There is also a similar getExtension.
/**
* Remove the file extension from a filename, that may include a path.
*
* e.g. /path/to/myfile.jpg -> /path/to/myfile
*/
public static String removeExtension(String filename) {
if (filename == null) {
return null;
}
int index = indexOfExtension(filename);
if (index == -1) {
return filename;
} else {
return filename.substring(0, index);
}
}
/**
* Return the file extension from a filename, including the "."
*
* e.g. /path/to/myfile.jpg -> .jpg
*/
public static String getExtension(String filename) {
if (filename == null) {
return null;
}
int index = indexOfExtension(filename);
if (index == -1) {
return filename;
} else {
return filename.substring(index);
}
}
private static final char EXTENSION_SEPARATOR = '.';
private static final char DIRECTORY_SEPARATOR = '/';
public static int indexOfExtension(String filename) {
if (filename == null) {
return -1;
}
// Check that no directory separator appears after the
// EXTENSION_SEPARATOR
int extensionPos = filename.lastIndexOf(EXTENSION_SEPARATOR);
int lastDirSeparator = filename.lastIndexOf(DIRECTORY_SEPARATOR);
if (lastDirSeparator > extensionPos) {
LogIt.w(FileSystemUtil.class, "A directory separator appears after the file extension, assuming there is no file extension");
return -1;
}
return extensionPos;
}

Simplest way to get name from relative path or full path is using
import org.apache.commons.io.FilenameUtils;
FilenameUtils.getBaseName(definitionFilePath)

You can use java split function to split the filename from the extension, if you are sure there is only one dot in the filename which for extension.
File filename = new File('test.txt');
File.getName().split("[.]");
so the split[0] will return "test" and split[1] will return "txt"

fileEntry.getName().substring(0, fileEntry.getName().lastIndexOf("."));

Given the String filename, you can do:
String filename = "test.xml";
filename.substring(0, filename.lastIndexOf(".")); // Output: test
filename.split("\\.")[0]; // Output: test

public static String getFileExtension(String fileName) {
if (TextUtils.isEmpty(fileName) || !fileName.contains(".") || fileName.endsWith(".")) return null;
return fileName.substring(fileName.lastIndexOf(".") + 1);
}
public static String getBaseFileName(String fileName) {
if (TextUtils.isEmpty(fileName) || !fileName.contains(".") || fileName.endsWith(".")) return null;
return fileName.substring(0,fileName.lastIndexOf("."));
}

Use FilenameUtils.removeExtension from Apache Commons IO
Example:
You can provide full path name or only the file name.
String myString1 = FilenameUtils.removeExtension("helloworld.exe"); // returns "helloworld"
String myString2 = FilenameUtils.removeExtension("/home/abc/yey.xls"); // returns "yey"
Hope this helps ..

The fluent way:
public static String fileNameWithOutExt (String fileName) {
return Optional.of(fileName.lastIndexOf(".")).filter(i-> i >= 0)
.filter(i-> i > fileName.lastIndexOf(File.separator))
.map(i-> fileName.substring(0, i)).orElse(fileName);
}

You can split it by "." and on index 0 is file name and on 1 is extension, but I would incline for the best solution with FileNameUtils from apache.commons-io like it was mentioned in the first article. It does not have to be removed, but sufficent is:
String fileName = FilenameUtils.getBaseName("test.xml");

Keeping it simple, use Java's String.replaceAll() method as follows:
String fileNameWithExt = "test.xml";
String fileNameWithoutExt
= fileNameWithExt.replaceAll( "^.*?(([^/\\\\\\.]+))\\.[^\\.]+$", "$1" );
This also works when fileNameWithExt includes the fully qualified path.

My solution needs the following import.
import java.io.File;
The following method should return the desired output string:
private static String getFilenameWithoutExtension(File file) throws IOException {
String filename = file.getCanonicalPath();
String filenameWithoutExtension;
if (filename.contains("."))
filenameWithoutExtension = filename.substring(filename.lastIndexOf(System.getProperty("file.separator"))+1, filename.lastIndexOf('.'));
else
filenameWithoutExtension = filename.substring(filename.lastIndexOf(System.getProperty("file.separator"))+1);
return filenameWithoutExtension;
}

com.google.common.io.Files
Files.getNameWithoutExtension(sourceFile.getName())
can do a job as well

file name only, where full path is also included. No need for external libs, regex...etc
public class MyClass {
public static void main(String args[]) {
String file = "some/long/directory/blah.x.y.z.m.xml";
System.out.println(file.substring(file.lastIndexOf("/") + 1, file.lastIndexOf(".")));
//outputs blah.x.y.z.m
}
}

Try the code below. Using core Java basic functions. It takes care of Strings with extension, and without extension (without the '.' character). The case of multiple '.' is also covered.
String str = "filename.xml";
if (!str.contains("."))
System.out.println("File Name=" + str);
else {
str = str.substring(0, str.lastIndexOf("."));
// Because extension is always after the last '.'
System.out.println("File Name=" + str);
}
You can adapt it to work with null strings.

Related

How to use var name dynamically in [java]

i need to call a function with the same var name but with a little different example:
private final String REQ_DROPDOWN_1 = "//div[text()='XXX']";
private final String REQ_DROPDOWN_2 = "//div[text()='YYY']";
public boolean goodVar(String num){
return this.IsVisible(REQ_DROPDOWN_ + num);
}
How i Can use the name of the var dynamically
The simplest solution is this:
private final String[] REQ_DROPDOWNS = {
"//div[text()='XXX']", "//div[text()='YYY']"};
// NB: I have changed the argument type to 'int'
public Boolean goodVar(int num) {
if (num > 0 && num < REQ_DROPDOWNS.length) {
return this.IsVisble(REQ_DROPDOWNS[num - 1]);
} else {
throw new IllegalArgumentException("Num out of range: " + num);
}
}
Java does not support dynamic variables; see https://stackoverflow.com/questions/6729605.
You could dynamically lookup a Field and then access it using reflection, but it is more complicated and error prone to do that.
You could also use a HashMap, but that too is unnecessarily complicated for the use-case in your example. But if you wanted the name lookup to be more flexible, this would be a good option.
private final Map<String, String> map = new HashMap<>(){{
put("REQ_DROPDOWN_1", "//div[text()='XXX']");
put("REQ_DROPDOWN_2", "//div[text()='YYY']");
}}
public Boolean goodVar(String suffix) {
String path = map.get("REQ_DROPDOWN_" + suffix);
if (path == null) {
throw new IllegalArgumentException("Unknown suffix: " + suffix);
}
return this.IsVisble();
}

Use JLine to Complete Multiple Commands on One Line

I was wondering how I could implement an ArgumentCompleter such that if I complete a full and valid command, then it would begin tab completing for a new command.
I would have assumed it could be constructed doing something like this:
final ConsoleReader consoleReader = new ConsoleReader()
final ArgumentCompleter cyclicalArgument = new ArgumentCompleter();
cyclicalArgument.getCompleters().addAll(Arrays.asList(
new StringsCompleter("foo"),
new StringsCompleter("bar"),
cyclicalArgument));
consoleReader.addCompleter(cyclicalArgument);
consoleReader.readLine();
However right now this stops working after tab completeing the first foo bar
Is anyone familiar enough with the library to tell me how I would go about implementing this? Or is there a known way to do this that I am missing? Also this is using JLine2.
That was quite a task :-)
It is handled by the completer you are using. The complete() method of the completer has to use for the search only what comes after the last blank.
If you look for example at the FileNameCompleter of the library: this is not done at all, so you will find no completion, because the completer searches for <input1> <input2> and not only for <input2> :-)
You will have to do your own implementation of a completer that is able to find input2.
Additionally the CompletionHandler has to append what you found to what you already typed.
Here is a basic implementation changing the default FileNameCompleter:
protected int matchFiles(final String buffer, final String translated, final File[] files,
final List<CharSequence> candidates) {
// THIS IS NEW
String[] allWords = translated.split(" ");
String lastWord = allWords[allWords.length - 1];
// the lastWord is used when searching the files now
// ---
if (files == null) {
return -1;
}
int matches = 0;
// first pass: just count the matches
for (File file : files) {
if (file.getAbsolutePath().startsWith(lastWord)) {
matches++;
}
}
for (File file : files) {
if (file.getAbsolutePath().startsWith(lastWord)) {
CharSequence name = file.getName() + (matches == 1 && file.isDirectory() ? this.separator() : " ");
candidates.add(this.render(file, name).toString());
}
}
final int index = buffer.lastIndexOf(this.separator());
return index + this.separator().length();
}
And here the complete()-Method of the CompletionHandler changing the default CandidateListCompletionHandler:
#Override
public boolean complete(final ConsoleReader reader, final List<CharSequence> candidates, final int pos)
throws IOException {
CursorBuffer buf = reader.getCursorBuffer();
// THIS IS NEW
String[] allWords = buf.toString().split(" ");
String firstWords = "";
if (allWords.length > 1) {
for (int i = 0; i < allWords.length - 1; i++) {
firstWords += allWords[i] + " ";
}
}
//-----
// if there is only one completion, then fill in the buffer
if (candidates.size() == 1) {
String value = Ansi.stripAnsi(candidates.get(0).toString());
if (buf.cursor == buf.buffer.length() && this.printSpaceAfterFullCompletion && !value.endsWith(" ")) {
value += " ";
}
// fail if the only candidate is the same as the current buffer
if (value.equals(buf.toString())) {
return false;
}
CandidateListCompletionHandler.setBuffer(reader, firstWords + " " + value, pos);
return true;
} else if (candidates.size() > 1) {
String value = this.getUnambiguousCompletions(candidates);
CandidateListCompletionHandler.setBuffer(reader, value, pos);
}
CandidateListCompletionHandler.printCandidates(reader, candidates);
// redraw the current console buffer
reader.drawLine();
return true;
}

Getting local path of a file [duplicate]

Given two absolute paths, e.g.
/var/data/stuff/xyz.dat
/var/data
How can one create a relative path that uses the second path as its base? In the example above, the result should be: ./stuff/xyz.dat
It's a little roundabout, but why not use URI? It has a relativize method which does all the necessary checks for you.
String path = "/var/data/stuff/xyz.dat";
String base = "/var/data";
String relative = new File(base).toURI().relativize(new File(path).toURI()).getPath();
// relative == "stuff/xyz.dat"
Please note that for file path there's java.nio.file.Path#relativize since Java 1.7, as pointed out by #Jirka Meluzin in the other answer.
Since Java 7 you can use the relativize method:
import java.nio.file.Path;
import java.nio.file.Paths;
public class Test {
public static void main(String[] args) {
Path pathAbsolute = Paths.get("/var/data/stuff/xyz.dat");
Path pathBase = Paths.get("/var/data");
Path pathRelative = pathBase.relativize(pathAbsolute);
System.out.println(pathRelative);
}
}
Output:
stuff/xyz.dat
At the time of writing (June 2010), this was the only solution that passed my test cases. I can't guarantee that this solution is bug-free, but it does pass the included test cases. The method and tests I've written depend on the FilenameUtils class from Apache commons IO.
The solution was tested with Java 1.4. If you're using Java 1.5 (or higher) you should consider replacing StringBuffer with StringBuilder (if you're still using Java 1.4 you should consider a change of employer instead).
import java.io.File;
import java.util.regex.Pattern;
import org.apache.commons.io.FilenameUtils;
public class ResourceUtils {
/**
* Get the relative path from one file to another, specifying the directory separator.
* If one of the provided resources does not exist, it is assumed to be a file unless it ends with '/' or
* '\'.
*
* #param targetPath targetPath is calculated to this file
* #param basePath basePath is calculated from this file
* #param pathSeparator directory separator. The platform default is not assumed so that we can test Unix behaviour when running on Windows (for example)
* #return
*/
public static String getRelativePath(String targetPath, String basePath, String pathSeparator) {
// Normalize the paths
String normalizedTargetPath = FilenameUtils.normalizeNoEndSeparator(targetPath);
String normalizedBasePath = FilenameUtils.normalizeNoEndSeparator(basePath);
// Undo the changes to the separators made by normalization
if (pathSeparator.equals("/")) {
normalizedTargetPath = FilenameUtils.separatorsToUnix(normalizedTargetPath);
normalizedBasePath = FilenameUtils.separatorsToUnix(normalizedBasePath);
} else if (pathSeparator.equals("\\")) {
normalizedTargetPath = FilenameUtils.separatorsToWindows(normalizedTargetPath);
normalizedBasePath = FilenameUtils.separatorsToWindows(normalizedBasePath);
} else {
throw new IllegalArgumentException("Unrecognised dir separator '" + pathSeparator + "'");
}
String[] base = normalizedBasePath.split(Pattern.quote(pathSeparator));
String[] target = normalizedTargetPath.split(Pattern.quote(pathSeparator));
// First get all the common elements. Store them as a string,
// and also count how many of them there are.
StringBuffer common = new StringBuffer();
int commonIndex = 0;
while (commonIndex < target.length && commonIndex < base.length
&& target[commonIndex].equals(base[commonIndex])) {
common.append(target[commonIndex] + pathSeparator);
commonIndex++;
}
if (commonIndex == 0) {
// No single common path element. This most
// likely indicates differing drive letters, like C: and D:.
// These paths cannot be relativized.
throw new PathResolutionException("No common path element found for '" + normalizedTargetPath + "' and '" + normalizedBasePath
+ "'");
}
// The number of directories we have to backtrack depends on whether the base is a file or a dir
// For example, the relative path from
//
// /foo/bar/baz/gg/ff to /foo/bar/baz
//
// ".." if ff is a file
// "../.." if ff is a directory
//
// The following is a heuristic to figure out if the base refers to a file or dir. It's not perfect, because
// the resource referred to by this path may not actually exist, but it's the best I can do
boolean baseIsFile = true;
File baseResource = new File(normalizedBasePath);
if (baseResource.exists()) {
baseIsFile = baseResource.isFile();
} else if (basePath.endsWith(pathSeparator)) {
baseIsFile = false;
}
StringBuffer relative = new StringBuffer();
if (base.length != commonIndex) {
int numDirsUp = baseIsFile ? base.length - commonIndex - 1 : base.length - commonIndex;
for (int i = 0; i < numDirsUp; i++) {
relative.append(".." + pathSeparator);
}
}
relative.append(normalizedTargetPath.substring(common.length()));
return relative.toString();
}
static class PathResolutionException extends RuntimeException {
PathResolutionException(String msg) {
super(msg);
}
}
}
The test cases that this passes are
public void testGetRelativePathsUnix() {
assertEquals("stuff/xyz.dat", ResourceUtils.getRelativePath("/var/data/stuff/xyz.dat", "/var/data/", "/"));
assertEquals("../../b/c", ResourceUtils.getRelativePath("/a/b/c", "/a/x/y/", "/"));
assertEquals("../../b/c", ResourceUtils.getRelativePath("/m/n/o/a/b/c", "/m/n/o/a/x/y/", "/"));
}
public void testGetRelativePathFileToFile() {
String target = "C:\\Windows\\Boot\\Fonts\\chs_boot.ttf";
String base = "C:\\Windows\\Speech\\Common\\sapisvr.exe";
String relPath = ResourceUtils.getRelativePath(target, base, "\\");
assertEquals("..\\..\\Boot\\Fonts\\chs_boot.ttf", relPath);
}
public void testGetRelativePathDirectoryToFile() {
String target = "C:\\Windows\\Boot\\Fonts\\chs_boot.ttf";
String base = "C:\\Windows\\Speech\\Common\\";
String relPath = ResourceUtils.getRelativePath(target, base, "\\");
assertEquals("..\\..\\Boot\\Fonts\\chs_boot.ttf", relPath);
}
public void testGetRelativePathFileToDirectory() {
String target = "C:\\Windows\\Boot\\Fonts";
String base = "C:\\Windows\\Speech\\Common\\foo.txt";
String relPath = ResourceUtils.getRelativePath(target, base, "\\");
assertEquals("..\\..\\Boot\\Fonts", relPath);
}
public void testGetRelativePathDirectoryToDirectory() {
String target = "C:\\Windows\\Boot\\";
String base = "C:\\Windows\\Speech\\Common\\";
String expected = "..\\..\\Boot";
String relPath = ResourceUtils.getRelativePath(target, base, "\\");
assertEquals(expected, relPath);
}
public void testGetRelativePathDifferentDriveLetters() {
String target = "D:\\sources\\recovery\\RecEnv.exe";
String base = "C:\\Java\\workspace\\AcceptanceTests\\Standard test data\\geo\\";
try {
ResourceUtils.getRelativePath(target, base, "\\");
fail();
} catch (PathResolutionException ex) {
// expected exception
}
}
When using java.net.URI.relativize you should be aware of Java bug:
JDK-6226081 (URI should be able to relativize paths with partial roots)
At the moment, the relativize() method of URI will only relativize URIs when one is a prefix of the other.
Which essentially means java.net.URI.relativize will not create ".."'s for you.
In Java 7 and later you can simply use (and in contrast to URI, it is bug free):
Path#relativize(Path)
The bug referred to in another answer is addressed by URIUtils in Apache HttpComponents
public static URI resolve(URI baseURI,
String reference)
Resolves a URI reference against a
base URI. Work-around for bug in
java.net.URI ()
If you know the second string is part of the first:
String s1 = "/var/data/stuff/xyz.dat";
String s2 = "/var/data";
String s3 = s1.substring(s2.length());
or if you really want the period at the beginning as in your example:
String s3 = ".".concat(s1.substring(s2.length()));
Recursion produces a smaller solution. This throws an exception if the result is impossible (e.g. different Windows disk) or impractical (root is only common directory.)
/**
* Computes the path for a file relative to a given base, or fails if the only shared
* directory is the root and the absolute form is better.
*
* #param base File that is the base for the result
* #param name File to be "relativized"
* #return the relative name
* #throws IOException if files have no common sub-directories, i.e. at best share the
* root prefix "/" or "C:\"
*/
public static String getRelativePath(File base, File name) throws IOException {
File parent = base.getParentFile();
if (parent == null) {
throw new IOException("No common directory");
}
String bpath = base.getCanonicalPath();
String fpath = name.getCanonicalPath();
if (fpath.startsWith(bpath)) {
return fpath.substring(bpath.length() + 1);
} else {
return (".." + File.separator + getRelativePath(parent, name));
}
}
Here is a solution other library free:
Path sourceFile = Paths.get("some/common/path/example/a/b/c/f1.txt");
Path targetFile = Paths.get("some/common/path/example/d/e/f2.txt");
Path relativePath = sourceFile.relativize(targetFile);
System.out.println(relativePath);
Outputs
..\..\..\..\d\e\f2.txt
[EDIT] actually it outputs on more ..\ because of the source is file not a directory. Correct solution for my case is:
Path sourceFile = Paths.get(new File("some/common/path/example/a/b/c/f1.txt").parent());
Path targetFile = Paths.get("some/common/path/example/d/e/f2.txt");
Path relativePath = sourceFile.relativize(targetFile);
System.out.println(relativePath);
My version is loosely based on Matt and Steve's versions:
/**
* Returns the path of one File relative to another.
*
* #param target the target directory
* #param base the base directory
* #return target's path relative to the base directory
* #throws IOException if an error occurs while resolving the files' canonical names
*/
public static File getRelativeFile(File target, File base) throws IOException
{
String[] baseComponents = base.getCanonicalPath().split(Pattern.quote(File.separator));
String[] targetComponents = target.getCanonicalPath().split(Pattern.quote(File.separator));
// skip common components
int index = 0;
for (; index < targetComponents.length && index < baseComponents.length; ++index)
{
if (!targetComponents[index].equals(baseComponents[index]))
break;
}
StringBuilder result = new StringBuilder();
if (index != baseComponents.length)
{
// backtrack to base directory
for (int i = index; i < baseComponents.length; ++i)
result.append(".." + File.separator);
}
for (; index < targetComponents.length; ++index)
result.append(targetComponents[index] + File.separator);
if (!target.getPath().endsWith("/") && !target.getPath().endsWith("\\"))
{
// remove final path separator
result.delete(result.length() - File.separator.length(), result.length());
}
return new File(result.toString());
}
Matt B's solution gets the number of directories to backtrack wrong -- it should be the length of the base path minus the number of common path elements, minus one (for the last path element, which is either a filename or a trailing "" generated by split). It happens to work with /a/b/c/ and /a/x/y/, but replace the arguments with /m/n/o/a/b/c/ and /m/n/o/a/x/y/ and you will see the problem.
Also, it needs an else break inside the first for loop, or it will mishandle paths that happen to have matching directory names, such as /a/b/c/d/ and /x/y/c/z -- the c is in the same slot in both arrays, but is not an actual match.
All these solutions lack the ability to handle paths that cannot be relativized to one another because they have incompatible roots, such as C:\foo\bar and D:\baz\quux. Probably only an issue on Windows, but worth noting.
I spent far longer on this than I intended, but that's okay. I actually needed this for work, so thank you to everyone who has chimed in, and I'm sure there will be corrections to this version too!
public static String getRelativePath(String targetPath, String basePath,
String pathSeparator) {
// We need the -1 argument to split to make sure we get a trailing
// "" token if the base ends in the path separator and is therefore
// a directory. We require directory paths to end in the path
// separator -- otherwise they are indistinguishable from files.
String[] base = basePath.split(Pattern.quote(pathSeparator), -1);
String[] target = targetPath.split(Pattern.quote(pathSeparator), 0);
// First get all the common elements. Store them as a string,
// and also count how many of them there are.
String common = "";
int commonIndex = 0;
for (int i = 0; i < target.length && i < base.length; i++) {
if (target[i].equals(base[i])) {
common += target[i] + pathSeparator;
commonIndex++;
}
else break;
}
if (commonIndex == 0)
{
// Whoops -- not even a single common path element. This most
// likely indicates differing drive letters, like C: and D:.
// These paths cannot be relativized. Return the target path.
return targetPath;
// This should never happen when all absolute paths
// begin with / as in *nix.
}
String relative = "";
if (base.length == commonIndex) {
// Comment this out if you prefer that a relative path not start with ./
//relative = "." + pathSeparator;
}
else {
int numDirsUp = base.length - commonIndex - 1;
// The number of directories we have to backtrack is the length of
// the base path MINUS the number of common path elements, minus
// one because the last element in the path isn't a directory.
for (int i = 1; i <= (numDirsUp); i++) {
relative += ".." + pathSeparator;
}
}
relative += targetPath.substring(common.length());
return relative;
}
And here are tests to cover several cases:
public void testGetRelativePathsUnixy()
{
assertEquals("stuff/xyz.dat", FileUtils.getRelativePath(
"/var/data/stuff/xyz.dat", "/var/data/", "/"));
assertEquals("../../b/c", FileUtils.getRelativePath(
"/a/b/c", "/a/x/y/", "/"));
assertEquals("../../b/c", FileUtils.getRelativePath(
"/m/n/o/a/b/c", "/m/n/o/a/x/y/", "/"));
}
public void testGetRelativePathFileToFile()
{
String target = "C:\\Windows\\Boot\\Fonts\\chs_boot.ttf";
String base = "C:\\Windows\\Speech\\Common\\sapisvr.exe";
String relPath = FileUtils.getRelativePath(target, base, "\\");
assertEquals("..\\..\\..\\Boot\\Fonts\\chs_boot.ttf", relPath);
}
public void testGetRelativePathDirectoryToFile()
{
String target = "C:\\Windows\\Boot\\Fonts\\chs_boot.ttf";
String base = "C:\\Windows\\Speech\\Common";
String relPath = FileUtils.getRelativePath(target, base, "\\");
assertEquals("..\\..\\Boot\\Fonts\\chs_boot.ttf", relPath);
}
public void testGetRelativePathDifferentDriveLetters()
{
String target = "D:\\sources\\recovery\\RecEnv.exe";
String base = "C:\\Java\\workspace\\AcceptanceTests\\Standard test data\\geo\\";
// Should just return the target path because of the incompatible roots.
String relPath = FileUtils.getRelativePath(target, base, "\\");
assertEquals(target, relPath);
}
Actually my other answer didn't work if the target path wasn't a child of the base path.
This should work.
public class RelativePathFinder {
public static String getRelativePath(String targetPath, String basePath,
String pathSeparator) {
// find common path
String[] target = targetPath.split(pathSeparator);
String[] base = basePath.split(pathSeparator);
String common = "";
int commonIndex = 0;
for (int i = 0; i < target.length && i < base.length; i++) {
if (target[i].equals(base[i])) {
common += target[i] + pathSeparator;
commonIndex++;
}
}
String relative = "";
// is the target a child directory of the base directory?
// i.e., target = /a/b/c/d, base = /a/b/
if (commonIndex == base.length) {
relative = "." + pathSeparator + targetPath.substring(common.length());
}
else {
// determine how many directories we have to backtrack
for (int i = 1; i <= commonIndex; i++) {
relative += ".." + pathSeparator;
}
relative += targetPath.substring(common.length());
}
return relative;
}
public static String getRelativePath(String targetPath, String basePath) {
return getRelativePath(targetPath, basePath, File.pathSeparator);
}
}
public class RelativePathFinderTest extends TestCase {
public void testGetRelativePath() {
assertEquals("./stuff/xyz.dat", RelativePathFinder.getRelativePath(
"/var/data/stuff/xyz.dat", "/var/data/", "/"));
assertEquals("../../b/c", RelativePathFinder.getRelativePath("/a/b/c",
"/a/x/y/", "/"));
}
}
Cool!! I need a bit of code like this but for comparing directory paths on Linux machines. I found that this wasn't working in situations where a parent directory was the target.
Here is a directory friendly version of the method:
public static String getRelativePath(String targetPath, String basePath,
String pathSeparator) {
boolean isDir = false;
{
File f = new File(targetPath);
isDir = f.isDirectory();
}
// We need the -1 argument to split to make sure we get a trailing
// "" token if the base ends in the path separator and is therefore
// a directory. We require directory paths to end in the path
// separator -- otherwise they are indistinguishable from files.
String[] base = basePath.split(Pattern.quote(pathSeparator), -1);
String[] target = targetPath.split(Pattern.quote(pathSeparator), 0);
// First get all the common elements. Store them as a string,
// and also count how many of them there are.
String common = "";
int commonIndex = 0;
for (int i = 0; i < target.length && i < base.length; i++) {
if (target[i].equals(base[i])) {
common += target[i] + pathSeparator;
commonIndex++;
}
else break;
}
if (commonIndex == 0)
{
// Whoops -- not even a single common path element. This most
// likely indicates differing drive letters, like C: and D:.
// These paths cannot be relativized. Return the target path.
return targetPath;
// This should never happen when all absolute paths
// begin with / as in *nix.
}
String relative = "";
if (base.length == commonIndex) {
// Comment this out if you prefer that a relative path not start with ./
relative = "." + pathSeparator;
}
else {
int numDirsUp = base.length - commonIndex - (isDir?0:1); /* only subtract 1 if it is a file. */
// The number of directories we have to backtrack is the length of
// the base path MINUS the number of common path elements, minus
// one because the last element in the path isn't a directory.
for (int i = 1; i <= (numDirsUp); i++) {
relative += ".." + pathSeparator;
}
}
//if we are comparing directories then we
if (targetPath.length() > common.length()) {
//it's OK, it isn't a directory
relative += targetPath.substring(common.length());
}
return relative;
}
I'm assuming you have fromPath (an absolute path for a folder), and toPath (an absolute path for a folder/file), and your're looking for a path that with represent the file/folder in toPath as a relative path from fromPath (your current working directory is fromPath) then something like this should work:
public static String getRelativePath(String fromPath, String toPath) {
// This weirdness is because a separator of '/' messes with String.split()
String regexCharacter = File.separator;
if (File.separatorChar == '\\') {
regexCharacter = "\\\\";
}
String[] fromSplit = fromPath.split(regexCharacter);
String[] toSplit = toPath.split(regexCharacter);
// Find the common path
int common = 0;
while (fromSplit[common].equals(toSplit[common])) {
common++;
}
StringBuffer result = new StringBuffer(".");
// Work your way up the FROM path to common ground
for (int i = common; i < fromSplit.length; i++) {
result.append(File.separatorChar).append("..");
}
// Work your way down the TO path
for (int i = common; i < toSplit.length; i++) {
result.append(File.separatorChar).append(toSplit[i]);
}
return result.toString();
}
Lots of answers already here, but I found they didn't handle all cases, such as the base and target being the same. This function takes a base directory and a target path and returns the relative path. If no relative path exists, the target path is returned. File.separator is unnecessary.
public static String getRelativePath (String baseDir, String targetPath) {
String[] base = baseDir.replace('\\', '/').split("\\/");
targetPath = targetPath.replace('\\', '/');
String[] target = targetPath.split("\\/");
// Count common elements and their length.
int commonCount = 0, commonLength = 0, maxCount = Math.min(target.length, base.length);
while (commonCount < maxCount) {
String targetElement = target[commonCount];
if (!targetElement.equals(base[commonCount])) break;
commonCount++;
commonLength += targetElement.length() + 1; // Directory name length plus slash.
}
if (commonCount == 0) return targetPath; // No common path element.
int targetLength = targetPath.length();
int dirsUp = base.length - commonCount;
StringBuffer relative = new StringBuffer(dirsUp * 3 + targetLength - commonLength + 1);
for (int i = 0; i < dirsUp; i++)
relative.append("../");
if (commonLength < targetLength) relative.append(targetPath.substring(commonLength));
return relative.toString();
}
Here a method that resolves a relative path from a base path regardless they are in the same or in a different root:
public static String GetRelativePath(String path, String base){
final String SEP = "/";
// if base is not a directory -> return empty
if (!base.endsWith(SEP)){
return "";
}
// check if path is a file -> remove last "/" at the end of the method
boolean isfile = !path.endsWith(SEP);
// get URIs and split them by using the separator
String a = "";
String b = "";
try {
a = new File(base).getCanonicalFile().toURI().getPath();
b = new File(path).getCanonicalFile().toURI().getPath();
} catch (IOException e) {
e.printStackTrace();
}
String[] basePaths = a.split(SEP);
String[] otherPaths = b.split(SEP);
// check common part
int n = 0;
for(; n < basePaths.length && n < otherPaths.length; n ++)
{
if( basePaths[n].equals(otherPaths[n]) == false )
break;
}
// compose the new path
StringBuffer tmp = new StringBuffer("");
for(int m = n; m < basePaths.length; m ++)
tmp.append(".."+SEP);
for(int m = n; m < otherPaths.length; m ++)
{
tmp.append(otherPaths[m]);
tmp.append(SEP);
}
// get path string
String result = tmp.toString();
// remove last "/" if path is a file
if (isfile && result.endsWith(SEP)){
result = result.substring(0,result.length()-1);
}
return result;
}
Passes Dónal's tests, the only change - if no common root it returns target path (it could be already relative)
import static java.util.Arrays.asList;
import static java.util.Collections.nCopies;
import static org.apache.commons.io.FilenameUtils.normalizeNoEndSeparator;
import static org.apache.commons.io.FilenameUtils.separatorsToUnix;
import static org.apache.commons.lang3.StringUtils.getCommonPrefix;
import static org.apache.commons.lang3.StringUtils.isBlank;
import static org.apache.commons.lang3.StringUtils.isNotEmpty;
import static org.apache.commons.lang3.StringUtils.join;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
public class ResourceUtils {
public static String getRelativePath(String targetPath, String basePath, String pathSeparator) {
File baseFile = new File(basePath);
if (baseFile.isFile() || !baseFile.exists() && !basePath.endsWith("/") && !basePath.endsWith("\\"))
basePath = baseFile.getParent();
String target = separatorsToUnix(normalizeNoEndSeparator(targetPath));
String base = separatorsToUnix(normalizeNoEndSeparator(basePath));
String commonPrefix = getCommonPrefix(target, base);
if (isBlank(commonPrefix))
return targetPath.replaceAll("/", pathSeparator);
target = target.replaceFirst(commonPrefix, "");
base = base.replaceFirst(commonPrefix, "");
List<String> result = new ArrayList<>();
if (isNotEmpty(base))
result.addAll(nCopies(base.split("/").length, ".."));
result.addAll(asList(target.replaceFirst("^/", "").split("/")));
return join(result, pathSeparator);
}
}
If you're writing a Maven plugin, you can use Plexus' PathTool:
import org.codehaus.plexus.util.PathTool;
String relativeFilePath = PathTool.getRelativeFilePath(file1, file2);
If Paths is not available for JRE 1.5 runtime or maven plugin
package org.afc.util;
import java.io.File;
import java.util.LinkedList;
import java.util.List;
public class FileUtil {
public static String getRelativePath(String basePath, String filePath) {
return getRelativePath(new File(basePath), new File(filePath));
}
public static String getRelativePath(File base, File file) {
List<String> bases = new LinkedList<String>();
bases.add(0, base.getName());
for (File parent = base.getParentFile(); parent != null; parent = parent.getParentFile()) {
bases.add(0, parent.getName());
}
List<String> files = new LinkedList<String>();
files.add(0, file.getName());
for (File parent = file.getParentFile(); parent != null; parent = parent.getParentFile()) {
files.add(0, parent.getName());
}
int overlapIndex = 0;
while (overlapIndex < bases.size() && overlapIndex < files.size() && bases.get(overlapIndex).equals(files.get(overlapIndex))) {
overlapIndex++;
}
StringBuilder relativePath = new StringBuilder();
for (int i = overlapIndex; i < bases.size(); i++) {
relativePath.append("..").append(File.separatorChar);
}
for (int i = overlapIndex; i < files.size(); i++) {
relativePath.append(files.get(i)).append(File.separatorChar);
}
relativePath.deleteCharAt(relativePath.length() - 1);
return relativePath.toString();
}
}
I know this is a bit late but, I created a solution that works with any java version.
public static String getRealtivePath(File root, File file)
{
String path = file.getPath();
String rootPath = root.getPath();
boolean plus1 = path.contains(File.separator);
return path.substring(path.indexOf(rootPath) + rootPath.length() + (plus1 ? 1 : 0));
}
org.apache.ant has a FileUtils class with a getRelativePath method. Haven't tried it myself yet, but could be worthwhile to check it out.
http://javadoc.haefelinger.it/org.apache.ant/1.7.1/org/apache/tools/ant/util/FileUtils.html#getRelativePath(java.io.File, java.io.File)
private String relative(String left, String right){
String[] lefts = left.split("/");
String[] rights = right.split("/");
int min = Math.min(lefts.length, rights.length);
int commonIdx = -1;
for(int i = 0; i < min; i++){
if(commonIdx < 0 && !lefts[i].equals(rights[i])){
commonIdx = i - 1;
break;
}
}
if(commonIdx < 0){
return null;
}
StringBuilder sb = new StringBuilder(Math.max(left.length(), right.length()));
sb.append(left).append("/");
for(int i = commonIdx + 1; i < lefts.length;i++){
sb.append("../");
}
for(int i = commonIdx + 1; i < rights.length;i++){
sb.append(rights[i]).append("/");
}
return sb.deleteCharAt(sb.length() -1).toString();
}
Psuedo-code:
Split the strings by the path seperator ("/")
Find the greatest common path by iterating thru the result of the split string (so you'd end up with "/var/data" or "/a" in your two examples)
return "." + whicheverPathIsLonger.substring(commonPath.length);

Find a specific file in a detected USB from Java

I am using Java codes to find a file that ends with a certain extension in a detected removable storage. I am trying to link the two codes together but I am not sure on how I can do so. These are the codes I am using:
DetectDrive.java
import java.io.*;
import java.util.*;
import javax.swing.filechooser.FileSystemView;
public class DetectDrive
{
public String USBDetect()
{
String driveLetter = "";
FileSystemView fsv = FileSystemView.getFileSystemView();
File[] f = File.listRoots();
for (int i = 0; i < f.length; i++)
{
String drive = f[i].getPath();
String displayName = fsv.getSystemDisplayName(f[i]);
String type = fsv.getSystemTypeDescription(f[i]);
boolean isDrive = fsv.isDrive(f[i]);
boolean isFloppy = fsv.isFloppyDrive(f[i]);
boolean canRead = f[i].canRead();
boolean canWrite = f[i].canWrite();
if (canRead && canWrite && !isFloppy && isDrive && (type.toLowerCase().contains("removable") || type.toLowerCase().contains("rimovibile")))
{
//log.info("Detected PEN Drive: " + drive + " - "+ displayName);
driveLetter = drive;
break;
}
}
/*if (driveLetter.equals(""))
{
System.out.println("Not found!");
}
else
{
System.out.println(driveLetter);
}
*/
//System.out.println(driveLetter);
return driveLetter;
}
}
FileSarch.java
import java.io.*;
public class FileSearch
{
public String find(File dir)
{
String pattern = ".raw";
File listFile[] = dir.listFiles();
if (listFile != null)
{
for (int i=0; i<listFile.length; i++)
{
if (listFile[i].isDirectory())
{
find(listFile[i]);
} else
{
if (listFile[i].getName().endsWith(pattern))
{
System.out.println(listFile[i].getPath());
}
}
}
}
return pattern;
}
}
The file that I want the program to search ends with a .raw extension and I want the program to search for the file in the detected removable storage (e.g. F:). How do I link these 2 codes together? If possible I would like an example of codes to link them. I got the codes for FileSearch.java from http://rosettacode.org/wiki/Walk_a_directory/Recursively#Java
Heres how I would do it, however I would also make the methods USBDetect and find static, they both dont seem to have any objects their referencing in their parent class. Also make USBDetect return a File instead of a String
public static void main(String [] args) {
// look for the drive
String drive = (new DetectDrive()).USBDetect();
// if it found a drive (null or empty string says no)
if(drive != null && !drive.isEmpty()) {
// look for a file in that drive
FileSearch fileSearch = new FileSearch();
fileSearch.find(new File(drive+":"));
}
}
First, change USBDetect by replacing String driveLetter with File removableDrive, and return that. Then pass the value returned from USBDetect into find.

Normalizing using MapReduce

There is this sample record,
100,1:2:3
Which I want to normalize as,
100,1
100,2
100,3
A colleague of mine wrote a pig script to achieve this and my MapReduce code took more time. I was using the default TextInputformat before. But to improve performance, I decided to write a custom Input format class, with a custom RecordReader. Taking the LineRecordReader class as reference, I tried to write the following code.
import java.io.IOException;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.hadoop.util.LineReader;
import com.normalize.util.Splitter;
public class NormalRecordReader extends RecordReader<Text, Text> {
private long start;
private long pos;
private long end;
private LineReader in;
private int maxLineLength;
private Text key = null;
private Text value = null;
private Text line = null;
public void initialize(InputSplit genericSplit, TaskAttemptContext context) throws IOException {
FileSplit split = (FileSplit) genericSplit;
Configuration job = context.getConfiguration();
this.maxLineLength = job.getInt("mapred.linerecordreader.maxlength", Integer.MAX_VALUE);
start = split.getStart();
end = start + split.getLength();
final Path file = split.getPath();
FileSystem fs = file.getFileSystem(job);
FSDataInputStream fileIn = fs.open(split.getPath());
in = new LineReader(fileIn, job);
this.pos = start;
}
public boolean nextKeyValue() throws IOException {
int newSize = 0;
if (line == null) {
line = new Text();
}
while (pos < end) {
newSize = in.readLine(line);
if (newSize == 0) {
break;
}
pos += newSize;
if (newSize < maxLineLength) {
break;
}
// line too long. try again
System.out.println("Skipped line of size " + newSize + " at pos " + (pos - newSize));
}
Splitter splitter = new Splitter(line.toString(), ",");
List<String> split = splitter.split();
if (key == null) {
key = new Text();
}
key.set(split.get(0));
if (value == null) {
value = new Text();
}
value.set(split.get(1));
if (newSize == 0) {
key = null;
value = null;
return false;
} else {
return true;
}
}
#Override
public Text getCurrentKey() {
return key;
}
#Override
public Text getCurrentValue() {
return value;
}
/**
* Get the progress within the split
*/
public float getProgress() {
if (start == end) {
return 0.0f;
} else {
return Math.min(1.0f, (pos - start) / (float)(end - start));
}
}
public synchronized void close() throws IOException {
if (in != null) {
in.close();
}
}
}
Though this works, but I haven't seen any performance improvement. Here I am breaking the record at "," and setting the 100 as key and 1,2,3 as value. I only call the mapper which does the following:
public void map(Text key, Text value, Context context)
throws IOException, InterruptedException {
try {
Splitter splitter = new Splitter(value.toString(), ":");
List<String> splits = splitter.split();
for (String split : splits) {
context.write(key, new Text(split));
}
} catch (IndexOutOfBoundsException ibe) {
System.err.println(value + " is malformed.");
}
}
The splitter class is used to split the data, as I found String's splitter to be slower. The method is:
public List<String> split() {
List<String> splitData = new ArrayList<String>();
int beginIndex = 0, endIndex = 0;
while(true) {
endIndex = dataToSplit.indexOf(delim, beginIndex);
if(endIndex == -1) {
splitData.add(dataToSplit.substring(beginIndex));
break;
}
splitData.add(dataToSplit.substring(beginIndex, endIndex));
beginIndex = endIndex + delimLength;
}
return splitData;
}
Can the code be improved in any way?
Let me summarize here what I think you can improve instead of in the comments:
As explained, currently you are creating a Text object several times per record (number of times will be equal to your number of tokens). While it may not matter too much for small input, this can be a big deal for decently sized jobs. To fix that, do the following:
private final Text text = new Text();
public void map(Text key, Text value, Context context) {
....
for (String split : splits) {
text.set(split);
context.write(key, text);
}
}
For your splitting, what you're doing right now is for every record allocating a new array, populating this array, and then iterating over this array to write your output. Effectively you don't really need an array in this case since you're not maintaining any state. Using the implementation of the split method you provided, you only need to make one pass on the data:
public void map(Text key, Text value, Context context) {
String dataToSplit = value.toString();
String delim = ":";
int beginIndex = 0;
int endIndex = 0;
while(true) {
endIndex = dataToSplit.indexOf(delim, beginIndex);
if(endIndex == -1) {
text.set(dataToSplit.substring(beginIndex));
context.write(key, text);
break;
}
text.set(dataToSplit.substring(beginIndex, endIndex));
context.write(key, text);
beginIndex = endIndex + delim.length();
}
}
I don't really see why you write your own InputFormat, it seems that KeyValueTextInputFormat is exactly what you need and has probably been already optimized. Here is how you use it:
conf.set("key.value.separator.in.input.line", ",");
job.setInputFormatClass(KeyValueTextInputFormat.class);
Based on your example, the key for each record seems to be an integer. If that's always the case, then using a Text as your mapper input key is not optimal and it should be an IntWritable or maybe even a ByteWritable depending on what's in your data.
Similarly, you want want to use an IntWritable or ByteWritable as your mapper output key and output value.
Also, if you want some meaningful benchmark, you should test on a bigger dataset, like a few Gbs if possible. 1 minute tests are not really meaningful, especially in the context of distributed systems. 1 job may run quicker than another one on a small input, but the trend may be reverted for bigger inputs.
That being said, you should also know that Pig does a lot of optimizations behind the hood when translating to Map/Reduce, so I'm not too surprised that it runs faster than your Java Map/Reduce code and I've seen that in the past. Try the optimizations I suggested, if it's still not fast enough here is a link on profiling your Map/Reduce jobs with a few more useful tricks (especially tip 7 on profiling is something I've found useful).

Categories

Resources