Does Java have a path joining method? [duplicate] - java

This question already has answers here:
Closed 13 years ago.
Exact Duplicate:
combine paths in java
I would like to know if there is such a method in Java. Take this snippet as example :
// this will output a/b
System.out.println(path_join("a","b"));
// a/b
System.out.println(path_join("a","/b");

This concerns Java versions 7 and earlier.
To quote a good answer to the same question:
If you want it back as a string later, you can call getPath(). Indeed, if you really wanted to mimic Path.Combine, you could just write something like:
public static String combine (String path1, String path2) {
File file1 = new File(path1);
File file2 = new File(file1, path2);
return file2.getPath();
}

Try:
String path1 = "path1";
String path2 = "path2";
String joinedPath = new File(path1, path2).toString();

One way is to get system properties that give you the path separator for the operating system, this tutorial explains how. You can then use a standard string join using the file.separator.

This is a start, I don't think it works exactly as you intend, but it at least produces a consistent result.
import java.io.File;
public class Main
{
public static void main(final String[] argv)
throws Exception
{
System.out.println(pathJoin());
System.out.println(pathJoin(""));
System.out.println(pathJoin("a"));
System.out.println(pathJoin("a", "b"));
System.out.println(pathJoin("a", "b", "c"));
System.out.println(pathJoin("a", "b", "", "def"));
}
public static String pathJoin(final String ... pathElements)
{
final String path;
if(pathElements == null || pathElements.length == 0)
{
path = File.separator;
}
else
{
final StringBuilder builder;
builder = new StringBuilder();
for(final String pathElement : pathElements)
{
final String sanitizedPathElement;
// the "\\" is for Windows... you will need to come up with the
// appropriate regex for this to be portable
sanitizedPathElement = pathElement.replaceAll("\\" + File.separator, "");
if(sanitizedPathElement.length() > 0)
{
builder.append(sanitizedPathElement);
builder.append(File.separator);
}
}
path = builder.toString();
}
return (path);
}
}

Related

Strange behaviour of String.length()

I have class with main:
public class Main {
// args[0] - is path to file with first and last words
// args[1] - is path to file with dictionary
public static void main(String[] args) {
try {
List<String> firstLastWords = FileParser.getWords(args[0]);
System.out.println(firstLastWords);
System.out.println(firstLastWords.get(0).length());
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
and I have FileParser:
public class FileParser {
public FileParser() {
}
final static Charset ENCODING = StandardCharsets.UTF_8;
public static List<String> getWords(String filePath) throws IOException {
List<String> list = new ArrayList<String>();
Path path = Paths.get(filePath);
try (BufferedReader reader = Files.newBufferedReader(path, ENCODING)) {
String line = null;
while ((line = reader.readLine()) != null) {
String line1 = line.replaceAll("\\s+","");
if (!line1.equals("") && !line1.equals(" ") ){
list.add(line1);
}
}
reader.close();
}
return list;
}
}
args[0] is the path to txt file with just 2 words. So if file contains:
тор
кит
programm returns:
[тор, кит]
4
If file contains:
т
тор
кит
programm returns:
[т, тор, кит]
2
even if file contains:
//jump to next line
тор
кит
programm returns:
[, тор, кит]
1
where digit - is length of the first string in the list.
So the question is why it counts one more symbol?
Thanks all.
This symbol as said #Bill is BOM (http://en.wikipedia.org/wiki/Byte_order_mark) and reside at the beginning of a text file.
So i found this symbol by this line:
System.out.println(((int)firstLastWords.get(0).charAt(0)));
it gave me 65279
then i just changed this line:
String line1 = line.replaceAll("\\s+","");
to this
String line1 = line.replaceAll("\uFEFF","");
Cyrillic characters are difficult to capture using Regex, eg \p{Graph} does not work, although they are clearly visible characters. Anyways, that is besides the OP question.
The actual problem is likely due to other non-visible characters, likely control characters present. Try following regex to remove more: replaceAll("(\\s|\\p{Cntrl})+",""). You can play around with the Regex to further extend that to other cases.

Removing the BOM character with Java [duplicate]

This question already has answers here:
Byte order mark screws up file reading in Java
(11 answers)
Closed 8 years ago.
I am trying to read files using FileReader and write them into a separate file.
These files are UTF-8 encoded, but unfortuantely some of them still contain a BOM.
The relevant code I tried is this:
private final String UTF8_BOM = "\uFEFF";
private String removeUTF8BOM(String s)
{
if (s.startsWith(UTF8_BOM))
{
s=s.replace(UTF8_BOM, "");
}
return s;
}
line=removeUTF8BOM(line);
But for some reason the BOM is not removed. Is there any other way I can do this with FileReader? I know that there is the BOMInputStream that should work, but I'd rather find a solution using FileReader.
The class FileReader is an old utility class, that uses the platform encoding. On Windows that is likely not UTF-8.
Best to read with another class.
As amusement, and to clarify the error, here a dirty hack, that works for platforms with single byte encodings:
private final String UTF8_BOM = new String("\uFEFF".getBytes(StandardCharsets.UTF_8));
This gets the UTF-8 bytes and makes a String in the current platform encoding.
No need to mention that FileReader is non-portible, dealing only with local files.
Naive Solution to the question as asked:
public static void main(final String[] args)
{
final String hasbom = "\uFEFF" + "Hello World!";
final String nobom = hasbom.charAt(0) == '\uFEFF' ? hasbom.substring(1) : hasbom;
System.out.println(hasbom.equals(nobom));
}
Outputs:
false
Proper Solution Approach:
You should never program to a File based API and instead program against InputStream/OutputStream so that your code is portable to different source locations.
This is just an untested example of how you might go about encapsulating this behavior into an InputStream to make it transparent.
public class BomProofInputStream extends InputStream
{
private final InputStream is;
public BomProofInputStream(#Nonnull final InputStream is)
{
this.is = is;
}
private boolean isFirstByte = true;
#Override
public int read() throws IOException
{
if (this.isFirstByte)
{
this.isFirstByte = false;
final int b = is.read();
if ("\uFEFF".charAt(0) != b) { return b; }
}
return is.read();
}
}
Found an full fledged example with some searching:

How to separate string in text file into different array (java)

I have a text file that consist of string. What i want to do is to separate the string with "[ham]" and the string with "[spam]" inside to the different array, how can i do that, i think about to use regex to recognize the pattern (ham & spam), but i have no idea to start. please help me.
String in text file:
good [ham]
very good [ham]
bad [spam]
very bad [spam]
very bad, very bad [spam]
and i want the output to be like this:
Ham array:
good
very good
Spam array:
bad
very bad
very bad, very bad
Help me please.
Instead of using array I think you should go for ArrayList
List<String> ham=new ArrayList<String>();
List<String> spam=new ArrayList<String>();
if(line.contains("[ham]"))
ham.add(line.substring(0,line.indexOf("[ham]")));
if(line.contains("[spam]"))
spam.add(line.substring(0,line.indexOf("[spam]")));
If you really need do this that way (with regex & array as output) write code like this:
public class StringResolve {
public static void main(String[] args) {
try {
// read data from some source
URL exampleTxt = StringResolve.class.getClassLoader().getResource("me/markoutte/sandbox/_25989334/example.txt");
Path path = Paths.get(exampleTxt.toURI());
List<String> strings = Files.readAllLines(path, Charset.forName("UTF8"));
// init all my patterns & arrays
Pattern ham = getPatternFor("ham");
List<String> hams = new LinkedList<>();
Pattern spam = getPatternFor("spam");
List<String> spams = new LinkedList<>();
// check all of them
for (String string : strings) {
Matcher hamMatcher = ham.matcher(string);
if (hamMatcher.matches()) {
// we choose only text without label here
hams.add(hamMatcher.group(1));
}
Matcher spamMatcher = spam.matcher(string);
if (spamMatcher.matches()) {
// we choose only text without label here
spams.add(spamMatcher.group(1));
}
}
// output data through arrays
String[] hamArray = hams.toArray(new String[hams.size()]);
System.out.println("Ham array");
for (String s : hamArray) {
System.out.println(s);
}
System.out.println();
String[] spamArray = spams.toArray(new String[spams.size()]);
System.out.println("Spam array");
for (String s : spamArray) {
System.out.println(s);
}
} catch (URISyntaxException | IOException e) {
e.printStackTrace();
}
}
private static Pattern getPatternFor(String label) {
// Regex pattern for string with same kind: some text [label]
return Pattern.compile(String.format("(.+?)\\s(\\[%s\\])", label));
}
}
You can use Paths.get("some/path/to/file") if you need to read it from somewhere in your drive.

How do I trim a file extension from a String in Java?

What's the most efficient way to trim the suffix in Java, like this:
title part1.txt
title part2.html
=>
title part1
title part2
This is the sort of code that we shouldn't be doing ourselves. Use libraries for the mundane stuff, save your brain for the hard stuff.
In this case, I recommend using FilenameUtils.removeExtension() from Apache Commons IO
str.substring(0, str.lastIndexOf('.'))
As using the String.substring and String.lastIndex in a one-liner is good, there are some issues in terms of being able to cope with certain file paths.
Take for example the following path:
a.b/c
Using the one-liner will result in:
a
That's incorrect.
The result should have been c, but since the file lacked an extension, but the path had a directory with a . in the name, the one-liner method was tricked into giving part of the path as the filename, which is not correct.
Need for checks
Inspired by skaffman's answer, I took a look at the FilenameUtils.removeExtension method of the Apache Commons IO.
In order to recreate its behavior, I wrote a few tests the new method should fulfill, which are the following:
Path Filename
-------------- --------
a/b/c c
a/b/c.jpg c
a/b/c.jpg.jpg c.jpg
a.b/c c
a.b/c.jpg c
a.b/c.jpg.jpg c.jpg
c c
c.jpg c
c.jpg.jpg c.jpg
(And that's all I've checked for -- there probably are other checks that should be in place that I've overlooked.)
The implementation
The following is my implementation for the removeExtension method:
public static String removeExtension(String s) {
String separator = System.getProperty("file.separator");
String filename;
// Remove the path upto the filename.
int lastSeparatorIndex = s.lastIndexOf(separator);
if (lastSeparatorIndex == -1) {
filename = s;
} else {
filename = s.substring(lastSeparatorIndex + 1);
}
// Remove the extension.
int extensionIndex = filename.lastIndexOf(".");
if (extensionIndex == -1)
return filename;
return filename.substring(0, extensionIndex);
}
Running this removeExtension method with the above tests yield the results listed above.
The method was tested with the following code. As this was run on Windows, the path separator is a \ which must be escaped with a \ when used as part of a String literal.
System.out.println(removeExtension("a\\b\\c"));
System.out.println(removeExtension("a\\b\\c.jpg"));
System.out.println(removeExtension("a\\b\\c.jpg.jpg"));
System.out.println(removeExtension("a.b\\c"));
System.out.println(removeExtension("a.b\\c.jpg"));
System.out.println(removeExtension("a.b\\c.jpg.jpg"));
System.out.println(removeExtension("c"));
System.out.println(removeExtension("c.jpg"));
System.out.println(removeExtension("c.jpg.jpg"));
The results were:
c
c
c.jpg
c
c
c.jpg
c
c
c.jpg
The results are the desired results outlined in the test the method should fulfill.
String foo = "title part1.txt";
foo = foo.substring(0, foo.lastIndexOf('.'));
BTW, in my case, when I wanted a quick solution to remove a specific extension, this is approximately what I did:
if (filename.endsWith(ext))
return filename.substring(0,filename.length() - ext.length());
else
return filename;
Use a method in com.google.common.io.Files class if your project is already dependent on Google core library. The method you need is getNameWithoutExtension.
you can try this function , very basic
public String getWithoutExtension(String fileFullPath){
return fileFullPath.substring(0, fileFullPath.lastIndexOf('.'));
}
String fileName="foo.bar";
int dotIndex=fileName.lastIndexOf('.');
if(dotIndex>=0) { // to prevent exception if there is no dot
fileName=fileName.substring(0,dotIndex);
}
Is this a trick question? :p
I can't think of a faster way atm.
I found coolbird's answer particularly useful.
But I changed the last result statements to:
if (extensionIndex == -1)
return s;
return s.substring(0, lastSeparatorIndex+1)
+ filename.substring(0, extensionIndex);
as I wanted the full path name to be returned.
So "C:\Users\mroh004.COM\Documents\Test\Test.xml" becomes
"C:\Users\mroh004.COM\Documents\Test\Test" and not
"Test"
filename.substring(filename.lastIndexOf('.'), filename.length()).toLowerCase();
Use a regex. This one replaces the last dot, and everything after it.
String baseName = fileName.replaceAll("\\.[^.]*$", "");
You can also create a Pattern object if you want to precompile the regex.
If you use Spring you could use
org.springframework.util.StringUtils.stripFilenameExtension(String path)
Strip the filename extension from the given Java resource path, e.g.
"mypath/myfile.txt" -> "mypath/myfile".
Params: path – the file path
Returns: the path with stripped filename extension
private String trimFileExtension(String fileName)
{
String[] splits = fileName.split( "\\." );
return StringUtils.remove( fileName, "." + splits[splits.length - 1] );
}
String[] splitted = fileName.split(".");
String fileNameWithoutExtension = fileName.replace("." + splitted[splitted.length - 1], "");
create a new file with string image path
String imagePath;
File test = new File(imagePath);
test.getName();
test.getPath();
getExtension(test.getName());
public static String getExtension(String uri) {
if (uri == null) {
return null;
}
int dot = uri.lastIndexOf(".");
if (dot >= 0) {
return uri.substring(dot);
} else {
// No extension.
return "";
}
}
org.apache.commons.io.FilenameUtils version 2.4 gives the following answer
public static String removeExtension(String filename) {
if (filename == null) {
return null;
}
int index = indexOfExtension(filename);
if (index == -1) {
return filename;
} else {
return filename.substring(0, index);
}
}
public static int indexOfExtension(String filename) {
if (filename == null) {
return -1;
}
int extensionPos = filename.lastIndexOf(EXTENSION_SEPARATOR);
int lastSeparator = indexOfLastSeparator(filename);
return lastSeparator > extensionPos ? -1 : extensionPos;
}
public static int indexOfLastSeparator(String filename) {
if (filename == null) {
return -1;
}
int lastUnixPos = filename.lastIndexOf(UNIX_SEPARATOR);
int lastWindowsPos = filename.lastIndexOf(WINDOWS_SEPARATOR);
return Math.max(lastUnixPos, lastWindowsPos);
}
public static final char EXTENSION_SEPARATOR = '.';
private static final char UNIX_SEPARATOR = '/';
private static final char WINDOWS_SEPARATOR = '\\';
The best what I can write trying to stick to the Path class:
Path removeExtension(Path path) {
return path.resolveSibling(path.getFileName().toString().replaceFirst("\\.[^.]*$", ""));
}
dont do stress on mind guys. i did already many times. just copy paste this public static method in your staticUtils library for future uses ;-)
static String removeExtension(String path){
String filename;
String foldrpath;
String filenameWithoutExtension;
if(path.equals("")){return "";}
if(path.contains("\\")){ // direct substring method give wrong result for "a.b.c.d\e.f.g\supersu"
filename = path.substring(path.lastIndexOf("\\"));
foldrpath = path.substring(0, path.lastIndexOf('\\'));;
if(filename.contains(".")){
filenameWithoutExtension = filename.substring(0, filename.lastIndexOf('.'));
}else{
filenameWithoutExtension = filename;
}
return foldrpath + filenameWithoutExtension;
}else{
return path.substring(0, path.lastIndexOf('.'));
}
}
I would do like this:
String title_part = "title part1.txt";
int i;
for(i=title_part.length()-1 ; i>=0 && title_part.charAt(i)!='.' ; i--);
title_part = title_part.substring(0,i);
Starting to the end till the '.' then call substring.
Edit:
Might not be a golf but it's effective :)
Keeping in mind the scenarios where there is no file extension or there is more than one file extension
example Filename : file | file.txt | file.tar.bz2
/**
*
* #param fileName
* #return file extension
* example file.fastq.gz => fastq.gz
*/
private String extractFileExtension(String fileName) {
String type = "undefined";
if (FilenameUtils.indexOfExtension(fileName) != -1) {
String fileBaseName = FilenameUtils.getBaseName(fileName);
int indexOfExtension = -1;
while (fileBaseName.contains(".")) {
indexOfExtension = FilenameUtils.indexOfExtension(fileBaseName);
fileBaseName = FilenameUtils.getBaseName(fileBaseName);
}
type = fileName.substring(indexOfExtension + 1, fileName.length());
}
return type;
}
String img = "example.jpg";
// String imgLink = "http://www.example.com/example.jpg";
URI uri = null;
try {
uri = new URI(img);
String[] segments = uri.getPath().split("/");
System.out.println(segments[segments.length-1].split("\\.")[0]);
} catch (Exception e) {
e.printStackTrace();
}
This will output example for both img and imgLink
private String trimFileName(String fileName)
{
String[] ext;
ext = fileName.split("\\.");
return fileName.replace(ext[ext.length - 1], "");
}
This code will spilt the file name into parts where ever it has " . ", For eg. If the file name is file-name.hello.txt then it will be spilted into string array as , { "file-name", "hello", "txt" }. So anyhow the last element in this string array will be the file extension of that particular file , so we can simply find the last element of any arrays with arrayname.length - 1, so after we get to know the last element, we can just replace the file extension with an empty string in that file name. Finally this will return file-name.hello. , if you want to remove also the last period then you can add the string with only period to the last element of string array in the return line. Which should look like,
return fileName.replace("." + ext[ext.length - 1], "");
public static String removeExtension(String file) {
if(file != null && file.length() > 0) {
while(file.contains(".")) {
file = file.substring(0, file.lastIndexOf('.'));
}
}
return file;
}

Comparing text files with Junit

I am comparing text files in junit using:
public static void assertReaders(BufferedReader expected,
BufferedReader actual) throws IOException {
String line;
while ((line = expected.readLine()) != null) {
assertEquals(line, actual.readLine());
}
assertNull("Actual had more lines then the expected.", actual.readLine());
assertNull("Expected had more lines then the actual.", expected.readLine());
}
Is this a good way to compare text files? What is preferred?
Here's one simple approach for checking if the files are exactly the same:
assertEquals("The files differ!",
FileUtils.readFileToString(file1, "utf-8"),
FileUtils.readFileToString(file2, "utf-8"));
Where file1 and file2 are File instances, and FileUtils is from Apache Commons IO.
Not much own code for you to maintain, which is always a plus. :) And very easy if you already happen to use Apache Commons in your project. But no nice, detailed error messages like in mark's solution.
Edit:
Heh, looking closer at the FileUtils API, there's an even simpler way:
assertTrue("The files differ!", FileUtils.contentEquals(file1, file2));
As a bonus, this version works for all files, not just text.
junit-addons has nice support for it: FileAssert
It gives you exceptions like:
junitx.framework.ComparisonFailure: aa Line [3] expected: [b] but was:[a]
Here is a more exhaustive list of File comparator's in various 3rd-party Java libraries:
org.apache.commons.io.FileUtils
org.dbunit.util.FileAsserts
org.fest.assertions.FileAssert
junitx.framework.FileAssert
org.springframework.batch.test.AssertFile
org.netbeans.junit.NbTestCase
org.assertj.core.api.FileAssert
As of 2015, I would recomment AssertJ, an elegant and comprehensive assertion library. For files, you can assert against another file:
#Test
public void file() {
File actualFile = new File("actual.txt");
File expectedFile = new File("expected.txt");
assertThat(actualFile).hasSameTextualContentAs(expectedFile);
}
or against inline strings:
#Test
public void inline() {
File actualFile = new File("actual.txt");
assertThat(linesOf(actualFile)).containsExactly(
"foo 1",
"foo 2",
"foo 3"
);
}
The failure messages are very informative as well. If a line is different, you get:
java.lang.AssertionError:
File:
<actual.txt>
and file:
<expected.txt>
do not have equal content:
line:<2>,
Expected :foo 2
Actual :foo 20
and if one of the files has more lines you get:
java.lang.AssertionError:
File:
<actual.txt>
and file:
<expected.txt>
do not have equal content:
line:<4>,
Expected :EOF
Actual :foo 4
Simple comparison of the content of two files with java.nio.file API.
byte[] file1Bytes = Files.readAllBytes(Paths.get("Path to File 1"));
byte[] file2Bytes = Files.readAllBytes(Paths.get("Path to File 2"));
String file1 = new String(file1Bytes, StandardCharsets.UTF_8);
String file2 = new String(file2Bytes, StandardCharsets.UTF_8);
assertEquals("The content in the strings should match", file1, file2);
Or if you want to compare individual lines:
List<String> file1 = Files.readAllLines(Paths.get("Path to File 1"));
List<String> file2 = Files.readAllLines(Paths.get("Path to File 2"));
assertEquals(file1.size(), file2.size());
for(int i = 0; i < file1.size(); i++) {
System.out.println("Comparing line: " + i)
assertEquals(file1.get(i), file2.get(i));
}
I'd suggest using Assert.assertThat and a hamcrest matcher (junit 4.5 or later - perhaps even 4.4).
I'd end up with something like:
assertThat(fileUnderTest, containsExactText(expectedFile));
where my matcher is:
class FileMatcher {
static Matcher<File> containsExactText(File expectedFile){
return new TypeSafeMatcher<File>(){
String failure;
public boolean matchesSafely(File underTest){
//create readers for each/convert to strings
//Your implementation here, something like:
String line;
while ((line = expected.readLine()) != null) {
Matcher<?> equalsMatcher = CoreMatchers.equalTo(line);
String actualLine = actual.readLine();
if (!equalsMatcher.matches(actualLine){
failure = equalsMatcher.describeFailure(actualLine);
return false;
}
}
//record failures for uneven lines
}
public String describeFailure(File underTest);
return failure;
}
}
}
}
Matcher pros:
Composition and reuse
Use in normal code as well as test
Collections
Used in mock framework(s)
Can be used a general predicate function
Really nice log-ability
Can be combined with other matchers and descriptions and failure descriptions are accurate and precise
Cons:
Well it's pretty obvious right? This is way more verbose than assert or junitx (for this particular case)
You'll probably need to include the hamcrest libs to get the most benefit
FileUtils sure is a good one. Here's yet another simple approach for checking if the files are exactly the same.
assertEquals(FileUtils.checksumCRC32(file1), FileUtils.checksumCRC32(file2));
While the assertEquals() does provide a little more feedback than the assertTrue(), the result of checksumCRC32() is a long. So, that may not be intrisically helpful.
If expected has more lines than actual, you'll fail an assertEquals before getting to the assertNull later.
It's fairly easy to fix though:
public static void assertReaders(BufferedReader expected,
BufferedReader actual) throws IOException {
String expectedLine;
while ((expectedLine = expected.readLine()) != null) {
String actualLine = actual.readLine();
assertNotNull("Expected had more lines then the actual.", actualLine);
assertEquals(expectedLine, actualLine);
}
assertNull("Actual had more lines then the expected.", actual.readLine());
}
This is my own implementation of equalFiles, no need to add any library to your project.
private static boolean equalFiles(String expectedFileName,
String resultFileName) {
boolean equal;
BufferedReader bExp;
BufferedReader bRes;
String expLine ;
String resLine ;
equal = false;
bExp = null ;
bRes = null ;
try {
bExp = new BufferedReader(new FileReader(expectedFileName));
bRes = new BufferedReader(new FileReader(resultFileName));
if ((bExp != null) && (bRes != null)) {
expLine = bExp.readLine() ;
resLine = bRes.readLine() ;
equal = ((expLine == null) && (resLine == null)) || ((expLine != null) && expLine.equals(resLine)) ;
while(equal && expLine != null)
{
expLine = bExp.readLine() ;
resLine = bRes.readLine() ;
equal = expLine.equals(resLine) ;
}
}
} catch (Exception e) {
} finally {
try {
if (bExp != null) {
bExp.close();
}
if (bRes != null) {
bRes.close();
}
} catch (Exception e) {
}
}
return equal;
}
And to use it just use regular AssertTrue JUnit method
assertTrue(equalFiles(expected, output)) ;

Categories

Resources