Extract number from string using pattern - java

Ok so this is what I have
"C:\this\file\is\rev12\oh\A_12345\doll\classes"
I want to extract from this string the 12345 only.
How can it be done using Java Pattern.compile?

You should define in more general idea how this number appears.
So if it somewhere in string with leading underscore _ and trailing slash \ pattern will be following _(\d+)\\.
Your number can be extracted from pattern matched group.
Try it.

Below is the code you could use, however I had to change the backward slash to forward slash in the path and use an absolute path. I also tried to change the path "C:\\this\\file\\is\\rev12\\oh\\A_12345\\doll\\classes" to use it in Windows. You could change the '\' to '\\'. Both the path string works for the below code.
File file = new java.io.File("C:/this/file/is/rev12/oh/A_12345/doll/classes").getAbsoluteFile();
System.out.println(file.getAbsolutePath());
Pattern pat = Pattern.compile("-?\\d+");
Matcher mat = pat.matcher(file.getAbsolutePath());
while (mat.find()) {
System.out.println(mat.group());
}

Related

How to get File Path with double-backslashes in Java

I have a program where I save school grades in a .txt File.
I want to let the user choose where this File should be saved.
It works with the JFileChooser find but Java have a problem with the
FilePath.
The filepath from the JFileChooser looks like this:
C:\Users...\Documents\n.txt
But if I want to read the TextFile in the Program Java says that
it couldn't find the Filepath.
It should look like this:
C:\Users\...\Documents\n.txt
How can I get the Path with double-backslashes?
public void actionPerformed(ActionEvent e) {
JFileChooser jf = new JFileChooser();
jf.showSaveDialog(null);
String fPath = jf.getSelectedFile().getPath();
fPath.replaceAll('\', '\\');
System.out.println(p);
}
that does not work it says invalid character constant
There are some places where the backslash serves as escape character, and must be escaped, to be simply the backslash of a Windows path separator.
These places are inside .properties files, java String literals and some more.
You could for Windows paths alternatively use a slash (POSIX compliance of Windows).
fPath = fPath.replace('\\', '/');
Backslash:
fPath = fPath.replace("\\", "\\\\");
The explanation is that a single backslash inside char and string literals must be escaped: two backslashes represent a single backslash.
With regular expressions (replaceAll) a backlash is used as command: a digit is expressed as \d and as java String: "\\d". Hence the backslash itself becomes (behold):
fPath = fPath.replaceAll("\\\\", "\\\\\\\\"); // PLEASE NOT
I almost did not see it, but methods on String do not alter it, but return a new value, so one needs to assign the result.
When using hard coded file names in Java you should always use forward slashes / as file separators. Java knows how to handle them on Windows.
Also you should not use absolute paths. You don't know if that paths will exist at the target system. You should use either relative paths starting with your classpath as root "/..." or get some system dependen places from System.getProperty() https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#getProperties--
Multiple issues in your code:
public void actionPerformed(ActionEvent e) {
JFileChooser jf = new JFileChooser();
jf.showSaveDialog(null);
String fPath = jf.getSelectedFile().getPath();
// fPath is a proper file path. This can be used directly with
// new File(fPath). The contents will contain single \ character
// as Path separator
fPath.replaceAll('\', '\\');
// I guess you are trying to replace a single \ character with \\
// character. You need to escape the \ character. You need to
// consider that both parameters are regexes.
// doing it is:
// fPath.replaceAll("\\\\", "\\\\\\\\");
// And then you need to capture the return value. Strings are
// immutable in java. So it is:
// fPath = fPath.replaceAll("\\\\", "\\\\\\\\");
System.out.println(p);
// I don't know what p is. I guess you want to use fPath
}
That said, I do not understand why you want to convert the path returned by JFileChooser.
You don't need the file path with double backslashes in Java. Double backslashes are for:
The Java compiler, inside string literals.
The Java regex compiler.
Everywhere else you can obtain backslashes, or use forward slashes.
Possibly you are looking for java.util.Properties?

How to get (split) filenames from string in java?

I have a string that contains file names like:
"file1.txt file2.jpg tricky file name.txt other tricky filenames containing áéíőéáóó.gif"
How can I get the file names, one by one?
I am looking for the most safe most through method, preferably something java standard. There has got to be some regular expression already out there, I am counting on your experience.
Edit: expected results:
"file1.txt", "file2.jpg", "tricky file name.txt", "other tricky filenames containing áéíőéáóó.gif"
Thanks for the help,
Sziro
Regular expresion that enrico.bacis suggested (\S.?.\S+)* will not work if there are filenames without characters before "." like .project.
Correct pattern would be:
(([^ .]+ +)*\S*\.\S+)
You can try it here.
Java program that could extract filenames will look like:
String patternStr = "([^ .]+ +)*\\S*\\.\\S+";
String input = "file1.txt .project file2.jpg tricky file name.txt other tricky filenames containing áéíoéáóó.gif";
Pattern pattern = Pattern.compile(patternStr, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group());
}
If you want to use regular expressions you can find all the occurrences of:
(\S.*?\.\S+)
(you can test it here)
If there are spaces in the file names, it makes it trickier.
If you can assume there are no dots (.) in the file names, you can use the dot to find each individual records as has been suggested.
If you can't assume there are no dots in file names, e.g. my file.new something.txt
In this situation, I'd suggest you create a list of acceptable extentions, e.g. .doc, .jpg, .pdf etc.
I know the list may be long, so it's not ideal. Once you have done this you can look for these extensions and assume what's before it is a valid filename.
String txt = "file1.txt file2.jpg tricky file name.txt other tricky filenames containing áéíőéáóó.gif";
Pattern pattern = Pattern.compile("\\S.*?\\.\\S+"); // Get regex from enrico.bacis
Matcher matcher = pattern.matcher(txt);
while (matcher.find()) {
System.out.println(matcher.group().trim());
}

Java regex for Windows file path

I'm trying to build a Java regex to search a .txt file for a Windows formatted file path, however, due to the file path containing literal backslashes, my regex is failing.
The .txt file contains the line:
C\Windows\SysWOW64\ntdll.dll
However, some of the filenames in the text file are formatted like this:
C\Windows\SysWOW64\ntdll.dll (some developer stuff here...)
So I'm unable to use String.equals
To match this line, I'm using the regex:
filename = "C\\Windows\\SysWOW64\\ntdll.dll"
read = BufferedReader.readLine();
if (Pattern.compile(Pattern.quote(filename), Pattern.CASE_INSENSITIVE).matcher(read).find()) {
I've tried escaping the literal backslashes, using the replace method, i.e:
filename.replace("\\", "\\\\");
However, this is failing to find, I'm guessing this is because I need to further escape the backslashes after the Pattern has been built, I'm thinking I might need to escape upto an additional four backslashes, i.e:
Pattern.replaceAll("\\\\", "\\\\\\\\");
However, each time I try, the pattern doesn't get matched. I'm certain it's a problem with the backslashes, but I'm not sure where to do the replacement, or if there's a better way of building the pattern.
I think the problem is further being compounded as the replaceAll method also uses a regex, with means the pattern will have it's own backslashes in there, to deal with the case insensitivity.
Any input or advice would be appreciated.
Thanks
Seems like you're attempting to to a direct comparison of String against another. For exact matches, you could do (
if (read.equalsIgnoreCase(filename)) {
of simply
if (read.startsWith(filename)) {
Try this :
While reading each line from the file, replace '\' by '\\'.
Then :
String lLine = "C\\Windows\\SysWOW64\\ntdll.dll";
Pattern lPattern = Pattern.compile("C\\\\Windows\\\\SysWOW64\\\\ntdll\\.dll");
Matcher lMatcher = lPattern.matcher(lLine);
if(lMatcher.find()) {
System.out.println(lMatcher.group());
}
lLine = "C\\Windows\\SysWOW64\\ntdll.dll (some developer stuff here...)";
lMatcher = lPattern.matcher(lLine);
if(lMatcher.find()) {
System.out.println(lMatcher.group());
}
The correct usage will be:
String filename = "C\\Windows\\SysWOW64\\ntdll.dll";
String file = filename.replace('\\', ' ');

Java: parsing file path

I need to parse a file path to get the filename from it. What confuses me is that windows uses \ as the delimeter and linux - / and somehow the provided filepath can even contain both delimeters at the same time.
Of course I can do:
int slash = filePath.lastIndexOf("/");
int backslash = filePath.lastIndexOf("\\");
fileName = filePath.substring(slash > backslash ? slash : backslash);
but is there a better way in case I have more delimiters? (probably not for a file path)
Just use the File class:
String fileName = new File(path).getName();
It handles forward and backwards slashes, plus combinations of the two.
You can use
String separator =System.getProperty("path.separator");
to get you systems separator.

Splitting filenames using system file separator symbol

I have a complete file path and I want to get the file name.
I am using the following instruction:
String[] splittedFileName = fileName.split(System.getProperty("file.separator"));
String simpleFileName = splittedFileName[splittedFileName.length-1];
But on Windows it gives:
java.util.regex.PatternSyntaxException: Unexpected internal error near index 1
\
^
Can I avoid this exception? Is there a better way to do this?
The problem is that \ has to be escaped in order to use it as backslash within a regular expression. You should either use a splitting API which doesn't use regular expressions, or use Pattern.quote first:
// Alternative: use Pattern.quote(File.separator)
String pattern = Pattern.quote(System.getProperty("file.separator"));
String[] splittedFileName = fileName.split(pattern);
Or even better, use the File API for this:
File file = new File(fileName);
String simpleFileName = file.getName();
When you write a file name, you should use System.getProperty("file.separator").
When you read a file name, you could possibly have either the forward slash or the backward slash as a file separator.
You might want to try the following:
fileName = fileName.replace("\\", "/");
String[] splittedFileName = fileName.split("/"));
String simpleFileName = splittedFileName[splittedFileName.length-1];
First of all, for this specific problem I'd recommend using the java.util.File class instead of a regex.
That being said, the root of the problem you're running into is that the backslash character '\' signifies an escape sequence in Java regular expressions. What's happening is the regex parser is seeing the backslash and expecting there to be another character after it which would complete the escape sequence. The easiest way to get around this is to use the java.util.regex.Pattern.quote() method which will escape any special characters in the string you give it.
With this change your code becomes:
String splitRegex = Pattern.quote(System.getProperty("file.separator"));
String[] splittedFileName = fileName.split(splitRegex);
String simpleFileName = splittedFileName[splittedFileName.length-1];
Another simpler way could be to do
File f = new File(path);
String fileName = f.getName();
I believe this will work provided the paths are compatible with the platform, i.e. not sure if path "c:\file.txt" will work on Linux or not.

Categories

Resources