Java RegularExpression for " double quotes and ' ' spaces - java

I am trying to find and replace in the file using java but unable to get the solution.
File contents are
"ProductCode" = "8:{3E3CDCB6-286C-4B7F-BCA6-D347A4AE37F5}"
"ProductCode" = "8:.NETFramework,Version=v4.5"
I have to update the guid of first one which is 3E3CDCB6-286C-4B7F-BCA6-D347A4AE37F5
String line = "\"ProductCode\" = \"8:{3E3CDCB6-286C-4B7F-BCA6-D347A4AE37F5}\"";
String pattern = "[\"]([P][r][o][d][u][c][t][C][o][d][e]).+([\"])(\\s)[\"][8][:][{]";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
System.out.println(m.matches());
I am getting false.
please provide the solution if possible.
Thanks in advance.

"ProductCode" = "8:{3E3CDCB6-286C-4B7F-BCA6-D347A4AE37F5}" This is of the form:
quote + ProductCode + quote + whitespace + equals + whitespace +
quote + number + colon + any + quote
A simple Regex for this is \"ProductCode\"\s*=\s*\"\d:(.+)\"
When we escape this to a Java string we get \\\"ProductCode\\\"\\s*=\\s*\\\"\\d:(.+)\\\"

Try this pattern:
String pattern = "^\\\"(ProductCode)\\\"\\s\\=\\s\\\"\\w\\:\\{(\\w+\\-*\\w+\\-\\w+\\-\\w+\\-\\w+)\\}\\\"$";

Using regex for this problem is like taking a sledgehammer to break a nut. Rather simple:
final String line = "\"ProductCode\" = \"8:{3E3CDCB6-286C-4B7F-BCA6-D347A4AE37F5}\"";
final String prefix = "\"ProductCode\" = \"8:{";
final int prefixIndex = line.indexOf(prefix);
final String suffix = "}\"";
final int suffixIndex = line.indexOf(suffix);
final String guid = line.substring(prefixIndex + prefix.length(), suffixIndex);

Related

Android Java - String .replaceAll to replace specific characters (regex)

I need to remove some specific "special" characters and replace them with empty string if they show up.
I am currently having a problem with the regex, probably with the Java escaping. I can't put them all together, it just doesn't work, I tried a lot! T_T
Currently I am doing it one by one which is kinda silly, but for now at least it works, like that :
public static String filterSpecialCharacters(String string) {
string = string.replaceAll("-", "");
string = string.replaceAll("\\[", "");
string = string.replaceAll("\\]", "");
string = string.replaceAll("\\^", "");
string = string.replaceAll("/", "");
string = string.replaceAll(",", "");
string = string.replaceAll("'", "");
string = string.replaceAll("\\*", "");
string = string.replaceAll(":", "");
string = string.replaceAll("\\.", "");
string = string.replaceAll("!", "");
string = string.replaceAll(">", "");
string = string.replaceAll("<", "");
string = string.replaceAll("~", "");
string = string.replaceAll("#", "");
string = string.replaceAll("#", "");
string = string.replaceAll("$", "");
string = string.replaceAll("%", "");
string = string.replaceAll("\\+", "");
string = string.replaceAll("=", "");
string = string.replaceAll("\\?", "");
string = string.replaceAll("|", "");
string = string.replaceAll("\"", "");
string = string.replaceAll("\\\\", "");
string = string.replaceAll("\\)", "");
string = string.replaceAll("\\(", "");
return string;
}
Those are all the character I need to remove:
- [ ] ^ / , ' * : . ! > < ~ # # $ % + = ? | " \ ) (
I am clearly missing something, I can't figure out how to put it all in one line. Help?
Your code does not work in fact because .replaceAll("$", "") replaces an end of string with empty string. To replace a literal $, you need to escape it. Same issue is with the pipe symbol removal.
All you need to do is to put the characters you need to replace into a character class and apply the + quantifier for better performance, like this:
string = string.replaceAll("[-\\[\\]^/,'*:.!><~##$%+=?|\"\\\\()]+", "");
Note that inside a character class, most "special regex metacharacters" lose their special status, you only have to escape [, ], \, a hyphen (if it is not at the start/end of the character class), and a ^ (if it is the first symbol in the "positive" character class).
DEMO:
String s = "-[]^/,'*:.!><~##$%+=?|\"\\()TEXT";
s = s.replaceAll("[-\\[\\]^/,'*:.!><~##$%+=?|\"\\\\()]+", "");
System.out.println(s); // => TEXT
Use these codes
String REGEX = "YOUR_REGEX";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(yourString);
yourString = m.replaceAll("");
UPDATE :
Your REGEX looks something like
String REGEX = "-|\\[|\\]|\\^|\\/|,|'|\\*|\\:|\\.|!|>|<|\\~|#|#|\\$|%|\\+|=\\?|\\||\\\\|\\\\\\\\|\\)|\\(";
SAPMLE :
String yourString = "#My (name) -is #someth\ing"";
//Use Above codes
Log.d("yourString",yourString);
OUTPUT

java: Extract a substring using regular expression

I have String data in which I am interested to extract a substring but I am stuck on creating the regex pattern for that.The String data I have is following:
$.ajax({url:"Q" + "uestions?"
+ "" + "action="
+ "maxim" + "um&"
+ "p043366329446409=08315891235072667&"
+ "c" + "ity="
+ k.val() + "&"
+ e + "=888",success:succFun,error:errFun,async:false});
};
I want to extract p043366329446409=08315891235072667 part from the above string.This data changes everytime I make request to server but "p0" will always start the string and &" will end the string.
Thanks EveryOne.
Try this one:
String mydata = "<query string>";
Pattern pattern = Pattern.compile("p0([0-9]+)=([0-9]+)&");
Matcher matcher = pattern.matcher(mydata);
int start=0,end=0;
if(matcher.find())
{
start=matcher.start();
end=matcher.end();
System.out.println(mydata.substring(start,end-1));
}
try this
String p0 = s.replaceAll(".*&(p0.+?=.+?)&.*", "$1");

How to create variables from an input file

In my program I need to loop through a variety of dates. I am writing this program in java, and have a bit of experience with readers, but I do not know which reader would complete this task the best, or if another class would work better.
The dates would be input into a text file in the format as follows:
1/1/2013 to 1/7/2013
1/8/2013 to 1/15/2013
Or something of this manner. I would need to break each range of dates into 6 local variables for the loop, then change them for the next loop. The variables would be coded for example:
private static String startingMonth = "1";
private static String startingDay = "1";
private static String startingYear = "2013";
private static String endingMonth = "1";
private static String endingDay = "7";
private static String endingYear = "2013";
I imagine this could be done creating several delimiters to look for, but I do not know that this would be the easiest way. I have been looking at this post for help, but cant seem to find a relevant answer. What would be the best way to go about this?
There are several options.
You could use the scanner, and set the delimiter to include the slash. If you want the values as ints and not string, just use sc.nextInt()
Scanner sc = new Scanner(input).useDelimiter("\\s*|/");
// You can skip the loop to just read a single line.
while(sc.hasNext()) {
startingMonth = sc.next();
startingDay = sc.next();
startingYear = sc.next();
// skip "to"
sc.next()
endingMonth = sc.next();
endingDay = sc.next();
endingYear = sc.next();
}
You can use regex, as alfasin suggest, but this case is rather simple so you can just match the first and last space.
String str = "1/1/2013 to 1/7/2013";
String startDate = str.substring(0,str.indexOf(" "));
String endDate = str.substring(str.lastIndexOf(" ")+1);ยจ
// The rest is the same:
String[] start = startDate.split("/");
System.out.println(start[0] + "-" + start[1] + "-" + start[2]);
String[] end = endDate.split("/");
System.out.println(end[0] + "-" + end[1] + "-" + end[2]);
String str = "1/1/2013 to 1/7/2013";
Pattern pattern = Pattern.compile("(\\d+/\\d+/\\d+)");
Matcher matcher = pattern.matcher(str);
matcher.find();
String startDate = matcher.group();
matcher.find();
String endDate = matcher.group();
String[] start = startDate.split("/");
System.out.println(start[0] + "-" + start[1] + "-" + start[2]);
String[] end = endDate.split("/");
System.out.println(end[0] + "-" + end[1] + "-" + end[2]);
...
OUTPUT
1-1-2013
1-7-2013

Regex for floor in address

I have this regex:
String regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
I want to test it against:
String lineString = "8th floor, Prince's Building, 12 Chater Road";
so I do:
boolean isMatching = lineString.matches(regexPattern);
and it return false. Why?
I thought it had something to do with whitespaces in Java, so I removed the whitespace in the regexPattern variable so it reads
regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)floor";
and matched it with a string without white space:
String lineString = "8thfloor,Prince'sBuilding,12ChaterRoad"
it still returns false. Why? Any help very much appreciated.
String.matches() only returns true if the entire string matches the pattern.
Try adding .* to the beginning and end of your regex.
Example:
String regex = ".*[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor.*";
This is not the best approach, however...
Here's a better alternative:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
boolean isMatch = p.matcher(input).find();
If you want to extract the floor number, do this:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "([0-9A-Za-z])+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
if (m.find()) {
String num = m.group(1);
String suffix = m.group(2);
System.out.println("Welcome to the " + num + suffix + " floor!");
// prints 'Welcome to the 8th floor!'
}
Check out the Pattern API for a boatload of info about Java regular expressions.
Edited, per comments ...
The [0-9A-Za-z]+ part is greedily matching until the end of th.
Try [0-9] instead.

Regex to match "path/*.extension"

I am trying to find a regular expression that would match the following format:
path/*.file_extension
For example:
temp/*.jpg
usr/*.pdf
var/lib/myLib.so
tmp/
Using the regex, I want to store the matching parts into a String array, such as:
String[] tokens;
// regex magic here
String path = tokens[0];
String filename = tokens[1];
String extension = tokens[2];
In case of the last case tmp/, that contains no filename and extension, then token[1] and token[2] would be null.
In case of the:
usr/*.pdf
then the token[1] would contain only the string "*".
Thank you very much for your help.
If you can use Java7 then you can use named groups like this
String data = "temp/*.jpg, usr/*.pdf, var/lib/*.so, tmp/*, usr/*, usr/*.*";
Pattern p = Pattern
.compile("(?<path>(\\w+/)+)((?<name>\\w+|[*]))?([.](?<extension>\\w+|[*]))?");
Matcher m = p.matcher(data);
while (m.find()) {
System.out.println("data=" + m.group());
System.out.println("path=" + m.group("path"));
System.out.println("name=" + m.group("name"));
System.out.println("extension=" + m.group("extension"));
System.out.println("------------");
}
This code should wotk:
String line = "var/lib/myLib.so";
Pattern p = Pattern.compile("(.+?(?=/[^/]*$))/([^.]+)\\.(.+)$");
Matcher m = p.matcher(line);
List<String> tokens = new ArrayList<String>();
if (m.find()) {
for (int i=1; i <= m.groupCount(); i++) {
tokens.add(m.group(i));
}
}
System.out.println("Tokens => " + tokens);
OUTPUT:
Tokens => [var/lib, myLib, so]
I'm assuming you're using Java. This should work:
Pattern.compile("path/(.*?)(?:\\.(file_extension))?");
Why use a regular expression?
I personally find lastIndexOf more readable.
String path;
String filename;
#Nullable String extension;
// Look for the last slash
int lastSlash = fullPath.lastIndexOf('/');
// Look for the last dot after the last slash
int lastDot = fullPath.lastIndexOf('.', lastSlash + 1);
if (lastDot < 0) {
filename = fullPath.substring(lastSlash + 1);
// If there is no dot, then there is no extension which
// is distinct from the empty extension in "foo/bar."
extension = null;
} else {
filename = fullPath.substring(lastSlash + 1, lastDot);
extension = fullPath.substring(lastDot + 1);
}
On a different approach, a simple usage of 'substring()/lastIndexOf()' methods should serve the purpose:
String filePath = "var/lib/myLib.so";
String fileName = filePath.substring(filePath.lastIndexOf('/')+1);
String path = filePath.substring(0, filePath.lastIndexOf('/'));
String fileName = fileName.substring(0, fileName.lastIndexOf('.'));
String extension = fileName.substring(fileName.lastIndexOf('.')+1);
Please Note: You need to handle the alternate scenarios e.g. file path without extension.

Categories

Resources