Java: Get all Strings using Regex - java

I'm trying to get all the strings from a javascript script, I created a code, but it's not catching all, it's skipping some
My Code
String Strings;
public String GetStrings(String str){
try{
String Str= str;
Strings = "";
while(true){
Pattern pattern = Pattern.compile("('|\")");
Matcher matcher = pattern.matcher(Str);
if(matcher.find()){
Pattern pattern1 = Pattern.compile("(" + matcher.group(1) + "[^" + matcher.group(1) + "]*" + matcher.group(1) + ")");
Matcher matcher1 = pattern1.matcher(Str);
if(matcher1.find()){
Strings += "|" + matcher1.group(1) + "|";
Str = Str.replace(matcher1.group(1)," ");
}
}else{
break;
}
}
}catch(Exception err){return err.toString(); }
return Strings;
}
Input
var A="&";var B="(";var D="[]";var X="'";var W='&';var Q='';var STR="'";var Q="'******'";var G="^";var F="...";var T='$';var wm = "()"
console.log(A + B + D + "^" + wm + '#');
Output
|"&"||"("||"[]"||"'"||'&'||''||"'******'"||"^"||"..."||'$'||"()"||'#'|
As you can see not captured all the strings, some did not appear, if anyone has any solution or can point the problem, please help me

You need to use following regex:
(\"(.*?)\")|(\'(.*?)\')
example:
public String getStrings(String str){
String regex = "(\\\"(.*?)\\\")|(\\'(.*?)\\')";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
String output = "";
while (matcher.find()){
output = output+"|"+matcher.group(0)+"|";
}
return output;
}
Output:
|"&"||"("||"[]"||"'"||'&'||''||"'"||"'******'"||"^"||"..."||'$'||"()"||"^"||'#'|
Regex Explanation

Input and expected output is not matching but according what i understood is
public String GetStrings(String str){
StringBuffer b = new StringBuffer();
for (int i = 0; i < str.length(); ++i) {
char ch = str.charAt(i);
if (Character.isWhitespace(ch))
b.append("\\s");
else if (Character.isDigit(ch))
b.append("\\d");
else if (Character.isUpperCase(ch))
b.append("A-Z");
else if (Character.isLowerCase(ch))
b.append("a-z");
}
b.append("||");
}

Related

Converting all float values in String from scientific notation to decimal notation

so i have a xml string that looks like this:
<CONFIG><Setting1><o1>44</o1><o2>1.0E-4</o2><o3>955</o3><o4>1.5E-4</o4><o5>Surname</o5></setting1>....</CONFIG>
How would i go about converting every float in a string from scientific-notion to the decimal-notation?
Edit: To clarify, im not looking to convert only a single float value from scientific to decimal nation. The String is read from a xml file that i serialized from a pojo, so all of the float values in the String would need to be converted. Sadly the XML-Framework i used (SimpleXML) only represents floats in scientific notation.
UPDATE:
Tried finding the float values with RegEx, it works. "found" will be the new converted decimal. How would i go about replacing each of the the found pattern with the "found"-String?
public static void ScientificToDecimal(String text){
String found;
Pattern pattern = Pattern.compile("\\d+[.]\\d+E[+-]\\d");
Matcher matcher = pattern.matcher(text);
while(matcher.find()){
found = new BigDecimal(matcher.group()).toPlainString();
Log.i("Converted: ", matcher.group() + " to " + found);
}
}
UPDATE2: Works good enough for me.
public static String scientificToDecimal(String text){
String replacementText = "";
StringBuffer sb = new StringBuffer();
Pattern pattern = Pattern.compile("\\d+[.]\\d+E[+-]\\d");
Matcher matcher = pattern.matcher(text);
while(matcher.find()){
replacementText = new BigDecimal(matcher.group()).toPlainString();
matcher.appendReplacement(sb,replacementText);
Log.i("Converted: ", matcher.group() + " to " + replacementText);
}
matcher.appendTail(sb);
return sb.toString();
}
Think about those pattern >1.0E-4< or >1.5E-4< and RegEx and String replacement and so on.
Use XMLPullParser (consult the guide) to get the double values, then convert using the technique described here, or here, potentially use your regex.
Just to enhance to handle the following scenarios
a) 1.0E-4
b) 1.0E4
c) 1.0E+4
public static String scientificToDecimal(String text){
String out = "";
boolean found = false;
String replacementText = "";
StringBuffer sb = new StringBuffer();
/*
* 5.0E4
*/
Pattern pattern = Pattern.compile("\\d+[.]\\d+E\\d");
Matcher matcher = pattern.matcher(text);
while(matcher.find()){
replacementText = new BigDecimal(matcher.group()).toPlainString();
matcher.appendReplacement(sb,replacementText);
found = true;
// System.out.println("Converted: " + matcher.group() + " to " + replacementText);
}
if ( found )
{
matcher.appendTail(sb);
out = sb.toString();
return out;
}
/*
* 5.0E-4
*/
pattern = Pattern.compile("\\d+[.]\\d+E[-+]\\d");
matcher = pattern.matcher(text);
while(matcher.find()){
replacementText = new BigDecimal(matcher.group()).toPlainString();
matcher.appendReplacement(sb,replacementText);
// System.out.println("Converted: " + matcher.group() + " to " + replacementText);
}
matcher.appendTail(sb);
out = sb.toString();
return out;
}

Java regex modifying the lookaround assertions

I have a string
String str = "(varA=0.1;varB=0.2;varC<0.3;varD>=0.4)<?0.1>(varA=1.1;varB=1.2;varC<1.3;varD>=1.4)";
and I want to split it into
(varA=0.1;varB=0.2;varC<0.3;varD>=0.4)
<?0.1>
(varA=1.1;varB=1.2;varC<1.3;varD>=1.4)
I have tried
String[] parts = str.split("(?<=>)|(?=<)");
But it didn't work.
Any suggestion? Thanks!
But it pops error
You can use:
String[] parts = str.split("(?<=\\))(?=<)|(?<=>)(?=\\()");
RegEx Demo
Output:
(varA=0.1;varB=0.2;varC<0.3;varD>=0.4)
<?0.1>
(varA=1.1;varB=1.2;varC<1.3;varD>=1.4)
Try
public static void main(String[] args) {
String line = "(varA=0.1;varB=0.2;varC<0.3;varD>=0.4)<?0.1>(varA=1.1;varB=1.2;varC<1.3;varD>=1.4)";
String pattern = "(\\(.*\\))(<.*?>)(.*)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(1) );
System.out.println("Found value: " + m.group(2) );
System.out.println("Found value: " + m.group(3) );
}
}

How to match the text file against multiple regex patterns and count the number of occurences of these patterns?

I want to find and count all the occurrences of the words unit, device, method, module in every line of the text file separately. That's what I've done, but I don't know how to use multiple patterns and how to count the occurrence of every word in the line separately? Now it counts only occurrences of all words together for every line. Thank you in advance!
private void countPaterns() throws IOException {
Pattern nom = Pattern.compile("unit|device|method|module|material|process|system");
String str = null;
BufferedReader r = new BufferedReader(new FileReader("D:/test/test1.txt"));
while ((str = r.readLine()) != null) {
Matcher matcher = nom.matcher(str);
int countnomen = 0;
while (matcher.find()) {
countnomen++;
}
//intList.add(countnomen);
System.out.println(countnomen + " davon ist das Wort System");
}
r.close();
//return intList;
}
Better to use a word boundary and use a map to keep counts of each matched keyword.
Pattern nom = Pattern.compile("\\b(unit|device|method|module|material|process|system)\\b");
String str = null;
BufferedReader r = new BufferedReader(new FileReader("D:/test/test1.txt"));
Map<String, Integer> counts = new HashMap<>();
while ((str = r.readLine()) != null) {
Matcher matcher = nom.matcher(str);
while (matcher.find()) {
String key = matcher.group(1);
int c = 0;
if (counts.containsKey(key))
c = counts.get(key);
counts.put(key, c+1)
}
}
r.close();
System.out.println(counts);
Here's a Java 9 (and above) solution:
public static void main(String[] args) {
List<String> expressions = List.of("(good)", "(bad)");
String phrase = " good bad bad good good bad bad bad";
for (String regex : expressions) {
Pattern gPattern = Pattern.compile(regex);
Matcher matcher = gPattern.matcher(phrase);
long count = matcher.results().count();
System.out.println("Pattern \"" + regex + "\" appears " + count + (count == 1 ? " time" : " times"));
}
}
Outputs
Pattern "(good)" appears 3 times
Pattern "(bad)" appears 5 times

Use regex to parse in Android

I would like to use regex to parse a message received through a socket in an Android Client and put part of the message in a list.
This is the message to parse:
{Code=1;NumServices=3;Service1=World Weather Online;Link1=http://www.worldweatheronline.com/;Service2=Open Weather Map;Link2=http://openweathermap.org/;Service3=Weather;Link3=http://www.weather.gov/;}
and the method I'm using:
private void parse(String mess) {
String Code="0";
Pattern pattern = Pattern.compile("Code=(.*?);");
Matcher matcher = pattern.matcher(mess);
while (matcher.find()) {
Code = matcher.group(1);
Log.d("Matcher", "PATTERN MATCHES! Code parsed "+Code );
// System.out.println("Code: "+Code);
}
Log.d("Matcher", "PATTERN MATCHES! Code not parsed "+Code );
if(Code.compareTo("1")==0){
// System.out.println("testing the parser");
// Pattern pattern1 = Pattern.compile(";CPU=(.*?);Screen");
Pattern pattern2 = Pattern.compile("NumServices=(.*?);");
Matcher matcher2 = pattern2.matcher(mess);
int number=0;
if (matcher2.find()) {
String numb = matcher2.group(1);
this.tester = numb;
Log.d("Matcher", "PATTERN MATCHES! numb services");
number = Integer.parseInt(numb);
}
else{
this.tester = "NOT FOUND";
Log.d("Matcher", "PATTERN MATCHES! match num failed");
}
int i;
for(i=1;i<=number;i++){
Pattern pattern3 = Pattern.compile(";Service"+i+"=(.*?);");
Pattern pattern4 = Pattern.compile(";Link"+i+"=(.*?);");
Matcher matcher3 = pattern3.matcher(mess);
Matcher matcher4 = pattern4.matcher(mess);
if (matcher3.find()) {
// Log.d("Matcher", "PATTERN MATCHES! services");
String serv = matcher3.group(1);
// this.tester = serv;
your_array_list.add(serv);
}
if (matcher4.find()) {
Log.d("Matcher", "PATTERN MATCHES! links");
String link = matcher4.group(1);
your_array_list2.add(link);
}
}
}
}
None of the log.d works so I cannot verify the flow of the code. What's weird is that I tested the same code in Eclipse and it works. When I use toast to display, it gives me the value of Code, but not of Service. Is there an error somewhere or does regex work differently in Android?
Thanks.
You can actually use 1 regex to capture all pertinent data:
Code=([^;]*);NumServices=([^;]*);|Service(\d+)=([^;]*);Link\d+=([^;]*);
Here is a sample code:
String str = "{Code=1;NumServices=3;Service1=World Weather Online;Link1=http://www.worldweatheronline.com/;Service2=Open Weather Map;Link2=http://openweathermap.org/;Service3=Weather;Link3=http://www.weather.gov/;}";
Pattern ptrn = Pattern.compile("Code=([^;]*);NumServices=([^;]*);|Service(\\d+)=([^;]*);Link\\d+=([^;]*);");
Matcher matcher = ptrn.matcher(str);
while (matcher.find()) {
if (matcher.group(1) != null) {
System.out.println("Code: " + matcher.group(1));
System.out.println("NumServices: " + matcher.group(2));
}
else if (matcher.group(1) == null && matcher.group(2) == null) {
System.out.println("Service #: " + matcher.group(3));
System.out.println("Service Name: " + matcher.group(4));
System.out.println("Link: " + matcher.group(5));
}
}
See IDEONE demo

Words inside square brackes - RegExp

String linkPattern = "\\[[A-Za-z_0-9]+\\]";
String text = "[build]/directory/[something]/[build]/";
RegExp reg = RegExp.compile(linkPattern,"g");
MatchResult matchResult = reg.exec(text);
for (int i = 0; i < matchResult.getGroupCount(); i++) {
System.out.println("group" + i + "=" + matchResult.getGroup(i));
}
I am trying to get all blocks which are encapsulated by squared bracets form a path string:
and I only get group0="[build]" what i want is:
1:"[build]" 2:"[something]" 3:"[build]"
EDIT:
just to be clear words inside the brackets are generated with random text
public static String genText()
{
final int LENGTH = (int)(Math.random()*12)+4;
StringBuffer sb = new StringBuffer();
for (int x = 0; x < LENGTH; x++)
{
sb.append((char)((int)(Math.random() * 26) + 97));
}
String str = sb.toString();
str = str.substring(0,1).toUpperCase() + str.substring(1);
return str;
}
EDIT 2:
JDK works fine, GWT RegExp gives this problem
SOLVED:
Answer from Didier L
String linkPattern = "\\[[A-Za-z_0-9]+\\]";
String result = "";
String text = "[build]/directory/[something]/[build]/";
RegExp reg = RegExp.compile(linkPattern,"g");
MatchResult matchResult = null;
while((matchResult=reg.exec(text)) != null){
if(matchResult.getGroupCount()==1)
System.out.println( matchResult.getGroup(0));
}
I don't know which regex library you are using but using the one from the JDK it would go along the lines of
String linkPattern = "\\[[A-Za-z_0-9]+\\]";
String text = "[build]/directory/[something]/[build]/";
Pattern pat = Pattern.compile(linkPattern);
Matcher mat = pat.matcher(text);
while (mat.find()) {
System.out.println(mat.group());
}
Output:
[build]
[something]
[build]
Try:
String linkPattern = "(\\[[A-Za-z_0-9]+\\])*";
EDIT:
Second try:
String linkPattern = "\\[(\\w+)\\]+"
Third try, see http://rubular.com/r/eyAQ3Vg68N

Categories

Resources