How to get the middle strings with regex? - java

I have an input string that looks like this
DatalogSetupFile: BTS50xx1EJA\3.20\log_all.stp
The DatalogSetupFile: and \3.20\log_all.stp are constant. I wish to extract BTS50xx1EJA from the string. How should I do it?

You can make a regex group in which you can specify what all are the static content and then specify what are the dynamic content as a whole group, So that you can get the dynamic content as a whole group,
You can define regex as follow
^(?:DatalogSetupFile:\s)(.*)(?:\3.20\log_all.stp)$
Try this Demo
Here you can use the first group to get your dynamic string

Give this regex a try:
\s\K[^\\]+
Which, in Java would look like:
String myInputString = "DatalogSetupFile: BTS50xx1EJA\\3.20\\log_all.stp";
Pattern myPattern = Pattern.compile("\\s\\K[^\\\\]+");
Matcher myMatcher = Pattern.matcher(myInputString);
System.out.println(myMatcher.group(0));

Related

Java/Scala regex expression

I am a little unfamiliar with regex. I have a string along the following lines
val str20 = "unit/virtual-ExtractMe/domain-testing-ExtracMe-IgnoreMe"
The word I want to extract is "ExtractMe"(the first one that is seen above right before domain-. The format is of string is going to be same in the start but will change after the second slash and I need to ignore whatever is written after the second slash. My interest is to get whatever is written between virtual- and second / . In this case it is the first occurrence of the word ExtractMe. for example if I have this
val str20 = "unit/virtual-YouGotMe/domain-testing-ExtracMe-IgnoreMe"
The regex should get me the word "YouGotMe" as it is between virtual- and the second forward slash
This /virtual-(.*?)/ will get you all the matches in a group. You just have to get the first one. See : https://regex101.com/r/KX9VTt/2
In Scala regex, there is no need to escape the /, but if you are doing them in Java directly, you will need to escape them.
In Scala, you can use findFirstMatchIn to extract the first matched group as follows:
val pattern = """virtual-(.*?)/""".r
val str20 = "unit/virtual-ExtractMe/domain-testing-ExtracMe-IgnoreMe"
pattern.
findFirstMatchIn(str20).
map(_.group(1)).
getOrElse("Error: No Match!!!")
res1: String = ExtractMe
val str20 = "unit/virtual-YouGotMe/domain-testing-ExtracMe-IgnoreMe"
pattern.
findFirstMatchIn(str20).
map(_.group(1)).
getOrElse("Error: No Match!!!")
res2: String = YouGotMe

How can I manipulate part of a string in java using regular expressions

I have a string that looks like this:
String pathTokenString = "CMS/{brandPath}/Shows/{showPath}";
I want to remove the Shows part and everything that follows. I also want to replace "{brandPath}" with a token that I'm given.
This has been my approach. However, my string isn't getting updated at all:
//remove the '/Shows/{showPath}'
pathTokenString = pathTokenString.replace("/Shows$", "");
//replace passed in brandPath in the tokenString
String answer = pathTokenString.replace("{(.*?)}", brandPath);
Is there something wrong with my regular expressions?
You should use the replaceAll method instead of replace when you want to pass a regex string as the pattern to be replaced. Also your regex patterns should be updated:
pathTokenString = pathTokenString.replaceAll("/Shows.*$", "");
// The curly braces need to be escaped because they denote quantifiers
String answer = pathTokenString.replaceAll("\\{(.*?)\\}", brandPath);

Regex string modifications

I have the following String and I want to filter the MBRB1045T4G out with a regular expression in Java. How would I achieve that?
String:
<p class="ref">
<b>Mfr Part#:</b>
MBRB1045T4G<br>
<b>Technologie:</b>
Tab Mount<br>
<b>Bauform:</b>
D2PAK-3<br>
<b>Verpackungsart:</b>
REEL<br>
<b>Standard Verpackungseinheit:</b>
800<br>
As Wrikken correctly says, HTML can't be parsed correctly by regex in the general case. However it seems you're looking at an actual website and want to scrape some contents. In that case, assuming space elements and formatting in the HTML code don't change, you can use a regex like this:
Mfr Part#:</b>([^<]+)<br>
And collect the first capture group like so (where string is your HTML):
Pattern pt = Pattern.compile("Mfr Part#:</b>\s+([^<]+)<br>",Pattern.MULTILINE);
Matcher m = pt.matcher(string);
if (m.matches())
System.out.println(m.group(1));

How can I get all content between two pipes using regular expression

I have a String say
String s = "|India| vs Aus";
In this case result should be only India.
Second case :
String s = "Aus vs |India|";
In this case result should be only India.
3rd case:
String s = "|India| vs |Aus|"
Result shouls contain only India, Aus. vs should not present in output.
And in these scenarios, there can be any other word in place of vs. e.g. String can be like this also |India| in |Aus|. and the String can be like this also |India| and |Sri Lanka| in |Aus|. I want those words that are present in between two pipes like India, Sri Lanka , Aus.
I want to do it in Java.
Any pointer will be helpful.
You would use a regex like...
\|[^|]+\|
...or...
\|.+?\|
You must escape the pipe because the pipe has special meaning in a regex as or.
You are looking at something similar to this:
String s = "|India| vs |Aus|";
Pattern p = Pattern.compile("\\|(.*?)\\|");
Matcher m = p.matcher(s);
while(m.find()){
System.out.println(m.group(1));
}
You need to use the group to get the contents inside the paranthesis in the regexp.

Java regular expression for extracting the data between tags

I am trying to a regular expression which extracs the data from a string like
<B Att="text">Test</B><C>Test1</C>
The extracted output needs to be Test and Test1. This is what I have done till now:
public class HelloWorld {
public static void main(String[] args)
{
String s = "<B>Test</B>";
String reg = "<.*?>(.*)<\\/.*?>";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(s);
while(m.find())
{
String s1 = m.group();
System.out.println(s1);
}
}
}
But this is producing the result <B>Test</B>. Can anybody point out what I am doing wrong?
Three problems:
Your test string is incorrect.
You need a non-greedy modifier in the group.
You need to specify which group you want (group 1).
Try this:
String s = "<B Att=\"text\">Test</B><C>Test1</C>"; // <-- Fix 1
String reg = "<.*?>(.*?)</.*?>"; // <-- Fix 2
// ...
String s1 = m.group(1); // <-- Fix 3
You also don't need to escape a forward slash, so I removed that.
See it running on ideone.
(Also, don't use regular expressions to parse HTML - use an HTML parser.)
If u are using eclipse there is nice plugin that will help you check your regular expression without writing any class to check it.
Here is link:
http://regex-util.sourceforge.net/update/
You will need to show view by choosing Window -> Show View -> Other, and than Regex Util
I hope it will help you fighting with regular expressions
It almost looks like you're trying to use regex on XML and/or HTML. I'd suggest not using regex and instead creating a parser or lexer to handle this type of arrangement.
I think the bestway to handle and get value of XML nodes is just treating it as an XML.
If you really want to stick to regex try:
<B[^>]*>(.+?)</B\s*>
understanding that you will get always the value of B tag.
Or if you want the value of any tag you will be using something like:
<.*?>(.*?)</.*?>

Categories

Resources