How to replace tokens in java using regex? - java

I am having a string template containing $variables which needs to be replaced.
String Template: "hi my name is $name.\nI am $age old. I am $sex"
The solution which i tried verifying does not work in the java program.
http://regexr.com/3dtq1
Further, I referred to https://www.regex101.com/ where i could not check if the pattern works for java. But, while going through one of the tutorials I found that "$ Matches end of line". what's the best way to replace the tokens in the template with the variables?
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PatternCompiler {
static String text = "hi my name is $name.\nI am $age old. I am $sex";
static Map<String,String> replacements = new HashMap<String,String>();
static Pattern pattern = Pattern.compile("\\$\\w+");
static Matcher matcher = pattern.matcher(text);
public static void main(String[] args) {
replacements.put("name", "kumar");
replacements.put("age", "26");
replacements.put("sex", "male");
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
String replacement = replacements.get(matcher.group(1));
if (replacement != null) {
// matcher.appendReplacement(buffer, replacement);
// see comment
matcher.appendReplacement(buffer, "");
buffer.append(replacement);
}
}
matcher.appendTail(buffer);
System.out.println(buffer.toString());
}
}

You are using matcher.group(1) but you didn't define any group in the regexp (( )), so you can use only group() for the whole matched string, which is what you want.
Replace line:
String replacement = replacements.get(matcher.group(1));
With:
String replacement = replacements.get(matcher.group().substring(1));
Notice the substring, your map contains only words, but matcher will match also $, so you need to search in map for "$age".substring(1)" but do replacement on the whole $age.

You can try replacing the pattern string with
\\$(\\w+)
and the variable replacement works. Your current pattern only has group 0 (the entire pattern) but not group 1. Adding the parenthesis makes the first group the variable name and the replacement will replace the dollar sign and the variable name.

Your code has just minor glitches.
static Map<String,String> replacements = new HashMap<>();
static Pattern pattern = Pattern.compile("\\$\\w+\\b"); // \b not really needed
// As no braces (...) there is no group(1)
String replacement = replacements.get(matcher.group());

Your not using the right thing as your key. Change to group(), and change map to '$name' etc:
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HelloWorld {
static String text = "hi my name is $name.\nI am $age old. I am $sex";
static Map<String,String> replacements = new HashMap<String,String>();
static Pattern pattern = Pattern.compile("\\$\\w+");
static Matcher matcher = pattern.matcher(text);
public static void main(String[] args) {
replacements.put("$name", "kumar");
replacements.put("$age", "26");
replacements.put("$sex", "male");
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
String replacement = replacements.get(matcher.group());
System.out.println(replacement);
if (replacement != null) {
// matcher.appendReplacement(buffer, replacement);
// see comment
matcher.appendReplacement(buffer, "");
buffer.append(replacement);
}
}
matcher.appendTail(buffer);
System.out.println(buffer.toString());
}
}

Related

Remove double quotes from output Java

I am trying to extract a url from the string. But I am unable to skip the double quotes in the output.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
String s1 = "<a id=\"BUTTON_LINK\" style=\"%%BUTTON_LINK%%\" target=\"_blank\" href=\"https://||domainName||/basketReviewPageLoadAction.do\">%%CHECKOUT%%</a>";
//System.out.println(s1);
Pattern pattern = Pattern.compile("\\s*(?i)href\\s*=\\s*(\"([^\"]*\")|'[^']*'|([^'\">\\s]+))");
Matcher matcher = pattern.matcher(s1);
if(matcher.find()){
String url = matcher.group(1);
System.out.println(url);
}
}
}
My Output is:
"https://||domainName||/basketReviewPageLoadAction.do"
Expected Output is:
https://||domainName||/basketReviewPageLoadAction.do
I cannot do string replace. I have add few get param in this output and attach back it to original string.
Regex: (?<=href=")([^\"]*) Substitution: $1?params...
Details:
(?<=) Positive Lookbehind
() Capturing group
[^] Match a single character not present in the list
* Matches between zero and unlimited times
$1 Group 1.
Java code:
By using function replaceAll you can add your params ?abc=12 to the end of the capturing group $1 in this case href.
String text = "<a id=\"BUTTON_LINK\" style=\"%%BUTTON_LINK%%\" target=\"_blank\" href=\"https://||domainName||/basketReviewPageLoadAction.do\">%%CHECKOUT%%</a>";
text = text.replaceAll("(?<=href=\")([^\"]*)", String.format("$1%s", "?abc=12"));
System.out.print(text);
Output:
<a id="BUTTON_LINK" style="%%BUTTON_LINK%%" target="_blank" href="https://||domainName||/basketReviewPageLoadAction.do?abc=12">%%CHECKOUT%%</a>
Code demo
You can try one of these options:
System.out.println(url.replaceAll("^\"|\"$", ""));
System.out.println(url.substring(1, url.length()-1));
ugly, seems works.Hope this help.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.Stream;
class Main {
public static void main(String[] args) {
String s1 = "<a id=\"BUTTON_LINK\" style=\"%%BUTTON_LINK%%\" target=\"_blank\" href= \"https://||domainName||/basketReviewPageLoadAction.do\">%%CHECKOUT%%</a>";
//System.out.println(s1);
Pattern pattern = Pattern.compile("\\s*(?i)href\\s*=\\s*(\"([^\"]*)\"|'([^']*)'|([^'\">\\s]+))");
Matcher matcher = pattern.matcher(s1);
if (matcher.find()) {
String url = Stream.of(matcher.group(2), matcher.group(3),
matcher.group(4)).filter(s -> s != null).collect(Collectors.joining());
System.out.print(url);
}
}
}
This solution worked for now.
Pattern pattern = Pattern.compile("\\s*(?i)href\\s*=\\s*\"([^\"]*)");
You will try this out,
s1 = s1.Replace("\"", "");

How to preserve delimeters while using String.split() in Java?

String TextValue = "hello{MyVar} Discover {MyVar2} {MyVar3}";
String[] splitString = TextValue.split("\\{*\\}");
What I'm getting output is [{MyVar, {MyVar2, {MyVar3] in splitString
But my requirement is to preserve those delimiters {} i.e. [{MyVar}, {MyVar2}, {MyVar3}].
Required a way to match above output.
Use something like so:
Pattern p = Pattern.compile("(\\{\\w+\\})");
String str = ...
Matcher m = p.matcher(str);
while(m.find())
System.out.println(m.group(1));
Note, the code above is untested but that will look for words within curly brackets and place them in a group. It will then go over the string and output any string which matches the expression above.
An example of the regular expression is available here.
Thanks kelvin & npinti.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class CreateMatcherExample {
public static void main(String[] args) {
String TextValue = "hello{MyVar} Discover {My_Var2} {My_Var3}";
String patternString = "\\{\\w+\\}";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(TextValue);
while(matcher.find()) {
System.out.println(matcher.group());
}
}
}

java tokenizer for strings

I have a text file and want to tokenize its lines -- but only the sentences with the # character.
For example, given...
Buah... Molt bon concert!! #Postconcert #gintonic
...I want to print only #Postconcert #gintonic.
I have already tried this code with some changes...
public class MyTokenizer {
/**
* #param args
*/
public static void main(String[] args) {
tokenize("Europe3.txt","allo.txt");
}
public static void tokenize(String sFile,String sFileOut) {
String sLine="", sToken="";
MyBufferedReaderWriter f = new MyBufferedReaderWriter();
f.openRFile(sFile);
MyBufferedReaderWriter fOut = new MyBufferedReaderWriter();
fOut.openWFile(sFileOut);
while ((sLine=f.readLine()) != null) {
//StringTokenizer st = new StringTokenizer(sLine, "#");
String[] tokens = sLine.split("\\#");
for (String token : tokens)
{
fOut.writeLine(token);
//System.out.println(token);
}
/*while (st.hasMoreTokens()) {
sToken = st.nextToken();
System.out.println(sToken);
}*/
}
f.closeRFile();
}
}
Can anyone help?
You can try something like with Regex:
package com.stackoverflow.answers;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HashExtractor {
public static void main(String[] args) {
String strInput = "Buah... Molt bon concert!! #Postconcert #gintonic";
String strPattern = "(?:\\s|\\A)[##]+([A-Za-z0-9-_]+)";
Pattern pattern = Pattern.compile(strPattern);
Matcher matcher = pattern.matcher(strInput);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
As per the given example, when using the split() function the values would be stored something like this:
tokens[0]=Buah... Molt bon concert!!
tokens[1]=Postconcert
tokens[2]=gintonic
So you just need to skip first value and append '#' (if you need that in your other) to the other string values.
Hope this helps.
You have not specially asked for this, but I assume you try to extract all the #hashtags from your textfile.
To do this, Regex is your friend:
String text = "Buah... Molt bon concert!! #Postconcert #gintonic";
System.out.println(getHashTags(text));
public Collection<String> getHashTags(String text) {
Pattern pattern = Pattern.compile("(#\\w+)");
Matcher matcher = pattern.matcher(text);
Set<String> htags = new HashSet();
while (matcher.find()) {
htags.add(matcher.group(1));
}
return htags;
}
Compile a pattern like this #\w+, everything that starts with a # followed by one or more (+) word character (\w).
Then we have to escape the \ for java with a \\.
And finally put this expression in a group to get access to the matched text by surrounding it with braces (#\w+).
For every match, add the first matched group to the set htags, finally we get a set with all the hashtags in it.
[#gintonic, #Postconcert]

Find the String Starts with $ symbol using Regular Expression

String str = "hai ${name} .... Welcome to ${sitename}....";
from this str i need to replace ${name} by "jack" and ${sitename} by "google"
is it possible to do with regular Expression ?
or
is there any other fastest way to replace the string .
EDITED :
name and sitename is the str variable is dynamic .
1.So first i have to find the key .
Eg : here name , sitename is the key
2.Then i have an Hashmap which has key value pairs .
based on the key value i have to replace the string in str variable.
import java.util.HashMap;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Replacer
{
public static String replacePlaceHolders(String text)
{
Map<String, String> fields = new HashMap<>();
fields.put("name", "jack");
fields.put("sitename", "google");
Pattern p = Pattern.compile("\\$\\{(.*?)\\}");
Matcher matcher = p.matcher(text);
StringBuffer result = new StringBuffer();
while (matcher.find()) {
String key = matcher.group(1);
if (!fields.containsKey(key)) {
continue;
}
matcher.appendReplacement(result, fields.get(key));
}
matcher.appendTail(result);
return result.toString();
}
public static void main(String[] args)
{
System.out.println(
replacePlaceHolders("hai ${name} .... Welcome to ${sitename}...."));
}
}
NO Regex is needed!
You could iterate the keySet of your map, and do:
str=str.replace("${"+key+"}", map.get(key));
about the method:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replace(java.lang.CharSequence, java.lang.CharSequence)
I think you'd be better off using String.format
String str = String.format("hai %s .... Welcome to %s....", name, site);
Maybe this will help you:
\$\{name+\}
change ${name} to test
-> ${name} trete ${sitename} -> test trete ${sitename}
You have to escape this expression for usage in Java:
\\$\\{name+\\}
If you can replace ${name} and ${sitename} with {0} and {1} you could use MessageFormat.
String str = "hai {0} .... Welcome to {1}....";
String output = MessageFormat.format(str, new String[]{"jack","google"});

Regex to remove the uri prefix (within tag) only from xml tag

I need a regular expression to remove the uri prefix(within tag) only from xml tag.
Example
input:
<ns1:fso xlmns:="http://xyz"><sender>abc</sender></ns1:fso>
output:
<fso xlmns:="http://xyz"><sender>abc</sender></fso>
Here is my code:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public final class RegularExpressionTest {
private static String REGEX1 = "<\\/?([a-z0-9]+?:).*?>";
private static String INPUT = "<ns1:fso xmlns:ns1='https://www.example.com/fsoCanonical'>
<ns2:senderId xmlns='http://www.example.com/fsoCanonical'>abc</ns2:senderId>
<receiverId xmlns='http://www.example.com/fsoCanonical'>testdata</receiverId>
<messageId xmlns='http://www.example.com/fsoCanonical'>4CF4DC05126A0077E10080000A66C871</messageId>
</ns1:fso> ";
private static String REPLACE = "";
public static void main(String[] args) {
Pattern p = Pattern.compile(REGEX1);
Matcher m = p.matcher(INPUT); // get a matcher object
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, REPLACE);
}
m.appendTail(sb);
System.out.println(sb.toString());
}
I am not able to paste the input XML here
private static String INPUT =
is not the correct one as shown in above code. Instead you can take any example of soap message.
I am more used with PERLs RegEx engine, but if it works the same, this could be it:
private static String REGEX1 = "(<\\/?)[a-z0-9]+:";
and
private static String REPLACE = "$1";
You can match ns1 with following regex:
<\/?([a-z0-9]+?:).*?>
I would improve this code. I should not remove the soapenv,body and header...
(</?)[a-zA-Z0-9]+:(?!Header|Body|Envelope)
I would include de A-Z as well....

Categories

Resources