ReplaceAll with java8 lambda functions - java

Given the following variables
templateText = "Hi ${name}";
variables.put("name", "Joe");
I would like to replace the placeholder ${name} with the value "Joe" using the following code (that does not work)
variables.keySet().forEach(k -> templateText.replaceAll("\\${\\{"+ k +"\\}" variables.get(k)));
However, if I do the "old-style" way, everything works perfectly:
for (Entry<String, String> entry : variables.entrySet()){
String regex = "\\$\\{" + entry.getKey() + "\\}";
templateText = templateText.replaceAll(regex, entry.getValue());
}
Surely I am missing something here :)

Java 8
The proper way to implement this has not changed in Java 8, it is based on appendReplacement()/appendTail():
Pattern variablePattern = Pattern.compile("\\$\\{(.+?)\\}");
Matcher matcher = variablePattern.matcher(templateText);
StringBuffer result = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(result, variables.get(matcher.group(1)));
}
matcher.appendTail(result);
System.out.println(result);
Note that, as mentioned by drrob in the comments, the replacement String of appendReplacement() may contain group references using the $ sign, and escaping using \. If this is not desired, or if your replacement String can potentially contain those characters, you should escape them using Matcher.quoteReplacement().
Being more functional in Java 8
If you want a more Java-8-style version, you can extract the search-and-replace boiler plate code into a generalized method that takes a replacement Function:
private static StringBuffer replaceAll(String templateText, Pattern pattern,
Function<Matcher, String> replacer) {
Matcher matcher = pattern.matcher(templateText);
StringBuffer result = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(result, replacer.apply(matcher));
}
matcher.appendTail(result);
return result;
}
and use it as
Pattern variablePattern = Pattern.compile("\\$\\{(.+?)\\}");
StringBuffer result = replaceAll(templateText, variablePattern,
m -> variables.get(m.group(1)));
Note that having a Pattern as parameter (instead of a String) allows it to be stored as a constant instead of recompiling it every time.
Same remark applies as above concerning $ and \ – you may want to enforce the quoteReplacement() inside the replaceAll() method if you don't want your replacer function to handle it.
Java 9 and above
Java 9 introduced Matcher.replaceAll(Function) which basically implements the same thing as the functional version above. See Jesse Glick's answer for more details.

you also can using Stream.reduce(identity,accumulator,combiner).
identity
identity is the initial value for reducing function which is accumulator.
accumulator
accumulator reducing identity to result, which is the identity for the next reducing if the stream is sequentially.
combiner
this function never be called in sequentially stream. it calculate the next identity from identity & result in parallel stream.
BinaryOperator<String> combinerNeverBeCalledInSequentiallyStream=(identity,t) -> {
throw new IllegalStateException("Can't be used in parallel stream");
};
String result = variables.entrySet().stream()
.reduce(templateText
, (it, var) -> it.replaceAll(format("\\$\\{%s\\}", var.getKey())
, var.getValue())
, combinerNeverBeCalledInSequentiallyStream);

import java.util.HashMap;
import java.util.Map;
public class Repl {
public static void main(String[] args) {
Map<String, String> variables = new HashMap<>();
String templateText = "Hi, ${name} ${secondname}! My name is ${name} too :)";
variables.put("name", "Joe");
variables.put("secondname", "White");
templateText = variables.keySet().stream().reduce(templateText, (acc, e) -> acc.replaceAll("\\$\\{" + e + "\\}", variables.get(e)));
System.out.println(templateText);
}
}
output:
Hi, Joe White! My name is Joe too :)
However, it's not the best idea to reinvent the wheel and the preferred way to achieve what you want would be to use apache commons lang as stated here.
Map<String, String> valuesMap = new HashMap<String, String>();
valuesMap.put("animal", "quick brown fox");
valuesMap.put("target", "lazy dog");
String templateString = "The ${animal} jumped over the ${target}.";
StrSubstitutor sub = new StrSubstitutor(valuesMap);
String resolvedString = sub.replace(templateString);

Your code should be changed like below,
String templateText = "Hi ${name}";
Map<String,String> variables = new HashMap<>();
variables.put("name", "Joe");
templateText = variables.keySet().stream().reduce(templateText, (originalText, key) -> originalText.replaceAll("\\$\\{" + key + "\\}", variables.get(key)));

Performing replaceAll repeatedly, i.e. for every replaceable variable, can become quiet expensive, especially as the number of variables might grow. This doesn’t become more efficient when using the Stream API. The regex package contains the necessary building blocks to do this more efficiently:
public static String replaceAll(String template, Map<String,String> variables) {
String pattern = variables.keySet().stream()
.map(Pattern::quote)
.collect(Collectors.joining("|", "\\$\\{(", ")\\}"));
Matcher m = Pattern.compile(pattern).matcher(template);
if(!m.find()) {
return template;
}
StringBuffer sb = new StringBuffer();
do {
m.appendReplacement(sb, Matcher.quoteReplacement(variables.get(m.group(1))));
} while(m.find());
m.appendTail(sb);
return sb.toString();
}
If you are performing the operation with the same Map very often, you may consider keeping the result of Pattern.compile(pattern), as it is immutable and safely shareable.
On the other hand, if you are using this operation with different maps frequently, it might be an option to use a generic pattern instead, combined with handling the possibility that the particular variable is not in the map. The adds the option to report occurrences of the ${…} pattern with an unknown variable:
private static Pattern VARIABLE = Pattern.compile("\\$\\{([^}]*)\\}");
public static String replaceAll(String template, Map<String,String> variables) {
Matcher m = VARIABLE.matcher(template);
if(!m.find())
return template;
StringBuffer sb = new StringBuffer();
do {
m.appendReplacement(sb,
Matcher.quoteReplacement(variables.getOrDefault(m.group(1), m.group(0))));
} while(m.find());
m.appendTail(sb);
return sb.toString();
}
m.group(0) is the actual match, so using this as a fall-back for the replacement string establishes the original behavior of not replacing ${…} occurrences when the key is not in the map. As said, alternative behaviors, like reporting the absent key or using a different fall-back text, are possible.

To update #didier-l’s answer, in Java 9 this is a one-liner!
Pattern.compile("[$][{](.+?)[}]").matcher(templateText).replaceAll(m -> variables.get(m.group(1)))

Related

PatternReplaceCharFilterFactory arguments problem in Lucene (java)

I am doing a practice in Java using Lucene. I want to remove "{", "}" and ";" using a CharFilter in a CustomAnalyzer but I don't know how to call the "PatternReplaceCharFilterFactory". I have tried to call it passing it "map" but it doesn't work and it returns an exception. I have also tried with pattern "p" but it's the same.
public static ArrayList<String> analyzer_codigo(String texto)throws IOException{
Map<String, String> map = new HashMap<String, String>();
map.put("{", "");
map.put("}", "");
map.put(";", "");
Pattern p = Pattern.compile("([^a-z])");
boolean replaceAll = Boolean.TRUE;
Reader r = new Reader(texto);
Analyzer ana = CustomAnalyzer.builder(Paths.get("."))
.addCharFilter(PatternReplaceCharFilterFactory.class,p,"",r)
.withTokenizer(StandardTokenizerFactory.class)
.addTokenFilter(LowerCaseFilterFactory.class)
.build();
return muestraTexto(ana, texto);
}
You can pass a Map to the PatternReplaceCharFilterFactory - but the keys you use for the map are those defined in the JavaDoc for the factory class:
pattern="([^a-z])" replacement=""
This uses Solr documentation to define the keys (pattern and replacement) together with their Solr default values.
Using these keys, your map becomes:
Map<String, String> map = new HashMap<>();
map.put("pattern", "\\{|\\}|;");
map.put("replacement", "");
The regular expression \\{|\\}|; needs to escape the { and } characters because they have special meanings, and then the regex backslashes also need to be escaped in the Java string.
So, the above regular expression means { and } and ; will all be replaced by the empty string.
Your custom analyzer then becomes:
Analyzer analyzer = CustomAnalyzer.builder()
.withTokenizer(StandardTokenizerFactory.NAME)
.addCharFilter(PatternReplaceCharFilterFactory.NAME, map)
.addTokenFilter(LowerCaseFilterFactory.NAME)
.build();
If you use this to index the following input string:
foo{bar}baz;bat
Then the indexed value will be stored as:
foobarbazbat
Very minor point: I prefer to use PatternReplaceCharFilterFactory.NAME instead of PatternReplaceCharFilterFactory.class or even just "patternReplace" - but these all work.
Update
Just for completeness:
The CustomAnalyzer.Builder supports different ways to add a CharFilter. See its addCharFilter methods.
As well as the approach shown above, using a Map...
.addCharFilter(PatternReplaceCharFilterFactory.NAME, map)
...you can also use Java varargs:
"key1", "value1", "key2", "value2", ...
So, in our case, this would be:
.addCharFilter(PatternReplaceCharFilterFactory.NAME
"pattern", "\\{|\\}|;", "replacement", "")

Java-Stream & Optional - Find a value that matches to a stream-element or provide a Default value

I have a Dictionary object which consists of several entries:
record Dictionary(String key, String value, String other) {};
I would like to replace words in the given String my a which are present as a "key" in one of the dictionaries with the corresponding value. I can achieve it like this, but I guess, there must be a better way to do this.
An example:
> Input: One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four
> Output: One [a-value] Two [b-value] Three [D] Four
The code to be improved:
public class ReplaceStringWithDictionaryEntries {
public static void main(String[] args) {
List<Dictionary> dictionary = List.of(new Dictionary("a", "a-value", "a-other"),
new Dictionary("b", "b-value", "b-other"));
String theText = "One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four";
Matcher matcher = Pattern.compile("<sup>([A-Za-z]+)</sup>").matcher(theText);
StringBuilder sb = new StringBuilder();
int matchLast = 0;
while (matcher.find()) {
sb.append(theText, matchLast, matcher.start());
Optional<Dictionary> dict = dictionary.stream().filter(f -> f.key().equals(matcher.group(1))).findFirst();
if (dict.isPresent()) {
sb.append("[").append(dict.get().value()).append("]");
} else {
sb.append("[").append(matcher.group(1)).append("]");
}
matchLast = matcher.end();
}
if (matchLast != 0) {
sb.append(theText.substring(matchLast));
}
System.out.println("Result: " + sb.toString());
}
}
Output:
Result: One [a-value] Two [b-value] Three [D] Four
Do you have a more elegant way to do this?
Since Java 9, Matcher#replaceAll can accept a callback function to return the replacement for each matched value.
String result = Pattern.compile("<sup>([A-Za-z]+)</sup>").matcher(theText)
.replaceAll(mr -> "[" + dictionary.stream().filter(f -> f.key().equals(mr.group(1)))
.findFirst().map(Dictionary::value)
.orElse(mr.group(1)) + "]");
Create a map from your list using key as key and value as value, use the Matcher#appendReplacement method to replace matches using the above map and calling Map.getOrDefault, use the group(1) value as default value. Use String#join to put the replacements in square braces
public static void main(String[] args) {
List<Dictionary> dictionary = List.of(
new Dictionary("a", "a-value", "a-other"),
new Dictionary("b", "b-value", "b-other"));
Map<String,String> myMap = dictionary.stream()
.collect(Collectors.toMap(Dictionary::key, Dictionary::value));
String theText = "One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four";
Matcher matcher = Pattern.compile("<sup>([A-Za-z]+)</sup>").matcher(theText);
StringBuilder sb = new StringBuilder();
while (matcher.find()) {
matcher.appendReplacement(sb,
String.join("", "[", myMap.getOrDefault(matcher.group(1), matcher.group(1)), "]"));
}
matcher.appendTail(sb);
System.out.println(sb.toString());
}
record Dictionary( String key, String value, String other) {};
Map vs List
As #Chaosfire has pointed out in the comment, a Map is more suitable collection for the task than a List, because it eliminates the need of iterating over collection to access a particular element
Map<String, Dictionary> dictByKey = Map.of(
"a", new Dictionary("a", "a-value", "a-other"),
"b", new Dictionary("b", "b-value", "b-other")
);
And I would also recommend wrapping the Map with a class in order to provide continent access to the string-values of the dictionary, otherwise we are forced to check whether a dictionary returned from the map is not null and only then make a call to obtain the required value, which is inconvenient. The utility class can facilitate getting the target value in a single method call.
To avoid complicating the answer, I would not implement such a utility class, and for simplicity I'll go with a Map<String,String> (which basically would act as a utility class intended to act - providing the value within a single call).
public static final Map<String, String> dictByKey = Map.of(
"a", "a-value",
"b", "b-value"
);
Pattern.splitAsStream()
We can replace while-loop with a stream created via splitAsStream() .
In order to distinguish between string-values enclosed with tags <sup>text</sup> we can make use of the special constructs which are called Lookbehind (?<=</sup>) and Lookahead (?=<sup>).
(?<=foo) - matches a position that immediately precedes the foo.
(?=foo) - matches a position that immediately follows after the foo;
For more information, have a look at this tutorial
The pattern "(?=<sup>)|(?<=</sup>)" would match a position in the given string right before the opening tag and immediately after the closing tag. So when we apply this pattern splitting the string with splitAsStream(), it would produce a stream containing elements like "<sup>a</sup>" enclosed with tags, and plain string like "One", "Two", "Three".
Note that in order to reuse the pattern without recompiling, it can be declared on a class level:
public static final Pattern pattern = Pattern.compile("(?=<sup>)|(?<=</sup>)");
The final solution would result in lean and simple stream:
public static void foo(String text) {
String result = pattern.splitAsStream(text)
.map(str -> getValue(str)) // or MyClass::getValue
.collect(Collectors.joining());
System.out.println(result);
}
Instead of tackling conditional logic inside a lambda, it's often better to extract it into a separate method (sure, you can use a ternary operator and place this logic right inside the map operation in the stream if you wish instead of having this method, but it'll be a bit messy):
public static String getValue(String str) {
if (str.matches("<sup>\\p{Alpha}+</sup>")) {
String key = str.replaceAll("<sup>|</sup>", "");
return "[" + dictByKey.getOrDefault(key, key) + "]";
}
return str;
}
main()
public static void main(String[] args) {
foo("One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four");
}
Output:
Result: One [a-value] Two [b-value] Three [D] Four
A link to Online Demo

How to split string using regular expression and put string values into a map in Java

I have a following String and i want to read it using regular expression and put into a map as a key and value.I have already split and put into a map.but the problem is that i have used string arrays and there is a high risk of array index out of bound.so i think that way is not suit for good coding.
public static void read(String log,Map<String, String> logMap) {
String sanitizeLog = "";
String commaSeparatedLine[];
String equalSeparatedLine[];
String patternComma = ",";
String patternEqual = "=";
String patternSanitize = "(?<=]:).*";
Pattern pattern = Pattern.compile(patternSanitize);
Matcher matcher = pattern.matcher(log);
if (matcher.find()) {
sanitizeLog = matcher.group();
}
pattern = Pattern.compile(patternComma);
commaSeparatedLine = pattern.split(sanitizeLog);
for (String line : commaSeparatedLine) {
pattern = Pattern.compile(patternEqual);
equalSeparatedLine = pattern.split(line);
for (int i = 0; i < equalSeparatedLine.length; i += 2) {
logMap.put(equalSeparatedLine[i].trim(),
equalSeparatedLine[i + 1]);
}
}
}
Above code snippet is working fine.but there i used lot of string arrays to store split values.Please let me know that is there any way to do the same thing without using string arrays and put split values in to a map using regular expression.I am a newbie in regular expression.
Output Map should contain like this.
Key -> value
DB.UPDATE_CT -> 2
DB.DUPQ_CT -> 1
...
String value to be split
[2015-01-07 07:17:56,911]: R="InProgressOrders.jsp", REQUEST_UUID="77ed2ab1-b799-4715-acd5-e77ab756192e", HTTP_M="POST",
PFWD="login.jsp", USER_ORG="TradeCustomer.1717989", TX_ORG1="1717989",
DB.QUERY_CT=61, DB.UPDATE_CT=2, DB.DUPQ_CT=1, DB.SVR_MS=59,
DB.IO_MS=111, DB.DRV_MS=144, DB.LOCK_MS=31, DB.BYTES_W=1501, KV.PUT=1,
KV.GET=5, KV.PWAIT_MS=2, KV.GWAIT_MS=4, KV.BYTES_W=193,
KV.BYTES_R=367, MCACHE.GET=30, MCACHE.PUT=18, MCACHE.L1HIT=10,
MCACHE.L2HIT=1, MCACHE.HIT=1, MCACHE.MISS=18, MCACHE.WAIT_MS=51,
MCACHE.BYTES_W=24538, MCACHE.BYTES_R=24282, ROOTS.READ_CT=6,
ROOTS.DUPRSV_CT=3, THREAD.WALL_MS=594, THREAD.CPU_MS=306,
THREAD.CPU_USER_MS=300, THREAD.MEM_K=19318
You seem to have a lot of code. Here is how to do it in 1-line:
Map<String, String> map = Arrays.stream(input.split(","))
.map(s -> a.split("="))
.collect(Collectors.toMap(a -> a[0], a -> a[1]));
To instead add the entries to another map (as in your code):
Arrays.stream(input.split(",")).map(s -> a.split("="))
.forEach(a -> logMap.put(a[0], a[1]));
Disclaimer: Not tested or compiled, just thumbed in.

Place all text in quotes into ArrayList

I'm looking for an easy way to take a string and have all values in quotes placed into an ArrayList
Eg
The "car" was "faster" than the "other"
I would like to have an ArrayList that contains
car, faster, other
I think I might need to use RegEx for this but I'm wondering if there is another simpler way.
Using a regex, it is actually quite easy. Note: this solution supposes that there cannot be nested quotes:
private static final Pattern QUOTED = Pattern.compile("\"([^\"]+)\"");
// ...
public List<String> getQuotedWords(final String input)
{
// Note: Java 7 type inference used; in Java 6, use new ArrayList<String>()
final List<String> ret = new ArrayList<>();
final Matcher m = QUOTED.matcher(input);
while (m.find())
ret.add(m.group(1));
return ret;
}
The regex is:
" # find a quote, followed by
([^"]+) # one or more characters not being a quote, captured, followed by
" # a quote
Of course, since this is in a Java string quotes need to be quoted... Hence the Java string for this regex: "\"([^\"]+)\"".
Use this script to parse the input:
public static void main(String[] args) {
String input = "The \"car\" was \"faster\" than the \"other\"";
List<String> output = new ArrayList<String>();
Pattern pattern = Pattern.compile("\"\\w+\"");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
output.add(matcher.group().replaceAll("\"",""));
}
}
Output list contains:
[car,faster,other]
You can use Apache common String Utils substringsBetween method
String[] arr = StringUtils.substringsBetween(input, "\"", "\"");
List<String> = new ArrayList<String>(Arrays.asList(arr));

Fastest way to put contents of Set<String> to a single String with words separated by a whitespace?

I have a few Set<String>s and want to transform each of these into a single String where each element of the original Set is separated by a whitespace " ".
A naive first approach is doing it like this
Set<String> set_1;
Set<String> set_2;
StringBuilder builder = new StringBuilder();
for (String str : set_1) {
builder.append(str).append(" ");
}
this.string_1 = builder.toString();
builder = new StringBuilder();
for (String str : set_2) {
builder.append(str).append(" ");
}
this.string_2 = builder.toString();
Can anyone think of a faster, prettier or more efficient way to do this?
With commons/lang you can do this using StringUtils.join:
String str_1 = StringUtils.join(set_1, " ");
You can't really beat that for brevity.
Update:
Re-reading this answer, I would prefer the other answer regarding Guava's Joiner now. In fact, these days I don't go near apache commons.
Another Update:
Java 8 introduced the method String.join()
String joined = String.join(",", set);
While this isn't as flexible as the Guava version, it's handy when you don't have the Guava library on your classpath.
If you are using Java 8, you can use the native
String.join(CharSequence delimiter, Iterable<? extends CharSequence> elements)
method:
Returns a new String composed of copies of the CharSequence elements joined together with a copy of the specified delimiter.
For example:
Set<String> strings = new LinkedHashSet<>();
strings.add("Java"); strings.add("is");
strings.add("very"); strings.add("cool");
String message = String.join("-", strings);
//message returned is: "Java-is-very-cool"
Set implements Iterable, so simply use:
String.join(" ", set_1);
As a counterpoint to Seanizer's commons-lang answer, if you're using Google's Guava Libraries (which I'd consider the 'successor' to commons-lang, in many ways), you'd use Joiner:
Joiner.on(" ").join(set_1);
with the advantage of a few helper methods to do things like:
Joiner.on(" ").skipNulls().join(set_1);
// If 2nd item was null, would produce "1, 3"
or
Joiner.on(" ").useForNull("<unknown>").join(set_1);
// If 2nd item was null, would produce "1, <unknown>, 3"
It also has support for appending direct to StringBuilders and Writers, and other such niceties.
Maybe a shorter solution:
public String test78 (Set<String> set) {
return set
.stream()
.collect(Collectors.joining(" "));
}
or
public String test77 (Set<String> set) {
return set
.stream()
.reduce("", (a,b)->(a + " " + b));
}
but native, definitely faster
public String test76 (Set<String> set) {
return String.join(" ", set);
}
I don't have the StringUtil library available (I have no choice over that) so using standard Java I came up with this ..
If you're confident that your set data won't include any commas or square brackets, you could use:
mySet.toString().replaceAll("\\[|\\]","").replaceAll(","," ");
A set of "a", "b", "c" converts via .toString() to string "[a,b,c]".
Then replace the extra punctuation as necesary.
Filth.
I use this method:
public static String join(Set<String> set, String sep) {
String result = null;
if(set != null) {
StringBuilder sb = new StringBuilder();
Iterator<String> it = set.iterator();
if(it.hasNext()) {
sb.append(it.next());
}
while(it.hasNext()) {
sb.append(sep).append(it.next());
}
result = sb.toString();
}
return result;
}
I'm confused about the code replication, why not factor it into a function that takes one set and returns one string?
Other than that, I'm not sure that there is much that you can do, except maybe giving the stringbuilder a hint about the expected capacity (if you can calculate it based on set size and reasonable expectation of string length).
There are library functions for this as well, but I doubt they're significantly more efficient.
This can be done by creating a stream out of the set and then combine the elements using a reduce operation as shown below (for more details about Java 8 streams check here):
Optional<String> joinedString = set1.stream().reduce(new
BinaryOperator<String>() {
#Override
public String apply(String t, String u) {
return t + " " + u;
}
});
return joinedString.orElse("");

Categories

Resources