Place all text in quotes into ArrayList - java

I'm looking for an easy way to take a string and have all values in quotes placed into an ArrayList
Eg
The "car" was "faster" than the "other"
I would like to have an ArrayList that contains
car, faster, other
I think I might need to use RegEx for this but I'm wondering if there is another simpler way.

Using a regex, it is actually quite easy. Note: this solution supposes that there cannot be nested quotes:
private static final Pattern QUOTED = Pattern.compile("\"([^\"]+)\"");
// ...
public List<String> getQuotedWords(final String input)
{
// Note: Java 7 type inference used; in Java 6, use new ArrayList<String>()
final List<String> ret = new ArrayList<>();
final Matcher m = QUOTED.matcher(input);
while (m.find())
ret.add(m.group(1));
return ret;
}
The regex is:
" # find a quote, followed by
([^"]+) # one or more characters not being a quote, captured, followed by
" # a quote
Of course, since this is in a Java string quotes need to be quoted... Hence the Java string for this regex: "\"([^\"]+)\"".

Use this script to parse the input:
public static void main(String[] args) {
String input = "The \"car\" was \"faster\" than the \"other\"";
List<String> output = new ArrayList<String>();
Pattern pattern = Pattern.compile("\"\\w+\"");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
output.add(matcher.group().replaceAll("\"",""));
}
}
Output list contains:
[car,faster,other]

You can use Apache common String Utils substringsBetween method
String[] arr = StringUtils.substringsBetween(input, "\"", "\"");
List<String> = new ArrayList<String>(Arrays.asList(arr));

Related

Java-Stream & Optional - Find a value that matches to a stream-element or provide a Default value

I have a Dictionary object which consists of several entries:
record Dictionary(String key, String value, String other) {};
I would like to replace words in the given String my a which are present as a "key" in one of the dictionaries with the corresponding value. I can achieve it like this, but I guess, there must be a better way to do this.
An example:
> Input: One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four
> Output: One [a-value] Two [b-value] Three [D] Four
The code to be improved:
public class ReplaceStringWithDictionaryEntries {
public static void main(String[] args) {
List<Dictionary> dictionary = List.of(new Dictionary("a", "a-value", "a-other"),
new Dictionary("b", "b-value", "b-other"));
String theText = "One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four";
Matcher matcher = Pattern.compile("<sup>([A-Za-z]+)</sup>").matcher(theText);
StringBuilder sb = new StringBuilder();
int matchLast = 0;
while (matcher.find()) {
sb.append(theText, matchLast, matcher.start());
Optional<Dictionary> dict = dictionary.stream().filter(f -> f.key().equals(matcher.group(1))).findFirst();
if (dict.isPresent()) {
sb.append("[").append(dict.get().value()).append("]");
} else {
sb.append("[").append(matcher.group(1)).append("]");
}
matchLast = matcher.end();
}
if (matchLast != 0) {
sb.append(theText.substring(matchLast));
}
System.out.println("Result: " + sb.toString());
}
}
Output:
Result: One [a-value] Two [b-value] Three [D] Four
Do you have a more elegant way to do this?
Since Java 9, Matcher#replaceAll can accept a callback function to return the replacement for each matched value.
String result = Pattern.compile("<sup>([A-Za-z]+)</sup>").matcher(theText)
.replaceAll(mr -> "[" + dictionary.stream().filter(f -> f.key().equals(mr.group(1)))
.findFirst().map(Dictionary::value)
.orElse(mr.group(1)) + "]");
Create a map from your list using key as key and value as value, use the Matcher#appendReplacement method to replace matches using the above map and calling Map.getOrDefault, use the group(1) value as default value. Use String#join to put the replacements in square braces
public static void main(String[] args) {
List<Dictionary> dictionary = List.of(
new Dictionary("a", "a-value", "a-other"),
new Dictionary("b", "b-value", "b-other"));
Map<String,String> myMap = dictionary.stream()
.collect(Collectors.toMap(Dictionary::key, Dictionary::value));
String theText = "One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four";
Matcher matcher = Pattern.compile("<sup>([A-Za-z]+)</sup>").matcher(theText);
StringBuilder sb = new StringBuilder();
while (matcher.find()) {
matcher.appendReplacement(sb,
String.join("", "[", myMap.getOrDefault(matcher.group(1), matcher.group(1)), "]"));
}
matcher.appendTail(sb);
System.out.println(sb.toString());
}
record Dictionary( String key, String value, String other) {};
Map vs List
As #Chaosfire has pointed out in the comment, a Map is more suitable collection for the task than a List, because it eliminates the need of iterating over collection to access a particular element
Map<String, Dictionary> dictByKey = Map.of(
"a", new Dictionary("a", "a-value", "a-other"),
"b", new Dictionary("b", "b-value", "b-other")
);
And I would also recommend wrapping the Map with a class in order to provide continent access to the string-values of the dictionary, otherwise we are forced to check whether a dictionary returned from the map is not null and only then make a call to obtain the required value, which is inconvenient. The utility class can facilitate getting the target value in a single method call.
To avoid complicating the answer, I would not implement such a utility class, and for simplicity I'll go with a Map<String,String> (which basically would act as a utility class intended to act - providing the value within a single call).
public static final Map<String, String> dictByKey = Map.of(
"a", "a-value",
"b", "b-value"
);
Pattern.splitAsStream()
We can replace while-loop with a stream created via splitAsStream() .
In order to distinguish between string-values enclosed with tags <sup>text</sup> we can make use of the special constructs which are called Lookbehind (?<=</sup>) and Lookahead (?=<sup>).
(?<=foo) - matches a position that immediately precedes the foo.
(?=foo) - matches a position that immediately follows after the foo;
For more information, have a look at this tutorial
The pattern "(?=<sup>)|(?<=</sup>)" would match a position in the given string right before the opening tag and immediately after the closing tag. So when we apply this pattern splitting the string with splitAsStream(), it would produce a stream containing elements like "<sup>a</sup>" enclosed with tags, and plain string like "One", "Two", "Three".
Note that in order to reuse the pattern without recompiling, it can be declared on a class level:
public static final Pattern pattern = Pattern.compile("(?=<sup>)|(?<=</sup>)");
The final solution would result in lean and simple stream:
public static void foo(String text) {
String result = pattern.splitAsStream(text)
.map(str -> getValue(str)) // or MyClass::getValue
.collect(Collectors.joining());
System.out.println(result);
}
Instead of tackling conditional logic inside a lambda, it's often better to extract it into a separate method (sure, you can use a ternary operator and place this logic right inside the map operation in the stream if you wish instead of having this method, but it'll be a bit messy):
public static String getValue(String str) {
if (str.matches("<sup>\\p{Alpha}+</sup>")) {
String key = str.replaceAll("<sup>|</sup>", "");
return "[" + dictByKey.getOrDefault(key, key) + "]";
}
return str;
}
main()
public static void main(String[] args) {
foo("One <sup>a</sup> Two <sup>b</sup> Three <sup>D</sup> Four");
}
Output:
Result: One [a-value] Two [b-value] Three [D] Four
A link to Online Demo

Extracting values corresponding to a certain "key" from a string List in Java [duplicate]

I have a String that's formatted like this:
"key1=value1;key2=value2;key3=value3"
for any number of key/value pairs.
I need to check that a certain key exists (let's say it's called "specialkey"). If it does, I want the value associated with it. If there are multiple "specialkey"s set, I only want the first one.
Right now, I'm looking for the index of "specialkey". I take a substring starting at that index, then look for the index of the first = character. Then I look for the index of the first ; character. The substring between those two indices gives me the value associated with "specialkey".
This is not an elegant solution, and it's really bothering me. What's an elegant way of finding the value that corresponds with "specialkey"?
I would parse the String into a map and then just check for the key:
String rawValues = "key1=value1;key2=value2;key3=value3";
Map<String,String> map = new HashMap<String,String>();
String[] entries = rawValues.split(";");
for (String entry : entries) {
String[] keyValue = entry.split("=");
map.put(keyValue[0],keyValue[1]);
}
if (map.containsKey("myKey")) {
return map.get("myKey");
}
Use String.split:
String[] kvPairs = "key1=value1;key2=value2;key3=value3".split(";");
This will give you an array kvPairs that contains these elements:
key1=value1
key2=value2
key3=value3
Iterate over these and split them, too:
for(String kvPair: kvPairs) {
String[] kv = kvPair.split("=");
String key = kv[0];
String value = kv[1];
// Now do with key whatever you want with key and value...
if(key.equals("specialkey")) {
// Do something with value if the key is "specialvalue"...
}
}
If it's just the one key you're after, you could use regex \bspecialkey=([^;]+)(;|$) and extract capturing group 1:
Pattern p = Pattern.compile("\\bspecialkey=([^;]+)(;|$)");
Matcher m = p.matcher("key1=value1;key2=value2;key3=value3");
if (m.find()) {
System.out.println(m.group(1));
}
If you're doing something with the other keys, then split on ; and then = within a loop - no need for regex.
Just in case anyone is interested in a pure Regex-based approach, the following snippet works.
Pattern pattern = Pattern.compile("([\\w]+)?=([\\w]+)?;?");
Matcher matcher = pattern.matcher("key1=value1;key2=value2;key3=value3");
while (matcher.find()) {
System.out.println("Key - " + matcher.group(1) + " Value - " + matcher.group(2);
}
Output will be
Key - key1 Value - value1
Key - key2 Value - value2
Key - key3 Value - value3
However, as others explained before, String.split() is recommended any day for this sort of task. You shouldn't complicate your life trying to use Regex when there's an alternative to use.
There are many ways to do this. Perhaps the simplest is to use the Streams API (available as of Java 8 and later) to process the match results:
List<String> OriginalList = Arrays.asList("A=1,B=2,C=3",
"A=11,B=12,C=13,D=15", "A=5,B=4,C=9,D=10,E=13",
"A=19,B=20,C=91,D=40,E=33", "A=77,B=27,C=37");
this streams the strings
matches on the pattern and extracts the integer
the collects to a list
Pattern p = Pattern.compile("A=(\\d+)");
List<Integer> result = OriginalList.stream().
flatMap(str->p.matcher(str).results())
.map(mr->Integer.valueOf(mr.group(1)))
.collect(Collectors.toList());
System.out.println(result);
Prints:
[1, 11, 5, 19, 77]
Try : (?:(?:A=)([^,]*))
Demo : https://regex101.com/r/rziGDz/1
Else you find a code using regex and your list to get answer :
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.*;
public class Main
{
public static void main(String[] args)
{
List<Integer> results = new ArrayList();
Pattern pattern = Pattern.compile("(?:(?:A=)([^,]*))", Pattern.CASE_INSENSITIVE);
List<String> OriginalList = Arrays.asList(
"A=1,B=2,C=3",
"A=11,B=12,C=13,D=15",
"A=5,B=4,C=9,D=10,E=13",
"A=19,B=20,C=91,D=40,E=33",
"A=77,B=27,C=37");
for (int i = 0; i < OriginalList.size(); i++)
{
Matcher matcher = pattern.matcher(OriginalList.get(i));
boolean matchFound = matcher.find();
if(matchFound)
{
System.out.println( matcher.group(1) );
results.add( Integer.parseInt(matcher.group(1)) );
}
}
}
}
This may be implemented using Stream API by simple splitting of each string in the input list by comma and Stream::flatMap
// assuming A is not always at the beginning
List<String> list = Arrays.asList(
"A=1,B=2,C=3",
"A=11,B=12,C=13,D=15",
"A=5,B=4,C=9,D=10,E=13",
"B=20,C=91,D=40,E=33",
"B=27, A=19, C=37, A=77");
List<Integer> aNums = list.stream() // Stream<String>
.flatMap(
s -> Arrays.stream(s.split("\\s*,\\s*")) // Stream<String> pairs of letter=digits
.filter(pair -> pair.startsWith("A="))
.map(pair -> Integer.valueOf(pair.substring(2)))
)
.collect(Collectors.toList());
System.out.println(aNums);
Output:
[1, 11, 5, 19, 77]
Update
A pattern to split an input string and keep only the digits related to A may be applied as follows:
Pattern splitByA = Pattern.compile("A\\s*=\\s*|\\s*,\\s*|[^A]\\s*=\\s*\\d+");
List<Integer> aNums2 = list.stream()
.flatMap(splitByA::splitAsStream) // Stream<String>
.filter(Predicate.not(String::isEmpty)) // need to remove empty strings
.map(Integer::valueOf)
.collect(Collectors.toList());
System.out.println(aNums2);
Output is the same
[1, 11, 5, 19, 77]
Using basic filter: Split using [A-Z=,]+ regex. Pick the 2nd element.
public List filter() {
List<String> originalList = Arrays.asList("A=1,B=2,C=3", "A=11,B=12,C=13,D=15", "A=5,B=4,C=9,D=10,E=13",
"A=19,B=20,C=91,D=40,E=33", "A=77,B=27,C=37");
List<Integer> parsedData = new ArrayList();
for(String str: originalList) {
Integer data = Integer.parseInt(str.split("[A-Z=,]+")[1]);
parsedData.add(data);
}
return parsedData;
}
Try this:
List<Integer> results = new ArrayList();
Pattern p = Pattern.compile("(?:(?:A=)([^,]*))");
Matcher m = null;
for (String tmp : OriginalList) {
m = p.matcher(tmp);
if (m.find()) {
int r = Integer.parseInt(m.group(0).replace("A=", ""));
results.add(r);
}
}

Parsing String by pattern of substrings

I need to parse a formula and get all the variables that were used. The list of variables is available. For example, the formula looks like this:
String f = "(Min(trees, round(Apples1+Pears1,1)==1&&universe==big)*number";
I know that possible variables are:
String[] vars = {"trees","rivers","Apples1","Pears1","Apricots2","universe","galaxy","big","number"};
I need to get the following array:
String[] varsInF = {"trees", "Apples1","Pears1", "universe", "big","number"};
I believe that split method is good here but can’t figure the regexp required for this.
No need for any regex pattern - just check which item of the supported vars is contained in the given string:
List<String> varsInf = new ArrayList<>();
for(String var : vars)
if(f.contains(var))
varsInf.add(var);
Using Stream<> you can:
String[] varsInf = Arrays.stream(vars).filter(f::contains).toArray(String[]::new);
Assuming "variable" is represented by one alphanumeric character or sequential sequence of multiple such characters, you should split by not-alphanumeric characters, i. e. [^\w]+, then collect result by iteration or filter:
Set<String> varSet = new HashSet<>(Arrays.asList(vars));
List<String> result = new ArrayList<>();
for (String s : f.split("[^\\w]+")) {
if (varSet.contains(s)) {
result.add(s);
}
}

ReplaceAll with java8 lambda functions

Given the following variables
templateText = "Hi ${name}";
variables.put("name", "Joe");
I would like to replace the placeholder ${name} with the value "Joe" using the following code (that does not work)
variables.keySet().forEach(k -> templateText.replaceAll("\\${\\{"+ k +"\\}" variables.get(k)));
However, if I do the "old-style" way, everything works perfectly:
for (Entry<String, String> entry : variables.entrySet()){
String regex = "\\$\\{" + entry.getKey() + "\\}";
templateText = templateText.replaceAll(regex, entry.getValue());
}
Surely I am missing something here :)
Java 8
The proper way to implement this has not changed in Java 8, it is based on appendReplacement()/appendTail():
Pattern variablePattern = Pattern.compile("\\$\\{(.+?)\\}");
Matcher matcher = variablePattern.matcher(templateText);
StringBuffer result = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(result, variables.get(matcher.group(1)));
}
matcher.appendTail(result);
System.out.println(result);
Note that, as mentioned by drrob in the comments, the replacement String of appendReplacement() may contain group references using the $ sign, and escaping using \. If this is not desired, or if your replacement String can potentially contain those characters, you should escape them using Matcher.quoteReplacement().
Being more functional in Java 8
If you want a more Java-8-style version, you can extract the search-and-replace boiler plate code into a generalized method that takes a replacement Function:
private static StringBuffer replaceAll(String templateText, Pattern pattern,
Function<Matcher, String> replacer) {
Matcher matcher = pattern.matcher(templateText);
StringBuffer result = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(result, replacer.apply(matcher));
}
matcher.appendTail(result);
return result;
}
and use it as
Pattern variablePattern = Pattern.compile("\\$\\{(.+?)\\}");
StringBuffer result = replaceAll(templateText, variablePattern,
m -> variables.get(m.group(1)));
Note that having a Pattern as parameter (instead of a String) allows it to be stored as a constant instead of recompiling it every time.
Same remark applies as above concerning $ and \ – you may want to enforce the quoteReplacement() inside the replaceAll() method if you don't want your replacer function to handle it.
Java 9 and above
Java 9 introduced Matcher.replaceAll(Function) which basically implements the same thing as the functional version above. See Jesse Glick's answer for more details.
you also can using Stream.reduce(identity,accumulator,combiner).
identity
identity is the initial value for reducing function which is accumulator.
accumulator
accumulator reducing identity to result, which is the identity for the next reducing if the stream is sequentially.
combiner
this function never be called in sequentially stream. it calculate the next identity from identity & result in parallel stream.
BinaryOperator<String> combinerNeverBeCalledInSequentiallyStream=(identity,t) -> {
throw new IllegalStateException("Can't be used in parallel stream");
};
String result = variables.entrySet().stream()
.reduce(templateText
, (it, var) -> it.replaceAll(format("\\$\\{%s\\}", var.getKey())
, var.getValue())
, combinerNeverBeCalledInSequentiallyStream);
import java.util.HashMap;
import java.util.Map;
public class Repl {
public static void main(String[] args) {
Map<String, String> variables = new HashMap<>();
String templateText = "Hi, ${name} ${secondname}! My name is ${name} too :)";
variables.put("name", "Joe");
variables.put("secondname", "White");
templateText = variables.keySet().stream().reduce(templateText, (acc, e) -> acc.replaceAll("\\$\\{" + e + "\\}", variables.get(e)));
System.out.println(templateText);
}
}
output:
Hi, Joe White! My name is Joe too :)
However, it's not the best idea to reinvent the wheel and the preferred way to achieve what you want would be to use apache commons lang as stated here.
Map<String, String> valuesMap = new HashMap<String, String>();
valuesMap.put("animal", "quick brown fox");
valuesMap.put("target", "lazy dog");
String templateString = "The ${animal} jumped over the ${target}.";
StrSubstitutor sub = new StrSubstitutor(valuesMap);
String resolvedString = sub.replace(templateString);
Your code should be changed like below,
String templateText = "Hi ${name}";
Map<String,String> variables = new HashMap<>();
variables.put("name", "Joe");
templateText = variables.keySet().stream().reduce(templateText, (originalText, key) -> originalText.replaceAll("\\$\\{" + key + "\\}", variables.get(key)));
Performing replaceAll repeatedly, i.e. for every replaceable variable, can become quiet expensive, especially as the number of variables might grow. This doesn’t become more efficient when using the Stream API. The regex package contains the necessary building blocks to do this more efficiently:
public static String replaceAll(String template, Map<String,String> variables) {
String pattern = variables.keySet().stream()
.map(Pattern::quote)
.collect(Collectors.joining("|", "\\$\\{(", ")\\}"));
Matcher m = Pattern.compile(pattern).matcher(template);
if(!m.find()) {
return template;
}
StringBuffer sb = new StringBuffer();
do {
m.appendReplacement(sb, Matcher.quoteReplacement(variables.get(m.group(1))));
} while(m.find());
m.appendTail(sb);
return sb.toString();
}
If you are performing the operation with the same Map very often, you may consider keeping the result of Pattern.compile(pattern), as it is immutable and safely shareable.
On the other hand, if you are using this operation with different maps frequently, it might be an option to use a generic pattern instead, combined with handling the possibility that the particular variable is not in the map. The adds the option to report occurrences of the ${…} pattern with an unknown variable:
private static Pattern VARIABLE = Pattern.compile("\\$\\{([^}]*)\\}");
public static String replaceAll(String template, Map<String,String> variables) {
Matcher m = VARIABLE.matcher(template);
if(!m.find())
return template;
StringBuffer sb = new StringBuffer();
do {
m.appendReplacement(sb,
Matcher.quoteReplacement(variables.getOrDefault(m.group(1), m.group(0))));
} while(m.find());
m.appendTail(sb);
return sb.toString();
}
m.group(0) is the actual match, so using this as a fall-back for the replacement string establishes the original behavior of not replacing ${…} occurrences when the key is not in the map. As said, alternative behaviors, like reporting the absent key or using a different fall-back text, are possible.
To update #didier-l’s answer, in Java 9 this is a one-liner!
Pattern.compile("[$][{](.+?)[}]").matcher(templateText).replaceAll(m -> variables.get(m.group(1)))

How to split string using regular expression and put string values into a map in Java

I have a following String and i want to read it using regular expression and put into a map as a key and value.I have already split and put into a map.but the problem is that i have used string arrays and there is a high risk of array index out of bound.so i think that way is not suit for good coding.
public static void read(String log,Map<String, String> logMap) {
String sanitizeLog = "";
String commaSeparatedLine[];
String equalSeparatedLine[];
String patternComma = ",";
String patternEqual = "=";
String patternSanitize = "(?<=]:).*";
Pattern pattern = Pattern.compile(patternSanitize);
Matcher matcher = pattern.matcher(log);
if (matcher.find()) {
sanitizeLog = matcher.group();
}
pattern = Pattern.compile(patternComma);
commaSeparatedLine = pattern.split(sanitizeLog);
for (String line : commaSeparatedLine) {
pattern = Pattern.compile(patternEqual);
equalSeparatedLine = pattern.split(line);
for (int i = 0; i < equalSeparatedLine.length; i += 2) {
logMap.put(equalSeparatedLine[i].trim(),
equalSeparatedLine[i + 1]);
}
}
}
Above code snippet is working fine.but there i used lot of string arrays to store split values.Please let me know that is there any way to do the same thing without using string arrays and put split values in to a map using regular expression.I am a newbie in regular expression.
Output Map should contain like this.
Key -> value
DB.UPDATE_CT -> 2
DB.DUPQ_CT -> 1
...
String value to be split
[2015-01-07 07:17:56,911]: R="InProgressOrders.jsp", REQUEST_UUID="77ed2ab1-b799-4715-acd5-e77ab756192e", HTTP_M="POST",
PFWD="login.jsp", USER_ORG="TradeCustomer.1717989", TX_ORG1="1717989",
DB.QUERY_CT=61, DB.UPDATE_CT=2, DB.DUPQ_CT=1, DB.SVR_MS=59,
DB.IO_MS=111, DB.DRV_MS=144, DB.LOCK_MS=31, DB.BYTES_W=1501, KV.PUT=1,
KV.GET=5, KV.PWAIT_MS=2, KV.GWAIT_MS=4, KV.BYTES_W=193,
KV.BYTES_R=367, MCACHE.GET=30, MCACHE.PUT=18, MCACHE.L1HIT=10,
MCACHE.L2HIT=1, MCACHE.HIT=1, MCACHE.MISS=18, MCACHE.WAIT_MS=51,
MCACHE.BYTES_W=24538, MCACHE.BYTES_R=24282, ROOTS.READ_CT=6,
ROOTS.DUPRSV_CT=3, THREAD.WALL_MS=594, THREAD.CPU_MS=306,
THREAD.CPU_USER_MS=300, THREAD.MEM_K=19318
You seem to have a lot of code. Here is how to do it in 1-line:
Map<String, String> map = Arrays.stream(input.split(","))
.map(s -> a.split("="))
.collect(Collectors.toMap(a -> a[0], a -> a[1]));
To instead add the entries to another map (as in your code):
Arrays.stream(input.split(",")).map(s -> a.split("="))
.forEach(a -> logMap.put(a[0], a[1]));
Disclaimer: Not tested or compiled, just thumbed in.

Categories

Resources