Splitting a string into multiple strings using regex - java

I was working on a task to parse a cucumber feature string where I need to split string into 5 like the following.
String data = "calls 'create' using 'POST' on some uri";
I was implementing the basic split functionality multiple times (without any regex which is very tedious) to generate the data into the following.
String dataArray[] = {"calls '","create","' using '","POST", "' on some uri"};
I wanted to obtain the names of dataArray[1] and dataArray[3]. Is there a way to generate the above dataArray using regex and split or some other straight forward method?

Simply use this?
String dataArray[] = data.split("'");
->
[calls , create, using , POST, on some uri]

Here is a solution which use Regex:
public static void main (String[] args) {
String data = "calls 'create' using 'POST' on some uri";
String[] dataArray = new String[2];
Matcher matcher = Pattern.compile("'[a-zA-Z]+'").matcher(data);
int counter = 0;
while (matcher.find()) {
String result = matcher.group(0);
dataArray[counter++] = result.substring(1, result.length() - 1);
}
}
Output:
dataArray[0] --> create
dataArray[1] --> POST

Related

N-th indexOf in String?

I need to extract a sub-string of a URL.
URLs
/service1/api/v1.0/foo -> foo
/service1/api/v1.0/foo/{fooId} -> foo/{fooId}
/service1/api/v1.0/foo/{fooId}/boo -> foo/{fooId}/boo
And some of those URLs may have request parameters.
Code
String str = request.getRequestURI();
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1, str.indexOf("?"));
Is there a better way to extract the sub-string instead of recurrent usage of indexOf method?
There are many alternative ways:
Use Java-Stream API on splitted String with \ delimiter:
String str = "/service1/api/v1.0/foo/{fooId}/boo";
String[] split = str.split("\\/");
String url = Arrays.stream(split).skip(4).collect(Collectors.joining("/"));
System.out.println(url);
With the elimination of the parameter, the Stream would be like:
String url = Arrays.stream(split)
.skip(4)
.map(i -> i.replaceAll("\\?.+", ""))
.collect(Collectors.joining("/"));
This is also where Regex takes its place! Use the classes Pattern and Matcher.
String str = "/service1/api/v1.0/foo/{fooId}/boo";
Pattern pattern = Pattern.compile("\\/.*?\\/api\\/v\\d+\\.\\d+\\/(.+)");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
If you rely on the indexOf(..) usage, you might want to use the while-loop.
String str = "/service1/api/v1.0/foo/{fooId}/boo?parameter=value";
String string = str;
while(!string.startsWith("v1.0")) {
string = string.substring(string.indexOf("/") + 1);
}
System.out.println(string.substring(string.indexOf("/") + 1, string.indexOf("?")));
Other answers include a way that if the prefix is not mutable, you might want to use only one call of idndexOf(..) method (#JB Nizet):
string.substring("/service1/api/v1.0/".length(), string.indexOf("?"));
All these solutions are based on your input and fact, the pattern is known, or at least the number of the previous section delimited with \ or the version v1.0 as a checkpoint - the best solution might not appear here since there are unlimited combinations of the URL. You have to know all the possible combinations of input URL to find the best way to handle it.
Path is quite useful for that :
public static void main(String[] args) {
Path root = Paths.get("/service1/api/v1.0/foo");
Path relativize = root.relativize(Paths.get("/service1/api/v1.0/foo/{fooId}/boo"));
System.out.println(relativize);
}
Output :
{fooId}/boo
How about this:
String s = "/service1/api/v1.0/foo/{fooId}/boo";
String[] sArray = s.split("/");
StringBuilder sb = new StringBuilder();
for (int i = 4; i < sArray.length; i++) {
sb.append(sArray[i]).append("/");
}
sb.deleteCharAt(sb.length() - 1);
System.out.println(sb.toString());
Output:
foo/{fooId}/boo
If the url prefix is always /service1/api/v1.0/, you just need to do s.substring("/service1/api/v1.0/".length()).
There are a few good options here.
1) If you know "foo" will always be the 4th token, then you have the right idea already. The only issue with your way is that you have the information you need to be efficient, but you aren't using it. Instead of copying the String multiple times and looping anew from the beginning of the new String, you could just continue from where you left off, 4 times, to find the starting point of what you want.
String str = "/service1/api/v1.0/foo/{fooId}/boo";
// start at the beginning
int start = 0;
// get the 4th index of '/' in the string
for (int i = 0; i != 4; i++) {
// get the next index of '/' after the index 'start'
start = str.indexOf('/',start);
// increase the pointer to the next character after this slash
start++;
}
// get the substring
str = str.substring(start);
This will be far, far more efficient than any regex pattern.
2) Regex: (java.util.regex.*). This will work if you what you want is always preceded by "service1/api/v1.0/". There may be other directories before it, e.g. "one/two/three/service1/api/v1.0/".
// \Q \E will automatically escape any special chars in the path
// (.+) will capture the matched text at that position
// $ marks the end of the string (technically it matches just before '\n')
Pattern pattern = Pattern.compile("/service1/api/v1\\.0/(.+)$");
// get a matcher for it
Matcher matcher = pattern.matcher(str);
// if there is a match
if (matcher.find()) {
// get the captured text
str = matcher.group(1);
}
If your path can vary some, you can use regex to account for it. e.g.: service/api/v3/foo/{bar}/baz/" (note varying number formats and trailing '/') could be matched as well by changing the regex to "/service\\d*/api/v\\d+(?:\\.\\d+)?/(.+)(?:/|$)"

Parse string value from URL

I have a string (which is an URL) in this pattern https://xxx.kflslfsk.com/kjjfkskfjksf/v1/files/media/93939393hhs8.jpeg
now I want to clip it to this
media/93939393hhs8.jpeg
I want to remove all the characters before the second last slash /.
i'm a newbie in java but in swift (iOS) this is how we do this:
if let url = NSURL(string:"https://xxx.kflslfsk.com/kjjfkskfjksf/v1/files/media/93939393hhs8.jpeg"), pathComponents = url.pathComponents {
let trimmedString = pathComponents.suffix(2).joinWithSeparator("/")
print(trimmedString) // "output = media/93939393hhs8.jpeg"
}
Basically, I'm removing everything from this Url expect of last 2 item and then.
I'm joining those 2 items using /.
String ret = url.substring(url.indexof("media"),url.indexof("jpg"))
Are you familiar with Regex? Try to use this Regex (explained in the link) that captures the last 2 items separated with /:
.*?\/([^\/]+?\/[^\/]+?$)
Here is the example in Java (don't forget the escaping with \\:
Pattern p = Pattern.compile("^.*?\\/([^\\/]+?\\/[^\\/]+?$)");
Matcher m = p.matcher(string);
if (m.find()) {
System.out.println(m.group(1));
}
Alternatively there is the split(..) function, however I recommend you the way above. (Finally concatenate separated strings correctly with StringBuilder).
String part[] = string.split("/");
int l = part.length;
StringBuilder sb = new StringBuilder();
String result = sb.append(part[l-2]).append("/").append(part[l-1]).toString();
Both giving the same result: media/93939393hhs8.jpeg
string result=url.substring(url.substring(0,url.lastIndexOf('/')).lastIndexOf('/'));
or
Use Split and add last 2 items
string[] arr=url.split("/");
string result= arr[arr.length-2]+"/"+arr[arr.length-1]
public static String parseUrl(String str) {
return (str.lastIndexOf("/") > 0) ? str.substring(1+(str.substring(0,str.lastIndexOf("/")).lastIndexOf("/"))) : str;
}

How can i derive specific data from the string?

I have the following string and i want to derive the number (104321) from the a href tag . How can i derive this number .
Hello this is testing string Ap<img src=\"Image Url" width=\"222\" height=\"149\"/><br/><br/>test\u00e4n p\u00e4\u00e4ll\u00e4 test, test\u00e4, test?
i want the final output to be like this.
String[] strExample= {"testing", "104321","test\u00e4n p\u00e4\u00e4ll\u00e4 test, test\u00e4, test?"};
Any help is appreciated.
You could try a simple Pattern matcher with the regexp:
String THE_PATTERN = "<a\\s+href\\s*=\\s*\"/([a-zA-Z]+)/([0-9]+)";
Matcher m = Pattern.compile(THE_PATTERN).matcher(THE_INPUT_STRING);
String[] results = new String[2];
if (m.find()) {
results[0] = m.group(1);
results[1] = m.group(2);
}
Haven't tried it though, so there could be small/easy-to-fix errors.
For that single case
String[] strExample = str.split("^.+?\\\"/|\\\\\">.+<br/>|/");
will work. It will break if the string you want to parse changes much though. Some more examples would probably be in place if there are more patterns you need to account for.

How to replace a set of tokens in a Java String?

I have the following template String: "Hello [Name] Please find attached [Invoice Number] which is due on [Due Date]".
I also have String variables for name, invoice number and due date - what's the best way to replace the tokens in the template with the variables?
(Note that if a variable happens to contain a token it should NOT be replaced).
EDIT
With thanks to #laginimaineb and #alan-moore, here's my solution:
public static String replaceTokens(String text,
Map<String, String> replacements) {
Pattern pattern = Pattern.compile("\\[(.+?)\\]");
Matcher matcher = pattern.matcher(text);
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
String replacement = replacements.get(matcher.group(1));
if (replacement != null) {
// matcher.appendReplacement(buffer, replacement);
// see comment
matcher.appendReplacement(buffer, "");
buffer.append(replacement);
}
}
matcher.appendTail(buffer);
return buffer.toString();
}
I really don't think you need to use a templating engine or anything like that for this. You can use the String.format method, like so:
String template = "Hello %s Please find attached %s which is due on %s";
String message = String.format(template, name, invoiceNumber, dueDate);
The most efficient way would be using a matcher to continually find the expressions and replace them, then append the text to a string builder:
Pattern pattern = Pattern.compile("\\[(.+?)\\]");
Matcher matcher = pattern.matcher(text);
HashMap<String,String> replacements = new HashMap<String,String>();
//populate the replacements map ...
StringBuilder builder = new StringBuilder();
int i = 0;
while (matcher.find()) {
String replacement = replacements.get(matcher.group(1));
builder.append(text.substring(i, matcher.start()));
if (replacement == null)
builder.append(matcher.group(0));
else
builder.append(replacement);
i = matcher.end();
}
builder.append(text.substring(i, text.length()));
return builder.toString();
Unfortunately the comfortable method String.format mentioned above is only available starting with Java 1.5 (which should be pretty standard nowadays, but you never know). Instead of that you might also use Java's class MessageFormat for replacing the placeholders.
It supports placeholders in the form '{number}', so your message would look like "Hello {0} Please find attached {1} which is due on {2}". These Strings can easily be externalized using ResourceBundles (e. g. for localization with multiple locales). The replacing would be done using the static'format' method of class MessageFormat:
String msg = "Hello {0} Please find attached {1} which is due on {2}";
String[] values = {
"John Doe", "invoice #123", "2009-06-30"
};
System.out.println(MessageFormat.format(msg, values));
You could try using a templating library like Apache Velocity.
http://velocity.apache.org/
Here is an example:
import org.apache.velocity.VelocityContext;
import org.apache.velocity.app.Velocity;
import java.io.StringWriter;
public class TemplateExample {
public static void main(String args[]) throws Exception {
Velocity.init();
VelocityContext context = new VelocityContext();
context.put("name", "Mark");
context.put("invoiceNumber", "42123");
context.put("dueDate", "June 6, 2009");
String template = "Hello $name. Please find attached invoice" +
" $invoiceNumber which is due on $dueDate.";
StringWriter writer = new StringWriter();
Velocity.evaluate(context, writer, "TemplateName", template);
System.out.println(writer);
}
}
The output would be:
Hello Mark. Please find attached invoice 42123 which is due on June 6, 2009.
You can use template library for complex template replacement.
FreeMarker is a very good choice.
http://freemarker.sourceforge.net/
But for simple task, there is a simple utility class can help you.
org.apache.commons.lang3.text.StrSubstitutor
It is very powerful, customizable, and easy to use.
This class takes a piece of text and substitutes all the variables
within it. The default definition of a variable is ${variableName}.
The prefix and suffix can be changed via constructors and set methods.
Variable values are typically resolved from a map, but could also be
resolved from system properties, or by supplying a custom variable
resolver.
For example, if you want to substitute system environment variable into a template string,
here is the code:
public class SysEnvSubstitutor {
public static final String replace(final String source) {
StrSubstitutor strSubstitutor = new StrSubstitutor(
new StrLookup<Object>() {
#Override
public String lookup(final String key) {
return System.getenv(key);
}
});
return strSubstitutor.replace(source);
}
}
System.out.println(MessageFormat.format("Hello {0}! You have {1} messages", "Join",10L));
Output:
Hello Join! You have 10 messages"
String.format("Hello %s Please find attached %s which is due on %s", name, invoice, date)
It depends of where the actual data that you want to replace is located. You might have a Map like this:
Map<String, String> values = new HashMap<String, String>();
containing all the data that can be replaced. Then you can iterate over the map and change everything in the String as follows:
String s = "Your String with [Fields]";
for (Map.Entry<String, String> e : values.entrySet()) {
s = s.replaceAll("\\[" + e.getKey() + "\\]", e.getValue());
}
You could also iterate over the String and find the elements in the map. But that is a little bit more complicated because you need to parse the String searching for the []. You could do it with a regular expression using Pattern and Matcher.
My solution for replacing ${variable} style tokens (inspired by the answers here and by the Spring UriTemplate):
public static String substituteVariables(String template, Map<String, String> variables) {
Pattern pattern = Pattern.compile("\\$\\{(.+?)\\}");
Matcher matcher = pattern.matcher(template);
// StringBuilder cannot be used here because Matcher expects StringBuffer
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
if (variables.containsKey(matcher.group(1))) {
String replacement = variables.get(matcher.group(1));
// quote to work properly with $ and {,} signs
matcher.appendReplacement(buffer, replacement != null ? Matcher.quoteReplacement(replacement) : "null");
}
}
matcher.appendTail(buffer);
return buffer.toString();
}
With Apache Commons Library, you can simply use Stringutils.replaceEach:
public static String replaceEach(String text,
String[] searchList,
String[] replacementList)
From the documentation:
Replaces all occurrences of Strings within another String.
A null reference passed to this method is a no-op, or if any "search
string" or "string to replace" is null, that replace will be ignored.
This will not repeat. For repeating replaces, call the overloaded
method.
StringUtils.replaceEach(null, *, *) = null
StringUtils.replaceEach("", *, *) = ""
StringUtils.replaceEach("aba", null, null) = "aba"
StringUtils.replaceEach("aba", new String[0], null) = "aba"
StringUtils.replaceEach("aba", null, new String[0]) = "aba"
StringUtils.replaceEach("aba", new String[]{"a"}, null) = "aba"
StringUtils.replaceEach("aba", new String[]{"a"}, new String[]{""}) = "b"
StringUtils.replaceEach("aba", new String[]{null}, new String[]{"a"}) = "aba"
StringUtils.replaceEach("abcde", new String[]{"ab", "d"}, new String[]{"w", "t"}) = "wcte"
(example of how it does not repeat)
StringUtils.replaceEach("abcde", new String[]{"ab", "d"}, new String[]{"d", "t"}) = "dcte"
You can use Apache Commons StringSubstitutor:
For example:
// Build map
Map<String, String> valuesMap = new HashMap<>();
valuesMap.put("animal", "quick brown fox");
valuesMap.put("target", "lazy dog");
String templateString = "The ${animal} jumped over the ${target}.";
// Build StringSubstitutor
StringSubstitutor sub = new StringSubstitutor(valuesMap);
// Replace
String resolvedString = sub.replace(templateString);
yielding:
"The quick brown fox jumped over the lazy dog."
You can also customize the prefix and suffix delimiters (${ and } respectively in the example above) by using:
setVariablePrefix
setVariableSuffix
You can also specify a default value using syntax like below:
String templateString = "The ${animal:giraffe} jumped over the ${target}.";
which would yield "The giraffe jumped over the lazy dog." when no animal parameter was supplied.
http://github.com/niesfisch/tokenreplacer
FYI
In the new language Kotlin,
you can use "String Templates" in your source code directly,
no 3rd party library or template engine need to do the variable replacement.
It is a feature of the language itself.
See:
https://kotlinlang.org/docs/reference/basic-types.html#string-templates
In the past, I've solved this kind of problem with StringTemplate and Groovy Templates.
Ultimately, the decision of using a templating engine or not should be based on the following factors:
Will you have many of these templates in the application?
Do you need the ability to modify the templates without restarting the application?
Who will be maintaining these templates? A Java programmer or a business analyst involved on the project?
Will you need to the ability to put logic in your templates, like conditional text based on values in the variables?
Will you need the ability to include other templates in a template?
If any of the above applies to your project, I would consider using a templating engine, most of which provide this functionality, and more.
I used
String template = "Hello %s Please find attached %s which is due on %s";
String message = String.format(template, name, invoiceNumber, dueDate);
The following replaces variables of the form <<VAR>>, with values looked up from a Map. You can test it online here
For example, with the following input string
BMI=(<<Weight>>/(<<Height>>*<<Height>>)) * 70
Hi there <<Weight>> was here
and the following variable values
Weight, 42
Height, HEIGHT 51
outputs the following
BMI=(42/(HEIGHT 51*HEIGHT 51)) * 70
Hi there 42 was here
Here's the code
static Pattern pattern = Pattern.compile("<<([a-z][a-z0-9]*)>>", Pattern.CASE_INSENSITIVE);
public static String replaceVarsWithValues(String message, Map<String,String> varValues) {
try {
StringBuffer newStr = new StringBuffer(message);
int lenDiff = 0;
Matcher m = pattern.matcher(message);
while (m.find()) {
String fullText = m.group(0);
String keyName = m.group(1);
String newValue = varValues.get(keyName)+"";
String replacementText = newValue;
newStr = newStr.replace(m.start() - lenDiff, m.end() - lenDiff, replacementText);
lenDiff += fullText.length() - replacementText.length();
}
return newStr.toString();
} catch (Exception e) {
return message;
}
}
public static void main(String args[]) throws Exception {
String testString = "BMI=(<<Weight>>/(<<Height>>*<<Height>>)) * 70\n\nHi there <<Weight>> was here";
HashMap<String,String> values = new HashMap<>();
values.put("Weight", "42");
values.put("Height", "HEIGHT 51");
System.out.println(replaceVarsWithValues(testString, values));
}
and although not requested, you can use a similar approach to replace variables in a string with properties from your application.properties file, though this may already be being done:
private static Pattern patternMatchForProperties =
Pattern.compile("[$][{]([.a-z0-9_]*)[}]", Pattern.CASE_INSENSITIVE);
protected String replaceVarsWithProperties(String message) {
try {
StringBuffer newStr = new StringBuffer(message);
int lenDiff = 0;
Matcher m = patternMatchForProperties.matcher(message);
while (m.find()) {
String fullText = m.group(0);
String keyName = m.group(1);
String newValue = System.getProperty(keyName);
String replacementText = newValue;
newStr = newStr.replace(m.start() - lenDiff, m.end() - lenDiff, replacementText);
lenDiff += fullText.length() - replacementText.length();
}
return newStr.toString();
} catch (Exception e) {
return message;
}
}

url and name spaces java convertion

I need to be able to convert:
(url) http://www.joe90.com/showroom
to
(namespace) com.joe90.showroom
I can do this using tokens etc, and a enforced rule set.
However, is there a way (a java package) that will do this for me?
or do i need to write one myself?
Thanks
java.net.URL url = new java.net.URL("http://www.joe90.com/showroom");
String tokens[] = url.getHostname().split(".");
StringBuilder sb = new StringBuilder();
for (int i=0; i<tokens.length; i++) {
if (i > 1) {
sb.append('.');
}
sb.append(tokens[i]);
}
String namespace = sb.toString();
Alternatively you can parse the hostname out.
Pattern p = Pattern.compile("^(\\w+://)?(.*?)/");
Matcher m = p.matcher(url); // string
if (m.matches()) {
String tokens[] = m.group(2).split(".");
// etc
}
Of course that regex doesn't match all URLs, for example:
http://username#hostname.com/...
That's why I suggested using java.net.URL: it does all the URL validation and parsing for you.
Your best bet would be to split the string based on the . and / characters (e.g. using Sting.split(), and then concatenate the pieces in reverse order, skipping over any you don't want to include (e.g. www)

Categories

Resources