How can i derive specific data from the string? - java

I have the following string and i want to derive the number (104321) from the a href tag . How can i derive this number .
Hello this is testing string Ap<img src=\"Image Url" width=\"222\" height=\"149\"/><br/><br/>test\u00e4n p\u00e4\u00e4ll\u00e4 test, test\u00e4, test?
i want the final output to be like this.
String[] strExample= {"testing", "104321","test\u00e4n p\u00e4\u00e4ll\u00e4 test, test\u00e4, test?"};
Any help is appreciated.

You could try a simple Pattern matcher with the regexp:
String THE_PATTERN = "<a\\s+href\\s*=\\s*\"/([a-zA-Z]+)/([0-9]+)";
Matcher m = Pattern.compile(THE_PATTERN).matcher(THE_INPUT_STRING);
String[] results = new String[2];
if (m.find()) {
results[0] = m.group(1);
results[1] = m.group(2);
}
Haven't tried it though, so there could be small/easy-to-fix errors.

For that single case
String[] strExample = str.split("^.+?\\\"/|\\\\\">.+<br/>|/");
will work. It will break if the string you want to parse changes much though. Some more examples would probably be in place if there are more patterns you need to account for.

Related

Regex Redirect URL excludes token

I'm trying to create a redirect URL for my client. We have a service that you specify "fromUrl" -> "toUrl" that is using a java regex Matcher. But I can't get it work to include the token in when it converts it. For example:
/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf
Should be:
/tourl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf
but it excludes the token so the result I get is:
/fromurl/login/
/tourl/login/
I tried various regex patterns like: " ?.* and [%5E//?]+)/([^/?]+)/(?.*)?$ and (/*) etc" but no one seems to work.
I'm not that familiar with regex. How can I solve this?
This can be easily done using simple string replace but if you insist on using regular expressions:
Pattern p = Pattern.compile("fromurl");
String originalUrlAsString = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf ";
String newRedirectedUrlAsString = p.matcher(originalUrlAsString).replaceAll("tourl");
System.out.println(newRedirectedUrlAsString);
If I understand you correctly you need something like this?
String from = "/my/old/url/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String to = from.replaceAll("\\/(.*)\\/", "/my/new/url/");
System.out.println(to); // /my/new/url/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
This will replace everything between the first and the last forward slash.
Can you detail more exactly what the original expression is like? This is necessary because the regular expression is based on it.
Assuming that the first occurrence of fromurl should simply be replaced with the following code:
String from = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String to = from.replaceFirst("fromurl", "tourl");
But if it is necessary to use more complex rules to determine the substring to replace, you can use:
String from = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String to = "";
String regularExpresion = "(<<pre>>)(fromurl)(<<pos>>)";
Pattern pattern = Pattern.compile(regularExpresion);
Matcher matcher = pattern.matcher(from);
if (matcher.matches()) {
to = from.replaceAll(regularExpresion, "$1tourl$3");
}
NOTE: pre and pos targets are referencial because I don't know the real expresion of the url
NOTE 2: $1 and $3 refer to the first and the third group
Although existing answers should solve the issue and some are similar, maybe below solution would be of help, with quite an easy regex being used (assuming you get input of same format as your example):
private static String replaceUrl(String inputUrl){
String regex = "/.*(/login\\?token=.*)";
String toUrl = "/tourl";
Pattern p = Pattern.compile(regex);
Matcher matcher = p.matcher(inputUrl);
if (matcher.find()) {
return toUrl + matcher.group(1);
} else
return null;
}
You can write a test if it works for other expected inputs/outputs if you want to change format and adjust regex:
String inputUrl = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String expectedUrl = "/tourl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
if (expectedUrl.equals(replaceUrl(inputUrl))){
System.out.println("Success");
}

How to remove an id out of a path using a Java Regex?

I am trying to get rid of an "id" in URI paths and I can only use Java regex transformation.
The paths look like this:
/web/service/1223345/add
/web/service/1223345/delete
/web/service/v2/1223345/add
/web/service/1223345
/web/service/do
The id is always a series of numbers. In the example above it is "1223345".
I have tried a couple of regexes but none of them worked. Here are my tries:
(/\w.*)/?[0-9]*/(.*)
([^0-9]+){0,}
(/.*/)[0-9]*(/.*)
Thanks for your help
String input = "/web/service/1223345/add";
System.out.println(input.replaceAll("/\\d*/","/"));
Output:
/web/service/add
If you are after removing id, you could do the following:
String input = "/web/service/v2/1223345/add";
String removed = input.replaceAll("/\\d*/?", "/");
System.out.println(removed);
Note that arnoud's regex "/\d*/" will not work for e.g. /web/service/1223345.
Question mark at the end of the regex takes care of such cases: "/\d*/?"
If on the other hand you are after extracting id:
Pattern pattern = Pattern.compile(".*?/(\\d*?)(/.*)?$");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
String id = matcher.group(1);
System.out.println(id);
}

More efficient way splitting than this?

Is there a more efficient way of splitting a string than this?
String input = "=example>";
String[] split = input.split("=");
String[] split1 = split[1].split(">");
String result = split1[0];
The result would be "example".
String result = input.replaceAll("[=>]", "");
Very simple regex!
To learn more, go to this link: here
Do you really need regex. You can do:
String result = input.substring(1, input.length()-1);
Otherwise if you really have a case for regex then use character class:
String result = input.replaceAll("[=>]", "");
If you just want to get example out of that do this:
input.substring(1, input.lastIndexOf(">"))
If the string of yours defenitely constant format use substring otherwise go fo regex
result = result.substring(1, result.length() - 1);
You can do it more elegant with RegEx groups:
String sourceString = "=example>";
// When matching, we can "mark" a part of the matched pattern with parentheses...
String patternString = "=(.*?)>";
Pattern p = Pattern.compile(patternString);
Matcher m = p.matcher(sourceString);
m.find();
// ... and access it later
String result = m.group(1);
You can try this regex: ".*?((?:[a-z][a-z]+))"
But it would be better when you use something like this:
String result = input.substring(1, input.length()-1);
try this
String result = input.replace("[\\W]", "")
You can try this too
String input = "=example>";
System.out.println(input.replaceAll("[^\\p{L}\\p{Nd}]", ""));
This will remove all non-words characters
Regex would do the job perfectly, but just to add something new for future solutions you also could use a third party lib such as Guava from Google, it adds a lot of functionalities to your project and the Splitter is really helpful to solve something like you have.

regex or string parsing

I am trying to parse a string which has a specific pattern. An example valid string is as follows:
<STX><DATA><ETX>
<STX>A?123<ETX>
<STX><DATA><ETX>
<STX>name!xyz<ETX>
<STX>age!27y<ETX>
<STX></DATA><ETX>
<STX>A?234<ETX>
<STX><DATA><ETX>
<STX>name!abc<ETX>
<STX>age!24y<ETX>
<STX></DATA><ETX>
<STX>A?345<ETX>
<STX><DATA><ETX>
<STX>name!bac<ETX>
<STX>age!22y<ETX>
<STX></DATA><ETX>
<STX>OK<ETX>
<STX></DATA><ETX>
this data is sent by device. All I need is to parse this string with id:123 name:xyz, age 27y.
I am trying to use this regex:
final Pattern regex = Pattern.compile("(.*?)", Pattern.DOTALL);
this does output the required data :
<ETX>
<STX>A?123<ETX>
<STX><DATA><ETX>
<STX>name!xyz<ETX>
<STX>age!27y<ETX>
<STX>
How can I loop the string recursively to copy all into list of string.
I am trying to loop over and delete the extracted pattern but it doesn't delete.
final Pattern regex = Pattern.compile("<DATA>(.*?)</DATA>", Pattern.DOTALL);// Q?(.*?)
final StringBuffer buff = new StringBuffer(frame);
final Matcher matcher = regex.matcher(buff);
while (matcher.find())
{
final String dataElements = matcher.group();
System.out.println("Data:" + dataElements);
}
}
Are there any beter ways to do this.
This is the output I am currently getting:
Data:<DATA><ETX><STX>A?123<ETX><STX><DATA><ETX><STX>name!xyz<ETX><STX>age!27y<ETX><STX> </DATA>
Data:<DATA><ETX><STX>name!abc<ETX><STX>age!24y<ETX><STX></DATA>
Data:<DATA><ETX><STX>name!bac<ETX><STX>age!22y<ETX><STX></DATA>
I am missing the A?234 and A?345 in the next two matches.
I really dont know what exactly you want to achieve by this but if you want to remove the occurances of that pattern this line:
buff.toString().replace(dataElements, "")
doesn't look good. you are just editing the string representation of that buff. You have to again replace the edited version back into the buff (after casting).
Using this regex solves my issue:
<STX>(A*)(.*?)<DATA>(.*?)</DATA>

Regular Expression Search On String

I am having great issues searching a string for particular parameters that are needed in my application, I am under the assumption that the only real way to do this is using regular expressions however they are giving me a huge headache! I don't usually write them myself but get them off other websites however what i need isn't simple enough to be included :(
Here is the string:
10 50 u E2U+pstn:tel "!^(.*)$!tel:\\1;spn=42180;mcc=234;mnc=33!" .
I need to extract the spn, mcc, and the mnc from this string. Unfortunately the api i call changes the location of these on the string for some requests which makes indexing the string difficult. I really need to list what i need to grab the spn= for example then follow off and read the number but everything i try never works.
I wouldn't use regex but simply splitting :
String[] tokens = str.split(";");
for (int i=0; i<tokens.length; i++) {
if (tokens[i].startsWith("spn=")) {
spn = Integer.parseInt(tokens[i].substring("spn=".length()));
}
}
Of course you could objectify this a little, or use constants for "spn=".
A solution using Pattern and Matcher:
String s = "10 50 u E2U+pstn:tel \"!^(.*)$!tel:\\\\1;spn=42180;mcc=234;mnc=33!\"";
Pattern p = Pattern.compile("^.*spn=([0-9]+);mcc=([0-9]*);mnc=([0-9]*)!.*$");
Matcher matcher = p.matcher(s);
matcher.matches(); // true
String spn = matcher.group(1); // 42180
String mcc = matcher.group(2); // 234
String mnc = matcher.group(3); // 33
Edit: You can use named-capturing groups, too:
Pattern p =
Pattern.compile("^.*spn=(?<spn>[0-9]+);mcc=(?<mcc>[0-9]*);mnc=(?<mnc>[0-9]*)!.*$");
Matcher matcher = p.matcher(s);
matcher.matches(); // true
String spn = matcher.group("spn");
String mcc = matcher.group("mcc");
String mnc = matcher.group("mnc");

Categories

Resources