I have a long text like below:
name="sessionValidity" value="2018-09-13T16:28:28Z" type="hidden"
name="shipBeforeDate" value= "2018-09-17" name="merchantReturnData"
value= "",name="shopperLocale" value="en_GB" name="skinCode"
value="CeprdxkMuQ" name="merchantSig"
value="X70xAkOaaAeWGxNgWnTJolmy6/FFoFaBD47IzyBYWf4="
Now, I have to find all the data which are stored in the value string.
Please help.
Usually the worst thing you could do is parsing an HTML with regex. Detailed explonation here.
For the purpose of parsing the data and manipulate it the right way you should considering using an advanced markup parser like jaxb, jsoup, or any other.
Of course it is a case specific decision and in your case maybe this one could do the work...
private static List<String> extractValuesAsUselessList(String theString) {
List<String> attributes = new ArrayList<>();
if (theString != null && !theString.equals("")) {
Matcher matcher = Pattern.compile("\\s([\\w+|-|:]+)=\"(.*?)\"").matcher(theString);
while (matcher.find()) {
if ("value".equals(matcher.group(1))) {
attributes.add(matcher.group(2));
}
}
}
return attributes;
}
Related
say I have the following string in a variable
cookie-one=someValue;HttpOnly;Secure;Path=/;SameSite=none, cookie-two=someOtherValue;Path=/;Secure;HttpOnly, cookie-three=oneMoreValue;Path=/;Secure
and I want a substring from the name of a cookie that I choose say cookie-two and store the string up to the contents of that cookie.
So basically I need
cookie-two=someOtherValue;Path=/;Secure;HttpOnly
How can I get this substring out?
You can just separate the String by commas first to separate the cookies. For example if you wanted just the cookie that has the name cookie-two:
String s = "cookie-one=someValue;HttpOnly;Secure;Path=/;SameSite=none, cookie-two=someOtherValue;Path=/;Secure;HttpOnly, cookie-three=oneMoreValue;Path=/;Secure";
String[] cookies = s.split(",");
for(String cookie : cookies){
if(cookie.trim().startsWith("cookie-two")){
System.out.println(cookie);
}
}
This is possible to achieve in several different ways depending on how the data might vary in the sting. For your specific example we could for instance do like this:
String cookieString = "cookie-one=someValue;HttpOnly;Secure;Path=/;SameSite=none, cookie-two=someOtherValue;Path=/;Secure;HttpOnly, cookie-three=oneMoreValue;Path=/;Secure";
String result = "";
for(String s: cookieString.split(", ")) {
if(s.startsWith("cookie-two")) {
result = s;
break;
}
}
We could also use regex and/or streams to make the code look nicer, but this is probably one of the most straight forward ways of achieving what you want.
I have an unformatted string like this:
Tabs,[
{ tab1 = {
Title = tab1name
}
}
{ tab2 = {
Title = tab2name
}
}
{ tab3 = {
Title = tab3name
}
}
]
I need to parse this string and i need the title from it.
Is there is any other way to do like json parsing ?
Any help please.
Your question is a bit unclear - are you trying to parse source code or are you trying to parse the elements within the Tab[] object? If you're looking into this for a serious project, I'd recommend looking into something like cup. If it's something simpler and you merely need specific information from a collection of strings, you can use a variety of string methods. For instance -
replace()
split()
substring()
toUpperCase()
etc...
You can find more on this documentation here, I'd recommend it for a good read that might help you answer this and future questions.
I was trying to utilize Simple Java XML Parser (SJXP) but ran into problems with the XML I need to parse into a data class.
Data.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE SYSTEM "local-1.2.2.dtd" >
<data>
<article>
<field name="name" type="text">HOTELS</field>
<field name="name_de" type="text">HOTELS</field>
<field name="name_zh" type="text">酒店</field>
<field name="color" type="text">6,68,109,0.85</field>
<field name="textcolor" type="text">255,255,255,1</field>
</article>
<!-- 20000 more articles ... -->
</data>
I was trying to do it that way:
XmlHandler.java
private Map<String, Category> categoryMap;
private XMLParser<Category> categoryParser = new XMLParser<Category>(
new DefaultRule<Category>(IRule.Type.CHARACTER, "/data/article") {
#Override
public void handleParsedCharacters(XMLParser<Category> parser, String text, Category category) {
Log.d("XmlHandler", "Hello");
if (category != null) {
categoryMap.put(category.getName(Category.LANG.EN), category);
Log.d("XmlHandler", "Saved category to map. New count="+categoryMap.values().size());
category.reset();
} else {
category = new Category();
Log.d("XmlHandler", "Creating a new category");
}
}
},
new CategoryNameRule()
);
private class CategoryNameRule extends DefaultRule<Category> {
private String nameKey = "";
public CategoryNameRule() {
super(Type.ATTRIBUTE, "/data/article/field", "name");
}
#Override
public void handleParsedAttribute(XMLParser parser, int index,
String value, Category category) {
nameKey = value;
}
#Override
public void handleParsedCharacters(XMLParser parser, String text, Category category) {
Log.d("XmlHandler", "Handling nameKey="+nameKey);
if(nameKey == null || nameKey.length() == 0){
return;
}else if( "name".equals(nameKey)){
category.setName(Category.LANG.EN, text);
}else if( "name_de".equals(nameKey)){
category.setName(Category.LANG.DE, text);
}else if( "name_zh".equals(nameKey)){
category.setName(Category.LANG.ZH, text);
}else if( "color".equals(nameKey)){
category.colorBackground = getConvertedColor(text);
}else if( "textcolor".equals(nameKey)){
category.colorForeground = getConvertedColor(text);
}
}
}
Problem is that my hashmap turns out empty after the whole document is parsed and I don't know why. My guess is that I'd a combination of IRule.Type.CHARACTER & IRule.Type.ATTRIBUTE but I don't know to achieve that.
Any ideas/experience with that?
Stefan,
I am so sorry for missing this question (just found it by accident via Google while I was searching for something else).
There are a few points of confusion here, so let me clarify, and then outline how I would recommend solving this (assuming you didn't find a way, but I realize this was 3 months ago)
First, you are correct, you need a combination of CHARACTER rules and ATTRIBUTE rules. The CHARACTER rules will give you the content between tags, for example:
<tag>this is CHARACTER data</tag>
Second, your rules should target tags that contain data you need, in your example above, it looks like you don't get any differentiating data until you get to the /data/article/field level (individual fields contain both ATTRIBUTES and CHARACTERS that you want)
It does looks like you key off of (open tags) to tell when you have entered a new article so you know you are collecting field information for a specific, unique article. In that case you can actually use TAG rules to file when you hit a start tag, so you can do some logic like creating a new record in your HashMap for the soon-to-be-parsed new article.
Lastly, the Category argument you have being passed to the handlers is a pass-through user-variable.
What that means is you can call the Parse method:
XMLParser<List<Article>> p = new XMLParser<List<Article>>(... stuff ...);
List<Article> articleList = new ArrayList<Article>();
p.parse(input, articleList);
this allows ALL your handlers to have direct access to your articleList so they can parse/store information directly in it, so when the call to parse(...) returns, you know your list is current and updated.
If you do not pass anything to the parse method in the userObject field, then the handlers will all receive a null argument.
Your use checking for Category had me confused and made me think you were expecting to get a changing value there when the handlers were called, which isn't the case. I just wanted to clarify that.
Summary
I think your perfect parser design would include 3 rules:
/data/article TAG rule -- do something when a START article tag is encountered (and optionally when a CLOSE article tag is).
/data/article/field CHARACTER rule -- store the parsed character data for the field.
/data/article/field ATTRIBUTE rule -- store the parsed attribute data for the field.
I hope that helps!
I am looking for some nice solution. I've got a couple of textfields on my page and I am sending these via Ajax using jQuery serialize method. This serialized string is parsed in my java method to hashmap with key = 'nameOfTextfield' nad value = 'valueInTextfield'
For example, I've got this String stdSel=value1&stdNamText=value2&stdRevText=value3 and everything works fine.
String[] sForm = serializedForm.split("&");
Map<String, String> fForm = new HashMap<String, String>();
for (String part : sForm) {
String key = null;
String value = null;
try {
key = part.split("=")[0];
value = part.split("=",2)[1];
fForm.put(key, value);
//if textfield is empty
} catch(IndexOutOfBoundsException e) {
fForm.put(key, "");
}
}
But this method will break down when ampersand in some textfield appears, for example this stdSel=value1&stdNamText=value2&stdRevText=val&&ue3. My thought was that I'll replace ampersand as separator in searialized string for some other character or maybe more characters. Is it possible and good idea or is there any better way?
Regards
Ondrej
Ampersands are escaped by the serialize function, so they don't break the URL.
What you need to unescape a field you got from an URL is
value = URLDecoder.decode(value,"UTF-8");
But, as was pointed by... Pointy, if you're using a web framework and not using only vanilla java.net java you probably don't have to do this.
Given a string like so:
Hello {FIRST_NAME}, this is a personalized message for you.
Where FIRST_NAME is an arbitrary token (a key in a map passed to the method), to write a routine which would turn that string into:
Hello Jim, this is a personalized message for you.
given a map with an entry FIRST_NAME -> Jim.
It would seem that StringTokenizer is the most straight forward approach, but the Javadocs really say you should prefer to use the regex aproach. How would you do that in a regex based solution?
Thanks everyone for the answers!
Gizmo's answer was definitely out of the box, and a great solution, but unfortunately not appropriate as the format can't be limited to what the Formatter class does in this case.
Adam Paynter really got to the heart of the matter, with the right pattern.
Peter Nix and Sean Bright had a great workaround to avoid all of the complexities of the regex, but I needed to raise some errors if there were bad tokens, which that didn't do.
But in terms of both doing a regex and a reasonable replace loop, this is the answer I came up with (with a little help from Google and the existing answer, including Sean Bright's comment about how to use group(1) vs group()):
private static Pattern tokenPattern = Pattern.compile("\\{([^}]*)\\}");
public static String process(String template, Map<String, Object> params) {
StringBuffer sb = new StringBuffer();
Matcher myMatcher = tokenPattern.matcher(template);
while (myMatcher.find()) {
String field = myMatcher.group(1);
myMatcher.appendReplacement(sb, "");
sb.append(doParameter(field, params));
}
myMatcher.appendTail(sb);
return sb.toString();
}
Where doParameter gets the value out of the map and converts it to a string and throws an exception if it isn't there.
Note also I changed the pattern to find empty braces (i.e. {}), as that is an error condition explicitly checked for.
EDIT: Note that appendReplacement is not agnostic about the content of the string. Per the javadocs, it recognizes $ and backslash as a special character, so I added some escaping to handle that to the sample above. Not done in the most performance conscious way, but in my case it isn't a big enough deal to be worth attempting to micro-optimize the string creations.
Thanks to the comment from Alan M, this can be made even simpler to avoid the special character issues of appendReplacement.
Well, I would rather use String.format(), or better MessageFormat.
String.replaceAll("{FIRST_NAME}", actualName);
Check out the javadocs for it here.
Try this:
Note: The author's final solution builds upon this sample and is much more concise.
public class TokenReplacer {
private Pattern tokenPattern;
public TokenReplacer() {
tokenPattern = Pattern.compile("\\{([^}]+)\\}");
}
public String replaceTokens(String text, Map<String, String> valuesByKey) {
StringBuilder output = new StringBuilder();
Matcher tokenMatcher = tokenPattern.matcher(text);
int cursor = 0;
while (tokenMatcher.find()) {
// A token is defined as a sequence of the format "{...}".
// A key is defined as the content between the brackets.
int tokenStart = tokenMatcher.start();
int tokenEnd = tokenMatcher.end();
int keyStart = tokenMatcher.start(1);
int keyEnd = tokenMatcher.end(1);
output.append(text.substring(cursor, tokenStart));
String token = text.substring(tokenStart, tokenEnd);
String key = text.substring(keyStart, keyEnd);
if (valuesByKey.containsKey(key)) {
String value = valuesByKey.get(key);
output.append(value);
} else {
output.append(token);
}
cursor = tokenEnd;
}
output.append(text.substring(cursor));
return output.toString();
}
}
With import java.util.regex.*:
Pattern p = Pattern.compile("{([^{}]*)}");
Matcher m = p.matcher(line); // line being "Hello, {FIRST_NAME}..."
while (m.find) {
String key = m.group(1);
if (map.containsKey(key)) {
String value= map.get(key);
m.replaceFirst(value);
}
}
So, the regex is recommended because it can easily identify the places that require substitution in the string, as well as extracting the name of the key for substitution. It's much more efficient than breaking the whole string.
You'll probably want to loop with the Matcher line inside and the Pattern line outside, so you can replace all lines. The pattern never needs to be recompiled, and it's more efficient to avoid doing so unnecessarily.
The most straight forward would seem to be something along the lines of this:
public static void main(String[] args) {
String tokenString = "Hello {FIRST_NAME}, this is a personalized message for you.";
Map<String, String> tokenMap = new HashMap<String, String>();
tokenMap.put("{FIRST_NAME}", "Jim");
String transformedString = tokenString;
for (String token : tokenMap.keySet()) {
transformedString = transformedString.replace(token, tokenMap.get(token));
}
System.out.println("New String: " + transformedString);
}
It loops through all your tokens and replaces every token with what you need, and uses the standard String method for replacement, thus skipping the whole RegEx frustrations.
Depending on how ridiculously complex your string is, you could try using a more serious string templating language, like Velocity. In Velocity's case, you'd do something like this:
Velocity.init();
VelocityContext context = new VelocityContext();
context.put( "name", "Bob" );
StringWriter output = new StringWriter();
Velocity.evaluate( context, output, "",
"Hello, #name, this is a personalized message for you.");
System.out.println(output.toString());
But that is likely overkill if you only want to replace one or two values.
import java.util.HashMap;
public class ReplaceTest {
public static void main(String[] args) {
HashMap<String, String> map = new HashMap<String, String>();
map.put("FIRST_NAME", "Jim");
map.put("LAST_NAME", "Johnson");
map.put("PHONE", "410-555-1212");
String s = "Hello {FIRST_NAME} {LAST_NAME}, this is a personalized message for you.";
for (String key : map.keySet()) {
s = s.replaceAll("\\{" + key + "\\}", map.get(key));
}
System.out.println(s);
}
}
The docs mean that you should prefer writing a regex-based tokenizer, IIRC. What might work better for you is a standard regex search-replace.
Generally we'd use MessageFormat in a case like this, coupled with loading the actual message text from a ResourceBundle. This gives you the added benefit of being G10N friendly.