Android Matcher.replaceAll/replaceFirst problem with groups count > 9

Android Matcher.replaceAll/replaceFirst problem with groups count > 9 - java

I've found some problem with Matcher.replaceFirst/replaceAll when subgroup count in regex is more than 9...
simple example:
String res = "abcdefghij".replaceFirst("(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)", "$1 $2 $3 $4 $5 $6 $7 $8 $9 $10");
Expected result is "a b c d e f g h i j" but got "a b c d e f g h i a0" string.
This problem can reproduced in Android runtime, but on local unit tests with desktop java it works well.
When I tried to debug it step by step, I've found following ugly code in Android sources of Matcher class:
private void appendEvaluated(StringBuffer buffer, String s) {
boolean escape = false;
boolean dollar = false;
boolean escapeNamedGroup = false;
int escapeNamedGroupStart = -1;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c == '\\' && !escape) {
escape = true;
} else if (c == '$' && !escape) {
dollar = true;
} else if (c >= '0' && c <= '9' && dollar) { //<<<<------ WHAT IS IT?!
buffer.append(group(c - '0'));
dollar = false;
} else if (c == '{' && dollar) {
escapeNamedGroup = true;
escapeNamedGroupStart = i;
} else if (c == '}' && dollar && escapeNamedGroup) {
String namedGroupName =
s.substring(escapeNamedGroupStart + 1, i);
buffer.append(group(namedGroupName));
dollar = false;
escapeNamedGroup = false;
} else if (c != '}' && dollar && escapeNamedGroup) {
continue;
} else {
buffer.append(c);
dollar = false;
escape = false;
escapeNamedGroup = false;
}
}
if (escape) {
throw new IllegalArgumentException("character to be escaped is missing");
}
if (dollar) {
throw new IllegalArgumentException("Illegal group reference: group index is missing");
}
if (escapeNamedGroup) {
throw new IllegalArgumentException("Missing ending brace '}' from replacement string");
}
}
this is part of SDK API 29... I've checked API 30 level, it has same code.
maybe someone already solve this problem?
I think it needs to create custom replacer with more correct logic...

Related

Denial of service:regular expression : fortify pointed out a issue

Hi i am getting denial of service:regular expressioon warning on the below line
billingApplicationAcctId = billingApplicationAcctId.replaceAll("\" + s, "");
you can see below code for further reference
if (null != formatBillingAcctIdInd && formatBillingAcctIdInd.equals("Y")
&& billingApplicationCode.equalsIgnoreCase(EPWFReferenceDataConstants.BILLING_APPICATION_ID.KENAN.name())) {
Pattern pt = Pattern.compile("[^a-zA-Z0-9]");
Matcher match = pt.matcher(payment.getBillingApplicationAccntId());
while (match.find()) {
String s = match.group();
billingApplicationAcctId = billingApplicationAcctId.replaceAll("\\" + s, "");
}
}
what should i do instead of above code , so i will not get fortify DOS warning

If you want to go away from your regex code, you can compare the input character-wise. Just replace
Pattern pt = Pattern.compile("[^a-zA-Z0-9]");
Matcher match = pt.matcher(payment.getBillingApplicationAccntId());
while (match.find()) {
String s = match.group();
billingApplicationAcctId = billingApplicationAcctId.replaceAll("\\" + s, "");
}
with:
String rawInput = payment.getBillingApplicationAccntId();
StringBuilder sb = new StringBuilder();
for (char c : rawInput.toCharArray()) {
// any char that is an english letter or 0-9 is included. The rest is thrown away...
if ((c >= 'a' && c <= 'z')
|| (c >= 'A' && c <= 'Z')
|| (c >= '0' && c <= '9')) {
sb.append(c);
}
}
billingApplicationAcctId = sb.toString();

How do I handle punctuation in this Pig Latin translator?

The rest of the code is working perfectly but I cannot figure out how to prevent punctuation from being translated.
public class PigLatintranslator
{
public static String translateWord (String word)
{
String lowerCaseWord = word.toLowerCase ();
int pos = -1;
char ch;
for (int i = 0 ; i < lowerCaseWord.length () ; i++)
{
ch = lowerCaseWord.charAt (i);
if (isVowel (ch))
{
pos = i;
break;
}
}
if (pos == 0 && lowerCaseWord.length () != 1) //translates if word starts with vowel
{
return lowerCaseWord + "way"; // Adding "way" to the end of string
}
else if (lowerCaseWord.length () == 1) //Ignores words that are only 1 character
{
return lowerCaseWord;
}
else if (lowerCaseWord.charAt(0) == 'q' && lowerCaseWord.charAt(1) == 'u')//words that start with qu
{
String a = lowerCaseWord.substring (2);
return a + "qu" + "ay";
}
else
{
String a = lowerCaseWord.substring (1);
String b = lowerCaseWord.substring (0,1);
return a + b + "ay"; // Adding "ay" at the end of the extracted words after joining them.
}
}
public static boolean isVowel (char ch) checks for vowel
{
if (ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u' || ch == 'y')
{
return true;
}
return false;
}
}
I need the translation to ignore punctuation. For example "Question?" should be translated to "estionquay?" (question mark still in the same position and not translated)

As Andreas said, if the function is expecting only one word, it should be the responsibility of the calling function to ensure there's no full sentence or punctuation being passed to it. With that said, if you require the translator to handle this, you need to find the index of the string where the punctuation or non-letter character occurs. I added in a main method to test the function:
public static void main(String[] args) {
System.out.println(translateWord("QUESTION?"));
}
I added a loop into the qu case to find the punctuation being input, the two checks are to see if the character at position i is inside the range of a - z. The sub-string then only goes to the point where the punctuation is found.
int i;
for (i = 0; i < lowerCaseWord.length(); i++) {
if(lowerCaseWord.charAt(i) > 'z' || lowerCaseWord.charAt(i) < 'a') {
break;
}
}
String a = lowerCaseWord.substring (2, i);
String b = lowerCaseWord.substring(i);
return a + "qu" + "ay" + b;
This may need some tweaking if you're worried about words with hyphens and whatnot but this should put across the basic idea.
Here's the output I received:
$javac PigLatintranslator.java
$java -Xmx128M -Xms16M PigLatintranslator
estionquay?

TitleCaps program will not translate first character in a String correctly (why?)

See if you guys can solve this. I wrote a title caps program in Java, that is a program which can take a string of ASCII characters and make all words (substrings made up of only letters A-Z or a-z) into title case. So the string "##hello!_world$" becomes "##Hello!_World$". But this program refuses to correctly translate non letters at the first indice of the string despite my best efforts to correct it.
public static String LetterCapitalize(String str) {
String newStr = "";
System.out.println(newStr);
for (int i = 0; i < str.length(); i++) {
// if first character is a letter and not uppercase
if (i == 0 && (!isUpperCase(str.charAt(i)))) {
Character m = (char) ((int) str.charAt(i) - 32);
newStr = newStr + m;
} // if first character is a letter and uppercase
else if (i == 0 && (isUpperCase(str.charAt(i)))) {
Character m = str.charAt(i);
newStr = newStr + m;
} // if first character is not a letter
else if (i == 0 && (!isLetter(str.charAt(i)))) {
Character m = str.charAt(i);
newStr = newStr + m + m;
} // if character is first letter in a word
else if (!isLetter(str.charAt(i - 1)) && isLetter(str.charAt(i)) && !isUpperCase(str.charAt(i))) {
Character m = (char) ((int) str.charAt(i) - 32);
newStr = newStr + m;
} // all other
else {
Character m = str.charAt(i);
newStr = newStr + m;
}
}
return newStr;
}
public static boolean isUpperCase(char c) {
boolean isCap;
if (c >= 'A' && c <= 'Z') {
isCap = true;
} else {
isCap = false;
}
return isCap;
}
public static boolean isLetter(char c) {
boolean isLetter;
if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {
isLetter = true;
} else {
isLetter = false;
}
return isLetter;
}

this line is wrong: if (i == 0 && (!isUpperCase(str.charAt(i)))) { - it assumes it's a lowercase letter. You need to see if it's a lowercase letter. Not all characters that aren't uppercase are letters. So you should have some thing like
if (i==0 && isLetter() && !isUpperCase())
The other way to do it is have the check for it being a letter come first.

How to use PropertyResourceBundle with keys containing whitespaces

I want to use properties files through PropertyResourceBundle for i18n. My current issue is that keys on the files I have can include white spaces, e.g. :
key number 1 = value number 1
key2 = value2
So, when I load the corresponding property file the first white space is used as the key-value delimiter instead of the '=' sign.
Then, my questions are: how can I use a key with white spaces in it without modifying the properties file (I'd like to avoid adding any slash or unicode character code)? Is there any way to override the default properties file delimiter so I can set '=' as the only one to be considered?

you will have to write your own Properties class, the one in the jdk considers white space as a separator, here is it's code. you'll find out that as soon as it encounter a white space it stop the key & start the value.
private void load0 (LineReader lr) throws IOException {
char[] convtBuf = new char[1024];
int limit;
int keyLen;
int valueStart;
char c;
boolean hasSep;
boolean precedingBackslash;
while ((limit = lr.readLine()) >= 0) {
c = 0;
keyLen = 0;
valueStart = limit;
hasSep = false;
//System.out.println("line=<" + new String(lineBuf, 0, limit) + ">");
precedingBackslash = false;
while (keyLen < limit) {
c = lr.lineBuf[keyLen];
//need check if escaped.
if ((c == '=' || c == ':') && !precedingBackslash) {
valueStart = keyLen + 1;
hasSep = true;
break;
} else if ((c == ' ' || c == '\t' || c == '\f') && !precedingBackslash) {
valueStart = keyLen + 1;
break;
}
if (c == '\\') {
precedingBackslash = !precedingBackslash;
} else {
precedingBackslash = false;
}
keyLen++;
}
while (valueStart < limit) {
c = lr.lineBuf[valueStart];
if (c != ' ' && c != '\t' && c != '\f') {
if (!hasSep && (c == '=' || c == ':')) {
hasSep = true;
} else {
break;
}
}
valueStart++;
}
String key = loadConvert(lr.lineBuf, 0, keyLen, convtBuf);
String value = loadConvert(lr.lineBuf, valueStart, limit - valueStart, convtBuf);
put(key, value);
}
}

Reading in text file gives ArrayIndexOutOfBoundsException

I am attempting to read this .txt file into my program (as an improvement over manual input) and i am having trouble converting my methods to accept the input txt file. i get a arrayindexoutofboundsexception on line "infix[--pos]='\0';"
class Functions {
void postfix(char infix[], char post[]) {
int position, und = 1;
int outposition = 0;
char topsymb = '+';
char symb;
Stack opstk = new Stack();
opstk.top = -1;
for (position = 0; (symb = infix[position]) != '\0'; position++) {
if (isoperand(symb))
post[outposition++] = symb;
else {
if (opstk.isempty() == 1)
und = 1;
else {
und = 0;
topsymb = opstk.pop();
}
while (und == 0 && precedence(topsymb, symb) == 1) {
post[outposition++] = topsymb;
if (opstk.isempty() == 1)
und = 1;
else {
und = 0;
topsymb = opstk.pop();
}
}// end while
if (und == 0)
opstk.push(topsymb);
if (und == 1 || (symb != ')'))
opstk.push(symb);
else
topsymb = opstk.pop();
}// end else
}// end for
while (opstk.isempty() == 0)
post[outposition++] = opstk.pop();
post[outposition] = '\0';
}// end postfix function
int precedence(char topsymb, char symb) {
/* check precedence and return 0 or 1 */
if (topsymb == '(')
return 0;
if (symb == '(')
return 0;
if (symb == ')')
return 1;
if (topsymb == '$' && symb == '$')
return 0;
if (topsymb == '$' && symb != '$')
return 1;
if (topsymb != '$' && symb == '$')
return 0;
if ((topsymb == '*' || topsymb == '/') && (symb != '$'))
return 1;
if ((topsymb == '+' || topsymb == '-') && (symb == '-' || symb == '+'))
return 1;
if ((topsymb == '+' || topsymb == '-') && (symb == '*' || symb == '/'))
return 0;
return 1;
} /* end precedence function */
private boolean isoperand(char symb) {
/* Return 1 if symbol is digit and 0 otherwise */
if (symb >= '0' && symb <= '9')
return true;
else
return false;
}/* end isoperand function */
}
public class Driver {
public static void main(String[] args) throws IOException {
Functions f = new Functions();
char infix[] = new char[80];
char post[] = new char[80];
int pos = 0;
char c;
System.out.println("\nEnter an expression is infix form : ");
try {
BufferedReader in = new BufferedReader(new FileReader("infix.txt"));
String str;
while ((str = in.readLine()) != null) {
infix = str.toCharArray();
}
in.close();
} catch (IOException e) {
}
infix[--pos] = '\0';
System.out.println("The original infix expression is : ");
for (int i = 0; i < pos; i++)
System.out.print(infix[i]);
f.postfix(infix, post);
System.out.println("\nThe postfix expression is : ");
for (int i = 0; post[i] != '\0'; i++)
System.out.println(post[i]);
}
}

Do should never ever do like this:
try {
...
} catch (IOException e) {
}
You loose some essential information about your code-running.
At lease you should print the stack trace to follow the investigation:
e.printStackTrace();
You may have a FileNotFound exception.
In addition you try to index your array to -1 in infix[--pos], pos is set to 0 before this statement.

1) Totally aside, but I think line in main should read:System.out.println("\nEnter an expression in infix form : ");
2) As well, i agree about the catch statement. You already have narrowed it down to being an IOExcpetion, but you can find so much more info out by printing wither of the following inside the catch
System.err.println(e.getMessage()); or e.printStackTrace()
Now to answer your question. You are initializing pos to the value 0, but the you are doing a PREINCREMENT on the line infix[--pos] = '\0'; so pos becomes -1 (clearly outside the scope of the array bounddaries).
I think you want to change that to a post increment infix[pos--] = '\0';. Perhaps?
And yes, your code DOES Look like C...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Android Matcher.replaceAll/replaceFirst problem with groups count > 9 - java

Related

Denial of service:regular expression : fortify pointed out a issue

How do I handle punctuation in this Pig Latin translator?

TitleCaps program will not translate first character in a String correctly (why?)

How to use PropertyResourceBundle with keys containing whitespaces

Reading in text file gives ArrayIndexOutOfBoundsException

Categories

Resources