How to use Java's regex to split nested mathematical equations

How to use Java's regex to split nested mathematical equations - java

I was curious on how it would be possible to split mathematical equations with parenthesis meaningfully using java's string regex. It's hard to explain without an example, one is below.
A generic solution pattern would be appreciated, rather than one which just works for the example provided below.
String s = "(5 + 6) + (2 - 18)";
// I want to split this string via the regex pattern of "+",
// (but only the non-nested ones)
// with the result being [(5 + 6), (2 - 18)]
s.split("\\+"); // Won't work, this will split via every plus.
What I'm mainly looking for is first level splitting, I want a regex check to see if a symbol like "+" or "-" is nested in any form, if it is, don't split it, if it isn't split it. Nesting can be in the form of () or [].
Thank you.

Unfortunately, not with RegEx, you need a library like JEP

If you don't expect splitting nested expressions like ((6 + 5)-4), I have a pretty simple function to split the expressions without using regular expressions :
public static String[] subExprs(String expr) {
/* Actual logic to split the expression */
int fromIndex = 0;
int subExprStart = 0;
ArrayList<String> subExprs = new ArrayList<String>();
again:
while ((subExprStart = expr.indexOf("(", fromIndex)) != -1) {
fromIndex = subExprStart;
int substringEnd=0;
while((substringEnd = expr.indexOf(")", fromIndex)) != -1){
subExprs.add(expr.substring(subExprStart, substringEnd+1));
fromIndex = substringEnd + 1;
continue again;
}
}
/* Logic only for printing */
System.out.println("Original expression : " + expr);
System.out.println();
System.out.print("Sub expressions : [ ");
for (String string : subExprs) {
System.out.print(string + ", ");
}
System.out.print("]");
String[] subExprsArray = {};
return subExprs.toArray(subExprsArray);
}
Sample output :
Original expression : (a+b)+(5+6)+(57-6)
Sub expressions : [ (a+b), (5+6), (57-6), ]
EDIT
For the extra condition of also getting expressions enclosed in [], this code will handle expressions inside both () and [].
public static String[] subExprs(String expr) {
/* Actual logic to split the expression */
int fromIndex = 0;
int subExprStartParanthesis = 0;
int subExprStartSquareBrackets = 0;
ArrayList<String> subExprs = new ArrayList<String>();
again: while ((subExprStartParanthesis = expr.indexOf("(", fromIndex)) > -2
&& (subExprStartSquareBrackets = expr.indexOf("[", fromIndex)) > -2) {
/* Check the type of current bracket */
boolean isParanthesis = false;
if (subExprStartParanthesis == -1
&& subExprStartSquareBrackets == -1)
break;
else if (subExprStartParanthesis == -1)
isParanthesis = false;
else if (subExprStartSquareBrackets == -1)
isParanthesis = true;
else if (subExprStartParanthesis < subExprStartSquareBrackets)
isParanthesis = true;
/* Extract the sub expression */
fromIndex = isParanthesis ? subExprStartParanthesis
: subExprStartSquareBrackets;
int subExprEndParanthesis = 0;
int subExprEndSquareBrackets = 0;
if (isParanthesis) {
while ((subExprEndParanthesis = expr.indexOf(")", fromIndex)) != -1) {
subExprs.add(expr.substring(subExprStartParanthesis,
subExprEndParanthesis + 1));
fromIndex = subExprEndParanthesis + 1;
continue again;
}
} else {
while ((subExprEndSquareBrackets = expr.indexOf("]", fromIndex)) != -1) {
subExprs.add(expr.substring(subExprStartSquareBrackets,
subExprEndSquareBrackets + 1));
fromIndex = subExprEndSquareBrackets + 1;
continue again;
}
}
}
/* Logic only for printing */
System.out.println("Original expression : " + expr);
System.out.println();
System.out.print("Sub expressions : [ ");
for (String string : subExprs) {
System.out.print(string + ", ");
}
System.out.print("]");
String[] subExprsArray = {};
return subExprs.toArray(subExprsArray);
}
Sample Output :
Original expression : (a+b)+[5+6]+(57-6)-[a-b]+[c-d]
Sub expressions : [ (a+b), [5+6], (57-6), [a-b], [c-d], ]
Do suggest improvements in the code. :)

You can't know that you will never get more than one level of parentheses, and you can't analyze recursive syntax with a regular expression, by definition. You need to use or write a parser. Have aloo, around for the Dijkstra Shunting Yard Algorithm, or a recursive descent expression parser, or a library that will do either,

Related

JAVA How to check whether a string contains " / " three times or not [duplicate]

How can I count the number of times a particular string occurs in another string. For example, this is what I am trying to do in Javascript:
var temp = "This is a string.";
alert(temp.count("is")); //should output '2'

The g in the regular expression (short for global) says to search the whole string rather than just find the first occurrence. This matches is twice:
var temp = "This is a string.";
var count = (temp.match(/is/g) || []).length;
console.log(count);
And, if there are no matches, it returns 0:
var temp = "Hello World!";
var count = (temp.match(/is/g) || []).length;
console.log(count);

/** Function that count occurrences of a substring in a string;
* #param {String} string The string
* #param {String} subString The sub string to search for
* #param {Boolean} [allowOverlapping] Optional. (Default:false)
*
* #author Vitim.us https://gist.github.com/victornpb/7736865
* #see Unit Test https://jsfiddle.net/Victornpb/5axuh96u/
* #see https://stackoverflow.com/a/7924240/938822
*/
function occurrences(string, subString, allowOverlapping) {
string += "";
subString += "";
if (subString.length <= 0) return (string.length + 1);
var n = 0,
pos = 0,
step = allowOverlapping ? 1 : subString.length;
while (true) {
pos = string.indexOf(subString, pos);
if (pos >= 0) {
++n;
pos += step;
} else break;
}
return n;
}
Usage
occurrences("foofoofoo", "bar"); //0
occurrences("foofoofoo", "foo"); //3
occurrences("foofoofoo", "foofoo"); //1
allowOverlapping
occurrences("foofoofoo", "foofoo", true); //2
Matches:
foofoofoo
1 `----´
2 `----´
Unit Test
https://jsfiddle.net/Victornpb/5axuh96u/
Benchmark
I've made a benchmark test and my function is more then 10 times
faster then the regexp match function posted by gumbo. In my test
string is 25 chars length. with 2 occurences of the character 'o'. I
executed 1 000 000 times in Safari.
Safari 5.1
Benchmark> Total time execution: 5617 ms (regexp)
Benchmark> Total time execution: 881 ms (my function 6.4x faster)
Firefox 4
Benchmark> Total time execution: 8547 ms (Rexexp)
Benchmark> Total time execution: 634 ms (my function 13.5x faster)
Edit: changes I've made
cached substring length
added type-casting to string.
added optional 'allowOverlapping' parameter
fixed correct output for "" empty substring case.
Gist
https://gist.github.com/victornpb/7736865

function countInstances(string, word) {
return string.split(word).length - 1;
}
console.log(countInstances("This is a string", "is"))

You can try this:
var theString = "This is a string.";
console.log(theString.split("is").length - 1);

My solution:
var temp = "This is a string.";
function countOccurrences(str, value) {
var regExp = new RegExp(value, "gi");
return (str.match(regExp) || []).length;
}
console.log(countOccurrences(temp, 'is'));

You can use match to define such function:
String.prototype.count = function(search) {
var m = this.match(new RegExp(search.toString().replace(/(?=[.\\+*?[^\]$(){}\|])/g, "\\"), "g"));
return m ? m.length:0;
}

Just code-golfing Rebecca Chernoff's solution :-)
alert(("This is a string.".match(/is/g) || []).length);

The non-regex version:
var string = 'This is a string',
searchFor = 'is',
count = 0,
pos = string.indexOf(searchFor);
while (pos > -1) {
++count;
pos = string.indexOf(searchFor, ++pos);
}
console.log(count); // 2

String.prototype.Count = function (find) {
return this.split(find).length - 1;
}
console.log("This is a string.".Count("is"));
This will return 2.

Here is the fastest function!
Why is it faster?
Doesn't check char by char (with 1 exception)
Uses a while and increments 1 var (the char count var) vs. a for loop checking the length and incrementing 2 vars (usually var i and a var with the char count)
Uses WAY less vars
Doesn't use regex!
Uses an (hopefully) highly optimized function
All operations are as combined as they can be, avoiding slowdowns due to multiple operations
String.prototype.timesCharExist=function(c){var t=0,l=0,c=(c+'')[0];while(l=this.indexOf(c,l)+1)++t;return t};
Here is a slower and more readable version:
String.prototype.timesCharExist = function ( chr ) {
var total = 0, last_location = 0, single_char = ( chr + '' )[0];
while( last_location = this.indexOf( single_char, last_location ) + 1 )
{
total = total + 1;
}
return total;
};
This one is slower because of the counter, long var names and misuse of 1 var.
To use it, you simply do this:
'The char "a" only shows up twice'.timesCharExist('a');
Edit: (2013/12/16)
DON'T use with Opera 12.16 or older! it will take almost 2.5x more than the regex solution!
On chrome, this solution will take between 14ms and 20ms for 1,000,000 characters.
The regex solution takes 11-14ms for the same amount.
Using a function (outside String.prototype) will take about 10-13ms.
Here is the code used:
String.prototype.timesCharExist=function(c){var t=0,l=0,c=(c+'')[0];while(l=this.indexOf(c,l)+1)++t;return t};
var x=Array(100001).join('1234567890');
console.time('proto');x.timesCharExist('1');console.timeEnd('proto');
console.time('regex');x.match(/1/g).length;console.timeEnd('regex');
var timesCharExist=function(x,c){var t=0,l=0,c=(c+'')[0];while(l=x.indexOf(c,l)+1)++t;return t;};
console.time('func');timesCharExist(x,'1');console.timeEnd('func');
The result of all the solutions should be 100,000!
Note: if you want this function to count more than 1 char, change where is c=(c+'')[0] into c=c+''

var temp = "This is a string.";
console.log((temp.match(new RegExp("is", "g")) || []).length);

A simple way would be to split the string on the required word, the word for which we want to calculate the number of occurences, and subtract 1 from the number of parts:
function checkOccurences(string, word) {
return string.split(word).length - 1;
}
const text="Let us see. see above, see below, see forward, see backward, see left, see right until we will be right";
const count=countOccurences(text,"see "); // 2

I think the purpose for regex is much different from indexOf.
indexOf simply find the occurance of a certain string while in regex you can use wildcards like [A-Z] which means it will find any capital character in the word without stating the actual character.
Example:
var index = "This is a string".indexOf("is");
console.log(index);
var length = "This is a string".match(/[a-z]/g).length;
// where [a-z] is a regex wildcard expression thats why its slower
console.log(length);

Super duper old, but I needed to do something like this today and only thought to check SO afterwards. Works pretty fast for me.
String.prototype.count = function(substr,start,overlap) {
overlap = overlap || false;
start = start || 0;
var count = 0,
offset = overlap ? 1 : substr.length;
while((start = this.indexOf(substr, start) + offset) !== (offset - 1))
++count;
return count;
};

var myString = "This is a string.";
var foundAtPosition = 0;
var Count = 0;
while (foundAtPosition != -1)
{
foundAtPosition = myString.indexOf("is",foundAtPosition);
if (foundAtPosition != -1)
{
Count++;
foundAtPosition++;
}
}
document.write("There are " + Count + " occurrences of the word IS");
Refer :- count a substring appears in the string for step by step explanation.

Building upon #Vittim.us answer above. I like the control his method gives me, making it easy to extend, but I needed to add case insensitivity and limit matches to whole words with support for punctuation. (e.g. "bath" is in "take a bath." but not "bathing")
The punctuation regex came from: https://stackoverflow.com/a/25575009/497745 (How can I strip all punctuation from a string in JavaScript using regex?)
function keywordOccurrences(string, subString, allowOverlapping, caseInsensitive, wholeWord)
{
string += "";
subString += "";
if (subString.length <= 0) return (string.length + 1); //deal with empty strings
if(caseInsensitive)
{
string = string.toLowerCase();
subString = subString.toLowerCase();
}
var n = 0,
pos = 0,
step = allowOverlapping ? 1 : subString.length,
stringLength = string.length,
subStringLength = subString.length;
while (true)
{
pos = string.indexOf(subString, pos);
if (pos >= 0)
{
var matchPos = pos;
pos += step; //slide forward the position pointer no matter what
if(wholeWord) //only whole word matches are desired
{
if(matchPos > 0) //if the string is not at the very beginning we need to check if the previous character is whitespace
{
if(!/[\s\u2000-\u206F\u2E00-\u2E7F\\'!"#$%&\(\)*+,\-.\/:;<=>?#\[\]^_`{|}~]/.test(string[matchPos - 1])) //ignore punctuation
{
continue; //then this is not a match
}
}
var matchEnd = matchPos + subStringLength;
if(matchEnd < stringLength - 1)
{
if (!/[\s\u2000-\u206F\u2E00-\u2E7F\\'!"#$%&\(\)*+,\-.\/:;<=>?#\[\]^_`{|}~]/.test(string[matchEnd])) //ignore punctuation
{
continue; //then this is not a match
}
}
}
++n;
} else break;
}
return n;
}
Please feel free to modify and refactor this answer if you spot bugs or improvements.

For anyone that finds this thread in the future, note that the accepted answer will not always return the correct value if you generalize it, since it will choke on regex operators like $ and .. Here's a better version, that can handle any needle:
function occurrences (haystack, needle) {
var _needle = needle
.replace(/\[/g, '\\[')
.replace(/\]/g, '\\]')
return (
haystack.match(new RegExp('[' + _needle + ']', 'g')) || []
).length
}

Try it
<?php
$str = "33,33,56,89,56,56";
echo substr_count($str, '56');
?>
<script type="text/javascript">
var temp = "33,33,56,89,56,56";
var count = temp.match(/56/g);
alert(count.length);
</script>

Simple version without regex:
var temp = "This is a string.";
var count = (temp.split('is').length - 1);
alert(count);

No one will ever see this, but it's good to bring back recursion and arrow functions once in a while (pun gloriously intended)
String.prototype.occurrencesOf = function(s, i) {
return (n => (n === -1) ? 0 : 1 + this.occurrencesOf(s, n + 1))(this.indexOf(s, (i || 0)));
};

function substrCount( str, x ) {
let count = -1, pos = 0;
do {
pos = str.indexOf( x, pos ) + 1;
count++;
} while( pos > 0 );
return count;
}

ES2020 offers a new MatchAll which might be of use in this particular context.
Here we create a new RegExp, please ensure you pass 'g' into the function.
Convert the result using Array.from and count the length, which returns 2 as per the original requestor's desired output.
let strToCheck = RegExp('is', 'g')
let matchesReg = "This is a string.".matchAll(strToCheck)
console.log(Array.from(matchesReg).length) // 2

Now this is a very old thread i've come across but as many have pushed their answer's, here is mine in a hope to help someone with this simple code.
var search_value = "This is a dummy sentence!";
var letter = 'a'; /*Can take any letter, have put in a var if anyone wants to use this variable dynamically*/
letter = letter && "string" === typeof letter ? letter : "";
var count;
for (var i = count = 0; i < search_value.length; count += (search_value[i++] == letter));
console.log(count);
I'm not sure if it is the fastest solution but i preferred it for simplicity and for not using regex (i just don't like using them!)

You could try this
let count = s.length - s.replace(/is/g, "").length;

We can use the js split function, and it's length minus 1 will be the number of occurrences.
var temp = "This is a string.";
alert(temp.split('is').length-1);

Here is my solution. I hope it would help someone
const countOccurence = (string, char) => {
const chars = string.match(new RegExp(char, 'g')).length
return chars;
}

var countInstances = function(body, target) {
var globalcounter = 0;
var concatstring = '';
for(var i=0,j=target.length;i<body.length;i++){
concatstring = body.substring(i-1,j);
if(concatstring === target){
globalcounter += 1;
concatstring = '';
}
}
return globalcounter;
};
console.log( countInstances('abcabc', 'abc') ); // ==> 2
console.log( countInstances('ababa', 'aba') ); // ==> 2
console.log( countInstances('aaabbb', 'ab') ); // ==> 1

substr_count translated to Javascript from php
Locutus (Package that translates Php to JS)
substr_count (official page, code copied below)
function substr_count (haystack, needle, offset, length) {
// eslint-disable-line camelcase
// discuss at: https://locutus.io/php/substr_count/
// original by: Kevin van Zonneveld (https://kvz.io)
// bugfixed by: Onno Marsman (https://twitter.com/onnomarsman)
// improved by: Brett Zamir (https://brett-zamir.me)
// improved by: Thomas
// example 1: substr_count('Kevin van Zonneveld', 'e')
// returns 1: 3
// example 2: substr_count('Kevin van Zonneveld', 'K', 1)
// returns 2: 0
// example 3: substr_count('Kevin van Zonneveld', 'Z', 0, 10)
// returns 3: false
var cnt = 0
haystack += ''
needle += ''
if (isNaN(offset)) {
offset = 0
}
if (isNaN(length)) {
length = 0
}
if (needle.length === 0) {
return false
}
offset--
while ((offset = haystack.indexOf(needle, offset + 1)) !== -1) {
if (length > 0 && (offset + needle.length) > length) {
return false
}
cnt++
}
return cnt
}
Check out Locutus's Translation Of Php's substr_count function

The parameters:
ustring: the superset string
countChar: the substring
A function to count substring occurrence in JavaScript:
function subStringCount(ustring, countChar){
var correspCount = 0;
var corresp = false;
var amount = 0;
var prevChar = null;
for(var i=0; i!=ustring.length; i++){
if(ustring.charAt(i) == countChar.charAt(0) && corresp == false){
corresp = true;
correspCount += 1;
if(correspCount == countChar.length){
amount+=1;
corresp = false;
correspCount = 0;
}
prevChar = 1;
}
else if(ustring.charAt(i) == countChar.charAt(prevChar) && corresp == true){
correspCount += 1;
if(correspCount == countChar.length){
amount+=1;
corresp = false;
correspCount = 0;
prevChar = null;
}else{
prevChar += 1 ;
}
}else{
corresp = false;
correspCount = 0;
}
}
return amount;
}
console.log(subStringCount('Hello World, Hello World', 'll'));

var str = 'stackoverflow';
var arr = Array.from(str);
console.log(arr);
for (let a = 0; a <= arr.length; a++) {
var temp = arr[a];
var c = 0;
for (let b = 0; b <= arr.length; b++) {
if (temp === arr[b]) {
c++;
}
}
console.log(`the ${arr[a]} is counted for ${c}`)
}

Check for multiple occurrence of certain character in string

Edit: To those who downvote me, this question is difference from the duplicate question which you guy linked. The other question is about returning the indexes. However, for my case, I do not need the index. I just want to check whether there is duplicate.
This is my code:
String word = "ABCDE<br>XYZABC";
String[] keywords = word.split("<br>");
for (int index = 0; index < keywords.length; index++) {
if (keywords[index].toLowerCase().contains(word.toLowerCase())) {
if (index != (keywords.length - 1)) {
endText = keywords[index];
definition.setText(endText);
}
}
My problem is, if the keywords is "ABC", then the string endText will only show "ABCDE". However, "XYZABC" contains "ABC" as well. How to check if the string has multiple occurrence? I would like to make the definition textview become definition.setText(endText + "More"); if there is multiple occurrence.
I tried this. The code is working, but it is making my app very slow. I guess the reason is because I got the String word through textwatcher.
String[] keywords = word.split("<br>");
for (int index = 0; index < keywords.length; index++) {
if (keywords[index].toLowerCase().contains(word.toLowerCase())) {
if (index != (keywords.length - 1)) {
int i = 0;
Pattern p = Pattern.compile(search.toLowerCase());
Matcher m = p.matcher( word.toLowerCase() );
while (m.find()) {
i++;
}
if (i > 1) {
endText = keywords[index];
definition.setText(endText + " More");
} else {
endText = keywords[index];
definition.setText(endText);
}
}
}
}
Is there any faster way?

It's a little hard for me to understand your question, but it sounds like:
You have some string (e.g. "ABCDE<br>XYZABC"). You also have some target text (e.g. "ABC"). You want to split that string on a delimiter (e.g. "<br>", and then:
If exactly one substring contains the target, display that substring.
If more than one substring contains the target, display the last substring that contains it plus the suffix "More"
In your posted code, the performance is really slow because of the Pattern.compile() call. Re-compiling the Pattern on every loop iteration is very costly. Luckily, there's no need for regular expressions here, so you can avoid that problem entirely.
String search = "ABC".toLowerCase();
String word = "ABCDE<br>XYZABC";
String[] keywords = word.split("<br>");
int count = 0;
for (String keyword : keywords) {
if (keyword.toLowerCase().contains(search)) {
++count;
endText = keyword;
}
}
if (count > 1) {
definition.setText(endText + " More");
}
else if (count == 1) {
definition.setText(endText);
}

You are doing it correctly but you are doing unnecessary check which is if (index != (keywords.length - 1)). This will ignore if there is match in the last keywords array element. Not sure is that a part of your requirement.
To enhance performance when you found the match in second place break the loop. You don't need to check anymore.
public static void main(String[] args) {
String word = "ABCDE<br>XYZABC";
String pattern = "ABC";
String[] keywords = word.split("<br>");
String endText = "";
int count = 0;
for (int index = 0; index < keywords.length; index++) {
if (keywords[index].toLowerCase().contains(pattern.toLowerCase())) {
//If you come into this part mean found a match.
if(count == 1) {
// When you found the second match it will break to loop. No need to check anymore
// keep the first found String and append the more part to it
endText += " more";
break;
}
endText = keywords[index];
count++;
}
}
System.out.println(endText);
}
This will print ABCDE more

Hi You have to use your condition statement like this
if (word.toLowerCase().contains(keywords[index].toLowerCase()))

You can use this:
String word = "ABCDE<br>XYZABC";
String[] keywords = word.split("<br>");
for (int i = 0; i < keywords.length - 1; i++) {
int c = 0;
Pattern p = Pattern.compile(keywords[i].toLowerCase());
Matcher m = p.matcher(word.toLowerCase());
while (m.find()) {
c++;
}
if (c > 1) {
definition.setText(keywords[i] + " More");
} else {
definition.setText(keywords[i]);
}
}
But like what I mentioned in comment, there is no double occurrence in word "ABCDE<br>XYZABC" when you want to split it by <br>.
But if you use the word "ABCDE<br>XYZABCDE" there is two occurrence of word "ABCDE"

void test() {
String word = "ABCDE<br>XYZABC";
String sequence = "ABC";
if(word.replaceFirst(sequence,"{---}").contains(sequence)){
int startIndex = word.indexOf(sequence);
int endIndex = word.indexOf("<br>");
Log.v("test",word.substring(startIndex,endIndex)+" More");
}
else{
//your code
}
}
Try this

Substring between two same or different delimiters (when delimiters occur multiple times)

I need to fetch a sub string that lies between two same or different delimiters. The delimiters will be occurring multiple times in the string, so i need to extract the sub-string that lies between mth occurrence of delimiter1 and nth occurrence of delimiter2.
For eg:
myString : Ron_CR7_MU^RM^_SAF_34^
What should i do here if i need to extract the sub-string that lies between 3rd occurrence of '_' and 3rd occurence of '^'?
Substring = SAF_34
Or i could look for a substring that lies between 2nd '^' and 4th '_', i.e :
Substring = _SAF
An SQL equivalent would be :
substr(myString, instr(myString, '',1,3)+1,instr(myString, '^',1,3)-1-instr(myString, '',1,3))

I would use,
public static int findNth(String text, String toFind, int count) {
int pos = -1;
do {
pos = text.indexOf(toFind, pos+1);
} while(--count > 0 && pos >= 0);
return pos;
}
int from = findNth(text, "_", 3);
int to = findNth(text, "^", 3);
String found = text.substring(from+1, to);

If you can use a solution without regex you can find indexes in your string where your resulting string needs to start and where it needs to end. Then just simply perform: myString.substring(start,end) to get your result.
Biggest problem is to find start and end. To do it you can repeat this N (M) times:
int pos = indexOf(delimiterX)
myString = myString.substring(pos) //you may want to work on copy of myString
Hope you get an idea.

You could create a little method that simply hunts for such substrings between delimiters sequentially, using (as noted) String.indexOf(string); You do need to decide whether you want all substrings (whether they overlap or not .. which your question indicates), or if you don't want to see overlapping strings. Here is a trial for such code
import java.util.Vector;
public class FindDelimitedStrings {
public static void main(String[] args) {
String[] test = getDelimitedStrings("Ron_CR7_MU'RM'_SAF_34'", "_", "'");
if (test != null) {
for (int i = 0; i < test.length; i++) {
System.out.println(" " + (i + 1) + ". |" + test[i] + "|");
}
}
}
public static String[] getDelimitedStrings(String source,
String leftDelimiter, String rightDelimiter) {
String[] answer = null;
;
Vector<String> results = new Vector<String>();
if (source == null || leftDelimiter == null || rightDelimiter == null) {
return null;
}
int loc = 0;
int begin = source.indexOf(leftDelimiter, loc);
int end;
while (begin > -1) {
end = source
.indexOf(rightDelimiter, begin + leftDelimiter.length());
if (end > -1) {
results.add(source.substring(begin, end));
// loc = end + rightDelimiter.length(); if strings must be
// returned as pairs
loc = begin + 1;
if (loc < source.length()) {
begin = source.indexOf(leftDelimiter, loc);
} else {
begin = -1;
}
} else {
begin = -1;
}
}
if (results.size() > 0) {
answer = new String[results.size()];
results.toArray(answer);
}
return answer;
}
}

RegEx for dividing complex number String in Java

Looking for a Regular Expression in Java to separate a String that represents complex numbers. A code sample would be great.
The input string will be in the form:
"a+bi"
Example: "300+400i", "4+2i", "5000+324i".
I need to retrieve 300 & 400 from the String.'
I know we can do it crudely in this way.
str.substring(0, str.indexOf('+'));
str.substring(str.indexOf('+')+1,str.indexOf("i"));

I need to retrieve 300 & 400 from the String.
What about using String.split(regex) function:
String s[] = "300-400i".split("[\\Q+-\\Ei]");
System.out.println(s[0]+" "+s[1]); //prints 300 400

Regex that matches this is: /[0-9]{1,}[+-][0-9]{1,}i/
You can use this method:
Pattern complexNumberPattern = Pattern.compile("[0-9]{1,}");
Matcher complexNumberMatcher = complexNumberPattern.matcher(myString);
and use find and group methods on complexNumberMatcher to retrieve numbers from myString

Use this one:
[0-9]{1,}
It'll return the numbers.
Hope it helps.

Regex
([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)\s*\+\s*([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)i
Example Regex
http://rubular.com/r/FfOAt1zk0v
Example Java
string regexPattern =
// Match any float, negative or positive, group it
#"([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)" +
// ... possibly following that with whitespace
#"\s*" +
// ... followed by a plus
#"\+" +
// and possibly more whitespace:
#"\s*" +
// Match any other float, and save it
#"([-+]?\d+\.?\d*|[-+]?\d*\.?\d+)" +
// ... followed by 'i'
#"i";
Regex regex = new Regex(regexPattern);
Console.WriteLine("Regex used: " + regex);
while (true)
{
Console.WriteLine("Write a number: ");
string imgNumber = Console.ReadLine();
Match match = regex.Match(imgNumber);
double real = double.Parse(match.Groups[1].Value, CultureInfo.InvariantCulture);
double img = double.Parse(match.Groups[2].Value, CultureInfo.InvariantCulture);
Console.WriteLine("RealPart={0};Imaginary part={1}", real, img);
}

Try this one. As for me, it works.
public static void main(String[] args) {
String[] attempts = new String[]{"300+400i", "4i+2", "5000-324i", "555", "2i", "+400", "-i"};
for (String s : attempts) {
System.out.println("Parsing\t" + s);
printComplex(s);
}
}
static void printComplex(String in) {
String[] parts = in.split("[+-]");
int re = 0, im = 0, pos = -1;
for (String s : parts) {
if (pos != -1) {
s = in.charAt(pos) + s;
} else {
pos = 0;
if ("".equals(s)) {
continue;
}
}
pos += s.length();
if (s.lastIndexOf('i') == -1) {
if (!"+".equals(s) && !"-".equals(s)) {
re += Integer.parseInt(s);
}
} else {
s = s.replace("i", "");
if ("+".equals(s)) {
im++;
} else if ("-".equals(s)) {
im--;
} else {
im += Integer.parseInt(s);
}
}
}
System.out.println("Re:\t" + re + "\nIm:\t" + im);
}
Output:
Parsing 300+400i
Re: 300
Im: 400
Parsing 4i+2
Re: 2
Im: 4
Parsing 5000-324i
Re: 5000
Im: -324
Parsing 555
Re: 555
Im: 0
Parsing 2i
Re: 0
Im: 2

In theory you could use something like this:
Pattern complexNumberPattern = Pattern.compile("(.*)+(.*)");
Matcher complexNumberMatcher = complexNumberPattern.matcher(myString);
if (complexNumberMatcher.matches()) {
String prePlus = complexNumberMatcher.group(1);
String postPlus = complexNumberMatcher.group(2);
}
The advantage this would give you over selecting the numbers, is that it would allow you to read things like:
5b+17c as 5b and 17c
edit: just noticed you didn't want the letters, so never mind, but this would give you more control over it in case other letters appear in it.

Best Loop Idiom for special casing the last element

I run into this case a lot of times when doing simple text processing and print statements where I am looping over a collection and I want to special case the last element (for example every normal element will be comma separated except for the last case).
Is there some best practice idiom or elegant form that doesn't require duplicating code or shoving in an if, else in the loop.
For example I have a list of strings that I want to print in a comma separated list. (the do while solution already assumes the list has 2 or more elements otherwise it'd be just as bad as the more correct for loop with conditional).
e.g. List = ("dog", "cat", "bat")
I want to print "[dog, cat, bat]"
I present 2 methods the
For loop with conditional
public static String forLoopConditional(String[] items) {
String itemOutput = "[";
for (int i = 0; i < items.length; i++) {
// Check if we're not at the last element
if (i < (items.length - 1)) {
itemOutput += items[i] + ", ";
} else {
// last element
itemOutput += items[i];
}
}
itemOutput += "]";
return itemOutput;
}
do while loop priming the loop
public static String doWhileLoopPrime(String[] items) {
String itemOutput = "[";
int i = 0;
itemOutput += items[i++];
if (i < (items.length)) {
do {
itemOutput += ", " + items[i++];
} while (i < items.length);
}
itemOutput += "]";
return itemOutput;
}
Tester class:
public static void main(String[] args) {
String[] items = { "dog", "cat", "bat" };
System.out.println(forLoopConditional(items));
System.out.println(doWhileLoopPrime(items));
}
In the Java AbstractCollection class it has the following implementation (a little verbose because it contains all edge case error checking, but not bad).
public String toString() {
Iterator<E> i = iterator();
if (! i.hasNext())
return "[]";
StringBuilder sb = new StringBuilder();
sb.append('[');
for (;;) {
E e = i.next();
sb.append(e == this ? "(this Collection)" : e);
if (! i.hasNext())
return sb.append(']').toString();
sb.append(", ");
}
}

I usually write it like this:
static String commaSeparated(String[] items) {
StringBuilder sb = new StringBuilder();
String sep = "";
for (String item: items) {
sb.append(sep);
sb.append(item);
sep = ",";
}
return sb.toString();
}

There are a lot of for loops in these answers, but I find that an Iterator and while loop reads much more easily. E.g.:
Iterator<String> itemIterator = Arrays.asList(items).iterator();
if (itemIterator.hasNext()) {
// special-case first item. in this case, no comma
while (itemIterator.hasNext()) {
// process the rest
}
}
This is the approach taken by Joiner in Google collections and I find it very readable.

string value = "[" + StringUtils.join( items, ',' ) + "]";

My usual take is to test if the index variable is zero, e.g.:
var result = "[ ";
for (var i = 0; i < list.length; ++i) {
if (i != 0) result += ", ";
result += list[i];
}
result += " ]";
But of course, that's only if we talk about languages that don't have some Array.join(", ") method. ;-)

I think it is easier to think of the first element as the special case because it is much easier to know if an iteration is the first rather than the last. It does not take any complex or expensive logic to know if something is being done for the first time.
public static String prettyPrint(String[] items) {
String itemOutput = "[";
boolean first = true;
for (int i = 0; i < items.length; i++) {
if (!first) {
itemOutput += ", ";
}
itemOutput += items[i];
first = false;
}
itemOutput += "]";
return itemOutput;
}

I'd go with your second example, ie. handle the special case outside of the loop, just write it a bit more straightforward:
String itemOutput = "[";
if (items.length > 0) {
itemOutput += items[0];
for (int i = 1; i < items.length; i++) {
itemOutput += ", " + items[i];
}
}
itemOutput += "]";

Java 8 solution, in case someone is looking for it:
String res = Arrays.stream(items).reduce((t, u) -> t + "," + u).get();

I like to use a flag for the first item.
ArrayList<String> list = new ArrayList()<String>{{
add("dog");
add("cat");
add("bat");
}};
String output = "[";
boolean first = true;
for(String word: list){
if(!first) output += ", ";
output+= word;
first = false;
}
output += "]";

Since your case is simply processing text, you don't need the conditional inside the loop. A C example:
char* items[] = {"dog", "cat", "bat"};
char* output[STRING_LENGTH] = {0};
char* pStr = &output[1];
int i;
output[0] = '[';
for (i=0; i < (sizeof(items) / sizeof(char*)); ++i) {
sprintf(pStr,"%s,",items[i]);
pStr = &output[0] + strlen(output);
}
output[strlen(output)-1] = ']';
Instead of adding a conditional to avoid generating the trailing comma, go ahead and generate it (to keep your loop simple and conditional-free) and simply overwrite it at the end. Many times, I find it clearer to generate the special case just like any other loop iteration and then manually replace it at the end (although if the "replace it" code is more than a couple of lines, this method can actually become harder to read).

...
String[] items = { "dog", "cat", "bat" };
String res = "[";
for (String s : items) {
res += (res.length == 1 ? "" : ", ") + s;
}
res += "]";
or so is quite readable. You can put the conditional in a separate if clause, of course. What it makes idiomatic (I think so, at least) is that it uses a foreach loop and does not use a complicated loop header.
Also, no logic is duplicated (i.e. there is only one place where an item from items is actually appended to the output string - in a real world application this might be a more complicated and lengthy formatting operation, so I wouldn't want to repeat the code).

In this case, you are essentially concatenating a list of strings using some separator string. You can maybe write something yourself which does this. Then you will get something like:
String[] items = { "dog", "cat", "bat" };
String result = "[" + joinListOfStrings(items, ", ") + "]"
with
public static String joinListOfStrings(String[] items, String sep) {
StringBuffer result;
for (int i=0; i<items.length; i++) {
result.append(items[i]);
if (i < items.length-1) buffer.append(sep);
}
return result.toString();
}
If you have a Collection instead of a String[] you can also use iterators and the hasNext() method to check if this is the last or not.

If you are building a string dynamically like that, you shouldn't be using the += operator.
The StringBuilder class works much better for repeated dynamic string concatenation.
public String commaSeparate(String[] items, String delim){
StringBuilder bob = new StringBuilder();
for(int i=0;i<items.length;i++){
bob.append(items[i]);
if(i+1<items.length){
bob.append(delim);
}
}
return bob.toString();
}
Then call is like this
String[] items = {"one","two","three"};
StringBuilder bob = new StringBuilder();
bob.append("[");
bob.append(commaSeperate(items,","));
bob.append("]");
System.out.print(bob.toString());

Generally, my favourite is the multi-level exit. Change
for ( s1; exit-condition; s2 ) {
doForAll();
if ( !modified-exit-condition )
doForAllButLast();
}
to
for ( s1;; s2 ) {
doForAll();
if ( modified-exit-condition ) break;
doForAllButLast();
}
It eliminates any duplicate code or redundant checks.
Your example:
for (int i = 0;; i++) {
itemOutput.append(items[i]);
if ( i == items.length - 1) break;
itemOutput.append(", ");
}
It works for some things better than others. I'm not a huge fan of this for this specific example.
Of course, it gets really tricky for scenarios where the exit condition depends on what happens in doForAll() and not just s2. Using an Iterator is such a case.
Here's a paper from the prof that shamelessly promoted it to his students :-). Read section 5 for exactly what you're talking about.

I think there are two answers to this question: the best idiom for this problem in any language, and the best idiom for this problem in java. I also think the intent of this problem wasn't the tasks of joining strings together, but the pattern in general, so it doesn't really help to show library functions that can do that.
Firstly though the actions of surrounding a string with [] and creating a string separated by commas are two separate actions, and ideally would be two separate functions.
For any language, I think the combination of recursion and pattern matching works best. For example, in haskell I would do this:
join [] = ""
join [x] = x
join (x:xs) = concat [x, ",", join xs]
surround before after str = concat [before, str, after]
yourFunc = surround "[" "]" . join
-- example usage: yourFunc ["dog", "cat"] will output "[dog,cat]"
The benefit of writing it like this is it clearly enumerates the different situations that the function will face, and how it will handle it.
Another very nice way to do this is with an accumulator type function. Eg:
join [] = ""
join strings = foldr1 (\a b -> concat [a, ",", b]) strings
This can be done in other languages as well, eg c#:
public static string Join(List<string> strings)
{
if (!strings.Any()) return string.Empty;
return strings.Aggregate((acc, val) => acc + "," + val);
}
Not very efficient in this situation, but can be useful in other cases (or efficiency may not matter).
Unfortunately, java can't use either of those methods. So in this case I think the best way is to have checks at the top of the function for the exception cases (0 or 1 elements), and then use a for loop to handle the case with more than 1 element:
public static String join(String[] items) {
if (items.length == 0) return "";
if (items.length == 1) return items[0];
StringBuilder result = new StringBuilder();
for(int i = 0; i < items.length - 1; i++) {
result.append(items[i]);
result.append(",");
}
result.append(items[items.length - 1]);
return result.toString();
}
This function clearly shows what happens in the two edge cases (0 or 1 elements). It then uses a loop for all but the last elements, and finally adds the last element on without a comma. The inverse way of handling the non-comma element at the start is also easy to do.
Note that the if (items.length == 1) return items[0]; line isn't actually necessary, however I think it makes what the function does more easier to determine at a glance.
(Note that if anyone wants more explanation on the haskell/c# functions ask and I'll add it in)

It can be achieved using Java 8 lambda and Collectors.joining() as -
List<String> items = Arrays.asList("dog", "cat", "bat");
String result = items.stream().collect(Collectors.joining(", ", "[", "]"));
System.out.println(result);

I usually write a for loop like this:
public static String forLoopConditional(String[] items) {
StringBuilder builder = new StringBuilder();
builder.append("[");
for (int i = 0; i < items.length - 1; i++) {
builder.append(items[i] + ", ");
}
if (items.length > 0) {
builder.append(items[items.length - 1]);
}
builder.append("]");
return builder.toString();
}

If you are just looking for a comma seperated list of like this: "[The, Cat, in, the, Hat]", don't even waste time writing your own method. Just use List.toString:
List<String> strings = Arrays.asList("The", "Cat", "in", "the", "Hat);
System.out.println(strings.toString());
Provided the generic type of the List has a toString with the value you want to display, just call List.toString:
public class Dog {
private String name;
public Dog(String name){
this.name = name;
}
public String toString(){
return name;
}
}
Then you can do:
List<Dog> dogs = Arrays.asList(new Dog("Frank"), new Dog("Hal"));
System.out.println(dogs);
And you'll get:
[Frank, Hal]

A third alternative is the following
StringBuilder output = new StringBuilder();
for (int i = 0; i < items.length - 1; i++) {
output.append(items[i]);
output.append(",");
}
if (items.length > 0) output.append(items[items.length - 1]);
But the best is to use a join()-like method. For Java there's a String.join in third party libraries, that way your code becomes:
StringUtils.join(items,',');
FWIW, the join() method (line 3232 onwards) in Apache Commons does use an if within a loop though:
public static String join(Object[] array, char separator, int startIndex, int endIndex) {
if (array == null) {
return null;
}
int bufSize = (endIndex - startIndex);
if (bufSize <= 0) {
return EMPTY;
}
bufSize *= ((array[startIndex] == null ? 16 : array[startIndex].toString().length()) + 1);
StringBuilder buf = new StringBuilder(bufSize);
for (int i = startIndex; i < endIndex; i++) {
if (i > startIndex) {
buf.append(separator);
}
if (array[i] != null) {
buf.append(array[i]);
}
}
return buf.toString();
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to use Java's regex to split nested mathematical equations - java

Unfortunately, not with RegEx, you need a library like JEP

Related

JAVA How to check whether a string contains " / " three times or not [duplicate]

Check for multiple occurrence of certain character in string

Substring between two same or different delimiters (when delimiters occur multiple times)

RegEx for dividing complex number String in Java

Best Loop Idiom for special casing the last element

Categories

Resources