How to divide a String into identical repeating parts - java

Given a String, I want to divide it up into substrings that are all identical. For example:
"abcabcabcabc" -> ["abc", "abc", "abc", "abc"]
"aaaaaa" -> ["a", "a", "a", "a", "a", "a"]
"abc" -> ["abc"]
My problem is figuring out the logic of finding where to break the characters. My approach initial attempt is:
public static void FindPattern(String s) {
int no_of_characters = 256;
int[] count = new int[no_of_characters];
Arrays.fill(count, 0);
for (int i= 0; i < s.length();i++){
count[s.charAt(i)]++;
}
}
public static void main(String[] args) {
String s = "abcabcabd";
FindPattern(s);
}
but I have no idea of where to go from there.

You can use regex to find the smallest substring that when repeated is the same as the whole string:
String part = str.replaceAll("^(.+?)\\1*$", "$1");
Breaking down the regex:
^ means "start of input"
(.*?) means "capture (as group 1) the smallest amount of input that will result in a match"
\1 is a back reference to group 1, meaning "another copy of what was captured in group 1"
* zero or more of the the back reference
$1 the replacement is what was captured in group 1
Because zero further copies are allowed to complete the match, when there is no repeating group, the whole string is returned, which is correct behaviour.
Once you have this string, you don't actually need to "divide" the string up, you just need n copies of it. However as a convenience you can split the sting equal parts by splitting on the length of the result of the above:
String[] parts = str.split("(?<=\\G.{" + str.replaceAll("^(.*?)\\1*$", "$1").length() + "})");
More simply, the split regex is (?<=\G.{n}), which means "there are n characters between the end of the previous match and the current position".

Related

How to split the string with slash correctly

Code:
String line = "/abc/1/";
String[] tokens = line.split("/");
I want to get {"", "abc", "1", ""}.
However, the actual output is {"", "abc", "1"}.
What confuses me is why there is only one "", maybe there is something wrong with line.split("/").
Use the not-often-used second parameter of String#split:
String line = "/abc/1/";
String[] tokens = line.split("/", -1);
This returns {"", "abc", "1", ""}.
Demo
From the documentation for String#split(String pattern, int n):
If n is non-positive then the pattern will be applied as many times as possible and the array can have any length
Just a follow-up to Tim's answer, as the doc clearly points out there is a second flag we can use to control the times of the regex applied to the string. And there are three different options we have for the limit:
public String[] split(String regex, int limit)
If the limit n is positive then the returned array's length will be no greater than n, and the array's last entry will contain all the left.
If the limit n is negative then there is no limit and all the elements that match the pattern will be returned;
If the limit n is zero, then based on the No.2, all the suffixing/trailing empties will be discarded.
So to your problem, you should try:
line.split("/", -1); // include all results.

Java extracting substring from sentences

There are combination of words like is, is not, does not contain. We have to match these words in a sentence and have to split it.
Intput : if name is tom and age is not 45 or name does not contain tom then let me know.
Expected output:
If name is
tom and age is not
45 or name does not contain
tom then let me know
I tried below code to split and extract but the occurrence of "is" is in "is not" as well which my code is not able to find out:
public static void loadOperators(){
operators.add("is");
operators.add("is not");
operators.add("does not contain");
}
public static void main(String[] args) {
loadOperators();
for(String s : operators){
System.out.println(str.split(s).length - 1);
}
}
Since there could be multiple occurence of a word split wouldn't solve your use case, as in is and is not being different operators for you. You would ideally :
Iterate :
1. Find the index of the 'operator'.
2. Search for the next space _ or word.
3. Then update your string as substring from its index to length-1.
I am not entirely sure about what you try to achieve, but let's give it a shot.
For your case, a simple "workaround" might work just fine:
Sort the operators by their length, descending. This way the "largest match" will get found first. You can define "largest" as either literally the longest string, or preferably the number of words (number of spaces contained), so is a has precedence over contains
You'll need to make sure that no matches overlap though, which can be done by comparing all matches' start and end indices and discarding overlaps by some criteria, like first match wins
This code does what you seem to be wanting to do (or what I guessed you are wanting to do):
public static void main(String[] args) {
List<String> operators = new ArrayList<>();
operators.add("is");
operators.add("is not");
operators.add("does not contain");
String input = "if name is tom and age is not 45 or name does not contain tom then let me know.";
List<String> output = new ArrayList<>();
int lastFoundOperatorsEndIndex = 0; // First start at the beginning of input
for (String operator : operators){
int indexOfOperator = input.indexOf(operator); // Find current operator's position
if (indexOfOperator > -1) { // If operator was found
int thisOperatorsEndIndex = indexOfOperator + operator.length(); // Get length of operator and add it to the index to include operator
output.add(input.substring(lastFoundOperatorsEndIndex, thisOperatorsEndIndex).trim()); // Add operator to output (and remove trailing space)
lastFoundOperatorsEndIndex = thisOperatorsEndIndex; // Update startindex for next operator
}
}
output.add(input.substring(lastFoundOperatorsEndIndex, input.length()).trim()); // Add rest of input as last entry to output
for (String part : output) { // Output to console
System.out.println(part);
}
}
But it is highly dependant on the order of the sentence and the operators. If we're talking about user-input, the task will be much more complicated.
A better method using regular expressions (regExp) would be:
public static void main(String... args) {
// Define inputs
String input1 = "if name is tom and age is not 45 or name does not contain tom then let me know.";
String input2 = "the name is tom and he is 22 years old but the name does not contain jack, but merry is 24 year old.";
// Output split strings
for (String part : split(input1)) {
System.out.println(part.trim());
}
System.out.println();
for (String part : split(input2)) {
System.out.println(part.trim());
}
}
private static String[] split(String input) {
// Define list of operators - 'is not' has to precede 'is'!!
String[] operators = { "\\sis not\\s", "\\sis\\s", "\\sdoes not contain\\s", "\\sdoes contain\\s" };
// Concatenate operators to regExp-String for search
StringBuilder searchString = new StringBuilder();
for (String operator : operators) {
if (searchString.length() > 0) {
searchString.append("|");
}
searchString.append(operator);
}
// Replace all operators by operator+\n and split resulting string at \n-character
return input.replaceAll("(" + searchString.toString() + ")", "$1\n").split("\n");
}
Notice the order of the operators! 'is' has to come after 'is not' or 'is not' will always be split.
You can prevent this by using a negative lookahead for the operator 'is'.
So "\\sis\\s" would become "\\sis(?! not)\\s" (reading like: "is", not followed by a " not").
A minimalist Version (with JDK 1.6+) could look like this:
private static String[] split(String input) {
String[] operators = { "\\sis(?! not)\\s", "\\sis not\\s", "\\sdoes not contain\\s", "\\sdoes contain\\s" };
return input.replaceAll("(" + String.join("|", operators) + ")", "$1\n").split("\n");
}

Regex does not store the element in the first index

I have a function which takes a String containing a math expression such as 6+9*8 or 4+9 and it evaluates them from left to right (without normal order of operation rules).
I've been stuck with this problem for the past couple of hours and have finally found the culprit BUT I have no idea why it is doing what it does. When I split the string through regex (.split("\\d") and .split("\\D")), I make it go into 2 arrays, one is a int[] where it contains the numbers involved in the expression and a String[] where it contains the operations.
What I've realized is that when I do the following:
String question = "5+9*8";
String[] mathOperations = question.split("\\d");
for(int i = 0; i < mathOperations.length; i++) {
System.out.println("Math Operation at " + i + " is " + mathOperations[i]);
}
it does not put the first operation sign in index 0, rather it puts it in index 1. Why is this?
This is the system.out on the console:
Math Operation at 0 is
Math Operation at 1 is +
Math Operation at 2 is *
Because on position 0 of mathOperations there's an empty String. In other words
mathOperations = {"", "+", "*"};
According to split documentation
The array returned by this method contains each substring of this
string that is terminated by another substring that matches the given
expression or is terminated by the end of the string. ...
Why isn't there an empty string at the end of the array too?
Trailing empty strings are therefore not included in the resulting
array.
More detailed explanation - your regex matched the String like this:
"(5)+(9)*(8)" -> "" + (5) + "+" + (9) + "*" + (8) + ""
but the trailing empty string is discarded as specified by the documentation.
(hope this silly illustration helps)
Also a thing worth noting, the regex you used "\\d", would split following string "55+5" into
["", "", "+"]
That's because you match only a single character, you should probably use "\\d+"
You may find the following variation on your program helpful, as one split does the jobs of both of yours...
public class zw {
public static void main(String[] args) {
String question = "85+9*8-900+77";
String[] bits = question.split("\\b");
for (int i = 0; i < bits.length; ++i) System.out.println("[" + bits[i] + "]");
}
}
and its output:
[]
[85]
[+]
[9]
[*]
[8]
[-]
[900]
[+]
[77]
In this program, I used \b as a "zero-width boundary" to do the splitting. No characters were harmed during the split, they all went into the array.
More info here: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
and here: http://www.regular-expressions.info/wordboundaries.html

How to evaluate >9 number in a String expression Java

Life is very easy if the expression has values from 0 to 9 but
If expression = 23+52*5 is input by user then I take it in a String named expression.
Now what I want is that a new String or char Array in such a fashion that:
String s or char[] ch = ['23','+','52','*','5']
so that ch[0] or s.charAt(0) gives me 23 and not 2.
To do so I have tried the following and am stuck on what to do next:
for(int i=0;i<expression.length();i++)
{
int operand = 0;
while(i<expression.length() && sol.isOperand(expression.charAt(i))) {
// For a number with more than one digits, as we are scanning
// from left to right.
// Everytime , we get a digit towards right, we can
// multiply current total in operand by 10
// and add the new digit.
operand = (operand*10) + (expression.charAt(i) - '0');
i++;
}
// Finally, you will come out of while loop with i set to a non-numeric
// character or end of string decrement it because it will be
// incremented in increment section of loop once again.
// We do not want to skip the non-numeric character by
// incrementing it twice.
i--;
/**
* I have the desired integer value but how to put it on a string
* or char array to fulfill my needs as stated above.
**/
// help or ideas for code here to get desired output
}
In the while loop the method isOperand(char) is returning a boolean value true if char provided is >=0 or <=9.
You're going to want a String[] when you break the expression apart. Regex lookbehind/lookahead (Java Pattern Reference) allows you to split a String and keep the delimiters. In this case, your delimiters are the operands. You can use a split pattern like this:
public static void main(String[] args) throws Exception {
String expression = "23+52*5";
String[] pieces = expression.split("(?<=\\+|-|\\*|/)|(?=\\+|-|\\*|/)");
System.out.println(Arrays.toString(pieces));
}
Results:
[23, +, 52, *, 5]
One this to consider, is if your expression contains any spaces then you're going to want to remove them before splitting the expression apart. Otherwise, the spaces will be included in the results. To remove the spaces from the expression can be done with String.replaceAll() like this:
expression = expression.replaceAll("\\s+", "");
The "\\s+" is a regular expression pattern that means a whitespace character: [ \t\n\x0B\f\r]. This statement replaces all whitespace characters with an empty space essentially removing them.
String expr = "23+52*5";
String[] operands = expr.split("[^0-9]");
String[] operators = expr.split("[0-9]+");
This breaks it into:
try using a switch case:
//read first character
switch data
if number : what is it ( 0,1,2,3...?)
save the number
if its an operator
save operator
** 2 the switch has 4 cases for operators and 10 for digits **
// read the next character
again switch
if its a number after a number add both chars into a string
if its an operator "close" the number string and convert it into an int
If you will need some code I will gladly help.
Hope the answer is useful.
for i = 1 ; i = string.length
switch :
case 1,2,3,4,5,6,7,8,9,0 (if its a digit)
x(char) = char(i)
case +,-,*,/ (operator)
y(char) = char(i)
x[1] + x[2] +...x[n] = X(string) // build all the digits saves before the operator into 1 string
convert_to_int(X(string))
You can use regex:
\d+
Will grab any instance of one or more consecutive digits.

split string only on first instance - java

I want to split a string by '=' charecter. But I want it to split on first instance only. How can I do that ? Here is a JavaScript example for '_' char but it doesn't work for me
split string only on first instance of specified character
Example :
apple=fruit table price=5
When I try String.split('='); it gives
[apple],[fruit table price],[5]
But I need
[apple],[fruit table price=5]
Thanks
string.split("=", limit=2);
As String.split(java.lang.String regex, int limit) explains:
The array returned by this method contains each substring of this string that is terminated by another substring that matches the given expression or is terminated by the end of the string. The substrings in the array are in the order in which they occur in this string. If the expression does not match any part of the input then the resulting array has just one element, namely this string.
The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.
The string boo:and:foo, for example, yields the following results with these parameters:
Regex Limit Result
: 2 { "boo", "and:foo" }
: 5 { "boo", "and", "foo" }
: -2 { "boo", "and", "foo" }
o 5 { "b", "", ":and:f", "", "" }
o -2 { "b", "", ":and:f", "", "" }
o 0 { "b", "", ":and:f" }
Yes you can, just pass the integer param to the split method
String stSplit = "apple=fruit table price=5"
stSplit.split("=", 2);
Here is a java doc reference : String#split(java.lang.String, int)
As many other answers suggest the limit approach, This can be another way
You can use the indexOf method on String which will returns the first Occurance of the given character, Using that index you can get the desired output
String target = "apple=fruit table price=5" ;
int x= target.indexOf("=");
System.out.println(target.substring(x+1));
String string = "This is test string on web";
String splitData[] = string.split("\\s", 2);
Result ::
splitData[0] => This
splitData[1] => is test string
String string = "This is test string on web";
String splitData[] = string.split("\\s", 3);
Result ::
splitData[0] => This
splitData[1] => is
splitData[1] => test string on web
By default split method create n number's of arrays on the basis of given regex. But if you want to restrict number of arrays to create after a split than pass second argument as an integer argument.
This works:
public class Split
{
public static void main(String...args)
{
String a = "%abcdef&Ghijk%xyz";
String b[] = a.split("%", 2);
System.out.println("Value = "+b[1]);
}
}
String[] func(String apple){
String[] tmp = new String[2];
for(int i=0;i<apple.length;i++){
if(apple.charAt(i)=='='){
tmp[0]=apple.substring(0,i);
tmp[1]=apple.substring(i+1,apple.length);
break;
}
}
return tmp;
}
//returns string_ARRAY_!
i like writing own methods :)

Categories

Resources