Finding pairs in strings

Finding pairs in strings - java

I was wondering if i can get some help with this problem. Suppose I had a string
34342
I would like to find the number of pairs in this string, which would be two. How would i go about doing that?
EDIT: Ok what i really wanted was to match the occurrences of characters that are the same in the string.

You can use backreferences to find pairs of things that appear in a row:
(\d+)\1
This will match one or more digit character followed by the same sequence again. \1 is a backreference which refers to the contents of the first capturing group.
If you want to match numbers that appear multiple times in the string, you could use a pattern like
(\d)(?=\d*\1)
Again we're using a backreference, but this time we also use a lookahead as well. A lookahead is a zero-width assertion which specifies something that must be matched (or not matched, if using a negative lookahead) after the current position in the string, but doesn't consume any characters or move the position the regex engine is at in the string. In this case, we will assert that the contents of the first capture group must be found again, though not necessarily directly beside the first one. By specifying \d* within the lookahead, it will only be considered a pair if it is within the same number (so if there's a space between numbers, the pair won't be matched -- if this is undesired, the \d can be changed to ., which will match any character).
It'll match the first 3 and 4 in 34342 and the first 1, 2, 3, and 4 in 12332144. Note however that if you have an odd number of repetitions, you will get an extra match (ie. 1112 will match the first two 1s), because lookaheads do not consume.

Here's one way, if a regex doesn't seem appropriate. One method here uses a map, the other uses pure arrays. I don't really know what a pair is. Is "555" three pairs, one pair, or what? So these routines print a list of all characters that occur more than once.
public class Pairs {
public static void main(String[] args) {
usingMap("now is the time for all good men");
System.out.println("-----------");
usingArrays("now is the time for all good men");
}
private static void usingMap(String s) {
Map<Character, Integer> m = new TreeMap<Character, Integer>();
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (m.containsKey(c)) {
m.put(c, m.get(c) + 1);
} else {
m.put(c, 1);
}
}
for (Character c : m.keySet()) {
if (m.get(c) > 1) {
System.out.println(c + ":" + m.get(c));
}
}
}
private static void usingArrays(String s) {
int count[] = new int[256];
for (int i = 0; i < count.length; i++) count[i] = 0;
for (int i = 0; i < s.length(); i++) {
count[s.charAt(i)]++;
}
for (int i = 0; i < count.length; i++) {
if (count[i] > 1) {
System.out.println((char) i + ":" + count[i]);
}
}
}
}

Related

Recursion: Longest Palindrome Substring

This is a very common problem in which we would have to find the longest substring which is also a palindrome substring for the given input string.
Now there are multiple possible approaches to this and I am aware about Dynamic programming solution, expand from middle etc. All these solutions should be used for any practical usecase.
I was experimenting with using recursion to solve this problem and trying to implement the simple idea.
Let us assume that s is the given input string and i and j represent any valid character indexes of input string. So if s[i] == s[j], my longest substring would be:
s.charAt(i) + longestSubstring(s, i + 1, j - 1) + s.charAt(j)
And if these two characters are not equal then:
max of longestSubstring(s, i + 1, j) or longestSubstring(s, i, j - 1)
I tried to implement this solution below:
// end is inclusive
private static String longestPalindromeHelper(String s, int start, int end) {
if (start > end) {
return "";
} else if (start == end) {
return s.substring(start, end + 1);
}
// if the character at start is equal to end
if (s.charAt(start) == s.charAt(end)) {
// I can concatenate the start and end characters to my result string
// plus I can concatenate the longest palindrome in start + 1 to end - 1
// now logically this makes sense to me, but this would fail in the case
// for ex: a a c a b d k a c a a (space added for visualization)
// when start = 3 (a character)
// end = 7 (again end character)
// it will go in recursion with start = 4 and end = 6 from now onwards
// there is no palindrome substrings apart from the single character
// substring (which are palindrome by itself) so recursion tree for
// start = 3 and end = 7 would return any single character from b d k
// let's say it returns b so result would be a a c a b a c a a
// this would be correct answer for longest palindrome subsequence but
// not substring because for sub strings I need to have consecutive
// characters
return s.charAt(start)
+ longestPalindromeHelper(s, start + 1, end - 1) + s.charAt(end);
} else {
// characters are not equal, increment start
String s1 = longestPalindromeHelper(s, start + 1, end);
String s2 = longestPalindromeHelper(s, start, end - 1);
return s1.length() > s2.length() ? s1 : s2;
}
}
public static String longestPalindrome(String s) {
return longestPalindromeHelper(s, 0, s.length() - 1);
}
public static void main(String[] args) throws Exception {
String ans = longestPalindrome("aacabdkacaa");
System.out.println("Answer => " + ans);
}
For a moment let us forgot about time complexity or runtime. I am focused towards making it work for simple case above.
As you can see in the comments I got the idea why this is failing but I tried hard to rectify the problem following the exactly same approach. I don't want to use loops here.
What could be the possible fix for this following same approach?
Note: I am interested in the actual string as answer and not the length. FYI I had a look at all the other questions and it seems no one is following this approach for correctness so I am trying.

Once you have a call wherein s[i] == s[j], you could flip a boolean flag or switch to a modified method that communicates to child calls that they can no longer use the "don't match, try i + 1 and j - 1" branch (else condition). This ensures you're looking at substrings, not subsequences, for the remainder of the recursion.
Secondly, for the substring variant, even if s[i] == s[j], you should also try i + 1 and j - 1 as if these characters didn't match, because one or both of these characters might not be part of the final best substring between i and j. In the subsequence version, there's never any reason not to add any matching characters to the current palindromic subsequence for the range i to j, but that's not always the case with substrings.
For example, given input "aabcbda" and we're at a call frame where i = 1 and j = length - 1, we need to maximize over three possibilities:
The best substring includes both 'a' characters. Call the subroutine with the flag that says we have to consume from both ends on down and can no longer try skipping characters.
The best substring might still include s[i] but not s[j], try j - 1.
The best substring might still include s[j] but not s[i], try i + 1.
Another observation: it might make more sense to pass best indices up the helper call chain, then grab the longest palindromic substring based on these indices at the very end in the wrapper function.
On a similar note, if you're struggling, you might simplify the problem and return the longest palindromic substring length using your recursive method, then switch to getting the actual substring itself. This makes it easier to focus on the subsequence logic without the return value complicating things as much.

It is much easier to use loops here, rather than recursion, something like this:
public static void main(String[] args) {
System.out.println(longestPalindrome("abbqa")); // bb
System.out.println(longestPalindrome("aacabdkacaa")); // aca
System.out.println(longestPalindrome("aacabdkaccaa")); // acca
}
public static String longestPalindrome(String str) {
String palindrome = "";
for (int i = 0; i < str.length(); i++) {
for (int j = i; j < str.length(); j++) {
String substring = str.substring(i, j);
if (isPalindrome(substring)
&& substring.length() > palindrome.length()) {
palindrome = substring;
}
}
}
return palindrome;
}
public static boolean isPalindrome(String str) {
for (int i = 0; i < str.length() / 2; i++) {
if (str.charAt(i) != str.charAt(str.length() - i - 1)) {
return false;
}
}
return true;
}

Maximum repeating sequence instead of longest repeating sequence

I am trying to get the most repeated sequence of characters in a string.
For example :
Input:
s = "abccbaabccba"
Output:
2
I have used dynamic programming to figure out the repeating sequence, but this returns the longest repeating character sequence. For example:
Input:
s = "abcabcabcabc"
Output:
2
2(abcabc,abcabc) instead of 4(abc,abc,abc,abc)
Here is the part of the code where I'm filling the DP table and extracting repeating sequence. Can anyone suggest how I can get the most repeating sequence?
//Run through the string and fill the DP table.
char[] chars = s.toCharArray();
for(int i = 1; i <= length; i++){
for(int j = 1; j <= length; j++){
if( chars[i-1] == chars[j-1] && Math.abs(i-j) > table[i-1][j-1]){
table[i][j] = table[i-1][j-1] + 1;
if(table[i][j] > max_length_sub){
max_length_sub = table[i][j];
array_index = Math.min(i, j);
}
}else{
table[i][j] = 0;
}
}
}
//Check if there was a repeating sequence and return the number of times it occurred.
if( max_length_sub > 0 ){
String temp = s;
String subSeq = "";
for(int i = (array_index - max_length_sub); i< max_length_sub; i++){
subSeq = subSeq + s.charAt(i);
}
System.out.println( subSeq );
Pattern pattern = Pattern.compile(subSeq);
Matcher matcher = pattern.matcher(s);
int count = 0;
while (matcher.find())
count++;
// To find left overs - doesn't seem to matter
String[] splits = temp.split(subSeq);
if (splits.length == 0){
return count;
}else{
return 0;
}
}

Simple and dump, the the smallest sequence to be considered is a pair of characters (*):
loop over the whole String an get every consecutive pair of characters, like using a for and substring to get the characters;
count the occurrence of that pair in the String, create a method countOccurrences() using indexof(String, int) or regular expressions; and
store the greatest count, use one variable maxCount outside the loop and an if to check if the actual count is greater (or Math.max())
(*) if "abc" occurs 5 times, than "ab" (and "bc") will occur at least 5 times too - so it is enough to search just for "ab" and "bc", not need to check "abc"
Edit without leftovers, see comments, summary:
check if the first character is repeated over the whole string, if not
check if the 2 initial characters are repeated all over, if not
check if the 3 ...
at least 2 counters/loops needed: one for the number of characters to test, second for the position being tested. Some arithmetic could be used to improve performance: the length of the string must be divisible by the number of repeated characters without remainder.

Identifying set of integers in strings

I'm new to Java, trying to learn more.
How do I identify a contiguous set of integers in a string?
For example, if I have the string "123hh h3ll0 wor1d" the program should output 4 as the answer.
Here's what I've worked on, and as a result, my program outputs 6. I understand why but I don't know how to implement what I want the program to do.
public static void main (String[] args) throws java.lang.Exception
{
String string = "123hh h3ll0 w0rld";
int count = 0;
if (string.isEmpty())
count = 0;
for (int i = 0; i < string.length(); i++)
{
char c = string.charAt(i);
if (Character.isDigit(c))
count++;
}
System.out.println(count);
}

Your program is a good start, but it counts all digits. You need to avoid count++ when you are in a contiguous group of digits. You can do it by adding a boolean flag which you set to true when you see a digit, and then to false when you see a non-digit:
boolean inDigits = false;
for (int i = 0; i < string.length(); i++)
{
char c = string.charAt(i);
if (Character.isDigit(c)) {
if (!inDigits) count++;
inDigits = true;
} else {
inDigits = false;
}
}
Demo 1
A simpler way to find the number of groups is to split on \\d+ (a sequence of one or more digits), count the number of groups you get, and subtract one:
System.out.println(string.split("\\d+").length-1);
Demo 2

Your program counts each occurrence of a digit within the input string; and as there 6 digits; 6 is your result. So no surprises there. You have to understand: if you are interested in sequences of digits, then just checking each character "are you a digit" isn't enough!
If you are interested in the length of the longest sequence, then your "counting" must be enhanced:
While being within a sequence, you keep increasing the "currentSequenceLength" counter
When hitting a non-digit, you stop counting; and you compare the length of the last sequence with the "maximum" that you also have to remember.
That should be enough to get you going; as for sure; the idea is not that we do the homework/learning part for you.

Based on what you stated about contiguous, you want to reset the count every time you are not on a digit and store the maximum count achieved during this process.
Add int currentMaximum = 0 and when a non-digit is read in, check to see if count is greater than currentMaximum and set currentMaximum to count and then set count = 0. This will cause the count to reset at each non-digit and only count up when digits are contiguous.

So your code snippet is simply counting every digit in the string, which you said you knew. So you have to see which situation does it occur that you actually want to count. For contiguous digits, the situation you're looking for is when the current character is a digit, and the next is not. That is when the chain of contiguous digits is broken. I have edited your for loop to use this technique to find the number of contiguous digit sets:
public static void main (String[] args) throws java.lang.Exception {
String string = "123hh h3ll0 w0rld";
int count = 0;
if (string.isEmpty()) {
count = 0;
} else {
for (int i = 0; i < string.length() - 1; i++){
char current = string.charAt(i);
char next = string.charAt(i+1);
if (Character.isDigit(current) && !Character.isDigit(next)){
count++;
}
}
System.out.println(count);
}
}

I would target the numbers in the string via pre-processing OR real time, your problem statement doesn't define a requirement for either, and then refer to the related problem: here and additionally here (which provides a c++ and java sample too).
There's not too much to elaborate on because you haven't set up the problem space to define other factors (more in-depth problem statement by OP would help answers reflect more accurate responses). Try to specify such things as:
does/should it reset when encountering non-digits?
can you use objects like sets?
are you reading in simple test strings or large data amounts?
etc.
Without more info, I think there will be a varying response of answers here.
Cheers

This code will generate count = 6 and group = 4.
enter public static void main (String[] args) throws java.lang.Exception
{
String string = "123hh h3ll0 w0rld";
boolean isGroup = false;
int count = 0;
int group = 0;
if (string.isEmpty()) {
count = 0;
} else {
for (int i = 0; i < string.length(); i++) {
char c = string.charAt(i);
if (Character.isDigit(c)) {
count++;
if (!isGroup) {
group++;
isGroup = true;
}
} else {
isGroup = false;
}
}
}
System.out.println(count);
System.out.println(group);
}

Count the number of dots[.] , ! and ? in a text

Im working on java project and I need to count the number of dots,! and ? in a string. My current approach is to use regex. I used the following code but its not giving the correct result.
for(int i=0; i<words.length; i++){
String w = words[i];
if(w.matches("(.)+[.!?]")){
count++; //increasing the count.
}
}
For some other function I have converted the string into array of words. So I'm using it in this.
I want to increase the count by one for each occurrence of dot,! or ? indicating a terminating point of a sentence. For example
test. - count increases by 1
test.. - count increase by 1
test?. - count increases by 1
Repeated use of symbols shouldn't increase the count.
Can you tell me what is wrong in here?

Use a wildcard in the regex.
int count = 0;
for( int i = 0; i < words.length; i++ )
if( words[i].matches(".*[.!?]") )
count++;
.*[.!?] will match all strings that end in a period, exclamation point, or question mark.
The first . is unescaped, and stands for any character. The * means 0 or more of the previous thing. So 0 or more of any character. The . in the brackets is escaped, so it's just a regular period.

The easiest way is this one-liner:
int count = str.length() - str.replaceAll("[.!?]+", "").length()
Rather than count char matches, delete them and compare lengths.

You can do -
public static void main(String args[])
{
String str = "Test, test!.\tTEST:\nTeST?;";
Pattern p = Pattern.compile("[.!?]");
Matcher matcher = p.matcher(str);
int count = 0;
while(matcher.find()) {
count++;
}
System.out.println("Count : " + count);
}
and the output is - 3 as expected.
Can you tell me why the same regex in str.matches("[.!?]) is not giving the expected result?
Because str.matches("[.!?]) matches the whole String and not if regex is found in the String. Above regex will work if string is '.', '?', or '!' -
String s = ".";
System.out.println(s.matches("[.!?]"));
will give true.

Why am I getting java.lang.StringIndexOutOfBoundsException?

I want to write a program that prints words incrementally until a complete sentence appears. For example : I need to write (input), and output:
I
I need
I need to
I need to write.
Here is my code:
public static void main(String[] args) {
String sentence = "I need to write.";
int len = sentence.length();
int numSpace=0;
System.out.println(sentence);
System.out.println(len);
for(int k=0; k<len; k++){
if(sentence.charAt(k)!='\t')
continue;
numSpace++;
}
System.out.println("Found "+numSpace +"\t in the string.");
int n=1;
for (int m = 1; m <=3; m++) {
n=sentence.indexOf('\t',n-1);
System.out.println("ligne"+m+sentence.substring(0, n));
}
}
and this is what I get:
I need to write.
16
Found 0 in the string.
Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
String index out of range: -1 at
java.lang.String.substring(String.java:1937) at
split1.Split1.main(Split1.java:36) Java Result: 1 BUILD SUCCESSFUL
(total time: 0 seconds)
I don't understand why numSpace doesn't count the occurrences of spaces, nor why I don't get the correct output (even if I replace numSpace by 3 for example).

You don't have a \t character, so indexOf(..) returns -1
You try a substring from 0 to -1 - fails
The solution is to check:
if (n > -1) {
System.out.prinltn(...);
}

Your loop looking for numSpace is incorrect. You are looking for a \t which is a tab character, of which there are none in the string.
Further, when you loop in the bottom, you get an exception because you are trying to parse by that same\t, which will again return no results. The value of n in n=sentence.indexOf('\t',n-1); is going to return -1 which means "there is not last index of what you are looking for". Then you try to get an actual substring with the value of -1 which is an invalid substring, so you get an exception.

You are mistaken by the concept of \t which is an escape sequence for a horizontal tab and not for a whitespace character (space). Searching for ' ' would do the trick and find the whitespaces in your sentence.

This looks like homework, so my answer is a hint.
Hint: read the javadoc for String.indexOf paying attention to what it says about the value returned when the string / character is not found.
(In fact - even if this is not formal homework, you are clearly a Java beginner. And beginners need to learn that the javadocs are the first place to look when using an unfamiliar method.)

The easiest way to solve this I guess would be to split the String first by using the function String.split. Something like this:
static void sentence(String snt) {
String[] split = snt.split(" ");
for (int i = 0; i < split.length; i++) {
for (int j = 0; j <= i; j++) {
if (i == 1 && j == 0) System.out.print(split[j]);
else System.out.printf(" %s", split[j]);
}
}
}
As other people pointed out. You are counting every characters except tabs(\t) as a space. You need to check for spaces by
if (sentence.charAt(k) == ' ')

\t represents a tab. To look for a space, just use ' '.
.indexOf() returns -1 if it can't find a character in the string. So we keep looping until .indexOf() returns -1.
Use of continue wasn't really needed here. We increment numSpaces when we encounter a space.
System.out.format is useful when we want to mix literal strings and variables. No ugly +s needed.
String sentence = "I need to write.";
int len = sentence.length();
int numSpace = 0;
for (int k = 0; k < len; k++) {
if (sentence.charAt(k) == ' ') {
numSpace++;
}
}
System.out.format("Found %s in the string.\n", numSpace);
int index = sentence.indexOf(' ');
while(index > -1) {
System.out.println(sentence.substring(0, index));
index = sentence.indexOf(' ', index + 1);
}
System.out.println(sentence);
}

Try this, it should pretty much do what you want. I figure you have already finished this so I just made the code real fast. Read the comments for the reasons behind the code.
public static void main(String[] args) {
String sentence = "I need to write.";
int len = sentence.length();
String[] broken = sentence.split(" "); //Doing this instead of the counting of characters is just easier...
/*
* The split method makes it where it populates the array based on either side of a " "
* (blank space) so at the array index of 0 would be 'I' at 1 would be "need", etc.
*/
boolean done = false;
int n = 0;
while (!done) { // While done is false do the below
for (int i = 0; i <= n; i++) { //This prints out the below however many times the count of 'n' is.
/*
* The reason behind this is so that it will print just 'I' the first time when
* 'n' is 0 (because it only prints once starting at 0, which is 'I') but when 'n' is
* 1 it goes through twice making it print 2 times ('I' then 'need") and so on and so
* forth.
*/
System.out.print(broken[i] + " ");
}
System.out.println(); // Since the above method is a print this puts an '\n' (enter) moving the next prints on the next line
n++; //Makes 'n' go up so that it is larger for the next go around
if (n == broken.length) { //the '.length' portion says how many indexes there are in the array broken
/* If you don't have this then the 'while' will go on forever. basically when 'n' hits
* the same number as the amount of words in the array it stops printing.
*/
done = true;
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Finding pairs in strings - java

Related

Recursion: Longest Palindrome Substring

Maximum repeating sequence instead of longest repeating sequence

Identifying set of integers in strings

Count the number of dots[.] , ! and ? in a text

Why am I getting java.lang.StringIndexOutOfBoundsException?

Categories

Resources