regex in java not capture last group [duplicate] - java

((\d{1,2})/(\d{1,2})/(\d{2,4}))
Is there a way to retrieve a list of all the capture groups with the Pattern object. I debugged the object and all it says is how many groups there are (5).
I need to retrieve a list of the following capture groups.
Example of output:
0 ((\d{1,2})/(\d{1,2})/(\d{2,4}))
1 (\d{2})/(\d{2})/(\d{4})
2 \d{2}
3 \d{2}
4 \d{4}
Update:
I am not necessarily asking if a regular expression exists, but that would be most favorable. So far I have created a rudimentary parser (I do not check for most out-of-bounds conditions) that only matches inner-most groups. I would like to know if there is a way to hold reference to already-visited parenthesis. I would probably have to implement a tree structure?
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;
public class App {
public final char S = '(';
public final char E = ')';
public final char X = '\\';
String errorMessage = "Malformed expression: ";
/**
* Actual Output:
* Groups: [(//), (\d{1,2}), (\d{1,2}), (\d{2,4})]
* Expected Output:
* Groups: [\\b((\\d{1,2})/(\\d{1,2})/(\\d{2,4}))\\b, ((\\d{1,2})/(\\d{1,2})/(\\d{2,4})), (\d{1,2}), (\d{1,2}), (\d{2,4})]
*/
public App() {
String expression = "\\b((\\d{1,2})/(\\d{1,2})/(\\d{2,4}))\\b";
String output = "";
if (isValidExpression(expression)) {
List<String> groups = findGroups(expression);
output = "Groups: " + groups;
} else {
output = errorMessage;
}
System.out.println(output);
}
public List<String> findGroups(String expression) {
List<String> groups = new ArrayList<>();
int[] pos;
int start;
int end;
String sub;
boolean done = false;
while (expression.length() > 0 && !done) {
pos = scanString(expression);
start = pos[0];
end = pos[1];
if (start == -1 || end == -1) {
done = true;
continue;
}
sub = expression.substring(start, end);
expression = splice(expression, start, end);
groups.add(0, sub);
}
return groups;
}
public int[] scanString(String str) {
int[] range = new int[] { -1, -1 };
int min = 0;
int max = str.length() - 1;
int start = min;
int end = max;
char curr;
while (start <= max) {
curr = str.charAt(start);
if (curr == S) {
range[0] = start;
}
start++;
}
end = range[0];
while (end > -1 && end <= max) {
curr = str.charAt(end);
if (curr == E) {
range[1] = end + 1;
break;
}
end++;
}
return range;
}
public String splice(String str, int start, int end) {
if (str == null || str.length() < 1)
return "";
if (start < 0 || end > str.length()) {
System.err.println("Positions out of bounds.");
return str;
}
if (start >= end) {
System.err.println("Start must not exceed end.");
return str;
}
String first = str.substring(0, start);
String last = str.substring(end, str.length());
return first + last;
}
public boolean isValidExpression(String expression) {
try {
Pattern.compile(expression);
} catch (PatternSyntaxException e) {
errorMessage += e.getMessage();
return false;
}
return true;
}
public static void main(String[] args) {
new App();
}
}

Here is my solution ... I simply provided a regex of the regex as #SotiriosDelimanolis commented out.
public static void printGroups() {
String sp = "((\\(\\\\d\\{1,2\\}\\))\\/(\\(\\\\d\\{1,2\\}\\))\\/(\\(\\\\d\\{2,4\\}\\)))";
Pattern p = Pattern.compile(sp);
Matcher m = p.matcher("(\\d{1,2})/(\\d{1,2})/(\\d{2,4})");
if (m.matches())
for (int i = 0; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
Pay attention that you cannot remove the if-statement because in order to use the group method you should call the matches method first (I didn't know it!). See this link as a reference about it.
Hope this is what you were asking for ...

Related

Replace all doubled or tripled letters with single ones

Task:
For a given string of characters consisting only of letters: a, b and
c swap all doubled or tripled letters for single ones
I prepared such a code:
public static String doubleLetters(String str) {
StringBuilder ret = new StringBuilder(str.length());
if (str.length() == 0) return "";
for (int i = 1; i < str.length(); i++)
{
if(str.charAt(i) == str.charAt(i-1)
|| str.charAt(i) == str.charAt(i-1) && str.charAt(i) == str.charAt(i-2))
{
ret.append(str.charAt(i));
}
}
return ret.toString();
}
However, I cannot define the condition to take into account tripled letters.
By entering "aaabbbccc" I want "abc".
By entering "aabbcc" I want "abc".
By entering "aaaaabbbbbbbccc" I want "aabbbc".
IMPORTANT
Letters that are converted to 1 letter are not taken into account.
Please help me how to approach this problem.
It is not entirely clear whether you want to replace only triple or double letters or a repeated character of any length by a single one. I'm assuming the latter one:
public static String eliminateMultipleLetters(String s) {
StringBuilder sb = new StringBuilder(); // better for loops than concatenation
for (int i = 0; i < s.length() - 1; i++) {
if (s.charAt(i) != s.charAt(i + 1))
sb.append(s.charAt(i));
}
sb.append(s.charAt(s.length() - 1)); // append last character
return sb.toString();
}
Edit: to replace 3 characters by 1 as long as there are 3 and then 2 if possible, you could do as follows (the logic is very similar, just the step at the end gets more complicated):
public static String replace3or2Letters(String s) {
if (s.length() < 2)
return s;
StringBuilder sb = new StringBuilder();
int i;
for (i = 0; i < s.length() - 2; i++) {
sb.append(s.charAt(i));
if (s.charAt(i) == s.charAt(i + 1)) {
if (s.charAt(i) == s.charAt(i + 2))
i += 2;
else
i++;
}
}
if (i == s.length() - 2) {
sb.append(s.charAt(s.length() - 2));
if (s.charAt(s.length() - 2) != s.charAt(s.length() - 2))
sb.append(s.charAt(s.length() - 1));
} else if (i == s.length() - 1) {
sb.append(s.charAt(s.length() - 1));
}
return sb.toString();
}
More elegant:
public static String replaceUpToXbySingle(String s, int x) { // x = 3 for you
StringBuilder sb = new StringBuilder();
char last = 'c'; // whatever
int count = 0;
for (int i = 0; i < s.length(); i++) {
if (count == 0 || s.charAt(i) != last) {
if (count > 0)
sb.append(last);
last = s.charAt(i);
count = 1;
} else if (++count == x) {
sb.append(last);
count = 0;
}
}
if (count > 0)
sb.append(last);
return sb.toString();
}
Updated response
If you want to replace groups of three, followed by groups of two, you need to build a list of contiguous frequencies. After you have this list, you could build a string by applying div/mod logic to the total.
I included a Pair class that extends Map.Entry which stores key-value associations.
import java.util.*;
public class StringUtil {
public static void main(String[] args) {
System.out.println(dedupe("aabbcc").equals("abc"));
System.out.println(dedupe("aaabbbccc").equals("abc"));
System.out.println(dedupe("aaaaabbbbbbbccc").equals("aabbbc"));
}
public static String dedupe(String str) {
if (str == null || str.isEmpty()) {
return str;
}
StringBuilder buffer = new StringBuilder();
List<Pair<Character, Integer>> pairs = new ArrayList<>();
char[] chars = str.toCharArray();
char curr, prev = chars[0];
int total = 0, i, add3, add2;
for (i = 1; i < chars.length; i++) {
curr = chars[i];
total++;
if (curr != prev) {
pairs.add(new Pair<>(prev, total));
total = 0;
prev = curr;
}
}
total++;
pairs.add(new Pair<>(prev, total));
for (Pair<Character, Integer> pair : pairs) {
total = pair.getValue();
add3 = total / 3;
for (i = 0; i < add3; i++) {
buffer.append(pair.getKey());
}
total %= 3;
add2 = total / 2;
for (i = 0; i < add2; i++) {
buffer.append(pair.getKey());
}
total %= 2;
for (i = 0; i < total; i++) {
buffer.append(pair.getKey());
}
}
return buffer.toString();
}
private static final class Pair<K, V> implements Map.Entry<K, V> {
private final K key;
private V value;
public Pair(K key, V value) {
this.key = key;
this.value = value;
}
#Override
public K getKey() {
return key;
}
#Override
public V getValue() {
return value;
}
#Override
public V setValue(V value) {
V old = this.value;
this.value = value;
return old;
}
}
}
Original response
All you would need to do it store a previous (prev) value and then just loop over the characters and append to buffer if current (curr) does not match the previous.
public class StringUtil {
public static void main(String[] args) {
System.out.println(dedupe("aaabbbccc")); // "abc"
}
public static String dedupe(String str) {
StringBuilder buffer = new StringBuilder();
char prev = 0;
for (char curr : str.toCharArray()) {
if (curr != prev) {
buffer.append(curr);
prev = curr;
}
}
return buffer.toString();
}
}
Another approach: count the number of consecutive occurrences of a character, and then print them in bulk, by dividing their number by three.
public static String doubleLetters(String str) {
StringBuilder ret = new StringBuilder(str.length());
if (str.length() == 0) return "";
int count = 1;
for (int i = 1; i < str.length(); i++)
{
if (str.charAt(i) == str.charAt(i-1)) {
count++;
continue;
}
for (; count > 0; count -= 3)
ret.append(str.charAt(i-1));
count = 1;
}
for (; count > 0; count -= 3)
ret.append(str.charAt(str.length() - 1));
return ret.toString();
}
However, I cannot define the condition to take into account tripled
letters.
By entering "aaabbbccc" I want "abc".
By entering "aabbcc" I want "abc".
By entering "aaaaabbbbbbbccc" I want "aabbbc".
IMPORTANT Letters that are converted to 1 letter are not taken into
account.
Imagine this problem in three stages:
First, generate a sequence (or a stream) of all the single, double, or triple letter occurrences in your input
Next, replace each of the occurrences in the sequence with its first letter
Finally, put the sequence (or stream) back together into a single String instance
To realize this solution very readably, I would use a main method and a static helper method as follows:
public static String deDupe(String input) {
String result = generateStreamOfOccurrences(input)
.map(occurrence -> occurrence.substr(0, 1)
.collect(Collectors.joining());
return result;
}
private static Stream<String> generateStreamOfOccurrences(String input) {
List<String> listOfOccs = new ArrayList<>();
if (input != null) {
while (input.length() > 0) {
String occ = input.substring(0, 1);
if (input.length() > 2 && input.substring(1, 3).equals(occ + occ))
occ = input.substring(0, 3);
if (input.length() > 1 && input.substring(1, 2).equals(occ))
occ = input.substring(0, 2);
input = input.substring(occ.length());
listOfOccs.add(occ);
}
}
return listOfOccs.stream();
}
The main function is easy to read. It obtains a stream of letter occurrences in the String, each of which is a single, double, or triple of its first letter. The work of obtaining this Stream<String> pipeline is encapsulated in the helper function generateListOfOccurrences().
However, I cannot define the condition to take into account tripled
letters.
By entering "aaabbbccc" I want "abc".
By entering "aabbcc" I want "abc".
By entering "aaaaabbbbbbbccc" I want "aabbbc".
IMPORTANT Letters that are converted to 1 letter are not taken into
account.
Imagine this problem in three stages:
First, generate a sequence (or a stream) of all the single, double, or triple letter occurrences in your input
Next, replace each of the occurrences in the sequence with its first letter
Finally, put the sequence (or stream) back together into a single String instance
To realize this solution very readably, I would use a main method and a static helper method as follows:
public static String deDupe(String input) {
String result = generateStreamOfOccurrences(input)
.map(occurrence -> occurrence.substring(0, 1)
.collect(Collectors.joining());
return result;
}
private static Stream<String> generateStreamOfOccurrences(String str) {
String input = str;
List<String> listOfOccs = new ArrayList<>();
if (input != null) {
while (input.length() > 0) {
String occ = input.substring(0, 1);
if (input.length() > 2 && input.substring(1, 3).equals(occ + occ)) {
occ = input.substring(0, 3);
}
else if (input.length() > 1 && input.substring(1, 2).equals(occ)) {
occ = input.substring(0, 2);
}
input = input.substring(occ.length());
listOfOccs.add(occ);
}
}
return listOfOccs.stream();
}
The main function is easy to read. It obtains a stream of letter occurrences in the String, each of which is a single, double, or triple of its first letter. The work of obtaining this Stream<String> pipeline is encapsulated in the helper function generateListOfOccurrences(). Then each of the elements in the stream are replaced by elements that consist of just the 1st letter. Then they are joined together to form the result.

Java code unable to find genes in a dna string

The following code is from Java mooc that is supposed to find all genes in a given file. The problem is that my getAllGenes method is not returning any genes for a given string of dna. I simply don't know what is wrong with the code.
The file I'm testing against (brca1line.fa) is located here https://github.com/polde-live/duke-java-1/tree/master/AllGenesFinder/dna
Thank you.
This is my java code.
public class AllGenesStored {
public int findStopCodon(String dnaStr,
int startIndex,
String stopCodon){
int currIndex = dnaStr.indexOf(stopCodon,startIndex+3);
while (currIndex != -1 ) {
int diff = currIndex - startIndex;
if (diff % 3 == 0) {
return currIndex;
}
else {
currIndex = dnaStr.indexOf(stopCodon, currIndex + 1);
}
}
return -1;
}
public String findGene(String dna, int where) {
int startIndex = dna.indexOf("ATG", where);
if (startIndex == -1) {
return "";
}
int taaIndex = findStopCodon(dna,startIndex,"TAA");
int tagIndex = findStopCodon(dna,startIndex,"TAG");
int tgaIndex = findStopCodon(dna,startIndex,"TGA");
int minIndex = 0;
if (taaIndex == -1 ||
(tgaIndex != -1 && tgaIndex < taaIndex)) {
minIndex = tgaIndex;
}
else {
minIndex = taaIndex;
}
if (minIndex == -1 ||
(tagIndex != -1 && tagIndex < minIndex)) {
minIndex = tagIndex;
}
if (minIndex == -1){
return "";
}
return dna.substring(startIndex,minIndex + 3);
}
public StorageResource getAllGenes(String dna) {
//create an empty StorageResource, call it geneList
StorageResource geneList = new StorageResource();
//Set startIndex to 0
int startIndex = 0;
//Repeat the following steps
while ( true ) {
//Find the next gene after startIndex
String currentGene = findGene(dna, startIndex);
//If no gene was found, leave this loop
if (currentGene.isEmpty()) {
break;
}
//Add that gene to geneList
geneList.add(currentGene);
//Set startIndex to just past the end of the gene
startIndex = dna.indexOf(currentGene, startIndex) +
currentGene.length();
}
//Your answer is geneList
return geneList;
}
public void testOn(String dna) {
System.out.println("Testing getAllGenes on " + dna);
StorageResource genes = getAllGenes(dna);
for (String g: genes.data()) {
System.out.println(g);
}
}
public void test() {
FileResource fr = new FileResource();
String dna = fr.asString();
// ATGv TAAv ATG v v TGA
//testOn("ATGATCTAATTTATGCTGCAACGGTGAAGA");
testOn(dna);
// ATGv v v v TAAv v v ATGTAA
//testOn("ATGATCATAAGAAGATAATAGAGGGCCATGTAA");
}
}
The data in the file is like "acaagtttgtacaaaaaagcagaagggccgtcaaggcccaccatgcctattggatccaaagagaggccaacatttttt". You are searching for uppercase characters in a string containing only lower case characters. String.indexOf will therefore never find TAA, TAG or TGA. Change the strings to a lower case.
int startIndex = dna.indexOf("atg", where);
...
int taaIndex = findStopCodon(dna,startIndex,"taa");
int tagIndex = findStopCodon(dna,startIndex,"tag");
int tgaIndex = findStopCodon(dna,startIndex,"tga");
Responding to the comment below: if you want to be able to handle mixed case, like you do in your text, you need to lowercase() the string first.

Autocomplete byReverseWeightOrder comparator issue

I have been working on this problem for several hours now and I just cannot figure out what I am doing wrong here. Could anyone help point me in the right direction?
I was asked to write an Autocomplete program and I've completed everything except for this one method I cannot get working. Each term has: 1. String query and 2. long weight.
Here is the method:
public static Comparator<Term> byReverseWeightOrder() {
return new Comparator<Term>() { // LINE CAUSING PROBLEM
public int compare(Term t1, Term t2) {
if (t1.weight > t2.weight) { // LINE CAUSING PROBLEM
return -1;
} else if (t1.weight == t2.weight) {
return 0;
} else {
return 1;
}
}
};
}
My problem is that no matter how I mess with the method I always result in a NullPointerException(). Which, it points to this method (byReverseWeightOrder) as well as these two statements.
Arrays.sort(matches, Term.byReverseWeightOrder());
Term[] results = autocomplete.allMatches(prefix);
Here is the rest of the code if it can be found helpful:
Term
import java.util.Comparator;
public class Term implements Comparable<Term> {
public String query;
public long weight;
public Term(String query, long weight) {
if (query == null) {
throw new java.lang.NullPointerException("Query cannot be null");
}
if (weight < 0) {
throw new java.lang.IllegalArgumentException("Weight cannot be negative");
}
this.query = query;
this.weight = weight;
}
public static Comparator<Term> byReverseWeightOrder() {
return new Comparator<Term>() {
public int compare(Term t1, Term t2) {
if (t1.weight > t2.weight) {
return -1;
} else if (t1.weight == t2.weight) {
return 0;
} else {
return 1;
}
}
};
}
public static Comparator<Term> byPrefixOrder(int r) {
if (r < 0) {
throw new java.lang.IllegalArgumentException("Cannot order with negative number of characters");
}
final int ref = r;
return
new Comparator<Term>() {
public int compare(Term t1, Term t2) {
String q1 = t1.query;
String q2 = t2.query;
int min;
if (q1.length() < q2.length()) {
min = q1.length();
}
else {
min = q2.length();
}
if (min >= ref) {
return q1.substring(0, ref).compareTo(q2.substring(0, ref));
}
else if (q1.substring(0, min).compareTo(q2.substring(0, min)) == 0) {
if (q1.length() == min) {
return -1;
}
else {
return 1;
}
}
else {
return q1.substring(0, min).compareTo(q2.substring(0, min));
}
}
};
}
public int compareTo(Term that) {
String q1 = this.query;
String q2 = that.query;
return q1.compareTo(q2);
}
public long getWeight() {
return this.weight;
}
public String toString() {
return this.weight + "\t" + this.query;
}
}
BinarySearchDeluxe
import java.lang.*;
import java.util.*;
import java.util.Comparator;
public class BinarySearchDeluxe {
public static <Key> int firstIndexOf(Key[] a, Key key, Comparator<Key> comparator) {
if (a == null || key == null || comparator == null) {
throw new java.lang.NullPointerException();
}
if (a.length == 0) {
return -1;
}
int left = 0;
int right = a.length - 1;
while (left + 1 < right) {
int middle = left + (right - left)/2;
if (comparator.compare(key, a[middle]) <= 0) {
right = middle;
} else {
left = middle;
}
}
if (comparator.compare(key, a[left]) == 0) {
return left;
}
if (comparator.compare(key, a[right]) == 0) {
return right;
}
return -1;
}
public static <Key> int lastIndexOf(Key[] a, Key key, Comparator<Key> comparator) {
if (a == null || key == null || comparator == null) {
throw new java.lang.NullPointerException();
}
if (a == null || a.length == 0) {
return -1;
}
int left = 0;
int right = a.length - 1;
while (left + 1 < right) {
int middle = left + (right - left)/2;
if (comparator.compare(key, a[middle]) < 0) {
right = middle;
} else {
left = middle;
}
}
if (comparator.compare(key, a[right]) == 0) {
return right;
}
if (comparator.compare(key, a[left]) == 0) {
return left;
}
return -1;
}
}
AutoComplete
import java.util.Arrays;
import java.util.Scanner;
import java.io.File;
import java.io.IOException;
import java.util.Comparator;
public class Autocomplete {
public Term[] terms;
public Autocomplete(Term[] terms) {
if (terms == null) {
throw new java.lang.NullPointerException();
}
this.terms = terms.clone();
Arrays.sort(this.terms);
}
public Term[] allMatches(String prefix) {
if (prefix == null) {
throw new java.lang.NullPointerException();
}
Term theTerm = new Term(prefix, 0);
int start = BinarySearchDeluxe.firstIndexOf(terms, theTerm, Term.byPrefixOrder(prefix.length()));
int end = BinarySearchDeluxe.lastIndexOf(terms, theTerm, Term.byPrefixOrder(prefix.length()));
int count = start;
System.out.println("Start: " + start + " End: " + end);
if (start == -1 || end == -1) {
// System.out.println("PREFIX: " + prefix);
throw new java.lang.NullPointerException();
} // Needed?
Term[] matches = new Term[end - start + 1];
//matches = Arrays.copyOfRange(terms, start, end);
for (int i = 0; i < end - start; i++) {
matches[i] = this.terms[count];
count++;
}
Arrays.sort(matches, Term.byReverseWeightOrder());
System.out.println("Finished allmatches");
return matches;
}
public int numberOfMatches(String prefix) {
if (prefix == null) {
throw new java.lang.NullPointerException();
}
Term theTerm = new Term(prefix, 0);
int start = BinarySearchDeluxe.firstIndexOf(terms, theTerm, Term.byPrefixOrder(prefix.length()));
int end = BinarySearchDeluxe.lastIndexOf(terms, theTerm, Term.byPrefixOrder(prefix.length()));
System.out.println("Finished numberMatches");
return end - start + 1; // +1 needed?
}
public static void main(String[] args) throws IOException {
// Read the terms from the file
Scanner in = new Scanner(new File("wiktionary.txt"));
int N = in.nextInt(); // Number of terms in file
Term[] terms = new Term[N];
for (int i = 0; i < N; i++) {
long weight = in.nextLong(); // read the next weight
String query = in.nextLine(); // read the next query
terms[i] = new Term(query.replaceFirst("\t",""), weight); // construct the term
}
Scanner ip = new Scanner(System.in);
// TO DO: Data Validation Here
int k;
do {
System.out.println("Enter how many matching terms do you want to see:");
k = ip.nextInt();
} while (k < 1 || k > N);
Autocomplete autocomplete = new Autocomplete(terms);
// TO DO: Keep asking the user to enter the prefix and show results till user quits
boolean cont = true;
do {
// Read in queries from standard input and print out the top k matching terms
System.out.println("Enter the term you are searching for. Enter * to exit");
String prefix = ip.next();
if (prefix.equals("*")) {
cont = false;
break;
}
Term[] results = autocomplete.allMatches(prefix);
System.out.println(results.length);
for(int i = 0; i < Math.min(k,results.length); i++)
System.out.println(results[i].toString());
} while(cont);
System.out.println("Done!");
}
}
I apologize for the sloppy code, I have been pulling my hair out for awhile now and keep forgetting to clean it up.
Two examples:
Example 1:
int k = 2;
String prefix = "auto";
Enter how many matching terms do you want to see:
2
Enter the term you are searching for. Enter * to exit
auto
619695 automobile
424997 automatic
Example 2:
int k = 5;
String prefix = "the";
Enter how many matching terms do you want to see:
5
Enter the term you are searching for. Enter * to exit
the
5627187200 the
334039800 they
282026500 their
250991700 them
196120000 there

puzzling recursion with java

Given a string and a non-empty substring sub, compute recursively the largest substring which starts and ends with sub and return its length.
strDist("catcowcat", "cat") → 9
strDist("catcowcat", "cow") → 3
strDist("cccatcowcatxx", "cat") → 9
my solution
public int strDist(String str, String sub) {
int i = sub.length();
int j = str.length();
int count = 0;
if (str.length() == 1 && str.equals(sub)) {
return 1;
} else if (str.length() < sub.length() || str.length() <= 1) {
return 0;
}
if (str.substring(0, i).equals(sub)) {
if (str.substring(str.length() - i, str.length()).equals(sub)) {
return str.length();
} else {
strDist(str.substring(0, str.length() - i), sub);
}
} else {
strDist(str.substring(1, str.length()), sub);
}
return 0;
}
tell me how to correct my code?
Why does this need to be done with recursion?
Edit: fixed code to handle case where sub is not present in str, or only present once.
public int strDist(String str, String sub) {
int last=str.lastIndexOf(sub);
if (last != -1) {
int first=str.indexOf(sub);
if (first != last)
return last - first + sub.length();
}
}
return 0;
}
Recursion is great, if it is suited to the problem. In this case, recursion doesn't add value, and writing it with recursion for the sake of recursion makes the code inefficient.
This will , "compute recursively the largest substring which starts and ends with sub and return its length" as you described.
public class PuzzlingRecursion {
static String substringFound = "";
public static void main(String[] args) {
String sentence = "catcowcat";
String substring = "cat";
int sizeString = findNumberOfStrings(sentence, substring, 0);
System.out.println("you are searching for: " + substring);
System.out.println("in: " + sentence);
System.out.println("substring which starts and ends with sub and return its length is:"+substringFound + ", " + sizeString);
}
private static int findNumberOfStrings(String subStringPassed,
String setenecePassed, int count) {
if (subStringPassed.length() == 0) {
return count + 0;
}
if (subStringPassed.length() < setenecePassed.length()) {
return count + 0;
}
count++;
String lastStringMiddle = subStringPassed.replaceAll("(.*?)" + "("
+ setenecePassed + ")" + "(.*?)" + "(" + setenecePassed + ")"
+ "(.*?.*)", "$3");
if (subStringPassed.contains(setenecePassed)
&& lastStringMiddle.length() != setenecePassed.length()) {
if (subStringPassed.contains(setenecePassed)
&& lastStringMiddle.contains(setenecePassed)) {
// only found one item no pattern but according to the example
// you posted it should return the length of one word/substring
count = setenecePassed.length();
substringFound = subStringPassed;
return count;
}
}
// makesure the lastSrtringMiddle has the key we are search
if (!lastStringMiddle.equals(subStringPassed)) {
subStringPassed = subStringPassed.replaceFirst(setenecePassed, "");
String lastString = subStringPassed.substring(0,
subStringPassed.lastIndexOf(setenecePassed));
if (null != lastString && !"".equals(lastString)) {
count = lastStringMiddle.length() + setenecePassed.length()
+ setenecePassed.length();
substringFound = setenecePassed + lastStringMiddle
+ setenecePassed;
subStringPassed = "";
}
return findNumberOfStrings(subStringPassed, setenecePassed, count);
}
return count;
}
}
I think this is much nicer recursive solution:
public int strDist(String str, String sub) {
if (str.length()==0) return 0;
if (!str.startsWith(sub))
return strDist(str.substring(1),sub);
if (!str.endsWith(sub))
return strDist(str.substring(0,str.length()-1),sub);
return str.length();
}

Java Pattern print capturing groups

((\d{1,2})/(\d{1,2})/(\d{2,4}))
Is there a way to retrieve a list of all the capture groups with the Pattern object. I debugged the object and all it says is how many groups there are (5).
I need to retrieve a list of the following capture groups.
Example of output:
0 ((\d{1,2})/(\d{1,2})/(\d{2,4}))
1 (\d{2})/(\d{2})/(\d{4})
2 \d{2}
3 \d{2}
4 \d{4}
Update:
I am not necessarily asking if a regular expression exists, but that would be most favorable. So far I have created a rudimentary parser (I do not check for most out-of-bounds conditions) that only matches inner-most groups. I would like to know if there is a way to hold reference to already-visited parenthesis. I would probably have to implement a tree structure?
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;
public class App {
public final char S = '(';
public final char E = ')';
public final char X = '\\';
String errorMessage = "Malformed expression: ";
/**
* Actual Output:
* Groups: [(//), (\d{1,2}), (\d{1,2}), (\d{2,4})]
* Expected Output:
* Groups: [\\b((\\d{1,2})/(\\d{1,2})/(\\d{2,4}))\\b, ((\\d{1,2})/(\\d{1,2})/(\\d{2,4})), (\d{1,2}), (\d{1,2}), (\d{2,4})]
*/
public App() {
String expression = "\\b((\\d{1,2})/(\\d{1,2})/(\\d{2,4}))\\b";
String output = "";
if (isValidExpression(expression)) {
List<String> groups = findGroups(expression);
output = "Groups: " + groups;
} else {
output = errorMessage;
}
System.out.println(output);
}
public List<String> findGroups(String expression) {
List<String> groups = new ArrayList<>();
int[] pos;
int start;
int end;
String sub;
boolean done = false;
while (expression.length() > 0 && !done) {
pos = scanString(expression);
start = pos[0];
end = pos[1];
if (start == -1 || end == -1) {
done = true;
continue;
}
sub = expression.substring(start, end);
expression = splice(expression, start, end);
groups.add(0, sub);
}
return groups;
}
public int[] scanString(String str) {
int[] range = new int[] { -1, -1 };
int min = 0;
int max = str.length() - 1;
int start = min;
int end = max;
char curr;
while (start <= max) {
curr = str.charAt(start);
if (curr == S) {
range[0] = start;
}
start++;
}
end = range[0];
while (end > -1 && end <= max) {
curr = str.charAt(end);
if (curr == E) {
range[1] = end + 1;
break;
}
end++;
}
return range;
}
public String splice(String str, int start, int end) {
if (str == null || str.length() < 1)
return "";
if (start < 0 || end > str.length()) {
System.err.println("Positions out of bounds.");
return str;
}
if (start >= end) {
System.err.println("Start must not exceed end.");
return str;
}
String first = str.substring(0, start);
String last = str.substring(end, str.length());
return first + last;
}
public boolean isValidExpression(String expression) {
try {
Pattern.compile(expression);
} catch (PatternSyntaxException e) {
errorMessage += e.getMessage();
return false;
}
return true;
}
public static void main(String[] args) {
new App();
}
}
Here is my solution ... I simply provided a regex of the regex as #SotiriosDelimanolis commented out.
public static void printGroups() {
String sp = "((\\(\\\\d\\{1,2\\}\\))\\/(\\(\\\\d\\{1,2\\}\\))\\/(\\(\\\\d\\{2,4\\}\\)))";
Pattern p = Pattern.compile(sp);
Matcher m = p.matcher("(\\d{1,2})/(\\d{1,2})/(\\d{2,4})");
if (m.matches())
for (int i = 0; i <= m.groupCount(); i++)
System.out.println(m.group(i));
}
Pay attention that you cannot remove the if-statement because in order to use the group method you should call the matches method first (I didn't know it!). See this link as a reference about it.
Hope this is what you were asking for ...

Categories

Resources