Get all captured groups in Java - java

I want to match a single word inside brackets(including the brackets), my Regex below is working but it's not returning me all groups.
Here's my code:
String text = "This_is_a_[sample]_text[notworking]";
Matcher matcher = Pattern.compile("\\[([a-zA-Z_]+)\\]").matcher(text);
if (matcher.find()) {
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println("------------------------------------");
System.out.println("Group " + i + ": " + matcher.group(i));
}
Also I've tested it in Regex Planet and it seems to work.
It must return 4 groups:
------------------------------------
Group 0: [sample]
------------------------------------
Group 1: sample
------------------------------------
Group 2: [notworking]
------------------------------------
Group 3: notworking
But it's returning just it:
------------------------------------
Group 0: [sample]
------------------------------------
Group 1: sample
What's wrong?

JAVA does not offer fancy global option to find all the matches at once. So, you need while loop here
int i = 0;
while (matcher.find()) {
for (int j = 0; j <= matcher.groupCount(); j++) {
System.out.println("------------------------------------");
System.out.println("Group " + i + ": " + matcher.group(j));
i++;
}
}
Ideone Demo

Groups aren't thought to find several matches. They are thought to identify several subparts in a single match, e.g. the expression "([A-Za-z]*):([A-Za-z]*)" would match a key-value pair and you could get the key as group 1 and the value as group 2.
There is only 1 group (= one brackets pair) in your expression and therefore only the groups 0 (always the whole matched expression, independently of your manually defined groups) and 1 (the single group you defined) are returned.
In your case, try calling find iteratively, if you want more matches.
int i = 0;
while (matcher.find()) {
System.out.println("Match " + i + ": " + matcher.group(1));
i++;
}

Also if you know the amount of matches you will get, you can find groups and add them to list then just take values from list when needed to assigned them somewhere else
public static List<String> groupList(String text, Pattern pattern) {
List<String> list = new ArrayList<>();
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
list.add(matcher.group(i));
}
}
return list;

Related

How can I find compound string in text

I have been searching for solution to find strings like this howareyou in sentence and remove them from it. For example:
We have a sentence - Hello there, how are you?
And compound - how are you
As a result I want to have this string - Hello there, ? With compound removed.
My current solution is splitting string into words and checking if compound contains each word, but it's not working well, because if you have other words that match that compound they will also be removed, e.g.:
If we will look for foreseenfuture in this string - I have foreseen future for all of you, then, according to my solution for will also be removed, because it is inside of compound.
Code
String[] words = text.split("[^a-zA-Z]");
String compound = "foreseenfuture";
int startIndex = -1;
int endIndex = -1;
for(String word : words){
if(compound.contains(word)){
if(startIndex == -1){
startIndex = text.indexOf(word);
}
endIndex = text.indexOf(word) + word.length() - 1;
}
}
if(startIndex != -1 && endIndex != -1){
text = text.substring(0, startIndex) + "" + text.substring(endIndex + 1, text.length() - 1);
}
So, is there any other way to solve this?
I'm going to assume that when you compound you only remove whitespace. So with this assumption "for,seen future. for seen future" would become "for,seen future. " since the comma breaks up the other compound. In this case then this should work:
String example1 = "how are you?";
String example2 = "how, are you... here?";
String example3 = "Madam, how are you finding the accommodations?";
String example4 = "how are you how are you how are you taco";
String compound = "howareyou";
StringBuilder compoundRegexBuilder = new StringBuilder();
//This matches to a word boundary before the first word
compoundRegexBuilder.append("\\b");
// inserts each character into the regex
for(int i = 0; i < compound.length(); i++) {
compoundRegexBuilder.append(compound.charAt(i));
// between each letter there could be any amount of whitespace
if(i<compound.length()-1) {
compoundRegexBuilder.append("\\s*");
}
}
// Makes sure the last word isn't part of a larger word
compoundRegexBuilder.append("\\b");
String compoundRegex = compoundRegexBuilder.toString();
System.out.println(compoundRegex);
System.out.println("Example 1:\n" + example1 + "\n" + example1.replaceAll(compoundRegex, ""));
System.out.println("\nExample 2:\n" + example2 + "\n" + example2.replaceAll(compoundRegex, ""));
System.out.println("\nExample 3:\n" + example3 + "\n" + example3.replaceAll(compoundRegex, ""));
System.out.println("\nExample 4:\n" + example4 + "\n" + example4.replaceAll(compoundRegex, ""));
The output is as follows:
\bh\s*o\s*w\s*a\s*r\s*e\s*y\s*o\s*u\b
Example 1:
how are you?
?
Example 2:
how, are you... here?
how, are you... here?
Example 3:
Madam, how are you finding the accommodations?
Madam, finding the accommodations?
Example 4:
how are you how are you how are you taco
taco
You can also use this to match any other alpha-numeric compound.

How do I output to the console in a specific layout?

I am working on a small project that takes user input (match results) on one line, splits the input and outputs the same data in a different format. I am struggling to find a way to output the data in a specific format. As well as total games played, I want my program to produce a chart like output in the format
home_name [home_score] | away_name [away_score]
This is the code I have at the minute which allows users to input results line after line in the following format
home_name : away_name : home_score : away_score
until they enter stop, which breaks the loop (and hopefully soon outputs the data).
import java.util.*;
public class results {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int totalGames = 0;
String input = null;
System.out.println("Please enter results in the following format"
+ " home_name : away_name : home_score : away_score"
+ ", or enter stop to quit");
while (null != (input = scan.nextLine())){
if ("stop".equals(input)){
break;
}
String results[] = input.split(" : ");
for (int x = 0; x < results.length; x++) {
}
totalGames++;
}
System.out.println("Total games played is " + totalGames);
}
}
You can see here.
You can format your text as you wish.
The general syntax is
%[arg_index$][flags][width][.precision]conversion char  Argument
numbering starts with 1 (not 0). So to print the first argument, you
should use 1$ (if you are using explicit ordering).
You can use regEx to parse the line:
(\w)\s(\w)\s|\s(\w)\s(\w)
Base on Java code from (from http://tutorials.jenkov.com/java-regex/matcher.html)
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatcherFindStartEndExample{
public static void main(String[] args){
String text = "Belenenses 6 | Benfica 0";
String patternString = "(\\w+)\\s(\\w+)\\s\\|\\s(\\w+)\\s(\\w+)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
while (matcher.find()){
System.out.println("found: " + matcher.group(1));
System.out.println("found: " + matcher.group(2));
System.out.println("found: " + matcher.group(3));
System.out.println("found: " + matcher.group(4));
}
}}
Use this code instead of your
String results[] = input.split(" : ");
for (int x = 0; x < results.length; x++) {
}
You should do things in two times :
1) retrieving information entered by the user and storing it in instances of a custom class : PlayerResult.
2) performing the output according to the expected format. You should also compute the max size of each column before creating the graphical table.
Otherwise you could have a ugly rendering.
First step :
List<PlayerResult> playerResults = new ArrayList<PlayerResult>();
...
String[4] results = input.split(" : ");
playerResults.add(new PlayerResult(results[0],results[1],results[2],results[3])
Second step :
// compute length of column
int[] lengthByColumn = computeLengthByColumn(results);
int lengthHomeColumn = lengthByColumn[0];
int lengthAwayColumn = lengthByColumn[1];
// render header
System.out.print(adjustLength("home_name [home_score]", lengthHomeColumn));
System.out.println(adjustLength("away_name [away_score]", lengthAwayColumn));
// render data
for (PlayerResult playerResult : playerResults){
System.out.print(adjustLength(playerResult.getHomeName() + "[" + playerResult.getHomeName() + "]", lengthHomeColumn));
System.out.println(adjustLength(playerResult.getAwayName() + "[" + playerResult.getAwayScore() + "]", lengthAwayColumn));
}
You can keep the games statistics by adding results array values to the finalResults ArrayList. And then outputting its results as stop input is entered.
For counting the total results per team HashMap<String, Integer> is the best choice.
Here is the complete code with comments to make it clear:
import java.util.*;
// following the naming conventions class name must start with a capital letter
public class Results {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int totalGames = 0;
String input;
System.out.println("Please enter results in the following format: \n"
+ "'HOME_NAME : AWAY_NAME : HOME_SCORE : AWAY_SCORE' \n"
+ "or enter 'stop' to quit");
// HashMap to keep team name as a key and its total score as value
Map<String, Integer> scoreMap = new HashMap<>();
// ArrayList for storing game history
List<String> finalResults = new ArrayList<>();
// don't compare null to value. Read more http://stackoverflow.com/questions/6883646/obj-null-vs-null-obj
while ((input = scan.nextLine()) != null) {
if (input.equalsIgnoreCase("stop")) { // 'Stop', 'STOP' and 'stop' are all OK
scan.close(); // close Scanner object
break;
}
String[] results = input.split(" : ");
// add result as String.format. Read more https://examples.javacodegeeks.com/core-java/lang/string/java-string-format-example/
finalResults.add(String.format("%s [%s] | %s [%s]", results[0], results[2], results[1], results[3]));
// check if the map already contains the team
// results[0] and results[1] are team names, results[2] and results[3] are their scores
for (int i = 0; i < 2; i++) {
// here is used the Ternary operator. Read more http://alvinalexander.com/java/edu/pj/pj010018
scoreMap.put(results[i], !scoreMap.containsKey(results[i]) ?
Integer.valueOf(results[i + 2]) :
Integer.valueOf(scoreMap.get(results[i]) + Integer.valueOf(results[i + 2])));
}
totalGames++; // increment totalGames
}
System.out.printf("%nTotal games played: %d.%n", totalGames); // output the total played games
// output the games statistics from ArrayList finalResults
for (String finalResult : finalResults) {
System.out.println(finalResult);
}
// output the score table from HashMap scoreMap
System.out.println("\nScore table:");
for (Map.Entry<String, Integer> score : scoreMap.entrySet()) {
System.out.println(score.getKey() + " : " + score.getValue());
}
}
}
Now testing with input:
team1 : team2 : 1 : 0
team3 : team1 : 3 : 2
team3 : team2 : 2 : 2
sToP
The output is:
Total games played: 3.
team1 [1] | team2 [0]
team3 [3] | team1 [2]
team3 [2] | team2 [2]
Score table:
team3 : 5
team1 : 3
team2 : 2

Java: Find Integers in a String (Calculator)

If I have a String that looks like this: String calc = "5+3". Can I substring the integers 5 and 3?
In this case, you do know how the String looks, but it could look like this: String calc = "55-23" Therefore, I want to know if there is a way to identify integers in a String.
For something like that, regular expression is your friend:
String text = "String calc = 55-23";
Matcher m = Pattern.compile("\\d+").matcher(text);
while (m.find())
System.out.println(m.group());
Output
55
23
Now, you might need to expand it to support decimals:
String text = "String calc = 1.1 + 22 * 333 / (4444 - 55555)";
Matcher m = Pattern.compile("\\d+(?:.\\d+)?").matcher(text);
while (m.find())
System.out.println(m.group());
Output
1.1
22
333
4444
55555
You could use a regex like ([\d]+)([+-])([\d]+) to obtain the full binary expression.
Pattern pattern = Pattern.compile("([\\d]+)([+-])([\\d]+)");
String calc = "5+3";
Matcher matcher = pattern.matcher(calc);
if (matcher.matches()) {
int lhs = Integer.parseInt(matcher.group(1));
int rhs = Integer.parseInt(matcher.group(3));
char operator = matcher.group(2).charAt(0);
System.out.print(lhs + " " + operator + " " + rhs + " = ");
switch (operator) {
case '+': {
System.out.println(lhs + rhs);
}
case '-': {
System.out.println(lhs - rhs);
}
}
}
Output:
5 + 3 = 8
You can read each character and find it's Ascii code. Evaluate its code if it is between 48 and 57, it is a number and if it is not, it is a symbol.
if you find another character that is a number also you must add to previous number until you reach a symbol.
String calc="55-23";
String intString="";
char tempChar;
for (int i=0;i<calc.length();i++){
tempChar=calc.charAt(i);
int ascii=(int) tempChar;
if (ascii>47 && ascii <58){
intString=intString+tempChar;
}
else {
System.out.println(intString);
intString="";
}
}

why does a string who clearly contains a piece of other, not result in a found hit using .contains()?

this program attempts to scan a text file radio music log, and then match the songs to a directory of wav files. all files are named with the same convention: artist-title, ie: lukebryan-kickthedustup.wav. i swap the locations of the title and artist using the delimiter feature, which allows for easy comparison to the music log, which is already formatted the same way: title, artist.
now, lets say i'm searching the term "lovingyoueasyza", which is Loving You Easy by the Zac Brown Band... when it reaches the file in the directory with the assigned string "lovingyoueasyzacbrownband", it ignores it, even though it contains that string. you'll see i'm calling:
if(searchMe.contains(findMe))
yet it doesn't return a hit. it will return matches if the findMe string only contains the song title, but if any part of the artist title creeps into that string, it stops working. why!? for shorter titles its critical i be able to search for artist name as well, which is why i can't just search by song title.
i've tried using .trim() to no avail. here is some sample output of when a match is found:
searching term: "onehellofanamen"
comparing to: "onehellofanamendbrantleygilbert"
Match found!
value of findMe: onehellofanamen
value of searchMe: onehellofanamendbrantleygilbert
value of y: 49
value of x: 79
here is sample output of a failed attempt to match:
searching term: "lovingyoueasyza"
comparing to: "keepmeinminddzacbrownband"
searching term: "lovingyoueasyza"
comparing to: "lovingyoueasydzacbrownband"
searching term: "lovingyoueasyza"
comparing to: "nohurrydzacbrownband"
searching term: "lovingyoueasyza"
comparing to: "toesdzacbrownband"
searching term: "lovingyoueasyza"
this is what the findMe's go into the method as:
fileToProcess var is: C:\test\06012015.TXT
slot #0: topofhouridplac
slot #1: lovemelikeyoume
slot #2: wearetonightbil
slot #3: turnitoneliyoun
slot #4: lonelytonightbl
slot #5: stopset
slot #6: alrightdariusru
slot #7: lovingyoueasyza
slot #8: sundazefloridag
slot #9: stopset
the final output of matchesFound is like this:
Item Number: 0 ****TOP OF HOUR****
Item Number: 1 d:\tocn\kelseaballerini-lovemelikeyoumeanit.wav
Item Number: 2 null
Item Number: 3 null
Item Number: 4 null
Item Number: 5 ****STOP SET****
Item Number: 6 null
... through 82.
public static String[] regionMatches(String[] directoryArray,
String[] musicLogArray) throws InterruptedException {
String[] matchesFound = new String[musicLogArray.length];
String[] originalFileList = new String[directoryArray.length];
for (int y = 0; y < directoryArray.length; y++) {
originalFileList[y] = directoryArray[y];
System.out.println("o value: " + originalFileList[y]);
System.out.println("d value: " + directoryArray[y]);
}
for (int q = 0; q < originalFileList.length; q++) {
originalFileList[q] = originalFileList[q].replaceAll(".wav", "");
originalFileList[q] = originalFileList[q].replaceAll("\\\\", "");
originalFileList[q] = originalFileList[q].replaceAll("[+.^:,]", "");
originalFileList[q] = originalFileList[q].replaceAll("ctestmusic",
"");
originalFileList[q] = originalFileList[q].replaceAll("tocn", "");
originalFileList[q] = originalFileList[q].toLowerCase();
String[] parts = originalFileList[q].split("-");
originalFileList[q] = parts[1] + parts[0];
System.out.println(originalFileList[q]);
}
for (int x = 0; x < musicLogArray.length; x++) {
for (int y = 0; y < directoryArray.length; y++) {
//System.out.println("value of x: " + x);
//System.out.println("value of y: " + y);
String searchMe = originalFileList[y];
String findMe = musicLogArray[x];
int searchMeLength = searchMe.length();
int findMeLength = findMe.length();
boolean foundIt = false;
updateDisplay("searching term: " + "\"" + findMe+"\"");
updateDisplay("comparing to: " + "\"" + searchMe + "\"");
//for (int i = 0; i <= (searchMeLength - findMeLength); i++) {
if(searchMe.contains(findMe)){
updateDisplay("Match found!");
updateDisplay("value of findMe: " + findMe);
updateDisplay("value of searchMe: " + searchMe);
updateDisplay("value of y: " + y);
updateDisplay("value of x: " + x);
matchesFound[x] = directoryArray[y];
break;
// if (searchMe.regionMatches(i, findMe, 0, findMeLength)) {
// foundIt = true;
// updateDisplay("MATCH FOUND!: "
// + searchMe.substring(i, i + findMeLength));
//
// matchesFound[x] = directoryArray[y];
//
// break;
} else if (findMe.contains("stopset")){
matchesFound[x] = "****STOP SET****";
break;
} else if (findMe.contains("topofho")) {
matchesFound[x] = "****TOP OF HOUR****";
break;
}
}
//if (!foundIt) {
// updateDisplay("No match found.");
//}
}
//}
return matchesFound;
}
It seems to me that your music directory has a bunch of unwanted d's in the file where you put the pieces back together.
searching term: "lovingyoueasyza"
comparing to: "lovingyoueasydzacbrownband"
The comparing to string does not contain the search term because after "easy" there is a "d" which ruins the search which is why you are having errors including artist names.
Here:
searching term: "lovingyoueasyza"
comparing to: "lovingyoueasydzacbrownband"
In your second string, note that there is an extra d after easy.
So the second string does not contain the first string.
I think you are adding an extra 'd' when combining song name with the artist name.
The same thing is happening for all your other strings, e.g.
searching term: "onehellofanamen"
comparing to: "onehellofanamendbrantleygilbert"
which I suppose is one hell of an amen + the extra 'd' + brantley gilbert.

How to divide string into two parts using regex in java?

String strArray="135(i),15a,14(g)(q)12,67dd(),kk,159"; //splited by ','
divide string after first occurrence of alphanumeric value/character
expected output :
original expected o/p
15a s1=15 s2=a
67dd() s1=67 s2=dd()
kk s1="" s2=kk
159 s1=159 s2=""
Please help me................
You could use the group-method of Pattern/Matcher:
String strArray = "135(i),15a,14(g)(q)12,67dd(),kk,159";//splited by ','
Pattern pattern = Pattern.compile("(?<digits>\\d*)(?<chars>[^,]*)");
Matcher matcher = pattern.matcher(strArray);
while (matcher.find()) {
if (!matcher.group().isEmpty()) //omit empty groups
System.out.println(matcher.group() + " : " + matcher.group("digits") + " - " + matcher.group("chars"));
}
The method group(String name) gives you the String found in the pattern's parenthesis with the specific name (here it is 'digits' or 'chars') within the match.
The method group(int i) would give you the String found in the i-th parenthesis of the pattern within the match.
See the Oracle tutorial at http://docs.oracle.com/javase/tutorial/essential/regex/ for more examples of using regex in Java.
You can use a Pattern and a Matcher to find the first index of a letter preceded by a number and split at that position.
Code
public static void main(String[] args) throws ParseException {
String[] inputs = { "15a", "67dd()", "kk", "159" };
for (String input : inputs) {
Pattern p = Pattern.compile("(?<=[0-9])[a-zA-Z]");
Matcher m = p.matcher(input);
System.out.println("Input: " + input);
if (m.find()) {
int splitIndex = m.end();
// System.out.println(splitIndex);
System.out.println("1.\t"+input.substring(0, splitIndex - 1));
System.out.println("2.\t"+input.substring(splitIndex - 1));
} else {
System.out.println("1.");
System.out.println("2.\t"+input);
}
}
}
Output
Input: 15a
1. 15
2. a
Input: 67dd()
1. 67
2. dd()
Input: kk
1.
2. kk
Input: 159
1.
2. 159
Use java.util.regex.Pattern and java.util.regex.Matcher
String strArray="135(i),15a,14(g)(q)12,67dd(),kk,159";
String arr[] = strArray.split(",");
for (String s : arr) {
Matcher m = Pattern.compile("([0-9]*)([^0-9]*)").matcher(s);
System.out.println("String in = " + s);
if(m.matches()){
System.out.println(" s1: " + m.group(1));
System.out.println(" s2: " + m.group(2));
} else {
System.out.println(" unmatched");
}
}
outputs:
String in = 135(i)
s1: 135
s2: (i)
String in = 15a
s1: 15
s2: a
String in = 14(g)(q)12
unmatched
String in = 67dd()
s1: 67
s2: dd()
String in = kk
s1:
s2: kk
String in = 159
s1: 159
s2:
Note how '14(g)(q)12' is not matched. It's not clear what the OP's required output is in this instance (or if a comma is missing from this portion of the example input string).

Categories

Resources