Java Scanner Class useDelimiter Method - java

I have to read from a text file containing all the NCAA Division 1 championship games since 1933,
the file is in this format: 1939:Villanova:42:Brown:30
1945:New York University:70:Ohio State:65 **The fact that some Universities have multiple white spaces is giving me lots of trouble beause we are only to read the school names and discard the year, points and colon. I do not know if I have to use a delimiter that discards what spaces, but buttom line is I am a very lost.
We are to discard the date, points, and ":". I am slightly fimilar with the useDelimiter method but, I have read that a .split("") might be useful. I am having a great deal of problems due to my lack of knowledge in patterns.
THIS IS WHAT I HAVE SO FAR:
class NCAATeamTester
{
public static void main(String[]args)throws IOException
{
NCAATeamList myList = new NCAATeamList(); //ArrayList containing teams
Scanner in = new Scanner(new File("ncaa2012.data"));
in.useDelimiter("[A-Za-z]+"); //String Delimeter excluding non alphabetic chars or ints
while(in.hasNextLine()){
String line = in.nextLine();
String name = in.next(line);
String losingTeam = in.next(line);
//Creating team object with winning team
NCAATeamStats win = new NCAATeamStats(name);
myList.addToList(win); //Adds to List
//Creating team object with losing team
NCAATeamStats lose = new NCAATeamStats(losingTeam);
myList.addToList(lose)
}
}
}

What about
String[] spl = line.split(':');
String name1 = spl[1];
String name2 = spl[3];
?
Or, if there are more records at the same line, use regular expressions :
String line = "1939:Villanova:42:Brown:30 1945:New York University:70:Ohio State:65";
Pattern p = Pattern.compile("(.*?:){4}[0-9]+");
Matcher m = p.matcher(line);
while (m.find())
{
String[] spl = m.group().split(':');
String name = spl[1];
String name2 = spl[3];
}

Related

Java - String splitting

I read a txt with data in the following format: Name Address Hobbies
Example(Bob Smith ABC Street Swimming)
and Assigned it into String z
Then I used z.split to separate each field using " " as the delimiter(space) but it separated Bob Smith into two different strings while it should be as one field, same with the address. Is there a method I can use to get it in the particular format I want?
P.S Apologies if I explained it vaguely, English isn't my first language.
String z;
try {
BufferedReader br = new BufferedReader(new FileReader("desc.txt"));
z = br.readLine();
} catch(IOException io) {
io.printStackTrace();
}
String[] temp = z.split(" ");
If the format of name and address parts is fixed to consist of two parts, you could just join them:
String z = ""; // z must be initialized
// use try-with-resources to ensure the reader is closed properly
try (BufferedReader br = new BufferedReader(new FileReader("desc.txt"))) {
z = br.readLine();
} catch(IOException io) {
io.printStackTrace();
}
String[] temp = z.split(" ");
String name = String.join(" ", temp[0], temp[1]);
String address = String.join(" ", temp[2], temp[3]);
String hobby = temp[4];
Another option could be to create a format string as a regular expression and use it to parse the input line using named groups (?<group_name>capturing text):
// use named groups to define parts of the line
Pattern format = Pattern.compile("(?<name>\\w+\\s\\w+)\\s(?<address>\\w+\\s\\w+)\\s(?<hobby>\\w+)");
Matcher match = format.matcher(z);
if (match.matches()) {
String name = match.group("name");
String address = match.group("address");
String hobby = match.group("hobby");
System.out.printf("Input line matched: name=%s address=%s hobby=%s%n", name, address, hobby);
} else {
System.out.println("Input line not matching: " + z);
}
I can think of three solutions.
In order from best to worst:
Different delimiter
Enforce the format to always have two names, two address parts and one hobby
Have a dictionary with names and hobbies, check each word to determine which type it is and then group them together as needed.
(The 3rd option is not meant as a serious alternative.)
As others have mentioned, using spaces as both field delimiter and inside fields is problematic. You could use a regex pattern to split the line (paste (\w+ \w+) (\w+ \w+) (.+) in Regex101 for an explanation):
Pattern pattern = Pattern.compile("(\\w+ \\w+) (\\w+ \\w+) (.+)");
Matcher matcher = pattern.matcher("Bob Smith ABC Street Bowling Fishing Rollerblading");
System.out.println("matcher.matches() = " + matcher.matches());
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println("matcher.group(" + i + ") = " + matcher.group(i));
}
This would give the following output:
matcher.matches() = true
matcher.group(0) = Bob Smith ABC Street Bowling Fishing Rollerblading
matcher.group(1) = Bob Smith
matcher.group(2) = ABC Street
matcher.group(3) = Bowling Fishing Rollerblading
However this only works for this exact format. If you get a line with three name parts for example:
John B Smith ABC Street Swimming
This will get split into John B as the name, Smith ABC as the address and Street Swimming as hobbies.
So either make 100% sure your input will always match this format or use a different delimiter.
The split() method majorly works on the 2 things:
Delimiter and
The String Object
Sometimes on limit too.
Whatever limit you will provide, the split() method will do its work according to that.
It doesn't understand whether the left substring is a name or not, same as for the right substring.
Have a look at this code snippet:
String assets = "Gold:Stocks:Fixed Income:Commodity:Interest Rates";
String[] splits = assets.split(":");
System.out.println("splits.size: " + splits.length);
for(String asset: splits){
System.out.println(assets);
}
OutPut
splits.size: 5
Gold
Stocks
Fixed Income // with space
Commodity
Interest Rates // with space
The output came with spaces because I provided the ; as a delimiter.
This probably helped you to get your answer.
Find Detailed Information on Split():
Top 5 Use cases of Split()
Java Docs : Split()
It depends on the data you're dealing with. Will the name always consist of a first and last name? Then you can simply combine the first two elements from the resulting array into a new string.
Otherwise, you might have to find a different way to separate out the different pieces within the txt file. Possibly a comma? Some character that you know won't ever be used in your normal data.
Assuming that every line follows the format
Bob Smith ABC Street Swimming
ie, name surname.... this code can manually manipulate the data for you:
String[] temp = z.split(" ");
String[] temp2 = new String[temp.length - 1];
temp2[0] = temp[0] + " " + temp[1];
for (int i = 2; i < temp.length; i++) {
temp2[i] = temp2[i];
}
temp = temp2;

Trying to split up a string with blank space

I'm writing out a piece of a code that where I am trying to split up the user's input into 3 different arrays, by using the spaces in-between the values the user has entered. However, everytime i run the code i get the error:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at Substring.main(Substring.java:18)
Java Result: 1
I have tried to use a different delimiter when entering the text and it has worked fine, e.g. using a / split the exact same input normally, and did what i wanted it to do thus far.
Any help would be appreciated!
Here's my code if needed
import java.util.Scanner;
public class Substring{
public static void main(String[]args){
Scanner user_input = new Scanner(System.in);
String fullname = ""; //declaring a variable so the user can enter their full name
String[] NameSplit = new String[2];
String FirstName;
String MiddleName;
String LastName;
System.out.println("Enter your full name (First Middle Last): ");
fullname = user_input.next(); //saving the user's name in the string fullname
NameSplit = fullname.split(" ");//We are splitting up the value of fullname every time there is a space between words
FirstName = NameSplit[0]; //Putting the values that are in the array into seperate string values, so they are easier to handle
MiddleName = NameSplit[1];
LastName = NameSplit[2];
System.out.println(fullname); //outputting the user's orginal input
System.out.println(LastName+ ", "+ FirstName +" "+ MiddleName);//outputting the last name first, then the first name, then the middle name
new StringBuilder(FirstName).reverse().toString();
System.out.println(FirstName);
}
}
Split is a regular expression, you can look for one or more spaces (" +") instead of just one space (" ").
String[] array = s.split(" +");
Or you can use Strint Tokenizer
String message = "MY name is ";
String delim = " \n\r\t,.;"; //insert here all delimitators
StringTokenizer st = new StringTokenizer(message,delim);
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
You have made mistakes at following places:
fullname = user_input.next();
It should be nextLine() instead of just next() since you want to read the complete line from the Scanner.
String[] NameSplit = new String[2];
There is no need for this step as you are doing NameSplit = user_input.split(...) later but it should be new String[3] instead of new String[2] since you are storing three entries i.e. First Name, Middle Name and the Last Name.
Here is the correct program:
class Substring {
public static void main (String[] args) throws java.lang.Exception {
Scanner user_input = new Scanner(System.in);
String[] NameSplit = new String[3];
String FirstName;
String MiddleName;
String LastName;
System.out.println("Enter your full name (First Middle Last): ");
String fullname = user_input.nextLine();
NameSplit = fullname.split(" ");
FirstName = NameSplit[0];
MiddleName = NameSplit[1];
LastName = NameSplit[2];
System.out.println(fullname);
System.out.println(LastName+ ", "+ FirstName +" "+ MiddleName);
new StringBuilder(FirstName).reverse().toString();
System.out.println(FirstName);
}
}
Output:
Enter your full name (First Middle Last): John Mayer Smith
Smith, John Mayer
John
java.util.Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
hence even though you entered 'Elvis John Presley' only 'Elvis' is stored in the fullName variable.
You can use BufferedReader to read full line:
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
try {
fullname = reader.readLine();
} catch (IOException e) {
e.printStackTrace();
}
or you can change the default behavior of scanner by using:
user_input.useDelimiter("\n"); method.
The exception clearly tells that you are exceeding the array's length. The index 2 in LastName = NameSplit[2] is out of array's bounds. To get rid of the error you must:
1- Change String[] NameSplit = new String[2] to String[] NameSplit = new String[3] because the array length should be 3.
Read more here: [ How do I declare and initialize an array in Java? ]
Up to here the error is gone but the solution is not correct yet since NameSplit[1] and NameSplit[2] are null, because user_input.next(); reads only the first word (*basically until a whitespace (or '\n' if only one word) is detected). So:
2- Change user_input.next(); to user_input.nextLine(); because the nextLine() reads the entire line (*basically until a '\n' is detected)
Read more here: [ http://www.cs.utexas.edu/users/ndale/Scanner.html ]

Splitting team names with scores separated by space and commas

I've multiple lines of game scores as input. The input is something like this.
Lions 1, FCAwesome 1
I'm currently Splitting the line based on either comma or space.
Charset charset = Charset.forName("US-ASCII");
String REGEX = ",?\\s+";
Pattern pattern = Pattern.compile(REGEX);
try(BufferedReader reader = Files.newBufferedReader(path, charset)){
int count = 0;
String line = null;
while((line = reader.readLine()) != null){
String[] arr = pattern.split(line);
}
This works fine for the provided input. However if the team name is has more than one word, my code breaks.
Lions 1, FC Awesome 1
How do I modify my REGEX to handle this case. FC Awesome still needs to be one team name.
Try splitting on space which
has comma before it (including that comma) - to separate team score pairs.
has digit after it - to separate team name and score,
So try with split(",\\s|\\s(?=\\d)")
If there is possible that some parts of team name can start with digit, we can specify our condition more. We can require from [space][digit] to either have after it comma or to be placed at the end of text.
split(",\\s|\\s(?=\\d+(?=,|$))")
try to split whole data by comma, then use getTeam method below
class Team {
String name;
int score;
public Team(String name, int score) {
this.name = name;
this.score = score;
}
#Override
public String toString() {
return this.name + ", " + this.score;
}
public static Team getTeam(String data) {
String score = "";
int i = data.length() - 1;
for (; Character.isDigit(data.charAt(i)); i--) {
score += data.charAt(i);
}
String name = data.substring(0, i);
return new Team(name, Integer.parseInt(new StringBuilder(score).reverse().toString()));
}
}
for example input is like this
LION_## 1234 OLD 5555 ,TEAM2345NAME NAME 123NAME 4444
first name is LION_## 1234 OLD and it's score is 5555
second name is TEAM2345NAME NAME 123NAME and 4444 is it's score
note: both contain numbers or special characters in their name and even space in score part.
now all i need is creating an instance of Team class.like below example:
String all_data = "LION_## 1234 OLD 5555 ,TEAM2345NAME NAME 123NAME 4444";
// spliting data by comma
String parts[] = all_data.split(",");
// calling getTeam method
Team t1 = Team.getTeam(parts[0]);
Team t2 = Team.getTeam(parts[1]);
then use it's fields, for example print them.
System.out.println(t1.name);
System.out.println(t2.score);

Java Scanner reading special formatted lines

How can I read a String (below) and replace the `%rackname` or `%sysname` with new text
KW Actual `%rackname` -> KW Actual THE RACK NAME
KW Difference `%rackname`-> KW Actual THE RACK NAME
KW Predicted `%rackname` -> KW Actual THE RACK NAME
Loads Capacity `%sysname` -> Loads Capacity SYS NAME
Loads Cost Difference `%sysname` -> Loads Cost Difference SYS NAME
Loads EEPR `%sysname` -> Loads EEPR SYS NAME
I need to apply this formatting to all strings in an arraylist, and some strings will have multiple variables to replace.
What is the best way to find these replaceable fields for these lists.
My first thought was using a Scanner to scan through a string using next(), if i find a word starting with a ` i read to the end of string and figure out what the field is to replace.
List<String []> vars = new ArrayList<String[]>() {};
int numfans, numsg, numcomp, numsys;
String [] newString;
String temp;
Scanner scan;
for(int i = 0; i < numRacks; i++){
// do all racks
for(String s: rackStr){
newString = new String[1];
scan = new Scanner(s);
while(scan.hasNext()){
temp = scan.next();
if(temp.startsWith("`") && temp.endsWith("`")){
System.out.println("Temp: " + temp);
System.out.println("Success");
newString[0] += findVar(temp);
}else {
newString[0] += temp;
}
}
vars.add(newString);
}
I figured this is probably horrible if i have to create a new scanner for every string in the multiple arraylists i will have.
Edit - Ok so str.replaceAll(); is such an easier solution /facepalm
Use String.replaceAll to replace patterns in a string, for example:
str = str.replaceAll("`%rackname`", "THE RACK NAME");
str = str.replaceAll("`%sysname`", "SYS NAME");

Building a pattern to extract data out of a string

I have strings of the form:
"abc" 1 2 1 13
"efgh" 2 5
Basically, a string in quotes followed by numbers separated by whitespace characters.
I need to extract the string and the numbers out of the line.
So for eg., for the first line, I'd want
abc to be stored in a String variable (i.e. without the quotations) and
an array of int to store [1,2,1,13].
I tried to create a pattern that'd do this, but I'm a little confused.
Pattern P = Pattern.compile("\A\".+\"(\s\d+)+");
Not sure how to proceed now. I realized that with this pattern I'd kinda be extracting the whole line out? Perhaps multiple patterns would help?
Pattern P1 = Pattern.compile("\A\".+\"");
Pattern P2 = Pattern.compile("(\s\d+)+");
Again, not very sure how to get the string and ints out of the line though. Any help is appreciated!
I would rather just split the string on space, rather than building complex regex, and use it with Pattern and Matcher class.
Something like this: -
String str = "\"abc\" 1 2 1 13 ";
String[] arrr = str.split("\\s");
System.out.println(Arrays.toString(arrr));
OUTPUT: -
["abc", 1, 2, 1, 13]
Shows your intent much clearer, that what you want to do.
Then, you can get the string and integer parts from your string array. You would need to do a Integer.parseInt() on integer elements.
If your string may contain spaces in it, then in that case, you would need a Regex. Better one would be the one in #m.buettner's answer
Use capturing groups to get both parts in one go, then split the numbers at spaces.
Pattern pattern = Pattern.compile("\"([^\"]*)\"\\s*([\\d\\s]*)");
Matcher m = pattern .matcher(input);
while (m.find()) {
String str = m.group(1);
String[] numbers = m.group(2).split("\\s");
// process both of them
}
Each set of parentheses in the regex will later correspond to one group (counting opening parentheses from left to right, starting at 1).
Please try this it will separate both String and int also
String s = "\"abc\" 1 2 1 13 ";
s = s.replace("\"", "");
String sarray[] = s.split(" ");
int i[] = new int[10];
String si[] = new String[10];
int siflag = 0;
int iflag = 0;
for (String st : sarray) {
try {
int ii = Integer.parseInt(st)
i[iflag++] = ii;
} catch (NumberFormatException e) {
si[siflag++] = st;
}
}
StringTokenizer st = new StringTokenizer(str,"\" ");
String token = null;
String strComponent = null;
int num[] = new int[10]; // can change length dynamically by using ArrayList
int i = 0;
int numTemp = -1;
while(st.hasMoreTokens()){
token = st.nextToken();
try{
numTemp = Integer.parseInt(token);
num[i++] = numTemp ;
}catch(NumberFormatException nfe){
strComponent = token.toString();
}

Categories

Resources