Java StringTokenizer - Problems with nextToken() usage with substring

Java StringTokenizer - Problems with nextToken() usage with substring - java

I have a text file I must iterate through and want to move certain elements of each line into an ArrayList. Each line of the file is in the format: number. String number. decimal decimal
As the two numbers have a full stop (.) at the end and I need to read these as a String, removed the . using substring and then convert to a primitive data type (int or short).
Example on file:
294. ABC123 66. .00 .00
I get a string range error if I try this: (* temp is a String)
while(fileLine.hasMoreTokens())
{
oneNumber = Integer.valueOf(fileLine.nextToken().substring(0,
fileLine.nextToken().indexOf('.')));
twoString = fileLine.nextToken();
threeNumber = Short.valueOf(fileLine.nextToken().substring(0,
fileLine.nextToken().indexOf('.')));
temp = fileLine.nextToken(); //Handle attributes not required
temp = fileLine.nextToken(); //Handle attributes not required
}
I believe why this is happening is that the nextToken() in the substring's parameters is confusing the StringTokenizer. So I fixed it like this:
while(fileLine.hasMoreTokens())
{
temp = fileLine.nextToken();
oneNumber = Integer.valueOf(temp.substring(0, temp.indexOf('.')));
twoString = fileLine.nextToken();
temp = fileLine.nextToken();
threeNumber= Short.valueOf(temp.substring(0, temp.indexOf('.')));
temp = fileLine.nextToken();
temp = fileLine.nextToken();
}
While this works it feels a bit redundant. Is there something I can try to make this cleaner, while retaining use of the StringTokenizer?

This is the intended behavior of .nextToken(): it returns the token and moves past the current token. When you use Integer.valueOf(fileLine.nextToken().substring(0, fileLine.nextToken().indexOf('.'))), you are calling .nextToken() twice, which means you are dealing with two distinct tokens. It has nothing to do with how String#substring works. You need to store the token in a variable if you need to perform additional operations on it. This exact same problem can also be caused by using BufferedReader#readLine twice when one should be storing the value.

Yup. nextToken() is stateful, calling it changes things, so using it twice in a single line would consume two tokens.
Your second snippet seems much easier to read to me, so I'm not sure what the problem is. Presumably you want your code to be more readable.
An easy fix is to make helper methods:
while (fileLine.hasMoreTokens()) {
oneNumber = fetchHeadingNumber(fileLine);
twoString = fileLine.nextToken();
threeNumber = fetchHeadingNumber(fileLine);
fileLine.nextToken(); // no need to assign it.
fileLine.nextToken();
}
with this method:
int fetchHeadingNumber(StringTokenizer t) {
String token = t.nextToken();
return Integer.parseInt(token.substring(0, token.indexOf('.')));
}
you can go even further and make a class representing a line, which has all the code needed to parse it (I made up names; your snippet doesn't make clear what kind of thing the line represents):
#lombok.Value class InventoryItem {
int warehouse;
String name;
int shelf;
public static InventoryItem read(StringTokenizer tokenizer) {
int warehouse = num(tokenizer);
String name = tokenizer.nextToken();
int shelf = num(tokenizer);
tokenizer.nextToken();
tokenizer.nextToken();
return new InventoryItem(warehouse, name, shelf);
}
private static int num(StringTokenizer t) {
String token = t.nextToken();
return Integer.parseInt(token.substring(0, token.indexOf('.')));
}
}
and then reading a line and retrieving, say, the location where it is stored is so much nicer: Now things actually have names!
InventoryItem item = InventoryItem.read(fileLine);
System.out.println("This item is in warehouse " + item.getWarehouse());
NB: Uses lombok's #Value to avoid putting a lot of boilerplate in this answer.

Related

Reading data and storing in array Java

I am writing a program which will allow users to reserve a room in a hotel (University Project). I have got this problem where when I try and read data from the file and store it in an array I receive a NumberFormatException.
I have been stuck on this problem for a while now and cannot figure out where I am going wrong. I've read up on it and apparently its when I try and convert a String to a numeric but I cannot figure out how to fix it.
Any suggestions, please?
This is my code for my reader.
FileReader file = new FileReader("rooms.txt");
Scanner reader = new Scanner(file);
int index = 0;
while(reader.hasNext()) {
int RoomNum = Integer.parseInt(reader.nextLine());
String Type = reader.nextLine();
double Price = Double.parseDouble(reader.nextLine());
boolean Balcony = Boolean.parseBoolean(reader.nextLine());
boolean Lounge = Boolean.parseBoolean(reader.nextLine());
String Reserved = reader.nextLine();
rooms[index] = new Room(RoomNum, Type, Price, Balcony, Lounge, Reserved);
index++;
}
reader.close();
This is the error message
This is the data in my file which I am trying to read:

Change your while loop like this
while (reader.hasNextLine())
{
// then split reader.nextLine() data using .split() function
// and store it in string array
// after that you can extract data from the array and do whatever you want
}

You're trying to parse the whole line to Integer. You can read the whole line as a String, call
.split(" ")
on it. This will split the whole line into multiple values and put them into an array. Then you can grab each item from the array and parse separately as you intended.
Please avoid posting screenshots next time, use proper formatting and text so someone can easily copy your code or test data to IDE and reproduce the scenario.

Use next() instead of nextLine().

With Scanner one must use hasNextLine, nextLine, hasNext, next, hasNextInt, nextInt etcetera. I would do it as follows:
Using Path and Files - the newer more general classes i.o. File.
Files can read lines, here I use Files.lines which gives a Stream of lines, a bit like a loop.
Try-with-resources: try (AutoCloseable in = ...) { ... } ensures that in.close() is always called implicitly, even on exception or return.
The line is without line ending.
The line is split into words separated by one or more spaces.
Only lines with at least 6 words are handled.
Create a Room from the words.
Collect an array of Room-s.
So:
Path file = Paths.get("rooms.txt");
try (Stream<String> in = Files.lines(file)) {
rooms = in // Stream<String>
.map(line -> line.split(" +")) // Stream<String[]>
.filter(words -> words.length >= 6)
.map(words -> {
int roomNum = Integer.parseInt(words[0]);
String type = words[1];
double price = Double.parseDouble(words[2]);
boolean balcony = Boolean.parseBoolean(words[3]);
boolean lounge = Boolean.parseBoolean(words[4]);
String reserved = words[5];
return new Room(roomNum, type, price, balcony, lounge, reserved);
}) // Stream<Room>
.toArray(Room[]::new); // Room[]
}
For local variables use camelCase with a small letter in front.
The code uses the default character encoding of the system to convert the bytes in the file to java Unicode String. If you want all Unicode symbols,
you might store your list as Unicode UTF-8, and read them as follows:
try (Stream<String> in = Files.lines(file, StandardCharsets.UTF_8)) {
An other issue is the imprecise floating point double. You might use BigDecimal instead; it holds a precision:
BigDecimal price = new BigDecimal(words[2]);
It is however much more verbose, so you need to look at a couple of examples.

String Fragment Combinations Puzzle

Let's say I am given a list of String fragments. Two fragments can be concatenated on their overlapping substrings.
e.g.
"sad" and "den" = "saden"
"fat" and "cat" = cannot be combined.
Sample input:
aw was poq qo
Sample output:
awas poqo
So, what's the best way to write a method which find the longest string that can be made by combining the strings in a list. If the string is infinite the output should be "infinite".
public class StringUtil {
public static String combine(List<String> fragments) {
StringBuilder combined = new StringBuilder();
for (int i = 0; i < fragments.size(); i++) {
char last = (char) (fragments.get(i).length() - 1);
if (Character.toString(last).equals(fragments.get(i).substring(0))) {
combined.append(fragments.get(i)).append(fragments.get(i+1));
}
}
return combined.toString();
}
}
Here's my JUnit test:
public class StringUtilTest {
#Test
public void combine() {
List<String> fragments = new ArrayList<String>();
fragments.add("aw");
fragments.add("was");
fragments.add("poq");
fragments.add("qo");
String result = StringUtil.combine(fragments);
assertEquals("awas poqo", result);
}
}
This code doesn't seem to be working on my end... It returning an empty string:
org.junit.ComparisonFailure: expected:<[awas poqo]> but was:<[]>
How can I get this to work? And also how can I get it to check for infinite strings?

I don't understand how fragments.get(i).length() - 1 is supposed to be a char. You clearly casted it on purpose, but I can't for the life of me tell what that purpose is. A string of length < 63 will be converted to an ASCII (Unicode?) character that isn't a letter.
I'm thinking you meant to compare the last character in one fragment to the first character in another, but I don't think that's what that code is doing.
My helpful answer is to undo some of the method chaining (function().otherFunction()), store the results in temporary variables, and step through it with a debugger. Break the problem down into small steps that you understand and verify the code is doing what you think it SHOULD be doing at each step. Once it works, then go back to chaining.
Edit: ok I'm bored and I like teaching. This smells like homework so I won't give you any code.
1) method chaining is just convenience. You could (and should) do:
String tempString = fragments.get(i);
int lengthOfString = tempString.length() - 1;
char lastChar = (char) lengthOfString;//WRONG
Etc.
This lets you SEE the intermediate steps, and THINK about what you are doing. You are literally taking the length of a string, say 3, and converting that Integer to a Char. You really want the last character in the string. When you don't use method chaining, you are forced to declare a Type of intermediate variable, which of course forces you to think about what the method ACTUALLY RETURNS. And this is why I told you to forgo method chaining until you are familiar with the functions.
2) I'm guessing at the point you wrote the function, the compiler complained that it couldn't implicitly cast to char from int. You then explicitly cast to a char to get it to shut up and compile. And now you are trying to figure out why it's failing at run time. The lesson is to listen to the compiler while you are learning. If it's complaining, you're messing something up.
3) I knew there was something else. Debugging. If you want to code, you'll need to learn how to do this. Most IDE's will give you an option to set a break point. Learn how to use this feature and "step through" your code line by line. THINK about exactly what step you are doing. Write down the algorithm for a short two letter pair, and execute it by hand on paper, one step at a time. Then look at what the code DOES, step by step, until you see somewhere it does something that you don't think is right. Finally, fix the section that isn't giving you the desired result.

Looking at your unit test, the answer seems to be quite simple.
public static String combine(List<String> fragments) {
StringBuilder combined = new StringBuilder();
for (String fragment : fragments) {
if (combined.length() == 0) {
combined.append(fragment);
} else if (combined.charAt(combined.length() - 1) == fragment.charAt(0)) {
combined.append(fragment.substring(1));
} else {
combined.append(" " + fragment);
}
}
return combined.toString();
}
But seeing at your inqusition example, you might be looking for something like this,
public static String combine(List<String> fragments) {
StringBuilder combined = new StringBuilder();
for (String fragment : fragments) {
if (combined.length() == 0) {
combined.append(fragment);
} else if (combined.charAt(combined.length() - 1) == fragment.charAt(0)) {
int i = 1;
while (i < fragment.length() && i < combined.length() && combined.charAt(combined.length() - i - 1) == fragment.charAt(i))
i++;
combined.append(fragment.substring(i));
} else {
combined.append(" " + fragment);
}
}
return combined.toString();
}
But note that for your test, it will generate aws poq which seems to be logical.

Splitting and saving data in Java

I'm trying to read a data file and save the different variables into an array list.
The format of the data file looks a little like this like this
5003639MATH131410591
5003639CHEM434111644
5003639PSYC230110701
Working around the bad formatting of the data file, I added commas to the different sections to make a split work. The new text file created looks something like this
5,003639,MATH,1314,10591
5,003639,CHEM,4341,11644
5,003639,PSYC,2301,10701
After creating said file, I tried to save the information into an array list.
The following is the snippet of trying to do this.
FileReader reader3 = new FileReader("example.txt");
BufferedReader br3 = new BufferedReader(reader3);
while ((strLine = br3.readLine())!=null){
String[] splitOut = strLine.split(", ");
if (splitOut.length == 5)
list.add(new Class(splitOut[0], splitOut[1], splitOut[2], splitOut[3], splitOut[4]));
}
br3.close();
System.out.println(list.get(0));
The following is the structure it is trying to save into
public static class Class{
public final String recordCode;
public final String institutionCode;
public final String subject;
public final String courseNum;
public final String sectionNum;
public Class(String rc, String ic, String sub, String cn, String sn){
recordCode = rc;
institutionCode = ic;
subject = sub;
courseNum = cn;
sectionNum = sn;
}
}
At the end I wanted to print out one of the variables to see that it's working but it gives me an IndexOutOfBoundsException. I wanted to know if I'm maybe saving the info incorrectly, or am I perhaps trying to get it to print out incorrectly?

You have a space in your split delimiter specification, but no spaces in your data.
String[] splitOut = strLine.split(", "); // <-- notice the space?
This will result in a splitOut array of only length 1, not 5 like you expect.
Since you only add to the list when you see a length of 5, checking the list for the 0th element at the end will result in checking for the first element of an empty list, hence your exception.

If you expect your data to have a comma or a space separating the characters then you would alter the split line to be:
String[] splitOut = strLine.split("[, ]");
The split takes a regular expression as an argument.
Rather than artificially adding commas I would look at String.substring in order to cut the line you have read into pieces. For example:
while ((strLine = br3.readLine())!=null) {
if (strLine.length() != 20)
throw new BadLineException("line length is not valid");
list.add(new Class(strLine.substring(0,1), strLine.substring(1,7), strLine.substring(7,11), strLine.substring(11,15), strLine.substring(15,19)));
}
[ Untested: my numbers might be out because I a bit knacked, but you get the idea ]

Splitting string algorithm in Java

I'm trying to make the following algorithm work. What I want to do is split the given string into substrings consisting of either a series of numbers or an operator.
So for this string = "22+2", I would get an array in which [0]="22" [1]="+" and [2]="2".
This is what I have so far, but I get an index out of bounds exception:
public static void main(String[] args) {
String string = "114+034556-2";
int k,a,j;
k=0;a=0;j=0;
String[] subStrings= new String[string.length()];
while(k<string.length()){
a=k;
while(((int)string.charAt(k))<=57&&((int)string.charAt(k))>=48){
k++;}
subStrings[j]=String.valueOf(string.subSequence(a,k-1)); //exception here
j++;
subStrings[j]=String.valueOf(string.charAt(k));
j++;
}}
I would rather be told what's wrong with my reasoning than be offered an alternative, but of course I will appreciate any kind of help.

I'm deliberately not answering this question directly, because it looks like you're trying to figure out a solution yourself. I'm also assuming that you're purposefully not using the split or the indexOf functions, which would make this pretty trivial.
A few things I've noticed:
If your input string is long, you'd probably be better off working with a char array and stringbuilder, so you can avoid memory problems arising from immutable strings
Have you tried catching the exception, or printing out what the value of k is that causes your index out of bounds problem?
Have you thought through what happens when your string terminates? For instance, have you run this through a debugger when the input string is "454" or something similarly trivial?

You could use a regular expression to split the numbers from the operators using lookahead and lookbehind assertions
String equation = "22+2";
String[] tmp = equation.split("(?=[+\\-/])|(?<=[+\\-/])");
System.out.println(Arrays.toString(tmp));

If you're interested in the general problem of parsing, then I'd recommend thinking about it on a character-by-character level, and moving through a finite state machine with each new character. (Often you'll need a terminator character that cannot occur in the input--such as the \0 in C strings--but we can get around that.).
In this case, you might have the following states:
initial state
just parsed a number.
just parsed an operator.
The characters determine the transitions from state to state:
You start in state 1.
Numbers transition into state 2.
Operators transition into state 3.
The current state can be tracked with something like an enum, changing the state after each character is consumed.
With that setup, then you just need to loop over the input string and switch on the current state.
// this is pseudocode -- does not compile.
List<String> parse(String inputString) {
State state = INIT_STATE;
String curr = "";
List<String> subStrs = new ArrayList<String>();
for(Char c : inputString) {
State next;
if (isAnumber(c)) {
next = JUST_NUM;
} else {
next = JUST_OP;
}
if (state == next) {
// no state change, just add to accumulator:
acc = acc + c;
} else {
// state change, so save and reset the accumulator:
subStrs.add(acc);
acc = "";
}
// update the state
state = next;
}
return subStrs;
}
With a structure like that, you can more easily add new features / constructs by adding new states and updating the behavior depending on the current state and incoming character. For example, you could add a check to throw errors if letters appear in the string (and include offset locations, if you wanted to track that).

If your critera is simply "Anything that is not a number", then you can use some simple regex stuff if you dont mind working with parallel arrays -
String[] operands = string.split("\\D");\\split around anything that is NOT a number
char[] operators = string.replaceAll("\\d", "").toCharArray();\\replace all numbers with "" and turn into char array.

String input="22+2-3*212/21+23";
String number="";
String op="";
List<String> numbers=new ArrayList<String>();
List<String> operators=new ArrayList<String>();
for(int i=0;i<input.length();i++){
char c=input.charAt(i);
if(i==input.length()-1){
number+=String.valueOf(c);
numbers.add(number);
}else if(Character.isDigit(c)){
number+=String.valueOf(c);
}else{
if(c=='+' || c=='-' || c=='*' ||c=='/'){
op=String.valueOf(c);
operators.add(op);
numbers.add(number);
op="";
number="";
}
}
}
for(String x:numbers){
System.out.println("number="+x+",");
}
for(String x:operators){
System.out.println("operators="+x+",");
}
this will be the output
number=22,number=2,number=3,number=212,number=21,number=23,operator=+,operator=-,operator=*,operator=/,operator=+,

Why is the size of this vector 1?

When I use System.out.println to show the size of a vector after calling the following method then it shows 1 although it should show 2 because the String parameter is "7455573;photo41.png;photo42.png" .
private void getIdClientAndPhotonames(String csvClientPhotos)
{
Vector vListPhotosOfClient = new Vector();
String chainePhotos = "";
String photoName = "";
String photoDirectory = new String(csvClientPhotos.substring(0, csvClientPhotos.indexOf(';')));
chainePhotos = csvClientPhotos.substring(csvClientPhotos.indexOf(';')+1);
chainePhotos = chainePhotos.substring(0, chainePhotos.lastIndexOf(';'));
if (chainePhotos.indexOf(';') == -1)
{
vListPhotosOfClient.addElement(new String(chainePhotos));
}
else // aaa;bbb;...
{
for (int i = 0 ; i < chainePhotos.length() ; i++)
{
if (chainePhotos.charAt(i) == ';')
{
vListPhotosOfClient.addElement(new String(photoName));
photoName = "";
continue;
}
photoName = photoName.concat(String.valueOf(chainePhotos.charAt(i)));
}
}
}
So the vector should contain the two String photo41.png and photo42.png , but when I print the vector content I get only photo41.png.
So what is wrong in my code ?

The answer is not valid for this question anymore, because it has been retagged to java-me. Still true if it was Java (like in the beginning): use String#split if you need to handle csv files.
It's be far easier to split the string:
String[] parts = csvClientPhotos.split(";");
This will give a string array:
{"7455573","photo41.png","photo42.png"}
Then you'd simply copy parts[1] and parts[2] to your vector.

You have two immediate problems.
The first is with your initial manipulation of the string. The two lines:
chainePhotos = csvClientPhotos.substring(csvClientPhotos.indexOf(';')+1);
chainePhotos = chainePhotos.substring(0, chainePhotos.lastIndexOf(';'));
when applied to 7455573;photo41.png;photo42.png will end up giving you photo41.png.
That's because the first line removes everything up to the first ; (7455573;) and the second strips off everything from the final ; onwards (;photo42.png). If your intent is to just get rid of the 7455573; bit, you don't need the second line.
Note that fixing this issue alone will not solve all your ills, you still need one more change.
Even though your input string (to the loop) is the correct photo41.png;photo42.png, you still only add an item to the vector each time you encounter a delimiting ;. There is no such delimiter at the end of that string, meaning that the final item won't be added.
You can fix this by putting the following immediately after the for loop:
if (! photoName.equals(""))
vListPhotosOfClient.addElement(new String(photoName));
which will catch the case of the final name not being terminated with the ;.

These two lines are the problem:
chainePhotos = csvClientPhotos.substring(csvClientPhotos.indexOf(';') + 1);
chainePhotos = chainePhotos.substring(0, chainePhotos.lastIndexOf(';'));
After the first one the chainePhotos contains "photo41.png;photo42.png", but the second one makes it photo41.png - which trigers the if an ends the method with only one element in the vector.
EDITED: what a mess.
I ran it with correct input (as provided by the OP) and made a comment above.
I then fixed it as suggested above, while accidently changing the input to 7455573;photo41.png;photo42.png; which worked, but is probably incorrect and doesn't match the explanation above input-wise.
I wish someone would un-answer this.

You can split the string manually. If the string having the ; symbol means why you can do like this? just do like this,
private void getIdClientAndPhotonames(String csvClientPhotos)
{
Vector vListPhotosOfClient = split(csvClientPhotos);
}
private vector split(String original) {
Vector nodes = new Vector();
String separator = ";";
// Parse nodes into vector
int index = original.indexOf(separator);
while(index>=0) {
nodes.addElement( original.substring(0, index) );
original = original.substring(index+separator.length());
index = original.indexOf(separator);
}
// Get the last node
nodes.addElement( original );
return nodes;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.