Java String Split to get individual data from a large string - java

I have the following data which is stored as a big string.
"John Chips Monday \n"
"Tom Pizza Tuesday\n"
"Jerry IceCream Wednesday\n"
"Jennifer Coffee Thursday\n"
Now I wish to split this string so I can get individual data from this string and place each data in an array for example.
each element of names array stores the names seen above like names[0] = john, names[1] = Tom etc.
each element of food array stores the foods seen above like food[0] = chips, food[1] = pizza.
I have tried doing this
John + "\t" + Chips + "-" + Monday + "\n"
Tom + "\t" + Pizza + "-" + Tuesday+ "\n"
Jerry + "\t" + IceCream + "-" + Wednesday+ "\n"
Jennifer + "\t" + Coffee + "-" + Thursday+ "\n"
String nameCol[] = data.split("\\t");
String foodCol[] = data.split("-");
The output I get is nearly there but wrong as it contains data that I don't want in the array for example the output for first array is
nameCol[0] = John
nameCol[1] = Chips -
nameCol[2] = Monday
Element 0 contains john but the other elements contain the parts I don't want.
I tried for a limit but this did not work
String nameCol[] = data.split("\\t",1);
String foodCol[] = data.split("-",1);

This will work:
String yourLine = "John Chips Monday\n"; // Read your line in here
String[] resultCol = yourLine.split(" ");
resultCol[2] = resultCol[2].split("\\n")[0];
System.out.println( resultCol[2] );
The first split on the string will give you "John", "Chips" and "Monday\n". The second split takes "Monday\n" from the array and splits it. Returning "Monday" back into the final index of the array resultCol[2]. From here you can simply assign each element in the array to the arrays you require.

Don't use them separately, use the delimiters together, like : String dataArr\[\] = data.split("\t\n",1);
Then iterate through the String[]:
for (int i = 0; i < dataArr.length; i+=2) {
String name = dataArr[i];
String food = dataArr[i+1];
// ... do whatever you want with them.
}
Or, you could also try the similar Pattern-Matcher approach

You should:
use Lists for each column, (Lists can increase their size dynamically, while arrays have fixed size)
iterate over each line,
split it on whitespace (or any separator you are using)
and add column[0] to list of names, column[1] to list of food and so on with other columns.
OR if you know that each line has only three words you could simply use Scanner and iterate over words instead of lines with split.
while(scanner.hasNext()){
listOfNames.add(scanner.next());
listOfFood.add(scanner.next());
listOfDays.add(scanner.next());
}

Try this,
String str="John" + "\t" + "Chips" + "\t" + "Monday" + "-" + "Test"+"\n"+"chet";
String st1= str.replaceAll("(\\t)|(-)|(\n)"," ");
String []st=st1.split(" ");
for(String s : st)
System.out.println(s);

From your data, I assume that you are reading this values from a files. If you know how many lines there are, you could use 3 arrays, each for every type of data that needs to be retrived. If you don't know the size, you could go with 3 ArrayLists. Your problem is that after making the split, you didn't put them in the correct arrays. The following code assumes that you already have all the data in one String.
final String values[] = data.split("\\n");
final ArrayList<String> names = new ArrayList<String>();
final ArrayList<String> foods = new ArrayList<String>();
final ArrayList<String> days = new ArrayList<String>();
for (String line : values) {
String[] split = line.trim().split("[ ]+");
names.add(split[0]);
foods.add(split[1]);
days.add(split[2]);
}
Another thing that you must consider is to check if the data always has 3 values on a "line", or else further error checking is needed.

If your string is always going to be "[name] [food] [day]", then you could do:
String[] names = new String[allData.length]; //A list of names
String[] food = new String[allData.length]; //A list of food
String[] day = new String[allData.length]; //A list of days
for(int i = 0 ; i < allData.length ; i++)
{
String[] contents = allData[i].split(" "); //Or use a similar delimiter.
names[i] = contents[0];
food[i] = contents[1];
day[i] = contents[2];
}

Try this code
String s = "John Chips Monday \n Tom Pizza Tuesday \n Jerry IceCream Wednesday \n Jennifer Coffee Thursday \n";
String split[] = s.split("\n");
String names[] = new String[split.length];
String foods[] = new String[split.length];
String days[] = new String[split.length];
for (int i = 0; i < split.length; i++) {
String split1[] = split[i].trim().split(" ");
names[i]=split1[0];
foods[i]=split1[1];
days[i]=split1[2];
System.out.println("name=" + names[i] + ",food=" + foods[i] + ",day=" + days[i]);
}

Related

How to Split a text file with no pattern

I am given a messy set of data and have to split the data into categories. The first line is tax rate, The following line is supposed to be "items," "number of items," and price (columns). I just need help splitting the data accordingly. Any help would be appreciated.
0.05 5
Colored Pencils 3 0.59
Notebook 5 0.99
AAA Battery 5 0.99
Java Book 5 59.95
iPhone X 2 999.99
iPhone 8 3 899.99.
import java.io.*;
import java.util.Scanner;
public class ShoppingBagClient {
public static void main(String[] args) {
File file = new File("data.txt");
shoppingBag(file);
}
public static void shoppingBag (File file) {
Scanner scan;
String itemName=" ";
int quantity = 0;
float price = 0;
int count = 0;
float taxRate;
ShoppingBag shoppingBag = new ShoppingBag(6, .5f);
try {
scan = new Scanner(new BufferedReader(new FileReader(file)));
String dilimeter;
while(count < 1) {
String line = scan.nextLine();
String[] arr = line.split(" ");
taxRate = Float.parseFloat(arr[0]);
}
while(scan.hasNextLine() && count > 0){
String line = scan.nextLine();
String delimeter;
String arr[] = line.split();
itemName = arr[0];
quantity = Integer.parseInt(arr[1]);
price = Float.parseFloat(arr[2]);
Item item =``` new Item(itemName, quantity, price);
shoppingBag.push(item);
System.out.println(itemName);
}
}
catch(IOException e) {
e.printStackTrace();
}
}
}
#Henry's comment gives a good approach. If you know the structure of each line and that it is delimited in a consistent manner (e.g. single space-separated) then you can combine lastIndexOf and substring to do the job.
String delimiter = " ";
String line = scan.nextLine(); // "Colored pencils 3 0.59"
int last = line.lastIndexOf(delimiter); // "Colored pencils 3_0.59"
String price = line.substring(last + 1); // "0.59"
line = line.substring(0, last); // "Colored pencils 3"
last = line.lastIndexOf(delimiter); // "Colored pencils_3"
String quantity = line.substring(last + 1); // "3"
line = line.substring(0, last); // "Colored pencils"
String product = line;
This can be refactored to be tidier but illustrates the point. Be mindful that if lastIndexOf returns the final character in the line then substring(last + 1) will throw a StringIndexOutOfBoundsException. A check should also be taken for if lastIndexOf does not find a match in which case it will return -1.
Edit: The price and quantity can then be converted to an int or float as necessary.
Because the item name doesn't have any restrictions, and as shown can include both spaces and numbers, it would be difficult to start processing the line from the beginning. On the other hand processing from the end is easier.
Consider this change to your code:
String arr[] = line.split();
int len = arr.length;
double price = Float.parseFloat(arr[len - 1]);
double quantity = Integer.parseInt(arr[len - 2]);
String itemName = "";
for(int i = 0; i < len - 2; i++)
itemName += arr[i] + " ";
This works because you know the last element will always be the price, and the pre last will always be the quantity. Therefore the rest of the array contains the name.
alternatively you could use a java 8 implementation for acquiring the name:
itemName = Stream.of(values).limit(values.length -2).collect(Collectors.joining(" "));

How can I find compound string in text

I have been searching for solution to find strings like this howareyou in sentence and remove them from it. For example:
We have a sentence - Hello there, how are you?
And compound - how are you
As a result I want to have this string - Hello there, ? With compound removed.
My current solution is splitting string into words and checking if compound contains each word, but it's not working well, because if you have other words that match that compound they will also be removed, e.g.:
If we will look for foreseenfuture in this string - I have foreseen future for all of you, then, according to my solution for will also be removed, because it is inside of compound.
Code
String[] words = text.split("[^a-zA-Z]");
String compound = "foreseenfuture";
int startIndex = -1;
int endIndex = -1;
for(String word : words){
if(compound.contains(word)){
if(startIndex == -1){
startIndex = text.indexOf(word);
}
endIndex = text.indexOf(word) + word.length() - 1;
}
}
if(startIndex != -1 && endIndex != -1){
text = text.substring(0, startIndex) + "" + text.substring(endIndex + 1, text.length() - 1);
}
So, is there any other way to solve this?
I'm going to assume that when you compound you only remove whitespace. So with this assumption "for,seen future. for seen future" would become "for,seen future. " since the comma breaks up the other compound. In this case then this should work:
String example1 = "how are you?";
String example2 = "how, are you... here?";
String example3 = "Madam, how are you finding the accommodations?";
String example4 = "how are you how are you how are you taco";
String compound = "howareyou";
StringBuilder compoundRegexBuilder = new StringBuilder();
//This matches to a word boundary before the first word
compoundRegexBuilder.append("\\b");
// inserts each character into the regex
for(int i = 0; i < compound.length(); i++) {
compoundRegexBuilder.append(compound.charAt(i));
// between each letter there could be any amount of whitespace
if(i<compound.length()-1) {
compoundRegexBuilder.append("\\s*");
}
}
// Makes sure the last word isn't part of a larger word
compoundRegexBuilder.append("\\b");
String compoundRegex = compoundRegexBuilder.toString();
System.out.println(compoundRegex);
System.out.println("Example 1:\n" + example1 + "\n" + example1.replaceAll(compoundRegex, ""));
System.out.println("\nExample 2:\n" + example2 + "\n" + example2.replaceAll(compoundRegex, ""));
System.out.println("\nExample 3:\n" + example3 + "\n" + example3.replaceAll(compoundRegex, ""));
System.out.println("\nExample 4:\n" + example4 + "\n" + example4.replaceAll(compoundRegex, ""));
The output is as follows:
\bh\s*o\s*w\s*a\s*r\s*e\s*y\s*o\s*u\b
Example 1:
how are you?
?
Example 2:
how, are you... here?
how, are you... here?
Example 3:
Madam, how are you finding the accommodations?
Madam, finding the accommodations?
Example 4:
how are you how are you how are you taco
taco
You can also use this to match any other alpha-numeric compound.

How do I output to the console in a specific layout?

I am working on a small project that takes user input (match results) on one line, splits the input and outputs the same data in a different format. I am struggling to find a way to output the data in a specific format. As well as total games played, I want my program to produce a chart like output in the format
home_name [home_score] | away_name [away_score]
This is the code I have at the minute which allows users to input results line after line in the following format
home_name : away_name : home_score : away_score
until they enter stop, which breaks the loop (and hopefully soon outputs the data).
import java.util.*;
public class results {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int totalGames = 0;
String input = null;
System.out.println("Please enter results in the following format"
+ " home_name : away_name : home_score : away_score"
+ ", or enter stop to quit");
while (null != (input = scan.nextLine())){
if ("stop".equals(input)){
break;
}
String results[] = input.split(" : ");
for (int x = 0; x < results.length; x++) {
}
totalGames++;
}
System.out.println("Total games played is " + totalGames);
}
}
You can see here.
You can format your text as you wish.
The general syntax is
%[arg_index$][flags][width][.precision]conversion char  Argument
numbering starts with 1 (not 0). So to print the first argument, you
should use 1$ (if you are using explicit ordering).
You can use regEx to parse the line:
(\w)\s(\w)\s|\s(\w)\s(\w)
Base on Java code from (from http://tutorials.jenkov.com/java-regex/matcher.html)
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class MatcherFindStartEndExample{
public static void main(String[] args){
String text = "Belenenses 6 | Benfica 0";
String patternString = "(\\w+)\\s(\\w+)\\s\\|\\s(\\w+)\\s(\\w+)";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
while (matcher.find()){
System.out.println("found: " + matcher.group(1));
System.out.println("found: " + matcher.group(2));
System.out.println("found: " + matcher.group(3));
System.out.println("found: " + matcher.group(4));
}
}}
Use this code instead of your
String results[] = input.split(" : ");
for (int x = 0; x < results.length; x++) {
}
You should do things in two times :
1) retrieving information entered by the user and storing it in instances of a custom class : PlayerResult.
2) performing the output according to the expected format. You should also compute the max size of each column before creating the graphical table.
Otherwise you could have a ugly rendering.
First step :
List<PlayerResult> playerResults = new ArrayList<PlayerResult>();
...
String[4] results = input.split(" : ");
playerResults.add(new PlayerResult(results[0],results[1],results[2],results[3])
Second step :
// compute length of column
int[] lengthByColumn = computeLengthByColumn(results);
int lengthHomeColumn = lengthByColumn[0];
int lengthAwayColumn = lengthByColumn[1];
// render header
System.out.print(adjustLength("home_name [home_score]", lengthHomeColumn));
System.out.println(adjustLength("away_name [away_score]", lengthAwayColumn));
// render data
for (PlayerResult playerResult : playerResults){
System.out.print(adjustLength(playerResult.getHomeName() + "[" + playerResult.getHomeName() + "]", lengthHomeColumn));
System.out.println(adjustLength(playerResult.getAwayName() + "[" + playerResult.getAwayScore() + "]", lengthAwayColumn));
}
You can keep the games statistics by adding results array values to the finalResults ArrayList. And then outputting its results as stop input is entered.
For counting the total results per team HashMap<String, Integer> is the best choice.
Here is the complete code with comments to make it clear:
import java.util.*;
// following the naming conventions class name must start with a capital letter
public class Results {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int totalGames = 0;
String input;
System.out.println("Please enter results in the following format: \n"
+ "'HOME_NAME : AWAY_NAME : HOME_SCORE : AWAY_SCORE' \n"
+ "or enter 'stop' to quit");
// HashMap to keep team name as a key and its total score as value
Map<String, Integer> scoreMap = new HashMap<>();
// ArrayList for storing game history
List<String> finalResults = new ArrayList<>();
// don't compare null to value. Read more http://stackoverflow.com/questions/6883646/obj-null-vs-null-obj
while ((input = scan.nextLine()) != null) {
if (input.equalsIgnoreCase("stop")) { // 'Stop', 'STOP' and 'stop' are all OK
scan.close(); // close Scanner object
break;
}
String[] results = input.split(" : ");
// add result as String.format. Read more https://examples.javacodegeeks.com/core-java/lang/string/java-string-format-example/
finalResults.add(String.format("%s [%s] | %s [%s]", results[0], results[2], results[1], results[3]));
// check if the map already contains the team
// results[0] and results[1] are team names, results[2] and results[3] are their scores
for (int i = 0; i < 2; i++) {
// here is used the Ternary operator. Read more http://alvinalexander.com/java/edu/pj/pj010018
scoreMap.put(results[i], !scoreMap.containsKey(results[i]) ?
Integer.valueOf(results[i + 2]) :
Integer.valueOf(scoreMap.get(results[i]) + Integer.valueOf(results[i + 2])));
}
totalGames++; // increment totalGames
}
System.out.printf("%nTotal games played: %d.%n", totalGames); // output the total played games
// output the games statistics from ArrayList finalResults
for (String finalResult : finalResults) {
System.out.println(finalResult);
}
// output the score table from HashMap scoreMap
System.out.println("\nScore table:");
for (Map.Entry<String, Integer> score : scoreMap.entrySet()) {
System.out.println(score.getKey() + " : " + score.getValue());
}
}
}
Now testing with input:
team1 : team2 : 1 : 0
team3 : team1 : 3 : 2
team3 : team2 : 2 : 2
sToP
The output is:
Total games played: 3.
team1 [1] | team2 [0]
team3 [3] | team1 [2]
team3 [2] | team2 [2]
Score table:
team3 : 5
team1 : 3
team2 : 2

ReplaceAll and Regular Expression

I am trying to insert tuples into newly created tables of a database schema I am building for SQL.
The issue is, I am to expect the first line to be
ssn INTEGER(9), cname VARCHAR(25), gender VARCHAR(6), age VARCHAR(3), profession VARCHAR(25)
But I want it to just be this:
ssn, cname, gender, age, profession
The previous method I tried with two splits, one for the space and the other for the comma is not working, so I thought using replace all would be easier. However, I am not sure what to try for the regular expression. How should these be created?
private static String parseFile (String[] x, Connection conn,
String tableName) {
// assume the first line is the relation name layout
String query = "INSERT INTO " + tableName;
String firstLine = x[0];
//System.out.println(firstLine);
String[] splits = firstLine.split(" ");
String[] finalSplit = new String[50];
String finalString = "";
for (int i=0; i<splits.length; i++) {
int counter = 0;
String[] split2 = splits[i].split(",");
//System.out.println (splits[i]);
for (int j=0; j<split2.length; j++) {
finalSplit[j+counter] = split2[j];
//System.out.println (split2[j]);
if (j%2 == 0)
finalString += split2[j];
counter += 1;
}
} // end outside for
System.out.println ("The attribute string is: " + finalString);
for (int i=1 ; i<x.length; i++)
{
String line = x[i];
String Final = query + " " + finalString + " " + line;
System.out.println ("Final string: " + Final);
}
return finalString;
}
I would appreciate a bit of guidance here.
EDIT:
Some of the output is:
The attribute string is: ssnINTEGER(9)cnameVARCHAR(25)genderVARCHAR(6)ageVARCHAR(3)professionVARCHAR(25)
Final string: INSERT INTO customer ssnINTEGER(9)cnameVARCHAR(25)genderVARCHAR(6)ageVARCHAR(3)professionVARCHAR(25) 3648993,Emily,male,63,Consulting
Final string: INSERT INTO customer ssnINTEGER(9)cnameVARCHAR(25)genderVARCHAR(6)ageVARCHAR(3)professionVARCHAR(25) 5022334,Barbara,male,26,Finance
Final string: INSERT INTO customer ssnINTEGER(9)cnameVARCHAR(25)genderVARCHAR(6)ageVARCHAR(3)professionVARCHAR(25) 1937686,Tao,female,5,IT
Some of the input of x is:
ssn INTEGER(9), cname VARCHAR(25), gender VARCHAR(6), age VARCHAR(3), profession VARCHAR(25)
3648993,Emily,male,63,Consulting
5022334,Barbara,male,26,Finance
1937686,Tao,female,5,IT
Try
firstLine.replaceAll(" [A-Z]+\\(\\d+\\)","");
Explanation: This regex finds words with 1 or more capital letters immediately followed by a left parenthesis, one or more digits, a right parenthesis and a comma.
replaceAll replaces all instances of this with an empty string.

Rearranging a string in Java

I have a String of the form "Firstname MiddleInitial Lastname".
I want to convert it to "Lastname, Firstname MiddleIntial"
Some names may have middle initial, but some may not:
String Name1 = "John Papa P";
String name2 = "Michael Jackson";
// Desired Output
result1 = "Papa, John P";
result2 = "Jackson, Michael";
How can I accomplish this?
Maybe something like this?
public class HelloWorld{
public static void main(String []args){
String name1 = "John Papa P";
String name2 = "Michael Jackson";
String[] split = name1.split(" ");
String result;
if (split.length > 2) {
result = split[1] + ", " + split[0] + " " + split[2];
} else {
result = split[1] + ", " + split[0];
}
System.out.println(result);
}
}
You can use the split() method on your String to split it into an array of Strings using a space as a delimiter, and rearrange the array as necessary.
A possible way to do this is using split function and make it into lists.
String one = "John Doe";
String two = "Albert Einstein";
String [] onelst = one.split(" ");
String [] twolst = two.split(" ");
String oneMod = onelst[1]+" "+onelst[0];
String twoMod = twolst[1]+" "+twolst[0];
System.out.println(oneMod);
System.out.println(twoMod);
Output for this:
Doe John
Einstein Albert
Just use split() to create an array of names. Now just use size() to get the size of the array, if it's 3 you have MiddleInitial, if 2 you dont.
Then for each case rearrange the array as you want.

Categories

Resources