I have a text file and i need to read data from it to a 2D array. the file contains string as well as numbers.
String[][] arr = new String[3][5];
BufferedReader br = new BufferedReader(new FileReader("C:/Users/kp/Desktop/sample.txt"));
String line = " ";
String [] temp;
int i = 0;
while ((line = br.readLine())!= null){
temp = line.split(" ");
for (int j = 0; j<arr[i].length; j++) {
arr[i][j] = (temp[j]);
}
i++;
}
sample text file is :
name age salary id gender
jhon 45 4900 22 M
janey 33 4567 33 F
philip 55 5456 44 M
now, when the name is a single word without any space in between, the code works. but it doesn't work when the name is like "jhon desuja". How to overcome this?
I need to store it in a 2d array. how to validate the input? like name should not contain numbers or age should not be negative or contain letters. any help will be highly appreciated.
Regular Expression might be a better options:
Pattern p = Pattern.compile("(.+) (\\d+) (\\d+) (\\d+) ([MF])");
String[] test = new String[]{"jhon 45 4900 22 M","janey 33 4567 33 F","philip 55 5456 44 M","john mayer 56 4567 45 M"};
for(String line : test){
Matcher m = p.matcher(line);
if(m.find())
System.out.println(m.group(1) +", " +m.group(2) +", "+m.group(3) +", " + m.group(4) +", " + m.group(5));
}
which would return
jhon, 45, 4900, 22, M
janey, 33, 4567, 33, F
philip, 55, 5456, 44, M
john mayer, 56, 4567, 45, M
Related
I have to divide an address into street and number. Examples
Lievensberg 31D
Jablunkovska 21/2
Weimarstraat 113 A
Pastoor Baltesenstraat 22
Van Musschenboek strasse 84
I need to split like this:
Street1: Lievensberg
Number1: 31D
Street2: Jablunkovska
Number2: 21/2
Street3: Weimarstraat
Number3: 113 A
Street4: Pastoor Baltesenstraat
Number4: 22
Street5: Van Musschenboek strasse
Number5: 84
I used this code but not working, because I need to split only when the character after the white space will be a number:
String[] arrSplit = address_line.split("\\s");
for (int i = 0; i < arrSplit.length; i++) {
System.out.println(arrSplit[i]);
}
But I don't know how to do it so that all my requirements are met. Any idea?
If the number can be optional, instead of using split, you could use 2 capturing groups where the second group is optional.
^([^\d\r\n]+?)(?:\h*(\d.*)|$)
Explanation
^ Start of string
([^\d\r\n]+?) Match 1+ times any char except a digit or newline non greedy
(?: Non capture group
\h*(\d.*) Match 0+ horizontal whitespace chars
| Or
$ End of string
) Close non capture group
Regex demo | Java demo
Example code
String regex = "^([^\\d\\r\\n]+?)(?:\\h*(\\d.*)|$)";
String string = "Lievensberg 31D\n"
+ "Jablunkovska 21/2\n"
+ "Weimarstraat 113 A\n"
+ "Pastoor Baltesenstraat 22\n"
+ "Van Musschenboek strasse 84\n"
+ "Lievensberg";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Street: " + matcher.group(1));
if (matcher.group(2) != null) {
System.out.println("Number: " + matcher.group(2));
}
System.out.println("------------------");
}
Output
Street: Lievensberg
Number: 31D
------------------
Street: Jablunkovska
Number: 21/2
------------------
Street: Weimarstraat
Number: 113 A
------------------
Street: Pastoor Baltesenstraat
Number: 22
------------------
Street: Van Musschenboek strasse
Number: 84
------------------
Street: Lievensberg
------------------
Something like this:
ArrayList<String> list = new ArrayList();
list.add("Lievensberg 31D");
list.add("Jablunkovska 21/2");
list.add("Weimarstraat 113 A");
list.add("Pastoor Baltesenstraat 22");
list.add("Van Musschenboek strasse 84");
for(int i=0;i<list.size();i++){
System.out.println("Street"+(i+1)+": "+ list.get(i).split("\\s+(?=\\d)")[0]);
System.out.println("Number"+(i+1)+": "+ list.get(i).split("\\s+(?=\\d)")[1]);
}
You can use regex to verify first whether it matches or not, then only process it.
String str1 = "Lievensberg 31D"; // street = Lievensberg, number = 31D
String str2 = "Lievensberg NN31D"; // doesn't matches
String str3 = "Lievensberg"; // street = Lievensberg, number = null
String str4 = "Pastoor Baltesenstraat 22"; // street = Pastoor Baltesenstraat, number = 22
Pattern pattern = Pattern.compile("([a-zA-Z ]+?)(\\s(\\d+)(.*))?");
Matcher matcher = pattern.matcher(str1);
if(matcher.matches()) {
String street = matcher.group(1);
String number = matcher.group(2) != null ? matcher.group(3) + matcher.group(4) : null;
System.out.println("street = " + street);
System.out.println("number = " + number);
}
You can use this logic:
Find the index of the first number
Split the string based on this index
For better understanding use below code
public static void main(String[] args) {
String address_line = "Weimarstraat 113 A";
// Find index of first number
Matcher matcher = Pattern.compile("\\d+").matcher(address_line);
int i = -1;
for(char c: address_line.toCharArray() ){
if('0'<=c && c<='9')
break;
i++;
}
//Split string using index
System.out.println(address_line.substring(0, i));
System.out.println(address_line.substring(i+1));
}
Its output will be:
Weimarstraat
113 A
Here's a simple solution using regex and split:
String str = "Jablunkovska 21/2";
String[] split = str.split("\\s(?=\\d)", 2);
System.out.println(Arrays.toString(split));
Output:
[Jablunkovska, 21/2]
The expression (?=\\d) is a lookahead for a digit, so it doesn't get removed with the split.
Edit
Here is the updated code, I got the string to be loaded up with the matches but now how do I match the strings with the second file.
The strings contain this for reference:
String 1
1913 2016 1 1913 186 2016 1711 32843 2016 518
3 1913 32843 32001 4 250 5 3500 6 7
8 27 73 9 10 1711 73 11 2
12 1913 19 1930 20 21 1947
22 1955 23 1961 23 1969 27 1995 26 27
1962 28 29 30 1970 31 31
String 2
1.4 33.75278 84.38611
And I need to see how many times each number matches with this string
1913
2016
32843
31
27
1.4
4
7
2
23
public static void main(String[] args) throws FileNotFoundException {
Scanner INPUT_TEXT = new Scanner(new File("C:\\Users\\josep\\OneDrive\\Classes\\Programming\\assignment3_1.txt"));
INPUT_TEXT.useDelimiter(" ");
int count = 0;
ArrayList<String> integerInFirstFile =new ArrayList<>();
ArrayList<String> decimalsInFirstFile =new ArrayList<>();
while (INPUT_TEXT.hasNext()) {
String TempString = INPUT_TEXT.next();
String temp1 = TempString.replaceAll("[\\,]", "");
String pattern1 = "[0-9]+\\.{1}[0-9]+";
Pattern r1 = Pattern.compile(pattern1);
Matcher m1 = r1.matcher(temp1);
String pattern2 = "[0-9]+";
Pattern r2 = Pattern.compile(pattern2);
Matcher m2 = r2.matcher(temp1);
if (!(m1.find())) {
while (m2.find()) {
count++;
String number = m2.group(0);
integerInFirstFile.add(number);
if (count % 10 == 0) {
System.out.println(number);
} else
System.out.print(number + " ");
}
} else {
count++;
String number = m1.group(0);
decimalsInFirstFile.add(number);
System.out.println("Next File");
if (count % 10 == 0) {
System.out.println(number);
} else
System.out.print(number + " ");
}
}
Scanner INPUT_TEXT2 = new Scanner(new File("C:\\Users\\josep\\OneDrive\\Classes\\Programming\\assignment3_2.txt"));
INPUT_TEXT2.useDelimiter(" ");
System.out.println("");
System.out.println("Next File");
while (INPUT_TEXT2.hasNext()) {
String TempString1 = INPUT_TEXT2.next();
if (decimalsInFirstFile.equals(TempString1) || integerInFirstFile.equals(TempString1)) ;
{
System.out.println(TempString1);
}
}
INPUT_TEXT.close();
INPUT_TEXT2.close();
}
}
The first file contains a bunch of text and numbers that why I had to use the matcher to remove the commas and other text. My question is how do I then match to the other text file and display how many times a number in the 2nd file appears in the first file. Do I use the match class or do I use .match nextLine or something similar. Also where is the best place to put in the code, I don't want to mess up the loops already in place.
Once you got one number, store it for a future use, don't just display it.
Scanner INPUT_TEXT = new Scanner(new File("C:\\Users\\josep\\Downloads\\assignment3_1.txt"));
INPUT_TEXT.useDelimiter(" ");
int count=0;
// ** Creating the storage for the numbers (as strings)
ArrayList<String> numbersInFirstFile=new ArrayList<>();
Then when you find one, store it:
while(INPUT_TEXT.hasNext()){
String TempString=INPUT_TEXT.next();
String temp1 = TempString.replaceAll("[\\,]", "");
// other things that are needed where here, put them back
if(!(m1.find())){
while(m2.find( )) {
count++;
String number = m2.group(0);
// *********************************
// Hey, I found one, let's store it:
numbersInFirstFile.add(number);
// Do the same in other places where you find numbers in the first file
As you now have the numbers in the first file, whenever you get a number from the second file, you have something to search into.
If I have a String that looks like this: String calc = "5+3". Can I substring the integers 5 and 3?
In this case, you do know how the String looks, but it could look like this: String calc = "55-23" Therefore, I want to know if there is a way to identify integers in a String.
For something like that, regular expression is your friend:
String text = "String calc = 55-23";
Matcher m = Pattern.compile("\\d+").matcher(text);
while (m.find())
System.out.println(m.group());
Output
55
23
Now, you might need to expand it to support decimals:
String text = "String calc = 1.1 + 22 * 333 / (4444 - 55555)";
Matcher m = Pattern.compile("\\d+(?:.\\d+)?").matcher(text);
while (m.find())
System.out.println(m.group());
Output
1.1
22
333
4444
55555
You could use a regex like ([\d]+)([+-])([\d]+) to obtain the full binary expression.
Pattern pattern = Pattern.compile("([\\d]+)([+-])([\\d]+)");
String calc = "5+3";
Matcher matcher = pattern.matcher(calc);
if (matcher.matches()) {
int lhs = Integer.parseInt(matcher.group(1));
int rhs = Integer.parseInt(matcher.group(3));
char operator = matcher.group(2).charAt(0);
System.out.print(lhs + " " + operator + " " + rhs + " = ");
switch (operator) {
case '+': {
System.out.println(lhs + rhs);
}
case '-': {
System.out.println(lhs - rhs);
}
}
}
Output:
5 + 3 = 8
You can read each character and find it's Ascii code. Evaluate its code if it is between 48 and 57, it is a number and if it is not, it is a symbol.
if you find another character that is a number also you must add to previous number until you reach a symbol.
String calc="55-23";
String intString="";
char tempChar;
for (int i=0;i<calc.length();i++){
tempChar=calc.charAt(i);
int ascii=(int) tempChar;
if (ascii>47 && ascii <58){
intString=intString+tempChar;
}
else {
System.out.println(intString);
intString="";
}
}
Let’s say I am looping through a text file and come across the following two strings with random words and integer values
“foo 11 25”
“foo 38 15 976 24”
I write a regex pattern that would match both strings, for example:
((?:[a-z][a-z]+)\\s+\\d+\\s\\d+)
But, the problem is I don’t think this regex would allow me to get to all 4 integer values in the 2nd string.
Q1.) How can I create a single pattern that leaves these 3rd and 4th integers optional?
Q2.) How do I write the matcher code to only go after the 3rd and 4th values when they are found by the pattern?
Here is a template program to help anyone willing to offer a hand. Thanks.
public void foo(String fooFile) {
//Assume fooFile contains the two strings
//"foo 11 25";
//"foo 38 976 24";
Pattern p = Pattern.compile("((?:[a-z][a-z]+)\\s+\\d+\\s\\d+)", Pattern.CASE_INSENSITIVE);
BufferedReader br = new BufferedReader(new FileReader(fooFile));
String line;
while ((line = br.readLine()) != null) {
//Process the patterns
Matcher m1 = p.matcher(line);
if (m1.find()) {
int int1, int2, int3, int4;
//Need help to write the matcher code
}
}
}
If you want to retrieve every int value, you can use regex:
[a-z]+\s(\d+)\s(\d+)\s?(\d+)?\s?(\d+)?
DEMO
and every int will be in groups from 1 to 4. Then you can use somethig like:
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args){
String[] strings = {"foo 11 25","foo 67 45 97",
"foo 38 15 976 24"};
for(String string : strings) {
ArrayList<Integer> numbers = new ArrayList<Integer>();
Matcher matcher = Pattern.compile("[a-z]+\\s(\\d+)\\s(\\d+)\\s?(\\d+)?\\s?(\\d+)?").matcher(string);
matcher.find();
for(int i = 0; i < 4; i++){
if(matcher.group(i+1) != null) {
numbers.add(Integer.valueOf(matcher.group(i + 1)));
}else{
System.out.println("group " + (i+1) + " is " + matcher.group(i+1));
}
}
System.out.println("Match from string: "+ "\""+ string + "\"" + " : " + numbers.toString());
}
}
}
with output:
group 3 is null
group 4 is null
Match from string: "foo 11 25" : [11, 25]
group 4 is null
Match from string: "foo 67 45 97" : [67, 45, 97]
Match from string: "foo 38 15 976 24" : [38, 15, 976, 24]
Another way would be to get all int in one group with:
[a-z]+\s((?:\d+\s?)+)
DEMO
and split matcher.group(1) with space, you will get String[] with values. Implementation in Java:
public class Test {
public static void main(String[] args){
String[] strings = {"foo 11 25","foo 67 45 97",
"foo 38 15 976 24"};
for(String string : strings) {
ArrayList<Integer> numbers = new ArrayList<Integer>();
Matcher matcher = Pattern.compile("[a-z]+\\s((?:\\d+\\s?)+)").matcher(string);
matcher.find();
String[] nums = matcher.group(1).split("\\s");
for(String num : nums){
numbers.add(Integer.valueOf(num));
}
System.out.println("Match from string: "+ "\""+ string + "\"" + " : " + numbers.toString());
}
}
}
with output:
Match from string: "foo 11 25" : [11, 25]
Match from string: "foo 67 45 97" : [67, 45, 97]
Match from string: "foo 38 15 976 24" : [38, 15, 976, 24]
The current regex pattern you are using requires the text \s\d\s\d at the end. If you want it to allow for any number of numbers each preceded by whitespace, you would use (\s+\d+)+.
So the full regex would be ((?:[a-z][a-z]+)(\s+\d+)+)
String strArray="135(i),15a,14(g)(q)12,67dd(),kk,159"; //splited by ','
divide string after first occurrence of alphanumeric value/character
expected output :
original expected o/p
15a s1=15 s2=a
67dd() s1=67 s2=dd()
kk s1="" s2=kk
159 s1=159 s2=""
Please help me................
You could use the group-method of Pattern/Matcher:
String strArray = "135(i),15a,14(g)(q)12,67dd(),kk,159";//splited by ','
Pattern pattern = Pattern.compile("(?<digits>\\d*)(?<chars>[^,]*)");
Matcher matcher = pattern.matcher(strArray);
while (matcher.find()) {
if (!matcher.group().isEmpty()) //omit empty groups
System.out.println(matcher.group() + " : " + matcher.group("digits") + " - " + matcher.group("chars"));
}
The method group(String name) gives you the String found in the pattern's parenthesis with the specific name (here it is 'digits' or 'chars') within the match.
The method group(int i) would give you the String found in the i-th parenthesis of the pattern within the match.
See the Oracle tutorial at http://docs.oracle.com/javase/tutorial/essential/regex/ for more examples of using regex in Java.
You can use a Pattern and a Matcher to find the first index of a letter preceded by a number and split at that position.
Code
public static void main(String[] args) throws ParseException {
String[] inputs = { "15a", "67dd()", "kk", "159" };
for (String input : inputs) {
Pattern p = Pattern.compile("(?<=[0-9])[a-zA-Z]");
Matcher m = p.matcher(input);
System.out.println("Input: " + input);
if (m.find()) {
int splitIndex = m.end();
// System.out.println(splitIndex);
System.out.println("1.\t"+input.substring(0, splitIndex - 1));
System.out.println("2.\t"+input.substring(splitIndex - 1));
} else {
System.out.println("1.");
System.out.println("2.\t"+input);
}
}
}
Output
Input: 15a
1. 15
2. a
Input: 67dd()
1. 67
2. dd()
Input: kk
1.
2. kk
Input: 159
1.
2. 159
Use java.util.regex.Pattern and java.util.regex.Matcher
String strArray="135(i),15a,14(g)(q)12,67dd(),kk,159";
String arr[] = strArray.split(",");
for (String s : arr) {
Matcher m = Pattern.compile("([0-9]*)([^0-9]*)").matcher(s);
System.out.println("String in = " + s);
if(m.matches()){
System.out.println(" s1: " + m.group(1));
System.out.println(" s2: " + m.group(2));
} else {
System.out.println(" unmatched");
}
}
outputs:
String in = 135(i)
s1: 135
s2: (i)
String in = 15a
s1: 15
s2: a
String in = 14(g)(q)12
unmatched
String in = 67dd()
s1: 67
s2: dd()
String in = kk
s1:
s2: kk
String in = 159
s1: 159
s2:
Note how '14(g)(q)12' is not matched. It's not clear what the OP's required output is in this instance (or if a comma is missing from this portion of the example input string).